mirror of
https://github.com/larksuite/cli.git
synced 2026-07-03 14:02:43 +08:00
sync(sheets): pick up sheets_df.py + doc DRY cleanup from spec
Mirror of the sheet-skill-spec change that ships a 32-line helper-only sheets_df.py (df_to_sheet + sheet_to_df) and removes the corresponding inline `def` blocks from three reference docs. - skills/lark-sheets/scripts/sheets_df.py (new): pandas DataFrame ↔ one +table-put / +table-get sheet, importable as a library. Same helper pair the docs already taught, lifted out of the prose so callers can `from sheets_df import df_to_sheet, sheet_to_df`. - lark-sheets-write-cells.md / lark-sheets-read-data.md / lark-sheets-workbook.md: drop the inline helper definitions; keep the usage examples (single/multi-sheet, round-trip) and switch them to import-from-script. workbook reference's +workbook-create --sheets section now points pandas users at the helper directly (was previously a textual reference back to write-cells). End-to-end verified against PPE (--as user): - +workbook-create with df_to_sheet for three sheets (income / balance / cashflow): create ok, dtypes (datetime64[ns] / float64) + formats (#,##0 / 0.0% / yyyy-mm-dd) survive on read-back through sheet_to_df. - read → pandas mutate → write-back round-trip preserves both data and formats.
This commit is contained in:
@@ -187,14 +187,12 @@ lark-cli sheets +table-get --url "<表URL>"
|
||||
lark-cli sheets +table-get --url "<表URL>" --sheet-name "销售"
|
||||
```
|
||||
|
||||
#### 输出 → DataFrame(2 行 helper)
|
||||
#### 输出 → DataFrame(用 `sheet_to_df` helper)
|
||||
|
||||
输出形状对齐 pandas split:`columns` 是列名数组、`data` 是二维数据、`dtypes` 是 `{列名: pandas_dtype_str}` 映射。直接喂给 `pd.DataFrame(...).astype(...)` 就能一次性还原所有列类型(不必逐列 `to_datetime` / `to_numeric`),写入侧 `df_to_sheet` 的镜像 helper:
|
||||
输出形状对齐 pandas split:`columns` 是列名数组、`data` 是二维数据、`dtypes` 是 `{列名: pandas_dtype_str}` 映射。直接喂给 `pd.DataFrame(...).astype(...)` 就能一次性还原所有列类型(不必逐列 `to_datetime` / `to_numeric`)。本 skill 把这段 2 行 helper 打包成可 import 的 [`scripts/sheets_df.py`](../scripts/sheets_df.py)(含 `df_to_sheet` 和 `sheet_to_df`,写入 / 读回成对):
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
def sheet_to_df(sheet):
|
||||
return pd.DataFrame(sheet["data"], columns=sheet["columns"]).astype(sheet["dtypes"])
|
||||
from sheets_df import sheet_to_df
|
||||
|
||||
# 单 sheet
|
||||
df = sheet_to_df(out["data"]["sheets"][0])
|
||||
@@ -236,10 +234,12 @@ df = pd.read_feather(io.BytesIO(res.stdout))
|
||||
|
||||
#### round-trip:读 → 改 → 写回(写读对偶)
|
||||
|
||||
`sheet_to_df` 和 write-cells reference 里的 `df_to_sheet` 是一对镜像 helper,round-trip 三段读 / 改 / 写各一行:
|
||||
`sheet_to_df` 和 `df_to_sheet` 一对镜像 helper([`scripts/sheets_df.py`](../scripts/sheets_df.py))让 round-trip 三段读 / 改 / 写各一行:
|
||||
|
||||
```python
|
||||
import json, subprocess
|
||||
from sheets_df import df_to_sheet, sheet_to_df
|
||||
|
||||
# 1. 读
|
||||
out = json.loads(subprocess.check_output(
|
||||
["lark-cli","sheets","+table-get","--url",URL,"--sheet-name","销售"]))
|
||||
|
||||
@@ -229,6 +229,13 @@ python prepare.py | lark-cli sheets +workbook-create --title "交易" --datafram
|
||||
|
||||
`--sheets` 协议与 `+table-put` 完全同构(字段含义见 lark-sheets-write-cells 的 `+table-put`,大 payload 走 stdin / `@file`);`--dataframe` 是同一份 typed 数据的二进制 wire(Arrow IPC,详见同 reference 的 `+table-put` 段落的 `--dataframe` 小节),按 producer 已有的 API 选——pandas 走 `--dataframe`,多子表 / 手拼 JSON 走 `--sheets`。关键差异:**新建工作簿的默认子表会被复用为第一个子表**(重命名后承载数据),不会残留空 `Sheet1`;其余子表按需新建。它把 `+table-put` 单独做不到的"建表 + typed 写入"合到一条命令,是「pandas 算完直接落地一张带真日期的新表」的首选。回读校验用 `+table-get`(与 `--sheets` 同构、可 round-trip;pandas 用户也可走 `--dataframe-out` 直拿 Arrow 文件)。
|
||||
|
||||
> 💡 pandas DataFrame 走 `--sheets` 时直接 `from sheets_df import df_to_sheet`([`scripts/sheets_df.py`](../scripts/sheets_df.py),与 `+table-put` 共用同一份 helper),多子表场景 helper 优势更明显:
|
||||
> ```python
|
||||
> payload = {"sheets": [df_to_sheet(income, "Income Statement"),
|
||||
> df_to_sheet(balance, "Balance Sheet"),
|
||||
> df_to_sheet(cashflow, "Cash Flow")]}
|
||||
> ```
|
||||
|
||||
`--styles` 可在建表写入时同时写视觉处理。它和 `--sheets` 一样只有一种外层写法:顶层对象里放 `styles` 数组;数组每项对应一个子表,含 `name`,并按能力拆成四类可选数组:
|
||||
|
||||
- `cell_styles`:像 `+cells-set-style`,用 A1 单元格 `range` 加扁平样式字段(`font_weight` / `background_color` / `horizontal_alignment` / `vertical_alignment` / `number_format` 等)和可选 `border_styles`;这些样式会随内容在同一次写入里一并应用。完整字段跑 `+workbook-create --print-schema --flag-name styles`。
|
||||
|
||||
@@ -506,17 +506,12 @@ lark-cli sheets +table-put --spreadsheet-token "<token>" --sheets @payload.json
|
||||
|
||||
每个 sheet 还可带 `"allow_overwrite": false`(遇非空拒写、保护原数据)、`"header": false`(只写数据不写表头)。完整字段跑 `+table-put --print-schema --flag-name sheets`。
|
||||
|
||||
#### DataFrame → 协议(5 行 helper)
|
||||
#### DataFrame → 协议(用 `df_to_sheet` helper)
|
||||
|
||||
pandas 的 `df.to_json(orient="split", date_format="iso")` 一步完成所有清洗(NaN→null、Timestamp→ISO 字符串、numpy 标量→原生数字),helper 只要把 dtypes 拼上去——5 行覆盖单 / 多 sheet:
|
||||
pandas 的 `df.to_json(orient="split", date_format="iso")` 一步完成所有清洗(NaN→null、Timestamp→ISO 字符串、numpy 标量→原生数字),把 dtypes 拼上即可。本 skill 把这段 5 行 helper 打包成可 import 的 [`scripts/sheets_df.py`](../scripts/sheets_df.py)(含 `df_to_sheet` 和 `sheet_to_df`,写入 / 读回成对):
|
||||
|
||||
```python
|
||||
import json
|
||||
def df_to_sheet(df, name, formats=None):
|
||||
return {"name": name,
|
||||
**json.loads(df.to_json(orient="split", date_format="iso")),
|
||||
"dtypes": df.dtypes.astype(str).to_dict(),
|
||||
**({"formats": formats} if formats else {})}
|
||||
from sheets_df import df_to_sheet
|
||||
|
||||
# 单 sheet(显式 format 覆盖默认显示)
|
||||
payload = {"sheets": [df_to_sheet(df, "销售", {"营收": "#,##0.00", "毛利率": "0.0%"})]}
|
||||
|
||||
BIN
skills/lark-sheets/scripts/__pycache__/sheets_df.cpython-312.pyc
Normal file
BIN
skills/lark-sheets/scripts/__pycache__/sheets_df.cpython-312.pyc
Normal file
Binary file not shown.
32
skills/lark-sheets/scripts/sheets_df.py
Normal file
32
skills/lark-sheets/scripts/sheets_df.py
Normal file
@@ -0,0 +1,32 @@
|
||||
#!/usr/bin/env python3
|
||||
# Copyright (c) 2026 Lark Technologies Pte. Ltd.
|
||||
# SPDX-License-Identifier: MIT
|
||||
"""DataFrame ↔ Feishu Sheet typed-JSON helpers.
|
||||
|
||||
This is the same 7-line snippet the skill docs already inline (see
|
||||
`lark-sheets-write-cells` "DataFrame → 协议(5 行 helper)" and
|
||||
`lark-sheets-read-data` "输出 → DataFrame(2 行 helper)"), pulled out
|
||||
so callers can `import` it instead of copy-pasting:
|
||||
|
||||
from sheets_df import df_to_sheet, sheet_to_df
|
||||
|
||||
Callers run lark-cli themselves; this file is a library, not a CLI.
|
||||
"""
|
||||
import json
|
||||
|
||||
import pandas as pd
|
||||
|
||||
|
||||
def df_to_sheet(df, name, formats=None):
|
||||
"""Pack one DataFrame into one entry of a `+table-put --sheets` payload."""
|
||||
return {
|
||||
"name": name,
|
||||
**json.loads(df.to_json(orient="split", date_format="iso")),
|
||||
"dtypes": df.dtypes.astype(str).to_dict(),
|
||||
**({"formats": formats} if formats else {}),
|
||||
}
|
||||
|
||||
|
||||
def sheet_to_df(sheet):
|
||||
"""Restore one `+table-get` sheet dict into a typed DataFrame."""
|
||||
return pd.DataFrame(sheet["data"], columns=sheet["columns"]).astype(sheet["dtypes"])
|
||||
Reference in New Issue
Block a user