Files
larksuite-cli/shortcuts/sheets/batch_op_dispatch.go
xiongyuanwen-byted bd898a1d74 feat(sheets): typed table I/O & error contract, workbook import/export, skill refresh (#1355)
* feat(sheets): add +sheet-show-gridline / +sheet-hide-gridline shortcuts

* docs(sheets): strengthen lark-sheets references for common editing pitfalls

Add targeted guidance to six lark-sheets references to reduce frequent
mistakes when editing spreadsheets through the CLI:

- write-cells: sanity-check units / dimension conversion / quantity factors
  before formula writes (formulas can run clean yet be off by a factor);
  keep derived output off original data columns to avoid clobbering source
- core-operations: prefer live formulas for derived values even when "live
  update" is not explicitly requested; scope rewrite/transform precisely so
  rows/columns that should stay unchanged are kept 1:1; treat header-stated
  format rules as checklist items; confirm the artifact file actually exists
  before finishing; write back bare values from local scripts
- visual-standards: apply border/header formatting on explicit request and
  identify the real header row; keep font size consistent with the source
- range-operations: keep total column width within A4 for printing
- read-data: dedup/compare long numbers via raw values, not csv formatted
  display (scientific notation collapses distinct numbers and causes false
  duplicates)
- chart: format date/number axes via source-cell number_format; place charts
  outside the data area so they do not cover existing data

* feat(sheets): implement table-put/table-get and sync skill specs

- Add lark_sheet_table_io.go with +table-put / +table-get and tests
- Refactor read-data; extend workbook; register new shortcuts
- Sync generated flag defs/schemas (go:embed) from sheet-skill-spec
- Sync skill references (write-cells numeric-column guidance, plus
  read-data / workbook / chart updates)

* docs(sheets): surface typed-write path at the write-decision point

Quick-ref table (SKILL.md, the first decision point) had no +table-put and
gated typed writes on "DataFrame", so a model holding a Counter/list/dict
would fall back to +csv-put and silently lose number/date fidelity.

- split csv-put row to plain-text values (no numeric/date semantics)
- add +table-put row for typed writes into an existing sheet
- add +workbook-create --sheets row for create + typed write in one shot
- add judgment note: number/amount/date/percent/count -> +table-put
  (or +workbook-create --sheets when the workbook does not exist yet);
  plain text -> +csv-put
- reframe write-cells scenario row to lead with numeric semantics
- point new-table writes at +workbook-create --sheets (one shot) instead
  of the create-empty-then-table-put two-step

Synced from sheet-skill-spec canonical (generate:cli + sync:cli).

* docs(sheets): sync SKILL.md (drop "not for local Excel" caveat)

Mirror the upstream sheet-skill-spec change removing the "not applicable to local Excel files" tail from the sheets skill and reference descriptions.

* docs(sheets): sync SKILL.md (drop "Feishu sheets only" caveat)

Mirror the upstream sheet-skill-spec change removing the "applies to Feishu sheets only" tail from the 14 sheet reference descriptions.

* feat(sheets): add +workbook-import wrapping the drive import core

Import a local xlsx/xls/csv as a new spreadsheet by delegating to the shared drive import flow with the target type pinned to sheet. Refactor drive +import to expose ImportParams / ValidateImport / PlanImportDryRun / RunImport (behavior unchanged, existing drive tests still cover it); sheets reuses them. Regenerate flag_defs_gen.go and sync the spec mirror.

* refactor(sheets): reuse the drive export core in +workbook-export

Replace +workbook-export's parallel export-task implementation with the shared drive ExportParams/RunExport core (pinned to type=sheet). Drops ~90 lines of duplicated poll/download code; +workbook-export now inherits drive's ctx cancellation, resume-on-timeout, filename sanitize/overwrite, and the full set of export status labels. The output contract aligns with drive's (adds ready/downloaded/doc_type; saved_path preserved). Also normalize an empty drive --output-dir to "." so drive +export behavior is unchanged, and fix the sheets export e2e to call +workbook-export instead of a nonexistent +export.

* docs(sheets): keep original column widths; align chart axis with requested metric

- range-operations: only widen new / overflowing columns; never recompute or
  shrink the widths of existing columns (any blanket resize, even by 1px,
  breaks the original visual format)
- chart: when the user asks for a share / percentage, the value axis should be
  a percentage (pie, or stack.percentage on bar/column) rather than raw counts

* docs(sheets): reword guidance to avoid eval-specific phrasing

Replace scoring-framework wording in the examples with plain functional
consequences (e.g. "not delivered", "goes stale when the source changes",
"breaks the original visual format"), so the references stay agent-facing.

* docs: add lark sheets financial modeling guidance

* docs(sheets): align write-cells reference with the generated output

Bring the hand-applied write-cells example in line with the spec-generated
reference so the CLI mirror is byte-identical to the canonical source.

* docs(sheets): align +csv-put help with formula support

Sync the formula-support wording from sheet-skill-spec (flag-defs, skill
references) and update the hand-authored cobra Description and comment for
+csv-put. +csv-put evaluates a leading-= cell as a formula via
set_range_from_csv; descriptions only, no behavior change.

* docs(sheets): fix invalid +dim-insert example in chart reference

The chart reference's placement example used non-existent flags
--dimension/--start/--end for +dim-insert. The real signature is
--position (required) + --count (required); copying the example
fails Validate with "--position is required". Replace it with
+dim-insert --position V --count 6 (insert 6 columns before V,
i.e. after U), aligning with the sheet-structure reference.

* docs(sheets): chart coordinate base / quoting + filter condition enums

Sync three reference-doc corrections from the spec source:

1. chart: label position.row as 0-based (first row = row:0), distinct
   from the 1-based row numbers used by A1 ranges and +dim-insert
   --position, removing the row-base ambiguity.

2. chart: convert the three runnable examples whose JSON contains a
   quoted sheet prefix ('Sheet1'!A1) from inline single-quoted
   --properties '{...}' to a stdin heredoc (--properties - <<'JSON').
   Inside an inline single-quoted string bash strips the inner quotes
   around the sheet name (and splits names with spaces into words),
   corrupting the JSON; a quoted heredoc delimiter performs no shell
   substitution and preserves it. Adds a short note on the pitfall.

3. filter / filter-view: add the full conditions[].type x compare_type
   enum table (text / number / multiValue / color and their respective
   compare_type values and values shape), and call out the
   equals/notEquals (with s) vs equal/notEqual (no s) gotcha. The docs
   previously only showed two values via examples.

* docs(sheets): label +sheet-create --index as 0-based

The base flag description for +sheet-create's --index omitted the
coordinate base, while its siblings +sheet-move ("Target position
(0-based)") and +sheet-copy already state 0-based. Align the description
so the index base is unambiguous. Synced from the spec source
(flag-defs.json + workbook reference).

* fix(sheets): regenerate flag defs and fix asasalint in table io

* feat(sheets): add counta to chart aggregateType enum

Add `counta` (count non-empty cells, incl. text) to manage_chart_object
dim2.series[].aggregateType in the chart flag schema. `count` only counts
numeric cells, so counting occurrences of a text/category column renders an
empty chart; `counta` enables category frequency counts. Synced from the
sheet-skill-spec canonical schema.

* feat(sheets): make --target-position and --range mutually exclusive on +pivot-create

Both flags map to the same wire field (properties.range), so passing
non-default values for both is ambiguous. Mirror the
--target-sheet-id / --target-sheet-name mutex pattern: --target-position
takes priority over --range, and supplying both with non-default values
is rejected up front with a typed FlagErrorf. --target-position=A1 is
the documented default and is treated as "not set".

Add a symmetric validateCreateInput hook on objectCRUDSpec (alongside
the existing validateUpdateInput), wire it into objectCreateInput, and
inject the pivot-specific check on pivotSpec.

* feat(sheets): rework +workbook-create flags and --styles

- --values builds a type-less typed payload, writing through --sheets' batched set_cell_range path (raw passthrough preserves auto-detect; large tables batch; big ints via json.Number)
- drop --headers (subsumed by --values first row) and --header-style (typed header no longer auto-bold; use --styles instead)
- styles: deep-merge overlapping cell_styles/border_styles fields (was wholesale-replace which dropped fields); add manual border_styles validation (style/weight enums + sides) since --styles is on parseJSONFlagSkip and bypasses the schema validator
- regenerate flag-defs/flag-schemas/skills mirror from sheet-skill-spec (--styles flag + full per-side border schema)

* fix(sheets): add mention_type enum to set_cell_range cells schema

Constrain rich_text mention_type to the proto MENTION_FILE_TYPE set so a
file @mention with an out-of-enum value (e.g. 6 = cloud shared folder) is
rejected by the schema validator before it reaches the server and fails
pb serialization ("mentionFileInfo.fileType: enum value expected").

- data/flag-schemas.json: mention_type gains enum + per-value description
- lark_sheet_write_cells_test.go: cover reject (6) + allow (0 / 2 / 22)

* feat(sheets): implement pandas-split --sheets protocol for +table-put/+table-get/+workbook-create

Synced from sheet-skill-spec canonical (cli:table_put schema +
references). +table-put/+workbook-create accept the new shape via a
tableSheetIn -> tableSheetSpec normalize step (dtype string -> internal
type/format mapping). +table-get emits the same shape so the writer's
df_to_sheet and the reader's sheet_to_df round-trip cleanly.

isoDateToSerial now accepts the full ISO datetime form
(2024-01-15T00:00:00.000, including timezone suffixes) emitted by
df.to_json(date_format="iso"), not just yyyy-mm-dd. End-to-end verified
by the spec repo's contracts/python_helper_roundtrip script against a
real Lark spreadsheet on pandas 2.2 and 3.0.

* feat(sheets): add --dataframe Arrow IPC input for +table-put/+table-get/+workbook-create

Introduce a binary-typed twin of --sheets: --dataframe accepts an Arrow IPC
(Feather v2) payload that pandas' df.to_feather() writes, deriving dtypes and
per-column number formats from the Arrow schema. The two producers are mutually
exclusive and funnel through a shared resolver so +table-put and
+workbook-create stay in lockstep; +table-get gains --dataframe-out for
single-sheet reads. Also auto-grow a sub-sheet's row/column count before
writing so blocks past the backend's default 200x20 bounds no longer fail with
range-exceeds-sheet-bounds.

* docs(lark-sheets): remove financial modeling standards reference

Drop the lark-sheets-financial-modeling-standards.md reference doc and all
pointers to it from SKILL.md, core-operations, and visual-standards. Bump
skill version to 3.0.0.

* docs(lark-sheets): clarify cell-image vs float-image routing and fix reference self-references

Synced from sheet-skill-spec.

- Add a binding-based decision (does the image belong to a record and move with its row?) to route +cells-set-image vs +float-image-create across the SKILL entry, float-image and write-cells references.
- Add routing rows to the SKILL command cheat-sheet and warn against defaulting to float-image out of familiarity.
- Replace mislabeled 本 skill / 子 skill / 跨 skill wording in references with 本文 / reference names, matching the existing convention.

* feat(sheets): add --styles to +table-put for one-step typed write with styling

+table-put now accepts --styles (same shape as +workbook-create's --styles):
cell_styles merge into the set_cell_range matrix, while cell_merges /
row_sizes / col_sizes apply as their own tool calls after the write. The
styles payload is name-matched against the written sheets and validated up
front, so a malformed or mismatched style fails before any write lands.

Also points +sheet-create users to +table-put (auto-creates missing sheets)
when they need data/styles, via a runtime Tip and the lark-sheets skill
references. Flag is sourced from the upstream Base table and regenerated
through sheet-skill-spec (flag-defs.json / flag-schemas.json / gen file).

Adds unit tests (dry-run styles, name-mismatch reject, execute) and a
dry-run E2E (tests/cli_e2e/sheets/sheets_table_put_dryrun_test.go).

* docs(lark-sheets): point read-data to +sheet-info for hidden row/col identification

skip-hidden defaults to false (lossless reads), but the read primitives don't mark which rows/cols are hidden. Cross-reference +sheet-info --include hidden_rows,hidden_cols + row_indices/col_indices so agents can identify hidden ranges when they need to filter or interpret hidden data.

Synced from sheet-skill-spec.

* feat(sheets): document link requirement for @document mentions in cells flag schema

@document mentions (mention_type != 0) must pass link (doc URL) to render a
clickable card; @user mentions (mention_type=0) don't need it. Synced from the
upstream tools-schema.

* fix(sheets): reject cond-format attrs whose shape mismatches rule_type

A conditional-format rule created with --rule-type colorScale but
cellIs-shaped attrs ({compare_type,value}, no color) was accepted by
the CLI and written through to the server, producing a color-less
color-scale segment. That dirty data crashes the frontend on snapshot
deserialization, so the spreadsheet can no longer be opened (5005).

The per-entry schema check can't catch this: properties.attrs.items is
a oneOf over all nine attr shapes and passes as soon as any branch
matches, blind to the sibling rule_type — {compare_type,value} matches
the cellIs branch even when rule_type says colorScale. The tool side
maps attrs blindly by rule_type and only validates dataBar count and
iconSet ordering, so the gap reaches the data layer.

Add a cross-field validator (validateCondFormatAttrs) wired into both
create and update via the new objectCRUDSpec.validateCreateInput hook
(twin of validateUpdateInput). It enforces, per rule_type, the keys
every attrs entry must carry — mirroring the tool's converter contract
— and treats an empty required string (notably color) as missing.
Rule types that take no attrs (duplicateValues / uniqueValues /
containsBlanks / notContainsBlanks) and updates that omit rule_type are
left to the server.

* test(sheets): guard condFormatAttrsRequired against flag-schemas drift

Add TestCondFormatAttrsRequired_MatchesSchemaOneOf, comparing the
hand-maintained condFormatAttrsRequired table against the embedded
flag-schemas.json attrs oneOf (multiset of required-key sets, for both
create and update). The cross-field validator only holds if its
per-rule_type required keys mirror the schema branches, and the two
share no compile-time link — this pins them together so a future schema
sync that adds/drops a required key can't silently desync the table.

* fix(sheets): default +table-get to full used range, not A1 current region

+table-get without --range anchored its current_region probe at A1, so an
internal blank row or column silently truncated everything past it — agents
then treated the partial data as complete (the pro016 / pro025 incident).

- Probe the used range over the full physical grid (row_count × column_count
  from the workbook structure) so it spans internal blank rows/columns; fall
  back to the legacy A1 anchor when dimensions are unknown.
- Emit the actually-read `range` on every sheet so callers can detect
  truncation (get_cell_ranges has no has_more flag).
- Fix the same A1-anchor bug in append mode's last-data-row probe, which could
  otherwise overwrite data past an internal blank row.
- Add unit + dry-run/live E2E coverage; refresh synced skill docs.

* docs(sheets): fix csv-get current_region guidance to cross-check row_count

current_region is a blank-row/column-bounded block, not the true sheet extent:
an internal blank row truncates it, so it can miss rows past the gap. The
read-data reference previously called it the "真实数据边界" and told agents to
prefer it over row_count — which drove the "read only to current_region's last
row, miss the tail" failure.

- current_region: warn it can be both smaller (internal blank rows truncate)
  and larger (trailing summary/signature rows) than the real data range.
- csv-get output contract: clarify its row_count/col_count is the returned size
  (= actual_range), not the physical sheet size; has_more only reflects the
  current range, not whether the whole sheet was read.
- "确定数据范围的正确流程": add a step to cross-check against +workbook-info's
  physical row_count and probe past current_region's last row for data beyond an
  internal blank row.

* fix(sheets): collapse duplicate validateCreateInput from bad merge resolution

A prior merge kept both branches' independently-added validateCreateInput
fields on objectCRUDSpec with conflicting signatures (pivot's
func(rt, input) and cond-format's func(input)), plus both call sites in
objectCreateInput, which failed to compile (validateCreateInput redeclared).

Collapse to the single richer func(rt flagView, input) signature and one
call site. cond-format's validateCondFormatAttrs (func(input), still shared
with validateUpdateInput) is wrapped in a closure that ignores rt. Both
behaviors are preserved: pivot --target-position/--range mutex and
cond-format attrs-shape-vs-rule_type validation.

* refactor(sheets): migrate legacy error helpers to typed errs in sheets domain

golangci-lint forbidigo (errs-no-legacy-helper / errs-no-bare-wrap) flagged
the table I/O, workbook, and dataframe shortcuts that landed on this branch:
93 common.FlagErrorf and 48 fmt.Errorf calls.

- Replace every common.FlagErrorf with common.ValidationErrorf (typed
  *errs.ValidationError, same signature) across workbook / table_io /
  dataframe / object_crud.
- writeDataframeOut's two final --dataframe-out write failures become typed
  errs.NewInternalError(SubtypeFileIO, ...).WithCause(err).
- applyWorkbookCreateVisualOps now passes the typed callTool error through
  unchanged (re-wrapping would downgrade classification) and attaches the
  failing op as a recovery hint only when none is set.
- The remaining fmt.Errorf are genuine intermediate errors that the command
  layer re-wraps into typed validation errors (buildTypedCell / Arrow
  decode-encode) or surfaces as a partial_success message string
  (writeTypedSheets via tablePutPartial); each carries a //nolint:forbidigo
  with that reason, per the lint guidance.

No behavior change: error messages and partial-success shapes are preserved;
gofmt, go vet, golangci-lint (0 issues) and sheets tests all pass.

* fix(shortcuts): clarify single-stdin constraint in flag help and error hint

Input flags advertised '(supports @file, - for stdin)' per flag, leading
AI agents to write '--a - <x --b - <y' where the second '<' silently
clobbers the first and the first flag reads the wrong payload. A process
has a single stdin, so at most one flag per call can use '-'.

- Reword the generated help hint to '- reads stdin (one flag per call;
  use @file for others)'.
- Add an actionable .WithHint to the stdin-conflict validation error
  pointing callers to @file for the extra flags.
- Assert the new hint in TestResolveInputFlags_DuplicateStdin.

* feat(sheets): +cells-get/+csv-get --max-chars 默认值 200000 → 500000

放宽默认防爆上限。flag_defs_gen.go 由 go generate 重生;flag_defs_test.go
的 expected default 同步;flag-schemas.json schema_version 2 → 3 是上游
spec-tables 架构调整带来的元数据 bump,与本业务改动无关、go:embed 不解析
该字段、无功能影响。

Synced from sheet-skill-spec@93f7a78.

* docs(lark-sheets): sync from spec — +csv-put 含逗号公式正例 + 收敛警示标签

源同步自 sheet-skill-spec:write-cells 补含逗号公式 RFC 4180 转义正例与结构化写入优先指引;全 reference 收敛「高频致命错误」类标签。

* docs(lark-sheets): sync from spec — --max-chars 放出为可见 flag + 落盘优先指引

源同步自 sheet-skill-spec:--max-chars 放出(默认 500000,可调小避免大输出被 Bash/终端转存为文件、改 has_more 分页);read-data 增「大数据优先落盘」指引。

* feat(sheets): 写操作报错增强 + --token 别名

- 复合 JSON shape 校验失败时报错附 --print-schema 提示,agent 可直接拿到精确结构(pro26 头号:+cells-set --cells 反复猜 shape)
- JSON 解析失败且该 flag 支持 stdin 时提示改用 stdin(公式/引号/逗号内联到 shell 被转义弄坏 JSON)
- --token 作为 --spreadsheet-token 的解析期别名:复用 sheets 已有 PostMount 钩子 + pflag normalize,仅 sheets 包,common 零改动

* docs(lark-sheets): sync from spec — set+H 改单引号 / 速查表补臆造命令名 / workbook-import 引导

* fix(sheets): migrate +table-put to typed error contract

The merge from main brought in #1449 (retire legacy error envelopes),
which removed output.ExitError / output.ErrDetail and forbids
constructing them. Port tablePutPartial off the legacy envelope:

- no sheets written -> typed errs.APIError (plain failure)
- some sheets written -> ok:false result via runtime.OutPartialFailure
  carrying written_sheets, returning the partial-failure exit signal

Also fix two drifts the same merge introduced:
- regenerate flag_defs_gen.go to match the committed flag-defs.json
- update the --max-chars flag test to assert visible (no longer hidden)

* docs(lark-sheets): sync from spec — set+H 告诫通则化(移入 stdin 段)

* feat(sheets): styles 接受 halign/valign 等对齐字段别名

把模型常幻觉的 horizontal_align / halign / vertical_align / valign 映射到
规范字段 horizontal_alignment / vertical_alignment,覆盖 --styles 与 typed
--cells;与规范字段冲突时报错而非静默择一。同步 lark-sheets skill 文档补
对齐字段说明 + --print-schema --flag-name styles 提示。

* feat(sheets): resolve wiki URLs to the backing spreadsheet for --url

Sheets shortcuts only accepted /sheets/ and /spreadsheets/ URLs via --url.
A /wiki/<node_token> URL was rejected with "must be a spreadsheet URL"
because the wiki node_token is not a spreadsheet token: resolving it to the
backing spreadsheet needs a wiki get_node call, which Validate/DryRun (kept
network-free) must not make.

Mirror the existing slides/doc/drive two-stage pattern:

- parseSpreadsheetRef classifies --url / --spreadsheet-token network-free
  into a sheet token or an (unresolved) wiki node_token.
- resolveSpreadsheetTokenExec (Execute only) resolves a /wiki/ node_token
  via wiki get_node, verifies obj_type=sheet, and returns the obj_token.
  The wiki:node:read scope is enforced on this path only, so non-wiki
  invocations are unaffected.
- resolveSpreadsheetToken stays network-free for Validate/DryRun, passing
  the node_token through unchanged.

All 47 Execute paths (including +batch-update and +workbook-export) switch
to the Exec resolver; Validate/DryRun keep the network-free one. No tool
schema change: the CLI feeds the resolved spreadsheet token as excel_id, so
this is a pure CLI-layer change.

Tested: unit (parse classification + wiki get_node e2e via httpmock) and
live end-to-end against a real wiki spreadsheet (read: +workbook-info,
+cells-get, +csv-get; write: +sheet-create, +sheet-rename, +csv-put).

* docs(sheets): note --url accepts wiki URLs (synced from spec)

* fix(sheets): match --url path segment via url.Parse, not substring

parseSpreadsheetRef classified /wiki/ with strings.Index over the whole URL, so a /sheets/ link whose query or fragment merely contained /wiki/ (e.g. .../sheets/sht?from=/wiki/x) was hijacked into a get_node call. Now parse the URL and match /sheets/, /spreadsheets/, /wiki/ only as a path prefix, mirroring slides parsePresentationRef which already fixed this class. Drop the substring helpers. Also align wiki resolution with slides: CallAPITyped (typed error + log_id) and classify an incomplete get_node response as InternalError instead of a --url validation error. Add regression tests for query/fragment /wiki/ and incomplete node.

* fix(sheets): satisfy errorlint/copyloopvar + regen flag defs

- helpers_test.go: drop the Go 1.22+ redundant `tc := tc` loop copy
  (copyloopvar).
- lark_sheet_dataframe.go, lark_sheet_table_io.go: switch the
  intermediate-error fmt.Errorf calls from %v to %w so errorlint passes.
  Behavior unchanged — these errors are always rewrapped into typed
  validation errors at the command layer.
- flag_defs_gen.go: regenerate from data/flag-defs.json (drift from the
  wiki-URL merge).

* ci: allow Apache Arrow module in license check

Arrow is Apache-2.0 overall, but it vendors c-ares (LicenseRef-C-Ares,
ISC-like) inside the module which go-licenses classifies as Unknown and
the strict disallowed_types=...,unknown gate rejects.

Pass --ignore github.com/apache/arrow/go/v17 since Arrow is required by
sheets +table-put / +table-get / +workbook-create --dataframe (Arrow IPC
ingest) and the vendored c-ares is not redistributed by us.

* fix(sheets): resolve wiki URL in +range-move/+range-copy Execute

transformExecuteFn (the named Execute helper shared by +range-move and +range-copy) still called the network-free resolveSpreadsheetToken, so a /wiki/ URL reached transform_range as an unresolved node_token and failed. #1519's sweep over Execute hooks only rewrote inline closures; this is the only Execute backed by a named helper. Switch it to resolveSpreadsheetTokenExec (Validate/DryRun stay network-free) and add a +range-move wiki-URL regression test.

* refactor(sheets): drop +table-put manual capacity grow; rely on set_cell_range auto-grow

set_cell_range now auto-grows the sub-sheet to fit the write, so the
ensureSheetCapacity helper (and its modify_sheet_structure dim-insert
call before each write) is no longer needed. This also closes a data-
safety hole flagged in review: inserting before the last existing row
could push real data down into the area set_cell_range was about to
write, and allow_overwrite=false could not protect against it because
the structural insert had already mutated the sheet by the time the
write-collision check ran.

Verified end-to-end against a real spreadsheet: +table-put writing
300x25 into a fresh Sheet1 (default 200x20) succeeds in one write and
the sheet ends up 301x25.

* fix(sheets): close --dataframe stdin guard hole

--dataframe is binary and bypasses the common Input resolver, which is
where the existing single-stdin guard lives. Result: an invocation like
+table-put --dataframe - --styles - was accepted, then one of the two
consumers raced for stdin and the other silently saw an empty stream.

Add a stdinConsumed marker on RuntimeContext that both consumers share:
common.resolveInputFlags sets it when an Input flag uses '-', and
readDataframeBytes both checks and sets it. A second consumer is
rejected up front with an actionable hint pointing at @file.

Flagged in code review (lark_sheet_dataframe.go:93).

* fix(sheets): harden +table-put / +table-get input validation and round-trip safety

Four review-flagged correctness gaps in table I/O, all bundled because
they touch the same file:

1. --sheets accepted trailing data after the first JSON value
   (json.Decoder does not surface that, unlike json.Unmarshal). A new
   decoderExpectEOF helper rejects e.g. `--sheets '{...} oops'` with a
   typed validation error instead of letting the leading object pass
   through and surface as a confusing downstream failure.

2. +table-get with a duplicate header (e.g. `amount, amount`) used to
   read back successfully — the dtypes map silently collapsed to one
   entry — and only failed later on +table-put because the writer
   rejects duplicate column names. Fail fast at read time with an
   actionable hint to rename or pass --no-header. --no-header mode is
   exempt (fallback col<N> names are always unique).

3. +table-put dry-run rendered an invalid range like A1:C0 when
   header=false with rows=[]. tablePutFullRange returns "" for an
   empty matrix or zero columns instead of building a degenerate
   rectangle.

4. +table-get with --sheet-id and a get_workbook_structure miss (read
   failure or selector mismatch) used to return a target with
   name="", which then broke +table-get → +table-put round-trip (the
   writer requires a non-empty sheet name). Fall back to using the id
   as the name.

End-to-end verified against a real spreadsheet: trailing data, duplicate
header, and --no-header fallback all behave as advertised.

* fix(sheets): apply +workbook-create style-only ops instead of silently dropping them

A +workbook-create call carrying only cell_merges / row_sizes / col_sizes
(no --values / --sheets and no cell_styles) used to create the workbook
but silently drop the requested visual ops. Two reasons, both fixed:

- workbookCreateStyleDimensions only counted cell_styles when computing
  the write extent, so cell_merges / row_sizes / col_sizes always
  contributed 0 → buildValuesPayload returned a nil payload → Execute
  skipped writeTypedSheets entirely → no visual ops ran. Extend the
  helper to fold the merge / resize ranges in.

- Pure row_sizes / col_sizes payloads can never expand a cell rectangle
  (they are dimension ranges, not cell ranges), so even with the extent
  fix Execute would still skip the write path. Add a no-data branch:
  when payload == nil but a styles item is present, look up the default
  sheet and apply visual ops directly via applyWorkbookCreateVisualOps.
  The dry-run plan mirrors this so the preview shows the visual ops.

Also picks up the --values trailing-JSON-data EOF check (mirror of the
--sheets one in lark_sheet_table_io.go).

End-to-end verified against a real spreadsheet: a cell_merges-only
+workbook-create now produces a sheet with merged_cells_count: 1.

* fix(sheets): preserve causes and render messages cleanly for typed validation errors

common.ValidationErrorf goes through fmt.Sprintf, which does not support
%w — the seven call sites that used `%w` were rendering the cause as
literal `%!w(*fmt.wrapError=&{...})` and dropping the cause from the
typed-error chain (so callers couldn't errors.As back to the underlying
error).

Switch each to `%v` for clean rendering and attach the cause via
.WithCause(err) so the typed contract is preserved. Touched call sites:

- lark_sheet_dataframe.go: --dataframe Arrow decode / stdin read / file
  read failures (3 call sites).
- lark_sheet_table_io.go: --sheets invalid JSON, payload-validate
  per-cell coercion error, buildSheetMatrix per-cell error,
  --dataframe-out arrow encode failure (4 call sites).

End-to-end verified against a real spreadsheet: both invalid-JSON and
typed-cell errors now render readable messages instead of %!w(...).

* sync(sheets): pick up +sheet-{show,hide}-gridline in +batch-update schema

Mirror of the sheet-skill-spec change adding the two gridline shortcuts
to cli-schemas.json batch_update.operations.shortcut enum. Synced from
the upstream canonical via generate:cli + sync:cli.

Verified end-to-end on a real spreadsheet — +batch-update with a
+sheet-hide-gridline op passes schema validation and the backend run
returns succeeded: 1.

* sync(sheets): pick up +workbook-export UX clarification from spec

Mirror of the sheet-skill-spec update that documents +workbook-export's
default-no-download behavior and its relationship to drive +export
--doc-type sheet. Synced from canonical via generate:cli + sync:cli +
go generate.

End-to-end verified against a real spreadsheet:
- Omit --output-path → ok:true, downloaded:false, file_token returned
- Pass --output-path ./crfix_test.xlsx → ok:true, file saved
  (17892 bytes), saved_path returned

The --help output for +workbook-export now states the default behavior
and points callers at `drive +export --doc-type sheet` when they need
the --output-dir / --file-name / --overwrite split.

* test(sheets): assert typed errs.Problem instead of err.Error() substrings

Per the coding guideline "Error-path tests must assert typed metadata via
errs.ProblemOf (category / subtype / param) and cause preservation, not
message substrings alone." — sweep through every error-path assertion in
the sheets domain and replace the
`strings.Contains(stdout+stderr+err.Error(), ...)` pattern with two
small helpers landed in helpers_test.go:

  requireProblem(t, err, wantCategory, wantSubtype, msgContains)
    -> *errs.Problem
  requireValidation(t, err, msgContains)
    -> *errs.ValidationError   // shorthand for CategoryValidation +
                               //   SubtypeInvalidArgument; lets callers
                               //   also assert .Param / .Params / .Cause

~60 assertion sites across 18 test files now check the typed envelope
shape, with message-substring checks moved onto the returned Problem
(.Message / .Hint / .Param). The substring is preserved as a sanity
check rather than the sole assertion, so a category drift like
validation → internal would now fail loudly instead of slipping past.

Cases intentionally left as substring (each with a one-line reason):
  - Errors that come straight from cobra's native flag parser (untyped
    *errors.errorString — e.g. "required flag(s) ... not set", mutually-
    exclusive groups). Re-typing these needs a custom FlagErrorFunc and
    is out of scope here.
  - Intermediate errors from decodeArrowToSheet that the caller wraps
    into a typed envelope (`//nolint:forbidigo` reason). Those unit
    tests assert the unwrapped intermediate directly.

One production tweak:
  - shortcuts/sheets/flag_schema.go: printFlagSchemaFor returns typed
    *errs.ValidationError (with WithParam("--flag-name") on the
    unknown-flag branch) instead of raw fmt.Errorf. The framework
    already wraps this when called via --print-schema, so user-facing
    behaviour is unchanged; direct callers (and tests) now get the
    typed envelope.

Verified: go test ./shortcuts/sheets/... passes; golangci-lint
--new-from-rev=origin/main reports 0 issues.

* test(common): assert typed errs.Problem instead of err.Error() substrings

Mirror of the sweep just landed in shortcuts/sheets: replace error-path
substring assertions with typed-envelope checks via two small helpers
landed in a new shortcuts/common/typed_error_assertions_test.go:

  requireProblem(t, err, wantCategory, wantSubtype, msgContains)
    -> *errs.Problem
  requireValidation(t, err, msgContains)
    -> *errs.ValidationError   // shorthand for CategoryValidation +
                               //   SubtypeInvalidArgument; lets callers
                               //   also assert .Param / .Params / .Cause

8 sites moved to typed assertions across runner_jq_test.go,
mcp_client_test.go, drive_media_upload_typed_test.go, and
runner_input_test.go (the input tests already used a typed-param helper;
this just retargets the substring follow-up onto the typed Message).

Sites intentionally left as substring + comment (production returns raw
fmt.Errorf, not a typed envelope):
  - runner_botinfo_test.go (6 sites): BotInfo / fetchBotInfo wrap upstream
    errors with fmt.Errorf so the SDK-level message ([99991], 403,
    invalid character, etc.) shows through.
  - runner_args_test.go (4 sites in 2 tests): rejectPositionalArgs returns
    raw fmt.Errorf to satisfy cobra's PositionalArgs contract.
  - permission_grant_test.go (2 sites): assert on stderr / hint strings,
    not error messages — already out of the err.Error() substring class.

No production code changes.

Verified: go test ./shortcuts/common/... passes;
golangci-lint --new-from-rev=origin/main ./shortcuts/common/... reports
0 issues.

* fix(sheets): plug four +table-put / +table-get correctness gaps flagged in CR

Four review-flagged bugs, all in lark_sheet_table_io.go (bundled because
they touch the same file and the same +table-put / +table-get domain):

1. +table-get --dry-run dropped the --sheet-id / --sheet-name selector
   from the get_cell_ranges body, while Execute always passed it. Agents
   that validate the dry-run shape and then run live would see a request
   shape mismatch. The dry-run now calls sheetSelectorForToolInput so
   the body matches Execute.

2. isDateNumberFormat used a simple `strings.ContainsRune(_, 'y')` so
   number formats like "JPY #,##0" (a currency prefix that happens to
   contain a lone 'Y') were misread as date formats — round-tripping
   integer cells out as ISO dates. The detector is now token-aware:
   it skips quoted "...", `\\x`-escaped, and `[...]` bracket sections,
   and only fires on an unescaped `yy` (a real Excel year token).

3. sheetCreateDims sized new append-mode sheets by `headerOn(s)` only,
   but writeSheetData forces a header on empty append sheets when
   Header == nil. Near 50000 rows / 200 cols this created the sheet one
   row short and the follow-up set_cell_range bounced off the backend
   ceiling. Size now matches the forced-header logic exactly.

4. tableGetTargets fallback paths (read-failure / selector mismatch on
   --sheet-id) returned a target with name="" — already corrected for
   --sheet-id structure-success path in 086876d2, but the structure-
   failure fallback still left it empty. Use the id as the name there
   too so the +table-get → +table-put round-trip never breaks on a
   nameless sheet.

End-to-end verified against a real spreadsheet:
- table-get --dry-run with --sheet-name / --sheet-id both render the
  selector field in the get_cell_ranges body
- A real round-trip (typed put → get) preserves dtypes + formats

* fix(sheets): bound --dataframe memory use with byte / row / column caps

readDataframeBytes used to read the whole Arrow file unbounded — a
stdin / file > 1 GiB would OOM the CLI long before the backend
per-sheet ceilings kicked in. decodeArrowToSheet then materialized
every record into [][]interface{} regardless of size.

Three caps now match the backend's per-sheet hard ceilings:
- byte cap: 256 MiB (covers worst-case 200×50000 cells × ~25 B Arrow
  overhead). File path pre-Stat()s before opening; both file and stdin
  paths read through io.LimitReader so an oversized input is rejected
  without allocating the full payload.
- column cap: 200, checked at schema-decode time before allocating any
  per-column slices.
- row cap: 50000, checked during record-batch iteration so a 1M-row
  Arrow file is rejected mid-stream instead of fully decoding first.

End-to-end verified against PPE — a 257 MiB file is rejected at file-
Stat with a typed validation error before any read happens.

* fix(drive): wrap +export ctx cancellation/deadline as typed errs.NetworkError

The poll loop in RunExport returned ctx.Err() directly in two places —
on the inter-attempt sleep cancel and on the pre-attempt deadline check.
That let context.Canceled / context.DeadlineExceeded escape as untyped
errors at the cobra layer, bypassing the typed-error contract every
other failure path already honors.

Add wrapExportContextErr that maps both into errs.NewNetworkError with
SubtypeNetworkTransport / SubtypeNetworkTimeout respectively and
preserves the cause via .WithCause(err), so callers can still
errors.Is(err, context.Canceled) downstream.

CR-flagged at drive_export.go:229 / :234.

* ci(license): narrow Apache Arrow workaround with a follow-up assertion

The dependency-license check still has to --ignore Apache Arrow wholesale
because go-licenses' classifier parses its LICENSE.txt as a single license
and mis-reports the module as LicenseRef-C-Ares / Unknown (Arrow inlines
the c-ares 3rdparty notice alongside its own Apache-2.0). Re-classifying
on our side isn't possible without changing go-licenses itself.

The CR concern was that --ignore is too wide — a future Arrow re-license
or new inlined dep would silently sail through. Add a follow-up step that
re-checks Arrow's LICENSE.txt independently: it must still open with
"Apache License" AND must still inline the c-ares 3rdparty notice (the
two facts that make the --ignore safe today). If either invariant breaks,
CI fails here and forces a human to re-evaluate the ignore.

Verified locally — both assertions pass against the current pinned
Arrow v17.

* sync(sheets): pick up +table-put payload-shape doc corrections from spec

Mirror of the sheet-skill-spec change that fixes three places teaching
an invalid +table-put payload shape — the typed protocol only has
columns / data / dtypes / formats (no formula field) and must always
be wrapped in an outer {"sheets":[...]} envelope. write-cells and the
SKILL.md decision table previously used the wrong field names (type /
format) and pointed users at +table-put for formula writes, which the
shortcut can't actually accept.

Synced from upstream canonical via generate:cli + sync:cli.

* test(sheets/e2e): add E2E coverage for new shortcuts + typed workbook-create

AGENTS.md requires a dry-run E2E for every new shortcut and a live E2E
for new flows. Three new files cover the four shortcuts this branch
adds or materially changes:

- sheets_gridline_dryrun_test.go — pins +sheet-show-gridline /
  +sheet-hide-gridline as a single modify_workbook_structure call with
  the right operation name (show_gridline / hide_gridline) and
  sheet_id, so an op-name typo would trip CI before any live run.

- sheets_workbook_import_dryrun_test.go — pins +workbook-import as a
  two-step plan (drive media upload + drive import-task create) with
  the doc type hard-coded to "sheet" — the wrapper's whole reason for
  existing on top of generic drive +import. --name reaches file_name
  on the wire; file_extension is sniffed from the local file.

- sheets_table_put_typed_workflow_test.go — two live workflows running
  against a freshly created spreadsheet. The first runs the full
  typed +table-put → +table-get round-trip (date / numeric / object
  columns with custom number_format) and asserts the dtype + format
  contract holds end-to-end. The second exercises the typed
  +workbook-create --sheets path: create + write in one shortcut, the
  payload sheet name adopts the workbook's default sheet (no empty
  "Sheet1" left behind), and the typed contract still survives the
  read-back.

End-to-end verified locally (user identity): typed put round-trips
preserve dtypes (date → datetime64[ns], numeric → float64, object →
object) + formats verbatim; workbook-create adopts the named sheet as
the first sheet with the same typed shape intact.

* sync(sheets): pick up sheets_df.py — pandas ↔ JSON skill script from spec

Mirror of the sheet-skill-spec change that adds a DataFrame ↔ JSON
bridge as a skill-bundled Python script instead of inside the CLI
binary. Per PR #1355 review (docx NcmxdRo2yoZ4OXxoMUZcxRZ7nHd, §4.2):
keep the CLI a thin JSON/REST client; pandas / Arrow editing lives in
the caller's Python process. Synced from canonical via generate:cli +
sync:cli.

- skills/lark-sheets/scripts/sheets_df.py (new): pandas DataFrame ↔
  one sheet, .parquet / .feather / .arrow / .csv / .json. Shells out to
  `+table-put` / `+table-get` over typed JSON — no CLI changes.
- SKILL.md decision tree + write-cells.md +table-put section: explicit
  pointers so pandas users land on the script instead of hand-rolling
  the `--sheets` payload.

End-to-end verified against PPE: 3-row DataFrame (datetime / float /
object) round-trips parquet → script put → real sheet → script get →
parquet with dtypes preserved.

* Revert "sync(sheets): pick up sheets_df.py — pandas ↔ JSON skill script from spec"

This reverts commit 2964983b92.

* sync(sheets): pick up sheets_df.py + doc DRY cleanup from spec

Mirror of the sheet-skill-spec change that ships a 32-line helper-only
sheets_df.py (df_to_sheet + sheet_to_df) and removes the corresponding
inline `def` blocks from three reference docs.

- skills/lark-sheets/scripts/sheets_df.py (new): pandas DataFrame ↔
  one +table-put / +table-get sheet, importable as a library. Same
  helper pair the docs already taught, lifted out of the prose so
  callers can `from sheets_df import df_to_sheet, sheet_to_df`.
- lark-sheets-write-cells.md / lark-sheets-read-data.md /
  lark-sheets-workbook.md: drop the inline helper definitions; keep
  the usage examples (single/multi-sheet, round-trip) and switch them
  to import-from-script. workbook reference's +workbook-create
  --sheets section now points pandas users at the helper directly
  (was previously a textual reference back to write-cells).

End-to-end verified against PPE (--as user):
- +workbook-create with df_to_sheet for three sheets (income / balance
  / cashflow): create ok, dtypes (datetime64[ns] / float64) + formats
  (#,##0 / 0.0% / yyyy-mm-dd) survive on read-back through sheet_to_df.
- read → pandas mutate → write-back round-trip preserves both data
  and formats.

* chore: drop accidentally-committed __pycache__/ and gitignore .pyc

The previous commit (5fac9c39) shipped sheets_df.py and inadvertently
included its `__pycache__/sheets_df.cpython-312.pyc` — local Python
import created the bytecode cache during PPE round-trip verification and
`git add skills/lark-sheets/` swept it in.

Remove the pyc and add Python bytecode patterns to .gitignore so the
skill-bundled helper scripts don't pull cache files into future commits.

* refactor(sheets): drop --dataframe / --dataframe-out + apache/arrow dep

Per the design review at NcmxdRo2yoZ4OXxoMUZcxRZ7nHd, the Arrow IPC binary
input/output channel adds a heavy columnar runtime to the CLI for no new
capability — the typed JSON --sheets path already covers everything, and
the column-major / zero-copy advantages collapse the moment the CLI re-
encodes into the row-oriented sheets OpenAPI JSON body. Removing it also
lets us drop the `--ignore github.com/apache/arrow/go/v17` license-check
escape hatch.

Deleted:
- shortcuts/sheets/lark_sheet_dataframe.go (+ test)
- --dataframe branches in +table-put / +workbook-create
- --dataframe-out branch in +table-get
- StdinConsumed / MarkStdinConsumed exported methods (the binary stdin
  reader was the only out-of-band consumer); internal stdinConsumed
  guard against duplicate `-` input flags stays
- apache/arrow/go/v17 + transitive deps via `go mod tidy`
- CI go-licenses --ignore for arrow and the LICENSE.txt assertion step
- --dataframe / --dataframe-out coverage in skill references

Pandas users keep the round-trip via the existing skill script
skills/lark-sheets/scripts/sheets_df.py over the JSON path.

The full pre-removal state is preserved on branch feat/sheets-arrow-stash.

Upstream sheet-skill-spec follow-up: the two flag rows in the canonical
spec + base table tblV2F6fqIjyCFQW must also be dropped so the next sync
does not re-add them.

* sync(sheets): pick up --sheets one-liner fix from spec

Mirrors sheet-skill-spec 5562f83. The +table-put / +workbook-create
--sheets flag descriptions (and the --print-schema description on the
sheets array) now point at the existing df_to_sheet helper instead of
the previous misleading one-liner that produced a dict missing the
outer {"sheets":[...]} envelope and the per-sheet `name`. Agents that
copy-paste the description verbatim now build a valid payload.

Auto-synced via spec's generate:cli + sync:consumers; go generate
./shortcuts/sheets/... regenerated flag_defs_gen.go so its embedded
flagDefs stays byte-equal to data/flag-defs.json.

* test(sheets/e2e): close E2E coverage gaps for newly added shortcuts

AGENTS.md requires both dry-run and live E2E for every newly registered
shortcut, and behavior-changing refactors need at least the matching
half. Three gaps remained on feat/lark-sheets-develop:

- +sheet-show-gridline / +sheet-hide-gridline (new): only dry-run E2E.
  Add sheets_gridline_workflow_test.go — create a real spreadsheet,
  toggle hide then show against a live sub-sheet, assert ok=true on
  both (gridline state is write-only — there is no read-back field on
  +sheet-info / +workbook-info — so a successful envelope is the
  meaningful signal; the dry-run E2E already pins the wire shape).

- +workbook-import (new): only dry-run E2E. Add
  sheets_workbook_import_workflow_test.go — write a local CSV, run
  the full upload → create-task → poll, assert ready=true with a
  sheet token, +info confirms the imported workbook is reachable,
  cleanup deletes the spreadsheet.

- +workbook-export refactor (no-download default changed): had live
  E2E but no dry-run E2E in tests/cli_e2e/. Add
  sheets_workbook_export_dryrun_test.go — pin the three sheet-
  specific differences vs drive +export: type=sheet hard-coded,
  csv mode routes --sheet-id onto sub_id (xlsx mode omits it), and
  --output-path maps onto the dry-run plan's top-level output_dir.
  Also pins the csv-without-sheet-id validation error.

* refactor(sheets): unify workbookCreatedButFillFailed with OutPartialFailure

Three "made it halfway and stopped" exits in the sheets domain previously
disagreed on shape, which made the post-failure recovery flow hard for
agents to predict from one command to another:

- +table-put partial write           → exit 1, stdout ok:false envelope
- +table-put zero-sheet write        → exit 1, stderr api/server_error
- +workbook-create create-but-fill   → exit 2, stderr validation/failed_precondition

OutPartialFailure exists exactly for "the side effect landed but the
follow-up didn't" — it stamps an ok:false result envelope on stdout
(carrying the state the caller needs to recover) and returns the bare
partial-failure exit signal. The workbook-create fill-failure path was
the odd one out: it surfaced as a typed failed_precondition error on
stderr, which agents couldn't tell apart from a plain validation refusal
even though the spreadsheet really did exist and a retry / cleanup was
possible.

Migrate workbookCreatedButFillFailed onto OutPartialFailure so the four
call sites in +workbook-create's Execute (sheet-resolve failure, initial
fill failure, style-only resolve failure, style-only apply failure) emit
the same envelope shape +table-put's partial write does:

  {
    "ok": false,
    "data": {
      "spreadsheet_token": "shtNEW",
      "reason": "spreadsheet shtNEW created but initial fill failed",
      "hint":   "the spreadsheet exists; retry the fill … or delete it",
      "cause":  {"category": "...", "subtype": "...", "message": "..."}
    }
  }

The inner failure's typed problem (category / subtype / message) is
flattened into the `cause` field so agents stay diagnosable from the JSON
envelope alone, instead of having to errors.Unwrap a Go error.

Updated TestExecute_WorkbookCreate_FillFailureKeepsToken to assert the
new shape (ok:false envelope on stdout, *output.PartialFailureError exit
signal, structured cause carrying the underlying invalid_response
subtype) — preserving the original test intent (token must survive for
recovery; inner cause must stay diagnosable) under the new contract.

* chore(sheets): three review nits — WithCause + stale comment + unexport

- shortcuts/sheets/flag_schema_validate.go:106 — composite-JSON shape
  validation was wrapping vErr's message into a typed sheets validation
  error without preserving vErr as the typed cause; add the missing
  .WithCause(vErr) so errors.Unwrap and ProblemOf still find the
  underlying validator error (matches every other typed-error chain
  helper in the file).

- shortcuts/sheets/lark_sheet_batch_update.go:92 — comment claimed
  batchUpdateInput returns "FlagErrorf-typed errors", but FlagErrorf no
  longer exists (the typed-error migration replaced it with
  common.ValidationErrorf / errs.ValidationError); update the comment
  to reflect what is actually returned.

- shortcuts/drive/drive_export.go:121 — drop the ValidateExport public
  alias and rename to validateExport. sheets +workbook-export reuses
  RunExport / PlanExportDryRun from this package but inlines its own
  (sheet-specific) Validate, so there is no cross-package call site —
  ValidateExport was a misleading sibling of the genuinely-shared
  ValidateImport. Comment added to record the asymmetry so future
  readers do not export it back.

* chore(deps): drop stale indirect bumps left by the arrow removal

The earlier --dataframe / --dataframe-out + apache/arrow/go/v17 removal
deleted the arrow consumer but left two indirect lines in go.mod pinned
to the versions arrow had pulled in:

  - github.com/kr/text                   v0.2.0
  - golang.org/x/exp  v0.0.0-20240222234643-814bf88cf225

With arrow gone, larksuite/cli was the only requirer of those exact
versions; every real consumer needs lower ones (kr/pretty wants
kr/text v0.1.0; charmbracelet/huh wants x/exp …20231006; xo/terminfo
wants x/exp …20220909). Removing the two indirect lines and running
`go mod tidy` lets MVS pick the real-consumer versions and drops the
explicit indirect entries entirely — go.mod net-diff against main is
now zero for this branch.

Verified locally: go build ./...; go test ./shortcuts/sheets/...
./shortcuts/drive/... ./shortcuts/common/... ./internal/auth/...
./cmd/auth/... — all green.

---------

Co-authored-by: zhengzhijie <zhengzhijie.j@bytedance.com>
Co-authored-by: Chenweifeng-bd <chenweifeng.1534@bytedance.com>
2026-06-25 10:48:13 +08:00

350 lines
17 KiB
Go
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
// Copyright (c) 2026 Lark Technologies Pte. Ltd.
// SPDX-License-Identifier: MIT
package sheets
import (
"strings"
)
// ─── +batch-update sub-op dispatch ─────────────────────────────────────
//
// 用户传给 +batch-update --operations 的形态是 CLI 视角的 {shortcut, input}
//
// [{"shortcut": "+range-copy", "input": {"sheet_id":"...","source-range":"A1:B2","target-range":"A10"}}, ...]
//
// input 里用的是该 shortcut 的 **CLI flag 名**(与 standalone 调用一致;连字符 /
// 下划线两种写法都接受)。底层 MCP batch_update tool 要的是
// {tool_name, input(MCP body)} —— body 的字段名往往与 CLI flag 名不同
// (如 +range-copy 的 source-range/target-range 要翻成 range/destination_range
//
// 关键:每个子操作复用 **standalone shortcut 同一套 flag→body translator**
// (那些 *Input 构建函数,现在统一接收 flagView 接口)。这样 batch 子操作
// 产出的 MCP body 与该 shortcut 单独调用产出的 body 完全一致(由
// batch-vs-standalone 契约测试保证。dispatch 表只列**可纳入 atomic batch
// 的 write shortcut**——读操作、fan-out wrapper+batch-update 自身、
// +cells-batch-set-style、+cells-batch-clear、+dropdown-{update,delete})一律不放进表里,
// 用户传到 +batch-update 里会被 translator 拒绝。
// batchTranslateFn turns a sub-op's CLI-shape input (via flagView) into the MCP
// tool body for the underlying batch_update sub-tool. token is the
// +batch-update top-level spreadsheet token; sheetID/sheetName are the resolved
// sheet selector for this sub-op. The returned body already carries excel_id
// and (where the tool needs one) the operation discriminator — exactly as the
// standalone shortcut would emit.
type batchTranslateFn func(fv flagView, token, sheetID, sheetName string) (map[string]interface{}, error)
type batchOpMapping struct {
// mcpToolName 是底层 MCP batch_update 接受的 tool_name。
mcpToolName string
// translate 复用 standalone 的 *Input 构建逻辑,产出 MCP body。
translate batchTranslateFn
}
// sheetSelectorFlagsForSubOp returns the (id, name) flag names a +batch-update
// sub-op uses to express its placement / context sheet. Defaults are
// `sheet-id` / `sheet-name`; +pivot-create deviates because its create
// shortcut renamed the placement selector to `target-sheet-id` /
// `target-sheet-name` (the data-source sheet is encoded in --source as
// `'SheetName'!Range`, not in a sheet selector flag). Update / delete on
// pivot still use the default names — only the create create-side
// shortcut was renamed.
func sheetSelectorFlagsForSubOp(shortcut string) (string, string) {
if shortcut == "+pivot-create" {
return "target-sheet-id", "target-sheet-name"
}
return "sheet-id", "sheet-name"
}
// objCreateTranslate / objUpdateTranslate / objDeleteTranslate bind an object
// CRUD spec to the shared object_crud builders.
func objCreateTranslate(spec objectCRUDSpec) batchTranslateFn {
return func(fv flagView, token, sheetID, sheetName string) (map[string]interface{}, error) {
return objectCreateInput(fv, token, sheetID, sheetName, spec)
}
}
func objUpdateTranslate(spec objectCRUDSpec) batchTranslateFn {
return func(fv flagView, token, sheetID, sheetName string) (map[string]interface{}, error) {
return objectUpdateInput(fv, token, sheetID, sheetName, spec)
}
}
func objDeleteTranslate(spec objectCRUDSpec) batchTranslateFn {
return func(fv flagView, token, sheetID, sheetName string) (map[string]interface{}, error) {
return objectDeleteInput(fv, token, sheetID, sheetName, spec)
}
}
// batchOpDispatch covers every write shortcut that can join an atomic batch.
// Each entry plugs the shortcut's standalone xxxInput builder into the
// batch translator path — so the body is byte-identical to the standalone
// invocation (locked by TestBatchOp_BodyMatchesStandalone) and the missing-
// flag error is identical too (locked by TestBatchOp_ErrorEquivalence).
var batchOpDispatch = map[string]batchOpMapping{
// ─── 单元格内容 ──────────────────────────────────────────────────
"+cells-set": {"set_cell_range", cellsSetInput},
"+cells-set-style": {"set_cell_range", cellsSetStyleInput},
"+cells-clear": {"clear_cell_range", cellsClearInput},
"+cells-replace": {"replace_data", replaceInput},
"+csv-put": {"set_range_from_csv", csvPutInput},
"+dropdown-set": {"set_cell_range", dropdownSetInput},
// ─── 单元格合并 (merge_cells, operation 区分) ────────────────────
"+cells-merge": {"merge_cells", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return mergeInput(fv, token, sid, sname, "merge", true)
}},
"+cells-unmerge": {"merge_cells", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return mergeInput(fv, token, sid, sname, "unmerge", false)
}},
// ─── 行列结构 (modify_sheet_structure, operation 区分) ──────────
"+dim-insert": {"modify_sheet_structure", dimInsertInput},
"+dim-delete": {"modify_sheet_structure", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return dimRangeOpInput(fv, token, sid, sname, "delete")
}},
"+dim-hide": {"modify_sheet_structure", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return dimRangeOpInput(fv, token, sid, sname, "hide")
}},
"+dim-unhide": {"modify_sheet_structure", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return dimRangeOpInput(fv, token, sid, sname, "unhide")
}},
"+dim-freeze": {"modify_sheet_structure", dimFreezeInput},
"+dim-group": {"modify_sheet_structure", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return dimGroupInput(fv, token, sid, sname, "group")
}},
"+dim-ungroup": {"modify_sheet_structure", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return dimGroupInput(fv, token, sid, sname, "ungroup")
}},
// ─── 行高列宽 (resize_range, 无 operation 字段) ─────────────────
"+rows-resize": {"resize_range", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return resizeInput(fv, token, sid, sname, "row")
}},
"+cols-resize": {"resize_range", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return resizeInput(fv, token, sid, sname, "column")
}},
// ─── 区域操作 (transform_range, operation 区分) ─────────────────
"+range-move": {"transform_range", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return transformMoveCopyInput(fv, token, sid, sname, "move", false)
}},
"+range-copy": {"transform_range", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
return transformMoveCopyInput(fv, token, sid, sname, "copy", true)
}},
"+range-fill": {"transform_range", rangeFillInput},
"+range-sort": {"transform_range", rangeSortInput},
// ─── 工作簿 / 子表 (modify_workbook_structure, operation 区分) ──
"+sheet-create": {"modify_workbook_structure", func(fv flagView, token, _, _ string) (map[string]interface{}, error) {
return sheetCreateInput(fv, token)
}},
"+sheet-delete": {"modify_workbook_structure", sheetDeleteInput},
"+sheet-rename": {"modify_workbook_structure", sheetRenameInput},
"+sheet-move": {"modify_workbook_structure", sheetMoveBatchInput},
"+sheet-copy": {"modify_workbook_structure", sheetCopyInput},
"+sheet-hide": {"modify_workbook_structure", func(fv flagView, t, sid, sn string) (map[string]interface{}, error) {
return sheetVisibilityInput(fv, t, sid, sn, "hide")
}},
"+sheet-unhide": {"modify_workbook_structure", func(fv flagView, t, sid, sn string) (map[string]interface{}, error) {
return sheetVisibilityInput(fv, t, sid, sn, "unhide")
}},
"+sheet-set-tab-color": {"modify_workbook_structure", sheetSetTabColorInput},
"+sheet-show-gridline": {"modify_workbook_structure", func(fv flagView, t, sid, sn string) (map[string]interface{}, error) {
return sheetVisibilityInput(fv, t, sid, sn, "show_gridline")
}},
"+sheet-hide-gridline": {"modify_workbook_structure", func(fv flagView, t, sid, sn string) (map[string]interface{}, error) {
return sheetVisibilityInput(fv, t, sid, sn, "hide_gridline")
}},
// ─── 对象族 CRUD (manage_*_object, operation 区分) ─────────────
"+chart-create": {"manage_chart_object", objCreateTranslate(chartSpec)},
"+chart-update": {"manage_chart_object", objUpdateTranslate(chartSpec)},
"+chart-delete": {"manage_chart_object", objDeleteTranslate(chartSpec)},
"+pivot-create": {"manage_pivot_table_object", objCreateTranslate(pivotSpec)},
"+pivot-update": {"manage_pivot_table_object", objUpdateTranslate(pivotSpec)},
"+pivot-delete": {"manage_pivot_table_object", objDeleteTranslate(pivotSpec)},
"+cond-format-create": {"manage_conditional_format_object", objCreateTranslate(condFormatSpec)},
"+cond-format-update": {"manage_conditional_format_object", objUpdateTranslate(condFormatSpec)},
"+cond-format-delete": {"manage_conditional_format_object", objDeleteTranslate(condFormatSpec)},
"+filter-create": {"manage_filter_object", filterCreateInput},
"+filter-update": {"manage_filter_object", filterUpdateInput},
"+filter-delete": {"manage_filter_object", filterDeleteInput},
"+filter-view-create": {"manage_filter_view_object", objCreateTranslate(filterViewSpec)},
"+filter-view-update": {"manage_filter_view_object", objUpdateTranslate(filterViewSpec)},
"+filter-view-delete": {"manage_filter_view_object", objDeleteTranslate(filterViewSpec)},
"+sparkline-create": {"manage_sparkline_object", objCreateTranslate(sparklineSpec)},
"+sparkline-update": {"manage_sparkline_object", objUpdateTranslate(sparklineSpec)},
"+sparkline-delete": {"manage_sparkline_object", objDeleteTranslate(sparklineSpec)},
"+float-image-create": {"manage_float_image_object", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
if err := rejectLocalImageInBatch(fv); err != nil {
return nil, err
}
return floatImageWriteInput(fv, token, sid, sname, "create", false, "")
}},
"+float-image-update": {"manage_float_image_object", func(fv flagView, token, sid, sname string) (map[string]interface{}, error) {
if err := rejectLocalImageInBatch(fv); err != nil {
return nil, err
}
return floatImageWriteInput(fv, token, sid, sname, "update", true, "")
}},
"+float-image-delete": {"manage_float_image_object", objDeleteTranslate(floatImageDeleteSpec)},
}
// rejectLocalImageInBatch blocks the local-file --image source inside
// +batch-update: a batch sub-op has no upload phase, so the file could not be
// turned into a file_token. Callers must pass --image-token / --image-uri.
func rejectLocalImageInBatch(fv flagView) error {
if strings.TrimSpace(fv.Str("image")) != "" {
return sheetsValidationForFlag("image", "--image (local upload) is not supported inside +batch-update; pass --image-token or --image-uri instead")
}
return nil
}
// sheetMoveBatchInput translates +sheet-move inside a batch. Unlike the
// standalone shortcut it cannot issue the get_workbook_structure read that
// auto-derives sheet_id / source_index, so both must be supplied explicitly.
func sheetMoveBatchInput(fv flagView, token, sheetID, sheetName string) (map[string]interface{}, error) {
if sheetID == "" {
return nil, sheetsValidationForFlag("sheet-id", "+sheet-move in +batch-update requires sheet_id (sheet_name needs a network lookup unavailable mid-batch)")
}
if !fv.Changed("source-index") {
return nil, sheetsValidationForFlag("source-index", "+sheet-move in +batch-update requires source_index (auto-derive needs a network lookup unavailable mid-batch)")
}
if fv.Int("source-index") < 0 {
return nil, sheetsValidationForFlag("source-index", "--source-index must be >= 0")
}
// Standalone +sheet-move requires --index (see SheetMove.Validate). A batch
// sub-op skips that path, and mapFlagView falls back to the flag default (0),
// which would silently move the sheet to the front. Require it explicitly so
// the batch contract matches the standalone one.
if !fv.Changed("index") {
return nil, sheetsValidationForFlag("index", "+sheet-move in +batch-update requires index")
}
if fv.Int("index") < 0 {
return nil, sheetsValidationForFlag("index", "--index must be >= 0")
}
return map[string]interface{}{
"excel_id": token,
"operation": "move",
"sheet_id": sheetID,
"source_index": fv.Int("source-index"),
"target_index": fv.Int("index"),
}, nil
}
// reservedSubOpKeys 是禁止用户在 sub-op input 里手填的 key —— 它们由
// +batch-update 顶层 --url/--token 统一提供excel_id / spreadsheet_token / url
var reservedSubOpKeys = []string{"excel_id", "spreadsheet_token", "url"}
// translateBatchOp 把一个 CLI 视角的 {shortcut, input} 翻成底层 MCP
// batch_update 的 {tool_name, input}。`index` 用于错误信息定位。input 用
// shortcut 的 CLI flag 名(连字符/下划线均可),经该 shortcut 的 standalone
// translator 翻成 MCP body。
//
// 失败场景:
// - shortcut 字段缺失 / 非 string
// - shortcut 不在 dispatch 表拼写错read 操作;嵌套 fan-out wrapper
// - input 不是 object
// - input 里手填了 operation由 shortcut 名隐含,禁手填以防 mismatch
// - input 里手填了 excel_id / spreadsheet_token / url
// - 子操作的 translator 报错(如缺必填字段)
func translateBatchOp(raw interface{}, token string, index int) (map[string]interface{}, error) {
op, ok := raw.(map[string]interface{})
if !ok {
return nil, sheetsValidationForFlag("operations", "operations[%d] must be a JSON object", index)
}
scRaw, present := op["shortcut"]
if !present {
return nil, sheetsValidationForFlag("operations", "operations[%d]: 'shortcut' field is required", index)
}
sc, ok := scRaw.(string)
if !ok || sc == "" {
return nil, sheetsValidationForFlag("operations", "operations[%d]: 'shortcut' must be a non-empty string (got %T)", index, scRaw)
}
mapping, ok := batchOpDispatch[sc]
if !ok {
return nil, sheetsValidationForFlag(
"operations",
"operations[%d]: shortcut %q not allowed in +batch-update "+
"(read ops / fan-out wrappers like +batch-update / +cells-batch-set-style / +cells-batch-clear / +dropdown-{update,delete} are excluded; "+
"run `lark-cli sheets +batch-update --print-schema --flag-name operations` to see the full enum)",
index, sc,
)
}
inputRaw, hasInput := op["input"]
var input map[string]interface{}
if !hasInput || inputRaw == nil {
input = map[string]interface{}{}
} else {
input, ok = inputRaw.(map[string]interface{})
if !ok {
return nil, sheetsValidationForFlag("operations", "operations[%d] (%s): 'input' must be a JSON object (got %T)", index, sc, inputRaw)
}
}
// 禁手填 operation —— 由 shortcut 名表达,手填易与 shortcut 不一致。
if _, has := input["operation"]; has {
return nil, sheetsValidationForFlag(
"operations",
"operations[%d] (%s): do not pass input.operation manually — it is implied by the shortcut name",
index, sc,
)
}
// 禁在 sub-op 重复填 spreadsheet 定位 —— 由 +batch-update 顶层 --url/--token 统一提供。
for _, k := range reservedSubOpKeys {
if _, has := input[k]; has {
return nil, sheetsValidationForFlag(
"operations",
"operations[%d] (%s): do not pass input.%s — it is already set from +batch-update top-level --url / --token",
index, sc, k,
)
}
}
// 拒绝任何额外的 sub-op 顶层 key防御未来 schema drift / 用户笔误)。
for k := range op {
if k != "shortcut" && k != "input" {
return nil, sheetsValidationForFlag("operations", "operations[%d] (%s): unknown top-level key %q (expected only 'shortcut' and 'input')", index, sc, k)
}
}
fv := newMapFlagViewForCommand(sc, input)
// operations is skipped by parse-time schema validation, so type-check the
// sub-op's scalar fields here before the translator reads them via
// Int/Bool/Float64 (which would otherwise coerce a wrong type to zero).
if err := fv.validateRawTypes(); err != nil {
return nil, sheetsValidationForFlag("operations", "operations[%d] (%s): %v", index, sc, err)
}
sheetIDFlag, sheetNameFlag := sheetSelectorFlagsForSubOp(sc)
sheetID := strings.TrimSpace(fv.Str(sheetIDFlag))
sheetName := strings.TrimSpace(fv.Str(sheetNameFlag))
body, err := mapping.translate(fv, token, sheetID, sheetName)
if err != nil {
return nil, sheetsValidationForFlag("operations", "operations[%d] (%s): %v", index, sc, err)
}
return map[string]interface{}{
"tool_name": mapping.mcpToolName,
"input": body,
}, nil
}
// translateBatchOperations 翻译整个 ops 数组fail-fast遇错立即返回。
func translateBatchOperations(rawOps []interface{}, token string) ([]interface{}, error) {
if len(rawOps) == 0 {
return nil, sheetsValidationForFlag("operations", "--operations must be a non-empty JSON array")
}
out := make([]interface{}, 0, len(rawOps))
for i, raw := range rawOps {
translated, err := translateBatchOp(raw, token, i)
if err != nil {
return nil, err
}
out = append(out, translated)
}
return out, nil
}