67 Commits

Author SHA1 Message Date
liujinkun2025
cccf025599 docs(drive): document 30-char query limit for +search (#1560)
The Search v2 API rejects queries longer than 30 characters (counted by
Unicode code point, CJK 1 each) with 99992402 field validation failed —
it is a hard error, not truncation. Surface this in the --query -h help
text and the lark-drive search skill so callers compress long queries
before searching instead of hitting the error.

Change-Id: Ieb30a66edae7a573690c49719627ec8fb2500a1a
2026-07-03 11:39:07 +08:00
fangshuyu-768
d0cde9a414 Improve secure label error handling (#1707)
* Improve secure label error handling

* Address secure label review feedback
2026-07-02 15:41:02 +08:00
fangshuyu-768
a6797ac2e4 Improve drive batch failure handling (#1703) 2026-07-01 18:15:14 +08:00
liujinkun2025
214318aa02 fix: support bot identity for drive search (#1670) 2026-06-30 21:59:51 +08:00
SunPeiYang996
22108c3300 feat(docs): add reference map flags (#1547) 2026-06-30 12:07:18 +08:00
zhaojiaxing-coding
af9835c288 feat(drive): add +member-add shortcut with wiki space member collection collaborator support (#1204) 2026-06-25 17:45:42 +08:00
xiongyuanwen-byted
bd898a1d74 feat(sheets): typed table I/O & error contract, workbook import/export, skill refresh (#1355)
* feat(sheets): add +sheet-show-gridline / +sheet-hide-gridline shortcuts

* docs(sheets): strengthen lark-sheets references for common editing pitfalls

Add targeted guidance to six lark-sheets references to reduce frequent
mistakes when editing spreadsheets through the CLI:

- write-cells: sanity-check units / dimension conversion / quantity factors
  before formula writes (formulas can run clean yet be off by a factor);
  keep derived output off original data columns to avoid clobbering source
- core-operations: prefer live formulas for derived values even when "live
  update" is not explicitly requested; scope rewrite/transform precisely so
  rows/columns that should stay unchanged are kept 1:1; treat header-stated
  format rules as checklist items; confirm the artifact file actually exists
  before finishing; write back bare values from local scripts
- visual-standards: apply border/header formatting on explicit request and
  identify the real header row; keep font size consistent with the source
- range-operations: keep total column width within A4 for printing
- read-data: dedup/compare long numbers via raw values, not csv formatted
  display (scientific notation collapses distinct numbers and causes false
  duplicates)
- chart: format date/number axes via source-cell number_format; place charts
  outside the data area so they do not cover existing data

* feat(sheets): implement table-put/table-get and sync skill specs

- Add lark_sheet_table_io.go with +table-put / +table-get and tests
- Refactor read-data; extend workbook; register new shortcuts
- Sync generated flag defs/schemas (go:embed) from sheet-skill-spec
- Sync skill references (write-cells numeric-column guidance, plus
  read-data / workbook / chart updates)

* docs(sheets): surface typed-write path at the write-decision point

Quick-ref table (SKILL.md, the first decision point) had no +table-put and
gated typed writes on "DataFrame", so a model holding a Counter/list/dict
would fall back to +csv-put and silently lose number/date fidelity.

- split csv-put row to plain-text values (no numeric/date semantics)
- add +table-put row for typed writes into an existing sheet
- add +workbook-create --sheets row for create + typed write in one shot
- add judgment note: number/amount/date/percent/count -> +table-put
  (or +workbook-create --sheets when the workbook does not exist yet);
  plain text -> +csv-put
- reframe write-cells scenario row to lead with numeric semantics
- point new-table writes at +workbook-create --sheets (one shot) instead
  of the create-empty-then-table-put two-step

Synced from sheet-skill-spec canonical (generate:cli + sync:cli).

* docs(sheets): sync SKILL.md (drop "not for local Excel" caveat)

Mirror the upstream sheet-skill-spec change removing the "not applicable to local Excel files" tail from the sheets skill and reference descriptions.

* docs(sheets): sync SKILL.md (drop "Feishu sheets only" caveat)

Mirror the upstream sheet-skill-spec change removing the "applies to Feishu sheets only" tail from the 14 sheet reference descriptions.

* feat(sheets): add +workbook-import wrapping the drive import core

Import a local xlsx/xls/csv as a new spreadsheet by delegating to the shared drive import flow with the target type pinned to sheet. Refactor drive +import to expose ImportParams / ValidateImport / PlanImportDryRun / RunImport (behavior unchanged, existing drive tests still cover it); sheets reuses them. Regenerate flag_defs_gen.go and sync the spec mirror.

* refactor(sheets): reuse the drive export core in +workbook-export

Replace +workbook-export's parallel export-task implementation with the shared drive ExportParams/RunExport core (pinned to type=sheet). Drops ~90 lines of duplicated poll/download code; +workbook-export now inherits drive's ctx cancellation, resume-on-timeout, filename sanitize/overwrite, and the full set of export status labels. The output contract aligns with drive's (adds ready/downloaded/doc_type; saved_path preserved). Also normalize an empty drive --output-dir to "." so drive +export behavior is unchanged, and fix the sheets export e2e to call +workbook-export instead of a nonexistent +export.

* docs(sheets): keep original column widths; align chart axis with requested metric

- range-operations: only widen new / overflowing columns; never recompute or
  shrink the widths of existing columns (any blanket resize, even by 1px,
  breaks the original visual format)
- chart: when the user asks for a share / percentage, the value axis should be
  a percentage (pie, or stack.percentage on bar/column) rather than raw counts

* docs(sheets): reword guidance to avoid eval-specific phrasing

Replace scoring-framework wording in the examples with plain functional
consequences (e.g. "not delivered", "goes stale when the source changes",
"breaks the original visual format"), so the references stay agent-facing.

* docs: add lark sheets financial modeling guidance

* docs(sheets): align write-cells reference with the generated output

Bring the hand-applied write-cells example in line with the spec-generated
reference so the CLI mirror is byte-identical to the canonical source.

* docs(sheets): align +csv-put help with formula support

Sync the formula-support wording from sheet-skill-spec (flag-defs, skill
references) and update the hand-authored cobra Description and comment for
+csv-put. +csv-put evaluates a leading-= cell as a formula via
set_range_from_csv; descriptions only, no behavior change.

* docs(sheets): fix invalid +dim-insert example in chart reference

The chart reference's placement example used non-existent flags
--dimension/--start/--end for +dim-insert. The real signature is
--position (required) + --count (required); copying the example
fails Validate with "--position is required". Replace it with
+dim-insert --position V --count 6 (insert 6 columns before V,
i.e. after U), aligning with the sheet-structure reference.

* docs(sheets): chart coordinate base / quoting + filter condition enums

Sync three reference-doc corrections from the spec source:

1. chart: label position.row as 0-based (first row = row:0), distinct
   from the 1-based row numbers used by A1 ranges and +dim-insert
   --position, removing the row-base ambiguity.

2. chart: convert the three runnable examples whose JSON contains a
   quoted sheet prefix ('Sheet1'!A1) from inline single-quoted
   --properties '{...}' to a stdin heredoc (--properties - <<'JSON').
   Inside an inline single-quoted string bash strips the inner quotes
   around the sheet name (and splits names with spaces into words),
   corrupting the JSON; a quoted heredoc delimiter performs no shell
   substitution and preserves it. Adds a short note on the pitfall.

3. filter / filter-view: add the full conditions[].type x compare_type
   enum table (text / number / multiValue / color and their respective
   compare_type values and values shape), and call out the
   equals/notEquals (with s) vs equal/notEqual (no s) gotcha. The docs
   previously only showed two values via examples.

* docs(sheets): label +sheet-create --index as 0-based

The base flag description for +sheet-create's --index omitted the
coordinate base, while its siblings +sheet-move ("Target position
(0-based)") and +sheet-copy already state 0-based. Align the description
so the index base is unambiguous. Synced from the spec source
(flag-defs.json + workbook reference).

* fix(sheets): regenerate flag defs and fix asasalint in table io

* feat(sheets): add counta to chart aggregateType enum

Add `counta` (count non-empty cells, incl. text) to manage_chart_object
dim2.series[].aggregateType in the chart flag schema. `count` only counts
numeric cells, so counting occurrences of a text/category column renders an
empty chart; `counta` enables category frequency counts. Synced from the
sheet-skill-spec canonical schema.

* feat(sheets): make --target-position and --range mutually exclusive on +pivot-create

Both flags map to the same wire field (properties.range), so passing
non-default values for both is ambiguous. Mirror the
--target-sheet-id / --target-sheet-name mutex pattern: --target-position
takes priority over --range, and supplying both with non-default values
is rejected up front with a typed FlagErrorf. --target-position=A1 is
the documented default and is treated as "not set".

Add a symmetric validateCreateInput hook on objectCRUDSpec (alongside
the existing validateUpdateInput), wire it into objectCreateInput, and
inject the pivot-specific check on pivotSpec.

* feat(sheets): rework +workbook-create flags and --styles

- --values builds a type-less typed payload, writing through --sheets' batched set_cell_range path (raw passthrough preserves auto-detect; large tables batch; big ints via json.Number)
- drop --headers (subsumed by --values first row) and --header-style (typed header no longer auto-bold; use --styles instead)
- styles: deep-merge overlapping cell_styles/border_styles fields (was wholesale-replace which dropped fields); add manual border_styles validation (style/weight enums + sides) since --styles is on parseJSONFlagSkip and bypasses the schema validator
- regenerate flag-defs/flag-schemas/skills mirror from sheet-skill-spec (--styles flag + full per-side border schema)

* fix(sheets): add mention_type enum to set_cell_range cells schema

Constrain rich_text mention_type to the proto MENTION_FILE_TYPE set so a
file @mention with an out-of-enum value (e.g. 6 = cloud shared folder) is
rejected by the schema validator before it reaches the server and fails
pb serialization ("mentionFileInfo.fileType: enum value expected").

- data/flag-schemas.json: mention_type gains enum + per-value description
- lark_sheet_write_cells_test.go: cover reject (6) + allow (0 / 2 / 22)

* feat(sheets): implement pandas-split --sheets protocol for +table-put/+table-get/+workbook-create

Synced from sheet-skill-spec canonical (cli:table_put schema +
references). +table-put/+workbook-create accept the new shape via a
tableSheetIn -> tableSheetSpec normalize step (dtype string -> internal
type/format mapping). +table-get emits the same shape so the writer's
df_to_sheet and the reader's sheet_to_df round-trip cleanly.

isoDateToSerial now accepts the full ISO datetime form
(2024-01-15T00:00:00.000, including timezone suffixes) emitted by
df.to_json(date_format="iso"), not just yyyy-mm-dd. End-to-end verified
by the spec repo's contracts/python_helper_roundtrip script against a
real Lark spreadsheet on pandas 2.2 and 3.0.

* feat(sheets): add --dataframe Arrow IPC input for +table-put/+table-get/+workbook-create

Introduce a binary-typed twin of --sheets: --dataframe accepts an Arrow IPC
(Feather v2) payload that pandas' df.to_feather() writes, deriving dtypes and
per-column number formats from the Arrow schema. The two producers are mutually
exclusive and funnel through a shared resolver so +table-put and
+workbook-create stay in lockstep; +table-get gains --dataframe-out for
single-sheet reads. Also auto-grow a sub-sheet's row/column count before
writing so blocks past the backend's default 200x20 bounds no longer fail with
range-exceeds-sheet-bounds.

* docs(lark-sheets): remove financial modeling standards reference

Drop the lark-sheets-financial-modeling-standards.md reference doc and all
pointers to it from SKILL.md, core-operations, and visual-standards. Bump
skill version to 3.0.0.

* docs(lark-sheets): clarify cell-image vs float-image routing and fix reference self-references

Synced from sheet-skill-spec.

- Add a binding-based decision (does the image belong to a record and move with its row?) to route +cells-set-image vs +float-image-create across the SKILL entry, float-image and write-cells references.
- Add routing rows to the SKILL command cheat-sheet and warn against defaulting to float-image out of familiarity.
- Replace mislabeled 本 skill / 子 skill / 跨 skill wording in references with 本文 / reference names, matching the existing convention.

* feat(sheets): add --styles to +table-put for one-step typed write with styling

+table-put now accepts --styles (same shape as +workbook-create's --styles):
cell_styles merge into the set_cell_range matrix, while cell_merges /
row_sizes / col_sizes apply as their own tool calls after the write. The
styles payload is name-matched against the written sheets and validated up
front, so a malformed or mismatched style fails before any write lands.

Also points +sheet-create users to +table-put (auto-creates missing sheets)
when they need data/styles, via a runtime Tip and the lark-sheets skill
references. Flag is sourced from the upstream Base table and regenerated
through sheet-skill-spec (flag-defs.json / flag-schemas.json / gen file).

Adds unit tests (dry-run styles, name-mismatch reject, execute) and a
dry-run E2E (tests/cli_e2e/sheets/sheets_table_put_dryrun_test.go).

* docs(lark-sheets): point read-data to +sheet-info for hidden row/col identification

skip-hidden defaults to false (lossless reads), but the read primitives don't mark which rows/cols are hidden. Cross-reference +sheet-info --include hidden_rows,hidden_cols + row_indices/col_indices so agents can identify hidden ranges when they need to filter or interpret hidden data.

Synced from sheet-skill-spec.

* feat(sheets): document link requirement for @document mentions in cells flag schema

@document mentions (mention_type != 0) must pass link (doc URL) to render a
clickable card; @user mentions (mention_type=0) don't need it. Synced from the
upstream tools-schema.

* fix(sheets): reject cond-format attrs whose shape mismatches rule_type

A conditional-format rule created with --rule-type colorScale but
cellIs-shaped attrs ({compare_type,value}, no color) was accepted by
the CLI and written through to the server, producing a color-less
color-scale segment. That dirty data crashes the frontend on snapshot
deserialization, so the spreadsheet can no longer be opened (5005).

The per-entry schema check can't catch this: properties.attrs.items is
a oneOf over all nine attr shapes and passes as soon as any branch
matches, blind to the sibling rule_type — {compare_type,value} matches
the cellIs branch even when rule_type says colorScale. The tool side
maps attrs blindly by rule_type and only validates dataBar count and
iconSet ordering, so the gap reaches the data layer.

Add a cross-field validator (validateCondFormatAttrs) wired into both
create and update via the new objectCRUDSpec.validateCreateInput hook
(twin of validateUpdateInput). It enforces, per rule_type, the keys
every attrs entry must carry — mirroring the tool's converter contract
— and treats an empty required string (notably color) as missing.
Rule types that take no attrs (duplicateValues / uniqueValues /
containsBlanks / notContainsBlanks) and updates that omit rule_type are
left to the server.

* test(sheets): guard condFormatAttrsRequired against flag-schemas drift

Add TestCondFormatAttrsRequired_MatchesSchemaOneOf, comparing the
hand-maintained condFormatAttrsRequired table against the embedded
flag-schemas.json attrs oneOf (multiset of required-key sets, for both
create and update). The cross-field validator only holds if its
per-rule_type required keys mirror the schema branches, and the two
share no compile-time link — this pins them together so a future schema
sync that adds/drops a required key can't silently desync the table.

* fix(sheets): default +table-get to full used range, not A1 current region

+table-get without --range anchored its current_region probe at A1, so an
internal blank row or column silently truncated everything past it — agents
then treated the partial data as complete (the pro016 / pro025 incident).

- Probe the used range over the full physical grid (row_count × column_count
  from the workbook structure) so it spans internal blank rows/columns; fall
  back to the legacy A1 anchor when dimensions are unknown.
- Emit the actually-read `range` on every sheet so callers can detect
  truncation (get_cell_ranges has no has_more flag).
- Fix the same A1-anchor bug in append mode's last-data-row probe, which could
  otherwise overwrite data past an internal blank row.
- Add unit + dry-run/live E2E coverage; refresh synced skill docs.

* docs(sheets): fix csv-get current_region guidance to cross-check row_count

current_region is a blank-row/column-bounded block, not the true sheet extent:
an internal blank row truncates it, so it can miss rows past the gap. The
read-data reference previously called it the "真实数据边界" and told agents to
prefer it over row_count — which drove the "read only to current_region's last
row, miss the tail" failure.

- current_region: warn it can be both smaller (internal blank rows truncate)
  and larger (trailing summary/signature rows) than the real data range.
- csv-get output contract: clarify its row_count/col_count is the returned size
  (= actual_range), not the physical sheet size; has_more only reflects the
  current range, not whether the whole sheet was read.
- "确定数据范围的正确流程": add a step to cross-check against +workbook-info's
  physical row_count and probe past current_region's last row for data beyond an
  internal blank row.

* fix(sheets): collapse duplicate validateCreateInput from bad merge resolution

A prior merge kept both branches' independently-added validateCreateInput
fields on objectCRUDSpec with conflicting signatures (pivot's
func(rt, input) and cond-format's func(input)), plus both call sites in
objectCreateInput, which failed to compile (validateCreateInput redeclared).

Collapse to the single richer func(rt flagView, input) signature and one
call site. cond-format's validateCondFormatAttrs (func(input), still shared
with validateUpdateInput) is wrapped in a closure that ignores rt. Both
behaviors are preserved: pivot --target-position/--range mutex and
cond-format attrs-shape-vs-rule_type validation.

* refactor(sheets): migrate legacy error helpers to typed errs in sheets domain

golangci-lint forbidigo (errs-no-legacy-helper / errs-no-bare-wrap) flagged
the table I/O, workbook, and dataframe shortcuts that landed on this branch:
93 common.FlagErrorf and 48 fmt.Errorf calls.

- Replace every common.FlagErrorf with common.ValidationErrorf (typed
  *errs.ValidationError, same signature) across workbook / table_io /
  dataframe / object_crud.
- writeDataframeOut's two final --dataframe-out write failures become typed
  errs.NewInternalError(SubtypeFileIO, ...).WithCause(err).
- applyWorkbookCreateVisualOps now passes the typed callTool error through
  unchanged (re-wrapping would downgrade classification) and attaches the
  failing op as a recovery hint only when none is set.
- The remaining fmt.Errorf are genuine intermediate errors that the command
  layer re-wraps into typed validation errors (buildTypedCell / Arrow
  decode-encode) or surfaces as a partial_success message string
  (writeTypedSheets via tablePutPartial); each carries a //nolint:forbidigo
  with that reason, per the lint guidance.

No behavior change: error messages and partial-success shapes are preserved;
gofmt, go vet, golangci-lint (0 issues) and sheets tests all pass.

* fix(shortcuts): clarify single-stdin constraint in flag help and error hint

Input flags advertised '(supports @file, - for stdin)' per flag, leading
AI agents to write '--a - <x --b - <y' where the second '<' silently
clobbers the first and the first flag reads the wrong payload. A process
has a single stdin, so at most one flag per call can use '-'.

- Reword the generated help hint to '- reads stdin (one flag per call;
  use @file for others)'.
- Add an actionable .WithHint to the stdin-conflict validation error
  pointing callers to @file for the extra flags.
- Assert the new hint in TestResolveInputFlags_DuplicateStdin.

* feat(sheets): +cells-get/+csv-get --max-chars 默认值 200000 → 500000

放宽默认防爆上限。flag_defs_gen.go 由 go generate 重生;flag_defs_test.go
的 expected default 同步;flag-schemas.json schema_version 2 → 3 是上游
spec-tables 架构调整带来的元数据 bump,与本业务改动无关、go:embed 不解析
该字段、无功能影响。

Synced from sheet-skill-spec@93f7a78.

* docs(lark-sheets): sync from spec — +csv-put 含逗号公式正例 + 收敛警示标签

源同步自 sheet-skill-spec:write-cells 补含逗号公式 RFC 4180 转义正例与结构化写入优先指引;全 reference 收敛「高频致命错误」类标签。

* docs(lark-sheets): sync from spec — --max-chars 放出为可见 flag + 落盘优先指引

源同步自 sheet-skill-spec:--max-chars 放出(默认 500000,可调小避免大输出被 Bash/终端转存为文件、改 has_more 分页);read-data 增「大数据优先落盘」指引。

* feat(sheets): 写操作报错增强 + --token 别名

- 复合 JSON shape 校验失败时报错附 --print-schema 提示,agent 可直接拿到精确结构(pro26 头号:+cells-set --cells 反复猜 shape)
- JSON 解析失败且该 flag 支持 stdin 时提示改用 stdin(公式/引号/逗号内联到 shell 被转义弄坏 JSON)
- --token 作为 --spreadsheet-token 的解析期别名:复用 sheets 已有 PostMount 钩子 + pflag normalize,仅 sheets 包,common 零改动

* docs(lark-sheets): sync from spec — set+H 改单引号 / 速查表补臆造命令名 / workbook-import 引导

* fix(sheets): migrate +table-put to typed error contract

The merge from main brought in #1449 (retire legacy error envelopes),
which removed output.ExitError / output.ErrDetail and forbids
constructing them. Port tablePutPartial off the legacy envelope:

- no sheets written -> typed errs.APIError (plain failure)
- some sheets written -> ok:false result via runtime.OutPartialFailure
  carrying written_sheets, returning the partial-failure exit signal

Also fix two drifts the same merge introduced:
- regenerate flag_defs_gen.go to match the committed flag-defs.json
- update the --max-chars flag test to assert visible (no longer hidden)

* docs(lark-sheets): sync from spec — set+H 告诫通则化(移入 stdin 段)

* feat(sheets): styles 接受 halign/valign 等对齐字段别名

把模型常幻觉的 horizontal_align / halign / vertical_align / valign 映射到
规范字段 horizontal_alignment / vertical_alignment,覆盖 --styles 与 typed
--cells;与规范字段冲突时报错而非静默择一。同步 lark-sheets skill 文档补
对齐字段说明 + --print-schema --flag-name styles 提示。

* feat(sheets): resolve wiki URLs to the backing spreadsheet for --url

Sheets shortcuts only accepted /sheets/ and /spreadsheets/ URLs via --url.
A /wiki/<node_token> URL was rejected with "must be a spreadsheet URL"
because the wiki node_token is not a spreadsheet token: resolving it to the
backing spreadsheet needs a wiki get_node call, which Validate/DryRun (kept
network-free) must not make.

Mirror the existing slides/doc/drive two-stage pattern:

- parseSpreadsheetRef classifies --url / --spreadsheet-token network-free
  into a sheet token or an (unresolved) wiki node_token.
- resolveSpreadsheetTokenExec (Execute only) resolves a /wiki/ node_token
  via wiki get_node, verifies obj_type=sheet, and returns the obj_token.
  The wiki:node:read scope is enforced on this path only, so non-wiki
  invocations are unaffected.
- resolveSpreadsheetToken stays network-free for Validate/DryRun, passing
  the node_token through unchanged.

All 47 Execute paths (including +batch-update and +workbook-export) switch
to the Exec resolver; Validate/DryRun keep the network-free one. No tool
schema change: the CLI feeds the resolved spreadsheet token as excel_id, so
this is a pure CLI-layer change.

Tested: unit (parse classification + wiki get_node e2e via httpmock) and
live end-to-end against a real wiki spreadsheet (read: +workbook-info,
+cells-get, +csv-get; write: +sheet-create, +sheet-rename, +csv-put).

* docs(sheets): note --url accepts wiki URLs (synced from spec)

* fix(sheets): match --url path segment via url.Parse, not substring

parseSpreadsheetRef classified /wiki/ with strings.Index over the whole URL, so a /sheets/ link whose query or fragment merely contained /wiki/ (e.g. .../sheets/sht?from=/wiki/x) was hijacked into a get_node call. Now parse the URL and match /sheets/, /spreadsheets/, /wiki/ only as a path prefix, mirroring slides parsePresentationRef which already fixed this class. Drop the substring helpers. Also align wiki resolution with slides: CallAPITyped (typed error + log_id) and classify an incomplete get_node response as InternalError instead of a --url validation error. Add regression tests for query/fragment /wiki/ and incomplete node.

* fix(sheets): satisfy errorlint/copyloopvar + regen flag defs

- helpers_test.go: drop the Go 1.22+ redundant `tc := tc` loop copy
  (copyloopvar).
- lark_sheet_dataframe.go, lark_sheet_table_io.go: switch the
  intermediate-error fmt.Errorf calls from %v to %w so errorlint passes.
  Behavior unchanged — these errors are always rewrapped into typed
  validation errors at the command layer.
- flag_defs_gen.go: regenerate from data/flag-defs.json (drift from the
  wiki-URL merge).

* ci: allow Apache Arrow module in license check

Arrow is Apache-2.0 overall, but it vendors c-ares (LicenseRef-C-Ares,
ISC-like) inside the module which go-licenses classifies as Unknown and
the strict disallowed_types=...,unknown gate rejects.

Pass --ignore github.com/apache/arrow/go/v17 since Arrow is required by
sheets +table-put / +table-get / +workbook-create --dataframe (Arrow IPC
ingest) and the vendored c-ares is not redistributed by us.

* fix(sheets): resolve wiki URL in +range-move/+range-copy Execute

transformExecuteFn (the named Execute helper shared by +range-move and +range-copy) still called the network-free resolveSpreadsheetToken, so a /wiki/ URL reached transform_range as an unresolved node_token and failed. #1519's sweep over Execute hooks only rewrote inline closures; this is the only Execute backed by a named helper. Switch it to resolveSpreadsheetTokenExec (Validate/DryRun stay network-free) and add a +range-move wiki-URL regression test.

* refactor(sheets): drop +table-put manual capacity grow; rely on set_cell_range auto-grow

set_cell_range now auto-grows the sub-sheet to fit the write, so the
ensureSheetCapacity helper (and its modify_sheet_structure dim-insert
call before each write) is no longer needed. This also closes a data-
safety hole flagged in review: inserting before the last existing row
could push real data down into the area set_cell_range was about to
write, and allow_overwrite=false could not protect against it because
the structural insert had already mutated the sheet by the time the
write-collision check ran.

Verified end-to-end against a real spreadsheet: +table-put writing
300x25 into a fresh Sheet1 (default 200x20) succeeds in one write and
the sheet ends up 301x25.

* fix(sheets): close --dataframe stdin guard hole

--dataframe is binary and bypasses the common Input resolver, which is
where the existing single-stdin guard lives. Result: an invocation like
+table-put --dataframe - --styles - was accepted, then one of the two
consumers raced for stdin and the other silently saw an empty stream.

Add a stdinConsumed marker on RuntimeContext that both consumers share:
common.resolveInputFlags sets it when an Input flag uses '-', and
readDataframeBytes both checks and sets it. A second consumer is
rejected up front with an actionable hint pointing at @file.

Flagged in code review (lark_sheet_dataframe.go:93).

* fix(sheets): harden +table-put / +table-get input validation and round-trip safety

Four review-flagged correctness gaps in table I/O, all bundled because
they touch the same file:

1. --sheets accepted trailing data after the first JSON value
   (json.Decoder does not surface that, unlike json.Unmarshal). A new
   decoderExpectEOF helper rejects e.g. `--sheets '{...} oops'` with a
   typed validation error instead of letting the leading object pass
   through and surface as a confusing downstream failure.

2. +table-get with a duplicate header (e.g. `amount, amount`) used to
   read back successfully — the dtypes map silently collapsed to one
   entry — and only failed later on +table-put because the writer
   rejects duplicate column names. Fail fast at read time with an
   actionable hint to rename or pass --no-header. --no-header mode is
   exempt (fallback col<N> names are always unique).

3. +table-put dry-run rendered an invalid range like A1:C0 when
   header=false with rows=[]. tablePutFullRange returns "" for an
   empty matrix or zero columns instead of building a degenerate
   rectangle.

4. +table-get with --sheet-id and a get_workbook_structure miss (read
   failure or selector mismatch) used to return a target with
   name="", which then broke +table-get → +table-put round-trip (the
   writer requires a non-empty sheet name). Fall back to using the id
   as the name.

End-to-end verified against a real spreadsheet: trailing data, duplicate
header, and --no-header fallback all behave as advertised.

* fix(sheets): apply +workbook-create style-only ops instead of silently dropping them

A +workbook-create call carrying only cell_merges / row_sizes / col_sizes
(no --values / --sheets and no cell_styles) used to create the workbook
but silently drop the requested visual ops. Two reasons, both fixed:

- workbookCreateStyleDimensions only counted cell_styles when computing
  the write extent, so cell_merges / row_sizes / col_sizes always
  contributed 0 → buildValuesPayload returned a nil payload → Execute
  skipped writeTypedSheets entirely → no visual ops ran. Extend the
  helper to fold the merge / resize ranges in.

- Pure row_sizes / col_sizes payloads can never expand a cell rectangle
  (they are dimension ranges, not cell ranges), so even with the extent
  fix Execute would still skip the write path. Add a no-data branch:
  when payload == nil but a styles item is present, look up the default
  sheet and apply visual ops directly via applyWorkbookCreateVisualOps.
  The dry-run plan mirrors this so the preview shows the visual ops.

Also picks up the --values trailing-JSON-data EOF check (mirror of the
--sheets one in lark_sheet_table_io.go).

End-to-end verified against a real spreadsheet: a cell_merges-only
+workbook-create now produces a sheet with merged_cells_count: 1.

* fix(sheets): preserve causes and render messages cleanly for typed validation errors

common.ValidationErrorf goes through fmt.Sprintf, which does not support
%w — the seven call sites that used `%w` were rendering the cause as
literal `%!w(*fmt.wrapError=&{...})` and dropping the cause from the
typed-error chain (so callers couldn't errors.As back to the underlying
error).

Switch each to `%v` for clean rendering and attach the cause via
.WithCause(err) so the typed contract is preserved. Touched call sites:

- lark_sheet_dataframe.go: --dataframe Arrow decode / stdin read / file
  read failures (3 call sites).
- lark_sheet_table_io.go: --sheets invalid JSON, payload-validate
  per-cell coercion error, buildSheetMatrix per-cell error,
  --dataframe-out arrow encode failure (4 call sites).

End-to-end verified against a real spreadsheet: both invalid-JSON and
typed-cell errors now render readable messages instead of %!w(...).

* sync(sheets): pick up +sheet-{show,hide}-gridline in +batch-update schema

Mirror of the sheet-skill-spec change adding the two gridline shortcuts
to cli-schemas.json batch_update.operations.shortcut enum. Synced from
the upstream canonical via generate:cli + sync:cli.

Verified end-to-end on a real spreadsheet — +batch-update with a
+sheet-hide-gridline op passes schema validation and the backend run
returns succeeded: 1.

* sync(sheets): pick up +workbook-export UX clarification from spec

Mirror of the sheet-skill-spec update that documents +workbook-export's
default-no-download behavior and its relationship to drive +export
--doc-type sheet. Synced from canonical via generate:cli + sync:cli +
go generate.

End-to-end verified against a real spreadsheet:
- Omit --output-path → ok:true, downloaded:false, file_token returned
- Pass --output-path ./crfix_test.xlsx → ok:true, file saved
  (17892 bytes), saved_path returned

The --help output for +workbook-export now states the default behavior
and points callers at `drive +export --doc-type sheet` when they need
the --output-dir / --file-name / --overwrite split.

* test(sheets): assert typed errs.Problem instead of err.Error() substrings

Per the coding guideline "Error-path tests must assert typed metadata via
errs.ProblemOf (category / subtype / param) and cause preservation, not
message substrings alone." — sweep through every error-path assertion in
the sheets domain and replace the
`strings.Contains(stdout+stderr+err.Error(), ...)` pattern with two
small helpers landed in helpers_test.go:

  requireProblem(t, err, wantCategory, wantSubtype, msgContains)
    -> *errs.Problem
  requireValidation(t, err, msgContains)
    -> *errs.ValidationError   // shorthand for CategoryValidation +
                               //   SubtypeInvalidArgument; lets callers
                               //   also assert .Param / .Params / .Cause

~60 assertion sites across 18 test files now check the typed envelope
shape, with message-substring checks moved onto the returned Problem
(.Message / .Hint / .Param). The substring is preserved as a sanity
check rather than the sole assertion, so a category drift like
validation → internal would now fail loudly instead of slipping past.

Cases intentionally left as substring (each with a one-line reason):
  - Errors that come straight from cobra's native flag parser (untyped
    *errors.errorString — e.g. "required flag(s) ... not set", mutually-
    exclusive groups). Re-typing these needs a custom FlagErrorFunc and
    is out of scope here.
  - Intermediate errors from decodeArrowToSheet that the caller wraps
    into a typed envelope (`//nolint:forbidigo` reason). Those unit
    tests assert the unwrapped intermediate directly.

One production tweak:
  - shortcuts/sheets/flag_schema.go: printFlagSchemaFor returns typed
    *errs.ValidationError (with WithParam("--flag-name") on the
    unknown-flag branch) instead of raw fmt.Errorf. The framework
    already wraps this when called via --print-schema, so user-facing
    behaviour is unchanged; direct callers (and tests) now get the
    typed envelope.

Verified: go test ./shortcuts/sheets/... passes; golangci-lint
--new-from-rev=origin/main reports 0 issues.

* test(common): assert typed errs.Problem instead of err.Error() substrings

Mirror of the sweep just landed in shortcuts/sheets: replace error-path
substring assertions with typed-envelope checks via two small helpers
landed in a new shortcuts/common/typed_error_assertions_test.go:

  requireProblem(t, err, wantCategory, wantSubtype, msgContains)
    -> *errs.Problem
  requireValidation(t, err, msgContains)
    -> *errs.ValidationError   // shorthand for CategoryValidation +
                               //   SubtypeInvalidArgument; lets callers
                               //   also assert .Param / .Params / .Cause

8 sites moved to typed assertions across runner_jq_test.go,
mcp_client_test.go, drive_media_upload_typed_test.go, and
runner_input_test.go (the input tests already used a typed-param helper;
this just retargets the substring follow-up onto the typed Message).

Sites intentionally left as substring + comment (production returns raw
fmt.Errorf, not a typed envelope):
  - runner_botinfo_test.go (6 sites): BotInfo / fetchBotInfo wrap upstream
    errors with fmt.Errorf so the SDK-level message ([99991], 403,
    invalid character, etc.) shows through.
  - runner_args_test.go (4 sites in 2 tests): rejectPositionalArgs returns
    raw fmt.Errorf to satisfy cobra's PositionalArgs contract.
  - permission_grant_test.go (2 sites): assert on stderr / hint strings,
    not error messages — already out of the err.Error() substring class.

No production code changes.

Verified: go test ./shortcuts/common/... passes;
golangci-lint --new-from-rev=origin/main ./shortcuts/common/... reports
0 issues.

* fix(sheets): plug four +table-put / +table-get correctness gaps flagged in CR

Four review-flagged bugs, all in lark_sheet_table_io.go (bundled because
they touch the same file and the same +table-put / +table-get domain):

1. +table-get --dry-run dropped the --sheet-id / --sheet-name selector
   from the get_cell_ranges body, while Execute always passed it. Agents
   that validate the dry-run shape and then run live would see a request
   shape mismatch. The dry-run now calls sheetSelectorForToolInput so
   the body matches Execute.

2. isDateNumberFormat used a simple `strings.ContainsRune(_, 'y')` so
   number formats like "JPY #,##0" (a currency prefix that happens to
   contain a lone 'Y') were misread as date formats — round-tripping
   integer cells out as ISO dates. The detector is now token-aware:
   it skips quoted "...", `\\x`-escaped, and `[...]` bracket sections,
   and only fires on an unescaped `yy` (a real Excel year token).

3. sheetCreateDims sized new append-mode sheets by `headerOn(s)` only,
   but writeSheetData forces a header on empty append sheets when
   Header == nil. Near 50000 rows / 200 cols this created the sheet one
   row short and the follow-up set_cell_range bounced off the backend
   ceiling. Size now matches the forced-header logic exactly.

4. tableGetTargets fallback paths (read-failure / selector mismatch on
   --sheet-id) returned a target with name="" — already corrected for
   --sheet-id structure-success path in 086876d2, but the structure-
   failure fallback still left it empty. Use the id as the name there
   too so the +table-get → +table-put round-trip never breaks on a
   nameless sheet.

End-to-end verified against a real spreadsheet:
- table-get --dry-run with --sheet-name / --sheet-id both render the
  selector field in the get_cell_ranges body
- A real round-trip (typed put → get) preserves dtypes + formats

* fix(sheets): bound --dataframe memory use with byte / row / column caps

readDataframeBytes used to read the whole Arrow file unbounded — a
stdin / file > 1 GiB would OOM the CLI long before the backend
per-sheet ceilings kicked in. decodeArrowToSheet then materialized
every record into [][]interface{} regardless of size.

Three caps now match the backend's per-sheet hard ceilings:
- byte cap: 256 MiB (covers worst-case 200×50000 cells × ~25 B Arrow
  overhead). File path pre-Stat()s before opening; both file and stdin
  paths read through io.LimitReader so an oversized input is rejected
  without allocating the full payload.
- column cap: 200, checked at schema-decode time before allocating any
  per-column slices.
- row cap: 50000, checked during record-batch iteration so a 1M-row
  Arrow file is rejected mid-stream instead of fully decoding first.

End-to-end verified against PPE — a 257 MiB file is rejected at file-
Stat with a typed validation error before any read happens.

* fix(drive): wrap +export ctx cancellation/deadline as typed errs.NetworkError

The poll loop in RunExport returned ctx.Err() directly in two places —
on the inter-attempt sleep cancel and on the pre-attempt deadline check.
That let context.Canceled / context.DeadlineExceeded escape as untyped
errors at the cobra layer, bypassing the typed-error contract every
other failure path already honors.

Add wrapExportContextErr that maps both into errs.NewNetworkError with
SubtypeNetworkTransport / SubtypeNetworkTimeout respectively and
preserves the cause via .WithCause(err), so callers can still
errors.Is(err, context.Canceled) downstream.

CR-flagged at drive_export.go:229 / :234.

* ci(license): narrow Apache Arrow workaround with a follow-up assertion

The dependency-license check still has to --ignore Apache Arrow wholesale
because go-licenses' classifier parses its LICENSE.txt as a single license
and mis-reports the module as LicenseRef-C-Ares / Unknown (Arrow inlines
the c-ares 3rdparty notice alongside its own Apache-2.0). Re-classifying
on our side isn't possible without changing go-licenses itself.

The CR concern was that --ignore is too wide — a future Arrow re-license
or new inlined dep would silently sail through. Add a follow-up step that
re-checks Arrow's LICENSE.txt independently: it must still open with
"Apache License" AND must still inline the c-ares 3rdparty notice (the
two facts that make the --ignore safe today). If either invariant breaks,
CI fails here and forces a human to re-evaluate the ignore.

Verified locally — both assertions pass against the current pinned
Arrow v17.

* sync(sheets): pick up +table-put payload-shape doc corrections from spec

Mirror of the sheet-skill-spec change that fixes three places teaching
an invalid +table-put payload shape — the typed protocol only has
columns / data / dtypes / formats (no formula field) and must always
be wrapped in an outer {"sheets":[...]} envelope. write-cells and the
SKILL.md decision table previously used the wrong field names (type /
format) and pointed users at +table-put for formula writes, which the
shortcut can't actually accept.

Synced from upstream canonical via generate:cli + sync:cli.

* test(sheets/e2e): add E2E coverage for new shortcuts + typed workbook-create

AGENTS.md requires a dry-run E2E for every new shortcut and a live E2E
for new flows. Three new files cover the four shortcuts this branch
adds or materially changes:

- sheets_gridline_dryrun_test.go — pins +sheet-show-gridline /
  +sheet-hide-gridline as a single modify_workbook_structure call with
  the right operation name (show_gridline / hide_gridline) and
  sheet_id, so an op-name typo would trip CI before any live run.

- sheets_workbook_import_dryrun_test.go — pins +workbook-import as a
  two-step plan (drive media upload + drive import-task create) with
  the doc type hard-coded to "sheet" — the wrapper's whole reason for
  existing on top of generic drive +import. --name reaches file_name
  on the wire; file_extension is sniffed from the local file.

- sheets_table_put_typed_workflow_test.go — two live workflows running
  against a freshly created spreadsheet. The first runs the full
  typed +table-put → +table-get round-trip (date / numeric / object
  columns with custom number_format) and asserts the dtype + format
  contract holds end-to-end. The second exercises the typed
  +workbook-create --sheets path: create + write in one shortcut, the
  payload sheet name adopts the workbook's default sheet (no empty
  "Sheet1" left behind), and the typed contract still survives the
  read-back.

End-to-end verified locally (user identity): typed put round-trips
preserve dtypes (date → datetime64[ns], numeric → float64, object →
object) + formats verbatim; workbook-create adopts the named sheet as
the first sheet with the same typed shape intact.

* sync(sheets): pick up sheets_df.py — pandas ↔ JSON skill script from spec

Mirror of the sheet-skill-spec change that adds a DataFrame ↔ JSON
bridge as a skill-bundled Python script instead of inside the CLI
binary. Per PR #1355 review (docx NcmxdRo2yoZ4OXxoMUZcxRZ7nHd, §4.2):
keep the CLI a thin JSON/REST client; pandas / Arrow editing lives in
the caller's Python process. Synced from canonical via generate:cli +
sync:cli.

- skills/lark-sheets/scripts/sheets_df.py (new): pandas DataFrame ↔
  one sheet, .parquet / .feather / .arrow / .csv / .json. Shells out to
  `+table-put` / `+table-get` over typed JSON — no CLI changes.
- SKILL.md decision tree + write-cells.md +table-put section: explicit
  pointers so pandas users land on the script instead of hand-rolling
  the `--sheets` payload.

End-to-end verified against PPE: 3-row DataFrame (datetime / float /
object) round-trips parquet → script put → real sheet → script get →
parquet with dtypes preserved.

* Revert "sync(sheets): pick up sheets_df.py — pandas ↔ JSON skill script from spec"

This reverts commit 2964983b92.

* sync(sheets): pick up sheets_df.py + doc DRY cleanup from spec

Mirror of the sheet-skill-spec change that ships a 32-line helper-only
sheets_df.py (df_to_sheet + sheet_to_df) and removes the corresponding
inline `def` blocks from three reference docs.

- skills/lark-sheets/scripts/sheets_df.py (new): pandas DataFrame ↔
  one +table-put / +table-get sheet, importable as a library. Same
  helper pair the docs already taught, lifted out of the prose so
  callers can `from sheets_df import df_to_sheet, sheet_to_df`.
- lark-sheets-write-cells.md / lark-sheets-read-data.md /
  lark-sheets-workbook.md: drop the inline helper definitions; keep
  the usage examples (single/multi-sheet, round-trip) and switch them
  to import-from-script. workbook reference's +workbook-create
  --sheets section now points pandas users at the helper directly
  (was previously a textual reference back to write-cells).

End-to-end verified against PPE (--as user):
- +workbook-create with df_to_sheet for three sheets (income / balance
  / cashflow): create ok, dtypes (datetime64[ns] / float64) + formats
  (#,##0 / 0.0% / yyyy-mm-dd) survive on read-back through sheet_to_df.
- read → pandas mutate → write-back round-trip preserves both data
  and formats.

* chore: drop accidentally-committed __pycache__/ and gitignore .pyc

The previous commit (5fac9c39) shipped sheets_df.py and inadvertently
included its `__pycache__/sheets_df.cpython-312.pyc` — local Python
import created the bytecode cache during PPE round-trip verification and
`git add skills/lark-sheets/` swept it in.

Remove the pyc and add Python bytecode patterns to .gitignore so the
skill-bundled helper scripts don't pull cache files into future commits.

* refactor(sheets): drop --dataframe / --dataframe-out + apache/arrow dep

Per the design review at NcmxdRo2yoZ4OXxoMUZcxRZ7nHd, the Arrow IPC binary
input/output channel adds a heavy columnar runtime to the CLI for no new
capability — the typed JSON --sheets path already covers everything, and
the column-major / zero-copy advantages collapse the moment the CLI re-
encodes into the row-oriented sheets OpenAPI JSON body. Removing it also
lets us drop the `--ignore github.com/apache/arrow/go/v17` license-check
escape hatch.

Deleted:
- shortcuts/sheets/lark_sheet_dataframe.go (+ test)
- --dataframe branches in +table-put / +workbook-create
- --dataframe-out branch in +table-get
- StdinConsumed / MarkStdinConsumed exported methods (the binary stdin
  reader was the only out-of-band consumer); internal stdinConsumed
  guard against duplicate `-` input flags stays
- apache/arrow/go/v17 + transitive deps via `go mod tidy`
- CI go-licenses --ignore for arrow and the LICENSE.txt assertion step
- --dataframe / --dataframe-out coverage in skill references

Pandas users keep the round-trip via the existing skill script
skills/lark-sheets/scripts/sheets_df.py over the JSON path.

The full pre-removal state is preserved on branch feat/sheets-arrow-stash.

Upstream sheet-skill-spec follow-up: the two flag rows in the canonical
spec + base table tblV2F6fqIjyCFQW must also be dropped so the next sync
does not re-add them.

* sync(sheets): pick up --sheets one-liner fix from spec

Mirrors sheet-skill-spec 5562f83. The +table-put / +workbook-create
--sheets flag descriptions (and the --print-schema description on the
sheets array) now point at the existing df_to_sheet helper instead of
the previous misleading one-liner that produced a dict missing the
outer {"sheets":[...]} envelope and the per-sheet `name`. Agents that
copy-paste the description verbatim now build a valid payload.

Auto-synced via spec's generate:cli + sync:consumers; go generate
./shortcuts/sheets/... regenerated flag_defs_gen.go so its embedded
flagDefs stays byte-equal to data/flag-defs.json.

* test(sheets/e2e): close E2E coverage gaps for newly added shortcuts

AGENTS.md requires both dry-run and live E2E for every newly registered
shortcut, and behavior-changing refactors need at least the matching
half. Three gaps remained on feat/lark-sheets-develop:

- +sheet-show-gridline / +sheet-hide-gridline (new): only dry-run E2E.
  Add sheets_gridline_workflow_test.go — create a real spreadsheet,
  toggle hide then show against a live sub-sheet, assert ok=true on
  both (gridline state is write-only — there is no read-back field on
  +sheet-info / +workbook-info — so a successful envelope is the
  meaningful signal; the dry-run E2E already pins the wire shape).

- +workbook-import (new): only dry-run E2E. Add
  sheets_workbook_import_workflow_test.go — write a local CSV, run
  the full upload → create-task → poll, assert ready=true with a
  sheet token, +info confirms the imported workbook is reachable,
  cleanup deletes the spreadsheet.

- +workbook-export refactor (no-download default changed): had live
  E2E but no dry-run E2E in tests/cli_e2e/. Add
  sheets_workbook_export_dryrun_test.go — pin the three sheet-
  specific differences vs drive +export: type=sheet hard-coded,
  csv mode routes --sheet-id onto sub_id (xlsx mode omits it), and
  --output-path maps onto the dry-run plan's top-level output_dir.
  Also pins the csv-without-sheet-id validation error.

* refactor(sheets): unify workbookCreatedButFillFailed with OutPartialFailure

Three "made it halfway and stopped" exits in the sheets domain previously
disagreed on shape, which made the post-failure recovery flow hard for
agents to predict from one command to another:

- +table-put partial write           → exit 1, stdout ok:false envelope
- +table-put zero-sheet write        → exit 1, stderr api/server_error
- +workbook-create create-but-fill   → exit 2, stderr validation/failed_precondition

OutPartialFailure exists exactly for "the side effect landed but the
follow-up didn't" — it stamps an ok:false result envelope on stdout
(carrying the state the caller needs to recover) and returns the bare
partial-failure exit signal. The workbook-create fill-failure path was
the odd one out: it surfaced as a typed failed_precondition error on
stderr, which agents couldn't tell apart from a plain validation refusal
even though the spreadsheet really did exist and a retry / cleanup was
possible.

Migrate workbookCreatedButFillFailed onto OutPartialFailure so the four
call sites in +workbook-create's Execute (sheet-resolve failure, initial
fill failure, style-only resolve failure, style-only apply failure) emit
the same envelope shape +table-put's partial write does:

  {
    "ok": false,
    "data": {
      "spreadsheet_token": "shtNEW",
      "reason": "spreadsheet shtNEW created but initial fill failed",
      "hint":   "the spreadsheet exists; retry the fill … or delete it",
      "cause":  {"category": "...", "subtype": "...", "message": "..."}
    }
  }

The inner failure's typed problem (category / subtype / message) is
flattened into the `cause` field so agents stay diagnosable from the JSON
envelope alone, instead of having to errors.Unwrap a Go error.

Updated TestExecute_WorkbookCreate_FillFailureKeepsToken to assert the
new shape (ok:false envelope on stdout, *output.PartialFailureError exit
signal, structured cause carrying the underlying invalid_response
subtype) — preserving the original test intent (token must survive for
recovery; inner cause must stay diagnosable) under the new contract.

* chore(sheets): three review nits — WithCause + stale comment + unexport

- shortcuts/sheets/flag_schema_validate.go:106 — composite-JSON shape
  validation was wrapping vErr's message into a typed sheets validation
  error without preserving vErr as the typed cause; add the missing
  .WithCause(vErr) so errors.Unwrap and ProblemOf still find the
  underlying validator error (matches every other typed-error chain
  helper in the file).

- shortcuts/sheets/lark_sheet_batch_update.go:92 — comment claimed
  batchUpdateInput returns "FlagErrorf-typed errors", but FlagErrorf no
  longer exists (the typed-error migration replaced it with
  common.ValidationErrorf / errs.ValidationError); update the comment
  to reflect what is actually returned.

- shortcuts/drive/drive_export.go:121 — drop the ValidateExport public
  alias and rename to validateExport. sheets +workbook-export reuses
  RunExport / PlanExportDryRun from this package but inlines its own
  (sheet-specific) Validate, so there is no cross-package call site —
  ValidateExport was a misleading sibling of the genuinely-shared
  ValidateImport. Comment added to record the asymmetry so future
  readers do not export it back.

* chore(deps): drop stale indirect bumps left by the arrow removal

The earlier --dataframe / --dataframe-out + apache/arrow/go/v17 removal
deleted the arrow consumer but left two indirect lines in go.mod pinned
to the versions arrow had pulled in:

  - github.com/kr/text                   v0.2.0
  - golang.org/x/exp  v0.0.0-20240222234643-814bf88cf225

With arrow gone, larksuite/cli was the only requirer of those exact
versions; every real consumer needs lower ones (kr/pretty wants
kr/text v0.1.0; charmbracelet/huh wants x/exp …20231006; xo/terminfo
wants x/exp …20220909). Removing the two indirect lines and running
`go mod tidy` lets MVS pick the real-consumer versions and drops the
explicit indirect entries entirely — go.mod net-diff against main is
now zero for this branch.

Verified locally: go build ./...; go test ./shortcuts/sheets/...
./shortcuts/drive/... ./shortcuts/common/... ./internal/auth/...
./cmd/auth/... — all green.

---------

Co-authored-by: zhengzhijie <zhengzhijie.j@bytedance.com>
Co-authored-by: Chenweifeng-bd <chenweifeng.1534@bytedance.com>
2026-06-25 10:48:13 +08:00
arnold9672
5efaf65aec feat: surface search API notices (#1413)
* feat: surface search API notices

sa: safe
doc: none
cfg: none
test: unit test

* fix: surface search notices in default output

* docs: add search notice doc comments

* docs: expand search notice doc comments
2026-06-23 14:27:04 +08:00
zgz2048
80bea45c6a feat: support base record comments (#1043)
* feat: support base record comments

* fix: tighten base comment validation

* fix: validate wiki base comment flags
2026-06-23 11:20:07 +08:00
evandance
c5b5aece33 refactor: retire legacy error envelopes and enforce typed contract (#1449)
* refactor: retire legacy error envelopes and enforce typed contract

Consolidate all command error reporting onto the typed errs.* contract, remove
the legacy error surface that predated it, and tighten the lint guards so the
contract holds across the whole repository going forward.

Every failure now reaches stderr as one envelope shape: a category, an
optional subtype, a human- and agent-readable message, and a recovery hint,
with invalid parameters listed under `params`. The legacy ExitError envelope,
its constructors, and the boundary bridge that promoted untyped config and
authorization errors are deleted, leaving a single path from error to wire.
Predicate commands keep their silent-exit behavior through a dedicated signal
that carries only an exit code.

Infrastructure paths that still emitted ad-hoc envelopes — flag parsing,
unknown commands and subcommands, plugin and policy guards, confirmation
prompts, and auth/config failures — now classify into the same taxonomy.
Business, API, auth, and config exit codes are preserved; the one behavioral
change is that Cobra usage failures (missing required flag, unknown command,
bad arguments) now emit the typed validation envelope and exit 2, matching the
explicit flag and subcommand guards, instead of Cobra's plain-text exit 1.

Enforcement is repo-wide rather than per-path:
- The errscontract guards run by default everywhere instead of through a
  migration allowlist, so legacy envelopes cannot be reintroduced anywhere.
- errorlint runs across the whole repository: every error wrap must use %w and
  every comparison must use errors.Is/errors.As, so interior wraps stay legal
  but can no longer break the chain the typed boundary relies on.
- The errs-no-bare-wrap guard is keyed by structural prefix instead of an
  explicit per-domain allowlist, so new shortcut domains are covered without
  editing a list. It runs where forbidigo is enabled (the shortcut domains and
  the auth/config/service command groups); repo-wide chain integrity for the
  remaining command paths is carried by errorlint above.

* test: align cli_e2e success assertions to the ok envelope

The api and service success path now emits the {"ok":true} envelope, so the
cli_e2e workflow assertions that still expected the old {"code":0} shape via
AssertStdoutStatus(t, 0) fail once they run with live credentials. Switch those
workflow assertions to AssertStdoutStatus(t, true); the fake-payload helper test
in core_test.go keeps its code-shape assertion.
2026-06-17 19:42:38 +08:00
wangweiming-01
4464ba7660 fix: validate drive import folder target (#1485)
Change-Id: I43755c3966b0daa06b708d2b3d03294f439547fa
2026-06-16 18:14:08 +08:00
yballul-bytedance
3feb70b32a feat(drive): 支持导出 Base 结构快照 (#1481)
1. 为 drive +export 增加 --only-schema 参数,并透传 only_schema 到导出任务请求。
2. 限制该参数仅用于 bitable 导出 .base,并补充单测与 dry-run E2E 覆盖。

Change-Id: I736cebf5841cc1c6acaa8a3ab16be51ba4cb355d
2026-06-16 16:36:31 +08:00
search_zhuhao
72c294712c feat: 【larksuite/cli】【drive 搜索支持 original_creator_ids】 M-7074213537 (#1046)
sa: none

fg: none

cfg: none

doc: none

test: ppe
Change-Id: I88bedd02a5daa3307b05c9b6f94748e1544d279a
2026-06-15 14:18:45 +08:00
fangshuyu-768
170565c57e fix: add @file/stdin support to drive +add-comment --content (#1343) 2026-06-09 15:20:25 +08:00
fangshuyu-768
281cdbd37c feat(drive): harden inspect shortcut failures (#1324) 2026-06-08 17:09:53 +08:00
caojie0621
62364fc320 fix(drive): use docs secure label read scope (#1281) 2026-06-05 12:48:22 +08:00
zhoujunteng-max
ac116e7ca3 feat(drive): add drive preview and cover shortcuts and document quota details (#1259)
* feat: support get quota detail

* feat: add drive preview and cover shortcuts

- add `drive +preview` and `drive +cover` shortcuts
- wrap `preview_result` output with stable preview item fields
- support cover download via `preview_download` with validated preset mappings
- update lark-drive skill references for preview and cover usage

* fix(drive): classify cover 404 as failed precondition

* fix(drive): show preview download step in dry-run

* docs(drive): clarify quota details user-only usage

* fix(drive): soften cover 404 guidance
2026-06-04 21:08:59 +08:00
evandance
b3fcf55611 feat(common): emit typed validation errors from shared shortcut pre-checks (#1242)
Input pre-check failures shared by every shortcut — @file/stdin input
resolution, enum validation, and unsupported --dry-run — now leave the
CLI as typed validation envelopes naming the offending flag, so scripts
and AI agents can branch on `param` instead of parsing prose. Wire type,
exit code, and message text are unchanged; the new fields are additive.

The shared layer also gains typed replacements for its legacy
error-producing helpers, so each business domain can migrate to typed
errors without rebuilding common plumbing, and a path-scoped lint guard
keeps migrated domains from sliding back.

Changes:
- Shared pre-check failures (input flags, enum values, dry-run support)
  return typed validation errors carrying the offending flag as `param`.
- Every legacy error-producing helper in shortcuts/common has a typed
  replacement that preserves the existing message text: validation and
  flag-group checks, chat/user ID validation (callers name the flag so
  `param` is ground truth), "me" open-id resolution, safe-path checks,
  input-stat and save-error wrapping. Legacy helpers stay for
  not-yet-migrated domains, marked deprecated — including the legacy
  API-result classifier, whose typed route is runtime.CallAPITyped.
- A new errscontract rule rejects legacy common-helper calls on migrated
  paths, so a migrated domain cannot silently reintroduce legacy
  envelopes; drive is the first locked path and its last legacy
  ID-helper calls are replaced.
2026-06-03 19:20:19 +08:00
evandance
98173ae5a9 feat(drive): emit typed error envelopes across the drive domain (#1205)
Drive-domain errors now leave the CLI as typed, machine-branchable
envelopes — a stable `type` plus `subtype` and named fields (param,
params, retryable, log_id, hint) — so scripts and AI agents can branch on
structure and act on a recovery hint instead of parsing prose.

Changes:
- Every error produced in the drive domain — validation, file I/O, and the
  failures returned from its Lark API calls — is emitted as a typed errs.*
  error; the exit code is derived from the error category. Drive's API calls
  now go through a shared typed classifier, so failures carry subtype,
  troubleshooter, a recovery hint, and the request's log_id whether the
  server returns it in the response body or the x-tt-logid header; an
  already-typed network/auth error is never downgraded into a generic API
  error.
- Known API conditions (resource conflict, cross-tenant, cross-brand, ...)
  carry a recovery hint keyed by their error class; a command can refine
  that hint with command-specific guidance.
- Batch partial failures (+push / +pull / +sync, where some items succeed
  and some fail) now report an honest ok:false multi-status result on
  stdout — the summary and every per-item outcome stay machine-readable —
  and exit non-zero, instead of a misleading ok:true success envelope.
- Duplicate rel_path conflicts report each colliding path as a structured
  params entry (RFC 7807 invalid-params style).
- Static guards lock the drive path so legacy error construction — direct
  envelopes or the auto-classifying API helpers — cannot be reintroduced,
  making drive the template for the remaining domains.

Output changes worth noting for consumers:
- Error envelopes now carry typed type/subtype and named fields; exit
  codes follow the error category (malformed or incomplete API responses
  are reported as internal errors rather than generic API errors).
- Batch partial failures (+push / +pull / +sync) emit an ok:false result
  envelope on stdout (summary + per-item items[]) and exit non-zero; the
  per-item results stay on stdout rather than in a stderr error envelope.

Errors surfaced through shared cross-domain helpers (scope precheck, media
import upload, metadata lookup, save-path resolution) are not yet typed;
they migrate with the shared layer in a follow-up change.
2026-06-03 10:27:15 +08:00
evandance
99e314fe0b feat(errs): typed envelope contract for auth-domain errors (#1135)
Every failure on the authentication, authorization, and configuration
path now surfaces as a typed structured error instead of an ad-hoc
envelope. Users and scripts that consume CLI output get:

  - a fixed nine-category taxonomy on the wire, each mapped to a
    stable shell exit code (authentication/authorization/config = 3,
    network = 4, internal = 5, policy = 6, confirmation = 10)
  - identity-aware detail fields (missing_scopes, requested_scopes,
    granted_scopes, console_url, log_id, retryable, hint) carried
    uniformly on the envelope
  - a single canonical policy envelope at exit 6; the legacy
    auth_error carve-out is retired
  - per-subtype canonical message + hint that preserves Lark's
    diagnostic phrasing and routes recovery to the right actor:
    app developer (app_scope_not_applied), user (missing_scope,
    token_scope_insufficient, user_unauthorized), or tenant admin
    (app_unavailable, app_disabled)
  - wrong app credentials classify as config/invalid_client whether
    surfaced by the Open API endpoint (99991543) or the tenant
    access-token mint endpoint (10003 / 10014), instead of
    collapsing to a transport error or api/unknown
  - local shortcut scope preflight emits the same
    authorization/missing_scope envelope (identity + deterministic
    missing-scope set) used by the post-call permission path, so AI
    consumers read the same structured shape from precheck and from
    server-returned permission denial
  - streaming download/upload failures keep the same network subtype
    split (timeout / TLS / DNS / transport) as the non-stream path
    instead of collapsing every cause to a generic transport failure
  - console_url is carried only on the bot-perspective
    app_scope_not_applied envelope (where the recovery action is
    "developer applies the scope at the developer console"); the
    user-perspective missing_scope envelope drops the field, since
    the only actionable user recovery is `lark-cli auth login --scope`
    and pointing an end user at a console they cannot modify is
    misleading
  - bind workflows (Hermes / OpenClaw / lark-channel) flatten dynamic
    Type tags to wire 'config' with the original module name kept
    as a metric label

All 10 typed errors are cause-bearing, nil-safe on .Error() and
.Unwrap(), and defensively clone slice setter inputs. Four lint
rules (CheckNilSafeError / CheckBuilderImmutable / CheckUnwrapSymmetry
/ CheckBuildAPIErrorArms) lock these invariants on migrated paths.
2026-05-30 19:08:41 +08:00
Kyalpha
3cd84fca90 test(drive): drop redundant CONFIG_DIR isolation in inspect Execute tests (#1121)
The six TestDriveInspectExecute_* tests set
t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) but build the CLI via
cmdutil.TestFactory(t, cfg), which provides an in-memory config closure
(func() (*core.CliConfig, error) { return config, nil }) and never reads the
filesystem. Per the repo learning from PR #343, this env var should only be
set for tests exercising the real NewDefault() factory path. None of these
tests use NewDefault(), so the calls are dead and removed.

No behavior change; all TestDriveInspect* tests still pass.

Co-authored-by: kyalpha313 <kyalpha313@users.noreply.github.com>
2026-05-28 17:32:31 +08:00
caojie0621
e182b01f68 feat(drive): add secure label shortcuts (#985) 2026-05-26 22:06:12 +08:00
fangshuyu-768
f00261da9f fix(drive): support doubao drive inspect URL variants (#1106) 2026-05-26 19:51:47 +08:00
ethan-zhx
ee9d090e64 feat(slides): support importing pptx as slides (#1068) 2026-05-26 11:54:46 +08:00
evandance
fe72e41fb2 feat(errs): add structured CLI error contract (#984)
Introduce a typed error contract framework for lark-cli so in-process
Go callers can branch via errors.As(&errs.XxxError{}) and shell scripts,
AI agents, and protocol adapters can branch on stable JSON type/subtype
fields instead of regex-parsing free-form messages.

Adds:
- Canonical taxonomy under errs/ (9 categories + typed Error structs
  embedding a shared Problem, RFC 7807-aligned)
- Centralized Lark code metadata + identity-aware BuildAPIError dispatch
- Typed JSON envelope writer alongside the legacy envelope writer
- MCP / OAuth (RFC 6750 Bearer) projection adapters
- Five CI lint guards preventing ad-hoc taxonomy drift

Backward compatibility: legacy *output.ExitError producers (ErrAPI,
ErrWithHint, Errorf, ErrBare) and business shortcuts that use them
continue to render the legacy envelope unchanged. SecurityPolicyError
wire format and exit code are preserved via a carve-out; taxonomy
migration is deferred to PR 2. Domain-specific business migration is
staged across PR 3+.

Framework-direct paths now return typed *errs.*Error: ErrAuth /
ErrValidation / ErrNetwork emit category literals on the wire
(authentication / validation / network), *core.ConfigError is promoted
at the cmd/root boundary with exit code aligned from 2 to 3, and Lark
API permission denials classified by BuildAPIError exit 3.

At the SDK boundary, WrapDoAPIError preserves any already-classified
error (legacy *output.ExitError or typed *errs.*) so output.ErrAuth
from missing credentials surfaces with the auth category and exit 3
intact instead of being downgraded to a network error. Policy responses
classified by BuildAPIError (codes 21000 / 21001) extract challenge_url
and the canonical hint from the response body, matching what the
auth transport already surfaces at the HTTP layer; non-https
challenge URLs are dropped.

First PR in the feat/error-contract-* series.
2026-05-26 11:42:33 +08:00
ethan-zhx
5c01a7f7f0 feat(slides): export slides (#988)
Change-Id: Ice3e8784e78986d427c4c94664e1e5edff2a4fcd
2026-05-22 17:19:49 +08:00
wangweiming-01
e54220ade1 feat: support files in drive +add-comment (#975)
* feat: support markdown files in drive +add-comment

Change-Id: Id9a87706a1e43756d8142637be9ec1e0748d4ddf

* fix: use markdown file comment anchor placeholder

Change-Id: Ifffc4cdd963c13e53f4cad154aebe11ae309df9e

* fix: gate drive file comments by supported extensions

Change-Id: Ie6c7f38dbbea1f87a81600da71180627b53a2355
2026-05-21 21:40:27 +08:00
fangshuyu-768
816927f8b8 fix: surface auto-grant failures via stderr and JSON hint (#1015)
When a resource is created with bot identity, the CLI attempts to
auto-grant full_access to the current user. If the user open_id is
missing or the grant API call fails, the result was only written to
the JSON permission_grant field and easily overlooked.

Changes:
- Add stderr warnings when auto-grant is skipped or fails
- Add 'hint' field to permission_grant JSON output with failure reason
  and actionable next step (e.g. auth login, check scope, retry)
- Add end-to-end skipped/failed tests across all affected shortcuts
  (doc, drive, sheets, slides, wiki, markdown, base)

Closes #963
2026-05-21 18:17:24 +08:00
wangweiming-01
e19e09019c feat: return real tenant URLs for drive +upload and markdown +create (#992)
Change-Id: I6b513eef57a3479c8971b3bb6cbf005cad3f8040
2026-05-21 11:07:37 +08:00
liujinkun2025
9272b9da99 docs(skills): migrate docs +search to drive +search and fix creator_ids owner semantic (#951)
docs +search is in maintenance and will be removed; cloud-space resource
discovery is consolidated onto drive +search. Two related doc/help fixes:

1. Redirect guidance: docs +search -> drive +search
   - skill-template/domains/{doc,sheets}.md
   - lark-base/SKILL.md: --filter '{"doc_types":["BITABLE"]}' -> --doc-types bitable
   - lark-sheets/SKILL.md: body + frontmatter description, add drive-search ref link
   Same server API, equivalent capability; only flattens the entry from
   nested --filter JSON to flags. reference links repointed to lark-drive.

2. Fix creator_ids/--mine semantic: creator -> owner
   The server matches creator_ids (incl. --mine / --creator-ids) by owner
   (document owner), not original creator, despite the OpenAPI field name.
   - shortcuts/drive/drive_search.go: --help Desc and Tip
   - lark-drive/references/lark-drive-search.md: identity section, params, rules, examples
   - lark-drive/SKILL.md: top-level guidance
   - lark-doc/references/lark-doc-search.md: creator_ids usage note (now self-consistent)
   Wire field name creator_ids kept (aligned with the server).

Docs/help strings only, no logic change; gofmt / go vet / package build pass.

Change-Id: If3ebf5a247b7e38b58050c677dc888a310f1c6b6
2026-05-20 15:08:50 +08:00
yballul-bytedance
69c34481f5 feat: Product CLI 4no-meego (#759)
Change-Id: If08f236c8ae351f92683f2b861cc999eb6f1d22d
2026-05-20 14:02:03 +08:00
fangshuyu-768
7c54f9b023 feat(drive): switch markdown export to V2 docs_ai fetch API (#948)
Switch `drive +export --file-extension markdown` from the legacy V1
GET /open-apis/docs/v1/content API to the V2
POST /open-apis/docs_ai/v1/documents/{token}/fetch API for
higher-quality Lark-flavored Markdown output.

- Update DryRun and Execute paths to use V2 endpoint with JSON body
- Add docx:document:readonly scope for the new API
- Validate V2 response structure (fail fast on missing document/content)
- Encode token in URL path via validate.EncodePathSegment
- Update unit tests and add V2 response validation error path tests
- Add E2E dry-run test for markdown export path
- Update skill documentation
2026-05-19 17:53:54 +08:00
fangshuyu-768
4aa61db8b2 feat(drive): add +inspect shortcut for document URL inspection with wiki unwrapping (#947)
* feat(drive): add +inspect shortcut for document URL inspection with wiki unwrapping

Implements #662: `lark-cli drive +inspect --url <url>` inspects any
Lark/Feishu document URL to get its type, title, and canonical token,
with automatic wiki URL unwrapping via get_node API.

- Add ParseResourceURL (inverse of BuildResourceURL) in common
- Extract FetchDriveMetaTitle as public shared helper
- Add drive +inspect shortcut with wiki unwrapping support
- Add skill reference docs and update SKILL.md
- Dry-run E2E tests for docx URL, wiki URL, and bare token

* refactor: move host validation from ParseResourceURL to +inspect

ParseResourceURL is a general-purpose URL parser that should not
hardcode domain lists — future Lark domains would silently break.
Move isLarkHost/larkHostSuffixes to drive_inspect.go where host
validation is a business decision of the +inspect command.
Add E2E test for non-Lark host with Lark-like path.

* refactor: remove host validation from +inspect

Lark supports custom enterprise domains, so a hardcoded suffix list
can never be exhaustive and would falsely reject valid URLs.
Path-based matching in ParseResourceURL is sufficient; invalid URLs
will fail naturally at the API call stage.
2026-05-19 15:19:35 +08:00
liujinkun2025
c4fb7006d2 feat(wiki): add +node-get / +node-delete / +space-create shortcuts (#904)
- +node-get: wrap wiki.spaces.get_node; accepts node_token, obj_token,
  or a Lark URL (URL path auto-infers obj_type); formatted output with
  creator / updated_at. No synthesized url — get_node returns none and a
  BuildResourceURL fallback is a non-canonical link that misleads in a
  read/confirm command (sibling read shortcuts omit it too)
- +node-delete: wrap space.node delete; high-risk-write (--yes gated),
  async delete-node task polling, auto-resolves space_id via get_node
  when --space-id omitted, actionable hints for codes 131011 / 131003.
  The delete-node task result lives under the gateway's generic
  `simple_task_result` key (NOT `delete_node_result`)
- +space-create: wrap spaces.create; user-only identity, --name
  required (no empty-name spaces), flattened space output, no url
- factor the shared wiki async-task poll loop into wiki_async_task.go;
  preserve upstream Lark Detail.Code on poll exhaustion (no longer
  rebuilt via lossy ErrWithHint)
- drive +task_result: add wiki_delete_node scenario so +node-delete's
  async-timeout next_command actually resolves
- skill docs: reference pages for the 3 new shortcuts + SKILL.md
  shortcuts table (no raw nodes.delete API exists — it's shortcut-only,
  so it is intentionally absent from API Resources / permission table);
  drop the circular TestWikiShortcutsIncludeAllCommands change-detector

Change-Id: I316f78290cec5bc50f80d629173e3bf2a35dd005
2026-05-19 11:21:54 +08:00
fangshuyu-768
df4b657737 feat(drive): add +sync workflow for Drive directories (#873)
Bidirectional sync between a local directory and a Drive folder with
diff detection (new_local, new_remote, modified, unchanged) and
conflict resolution strategies (--on-conflict: remote-wins, local-wins,
keep-both, ask).

Key behaviors:
- Type conflict detection: hard-fail when local file vs remote non-file
  or local directory vs remote file
- Keep-both: rename local with __lark_<hash> suffix, then pull remote;
  occupied map includes localDirs to prevent suffix collision
- Local-wins partial-success: prefer returned file_token on upload failure
- Empty directory mirroring: pre-create local dirs on Drive via
  drivePushWalkLocal before scope preflight
- Structured errors throughout (output.Errorf / output.ErrWithHint)

Includes unit tests and E2E tests (dry-run + live workflow).
2026-05-18 19:56:43 +08:00
wangweiming-01
241952459d feat: add drive version shortcut (#841)
Change-Id: I87bb32c86e3c3362f541ccc6320c656eb795ec9b
2026-05-18 16:44:10 +08:00
fangshuyu-768
5778adfefa fix(drive): preserve parent token on nested overwrite (#908)
* fix(drive): preserve parent token on nested overwrite

Ensure drive +push overwrite requests for nested files keep parent_node aligned with the actual remote parent folder and report parent resolution failures explicitly.

* test(drive): cover nested overwrite push workflow

Add a live drive +push workflow case for overwriting a nested remote file so the PR parent-token fix is exercised against the real backend and verified to converge via +status.
2026-05-15 18:32:58 +08:00
fangshuyu-768
52e0129078 feat(drive): add quick mode to status diff (#870) 2026-05-14 20:37:39 +08:00
wangweiming-01
0d20f88453 feat: support file-token overwrite and version output for drive +upload (#885)
Change-Id: I76c334578fc2fa5cfd2eedb4525b0d9d735f610e
2026-05-14 19:50:51 +08:00
fangshuyu-768
f1aa7d8f42 feat(drive): add modified-time smart sync mode (#859) 2026-05-14 14:10:35 +08:00
fangshuyu-768
4ba39ef392 fix(drive): handle duplicate remote sync paths (#803) 2026-05-11 17:51:23 +08:00
河伯
020aeb87ad feat(drive): pre-flight 10000-rune total cap for +add-comment reply_elements (#605)
* feat(drive): pre-flight per-text-element byte limit for +add-comment

The open-platform comment API returns an opaque [1069302] Invalid or
missing parameters whenever a single reply_elements[i] text exceeds
its implicit byte budget. The error does not name which element failed
or that length is the cause, so callers resort to binary-search
debugging.

Empirically: Chinese text up to ~80 chars (~240 bytes) lands; ~130
chars (~390 bytes) fails. Set the pre-flight limit to 300 bytes which
sits safely inside the known-good zone.

- parseCommentReplyElements now rejects any text element whose UTF-8
  byte length exceeds 300, with an ExitError naming the element index
  (#N, 1-based) and both the rune and byte counts, plus an ErrWithHint
  recommending the correct remediation (split into multiple text
  elements — the comment UI renders them as one contiguous comment).
- The previous 1000-rune check is removed: it was too lenient (a
  Chinese text under that cap would still fail server-side).
- skills/lark-drive/references/lark-drive-add-comment.md documents
  the per-element limit and the correct split pattern so agents
  avoid constructing oversized single elements upstream.

Addresses Case 12 in the 踩坑列表 doc.

* fix(drive): correct +add-comment hint to match actual escape coverage

`escapeCommentText` only expands `<` and `>` (each → 4 bytes via
`&lt;` / `&gt;`); `&` is intentionally left as-is. Both the over-limit
hint and the inline comment in `parseCommentReplyElements` previously
claimed `&` was also escaped, with a "4-5 bytes each" range that
implicitly assumed `&amp;` (5 bytes) — a string of 300 `&` chars
would actually fit in the budget, but a user reading the hint would
think otherwise and pre-emptively split it.

Code:
- Hint string ends with `Note: '<' and '>' are HTML-escaped and
  counted in their escaped form (4 bytes each).` (was: included `&`
  and "4-5 bytes")
- Inline comment above the budget check now matches:
  `escapeCommentText only expands '<' and '>' (each becomes 4 bytes:
  &lt; / &gt;); '&' is intentionally left as-is.`

Tests (regression):
- New `300 ampersands accepted (escapeCommentText leaves '&' as-is)`
  subtest pins that 300 `&` chars stay within budget. Without the fix
  this also passed (function was always correct), but the hint was
  lying — the test pins the budget contract loud and clear.
- New `TestParseCommentReplyElementsHintMatchesEscape` asserts the
  hint string itself: must mention `'<' and '>'` / `4 bytes`, must NOT
  mention `'&'` / `&amp;` / `4-5 bytes`. Catches a future drift if
  `escapeCommentText` is changed without updating the hint, or
  vice-versa.

The skill md (`skills/lark-drive/references/lark-drive-add-comment.md`)
already had the right wording (`每个 < 或 > 占 4 字节`), so it was the
in-Go strings that drifted; this commit aligns code with doc.

* fix(drive): rewrite +add-comment length cap to match real server behavior

The original PR set a 300-byte per-element pre-flight check, justified
by the empirical pattern "~80 Chinese chars succeeds, ~130 fails". A
fresh round of probing the live `/open-apis/drive/v1/files/{token}/
new_comments` endpoint with a real docx shows that pattern does not
reproduce, and the actual contract is very different:

  - 10000 ASCII / 10000 Chinese / 10000 '<' (escaped to 40000 bytes)
    in a single text element: all OK
  - 10001 of any of the above in a single text element: [1069302]
  - 5000 + 5000 across two text elements (total 10000): OK
  - 5000 + 5001 across two text elements (total 10001): [1069302]
  - 4000 + 4000 + 4000 across three (total 12000): [1069302]

Two consequences:

1. The cap is *10000 runes total across all reply_elements text*, not
   300 bytes per element. The old check rejected legitimate input
   anywhere from ~100 to 10000 Chinese chars (≈100x too aggressive).

2. The hint that recommended "split the content across multiple
   {\"type\":\"text\",\"text\":\"...\"} elements" was actively wrong —
   splitting doesn't bypass a total cap. A user told to split a
   10001-char message into 5000+5001 hits the same opaque [1069302].

This commit:

- Replaces `maxCommentTextElementBytes = 300` with
  `maxCommentTotalRunes = 10000`. The constant's doc comment records
  the probe matrix above so future maintainers know how it was
  derived.
- Switches the measurement from `len(escapeCommentText(input.Text))`
  to `utf8.RuneCountInString(input.Text)`. Server counts raw runes;
  byte width and post-escape form are irrelevant. The escape itself
  still happens — `<` and `>` still get rendered literally — but it
  no longer participates in the length check.
- Tracks a running `totalRunes` across the whole reply_elements array
  and bails at the first element that pushes the cumulative total
  over the 10000-rune budget, with index reporting that points at the
  offending element.
- Rewrites the over-cap hint to (a) name the actual 10000-rune budget,
  (b) explicitly say splitting does NOT help, (c) drop the wrong
  "comment UI still renders them as one contiguous comment" framing
  that implied splitting was a workaround.
- Adds a `TestParseCommentReplyElementsHintForbidsSplitAdvice`
  watchdog that fails if any future drift puts the discredited split
  advice back into the hint.

Tests: 11 cases on TestParseCommentReplyElementsTextLength covering
single-element boundary (ASCII / Chinese / angle brackets at exactly
10000 and at 10001), multi-element total cap (5000+5000 OK, 5000+5001
rejected with index pointing at element #2), early-element-overshoot
indexing (first element at 10001 reports index #1, not the trailing
element), and mention_user not double-counting toward the cap.

Skill md updated: removes the 300-byte / "split into multiple
elements" advice; documents the 10000-rune total cap with a note that
the schema currently advertises 1-1000 chars and is out of date,
plus a procedure for re-probing if the server-side limit ever moves.

Manual API verification: rebuilt binary and posted comments at
boundary lengths — all OK cases (100 / 5000 / 10000 chars, 5000+5000
split) accepted by server; over-cap cases (10001 / 10100 single, and
5000+5001 split) rejected by the new pre-flight before reaching the
network.

---------

Co-authored-by: fangshuyu <fangshuyu@bytedance.com>
2026-04-30 18:52:44 +08:00
zhouyue-bytedance
ac4c34f2ad feat: support file-name for drive export (#685)
* feat: support file-name for drive export

* test: cover drive export file-name metadata
2026-04-30 17:30:23 +08:00
fangshuyu-768
30ad38d4b6 feat(drive): add +pull shortcut for one-way Drive → local mirror (#696)
* feat(drive): add +pull shortcut to mirror a Drive folder onto local

Adds `drive +pull`, a one-way Drive → local mirror command. It
recursively lists --folder-token, downloads each type=file entry
into --local-dir at the matching relative path, and optionally
deletes local files absent from the remote (mirror semantics).

Implementation notes:

- Listing recurses through subfolders with the standard 200-page
  pagination loop. Online docs (docx, sheet, bitable, mindnote,
  slides) and shortcuts are skipped since there is no equivalent
  local binary to write back. Folder tree is reproduced under
  --local-dir, with parent directories auto-created by FileIO.Save.

- Per-file --if-exists=overwrite (default) | skip controls how
  pre-existing local files are treated; the framework's enum guard
  rejects any other value.

- --delete-local is the only destructive flag and is bound to --yes
  in Validate: --delete-local without --yes is rejected upfront so
  no listing or download even runs. --delete-local --yes performs
  downloads first, then walks --local-dir and removes regular files
  not present in the remote map. This matches the spec doc's
  "high-risk-write" intent for --delete-local without making the
  default pull path require confirmation.

- --local-dir is funneled through validate.SafeLocalFlagPath so
  errors reference --local-dir instead of the framework default
  --file. FileIO().Stat then enforces existence and IsDir.

- Scopes: drive:drive.metadata:readonly + drive:file:download. The
  broader drive:drive is disabled by enterprise policy in some
  tenants.

- Listing helper (drivePullListRemote) is duplicated locally rather
  than reused from drive_status.go because that change is still in
  open PR #692; once it merges, both can be lifted into a shared
  drive package helper. TODO marker is left in the code.

Tests cover six unit scenarios (happy-path with nested subfolder +
docx skipping, --if-exists=skip, --delete-local rejection without
--yes, --delete-local --yes deletes orphans, absolute-path
rejection, bad enum) and four E2E dry-run scenarios (request shape,
absolute path rejection, --delete-local --yes guard, missing
required flag).

* docs(skills): document drive +pull in lark-drive skill

Adds references/lark-drive-pull.md covering parameters, output schema
(summary + per-item action breakdown), the type=file scoping rule,
the --if-exists policy matrix, and the --delete-local + --yes safety
contract. Calls out the network-traffic caveat (pull is full-download,
unlike +status which only fetches when both sides have the file) and
the cwd boundary on --local-dir.

Wires +pull into the Shortcuts table in SKILL.md.

* fix(drive): walk +pull on canonical absolute root to close symlink/.. escape

Same root cause as the +status fix: --local-dir was validated through
SafeLocalFlagPath but the walk used the user-supplied raw string.
SafeLocalFlagPath returns the original value (the canonical form is
discarded), and SafeInputPath itself relies on filepath.Clean for
normalization, which shrinks "link/.." to "." purely as string
manipulation. The kernel then resolves "link/.." through the symlink
target's parent at walk time, putting the traversal outside cwd.

For +pull the bug is more dangerous than for +status because it
travels through --delete-local --yes — a raw walk would let the
delete pass land on files outside cwd.

Fix:
- In Execute, resolve --local-dir via validate.SafeInputPath to get a
  canonical absolute path, and resolve "." the same way for cwd.
- Convert the resolved root back to a cwd-relative form
  (filepath.Rel) for download targets so FileIO.Save's existing
  SafeOutputPath check (which rejects absolute paths) still applies.
- For --delete-local, walk the canonical absolute root, then delete
  via the absolute path. Both values come from the validated
  safeRoot, so kernel path resolution cannot redirect a delete to a
  file outside the canonical subtree.
- drivePullWalkLocal now returns absolute paths instead of rel paths;
  the caller computes the rel_path via filepath.Rel against safeRoot
  for output / remote-set membership checks.

Adds TestDrivePullDeleteLocalDoesNotEscapeViaSymlinkParentRef as a
regression: it stages an "escape" sibling directory containing a
sentinel file, adds a "link" symlink in cwd pointing into it, and
runs +pull --delete-local --yes against an empty remote with
--local-dir "link/..". The sentinel must survive (proving --delete
did not escape) and the in-cwd file must be removed (proving the
walk did run).

* test(drive): pin walker / download behavior on +pull symlink corner cases

Adds three regressions on top of the canonical-root walk fix:

- TestDrivePullSkipsSymlinkInsideRoot: a child symlink inside the
  validated root pointing to a sibling temp dir. Under
  --delete-local --yes with an empty remote, the sentinel inside the
  target must survive (walker did not follow the child symlink) and
  the in-cwd file must be deleted (walker did run).

- TestDrivePullSurvivesCircularSymlinkInsideRoot: a child symlink
  pointing at one of its ancestors. The walk must terminate so the
  test does not hang on the per-test timeout.

- TestDrivePullDownloadDoesNotEscapeViaSymlinkParentRef: pins the
  download half of the fix. With --local-dir "link/.." the canonical
  root resolves to cwd, so the remote file must land in cwd, not
  inside the symlink target's parent. The preexisting sentinel inside
  the escape directory must remain untouched.

* fix(drive): +pull --delete-local must not unlink local files shadowed by online docs

CodeRabbit (PR #696) flagged that the --delete-local pass treated any
local path missing from `remoteFiles` as orphaned, but `remoteFiles` only
records type=file entries. If Drive held a docx/sheet/shortcut at the
same rel_path as a local file, the local file would be unlinked even
though Drive still owned that path.

drivePullListRemote now returns two views:

  - files:    rel_path -> file_token, type=file only (download/skip set)
  - allPaths: every entry's rel_path regardless of type

The download loop continues to consume `files`; the --delete-local pass
consults `allPaths`, so an online-doc shadow of a local filename keeps
the local file safe.

Also routes the local walk and the delete through the vfs abstraction
(vfs.ReadDir + vfs.Remove) instead of filepath.WalkDir + os.Remove.
This drops the //nolint:forbidigo justifications and lines up with how
internal/keychain and internal/registry already do filesystem I/O. The
recursive vfs.ReadDir walker preserves the same "do not follow child
symlinks" semantics that filepath.WalkDir gave us, so the canonical-root
escape protections in 240b772 stay intact.

Adds TestDrivePullDeleteLocalPreservesLocalFileShadowedByOnlineDoc as a
direct regression: Drive serves keep.txt (file) plus notes.docx (docx),
local has both keep.txt and a hand-edited notes.docx; --delete-local
--yes must download keep.txt, leave notes.docx untouched, and report
deleted_local=0.

* fix(drive): count +pull delete failures in summary.failed

CodeRabbit (PR #696) flagged that both delete_failed branches in the
--delete-local pass appended an item but left the `failed` counter at
zero, so the JSON summary could legitimately report `"failed": 0` after
a partially-failed mirror. Increment failed in both branches (the
filepath.Rel error path and the vfs.Remove error path) so summary.failed
reflects every item flagged delete_failed in items[].

Adds TestDrivePullDeleteLocalCountsFailureInSummary, which forces
vfs.Remove to fail by chmod-ing the local dir 0o555 right before the
run and restoring 0o755 in t.Cleanup so t.TempDir teardown still works.

* fix(drive): swap +pull walk/remove back to filepath/os to satisfy depguard

The previous fix-up commits used vfs.ReadDir + vfs.Remove inside the
+pull shortcut, which depguard's "shortcuts-no-vfs" rule rejects:
shortcuts cannot import internal/vfs directly. CI lint failed on the
import line.

Restore the same pattern used in drive_status.go and the prior +pull
walker:

- filepath.WalkDir to enumerate files under the canonical absolute
  root, gated by //nolint:forbidigo with a comment explaining why.
- os.Remove for the actual delete, also gated by //nolint:forbidigo.

The canonical-root safety still holds: validate.SafeInputPath bounds
the walk root inside cwd before WalkDir runs, and WalkDir's default
"do not follow child symlinks" policy is preserved. The two earlier
fixes (drivePullListRemote returning allPaths so online-doc shadows
do not look orphaned, and incrementing failed on delete_failed) stay
in place.

`go test ./shortcuts/drive/...` and `golangci-lint run
--new-from-rev=origin/main` are both clean.

* fix(drive): record remote folder rel_path in +pull allPaths

Follow-up to 45fe4e3. The folder branch in drivePullListRemote merged
descendant rel_paths into allPaths but never recorded the folder's own
rel_path, so a local regular file with the same name as a remote
folder still looked orphaned and got unlinked under --delete-local.

Adds the missing allPaths[rel] for the folder case and a regression:
TestDrivePullDeleteLocalPreservesLocalFileShadowedByRemoteFolder
stages a Drive containing a folder named shadow alongside a
downloadable file, with the local side holding a regular file named
shadow; --delete-local --yes must download keep.txt and leave the
shadow file untouched.

* fix(drive): +pull pagination + dir/file conflict + skill doc symlink claim

Codex review on PR #696 surfaced three issues; addressed in one go:

1. drivePullListRemote only honored next_page_token. The shared
   common.PaginationMeta helper accepts both page_token and
   next_page_token; switched +pull over so a backend reply using
   page_token no longer makes the lister stop at page 1 (which would
   silently drop later remote files from both download and
   --delete-local).

2. --if-exists=skip swallowed mirror conflicts. The skip/overwrite
   branch only checked Stat success, so a local directory shadowing a
   remote regular file was reported as action=skipped. Now Stat's
   IsDir() is checked first; the conflict surfaces as action=failed
   with a message naming the directory, under both --if-exists=skip
   and --if-exists=overwrite, and increments summary.failed.

3. Skill doc told callers to soft-link the target into cwd if they
   wanted to pull from outside cwd. That is wrong: SafeInputPath
   evaluates symlinks before the cwd check, so a symlink pointing
   out-of-tree is rejected. Replaced the bogus shortcut with the
   actually viable options (switch the agent working directory,
   physically move/copy the target, or skip the comparison).

Two new regressions:

- TestDrivePullSurfacesDirectoryFileMirrorConflict — table test over
  both policies asserting failed=1, no skipped, action=failed, plus
  the 'is a directory' hint in the error message.

- TestDrivePullPaginationHandlesPageTokenField — first page returns
  page_token (not next_page_token) with has_more=true; asserts both
  pages are fetched and both files land on disk.

* fix(drive): +pull exits non-zero on item failures; gate --delete-local

Two PR-696 review fixes:

- Item-level failures (download error, dir/file conflict, delete error)
  now surface as a structured partial_failure ExitError instead of a
  success envelope with summary.failed > 0. Exit code becomes non-zero
  and error.detail still carries the {summary, items[]} payload, so
  AI / script callers can detect the failure via the exit code without
  reaching into the JSON body.

- A failed download pass now skips the --delete-local walk entirely.
  Previously +pull would continue removing local-only files even when
  the download phase had partially failed, leaving the mirror in a
  half-synced state (some Drive files missing locally AND some
  local-only files unlinked). Re-runs after fixing the download error
  recover cleanly.

Skill doc / shortcut description / flag desc updated to call the
operation a one-way file-level mirror, since --delete-local only
unlinks regular files and does not prune empty local directories left
behind by remote folder deletes (true directory-level mirroring is
explicitly out of scope).

Tests: existing dir/file-conflict and delete-failure cases now assert
the partial_failure ExitError shape; new test covers the
"download fails => --delete-local skipped" gating contract.

* refactor(drive): consolidate folder-listing helpers into listRemoteFolder

Closes the post-#692 / post-#709 TODO that lived in drive_pull.go (and
the matching note in drive_push.go): both #692 and #709 are now on main,
so the three near-identical recursive Drive folder listers can collapse
into one.

New shared helper in shortcuts/drive/list_remote.go:

  driveRemoteEntry { FileToken, Type, RelPath }
  listRemoteFolder(ctx, runtime, folderToken, relBase) -> map[rel]entry

Returns one entry per Drive item (every type), keyed by rel_path.
Subfolders are descended into and the folder's own entry is recorded so
callers can reason about "this rel_path is occupied by a folder"
without re-listing. Pagination via common.PaginationMeta is unchanged.

Each shortcut now derives its own per-shortcut view from the unified
listing:

  - drive_status.go: collapses to remoteFiles (Type=="file" -> token) for
    the content-hash diff.
  - drive_pull.go: derives remoteFiles (Type=="file") for the download
    set, plus remotePaths (every rel_path) as the --delete-local guard.
  - drive_push.go: derives remoteFiles (Type=="file") for upload /
    overwrite / orphan-delete, plus remoteFolders (Type=="folder") for
    the create_folder cache. drivePushRemoteEntry was a duplicate of
    driveRemoteEntry's first two fields and is dropped; the few call
    sites that read .FileToken keep working unchanged.

Per-shortcut copies removed:
  - drive_status.go: listRemoteForStatus, joinRelStatus,
    driveStatusListPageSize/FileType/FolderType
  - drive_pull.go: drivePullListRemote, drivePullJoinRel,
    drivePullListPageSize/FileType/FolderType
  - drive_push.go: drivePushListRemote, drivePushJoinRel,
    drivePushListPageSize/FileType/FolderType, drivePushRemoteEntry

drive_push_test.go's TestDrivePushHelpersRelPath is retargeted at the
shared joinRelDrive; the docstrings on the same-name-conflict tests
were tweaked to refer to "the remoteFiles view" instead of the
just-removed drivePushListRemote.

Net diff: +1 new file, -207 net lines across the four touched files.
All existing unit + e2e dry-run tests pass without behavioral change;
the rel_path / pagination / type-filter contracts each shortcut depends
on are preserved by construction.
2026-04-30 17:07:59 +08:00
fangshuyu-768
4d68e09537 feat(drive): add +push shortcut for one-way local → Drive mirror (#709)
* feat(drive): add +push shortcut for one-way local → Drive mirror

Mirrors a local directory onto a Drive folder: walks --local-dir,
recursively lists --folder-token, mirrors local subdirectory structure
(including empty dirs) onto Drive via create_folder, and for each
rel_path uploads new files, overwrites already-present files, or skips
them per --if-exists. With --delete-remote --yes, any Drive type=file
entry absent locally is removed; Lark native cloud docs (docx/sheet/
bitable/mindnote/slides) and shortcuts are never overwritten or deleted.

Overwrite hits POST /open-apis/drive/v1/files/upload_all with the
existing file_token in the form body and the response's `version` is
propagated to items[].version, mirroring the markdown +overwrite
contract. Files >20MB fall back to the 3-step
upload_prepare/upload_part/upload_finish path with a single shared fd
reused via io.NewSectionReader per block.

Output is a {summary, items[]} envelope; items[].action is one of
uploaded / overwritten / skipped / folder_created / deleted_remote /
failed / delete_failed.

--delete-remote is bound to --yes upfront in Validate, same pattern as
+pull's --delete-local: a stray flag never silently deletes anything.
Path safety reuses the canonical-root walk + SafeInputPath mechanics
from the sibling +status / +pull commands.

Scopes: drive:drive.metadata:readonly + drive:file:upload +
space:folder:create. space:document:delete is intentionally NOT in the
default set — the framework's pre-flight scope check would otherwise
block plain pushes and dry-runs for callers that haven't granted delete;
--delete-remote --yes relies on the runtime DELETE call to surface
missing_scope. The skill ref calls out the scope so users running
mirror sync can grant it upfront.

13 unit tests cover the upload/overwrite/skip/delete matrix, online-doc
protection, same-name conflict between local file and native cloud doc,
empty-directory mirroring, multipart, scope/path validation, and helper
correctness. 4 dry-run e2e tests pin the request shape.

* fix(drive +push): address review — failure semantics, default skip, scope pre-check, mirror wording

- Item-level failures now bump the exit code via output.ErrBare(ExitAPI)
  while keeping the structured items[] envelope on stdout. The
  --delete-remote phase is skipped entirely when any upload / overwrite /
  folder step fails, so a partial upload never proceeds to delete remote
  orphans (a half-synced state).
- Default --if-exists flipped from "overwrite" to the safer "skip": the
  upload_all overwrite-version protocol field is still rolling out, so
  the default no longer fails a first push against a pre-populated
  folder. Callers must opt into "overwrite" explicitly.
- --delete-remote --yes now triggers a conditional space:document:delete
  scope pre-check in Validate via the new RuntimeContext.EnsureScopes
  helper, so a missing grant fails the run before any upload — instead
  of after the upload phase, which would leave orphans uncleaned.
- Description, Tips and skill doc rewritten to call this a file-level
  mirror (not a directory mirror): the command does not remove
  remote-only directories or close gaps in directory structure that
  exists only on Drive.

Tests:
- new TestDrivePushDefaultsToSkipForExistingRemote pins the new default
- new TestDrivePushSkipsDeleteAfterUploadFailure pins the half-sync
  guard and the non-zero exit on item-level failure
- new TestDrivePushExitsZeroOnCleanRun pins the inverse
- existing tests that relied on the old overwrite default now opt in
  explicitly with --if-exists=overwrite
- TestDrivePushOverwriteWithoutVersionFails updated to assert
  *output.ExitError with Code=ExitAPI
- new TestDrive_PushDryRunAcceptsDeleteRemoteWithYes (e2e) symmetric to
  the existing reject-without-yes test, pinning that EnsureScopes is a
  silent no-op when the resolver has no scope metadata

* fix(drive +push): close remaining CodeRabbit comments

Three small follow-ups on the +push review thread that were still
open after the earlier failure-semantics / default-skip / scope
pre-check fix:

- drivePushUploadAll now extracts data.file_token before checking
  larkCode, and surfaces the returned token on the partial-success
  path (non-zero code + non-empty file_token). Without this, a backend
  response where bytes already landed but code != 0 would force the
  caller to fall back to entry.FileToken and silently lose the actual
  Drive token, defeating the overwrite-error token-stability handling
  in Execute.
- TestDrivePushOverwriteWithoutVersionFails switched from "tok_keep"
  to "tok_keep_new" in the upload_all stub and now asserts that the
  returned token (not entry.FileToken) lands in items[].file_token —
  pins the contract that a regression to the fallback branch would
  otherwise pass silently.
- New TestDrivePushOverwritePartialSuccessSurfacesReturnedToken pins
  the new partial-success branch end-to-end.
- drive_push_dryrun_test.go: tightened the three Validate / cobra
  rejections from `exit != 0` to exact codes — `exit == 2` for the
  two Validate-stage rejections (--local-dir absolute,
  --delete-remote without --yes), `exit == 1` for the cobra
  required-flag check (--folder-token missing). Locks in failure
  classification so a regression that misroutes the error layer
  doesn't slip through.
2026-04-30 15:00:44 +08:00
fangshuyu-768
a3bbe00ee0 feat(drive): add +status shortcut for content-hash diff (#692)
* feat(drive): add +status shortcut for content-hash diff

Adds `drive +status`, a read-only diff primitive that walks --local-dir,
recursively lists --folder-token, and reports four buckets — new_local,
new_remote, modified, unchanged — by SHA-256 content hash.

Implementation notes:

- Drive's list/metas APIs do not expose a content hash, so files
  present on both sides are downloaded via DoAPIStream and hashed in
  memory (sha256 + io.Copy, no disk write). Files only on one side are
  not fetched. The command stays Risk: "read".

- Only Drive entries with type=file participate. Online docs (docx,
  sheet, bitable, mindnote, slides) and shortcuts are skipped — there
  is no equivalent local binary to hash against.

- --local-dir is funneled through the framework's
  validate.SafeLocalFlagPath helper so that absolute paths and any ..
  that escapes cwd are rejected with --local-dir in the error message
  (rather than the internal default --file). FileIO().Stat() then
  enforces existence and the IsDir check.

- Local walk uses filepath.WalkDir behind a //nolint:forbidigo comment.
  The runtime FileIO interface has no walker today and shortcuts can't
  import internal/vfs; SafeInputPath has already bounded the walk root
  inside cwd, so the bare walk is acceptable until a runtime-level
  walker lands.

- Scopes: drive:drive.metadata:readonly (list folders) +
  drive:file:download (fetch files for hashing). The broader
  drive:drive scope is disabled by enterprise policy in some tenants;
  this narrower pair was verified end-to-end.

Tests cover the four-bucket categorization with a nested subfolder and
docx/shortcut filtering, plus validation errors for missing local-dir,
non-directory local-dir, and absolute-path local-dir.

* docs(skills): document drive +status in lark-drive skill

Adds references/lark-drive-status.md covering parameters, output
schema, the type=file scoping rule, and the network-traffic caveat
(hash is streamed in memory, but bytes still cross the wire).

Notes that --local-dir is bounded to cwd by the CLI's path validation,
and that when a user wants to compare a directory outside cwd the
agent should ask the user to relocate or to switch the agent's working
directory rather than `cd`-ing on its own.

Wires +status into the Shortcuts table in SKILL.md.

* test(drive): cover --folder-token validation and add +status dry-run E2E

Addresses two CodeRabbit review comments on PR #692:

- Adds TestDriveStatusRejectsEmptyFolderToken and
  TestDriveStatusRejectsMalformedFolderToken so the Validate-stage
  required-check and the ResourceName format guard for --folder-token
  are exercised, not just --local-dir.

- Adds tests/cli_e2e/drive/drive_status_dryrun_test.go which drives
  the real binary in dry-run mode and asserts:

  * the request shape (GET /open-apis/drive/v1/files with
    folder_token in the dry-run envelope), plus the description text,
  * --local-dir absolute paths are rejected by Validate (which still
    runs under --dry-run) with --local-dir surfaced in the message,
  * cobra's required-flag enforcement rejects a missing
    --folder-token before any custom validation.

* fix(drive): walk +status on canonical absolute root to close symlink/.. escape

Reported in PR review: --local-dir was validated through
SafeLocalFlagPath, but the actual walk used the user-supplied raw
string. SafeLocalFlagPath returns the original value (it only checks
the path through SafeInputPath and discards the canonical form), and
SafeInputPath itself relies on filepath.Clean for path normalization.
filepath.Clean shrinks "link/.." to "." purely as string manipulation,
so the validator sees a path inside cwd. The kernel, however, resolves
"link/.." through the symlink target's parent — which is outside cwd
and is what filepath.WalkDir actually traverses.

Fix: in Execute, resolve --local-dir via validate.SafeInputPath to
get the canonical absolute path (this one fully evaluates symlinks
across the entire path), and walk that path. Each absolute walk hit
is converted to a cwd-relative form via filepath.Rel against
validate.SafeInputPath(".") so FileIO.Open's existing SafeInputPath
guard (which rejects absolute paths) still applies.

Adds TestDriveStatusDoesNotEscapeViaSymlinkParentRef as a regression:
it stages an "escape" sibling directory containing a sentinel file,
adds a "link" symlink in cwd pointing into the escape directory, and
runs +status with --local-dir "link/..". Without this fix, the raw
walk visits the sentinel and leaks it into new_local; with the fix,
the walk stays inside the canonical cwd.

A standalone repro confirms the underlying behavior: raw
filepath.WalkDir("link/..", ...) traversed dozens of unrelated files
in the kernel-resolved parent directory; walking the canonical root
visits only the legitimate cwd contents.

* test(drive): pin walker behavior on child / circular symlinks for +status

Adds two corner-case regressions to back up the canonical-root walk fix:

- TestDriveStatusSkipsSymlinkInsideRoot: a child symlink under
  --local-dir that points to a sibling temp dir outside cwd. WalkDir's
  default policy must report it as a non-regular entry so the callback
  skips it, and the sentinel inside the target must not surface in
  new_local. This pins the contract our caller relies on (walk
  declines to follow child symlinks even when the canonical root
  resolves cleanly).

- TestDriveStatusSurvivesCircularSymlinkInsideRoot: a child symlink
  pointing back at one of its ancestors. The walk must terminate and
  surface the legitimate sibling file; if WalkDir ever followed the
  loop, the per-test timeout would catch it.

* fix(drive): close +status review gaps from Codex (pagination, doc, live E2E)

Three independent fixes flagged on PR #692:

1. Route the recursive Drive folder listing through common.PaginationMeta
   instead of reading next_page_token directly. The shared helper accepts
   both page_token and next_page_token, matching what okr/im already do
   and keeping +status safe against a backend field rename. Adds
   TestDriveStatusPaginatesRemoteListing, which serves a 2-page response
   where page 1 advertises the cursor as next_page_token and page 2 as
   page_token; either spelling alone would silently drop one page.

2. The skill doc previously suggested "or symlink the target into cwd"
   as a workaround for cwd-relative --local-dir. SafeInputPath calls
   filepath.EvalSymlinks before checking isUnderDir(canonicalCwd), so
   any symlink whose final target sits outside cwd still gets rejected
   as `unsafe file path`. Rewrite the section so agents stop steering
   users into a path that always errors out.

3. Add tests/cli_e2e/drive/drive_status_workflow_test.go — the live
   E2E that AGENTS.md requires for new shortcuts. Seeds a real Drive
   folder with three uploaded files (unchanged.txt, modified.txt,
   remote-only.txt), seeds a local tree with matching/diverging
   content plus a local-only.txt, runs +status, and asserts each of
   the four buckets contains exactly the file we expect with the
   right file_token. Cleanup of every uploaded file plus the parent
   folder is registered through the existing best-effort cleanup
   helpers. Coverage table bumped: drive +status moves to ✓ and the
   denominator goes from 28→29 to account for the new shortcut.

Codex also flagged the local-side filepath.WalkDir as a vfs-bypass.
Investigated: the depguard rule shortcuts-no-vfs explicitly forbids
shortcuts from importing internal/vfs (see commit c1b0bed on the
+pull branch where the same migration was rejected by CI). The
filepath.WalkDir + nolint:forbidigo pattern in walkLocalForStatus is
the lint-required convention until FileIO grows a walker, so leaving
it as-is.
2026-04-30 14:27:25 +08:00
caojie0621
fc22e9a04b feat(common): backfill resource URL when create APIs omit it (#680)
Add BuildResourceURL helper and wire it into doc/sheets/drive/base/wiki
create paths so callers always receive a clickable link, even when the
backend response (MCP degraded path or upstream OpenAPI) returns an
empty URL field. The fallback uses the brand-standard host
(www.feishu.cn / www.larksuite.com), which redirects to the tenant
domain.

Affected entries:
- docs +create v1 / v2
- sheets +create
- drive +create-folder / +import / +upload (newly exposes url)
- wiki +node-create (newly exposes url)

drive +create-shortcut is intentionally skipped because the URL form
depends on the underlying file kind, which the shortcut payload does
not carry.
2026-04-28 18:20:35 +08:00
ethan-zhx
05d8137c7d feat(drive): extend +add-comment to support slides targets (#674)
Change-Id: Id87ecce098d87f7db82389a73f3134b66fcd4814
2026-04-28 10:25:32 +08:00
liujinkun2025
aa48d70d7a feat(drive): add +search shortcut with flat filter flags (#658)
Expose doc_wiki/search v2 under the drive domain via explicit flags
(--query, --edited-since, --commented-since, --opened-since,
--created-since, --mine, --creator-ids, --doc-types, --folder-tokens,
--space-ids, ...) instead of a nested JSON filter, so natural-language
queries from AI agents map 1:1 to discrete flags.

Time handling:
- my_edit_time and my_comment_time are snapped to the hour (floor/ceil)
  with a stderr notice, since those fields are aggregated at hour
  granularity server-side. create_time passes through as-is.
- open_time has a server-side 3-month cap per request. When
  --opened-since / --opened-until span exceeds 90 days, the CLI narrows
  the request to the most recent 90-day slice and emits a stderr notice
  listing every remaining slice's --opened-* values so the agent can
  re-invoke for older ranges. Spans over 365 days are rejected up front
  to bound runaway slicing.

Flag ergonomics:
- --doc-types accepts mixed case; values are normalized to upper case
  before validation and before being sent to the server.
- --sort default is translated to the server enum DEFAULT_TYPE (every
  other sort value upper-cases 1:1).

Error hints:
- Lark code 99992351 (referenced open_id outside the app's contact
  visibility) is enriched with a +search-specific hint that
  distinguishes API scope from contact visibility and points at
  --creator-ids / --sharer-ids as the likely source.

Skill docs:
- new reference at skills/lark-drive/references/lark-drive-search.md,
  including the open_time slicing protocol and the paginate-within-
  slice-before-switching agent playbook.
- lark-drive/SKILL.md routes resource-discovery to drive +search.
- lark-doc/SKILL.md and lark-doc-search.md mark docs +search as
  deprecated and point users at drive +search.

Change-Id: I36d620045809b448446d4fdbdfa923b05794da19
2026-04-25 16:22:35 +08:00
wittam-01
24afe39516 feat: support wiki node targets in drive +upload (#611)
Change-Id: Iaf94270c0a2a2ac02af81c234553ac5850c0668b
2026-04-23 22:37:47 +08:00