* refactor: retire legacy error envelopes and enforce typed contract
Consolidate all command error reporting onto the typed errs.* contract, remove
the legacy error surface that predated it, and tighten the lint guards so the
contract holds across the whole repository going forward.
Every failure now reaches stderr as one envelope shape: a category, an
optional subtype, a human- and agent-readable message, and a recovery hint,
with invalid parameters listed under `params`. The legacy ExitError envelope,
its constructors, and the boundary bridge that promoted untyped config and
authorization errors are deleted, leaving a single path from error to wire.
Predicate commands keep their silent-exit behavior through a dedicated signal
that carries only an exit code.
Infrastructure paths that still emitted ad-hoc envelopes — flag parsing,
unknown commands and subcommands, plugin and policy guards, confirmation
prompts, and auth/config failures — now classify into the same taxonomy.
Business, API, auth, and config exit codes are preserved; the one behavioral
change is that Cobra usage failures (missing required flag, unknown command,
bad arguments) now emit the typed validation envelope and exit 2, matching the
explicit flag and subcommand guards, instead of Cobra's plain-text exit 1.
Enforcement is repo-wide rather than per-path:
- The errscontract guards run by default everywhere instead of through a
migration allowlist, so legacy envelopes cannot be reintroduced anywhere.
- errorlint runs across the whole repository: every error wrap must use %w and
every comparison must use errors.Is/errors.As, so interior wraps stay legal
but can no longer break the chain the typed boundary relies on.
- The errs-no-bare-wrap guard is keyed by structural prefix instead of an
explicit per-domain allowlist, so new shortcut domains are covered without
editing a list. It runs where forbidigo is enabled (the shortcut domains and
the auth/config/service command groups); repo-wide chain integrity for the
remaining command paths is carried by errorlint above.
* test: align cli_e2e success assertions to the ok envelope
The api and service success path now emits the {"ok":true} envelope, so the
cli_e2e workflow assertions that still expected the old {"code":0} shape via
AssertStdoutStatus(t, 0) fail once they run with live credentials. Switch those
workflow assertions to AssertStdoutStatus(t, true); the fake-payload helper test
in core_test.go keeps its code-shape assertion.
Drive-domain errors now leave the CLI as typed, machine-branchable
envelopes — a stable `type` plus `subtype` and named fields (param,
params, retryable, log_id, hint) — so scripts and AI agents can branch on
structure and act on a recovery hint instead of parsing prose.
Changes:
- Every error produced in the drive domain — validation, file I/O, and the
failures returned from its Lark API calls — is emitted as a typed errs.*
error; the exit code is derived from the error category. Drive's API calls
now go through a shared typed classifier, so failures carry subtype,
troubleshooter, a recovery hint, and the request's log_id whether the
server returns it in the response body or the x-tt-logid header; an
already-typed network/auth error is never downgraded into a generic API
error.
- Known API conditions (resource conflict, cross-tenant, cross-brand, ...)
carry a recovery hint keyed by their error class; a command can refine
that hint with command-specific guidance.
- Batch partial failures (+push / +pull / +sync, where some items succeed
and some fail) now report an honest ok:false multi-status result on
stdout — the summary and every per-item outcome stay machine-readable —
and exit non-zero, instead of a misleading ok:true success envelope.
- Duplicate rel_path conflicts report each colliding path as a structured
params entry (RFC 7807 invalid-params style).
- Static guards lock the drive path so legacy error construction — direct
envelopes or the auto-classifying API helpers — cannot be reintroduced,
making drive the template for the remaining domains.
Output changes worth noting for consumers:
- Error envelopes now carry typed type/subtype and named fields; exit
codes follow the error category (malformed or incomplete API responses
are reported as internal errors rather than generic API errors).
- Batch partial failures (+push / +pull / +sync) emit an ok:false result
envelope on stdout (summary + per-item items[]) and exit non-zero; the
per-item results stay on stdout rather than in a stderr error envelope.
Errors surfaced through shared cross-domain helpers (scope precheck, media
import upload, metadata lookup, save-path resolution) are not yet typed;
they migrate with the shared layer in a follow-up change.
Every failure on the authentication, authorization, and configuration
path now surfaces as a typed structured error instead of an ad-hoc
envelope. Users and scripts that consume CLI output get:
- a fixed nine-category taxonomy on the wire, each mapped to a
stable shell exit code (authentication/authorization/config = 3,
network = 4, internal = 5, policy = 6, confirmation = 10)
- identity-aware detail fields (missing_scopes, requested_scopes,
granted_scopes, console_url, log_id, retryable, hint) carried
uniformly on the envelope
- a single canonical policy envelope at exit 6; the legacy
auth_error carve-out is retired
- per-subtype canonical message + hint that preserves Lark's
diagnostic phrasing and routes recovery to the right actor:
app developer (app_scope_not_applied), user (missing_scope,
token_scope_insufficient, user_unauthorized), or tenant admin
(app_unavailable, app_disabled)
- wrong app credentials classify as config/invalid_client whether
surfaced by the Open API endpoint (99991543) or the tenant
access-token mint endpoint (10003 / 10014), instead of
collapsing to a transport error or api/unknown
- local shortcut scope preflight emits the same
authorization/missing_scope envelope (identity + deterministic
missing-scope set) used by the post-call permission path, so AI
consumers read the same structured shape from precheck and from
server-returned permission denial
- streaming download/upload failures keep the same network subtype
split (timeout / TLS / DNS / transport) as the non-stream path
instead of collapsing every cause to a generic transport failure
- console_url is carried only on the bot-perspective
app_scope_not_applied envelope (where the recovery action is
"developer applies the scope at the developer console"); the
user-perspective missing_scope envelope drops the field, since
the only actionable user recovery is `lark-cli auth login --scope`
and pointing an end user at a console they cannot modify is
misleading
- bind workflows (Hermes / OpenClaw / lark-channel) flatten dynamic
Type tags to wire 'config' with the original module name kept
as a metric label
All 10 typed errors are cause-bearing, nil-safe on .Error() and
.Unwrap(), and defensively clone slice setter inputs. Four lint
rules (CheckNilSafeError / CheckBuilderImmutable / CheckUnwrapSymmetry
/ CheckBuildAPIErrorArms) lock these invariants on migrated paths.
Introduce a typed error contract framework for lark-cli so in-process
Go callers can branch via errors.As(&errs.XxxError{}) and shell scripts,
AI agents, and protocol adapters can branch on stable JSON type/subtype
fields instead of regex-parsing free-form messages.
Adds:
- Canonical taxonomy under errs/ (9 categories + typed Error structs
embedding a shared Problem, RFC 7807-aligned)
- Centralized Lark code metadata + identity-aware BuildAPIError dispatch
- Typed JSON envelope writer alongside the legacy envelope writer
- MCP / OAuth (RFC 6750 Bearer) projection adapters
- Five CI lint guards preventing ad-hoc taxonomy drift
Backward compatibility: legacy *output.ExitError producers (ErrAPI,
ErrWithHint, Errorf, ErrBare) and business shortcuts that use them
continue to render the legacy envelope unchanged. SecurityPolicyError
wire format and exit code are preserved via a carve-out; taxonomy
migration is deferred to PR 2. Domain-specific business migration is
staged across PR 3+.
Framework-direct paths now return typed *errs.*Error: ErrAuth /
ErrValidation / ErrNetwork emit category literals on the wire
(authentication / validation / network), *core.ConfigError is promoted
at the cmd/root boundary with exit code aligned from 2 to 3, and Lark
API permission denials classified by BuildAPIError exit 3.
At the SDK boundary, WrapDoAPIError preserves any already-classified
error (legacy *output.ExitError or typed *errs.*) so output.ErrAuth
from missing credentials surfaces with the auth category and exit 3
intact instead of being downgraded to a network error. Policy responses
classified by BuildAPIError (codes 21000 / 21001) extract challenge_url
and the canonical hint from the response body, matching what the
auth transport already surfaces at the HTTP layer; non-https
challenge URLs are dropped.
First PR in the feat/error-contract-* series.