Files
larksuite-cli/internal/schema/path_test.go
sang-neo03 9e2be14301 feat(schema): output json spec envelope for all API commands (#1048)
* feat(schema): add envelope types and ordered properties container

* feat(schema): build meta_data.json key-order index for property ordering

* feat(schema): implement convertProperty with file/enum/range/nested handling

* feat(schema): build inputSchema with x-in / file binary / yes injection

* feat(schema): build outputSchema wrapping responseBody

* feat(schema): build _meta with scopes/risk/access_tokens normalization

* feat(schema): scaffold affordance overlay loader (PR-1 stub)

* feat(schema): wire up AssembleEnvelope main entry point

* feat(schema): parse dotted and space-separated path arguments

* feat(schema): batch envelope assembly with optional method filter

* feat(schema): implement L1-L3 envelope lint (structure/type/cross-field)

* feat(schema): measure L4 coverage and gate all envelopes through L1-L3

* feat(schema): add golden test harness with UPDATE_GOLDEN refresh

* test(schema): seed 20 golden envelopes covering edge cases

* feat(schema): output MCP envelope as default JSON, preserve pretty mode

Rewrites cmd/schema/schema.go so the default --format json branch emits
MCP-spec envelopes via schema.AssembleAll/AssembleService/AssembleEnvelope.
The legacy --format pretty branch is preserved verbatim and still uses
printServices / printResourceList / printMethodDetail.

Args max raised from 1 to 8 so the path can be supplied either as a single
dotted argument (im.reactions.list) or as space-separated segments
(im reactions list); both forms route through schema.ParsePath and produce
byte-identical output.

The completeSchemaPath function is extended to drive tab-completion for
both forms: legacy dotted prefix when len(args) == 0, and per-segment
resource/method completion when args already contains earlier segments.

BREAKING CHANGE: default JSON output shape changes from the raw meta_data
structure to an MCP envelope array/object. Existing scripts parsing the
old shape must either pin --format pretty or migrate to the new envelope
fields (name, description, inputSchema, outputSchema, _meta).

* test(schema): cover envelope JSON output, space-form path, yes injection

Replaces TestSchemaCmd_NoArgs with two variants reflecting the new default
shape: TestSchemaCmd_NoArgs_Pretty asserts the legacy "Available services"
text appears only under --format pretty, and TestSchemaCmd_NoArgs_JSON_IsArray
asserts the default JSON output parses as an envelope array with at least 180
entries.

Adds six new tests:
- TestSchemaCmd_JSONIsEnvelope: single-method output has name / description
  / inputSchema / outputSchema / _meta keys and envelope_version "1.0".
- TestSchemaCmd_SpaceSeparatedPath_EqualsDotted: dotted and space forms
  produce identical output bytes for the same command path.
- TestSchemaCmd_ServiceListIsArray: schema <service> returns a JSON array
  whose every entry's name starts with "<service> ".
- TestSchemaCmd_HighRiskYesInjection: high-risk-write commands inject
  inputSchema.properties.yes.
- TestSchemaCmd_NoYesForReadRisk: read-risk commands do not inject yes.
- TestSchemaCmd_PrettyUnchanged_KeyTextPresent: --format pretty still
  surfaces the legacy section markers (Parameters:, Response:, Identity:,
  Scopes:, CLI:).

* feat(schema): assemble envelope from embedded data only for stability

* chore(schema): lint cleanup

* fix(schema): preserve dotted resource segments in envelope name

Nested resources whose meta_data key contains a dot (e.g. chat.members,
user_mailbox.templates) were previously split on '.' and rejoined with
spaces, producing envelope names like 'im chat members bots'. AI
consumers doing name.split(' ') and feeding the result back as argv
got 'lark-cli im chat members bots' which the CLI rejects — the actual
invocation form is 'lark-cli im chat.members bots'.

Pass the dotted resource key as a single argv segment so the envelope
name 'im chat.members bots' round-trips through name.split(' ') back
to the CLI. Mirror the same convention in the golden harness so its
single-method assembly matches the live AssembleService walk.

* fix(schema): align MCP envelope output with JSON Schema 2020-12 contract

- coerce enum literals to typed JSON values (integer to int64,
  number to float64, boolean to bool) so type:"integer" fields no
  longer emit string enums; sort numeric/boolean enums while
  preserving meta_data order for string enums that carry semantic
  priority
- translate non-standard meta_data type:"list" to JSON Schema
  type:"array" with items:{} fallback when element shape is absent
  (covers the two mail attachment_ids fields)
- render inputSchema.required even when empty so consumers see a
  stable envelope shape ("[]" means no required fields, not "field
  is missing")
- reject trailing path segments in both JSON and pretty modes so
  schema im.messages.delete.foo errors instead of silently
  returning the delete method
- drop dead "list type" entry from lint_test isKnownDataInconsistency
  whitelist now that list values are translated upstream

* fix(schema): address CodeRabbit findings and stabilize CI tests

CI fix
- Replace hard-coded absolute key-order assertions in TestKeyOrderIndex_*
  and TestBuildInputSchema_* with set-membership and propagation invariants;
  the upstream meta_data API does not guarantee stable JSON key order across
  fetches, so the old tests were flaky on CI by design.
- Skip byte-level TestGoldenEnvelopes when CI=true; golden snapshots are a
  manual refresh artefact tied to a specific meta_data fetch, not a CI gate.
- Add TestMain to isolate registry-backed tests from any host ~/.lark-cli
  cache (LARKSUITE_CLI_CONFIG_DIR + LARKSUITE_CLI_REMOTE_META=off) so the
  suite gives the same answer on every machine.

CodeRabbit review actionables
- EmbeddedServiceNames returns a defensive copy so callers cannot mutate
  the package-level slice and affect subsequent assembly determinism.
- coerceEnumValue is now also applied to default literals: integer fields
  no longer ship default: "500" — they ship default: 500 (same idea as the
  earlier enum coercion fix).
- options-branch string enums preserve meta_data source order, matching the
  enum-branch policy; only numeric/boolean enums get sorted.
- validatePropertyTypes now validates the array element schema itself
  (type, nested items), not only items.properties — previously a primitive
  element with an invalid type (e.g. items.type="list") slipped past lint.
- OrderedProps.MarshalJSON falls back to alphabetical key order when Map
  has entries but Order is empty, instead of silently emitting {}.

Tests pass locally and with CI=true env (simulating GitHub Actions).

* chore(schema): refresh golden envelopes after meta_data drift

Re-generated with UPDATE_GOLDEN=1 against the current meta_data.json
snapshot. The bulk of the diff is upstream noise (description wording,
enum entries, field order) which the CI snapshot diff can no longer
reasonably gate (see previous commit). Side-effects of the code fixes
in the parent commit are also captured:

  - integer-typed defaults now emit numeric literals (e.g. page_size
    default 500, not "500") thanks to coerceEnumValue
  - mail.user_mailbox.templates.create _meta.risk corrects to "write"
    (assembler already emitted "write"; the old golden was stale)

* fix(schema): address CodeRabbit round-3 review findings

- TestMain: cleanup now runs reliably. os.Exit skips deferred functions,
  so the previous defer os.RemoveAll(dir) never executed. Replace defer
  with explicit cleanup, and fail fast if MkdirTemp errors instead of
  silently running against the host cache (which defeats isolation).
- convertProperty default coercion: when the literal cannot be coerced to
  the declared type (e.g. default:"" on integer field, used by meta_data
  to mean "no default"), omit the field entirely rather than emit a
  type-mismatched default. Removes a contract violation flagged on
  im.reactions.list.json#page_size.

* feat(schema): wire affordance overlay into envelope _meta

Replace the loadAffordance stub (which always returned nil and read
from an empty embedded annotations/ directory) with parseAffordance,
which lifts the affordance block from method["affordance"]. The block
is authored under larksuite-cli-registry's registry-config.yaml in the
overrides: section and flows through gen-registry.py's deep_merge into
the embedded meta_data.json.

Simplify buildMeta signature: the service/resourcePath/method args
existed only to feed the old dotted-path lookup.

Refresh 9 golden envelopes for unrelated upstream meta_data.json drift.

* refactor(schema): drop x-in extension from inputSchema

x-in (path/query/body) was an HTTP-shape leak in a CLI-facing tool spec.
AI consumers call the CLI by name with named args — they never construct
HTTP requests directly, so the path-vs-body-vs-query distinction is the
CLI's internal concern, not part of the contract.

Execution path (cmd/service/service.go) already reads location from
meta_data.json directly, so removing x-in does not affect routing.

Drop:
- Property.XIn field
- validXIn map and the two lint rules that depend on x-in
  (L1 "top-level missing x-in" and L2 "path field must be in required")
- contains() helper, no longer referenced after the path-required rule
  went away

Refresh 20 goldens for the now-absent x-in lines.

* refactor(schema): wrap inputSchema into params/data/flags sub-objects

Replace the flat inputSchema with a 3-bucket nested structure that mirrors
the CLI's actual flag layout, so AI consumers can directly map envelope
fields to lark-cli invocation:

  inputSchema:
    properties:
      params: { ...path + query fields  }   → CLI --params JSON
      data:   { ...body fields           }   → CLI --data   JSON
      flags:  { yes: ... }                  → CLI --yes (only for high-risk-write)

Each sub-object only appears when the method has the corresponding source,
so read-only GETs have a single `params` block, body-only POSTs have a
single `data` block, etc.

The `flags` wrapper carries an explicit description marking it as a CLI
control bucket (not API fields), so AI does not confuse `yes` with a
backend parameter.

Lint:
- L2 walkForL2 helper recurses into params/data sub-objects so leaf
  invariants (format:binary on non-string, min<max, required-in-properties)
  still apply.
- L3 yes-presence check now navigates flags.properties.yes.

Refresh all 20 goldens for the new shape.

* refactor(schema): drop flags wrapper, put yes at top level alongside params/data

The flags wrapper added one extra layer for a single field. Flatten so
inputSchema.properties has three siblings:

  inputSchema:
    properties:
      params: { ...path + query    }   → CLI --params
      data:   { ...body            }   → CLI --data
      yes:    { boolean, default:false }   → CLI --yes (only when risk == high-risk-write)

`yes` description strengthened to mark it as a CLI confirmation gate
(consumed by lark-cli, not sent to the backend), so AI can still
distinguish it from API fields without needing a wrapper.

Lint L3 yes-presence check goes back to top-level Properties.Map["yes"].
Refresh 20 goldens.

* feat(schema): add `file` top-level sub-object for binary upload fields

Splits file fields out of `data` into their own sibling, so the four
top-level slots in inputSchema map 1:1 to CLI flag dispatch:

  inputSchema.properties:
    params  { path + query fields }                   → --params JSON
    data    { non-file body fields }                  → --data   JSON
    file    { type:file body fields, format:binary }  → --file <key>=<path>
    yes     boolean                                   → --yes (only when risk == high-risk-write)

Each slot is conditional: only registered when the method actually has
fields for that source. This matches the CLI's own conditional flag
registration (cmd/service/service.go:170-195), so what AI sees in the
schema is exactly what flags exist for that method.

The file sub-object carries a description explaining its semantics so AI
knows to use --file for those fields rather than embedding the binary
in --data JSON.

Refresh im.images.create golden (the only file-upload method in the
golden set).

* test(schema): cover L2 lint recursion into params/data sub-objects

Add two negative test cases that stuff bad values inside the wrapped
inputSchema sub-objects (rather than at top-level), to lock in
walkForL2's recursive coverage:

  - format:binary on a non-string field nested under params
  - sub-object Required referencing a key not in its Properties

Regression guard so future walkForL2 refactors do not silently lose
recursion and let leaf-field violations slip past lint.

* fix(schema): coerce example, aggregate nested required, fix path hint

- coerce `example` literal to the declared JSON Schema type (rename
  coerceEnumValue -> coerceLiteral, drop on coerce failure to match the
  `default` policy). Without this, integer/boolean/number fields emitted
  string examples and failed strict validators.
- aggregate child field `required:true` into the enclosing nested
  object's `required[]` (both object and array-items shapes). Previously
  only the top-level params/data sub-objects scanned `required`, so
  envelopes silently under-reported the real call contract.
- check method existence before reporting trailing-segment failure in
  both JSON and pretty `schema` paths. A typo like `schema im messages
  typo extra` now reports "Unknown method: im.messages.typo" instead of
  the misleading "Method 'typo' exists but trailing segments ..." hint.
- extract risk level constants (RiskRead / RiskWrite / RiskHighRiskWrite)
  in internal/cmdutil/risk.go; replace literal usages in schema, lint,
  and confirm helpers so the typo radius is one file.
- reconcile AssembleEnvelope docstring with implementation reality (the
  package-level currentMethodOrder + assembleMu serialize concurrent
  callers; output is deterministic per inputs).
- drop testdata/golden/ and golden_test harness. End-to-end envelope
  shape regression now relies on real CLI invocations and the existing
  property-level unit + lint coverage.

* fix(schema): emit items:{} for all typeless arrays, restore lint gate

The list→array fallback only added items:{} when the source type was
"list", leaving ~64 natively-typed array fields (e.g.
approval.instances.cc.cc_user_ids) as {type:"array"} with no items.
These violated the L1 lint rule, but TestAllEnvelopesPass skipped the
"array missing items" error as a known data inconsistency, so the MCP
tool contract was not actually lint-clean.

Relax the fallback to cover every array lacking element shape regardless
of source type, and drop the lint-test skip so the gate is hard again.
2026-05-27 12:04:01 +08:00

35 lines
1.0 KiB
Go

// Copyright (c) 2026 Lark Technologies Pte. Ltd.
// SPDX-License-Identifier: MIT
package schema
import (
"reflect"
"testing"
)
func TestParsePath(t *testing.T) {
tests := []struct {
name string
args []string
want []string
}{
{"empty args -> nil", nil, nil},
{"empty slice -> nil", []string{}, nil},
{"single dotted", []string{"im.messages.reply"}, []string{"im", "messages", "reply"}},
{"single no-dot", []string{"im"}, []string{"im"}},
{"multi args", []string{"im", "messages", "reply"}, []string{"im", "messages", "reply"}},
{"two args", []string{"im", "messages"}, []string{"im", "messages"}},
{"nested resource dotted", []string{"im.chat.members.bots"}, []string{"im", "chat", "members", "bots"}},
{"nested resource space form", []string{"im", "chat.members", "bots"}, []string{"im", "chat.members", "bots"}},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := ParsePath(tt.args)
if !reflect.DeepEqual(got, tt.want) {
t.Errorf("ParsePath(%v) = %v, want %v", tt.args, got, tt.want)
}
})
}
}