docs: refine lark-drive knowledge organize workflow (#1253)

Change-Id: I49b4f398d60c5bb073d6c8d61987bd16f1a29c4e
2026-07-03 14:02:43 +08:00 · 2026-06-04 15:31:46 +08:00
parent 256df8c0fb
commit c000dc3a44
6 changed files with 314 additions and 57 deletions
--- a/skills/lark-drive/SKILL.md
+++ b/skills/lark-drive/SKILL.md
@@ -295,12 +295,14 @@ lark-cli drive <resource> <method> [flags] # 调用 API
 ```

 > **重要**：使用原生 API 时，必须先运行 `schema` 查看 `--data` / `--params` 参数结构，不要猜测字段格式。
+>
+> **高频原生命令：** 读取 Drive 文件夹清单时使用 `drive files list`，必须按 [`references/lark-drive-files-list.md`](references/lark-drive-files-list.md) 的模板通过 `--params` 传 `folder_token` / `page_token`，并手动处理分页；不要把 `--page-all` 输出直接交给 JSON 解析脚本。

 ### files

  - `copy` — 复制文件
  - `create_folder` — 新建文件夹
-  - `list` — 获取文件夹下的清单
+  - `list` — 获取文件夹下的清单；使用前阅读 [`references/lark-drive-files-list.md`](references/lark-drive-files-list.md)
  - `patch` — 修改文件标题

 ### file.comments
--- a/skills/lark-drive/references/lark-drive-files-list.md
+++ b/skills/lark-drive/references/lark-drive-files-list.md
@@ -0,0 +1,158 @@
+# drive files list（原生 API：读取 Drive 文件夹清单）
+
+`drive files list` 是原生 API 命令，不是 shortcut。它用于读取 Drive 根目录或某个 Drive 文件夹的直接子项；如果要递归盘点目录树，Agent 必须基于返回的子文件夹 token 继续调用本命令。
+
+## 什么时候使用
+
+| 场景 | 是否使用 | 说明 |
+|------|----------|------|
+| 盘点一个已确认的 Drive 文件夹树 | 使用 | 从目标 `folder_token` 开始递归列取 |
+| 盘点用户明确确认的 Drive 根目录 | 使用 | 第一层用空 `folder_token`，子文件夹继续按普通文件夹递归 |
+| 验证移动 / 创建后的实际位置 | 使用 | 读取目标目录直接子项，再按需递归验证 |
+| 根据关键词、标题、时间、owner 找资源 | 不使用 | 优先用 `drive +search` |
+| 读取 Docx 正文内容 | 不使用 | 用 `docs +fetch --api-version v2` |
+| 读取 Sheet / Base 内部数据 | 不使用 | 切到 `lark-sheets` / `lark-base` |
+
+## 标准命令模板
+
+读取普通文件夹：
+
+```bash
+lark-cli drive files list \
+  --params '{"folder_token":"<folder_token>","page_size":200}' \
+  --format json
+```
+
+继续翻页：
+
+```bash
+lark-cli drive files list \
+  --params '{"folder_token":"<folder_token>","page_size":200,"page_token":"<PAGE_TOKEN>"}' \
+  --format json
+```
+
+读取当前用户 Drive 根目录的直接子项：
+
+```bash
+lark-cli drive files list \
+  --params '{"folder_token":"","page_size":200}' \
+  --format json
+```
+
+也可以省略 `folder_token` 字段来请求根目录，但在 Agent 编排中建议显式传空字符串，避免把“忘记传参数”和“确认请求根目录”混在一起。
+
+## 参数规则
+
+1. `folder_token` 必须放在 `--params` JSON 里；不要使用不存在的 `--folder-token` flag。
+2. `page_token` 必须放在 `--params` JSON 里；不要依赖 shell 变量拼接不完整的 JSON。
+3. `page_size` 建议显式设置为 `200`。如果服务端或环境返回参数错误，再降级到服务端允许的值，并记录降级原因。
+4. 调用前如果不确定字段结构，先运行 `lark-cli schema drive.files.list` 查看 `--params` 结构。
+
+## 返回结构与解析
+
+`--format json` 输出中，Agent 只使用 `data` 中符合 `schema drive.files.list` 的 API 返回字段。
+
+常用字段：
+
+| 字段 | 用途 |
+|------|------|
+| `data.files` | 当前页直接子项列表 |
+| `data.has_more` | 当前目录是否还有下一页 |
+| `data.next_page_token` | 下一页 token；当 `has_more=true` 时放回 `--params.page_token` |
+| `data.files[].type` | 文件类型；等于 `folder` 时可递归 |
+| `data.files[].token` | 当前资源 token；文件夹递归时作为下一层 `folder_token` |
+| `data.files[].name` | 生成路径和展示标题 |
+| `data.files[].url` | 资源浏览器链接 |
+| `data.files[].owner_id` | 资源所有者 |
+| `data.files[].created_time` / `data.files[].modified_time` | 创建 / 更新时间 |
+
+字段名以 `schema drive.files.list` 为准。Agent MUST 以实际返回为准；如果字段缺失，先用 `schema drive.files.list` 或一页样本确认结构，不要猜测。
+
+## 根目录语义
+
+1. `folder_token` 为空字符串或省略时，请求的是当前调用用户的 Drive 根目录直接子项。
+2. 根目录返回值不是递归结果；不能把根目录第一页或直接子项数量当作整个云空间资源总量。
+3. 根目录只作为目录树起点。返回的子文件夹必须用其自己的 `folder_token` 继续调用 `drive files list`。
+4. 根据 schema 描述，根目录第一层清单不支持分页且不返回快捷方式；不要基于根目录响应推断子文件夹内容、根目录第一层快捷方式或无法分页的根目录剩余项已经被覆盖。
+
+## 递归盘点规则
+
+1. 只对返回项中的 `folder` 类型继续递归。
+2. 每个目录独立维护分页状态；一个目录的 `page_token` 不可复用于其他目录。
+3. 对每个目录持续请求，直到返回 `has_more=false`。非根目录的普通文件夹清单可能返回 `type=shortcut` 条目；不要假设这些条目会携带 `shortcut_info` 目标信息。
+4. 递归过程中生成稳定 `path`；不要只保存标题，否则同名资源无法区分。
+5. URL、owner、创建时间和更新时间优先使用 `files.list` 返回字段；如果字段缺失或需要批量补齐，再使用 `drive metas batch_query`。不要从标题或路径猜元数据。
+6. 深度、数量、每目录页数等限制只能作为内部批次 checkpoint；不能作为递归完成条件。
+7. 达到深度 checkpoint 时，把更深层子文件夹加入 continuation queue，并在下一批从这些子文件夹继续，保留原始 `path`。
+8. 达到数量 checkpoint 时，保存当前目录、当前页 token、剩余目录队列和已收集资源计数，并立即继续下一批；不要进入分析或规划阶段。
+
+### 递归算法
+
+Agent 盘点 Drive 文件夹树时，按以下顺序执行：
+
+1. 初始化待处理队列，放入起点目录：
+   - 普通文件夹：`{folder_token:"<folder_token>", path:"<folder_name>"}`
+   - Drive 根目录：`{folder_token:"", path:""}`
+2. 从队列取出一个目录，请求第一页。
+3. 用 `(folder_token, page_token)` 生成当前页 key；同一页 key 只允许追加一次，避免 retry 时重复计数。
+4. 从 `data.files` 取当前页直接子项，按 `dedupe_key` 去重后生成 `path` 并加入结果集。
+5. 如果新追加的子项是 `folder`，把子文件夹 token、子路径和 depth 加入队列。
+6. 如果 `has_more=true`，取 `data.next_page_token` 继续请求同一目录下一页。
+7. 同一目录分页结束后，再处理队列中的下一个目录。
+8. 如果达到深度、数量或每目录页数 checkpoint，把当前目录 / 页 token / 剩余队列 / 已访问页 key / dedupe key 写入 continuation queue，并继续下一批。
+9. 普通队列和 continuation queue 都为空，且没有分页 blocker 时，才可以认为本次确认范围盘点完成。
+
+简化伪代码：
+
+```text
+queue = [root_or_start_folder]
+visited_pages = set()
+dedupe_keys = set()
+while queue not empty:
+  folder = queue.pop()
+  page_token = folder.page_token or ""
+  retry_without_token = 0
+  while true:
+    page_key = (folder.folder_token, page_token or "first")
+    page = drive files list(folder.folder_token, page_token)
+    if page_key not in visited_pages:
+      append only files whose dedupe_key is not in dedupe_keys
+      enqueue newly appended child folders with folder_token, path, and depth
+      add page_key to visited_pages
+    if page.has_more != true:
+      break
+    next = page.next_page_token
+    if next is empty:
+      retry_without_token += 1
+      if retry_without_token >= 3:
+        record pagination blocker for folder
+        break
+      continue
+    page_token = next
+    retry_without_token = 0
+```
+
+## 分页与异常
+
+1. 默认手动处理 `has_more` 和返回中的 `next_page_token`。
+2. 不要使用 `--page-all` 作为脚本 JSON 解析输入；自动翻页输出可能不适合直接 `json.loads`。
+3. 如果 `has_more=true` 但没有可用的 `next_page_token`，重试同一页最多 3 次。
+4. 重试后仍无 continuation token 时，记录受影响的目录和 pagination blocker，停止扩展该目录；不要无限循环，也不要宣称该目录已完整覆盖。
+5. 如果触发深度、数量或每目录页数限制，把它视为批处理 checkpoint；在确认范围内继续下一批，而不是把当前结果说成完整。
+6. 不要因为达到 `max_depth=3`、`max_items=500` 或类似单批阈值就结束盘点；只有队列耗尽或遇到权限 / API / 工具预算 blocker 才能结束当前确认范围的盘点。
+
+## JSON 解析规则
+
+1. stdout 是数据通道。脚本解析 JSON 时只读取 stdout。
+2. stderr 可能包含刷新 token、进度、warning 或其他提示；不要把 stderr 合并进 JSON 输入，例如不要用 `2>&1` 后再 `json.loads`。
+3. 使用 `--format json` 保持 stdout 为结构化 JSON；解析 Drive 文件清单时只读取 `data.files` / `data.has_more` / `data.next_page_token` 等 schema 字段。
+4. 不要用根目录响应数量或当前页数量推断递归总量；递归总量必须由实际遍历并去重后的资源集合计算。
+
+## 常见错误
+
+| 错误用法 | 问题 | 正确做法 |
+|----------|------|----------|
+| `lark-cli drive files list --folder-token <token>` | `files.list` 不提供 `--folder-token` flag | 使用 `--params '{"folder_token":"<token>"}'` |
+| 根目录返回 N 项就认为云空间只有 N 项 | 根目录只返回直接子项，不是递归结果 | 对返回的子文件夹继续递归 |
+| `--page-all \| python json.loads(...)` | 自动翻页输出不适合作为单个 JSON 对象解析 | 手动使用 `page_token` 翻页并逐页解析 |
+| `cmd 2>&1` 后解析 JSON | stderr 提示污染 JSON 输入 | 只解析 stdout，stderr 作为日志处理 |
--- a/skills/lark-drive/references/lark-drive-workflow-knowledge-organize-analysis.md
+++ b/skills/lark-drive/references/lark-drive-workflow-knowledge-organize-analysis.md
@@ -24,7 +24,8 @@ MUST:
 4. Switch to `lark-sheets` / `lark-base` only when sheet / bitable title and path are insufficient.
 5. Record read evidence for classification.
 6. Continue reading low-confidence resources in internal batches until all supported low-confidence resources in the current inventory are processed or a blocker occurs.
-7. Output progress / summary without asking the user to continue between batches.
+7. Apply `Analysis Progress Reporting`.
+8. Output progress / summary without asking the user to continue between batches.

 Exit: low-confidence items are classified or marked `needs_review=true`.

@@ -93,6 +94,30 @@ Output this summary:

 - After every 50 processed low-confidence resources.
 - Once after low-confidence reading finishes.
+- About every 60 seconds during long-running reads, even if fewer than 50 additional resources were processed.
+
+### Analysis Progress Reporting
+
+Applies to `CONTENT_READ`, `ISSUE_ANALYSIS`, and `RULE_GENERATION`.
+
+Rules:
+
+1. For `CONTENT_READ`, use `Low-Confidence Read Summary` as the progress report format.
+2. For `ISSUE_ANALYSIS`, if analysis runs longer than about 60 seconds, output progress about every 60 seconds with current stage, processed resource count when known, detected problem type count when known, and the next analysis step.
+3. For `RULE_GENERATION`, if classification rule or target-tree generation runs longer than about 60 seconds, output progress about every 60 seconds with current stage, classified item count when known, unresolved item count when known, and target category / path count when known.
+4. Progress reports MUST be factual and stage-specific. Do not output generic "still running" messages without counts or the current stage.
+5. Do not ask the user to continue between internal batches unless auth, permission, API, target scope, or environment blockers occur.
+6. Do not expose internal chain-of-thought, raw tokens, or intermediate rule drafts.
+
+Examples:
+
+```text
+分析进度：正在归纳整理问题，已处理 <processed_count>/<resource_count> 项资源，已识别 <problem_type_count> 类问题。继续生成整理思路，不会执行移动或创建。
+```
+
+```text
+规则生成进度：正在生成分类规则和目标目录，已归类 <classified_count> 项，待人工确认 <needs_review_count> 项。继续生成完整计划前置数据。
+```

 ## State: ISSUE_ANALYSIS

@@ -103,8 +128,9 @@ MUST:
 1. Detect problems from organization perspective only. Do not generate research conclusions.
 2. Generate an organization approach based on inventory, low-confidence read evidence, and detected problems.
 3. Include how non-reused source containers will be handled after their contents are moved.
-4. Output `Inventory And Organization Approach Decision`.
-5. Stop and wait for the user to confirm the approach before `RULE_GENERATION`.
+4. Apply `Analysis Progress Reporting`.
+5. Output `Inventory And Organization Approach Decision`.
+6. Stop and wait for the user to confirm the approach before `RULE_GENERATION`.

 Problem rules:

@@ -161,10 +187,10 @@ MUST output evidence count or example paths. Do not output only abstract judgmen
 是否基于这个整理思路生成目标目录和移动 / 创建计划？

 你可以选择：
-A. 基于这个思路生成目标目录和计划
-B. 调整整理思路
-C. 查看问题详情
-D. 取消本次整理
+1. 基于这个思路生成目标目录和计划
+2. 调整整理思路
+3. 查看问题详情
+4. 取消本次整理
 ```

 ## State: RULE_GENERATION
@@ -181,7 +207,8 @@ MUST:
 6. For non-reused source containers, ensure `target_tree` includes a source-container cleanup target, defaulting to `待人工确认/待清理旧目录`, unless the user explicitly asks to keep source containers in place.
 7. Ensure target tree can contain every planned `target_path`.
 8. Ensure the target tree contains a manual confirmation target named `待人工确认` unless the user explicitly provides an equivalent name.
-9. Continue to `PLAN_GENERATION` without a separate target-tree-only confirmation.
+9. Apply `Analysis Progress Reporting`.
+10. Continue to `PLAN_GENERATION` without a separate target-tree-only confirmation.

 ### Classification

--- a/skills/lark-drive/references/lark-drive-workflow-knowledge-organize-discovery.md
+++ b/skills/lark-drive/references/lark-drive-workflow-knowledge-organize-discovery.md
@@ -10,8 +10,9 @@ Before executing rules in this file:

 1. Follow [`../../lark-shared/SKILL.md`](../../lark-shared/SKILL.md) for identity, auth, and permission handling.
 2. For Wiki / personal library targets, follow [`../../lark-wiki/SKILL.md`](../../lark-wiki/SKILL.md).
-3. For Drive search targets, follow [`lark-drive-search.md`](lark-drive-search.md).
-4. For URL / token inspection, follow [`lark-drive-inspect.md`](lark-drive-inspect.md) and [`../../lark-wiki/references/lark-wiki-node-get.md`](../../lark-wiki/references/lark-wiki-node-get.md).
+3. For Drive folder inventory, follow [`lark-drive-files-list.md`](lark-drive-files-list.md).
+4. For Drive search targets, follow [`lark-drive-search.md`](lark-drive-search.md).
+5. For URL / token inspection, follow [`lark-drive-inspect.md`](lark-drive-inspect.md) and [`../../lark-wiki/references/lark-wiki-node-get.md`](../../lark-wiki/references/lark-wiki-node-get.md).

 ## State: PARSE_SCOPE

@@ -87,6 +88,10 @@ Clarification template:
 请确认是否按这个范围继续？
 ```

+Scope confirmation is user-facing. It MUST confirm only the business scope, environment / profile, identity, and whether write operations will run.
+
+Do not display internal batching controls in scope confirmation, including `max_depth`, `max_items`, `page_size`, page tokens, retry counts, or `partial=true`. For example, when the user confirms Drive root, say the scope is the Drive root tree; do not append "recursive depth at most 3" or "at most 500 resources".
+
 ## State: INVENTORY

 Entry: `target_scope` confirmed.
@@ -96,20 +101,57 @@ MUST:
 1. Recursively list resources according to target type.
 2. Generate `path` during traversal.
 3. Normalize all results to `ResourceItem`.
-4. Track pagination, depth, and item limits.
-5. Set `partial=true` when limits are hit.
-6. Output `Inventory Summary`.
-7. Continue to `CONTENT_READ` without asking the user unless auth, permission, API, target scope, or environment blockers occur.
+4. Track pagination, depth, item limits, and continuation checkpoints.
+5. Treat pagination, depth, item, and per-folder page limits as batching checkpoints; continue inventory in the confirmed scope unless blocked.
+6. Set `partial=true` only when inventory cannot continue because of auth, permission, API / pagination failure after retries, API coverage limitations, tool budget, target scope, or environment blockers.
+7. Apply `Inventory Progress Reporting`.
+8. Output `Inventory Summary`.
+9. Do not leave `INVENTORY` while `inventory_continuation_state` has queued folders, nodes, pages, or slices that can still be fetched.
+10. Continue to `CONTENT_READ` without asking the user only after the confirmed scope is exhausted or blocked.

-### Inventory Limits
+### Inventory Batch Checkpoints

-| Scope | Default Limit | If Limit Is Hit |
-|-------|---------------|-----------------|
-| Wiki recursion | `max_depth=3`, `max_items=500`; follow `lark-wiki-node-list` pagination | Set `partial=true`; list covered paths and suggested next first-level directories |
-| Drive folder recursion | `max_depth=3`, `max_items=500`, max 10 pages per folder, `page_size=50` | Set `partial=true`; list folders not drilled into |
-| Search discovery | `page_size=20`, `max_items=500`; continue pages until `has_more=false` or `max_items` is reached | Set `partial=true`; report collected_count, service_total when available, page_count, and continuation information |
+| Scope | Internal Batch Checkpoint | Required Continuation |
+|-------|---------------------------|-----------------------|
+| Wiki recursion | `max_depth=3`, `max_items=500`; follow `lark-wiki-node-list` pagination | Record queued nodes / paths in `inventory_continuation_state` and immediately continue the next internal batch within the confirmed scope unless blocked |
+| Drive folder tree | `max_depth=3`, `max_items=500`, max 10 pages per folder, `page_size=200` | Record queued folders / pages in `inventory_continuation_state` and immediately continue the next internal batch within the confirmed scope unless blocked |
+| Search discovery | `page_size=20`, `max_items=500`; continue pages until `has_more=false` | Record remaining pages / slices in `inventory_continuation_state` and immediately continue the next internal batch within the confirmed scope unless blocked |

-If the user explicitly asks for full processing, batch by first-level directory, Wiki space, or time window. Do not remove all limits in one run.
+These checkpoints are pacing controls, not coverage limits. If the confirmed scope still has queued work after a checkpoint, continue with the next internal batch instead of presenting the current `resource_items` as final inventory or moving to content analysis.
+
+When a depth checkpoint is reached, enqueue the child folders / nodes that would exceed the current batch depth; the next batch starts from those queued children with their original paths preserved. When an item checkpoint is reached, persist the current folder / node / page cursor plus the remaining queue, visited page keys, and resource dedupe keys, then continue from that checkpoint before analysis or planning.
+
+If tool budget would be exceeded for a very large confirmed scope, stop only at that blocker, report that the inventory is incomplete, and suggest batching by first-level directory, Wiki space, or time window. Do not stop merely because a depth or item checkpoint was reached.
+
+### Inventory Continuation Rules
+
+1. Pagination, depth, item, and per-folder page limits are internal batching checkpoints.
+2. When a checkpoint is reached, record `inventory_continuation_state` with `scope`, `queue`, `current_cursor`, `visited_page_keys`, `dedupe_keys`, and `blockers`; Drive queue entries MUST contain `folder_token`, `path`, `depth`, and `page_token`; Wiki queue entries MUST contain `space_id` / `node_token`, `path`, `depth`, and pagination cursor; search entries MUST contain query / filters and pagination cursor.
+3. A depth checkpoint MUST enqueue deeper folders / nodes; it MUST NOT discard them or treat the current depth as final coverage.
+4. An item-count checkpoint MUST persist the current cursor and queue; it MUST NOT transition to `CONTENT_READ`, `ISSUE_ANALYSIS`, or `PLAN_GENERATION` while fetchable work remains.
+5. If `inventory_continuation_state` is missing, corrupt, or lacks required fields for the current scope, set `partial=true`, record the checkpoint blocker, and do not claim full coverage.
+6. Do not set `partial=true` solely because a valid batching checkpoint was reached.
+7. Set `partial=true` only when continuation is blocked by auth, permission, API / pagination failure after retries, API coverage limitations, tool budget, target scope, or environment blockers.
+8. Do not claim full coverage until the continuation queue for the confirmed scope is exhausted or blocked.
+
+### Inventory Progress Reporting
+
+Inventory can be long-running when a Drive root, large folder tree, Wiki space, or broad search scope is confirmed.
+
+Rules:
+
+1. When inventory starts, output one concise stage notice with the confirmed scope type and the fact that no write operation will be executed.
+2. If inventory runs longer than about 60 seconds, output progress about every 60 seconds.
+3. Progress reports SHOULD include only fields that are currently known: scanned folders / nodes, collected resources, current depth, queued folders / nodes, current search page / slice, and current blocker if any.
+4. When a batching checkpoint is reached and continuation will proceed automatically, report it as continuing inventory, not as a user action request.
+5. Do not output filler such as "still running" without current counts or current stage.
+6. Do not expose raw folder tokens, page tokens, retry logs, or `partial=true` unless the user explicitly asks to view inventory coverage details.
+
+Example:
+
+```text
+盘点进度：已扫描 <scanned_container_count> 个目录 / 节点，收集 <resource_count> 项资源，队列剩余 <queued_container_count> 个目录 / 节点。继续盘点，不会执行移动或创建。
+```

 ### Wiki Inventory Rules

@@ -120,11 +162,13 @@ If the user explicitly asks for full processing, batch by first-level directory,

 ### Drive Inventory Rules

-1. Use CLI command family `drive files list` according to `lark-drive` API rules; its schema path is `drive.files.list`.
-2. Recurse only into `folder` items.
-3. Use `drive metas batch_query` when URL, owner, created time, or updated time is needed.
-4. Continue pages by feeding `next_page_token` into request param `page_token`.
-5. Prefer explicit `folder_token`; querying root with empty `folder_token` may return broad root data and may not paginate as expected.
+1. Use `drive files list` according to [`lark-drive-files-list.md`](lark-drive-files-list.md); its schema path is `drive.files.list`.
+2. Use the same Drive folder-tree traversal for Drive root and ordinary folders after the first request. Drive root differs only for the first-level request: it uses omitted or empty `folder_token`, does not support pagination, and does not return root-level shortcuts according to schema; returned child folders MUST still be listed by their own folder tokens like ordinary folders, and those ordinary folder lists may return `type=shortcut` entries. For a Drive root target, record this root-level shortcut coverage caveat, set `partial=true` only if the user requested full root-level shortcut coverage or root pagination cannot continue, and do not claim root-level shortcut coverage as complete.
+3. Recurse only into `folder` items within the confirmed scope.
+4. For each directory, continue pages manually by feeding the returned `next_page_token` into request param `page_token`. Do not rely on `--page-all` for inventory.
+5. If a page returns `has_more=true` but no usable `next_page_token`, retry the same page request up to 3 times. If retries still cannot produce a continuation token, set `partial=true` for that directory and record the pagination blocker.
+6. Use `drive metas batch_query` when URL, owner, created time, or updated time is needed.
+7. Pagination blocker details such as `partial=true`, folder token, page token, and retry logs are internal by default. Do not show them to the user unless the user explicitly asks to view inventory coverage details.

 ### Search Inventory Rules

@@ -132,10 +176,11 @@ If the user explicitly asks for full processing, batch by first-level directory,
 2. If a search result is a Wiki item and lacks `node_token`, resolve it with `drive +inspect` or `wiki +node-get` before dedupe.
 3. If Wiki identity still cannot be resolved, keep the item, set `needs_review=true`, and record `needs_review_reason`.
 4. For search scope, use `page_size=20` unless a lower value is required by the command.
-5. Continue fetching pages until `has_more=false` or `max_items` is reached.
-6. Do not stop at an arbitrary sample size such as first 5 pages unless the user explicitly asks for sampling or auth, permission, API, environment, or tool-budget blockers occur.
-7. If `service_total` / result total is greater than collected items, set `partial=true` and show collected_count, service_total, page_count, and continuation information.
-8. Do not present a partial search sample as complete inventory. Before generating a full organization plan from partial search results, ask whether to continue fetching more pages or proceed with sample-based planning.
+5. Continue fetching pages until `has_more=false`.
+6. If `max_items=500` is reached in one batch, record the current search cursor in `inventory_continuation_state` and continue the next internal batch without asking the user.
+7. Do not stop at an arbitrary sample size such as first 5 pages unless the user explicitly asks for sampling or auth, permission, API, environment, or tool-budget blockers occur.
+8. If `service_total` / result total is greater than collected items, treat it as continuation evidence: continue fetching when a cursor / page is available; set `partial=true` only if continuation is blocked.
+9. Do not present a partial search sample as complete inventory. Before generating a full organization plan from partial search results, continue fetching available pages unless the user explicitly asked for sampling or a blocker prevents continuation.

 ## ResourceItem

@@ -179,7 +224,9 @@ ResourceItem rules:
 ## Inventory Summary

 ```text
-已完成盘点。
+已完成当前可覆盖范围盘点。
+
+<仅当适用：覆盖说明：Drive 根目录第一层清单不返回快捷方式；本次盘点不包含根目录第一层快捷方式。根目录下子文件夹会按普通文件夹继续盘点，普通文件夹内返回的 `type=shortcut` 条目仍会被纳入资源清单。>

 | 指标 | 数量 |
 |------|------|
@@ -202,4 +249,5 @@ ResourceItem rules:
 | Environment / profile is ambiguous | Ask user to confirm prod / BOE / PRE and profile | Do not cross environment boundaries |
 | Missing API scope | Follow `lark-shared` permission handling and stop | Do not retry the same command repeatedly |
 | Resource access denied | Stop and follow the main workflow `Permission Request Gate` | Do not request permission automatically or in batch |
-| Pagination / depth / item limit reached | Set `partial=true`; record uncovered range and continuation command | Do not claim full coverage |
+| Pagination / depth / item checkpoint reached | Record `inventory_continuation_state` and continue inventory in the confirmed scope | Do not set `partial=true` solely because a batching checkpoint was reached |
+| Pagination cursor missing after retries / API pagination failure | Set `partial=true`; record the affected directory and blocker | Do not loop indefinitely or claim full coverage |
--- a/skills/lark-drive/references/lark-drive-workflow-knowledge-organize-planning.md
+++ b/skills/lark-drive/references/lark-drive-workflow-knowledge-organize-planning.md
@@ -24,7 +24,8 @@ MUST:
 4. Apply `Plan Pagination`.
 5. Set `active_plan_items` to the latest complete plan.
 6. Keep complete plan internally even if only one page is displayed.
-7. Output `Target Tree And Plan Overview` or requested plan page, then wait.
+7. Apply `Plan Generation Progress Reporting`.
+8. Output `Target Tree And Plan Overview` or requested plan page, then wait.

 ### Plan Generation

@@ -44,6 +45,25 @@ MUST:
 | Target parent token unresolved | Keep plan item but block execution until token is resolved |
 | Resource title is poor or inconsistent | Report the naming issue only; do not create rename or title-patch plan items |

+### Plan Generation Progress Reporting
+
+Plan generation can be long-running when `resource_items` is large or source-container parent / child move ordering is complex.
+
+Rules:
+
+1. If plan generation starts with more than 500 `resource_items`, output one concise start notice with the resource count and that no write operation is being executed.
+2. If plan generation runs longer than about 60 seconds, output progress about every 60 seconds.
+3. Progress reports SHOULD include only fields currently known: processed resource count, generated plan item count, create count, move count, source-container move count, review count, and current step.
+4. Do not display unpaginated plan details as progress. Complete `plan_items` remain internal until the normal paginated output.
+5. Do not ask the user to continue during plan generation unless auth, permission, API, target scope, or environment blockers occur.
+6. Do not output filler such as "still running" without current counts or current step.
+
+Example:
+
+```text
+计划生成进度：已处理 <processed_count>/<resource_count> 项资源，生成 <plan_item_count> 项计划，其中创建 <create_count> 项、移动 <move_count> 项。继续计算父子目录移动顺序，不会执行创建或移动。
+```
+
 ## PlanItem

 `PlanItem` is for internal execution. It may contain tokens and internal enums.
@@ -167,11 +187,11 @@ Confidence display map:
 - 低置信度：<low_count> 项

 你可以选择：
- 查看第 1 页明细
- 只看将创建的目录 / 节点
- 只看待人工确认项
- 只看高置信度移动项
- 进入执行确认
+1. 查看第 1 页明细
+2. 只看将创建的目录 / 节点
+3. 只看待人工确认项
+4. 只看高置信度移动项
+5. 进入下一步：确认执行计划
 ```

 If `total_count > 500`, say:
@@ -224,10 +244,10 @@ User-facing output:
 说明：后续执行默认基于这份完整修正版计划，不是只执行刚才的修正项。

 你可以选择：
-A. 查看修正版计划总览
-B. 查看本次修改涉及的资源
-C. 进入执行确认
-D. 继续调整
+1. 查看修正版计划总览
+2. 查看本次修改涉及的资源
+3. 进入下一步：确认执行计划
+4. 继续调整
 ```

 If the user explicitly asks to execute only the corrected items, ask for confirmation before execution:
@@ -248,15 +268,15 @@ If the user explicitly asks to execute only the corrected items, ask for confirm
 还有 <remaining_pages> 页未展示。

 你可以回复：
- 继续看下一页
- 只看待人工确认项
- 只看低置信度项
- 进入执行确认
+1. 继续看下一页
+2. 只看待人工确认项
+3. 只看低置信度项
+4. 进入下一步：确认执行计划
 ```

 ## State: EXEC_CONFIRM

-Entry: user asks to execute.
+Entry: user asks to view execution confirmation or continue toward execution.

 MUST:

@@ -284,17 +304,17 @@ Before execution confirmation, MUST show this notice:

 When the user wants execution, ask for execution scope:

-Execution confirmation options MUST be renumbered by currently available choices. Do not show disabled choices, and do not ask the user to reply with skipped letters.
+Execution confirmation options MUST be numbered by currently available choices. Do not show disabled choices, and do not ask the user to reply with skipped numbers.

 If a plan detail page is currently active:

 ```text
 请确认执行范围：

-A. 执行完整计划：<total_count> 项
-B. 只执行当前页：<current_page_count> 项
-C. 只执行高置信度项：<high_confidence_count> 项
-D. 暂不执行，只保留方案
+1. 执行完整计划：<total_count> 项
+2. 只执行当前页：<current_page_count> 项
+3. 只执行高置信度项：<high_confidence_count> 项
+4. 暂不执行，只保留方案

 本 workflow 只执行已确认范围内的创建、移动和必要的单资源权限申请；不会重命名任何资源。
 ```
@@ -304,9 +324,9 @@ If no plan detail page is currently active:
 ```text
 请确认执行范围：

-A. 执行完整计划：<total_count> 项
-B. 只执行高置信度项：<high_confidence_count> 项
-C. 暂不执行，只保留方案
+1. 执行完整计划：<total_count> 项
+2. 只执行高置信度项：<high_confidence_count> 项
+3. 暂不执行，只保留方案

 如需只执行某一页，请先查看计划明细页。

--- a/skills/lark-drive/references/lark-drive-workflow-knowledge-organize.md
+++ b/skills/lark-drive/references/lark-drive-workflow-knowledge-organize.md
@@ -89,7 +89,8 @@ Agent MUST maintain these internal fields during one workflow run:
 | `environment_profile` | Current environment and CLI profile, such as prod / BOE / PRE and config profile |
 | `identity` | `user` by default unless user explicitly asks for app / bot perspective |
 | `resource_items` | Complete normalized resource list from discovery |
-| `partial` | Whether inventory or content-read limits were hit |
+| `partial` | Whether inventory or content read cannot fully continue because of auth, permission, API / pagination failure after retries, API coverage limitations, tool budget, or scope blockers; batching checkpoints alone are not partial |
+| `inventory_continuation_state` | Structured checkpoint for continuing inventory batches within the confirmed scope. Must preserve `scope`, `queue`, `current_cursor`, `visited_page_keys`, `dedupe_keys`, and `blockers`; Drive queue entries carry `folder_token`, `path`, `depth`, and `page_token`; Wiki queue entries carry `space_id` / `node_token`, `path`, `depth`, and pagination cursor; search entries carry query / filters and pagination cursor. Missing or corrupt state is a blocker, not a completed inventory. |
 | `low_confidence_items` | Items requiring mandatory partial content read |
 | `issue_summary` | Problem types, counts, evidence paths, and suggested handling |
 | `classification_rules` | Rules used to map resources to target paths |
@@ -211,6 +212,7 @@ Never request permission automatically, never batch permission requests, and nev
 - [Rollback phase](lark-drive-workflow-knowledge-organize-rollback.md)
 - [lark-shared](../../lark-shared/SKILL.md)
 - [lark-drive](../SKILL.md)
+- [lark-drive-files-list](lark-drive-files-list.md)
 - [lark-drive-search](lark-drive-search.md)
 - [lark-drive-inspect](lark-drive-inspect.md)
 - [lark-drive-apply-permission](lark-drive-apply-permission.md)