feat(ci): refine PR business area labels and introduce skill format check (#148)

* feat(ci): add PR size label pipeline

* chore(ci): make PR label sync non-blocking

* feat(ci): add dry-run mode for PR label sync

* feat(ci): add PR label dry-run samples

* test(ci): update PR label samples with real historical merged PRs

Replaced synthetic or open PR samples with actual merged/closed PRs from the
repository to provide a more accurate reflection of the size label categorization.
Added 4 samples each for sizes S, M, and L covering docs, fixes, ci, and features.

* feat(ci): add high-level area tags for PRs

Based on user feedback, fine-grained domain labels (like `domain/base`) are too detailed for the early stages.
This change adds support for applying `area/*` tags to indicate which important top-level modules a PR touches.

Currently tracked areas:
- `area/shortcuts`
- `area/skills`
- `area/cmd`

Minor modules like docs, ci, and tests are intentionally excluded to keep tags focused on critical architectural components.

* refactor(ci): extract pr-label-sync logic to a dedicated directory

To avoid polluting the root `scripts/` directory, moved `sync_pr_labels.js` and
`sync_pr_labels.samples.json` into a new `scripts/sync-pr-labels/` folder.
Added a dedicated README to document its usage and behavior.
Updated `.github/workflows/pr-labels.yml` to reflect the new path.

* refactor(ci): rename pr label script directory for simplicity

Renamed `scripts/sync-pr-labels/` to `scripts/pr-labels/` to keep directory
names concise. Updated internal references and GitHub workflow files to point
to the new path.

* ci: add GitHub Actions workflow to check skill format

* test(ci): update sample json to include expected_areas

Added `expected_areas` lists to each sample in `samples.json` to reflect
the newly added `area/*` high-level module tagging logic. Allows testing
to accurately check both `size/*` and `area/*` outputs.

* refactor(scripts): move skill format check to isolated directory and add README

* test(scripts): add positive and negative tests for skill format check

* fix(scripts): revert skill changes and downgrade version/metadata checks to warnings

* fix(scripts): completely remove version check and skip lark-shared

* refactor(ci): improve pr-labels script readability and maintainability

- Reorganized code into logical sections with clear comments
- Encapsulated GitHub API interactions into a reusable `GitHubClient` class
- Extracted and centralized classification logic into a pure `evaluateRules` function
- Replaced magic numbers with named constants (`THRESHOLD_L`, `THRESHOLD_XL`)
- Fixed `ROOT` path resolution logic
- Simplified conditional statements and control flow

* ci: fix setup-node version in pr-labels workflow

* tmp

* refactor(ci): replace generic area labels with business-specific ones

- Add PATH_TO_AREA_MAP to map shortcuts/skills paths to business areas (im, vc, ccm, base, mail, calendar, task, contact)

- Replace importantAreas with businessAreas throughout the codebase

- Remove area/shortcuts, area/skills, area/cmd generic labels

- Now generates specific labels like area/im, area/vc, area/ccm, etc.

- Update samples.json expected_areas to match new behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): address PR review feedback for label scripts and workflows

- Add `edited` event to PR labels workflow to trigger on title changes
- Add security warning comment in pr-labels.yml workflow
- Update pr-labels README with latest business area labels
- Exclude `skills/lark-*` paths from low risk doc classification
- Handle renamed files properly in PR path classification
- Fix YAML frontmatter extraction to handle CRLF line endings
- Use precise regex for YAML key validation instead of substring match
- Fix exit code checking logic in skill-format-check test script
- Translate Chinese comments in skill-format-check to English

* fix(skill-format-check): address CodeRabbit review feedback

- Fix frontmatter closing delimiter detection to strictly match '---' using regex, preventing invalid closing tags like '----' from passing.
- Improve test fixture reliability by failing tests immediately if fixture preparation fails, avoiding false positives.

* fix: address review comments from PR 148

- ci: warn when PR label sync fails in job summary
- test(skill-format-check): capture validator output for negative tests
- fix(skill-format-check): catch errors when reading SKILL.md to avoid hard crashes

* fix: add error handling for directory enumeration in skill-format-check

- refactor: use `fs.readdirSync` with `{ withFileTypes: true }` to avoid extra stat calls
- fix: catch and report errors gracefully during skills directory enumeration instead of crashing

* docs(skill-format-check): clarify `metadata` requirement in README

test(pr-labels): add edge case samples for skills paths, CCM multi-paths, and renames

* test(pr-labels): add real PR edge case samples

- use PR #134 to test skill path behaviors
- use PR #57 to test multi-path CCM resolution
- use PR #11 to test track renames cross domains

* refactor(ci): migrate pr labels from area to domain prefix

- Replaced `area/` prefix with `domain/` for PR labeling to align with existing GitHub labels
- Renamed internal constants and variables from `area` to `domain` (e.g. `PATH_TO_AREA_MAP` to `PATH_TO_DOMAIN_MAP`)
- Updated `samples.json` test data to use new `domain/` format and `expected_domains` key
- Added `scripts/pr-labels/test.js` runner script for continuous validation of labeling logic against PR samples
- Corrected expected size label for PR #134 test sample

* test: use execFileSync instead of execSync in pr-labels test script

* fix: resolve target path against process.cwd() instead of __dirname in skill-format-check

* docs: correct label prefix in PR label workflow README

- Updated README.md to reflect the new `domain/` label prefix instead of `area/`

* fix(ci): fix dry-run console output formatting and enforce auth in tests

- Removed duplicate domain array interpolation in printDryRunResult
- Added process.env.GITHUB_TOKEN guard in test.js to prevent ambiguous failures from API rate limits

* fix(ci): ensure PR labels can be applied reliably

- Added `issues: write` permission to pr-labels workflow, which is strictly required by the GitHub REST API to modify labels on pull requests
- Reordered script execution in `index.js` to apply/remove labels on the PR *before* attempting to sync repository-level label definitions (colors/descriptions). The definition sync is now a trailing best-effort step with error catching so transient repo-level API failures don't abort the critical path.

* fix(ci): fix edge cases in pr-label index script

- Added missing `skills/lark-task/` to `PATH_TO_DOMAIN_MAP` to properly detect task domain modifications
- Updated GitHub REST API error checking in `syncLabelDefinition` to reliably match `error.status === 422` rather than loosely checking substring
- Moved token presence check in `main()` to happen before `resolveContext` to avoid triggering unauthenticated 401 API limits when GITHUB_TOKEN is omitted locally

* test(ci): clean up PR label test samples

- Removed duplicate PR entries (#11 and #57) to reduce redundant API calls during testing
- Renamed sample test cases to correctly reflect their expected labels (e.g. `size-l-skill-format-check` -> `size-m-skill-format-check`)

* fix(ci): bootstrap new labels before applying to PRs

- Prior changes correctly made full label sync best-effort, but broke the flow for brand new domains
- GitHub API returns a 422 error if you attempt to attach a label to an Issue/PR that does not exist in the repository
- Added a targeted bootstrap loop to create/sync specifically the labels in `toAdd` before attempting `client.addLabels()`
- Left the remaining global label synchronization as a best-effort trailing action

* test(ci): automate PR label regression testing

- Added a dedicated GitHub Actions workflow (`pr-labels-test.yml`) to automatically run `test.js` against `samples.json` whenever the labeling logic is updated
- Documented local testing instructions in `scripts/pr-labels/README.md`

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
williamfzc
2026-04-01 17:45:39 +08:00
committed by GitHub
parent 17698d5c6a
commit 5621d2e555
16 changed files with 1381 additions and 0 deletions

View File

@@ -0,0 +1,58 @@
# PR Label Sync
This directory contains scripts and sample data for automatically classifying and labeling GitHub Pull Requests based on the files they modify.
## Files
- `index.js`: The main Node.js script. It fetches PR files, evaluates their risk level, calculates business impact, and uses GitHub APIs to add appropriate `size/*` and `domain/*` labels.
- `samples.json`: A collection of historical PRs used as test cases to verify the labeling logic (especially for regression testing the S/M/L thresholds).
## Features
### Size Labels (`size/*`)
The script evaluates the "effective" lines of code changed (ignoring tests, docs, and ci files) to classify the PR:
- **`size/S`**: Low-risk changes involving only docs, tests, CI workflows, or chores.
- **`size/M`**: Small-to-medium changes affecting a single business domain, with effective lines under 300.
- **`size/L`**: Large features (>= 300 lines), cross-domain changes, or any changes touching core architecture paths (like `cmd/`).
- **`size/XL`**: Architectural overhauls, extremely large PRs (>1200 lines), or sensitive refactors.
### Domain Tags (`domain/*`)
The script also identifies which business domains a PR touches to give reviewers an immediate sense of the impact scope. Currently tracked domains include:
- `domain/im`
- `domain/vc`
- `domain/ccm`
- `domain/base`
- `domain/mail`
- `domain/calendar`
- `domain/task`
- `domain/contact`
Minor modules like docs and tests are omitted to keep PR tags clean and focused on structural changes.
## Usage
### In GitHub Actions
This script is designed to run in CI workflows. It automatically reads the `GITHUB_EVENT_PATH` payload to get the PR context.
```bash
node scripts/pr-labels/index.js
```
### Local Dry Run
You can test the labeling logic against an existing GitHub PR without actually applying labels by using the `--dry-run` flag.
```bash
# Requires GITHUB_TOKEN environment variable or passing --token
node scripts/pr-labels/index.js --dry-run --repo larksuite/cli --pr-number 123
```
## Testing
A regression test suite is available in `test.js` which verifies the output of the classification logic against historical PRs configured in `samples.json`.
```bash
# Requires GITHUB_TOKEN environment variable to avoid rate limits
GITHUB_TOKEN=$(gh auth token) node scripts/pr-labels/test.js
```
This test suite also runs automatically in CI via `.github/workflows/pr-labels-test.yml` when changes are made to this directory.

747
scripts/pr-labels/index.js Executable file
View File

@@ -0,0 +1,747 @@
#!/usr/bin/env node
// Copyright (c) 2026 Lark Technologies Pte. Ltd.
// SPDX-License-Identifier: MIT
const fs = require("node:fs/promises");
const path = require("node:path");
// ============================================================================
// Constants & Configuration
// ============================================================================
const API_BASE = "https://api.github.com";
const SCRIPT_DIR = __dirname;
const ROOT = path.join(SCRIPT_DIR, "..", "..");
const THRESHOLD_L = 300;
const THRESHOLD_XL = 1200;
const LABEL_DEFINITIONS = {
"size/S": { color: "77bb00", description: "Low-risk docs, CI, test, or chore only changes" },
"size/M": { color: "eebb00", description: "Single-domain feat or fix with limited business impact" },
"size/L": { color: "ff8800", description: "Large or sensitive change across domains or core paths" },
"size/XL": { color: "ee0000", description: "Architecture-level or global-impact change" },
};
const MANAGED_LABELS = new Set(Object.keys(LABEL_DEFINITIONS));
// File path matching configurations
const DOC_SUFFIXES = [".md", ".mdx", ".txt", ".rst"];
const LOW_RISK_PREFIXES = [".github/", "docs/", ".changeset/", "testdata/", "tests/", "skill-template/"];
const LOW_RISK_FILENAMES = new Set(["readme.md", "readme.zh.md", "changelog.md", "license", "cla.md"]);
const LOW_RISK_TEST_SUFFIXES = ["_test.go", ".snap"];
const CORE_PREFIXES = ["internal/auth/", "internal/engine/", "internal/config/", "cmd/"];
const HEAD_BUSINESS_DOMAINS = new Set(["im", "contact", "ccm", "base", "docx"]);
const LOW_RISK_TYPES = new Set(["docs", "ci", "test", "chore"]);
// CODEOWNERS-based path to domain label mapping
// Maps shortcuts and skills paths to business domain labels
const PATH_TO_DOMAIN_MAP = {
// shortcuts
"shortcuts/im/": "im",
"shortcuts/vc/": "vc",
"shortcuts/calendar/": "calendar",
"shortcuts/doc/": "ccm",
"shortcuts/sheets/": "ccm",
"shortcuts/drive/": "ccm",
"shortcuts/base/": "base",
"shortcuts/mail/": "mail",
"shortcuts/task/": "task",
"shortcuts/contact/": "contact",
// skills
"skills/lark-im/": "im",
"skills/lark-vc/": "vc",
"skills/lark-doc/": "ccm",
"skills/lark-base/": "base",
"skills/lark-mail/": "mail",
"skills/lark-calendar/": "calendar",
"skills/lark-task/": "task",
"skills/lark-contact/": "contact",
};
const SENSITIVE_PATTERN = /(^|\/)(auth|permission|permissions|security)(\/|_|\.|$)/;
const CLASS_STANDARDS = {
"size/S": {
channel: "Fast track (S)",
gates: [
"Code quality: AI code review passed",
"Dependency and configuration security checks passed",
],
},
"size/M": {
channel: "Fast track (M)",
gates: [
"Code quality: AI code review passed",
"Dependency and configuration security checks passed",
"Skill format validation: added or modified Skills load successfully",
"CLI automation tests: all required business-line tests passed",
],
},
"size/L": {
channel: "Standard track (L)",
gates: [
"Code quality: AI code review passed",
"Dependency and configuration security checks passed",
"Skill format validation: added or modified Skills load successfully",
"CLI automation tests: all required business-line tests passed",
"Domain evaluation passed: reported success rate is greater than 95%",
],
},
"size/XL": {
channel: "Strict track (XL)",
gates: [
"Code quality: AI code review passed",
"Dependency and configuration security checks passed",
"Skill format validation: added or modified Skills load successfully",
"CLI automation tests: all required business-line tests passed",
"Domain evaluation passed: reported success rate is greater than 95%",
"Cross-domain release gate: all domains and full integration evaluations passed",
],
},
};
// ============================================================================
// Utilities
// ============================================================================
function log(message) {
console.error(`sync-pr-labels: ${message}`);
}
function normalizePath(input) {
return String(input || "").trim().toLowerCase();
}
function envValue(name) {
return (process.env[name] || "").trim();
}
function envOrFail(name) {
const value = envValue(name);
if (!value) {
throw new Error(`missing required environment variable: ${name}`);
}
return value;
}
// ============================================================================
// GitHub API Client
// ============================================================================
class GitHubClient {
constructor(token, repo, prNumber) {
this.token = token;
this.repo = repo;
this.prNumber = prNumber;
}
buildHeaders(hasBody = false) {
const headers = {
Accept: "application/vnd.github+json",
"X-GitHub-Api-Version": "2022-11-28",
};
if (this.token) {
headers.Authorization = `Bearer ${this.token}`;
}
if (hasBody) {
headers["Content-Type"] = "application/json";
}
return headers;
}
async request(endpoint, options = {}) {
const { method = "GET", payload, allow404 = false } = options;
const hasBody = payload !== undefined;
const url = endpoint.startsWith("http") ? endpoint : `${API_BASE}${endpoint}`;
const response = await fetch(url, {
method,
headers: this.buildHeaders(hasBody),
body: hasBody ? JSON.stringify(payload) : undefined,
});
if (allow404 && response.status === 404) {
return null;
}
if (!response.ok) {
const detail = await response.text();
const error = new Error(`GitHub API ${method} ${url} failed: ${response.status} ${detail}`);
error.status = response.status;
throw error;
}
const text = await response.text();
return text ? JSON.parse(text) : null;
}
async getPullRequest() {
return this.request(`/repos/${this.repo}/pulls/${this.prNumber}`);
}
async listPrFiles() {
const files = [];
for (let page = 1; ; page += 1) {
const params = new URLSearchParams({ per_page: "100", page: String(page) });
const batch = await this.request(`/repos/${this.repo}/pulls/${this.prNumber}/files?${params}`);
if (!batch || batch.length === 0) {
break;
}
files.push(...batch);
if (batch.length < 100) {
break;
}
}
return files;
}
async listIssueLabels() {
const labels = await this.request(`/repos/${this.repo}/issues/${this.prNumber}/labels`);
return new Set(labels.map((item) => item.name));
}
async syncLabelDefinition(name) {
const label = LABEL_DEFINITIONS[name];
const createUrl = `/repos/${this.repo}/labels`;
const updateUrl = `/repos/${this.repo}/labels/${encodeURIComponent(name)}`;
try {
await this.request(createUrl, {
method: "POST",
payload: { name, color: label.color, description: label.description },
});
log(`created label ${name}`);
} catch (error) {
if (error.status !== 422) {
throw error;
}
await this.request(updateUrl, {
method: "PATCH",
payload: { new_name: name, color: label.color, description: label.description },
});
log(`updated label ${name}`);
}
}
async addLabels(labels) {
if (labels.length === 0) return;
await this.request(`/repos/${this.repo}/issues/${this.prNumber}/labels`, {
method: "POST",
payload: { labels },
});
log(`added labels: ${labels.join(", ")}`);
}
async removeLabel(name) {
await this.request(`/repos/${this.repo}/issues/${this.prNumber}/labels/${encodeURIComponent(name)}`, {
method: "DELETE",
allow404: true,
});
log(`removed label: ${name}`);
}
}
// ============================================================================
// Path & Domain Heuristics
// ============================================================================
function parsePrType(title) {
const match = String(title || "").trim().match(/^([a-z]+)(?:\([^)]+\))?!?:/i);
return match ? match[1].toLowerCase() : "";
}
function isLowRiskPath(filePath) {
const normalized = normalizePath(filePath);
const basename = path.posix.basename(normalized);
if (normalized.startsWith("skills/lark-")) return false;
if (DOC_SUFFIXES.some((suffix) => normalized.endsWith(suffix))) return true;
if (LOW_RISK_FILENAMES.has(basename)) return true;
if (LOW_RISK_PREFIXES.some((prefix) => normalized.startsWith(prefix))) return true;
if (LOW_RISK_TEST_SUFFIXES.some((suffix) => normalized.endsWith(suffix))) return true;
return normalized.includes("/testdata/");
}
function isBusinessSkillPath(filePath) {
const normalized = normalizePath(filePath);
return normalized.startsWith("shortcuts/") || normalized.startsWith("skills/lark-");
}
function shortcutDomainForPath(filePath) {
const parts = normalizePath(filePath).split("/");
return parts.length >= 2 && parts[0] === "shortcuts" ? parts[1] : "";
}
function skillDomainForPath(filePath) {
const parts = normalizePath(filePath).split("/");
return parts.length >= 2 && parts[0] === "skills" && parts[1].startsWith("lark-")
? parts[1].slice("lark-".length)
: "";
}
// Get business domain label based on CODEOWNERS path mapping
function getBusinessDomain(filePath) {
const normalized = normalizePath(filePath);
for (const [prefix, domain] of Object.entries(PATH_TO_DOMAIN_MAP)) {
if (normalized.startsWith(prefix)) {
return domain;
}
}
return "";
}
async function detectNewShortcutDomain(files) {
for (const item of files) {
if (item.status !== "added") continue;
const domain = shortcutDomainForPath(item.filename);
if (!domain) continue;
try {
await fs.access(path.join(ROOT, "shortcuts", domain));
} catch {
return domain;
}
}
return "";
}
function collectCoreAreas(filenames) {
const areas = new Set();
for (const name of filenames) {
const normalized = normalizePath(name);
for (const prefix of CORE_PREFIXES) {
if (normalized.startsWith(prefix)) {
// remove trailing slash for area name
areas.add(prefix.slice(0, -1));
}
}
}
return areas;
}
function collectSensitiveKeywords(filenames) {
const hits = new Set();
for (const name of filenames) {
const match = normalizePath(name).match(SENSITIVE_PATTERN);
if (match && match[2]) {
hits.add(match[2]);
}
}
return [...hits].sort();
}
// ============================================================================
// Classification Logic
// ============================================================================
function evaluateRules(context) {
const {
prType, effectiveChanges, lowRiskOnly,
domains, headDomains, coreAreas, coreSignals,
sensitiveKeywords, sensitive, newShortcutDomain,
singleDomain, multiDomain, filenames
} = context;
const reasons = [];
let label;
if (lowRiskOnly && (LOW_RISK_TYPES.has(prType) || effectiveChanges === 0)) {
reasons.push("Only low-risk docs, CI, test, or chore paths were changed, with no effective business code or Skill changes");
label = "size/S";
return { label, reasons };
}
// XL is reserved for architecture-level or global-impact changes.
const isXL =
effectiveChanges > THRESHOLD_XL ||
(prType === "refactor" && sensitive && effectiveChanges >= THRESHOLD_L) ||
(coreAreas.size >= 2 && (multiDomain || effectiveChanges >= THRESHOLD_L)) ||
(headDomains.length >= 2 && sensitive);
if (isXL) {
if (effectiveChanges > THRESHOLD_XL) reasons.push("Effective business code or Skill changes are far beyond the L threshold");
if (prType === "refactor" && sensitive && effectiveChanges >= THRESHOLD_L) reasons.push("Refactor PR touches core or sensitive paths");
if (coreAreas.size >= 2) reasons.push("Touches multiple core areas at the same time");
if (headDomains.length >= 2) reasons.push("Impacts multiple major business domains");
coreSignals.forEach((signal) => reasons.push(`Core area hit: ${signal}`));
sensitiveKeywords.forEach((keyword) => reasons.push(`Sensitive keyword hit: ${keyword}`));
label = "size/XL";
} else if (
prType === "refactor" ||
effectiveChanges >= THRESHOLD_L ||
Boolean(newShortcutDomain) ||
multiDomain ||
sensitive
) {
if (prType === "refactor") reasons.push("PR type is refactor");
if (effectiveChanges >= THRESHOLD_L) reasons.push(`Effective business code or Skill changes exceed ${THRESHOLD_L} lines`);
if (newShortcutDomain) reasons.push(`Introduces a new business domain directory: shortcuts/${newShortcutDomain}/`);
if (multiDomain) reasons.push("Touches multiple business domains");
coreSignals.forEach((signal) => reasons.push(`Core area hit: ${signal}`));
sensitiveKeywords.forEach((keyword) => reasons.push(`Sensitive keyword hit: ${keyword}`));
label = "size/L";
} else {
if (filenames.some(isBusinessSkillPath) || effectiveChanges > 0) {
reasons.push("Regular feat, fix, or Skill change within a single business domain");
}
if (singleDomain && domains.size > 0) {
reasons.push(`Impact is limited to a single business domain: ${[...domains].sort().join(", ")}`);
}
if (effectiveChanges < THRESHOLD_L) {
reasons.push(`Effective business code or Skill changes are below ${THRESHOLD_L} lines`);
}
label = "size/M";
}
return { label, reasons };
}
async function classifyPr(payload, files) {
const pr = payload.pull_request;
const title = pr.title || "";
const prType = parsePrType(title);
const filenames = files.map((item) => item.filename || "");
const impactedPaths = files.flatMap((item) => {
const paths = [item.filename || ""];
if (item.status === "renamed" && item.previous_filename) {
paths.push(item.previous_filename);
}
return paths.filter(Boolean);
});
// Filter out docs, tests, and other low-risk paths so the size label tracks business impact.
const effectiveChanges = files.reduce(
(sum, item) => sum + (isLowRiskPath(item.filename) ? 0 : (item.changes || 0)),
0,
);
const totalChanges = files.reduce((sum, item) => sum + (item.changes || 0), 0);
const domains = new Set();
const businessDomains = new Set();
for (const name of impactedPaths) {
const businessDomain = getBusinessDomain(name);
if (businessDomain) {
businessDomains.add(businessDomain);
domains.add(businessDomain);
continue;
}
const shortcutDomain = shortcutDomainForPath(name);
if (shortcutDomain) domains.add(shortcutDomain);
const skillDomain = skillDomainForPath(name);
if (skillDomain) domains.add(skillDomain);
}
const coreAreas = collectCoreAreas(impactedPaths);
const newShortcutDomain = await detectNewShortcutDomain(files);
const lowRiskOnly = impactedPaths.length > 0 && impactedPaths.every(isLowRiskPath);
const singleDomain = domains.size <= 1;
const multiDomain = domains.size >= 2;
const headDomains = [...domains].filter((domain) => HEAD_BUSINESS_DOMAINS.has(domain));
const coreSignals = [...coreAreas].sort();
const sensitiveKeywords = collectSensitiveKeywords(impactedPaths);
const sensitive = coreSignals.length > 0 || sensitiveKeywords.length > 0;
const context = {
prType, effectiveChanges, lowRiskOnly,
domains, headDomains, coreAreas, coreSignals,
sensitiveKeywords, sensitive, newShortcutDomain,
singleDomain, multiDomain, filenames: impactedPaths
};
const { label, reasons } = evaluateRules(context);
return {
label,
title,
prType: prType || "unknown",
totalChanges,
effectiveChanges,
domains: [...domains].sort(),
businessDomains: [...businessDomains].sort(),
coreAreas: [...coreAreas].sort(),
coreSignals,
sensitiveKeywords,
newShortcutDomain,
reasons,
lowRiskOnly,
filenames,
};
}
// ============================================================================
// Output & Formatting
// ============================================================================
async function writeStepSummary(prNumber, classification) {
const summaryPath = (process.env.GITHUB_STEP_SUMMARY || "").trim();
if (!summaryPath) return;
const standard = CLASS_STANDARDS[classification.label];
const domains = classification.domains.join(", ") || "-";
const bDomains = classification.businessDomains.join(", ") || "-";
const coreAreas = classification.coreAreas.join(", ") || "-";
const reasons = classification.reasons.length > 0
? classification.reasons
: ["No higher-severity rule matched, so the PR defaults to medium classification"];
const lines = [
"## PR Size Classification",
"",
`- PR: #${prNumber}`,
`- Label: \`${classification.label}\``,
`- PR Type: \`${classification.prType}\``,
`- Total Changes: \`${classification.totalChanges}\``,
`- Effective Business/SKILL Changes: \`${classification.effectiveChanges}\``,
`- Business Domains: \`${domains}\``,
`- Impacted Domains: \`${bDomains}\``,
`- Core Areas: \`${coreAreas}\``,
`- CI/CD Channel: \`${standard.channel}\``,
`- Low Risk Only: \`${classification.lowRiskOnly}\``,
"",
"### Reasons",
"",
...reasons.map((reason) => `- ${reason}`),
"",
"### Pipeline Gates",
"",
...standard.gates.map((gate) => `- ${gate}`),
"",
];
await fs.appendFile(summaryPath, `${lines.join("\n")}\n`, "utf8");
}
function formatDryRunResult(repo, prNumber, classification) {
const standard = CLASS_STANDARDS[classification.label];
return {
repo,
prNumber,
label: classification.label,
prType: classification.prType,
totalChanges: classification.totalChanges,
effectiveChanges: classification.effectiveChanges,
lowRiskOnly: classification.lowRiskOnly,
domains: classification.domains,
businessDomains: classification.businessDomains,
coreAreas: classification.coreAreas,
coreSignals: classification.coreSignals,
sensitiveKeywords: classification.sensitiveKeywords,
reasons: classification.reasons,
channel: standard.channel,
gates: standard.gates,
};
}
function printDryRunResult(result, options) {
if (options.json) {
console.log(JSON.stringify(result, null, 2));
return;
}
const signalParts = [
...result.coreSignals.map((signal) => `core:${signal}`),
...result.sensitiveKeywords.map((keyword) => `keyword:${keyword}`),
...(result.domains.length > 0 ? [`domains:${result.domains.join(",")}`] : []),
];
const reasonParts = result.reasons.length > 0
? result.reasons
: ["No higher-severity rule matched, so the PR defaults to medium classification"];
console.log(
`${result.label} | #${result.prNumber} | type:${result.prType} | eff:${result.effectiveChanges} | `
+ `sig:${signalParts.join(";") || "-"} | reason:${reasonParts.join("; ")}`,
);
}
function printHelp() {
const lines = [
"Usage:",
" node scripts/pr-labels/index.js",
" node scripts/pr-labels/index.js --dry-run --pr-url <github-pr-url> [--token <token>] [--json]",
" node scripts/pr-labels/index.js --dry-run --repo <owner/name> --pr-number <number> [--token <token>] [--json]",
"",
"Modes:",
" default Read the GitHub Actions event payload and apply labels",
" --dry-run Fetch the PR, compute the managed label, and print the result without writing labels",
"",
"Options:",
" --pr-url <url> GitHub pull request URL, for example https://github.com/larksuite/cli/pull/123",
" --repo <owner/name> Repository name, used with --pr-number",
" --pr-number <n> Pull request number, used with --repo",
" --token <token> GitHub token override; falls back to GITHUB_TOKEN",
" --json Print dry-run output as JSON instead of the default one-line summary",
" --help Show this message",
];
console.log(lines.join("\n"));
}
function parseArgs(argv) {
const options = {
dryRun: false,
json: false,
help: false,
prUrl: "",
repo: "",
prNumber: "",
token: "",
};
for (let i = 0; i < argv.length; i += 1) {
const arg = argv[i];
if (arg === "--dry-run") options.dryRun = true;
else if (arg === "--json") options.json = true;
else if (arg === "--help" || arg === "-h") options.help = true;
else if (arg === "--pr-url") options.prUrl = argv[++i] || "";
else if (arg === "--repo") options.repo = argv[++i] || "";
else if (arg === "--pr-number") options.prNumber = argv[++i] || "";
else if (arg === "--token") options.token = argv[++i] || "";
else throw new Error(`unknown argument: ${arg}`);
}
return options;
}
function parsePrUrl(prUrl) {
let parsed;
try {
parsed = new URL(prUrl);
} catch {
throw new Error(`invalid PR URL: ${prUrl}`);
}
const match = parsed.pathname.match(/^\/([^/]+)\/([^/]+)\/pull\/(\d+)\/?$/);
if (!match) throw new Error(`unsupported PR URL format: ${prUrl}`);
return { repo: `${match[1]}/${match[2]}`, prNumber: Number(match[3]) };
}
async function loadEventPayload(filePath) {
return JSON.parse(await fs.readFile(filePath, "utf8"));
}
async function resolveContext(options) {
const token = options.token;
if (options.prUrl) {
const { repo, prNumber } = parsePrUrl(options.prUrl);
const client = new GitHubClient(token, repo, prNumber);
const payload = {
repository: { full_name: repo },
pull_request: await client.getPullRequest(),
};
return { repo, prNumber, payload, client };
}
if (options.repo || options.prNumber) {
if (!options.repo || !options.prNumber) throw new Error("--repo and --pr-number must be provided together");
const prNumber = Number(options.prNumber);
if (!Number.isInteger(prNumber) || prNumber <= 0) throw new Error(`invalid PR number: ${options.prNumber}`);
const client = new GitHubClient(token, options.repo, prNumber);
const payload = {
repository: { full_name: options.repo },
pull_request: await client.getPullRequest(),
};
return { repo: options.repo, prNumber, payload, client };
}
const eventPath = envOrFail("GITHUB_EVENT_PATH");
const payload = await loadEventPayload(eventPath);
const repo = payload.repository.full_name;
const prNumber = payload.pull_request.number;
const client = new GitHubClient(token, repo, prNumber);
return { repo, prNumber, payload, client };
}
// ============================================================================
// Main Execution
// ============================================================================
async function main() {
const options = parseArgs(process.argv.slice(2));
if (options.help) {
printHelp();
return;
}
options.token = options.token || envValue("GITHUB_TOKEN");
if (!options.dryRun && !options.token) {
throw new Error("missing required GitHub token; set GITHUB_TOKEN or pass --token");
}
const { repo, prNumber, payload, client } = await resolveContext(options);
const files = await client.listPrFiles();
const classification = await classifyPr(payload, files);
if (options.dryRun) {
printDryRunResult(formatDryRunResult(repo, prNumber, classification), options);
return;
}
const desired = new Set([classification.label]);
for (const domain of classification.businessDomains) {
desired.add(`domain/${domain}`);
}
const current = await client.listIssueLabels();
const managedCurrent = [...current].filter((label) => MANAGED_LABELS.has(label) || label.startsWith("domain/"));
const toAdd = [...desired].filter((label) => !current.has(label)).sort();
const toRemove = managedCurrent.filter((label) => !desired.has(label)).sort();
for (const domain of classification.businessDomains) {
const labelName = `domain/${domain}`;
if (!LABEL_DEFINITIONS[labelName]) {
LABEL_DEFINITIONS[labelName] = { color: "1d76db", description: `PR touches the ${domain} domain` };
}
}
// Ensure labels to be added actually exist in the repository first
// If the label doesn't exist, GitHub API will return 422 Unprocessable Entity when trying to add it to a PR.
for (const label of toAdd) {
if (LABEL_DEFINITIONS[label]) {
try {
await client.syncLabelDefinition(label);
} catch (e) {
log(`Warning: Failed to bootstrap new label ${label}: ${e.message}`);
}
}
}
await client.addLabels(toAdd);
for (const label of toRemove) {
await client.removeLabel(label);
}
// Keep other label metadata consistent. This is best-effort trailing work.
for (const label of Object.keys(LABEL_DEFINITIONS)) {
if (toAdd.includes(label)) continue; // Already synced above
try {
await client.syncLabelDefinition(label);
} catch (e) {
log(`Warning: Failed to sync label definition for ${label}: ${e.message}`);
}
}
await writeStepSummary(prNumber, classification);
log(
`pr #${prNumber} type=${classification.prType} total_changes=${classification.totalChanges} `
+ `effective_changes=${classification.effectiveChanges} files=${files.length} `
+ `desired=${[...desired].sort().join(",") || "-"} current_managed=${managedCurrent.sort().join(",") || "-"} `
+ `reasons=${classification.reasons.join(" | ") || "-"}`,
);
}
main().catch((error) => {
log(error.message || String(error));
process.exit(1);
});

View File

@@ -0,0 +1,145 @@
[
{
"name": "size-s-docs-badge",
"number": 103,
"title": "docs: add official badge to distinguish from third-party Lark CLI tools",
"pr_url": "https://github.com/larksuite/cli/pull/103",
"status": "merged",
"merged_at": "2026-03-30T12:15:45Z",
"expected_label": "size/S",
"expected_domains": [],
"review_note": "Pure docs sample. Useful to confirm low-risk paths stay in S even when total changed lines are not tiny."
},
{
"name": "size-s-docs-simplify",
"number": 26,
"title": "docs: simplify installation steps by merging CLI and Skills into one …",
"pr_url": "https://github.com/larksuite/cli/pull/26",
"status": "merged",
"merged_at": "2026-03-28T09:33:24Z",
"expected_label": "size/S",
"expected_domains": [],
"review_note": "Docs sample, verifying docs changes remain in S."
},
{
"name": "size-s-docs-star-history",
"number": 12,
"title": "docs: add Star History chart to readmes",
"pr_url": "https://github.com/larksuite/cli/pull/12",
"status": "merged",
"merged_at": "2026-03-28T16:00:15Z",
"expected_label": "size/S",
"expected_domains": [],
"review_note": "Docs sample, no effective business code changes."
},
{
"name": "size-s-docs-clarify-install",
"number": 3,
"title": "docs: clarify install methods and add source build steps",
"pr_url": "https://github.com/larksuite/cli/pull/3",
"status": "merged",
"merged_at": "2026-03-28T03:43:44Z",
"expected_label": "size/S",
"expected_domains": [],
"review_note": "Docs sample, pure documentation clarification."
},
{
"name": "size-m-fix-base-scope",
"number": 96,
"title": "fix(base): correct scope for record history list shortcut",
"pr_url": "https://github.com/larksuite/cli/pull/96",
"status": "merged",
"merged_at": "2026-03-30T11:40:18Z",
"expected_label": "size/M",
"expected_domains": ["domain/base"],
"review_note": "Small fix sample. Verify the lower edge of the M bucket within a single domain."
},
{
"name": "size-m-fix-mail-sensitive",
"number": 92,
"title": "fix: remove sensitive send scope from reply and forward shortcuts",
"pr_url": "https://github.com/larksuite/cli/pull/92",
"status": "merged",
"merged_at": "2026-03-30T10:19:11Z",
"expected_label": "size/M",
"expected_domains": ["domain/mail"],
"review_note": "Security-like wording in the title but stays in one business domain (mail)."
},
{
"name": "size-m-ci-improve",
"number": 71,
"title": "ci: improve CI workflows and add golangci-lint config",
"pr_url": "https://github.com/larksuite/cli/pull/71",
"status": "merged",
"merged_at": "2026-03-30T03:09:31Z",
"expected_label": "size/M",
"expected_domains": [],
"review_note": "CI workflow change that goes beyond S threshold."
},
{
"name": "size-m-feat-im-pagination",
"number": 30,
"title": "feat: add auto-pagination to messages search and update lark-im docs",
"pr_url": "https://github.com/larksuite/cli/pull/30",
"status": "merged",
"merged_at": "2026-03-30T15:00:41Z",
"expected_label": "size/M",
"expected_domains": ["domain/im"],
"review_note": "Single-domain feature with larger diff but effective changes stay in M."
},
{
"name": "size-l-fix-api-silent",
"number": 85,
"title": "fix: resolve silent failure in `lark-cli api` error output (#39)",
"pr_url": "https://github.com/larksuite/cli/pull/85",
"status": "merged",
"merged_at": "2026-03-30T09:19:24Z",
"expected_label": "size/L",
"expected_domains": [],
"review_note": "Touches core area (cmd), bumping the size to L."
},
{
"name": "size-l-fix-cli",
"number": 91,
"title": "fix: correct CLI examples in root help and READMEs (closes #48)",
"pr_url": "https://github.com/larksuite/cli/pull/91",
"status": "closed",
"merged_at": null,
"expected_label": "size/L",
"expected_domains": [],
"review_note": "Closed PR touching core area (cmd)."
},
{
"name": "size-m-skill-format-check",
"number": 134,
"title": "feat(ci): add skill format check workflow to ensure SKILL.md compliance",
"pr_url": "https://github.com/larksuite/cli/pull/134",
"status": "closed",
"merged_at": null,
"expected_label": "size/M",
"expected_domains": [],
"review_note": "Includes updates to tests/bad-skill/SKILL.md inside skills-like paths, testing how skill mock files and test scripts are handled."
},
{
"name": "size-l-ccm-multi-path",
"number": 57,
"title": "feat(docs): support local image upload in docs +create",
"pr_url": "https://github.com/larksuite/cli/pull/57",
"status": "closed",
"merged_at": null,
"expected_label": "size/L",
"expected_domains": ["domain/ccm"],
"review_note": "Touches docs_create_images.go and table_auto_width.go, representing multiple CCM sub-paths but resolving to a single ccm domain."
},
{
"name": "size-l-domain-rename",
"number": 11,
"title": "docs: rename user-facing Bitable references to Base",
"pr_url": "https://github.com/larksuite/cli/pull/11",
"status": "merged",
"merged_at": "2026-03-28T16:00:52Z",
"expected_label": "size/L",
"expected_domains": ["domain/base", "domain/ccm"],
"review_note": "A rename across paths. Since we track previous_filename to evaluate domains, this should properly capture the base domain."
}
]

52
scripts/pr-labels/test.js Normal file
View File

@@ -0,0 +1,52 @@
const fs = require('fs');
const { execFileSync } = require('child_process');
const path = require('path');
const samplesPath = path.join(__dirname, 'samples.json');
const indexPath = path.join(__dirname, 'index.js');
const samples = JSON.parse(fs.readFileSync(samplesPath, 'utf8'));
if (!process.env.GITHUB_TOKEN) {
console.error("❌ Error: GITHUB_TOKEN environment variable is required to run tests without hitting API rate limits.");
console.error("Please run: GITHUB_TOKEN=$(gh auth token) node scripts/pr-labels/test.js");
process.exit(1);
}
let passed = 0;
let failed = 0;
for (const sample of samples) {
try {
const output = execFileSync(
process.execPath,
[indexPath, '--dry-run', '--json', '--pr-url', sample.pr_url],
{ encoding: 'utf8', env: process.env }
);
const result = JSON.parse(output);
const matchLabel = result.label === sample.expected_label;
// Sort before comparing to ignore order
const actualDomains = (result.businessDomains || []).sort();
const expectedDomains = (sample.expected_domains || []).map(d => d.replace('domain/', '')).sort();
const matchDomains = JSON.stringify(actualDomains) === JSON.stringify(expectedDomains);
if (matchLabel && matchDomains) {
console.log(`✅ Passed: ${sample.name}`);
passed++;
} else {
console.log(`❌ Failed: ${sample.name}`);
console.log(` Label expected: ${sample.expected_label}, got: ${result.label}`);
console.log(` Domains expected: ${expectedDomains}, got: ${actualDomains}`);
failed++;
}
} catch (e) {
console.log(`❌ Failed: ${sample.name} (Execution error)`);
console.error(e.message);
failed++;
}
}
console.log(`\nTest Summary: ${passed} passed, ${failed} failed`);
if (failed > 0) process.exit(1);

View File

@@ -0,0 +1,36 @@
# Skill Format Check
This directory contains a script to validate the format of `SKILL.md` files located in the `../../skills` directory.
## Purpose
The `index.js` script ensures that all `SKILL.md` files conform to the standard template defined in `skill-template/skill-template.md`. Specifically, it checks that the YAML frontmatter includes the following fields:
- `name` (required)
- `description` (required)
- `metadata` (outputs a warning if missing, does not fail the build)
> **Note:** The `lark-shared` skill is explicitly excluded from these format checks.
## Usage
This script is executed automatically via GitHub Actions (`.github/workflows/skill-format-check.yml`) on pull requests and pushes that modify the `skills/` directory.
To run the check manually from the root of the repository, execute:
```bash
node scripts/skill-format-check/index.js
```
You can also specify a custom target directory as the first argument:
```bash
node scripts/skill-format-check/index.js ./path/to/my/skills
```
## Testing
This tool comes with a quick validation script to ensure it correctly identifies good and bad skill formats. To run the tests, execute:
```bash
./scripts/skill-format-check/test.sh
```

View File

@@ -0,0 +1,96 @@
const fs = require('fs');
const path = require('path');
// Allow passing a target directory as the first argument.
// If provided, resolve against process.cwd() so it behaves as the user expects.
// If not provided, default to '../../skills' relative to this script's directory.
const targetDirArg = process.argv[2];
const SKILLS_DIR = targetDirArg
? path.resolve(process.cwd(), targetDirArg)
: path.resolve(__dirname, '../../skills');
function checkSkillFormat() {
console.log(`Checking skill format in ${SKILLS_DIR}...`);
if (!fs.existsSync(SKILLS_DIR)) {
console.error('Skills directory not found:', SKILLS_DIR);
process.exit(1);
}
let skills;
try {
skills = fs
.readdirSync(SKILLS_DIR, { withFileTypes: true })
.filter(entry => entry.isDirectory())
.map(entry => entry.name);
} catch (err) {
console.error(`Failed to enumerate skills directory: ${err.message}`);
process.exit(1);
}
let hasErrors = false;
skills.forEach(skill => {
// Skip lark-shared skill completely
if (skill === 'lark-shared') {
console.log(`⏭️ Skipping check for ${skill}`);
return;
}
const skillPath = path.join(SKILLS_DIR, skill);
const skillFile = path.join(skillPath, 'SKILL.md');
if (!fs.existsSync(skillFile)) {
console.error(`❌ [${skill}] Missing SKILL.md`);
hasErrors = true;
return;
}
let content;
try {
content = fs.readFileSync(skillFile, 'utf-8');
} catch (err) {
console.error(`❌ [${skill}] Failed to read SKILL.md: ${err.message}`);
hasErrors = true;
return;
}
// Normalize line endings to simplify parsing
const normalizedContent = content.replace(/\r\n/g, '\n');
// Check YAML Frontmatter
if (!normalizedContent.startsWith('---\n')) {
console.error(`❌ [${skill}] SKILL.md must start with YAML frontmatter (---)`);
hasErrors = true;
} else {
const frontmatterMatch = normalizedContent.match(/^---\n([\s\S]*?)\n---(?:\n|$)/);
if (!frontmatterMatch) {
console.error(`❌ [${skill}] SKILL.md has unclosed or invalid YAML frontmatter`);
hasErrors = true;
} else {
const frontmatter = frontmatterMatch[1];
if (!/^name:/m.test(frontmatter)) {
console.error(`❌ [${skill}] YAML frontmatter missing 'name'`);
hasErrors = true;
}
if (!/^description:/m.test(frontmatter)) {
console.error(`❌ [${skill}] YAML frontmatter missing 'description'`);
hasErrors = true;
}
if (!/^metadata:/m.test(frontmatter)) {
console.warn(`⚠️ [${skill}] YAML frontmatter missing 'metadata' (Warning only)`);
// hasErrors = true; // Downgrade to warning to not fail on existing skills
}
}
}
});
if (hasErrors) {
console.error('\n❌ Skill format check failed. Please fix the errors above.');
process.exit(1);
} else {
console.log('\n✅ Skill format check passed!');
}
}
checkSkillFormat();

View File

@@ -0,0 +1,82 @@
#!/bin/bash
# Get the directory of this script
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
INDEX_JS="$DIR/index.js"
TEMP_DIR="$DIR/tests/temp_test_dir"
echo "=== Running tests for skill-format-check ==="
echo "Index script: $INDEX_JS"
prepare_fixture() {
local test_name=$1
rm -rf "$TEMP_DIR"
mkdir -p "$TEMP_DIR"
if [ ! -d "$DIR/tests/$test_name" ]; then
echo "❌ Missing fixture directory: $DIR/tests/$test_name"
exit 1
fi
cp -r "$DIR/tests/$test_name" "$TEMP_DIR/" || {
echo "❌ Failed to copy fixture: $test_name"
exit 1
}
}
# Function to run a positive test
run_positive_test() {
local test_name=$1
echo -e "\n--- [Positive] $test_name ---"
prepare_fixture "$test_name"
node "$INDEX_JS" "$TEMP_DIR"
if [ $? -eq 0 ]; then
echo "✅ Passed! (Correctly validated $test_name)"
rm -rf "$TEMP_DIR"
return 0
else
echo "❌ Failed! Expected $test_name to pass but it failed."
rm -rf "$TEMP_DIR"
exit 1
fi
}
# Function to run a negative test
run_negative_test() {
local test_name=$1
echo -e "\n--- [Negative] $test_name ---"
prepare_fixture "$test_name"
# Capture output for diagnostics while still treating non-zero as expected
local log_file="$TEMP_DIR/.validator.log"
node "$INDEX_JS" "$TEMP_DIR" > "$log_file" 2>&1
local exit_code=$?
if [ $exit_code -ne 0 ]; then
echo "✅ Passed! (Correctly rejected $test_name)"
rm -rf "$TEMP_DIR"
return 0
else
echo "❌ Failed! Expected $test_name to fail but it passed."
if [ -s "$log_file" ]; then
echo "--- Validator output ---"
cat "$log_file"
fi
rm -rf "$TEMP_DIR"
exit 1
fi
}
# Run positive tests
run_positive_test "good-skill"
run_positive_test "good-skill-minimal"
run_positive_test "good-skill-complex"
# Run negative tests
run_negative_test "bad-skill"
run_negative_test "bad-skill-no-frontmatter"
run_negative_test "bad-skill-unclosed-frontmatter"
echo -e "\n🎉 All tests passed successfully!"

View File

@@ -0,0 +1,3 @@
# No Frontmatter Skill
This skill completely lacks a YAML frontmatter.

View File

@@ -0,0 +1,9 @@
---
name: bad-skill-unclosed
version: 1.0.0
description: "This skill has an unclosed frontmatter block."
metadata: {}
# Unclosed Frontmatter Skill
This frontmatter does not have a closing `---` block.

View File

@@ -0,0 +1,8 @@
---
version: 1.0.0
metadata: {}
---
# Bad Skill
This skill is missing required fields like name and description.

View File

@@ -0,0 +1,17 @@
---
name: good-skill-complex
version: 2.5.1-beta
description: >
A very complex description
that spans multiple lines
and contains weird chars: !@#$%^&*()
metadata:
requires:
bins: ["lark-cli", "node"]
cliHelp: "lark-cli something --help"
customField: "customValue"
---
# Complex Skill
This skill has a complex frontmatter block.

View File

@@ -0,0 +1,10 @@
---
name: good-skill-minimal
version: 0.1.0
description: Minimal valid description
metadata: {}
---
# Minimal Skill
This has the bare minimum required fields.

View File

@@ -0,0 +1,12 @@
---
name: good-skill
version: 1.0.0
description: "This is a properly formatted skill."
metadata:
requires:
bins: ["lark-cli"]
---
# Good Skill
This skill follows all the formatting rules.