Files
github-spec-kit/tests/integrations/test_extra_args.py
Huy Do f50839a928 feat(integrations): support SPECIFY_<KEY>_EXTRA_ARGS env var for agent subprocess flags (#2596)
* feat(integrations): support SPECIFY_<KEY>_EXTRA_ARGS env var for agent subprocess flags

Read a per-integration env var (SPECIFY_<KEY>_EXTRA_ARGS) inside
`SkillsIntegration.build_exec_args`, `MarkdownIntegration.build_exec_args`,
and `TomlIntegration.build_exec_args` and append the parsed flags to the
spawned agent's argv, gated per integration key.

Operators can now opt into extra CLI flags (e.g.
`SPECIFY_CLAUDE_EXTRA_ARGS=--dangerously-skip-permissions`) without
modifying any SKILL or workflow YAML. Useful in CI / non-interactive
contexts where the spawned `<agent> -p ...` would otherwise hang on
an internal permission or input prompt invisible to the parent
`specify workflow run` process.

Key normalization: `kiro-cli` → `SPECIFY_KIRO_CLI_EXTRA_ARGS` (hyphen
replaced with underscore, then uppercased).

Default (env var unset or whitespace-only) is byte-identical to
previous behaviour. Extra args are inserted between `-p prompt` and
the model / output-format flags so they cannot clobber canonical
Spec Kit args.

Implementation: a single helper `IntegrationBase._apply_extra_args_env_var`
encapsulates the env-var read + shlex parsing; each of the three
concrete `build_exec_args` implementations calls it.

Closes #2595

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(integrations): wire SPECIFY_<KEY>_EXTRA_ARGS into Codex/Devin/Opencode/Copilot

Four integrations override `build_exec_args` and were silently
ignoring the env-var hook introduced in the previous commit:

- CodexIntegration (`codex exec ...`)
- DevinIntegration (`devin -p ...`)
- OpencodeIntegration (`opencode run ...`)
- CopilotIntegration (`copilot -p ...`)

Each now calls `self._apply_extra_args_env_var(args)` between the
base argv and the canonical Spec Kit flags (matching the placement
in `MarkdownIntegration`, `TomlIntegration`, and `SkillsIntegration`),
so operator-injected flags cannot clobber `--model` / `--output-format`
/ `--json`.

Adds 4 parameterized override-integration tests locking the wiring
per agent. Also cleans up an inline `__import__("os").environ` in the
fixture to a top-of-file `import os`.

Drive-by typing fix: guard `self.registrar_config.get(...)` in
`CopilotIntegration.add_commands` against the `None` case, matching
the pattern already used in `base.py` for the same access.

Addresses Copilot review on #2596.

* fix(integrations): apply Opencode extra-args before prompt-derived --command

When the Opencode prompt starts with `/`, `build_exec_args` injects
`--command <X>` derived from the prompt. The previous placement of
`self._apply_extra_args_env_var(args)` appended operator-injected
args AFTER that `--command`, so a user setting
`SPECIFY_OPENCODE_EXTRA_ARGS="--command override"` could redirect the
command under typical last-wins repeated-flag CLI semantics.

Move the hook to immediately after `args = [self.key, "run"]`, before
the prompt-parsing block. The operator's `--command override` (if
any) now precedes the Spec Kit-derived `--command speckit`, so the
canonical choice wins.

Adds `test_opencode_extra_args_cannot_clobber_prompt_derived_command`
locking the ordering. Also corrects the module docstring to describe
the hook as living in `IntegrationBase` (not `SkillsIntegration`) and
to acknowledge that this file covers Codex/Devin/Opencode/Copilot in
addition to SkillsIntegration stubs.

Addresses Copilot review on #2596.

* fix(integrations): honour SPECIFY_COPILOT_EXTRA_ARGS in dispatch_command

`CopilotIntegration` is the only integration that overrides
`dispatch_command` — it builds `cli_args` inline rather than going
through `build_exec_args`. The previous commit wired
`_apply_extra_args_env_var` into `build_exec_args` but workflow
execution calls `dispatch_command`, so `SPECIFY_COPILOT_EXTRA_ARGS`
was silently ignored at runtime.

Add the hook in `dispatch_command` immediately after
`cli_args = ["copilot", "-p", prompt]`, mirroring the placement in
`build_exec_args` (between `-p prompt` and the canonical
`--agent` / `--yolo` / `--model` / `--output-format` flags).

`IntegrationBase.dispatch_command` already delegates to
`build_exec_args`, so Codex, Devin, and Opencode continue to honour
their respective env vars through inheritance — no further changes
needed for them.

Adds two end-to-end tests that monkeypatch `subprocess.run` and
assert the env-var args reach the executed argv:
- `test_copilot_dispatch_command_includes_extra_args` locks the
  bypass fix on the overridden path.
- `test_codex_dispatch_command_includes_extra_args` locks the
  inherited-base-dispatch path for the other three integrations.

Addresses Copilot review on #2596.

* refactor(integrations): rename env var to SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGS

Per maintainer request: scope the operator-injected env var to the
integration subsystem by prepending `INTEGRATION_` to the key
segment, so it does not collide with other Spec Kit env-var
namespaces.

Renames everywhere it appears:
- Helper `IntegrationBase._apply_extra_args_env_var` env_name
  format and docstring (`base.py`).
- Inline comment in `CopilotIntegration.dispatch_command`.
- All `monkeypatch.setenv(...)` calls, docstrings, and the
  autouse fixture's scope filter in
  `tests/integrations/test_extra_args.py`.

No behaviour change beyond the variable name. Default (env var
unset) still byte-identical to before this PR.

Addresses review on #2596.

* fix(integrations): raise actionable error on malformed EXTRA_ARGS quoting

Wrap `shlex.split` in `_apply_extra_args_env_var` so an unmatched quote
in `SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGS` surfaces a clear `ValueError`
naming the offending env var and showing the invalid value, instead of
crashing workflow dispatch with a bare shlex traceback. Adds a new test
locking the actionable error path.

Addresses Copilot review feedback on #2596.

* test(integrations): use `_copilot_executable()` in Copilot extra-args test

`test_copilot_integration_honours_extra_args` hardcoded `"copilot"`
in the expected argv, but `CopilotIntegration.build_exec_args` calls
`_copilot_executable()` which returns `"copilot.cmd"` on Windows
(`os.name == "nt"`). The test passed on Linux/macOS and failed on
all three Windows-latest matrix entries.

Resolve by importing `_copilot_executable` alongside `CopilotIntegration`
and using it as the first expected argv token. The companion
`test_copilot_dispatch_command_includes_extra_args` already uses
`index()` lookups rather than full-argv equality so it was unaffected.

* fix(integrations): couple Codex executable to self.key + cover base classes

Two Copilot findings on the latest pass:

1. `CodexIntegration.build_exec_args` hardcoded the executable name
   as the literal `"codex"` while the env-var lookup derives from
   `self.key`. The two should stay coupled (matching Devin/Opencode,
   which both use `self.key` already). Replace the literal with
   `self.key` so the argv and env-var scoping cannot drift.

2. `tests/integrations/test_extra_args.py` covered the
   `SkillsIntegration` mechanism (via stubs near the top) and the
   four `build_exec_args` overrides (Codex/Devin/Opencode/Copilot)
   end-to-end, but did not exercise the `MarkdownIntegration` or
   `TomlIntegration` base implementations directly. Add bare
   `_MarkdownAgentStub` and `_TomlAgentStub` test stubs and a test
   each — the most common integration pattern (Amp, Auggie, Generic,
   Gemini, Tabnine, …) inherits without overriding, so the base
   wiring is now locked.

Full suite: 3011 passed (was 3009), 40 skipped, no regressions.

Addresses Copilot review feedback on #2596.

* fix(integrations): rename env var to SPECKIT_INTEGRATION_<KEY>_EXTRA_ARGS

Renames the env-var hook prefix from `SPECIFY_INTEGRATION_*` to
`SPECKIT_INTEGRATION_*` to match the established codebase
convention for integration-subsystem env vars
(`SPECKIT_INTEGRATION_CATALOG_URL` in `integrations/catalog.py`,
`SPECKIT_COPILOT_ALLOW_ALL_TOOLS` in `integrations/copilot/__init__.py`).

The `SPECIFY_*` prefix is reserved for user-facing
feature-resolution variables (`SPECIFY_FEATURE`,
`SPECIFY_FEATURE_DIRECTORY`); reusing it for integration-subsystem
scoping would introduce a second integration namespace under a
different prefix, confusing operators who already set
`SPECKIT_INTEGRATION_CATALOG_URL`.

Also reverts the unrelated defensive `arg_placeholder` /
`registrar_config is None` guard in
`CopilotIntegration.setup_skills_mode` — it was a drive-by pyright
cleanup mixed into this PR. Every concrete integration sets
`registrar_config` so the guard never fires in practice; the
typing issue belongs in a focused follow-up rather than this
env-var-hook PR.

Updates everywhere the old prefix appeared:
- `IntegrationBase._apply_extra_args_env_var` helper + docstring
- `CopilotIntegration.dispatch_command` inline comment
- All `monkeypatch.setenv(...)` calls in `tests/integrations/test_extra_args.py`
- The autouse fixture scope filter
- Test module docstring

Full suite: 3011 passed, 40 skipped, no regressions.

Addresses Copilot review feedback on #2596.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 14:00:15 -05:00

481 lines
15 KiB
Python

"""Tests for the per-integration `SPECKIT_INTEGRATION_<KEY>_EXTRA_ARGS` env-var hook.
The hook is implemented in `IntegrationBase._apply_extra_args_env_var`
and wired into every concrete `build_exec_args` —
`MarkdownIntegration`, `TomlIntegration`, `SkillsIntegration`, plus the
overrides in Codex, Devin, Opencode and Copilot. These tests cover both
the shared mechanism (via `SkillsIntegration` stubs near the top of the
file) and each override integration end-to-end (further down). See
issue #2595."""
import os
import pytest
from specify_cli.integrations.base import (
MarkdownIntegration,
SkillsIntegration,
TomlIntegration,
)
class _ClaudeStub(SkillsIntegration):
"""Minimal Claude-like SkillsIntegration for testing."""
key = "claude"
config = {
"name": "Claude (test stub)",
"folder": ".claude/",
"commands_subdir": "skills",
"install_url": None,
"requires_cli": True,
}
registrar_config = {
"dir": ".claude/skills",
"format": "markdown",
"args": "$ARGUMENTS",
"extension": "/SKILL.md",
}
context_file = "CLAUDE.md"
class _KiroCliStub(SkillsIntegration):
"""SkillsIntegration with a hyphenated key to exercise key
normalization (`kiro-cli` → `KIRO_CLI`)."""
key = "kiro-cli"
config = {
"name": "Kiro CLI (test stub)",
"folder": ".kiro/",
"commands_subdir": "commands",
"install_url": None,
"requires_cli": True,
}
registrar_config = {
"dir": ".kiro/commands",
"format": "markdown",
"args": "$ARGUMENTS",
"extension": ".md",
}
context_file = "KIRO.md"
class _NoCliStub(SkillsIntegration):
"""SkillsIntegration with requires_cli=False — build_exec_args
must return None and the env-var hook must not fire."""
key = "no-cli"
config = {
"name": "No-CLI agent (test stub)",
"folder": ".no-cli/",
"commands_subdir": "commands",
"install_url": None,
"requires_cli": False,
}
registrar_config = {
"dir": ".no-cli/commands",
"format": "markdown",
"args": "$ARGUMENTS",
"extension": ".md",
}
context_file = "NOCLI.md"
class _MarkdownAgentStub(MarkdownIntegration):
"""Bare MarkdownIntegration subclass — does NOT override
`build_exec_args`. Locks the base implementation in
`MarkdownIntegration.build_exec_args` for the common case
(most concrete integrations: Amp, Auggie, Generic, …)."""
key = "md-agent"
config = {
"name": "Markdown agent (test stub)",
"folder": ".md-agent/",
"commands_subdir": "commands",
"install_url": None,
"requires_cli": True,
}
registrar_config = {
"dir": ".md-agent/commands",
"format": "markdown",
"args": "$ARGUMENTS",
"extension": ".md",
}
context_file = "MDAGENT.md"
class _TomlAgentStub(TomlIntegration):
"""Bare TomlIntegration subclass — does NOT override
`build_exec_args`. Locks the base implementation in
`TomlIntegration.build_exec_args` (Gemini, Tabnine)."""
key = "toml-agent"
config = {
"name": "TOML agent (test stub)",
"folder": ".toml-agent/",
"commands_subdir": "commands",
"install_url": None,
"requires_cli": True,
}
registrar_config = {
"dir": ".toml-agent/commands",
"format": "toml",
"args": "$ARGUMENTS",
"extension": ".toml",
}
context_file = "TOMLAGENT.md"
@pytest.fixture(autouse=True)
def _clean_extra_args_env(monkeypatch):
"""Strip any leaked SPECKIT_INTEGRATION_*_EXTRA_ARGS from the test
env so a developer's shell setting doesn't pollute results."""
for key in list(os.environ):
if key.startswith("SPECKIT_INTEGRATION_") and key.endswith(
"_EXTRA_ARGS"
):
monkeypatch.delenv(key, raising=False)
def test_env_var_unset_byte_identical_argv():
"""Default behaviour: env var unset → no extra args inserted.
Locks the backward-compatibility guarantee that existing
operators see no change.
"""
args = _ClaudeStub().build_exec_args("hello prompt")
assert args == ["claude", "-p", "hello prompt", "--output-format", "json"]
def test_env_var_set_flag_inserted_before_model_and_output_format(
monkeypatch,
):
monkeypatch.setenv(
"SPECKIT_INTEGRATION_CLAUDE_EXTRA_ARGS", "--dangerously-skip-permissions"
)
args = _ClaudeStub().build_exec_args("hello prompt", model="sonnet")
assert args == [
"claude",
"-p",
"hello prompt",
"--dangerously-skip-permissions",
"--model",
"sonnet",
"--output-format",
"json",
]
def test_env_var_multi_token_parsed_via_shlex(monkeypatch):
monkeypatch.setenv(
"SPECKIT_INTEGRATION_CLAUDE_EXTRA_ARGS",
"--dangerously-skip-permissions --max-turns 3",
)
args = _ClaudeStub().build_exec_args("p")
assert args == [
"claude",
"-p",
"p",
"--dangerously-skip-permissions",
"--max-turns",
"3",
"--output-format",
"json",
]
def test_malformed_quoting_raises_actionable_value_error(monkeypatch):
"""An unmatched quote in the env-var value must surface a clear
error naming the offending env var and showing the invalid value,
rather than crashing workflow dispatch with a bare shlex traceback."""
monkeypatch.setenv(
"SPECKIT_INTEGRATION_CLAUDE_EXTRA_ARGS",
'--flag "unterminated',
)
with pytest.raises(ValueError) as excinfo:
_ClaudeStub().build_exec_args("p")
msg = str(excinfo.value)
assert "SPECKIT_INTEGRATION_CLAUDE_EXTRA_ARGS" in msg
assert "--flag \"unterminated" in msg
def test_env_var_empty_or_whitespace_is_noop(monkeypatch):
"""An env var set to '' or ' ' is treated as unset."""
monkeypatch.setenv("SPECKIT_INTEGRATION_CLAUDE_EXTRA_ARGS", " ")
args = _ClaudeStub().build_exec_args("p")
assert args == ["claude", "-p", "p", "--output-format", "json"]
def test_other_integration_env_var_ignored(monkeypatch):
"""`SPECKIT_INTEGRATION_GEMINI_EXTRA_ARGS` set must NOT leak into
Claude's argv (per-integration scoping)."""
monkeypatch.setenv("SPECKIT_INTEGRATION_GEMINI_EXTRA_ARGS", "--gemini-only-flag")
args = _ClaudeStub().build_exec_args("p")
assert args == ["claude", "-p", "p", "--output-format", "json"]
def test_key_normalization_hyphen_to_underscore_uppercase(monkeypatch):
"""`kiro-cli` key looks up `SPECKIT_INTEGRATION_KIRO_CLI_EXTRA_ARGS`
(hyphens replaced with underscores, then uppercased)."""
monkeypatch.setenv(
"SPECKIT_INTEGRATION_KIRO_CLI_EXTRA_ARGS", "--some-kiro-flag"
)
args = _KiroCliStub().build_exec_args("p")
assert args == [
"kiro-cli",
"-p",
"p",
"--some-kiro-flag",
"--output-format",
"json",
]
def test_requires_cli_false_returns_none(monkeypatch):
"""`requires_cli: False` short-circuits to None — the env-var
hook is never reached and no argv is built."""
monkeypatch.setenv("SPECKIT_INTEGRATION_NO_CLI_EXTRA_ARGS", "--should-not-appear")
assert _NoCliStub().build_exec_args("p") is None
# ---------------------------------------------------------------------------
# Base-class coverage
#
# Most integrations inherit `build_exec_args` from `MarkdownIntegration`
# or `TomlIntegration` without overriding it. The tests above use
# `SkillsIntegration` stubs (which share the same hook mechanism) — these
# tests exercise the two other base implementations directly so all three
# concrete bases are covered.
# ---------------------------------------------------------------------------
def test_markdown_integration_base_honours_extra_args(monkeypatch):
"""A bare `MarkdownIntegration` subclass — which does not override
`build_exec_args` — must honour the env var via the base
implementation. Covers the most common integration pattern."""
monkeypatch.setenv(
"SPECKIT_INTEGRATION_MD_AGENT_EXTRA_ARGS", "--debug --max-tokens 100"
)
args = _MarkdownAgentStub().build_exec_args("p")
assert args == [
"md-agent",
"-p",
"p",
"--debug",
"--max-tokens",
"100",
"--output-format",
"json",
]
def test_toml_integration_base_honours_extra_args(monkeypatch):
"""A bare `TomlIntegration` subclass — which does not override
`build_exec_args` — must honour the env var via the base
implementation. Covers Gemini/Tabnine-style integrations."""
monkeypatch.setenv(
"SPECKIT_INTEGRATION_TOML_AGENT_EXTRA_ARGS", "--yolo"
)
args = _TomlAgentStub().build_exec_args("p", model="gemini-pro")
# TomlIntegration uses `-m` for model (vs Markdown's `--model`).
assert args == [
"toml-agent",
"-p",
"p",
"--yolo",
"-m",
"gemini-pro",
"--output-format",
"json",
]
# ---------------------------------------------------------------------------
# Override-integration coverage
#
# CodexIntegration, DevinIntegration, OpencodeIntegration and
# CopilotIntegration each override `build_exec_args` rather than using the
# base implementations. The env-var hook must be wired into every override
# so the documented behaviour ("works for every requires_cli integration")
# is honoured. These tests lock that contract per integration.
# ---------------------------------------------------------------------------
def test_codex_integration_honours_extra_args(monkeypatch):
from specify_cli.integrations.codex import CodexIntegration
monkeypatch.setenv("SPECKIT_INTEGRATION_CODEX_EXTRA_ARGS", "--sandbox read-only")
args = CodexIntegration().build_exec_args("p", model="gpt-5")
assert args == [
"codex",
"exec",
"p",
"--sandbox",
"read-only",
"--model",
"gpt-5",
"--json",
]
def test_devin_integration_honours_extra_args(monkeypatch):
from specify_cli.integrations.devin import DevinIntegration
monkeypatch.setenv("SPECKIT_INTEGRATION_DEVIN_EXTRA_ARGS", "--no-confirm")
args = DevinIntegration().build_exec_args("p")
assert args == ["devin", "-p", "p", "--no-confirm"]
def test_opencode_integration_honours_extra_args(monkeypatch):
from specify_cli.integrations.opencode import OpencodeIntegration
monkeypatch.setenv("SPECKIT_INTEGRATION_OPENCODE_EXTRA_ARGS", "--quiet")
args = OpencodeIntegration().build_exec_args("p")
assert args == [
"opencode",
"run",
"--quiet",
"--format",
"json",
"p",
]
def test_opencode_extra_args_cannot_clobber_prompt_derived_command(
monkeypatch,
):
"""Operator-injected extra args must appear BEFORE the prompt-derived
``--command <X>`` so that Spec Kit's command selection wins under
repeated-flag CLI semantics (last value typically takes precedence).
Locks against the regression where an operator setting
``SPECKIT_INTEGRATION_OPENCODE_EXTRA_ARGS="--command malicious"`` could redirect
a slash-prefixed prompt to a different command.
"""
from specify_cli.integrations.opencode import OpencodeIntegration
monkeypatch.setenv(
"SPECKIT_INTEGRATION_OPENCODE_EXTRA_ARGS", "--command operator-override"
)
args = OpencodeIntegration().build_exec_args("/speckit body text")
# Prompt-derived "--command speckit" appears AFTER the
# operator-injected one, so a CLI that resolves repeated flags
# last-wins will honour Spec Kit's choice.
assert args == [
"opencode",
"run",
"--command",
"operator-override",
"--command",
"speckit",
"--format",
"json",
"body text",
]
def test_copilot_integration_honours_extra_args(monkeypatch):
from specify_cli.integrations.copilot import (
CopilotIntegration,
_copilot_executable,
)
# Disable --yolo so the argv shape stays deterministic.
monkeypatch.setenv("SPECKIT_COPILOT_ALLOW_ALL_TOOLS", "0")
monkeypatch.setenv(
"SPECKIT_INTEGRATION_COPILOT_EXTRA_ARGS", "--allow-tool 'shell(echo)'"
)
args = CopilotIntegration().build_exec_args("p")
# `_copilot_executable()` returns "copilot.cmd" on Windows and
# "copilot" elsewhere; the test must mirror that to stay portable.
assert args == [
_copilot_executable(),
"-p",
"p",
"--allow-tool",
"shell(echo)",
"--output-format",
"json",
]
# ---------------------------------------------------------------------------
# `dispatch_command` end-to-end coverage
#
# Workflow execution calls `impl.dispatch_command(...)`, not
# `build_exec_args` directly. `IntegrationBase.dispatch_command` delegates
# to `build_exec_args` (so the override fixes above flow through), but
# `CopilotIntegration` overrides `dispatch_command` and constructs
# `cli_args` inline — the hook must be invoked there too or the env var
# is silently ignored at workflow runtime. These tests monkeypatch
# `subprocess.run` and assert the env-var args reach the executed argv.
# ---------------------------------------------------------------------------
class _RunCapture:
"""Test double that captures argv passed to subprocess.run."""
def __init__(self):
self.captured_args: list[str] | None = None
def __call__(self, args, **kwargs):
self.captured_args = list(args)
class _Result:
returncode = 0
stdout = ""
stderr = ""
return _Result()
def test_copilot_dispatch_command_includes_extra_args(monkeypatch):
"""Locks the bypass fix: `CopilotIntegration.dispatch_command`
must honour `SPECKIT_INTEGRATION_COPILOT_EXTRA_ARGS`, not just `build_exec_args`.
"""
import subprocess
from specify_cli.integrations.copilot import CopilotIntegration
capture = _RunCapture()
monkeypatch.setattr(subprocess, "run", capture)
monkeypatch.setenv("SPECKIT_COPILOT_ALLOW_ALL_TOOLS", "0")
monkeypatch.setenv(
"SPECKIT_INTEGRATION_COPILOT_EXTRA_ARGS", "--allow-tool 'shell(echo)'"
)
CopilotIntegration().dispatch_command(
"speckit.plan", args="body", stream=False
)
assert capture.captured_args is not None
# Hook inserted between `-p prompt` and the canonical Copilot flags.
p_idx = capture.captured_args.index("-p")
agent_idx = capture.captured_args.index("--agent")
extra_idx = capture.captured_args.index("--allow-tool")
assert p_idx < extra_idx < agent_idx
assert "shell(echo)" in capture.captured_args
def test_codex_dispatch_command_includes_extra_args(monkeypatch):
"""Lock the inherited `IntegrationBase.dispatch_command` path:
Codex (and by transitivity Devin, Opencode) flow through
`build_exec_args`, so the env var must reach argv at workflow
runtime.
"""
import subprocess
from specify_cli.integrations.codex import CodexIntegration
capture = _RunCapture()
monkeypatch.setattr(subprocess, "run", capture)
monkeypatch.setenv("SPECKIT_INTEGRATION_CODEX_EXTRA_ARGS", "--sandbox read-only")
CodexIntegration().dispatch_command(
"speckit.plan", args="body", stream=False
)
assert capture.captured_args is not None
assert "--sandbox" in capture.captured_args
assert "read-only" in capture.captured_args