Files
github-spec-kit/src/specify_cli/authentication/config.py
Copilot f0998348be feat: Config-driven opt-in authentication registry with multi-platform support (#2393)
* Initial plan

* feat: add authentication provider registry (GitHub + Azure DevOps)

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/da7ecfd0-e1c9-48dc-b692-27be0879e976

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* feat: add try-each-provider HTTP helper and wire all catalog fetches through auth registry

- Add authentication/http.py with open_url() that tries each configured
  provider in registry order, falling through on 401/403 to the next,
  and finally to unauthenticated
- Add build_request() for one-shot request construction
- Add configured_providers() to registry __init__
- Remove api_base_url() from AuthProvider ABC (unused)
- Remove hosts attribute from providers (no host matching)
- Replace _github_http.py usage in ExtensionCatalog and PresetCatalog
- Wire IntegrationCatalog and WorkflowCatalog through open_url (were unauthenticated)
- Wire _fetch_latest_release_tag() through open_url
- Wire all inline --from-url downloads through open_url
- Fix unused stub variable flagged by code-quality bot
- 49 auth tests (positive + negative), 1805 total tests passing

* fix: address review — fix stale docstrings, restore Accept header, add extra_headers to open_url

- Fix _open_url() docstrings in extensions.py and presets.py that
  incorrectly claimed redirect stripping behavior
- Add extra_headers parameter to open_url() so callers can pass
  additional headers (e.g. Accept) that persist across retries
- Restore Accept: application/vnd.github+json header in
  _fetch_latest_release_tag() via extra_headers

* feat: config-driven opt-in auth via ~/.specify/auth.json

Security-first redesign: no credentials are sent unless the user
explicitly creates ~/.specify/auth.json mapping hosts to providers.

- Add authentication/config.py: loads and validates auth.json with
  host-to-provider mappings, supports token/token_env/azure-ad/azure-cli
- Refactor AuthProvider ABC: auth_headers(token, scheme) + resolve_token(entry)
- Refactor GitHubAuth: bearer scheme only, token from config entry
- Refactor AzureDevOpsAuth: 4 schemes (basic-pat, bearer, azure-cli, azure-ad)
  with dynamic token acquisition for azure-cli and azure-ad
- Rewrite authentication/http.py: host matching, redirect stripping,
  provider fallthrough on 401/403, unauthenticated fallback
- Add docs/reference/authentication.md with full reference and template
- 1823 tests passing (67 auth-specific)

* fix: address review — unused imports, host normalization, provider+scheme validation, security hardening

- Remove unused imports (os, field, Any) in config.py
- Normalize hosts during load (strip + lowercase)
- Validate token/token_env are non-empty strings during load
- Validate provider+scheme compatibility during load
- Fix extra_headers order: auth headers applied last, cannot be overridden
- Remove unused 'tried' variable in http.py
- Warn (once) on malformed auth.json instead of silent fallback
- URL-encode OAuth2 client credentials body in azure_devops.py
- Update 403 message to mention auth.json configuration
- Fix registry leak in test_register_duplicate (try/finally)
- Fix import style consistency in test_authentication.py
- Add azure-cli and azure-ad token acquisition tests (mock subprocess/urlopen)
- Add autouse fixture to isolate upgrade tests from real auth.json
- 1829 tests passing

* fix: reject unknown providers, validate azure-ad fields, strip Authorization from extra_headers

- Reject unknown provider keys during auth.json load with clear error message
- Validate azure-ad tenant_id/client_id/client_secret_env as non-empty strings
- Strip Authorization from extra_headers in both build_request and open_url
  to prevent accidental or intentional bypass of provider-configured auth
- Add tests for unknown provider and incompatible scheme validation
- 1831 tests passing

* fix: extract shared auth test helpers, global config isolation, align docstring

- Move _inject_github_config / make_github_auth_entry to tests/auth_helpers.py
  to eliminate duplication across test_extensions, test_presets, test_upgrade
- Move auth config isolation fixture to global conftest.py (autouse) so ALL
  tests are isolated from ~/.specify/auth.json, not just test_upgrade
- Align load_auth_config docstring with actual behavior: ValueError may be
  caught by higher-level HTTP helpers that warn and continue unauthenticated
- 1831 tests passing

* fix: preserve auth header across multi-hop redirect chains

- Read Authorization from both headers and unredirected_hdrs in
  _StripAuthOnRedirect to survive multi-hop chains within allowed hosts
- Add test_multi_hop_redirect_within_hosts_preserves_auth
- 1832 tests passing

* fix: use resolved config path in warning/error messages and patch build_opener in no-network test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/86df9557-54f1-4fe4-a25f-9501cb2356cf

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: assert full resolved config path in rate-limit output test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/86df9557-54f1-4fe4-a25f-9501cb2356cf

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: close HTTPError on 401/403, remove _VALID_AUTH_SCHEMES, catch TimeoutExpired, skip POSIX test on Windows, remove unused import

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/a1e29737-dd6e-4287-96c1-509e0c96fb21

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: use stable ~/.specify/auth.json in rate-limit message, skip POSIX permission check on Windows

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/4636bcdb-87ae-45d6-9545-a40e4effd617

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: validate host patterns, cache auth config per-process

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: clarify _is_valid_host_pattern docstring, clean up test sentinel type

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: improve _is_valid_host_pattern docstring and test observability

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-07 12:51:20 -05:00

210 lines
7.7 KiB
Python

"""Authentication configuration loader.
Reads ``~/.specify/auth.json`` to determine which hosts receive credentials
and which provider/auth-scheme to use. No credentials are sent without
an explicit opt-in via this file.
"""
from __future__ import annotations
import json
import os
import stat
from dataclasses import dataclass
from fnmatch import fnmatch
from pathlib import Path
from urllib.parse import urlparse
@dataclass(frozen=True)
class AuthConfigEntry:
"""A single provider entry from ``auth.json``."""
hosts: tuple[str, ...]
provider: str
auth: str
token: str | None = None
token_env: str | None = None
# Azure AD service-principal fields
tenant_id: str | None = None
client_id: str | None = None
client_secret_env: str | None = None
def _default_config_path() -> Path:
"""Return ``~/.specify/auth.json``."""
return Path.home() / ".specify" / "auth.json"
def _is_valid_host_pattern(pattern: str) -> bool:
"""Return True for safe host patterns: exact hostnames or ``*.suffix`` only.
Rejects patterns like ``*github.com`` (which would match
``github.com.evil.com``) or multi-wildcard forms. Only these two
forms are accepted:
* ``example.com`` — exact hostname
* ``*.example.com`` — leading ``*.`` wildcard; matches subdomains
such as ``myorg.example.com`` but not ``example.com`` itself
"""
if "*" not in pattern:
return True # exact hostname — already validated as non-empty
# Only *.suffix is allowed; no other wildcard positions
return pattern.startswith("*.") and "*" not in pattern[2:]
def load_auth_config(
path: Path | None = None,
) -> list[AuthConfigEntry]:
"""Load and validate ``auth.json``, returning configured entries.
Returns an empty list when the file does not exist — this means
all HTTP requests will be unauthenticated (opt-in model).
Raises ``ValueError`` on schema violations. Callers that want
misconfigurations to fail fast can allow this exception to
propagate; higher-level HTTP helpers may instead catch it,
warn, and continue with unauthenticated requests.
"""
config_path = path or _default_config_path()
if not config_path.is_file():
return []
# Warn (but don't fail) if the file is world-readable (POSIX only).
if os.name != "nt":
try:
mode = config_path.stat().st_mode
if mode & (stat.S_IRGRP | stat.S_IROTH):
import warnings
warnings.warn(
f"{config_path} is readable by group/others. "
"Consider restricting with: chmod 600 "
f"{config_path}",
UserWarning,
stacklevel=2,
)
except OSError:
pass # stat failed — skip permission check
raw = json.loads(config_path.read_text(encoding="utf-8"))
if not isinstance(raw, dict):
raise ValueError(f"auth.json must be a JSON object, got {type(raw).__name__}")
providers_raw = raw.get("providers")
if not isinstance(providers_raw, list):
raise ValueError("auth.json must contain a 'providers' array")
entries: list[AuthConfigEntry] = []
for i, entry_raw in enumerate(providers_raw):
if not isinstance(entry_raw, dict):
raise ValueError(f"providers[{i}]: must be a JSON object")
hosts = entry_raw.get("hosts")
if not isinstance(hosts, list) or not hosts:
raise ValueError(f"providers[{i}]: 'hosts' must be a non-empty array")
if not all(isinstance(h, str) and h.strip() for h in hosts):
raise ValueError(f"providers[{i}]: each host must be a non-empty string")
# Normalize hosts: strip whitespace and lowercase
hosts = [h.strip().lower() for h in hosts]
# Reject dangerous wildcard forms (e.g. *github.com matches github.com.evil.com)
for h in hosts:
if not _is_valid_host_pattern(h):
raise ValueError(
f"providers[{i}]: invalid host pattern {h!r}. "
"Only exact hostnames or '*.suffix' forms are allowed "
"(e.g. 'github.com' or '*.visualstudio.com')."
)
provider = entry_raw.get("provider", "")
if not isinstance(provider, str) or not provider:
raise ValueError(f"providers[{i}]: 'provider' must be a non-empty string")
auth = entry_raw.get("auth", "")
if not isinstance(auth, str) or not auth:
raise ValueError(f"providers[{i}]: 'auth' must be a non-empty string")
token = entry_raw.get("token")
token_env = entry_raw.get("token_env")
# Validate token/token_env types
if token is not None and (not isinstance(token, str) or not token.strip()):
raise ValueError(f"providers[{i}]: 'token' must be a non-empty string")
if token_env is not None and (not isinstance(token_env, str) or not token_env.strip()):
raise ValueError(f"providers[{i}]: 'token_env' must be a non-empty string")
# Validate provider+scheme compatibility
from . import get_provider as _get_provider
_prov = _get_provider(provider)
if _prov is None:
from . import AUTH_REGISTRY
raise ValueError(
f"providers[{i}]: unknown provider {provider!r}; "
f"registered: {sorted(AUTH_REGISTRY.keys())}"
)
if auth not in _prov.supported_auth_schemes:
raise ValueError(
f"providers[{i}]: provider {provider!r} does not support "
f"auth scheme {auth!r}; supported: {list(_prov.supported_auth_schemes)}"
)
# Validate token source based on auth scheme
if auth in ("bearer", "basic-pat"):
if not token and not token_env:
raise ValueError(
f"providers[{i}]: auth={auth!r} requires 'token' or 'token_env'"
)
elif auth == "azure-ad":
tenant_id = entry_raw.get("tenant_id")
client_id = entry_raw.get("client_id")
client_secret_env = entry_raw.get("client_secret_env")
if not all([tenant_id, client_id, client_secret_env]):
raise ValueError(
f"providers[{i}]: auth='azure-ad' requires "
"'tenant_id', 'client_id', and 'client_secret_env'"
)
for field_name, field_val in [
("tenant_id", tenant_id),
("client_id", client_id),
("client_secret_env", client_secret_env),
]:
if not isinstance(field_val, str) or not field_val.strip():
raise ValueError(
f"providers[{i}]: '{field_name}' must be a non-empty string"
)
# azure-cli needs no extra fields
entries.append(
AuthConfigEntry(
hosts=tuple(hosts),
provider=provider,
auth=auth,
token=token,
token_env=token_env,
tenant_id=entry_raw.get("tenant_id"),
client_id=entry_raw.get("client_id"),
client_secret_env=entry_raw.get("client_secret_env"),
)
)
return entries
def find_entries_for_url(
url: str, entries: list[AuthConfigEntry]
) -> list[AuthConfigEntry]:
"""Return entries whose ``hosts`` match the hostname of *url*."""
hostname = (urlparse(url).hostname or "").lower()
if not hostname:
return []
return [
e
for e in entries
if any(
pattern == hostname or fnmatch(hostname, pattern)
for pattern in e.hosts
)
]