feat: Config-driven opt-in authentication registry with multi-platform support (#2393)

* Initial plan

* feat: add authentication provider registry (GitHub + Azure DevOps)

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/da7ecfd0-e1c9-48dc-b692-27be0879e976

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* feat: add try-each-provider HTTP helper and wire all catalog fetches through auth registry

- Add authentication/http.py with open_url() that tries each configured
  provider in registry order, falling through on 401/403 to the next,
  and finally to unauthenticated
- Add build_request() for one-shot request construction
- Add configured_providers() to registry __init__
- Remove api_base_url() from AuthProvider ABC (unused)
- Remove hosts attribute from providers (no host matching)
- Replace _github_http.py usage in ExtensionCatalog and PresetCatalog
- Wire IntegrationCatalog and WorkflowCatalog through open_url (were unauthenticated)
- Wire _fetch_latest_release_tag() through open_url
- Wire all inline --from-url downloads through open_url
- Fix unused stub variable flagged by code-quality bot
- 49 auth tests (positive + negative), 1805 total tests passing

* fix: address review — fix stale docstrings, restore Accept header, add extra_headers to open_url

- Fix _open_url() docstrings in extensions.py and presets.py that
  incorrectly claimed redirect stripping behavior
- Add extra_headers parameter to open_url() so callers can pass
  additional headers (e.g. Accept) that persist across retries
- Restore Accept: application/vnd.github+json header in
  _fetch_latest_release_tag() via extra_headers

* feat: config-driven opt-in auth via ~/.specify/auth.json

Security-first redesign: no credentials are sent unless the user
explicitly creates ~/.specify/auth.json mapping hosts to providers.

- Add authentication/config.py: loads and validates auth.json with
  host-to-provider mappings, supports token/token_env/azure-ad/azure-cli
- Refactor AuthProvider ABC: auth_headers(token, scheme) + resolve_token(entry)
- Refactor GitHubAuth: bearer scheme only, token from config entry
- Refactor AzureDevOpsAuth: 4 schemes (basic-pat, bearer, azure-cli, azure-ad)
  with dynamic token acquisition for azure-cli and azure-ad
- Rewrite authentication/http.py: host matching, redirect stripping,
  provider fallthrough on 401/403, unauthenticated fallback
- Add docs/reference/authentication.md with full reference and template
- 1823 tests passing (67 auth-specific)

* fix: address review — unused imports, host normalization, provider+scheme validation, security hardening

- Remove unused imports (os, field, Any) in config.py
- Normalize hosts during load (strip + lowercase)
- Validate token/token_env are non-empty strings during load
- Validate provider+scheme compatibility during load
- Fix extra_headers order: auth headers applied last, cannot be overridden
- Remove unused 'tried' variable in http.py
- Warn (once) on malformed auth.json instead of silent fallback
- URL-encode OAuth2 client credentials body in azure_devops.py
- Update 403 message to mention auth.json configuration
- Fix registry leak in test_register_duplicate (try/finally)
- Fix import style consistency in test_authentication.py
- Add azure-cli and azure-ad token acquisition tests (mock subprocess/urlopen)
- Add autouse fixture to isolate upgrade tests from real auth.json
- 1829 tests passing

* fix: reject unknown providers, validate azure-ad fields, strip Authorization from extra_headers

- Reject unknown provider keys during auth.json load with clear error message
- Validate azure-ad tenant_id/client_id/client_secret_env as non-empty strings
- Strip Authorization from extra_headers in both build_request and open_url
  to prevent accidental or intentional bypass of provider-configured auth
- Add tests for unknown provider and incompatible scheme validation
- 1831 tests passing

* fix: extract shared auth test helpers, global config isolation, align docstring

- Move _inject_github_config / make_github_auth_entry to tests/auth_helpers.py
  to eliminate duplication across test_extensions, test_presets, test_upgrade
- Move auth config isolation fixture to global conftest.py (autouse) so ALL
  tests are isolated from ~/.specify/auth.json, not just test_upgrade
- Align load_auth_config docstring with actual behavior: ValueError may be
  caught by higher-level HTTP helpers that warn and continue unauthenticated
- 1831 tests passing

* fix: preserve auth header across multi-hop redirect chains

- Read Authorization from both headers and unredirected_hdrs in
  _StripAuthOnRedirect to survive multi-hop chains within allowed hosts
- Add test_multi_hop_redirect_within_hosts_preserves_auth
- 1832 tests passing

* fix: use resolved config path in warning/error messages and patch build_opener in no-network test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/86df9557-54f1-4fe4-a25f-9501cb2356cf

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: assert full resolved config path in rate-limit output test

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/86df9557-54f1-4fe4-a25f-9501cb2356cf

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: close HTTPError on 401/403, remove _VALID_AUTH_SCHEMES, catch TimeoutExpired, skip POSIX test on Windows, remove unused import

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/a1e29737-dd6e-4287-96c1-509e0c96fb21

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: use stable ~/.specify/auth.json in rate-limit message, skip POSIX permission check on Windows

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/4636bcdb-87ae-45d6-9545-a40e4effd617

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: validate host patterns, cache auth config per-process

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: clarify _is_valid_host_pattern docstring, clean up test sentinel type

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* fix: improve _is_valid_host_pattern docstring and test observability

Agent-Logs-Url: https://github.com/github/spec-kit/sessions/889b58a7-7f8c-47e2-8056-931ebcc671cc

Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mnriem <15701806+mnriem@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Copilot
2026-05-07 12:51:20 -05:00
committed by GitHub
parent 5563269831
commit f0998348be
19 changed files with 1851 additions and 174 deletions

View File

@@ -1762,22 +1762,14 @@ def _fetch_latest_release_tag() -> tuple[str | None, str | None]:
On anything else — including a malformed response body — the exception
propagates; there is no catch-all (research D-006).
"""
req = urllib.request.Request(
GITHUB_API_LATEST,
headers={"Accept": "application/vnd.github+json"},
)
token = None
for env_var in ("GH_TOKEN", "GITHUB_TOKEN"):
candidate = os.environ.get(env_var)
if candidate is not None:
candidate = candidate.strip()
if candidate:
token = candidate
break
if token:
req.add_header("Authorization", f"Bearer {token}")
from .authentication.http import open_url
try:
with urllib.request.urlopen(req, timeout=5) as resp:
with open_url(
GITHUB_API_LATEST,
timeout=5,
extra_headers={"Accept": "application/vnd.github+json"},
) as resp:
payload = json.loads(resp.read().decode("utf-8"))
tag = payload.get("tag_name")
if not isinstance(tag, str) or not tag:
@@ -1786,7 +1778,9 @@ def _fetch_latest_release_tag() -> tuple[str | None, str | None]:
except urllib.error.HTTPError as e:
# Order matters: HTTPError is a subclass of URLError.
if e.code == 403:
return None, "rate limited (try setting GH_TOKEN or GITHUB_TOKEN)"
return None, (
"rate limited (configure ~/.specify/auth.json with a GitHub token)"
)
return None, f"HTTP {e.code}"
except (urllib.error.URLError, OSError):
return None, "offline or timeout"
@@ -3381,7 +3375,9 @@ def preset_add(
with tempfile.TemporaryDirectory() as tmpdir:
zip_path = Path(tmpdir) / "preset.zip"
try:
with urllib.request.urlopen(from_url, timeout=60) as response:
from specify_cli.authentication.http import open_url as _open_url
with _open_url(from_url, timeout=60) as response:
zip_path.write_bytes(response.read())
except urllib.error.URLError as e:
console.print(f"[red]Error:[/red] Failed to download: {e}")
@@ -4285,7 +4281,9 @@ def extension_add(
zip_path = download_dir / f"{extension}-url-download.zip"
try:
with urllib.request.urlopen(from_url, timeout=60) as response:
from specify_cli.authentication.http import open_url as _open_url
with _open_url(from_url, timeout=60) as response:
zip_data = response.read()
zip_path.write_bytes(zip_data)
@@ -5500,7 +5498,7 @@ def workflow_add(
if source.startswith("http://") or source.startswith("https://"):
from ipaddress import ip_address
from urllib.parse import urlparse
from urllib.request import urlopen # noqa: S310
from specify_cli.authentication.http import open_url as _open_url
parsed_src = urlparse(source)
src_host = parsed_src.hostname or ""
@@ -5517,7 +5515,7 @@ def workflow_add(
import tempfile
try:
with urlopen(source, timeout=30) as resp: # noqa: S310
with _open_url(source, timeout=30) as resp:
final_url = resp.geturl()
final_parsed = urlparse(final_url)
final_host = final_parsed.hostname or ""
@@ -5613,10 +5611,10 @@ def workflow_add(
workflow_file = workflow_dir / "workflow.yml"
try:
from urllib.request import urlopen # noqa: S310 — URL comes from catalog
from specify_cli.authentication.http import open_url as _open_url
workflow_dir.mkdir(parents=True, exist_ok=True)
with urlopen(workflow_url, timeout=30) as response: # noqa: S310
with _open_url(workflow_url, timeout=30) as response:
# Validate final URL after redirects
final_url = response.geturl()
final_parsed = urlparse(final_url)

View File

@@ -0,0 +1,50 @@
"""Authentication provider registry for multi-platform support.
Credentials are **opt-in only**. No authentication headers are sent unless
the user creates ``~/.specify/auth.json`` mapping hosts to providers.
Provider classes define *how* to authenticate (Bearer, Basic-PAT, etc.)
while the config file defines *where* and *with what credentials*.
"""
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .base import AuthProvider
# Maps provider key → AuthProvider class instance.
AUTH_REGISTRY: dict[str, AuthProvider] = {}
def _register(provider: AuthProvider) -> None:
"""Register a provider instance in the global registry.
Raises ``ValueError`` for falsy keys and ``KeyError`` for duplicates.
"""
key = provider.key
if not key:
raise ValueError("Cannot register provider with an empty key.")
if key in AUTH_REGISTRY:
raise KeyError(f"Provider with key {key!r} is already registered.")
AUTH_REGISTRY[key] = provider
def get_provider(key: str) -> AuthProvider | None:
"""Return the provider for *key*, or ``None`` if not registered."""
return AUTH_REGISTRY.get(key)
# -- Register built-in providers -----------------------------------------
def _register_builtins() -> None:
"""Register all built-in authentication providers (alphabetical)."""
from .azure_devops import AzureDevOpsAuth
from .github import GitHubAuth
_register(AzureDevOpsAuth())
_register(GitHubAuth())
_register_builtins()

View File

@@ -0,0 +1,117 @@
"""Azure DevOps authentication provider."""
from __future__ import annotations
import base64
import json as _json
import os
import subprocess
from typing import TYPE_CHECKING
from .base import AuthProvider
if TYPE_CHECKING:
from .config import AuthConfigEntry
# Azure DevOps resource ID for OAuth / Azure AD token acquisition.
_ADO_RESOURCE_ID = "499b84ac-1321-427f-aa17-267ca6975798"
class AzureDevOpsAuth(AuthProvider):
"""Azure DevOps authentication provider.
Supports four auth schemes:
* ``basic-pat`` — PAT with empty username, Base64-encoded as ``:<PAT>``
* ``bearer`` — pre-acquired OAuth / Azure AD token
* ``azure-cli`` — acquires a token via ``az account get-access-token``
* ``azure-ad`` — acquires a token via OAuth2 client credentials flow
"""
key = "azure-devops"
supported_auth_schemes = ("basic-pat", "bearer", "azure-cli", "azure-ad")
def auth_headers(self, token: str, auth_scheme: str) -> dict[str, str]:
"""Build the ``Authorization`` header for the given scheme."""
if auth_scheme == "basic-pat":
encoded = base64.b64encode(f":{token}".encode("ascii")).decode("ascii")
return {"Authorization": f"Basic {encoded}"}
if auth_scheme in ("bearer", "azure-cli", "azure-ad"):
return {"Authorization": f"Bearer {token}"}
raise ValueError(
f"AzureDevOpsAuth does not support auth scheme {auth_scheme!r}"
)
def resolve_token(self, entry: AuthConfigEntry) -> str | None:
"""Resolve token, with special handling for azure-cli and azure-ad."""
if entry.auth == "azure-cli":
return self._acquire_via_az_cli()
if entry.auth == "azure-ad":
return self._acquire_via_client_credentials(entry)
return super().resolve_token(entry)
# -- Token acquisition ------------------------------------------------
@staticmethod
def _acquire_via_az_cli() -> str | None:
"""Run ``az account get-access-token`` and return the access token."""
try:
result = subprocess.run( # noqa: S603, S607
[
"az",
"account",
"get-access-token",
"--resource",
_ADO_RESOURCE_ID,
"--output",
"json",
],
capture_output=True,
text=True,
timeout=30,
check=False,
)
if result.returncode != 0:
return None
payload = _json.loads(result.stdout)
token = payload.get("accessToken", "").strip()
return token or None
except (OSError, subprocess.TimeoutExpired, _json.JSONDecodeError, KeyError):
return None
@staticmethod
def _acquire_via_client_credentials(entry: AuthConfigEntry) -> str | None:
"""Acquire a token via OAuth2 client credentials flow."""
import urllib.error
import urllib.request
if not entry.tenant_id or not entry.client_id or not entry.client_secret_env:
return None
client_secret = os.environ.get(entry.client_secret_env, "").strip()
if not client_secret:
return None
url = (
f"https://login.microsoftonline.com/{entry.tenant_id}"
"/oauth2/v2.0/token"
)
from urllib.parse import urlencode
body = urlencode({
"grant_type": "client_credentials",
"client_id": entry.client_id,
"client_secret": client_secret,
"scope": f"{_ADO_RESOURCE_ID}/.default",
}).encode("utf-8")
req = urllib.request.Request(
url,
data=body,
headers={"Content-Type": "application/x-www-form-urlencoded"},
)
try:
with urllib.request.urlopen(req, timeout=30) as resp: # noqa: S310
payload = _json.loads(resp.read().decode("utf-8"))
token = payload.get("access_token", "").strip()
return token or None
except (urllib.error.URLError, OSError, _json.JSONDecodeError, KeyError):
return None

View File

@@ -0,0 +1,57 @@
"""Abstract base class for authentication providers."""
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .config import AuthConfigEntry
class AuthProvider(ABC):
"""Abstract base class every authentication provider must implement.
Subclasses must set:
* ``key`` — unique provider identifier (e.g. ``"github"``, ``"azure-devops"``)
* ``supported_auth_schemes`` — tuple of auth scheme strings this provider handles
And implement:
* ``auth_headers(token, auth_scheme)`` — build headers from a resolved token
* ``resolve_token(entry)`` — obtain the token for a config entry
"""
key: str = ""
"""Unique provider identifier."""
supported_auth_schemes: tuple[str, ...] = ()
"""Auth schemes this provider supports (e.g. ``("bearer",)``)."""
@abstractmethod
def auth_headers(self, token: str, auth_scheme: str) -> dict[str, str]:
"""Build authentication headers for *token* using *auth_scheme*.
Must return a dict with at least an ``Authorization`` key.
"""
def resolve_token(self, entry: AuthConfigEntry) -> str | None:
"""Resolve the token for *entry*.
Default implementation reads from ``entry.token`` directly
or from the environment variable named by ``entry.token_env``.
Override for schemes that acquire tokens dynamically
(e.g. ``azure-cli``, ``azure-ad``).
"""
import os
if entry.token:
return entry.token.strip() or None
if entry.token_env:
val = os.environ.get(entry.token_env)
if val is not None:
val = val.strip()
if val:
return val
return None

View File

@@ -0,0 +1,209 @@
"""Authentication configuration loader.
Reads ``~/.specify/auth.json`` to determine which hosts receive credentials
and which provider/auth-scheme to use. No credentials are sent without
an explicit opt-in via this file.
"""
from __future__ import annotations
import json
import os
import stat
from dataclasses import dataclass
from fnmatch import fnmatch
from pathlib import Path
from urllib.parse import urlparse
@dataclass(frozen=True)
class AuthConfigEntry:
"""A single provider entry from ``auth.json``."""
hosts: tuple[str, ...]
provider: str
auth: str
token: str | None = None
token_env: str | None = None
# Azure AD service-principal fields
tenant_id: str | None = None
client_id: str | None = None
client_secret_env: str | None = None
def _default_config_path() -> Path:
"""Return ``~/.specify/auth.json``."""
return Path.home() / ".specify" / "auth.json"
def _is_valid_host_pattern(pattern: str) -> bool:
"""Return True for safe host patterns: exact hostnames or ``*.suffix`` only.
Rejects patterns like ``*github.com`` (which would match
``github.com.evil.com``) or multi-wildcard forms. Only these two
forms are accepted:
* ``example.com`` — exact hostname
* ``*.example.com`` — leading ``*.`` wildcard; matches subdomains
such as ``myorg.example.com`` but not ``example.com`` itself
"""
if "*" not in pattern:
return True # exact hostname — already validated as non-empty
# Only *.suffix is allowed; no other wildcard positions
return pattern.startswith("*.") and "*" not in pattern[2:]
def load_auth_config(
path: Path | None = None,
) -> list[AuthConfigEntry]:
"""Load and validate ``auth.json``, returning configured entries.
Returns an empty list when the file does not exist — this means
all HTTP requests will be unauthenticated (opt-in model).
Raises ``ValueError`` on schema violations. Callers that want
misconfigurations to fail fast can allow this exception to
propagate; higher-level HTTP helpers may instead catch it,
warn, and continue with unauthenticated requests.
"""
config_path = path or _default_config_path()
if not config_path.is_file():
return []
# Warn (but don't fail) if the file is world-readable (POSIX only).
if os.name != "nt":
try:
mode = config_path.stat().st_mode
if mode & (stat.S_IRGRP | stat.S_IROTH):
import warnings
warnings.warn(
f"{config_path} is readable by group/others. "
"Consider restricting with: chmod 600 "
f"{config_path}",
UserWarning,
stacklevel=2,
)
except OSError:
pass # stat failed — skip permission check
raw = json.loads(config_path.read_text(encoding="utf-8"))
if not isinstance(raw, dict):
raise ValueError(f"auth.json must be a JSON object, got {type(raw).__name__}")
providers_raw = raw.get("providers")
if not isinstance(providers_raw, list):
raise ValueError("auth.json must contain a 'providers' array")
entries: list[AuthConfigEntry] = []
for i, entry_raw in enumerate(providers_raw):
if not isinstance(entry_raw, dict):
raise ValueError(f"providers[{i}]: must be a JSON object")
hosts = entry_raw.get("hosts")
if not isinstance(hosts, list) or not hosts:
raise ValueError(f"providers[{i}]: 'hosts' must be a non-empty array")
if not all(isinstance(h, str) and h.strip() for h in hosts):
raise ValueError(f"providers[{i}]: each host must be a non-empty string")
# Normalize hosts: strip whitespace and lowercase
hosts = [h.strip().lower() for h in hosts]
# Reject dangerous wildcard forms (e.g. *github.com matches github.com.evil.com)
for h in hosts:
if not _is_valid_host_pattern(h):
raise ValueError(
f"providers[{i}]: invalid host pattern {h!r}. "
"Only exact hostnames or '*.suffix' forms are allowed "
"(e.g. 'github.com' or '*.visualstudio.com')."
)
provider = entry_raw.get("provider", "")
if not isinstance(provider, str) or not provider:
raise ValueError(f"providers[{i}]: 'provider' must be a non-empty string")
auth = entry_raw.get("auth", "")
if not isinstance(auth, str) or not auth:
raise ValueError(f"providers[{i}]: 'auth' must be a non-empty string")
token = entry_raw.get("token")
token_env = entry_raw.get("token_env")
# Validate token/token_env types
if token is not None and (not isinstance(token, str) or not token.strip()):
raise ValueError(f"providers[{i}]: 'token' must be a non-empty string")
if token_env is not None and (not isinstance(token_env, str) or not token_env.strip()):
raise ValueError(f"providers[{i}]: 'token_env' must be a non-empty string")
# Validate provider+scheme compatibility
from . import get_provider as _get_provider
_prov = _get_provider(provider)
if _prov is None:
from . import AUTH_REGISTRY
raise ValueError(
f"providers[{i}]: unknown provider {provider!r}; "
f"registered: {sorted(AUTH_REGISTRY.keys())}"
)
if auth not in _prov.supported_auth_schemes:
raise ValueError(
f"providers[{i}]: provider {provider!r} does not support "
f"auth scheme {auth!r}; supported: {list(_prov.supported_auth_schemes)}"
)
# Validate token source based on auth scheme
if auth in ("bearer", "basic-pat"):
if not token and not token_env:
raise ValueError(
f"providers[{i}]: auth={auth!r} requires 'token' or 'token_env'"
)
elif auth == "azure-ad":
tenant_id = entry_raw.get("tenant_id")
client_id = entry_raw.get("client_id")
client_secret_env = entry_raw.get("client_secret_env")
if not all([tenant_id, client_id, client_secret_env]):
raise ValueError(
f"providers[{i}]: auth='azure-ad' requires "
"'tenant_id', 'client_id', and 'client_secret_env'"
)
for field_name, field_val in [
("tenant_id", tenant_id),
("client_id", client_id),
("client_secret_env", client_secret_env),
]:
if not isinstance(field_val, str) or not field_val.strip():
raise ValueError(
f"providers[{i}]: '{field_name}' must be a non-empty string"
)
# azure-cli needs no extra fields
entries.append(
AuthConfigEntry(
hosts=tuple(hosts),
provider=provider,
auth=auth,
token=token,
token_env=token_env,
tenant_id=entry_raw.get("tenant_id"),
client_id=entry_raw.get("client_id"),
client_secret_env=entry_raw.get("client_secret_env"),
)
)
return entries
def find_entries_for_url(
url: str, entries: list[AuthConfigEntry]
) -> list[AuthConfigEntry]:
"""Return entries whose ``hosts`` match the hostname of *url*."""
hostname = (urlparse(url).hostname or "").lower()
if not hostname:
return []
return [
e
for e in entries
if any(
pattern == hostname or fnmatch(hostname, pattern)
for pattern in e.hosts
)
]

View File

@@ -0,0 +1,24 @@
"""GitHub authentication provider."""
from __future__ import annotations
from .base import AuthProvider
class GitHubAuth(AuthProvider):
"""GitHub authentication provider.
Supports the ``bearer`` auth scheme, used for PATs, fine-grained PATs,
OAuth tokens, and GitHub App installation tokens.
"""
key = "github"
supported_auth_schemes = ("bearer",)
def auth_headers(self, token: str, auth_scheme: str) -> dict[str, str]:
"""Return ``Authorization: Bearer <token>``."""
if auth_scheme != "bearer":
raise ValueError(
f"GitHubAuth does not support auth scheme {auth_scheme!r}"
)
return {"Authorization": f"Bearer {token}"}

View File

@@ -0,0 +1,149 @@
"""Authenticated HTTP helpers driven by ``~/.specify/auth.json``.
No credentials are sent unless the user has created ``auth.json``.
For each outbound URL the helper matches the hostname against
configured entries, resolves the token via the appropriate provider
class, and attaches auth headers. Redirect safety is enforced:
the ``Authorization`` header is stripped when a redirect leaves the
entry's declared hosts. On 401/403 the next matching entry is tried,
then unauthenticated.
"""
from __future__ import annotations
import urllib.error
import urllib.request
from fnmatch import fnmatch
from urllib.parse import urlparse
from . import get_provider
from .config import AuthConfigEntry, _default_config_path, find_entries_for_url, load_auth_config
_config_override: list[AuthConfigEntry] | None = None
_config_cache: list[AuthConfigEntry] | None = None # None = not yet loaded
def _load_config() -> list[AuthConfigEntry]:
"""Load auth config, using override if set (for testing).
The result is cached per-process so ``auth.json`` is read at most once,
and any warning about a malformed file fires only once.
"""
global _config_cache
if _config_override is not None:
return _config_override
if _config_cache is not None:
return _config_cache
try:
_config_cache = load_auth_config()
except (ValueError, OSError) as exc:
import warnings
config_path = _default_config_path()
warnings.warn(
f"Failed to load {config_path}: {exc}. "
"All requests will be unauthenticated.",
UserWarning,
stacklevel=2,
)
_config_cache = []
return _config_cache
def _hostname_in_hosts(hostname: str, hosts: tuple[str, ...]) -> bool:
"""Return True if *hostname* matches any pattern in *hosts*."""
hostname = hostname.lower()
return any(p == hostname or fnmatch(hostname, p) for p in hosts)
class _StripAuthOnRedirect(urllib.request.HTTPRedirectHandler):
"""Drop ``Authorization`` when a redirect leaves the entry's declared hosts."""
def __init__(self, hosts: tuple[str, ...]) -> None:
super().__init__()
self._hosts = hosts
def redirect_request(self, req, fp, code, msg, headers, newurl):
original_auth = (
req.get_header("Authorization")
or req.unredirected_hdrs.get("Authorization")
)
new_req = super().redirect_request(req, fp, code, msg, headers, newurl)
if new_req is not None:
hostname = (urlparse(newurl).hostname or "").lower()
if _hostname_in_hosts(hostname, self._hosts):
if original_auth:
new_req.add_unredirected_header("Authorization", original_auth)
else:
new_req.headers.pop("Authorization", None)
new_req.unredirected_hdrs.pop("Authorization", None)
return new_req
def build_request(url: str, extra_headers: dict[str, str] | None = None) -> urllib.request.Request:
"""Build a :class:`~urllib.request.Request`, attaching auth when config matches.
Uses the first matching entry from ``auth.json`` whose token resolves.
Returns a plain request when no entry matches or the file doesn't exist.
"""
headers: dict[str, str] = {}
if extra_headers:
# Strip Authorization from extra_headers to prevent bypass
headers.update({k: v for k, v in extra_headers.items() if k.lower() != "authorization"})
# Auth headers applied last — cannot be overridden by extra_headers
entries = find_entries_for_url(url, _load_config())
for entry in entries:
provider = get_provider(entry.provider)
if provider is None:
continue
token = provider.resolve_token(entry)
if token:
headers.update(provider.auth_headers(token, entry.auth))
break
return urllib.request.Request(url, headers=headers)
def open_url(url: str, timeout: int = 10, extra_headers: dict[str, str] | None = None):
"""Open *url* with config-driven auth, redirect stripping, and fallthrough.
1. Find ``auth.json`` entries whose hosts match the URL.
2. For each entry, resolve the token and try the request.
3. On 401/403 move to the next matching entry.
4. After all entries exhausted (or none matched), try unauthenticated.
5. Non-auth errors (404, 500, network) raise immediately.
*extra_headers* (e.g. ``Accept``) are merged into every attempt.
"""
entries = find_entries_for_url(url, _load_config())
def _make_req(auth_headers: dict[str, str]) -> urllib.request.Request:
merged = {}
if extra_headers:
# Strip Authorization from extra_headers to prevent bypass
merged.update({k: v for k, v in extra_headers.items() if k.lower() != "authorization"})
# Auth headers applied last — cannot be overridden by extra_headers
merged.update(auth_headers)
return urllib.request.Request(url, headers=merged)
# Try each matching entry
for entry in entries:
provider = get_provider(entry.provider)
if provider is None:
continue
token = provider.resolve_token(entry)
if not token:
continue
req = _make_req(provider.auth_headers(token, entry.auth))
opener = urllib.request.build_opener(_StripAuthOnRedirect(entry.hosts))
try:
return opener.open(req, timeout=timeout)
except urllib.error.HTTPError as exc:
if exc.code in (401, 403):
exc.close()
continue # try next entry
raise
# No entry worked (or none matched) — unauthenticated fallback
req = _make_req({})
return urllib.request.urlopen(req, timeout=timeout) # noqa: S310

View File

@@ -1707,20 +1707,20 @@ class ExtensionCatalog:
raise ValidationError("Catalog URL must be a valid URL with a host.")
def _make_request(self, url: str):
"""Build a urllib Request, adding a GitHub auth header when available.
"""Build a urllib Request, adding auth headers when a provider matches.
Delegates to :func:`specify_cli._github_http.build_github_request`.
Delegates to :func:`specify_cli.authentication.http.build_request`.
"""
from specify_cli._github_http import build_github_request
return build_github_request(url)
from specify_cli.authentication.http import build_request
return build_request(url)
def _open_url(self, url: str, timeout: int = 10):
"""Open a URL with GitHub auth, stripping the header on cross-host redirects.
"""Open a URL with provider-based auth, trying each configured provider.
Delegates to :func:`specify_cli._github_http.open_github_url`.
Delegates to :func:`specify_cli.authentication.http.open_url`.
"""
from specify_cli._github_http import open_github_url
return open_github_url(url, timeout)
from specify_cli.authentication.http import open_url
return open_url(url, timeout)
def _load_catalog_config(self, config_path: Path) -> Optional[List[CatalogEntry]]:
"""Load catalog stack configuration from a YAML file.

View File

@@ -265,7 +265,6 @@ class IntegrationCatalog:
) -> Dict[str, Any]:
"""Fetch one catalog, with per-URL caching."""
import urllib.error
import urllib.request
url_hash = hashlib.sha256(entry.url.encode()).hexdigest()[:16]
cache_file = self.cache_dir / f"catalog-{url_hash}.json"
@@ -289,7 +288,9 @@ class IntegrationCatalog:
pass # Cache cleanup is best-effort; ignore deletion failures.
try:
with urllib.request.urlopen(entry.url, timeout=10) as resp:
from specify_cli.authentication.http import open_url
with open_url(entry.url, timeout=10) as resp:
# Validate final URL after redirects
final_url = resp.geturl()
if final_url != entry.url:

View File

@@ -1845,20 +1845,20 @@ class PresetCatalog:
)
def _make_request(self, url: str):
"""Build a urllib Request, adding a GitHub auth header when available.
"""Build a urllib Request, adding auth headers when a provider matches.
Delegates to :func:`specify_cli._github_http.build_github_request`.
Delegates to :func:`specify_cli.authentication.http.build_request`.
"""
from specify_cli._github_http import build_github_request
return build_github_request(url)
from specify_cli.authentication.http import build_request
return build_request(url)
def _open_url(self, url: str, timeout: int = 10):
"""Open a URL with GitHub auth, stripping the header on cross-host redirects.
"""Open a URL with provider-based auth, trying each configured provider.
Delegates to :func:`specify_cli._github_http.open_github_url`.
Delegates to :func:`specify_cli.authentication.http.open_url`.
"""
from specify_cli._github_http import open_github_url
return open_github_url(url, timeout)
from specify_cli.authentication.http import open_url
return open_url(url, timeout)
def _load_catalog_config(self, config_path: Path) -> Optional[List[PresetCatalogEntry]]:
"""Load catalog stack configuration from a YAML file.

View File

@@ -322,7 +322,7 @@ class WorkflowCatalog:
# Fetch from URL — validate scheme before opening and after redirects
from urllib.parse import urlparse
from urllib.request import urlopen
from specify_cli.authentication.http import open_url as _open_url
def _validate_catalog_url(url: str) -> None:
parsed = urlparse(url)
@@ -337,7 +337,7 @@ class WorkflowCatalog:
_validate_catalog_url(entry.url)
try:
with urlopen(entry.url, timeout=30) as resp: # noqa: S310
with _open_url(entry.url, timeout=30) as resp:
_validate_catalog_url(resp.geturl())
data = json.loads(resp.read().decode("utf-8"))
except Exception as exc: