From b8982a748a27d808f5034887cc4b408053f111dd Mon Sep 17 00:00:00 2001 From: doquanghuy Date: Thu, 28 May 2026 22:50:38 +0700 Subject: [PATCH] =?UTF-8?q?docs(workflows):=20clarify=20continue=5Fon=5Fer?= =?UTF-8?q?ror=20scope=20=E2=80=94=20returned=20FAILED=20only?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Copilot finding on d0b9e00: The README's "Error Handling" intro implied `continue_on_error` covers "any other runtime error raised during step execution", but the engine only consults the flag when a step returns `StepResult(status=FAILED, ...)`. Exceptions raised out of `step_impl.execute()` propagate to `WorkflowEngine.execute()`, where the catch-all logs `workflow_failed` and re-raises — the step result is never recorded, and the flag is never consulted. Audited the whole PR diff for the same overclaim: 1. workflows/README.md — main fix. Reworded the Error Handling intro to "any step that returns StepResult(status=FAILED, ...)" and promoted the parenthetical structural-validation note into the Notes block. Added a new "Scope: returned failures only" note that names the exception path explicitly and tells step authors how to bring the flag into scope for exceptional code (catch internally and return FAILED with the failure encoded in `output`). 2. tests/test_workflows.py — section comment used "when an executable step fails", same ambiguity. Tightened to "when a step returns StepResult(status=FAILED, ...)" and added a sentence calling out that unhandled exceptions are out of scope. 3. src/specify_cli/workflows/engine.py — already correct ("any step that returns FAILED" in the validator comment; "lets the pipeline route around the failure" in the execute path). No change. Engine semantics and test bodies are unchanged. Docs-only. Co-Authored-By: Claude Opus 4.7 (1M context) --- tests/test_workflows.py | 15 +++++++++------ workflows/README.md | 29 ++++++++++++++++++----------- 2 files changed, 27 insertions(+), 17 deletions(-) diff --git a/tests/test_workflows.py b/tests/test_workflows.py index 0a5e61c2a..3ab8871de 100644 --- a/tests/test_workflows.py +++ b/tests/test_workflows.py @@ -2336,12 +2336,15 @@ steps: # ===== continue_on_error Tests ===== # # Locks the contract documented in workflows/README.md "Error Handling" -# section: when an executable step fails and `continue_on_error: true` -# is declared, the engine records the step's `output` (with `exit_code` -# and `stderr` from the failure) and its `status` (sibling key on -# `steps.`, not nested under `output`) and continues to the next -# sibling step instead of halting the run. Gate aborts -# (`output.aborted`) still halt regardless of the flag. +# section: when a step returns `StepResult(status=FAILED, ...)` and +# `continue_on_error: true` is declared, the engine records the step's +# `output` (with `exit_code` and `stderr` from the failure) and its +# `status` (sibling key on `steps.`, not nested under `output`) +# and continues to the next sibling step instead of halting the run. +# Gate aborts (`output.aborted`) still halt regardless of the flag. +# Unhandled exceptions raised out of `step_impl.execute()` are out of +# scope for this flag — they propagate to `WorkflowEngine.execute()` +# and abort the run. class TestContinueOnError: diff --git a/workflows/README.md b/workflows/README.md index 40c593fb6..b0b7050fe 100644 --- a/workflows/README.md +++ b/workflows/README.md @@ -221,17 +221,13 @@ Aggregate results from fan-out steps: ## Error Handling -By default, any step that ends in `StepStatus.FAILED` at runtime halts -the entire run — most commonly a `shell` or `command` step exiting -non-zero, but also any other runtime error raised during step -execution. (Invalid workflow definitions are rejected up-front by -`specify workflow run` before the run even starts, so structural -validation failures never reach this code path.) Set -`continue_on_error: true` on a step to record its result and continue -to the next sibling step instead. When the failure was a non-zero -exit, the exit code remains available on -`steps..output.exit_code` so downstream `if`, `switch`, or `gate` -steps can branch on it: +By default, any step that returns `StepResult(status=FAILED, ...)` +at runtime halts the entire run — most commonly a `shell` or +`command` step exiting non-zero. Set `continue_on_error: true` on +a step to record its result and continue to the next sibling step +instead. When the failure was a non-zero exit, the exit code +remains available on `steps..output.exit_code` so downstream +`if`, `switch`, or `gate` steps can branch on it: ```yaml - id: heavy-thing @@ -263,10 +259,21 @@ step in your own loop. - The field must be a literal boolean (`true` / `false`); coerced strings like `"true"` are rejected at validation time. +- **Scope: returned failures only.** The flag applies to step results + with `status=FAILED`. Unhandled exceptions raised out of a step's + `execute()` method are caught one level up by `WorkflowEngine.execute()`, + logged as `workflow_failed`, and abort the run regardless of + `continue_on_error`. If a step author wants the flag to cover an + exceptional path, the step must catch the exception internally and + return `StepResult(status=FAILED, ...)` with the failure encoded in + `output` (e.g. `exit_code`, `stderr`, or a custom field). - Gate aborts (`on_reject: abort` chosen by the operator) always halt the run — `continue_on_error` does not override them. The flag is for transient/expected step failures, not for overriding deliberate operator decisions. +- Structural validation runs up-front: `specify workflow run` rejects + invalid workflow definitions before the run is created, so + validation failures never reach this code path. - When the flag is omitted, behaviour is byte-equivalent to before this feature.