docs(workflows): clarify continue_on_error scope — returned FAILED only

Copilot finding on d0b9e00: The README's "Error Handling" intro implied `continue_on_error` covers "any other runtime error raised during step execution", but the engine only consults the flag when a step returns `StepResult(status=FAILED, ...)`. Exceptions raised out of `step_impl.execute()` propagate to `WorkflowEngine.execute()`, where the catch-all logs `workflow_failed` and re-raises — the step result is never recorded, and the flag is never consulted. Audited the whole PR diff for the same overclaim: 1. workflows/README.md — main fix. Reworded the Error Handling intro to "any step that returns StepResult(status=FAILED, ...)" and promoted the parenthetical structural-validation note into the Notes block. Added a new "Scope: returned failures only" note that names the exception path explicitly and tells step authors how to bring the flag into scope for exceptional code (catch internally and return FAILED with the failure encoded in `output`). 2. tests/test_workflows.py — section comment used "when an executable step fails", same ambiguity. Tightened to "when a step returns StepResult(status=FAILED, ...)" and added a sentence calling out that unhandled exceptions are out of scope. 3. src/specify_cli/workflows/engine.py — already correct ("any step that returns FAILED" in the validator comment; "lets the pipeline route around the failure" in the execute path). No change. Engine semantics and test bodies are unchanged. Docs-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-07-03 12:28:06 +08:00 · 2026-05-28 22:50:38 +07:00
parent 22c5184fa3
commit b8982a748a
2 changed files with 27 additions and 17 deletions
--- a/tests/test_workflows.py
+++ b/tests/test_workflows.py
@@ -2336,12 +2336,15 @@ steps:
 # ===== continue_on_error Tests =====
 #
 # Locks the contract documented in workflows/README.md "Error Handling"
-# section: when an executable step fails and `continue_on_error: true`
-# is declared, the engine records the step's `output` (with `exit_code`
-# and `stderr` from the failure) and its `status` (sibling key on
-# `steps.<id>`, not nested under `output`) and continues to the next
-# sibling step instead of halting the run. Gate aborts
-# (`output.aborted`) still halt regardless of the flag.
+# section: when a step returns `StepResult(status=FAILED, ...)` and
+# `continue_on_error: true` is declared, the engine records the step's
+# `output` (with `exit_code` and `stderr` from the failure) and its
+# `status` (sibling key on `steps.<id>`, not nested under `output`)
+# and continues to the next sibling step instead of halting the run.
+# Gate aborts (`output.aborted`) still halt regardless of the flag.
+# Unhandled exceptions raised out of `step_impl.execute()` are out of
+# scope for this flag — they propagate to `WorkflowEngine.execute()`
+# and abort the run.


 class TestContinueOnError:
--- a/workflows/README.md
+++ b/workflows/README.md
@@ -221,17 +221,13 @@ Aggregate results from fan-out steps:

 ## Error Handling

-By default, any step that ends in `StepStatus.FAILED` at runtime halts
-the entire run — most commonly a `shell` or `command` step exiting
-non-zero, but also any other runtime error raised during step
-execution. (Invalid workflow definitions are rejected up-front by
-`specify workflow run` before the run even starts, so structural
-validation failures never reach this code path.) Set
-`continue_on_error: true` on a step to record its result and continue
-to the next sibling step instead. When the failure was a non-zero
-exit, the exit code remains available on
-`steps.<id>.output.exit_code` so downstream `if`, `switch`, or `gate`
-steps can branch on it:
+By default, any step that returns `StepResult(status=FAILED, ...)`
+at runtime halts the entire run — most commonly a `shell` or
+`command` step exiting non-zero. Set `continue_on_error: true` on
+a step to record its result and continue to the next sibling step
+instead. When the failure was a non-zero exit, the exit code
+remains available on `steps.<id>.output.exit_code` so downstream
+`if`, `switch`, or `gate` steps can branch on it:

 ```yaml
 - id: heavy-thing
@@ -263,10 +259,21 @@ step in your own loop.

 - The field must be a literal boolean (`true` / `false`); coerced
  strings like `"true"` are rejected at validation time.
+- **Scope: returned failures only.** The flag applies to step results
+  with `status=FAILED`. Unhandled exceptions raised out of a step's
+  `execute()` method are caught one level up by `WorkflowEngine.execute()`,
+  logged as `workflow_failed`, and abort the run regardless of
+  `continue_on_error`. If a step author wants the flag to cover an
+  exceptional path, the step must catch the exception internally and
+  return `StepResult(status=FAILED, ...)` with the failure encoded in
+  `output` (e.g. `exit_code`, `stderr`, or a custom field).
 - Gate aborts (`on_reject: abort` chosen by the operator) always halt
  the run — `continue_on_error` does not override them. The flag is
  for transient/expected step failures, not for overriding deliberate
  operator decisions.
+- Structural validation runs up-front: `specify workflow run` rejects
+  invalid workflow definitions before the run is created, so
+  validation failures never reach this code path.
 - When the flag is omitted, behaviour is byte-equivalent to before
  this feature.