50 Commits

Author SHA1 Message Date
Tom Huang
6be4e56544 feat(workspace): add plan mode and Excalidraw sketch flows (#4862)
* feat(daemon): add 'plan' session mode and update related functionality

- Introduced a new session mode 'plan' alongside existing 'design' and 'chat' modes, allowing for editable document creation.
- Updated various functions and interfaces to accommodate the new session mode, including normalization and usage in commands.
- Enhanced markdown rendering capabilities by integrating micromark and its GFM extension for improved markdown processing.
- Added new input types for question forms, expanding the range of user interactions.
- Updated UI components to reflect changes in session modes and ensure proper handling of next steps based on the current mode.

* feat(chat): integrate SessionModeToggle into ChatComposer and HomeHero

- Added SessionModeToggle component to both ChatComposer and HomeHero for improved session management.
- Updated HomeHero styles to accommodate the new mode switcher layout.
- Adjusted countdown timer in QuestionsPanel to extend the skip countdown from 120 seconds to 10 minutes, enhancing user experience.
- Added tests to ensure the countdown functionality works as expected.

* feat(FileViewer): implement synchronized scrolling for Markdown editor and preview

- Added functionality to synchronize scrolling between the Markdown editor and preview panes.
- Introduced new types and utility functions for managing scroll behavior.
- Enhanced the MarkdownViewer component to handle scroll events and maintain scroll position across different modes.
- Updated the component's state management to accommodate the new scrolling features.

* feat(excalidraw): integrate Excalidraw into the project

- Added @excalidraw/excalidraw as a dependency in package.json.
- Updated vitest configuration to include an alias for Excalidraw.
- Imported Excalidraw CSS in the layout component for styling.
- Modified AssistantMessage component to handle optional projectId.
- Enhanced FileOpsSummary to display delete operations.
- Implemented new Excalidraw scene management in SketchEditor and FileWorkspace components.
- Updated MarkdownViewer to support file mentions and improved file handling.
- Refactored various components to accommodate Excalidraw integration and ensure compatibility with existing features.

* feat(files): improve sketch and file handoff flows

* test: align post-merge expectations

* chore(nix): refresh pnpm deps hash

* fix(workspace): stabilize sketch persistence and ci checks

* fix(workspace): address review blockers in editable files

* fix(workspace): persist cleared sketch scenes

* test(workspace): type sketch editor mock scene

* fix(workspace): serialize sketch autosaves

* fix(workspace): keep sketch save revisions current

* test(e2e): stabilize project workspace smoke flows

* fix(analytics): preserve plan mode for BYOK runs

* test(e2e): stabilize new project rail interactions

* fix(files): stop bash delete parsing at shell operators

* test(e2e): stabilize ui cold-start suites

* fix(viewer): preserve absolute markdown image sources

* feat(workspace): preload sketches and enhance markdown save options

- Added functionality to preload persisted sketches before opening the tab.
- Introduced new MarkdownSaveOptions type to manage save behavior.
- Updated saveMarkdownText to handle options for refreshing files and showing saving state.
- Enhanced FileViewer to maintain focus and selection during metadata refresh.
- Implemented a Toast component for user feedback on save and export actions.

* fix(web): stabilize markdown and sketch editor polish

* fix(web): finish sketch editor merge resolution

* chore(nix): refresh pnpm deps hash

* fix(plan): bypass discovery and stabilize markdown sync

* fix(web): simplify scene retrieval in SketchEditor component

* feat(web): enhance markdown viewer with auto-save functionality

- Implemented passive auto-save status in the MarkdownViewer component, replacing the manual Save button with an auto-save indicator.
- Introduced new hooks and state management for tracking auto-save events and displaying the last saved time.
- Added support for synchronized scrolling between the markdown editor and preview.
- Created a new markdown-scroll-sync module to handle scroll synchronization logic.
- Updated localization files to include new strings for auto-save messages.
- Added a SketchEnginePrewarm component to optimize Excalidraw loading times.

* feat(web): enhance session mode toggle with cost indicators

- Added cost tiers for each session mode in the SessionModeToggle component, providing users with a visual representation of usage costs.
- Introduced a new ModeCostTag component to display cost information alongside session mode labels.
- Updated localization files to include new keys for cost labels and notes.
- Enhanced styling for cost indicators to improve user experience and clarity.
- Refactored EntryShell to open the new project modal instead of creating a blank project directly from the rail.
- Implemented a utility function in markdown-scroll-sync to check for vertical progression in block offsets.

* feat(web): add max height adjustment for session mode description card

- Introduced maxHeight prop to the ModeDescriptionCard component to control the height of the description card based on available space.
- Implemented useLayoutEffect in SessionModeToggle to dynamically calculate and set the maximum height of the description card, ensuring it does not overlap with the project tab bar.
- Updated tests to verify that session modes display their expected usage/cost correctly in the UI.
- Enhanced localization files to include new cost-related strings for various languages.

* feat(web): implement goBack function for improved navigation and update auto-open logic

- Added a new `goBack` function to handle in-app navigation, allowing users to return to the previous route instead of a hardcoded destination.
- Updated the `navigate` function to maintain history state for better back navigation.
- Refactored auto-open logic to prioritize produced artifacts, allowing markdown files to be opened alongside HTML files.
- Updated tests to cover new navigation behavior and artifact selection logic.
- Enhanced localization files to include new descriptions for workspace actions.

* refactor(web): remove create design system functionality and update design files panel actions

- Removed the `onCreateDesignSystem` prop and associated button from the DesignFilesPanel component.
- Updated the empty state actions to include a button for creating a new document via the `onPaste` function.
- Adjusted tests to reflect the removal of the design system creation action and ensure the new document button is functional.
- Enhanced the MarkdownViewer component by adding a placeholder for the text area and removing the header bar for a cleaner interface.
- Updated localization files to include a new placeholder string for the markdown editor.

* fix(web): restore markdown placeholder translations

* refactor(web): streamline MarkdownViewer and enhance localization

- Removed unnecessary state management and reload functionality from the MarkdownViewer component for improved performance.
- Added a placeholder text for the markdown editor in multiple localization files to enhance user guidance.
- Updated styles for the save state indicator in the viewer to improve visual clarity and alignment.
- Adjusted tests to reflect changes in the MarkdownViewer and ensure proper functionality.

* fix(web): adjust SketchEditor button size and remove shortcut hints

- Reduced the icon size in the SketchEditor component from 13 to 12 for better alignment.
- Updated the removeSketchMermaidShortcutHints function to also remove the submit shortcut hints from the dialog, enhancing the user interface by decluttering unnecessary elements.
- Adjusted tests to verify the absence of shortcut hints in the modal after updates.

* fix(web): address plan mode follow-up polish

* fix(web): align ci expectations after merge

* test(e2e): stabilize project workspace helpers

* test(e2e): scale settings visual timeout

* fix(prompts): lock ElevenLabs voice picker choices

* test(e2e): pin project workspace P0 worker

* test(e2e): use rail new project entry

* test(e2e): relax app restoration startup waits

* fix(web): limit markdown pipe escaping to tables

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: Amy <1184569493@qq.com>
2026-07-01 04:38:31 +00:00
lefarcen
59bca72f7e feat(export): programmatic screenshot-based PPTX/PDF export (#4604)
* feat(export): programmatic screenshot-based PPTX/PDF export

Replace the prompt-driven "ask the agent to run python-pptx" PPTX export
with a deterministic, programmatic pipeline. The daemon renders each deck
slide to a pixel-perfect PNG via the desktop's bundled Electron Chromium
(reused over sidecar IPC — no second headless engine, so the packaged app
does not grow) and assembles a one-image-per-slide .pptx (PptxGenJS) or
raster .pdf (pdf-lib).

- sidecar-proto: RENDER_SLIDES message + DesktopRenderSlides{Input,Result}
- daemon: deck-export.ts assembler (decode + pptx + pdf), POST
  /export/pptx and /export/pdf-image routes, desktopSlideRenderer wiring
- desktop: deck-capture.ts renders the deck off-screen and captures one
  PNG per `.deck > .slide` (skips presenter-mode .mini-slide clones)
- web: exportProjectAsPptx() fetch+blob download; ProjectView swaps the
  prompt path for it
- cli: `od export pptx|pdf` dual-track closure
- remove the now-dead build-pptx-export-prompt lib + test

Tests: deck-export assembler unit tests + exportProjectAsPptx web tests.
Screenshot mode ships first; editable-rebuild modes are follow-up.

* chore(nix): refresh pnpm deps hash

* fix(export): deck PDF blank pages on opacity decks + image export capturing the modal

Two pre-existing deck-export defects surfaced while validating the new export:

- Vector PDF (Print-ready) left blank pages for decks that show one slide at a
  time via opacity: DECK_PRINT_CSS forced each .slide onto its own page but
  never reset opacity/visibility, and CSS transitions animated opacity from 0,
  so inactive slides printed blank. Force opacity/visibility/animation/
  transition in print media (both the web and desktop DECK_PRINT_CSS).

- "Export as image" captured Open Design's own format-chooser modal: the host
  compositor snapshot ran after the modal opened, so the overlay leaked into
  the PNG. Capture the clean preview before showing the modal; the cached
  snapshot is reused so opening the modal never re-captures over the overlay.

* feat(export): dual-mode renderer + reuse screenshot for image/PDF export

Unify all screenshot exports on one off-screen renderer and make them
viewport-independent:

- Renderer (deck-capture.ts) now has two modes: deck → one 1920x1080 PNG per
  slide (optionally just the slide at `index`); page (no `.slide` sections,
  e.g. a website) → a single full-document PNG at natural size. Adds `index`
  to the render input and `mode` to the result.
- Image export now renders through the daemon off-screen renderer (deck → the
  current slide at fixed size; website → the whole page as one long image),
  so the exported size no longer depends on the preview pane and can never
  capture Open Design's own UI. Falls back to the host/iframe snapshot on web.
- "Export as PDF" (UI) now produces a pixel-perfect screenshot PDF that matches
  the preview (same renderer as PPTX/image), falling back to the vector
  print path on web or on failure.
- New POST /export/image route; PPTX on a non-deck returns a clear 422.

* feat(export): smart full-page image capture + PDF back to print-view default

Full-page image export now auto-selects the capture technique (users only see
"full page"):
- captureBeyondViewport (one clean off-screen pass, no fixed-element
  duplication) when the output fits the machine's real GPU texture limit —
  queried at runtime via WebGL MAX_TEXTURE_SIZE, not hard-coded — and
  below-the-fold content actually rendered.
- scroll-segment stitch otherwise (too tall, or blank-below-fold scroll-driven
  pages like parallax landings): scrolls a viewport at a time, captures each
  frame, and stitches by real scroll offset into one long PNG. RAM-bound (a
  plain buffer, not a GPU texture), capped by a memory budget; encoded with a
  tiny dependency-free PNG encoder (node:zlib) so it bundles cleanly into the
  ESM packaged main and has no Skia dimension cap.
- Output scale derives from the window DPR / actual captured chunk, fixing a
  double-scale bug (DPR x clip.scale) that produced 4x-sized images.

PDF "Export as PDF" reverts to the print-ready vector path (instant, selectable
text) as the default; the pixel-perfect screenshot PDF stays available via
`od export pdf`.

* fix(export): PDF defaults to the CJK-safe screenshot path, not vector printToPDF

Chromium's vector printToPDF embeds no fonts in the packaged runtime and drops
CJK glyphs entirely — a Chinese page exported to "PDF" lost all its Chinese text
(only Latin survived). The off-screen screenshot renderer (already used for
image/PPTX) rasterizes the real browser render, so CJK is always correct.

"Export as PDF" now produces a pixel-perfect screenshot PDF that matches the
preview (one page per deck slide, or the whole page for a website), falling back
to the vector/browser print path only on web or on failure. Verified: a Chinese
site that lost its text under vector printToPDF renders fully under the raster
path.

* perf(export): stitch scroll-segments with Electron's native PNG encoder

The scroll-segment path was slow (~100s for a long parallax page) because of a
hand-written PNG encoder: a per-pixel JS BGRA→RGBA loop over tens of millions of
pixels plus zlib.deflate of a ~110MB high-entropy buffer.

Replace both with Electron's native image pipeline: stitch chunks as BGRA with
one Buffer.copy per chunk (capturePage already returns BGRA, which is what
createFromBitmap wants — no channel swap, no per-pixel JS) and encode once via
nativeImage.createFromBitmap(...).toPNG(). createFromBitmap is a CPU bitmap, not
a GPU texture, so it is not bound by the texture limit. Removes the hand-written
PNG encoder (crc32 / chunk framing / node:zlib).

Measured on a long parallax page: image 44s→17s, PDF 101s→19s (~5x), and the
native encoder also compresses better (55MB→26MB PNG).

* fix(export): render the image only on Save, after the format is chosen

Image export was rendering eagerly when the modal opened (a holdover from when
the capture had to run before the modal to avoid catching the overlay). Now the
desktop path renders off-screen and can't see the modal, so capture moves to the
Save click: open modal → pick format → Save → render + encode + download.

The Save button shows the "Saving…" state during the render; for the web-only
host-compositor fallback the modal is hidden during the brief capture so it
can't leak into the image. The snapshot is cached, so switching format after a
render re-encodes without re-capturing.

* perf(export): raster PDF embeds full pages as JPEG (52MB -> a few MB)

The screenshot PDF embedded full-page captures as PNG, so a photo-heavy long
page produced a ~52MB PDF. Full pages now render as JPEG (quality 82) — visually
near-identical for web screenshots but ~10x smaller. Deck slides stay PNG (crisp
text/graphics); image export still uses a lossless PNG source the client
re-encodes to the user's chosen format.

Threads a `pageImageFormat` hint through the render input; the desktop renderer
encodes page mode as JPEG (CDP captureScreenshot format:jpeg / nativeImage
toJPEG) and the daemon assembler embeds with embedJpg vs embedPng per image.

* feat(export): pre-pass for reveal-on-scroll + deck image as one long image

Two improvements found while testing real landing pages:

1. Pre-pass before full-page capture: freeze animations/transitions and scroll
   the whole page once (then back to top) so reveal-on-scroll content
   (IntersectionObserver / AOS / lazy images) is triggered and holds. This is
   the standard full-page-screenshot technique. Result: pages that previously
   came back blank-below-the-fold (and fell to the duplicate-prone scroll-
   stitch) now succeed with a clean single-pass captureBeyondViewport.
   Verified: a reveal-on-scroll landing page renders fully in one shot (no black,
   no duplicates); only pure JS-scrollY parallax (which re-hides at scrollY=0)
   still falls back to scroll-stitch.

2. Image export of a deck now stitches every slide top-to-bottom into one tall
   image (the "whole deck as one picture") via the native BGRA stitch, capped so
   a long deck can't exceed the bitmap limit. Ordinary pages remain a single
   full-page capture; a specific slide index is still honored if given.

* fix(export): show the loading toast at the start (not after) + track real completion

The "Exporting" toast was set inside the promise's .then(), so for a multi-second
screenshot export it only appeared AFTER the export finished — looking frozen the
whole time. Now fireShareExport shows a loading toast immediately, clears it on
success, and shows an error toast on failure. The loading toast TTL is raised to
60s so it survives a long export, and PPTX threads its real promise (was fire-
and-forget) so the toast reflects actual completion. Adds fileViewer.exportFailed
across all locales.

* fix(export): detect scroll-driven pages by comparison, not by color

The blank-below-fold heuristic (flat-color fraction) was unreliable: a dark-
themed page that renders fine (Mindloop, 89% near-black below the fold) looked
"blanker" than a scroll-driven page that genuinely fails (Luxury, 78%), so the
black middle slipped through and exported with a black band.

Replace it with a color-independent comparison: render the document's MIDDLE
band two ways — scrolled into view (real content) vs captureBeyondViewport at
scroll 0 (what the one-shot produces). If they differ significantly, the page is
scroll-driven and we use scroll-segment stitch; otherwise the clean one-shot.
Verified: Luxury now exports full content (stitch), Mindloop stays a clean
one-shot, dark designs are no longer false-flagged.

* fix(export): recognize nested-`.slide` decks (export all slides + show PPTX)

A deck whose slides are `.slide` nested under `.deck-viewport`/`.deck-stage`
(not direct children of `.deck`/`body`) was missed: PDF/image exported only the
first slide and the PPTX option was hidden, even though the slide pager showed
"1 / 9".

- Renderer: find slides via `.slide` anywhere, filtering out presenter-mode
  clones (`.mini-slide`/`.overview`/`.thumb`) in-page, instead of the rigid
  `.deck > .slide` selector. showSlide now also sets `visibility:visible` and
  toggles the common active-slide classes (active/visible/is-active/current) so
  decks that hide via `visibility:hidden` and gate reveals on `.visible` render
  every slide; animations are frozen so reveals reach final state instantly.
- UI: PPTX export shows whenever the artifact is deck-like (incl. the
  content-detected `.slide` decks that drive the pager), matching the pager.

Verified on a nested-`.slide` deck: PPTX = 9 slides, image = 9 stitched, all
with content (previously only slide 1).

* perf(export): cache /raw/ assets + halve per-slide round trips

Two performance wins for screenshot export (covers + live preview benefit too):

1. /raw/ now emits ETag + Last-Modified + Cache-Control: no-cache, and answers
   conditional GETs with 304. Covers, live preview, and the screenshot export
   window all load project HTML + its fonts/CSS/images through /raw/, and in the
   packaged app the hidden export window shares the same Chromium session/cache
   as the web UI — so a second load reuses already-downloaded bytes instead of
   re-fetching every asset. The validators are derived from file size+mtime, so
   any agent rewrite changes them and busts the cache immediately (no-cache keeps
   it always-revalidate, never silently stale). Previously /raw/ sent no cache
   headers at all, so nothing was reusable.

2. Deck slide capture merges the slide-show DOM toggle and its two-frame settle
   into a single executeJavaScript round trip (showSlide returns the settle
   Promise) instead of two separate main<->renderer hops. Output is identical;
   the loop is measurably faster and the saving scales with slide count.

Tests: apps/daemon/tests/project-raw-cache.test.ts covers the validators, 304 on
If-None-Match / If-Modified-Since, cache-bust after rewrite, and the streamed
media path. The merge is correctness-preserving by construction (same DOM ops,
same two-frame settle).

* perf(export): hand rendered images to the daemon as files, not base64 IPC

The desktop renderer used to return every rendered slide/page as a base64 data
URL inside the JSON IPC reply. For large images (photo-heavy decks) that means a
1.33x base64 blow-up plus JSON.stringify/parse of a multi-MB string plus a
multi-MB socket transfer across the desktop->daemon sidecar bridge, with a
matching RAM spike. The pixels only exist in the desktop process (it owns the
Chromium that captures them), so they must cross to the daemon — but they should
cross as a file on the shared filesystem, not as base64 in a JSON message.

Now the daemon picks a unique scratch dir under its data root
(<RUNTIME_DATA_DIR>/export-render/<id>), passes it as `outputDir` in the
RENDER_SLIDES request, the desktop writes the images there and returns their
paths in `slideFiles`, and the daemon reads them back and deletes the dir in a
finally. desktop only ever writes to the absolute path the daemon handed it, so
this works identically in dev and packaged (desktop never infers the data root).
A unique per-request id means concurrent exports never collide. Base64 data URLs
remain a fallback for older desktop builds that don't honor outputDir.

- sidecar-proto: DesktopRenderSlidesInput.outputDir + DesktopRenderSlidesResult.slideFiles
- deck-capture: emitImages() writes files when outputDir is set (all 3 paths:
  deck per-slide, deck stitch, full-page incl. scroll-segment)
- deck-export: readSlideFiles() reads the handoff files (companion to decodeSlideDataUrls)
- import-export-routes: create/own/clean the scratch dir; prefer slideFiles

Tests: readSlideFiles unit tests; a route-level test that asserts the renderer
is handed an outputDir under the data root, the image returns, the scratch dir is
deleted after the response, and concurrent exports each get a unique dir.

* chore(export): one-line per-phase timing logs for screenshot export

A slow export now leaves a diagnosable trail instead of guesswork:

- desktop `[od-export] render`: load / assets(fonts+images) / prepare / render
  phase breakdown + total, plus mode and whether the handoff used files.
- daemon `[od-export] assemble`: renderer(IPC) / read(file handoff vs base64) /
  assemble(pptx/pdf build) + total + byte size.

These immediately surfaced that a slow image export was dominated by the
artifact's own in-browser compile (Babel/Tailwind CDN) and uncacheable external
media — not the export pipeline (file read was ~2ms). One info line per export.

* fix(export): normalize PDF page size to points; honor --title in CLI output name

Addresses review feedback on PR #4604:

- buildScreenshotPdf sized each PDF page by the captured image's pixel
  dimensions, so the nominal page size scaled with the capture's device pixel
  ratio (a 2x retina capture produced a page twice as large as 1x). Normalize
  each page to a fixed longest-side in points (960pt; a 16:9 slide => 960x540pt,
  matching PowerPoint) with the image's aspect ratio. The image still embeds at
  full pixel resolution, only the page's points change.
- `od export pptx --title "X"` forwarded the title to the server but always saved
  the local file under the source HTML's basename. Name the output after the
  slugified title when --output is not given.

Tests: PDF page-size normalization assertion (loads the PDF, checks 960pt not
the 1px capture size); sidecar-proto render-slides IPC validation (outputDir,
enum, boolean, unknown-key rejection, minimal round-trip).

* test(export): cover the server Content-Disposition filename branch

The exportProjectAsPptx happy-path test only exercised the no-header local
fallback name; production always returns a Content-Disposition. Add a test that
pins the branch the desktop download actually uses (server filename wins).

* feat(export): support arbitrary-aspect decks (not just 16:9)

Screenshot deck export no longer assumes every deck is 16:9. The renderer
measures the deck's authored slide box (the rendered rect of the first slide
with layout, so fit-to-viewport decks report the stage they actually paint),
sizes the capture window + pinned stage to it, and clips capturePage to it. The
measured pixel dimensions flow to the PPTX assembler, which derives the slide
layout from the real aspect ratio (13.333" wide, height = width/aspect) instead
of hardcoding LAYOUT_16x9 — so 4:3, square, and portrait decks export
correctly-proportioned slides and PDFs instead of being letterboxed or clipped.

Falls back to 1920x1080 / 16:9 when the slide box can't be measured or is out of
a sane range, so existing 16:9 decks are unchanged.

Verified: demo-deck measures 1920x1080 (16:9, unchanged); a 1024x768 deck
measures 4:3. Tests: PPTX layout follows 16:9 / 4:3 / 9:16 aspect (asserted via
the slide cx/cy in presentation.xml).

* fix(export): capture off-screen carousel slides (translated-strip decks)

showSlide only toggled the active class/opacity, so decks that paginate by
translating a flex-strip container (e.g. html-ppt-zhangzara-grove) left slide 2+
off-screen and capturePage kept grabbing the first viewport region — exporting
the wrong slide or a blank page.

showSlide now reports where the active slide actually landed; if it is off the
top-left capture stage, showDeckSlide restacks just that slide into the viewport
(clears ancestor transforms + pins it fixed at 0,0) and re-settles before
capture. This branch only runs when the slide is genuinely off-stage, so
transform-scaled fit-to-viewport decks (active slide already at 0,0, and which
DO rely on an ancestor scale) are never touched.

Verified: a 3-slide flex-strip carousel — slide 0 stays at 0,0 (untouched),
slides 1/2 detected off-stage (x=1920) and restacked to 0,0 before capture.

* fix(export): gate PPTX on a host runtime; unify + center the image toast

- PPTX export has no web-only fallback (it needs the daemon's Electron-Chromium
  screenshot renderer), so a web-only deployment showed a PPTX button that always
  failed with 501. Gate `showPptxExport`/`canPptx` on `isOpenDesignHostAvailable()`
  so the action only appears where it can succeed. Image/PDF keep their web
  fallbacks and stay shown.
- Image export showed an in-modal spinner and a separate, non-portaled "saved"
  toast that rendered off-center (its `position:fixed` resolved against the
  preview pane's transform). Route image export progress through the same
  portaled, viewport-centered `exportToast` used by PPTX/PDF: close the modal on
  Save, show a loading toast, then success/error — one consistent, centered toast
  style. Removes the now-dead imageExportBusy/imageExportCapturing/savedToast.

* fix(export): screenshot the current deck slide; never drop slides when stitching

Two more review findings on the screenshot export path:

- captureExportImageSnapshot() routed deck snapshots through the daemon without a
  slide index, so /export/image fell into the stitch-whole-deck branch even for
  "Copy screenshot" and "Export as image" — which both promise "the current
  preview". Pass the active slide index for decks so both capture the current
  slide. Stitching the whole deck into one long image is reserved for an explicit
  action (a follow-up modal toggle).
- stitchDeckSlides() capped the output at DECK_STITCH_MAX_H by stopping the loop
  and still returning ok:true, silently dropping trailing slides (~13+ on a 2x
  capture) — partial-success data loss. It now captures slide 0 to learn the
  native size, picks one uniform downscale so all `count` slides fit under the
  cap, and stitches every slide (long decks just get a smaller per-slide size).

* fix(export): drop dead `scale` param; keep deck PDF slides PNG (not JPEG)

Two more review findings:

- The render-slides contract accepted a `scale` field (validated in sidecar-proto,
  forwarded by handleScreenshotExport) that the desktop renderer never read — a
  broken protocol surface on the feature's first release. Remove it from the
  proto, the daemon route, and BuildDeckRenderInputOptions; the capture resolution
  comes from the measured stage size and host DPR. (No scale multiplier is needed
  today; if one is added later it must actually be applied in the renderer.)
- The deck branch derived its image encoding from `pageImageFormat`, so the
  screenshot-PDF path (which sets pageImageFormat='jpeg') made deck slides lossy
  JPEG — contradicting the contract ("deck slides stay PNG; JPEG is a full-page
  page-mode optimization") and adding compression artifacts to text-heavy slides.
  The deck branch now always encodes PNG; only `page` mode honors JPEG.

* fix(export): no silent truncation for tall pages; deterministic deck slide index

- The full-page scroll-stitch path clamped the document height to the RAM budget
  and returned ok:true, silently dropping everything below the cap on very tall
  pages. It now refuses with a clear "page is too tall — export as PDF instead"
  error instead of returning a truncated image as success; pages within budget
  still stitch their full height. (Decks downscale to fit since they are discrete
  slides; a continuous page is failed rather than seam-spliced at reduced scale.)
- Deck screenshots now always send a concrete slide index
  (slideState?.active ?? cached ?? 0) so a fresh open — or a deck detected only
  from `.slide` markup that never emits od:slide-state — captures the current
  slide instead of falling into the stitch-whole-deck branch.

* fix(export): explicit page-vs-deck signal; surface semantic export failures

Two review findings:

- Treating any `.slide` element as proof of a deck was too broad for the generic
  /export/image and /export/pdf-image routes — an ordinary page with carousel or
  testimonial `.slide` markup would skip full-page capture and stitch those
  elements as slides. The caller now passes an explicit `deck` flag (the web
  knows `effectiveDeck`; PPTX is deck-only): `deck:false` forces full-page
  capture, `deck:true` forces slide capture, and the `.slide`-count heuristic
  remains only as the no-signal fallback (e.g. the CLI).
- `exportProjectImageDataUrl()` returned null for every non-OK response, so a
  semantic failure (e.g. the daemon's new "page is too tall — export as PDF")
  was treated as "renderer unavailable" and silently downgraded to a partial
  visible-viewport screenshot. It now returns a discriminated result; the caller
  only falls back to a web capture when the off-screen renderer is genuinely
  unavailable (501/no-host/network) and surfaces the real error otherwise (Copy
  screenshot + Export as image both show the message).

Plumbs `deck` through sidecar-proto, the daemon route/options, exports.ts
(image + pptx + screenshot-pdf), FileViewer, and ProjectView. Proto test covers
deck round-trip + rejection.

* fix(export): harden the file handoff (path confinement) + narrow unavailable

Three security/contract findings on the render-slides file handoff:

- sidecar-proto now rejects a non-absolute `outputDir` (was: any non-empty
  string), so a malformed render-slides request can't make desktop main mkdir +
  write outside the daemon scratch area. Negative proto test added.
- The daemon canonicalizes every returned `slideFiles` path and requires it to
  stay under the canonical `renderOutputDir` before reading — a buggy/malicious
  renderer response can no longer make /export/{pptx,pdf-image,image} read and
  stream back arbitrary files (path traversal / symlink escape). Returns 502 on
  an out-of-scope path; handoff test proves an out-of-tree path is refused and
  its bytes never reach the response.
- exportProjectImageDataUrl wrapped the whole flow in one try/catch, so a 200
  with a corrupt/unreadable payload was reported as `unavailable` and silently
  downgraded to the viewport screenshot. The `unavailable` path is now narrowed
  to transport-level failures (the fetch itself); a bad 200 payload returns a
  semantic `error` so the real failure surfaces.

* fix(export): CLI page/deck flag; reject out-of-range slide index

Two review follow-ups:

- `od export pdf|pptx` now accepts `--deck` / `--page` and forwards the signal in
  the request body, so the CLI hits the route with the same page-vs-deck
  semantics the UI uses (which sends effectiveDeck). Previously the CLI fell back
  to the daemon's `.slide` heuristic, so an ordinary HTML file with carousel
  markup could export as a deck from the CLI but a full page from the UI. (PPTX
  stays deck-only server-side; the flag matters for PDF.) `--deck` and `--page`
  are mutually exclusive; omitting both keeps the heuristic fallback.
- renderDeckSlides rejected nothing for an out-of-range `index`: it fell back to
  range(count) and the daemon returned slide 0 with 200 for image export, so
  asking for slide 99 of a 3-slide deck silently returned slide 0. It now fails
  with a clear "slide index N is out of range" error.

* fix(export): If-None-Match precedence; renderer IPC outage -> 502 not 400

- rawRequestIsFresh fell through to If-Modified-Since even when the request sent
  a non-matching If-None-Match, so a same-second rewrite (ETag changes, but
  Last-Modified is identical at second granularity) could 304 changed bytes when
  both headers were sent. If-None-Match is now authoritative when present
  (RFC 9110 §13.1.3) — freshness is the ETag match alone. Regression test sends a
  stale ETag + the current If-Modified-Since and expects 200.
- A rejection from desktopSlideRenderer (a 600s requestJsonIpc) — missing desktop
  process, broken socket, timeout — landed in the outer catch and became
  400 BAD_REQUEST, making renderer outages look like caller errors to retries /
  monitoring. The IPC call is now wrapped and translated to 502
  UPSTREAM_UNAVAILABLE, matching the !rendered.ok branch; the outer 400 stays for
  real request-validation / assembly errors.

* fix(export): full-page stitch corrupts on fractional DPR (125%/150% scaling)

scrollSegmentStitch rounded the device pixel ratio to an integer
(`Math.round(size.width / PAGE_W)`), so on non-retina display scaling (1.25x,
1.5x) the output width and every row offset were wrong — the stitched full-page
screenshot (/export/image and the raster PDF page path) came back cropped
horizontally or with vertical gaps/overlap even though the page rendered fine.

Derive width/height/placement from the REAL captured device width and its true
(possibly fractional) ratio instead. Extracted scrollStitchGeometry /
scrollStitchRowOffset as pure helpers with a non-integer-DPR regression test
(1x / 1.25x / 1.5x / 2x).

* fix(export): broaden deck slide selector; content ETag for transformed HTML

- The renderer only recognized `.slide`, but shipped decks use other slide
  contracts the print/export path already supports (e.g.
  html-ppt-zhangzara-creative-mode uses `<section data-screen-label=...>`), so an
  explicit deck export of those silently downgraded to a single full-page
  capture. Broaden SLIDE_SELECTOR to the pdf-export family
  (`.slide, [data-screen-label], .deck-slide, .ppt-slide`), and when
  `deck === true` finds no slide surfaces, fail fast with a clear error instead
  of capturing a page.
- /raw/ revalidation used the source file's mtime ETag even when the response is
  substituted by a transform (Vite dev-entry -> dist/index.html, or preview
  bridge injection). A change to dist/index.html with an unchanged source entry
  could then return a stale 304. Compute a content ETag from the actual sent
  bytes for transformed HTML; assets/fonts/images/streamed media keep the fast
  mtime ETag + early 304. Regression: rewriting only dist/index.html returns 200.

* fix(export): gate PPTX on explicit deck; page-mode DOM intact; stitch RAM budget

Four review findings:

- PPTX action was gated on the `.slide` regex (`effectiveDeck`/`looksLikeDeck`),
  so ordinary pages with carousel/testimonial `.slide` markup surfaced PPTX and
  were forced through the deck renderer (hardcoded `deck: true`). Gate
  show/canPptx on the EXPLICIT deck signal (`isDeckArtifact`: deck renderer / kind
  / presentation) instead; real decks keep PPTX, pages don't, and `deck: true`
  is now always correct. Image/PDF stay on the broader signal (they handle pages).
- renderDeckSlides ran prepareDeck (hide chrome + freeze animations) BEFORE
  deciding page vs deck, so page-mode exports rendered on a mutated DOM (content
  using generic `.notes`/`.overview` classes vanished). Split the non-mutating
  slide count from the deck-only DOM prep; page mode now captures the original
  document.
- stitchDeckSlides capped only output height, so a wide/high-DPR deck could still
  allocate >1 GiB (8192px stage @2x => W~16384 * 30000 * 4). Add a RAM byte budget
  (320MB, like the page stitcher): downscale by min(heightScale, byteScale).
- sidecar-proto render-slides test now covers the `index` field (success + reject
  negative / fractional / non-number).

* fix(export): image/PDF deck flag from explicit signal, not .slide heuristic

The image and screenshot-PDF exports still passed `deck: effectiveDeck` (the
`.slide` regex), so an ordinary HTML page with carousel/testimonial `.slide`
markup exported only the current card instead of the full page. Drive both off
the explicit `isDeckArtifact` signal (same as PPTX): a real deck → per-slide, a
page → full-page capture. Extracted `shouldCaptureAsDeck()` as a pure helper with
a regression test (page + slides + deck:false => page, not per-slide).

* fix(export): screenshot PDF download must prompt Save As (.pdf in allowlist)

The default Export PDF flow now streams a .pdf download via
exportProjectScreenshotPdf, but the will-download Save As hook only intercepted
.pptx + image extensions — so PDF silently wrote to the OS Downloads folder
while PPTX/images prompted. Add .pdf to SAVE_AS_EXTENSIONS with a PDF filter,
and extract saveAsDialogOptionsForFilename() as a pure helper with a runtime test
(PDF/PPTX/image prompt; uppercase matched; other extensions pass through).

* fix(export): single-shot guard for image export (no double-click duplicates)

The toast-based image export closes the modal and starts the save without an
in-flight guard (the old in-modal busy/disabled states were removed), so a fast
double-click / Enter-repeat on Save could enqueue two concurrent exports
(duplicate captures, downloads, and fireImageExportResult bookkeeping) before the
modal-close re-render removed the button. Add an imageExportInFlightRef guard
that returns early on re-entry and resets in finally — mirrors the existing
screenshotInFlightRef pattern.

* fix(export): If-Range guard on /raw/ stream; block image-modal reopen mid-export

Two non-blocking correctness issues:

- /raw/ honored Range unconditionally even with the new ETag/Last-Modified, so a
  client resuming a cached font/media download after the file changed could
  splice stale + fresh bytes. Gate Range on If-Range (RFC 9110 §13.1.5): serve
  206 only when the If-Range validator (ETag or date) still matches the current
  file, else fall back to a full 200. Regression test: stale If-Range + Range
  returns 200 with the new full length.
- The image-export single-shot guard covered handleImageExportSave, but reopening
  the modal mid-export reset the shared request/result refs, mis-attributing or
  dropping the in-flight export's analytics result. openImageExportModal now
  no-ops while an export is in flight.

* fix(export): drive image/PDF deck decision off the viewer signal (effectiveDeck)

The desktop screenshot image/PDF paths were gated on isDeckArtifact while the
vector-PDF fallback (and the viewer's own prev/next/Present) use effectiveDeck.
That diverged: a metadata-free `.slide` deck rendered as a deck in preview but
exported as a single full page on a desktop host, yet as a deck via the browser
fallback — same artifact, different output depending on host.

Drive image + screenshot-PDF off effectiveDeck (the viewer's deck decision), so
export matches what the user sees and is host-independent. PPTX keeps the
narrower isDeckArtifact: it is deck-only with no vector fallback, so it can't
diverge, and it must not offer slide export for incidental carousel markup.
Removes the now-dead isDeckForExport binding.

* test(web): update image-export specs for capture-on-Save modal flow

The image-export modal was redesigned in this PR from eager-capture-on-open
(preview + live format re-render + in-modal alert + disabled-until-ready Save)
to capture-on-Save unified with the PPTX/PDF portaled-toast flow: the dialog
just picks a format, and Save closes it and runs the single capture behind the
export toast. The 9 specs in file-viewer-image-export.test.tsx still drove the
old eager flow and failed in CI (Web workspace tests). Updated each to click
Save before asserting capture, pick the format before Save, assert the portaled
toast (role=alert error text unchanged) instead of the removed in-modal alert,
and replaced the obsolete "preparing label" spec with one proving no eager
capture happens on open or on format change.

* fix(cli): od export honors the server Content-Disposition filename

The web download helper prefers the daemon's Content-Disposition filename and
only falls back to a locally derived name. `od export` ignored it and always
synthesized the name from --title/basename, so the two surfaces could write
different filenames for the same export. Parse the header (RFC 5987 filename*
and plain filename, reduced to a hardened basename so an odd header can't steer
the write outside the cwd) and prefer it when --output is not given, keeping the
title-slug/basename fallback. Mirrors apps/web/src/runtime/exports.ts.

* fix(export): detect runtime-managed decks; image=whole deck; de-dup long pages

QA found three blocking export-fidelity issues on this PR:

1. Horizontal decks export only slide 1 (image: all such templates; PDF:
   some). Runtime-managed decks (`<deck-stage>` web component with slotted
   `<section data-screen-label>` children toggled via `data-deck-active`)
   carry no literal `class="slide"`, so the viewer's `looksLikeDeck` regex
   misses them and the UI sent an authoritative `deck:false`. The host then
   force-captured page mode (`mode:'page', slides:1`) — a full-page shot of
   whatever slide was visible. PDF same path: `deck:false` skips the host
   DECK_PRINT_CSS, so decks without their own `@media print` print one page.
   Fix: a broader EXPORT-only signal `sourceLooksLikeExportableDeck` /
   `deckExportSignal` mirroring the host's slide-surface family
   (`.slide`/`[data-screen-label]`/`.deck-slide`/`.ppt-slide`) plus
   `<deck-stage>`. Kept OUT of `effectiveDeck` so the host's deck-stage-
   incompatible prev/next nav is not surfaced as a dead "— / —" control.

2. "Export as image" of a deck returned the current slide only. It now
   stitches every slide into one long image (matching the slide count the
   viewer reports); Copy screenshot / Mark-Draw capture keep the current
   slide via `captureExportImageSnapshot({ wholeDeck })`.

3. Long-page image/PDF export duplicated a fixed/sticky hero down the
   output: the scroll-segment stitch captures the viewport per offset, so a
   pinned element was copied into every segment. `preparePageForCapture` now
   neutralizes `position:fixed`->absolute and `sticky`->static before
   measuring/capturing, so each renders once (captureBeyondViewport already
   de-dupes; applied uniformly for consistency).

Red specs: exports.test.ts (deck detection), neutralize-positioning.test.ts
(fixed/sticky normalization).

* chore: re-trigger CI on updated main — needs-validation gate moved to merge_group (#4714)

* fix(sidecar): decode IPC frames with StringDecoder (multibyte UTF-8 corruption)

Exported CJK artifacts intermittently showed `???` / `◆?` (U+FFFD) in place of a
character — e.g. "拥挤" rendered as "拥���", "交付边界" as "交付���界". The bad
character varied between exports, the source bytes on disk were correct, and the
daemon /raw/ serve was byte-identical, so it was not a font or storage problem.

Root cause is in the generic JSON-IPC transport. Both the server and client
socket readers did `buffer += chunk.toString()` into a STRING. A render request
carries the full artifact HTML over the desktop IPC; when the payload spans
multiple `data` events, a multibyte UTF-8 character (CJK = 3 bytes) straddling a
chunk boundary is decoded per-chunk, turning each partial half into U+FFFD. Small
payloads never hit a boundary (hence "works in my repro, breaks on the real
file"); large real artifacts do, at whichever character lands on the split.

Fix: feed each chunk through a per-connection `StringDecoder("utf8")`, which
holds an incomplete trailing byte sequence until the next chunk completes it.

Verified end-to-end against the QA "Blog Post" artifact in a packaged client:
"拥���" → "拥挤" after the fix. Red spec: a ~1.3 MB CJK payload round-tripped
through createJsonIpcServer/requestJsonIpc (forces multi-chunk delivery) is now
byte-exact; it fails on the pre-fix reader.

* fix(export): vector deck PDF rendered only the first slide

A deck exported via the vector PDF fallback (POST /export/pdf →
exportPdfFromHtml) collapsed to a single page: only the runtime-active slide
appeared. Decks gate visibility with `.slide:not(.active){display:none!important}`
(specificity 0,2,0); the host DECK_PRINT_CSS `.slide{}` rule (0,1,0) cannot win
that cascade, so every non-active slide stayed `display:none` in print.

Fix: before printToPDF, mark every slide surface active (the same class set the
screenshot path toggles in deck-capture's showSlide), so the deck's own
`.slide.active` styling applies to all slides and DECK_PRINT_CSS paginates them
one per page. Shadow-DOM `<deck-stage>` decks are unaffected (their own
`@media print` already lays out every slide).

Verified with an offscreen printToPDF of a 12-slide `.slide`-class deck: 1 page
-> 12 pages, each a distinct centered slide.

* fix(export): screenshot PDF fails fast instead of masking errors as vector PDF

Per review: the raster-PDF path fell back to the vector `exportProjectAsPdf` for
EVERY non-ok screenshot result, so a semantic failure (bad deck routing, a 422,
a renderer-side 502, "page too tall", unreadable output) silently handed the user
a different (vector) PDF — the exact fidelity/CJK-glyph class of bug the
screenshot path exists to avoid.

exportProjectAsPptx now returns the same tri-state as exportProjectImageDataUrl:
`{ok:true}` / `{ok:false,unavailable:true}` (501 or transport — caller may fall
back) / `{ok:false,error}` (semantic — must surface). The PDF action only falls
through to the vector path on `unavailable`; a semantic error throws and is shown
in the export toast (onErr now prefers the export's own user-facing message).

* chore(nix): refresh pnpm deps hash

* fix(export): guard deck capture against stale-frame duplicate pages

QA saw a deck export with duplicate pages (e.g. two identical 目录 pages, a
slide silently missing). Root cause is a compositor race: after showing slide i,
`capturePage()` can return the PREVIOUS slide's frame when the new slide hasn't
painted yet (more likely on slower / loaded machines and slides with heavier
reveal content), so the loop emits an exact duplicate of the prior page. The
source has 12 distinct slides and live navigation is fine — the race is purely in
the offscreen capture loop.

Fix: after each capture, compare a cheap sampled checksum to the previous
slide's; if byte-identical (which can't happen for distinct slides), wait for
more frames and re-capture (bounded, 4 attempts × ~60ms). Two genuinely-identical
adjacent slides exhaust the retries and emit once. Applied to both the per-slide
(PDF/PPTX) and stitch (whole-deck image) loops.

Test: imageSignature distinguishes captures by content and length. (The race
itself is timing-dependent and not reproducible on a fast/idle machine — both
file:// and packaged-http exports of the reported deck render 12 unique pages
here — so the guard hardens the failure mode rather than relying on local repro.)

* fix(export): paginate tall pages for raster PDF instead of refusing

Per review: the single-image RAM/texture guard in capturePage refused any page
taller than the budget with "page is too tall — export as PDF instead". That is
right for /export/image, but /export/pdf-image routes ordinary-page PDFs through
the same branch — and since the screenshot-PDF path now fails fast (no silent
vector fallback), a long landing page exported as PDF hit a self-contradictory
hard error and regressed tall-website PDF export.

Fix: the PDF path (`jpeg`) now paginates a too-tall page into a multi-page raster
PDF — captureBeyondViewport per texture/RAM-bounded chunk, one image per chunk,
which the daemon assembles into one PDF page each. /export/image (png) keeps its
refusal (it has nowhere to paginate to). tallPageChunkHeights extracted + tested.

Verified offscreen: a ~20400px page → PDF path returns 3 paginated pages
(ok/page), image path still refuses.

* fix(export): capture deck slides via CDP (structural fix for duplicate pages)

Replaces the pixel-compare/retry guard (88d21c7) with a structural fix, per
review feedback: comparing each capture to the previous slide is the wrong
abstraction (it can't tell a stale frame from two genuinely-identical adjacent
slides, and wastes retries on the latter).

Root cause: the deck path used `webContents.capturePage()`, which grabs the last
COMPOSITED frame and can return the previous slide's frame when the just-shown
slide hasn't composited yet — emitting an exact-duplicate page. The page path
never had this because it uses CDP `Page.captureScreenshot`, which renders the
CURRENT DOM to a fresh frame.

Fix: deck capture now uses CDP `Page.captureScreenshot` too (attach the debugger
once around the deck loop; fall back to capturePage if it can't attach). The
captured pixels always reflect the slide just shown — no compare, no retry, no
identical-slide edge case. Animations/transitions are already frozen
(prepareDeckStage), so each slide is captured at its final state, never a
mid page-turn frame. Removed imageSignature + the retry loop.

Verified: 12-slide deck still stitches to 12 distinct slides at the correct dims.

* fix(export): current-slide capture of runtime decks uses the visible slide

Per review: deckExportSignal makes runtime-managed decks (<deck-stage> /
data-screen-label) exportable, but the current-slide path (Copy screenshot /
annotation capture) still resolved the slide index as `slideState?.active ?? 0`.
Those decks are deliberately kept out of effectiveDeck, so the viewer never
receives their active-slide bridge and slideState is null — meaning Copy
screenshot always off-screen-rendered slide 0 instead of the slide on screen,
inconsistent with the PPTX/PDF fix on the same templates.

Fix: planDeckImageCapture() decides per capture — whole-deck (Export as image),
ordinary pages, and tracked .slide decks render off-screen (with the active index
when tracked); an untracked deck's current-slide capture skips the off-screen
path and falls through to the visible host snapshot (which IS the current slide).

Tests: planDeckImageCapture unit cases (exports.test.ts) + a FileViewer
regression — Copy screenshot of a data-screen-label deck with no tracked slide
uses the host snapshot and does NOT off-screen-render slide 0.

* fix(export): don't mask post-response failures / debugger-less tall PDF as fallback

Two review edge cases:

- exportProjectAsPptx wrapped resp.blob() + triggerDownload() in the same
  try/catch that maps to `{unavailable:true}`, so a corrupt body or a
  client-side download failure (after a 200) was reported as "renderer
  unavailable" — letting the PDF caller silently downgrade to the vector path.
  Only the fetch (transport) and 501 now map to `unavailable`; post-response
  failures return `{error}` so they surface. Unit test added.

- capturePage's no-debugger fallback still returned "page is too tall — export
  as PDF instead" for the PDF path (jpeg). Pagination needs CDP, and we only
  reach this branch when the debugger can't attach, so it now surfaces a
  distinct retryable error instead of telling the user to switch to the format
  they already chose. (The debugger attaches in normal packaged exports; this is
  a rare transient.)

* fix(export): distinguish CDP attach failure from later CDP command failure

Per review: when the debugger attached but a later CDP command threw (a real
Chromium/GPU/clip error), the broad catch swallowed it and the too-tall PDF
refusal reported "renderer is busy, please retry" — hiding the actionable error
and sending users into a pointless retry loop. The retryable busy message is
only correct when the attach itself failed.

Track the caught CDP error (cdpError) separately: the too-tall PDF branch now
surfaces the real CDP error message when the debugger was available but a command
failed, and reserves the retryable "busy" message for true attach contention.

* fix(export): reject `od export pptx --page`; test the tall-PDF error split

Two review items:

- CLI: `od export pptx --page` advertised a page mode that can never work (the
  daemon forces deck mode for /export/pptx). Reject `--page` for pptx with a
  clear contract error pointing at `od export pdf --page` instead of silently
  ignoring it.

- Lock down the cdpError split with a regression: extract tooTallPdfErrorMessage
  and unit-test both branches — attach failure → retryable "busy" message;
  attached-but-CDP-command-failed → the real Chromium/GPU error surfaces (and
  neither tells the user to "export as PDF", which they already chose).

* fix(export): keep current-view captures viewport-based; reject weak If-Range

Two review items:

- planDeckImageCapture sent ordinary-page Copy screenshot / captureViewport
  annotation through the off-screen renderer (useOffscreen:true, no index), which
  renders the WHOLE document instead of the visible region — a regression for
  screenshot/annotation viewport semantics. Now: Export-as-image (wholeDeck) and
  tracked-deck current-slide still render off-screen; an ordinary page's
  current-view capture (and an untracked deck's) falls back to the visible host
  snapshot. Tests updated.

- ifRangeAllowsPartial accepted weak entity-tags for a 206, but RFC 9110 §13.1.5
  requires a strong validator and our /raw/ ETag is weak (W/"size-mtime"). A
  same-size rewrite / mtime collision could splice stale + fresh bytes under a
  matching weak tag. Now any entity-tag If-Range falls back to full 200; only the
  date form authorizes a range. project-raw-cache.test.ts pins it (weak-ETag
  If-Range → 200, fresh date → 206, stale date → 200).

* fix(export): resolve imported-folder project files via metadata.baseDir

Per review: the new screenshot export routes (and the vector /export/pdf) read
the source with readProjectFile() and no metadata, so it fell back to
<OD_DATA_DIR>/projects/:id and returned FILE_NOT_FOUND for imported-folder
projects (whose workspace lives at metadata.baseDir) even though the file renders
in the UI.

Thread project metadata through: BuildDeckRenderInputOptions and
BuildDesktopPdfExportInputOptions gain a `metadata` field passed to
readProjectFile; handleScreenshotExport and the /export/pdf route load it via
getProject(db, id)?.metadata. HTTP regression added: an imported-folder project
(created through /api/import/folder) hitting /export/image now returns 200 with
the rendered image instead of 404.

* chore(nix): refresh pnpm deps hash

* Show PPTX export for detected decks

* Fix deck export detection for page captures

* Route CLI image export through screenshot renderer

* Route legacy image export through screenshot renderer

* fix(export): per-viewport PDF pagination + parallax-faithful image capture

A long non-deck page exported to PDF came out as one giant page, and the same
page exported as an image dropped its scroll-pinned text. Both stemmed from the
page-capture path: PDF assembled one PDF page from a single tall capture, and
the image path flattened fixed/sticky positioning (fixed->absolute,
sticky->static), which deleted parallax headline/foreground text.

- PDF: add a `paginate` render-slides input. A non-deck page now captures one
  image PER VIEWPORT, top to bottom, and the daemon assembles a multi-page PDF
  (one screen per page). Decks still paginate per slide; page-mode only.
- Image: capture each viewport live at its real scroll offset and stitch into
  one tall image, keeping fixed/sticky CSS as authored -- the SAME capture logic
  as the PDF path, differing only in assembly. Drop the captureBeyondViewport
  one-shot and its isScrollBound heuristic (it rendered the whole document at
  scroll 0 and got parallax/reveal-on-scroll content wrong), and drop the
  fixed-neutralization step (it dropped pinned text).

Adds paginateViewportBand unit coverage and a paginate IPC round-trip/rejection
case; removes the now-unused neutralizeFixedAndStickyPositioning helper and test.

* fix(export): capture deck-stage at authored size; share pptx in contracts

Addresses two review findings on the screenshot export surface.

- deck-stage fidelity (blocking): the <deck-stage> runtime fits its canvas to
  the viewport with `transform: scale(...)` by default and documents that PPTX
  export must set the `noscale` attribute so the DOM is captured at the authored
  slide size. The renderer never set it, so a deck whose authored canvas differs
  from the 1920x1080 capture viewport was measured + captured at the preview-
  scaled size. prepareDeckStage now sets `noscale` on every <deck-stage> (a
  no-op for plain `.slide` decks).

- contract boundary: `pptx` was a first-class CLI/daemon export format but the
  shared `EXPORT_FORMATS` in `@open-design/contracts` still declared only
  `['pdf', 'image']`, so the capability was typed through an ad hoc local union.
  Add `pptx` to the shared contract, import it in the CLI instead of a local
  duplicate, and route `pptx` through the generic `/export` route (to the
  screenshot renderer) alongside `image`.

* fix(export): route CLI --format pdf through the raster screenshot PDF path

`od export --format pdf` still posted to the generic `/export` route, whose
desktopArtifactExporter renders vector PDF via printToPDF() and drops CJK glyphs
in the packaged runtime. The web UI was deliberately switched to the raster
`/export/pdf-image` path for that reason, so the CLI diverged from the UI on the
exact decks/pages this feature targets.

Route all three CLI formats through the screenshot renderer (pdf →
/export/pdf-image, matching the UI). Extract the format→route mapping into a
pure `exportRoutePath` helper so it is unit-testable without executing the CLI
entrypoint, and assert no format falls through to the vector `/export` route.

* fix(export): route generic POST /export pdf through the raster screenshot path

The shared ExportRequest contract advertises `pdf` as part of the screenshot-
rendered export surface, but the generic `/export` route still sent `format:
'pdf'` to desktopArtifactExporter's vector printToPDF() path, which drops CJK
glyphs in the packaged runtime. So a contract caller hitting POST /export got the
lower-fidelity PDF while the dedicated /export/pdf-image route, the UI, and the
CLI all use the raster screenshot PDF — the API surface was internally
inconsistent.

Route every /export format (pdf included) through handleScreenshotExport so the
generic endpoint matches the dedicated routes and the contract; drop the now
unused desktopArtifactExporter / buildDesktopArtifactExportInput wiring from the
route. Add an HTTP-level regression asserting POST /export with format:'pdf'
runs the screenshot renderer and streams back a real (%PDF) raster PDF.

* Restore editable PPTX export

* Clarify authored slide measurement

* Enable PPTX export from browser

* Stabilize large editable PPTX text

* Use workspace root for PPTX export resource

* Let CLI exports auto-detect decks

* Avoid tracking generated PPTX bundle

* Fix generic export deck routing

* Fix deck export routing regressions

* Add CLI page-mode export flag

* Preserve authored deck capture DOM

* Load PPTX vendor bundle from gzip resource

* Harden export CLI and PPTX bundle loading

* Preserve editable PPTX slide background images

* Preserve export render sizing contract

* Classify screenshot export request errors

* Preserve freeform slide deck exports

* Preserve UTF-8 export filenames

* Align export routing and CLI JSON contract

* Preserve export compatibility paths

* Keep PDF export on screenshot renderer

* chore(nix): refresh pnpm deps hash

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-29 13:15:14 +00:00
lefarcen
c0b5db2cd4 [codex] Bump bundled Vela CLI to 0.0.19 (#4888)
* chore: bump bundled vela cli to 0.0.19

* chore: refresh nix pnpm deps hash
2026-06-29 12:08:42 +00:00
Tom Huang
29b138f7a3 feat(brands): turn any brand into a reusable design system (#4691)
* Implement brand management routes and CLI support

- Added `brand-routes.ts` to handle HTTP endpoints for brand operations: listing, extracting, retrieving details, deleting, and serving logos.
- Introduced `brands-cli-help.ts` for CLI commands related to brand management, including usage instructions for listing, creating, retrieving, and deleting brands.
- Updated `cli.ts` to integrate brand commands into the existing CLI structure, allowing headless management of brands via the command line.
- Created supporting files for brand metadata handling, including `design-md.ts` for rendering brand information in markdown format and `index.ts` for the brand engine API.
- Implemented `prefetch.ts` to fetch and process brand material from specified URLs, ensuring a streamlined extraction process.
- Enhanced server setup in `server.ts` to register brand routes and manage brand-related data effectively.

This commit establishes a comprehensive framework for managing brands within the application, facilitating both HTTP and CLI interactions.

* Enhance memory management and onboarding experience

- Introduced canonical profile labels to ensure consistent handling of user input in profile forms, preventing duplicate entries.
- Updated the `parseProfileBody` and `captureProfileFromForm` functions to utilize the new canonical label matching.
- Added a memory callout section in the onboarding view to highlight the benefits of memory usage, including personalized responses and reduced setup questions.
- Implemented new UI elements in the onboarding view to improve user engagement with memory features.
- Expanded i18n support for new onboarding messages related to memory benefits across multiple languages.

* Refactor onboarding flow and enhance design system integration

- Updated the onboarding process to include a new brand extraction step, replacing the previous newsletter step.
- Adjusted the tracking logic to reflect the new onboarding steps, ensuring accurate analytics for user progress.
- Improved the UI for the onboarding view, including new input fields for email collection during the brand extraction phase.
- Refined the EntryShell component to remove outdated comments and clarify the onboarding renderer's purpose.
- Enhanced CSS styles for the onboarding steps to improve layout and user experience.
- Updated internationalization strings across multiple languages to reflect changes in the onboarding flow and brand extraction messaging.

* Add brand management features and enhance font handling

- Introduced new modules for managing brand assets, including `chrome.ts` for headless Chrome operations and `fonts.ts` for self-hosting web fonts.
- Implemented `prefetch.ts` to streamline the brand material extraction process, allowing for efficient harvesting of colors, fonts, and logos.
- Enhanced the brand system with new schema definitions in `schema.ts` to support brand color and font management.
- Developed the `engine` module to integrate brand building and rendering processes, including token derivation and artifact generation.
- Improved the overall structure and organization of brand-related files for better maintainability and scalability.

* Enhance brand extraction and project management features

- Updated `brand-routes.ts` to include new dependencies for project management, allowing for the registration of brand-related projects.
- Modified the `extractBrand` function to support project ID and system files, improving the brand extraction process.
- Enhanced the CLI commands in `cli.ts` to handle project IDs during brand creation, enabling better tracking of brand projects.
- Updated the server setup in `server.ts` to register new project-related routes.
- Improved the UI components to display project information associated with brands, including buttons for opening projects in the `BrandDetailView` and `BrandsTab`.
- Added new metadata fields in the contracts to support project tracking and management for brands.

This commit establishes a more robust framework for managing brand projects, enhancing both backend and frontend functionalities.

* Enhance onboarding profile management and memory persistence

- Added new canonical profile labels for 'Organization size', 'Use cases', and 'Discovery source' to improve user input consistency.
- Introduced `OnboardingProfileState` type to manage onboarding profile data more effectively.
- Implemented functions to build and persist the onboarding profile body to memory, ensuring user selections are saved accurately.
- Updated the `OnboardingView` component to handle profile persistence during navigation and submission steps.
- Enhanced tests to verify that user selections are correctly persisted to the memory profile.

This commit improves the onboarding experience by ensuring that user inputs are consistently captured and stored, enhancing overall user engagement with the application.

* Reflow brand extraction into an agent-driven, live flow

Replace the deterministic SSE prefetch/preview/system pipeline with an
agent-driven extraction: POST /api/brands now reserves the brand and stands
up a backing project with the target site open in an in-app browser tab plus
a seeded prompt, so the agent measures, synthesizes brand.json incrementally,
and the user can clear anti-bot walls by hand. New /preview and /finalize
routes let the agent render the kit page live and register the resulting
user:<id> design system, so extracted brand facts persist as a structured,
reusable brand kit instead of a one-shot deterministic guess.

Adds the brand-extract skill (SKILL.md + brand-kit.html template), kit-render
engine, brand-extraction-engine tests, brand project covers in the Designs
tab, onboarding extract handoff, and the matching od brand extract/preview/
finalize CLI subcommands and contract updates.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Sediment finalized brands into structured memory

Reflow a finalized brand into the memory store (brandToMemoryEntries +
reflowBrandToMemory) so future chats can ground vague requests in the
brand's palette, type, voice and rules. finalizeBrand now wires through
the runtime dataDir and best-effort persists the brand, MemoryChangeEvent
gains a 'brand' source, and the brand kit render hardens its inline JSON
escaping. Adds brand.previewEmpty / brand.viewDetails across all locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Implement logo fallback and imagery support in brand extraction

- Introduced a deterministic logo fallback mechanism to ensure that brand extraction processes can retrieve and save site logos, even when the agent fails to do so.
- Enhanced the `startBrandExtraction` and `finalizeBrand` functions to utilize the new logo fallback, allowing for better handling of logo assets.
- Added support for imagery samples in brand validation, enabling the inclusion of representative images in the brand kit.
- Updated the brand kit rendering to include self-hosted fonts and imagery, improving the overall presentation of brand assets.

This commit strengthens the brand extraction workflow by ensuring that logos and imagery are reliably captured and displayed, enhancing the user experience in brand management.

* Enhance memory management with rule proposal and verification features

- Introduced new functionality for distilling annotations into rule proposals, allowing users to suggest rules based on in-canvas annotations through the `od memory rule suggest` command.
- Implemented a verification system that programmatically enforces compliance with active rules during artifact generation, ensuring that all active rules are covered in the self-verify scorecard.
- Added endpoints for managing verification outcomes, including listing, removing, and clearing verification records, enhancing the transparency of the verification process.
- Updated the memory management system to support the retrieval of active rule entries, ensuring that only linked rules are considered during verification.
- Enhanced tests for both rule proposal generation and verification processes to ensure reliability and correctness.

This commit strengthens the memory management capabilities by integrating rule proposals and verification, improving the overall user experience in managing design rules and ensuring compliance.

* Distill review annotations into memory and enforce self-verify scorecard

Add distillAnnotationsToMemory to mine inline preview comments/highlights/
marks into durable feedback + rule memory via a dedicated distiller prompt,
threaded through the existing extract pipeline with an 'annotation' change
source. Tighten the self-verify prompt (daemon + contracts) to state the
daemon programmatically checks the scorecard, so a missing or uncovered
scorecard on an artifact turn is an enforcement failure. Cover the rule
suggest and verification-history routes with tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Apply brand design system through web config on "Use in new chat"

Thread onApplyDesignSystem from the entry shell into BrandsTab so the brand's
registered design system is applied via the web config channel instead of a
bare daemon PATCH that left the Home composer stale. Add a transient
home-intent latch + event so the Brands tab can request the Prototype chip on
the already-mounted HomeView, which consumes it once the plugin catalog loads.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Wire annotation distillation into background memory extraction

Add a background distill pass that mines inline review annotations
(comments / highlights / drawn marks) from a turn into durable memory
alongside the general LLM extraction, surface an `annotation` memory
toast source in the web UI, and cover the flow with a unit test.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix brand design system not applying to composer on "Use in new chat"

Selecting a brand's "Use in new chat" applies the brand's design system as
the default and fires the Prototype chip intent in the same synchronous click
handler. HomeView consumed that intent inside the event listener, so `pickChip`
ran before React committed the config change and seeded the composer's
design-system field from the stale (empty) default — the composer showed
"No design system" instead of the brand until a reload.

Split the intent handling: the listener now only bumps a tick, and a separate
effect consumes the chip after the re-render lands, so the seeded design system
reflects the freshly-applied brand. Add the previously-untracked home-intent
latch test coverage.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): rework Brands into Brand Kit and add Home create entry

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(brands): harvest real cover/hero images for the Images module

The brand kit's Images gallery only populated when the extraction agent
remembered to save imagery — so a forgetful or bot-blocked agent (and the
pre-imagery "Open Design" brand) left it empty. Add a deterministic,
server-side imagery fallback (imagery-fallback.ts), mirroring the logo
fallback: it parses og:image/twitter:image, large <img> (highest-res
srcset/<picture>), <link rel=preload as=image>, and CSS background-image
hero blocks, fetches candidates with browser-shaped headers, decodes
PNG/GIF/JPEG/WebP dimensions to keep only big representative images
(dropping icons/sprites/logos/tracking pixels), dedupes by content hash,
and saves up to 8 of the largest into imagery/ with labeled samples.

finalizeBrand runs it as a timeout-bounded, failure-tolerant safety net
(injectable so tests stay offline) when the agent captured too few
samples, first adopting any on-disk images. The extraction prompt and
brand-extract SKILL now explicitly direct the agent to harvest the site's
large/cover/hero images, filtered by rendered size.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(qa): implement deck layout validation and safety checks

Add a new QA module for validating the layout of generated brand decks to ensure robustness against clipping and truncation issues. The `analyseDeckLayout` function checks for critical layout invariants, including the presence of `.slide` sections, correct container types, and necessary runtime layers. Introduce `assertDeckLayoutSafe` to enforce these checks during brand system rebuilds, preventing the deployment of decks that fail validation. Additionally, create comprehensive tests to verify the functionality of the new layout validation features.

* fix(brands): apply deck shrink-to-fit synchronously so slides never clip

The no-clip runtime scheduled its fit pass through requestAnimationFrame,
whose callbacks are throttled while the deck is offscreen or occluded. A
slide could therefore stay unscaled — and clip its content — until first
paint. Fit synchronously on resize/load/fonts-ready with a trailing
setTimeout settle pass for late reflow, removing the rAF dependency.

Verified at the previously-broken 1024x620 viewport: container-type:size,
zero truncations, runtime auto-applies scale (Problem 0.71, ASK 0.87,
Product 0.97, Competition 0.97) and frame clip count drops to 0.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): let New Brand modal embed scrollable brand reference picker

Add a fillHeight mode to BrandReferencePicker so the heading, quick-pick
row and controls stay pinned while only the gallery scrolls inside a
bounded-height parent. Wire it into NewBrandModal with a stable, spacious
dialog and refresh the related newBrand/brandPicker copy across all 18
locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(brands): enhance brand extraction with deterministic seed harvesting

Introduce a new `seed-fallback` module to provide a server-side deterministic palette and typography seed during brand extraction. This ensures that the brand kit's initial display includes a harvested logo, an approximate color palette, and font families, improving the user experience by reducing the all-skeleton appearance during the first paint. Update the `startBrandExtraction` function to utilize this new module, allowing for a more seamless and visually appealing brand extraction process.

Additionally, enhance the `BrandReferencePicker` component to reflect loading states and errors during brand extraction, ensuring users receive immediate feedback on their actions. Update related tests to verify the idempotency of the `finalizeBrand` function, ensuring that re-finalizing a brand correctly reuses the existing design system without duplication.

* feat(brand-extract): enhance BrandReferencePicker and localization updates

Updated the BrandReferencePicker component to reflect loading states and errors during brand extraction, improving user feedback. Added a new localization key for the brand extraction process and updated existing translations in English, Simplified Chinese, and Traditional Chinese to enhance clarity and user experience. Additionally, introduced new styles for better interaction with brand assets in the brand kit template.

* feat(brands): wire in-page lightbox/masonry/asset preview + refine seed

Brand-kit preview improvements for the live extraction kit:
- brand-kit.html: add in-page overlay system (sandboxed iframe has no
  top-nav) — clickable image lightbox with prev/next, a "view all"
  masonry modal, and a full-page asset preview modal that loads
  system/artifacts/<kind>.html in an iframe. Defer auto-reload while an
  overlay is open so it never yanks the modal out mid-interaction.
- seed-fallback.ts: prefer vivid mid-luminance hues for the seeded
  accent/accent-secondary, and drop icon/symbol faces (Remix Icon etc.)
  from the typography seed so specimens never render glyph soup.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(brands): wire in-page lightbox/masonry/asset preview + refine seed

Brand-kit preview improvements for the live extraction kit:
- brand-kit.html: add in-page overlay system (sandboxed iframe has no
  top-nav) — clickable image lightbox with prev/next, a "view all"
  masonry modal, and a full-page asset preview modal that loads
  system/artifacts/<kind>.html in an iframe. Defer auto-reload while an
  overlay is open so it never yanks the modal out mid-interaction.
- seed-fallback.ts: prefer vivid mid-luminance hues for the seeded
  accent/accent-secondary, and drop icon/symbol faces (Remix Icon etc.)
  from the typography seed so specimens never render glyph soup.

Co-authored-by: Cursor <cursoragent@cursor.com>

* i18n(web): add brandPicker.opening across remaining locales + picker test

Completes the brand-reference picker i18n key that was committed only for
en/zh-CN/zh-TW, so every locale satisfies the typed Dict, and lands the
BrandReferencePicker extraction-feedback test left untracked by the
concurrent worker.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(EntryShell): enhance AMR cloud card visibility post-detection

Updated the EntryShell component to ensure the AMR cloud card remains visible after detection settles, even when the AMR runtime is unavailable. This change prevents the card from disappearing and allows it to degrade gracefully to fallback content and sign-in flow. Additionally, added tests to verify the new behavior, ensuring a better user experience during onboarding.

* feat(library): OD Library asset registry + OD Clipper extension

Add a global, cross-project asset registry (OD Library) and a Chrome MV3
capture extension (OD Clipper), wiring the full HTTP + CLI + Web UI three-track
loop per specs/od-clipper.md.

- contracts: LibraryAsset/Source/Kind, ingest, search, pairing, task DTOs
- daemon: 6 additive SQLite tables, content-addressed owned storage, the
  idempotent registerLibraryAsset hook (hash dedup + append-source),
  programmatic enrichment (mime/size/image dims/domain/tags), pairing tokens
  with a persisted extension-origin allowlist, /api/library/* routes, and
  /api/tools/library/{search,apply} for in-task agent reuse
- cli: `od library list|get|rm|search|import|pair`
- web: Library tab (grid, source badges, filters, search, live SSE updates,
  extension pairing affordance)
- clipper/: standalone MV3 extension (background SW, content toolbar, popup)
- skills/library-curator: utility skill for agent-driven asset reuse

Origin middleware now honors paired chrome-extension:// origins (seeded from
SQLite on boot) and exempts the pairing-confirm handshake. Enrichment AI stages
(caption/OCR/embedding) are recorded as skipped pending a configured model.

* feat(brands): programmatic-first design system extraction + rename

Make brand extraction two-phase so a usable design system is ready the
moment the user enters a URL — the instant "aha" — instead of waiting on
the AI agent:

1. PROGRAMMATIC-FIRST (synchronous): startBrandExtraction now harvests the
   site deterministically (logo, palette, typography, one-line description,
   cover imagery, source URL) via prefetchBrand, synthesizes a valid design
   system with brandFromMaterial (no LLM), and finalizes + registers it
   before returning. finalizeBrand is refactored into a reusable
   finalizeBrandCore shared by both the programmatic path and the agent path.
2. ASYNC AI ENRICHMENT: the seeded agent prompt is reframed to enrich the
   already-usable design system and re-finalize in place (same user:<id>),
   updating every artifact/template.

Bounded + best-effort: a blocked/unreachable origin skips phase 1 and stays
`extracting` for the agent to drive. Gated on userDesignSystemsRoot so the
legacy agent-only path stays intact for tests.

Also rename the user-facing "Brand Kit" surface to "Design System" across
en + zh-CN strings, project names, and the enrichment prompt.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(library): enhance asset import and management features

- Updated the `import` command to allow multiple local files and remote URLs, with restrictions on supported formats.
- Added new commands: `apply` for copying assets into project design files, `edit-as-page` for converting HTML assets into editable projects, and `figma` for exporting Figma captures.
- Introduced sidecar functionality for storing derived data alongside owned assets, including Figma capture IR and element HTML.
- Enhanced server configuration to support larger ingest payloads for asset captures.
- Improved error handling and user feedback during asset import and application processes.

* feat(asset-management): enhance asset dropzone and introduce chat-to-design feature

- Updated the DesignSystemAssetDropzone component to improve file preview handling with new functions for creating and revoking object URLs.
- Adjusted CSS for better layout and spacing in the asset dropzone.
- Added a new "Chat to design" button in the LibrarySection component, allowing users to send selected assets to the Home chat composer for project creation.
- Updated localization strings across multiple languages to reflect changes in asset import terminology.
- Enhanced the HomeView component to handle asset staging from the chat composer.

* feat(library): enhance asset application with element markup support

- Updated the `applyLibraryAsset` function to include an `includeElement` option, allowing the capture of element markup alongside assets.
- Modified related components (e.g., `ChatComposer`, `LibrarySection`, `FileWorkspace`) to handle the new element markup feature, ensuring both asset paths and optional element paths are returned and processed.
- Introduced a new function, `fetchLibraryAssetElementHtml`, to retrieve the captured HTML for element-pick assets.
- Enhanced the UI to display element markup inline within the chat composer, improving user interaction with captured elements.
- Updated API contracts to reflect changes in asset application responses, including optional element markup paths.

* feat(library): enhance asset filtering and preview handling

- Updated the LibraryPicker and LibrarySection components to implement a badge-aware kind filter, allowing for more precise asset filtering based on badge kind.
- Introduced a new `matchesKindFilter` function to streamline the filtering logic across components.
- Enhanced the DesignSystemAssetDropzone to ensure proper handling of image previews, addressing issues with broken thumbnails under React StrictMode.
- Added CSS styles for kind badges to improve asset representation in the UI.
- Implemented tests for the DesignSystemAssetDropzone to ensure correct preview lifecycle management.

* feat(library): hydrate single asset on SSE ingest

Add fetchLibraryAsset(id) so the Library grid can merge just the one
asset an `ingest` SSE event references instead of refetching the whole
list on every capture. Returns null on miss/error.

* feat(clipper): richer in-page image picker

Collect CSS background-image url()s in addition to <img> (so hero/section
art painted as backgrounds is no longer silently missed), defer thumbnail
decode to visible cells via IntersectionObserver, draw downscaled canvas
thumbnails instead of second full-res decodes, and add locate-on-page
highlighting so a picked image can be traced back to its DOM source.

* feat(library): implement lazy loading for thumbnails and enhance asset filtering

- Introduced a `LibraryThumb` component to lazily load heavy content (images, videos, iframes) only when they are near the viewport, improving performance.
- Added a debounced search feature to optimize asset filtering, reducing unnecessary network requests during rapid input.
- Enhanced the asset filtering logic to track active filters using a ref, ensuring efficient updates during live events.
- Updated the `snapshotCardRects` and `cardIdsInBand` functions to support improved hit-testing for drag-and-drop interactions.

* feat(library): lazy picker thumbnails + debounced search

Extend the Library grid's lazy-thumbnail + 250ms debounced-search pattern
to the composer LibraryPicker so opening it no longer fires one full-bytes
request per asset, and tidy the clipper content-script image collection.

* feat(clipper): compress and budget capture inlining

Re-encode large raster images to downscaled WebP and inline smallest-first
within a fixed budget, dropping only the secondary Figma IR past a safe body
size, so an image-heavy page (e.g. a news front page) always saves as an
editable HTML capture instead of 413-failing the ingest.

* test(library): LibraryPicker debounce + lazy-thumbnail coverage

Cover the composer picker's 250ms debounced search and its lazy <img>
mount (deferred until the card is in view), matching the grid's perf test.

* feat(design-system): enhance asset handling and UI for design systems

- Updated the CLI to support additional asset kinds, including 'design-system'.
- Enhanced the DesignSystemProvenance type to include source URLs, improving provenance tracking.
- Modified the design system generation jobs to correctly summarize source links and GitHub repositories.
- Updated UI components to reflect changes in asset handling, including new source link management in the DesignSystemFlow.
- Improved tests to cover new functionality for adding source links and ensuring proper handling of design system assets.

* refactor(library): rename 'design-system' to 'brand kit' and enhance thumbnail loading

- Updated labels and filters in Library components to replace 'design-system' with 'brand kit'.
- Introduced a shimmer skeleton for lazy-loaded thumbnails in the LibraryPicker to improve user experience during asset loading.
- Enhanced the PickerCard component for better performance by memoizing individual asset cards.
- Updated tests to ensure proper handling of brand kit assets and their visibility in the LibraryPicker.

* feat(clipper): implement internationalization for toolbar and popup

- Added i18n support to the clipper, enabling localization of UI elements and tooltips.
- Introduced a new i18n.js file to manage translations for various languages.
- Updated content.js and popup.js to utilize the i18n functions for dynamic text rendering.
- Enhanced accessibility by ensuring aria-labels and tooltips are also localized.
- Improved user experience by providing localized messages for actions and statuses.

* feat(clipper): enhance brand kit extraction and localization support

- Updated the brand kit extraction process to include improved handling of assets and localization for various UI elements.
- Added internationalization support for the brand kit feature, allowing for dynamic text rendering based on user locale.
- Enhanced the user experience by ensuring that all relevant messages and tooltips are localized.
- Updated tests to cover new localization features and ensure proper functionality of the brand kit extraction process.

* feat(clipper): enhance brand color derivation and update localization

- Introduced new functions for color manipulation, including linear interpolation and clamping, to improve brand color derivation.
- Updated the deriveBrandColors function to better map observed palettes to semantic roles, ensuring consistent brand representation.
- Revised localization strings in i18n.js to reflect changes from 'brand kit' to 'design system', enhancing clarity and user experience.
- Improved overall code organization and readability by refactoring existing functions and adding new utility methods.

* refactor(clipper): update terminology from 'brand kit' to 'design system'

- Replaced all instances of 'brand kit' with 'design system' across various components and localization files for consistency.
- Updated UI elements, tooltips, and documentation to reflect the new terminology.
- Enhanced user experience by ensuring clarity in the design system extraction process and related functionalities.
- Adjusted localization strings in multiple languages to align with the updated terminology.

* feat(clipper): enhance image fill handling and normalization

- Introduced functions to normalize image fills by converting non-PNG/JPEG formats (SVG, WebP, GIF, AVIF) to PNG before import, ensuring all images are properly rendered in Figma.
- Updated the UI to report the number of images converted and dropped during the import process, improving user feedback.
- Enhanced the overall image processing workflow to prevent silent failures when unsupported formats are encountered.
- Revised documentation to reflect the new image handling capabilities and supported formats.

* feat(clipper): enhance UI kit and busy state feedback

- Updated the UI kit to include new components such as inputs, selection, and overlays, improving the overall design system representation.
- Enhanced the busy state feedback during capture processes with localized messages and a step-by-step progress indicator, providing users with clearer status updates.
- Revised localization strings to support new UI elements and improve user experience across multiple languages.
- Improved documentation to reflect changes in the UI kit and busy state handling.

* fix(brands): restore design-systems nav entry + reconcile BrandsTab on re-activation

Address review feedback on PR #4260:

1. EntryNavRail dropped the only control that reached view==='design-systems'
   when Brands replaced it in the rail, leaving the still-rendered/routed
   design-system manager deep-link only (the entry-nav-design-systems e2e
   specs assert this). Restore a reachable rail entry (blocks icon, existing
   navDesignSystems key) alongside Brands.

2. BrandsTab only fetched once on mount, but EntryShell keeps sub-views
   mounted and toggles visibility, so a brand finishing extraction in its
   backing project never reconciled until a full reload. Refresh whenever the
   Brands view becomes active again, and poll while any brand is extracting
   (torn down once settled / when hidden).

Red spec: tests/components/BrandsTab.refresh.test.tsx (fails pre-fix:
fetchBrands called once, not twice).

* Update clipper/brand-capture.js

* fix(clipper): improve busy state handling and UI feedback

- Adjusted the spinner CSS to use flex properties for better layout control.
- Enhanced the reclampIfMoved function to preserve user position during busy state transitions.
- Added loading toast notifications for popup-launched captures to ensure progress visibility even when the on-page bar is hidden.

* feat(daemon): add kiwi-schema dependency and enhance Figma API integration

- Added kiwi-schema package to the daemon for improved schema handling.
- Updated FigmaApiNode interface and related functions to support shared functionality with the offline decoder, ensuring consistency in node processing.
- Refactored capture functions in clipper to maintain UI visibility during DOM/IR snapshots, enhancing user experience during capture operations.

* fix(web): surface missing backing projects

* fix(web): re-enable brand actions after use

* fix(daemon): serve brand logos from data roots

* fix(brands): reconcile failed extractions

* feat(daemon): implement offline Figma import and decoding functionality

- Added support for importing `.fig` files directly into the daemon, enabling offline processing without requiring a Figma account.
- Introduced a new `fig-decode.ts` module for decoding `.fig` files, handling both ZIP-wrapped and raw formats.
- Created `figma-import.ts` to orchestrate the import process, generating a canonical snapshot and associated metadata.
- Enhanced the server to handle Figma file uploads and integrate with the new decoding logic.
- Updated package dependencies to include `kiwi-schema`, `html2canvas`, and `jspdf` for improved functionality.
- Added tests for the new Figma import features to ensure reliability and correctness.

* feat(clipper): reload-proof capture progress badge on the extension icon

The on-page progress strip dies if a page reloads itself mid-capture
(aggressive paywall sites like economist.com do this), leaving no
loading signal. Add a per-tab '•••' badge on the extension icon for the
lifetime of any capture message — it lives on the action icon, so a page
navigation can't take it down. Verified end-to-end via a real loaded
extension.

* feat(daemon): add export functionality for Figma and enhance PDF export process

- Introduced `runFigma` command for importing Figma designs, supporting both local `.fig` files and Figma URLs.
- Added detailed usage instructions for the `od figma import` command.
- Implemented `runExport` command for programmatic export of HTML/deck artifacts to PDF, PPTX, or image formats.
- Enhanced error handling and user feedback during export processes.
- Removed obsolete `build-pptx-export-prompt` module and related tests to streamline the codebase.

* feat(daemon): enhance library synchronization and export capabilities

- Implemented `reconcileLibrary` to mirror design systems and agent-produced project deliverables into the Library as referenced assets.
- Added support for programmatic export of artifacts via the `od export` command, including detailed usage instructions.
- Introduced new functions for handling Figma imports and exports, improving integration with design workflows.
- Enhanced error handling and user feedback during synchronization and export processes.
- Added tests for new features to ensure reliability and correctness.

* feat(web): PPTX export for any shareable artifact + Library toolbar tooltips

* chore(nix): refresh pnpm deps hash

* refactor(web): enhance onboarding view and file export progress indicators

- Updated the onboarding view layout for improved accessibility and visual hierarchy, including adjustments to spacing, typography, and button styles.
- Introduced a loading toast for file export progress, displaying elapsed time and estimated time remaining for slide exports.
- Added new translation keys for export progress messages in multiple languages.
- Refactored the export progress handling to provide real-time updates during the export process, improving user feedback and experience.

* refactor(web): streamline export capture bridge and update connector styles

- Removed unused loading logic for html2canvas in the export capture bridge, simplifying the code.
- Updated CSS for the onboarding view connector to improve visual clarity and ensure it does not overlap with node numbers.

* refactor(web): remove html2canvas dependency and enhance Figma URL handling

- Removed the html2canvas package from the project, including its references in the lock file and related components.
- Added functionality to manage Figma URLs within the Design System flow, allowing users to add, remove, and validate Figma file links.
- Improved drag-and-drop handling to prevent unintended file navigation when dropping files outside designated areas.
- Updated UI components to accommodate new Figma URL features, enhancing user experience and accessibility.

* refactor(web): unify brand and design system flows

- Merged the brand extraction process into the design system creation workflow, allowing users to start from a brand directly within the design system wizard.
- Updated routing to redirect legacy brand links to the unified design systems tab.
- Enhanced the onboarding experience by removing the separate Brand Kit tab and integrating brand selection into the design system creation process.
- Improved UI components to reflect these changes, ensuring a seamless user experience across the application.

* feat(web): introduce brand enrichment banner and picker modal

- Added a new BrandEnrichmentBanner component to allow users to refine programmatically-extracted design systems with AI by selecting design-system skills.
- Implemented a BrandPickerModal for selecting brands from a searchable gallery, enhancing the design system creation flow.
- Updated ChatPane to conditionally display the enrichment banner for eligible brand projects, improving user engagement.
- Enhanced the design system flow to support the new brand enrichment features, ensuring a seamless experience for users.

* feat(web): enhance BrandPickerModal and DesignSystemAssetDropzone

- Updated the BrandPickerModal to allow scrolling of the entire picker area, improving user experience by creating a unified scrolling surface.
- Added new props to the BrandReferencePicker for action labels and scroll root reference, enhancing flexibility in brand selection.
- Introduced a new DesignKitView component for rendering design kits consistently across different surfaces.
- Enhanced the DesignSystemAssetDropzone to support a wider variety of file types with appropriate previews, improving asset management during design system creation.
- Updated styles for better visual clarity and responsiveness across components.

* feat(web): update Design Systems tab actions and enhance localization

- Changed the button label in the DesignSystemsTab from "Edit" to "Open" for better clarity in user actions.
- Added a new translation key for 'dsManager.openSystem' across multiple languages to support the updated button label.
- Enhanced the FileWorkspace component to ensure the Design Files tab aligns correctly with the Design System tab, improving UI consistency.
- Implemented a new design system editing feature that allows users to fetch and save design system content from DESIGN.md, enhancing the design workflow.

* fix(merge): repair post-merge regressions after origin/main integration

Follow-up fixes on top of the origin/main merge (886f925cd) addressing
regressions the conflict resolution introduced. main's web suite is the
oracle (100% green); resolution principle was main's engine/backend +
HEAD's UI, unioned.

daemon:
- library-sync.ts: correct design-systems import to ./design-systems/index.js
  (design-systems became a directory module on main).
- tests/server-bootstrap-regression: add LIBRARY_DIR to the PathDeps fixture
  (main-added test x HEAD-added LIBRARY_DIR field).

web:
- WorkspaceTabsBar: union — restore main's onboarding Search-popover close
  behaviour + guards, keep HEAD's library/brands nav entries.
- HomeView: restore main's composer sending-state (await onSubmit, widen its
  return type to Promise<boolean>|boolean|void, pass submitting to HomeHero).
- MemorySection.test: take main's version to match main's two-loop memory
  component.
- i18n: restore dropped settings.onboardingRoleMarketing key across types.ts
  and all locales.
- App/BrandsTab/EntryNavRail/router/home-intent: union fixes restoring main
  features dropped during conflict resolution (needs_input handling, etc.).

Validation: pnpm guard + full pnpm typecheck (all 23 packages) green.
Known-red: EntryShell onboarding step 3 intentionally retains HEAD's "build"
step rather than main's brand-extract step; 8 EntryShell.onboarding /
App.onboarding-amr-e2e tests stay red pending that onboarding decision.

* fix(merge): keep HEAD's unified brand flow (revert main's separate Brands tab)

Follow-up to 688544ff7. Per the chosen product direction (brand creation
unified into the design-system create wizard, not a standalone Brands tab),
revert the brand-flow routing/nav that the post-merge repair had restored
from main:

- router.ts: keep HEAD's brand routing (brands folded into design-systems),
  drop main's standalone /brands and /brands/:id view routing.
- EntryNavRail.tsx: drop main's standalone "Brands" nav button.
- runtime/home-intent.ts: drop main's brand "Use in new chat" confirmation
  notice plumbing (tied to the separate Brands flow).

Kept from the repair commit (additive, non-conflicting): App.tsx
loadedActiveProject correctness, composer Sending… state, WorkspaceTabsBar
onboarding popover behaviour, two-loop memory test, restored i18n keys,
brand needs_input STATUS handling, server.ts plugin-route infrastructure.

* feat(library-ui): implement conditional rendering based on LIBRARY_UI_VISIBLE

- Updated router.ts to conditionally render the library view based on the LIBRARY_UI_VISIBLE flag.
- Modified ComposerPlusMenu.tsx, DesignFilesPanel.tsx, and DesignSystemAssetDropzone.tsx to show the "Select from library" button only when LIBRARY_UI_VISIBLE is true.
- Adjusted EntryNavRail.tsx and EntryShell.tsx to include the library navigation button and section conditionally based on the LIBRARY_UI_VISIBLE state.
- Enhanced HomeHero.tsx to allow starting a blank project directly, improving user experience by providing more options for project creation.

This commit introduces a feature toggle for the library UI, allowing for better control over its visibility during development and testing.

* feat(home-hero): implement edge auto-scroll for horizontal overflow

- Introduced `useEdgeAutoScroll` hook to manage auto-scrolling behavior for horizontally overflowing components in HomeHero.
- Updated `PluginPromptPresets` and `RailGroup` components to utilize the new auto-scroll functionality, enhancing accessibility for users without trackpads.
- Added `EdgeScrollZones` component to provide interactive edge zones for scrolling.
- Enhanced CSS styles to support the new scrolling layout and ensure proper positioning of elements.

This commit improves user experience by making overflow content more accessible and easier to navigate.

* feat(design-systems): add project creation from design system and enhance UI components

- Implemented `handleCreateProjectFromDesignSystem` function in `AppInner` to facilitate project creation directly from a selected design system.
- Updated `DesignKitView` to wrap the iframe in a span for better layout control.
- Refactored CSS for `BrandPreviewCard` and `DesignSystemsTab` to improve styling and responsiveness.
- Introduced a new `TemplatePicker` component in `HomeHero` for selecting project-type templates, enhancing user experience.
- Updated various components to support asynchronous handlers for design system actions, improving overall functionality.

This commit enhances the design system integration and user interface, making project creation more intuitive and accessible.

* feat(brand-routes): enhance brand reservation API and add DESIGN.md support

- Updated the POST /api/brands endpoint to accept optional fields: description and designMd, allowing for more flexible brand reservations.
- Modified validation to require either a URL or designMd for brand extraction.
- Introduced a new design-md-input module to handle parsing and validation of DESIGN.md content.
- Enhanced startBrandExtraction function to support processing of DESIGN.md, improving integration with design systems.
- Added utility functions for managing DESIGN.md input and output, streamlining the brand creation process.

This commit improves the brand extraction workflow by integrating DESIGN.md support, making it easier for users to create and manage brands.

* feat(chat-pane, design-kit-view): enhance chat functionality and design preview features

- Added `handleNextStepPromptAction` to `ChatPane` for setting draft prompts, improving user interaction.
- Introduced `nextStepVariant` to differentiate design system projects in `ChatPane`.
- Updated `DesignKitView` to include a button for previewing design kit covers, enhancing user experience.
- Implemented a modal for displaying design kit previews, allowing users to view content in a dedicated space.

These changes improve the chat interface and design kit interactions, making the application more intuitive and user-friendly.

* feat(brand-extraction): enhance DESIGN.md support and testing

- Added a new test case to validate brand extraction from DESIGN.md input without requiring a website.
- Implemented functionality to register brands directly from DESIGN.md, improving the brand creation workflow.
- Updated the `ChatPane` and `NextStepActions` components to handle design system-specific actions for projects, enhancing user experience.
- Enhanced localization files with new carousel hints and project brief options across multiple languages.

These changes streamline the brand extraction process and improve the overall functionality of the design system integration.

* feat(wireframe-examples): add annotated and greybox wireframe examples

- Introduced new wireframe examples for annotated and greybox styles, enhancing design system capabilities.
- Added HTML and JSON files for both wireframe types, providing templates for low-fidelity design mockups.
- Implemented SKILL.md documentation for each wireframe example, detailing usage and design specifications.

These additions improve the design toolkit, offering users more options for creating wireframes in various styles.

* feat(brand-extraction): refine Chrome fallback and enhance error handling

- Updated the Chrome fallback logic in the prefetch pipeline to clarify its purpose and usage as a diagnostic tool.
- Introduced environment variable checks to enable or disable system Chrome usage, improving control over the extraction process.
- Enhanced error messages in the DesignSystemCreationFlow component to provide clearer guidance on required inputs for creating a design system.
- Added regression tests to ensure that prompts do not instruct the agent to invoke a non-existent `brand-extract` skill, preventing potential failures during brand extraction.

These changes improve the robustness of the brand extraction process and enhance user experience by providing clearer instructions and error handling.

* feat(brand-extraction): enhance DESIGN.md input handling and introduce brand ready prompt

- Updated the BrandFromDesignMdInput interface to explicitly define the description property as optional with undefined.
- Enhanced the brand extraction prompts to clarify the inline brand-extract workflow, preventing confusion during the extraction process.
- Added a new BrandReadyPrompt component to notify users when a design system is ready for preview, improving user experience.
- Introduced CSS styles for the BrandReadyPrompt to ensure a visually appealing and user-friendly interface.
- Updated localization files to support new strings related to the brand ready prompt across multiple languages.

These changes improve the clarity and usability of the brand extraction process, providing users with timely feedback and a more intuitive interface.

* feat(brand-extraction): improve design system focus handling and localization updates

- Refactored the handling of browser tabs in the brand extraction tests to ensure proper validation of tab states.
- Enhanced the AppInner component to refresh design systems alongside templates, ensuring users see the latest updates without page reloads.
- Introduced a pending focus state in the DesignSystemsTab to manage design system selection more effectively after brand extraction.
- Added a BrandReadyPrompt in the ProjectView to notify users when a design system is ready for preview, improving user engagement.
- Updated localization files for Chinese (Simplified and Traditional) to reflect changes in terminology related to design systems.

These changes enhance the user experience by providing timely feedback and ensuring that the design system selection process is seamless and intuitive.

* fix(styles): adjust letter-spacing and enhance plus-menu trigger styles

- Set letter-spacing to 0 in design-system-flow.css for improved text clarity.
- Added styles for plus-menu trigger in plus-menu.css, including background, border, and hover effects to enhance user interaction and visual consistency.

These changes refine the design aesthetics and improve the usability of the plus-menu component.

* feat(tests): add design-system focus handoff tests

- Introduced a new test suite for validating the design-system focus handoff functionality.
- Implemented tests to ensure that the focus ID is correctly set, read, and cleared from session storage, preventing user selection hijacking.
- Added checks for scenarios where no focus ID is pending, enhancing test coverage for the design system's behavior.

These tests ensure the reliability of the design-system focus handling, contributing to a more robust user experience.

* feat(export): restrict image format options to PNG and JPEG

- Updated the image format options in the export functionality to only allow PNG and JPEG, removing WebP to prevent silent downgrades.
- Enhanced error handling to provide clear feedback when an unsupported image format is specified.
- Adjusted related documentation and comments to reflect the changes in supported formats across the application.

These changes ensure consistency in image export behavior and improve user experience by providing immediate validation errors for unsupported formats.

* feat(origin-validation): implement zero-config OD Clipper bypass for library requests

- Added a new function `isZeroConfigClipperLibraryRequest` to validate requests from locally-installed browser extensions targeting the `/library/` path.
- Updated the origin validation middleware to utilize this function, allowing unpaired browser extensions to access the `/api/library/ingest` endpoint while blocking other cross-origin requests.
- Enhanced tests to cover the new bypass functionality, ensuring correct behavior for both valid and invalid origins.

These changes improve the integration of browser extensions with the local daemon, enhancing user experience while maintaining security.

* feat(design-systems): add download functionality for design systems

- Implemented a new command `od design-systems download <id>` to allow users to download design systems as a .zip file, including all system files and a generated SKILLS.md usage guide.
- Updated the CLI help documentation to include usage instructions for the new download command.
- Enhanced the design systems API to support the download feature, ensuring only user design systems are accessible while handling errors for non-existent presets.
- Added localization strings for the new download functionality across multiple languages.

These changes enhance the usability of design systems by providing a straightforward method for users to obtain and share their design assets.

* feat(design-systems): enhance design system management and localization

- Introduced new UI components and styles for managing design systems, including buttons for downloading, refreshing, and resetting edits.
- Updated the DesignKitView to support direct actions for DESIGN.md editing, improving user interaction with design systems.
- Enhanced the DesignSystemDetail component to include download functionality and improved state management for design system edits.
- Added localization strings for new features, ensuring consistent user experience across multiple languages.
- Improved error handling and user feedback for design system operations, including download failures.

These changes streamline the design system management process, making it more intuitive and user-friendly while ensuring robust localization support.

* feat(tests): add comprehensive tests for design system archive functionality

- Introduced a new test suite for validating the `buildUserDesignSystemArchive` and `buildDesignSystemSkillsMarkdown` functions.
- Implemented tests to ensure correct packing of design system files, including the generation of a `SKILLS.md` guide and exclusion of internal metadata.
- Added checks for handling non-user IDs and scenarios where a design system already includes its own `SKILLS.md`.
- Enhanced the overall test coverage for design system functionalities, ensuring reliability and correctness in the design system archive process.

These changes improve the robustness of the design system features by ensuring thorough testing of critical functionalities.

* feat(figma-import): enhance CLI output and add Figma import endpoint

- Updated the CLI to conditionally log detailed import information based on the `--json` flag, improving usability for users who prefer JSON output.
- Introduced a new API endpoint for importing Figma files, handling file uploads and validating project existence, with appropriate error responses for missing files or invalid URLs.
- Added a dedicated route for the Figma import functionality, ensuring seamless integration with existing project workflows.

These changes improve the Figma import experience by providing clearer output options and robust error handling, enhancing overall user interaction with the CLI and API.

* feat(design-files): enhance DesignFilesPanel with new actions and styles

- Added new action buttons for opening a browser and creating a design system in the DesignFilesPanel, improving user interaction in the empty state.
- Updated styles for action buttons to enhance visual distinction and usability.
- Enhanced tests to verify the functionality of new actions in the DesignFilesPanel, ensuring they trigger correctly.

These changes improve the user experience by providing additional functionality and clearer visual cues in the design files interface.

* fix(ci): restore new project modal flow

* fix(ci): align design kit and onboarding checks

* fix(ci): sync bake preview workflow action

* fix(ci): include plugin preview helper scripts

* fix(review): harden brand source and preview flows

* fix(ci): stabilize web workspace tests

* fix(review): address latest blocking feedback

* chore(ci): retrigger validation after label update

* chore: re-trigger CI on updated main — needs-validation gate moved to merge_group (#4714)

* refactor(lightbox): implement portal for overlays to resolve z-index issues

- Updated the lightbox component to use React's createPortal for rendering overlays directly to the <body>, ensuring proper z-index stacking.
- Removed session mode toggle from HomeHero and adjusted related styles and tests accordingly.
- Cleaned up CSS by removing unused styles related to session mode toggle.
- Updated tests to reflect changes in the HomeHero component and its interaction with the design router.

* style(home-hero): remove focus halo from template search input

- Updated CSS to eliminate the global input focus outline and box-shadow for the template search field in the HomeHero component.
- Added a test to verify that the template picker search field maintains a clean appearance when focused.

* feat(design-system): add create design CTA and enhance design kit functionality

- Introduced a new `DesignSystemCreateCta` component to facilitate creating new designs from an active design system, enhancing user experience in the chat interface.
- Updated `ChatPane` to include the new CTA, allowing users to create designs directly from the chat.
- Enhanced `DesignKitView` with sticky header functionality for better accessibility while scrolling.
- Added new CSS styles for the `DesignSystemCreateCta` component to ensure a visually appealing and consistent design.
- Updated internationalization files to include new strings for the design system creation feature.

* feat(upload): enhance file upload handling and error recovery

- Introduced `sanitizePath` to preserve directory structures during file uploads, preventing issues with subdirectory paths.
- Updated `DesignKitView` and related components to utilize the new `sanitizePath` function for improved file name resolution.
- Added `KitErrorBoundary` component to gracefully handle rendering errors in the design kit, providing a user-friendly fallback.
- Implemented internationalization updates for new error messages and action confirmations related to uploads and error handling.
- Enhanced CSS styles for better visual feedback during error states and improved user experience.

* feat(design-kit): add keyboard shortcuts hint and enhance key handling

- Introduced a new keyboard shortcuts hint in the DesignKitView, providing users with quick access to essential actions (E edit, C copy, U upload, R refresh, ⌫ delete logo).
- Implemented a keydown event handler to manage keyboard shortcuts contextually within the design kit, improving user interaction and accessibility.
- Updated CSS for the shortcuts hint to ensure it remains low-contrast until hovered, enhancing the UI experience.
- Added internationalization support for new shortcut labels and hints across multiple languages.
- Adjusted DesignSystemsTab to prefer user logos for their systems, improving visual consistency.

* feat(design-system): introduce DesignSystemExtractionPanel and enhance design system interactions

- Added the `DesignSystemExtractionPanel` component to facilitate user interactions during design system extraction, providing a synthesized conversation view and next steps.
- Updated `ChatPane` to render the new extraction panel when a design system is active, enhancing user guidance.
- Introduced a new utility function `designSystemExtractionSource` to derive human-readable labels for design system sources.
- Enhanced internationalization support with new strings for extraction-related actions and prompts across multiple languages.
- Updated various components and tests to reflect changes in terminology and functionality, improving overall user experience.

* feat(project): add project deletion functionality and enhance design system interactions

- Introduced `onDeleteProject` prop in `ProjectView` to handle project deletion, improving project management capabilities.
- Updated `AppInner` to include the new delete project handler, enhancing user experience in project interactions.
- Enhanced `DesignKitView` and `DesignSystemsTab` with loading states and improved visual feedback during design system resolution.
- Removed deprecated `DesignSystemCreateCta` component and associated styles, streamlining the codebase.
- Updated internationalization files to reflect changes in project management terminology and actions.

* feat(design-kit): enhance internationalization and user feedback in DesignKitView

- Updated various labels and error messages in the DesignKitView to utilize internationalization functions, improving accessibility and user experience.
- Enhanced color input validation messages and added confirmation prompts for design system deletions in DesignSystemsTab and FileWorkspace.
- Introduced new props for handling design system project deletions, streamlining project management.
- Updated internationalization files to reflect new strings and translations for improved user guidance across multiple languages.

* refactor(design-kit): remove keyboard shortcuts hint and streamline header menu

- Eliminated the keyboard shortcuts hint from the DesignKitView, simplifying the header menu.
- Updated the sticky-header overflow menu to exclude upload, full-system preview, and shortcut help actions, focusing on essential project operations.
- Adjusted related tests to reflect the removal of the shortcuts hint and ensure accurate menu item visibility.

* feat(brand-routes): add extract-from-html endpoint for brand extraction

- Introduced a new POST endpoint `/api/brands/:id/extract-from-html` to re-run brand extraction using HTML rendered from the in-app browser after clearing anti-bot walls.
- Implemented error handling for missing HTML and brand not found scenarios.
- Enhanced the `extractBrandFromHtml` function to process the provided HTML and optional CSS, integrating it into the existing brand extraction workflow.
- Updated `prefetch` functionality to support extraction from pre-rendered HTML, improving the overall brand data retrieval process.

* chore(nix): refresh pnpm deps hash

* feat(brand-cli): add extract-from-html command for brand extraction

- Introduced a new CLI command `od brand extract-from-html` to facilitate brand extraction from pre-captured rendered HTML, allowing users to bypass anti-bot walls.
- Enhanced the command to accept optional CSS and base URL parameters, improving flexibility in extraction scenarios.
- Implemented error handling for missing HTML input and invalid brand IDs, ensuring robust user feedback.
- Updated the `BRAND_USAGE` documentation to reflect the new command and its usage details.
- Adjusted server configuration to accommodate larger payloads for the new extraction endpoint.

* feat(design-system): enhance design system extraction and browser tools

- Added a new script to collect CSS styles from rendered pages, improving brand extraction capabilities by capturing computed styles from cross-origin stylesheets.
- Removed the `DesignSystemExtractionPanel` component and its associated styles, streamlining the codebase.
- Updated `ProjectView` and `FileWorkspace` components to enhance design system interactions and improve user experience.
- Introduced new internationalization strings for design system phases and actions, ensuring better user guidance across multiple languages.

* feat(brand-assist): implement browser assist for brand extraction

- Added support for a client-side confirmation mechanism for the brand-browser-assist od-card, allowing users to extract brand information from the unblocked browser DOM.
- Enhanced the `ProjectView`, `ChatPane`, and `AssistantMessage` components to handle the new assist functionality, improving user interaction during brand extraction.
- Introduced new internationalization strings for browser assist prompts and messages, ensuring clarity and guidance across multiple languages.
- Updated the `useBrandReadyPrompt` hook to manage the state of the browser assist, providing a seamless user experience when dealing with anti-bot walls.

* feat(brand-prompt): enhance BrandReadyPrompt with refinement options

- Updated the BrandReadyPrompt component to include options for AI optimization and manual editing, allowing users to refine extracted brand systems.
- Added a new refinement nudge to inform users that automatic extraction may miss details, improving user guidance.
- Adjusted styles for the prompt and dismiss button for better alignment and visual consistency.
- Introduced new internationalization strings for the refinement features, ensuring clarity across multiple languages.
- Removed deprecated PPTX export functionality from the FileViewer component, streamlining the export options.

* refactor(export): remove PPTX export functionality and streamline export options

- Eliminated PPTX export support across various components, including CLI, desktop, and web, to simplify export formats.
- Updated documentation and help messages to reflect the removal of PPTX, ensuring clarity for users.
- Adjusted export-related types and constants to focus on PDF and image formats only, enhancing code maintainability.
- Improved user experience by refining export options and related UI elements.

* refactor(export): remove PPTX references and update export functionality

- Removed all instances of PPTX export functionality from the codebase, including related dependencies and comments.
- Updated export options to focus solely on PDF and image formats, enhancing clarity and maintainability.
- Adjusted UI components and tests to reflect the removal of PPTX, ensuring a streamlined user experience.
- Improved internationalization strings and documentation to align with the new export capabilities.

* chore(nix): refresh pnpm deps hash

* fix(onboarding): preserve selected runtime

* fix(brand): localize generated kit copy

* fix(onboarding): align first-run flow with main

* fix(nav): use palette icon for design systems

* fix(analytics): use design system onboarding step

* fix(ui): remove design system guide toggle

* fix(ui): position design system ready prompt

* fix(ui): space plugin task notice

* fix(web): restore home ask mode and design kit preview

* test(e2e): align onboarding visual capture

* test(e2e): align amr onboarding checks

* fix(brand): remove blocked reference brands

* feat(onboarding): show profile choices as chips

* fix(home): prefer design system cover art in recents

* test(e2e): select onboarding profile chips

* feat(brand-extraction): implement programmatic extraction transcript and UI enhancements for design systems

* feat(brand-extraction): enhance programmatic extraction with transcript agent support and UI improvements

* feat(brand-extraction): add transcript agent resolution and improve message handling in brand extraction

* fix(design-systems): stabilize loading state coverage

* test(e2e): align design system detail visual

* fix(brand-extraction): backfill programmatic transcripts

* fix(web): refresh ready brand design systems

* fix(brands): stabilize extraction handoff and seed colors

* fix(brands): return extraction transcript immediately

* fix(web): open new project modal from entry rail

* fix(editing): expose content edits for plain targets

* feat(file-viewer): implement manual edit draft dirty state tracking and reset logic

* feat(design-system): enhance project creation flow with conversation ID handling

* feat(brands): implement light theme handling for color extraction and seed generation

* feat(brands): add finalizeBrandProject function for brand project completion

* feat(file-workspace): add designSystemBrandId prop and update DesignSystemProjectPanel to use it

* Fix manual editing for brand kits

* fix(design-system): wait for project refreshes

* fix(web): open new project modal from rail

* fix(web): restore home ask mode toggle

* fix(web): sync brand color edits with seeds

* fix(web): stabilize design system workspace tests

* test(tools-pack): relax Windows resource cache timeout

* chore(pr): retrigger review after validation

* fix(web): surface design kit action progress

* fix(web): clarify brand next-step actions

* fix(web): cancel programmatic brand extraction

* fix(web): add design systems tab action feedback

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: xne998808-ai <xne998808@gmail.com>
Co-authored-by: PerishCode <perishcode@gmail.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: lefarcen <935902669@qq.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-06-25 03:56:14 +00:00
Caprika
eef6efa5c7 [codex] show AMR wallet balance in Open Design (#4675)
* feat: show AMR wallet balance

* fix: preserve AMR wallet reauth state

* chore: retrigger PR validation

* fix: bound AMR wallet balance fetch

* chore: retrigger wallet validation

* chore: bump bundled Vela CLI to 0.0.18

* chore: retrigger Vela CLI validation

* chore(nix): refresh pnpm deps hash
2026-06-24 03:58:49 +00:00
PerishFire
0bf1b6d6b8 [codex] converge release workflows and stable dry-runs (#4390)
* fix(tools-pack): use junctions for Windows standalone peer deps

* fix(desktop): expose IPC during startup

* fix(tools-pack): preserve Windows inspect diagnostics

* fix(tools-pack): report Windows inspect status errors

* fix(packaged): use Electron net fetch for app protocol

* fix(packaged): load Windows renderer from web sidecar

* fix(desktop): show Windows packaged window during startup

* fix(packaged): disable Windows GPU startup

* fix(tools-pack): keep Windows core smoke observable

* fix(packaged): remove Windows startup probes

* fix(tools-pack): trace Windows desktop IPC status

* fix(tools-pack): add Windows IPC diagnose loop

* fix(release): default beta-s Windows updater feed

* chore: clean merged test eof

* refactor(release): unify prerelease channel model

* chore(release): close prerelease doc escape hatches

* refactor(release): converge release channel workflows

* fix(release): install toolchain in metadata jobs

* fix(release): build release package before contracts

* chore(release): bump development version to 0.10.1

* fix(e2e): seed windows packaged smoke runtime config

* fix(release): install toolchain for metadata publish

* fix(release): materialize betas metadata checkout

* chore(release): bump development version to 0.10.2

* fix(release): allow betas metadata cold start from s3

* fix(e2e): support betas packaged update scenarios

* fix(release): pass betas channel into packaged smoke

* fix(release): set betas channel during self-hosted builds

* fix(release): verify counted channel reservations

* fix(release): use pnpm cmd for betas windows publish

* fix(release): add betas manifest artifact fallback

* fix(release): skip beta-s public metadata fetch

* fix(release): read beta-s manifests from storage

* fix(release): cache beta windows tools-pack builds

* fix(release): inline beta mac tools-pack builds

* fix(pack): deep sign unsigned mac bundles

* docs(pack): document payload-first beta updater validation

* fix(release): align preview tools-pack cache flow

* fix(release): align prerelease tools-pack cache flow

* fix(release): pass github token to prerelease metadata

* fix(release): setup pnpm before feishu notify

* fix(release): add stable dry-run prepublish flow

* fix(release): accept completed prerelease metadata gate

* fix(release): require stable release branches

* fix(release): converge r2 access checks

* fix(updater): use release channel parser for defaults

* fix(updater): harden windows payload relaunch

* fix(release): converge updater smoke fixture contract

* test(e2e): require silent updater fixture output

* fix(release): align stable windows smoke build path

* fix(ci): include release workspace in validation

* fix(ci): repair release validation lanes

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

* fix(ci): restore zero-install Feishu notification

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-23 06:13:21 +00:00
PerishFire
c782aeb3bb ci: stop duplicate post-merge validation (#4469) 2026-06-17 10:04:03 +00:00
PerishFire
891981d460 [codex] Optimize CI runtime topology (#4450)
* Optimize e2e tools-dev runtime parallelism

* Remove visual networkidle wait

* Optimize e2e vitest parallelism

* Optimize Nix flake caching

* Test CI on Blacksmith runners

* Allow parallel manual CI runs

* Tier CI runner sizes

* Temporarily narrow CI debug scope

* Instrument watcher CI timeouts

* Instrument watcher event diagnostics

* Avoid default polling in watcher tests

* Skip runtime trace during watcher debug

* Probe ARM runner tiers for CI

* Focus CI runner probe on x64 browser

* Probe browser workers by runner size

* Probe Playwright file parallelism

* Probe Playwright worker scaling on 8v

* Reshape CI topology

* Fix split E2E Vitest CI lane

* Simplify daemon CI topology

* Optimize Windows payload CI setup

* Revert "Optimize Windows payload CI setup"

This reverts commit 5cbc48c0af.

* Cache better-sqlite3 Nix binding separately

* Revert "Cache better-sqlite3 Nix binding separately"

This reverts commit 0384e3787e.

* Remove unused Nix cache setup from CI

* Use Blacksmith ARM for lightweight CI jobs
2026-06-17 12:53:06 +08:00
Chris Tam
f5b71e1af3 fix(nix): pin pnpm_10 through fetchPnpmDeps and pnpmConfigHook (#3990)
* fix(nix): pin pnpm_10 through fetchPnpmDeps and pnpmConfigHook

The flake's pnpm_10 override built a 10.33.2 binary, but only added
it to nativeBuildInputs. Both fetchPnpmDeps (the deps-fetch phase)
and pnpmConfigHook (the install phase) were silently falling back to
pkgs.pnpm, which has moved to 11.x on nixpkgs-unstable. That
triggers ERR_PNPM_UNSUPPORTED_ENGINE since package.json pins
>=10.33.2 <11.

Thread pnpm_10 into both:
  - pnpmConfigHook.overrideAttrs replaces its hardcoded
    propagatedBuildInputs = [ pnpm ] with [ pnpm_10 ].
  - fetchPnpmDeps now receives pnpm = pnpm_10 explicitly
    (its default is pkgs.pnpm).

Verified by rebuilding .#daemon.pnpmDeps and .#web.pnpmDeps
against current unstable, both succeed.

* fix: Clean up unused overrides
2026-06-15 07:01:18 +00:00
Marc Chan
0390de94a0 refactor(components): share composable dialog primitives for web dialogs (#4256)
* refactor(web): standardize simple dialog shell structure

Generated-By: looper 0.9.8 (runner=worker, agent=opencode)

* fix(web): preserve sketch text modal drafts

Generated-By: looper 0.9.8 (runner=fixer, agent=opencode)

* refactor(components): share dialog primitive

* refactor(web): share more low-risk dialogs

* fix(components): let custom dialog panels opt out of modal chrome

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)

* fix(components): drop dialog chrome on opt-out panels

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)

* fix(web): keep replacement confirm dialog custom chrome

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)

* fix(components): preserve custom dialog backdrop chrome

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)

* fix(components): localize dialog tests and keyframes

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)

* fix(nix): refresh daemon pnpm deps hash

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)

* refactor(components): add dialog section primitives

* fix(components): preserve dialog variant sizing

Generated-By: looper 0.9.9 (runner=fixer, agent=opencode)
2026-06-15 03:15:20 +00:00
PerishFire
a0afc584bb [codex] centralize daemon data directory docs (#4222)
* docs: centralize daemon data directory contract

* fix(e2e): allow slower artifact consistency navigation

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* docs: localize daemon data directory pointers

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-15 02:52:05 +00:00
lefarcen
1df027c02f chore(amr): bump bundled vela CLI to 0.0.16 (#4071)
* chore(amr): bump bundled vela CLI to 0.0.16

Picks up powerformer/vela#277 (forward OpenCode token usage on the ACP
session/prompt result), so packaged builds that bundle the vela CLI report
real provider token counts for AMR runs instead of token_count_source:unknown.

* chore(nix): refresh pnpm-deps hashes for vela CLI 0.0.16 bump

The lockfile change shifts the daemon/web pnpm-deps fixed-output hashes;
regenerate nix/pnpm-deps.nix so 'Validate Nix flake' passes (hashes taken
from the CI nix-hash-refresh artifact).
2026-06-10 09:46:45 +00:00
lefarcen
3f5fe0f96c feat(amr): forward AMR_CLIENT_SOURCE to vela + source on wallet links (#4005)
* fix(landing): keep auth redirects on AMR wallet

* feat(amr): tag vela CLI launches with AMR_CLIENT_SOURCE + source on wallet links

So vela analytics can attribute Open Design's command funnel and model spend
back to this host (source=open_design), and the AMR wallet landing carries the
source for the web page_view.

- runtimes/env: set AMR_CLIENT_SOURCE=open_design for the amr agent spawn (not
  PII, so no telemetry-consent gate)
- integrations/vela: spawnVelaLogin forwards OD_INSTALLATION_ID (consent-gated)
- web/daemon: AMR wallet recharge URL gains ?source=open_design

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(web): expect ?source=open_design on AMR console/recharge URLs

The AMR console/recharge links the OD app opens into vela web now carry
?source=open_design so vela attributes the visit to Open Design. Update the
console/recharge URL assertions accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(daemon): expect ?source=open_design on AMR insufficient-balance recharge URL

DEFAULT_AMR_RECHARGE_URL now carries the source param for attribution.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(amr): bump bundled vela CLI to 0.0.15 + fix e2e recharge URL assertion

- tools/pack: bundle @powerformer/vela-cli 0.0.15 (the release carrying the
  source-attribution logic that reads AMR_CLIENT_SOURCE and tags model_request)
- e2e: expect the insufficient-balance recharge URL to carry ?source=open_design
  (addresses review on DEFAULT_AMR_RECHARGE_URL)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(nix): refresh pnpm-deps hashes after vela CLI bump

The tools/pack @powerformer/vela-cli bump changed pnpm-lock.yaml, invalidating
the daemon/web fetchPnpmDeps fixed-output hashes. Update to the values the Nix
flake check computed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: audit <a@b.c>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 02:02:48 +00:00
Caprika
3abedf09c8 [codex] Bump packaged vela-cli to 0.0.13 (#3897)
* Bump packaged vela-cli to 0.0.13

* chore(nix): refresh pnpm deps hash

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-08 09:06:51 +00:00
PerishFire
10dfd32a3e Revert "feat: add screenshot-based visual validation to critique loop (#3660)" (#3865)
This reverts commit 931780c914.
2026-06-08 06:24:44 +00:00
nettee
931780c914 feat: add screenshot-based visual validation to critique loop (#3660)
* feat(daemon): add visual validation atom

Generated-By: looper 0.9.3 (runner=worker, agent=codex)

* fix(daemon): fail closed in visual validation

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): narrow visual validation defaults

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): add visual validation to default critique stages

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): harden visual validation review fixes

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): fail closed in visual validation discovery

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* test(daemon): tighten visual validation discovery regression

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): match visual validation reference viewport

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): tighten visual validation capture flow

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): fail closed across visual validation refs

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): honor metadata entry file in visual validation

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): capture visual validation through preview route

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): fail closed without preview context

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): disable pre-start visual validation worker

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): narrow visual validation spec discovery

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): stop advertising visual validation as runnable

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): restore visual validation critique worker

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): delay visual validation until run success

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* test(daemon): cover post-run visual validation scheduling

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): gate post-run visual validation before finish

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* chore(nix): refresh pnpm deps hash

* fix(daemon): preserve deferred pipeline ordering

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(packaging): bundle Chromium for visual validation

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(scenarios): keep visual validation out of default critique loop

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(packaging): bundle Playwright headless shell

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(packaging): invalidate Playwright resource cache

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* fix(daemon): include visual-validation in bundled atom roster

Generated-By: looper 0.9.3 (runner=fixer, agent=codex)

* chore(nix): refresh pnpm deps hash

* fix: restore pre-run plugin surfaces and nix hashes

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* fix: preserve pre-run surfaces and tolerate headed-only chromium bundles

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* fix: launch visual validation with bundled chromium

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* fix: avoid Playwright fixture teardown races

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* fix: fail packaged visual validation on shell-only chromium

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* fix: isolate synthetic Playwright test bundles

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-08 04:03:43 +00:00
Caprika
02a1a5a537 [codex] fix connector refresh propagation (#3723)
* fix connector refresh propagation

* address connector refresh review feedback

* bump vela cli to 0.0.12

* chore(nix): refresh pnpm deps hash

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-05 10:31:46 +00:00
lefarcen
817885bc34 feat(amr): forward installationId to the vela CLI for analytics correlation (#3634)
* feat(amr): forward installationId to the vela CLI for analytics correlation

When the daemon spawns the vela (amr) CLI, pass this installation's id as
OD_INSTALLATION_ID once the user has consented to telemetry. vela attaches it
to its analytics events so CLI activity can be correlated back to the
open-design installation that launched it.

Resolution mirrors the /api/analytics/config handler: consent =
telemetry.metrics === true; channel-root installation.json wins over the legacy
app-config.json field. Reads are synchronous and best-effort — any missing
file / parse error / withheld consent simply omits the id.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(amr): bump bundled @powerformer/vela-cli to 0.0.11

0.0.11 is the first vela CLI release that consumes OD_INSTALLATION_ID (the env
forwarded by this PR's daemon change), so bundle it to ship the installationId
analytics correlation end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(amr): mirror readAppConfig defaults in installationId forwarding

Address review: the previous helper bailed out entirely when app-config.json
was missing or corrupt, so vela correlation was skipped even though the web
analytics config (via readAppConfig) would still emit events with the same
installationId. Now match readAppConfig / applyTelemetryDefaults semantics:

- a missing/corrupt app-config.json is treated as an empty config, not a hard
  failure — we still consult the channel-root installation.json for the id;
- telemetry defaults to on (opt-out model); the id is withheld only when the
  user has explicitly set telemetry.metrics to a non-true value.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(nix): refresh pnpm-deps hashes for vela-cli 0.0.11 bump

The bundled @powerformer/vela-cli bump changed pnpm-lock.yaml, so the
fixed-output pnpm-deps derivation hashes no longer matched. Apply the
CI-generated hash refresh (nix-hash-refresh artifact).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(amr): resolve installationId via shared readAppConfigSync

Address review: instead of hand-replicating readAppConfig's parsing in the
spawn-env helper (which kept diverging on edge cases like invalid telemetry
shapes), reuse the real logic. Add a synchronous mirror of readAppConfig:

- installation.ts: extract parseInstallationFile + add readInstallationFileSync
- app-config.ts: add exported readAppConfigSync (same filterAllowedKeys /
  validateTelemetry / applyTelemetryDefaults pipeline as the async path; only
  skips the best-effort migration write)
- env.ts: amrAnalyticsIdentityEnv now calls readAppConfigSync and gates on
  telemetry.metrics === true (matching analytics.ts), so vela correlation can
  no longer drift from the web analytics config for any config shape.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(nix): refresh pnpm-deps hashes after main merge

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 09:44:07 +00:00
luden
3a21392c46 Fix Electron composer lag and stale generation failures (#3593)
* Keep chat typing and web CSS resolution responsive

Cache the normalized mention index by entity array and scan only concrete @ positions so the composer does not rescan every file/plugin/skill token on each draft update. Link the shared components package at the root importer so Turbopack/PostCSS can resolve the exported component stylesheet from the web CSS entrypoint.

Constraint: Chat composer typing updates run on every keystroke in Electron, and tools-dev/Next evaluates global CSS from the workspace root.
Rejected: Debouncing textarea state updates | would make text entry feel stale and would not remove the expensive synchronous parser path.
Rejected: Relative-importing package source CSS from apps/web | would bypass the package export contract and hardcode package layout into the app stylesheet.
Confidence: high
Scope-risk: narrow
Directive: Keep inline mention parsing proportional to draft mention markers, and keep package CSS imports resolvable through workspace package links.
Tested: pnpm install
Tested: pnpm --filter @open-design/web test -- tests/utils/inlineMentions.test.ts
Tested: pnpm --filter @open-design/web typecheck
Tested: pnpm --filter @open-design/web build
Tested: pnpm guard
Not-tested: Full web Vitest remains blocked by unrelated CSS selector tests observed before this commit.

* Keep composer typing off the render path

Move keystroke-time draft updates into the textarea/ref path while deferring React draft rendering, mention overlay work, picker filtering, and localStorage persistence so Electron input echo stays within frame budget.

Constraint: Electron chat input was visibly lagging for plain text and mention text while the right iframe stayed responsive.

Rejected: Keep the textarea fully controlled | every keystroke still synchronizes the largest ChatComposer render path.

Confidence: high

Scope-risk: moderate

Directive: Keep prompt submission, slash commands, and mention insertion reading draftRef when draft rendering is deferred.

Tested: pnpm guard; pnpm --filter @open-design/web typecheck; pnpm --filter @open-design/web build; pnpm exec vitest run -c vitest.config.ts tests/components/ChatComposer.infinite-render.test.tsx; pnpm exec vitest run -c vitest.config.ts tests/components/ChatComposer.context-pickers.test.tsx; pnpm exec vitest run -c vitest.config.ts tests/utils/inlineMentions.test.ts; Electron inspect eval plain text avg 0.03ms max 0.5ms; Electron inspect eval @ path avg 0.03ms max 0.1ms; HTTP smoke status 200 with no components CSS resolve error.

Not-tested: Full web Vitest suite and Playwright UI suite.

* Keep Electron chat input responsive under heavy projects

Constraint: Electron project views can carry large chat/file DOM and Chromium can throttle or defer visible renderer work under tab/window visibility edge cases.
Rejected: Re-rendering the composer draft on every plain keystroke | it couples textarea input to mention overlay, resize, storage, and project layout work.
Confidence: medium
Scope-risk: moderate
Directive: Keep plain text input on the textarea/ref path; only commit React draft state immediately for mention, slash, chip, or imperative draft mutations.
Tested: pnpm guard; pnpm --filter @open-design/web typecheck; pnpm exec vitest run -c vitest.config.ts tests/components/ChatComposer.infinite-render.test.tsx; pnpm --filter @open-design/desktop test; pnpm --filter @open-design/desktop typecheck; desktop inspect actual project composer with OS SendKeys.
Not-tested: Push/PR remains blocked by GitHub 403 for mimicryluden on nexu-io/open-design.

* Keep heavy project typing isolated from preview layout

Constraint: Electron project panes can carry large DOM histories and preview frames that make textarea paint sensitive to parent layout.

Rejected: Further debouncing ChatComposer input state | app-side handler costs were already low while native textarea updates still lagged.

Confidence: high

Scope-risk: moderate

Directive: Keep the composer mounted through the fixed layer/slot pair so future layout changes do not pull typing back into the split-pane flex reflow path.

Tested: pnpm --filter @open-design/web typecheck; pnpm exec vitest run -c vitest.config.ts tests/components/ChatComposer.infinite-render.test.tsx tests/components/ChatComposer.context-pickers.test.tsx; pnpm guard; pnpm --filter @open-design/web build; desktop inspect fixed-layer DOM and SendKeys probe avg beforeinput->input 1.4ms max 2.5ms

Not-tested: PR publish blocked by GitHub repository permissions if remote still rejects mimicryluden

* Keep stale run recovery from hiding generated previews

Constraint: Completed daemon runs can outlive the in-memory run registry after a restart.

Rejected: Replaying succeeded rows with empty producedFiles | It turns completed messages into failed messages when the registry is gone.

Confidence: high

Scope-risk: narrow

Directive: Only active run statuses should enter daemon reattach recovery.

Tested: vitest generation-preview; vitest ProjectView reattach restore; web typecheck; pnpm guard; web build; desktop DOM eval failureText=false iframeCount=2

Not-tested: Fresh provider generation against an external model

* Keep composer contracts aligned with latest main

Rebase conflict resolution restored the current composer contract while leaving the lower-risk latency fixes in separate layers.

Constraint: The branch was rebased over newer workspace/session context composer contracts.

Rejected: Keeping the older composer conflict side because it removed current ChatComposer props and broke web typecheck.

Confidence: high

Scope-risk: narrow

Directive: Preserve upstream ChatComposer contracts when rebasing input-latency work.

Tested: web typecheck; targeted ChatComposer, inlineMentions, ProjectView generation-preview tests; desktop window-chrome test

Not-tested: Full browser interaction after fork push

* Fix daemon Nix workspace symlink pruning

---------

Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-06-04 10:47:44 +00:00
elihahah666
f3ec4968af feat(web): revamp chat UI styling + live streaming code card (#3382)
* feat(web): revamp chat UI styling + live streaming code card

Restyle the chat pane (chat/tokens/code/tools/routines/composio CSS) and
add a live code-component card that streams Write/Edit tool input and
<artifact type="text/html"> bodies character-by-character via a new
ephemeral tool_input_delta SSE event, flipping to the completed
FileWriteCard/FileEditCard view on close. Includes shiki highlighting,
markdown/artifact strip handling, tool-status cleanup, and i18n keys.

* chore(nix): refresh pnpm deps hash for merged lock

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-06-03 16:39:20 +00:00
Tom Huang
c6722d2671 feat: Lexical composer, interactive terminals, comment attachments & browser reference board (#3516)
* feat(daemon): add interactive terminal support with node-pty

- Introduced a new terminal service to manage interactive terminal sessions.
- Added routes for creating, streaming, and managing terminal sessions in the daemon.
- Integrated terminal functionality into the CLI, allowing users to open interactive shells.
- Updated project routes to support conversation seeding for side chats.
- Enhanced the web application to include terminal tab functionality and UI components.

This feature enables users to interact with a terminal directly within the application, enhancing the overall user experience and providing a more integrated development environment.

* feat(web): implement embedded browser module in Design Files workspace

- Added a `+` icon to the Design Files tab for opening a new Browser module.
- The Browser module supports navigation features including back, forward, refresh, and address input.
- Integrated a curated list of design reference URLs for user convenience.
- Implemented browser data clearing functionality via IPC.
- Enhanced desktop runtime to support embedded browser with appropriate security measures.
- Added tests for browser functionality and URL handling.

This commit establishes a new workspace for browsing and referencing design resources directly within the application, improving user experience and accessibility to design tools.

* refactor(web): enhance add tab menu functionality in FileWorkspace

- Updated the add tab button to toggle the visibility of the menu with improved accessibility attributes.
- Refactored the add menu to be portaled to the document body for better positioning and visibility.
- Adjusted CSS styles for the add menu to use fixed positioning and increased z-index for proper layering.
- Minor CSS adjustments in entry layout for consistent padding.

These changes improve the user experience when adding new modules in the FileWorkspace, ensuring the add menu is more accessible and visually consistent.

* feat(daemon): introduce session mode for conversations

- Added a new `session_mode` column to the `conversations` table with a default value of 'design'.
- Implemented logic to handle `session_mode` in conversation creation, updates, and retrieval.
- Enhanced the API to support `session_mode` in conversation requests, allowing for 'chat' or 'design' modes.
- Updated the web application to include a session mode toggle, enabling users to switch between chat and design modes seamlessly.
- Adjusted system prompts to reflect the current session mode, providing context-aware responses.

This feature enhances the user experience by allowing for more flexible conversation management, catering to different interaction styles.

* feat(web): enhance navigation and settings functionality in DesignBrowserPanel and EntryShell

- Introduced a navigation stack in DesignBrowserPanel to manage back and forward navigation states.
- Updated the browser navigation logic to handle URL history and improve user experience.
- Added a settings menu in EntryShell for quick access to language and appearance options.
- Implemented CSS styles for the new settings menu, ensuring a consistent and user-friendly interface.
- Enhanced tests for navigation functionality and settings menu interactions.

These changes improve the overall usability of the application by streamlining navigation and providing easy access to settings.

* feat(daemon): enhance conversation session mode handling

- Added a new `mode` flag to CLI commands for project and conversation creation, allowing users to specify 'design' or 'chat' modes.
- Implemented `normalizeChatSessionModeFlag` function to validate and normalize session mode inputs.
- Updated project routes to handle session mode during conversation creation and updates.
- Enhanced web components to support session mode changes, including new props and handlers for managing session modes in conversations.
- Adjusted UI elements to reflect the current session mode, improving user experience and interaction flexibility.

This update provides a more robust framework for managing conversation modes, catering to diverse user needs and enhancing overall functionality.

* feat(web): enhance HandoffButton and DesignBrowserPanel with improved functionality and styling

- Updated HandoffButton to support framework-specific CLI prompts and improved local project path handling.
- Enhanced DesignBrowserPanel to manage browser history with favicon support and improved address display.
- Introduced new utility functions for formatting addresses and extracting hostnames.
- Refactored CSS styles for better layout and responsiveness across components.
- Added tests for new functionalities in HandoffButton and DesignBrowserPanel, ensuring robust behavior.

These changes improve user experience by streamlining the handoff process and enhancing the design browsing capabilities within the application.

* feat(web): enhance HandoffButton and ProjectView with improved instructions handling and UI updates

- Updated HandoffButton to include a tabbed interface for switching between editor and CLI options, enhancing user experience.
- Added support for opening the AMR website directly from the HandoffButton.
- Refactored ProjectView to implement a modal for custom instructions, allowing users to edit and review instructions more intuitively.
- Improved CSS styles for project instructions modal and handoff menu, ensuring better layout and responsiveness.
- Added keyboard accessibility for closing the instructions modal with the Escape key.

These changes streamline the handoff process and improve the usability of custom instructions within the application.

* feat(daemon): add social share functionality and project folder management

- Introduced new `share` command in CLI for building localized social-share targets for Open Design projects.
- Implemented `printShareUsage` and `runShare` functions to handle share requests and display usage instructions.
- Added API routes for social sharing, allowing users to create shareable links for projects.
- Enhanced project routes with new endpoints for listing and creating project folders, improving project organization.
- Updated relevant files and tests to support new functionalities, ensuring robust behavior.

These changes enhance user experience by facilitating social sharing and better project management within the application.

* feat(web): enhance DesignBrowserPanel and DesignFilesPanel with improved address formatting and drag-and-drop functionality

- Refactored `formatAddressDisplay` to utilize a new `formatAddressDisplayParts` function, separating URL and title handling for better clarity.
- Updated `DesignFilesPanel` to improve drag-and-drop interactions, including enhanced directory navigation and file management features.
- Adjusted CSS styles for better visual consistency and responsiveness across components.
- Added tests for new functionalities in `DesignBrowserPanel` and `DesignFilesPanel`, ensuring robust behavior.

These changes improve user experience by streamlining address formatting and enhancing file management capabilities within the application.

* feat(web): enhance DesignBrowserPanel and DesignFilesPanel with new reference categories and improved folder creation

- Added new reference categories in DesignBrowserPanel, including 'Inspiration', 'Real Interfaces', 'Color', 'Typography', and more, each with curated design resources.
- Improved folder creation logic in DesignFilesPanel to suggest names based on existing folders, enhancing user experience.
- Updated CSS styles for better layout and responsiveness across components, particularly in control rows and search functionalities.
- Added tests for new reference categories and folder creation features, ensuring robust functionality.

These changes enrich the design resource catalog and streamline folder management, improving overall usability within the application.

* feat(web): implement reference icon functionality and enhance social share features

- Replaced favicon URL generation with a new `referenceIconUrl` function to provide reliable icons for curated design sites, improving visual consistency in the DesignBrowserPanel.
- Updated the FileViewer component to enhance social share functionality, including clearer messaging for protected deployments and improved UI for sharing links.
- Added CSS styles to support new visual states for social share components, ensuring better user experience.
- Expanded tests for the new `referenceIconUrl` function and social share interactions, ensuring robust functionality.

These changes enhance the design resource presentation and improve the sharing experience within the application.

* feat(web): enhance DesignFilesPanel and FileViewer with improved folder management and social share functionality

- Added optimistic folder path management in DesignFilesPanel to improve user experience during folder creation.
- Updated social share logic in FileViewer to handle protected deployments more effectively, ensuring clearer messaging and UI updates.
- Refactored CSS styles for better layout and responsiveness in both components, enhancing overall usability.
- Expanded tests for new folder management features and social share interactions, ensuring robust functionality.

These changes streamline folder management and enhance the social sharing experience within the application.

* feat(daemon): enhance project tab management with new state handling

- Updated database schema to include a new `state_json` column in the `tabs_state` table for improved project tab state management.
- Implemented functions to normalize and parse project tab states, including handling browser workspace tabs.
- Modified `listTabs` and `setTabs` functions to utilize the new state management features, allowing for better tracking of active tabs and saved states.
- Refactored related types in the contracts and web applications to support the new tab state structure.

These changes improve the functionality and user experience of managing project tabs within the application.

* feat(web): enhance project tab management and browser tab functionality

- Updated project routes and server logic to include `browserTabs` in the request body, improving tab state management.
- Implemented validation for `browserTabs` to ensure it is an array, enhancing error handling.
- Refactored `setTabs` function to accommodate the new structure for managing tabs, including browser-specific tabs.
- Added tests for browser tab state persistence and management in the new `project-tabs-state.test.ts` file, ensuring robust functionality.

These changes improve the user experience by providing better management and persistence of project and browser tabs within the application.

* feat(web): enhance ProjectView with improved chat send management and UI updates

- Added support for queue-only chat sends, allowing for better handling of messages during busy conversations.
- Refactored chat send functions to improve state management and persistence of queued messages.
- Updated CSS styles in the DesignFilesPanel for better layout and responsiveness, including search control enhancements.
- Expanded tests for new chat send functionalities and search interactions, ensuring robust behavior.

These changes improve the user experience by streamlining chat interactions and enhancing the overall design file management interface.

* feat(web): improve project tab state management and caching

- Enhanced the `loadTabs` function to utilize cached tab states, improving performance and user experience during data fetching.
- Implemented `normalizeTabsState` and caching functions to manage tab states effectively, including validation and error handling.
- Updated `writeCachedTabs` to ensure the latest state is stored in local storage, facilitating better persistence of tab information.
- Modified `listTabs` in the daemon to include `updatedAt` in the state retrieval, allowing for more accurate tracking of tab updates.

These changes streamline tab management and enhance the overall responsiveness of the application.

* feat(web): integrate Lexical for enhanced text composition and mention handling

- Added Lexical as a dependency to improve text composition capabilities within the chat interface.
- Implemented mention functionality with the creation of `MentionNode` and related serialization/deserialization logic for inline mentions.
- Enhanced `ChatComposer` and `ChatPane` components to support queue-only message sending and improved state management for queued messages.
- Updated `DesignBrowserPanel` and `PreviewDrawOverlay` to incorporate new features for capturing and annotating browser snapshots.
- Refactored various components to streamline interactions and improve user experience during chat and design tasks.

These changes significantly enhance the text editing experience and provide better management of chat interactions, improving overall usability in the application.

* feat(workspace): integrate terminal viewer and enhance chat functionality

- Added a new terminal viewer component with customizable themes and improved styling for better user experience.
- Integrated terminal functionality into the workspace, allowing users to interact with a terminal directly within the application.
- Updated chat components to support active conversation states, enabling seamless message handling and interaction.
- Refactored chat-related props and state management to enhance performance and maintainability.
- Removed deprecated file tree explorer components to streamline the workspace interface.

This update enhances the overall functionality of the workspace, providing users with a more integrated and responsive environment for both terminal and chat interactions.

* feat(workspace): enhance chat and session mode functionality

- Introduced a new `+` launcher in the workspace for easy access to project files and new tabs, including Side Chat and Terminal.
- Added Side Chat functionality that allows users to create context-aware conversations based on existing chats.
- Implemented a new terminal tab type for interactive terminal sessions using `node-pty`, enabling users to run shell commands directly within the workspace.
- Enhanced chat functionality with a new session mode toggle, allowing users to switch between 'design' and 'chat' modes seamlessly.
- Added a feature to copy chat responses as markdown to the clipboard, improving usability and sharing capabilities.
- Updated various components and styles to support these new features, ensuring a cohesive user experience.

This update significantly improves the workspace's interactivity and usability, providing users with more tools for collaboration and development.

* feat(workspace): enhance session mode toggle and chat functionality

- Improved the SessionModeToggle component to include localized guidance cards that provide context-aware descriptions for each mode.
- Updated the UI to display a popover with mode descriptions when hovering over options, enhancing user understanding of the available modes.
- Added a new copy button in the AssistantMessage footer to allow users to copy responses as raw Markdown, improving usability for external documentation.
- Enhanced localization support by updating i18n keys and translations for various languages, ensuring consistent user experience across different locales.
- Refactored styles for the session mode toggle and associated components to improve layout and responsiveness.

This update significantly enhances the user experience by providing clearer guidance and improved functionality in chat interactions.

* feat(styles): enhance session mode toggle styling for improved visibility

- Added new CSS rules to ensure the session mode toggle popover and hover cards are displayed correctly with increased z-index and visibility.
- Updated styles for various chat components to maintain consistent positioning and overflow behavior when session mode elements are present.
- Improved overall layout responsiveness for chat interactions, enhancing user experience during mode transitions.

This update refines the visual presentation of session mode toggles, ensuring they are more accessible and user-friendly.

* feat(web): finish Lexical composer input with atomic mention pills

* fix(web): make browser reference board scroll with a pinned toolbar

The browser tab's default Reference Board (.db-start) is wrapped by
PreviewDrawOverlay's position:absolute container, which is not a flex
parent — so .db-start's flex:1 1 auto never bounded its height and the
board grew to its content height instead of scrolling (and the sticky
.db-reference-toolbar could not pin). Fill the overlay with height:100%
like the .db-webview/.db-fallback siblings already do, restoring scroll
and the sticky toolbar.

* test(daemon): expect updatedAt in persisted tab state round-trip

listTabs() returns the tabs_state row's updatedAt timestamp on the
saved-state path, but these two toEqual assertions predated that field
and failed strictly. Match the real shape with expect.any(Number).

* feat(chat): enhance ChatComposer with session mode management and UI improvements

- Introduced a new `sessionMode` prop to the ChatComposer, allowing users to switch between different session modes (e.g., 'design').
- Added a SessionModeToggle component for improved user interaction and visibility of session options.
- Updated the ToolsTab type to reflect the removal of the pet option, streamlining the tools available in the composer.
- Refactored styles to enhance the visibility and positioning of session mode elements, ensuring a better user experience during mode transitions.
- Improved the handling of draft state and user interactions within the composer, enhancing overall functionality.

These changes significantly improve the ChatComposer's usability and flexibility, providing users with clearer options and a more responsive interface.

* feat(chat): implement caret floating layer for mention and slash popovers

- Introduced a new `CaretFloatingLayer` component to manage the positioning of mention and slash command popovers relative to the caret.
- Enhanced `ChatComposer` to utilize the caret rectangle for accurate popover placement, improving user experience during text input.
- Updated `LexicalComposerInput` to pass caret position data to the trigger handling logic, allowing for dynamic popover adjustments.
- Refactored styles for popovers to ensure consistent appearance and behavior, including improved animations and responsiveness.
- Added accessibility features to mention and slash popovers, enhancing usability for keyboard navigation.

These changes significantly improve the interaction model for mentions and commands within the chat interface, providing a more intuitive and responsive user experience.

* feat(chat): update mention tab order and improve search functionality

- Reordered the mention tab sections in the ChatComposer to prioritize 'Design files' over other categories, enhancing user experience during mentions.
- Updated the search prompt to reflect the new tab order, ensuring clarity in search functionality.
- Enhanced the mention selection logic to accommodate the new tab structure, allowing for a more intuitive navigation experience.
- Added tests to verify the correct display and functionality of the updated mention tabs and search behavior.

These changes significantly improve the usability of the mention feature within the chat interface, making it easier for users to find and select relevant items.

* feat(chat): enhance context management in ChatComposer and HomeHero

- Added functionality to manage MCP servers and connectors within the ChatComposer, allowing users to remove these contexts seamlessly.
- Updated the HomeHero component to support the selection and removal of MCP servers and connectors, improving context handling in user interactions.
- Enhanced the search prompt to include files, ensuring users can search across all relevant categories.
- Refactored related components and styles for better integration and user experience.
- Added tests to verify the correct functionality of the new context management features.

These changes significantly improve the usability of context features in the chat interface, making it easier for users to manage their interactions effectively.

* feat(chat): enhance message handling with session mode and plugin snapshot support

- Added new columns to the database schema for `session_mode` and `applied_plugin_snapshot_json` to support enhanced message context.
- Updated the `upsertMessage` and `listMessages` functions to handle the new fields, ensuring messages can store and retrieve session mode and plugin snapshot data.
- Enhanced the `ChatComposer` to manage and send the applied plugin snapshot as part of the message context, improving user interaction with plugins.
- Introduced a new `MessageSessionModeChip` component to visually represent the session mode in the chat interface.
- Updated styles for better presentation of session mode and plugin context within messages, enhancing user experience.

These changes significantly improve the context management capabilities in the chat interface, allowing for richer interactions and better tracking of session-specific data.

* feat(chat): add annotation handling and UI improvements in ChatComposer and PreviewDrawOverlay

- Implemented functionality to stage draw annotations into the composer input without sending, enhancing user interaction.
- Added a new button in the PreviewDrawOverlay to allow users to append notes to the input, improving workflow flexibility.
- Updated the ChatComposer tests to verify the correct staging of annotations and their integration into the input.
- Enhanced internationalization support by adding new translation keys for annotation actions across multiple languages.

These changes significantly improve the user experience by providing more intuitive annotation handling and better integration within the chat interface.

* feat(capture): implement page capture functionality and enhance folder management dialogs

- Added new capture functionality to allow users to take snapshots of the current page, improving user interaction with visual content.
- Introduced in-app dialogs for folder creation and moving files, replacing the unsupported window.prompt in the Electron desktop host, enhancing usability across platforms.
- Updated the DesignFilesPanel to support these new dialogs, ensuring a seamless experience for managing project files.
- Enhanced internationalization support by adding new translation keys for folder management actions across multiple languages.

These changes significantly improve the user experience by providing intuitive capture options and streamlined file management within the application.

* feat(projects): implement folder deletion and enhance project file management

- Added a new API endpoint to delete project folders, improving the file management capabilities within the application.
- Introduced utility functions for ensuring project subdirectories and safely deleting folders, enhancing the robustness of folder operations.
- Updated the DesignFilesPanel and FileWorkspace components to support folder deletion actions, providing users with a more intuitive interface for managing project files.
- Enhanced internationalization support by adding new translation keys for folder management actions.

These changes significantly improve the user experience by streamlining folder management and providing clearer options for users to organize their projects effectively.

* feat(composer): enhance BoardComposerPopover with image attachment functionality

- Added support for attaching images to comments, allowing users to upload and preview images directly within the composer.
- Implemented new handlers for image input changes and clipboard pasting, improving the user experience for image uploads.
- Updated the component's props to include image-related callbacks and state management for attached images.
- Enhanced styles for image thumbnails and removal buttons, ensuring a cohesive design with the existing comment popover interface.

These changes significantly improve the functionality of the BoardComposerPopover, providing users with a more interactive and visually rich commenting experience.

* feat(file-viewer): enhance comment attachment functionality with image support

- Updated the `onSendBoardCommentAttachments` prop to accept an additional `images` parameter, allowing for image attachments alongside comments.
- Introduced state management for handling attached images, including functions to add and remove images, and to generate previews.
- Implemented a modal for previewing attached images, improving user interaction when managing comment attachments.
- Updated the `FileWorkspace` component to reflect changes in the props, ensuring consistency across components.

These enhancements significantly improve the commenting experience by enabling users to attach and preview images directly within the file viewer.

* feat(home-hero, project-view, styles): enhance functionality and user experience

- Updated the HomeHero component to prevent unnecessary state changes during programmatic updates, improving user interaction with prompts.
- Enhanced the ProjectView component to support image attachments alongside comments, allowing for a more versatile commenting experience.
- Implemented a new image upload process that queues tasks efficiently, ensuring smooth handling of comment attachments.
- Added CSS to reserve scrollbar space in design files, preventing layout shifts when scrollbars appear, thus enhancing visual stability.

These changes collectively improve the user experience by streamlining interactions and ensuring consistent UI behavior across components.

* feat(preview-comments, chat): enhance comment functionality with attachment support

- Added support for `attachments_json` in the `preview_comments` table, allowing users to attach files to comments.
- Updated relevant functions to handle attachments, including `upsertPreviewComment` and `listPreviewComments`, ensuring attachments are properly managed and displayed.
- Enhanced the `CommentSidePanel` to render attached files, providing users with a visual representation of their attachments.
- Improved the `BoardComposerPopover` and `ChatPane` components to support image attachments, including drag-and-drop functionality for reordering queued sends.

These changes significantly enhance the commenting experience by enabling users to attach and manage files directly within the chat interface, improving overall usability and interaction.

* refactor(comment-attachments): rename attachment normalization functions for clarity

- Updated the `normalizeCommentAttachments` function to `normalizePreviewCommentAttachments` for better context in handling preview comment attachments.
- Adjusted the `upsertPreviewComment` and `normalizePreviewComment` functions to utilize the renamed attachment normalization function.
- Added tests to ensure that image attachments are correctly saved and retrieved, addressing a regression issue with attachment persistence.

These changes enhance code clarity and maintainability while ensuring the functionality for handling comment attachments remains robust.

* feat(comment-attachments): enhance comment submission with image and note validation

- Updated the `upsertPreviewComment` function to require either a text note or at least one image attachment for comment submission, improving validation logic.
- Modified the `BoardComposerPopover` to allow saving comments with only images, enhancing user experience by simplifying the commenting process.
- Adjusted the `FileViewer` to support saving comments with image-only notes, ensuring consistency across components.
- Improved styles in the chat and home hero components for better visual representation of attachments and comments.

These changes collectively enhance the commenting functionality, providing users with more flexibility in how they submit comments while ensuring robust validation.

* feat(project-view): implement auto-start for queued chat sends

- Replaced the `startingQueuedChatSendIdRef` with a state variable `queuedAutoStartBlocked` to manage the auto-start behavior of queued chat sends.
- Updated the `useEffect` to ensure that queued sends are processed one at a time after the active conversation completes, enhancing the chat experience.
- Added a test to verify that queued sends auto-start correctly after the active run finishes, ensuring reliable functionality.

These changes improve the handling of queued chat messages, providing a smoother user experience during conversations.

* feat(project-view): refine auto-start logic for queued chat sends

- Replaced the state variable `queuedAutoStartBlocked` with a reference `startingQueuedChatSendIdRef` to manage the auto-start behavior of queued chat sends more effectively.
- Updated the `useEffect` to ensure that queued sends are processed sequentially, improving the handling of chat messages during active conversations.
- Introduced a new state variable `queuedAutoStartTick` to track the auto-start process, enhancing the responsiveness of the chat interface.

These changes improve the reliability and user experience of the chat functionality by ensuring queued messages are handled smoothly and efficiently.

* feat(comment-attachments): improve attachment handling in preview comments

- Updated the `upsertPreviewComment` function to merge existing and incoming attachments, ensuring that image attachments are preserved when updating comments without new files.
- Introduced a new helper function, `mergePreviewCommentAttachments`, to handle the merging of attachments without duplicates, enhancing the robustness of attachment management.
- Added tests to verify the correct merging of attachments and the preservation of existing attachments during comment updates, improving overall functionality and user experience.

These changes enhance the commenting system by providing better management of image attachments, ensuring users can update comments seamlessly while retaining their attached images.

* feat(BoardComposerPopover): enhance popover positioning and measurement

- Updated the `popoverAnchorStyle` function to incorporate viewport scroll positions, ensuring the popover remains within visible bounds.
- Introduced a new `PopoverSize` type to manage measured dimensions, improving the accuracy of popover placement.
- Implemented a `useLayoutEffect` to dynamically measure the popover size and adjust its position accordingly, enhancing user experience during interactions.
- Added tests to verify that the popover correctly adjusts its position based on target and viewport dimensions, ensuring it remains fully visible.

These changes improve the usability of the BoardComposerPopover by ensuring it is properly positioned within the viewport, enhancing the overall commenting experience.

* feat(comment-attachments): enhance image attachment handling in comments

- Updated the `normalizeCommentAttachments` function to include image attachments in the comment payload, allowing comments to be submitted with only images.
- Introduced a fallback message for comments that consist solely of image attachments, improving user guidance.
- Enhanced the `renderCommentAttachmentHint` function to display image attachment details, ensuring users are informed about attached images.
- Added tests to verify that image attachments are preserved in comment submissions and correctly rendered in hints, improving overall functionality and user experience.

These changes enhance the commenting system by providing better support for image-only comments, ensuring users can effectively utilize image attachments in their interactions.

* feat(comment-attachments): refine comment context handling and enhance drag-and-drop functionality

- Updated the `normalizeCommentAttachments` function to conditionally omit comment text when the context is set to 'query', improving clarity in comment submissions.
- Enhanced the `renderCommentAttachmentHint` function to only display comments when not in 'query' context, ensuring a cleaner output.
- Implemented drag-and-drop functionality in the `CommentSidePanel` for reordering comments, improving user interaction and organization of comments.
- Added tests to verify the correct handling of comment context and the functionality of the drag-and-drop feature, ensuring robust performance.

These changes enhance the commenting system by providing clearer context management and improved usability through drag-and-drop capabilities.

* feat(FileViewer, ProjectView): enhance comment attachment handling and status updates

- Updated the `FileViewer` component to manage the state of sent comment IDs, ensuring that active previews are correctly updated after sending comments.
- Refined the `ProjectView` component to filter out comment attachments from the board-batch source and update their status to 'applying' during processing, improving user feedback on attachment handling.
- Introduced logic to handle the removal of sent comment IDs from the preview, enhancing the overall user experience when managing comments and attachments.

These changes improve the functionality and responsiveness of the commenting system, providing clearer feedback and better management of comment attachments.

* feat(conversation-forking): introduce message-based conversation forking

- Added the ability to fork conversations from a specific message, enhancing the chat experience by allowing users to create new conversations that inherit context up to a chosen point.
- Updated the CLI commands and help documentation to reflect the new `--fork-after` option, which specifies the message ID to stop copying from.
- Enhanced the backend to handle the new forking logic, ensuring that only messages up to the specified ID are included in the new conversation.
- Implemented tests to verify the forking functionality, ensuring robust performance and correct behavior when forking conversations.

These changes improve the flexibility of conversation management, allowing users to create tailored discussions based on previous interactions.

* feat(terminal-service): enhance session output management and memory handling

- Introduced new parameters for managing session output, including `maxBufferBytes`, `exitTailBytes`, `flushIntervalMs`, and `flushThresholdBytes`, to optimize memory usage and performance.
- Implemented a `trimBuffer` function to efficiently evict old events based on byte and count limits, improving memory management during active sessions.
- Added logic to coalesce buffered PTY output into single `data` events, reducing the frequency of event emissions and enhancing performance during high-throughput scenarios.
- Updated session event structure to include `byteLength`, allowing for better tracking of output size and memory usage.

These changes improve the efficiency and responsiveness of terminal sessions, ensuring better resource management and user experience.

* feat(workspace-context): enhance project attachment handling and workspace context management

- Introduced `formatProjectAttachmentHint` function to render project attachments in a user-visible order, improving clarity for users referencing attachments.
- Added `normalizeWorkspaceContextItems` function to standardize workspace context items, ensuring consistent handling of various context types.
- Updated `mergeRunContextSelections` to include workspace items, enhancing the context management during chat interactions.
- Enhanced `renderRunContextPrompt` to display active workspace context, providing users with better visibility of their current workspace state.
- Implemented tests for new functions to ensure robust functionality and correct behavior in handling project attachments and workspace contexts.

These changes improve the user experience by providing clearer context and better management of project attachments within the application.

* feat(database, chat): enhance message and conversation data structure

- Added `run_context_json` to the `messages` table schema to store contextual information for each message.
- Updated migration logic to include the new `run_context_json` field and ensure backward compatibility.
- Enhanced conversation retrieval to include `messageCount`, providing better insights into the number of messages per conversation.
- Improved attachment handling in the `ChatComposer` component by introducing ordering logic for attachments, ensuring a consistent user experience.
- Refactored `SessionModeToggle` to simplify state management and improve tooltip visibility.

These changes enhance data management and user interaction within the chat application, providing clearer context and improved functionality.

* feat(attachment-handling): improve attachment sorting and context management

- Introduced `sortAttachmentsByUserOrder` function to ensure attachments are displayed in a user-defined order, enhancing clarity and usability.
- Updated `historyWithApiAttachmentContext` to utilize the new sorting function, improving the context provided with message histories.
- Refactored `buildAnthropicMessageContent` to apply sorting for image attachments, ensuring consistent handling across different message types.

These changes enhance the user experience by providing better organization and visibility of attachments within the chat application.

* feat(chat): enhance conversation listing and loading states

- Improved the SQL query for listing conversations to include message counts, providing better insights into conversation activity.
- Added loading states to the ChatPane component, enhancing user experience during data fetching.
- Implemented a search feature in the conversation history, allowing users to filter conversations by title for easier navigation.
- Updated styles for loading indicators and conversation list to improve visual feedback during loading states.

These changes enhance the usability and responsiveness of the chat interface, providing users with clearer context and improved interaction capabilities.

* feat(chat): optimize suggestion filtering and enhance design system integration

- Refactored suggestion filtering in the ChatComposer component to utilize `useMemo`, improving performance by memoizing results based on dependencies.
- Added new props in ChatPane for handling plugin and design system details, enhancing the integration of these features within the chat interface.
- Updated the FileWorkspace component to manage tab states for design system and browser tabs, improving user navigation and context management.
- Introduced modals for displaying plugin and design system details in the ProjectView, enhancing user experience by providing contextual information.

These changes improve the efficiency of suggestion handling and enhance the overall user experience in managing plugins and design systems within the chat application.

* fix(chat): adjust styling and layout for improved chat interface

- Reduced the conversation row height in ChatPane for better alignment with design specifications.
- Updated CSS styles across various components to enhance layout consistency, including adjustments to margins, padding, and flex properties.
- Improved visual feedback and responsiveness in chat elements, ensuring a more cohesive user experience.

These changes refine the chat interface, making it more visually appealing and user-friendly.

* feat(comments): implement structural equality checks for comment snapshots

- Added `commentSnapshotOverlayEqual` and `commentSnapshotEqual` functions to compare comment snapshots based on their structural properties, improving performance by avoiding unnecessary state updates in the `FileViewer` component.
- Updated `HtmlViewer` to utilize these equality checks, optimizing the handling of live comment targets during pointer movements and hover events.
- Enhanced the overall responsiveness of the comment system by preventing redundant re-renders when comment snapshots remain unchanged.

These changes enhance the efficiency of comment handling and improve user experience in the commenting interface.

* feat(chat): enhance comment attachment sorting and order management

- Introduced `sortChatCommentAttachmentsByOrder` function to ensure comment attachments are displayed in a user-defined order, improving clarity and usability.
- Updated the `currentCommentAttachments` function to utilize the new sorting logic, enhancing the organization of attachments.
- Adjusted the order assignment for visual attachments to ensure consistent handling based on existing attachment orders.

These changes improve the user experience by providing better organization and visibility of comment attachments within the chat application.

* feat(chat): enhance queued send strip with overflow handling and styling improvements

- Added overflow handling to the queued send strip, allowing for better visibility of additional queued items when the list exceeds the visible limit.
- Updated CSS styles for the chat components, including adjustments to layout, padding, and font sizes to improve overall aesthetics and usability.
- Refactored the structure of the queued send row to utilize a grid layout, enhancing alignment and responsiveness of the elements within the chat interface.

These changes improve the user experience by providing clearer visibility of queued messages and a more polished interface.

* feat(database): add index for created_at and enhance comment retrieval

- Introduced a new index on `preview_comments` for `created_at` to optimize query performance when retrieving comments.
- Updated the `listPreviewComments` function to order results by `created_at` and `rowid`, improving the organization of displayed comments.
- Enhanced the test suite to verify the injection of the new URL preview selection bridge and its functionality in various scenarios.

These changes improve the efficiency of comment retrieval and enhance the user experience in the commenting interface.

* feat(tooltip): implement tooltip system for enhanced user guidance

- Introduced a new `TooltipLayer` component to manage tooltip display across the application, improving user interaction by providing contextual information on hover and focus.
- Updated various components to utilize the tooltip system, including buttons and icons, ensuring consistent tooltip behavior and styling.
- Enhanced CSS styles for tooltips, improving visibility and responsiveness, while maintaining a cohesive design across the application.

These changes enhance the user experience by providing clearer guidance and improving the overall usability of interactive elements.

* feat(tooltip): enhance tooltip integration across components

- Updated various components to include tooltip functionality, improving user guidance with contextual information on hover and focus.
- Added `data-tooltip` and `data-tooltip-placement` attributes to buttons and interactive elements for consistent tooltip behavior.
- Enhanced CSS styles to ensure tooltips are displayed correctly and responsively, maintaining a cohesive design across the application.

These changes improve the overall user experience by providing clearer guidance and enhancing the usability of interactive elements.

* feat(project-view): refactor comment handling and enhance tooltip positioning

- Introduced `mergeSavedPreviewComment` function to streamline the management of preview comments, improving clarity and maintainability.
- Updated `ProjectView` to utilize the new comment merging function, enhancing the efficiency of comment updates.
- Refactored `TooltipLayer` to use x and y coordinates for positioning, improving tooltip display accuracy and responsiveness.
- Enhanced CSS styles for tooltips, ensuring better visibility and layout consistency across the application.

These changes improve the user experience by providing more efficient comment handling and refined tooltip interactions.

* feat(design-files): enhance workspace hint and file handling in project context

- Introduced `formatDesignFilesWorkspaceHint` function to provide a detailed overview of the current Design Files workspace, including folder and file listings.
- Added limits for the number of folders and files displayed to improve clarity and prevent overwhelming users with excessive information.
- Updated the `startServer` function to integrate the new workspace hint, ensuring that the context of existing project files and folders is communicated effectively.
- Enhanced tests for the new workspace hint functionality to ensure accurate representation of project context.

These changes improve user experience by providing clearer insights into the Design Files workspace and facilitating better project management.

* feat(composer): atomic @mention keyboard navigation and deletion

* feat(database): add attachments_json field to comments and enhance migration logic

- Introduced `attachments_json` field in the database schema for comments to support attachment storage.
- Updated migration functions to include the new field, ensuring existing data is properly migrated.
- Refactored related SQL queries to accommodate the new field, improving data handling for comments.

These changes enhance the comment functionality by allowing attachments to be stored and retrieved effectively.

* chore(nix): refresh pnpm deps hash

* feat(chat-composer): implement design toolbox for enhanced design actions

- Added a new design toolbox feature to the ChatComposer, allowing users to access various design actions such as 'auto-match', 'motion', and 'visual-polish'.
- Introduced a state management system for the design toolbox, including hooks for opening and closing the toolbox.
- Enhanced the user interface with new components and styles for the design toolbox, improving accessibility and usability.
- Implemented functionality to apply design actions directly from the toolbox, streamlining the design workflow.
- Added tests to ensure the correct behavior of the design toolbox and its interactions within the ChatComposer.

These changes significantly enhance the design capabilities within the ChatComposer, providing users with a more efficient and intuitive design experience.

* feat(chat-composer): enhance design toolbox resource management

- Expanded the design toolbox functionality in ChatComposer to include a comprehensive resource index, allowing for better organization and retrieval of skills, plugins, MCP servers, templates, connectors, and project files.
- Introduced new types and interfaces to support the expanded resource management, improving type safety and clarity in the codebase.
- Updated the design toolbox action descriptions and search functionality to reflect the new resource capabilities, enhancing user experience.
- Added tests to validate the new resource indexing and search features, ensuring robust functionality.

These enhancements significantly improve the design workflow by providing users with a more organized and efficient way to access various design resources.

* feat(workspace): enhance workspace context management and UI integration

- Introduced a new function `renderWorkspaceContextToolHints` to provide contextual hints based on the type of workspace items (browser, terminal, files, live artifacts).
- Updated `ChatComposer`, `ChatPane`, and `FileWorkspace` components to support and display workspace context items, improving user interaction and accessibility.
- Enhanced the `QuickSwitcher` and `TabLauncherMenu` components to include workspace context items in search results, allowing users to navigate between tabs and files more efficiently.
- Added new translations and updated existing ones to reflect the inclusion of workspace tabs in the user interface.

These enhancements significantly improve the usability and functionality of the workspace, providing users with better context and navigation options.

* test(e2e): assert Lexical composer content with toHaveText, not toHaveValue

The chat composer and home hero input are now Lexical contenteditable
editors, not native form controls, so Playwright's toHaveValue (form-only)
fails with "Not an input element". Switch all chat-composer-input and
home-hero-input content assertions to toHaveText, and assert multi-line
soft-break cases with separate toContainText checks since the editor's
textContent collapses the newline (the newline reaching the sent payload
is already covered by downstream message/payload assertions).

* chore(nix): refresh pnpm deps hash

* chore(nix): refresh pnpm deps hash

* test(e2e): open settings through the entry settings menu

The home settings entry is now a menu (EntrySettingsMenu): clicking the
gear opens a popover whose "Settings" item opens the full execution-mode
dialog. Update the three specs that assumed a single click opened the
dialog directly to go through the menu trigger + open-details item.

* test(web): drive HomeView context picker through the Lexical helper

The home hero input is a Lexical contenteditable, so fireEvent.change /
reading .value throws "element does not have a value setter". Switch the
MCP+connector first-turn-context spec to setHomeHeroPrompt / homeHeroPromptText
like the rest of the file already does.

* Fix PR review blockers

* Update Nix pnpm deps hash

* Enhance project folder routes with error handling and tests

- Added checks for project existence in GET, POST, and DELETE folder routes, returning a 404 error if the project is not found.
- Updated tests to verify 404 responses for unknown project IDs in folder operations.
- Improved folder metadata handling in project routes.
- Refactored the TabLauncherMenu component for better UI structure and scrolling behavior.
- Adjusted styles for the TabLauncherMenu to improve usability and visual consistency.

* Update ProjectView component and enhance design toolbox localization

- Modified the ProjectView component to include workspacePanelTrack in chat panel width adjustments.
- Updated localization files to add new keys and translations for the design toolbox in English, Simplified Chinese, and Traditional Chinese.
- Enhanced user experience by providing comprehensive tooltips and prompts for design toolbox actions.

* Add browser use prompt launcher

* Cover browser use prompt launcher

* Add design toolbox badge translations

* Refactor ChatComposer and DesignBrowserPanel components

- Removed the design system picker from the ChatComposer component and integrated it into the StagedRunContexts for better context handling.
- Updated the DesignBrowserPanel to include a new function for determining viewport icons and modified the browser use categories to reflect changes in action prompts and titles.
- Enhanced the HandoffButton component with a new feature to copy the project path to the clipboard, improving user experience.
- Added new styles for search input and empty states in design files, enhancing the UI consistency across components.

* Enhance ChatPane and DesignBrowserPanel components with new features and styles

- Added scroll handling features to the ChatPane component, including scrollable state management and improved user experience during chat interactions.
- Updated the DesignBrowserPanel to include new localization keys and improved category titles for better clarity in the browser use prompt.
- Enhanced styles for chat log scrolling behavior, providing a more intuitive interface for users.
- Implemented keyboard shortcuts for tab navigation in the WorkspaceTabsBar, improving accessibility and user efficiency.

* Fix web workspace test hang

* Update styles and functionality for workspace tabs and chat components

- Adjusted CSS for the MAC_WINDOW_CHROME to improve spacing and margins for better layout.
- Enhanced the ChatComposer component to ensure proper tab order and mention handling, improving user experience.
- Implemented keyboard shortcuts for navigating workspace tabs, allowing for more efficient tab management.
- Updated localization strings for clarity in search prompts across multiple languages.
- Improved the HandoffButton component's UI for better visibility and interaction when copying project paths.

* Update visual settings flow

* Refresh Nix pnpm deps hashes

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-06-03 15:46:22 +00:00
Caprika
86d92e9586 [codex] bump vela-cli to 0.0.10 (#3577)
* bump vela-cli to 0.0.10

* chore(nix): refresh pnpm deps hash

* chore(nix): refresh pnpm deps hash

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-03 15:12:06 +00:00
elihahah666
2471996f26 feat(web): add product-wide UI animations (#3294)
* feat(web): add product-wide UI animations

Add motion library (Framer Motion) for modal/toast/popover exit
animations via AnimatePresence, and a comprehensive CSS entrance
animation system (entrance.css) for all other UI surfaces.

Motion is limited to low-frequency mount/unmount components (modals,
toasts, popovers) to avoid re-render overhead. All high-frequency
components (grids, lists, nav, inputs) use pure CSS @keyframes with
staggered nth-child delays — zero JS cost.

Covers: home hero sequence, card grids, design kanban, marketplace,
settings, memory panels, MCP picker, inspector/comment slide-in,
chat history, file viewer, connector drawer, error banners, agent
cards, automation history, skill rows, breadcrumbs, workspace tabs,
staged attachments, code viewer, and button press feedback.

Respects prefers-reduced-motion system preference.

* fix(web): prevent thumbnail flash on tab switch

Replace individual card/row stagger animations (opacity 0→1 per child)
with container-level fades. The child animations replayed on every
React remount when switching tabs, causing all preview thumbnails to
flash from invisible.

Now only containers animate (fast fade-in), individual items inside
grids/lists stay fully visible immediately.

* fix: remove fade animations from tab-switchable content to prevent flash on remount

Grids, tab panels, entry sections, viewer panels, and other content that
remounts on tab switch no longer start from opacity:0. Only true one-time
mounts (page load, home hero), overlays/popovers, slide-in panels, and
error banners retain entrance animations.

* fix: keep tab content mounted with display:none to prevent thumbnail reload on tab switch

Projects, Tasks, Plugins, and Design Systems tabs now stay mounted in the
DOM when inactive, hidden via display:none. This preserves loaded images
and prevents the visible flash caused by thumbnails re-fetching on every
tab switch. Home and Integrations tabs keep conditional rendering since
they have no persistent media to preserve.

* fix: keep HomeView mounted to preserve recent project thumbnails on tab switch

* perf: use content-visibility:hidden for inactive tabs + add card press feedback

Replace display:none with content-visibility:hidden so the browser skips
rendering computation for hidden tabs while preserving the full DOM tree.
Add :active scale(0.98) press feedback to design-card, plugins-home__card,
and recent-projects__card for tactile click response.

* fix: pin dependency versions to exact (no caret ranges)

* chore: update lockfile for pinned dependency versions

* fix: replace PreviewModal motion with CSS animation, add motion test mock

- PreviewModal no longer uses motion/react — prevents test failures from
  AnimatePresence exit animations never completing in test env
- Add CSS animations for .ds-modal-backdrop (fade-in) and .ds-modal (scale-in)
- Add vitest alias to mock motion/react so AnimatePresence in other
  components (UpdaterPopup, ExamplesTab) completes synchronously in tests

* chore(nix): refresh pnpm deps hash

* test(e2e): scope visual-home locators to their entry view

The redesigned entry shell keeps every view mounted (only the active one
is visible) so tab switches don't reload thumbnails. That makes testids
like `plugins-home-section` and text like "Launchpad dashboard" exist in
more than one view at once, breaking Playwright strict-mode locators.
Tag each view container with `data-testid="entry-view-<name>"` and scope
the affected visual specs to the relevant view so the locators stay
unambiguous.

* test(e2e): scope entry-chrome home starters locators to their view

The redesigned entry shell keeps every view mounted (only the active one
is visible), so `plugins-home-section` and its children render in both the
home and plugins views at once, breaking Playwright strict-mode locators.
Scope the home starters assertions in entry-chrome-flows to the
`entry-view-home` container so the locators stay unambiguous, matching the
existing visual-home spec fix.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

* fix(web): close a11y/reduced-motion/exit gaps in animation layer

Address three review findings on the product-wide animation PR:

- EntryShell kept every tab view mounted via content-visibility:hidden,
  but that only skips paint — inactive Home/Projects/Plugins subtrees
  stayed in the tab order and accessibility tree. Mark inactive view
  wrappers inert + aria-hidden so keyboard users and screen readers only
  see the active view while the cached DOM survives.
- The motion/react variants animated unconditionally; the CSS
  prefers-reduced-motion block did not cover them. Wrap the app root in
  MotionConfig reducedMotion="user" so dialogs/toasts/popovers honor the
  OS preference, and add a focused regression test for the wiring.
- NewProjectModal short-circuited with `if (!open) return null`, so its
  exit variants never ran. Render the body inside AnimatePresence gated
  on `open` so the close animation plays before unmount.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-03 13:20:18 +00:00
lefarcen
e93689a247 chore(nix): refresh pnpm deps hash for merged lock (tmp@0.2.7 + release deps)
The back-merge lock combines release's deps with main's #3379 tmp@0.2.7 bump,
so the daemon/web fetchPnpmDeps FODs hash to values distinct from either
branch. Set to the hashes nix flake check reported for the merged tree.
2026-06-03 17:47:02 +08:00
Gateway
13d4612f63 security: resolve vulnerable tmp transitive dependency (#3379)
* security: override tmp to patched version

* chore: refresh nix pnpm deps hash

---------

Co-authored-by: Gateway <gateway@users.noreply.github.com>
Co-authored-by: a1chzt <chizblank@gmail.com>
2026-06-03 09:15:20 +00:00
kami
333a62cda6 fix: link od bin after fresh install (#2069)
* fix: link od bin after fresh install

* test: lock root od bin shim path

* test: cover root workspace deps in postinstall scan

* chore(nix): refresh pnpm deps hash
2026-05-31 04:36:49 +00:00
Caprika
76c7d31c53 chore: bump vela cli to 0.0.4 (#3239)
* chore: bump vela cli to 0.0.4-test.0

* chore: refresh lockfile for vela cli 0.0.4-test.0

* chore(nix): refresh pnpm deps hash

* fix: materialize electron before mac release checks

* fix: rebuild electron when mac framework links are invalid

* revert: drop release workflow experiments

* chore(nix): refresh pnpm deps hash

* fix: stop blocking beta mac release on electron symlink preflight

* fix: stop using custom electron dist for beta mac packaging

* fix: guard oversized chat images and opencode overflow

* chore: bump vela cli to 0.0.4

* chore(nix): refresh pnpm deps hash

* fix(daemon): surface prompt-image stat failures instead of dropping them

resolveSafePromptImagePaths only swallowed unresolvable path input; once a
path was confirmed inside UPLOAD_DIR and existed, a statSync failure
(EACCES/EPERM, a file vanishing mid-run) silently dropped the image and let
the run continue without that prompt context. Since this helper is now also
the 1 MB enforcement point, that turned an infra/validation failure into a
'successful' run with missing required context.

Collect those into a new failedImages bucket and fail the run with
INTERNAL_ERROR at the call site, mirroring the oversized-image guard. Add a
unit test covering statSync throwing.

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: lefarcen <935902669@qq.com>
2026-05-29 06:41:17 +00:00
lefarcen
df8a0faff6 feat(runtimes): register AMR (vela) as an ACP stdio agent (#2355)
* feat(runtimes): register AMR (vela) as an ACP stdio agent

AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode`
speaks ACP JSON-RPC over stdio (see vela's
`specs/current/runtime/manual-agent-run-openrouter.md`); per
`docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat:
'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc.

The new `defs/amr.ts` is the entire wiring — `buildArgs` returns
`['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses
`detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's
e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the
matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN`
allowlist + install/docs URLs, so users can configure the per-agent env in
Settings without leaking into other adapters.

Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns
the documented `initialize` / `session/new` / `session/set_model` /
`session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via
`child_process.spawn` and drives a full turn through `attachAcpSession` and
`detectAcpModels`, so the ACP transport contract for AMR is end-to-end
verified locally even before a real `vela` binary is installed.

Validated:
- pnpm guard
- pnpm typecheck (all workspace projects)
- pnpm --filter @open-design/daemon test (2881/2881)

Deferred: real OpenRouter-backed turn through a built `vela` binary —
the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY`
and `VELA_LINK_URL` in env (or Settings).

* fix(runtimes/amr): pin a concrete default model and bare openai ids

End-to-end validation against a freshly-built `vela` (nexu-io/vela@main)
+ OpenRouter surfaced two contract details the first AMR runtime def
got wrong:

1. vela rejects `session/prompt` with `session/set_model must be called
   before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts
   skips set_model whenever the picked model is the synthetic 'default'
   id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The
   def now ships a concrete `gpt-5.4-mini` as both `fetchModels`'
   default option and `fallbackModels[0]`, which makes attachAcpSession
   always send a real `session/set_model` for AMR turns.

2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId
   it forwards to opencode's openai provider. With OpenRouter-style ids
   like `openai/gpt-5.4-mini`, opencode receives the double-prefixed
   `openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`.
   The new fallback list ships the bare ids opencode's openai registry
   actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.).

Stub + tests:
- tests/fixtures/fake-vela.mjs now enforces the set_model gate the same
  way real vela does, so a regression that silently goes back to
  model: 'default' would surface as a fatal error in tests instead of a
  hidden production failure.
- tests/amr-acp-integration.test.ts pins both contracts: no 'default' /
  no 'openai/' prefix in fallbackModels, and a negative case that
  asserts session/prompt fails when no model is set.

Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time
runner that drives `attachAcpSession` against a real `vela` binary and
prints the daemon's chat events, so future protocol drift can be checked
against an actual OpenRouter call.

Verified locally: `vela agent run --runtime opencode` + OpenRouter
returns the prompted string ("AMR-E2E-PASS") through the full daemon
pipeline; daemon test suite stays 2883/2883.

* fix(runtimes/amr): substitute concrete model when chat run sends 'default'

A plugin-driven AMR run from the UI surfaced a real-world hole in the
prior commit:

  json-rpc id 3: session/set_model must be called before session/prompt

The Default-design-router plugin (and any caller that doesn't pin a
real model) sends `model: 'default'` straight through, which the AMR
runtime def cannot accept — vela rejects `session/prompt` without
`session/set_model` and attachAcpSession skips set_model whenever
model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the
adapter's `fallbackModels` is not enough: the chat-run handler in
server.ts still forwarded 'default' verbatim.

This adds `resolveModelForAgent(def, resolved, env?)` as the
single source of truth for the substitution:

  1. If the caller picked a real id, pass it through.
  2. Else, if `def.defaultModelEnvVar` is set and the daemon process
     env has a non-empty value for it, return that (operator escape
     hatch — see below).
  3. Else, if the def's `fallbackModels` does NOT contain a 'default'
     id, return `fallbackModels[0].id`.
  4. Else, return the original value (the historic shape — defs that
     list 'default' themselves are untouched).

AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when
opencode's openai-provider registry deprecates `gpt-5.4-mini`
upstream, an operator can swap the fallback id without a code change
by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev
/ od. Worth noting the env var must live in the daemon's `process.env`
(Settings-UI per-agent env values only reach the spawned child, not
the daemon's resolver) — the new field's docblock spells this out.

Coverage:
- `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all
  four resolver branches plus the env-override happy path / fallback /
  ignore-when-user-picked-a-real-id case.
- `pnpm --filter @open-design/daemon typecheck` clean.

* chore(runtimes/amr): move AMR to the top of the base agent list

So `AMR (vela)` shows up first in the agent picker / status views,
ahead of claude / codex. Pure ordering change; no behavior delta.

* feat(amr): Sign-in / Sign-out button on the AMR Settings card

The first half of the AMR work assumed the operator would set
VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never
surfaced login state to users. This adds the missing UX so a fresh
install can drive the full path from Settings:

  - GET  /api/integrations/vela/status   reads ~/.vela/config.json
    for the active profile and returns { loggedIn, profile, user }
    (without leaking the runtime/control keys themselves).
  - POST /api/integrations/vela/login    spawns `vela login` once
    (409 if one is already in flight). The vela CLI opens the user's
    browser to the device-authorization page itself — Open Design
    only needs to kick the subprocess off.
  - POST /api/integrations/vela/logout   removes ~/.vela/config.json
    so the next status read returns logged-out.

`AmrAgentCard` is a dedicated agent-card component for AMR because
the existing `<button>` row can't host an interactive sub-control
(nested interactive elements). It polls /status after a login click
until the daemon reports loggedIn=true (or 5 minutes elapse), and
exposes a Sign-out action on hover. Other adapters (claude, codex,
hermes, …) keep their existing `<button>` card.

i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.)
added to en + zh-CN. Other locales spread `en` and inherit the
English copy until translations land.

Coverage:
- `tests/integrations/vela.test.ts` pins the config.json reader
  against a tmp HOME — including the negative case where a profile
  has user info but no runtimeKey (still logged-out), and the
  secret-leak guard ("rt-secret-*" must not appear in the projection
  payload).
- `tests/components/AmrAgentCard.test.tsx` covers all four UI
  states (logged-out, logging-in, logged-in, logging-out) plus the
  click-propagation invariant the divergent card was built to keep.

`pnpm --filter @open-design/daemon test` 2901 / 2901 passing.
`pnpm --filter @open-design/web test` 1719 / 1719 passing.
`pnpm typecheck` + `pnpm guard` clean.

Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs`
no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if
VELA_PROFILE is set, the vela CLI is allowed to resolve credentials
from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to
`scripts/guard.ts` allowlist with the executable-fixture / dev-runner
rationale.

* fix(connection-test): substitute model for AMR before attachAcpSession

The chat-run path in server.ts already routes the requested model through
`resolveModelForAgent` so AMR / vela (whose CLI demands an explicit
`session/set_model` before `session/prompt`) gets the def's first
concrete fallback id when the chat run ships `model: 'default'`.
`connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })`
directly, which made the Test Connection button on the AMR Settings
card deadlock with the same `session/set_model must be called before
session/prompt` error the chat-run path already handles — surfaced as a
permanent "Testing connection…" spinner in the UI.

Reuse the same helper here so Test Connection mirrors chat-run behavior.

* test(amr): three-layer end-to-end coverage for the AMR login + turn flow

The PR up to this point shipped runtime + UI code with unit-level Vitest
coverage. This commit adds the cross-layer regression net the live demo
relied on:

1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest)
   Spins up the real daemon Express app via `startServer({port:0,...})`,
   persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json,
   and exercises every /api/integrations/vela/* endpoint against the
   extended fake-vela stub:
     - status reads ~/.vela/config.json under various states
     - login spawns the fake, waits for config.json to appear, returns
       pid + startedAt + profile
     - 409 already-running guard with the stub's delay knob
     - logout removes the file (idempotent)
     - secrets (runtimeKey / controlKey) never leak in the projection
     - login → status round-trip flips loggedIn=false → true

2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest)
   Boots a namespaced daemon + web pair through `createSmokeSuite`,
   inlines a self-contained fake `vela` binary that handles BOTH
   `vela login` (writes ~/.vela/config.json) and
   `vela agent run --runtime opencode` (ACP stdio with the
   `session/set_model must precede session/prompt` gate the real binary
   enforces), then drives a complete /api/runs lifecycle for
   `agentId: 'amr', model: 'default'` and asserts the assistant message
   captures the fake's streamed text. This is the test that would have
   surfaced today's plugin-default-model regression (the `set_model
   before prompt` error) at PR time instead of demo time.

3. e2e/ui/amr-login-pill.test.ts (Playwright)
   Mocks /api/agents + /api/integrations/vela/{status,login,logout}
   to drive the Settings AMR card through the full Sign in → Signed in
   → Sign out cycle. Pins the AmrLoginPill polling contract and the
   aria-label semantics (the pill's accessible name is "Sign out" once
   logged in, regardless of which label the hover-state text shows).

fake-vela.mjs extensions:
   - Handles `vela login` argv by writing
     ~/.vela/config.json for the active VELA_PROFILE and exiting 0 —
     mirrors real vela's on-disk side-effect without the device-auth
     loop.
   - FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the
     in-flight state of the spawn lifecycle.
   - FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced
     user fields end-to-end.

Validated:
   - `pnpm guard` + `pnpm typecheck` (all workspace projects)
   - `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing,
     including the new 8-test integration suite.
   - `cd e2e && pnpm test tests/amr`: 1 / 1 passing.
   - `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`:
     1 / 1 passing (6.7s).

* feat(amr): package native cli and refine login ui

* feat(amr): wire vela cli beta packaging

* docs(amr): document vela ci packaging review

* docs(amr): refine vela ci integration review

* fix(ci): refresh nix pnpm dependency hashes

* fix(pack): clean up Vela CLI packaging

* fix(pack): bundle Vela CLI support files

* fix(amr): recover login attempts from stale auth state

* test: expand AMR and automations coverage

* fix(amr): address review follow-ups

* test(web): align tasks fixtures with contracts

* fix(daemon): type wildcard route params

* fix(ci): refresh PR merge validation

* fix(amr): clear env credentials on logout

* feat(settings): inline local CLI model configuration

* fix(amr): recognize daemon env credentials

* [codex] Fix Vela companion packaging (#2979)

* Fix Vela companion packaging

* Update Nix pnpm dependency hashes

* [codex] Surface AMR account failures (#2980)

* fix: surface AMR account failures

* fix: cover AMR recovery error guidance

* chore: bump beta base version to 0.8.1 (#2990)

* Fix AMR profile and packaged runtime review issues

* Detect packaged AMR OpenCode companion tree

* feat(web): polish AMR frontend flows

* Polish AMR onboarding card

* fix: read AMR login state from dot-amr config (#3048)

* test: tighten AMR credential and packaging coverage

* test: restore AMR executable test env helper

* [codex] Fix packaged mac Dock identity and AMR label (#3076)

* Fix packaged mac sidecar Dock identity

* Rename AMR assistant label

* Fix AMR live models and dot-amr login state (#3073)

* fix: read AMR login state from dot-amr config

* fix: load live AMR models before runs

* fix: point AMR onboarding link to production wallet

* fix: address AMR model review feedback

* fix: persist live AMR model fallback

* [codex] Fix AMR link catalog model ids (#3088)

* Fix packaged mac sidecar Dock identity

* Rename AMR assistant label

* Fix AMR link catalog model ids

* Fix AMR model normalization typecheck

* Use live AMR model for default runs

* fix: polish AMR runtime settings UI

* Accelerate AMR startup defaults (#3092)

* Surface AMR insufficient balance wallet URL (#3099)

* fix(web): polish onboarding controls (#3112)

* fix(web): show CLI scan loading state

* Avoid duplicate AMR wallet recharge links (#3117)

* Avoid duplicate AMR wallet recharge links

* Use Vela CLI 0.0.3 test package

* chore(nix): refresh pnpm deps hash

* Fix AMR wallet guidance display

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>

* chore(pack): pin Vela CLI 0.0.3-test.1 (#3127)

* chore(nix): refresh pnpm deps hash

* chore(pack): pin Vela CLI 0.0.3

* chore(nix): refresh pnpm deps hash

* fix(web): suppress AMR exit 130 fallback (#3136)

* feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083)

* feat(web): nudge users to hosted AMR on model/auth/quota failures

When a non-AMR agent run fails with an auth / quota / upstream model
error, surface an inline nudge under the error pill linking to Open
Design's hosted AMR gateway (https://open-design.ai/amr). The nudge
fires `surface_view` (element=run_failed_toast) on impression and
`ui_click` (element=go_amr) on the link.

Also teach the daemon to classify CLI-agent auth/quota/upstream failures
(Claude Code, codex, ...) into specific API error codes
(AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of
the generic AGENT_EXECUTION_FAILED, so both the error message and the
nudge key off accurate codes. AMR's own runs are excluded from the
nudge — they keep the dedicated sign-in / recharge affordances.

* feat(web): rework failed-run AMR guidance into per-case error UI

Replace the single inline nudge with a per-case failed-run experience
driven by the run's error code + agent:

- The error card is now neutral gray (was red) and always carries a
  retry button; it is driven by the persisted per-message error event so
  it survives a reload.
- Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion
  card under the error card offers "switch to AMR & retry" — switches the
  run to AMR, opens Settings on the AMR card, and auto-retries once the
  account signs in (ProjectView polls vela login status, independent of
  the Settings pill lifecycle, with success / 5-min-timeout / unmount
  exits).
- AMR agent unauthorized: clearer copy + an "authorize & retry" button.
- AMR agent out of balance: clearer copy + a "top up" button to the AMR
  wallet, with manual retry.
- Settings AMR card: when opened from the nudge, it scrolls into view and
  pulses, and an authorize-button coachmark (a fake hand cursor that
  rises in and dismisses on hover) points at the sign-in control when not
  yet authorized.

analytics: surface_view (run_failed_toast) on the promotion card and
ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.*
and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall
back to en) and drops the old chat.amrErrorGuidance keys.

* fix(daemon): require status context for numeric service-failure codes

Per review on #3083: the model-service classifier matched bare HTTP
status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like
`line 500`, `read 502 bytes`, or `exit code 401` could be misclassified
as a provider outage / auth wall and wrongly surface the AMR nudge. Now
a status number only counts when it carries explicit context (`HTTP 500`,
`status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases
(overloaded, bad gateway, service unavailable, rate limit, …) are
unchanged. Adds fixtures proving unrelated numeric output stays null.

* fix(web): keep error pill for failed runs ChatPane's card doesn't cover

Per review on #3083: the per-message gray error pill was suppressed for
every persisted error status event, but ChatPane only renders the
replacement top-level error card for `retryableAssistantMessage` (the
last failed assistant). So a failed turn that is no longer last (after a
follow-up) or an older failed run in history showed neither the pill nor
the card — its error detail vanished, undercutting reload/history
survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose
error the card represents); AssistantMessage suppresses only that one
pill and keeps rendering StatusPill for all other error events.

* fix(daemon): don't treat a process exit code as an HTTP status

Follow-up to review on #3083: the status-context helper accepted a bare
`code` prefix, so `exit code 401` / `process exited with code 429` still
matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the
very `exit code 401` case the comment calls out as noise). `code` now
only counts when qualified (`status code` / `error code` / `response
code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer
matches. Adds fixtures for exit-code lines returning null.

* chore(web): translate AMR card / error keys for 16 remaining locales

PR #3083 added 10 new `chat.amrCard.*` / `chat.amrError.*` keys but only
provided en/zh-CN/zh-TW translations; the other 16 locales fell back to
English. Translate the card title/body, three chips, primary CTA, and
the AMR self-error (auth / balance) messages and buttons for ar, de,
es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk.

* fix(amr): address review feedback on #2355

Targeted fixes for the unresolved review threads on #2355. Each fix
includes / updates a focused test.

- runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now
  verifies the inner `opencode` executable exists + is runnable, not
  just the directory. This closes the false-positive availability path
  that let `detectAgents()` surface AMR as available even when the
  packaged companion was empty / partially copied (mrcfps, 4 threads).

- runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers
  the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a
  stale `opencode` on the user's PATH, so packaged AMR builds can't be
  hijacked by a global installation.

- web/EntryShell.tsx: when the Local CLI scan returns an available
  agent and the previously-selected agent is AMR, switch the selection
  to the first available local agent so the runtime and persisted
  agent agree before Continue.

- server.ts (model-probe branch): for AMR, check `readVelaLoginStatus`
  BEFORE rejecting on an empty live-model catalog — a signed-out user
  was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of
  the correct `AMR_AUTH_REQUIRED` (sign-in affordance).

- server.ts (default model fallback): if the user asked for the AMR
  agent default and the cached id is no longer in the FRESH catalog,
  fall back to `liveModels[0]` from the probe instead of rejecting the
  run as `AMR_MODEL_UNAVAILABLE`.

- integrations/vela.ts: route `vela login` through
  `createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat`
  shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with
  verbatim args (matches `execAgentFile` / chat-run spawning).

- tools/pack/src/linux.ts: in containerized Linux builds, bind-mount
  the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env
  to the container-side path. The host path was being passed in as-is
  even though the default container only mounts /project, /tools-pack
  and cache/home — `copyOptionalVelaCliBinary` saw a missing path.

Deferred (out of scope for this PR):
- `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md
  UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked
  for a separate focused PR.
- Strict `--require-vela-cli` for Windows + mac-x64 beta builds:
  prematurely blocked — `@powerformer/vela-cli` only publishes the
  `darwin-arm64` platform binary today; adding the flag elsewhere
  would fail the builds. Revisit once win/x64/linux binaries ship.

* fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ)

The new signed-out AMR branch in the catalog preflight at server.ts:10875
calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the
const declaration sat ~100 lines below at the outer function scope. Because
`const` is TDZ-aware, that branch would have thrown `ReferenceError:
Cannot access 'sendAmrAccountFailure' before initialization` for the
exact users it tries to help — defeating the original intent.

Hoist the helper to just above the AMR preflight block so it's available
to every AMR code path in this function. Behavior elsewhere is unchanged.

Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch
uses packaged built-in Vela for AMR` was creating the
`<resourceRoot>/bin/libexec/opencode/` companion *directory* only, but
this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree`
also requires the inner `opencode` executable. Add it to that fixture
to match the new contract; the test was a sibling of the executables /
env-and-detection fixtures already updated in 13fc4f4.

Addresses #2355 review (mrcfps, 2026-05-28).

* feat(web): add hover cancel for AMR login (#3158)

* feat(web): add hover cancel for AMR login

* fix(web): don't bounce AmrLoginPill back to 'Signing in…' after local cancel

Both codex-connector (P2) and looper (CHANGES_REQUESTED) on this PR
flagged the same race in the new local-cancel path: `handleCancelLogin`
dispatches `notifyAmrLoginStatusChanged('login-canceled')` immediately
after `/login/cancel` returns, but the `AMR_LOGIN_STATUS_EVENT` listener
unconditionally re-enters `refresh()` and then restarts polling
whenever `/api/integrations/vela/status` still reports
`loginInFlight: true`.

That is a real race because the daemon's `cancelVelaLogin()` only sends
SIGTERM (escalating to SIGKILL after `LOGIN_CANCEL_KILL_GRACE_MS` =
2000 ms) and keeps the child in `activeLoginProcs` until it actually
exits — so the first `/status` read after a successful cancel can
legally still come back as in-flight. Under that window the pill flips
back to 'Signing in…' and can later surface the timeout/error path even
though the user already canceled, defeating the behavior promised in
the PR description.

Fix the listener instead of every dispatch site: in the
`login-canceled` branch, after the local reset (stopPolling +
setPending(null) + clear refs), optimistically mark every subscribed
pill instance as not-in-flight (`setStatus((c) => c ? { ...c,
loginInFlight: false } : c)`) and `return` — skip the
refresh-and-reconcile branch below entirely. The next explicit refresh
(component mount, user interaction, or a `status-changed` event) will
pick up the daemon's confirmed state once the child has actually
exited.

Add a focused regression test that holds `/api/integrations/vela/status`
at `loginInFlight: true` even after a successful `/login/cancel`,
asserting that the pill stays at the Canceled → Authorize sequence and
never bounces back to 'Signing in…'. This test fails on the pre-fix
listener and passes on the new behavior; existing
'cancels an in-flight AMR sign-in…' and 'reconciles late AMR browser
completion to Signed in after local cancel' tests continue to pass.

Addresses review feedback on #3158 (chatgpt-codex-connector, nettee).

---------

Co-authored-by: lefarcen <935902669@qq.com>

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
Co-authored-by: Amy <1184569493@qq.com>
Co-authored-by: Mason <jinmeihong0201@gmail.com>
Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-28 05:09:55 +00:00
Marc Chan
d5659d82d4 chore(nix): streamline pnpm deps hash maintenance (#2919)
* chore(nix): streamline pnpm deps hash maintenance

Generated-By: looper 0.9.0 (runner=worker, agent=opencode)

* fix(ci): satisfy actionlint in nix hash autofix

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* fix(ci): allow nix hash autofix on fork PRs

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* fix(ci): follow up nix hash review

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* fix(ci): tolerate nix hash bot token failures

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)
2026-05-26 07:35:38 +00:00
Patrick A
7bc11b398d chore(deps): upgrade express 4 -> 5 in daemon (#2311)
* chore(deps): upgrade express 4.22.1 -> 5.2.1 and @types/express

Breaking changes addressed:
- Renamed all bare wildcard route segments from * to *splat across
  src/server.ts, src/static-resource-routes.ts, src/project-routes.ts,
  src/import-export-routes.ts, and all three test stubs that define
  app.get/options/delete routes using /raw/* or /raw/* patterns
- Updated wildcard param access from (req.params as any)[0] / req.params[0]
  to Array.isArray(req.params.splat) ? req.params.splat.join('/') : String(...)
  to handle the Express 5 / path-to-regexp v8 change where wildcard params
  are now string[] instead of string
- Updated app.get('*') SPA fallback to app.get('/*splat') in server.ts
- Annotated five connector route handlers with Request<{ connectorId: string }>
  so the typed param resolves as string, not string | string[], fixing the
  10 TS2345 / TS2322 errors that surfaced when @types/express moved to 5.0.6
- Fixed two app.listen() beforeAll callbacks in origin-validation.test.ts to
  accept and propagate the optional Error argument Express 5 now passes to
  the listen callback, resolving TS2769 overload mismatch

* chore(nix): refresh daemonHash for rebased lockfile

* fix(daemon): await res.sendFile() in async route handlers for Express 5 compatibility

Express 5 res.sendFile() returns a Promise. Without await, async route
handlers return before the response is sent, causing Express to call
next() and fall through to a 404. Add await to all res.sendFile() calls
in async handlers in static-resource-routes.ts and server.ts.

* fix(daemon): use readFile+send for spritesheet route instead of sendFile

Express 5 res.sendFile() returns undefined (not a Promise). ENOENT errors
call next() asynchronously after the route handler's try/catch has returned,
causing unhandled 404 responses. Replacing with fs.promises.readFile + res.send
keeps the error path fully within the handler's try/catch.

---------

Co-authored-by: Patrick A <259201958+eefynet@users.noreply.github.com>
2026-05-26 03:16:48 +00:00
Denis Redozubov
128b62f863 chore: patch qs advisory (#2833)
* chore: patch qs advisory

* chore: update daemon pnpm deps hash
2026-05-25 05:49:33 +00:00
PerishFire
34165ff189 chore: retire tools-pr (#2867) 2026-05-25 05:15:04 +00:00
Marc Chan
a5b47c5f76 fix(ci): narrow workflow scope and reuse setup steps (#2708)
* fix(ci): narrow workflow scope and reuse setup steps

* fix(ci): narrow workflow scope and reuse setup steps

Repair Nix fixed-output hashes for the filtered daemon and web source trees.

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): narrow workflow scope and reuse setup steps

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): narrow workflow scope and reuse setup steps

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): repair daemon and nix checks

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
2026-05-22 18:58:53 +08:00
lefarcen
7f03030f3f perf(landing): self-host fonts + inline critical CSS (#2599)
* perf(landing): self-host fonts + inline critical CSS

PageSpeed Insights flagged ~2.3s of render-blocking on /:
  globals.css   12.9 KB external link, 160ms
  fonts CSS     2.2 KB  fonts.googleapis.com, 750ms
  + 4 woff2     ~1200ms each from fonts.gstatic.com

Two changes drop that whole chain:

1. Self-host fonts via @fontsource-variable/{inter,inter-tight,
   playfair-display,jetbrains-mono}. Each family ships a single variable
   woff2 (covers all weights we use) that Astro bundles into /_astro/*
   alongside the rest of the build, served same-origin through CF Pages —
   no separate TLS handshake, no Google Fonts CSS round-trip. The CSS
   variable names get an extra alias in front (`'Inter Tight Variable',
   'Inter Tight', ...`) so a system fallback still works if the package
   ever ships under a different family name.

2. `astro.config.ts: build.inlineStylesheets: 'always'` inlines every
   emitted <style> into the HTML <head> instead of emitting a separate
   /_astro/*.css link. The HTML grows from ~13KB to ~28KB (gzip) but
   loses one stylesheet round-trip + the entire @font-face chain that
   used to gate text rendering.

Component cleanup: the `<FontStylesheet>` component (preconnect + link to
fonts.googleapis.com) is no longer needed and is deleted, removed from
all 7 places that mounted it. og.astro keeps its own font setup since
it renders to a screenshot.

Expected effect (from PageSpeed Insights "Render-blocking requests"
diagnostic on the previous build):
  FCP  1.9s → ~1.2s
  LCP  2.2s → ~1.5s

Verified: pnpm typecheck 0 errors, pnpm build 1853 pages 78s, preview
serves /_astro/*.woff2 as font/woff2 same-origin, 0 fonts.googleapis or
fonts.gstatic references in the built HTML.

* perf(landing): include Playfair italic + bump nix pnpm-deps hash

Two follow-ups on the self-host fonts PR:

1. globals.css imported only `@fontsource-variable/playfair-display`,
   which ships @font-face for font-style: normal only. The previous
   Google Fonts URL included the italic axis (`ital,wght@0,500;1,400;
   ...`) and several rules (.roman, .work-rule .roman, .sec-rule .roman,
   plus 8 other italics across globals.css + sub-pages.css) render
   Playfair italics via `font-family: var(--serif); font-style: italic`.
   Without the italic face self-hosted, those would fall through to
   Times New Roman italic or browser synthesis. Adding
   `wght-italic.css` keeps the typography visually equivalent.

2. nix/pnpm-deps.nix uses a fixed-output derivation hash that has to
   match the pnpm vendored store; adding the four fontsource packages
   changed pnpm-lock.yaml so the hash has to be bumped to the value Nix
   reported in CI.

Codex (Looper reviewer) flagged #1 as non-blocking.

* perf(landing): pin fontsource versions exactly per repo guard

`pnpm add` defaulted to caret ranges (`^5.2.8`) but repo guard rejects
non-exact specs ("dependency specs must be exact versions like 1.2.3 or
workspace:*"). That was the actual cause of the Preflight + Validate
workspace failures — pinning to the locked versions Codex reviewer
called out:

  @fontsource-variable/inter             5.2.8
  @fontsource-variable/inter-tight       5.2.7
  @fontsource-variable/jetbrains-mono    5.2.8
  @fontsource-variable/playfair-display  5.2.8

`pnpm guard` now passes locally (6/6 tests).
2026-05-22 11:49:16 +08:00
Marc Chan
10192dcc52 fix(ci): catch nix hash drift before merge (#2530)
* fix(ci): catch nix hash drift before merge

* fix(nix): add pnpm hash refresh helper

* chore(nix): drop redundant hash alias

* fix(nix): raise update-hash output buffer

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(nix): handle current pnpm deps hash

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(nix): reject non-mismatch hash updates

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 16:08:13 +08:00
Marc Chan
c45c5c9764 fix(ci): align visual selectors and nix hashes (#2471)
* fix(ci): align visual selectors and nix hashes

* fix(ci): add strict PR visual verification

* fix(ci): repair visual-home captures

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 10:45:37 +08:00
Chris Tam
7b1cc16988 fix(nix): force http:// scheme on bundled caddy site address (#2485)
A bare `host:port` site address lets Caddy pick the listener scheme by
port heuristic, which fights `auto_https off` and surfaces as TLS errors
when the browser hits plain HTTP on a non-standard port. Hardcode the
`http://` prefix in both the Home Manager and NixOS Caddyfile templates
— the bundled proxy is plaintext-only by design, so users who need TLS
run their own front-end with `webFrontend.enable = false`.
2026-05-21 10:44:51 +08:00
lefarcen
80d305858b feat(diagnostics): add one-click log export from Settings → About (#798)
* feat(diagnostics): add one-click log export from Settings → About

Adds a new "Export diagnostics" entry under the About section that bundles
daemon/web/desktop logs, machine info, and recent macOS crash reports into
a zip the user can share when reporting issues.

- Browser hits a new daemon HTTP endpoint and triggers a download.
- Electron uses an IPC bridge with the native save dialog and reveals the
  saved file in Finder/Explorer; the Help menu also exposes it as a
  fallback when the daemon is unresponsive.

Packaging + redaction lives in a new @open-design/diagnostics package so
both surfaces share it. Sensitive JSON keys, URL query secrets, and the
current user's home path are redacted before packaging.

* build(nix): include packages/diagnostics in daemon build targets

The Nix daemon derivation builds workspace siblings in dependency order
before compiling apps/daemon. Without @open-design/diagnostics in that
list, the daemon TypeScript build fails inside the Nix sandbox with
`Cannot find module '@open-design/diagnostics'` because pnpm install
only creates the symlink — the dist output that the package.json
exports point at isn't produced until each sibling's build script runs.

* build(tools-pack): include @open-design/diagnostics in packaged INTERNAL_PACKAGES

Without this, packaged win/mac/linux builds fail with `npm error 404` when
the post-build `npm install --omit=dev --no-package-lock` step in the
assembled app tries to resolve `@open-design/diagnostics@0.2.0` from the
public npm registry. The package is workspace-private, so it has to be
tarballed via `pnpm pack` and file:-referenced from the assembled
package.json like every other internal workspace dep that daemon/desktop
depend on.

Also wires the package's `pnpm --filter ... build` into the pre-pack
workspace build step so the dist/ exists before pnpm pack runs, and
updates the two test fixtures (`win-app.test.ts`, `workspace-build.test.ts`)
that mirror INTERNAL_PACKAGES.

The diagnostics package itself is repinned to exact dependency versions
already used elsewhere in the workspace (`jszip 3.10.1`, `@types/node
20.19.39`, `esbuild 0.28.0`, `typescript 5.9.3`, `vitest 4.1.6`) so it
passes the new `pnpm guard` exact-version rule and produces a minimal
lockfile diff vs main (additions only, no resolution-string churn).

* fix(diagnostics): include `~` in bearer-token redaction char class

RFC 6750 token68 syntax allows `~`, so tokens like `Authorization: Bearer
abcd~efgh` were only partially matched by `HTTP_AUTH_SCHEME_RE`. The
regex stopped at the first `~`, leaving the tail (`~efgh`) un-redacted in
the exported diagnostics zip — a clear leak since this feature explicitly
generates support bundles for external sharing.

Add `~` to the character class and a regression test.

* fix(diagnostics): only collect renderer.log from desktop

`buildSidecarLogSources` unconditionally added `logs/${app}/renderer.log`
for daemon/web/desktop, but only the desktop runtime writes a renderer
log (see apps/desktop/src/main/runtime.ts) — daemon and web are pure
Node services with no Electron renderer. Every export therefore produced
missing-file placeholders and manifest warnings for the two phantom
paths, polluting the bundle.

Gate the renderer.log source on APP_KEYS.DESKTOP so the daemon-side
collector matches the desktop-side collector in apps/desktop/src/main/
diagnostics.ts:63.

* fix(diagnostics): mirror desktop-side renderer.log gate

The previous fix only updated the daemon-side `buildSidecarLogSources`
in `apps/daemon/src/diagnostics-export.ts`. The desktop-side collector
at `apps/desktop/src/main/diagnostics.ts` had an identical copy of the
same bug that I overlooked: it also unconditionally added
`logs/${appKey}/renderer.log` for daemon/web/desktop, producing
missing-file placeholders + manifest warnings for the two phantom paths
on every desktop-initiated export.

Apply the same `appKey === APP_KEYS.DESKTOP` gate here so both export
entry points (browser via daemon HTTP, Electron via native save dialog)
emit the same clean manifest.

* feat(diagnostics): add `od diagnostics export` CLI subcommand

AGENTS.md's dual-track capability-exposure contract requires every
user-facing feature to ship on both the web UI and the `od` CLI. The
diagnostics export was only reachable through Settings → About and the
desktop Help menu; this commit closes the loop with an `od diagnostics
export [<path>] [--json]` subcommand registered in SUBCOMMAND_MAP.

The CLI is a thin shell over the existing GET /api/diagnostics/export
endpoint — same zip output, same redaction, same crash-report scope.
Defaults to writing `open-design-diagnostics-<timestamp>.zip` in the
current directory; `--output <path>` or a positional arg overrides.
`--json` prints `{path, sizeBytes}` for shell pipelines.

Use cases this unlocks:
- A CI script can `od diagnostics export ~/artifacts/bundle.zip` after
  a failed run.
- Bug reporters on headless boxes can grab a bundle without booting
  the web UI.
- `od doctor` follow-ups can collect a full snapshot when a probe fails.

* fix(diagnostics): surface non-sidecar launch in manifest warnings

`buildSidecarLogSources()` returns `[]` when the daemon has no sidecar
runtime context, which is the standard `od` (plain) launch path —
`runDaemonCliStartup()` -> `startDaemonRuntime()` does not pass a
runtime. Settings → About and the new `od diagnostics export` previously
reported success but produced a bundle with only the summary JSONs, so
operators could not tell "no logs because plain launch" from "no logs
because something genuinely broke."

- Extend `DiagnosticsContext` with an optional upstream `warnings:
  string[]` that `buildManifest` merges into the manifest warnings.
- Emit STANDALONE_LAUNCH_WARNING from the daemon handler when
  `options.runtime == null`. The warning names the limitation and
  points the user at the sidecar entry points that DO capture logs.
- Add a regression spec at `apps/daemon/tests/diagnostics-export.test.ts`
  that drives the handler with `runtime: null` and asserts the warning
  surfaces in `summary/manifest.json` (and that `files` is empty so a
  user reading the bundle does not confuse "no log sources" with
  "missing files").
2026-05-20 09:10:51 +08:00
Marc Chan
60bd58a1f3 fix(nix): prebuild host package for web build (#2280) 2026-05-19 23:59:48 +08:00
PerishFire
bb13eee765 chore: optimize CI and beta release runtime (#2231)
* chore(ci): add runtime trace summaries

* chore(ci): tighten measured workspace steps

* chore(release): tighten beta setup steps

* chore(release): slim beta windows smoke

* chore(ci): shard daemon tests

* chore(ci): harden runtime trace lookup

* chore(release): avoid mac pnpm cache in beta

* chore(ci): split critical playwright checks

* chore(release): publish beta platforms from builders

* test(e2e): update beta release workflow expectation

* chore(ci): stop gating PRs on nix check

* fix(release): keep beta latest complete
2026-05-19 18:06:28 +08:00
PerishFire
bd48c597b0 chore: pin dependency versions and harden CI caches (#2189)
* chore: pin dependency versions

* ci: enforce pinned dependency specs

* ci: fix pnpm executable invocation
2026-05-19 13:58:27 +08:00
lefarcen
14cff69435 fix(nix): refresh pnpmDepsHash after merging main
The merge brought new entries into pnpm-lock.yaml (Leonardo.ai provider,
custom select primitive, Critique Theater wireup, etc.), so the vendored
pnpm store hash changed. CI computed the new hash from the freshly built
fixed-output derivation.

specified: sha256-mzGiD5K1l5SEFpQ++L4wq775ne+ViG4ruXYZYEdT6zQ= (pre-merge)
got:       sha256-lROdH5HgKFf3R7DYGbc8n/GrmINwLbfVwC4Xp7SrHN4= (post-merge)
2026-05-15 19:16:01 +08:00
PerishCode
545aed642e Merge remote-tracking branch 'origin/preview/0.8.0' into preview/v0.8.0
# Conflicts:
#	nix/package-daemon.nix
#	scripts/postinstall.mjs
2026-05-14 21:46:05 +08:00
PerishCode
883598f556 Build registry protocol in packaged workspaces 2026-05-14 21:23:45 +08:00
PerishCode
dbfe08eaf3 Update Nix pnpm deps hash 2026-05-14 21:15:27 +08:00
Tom Huang
76defffb93 Garnet hemisphere (#1702)
* feat(chat-composer): enhance mention handling and input overlay

- Introduced a new overlay for inline mentions in the chat composer, improving user experience by visually indicating mentions as users type.
- Updated the `ChatComposer` component to manage mention entities and integrate them into the input field, allowing for better context and interaction.
- Enhanced the `AssistantMessage` component to support the display of plugin action panels based on the current project context, facilitating easier plugin management.
- Refactored related components to ensure consistent handling of project files and mentions across the application.

This update significantly improves the chat interaction model, making it more intuitive for users to engage with mentions and plugins.

* feat(plugin-management): enhance plugin action panels and UI components

- Updated the `AssistantMessage` component to include plugin action panels based on the latest project context, improving user interaction with generated plugins.
- Refactored the `PluginsView` to support detailed views for available marketplace entries, allowing users to access more information and actions for each plugin.
- Introduced new CSS styles for improved visual representation of plugin-related UI elements, enhancing overall user experience.
- Enhanced the `listPlugins` function to include an option for fetching hidden plugins, providing more flexibility in plugin management.

This update significantly improves the usability and functionality of the plugin management system, making it easier for users to interact with and manage their plugins.

* fix(assistant-message): refine plugin folder candidate selection logic

- Updated the `pluginFoldersTouchedThisTurn` function to improve the logic for selecting plugin folder candidates based on touched paths and message content.
- Introduced a new helper function, `pathMatchesFolderFileBasename`, to enhance the matching criteria for folder candidates.
- Added a check for explicit folder matches before falling back to a single candidate, improving accuracy in folder selection.
- Modified the `shouldRenderSlotAsText` function in `HomeHero` to include the name parameter, refining the rendering logic for slot text.

These changes enhance the functionality and reliability of the assistant message component in managing plugin folder candidates.

* feat(plugin-folder-actions): implement agent-routed CLI actions for plugin management

- Introduced a new `PluginFolderAgentAction` type to streamline actions related to plugin folders, including install, publish, and contribute.
- Updated the `DesignFilesPanel`, `FileWorkspace`, and `AssistantMessage` components to utilize the new agent action handling, improving user interaction with generated plugins.
- Refactored the action handling logic to send commands to the agent, enhancing the workflow for managing plugin folders.
- Added corresponding tests to ensure the new functionality works as expected and integrates seamlessly with existing components.

This update significantly enhances the plugin management experience by routing actions through the agent, allowing for a more cohesive and interactive user experience.

* Fix PR 1702 CI blockers

* Fix PR 1702 remaining CI checks

* Prebuild AGUI adapter after install

* Restore plugin project snapshot wiring

* feat(marketplace): refactor marketplace URL handling and enhance fetching logic

- Introduced new functions to normalize marketplace URLs and manage fetching of marketplace manifests, improving the reliability of marketplace integrations.
- Updated the server and plugin logic to utilize the new fetching mechanisms, ensuring consistent handling of marketplace data.
- Enhanced tests to cover new URL normalization and fetching scenarios, ensuring robustness in marketplace management.

This update significantly improves the marketplace experience by streamlining URL handling and enhancing data fetching capabilities.

* Fix project auto-send cleanup spec
2026-05-14 21:12:50 +08:00
Nagendhra Madishetti
38a5ab69e6 feat(daemon): Critique Theater Phase 12 (9 Prometheus metrics + 6 log events + OTel span + Grafana dashboard) (#1485)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)

Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.

Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
  sticky against further lifecycle events for the same run, except
  for parser_warning which can land late and is recorded in a side
  channel without changing phase.
- A new run_started for a different runId at any time discards the
  prior state and reboots, so the UI can launch consecutive runs
  without an explicit reset action.
- Events whose runId does not match the active run return the same
  state reference, so React's useReducer doesn't re-render
  subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
  so an out-of-order panelist_dim for round 1 arriving after a
  round 2 dim does not corrupt the round 2 bucket.

Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.

* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)

createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.

useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.

Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
  CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
  shape back to PanelEvent; malformed JSON is swallowed and does
  not stop the stream; exponential backoff schedule and ready-reset
  semantics are pinned with a setTimeout seam; close() cancels
  pending reconnects and shuts the live source; no-op fallback
  when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
  reducer driven by synthetic actions, no connection when disabled
  or projectId is null, clean close on unmount, projectId change
  reopens cleanly.

* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)

Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.

speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
  finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
  can watch the run unfold.
- paused parses the transcript but holds events back until the
  caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
  feature, currently treated as instant; replay timestamps are not
  yet persisted with each event so honest pacing requires a
  follow-up Phase 7+ task.

gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.

Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
  shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
  done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
  still land.
- .gz transcripts route through the gunzip seam.

Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.

* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)

Two P1 fixes from lefarcen's review on PR #1307:

SSE payload override

`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.

Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.

Replay pause/resume

`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).

Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
  reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
  the next scheduled timer, flips to instant, and confirms the
  remaining transcript drains exactly once (cursor was preserved)

Doc consistency

The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.

Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.

* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)

* feat(web): Theater PanelistLane component (Phase 8.1)

* feat(web): Theater ScoreTicker component (Phase 8.2)

* feat(web): Theater RoundDivider component (Phase 8.3)

* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)

* feat(web): Theater TheaterDegraded chip (Phase 8.5)

* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)

* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)

* feat(web): Theater TheaterStage top-level container (Phase 8.8)

* feat(web): Theater CSS using existing semantic tokens (no hex literals)

* feat(web): Theater public exports barrel

* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)

Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.

State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
   Host hooks dispatch it when their gating prop changes so a stale
   run from a prior project / transcript cannot bleed into the next
   context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
   connection effect, so a workspace switch from project A (which
   streamed a critique) to project B clears the reducer before the
   new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
   parse effect, so transcriptUrl swaps (including swap-to-null after
   a replay reached `shipped`) lift the reducer back to idle before
   the new fetch starts.

SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
   check after the cheap `isPanelEvent` predicate. A
   `critique.ship` frame missing `composite` / `round` / `status` /
   `artifactRef` is rejected before reaching the reducer, so
   TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
   Every variant's required fields are validated: run_started
   (protocolVersion, non-empty cast, maxRounds, threshold, scale),
   panelist_* (round, role, plus variant-specific shape), round_end
   (round, composite, mustFix, decision in {continue,ship}, reason),
   ship (round, composite, status, artifactRef.{projectId,artifactId},
   summary), degraded (reason, adapter), interrupted (bestRound,
   composite), failed (cause), parser_warning (kind, position).

Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
   view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
   the in-progress lane the instant the tag opens. Before this, a
   stream that emitted only `panelist_open` after `run_started` left
   `rounds = []` and the UI rendered no current round until a later
   `panelist_dim` arrived.

Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
   `var(--purple, var(--accent))`. `--purple` is actually defined
   across the design systems; `--magenta` is not, so Brand was
   silently falling through to `--accent` and looking identical to
   Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
   interrupted-collapse copy ("Interrupted at round N, best
   composite X.X"). Previously the interrupted branch reused
   `shippedSummary` and the UI read "Shipped at round..." for a run
   that specifically did not ship. Native value in en + zh-CN; other
   locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
   hardcoded `theater-degraded-heading`, so two chips rendered on
   the same page (chat history with multiple completed runs) keep
   their aria-labelledby references unambiguous.

Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.

Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales

* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)

* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)

* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)

Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.

Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
  /api/projects/:id/critique/:runId/interrupt alongside the
  optimistic local dispatch. Before this fix, clicking Interrupt
  only flipped the React state to interrupted while the daemon job
  kept running. The fetch is best-effort: a 404 (endpoint not wired
  yet, lands in Phase 15) is swallowed with a dev-mode console.warn
  so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
  and simulate the "daemon not ready yet" path. Two tests pin both:
  the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
  rejected fetch still flips the UI.

interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
  one we saw; when it changes, interruptPending is cleared. A user
  who interrupts run-1 and then triggers run-2 from the same mount
  now gets a fresh, enabled kill button instead of one stuck in
  "Interrupting…". Pinned by a new mount test.

Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
  input, textarea, select, or contenteditable element is ignored
  (and any ancestor of those via closest() is treated the same
  way). Body-level focus still fires the keybind so the Theater
  area's affordance keeps working. Four new tests cover textarea,
  input, contenteditable, and the body-focus positive case.

userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
  critiqueTheater.userFacingName key so the "Design Jury" label can
  be renamed without touching code. Phase 8 introduced
  critiqueTheater.title by mistake; renamed across types.ts, en.ts,
  zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
  TheaterStage.tsx. The locale alignment test stays green.

Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
  the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
  the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.

* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)

In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.

Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.

7/7 vitest cases green.

* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)

Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.

* feat(daemon): adapter conformance harness (Phase 10.3)

runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.

5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.

* test(e2e): Critique Theater Playwright suite (Phase 11)

Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.

Coverage:

  1. Happy path: a single mounted theater plays the full
     fixture (1 run_started, 5 panelists open / dim / must_fix /
     close, 1 round_end, 1 ship) and ends on the score badge.
  2. Interrupt mid-run: the panelist that is open at the time
     the interrupt button is clicked closes with an interrupted
     marker and the transcript freezes there.
  3. Visual regression at 375x720 mobile.
  4. Visual regression at 768x1024 tablet.
  5. Visual regression at 1280x800 desktop.
  6. A11y role tree: the theater region exposes a labelled
     landmark, each panelist lane is a group with an accessible
     name, the score is a status live region.

All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.

* test(web): reducer p99 bench at 10k iterations (Phase 13.1)

Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.

The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.

* test(web): critique surface coverage walker (Phase 13.2)

Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.

62 assertions, one per symbol per corpus.

* docs: Critique Theater user guide (Phase 14.1)

Seven sections aimed at end users (not contributors):

  1. What is Design Jury
  2. How it works (the five panelists, auto-converging rounds,
     the composite formula)
  3. Settings (the M1 toggle and what it does)
  4. Reading the score badge
  5. Replay surface
  6. Troubleshooting (degraded, interrupted, failed)
  7. FAQ

The composite formula is documented as
    designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.

* docs(daemon): critique module AGENTS map (Phase 14.2)

Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.

* docs(web): Theater module AGENTS map (Phase 14.3)

Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.

* feat(daemon): rollout flag resolver (Phase 15.1)

Single decision point every caller consults to know whether the
orchestrator should wire the critique pipeline for a given run.
Priority:

  1. Skill-level policy (required wins, opt-out wins inversely)
  2. Per-project override from the Settings toggle
  3. OD_CRITIQUE_ENABLED env override
  4. Rollout phase default
       M0 dark-launch      false
       M1 settings only    false (toggle is off until the user flips it)
       M2 per-skill        true if skill opted in
       M3 global default   true

OD_CRITIQUE_ROLLOUT_PHASE parser defaults to M0 on unknown input
so a fresh install never surprises a user with the feature on.

10/10 vitest cases green covering every cell of the matrix.

* feat(web): Settings toggle hook for Critique Theater (Phase 15.2)

React hook that reads critiqueTheaterEnabled from the existing
open-design:config localStorage blob and stays in sync via:

  - the platform storage event (cross-tab)
  - a open-design:critique-theater-toggle CustomEvent (same-tab)

Same-tab event is the one that fires when the Settings panel saves
in the current window: the toggle and every mounted theater update
without a page reload.

setCritiqueTheaterEnabled(next) is the imperative setter the Settings
panel calls. It preserves the rest of the stored config (mode, apiKey,
etc.) and dispatches the same-tab event after the localStorage write.

The web hook reflects what the user toggled; the daemon-side
isCritiqueEnabled is the final routing authority (project override,
env, rollout phase). When they disagree, the daemon wins for backend
gating and the web reflects the toggle state.

6/6 vitest cases green covering first read, stored read, same-tab
event flip, config preservation, corrupted JSON tolerance, and
cross-tab storage event.

* test(web): Phase 15 toggle hook failure-mode coverage (PR #1320)

lefarcen P2 on PR #1320 flagged that the PR body claimed safe
behavior for disabled localStorage, non-object JSON, and missing
CustomEvent shim, but the suite only covered corrupt JSON plus
happy-path storage events. Added four failure-mode tests so the
swallowed errors are not silently traded for a throw in a future
refactor:

1. Returns false on a stored JSON value that parses to an array
   (non-object). Catches a regression where the guard treats
   anything truthy as a config blob.
2. Returns false on a stored JSON value of literal 'null'.
   typeof null === 'object' in JS, so the guard has to check null
   explicitly; this test pins that check.
3. Returns false when localStorage.getItem throws (private mode /
   disabled storage / SecurityError). The hook must swallow and
   return false so the rest of the app keeps rendering.
4. setCritiqueTheaterEnabled still dispatches the same-tab
   CustomEvent when localStorage.setItem throws (quota exceeded /
   disabled storage). The dispatch path is the in-session
   broadcast that keeps every mounted hook coherent even when
   persistence is unavailable; verified by mounting two probes
   and asserting both flip after the setter is called with a
   throwing setItem.

10/10 vitest cases green (6 existing + 4 new).

* fix(web): honor CustomEvent payload in toggle hook listener (PR #1320)

Both Siri-Ray (blocking) and lefarcen (P2 new) caught the same
real bug in the failure-mode test I added in affcdd27: the test
asserts the in-session UI flips when localStorage.setItem throws,
but the CustomEvent listener was ignoring the event's typed
detail and just calling readToggle(). Under a throwing setItem
the localStorage value is stale (or absent), so the listener
would see the OLD value and the test would fail (or worse, the
production claim 'in-session event keeps mounts coherent' was
hollow).

Fixed the hook, not the test: the listener now reads
event.detail.enabled when it is a boolean, falling back to
readToggle() only for malformed events or for cross-tab storage
events (which do not carry a typed payload). The setter already
dispatched the detail; the listener just was not consuming it.

Test changes:

  - The existing 'setItem throws' test now asserts the right
    behavior for the right reason. Updated the inline comment to
    say the listener reads from detail, not localStorage.
  - New test 'falls back to readToggle when the CustomEvent
    carries no usable detail' pins the fallback path: a
    malformed dispatcher (no detail, or detail.enabled not a
    boolean) degrades cleanly instead of throwing or being
    silently ignored.

11 / 11 vitest cases green (10 prior + 1 new fallback).

* feat(daemon): route critique spawn-path eligibility through the rollout resolver

The wireup edit Phase 10 and Phase 15 carved out: today server.ts gates
the critique pipeline on critiqueCfg.enabled, which is just the
OD_CRITIQUE_ENABLED env var. After this commit it gates on
isCritiqueEnabled(...) from the Phase 15 resolver, so the full
priority matrix is live:

  1. Per-skill od.critique.policy veto (opt-out / required)
  2. Per-project override (M1 Settings toggle, written through the
     existing Phase 6 settings endpoint)
  3. OD_CRITIQUE_ENABLED env override (power-user lane / CI fixtures)
  4. OD_CRITIQUE_ROLLOUT_PHASE default
       M0 dark-launch      false
       M1 settings only    false
       M2 per-skill        only when skillPolicy === 'opt-in'
       M3 global default   true

Default behaviour on a fresh install is unchanged: the resolver
returns false at M0 without an env override or a project override,
so prod traffic falls through to the legacy single-pass path
exactly the way it did before.

Inputs threaded today: phase from OD_CRITIQUE_ROLLOUT_PHASE,
envOverride from OD_CRITIQUE_ENABLED. skillPolicy and projectOverride
are passed as null for the v1 cutover; the daemon-side handler that
round-trips critiqueTheaterEnabled on the project settings row and
the od.critique.policy frontmatter resolver land as the next two
commits in this branch.

The three call sites that used critiqueCfg.enabled (the brand-thread
guard, the skill-thread guard, the top-line critiqueShouldRun
compound) now read from a single locally-scoped critiqueEnabledForRun
boolean, so the eligibility check is computed exactly once per spawn
and the prompt composer + orchestrator stay in lockstep the way
the existing comment already promised.

Tests still green: daemon vitest 22 / 22 across rollout +
conformance + adapter-degraded. Daemon typecheck clean.

* feat(web): mount CritiqueTheaterMount in ProjectView

The web counterpart of the daemon wireup. ProjectView now renders
<CritiqueTheaterMount projectId={project.id} enabled={...} /> as a
sibling of <AppChromeHeader> inside the top-level <div className="app">.

The mount is the drop-in from the Phase 9 stack: it owns the SSE
subscription, the kill-request handshake, and the phase-aware swap
from the live <TheaterStage> to the collapsed badge once a run
settles. The mount returns null until the daemon emits a
critique.run_started for the active project, so the visual surface
is byte-for-byte unchanged for users who have not opted in.

Enabled wiring: useCritiqueTheaterEnabled() reads the M1 Settings
toggle from the existing open-design:config localStorage blob and
stays in sync with both the platform storage event (cross-tab) and
the same-tab open-design:critique-theater-toggle CustomEvent the
Phase 15 setter dispatches. The hook honors the event payload
directly so a private-mode browser that cannot persist the toggle
still updates the in-session UI correctly.

The daemon-side gate (isCritiqueEnabled in apps/daemon/src/server.ts)
remains the authority for whether a run is actually wired through
the critique pipeline. This hook only governs whether the web layer
renders the resulting SSE stream when the daemon emits one. The
two-layer gate is intentional: an integrator embedding the Theater
in a custom UI can flip the web visibility independent of the
daemon's routing decision, and a daemon-side env override flips
backend gating without touching the web's localStorage.

Tests still green: web Theater suite 181 / 181 across 16 files.
Web typecheck clean.

* feat(daemon): resolve od.critique.policy frontmatter at the spawn site

The next step in the wireup branch's ladder: replace the placeholder
`skillPolicy: null` with the actual value parsed from the active
skill's SKILL.md frontmatter.

Three small edits, one new field on a public type:

1. SkillInfo gains a `critiquePolicy: SkillCritiquePolicy` field
   carrying the parsed `od.critique.policy` token (required /
   opt-in / opt-out / null). The field is null when the skill has
   no opinion, which lets the lower-priority resolver tiers
   (projectOverride, envOverride, phase default) decide.

2. listSkills() populates the new field via a small
   `normalizeCritiquePolicy` helper that tolerates the YAML
   scalar's casing and trims whitespace. Unknown tokens collapse
   to null so a typo in SKILL.md cannot accidentally force the
   panel on or off; it just falls through. Derived example cards
   inherit the parent's policy.

3. server.ts captures `skill.critiquePolicy` into a hoisted
   `skillCritiquePolicy` variable inside the existing skill-load
   block, then threads it into the isCritiqueEnabled call as the
   skillPolicy input. The hoisting keeps the variable in scope at
   the resolver call site without restructuring the spawn handler.

After this commit, the priority matrix the rollout resolver was
designed for is live for its top tier. The previous commit wired
env + phase; this one wires skill. The projectOverride input
remains null pending the next commit that extends the Phase 6
settings endpoint.

Daemon vitest: 10 / 10 rollout cases pass against the new wiring.
Daemon typecheck: clean.

* feat(daemon): feed projectOverride into the rollout resolver from project metadata

Replaces the placeholder `projectOverride: null` in the spawn
handler with the actual value the Settings panel writes onto the
project's metadata blob: `critiqueTheaterEnabled?: boolean`.

The read is defensive at the boundary: the metadata object is
typed loosely (it round-trips through SQLite as a free-form JSON
blob), so the spawn handler narrows to `boolean` and falls
through to `null` for any other shape. A missing key, a malformed
value, or a project that has never visited Settings collapses to
`null`, which is exactly the resolver's "no opinion, fall
through to env / phase" signal.

The `critique` frontmatter slot also gets typed on the
SkillFrontmatter shape so the `od.critique.policy` chain the
previous commit introduced no longer needs a bracket-access
cast. Same pattern as the existing `craft`, `preview`, and
`design_system` nested-record slots.

After this commit, every tier of the rollout resolver's priority
matrix is wired:

  1. skillPolicy   (from SKILL.md od.critique.policy)
  2. projectOverride (from project metadata critiqueTheaterEnabled)
  3. envOverride   (from OD_CRITIQUE_ENABLED)
  4. rollout phase (from OD_CRITIQUE_ROLLOUT_PHASE)

The write path for projectOverride still flows through the
existing project-update handler the Settings panel already uses
to persist project metadata; no new endpoint is needed. The
Settings UI button that calls setCritiqueTheaterEnabled and
posts the new field is the next commit on this branch.

Daemon typecheck: clean. Daemon vitest: 10 / 10 rollout cases
still green against the new wiring.

* fix(daemon): forward critique events to project sinks + align composer gate (PR #1338)

Two codex review items addressed in one commit since they share the
same root cause (resolver-enabled run hits a transport / prompt
contract that was still env-gated):

P1 (transport mismatch). The daemon emits critique.* SSE frames
through critiqueBus -> design.runs.emit, which fans out on
/api/runs/:runId/events. The web CritiqueTheaterMount subscribes to
/api/projects/:projectId/events (it's project-scoped, not run-
scoped, because the mount lives at the project workspace and
follows the user across runs). Result: in production the mount
never sees a real frame and the e2e tests' stubbed routes hide the
mismatch.

Fixed by extending critiqueBus.emit to fan out to BOTH sinks: the
existing runs.emit transport, AND the per-project event-sinks map.
The project-events route emits via sse.send(payload.type, payload),
so we pack the SSE channel name onto payload.type and let the sink
push the right channel. The web sseToPanelEvent overwrites type
from the channel name on the way back into a PanelEvent, so the
round-trip stays correct.

P2 (prompt gate misalignment). composeSystemPrompt reads
cfg.enabled to decide whether to append the panel addendum, but
critiqueCfg.enabled is loaded from OD_CRITIQUE_ENABLED only. A run
the resolver enabled via phase / project / skill (env unset) would
have critiqueShouldRun = true while critiqueCfg.enabled remained
false, dropping the panel prompt while still routing through
runOrchestrator -> parser waits for tags that never arrive -> run
degrades.

Fixed by passing a derived config { ...critiqueCfg, enabled: true }
to the composer when critiqueShouldRun is true. The composer's own
gate now agrees with the resolver decision on every input the
spec defines.

Daemon typecheck: clean. Daemon vitest: 10 / 10 rollout cases
still green against the new wiring.

* fix: address PerishCode P1 + P2 follow-ups on PR #1338

Two follow-up items PerishCode flagged on the activation PR.
Non-blocking but both are real:

1. Phase 11 e2e suite was wired into test:ui:extended but lands
   the user on '/' (home route) where ProjectView (and therefore
   CritiqueTheaterMount) is never rendered. With the suite as
   written, every assertion would time out the first time the
   lane runs in CI, contradicting the PR body's claim that the
   suite stays parked behind test.describe.fixme.

   The state diverged from my earlier Phase 11 work because the
   merge from main on commit 4ab719c6 brought in #1307's
   squash-merged version of the e2e file (the pre-fixme shape).

   Re-applied test.describe.fixme to the describe block plus
   removed ui/critique-theater.test.ts from the test:ui:extended
   script in e2e/package.json. Added a file-header docblock
   explaining what the follow-up commit needs to do: replace
   goto('/') with /projects/:id navigation similar to
   app-design-files.test.ts, split the SSE fixture into a live
   prefix and terminal suffix (Codex P2 on PR #1320), and commit
   the first PNG baselines.

2. bestRoundOf in CritiqueTheaterMount returned the LAST round
   with a numeric composite, not the round with the HIGHEST
   composite, while bestCompositeOf correctly returned the max.
   A run that closed round 1 at 8.5 and round 2 at 6.0 would
   dispatch interrupted { bestRound: 2, composite: 8.5 } on a
   user-clicked interrupt.

   Folded the two helpers into a single bestRoundAndComposite
   that walks state.rounds once and returns the matching pair so
   the two values cannot drift. The onInterrupt callback now
   destructures from one helper instead of two independent reads.
   Falls back to (state.activeRound, 0) when no round has closed
   with a composite yet.

Web typecheck: clean. CritiqueTheaterMount.test.tsx: 7 / 7 cases
still green against the new helper.

* fix: wire M1 project override end-to-end + correct deferred-surface doc claims (PR #1338)

Three lefarcen P2s on the latest review pass, all real:

1. M1 project override was half-wired: the daemon read
   metadata.critiqueTheaterEnabled but the web setter only
   wrote localStorage. A user opt-in would render the Theater
   on the web (localStorage was set) while the daemon resolved
   projectOverride=null and skipped critique unless env / phase
   already permitted. Two halves talking past each other.

   Extended setCritiqueTheaterEnabled to accept an optional
   { projectId, fetchProjectSettings } options bag. When a
   projectId is supplied, the setter ALSO sends a
   PATCH /api/projects/:id with { metadata: { critiqueTheaterEnabled
   } } so the daemon's spawn-time resolver picks the same value up
   on the next generation. The existing project-routes endpoint
   already accepts arbitrary metadata patches, so no new endpoint
   is needed. The local write + the CustomEvent dispatch still
   fire before the PATCH, so a network failure does not unwind
   the in-session UI flip. Three new vitest cases pin the new
   path: PATCHes when projectId is provided, skips when it is
   not, swallows a rejected PATCH so the in-session UI still
   flips.

2. Rollout docs (docs/critique-theater.md section 3) claimed the
   Settings toggle persists into the daemon settings store, but
   the previous implementation only had a localStorage reader /
   writer plus a daemon read of project metadata, with no
   round-trip. Rewrote the section to lead with the four-tier
   resolver (skill policy / project override / env / phase),
   document that the setter now round-trips via the existing
   PATCH endpoint when given a projectId, and call out the
   Settings panel UI control as a deliberate follow-up.

3. Troubleshooting table pointed users at /api/metrics/critique
   (Phase 12, deferred) and 'od adapters clear-degraded <id>'
   (CLI wrapper that does not exist). Replaced the metrics
   reference with the local conformance harness command
   (pnpm --filter @open-design/daemon vitest run
   tests/critique-conformance.test.ts) that ships today, with a
   note that the Phase 12 dashboard surfaces this status as a
   series once that PR lands. Replaced the CLI command with the
   programmatic clearDegraded() helper that exists today and
   flagged the CLI wrapper as planned follow-up.

Web typecheck: clean. Toggle hook tests: 14 / 14 green (11
existing + 3 new for the round-trip path).

* test(web): multi-round interrupt regression for bestRoundAndComposite (PR #1338)

lefarcen P3 follow-up to the previous bestRoundAndComposite fix:
the existing CritiqueTheaterMount.test.tsx interrupt cases only
exercised a single-round state, so a future refactor back to two
independent helpers wouldn't be caught by the test suite even
though it'd reintroduce the round / composite drift bug.

Added a regression case that:

  1. Drives the reducer through two complete rounds with the
     full 5-role cast closing at distinct composites: round 1
     at 8.5, round 2 at 6.0 (the high-composite round is NOT the
     most recent one).
  2. Clicks Interrupt + waits for the daemon ack via the test
     seam fetcher returning 204.
  3. Asserts the collapsed badge displays "round 1" (the
     correct best-composite round), and queryByText for
     "round 2 ... 8.5" returns null (the buggy pairing
     would have produced that string).

The bestRoundAndComposite helper walks state.rounds in one pass
and returns the matching pair, so the round number and the
composite cannot drift apart. This test locks the fix in: a
refactor that splits the helpers back into independent walks
will be caught here.

8 / 8 vitest cases green on the file.

* fix(web): read-merge-write the project metadata in setCritiqueTheaterEnabled (PerishCode P2 on PR #1338)

The previous round-trip sent { metadata: { critiqueTheaterEnabled: next } }
as the entire PATCH body. The daemon's project-routes handler only
re-stamps three immutable fields (baseDir, importedFrom,
fromTrustedPicker) before calling updateProject(db, id, patch),
which then does a shallow { ...existing, ...patch } in apps/daemon/
src/db.ts. So patch.metadata replaces the row's metadata wholesale,
dropping kind, templateId, linkedDirs, and every other field the rest
of the app reads.

No in-tree caller passes projectId today (only vitest cases), so the
bug had not surfaced yet. But the surface is documented in
docs/critique-theater.md section 3 and the function's own JSDoc as
the M1 round-trip path, so it would have shipped as a latent footgun
for the next integrator: a Settings UI follow-up, or any third party
that wires the setter into a project-aware surface.

Fix: read-merge-write rather than a bare patch.

- GET /api/projects/:id to read the row's current metadata.
- Spread that metadata into the PATCH body and overlay
  critiqueTheaterEnabled: next on top, mirroring the partial-metadata
  pattern already used in ChatComposer.tsx for linkedDirs.
- PATCH the merged object.

Failure handling:
- GET fails: skip the PATCH entirely. We cannot construct a safe
  merged body without the current state, and a bare patch would
  wipe other metadata. The in-session CustomEvent fired earlier in
  the setter still keeps every mounted hook consistent; the next
  save retries the round-trip.
- PATCH fails: log in dev. The in-session UI is already correct via
  the CustomEvent.

Tests (TDD, red-first):

- 'GETs the project then PATCHes with merged metadata when a
  projectId is supplied': stubs a GET that returns
  { kind: 'template', templateId: 'modern-blog', linkedDirs: [...] }
  and asserts the PATCH body equals the merge plus the toggle.
- 'PATCHes with just the toggle when the project has no prior
  metadata': stubs a GET that returns no metadata block.
- 'skips the PATCH (does not stomp metadata) when the prefetch GET
  fails': stubs a rejecting GET and asserts only the GET fires.
- 'swallows a rejected PATCH after a successful prefetch': stubs a
  successful GET and a rejecting PATCH; asserts the in-session UI
  still flips via the CustomEvent.

Doc updated on the setter's JSDoc to describe the new three-step
flow (localStorage, CustomEvent, read-merge-write PATCH) and the
two failure modes.

Verified:
- pnpm --filter @open-design/web typecheck clean.
- pnpm --filter @open-design/web test: 111 files / 1055 tests green
  (was 1052, +3 from the new merge-flow cases).

* fix(web): restore wait-for-daemon-ack pattern on Theater interrupt

Same regression as flagged on PR #1316 post-main-merge: the
optimistic local dispatch fired before the POST resolved, so a
daemon 404 / 409 still terminalized the UI and the real SSE
terminal event got ignored by the sticky interrupted phase.

Snapshot runId / bestRound / composite at click time, dispatch
interrupted only on res.ok, clear interruptPending on rejection or
non-2xx so the user can retry. Tests cover rejection + 404 leaving
the run on the live stage; the 204 path waits for the ack.

* feat(daemon): Critique Theater Phase 12 observability foundations

Lands the metrics registry, the structured logger, the /api/metrics
route, and the adapter-degraded bump that wires up the first data
point. The orchestrator-side bumps for runs / rounds / composite /
must-fix / interrupted / parser_errors / protocol_version land in a
follow-up commit on this branch (kept separate so the wiring diff
reads cleanly against the registry shape).

Surfaces added:

- apps/daemon/src/metrics/index.ts: 9 Prometheus series under the
  open_design_critique_* namespace with the histogram buckets the
  spec calls out (round_duration_ms at 100 / 250 / 500 / 1000 /
  2500 / 5000 / 10000 / 30000 / 60000 ms; composite_score at
  0-10 integer steps).
- apps/daemon/src/logging/critique.ts: 6 typed events, one JSON line
  per call on stdout, namespaced critique. Matches the JSON-per-line
  convention cli.ts already uses; no new logger framework.
- apps/daemon/src/server.ts: GET /api/metrics route. Honors
  OD_METRICS_ENDPOINT=disabled to opt out for air-gapped installs.
- apps/daemon/src/critique/adapter-degraded.ts: markDegraded now
  bumps degraded_total so the adapter-health dashboard panel
  reflects every TTL refresh and every fresh mark.

Deps: prom-client ^15.1.0, @opentelemetry/api ^1.9.0 added to
apps/daemon/package.json. Both are zero-config no-ops without an
exporter wired; daemon bundle size impact is ~150 KB uncompressed.
The @opentelemetry/api dep is in place ahead of the OTel-spans
follow-up commit; it adds no behavior on this commit.

Tests:
- tests/metrics/critique.test.ts (3 cases): registry shape +
  exposition text + reset-between-tests
- tests/logging/critique.test.ts (4 cases): event shape + ordering
  + newline framing + namespace stamping

Verification (Windows-local):
- pnpm --filter @open-design/daemon typecheck: clean
- New metrics + logging suites: 7 / 7 green
- Existing adapter-degraded + conformance + rollout suites:
  22 / 22 green; the bump is non-breaking

* feat(daemon): wire Critique Theater metrics + structured logs from the orchestrator

Lights up the bump sites the Phase 12 foundations PR registered the
series for. Every panel event the parser surfaces now reaches the
matching Prometheus counter / histogram and the matching JSON log
line on stdout.

Switch-loop bumps + logs:

- run_started: log run_started, set protocol_version gauge to the
  observed protocol version (small-integer cardinality).
- panelist_open: record the first-open wall-clock per round so
  round_end can compute round_duration_ms; subsequent opens in the
  same round leave the start time untouched.
- panelist_must_fix: bump must_fix_total with the panelist role.
  The wire event does not yet carry a dim name, so the label is
  'unspecified' for now; a future parser revision can drop in the
  real dim without a metric rename.
- round_end: bump rounds_total, observe composite_score, observe
  round_duration_ms (current ms minus the tracked start), log
  round_closed with the composite / mustFix / decision triple.
- parser_warning (parser-yielded): bump parser_errors_total with
  the kind label, log parser_recover with kind + position.

Orchestrator-side parser warnings (composite_mismatch and
duplicate_ship from the daemon-authoritative scoring checks) go
through a new emitParserWarning helper so the bus emit, the
collectedEvents push, the metric bump, and the log line stay in
lockstep. Three inline emission sites collapse to one-line helper
calls.

After the try/catch, a single terminal-status switch bumps
runs_total{status, adapter, skill} once per run, with branch-
specific log + counter:

- shipped / below_threshold: log run_shipped
- interrupted: bump interrupted_total, log run_failed{cause: interrupted}
- timed_out: log run_failed{cause: timed_out}
- failed: log run_failed{cause: orchestrator_internal}
- degraded: log degraded{reason: orchestrator_classified}

OrchestratorParams gains optional skill: string for the label;
defaults to 'unknown' so spawn sites that have not yet threaded it
keep working without a metric shape change.

Tests:
- The new metrics + logging suites (7 / 7) verify registry shape
  and event framing; orchestrator-side metric integration is
  exercised through the existing critique-conformance and
  critique-adapter-degraded suites (22 / 22 still green).
- Logger test reassigns process.stdout.write directly instead of
  vi.spyOn so the Node overloaded write signature does not
  collide with MockInstance<unknown>.

* feat(observability): Grafana dashboard JSON for Critique Theater

Three default rows mapping to the metrics this branch wires up:

1. Fleet quality: composite score p50 / p90 / p99 line graph by
   adapter, plus a heatmap of the composite distribution. The
   line graph answers 'are my agents getting better over time';
   the heatmap answers 'are the bad runs clustered around one
   adapter or smeared across the fleet'.

2. Adapter health: stacked bar charts for degraded marks (by
   adapter / reason) and parser errors (by adapter / kind) over
   a 5-minute window. The two queries together let an operator
   see 'is this adapter degraded because of malformed wire output
   or because of oversize blocks' without flipping panels.

3. Brief throughput: runs-per-hour by terminal status, an average
   rounds-per-run stat per adapter, and a round-duration ms p50 /
   p90 / p99 line. Throughput numbers fall straight out of the
   runs_total / rounds_total counters; the duration histogram is
   the same one the runs feed.

The dashboard uses a templated $datasource var (defaults to
'prometheus') so an operator with multiple Prometheus instances
can switch without editing JSON. Schema version 39 (Grafana 11).

Operators import via:

  pnpm dlx @grafana/cli dashboard import     tools/dev/dashboards/critique.json

or paste into a provisioned dashboards directory. The file is
checked into the repo as a starting artifact; alert rules and
SLO panels ship after the first 1000 runs inform the right
thresholds. JSON validates with node -e 'JSON.parse(...)' (sanity
checked locally).

* feat(daemon): OpenTelemetry outer span around the critique run

Wraps each runOrchestrator call in a 'critique.run' span via the
existing @opentelemetry/api dep added in the Phase 12 foundations
commit. Attributes set on the span:

- critique.run_id, critique.adapter, critique.skill at start
- critique.final_status, critique.final_composite on terminal
  resolution
- span status flipped to ERROR for failed / timed_out runs so a
  Tempo / Honeycomb / Jaeger filter on traces.status=error
  surfaces the right slice without joining back to Prometheus

No exporter is wired by default; @opentelemetry/api is the API
package and intentionally splits from @opentelemetry/sdk-*, so
the span is zero-overhead until an operator attaches an SDK
through their runtime config.

Inner per-round / parse_chunk / scoreboard_eval / persist_round /
ship.persist spans defined in the Phase 12 plan are a follow-up:
the outer span alone gives the trace a duration + final status +
adapter/skill labels, which is the 80% value for dashboards that
correlate runs across services. Adding child spans inside the
existing 600-line orchestrator without restructuring is a separate
careful change.

Verification:
- pnpm --filter @open-design/daemon typecheck: clean
- 29 / 29 critique + metrics + logging tests still green

* fix(nix): bump pnpmDepsHash for prom-client + @opentelemetry/api lockfile bump

nix-check failed on PR #1485 with hash mismatch in
open-design-daemon-pnpm-deps and open-design-web-pnpm-deps after
the Phase 12 foundations commit (2b8b7445) added prom-client and
@opentelemetry/api to apps/daemon/package.json and refreshed
pnpm-lock.yaml.

CI reported the new sha:
  specified: HFLm+8hv3o5x3Xem4MXNsNclIgiVRc70+EBafL0rVn8=
  got:       7R1sQC38gOT0gsZ2oNOviCZ486cbbGJGJCis6WI8z9s=

Both nix files pin the same workspace lockfile, so both flip in
lockstep. No other Nix surface changes required.

* fix(daemon): four Phase 12 review findings (Codex P2 x2 + Siri-Ray P2 + lefarcen P2)

1. Siri-Ray P2 in orchestrator.ts (round metric / log used untrusted
   agent values). The new observability path now records rs.composite
   and rs.mustFix (daemon-authoritative) instead of event.composite
   and event.mustFix when rs exists, and skips the bumps + log
   entirely when rs is missing (a degenerate round_end without any
   matching panelist_open). The dashboard p50 / p90 / p99 now agrees
   with persistence and ship decisions; an adapter reporting <ROUND_END
   composite='10'> while the daemon computed 6 logs 6 and still emits
   the composite_mismatch parser warning the prior block was already
   producing.

2. Codex P2 in server.ts (skill label always 'unknown'). The spawn
   path called runOrchestrator without passing the resolved skill id,
   so every live run bumped open_design_critique_*{skill='unknown'}
   and the per-skill dashboard breakdown was always empty. Threaded
   effectiveSkillId (already computed at the same handler scope as
   the project skill fallback) through skill: . . . so the metric
   reflects the real skill when one is assigned, and the orchestrator
   default of 'unknown' only fires for runs that genuinely have none.

3. Codex P2 in conformance.ts (protocol-version mismatch let through).
   An adapter that emitted <CRITIQUE_RUN version='2'> followed by a
   valid SHIP classified as shipped because the harness only watched
   for terminal events. Added a guard inside the parse loop: if a
   run_started carries protocolVersion !== CRITIQUE_PROTOCOL_VERSION,
   mark the adapter degraded with reason 'protocol_version_mismatch'
   (already in DEGRADED_REASONS) and return early. ConformanceOutcome
   union widened to accept the new reason.

4. lefarcen P2 in tools/dev/dashboards/critique.json (runs-per-hour
   panel under-reported by 3600x). 'rate(...[1h])' returns per-second.
   Multiplied by 3600 so the panel title and unit match the actual
   value rendered.

Verification:
- pnpm --filter @open-design/daemon typecheck: clean
- New metrics + logging suites (7), existing adapter-degraded (7),
  conformance (5), rollout (10): 29 / 29 green
- Grafana JSON re-parses with node -e 'JSON.parse(...)'

* fix(nix): set pnpmDepsHash to fakeHash so CI surfaces the real hash for the regenerated lockfile (lefarcen P1 on PR #1485)

* fix(nix): pin pnpmDepsHash to sha256-NtXbiRU0YZ4EVJVNC6N3sR1S0ozA3BvCwgXI0L0OMH4= from CI nix-check output

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-13 22:11:27 +08:00
lefarcen
a1859b7f40 fix(nix): update pnpmDepsHash for merged lockfile
The merged pnpm-lock.yaml (release/v0.7.0 contents on top of main) has a
different hash than either parent. Adopt the value Nix computed in CI.
2026-05-13 18:26:30 +08:00
nettee
f621dbbfea feat(web): Add Tailwind foundation (#1388) 2026-05-12 21:48:16 +08:00
Chris Tam
c61ba320fd feat(nix): Add official flake with home-manager and NixOS support (#402)
* nix: add official flake with home-manager and nixos modules

* Pin pnpm version

* Format README.md

* Populate PATH files to discover installed CLIs

* Revert "Populate PATH files to discover installed CLIs"

This reverts commit 18d88781a88b8781913cf5a8b680dfb38eabf7e4.

* Fix missing sqlite issue

* Fix system issue

* Reapply "Populate PATH files to discover installed CLIs"

This reverts commit d02ea994e6.

* Handle different ports for web frontend

* Provide documentation for getting pnpm hash

* Enable nix flake checks for code changes

* Set `OD_WEB_PORT` on daemon when declared

* fix: Fix environmentFile for macOS targets

* chore: Ignore nix and direnv related files

* fix: Read version directly from `package.json`

* feat: Make nix shell entry prettier

* chore: Update pnpm hashes

* chore: Bump `pnpm` hashes

* docs: Add blurb about dev shell in `README.md`

* Address review comments

* Add support for `OD_WEB_ORIGINS`

* Fix `isLocalSameOrigin`

* Update pnpm checksums

* docs: Update documentation on host origins

* Move allowedOrigins mapping out of the webFrontend.enable guard

* fix: Bump pnpm hashes

* Remove changes to `daemon` with `main` changes

`main` merged a feature that addressed our need for allowed origins.
Since this feature branch no longer needs it, remove any remaining
changes in `daemon` code so that this is a pure Nix change.

* Update documentation around `OD_DAEMON_URL`

* Rewrite option docs to match same-origin proxy contract

The port, webFrontend, and webFrontend.port option descriptions still
described OD_DAEMON_URL as the runtime contract for the SPA, but the
SPA issues relative /api/*, /artifacts/*, /frames/* requests and there
is no runtime daemon-URL injection. Rewrite the three blocks to
describe what the caddy / custom proxy must actually do.

* Document daemon-side requirements for custom-server proxy paths

The bring-your-own-server path in section (3) and the same-origin
contract in section (4) understated what the daemon needs: any proxy
whose origin differs from the daemon's bind (including loopback
split-port like 127.0.0.1:8080 while the daemon stays on :7457) is
403'd by the daemon's same-origin gate until told about that origin.

Add a callout under section (3)'s table, expand section (4) with a
decision table covering same-port, loopback split-port (OD_WEB_PORT or
webFrontend.allowedOrigins), and non-loopback (webFrontend.allowedOrigins)
cases, and rewrite the webFrontend.allowedOrigins option description to
enumerate the cases where it's required and surface OD_WEB_PORT as an
alternative for the loopback split-port case.

---------

Co-authored-by: lefarcen <935902669@qq.com>
2026-05-09 23:50:16 +08:00