131 Commits

Author SHA1 Message Date
lefarcen
59bca72f7e feat(export): programmatic screenshot-based PPTX/PDF export (#4604)
* feat(export): programmatic screenshot-based PPTX/PDF export

Replace the prompt-driven "ask the agent to run python-pptx" PPTX export
with a deterministic, programmatic pipeline. The daemon renders each deck
slide to a pixel-perfect PNG via the desktop's bundled Electron Chromium
(reused over sidecar IPC — no second headless engine, so the packaged app
does not grow) and assembles a one-image-per-slide .pptx (PptxGenJS) or
raster .pdf (pdf-lib).

- sidecar-proto: RENDER_SLIDES message + DesktopRenderSlides{Input,Result}
- daemon: deck-export.ts assembler (decode + pptx + pdf), POST
  /export/pptx and /export/pdf-image routes, desktopSlideRenderer wiring
- desktop: deck-capture.ts renders the deck off-screen and captures one
  PNG per `.deck > .slide` (skips presenter-mode .mini-slide clones)
- web: exportProjectAsPptx() fetch+blob download; ProjectView swaps the
  prompt path for it
- cli: `od export pptx|pdf` dual-track closure
- remove the now-dead build-pptx-export-prompt lib + test

Tests: deck-export assembler unit tests + exportProjectAsPptx web tests.
Screenshot mode ships first; editable-rebuild modes are follow-up.

* chore(nix): refresh pnpm deps hash

* fix(export): deck PDF blank pages on opacity decks + image export capturing the modal

Two pre-existing deck-export defects surfaced while validating the new export:

- Vector PDF (Print-ready) left blank pages for decks that show one slide at a
  time via opacity: DECK_PRINT_CSS forced each .slide onto its own page but
  never reset opacity/visibility, and CSS transitions animated opacity from 0,
  so inactive slides printed blank. Force opacity/visibility/animation/
  transition in print media (both the web and desktop DECK_PRINT_CSS).

- "Export as image" captured Open Design's own format-chooser modal: the host
  compositor snapshot ran after the modal opened, so the overlay leaked into
  the PNG. Capture the clean preview before showing the modal; the cached
  snapshot is reused so opening the modal never re-captures over the overlay.

* feat(export): dual-mode renderer + reuse screenshot for image/PDF export

Unify all screenshot exports on one off-screen renderer and make them
viewport-independent:

- Renderer (deck-capture.ts) now has two modes: deck → one 1920x1080 PNG per
  slide (optionally just the slide at `index`); page (no `.slide` sections,
  e.g. a website) → a single full-document PNG at natural size. Adds `index`
  to the render input and `mode` to the result.
- Image export now renders through the daemon off-screen renderer (deck → the
  current slide at fixed size; website → the whole page as one long image),
  so the exported size no longer depends on the preview pane and can never
  capture Open Design's own UI. Falls back to the host/iframe snapshot on web.
- "Export as PDF" (UI) now produces a pixel-perfect screenshot PDF that matches
  the preview (same renderer as PPTX/image), falling back to the vector
  print path on web or on failure.
- New POST /export/image route; PPTX on a non-deck returns a clear 422.

* feat(export): smart full-page image capture + PDF back to print-view default

Full-page image export now auto-selects the capture technique (users only see
"full page"):
- captureBeyondViewport (one clean off-screen pass, no fixed-element
  duplication) when the output fits the machine's real GPU texture limit —
  queried at runtime via WebGL MAX_TEXTURE_SIZE, not hard-coded — and
  below-the-fold content actually rendered.
- scroll-segment stitch otherwise (too tall, or blank-below-fold scroll-driven
  pages like parallax landings): scrolls a viewport at a time, captures each
  frame, and stitches by real scroll offset into one long PNG. RAM-bound (a
  plain buffer, not a GPU texture), capped by a memory budget; encoded with a
  tiny dependency-free PNG encoder (node:zlib) so it bundles cleanly into the
  ESM packaged main and has no Skia dimension cap.
- Output scale derives from the window DPR / actual captured chunk, fixing a
  double-scale bug (DPR x clip.scale) that produced 4x-sized images.

PDF "Export as PDF" reverts to the print-ready vector path (instant, selectable
text) as the default; the pixel-perfect screenshot PDF stays available via
`od export pdf`.

* fix(export): PDF defaults to the CJK-safe screenshot path, not vector printToPDF

Chromium's vector printToPDF embeds no fonts in the packaged runtime and drops
CJK glyphs entirely — a Chinese page exported to "PDF" lost all its Chinese text
(only Latin survived). The off-screen screenshot renderer (already used for
image/PPTX) rasterizes the real browser render, so CJK is always correct.

"Export as PDF" now produces a pixel-perfect screenshot PDF that matches the
preview (one page per deck slide, or the whole page for a website), falling back
to the vector/browser print path only on web or on failure. Verified: a Chinese
site that lost its text under vector printToPDF renders fully under the raster
path.

* perf(export): stitch scroll-segments with Electron's native PNG encoder

The scroll-segment path was slow (~100s for a long parallax page) because of a
hand-written PNG encoder: a per-pixel JS BGRA→RGBA loop over tens of millions of
pixels plus zlib.deflate of a ~110MB high-entropy buffer.

Replace both with Electron's native image pipeline: stitch chunks as BGRA with
one Buffer.copy per chunk (capturePage already returns BGRA, which is what
createFromBitmap wants — no channel swap, no per-pixel JS) and encode once via
nativeImage.createFromBitmap(...).toPNG(). createFromBitmap is a CPU bitmap, not
a GPU texture, so it is not bound by the texture limit. Removes the hand-written
PNG encoder (crc32 / chunk framing / node:zlib).

Measured on a long parallax page: image 44s→17s, PDF 101s→19s (~5x), and the
native encoder also compresses better (55MB→26MB PNG).

* fix(export): render the image only on Save, after the format is chosen

Image export was rendering eagerly when the modal opened (a holdover from when
the capture had to run before the modal to avoid catching the overlay). Now the
desktop path renders off-screen and can't see the modal, so capture moves to the
Save click: open modal → pick format → Save → render + encode + download.

The Save button shows the "Saving…" state during the render; for the web-only
host-compositor fallback the modal is hidden during the brief capture so it
can't leak into the image. The snapshot is cached, so switching format after a
render re-encodes without re-capturing.

* perf(export): raster PDF embeds full pages as JPEG (52MB -> a few MB)

The screenshot PDF embedded full-page captures as PNG, so a photo-heavy long
page produced a ~52MB PDF. Full pages now render as JPEG (quality 82) — visually
near-identical for web screenshots but ~10x smaller. Deck slides stay PNG (crisp
text/graphics); image export still uses a lossless PNG source the client
re-encodes to the user's chosen format.

Threads a `pageImageFormat` hint through the render input; the desktop renderer
encodes page mode as JPEG (CDP captureScreenshot format:jpeg / nativeImage
toJPEG) and the daemon assembler embeds with embedJpg vs embedPng per image.

* feat(export): pre-pass for reveal-on-scroll + deck image as one long image

Two improvements found while testing real landing pages:

1. Pre-pass before full-page capture: freeze animations/transitions and scroll
   the whole page once (then back to top) so reveal-on-scroll content
   (IntersectionObserver / AOS / lazy images) is triggered and holds. This is
   the standard full-page-screenshot technique. Result: pages that previously
   came back blank-below-the-fold (and fell to the duplicate-prone scroll-
   stitch) now succeed with a clean single-pass captureBeyondViewport.
   Verified: a reveal-on-scroll landing page renders fully in one shot (no black,
   no duplicates); only pure JS-scrollY parallax (which re-hides at scrollY=0)
   still falls back to scroll-stitch.

2. Image export of a deck now stitches every slide top-to-bottom into one tall
   image (the "whole deck as one picture") via the native BGRA stitch, capped so
   a long deck can't exceed the bitmap limit. Ordinary pages remain a single
   full-page capture; a specific slide index is still honored if given.

* fix(export): show the loading toast at the start (not after) + track real completion

The "Exporting" toast was set inside the promise's .then(), so for a multi-second
screenshot export it only appeared AFTER the export finished — looking frozen the
whole time. Now fireShareExport shows a loading toast immediately, clears it on
success, and shows an error toast on failure. The loading toast TTL is raised to
60s so it survives a long export, and PPTX threads its real promise (was fire-
and-forget) so the toast reflects actual completion. Adds fileViewer.exportFailed
across all locales.

* fix(export): detect scroll-driven pages by comparison, not by color

The blank-below-fold heuristic (flat-color fraction) was unreliable: a dark-
themed page that renders fine (Mindloop, 89% near-black below the fold) looked
"blanker" than a scroll-driven page that genuinely fails (Luxury, 78%), so the
black middle slipped through and exported with a black band.

Replace it with a color-independent comparison: render the document's MIDDLE
band two ways — scrolled into view (real content) vs captureBeyondViewport at
scroll 0 (what the one-shot produces). If they differ significantly, the page is
scroll-driven and we use scroll-segment stitch; otherwise the clean one-shot.
Verified: Luxury now exports full content (stitch), Mindloop stays a clean
one-shot, dark designs are no longer false-flagged.

* fix(export): recognize nested-`.slide` decks (export all slides + show PPTX)

A deck whose slides are `.slide` nested under `.deck-viewport`/`.deck-stage`
(not direct children of `.deck`/`body`) was missed: PDF/image exported only the
first slide and the PPTX option was hidden, even though the slide pager showed
"1 / 9".

- Renderer: find slides via `.slide` anywhere, filtering out presenter-mode
  clones (`.mini-slide`/`.overview`/`.thumb`) in-page, instead of the rigid
  `.deck > .slide` selector. showSlide now also sets `visibility:visible` and
  toggles the common active-slide classes (active/visible/is-active/current) so
  decks that hide via `visibility:hidden` and gate reveals on `.visible` render
  every slide; animations are frozen so reveals reach final state instantly.
- UI: PPTX export shows whenever the artifact is deck-like (incl. the
  content-detected `.slide` decks that drive the pager), matching the pager.

Verified on a nested-`.slide` deck: PPTX = 9 slides, image = 9 stitched, all
with content (previously only slide 1).

* perf(export): cache /raw/ assets + halve per-slide round trips

Two performance wins for screenshot export (covers + live preview benefit too):

1. /raw/ now emits ETag + Last-Modified + Cache-Control: no-cache, and answers
   conditional GETs with 304. Covers, live preview, and the screenshot export
   window all load project HTML + its fonts/CSS/images through /raw/, and in the
   packaged app the hidden export window shares the same Chromium session/cache
   as the web UI — so a second load reuses already-downloaded bytes instead of
   re-fetching every asset. The validators are derived from file size+mtime, so
   any agent rewrite changes them and busts the cache immediately (no-cache keeps
   it always-revalidate, never silently stale). Previously /raw/ sent no cache
   headers at all, so nothing was reusable.

2. Deck slide capture merges the slide-show DOM toggle and its two-frame settle
   into a single executeJavaScript round trip (showSlide returns the settle
   Promise) instead of two separate main<->renderer hops. Output is identical;
   the loop is measurably faster and the saving scales with slide count.

Tests: apps/daemon/tests/project-raw-cache.test.ts covers the validators, 304 on
If-None-Match / If-Modified-Since, cache-bust after rewrite, and the streamed
media path. The merge is correctness-preserving by construction (same DOM ops,
same two-frame settle).

* perf(export): hand rendered images to the daemon as files, not base64 IPC

The desktop renderer used to return every rendered slide/page as a base64 data
URL inside the JSON IPC reply. For large images (photo-heavy decks) that means a
1.33x base64 blow-up plus JSON.stringify/parse of a multi-MB string plus a
multi-MB socket transfer across the desktop->daemon sidecar bridge, with a
matching RAM spike. The pixels only exist in the desktop process (it owns the
Chromium that captures them), so they must cross to the daemon — but they should
cross as a file on the shared filesystem, not as base64 in a JSON message.

Now the daemon picks a unique scratch dir under its data root
(<RUNTIME_DATA_DIR>/export-render/<id>), passes it as `outputDir` in the
RENDER_SLIDES request, the desktop writes the images there and returns their
paths in `slideFiles`, and the daemon reads them back and deletes the dir in a
finally. desktop only ever writes to the absolute path the daemon handed it, so
this works identically in dev and packaged (desktop never infers the data root).
A unique per-request id means concurrent exports never collide. Base64 data URLs
remain a fallback for older desktop builds that don't honor outputDir.

- sidecar-proto: DesktopRenderSlidesInput.outputDir + DesktopRenderSlidesResult.slideFiles
- deck-capture: emitImages() writes files when outputDir is set (all 3 paths:
  deck per-slide, deck stitch, full-page incl. scroll-segment)
- deck-export: readSlideFiles() reads the handoff files (companion to decodeSlideDataUrls)
- import-export-routes: create/own/clean the scratch dir; prefer slideFiles

Tests: readSlideFiles unit tests; a route-level test that asserts the renderer
is handed an outputDir under the data root, the image returns, the scratch dir is
deleted after the response, and concurrent exports each get a unique dir.

* chore(export): one-line per-phase timing logs for screenshot export

A slow export now leaves a diagnosable trail instead of guesswork:

- desktop `[od-export] render`: load / assets(fonts+images) / prepare / render
  phase breakdown + total, plus mode and whether the handoff used files.
- daemon `[od-export] assemble`: renderer(IPC) / read(file handoff vs base64) /
  assemble(pptx/pdf build) + total + byte size.

These immediately surfaced that a slow image export was dominated by the
artifact's own in-browser compile (Babel/Tailwind CDN) and uncacheable external
media — not the export pipeline (file read was ~2ms). One info line per export.

* fix(export): normalize PDF page size to points; honor --title in CLI output name

Addresses review feedback on PR #4604:

- buildScreenshotPdf sized each PDF page by the captured image's pixel
  dimensions, so the nominal page size scaled with the capture's device pixel
  ratio (a 2x retina capture produced a page twice as large as 1x). Normalize
  each page to a fixed longest-side in points (960pt; a 16:9 slide => 960x540pt,
  matching PowerPoint) with the image's aspect ratio. The image still embeds at
  full pixel resolution, only the page's points change.
- `od export pptx --title "X"` forwarded the title to the server but always saved
  the local file under the source HTML's basename. Name the output after the
  slugified title when --output is not given.

Tests: PDF page-size normalization assertion (loads the PDF, checks 960pt not
the 1px capture size); sidecar-proto render-slides IPC validation (outputDir,
enum, boolean, unknown-key rejection, minimal round-trip).

* test(export): cover the server Content-Disposition filename branch

The exportProjectAsPptx happy-path test only exercised the no-header local
fallback name; production always returns a Content-Disposition. Add a test that
pins the branch the desktop download actually uses (server filename wins).

* feat(export): support arbitrary-aspect decks (not just 16:9)

Screenshot deck export no longer assumes every deck is 16:9. The renderer
measures the deck's authored slide box (the rendered rect of the first slide
with layout, so fit-to-viewport decks report the stage they actually paint),
sizes the capture window + pinned stage to it, and clips capturePage to it. The
measured pixel dimensions flow to the PPTX assembler, which derives the slide
layout from the real aspect ratio (13.333" wide, height = width/aspect) instead
of hardcoding LAYOUT_16x9 — so 4:3, square, and portrait decks export
correctly-proportioned slides and PDFs instead of being letterboxed or clipped.

Falls back to 1920x1080 / 16:9 when the slide box can't be measured or is out of
a sane range, so existing 16:9 decks are unchanged.

Verified: demo-deck measures 1920x1080 (16:9, unchanged); a 1024x768 deck
measures 4:3. Tests: PPTX layout follows 16:9 / 4:3 / 9:16 aspect (asserted via
the slide cx/cy in presentation.xml).

* fix(export): capture off-screen carousel slides (translated-strip decks)

showSlide only toggled the active class/opacity, so decks that paginate by
translating a flex-strip container (e.g. html-ppt-zhangzara-grove) left slide 2+
off-screen and capturePage kept grabbing the first viewport region — exporting
the wrong slide or a blank page.

showSlide now reports where the active slide actually landed; if it is off the
top-left capture stage, showDeckSlide restacks just that slide into the viewport
(clears ancestor transforms + pins it fixed at 0,0) and re-settles before
capture. This branch only runs when the slide is genuinely off-stage, so
transform-scaled fit-to-viewport decks (active slide already at 0,0, and which
DO rely on an ancestor scale) are never touched.

Verified: a 3-slide flex-strip carousel — slide 0 stays at 0,0 (untouched),
slides 1/2 detected off-stage (x=1920) and restacked to 0,0 before capture.

* fix(export): gate PPTX on a host runtime; unify + center the image toast

- PPTX export has no web-only fallback (it needs the daemon's Electron-Chromium
  screenshot renderer), so a web-only deployment showed a PPTX button that always
  failed with 501. Gate `showPptxExport`/`canPptx` on `isOpenDesignHostAvailable()`
  so the action only appears where it can succeed. Image/PDF keep their web
  fallbacks and stay shown.
- Image export showed an in-modal spinner and a separate, non-portaled "saved"
  toast that rendered off-center (its `position:fixed` resolved against the
  preview pane's transform). Route image export progress through the same
  portaled, viewport-centered `exportToast` used by PPTX/PDF: close the modal on
  Save, show a loading toast, then success/error — one consistent, centered toast
  style. Removes the now-dead imageExportBusy/imageExportCapturing/savedToast.

* fix(export): screenshot the current deck slide; never drop slides when stitching

Two more review findings on the screenshot export path:

- captureExportImageSnapshot() routed deck snapshots through the daemon without a
  slide index, so /export/image fell into the stitch-whole-deck branch even for
  "Copy screenshot" and "Export as image" — which both promise "the current
  preview". Pass the active slide index for decks so both capture the current
  slide. Stitching the whole deck into one long image is reserved for an explicit
  action (a follow-up modal toggle).
- stitchDeckSlides() capped the output at DECK_STITCH_MAX_H by stopping the loop
  and still returning ok:true, silently dropping trailing slides (~13+ on a 2x
  capture) — partial-success data loss. It now captures slide 0 to learn the
  native size, picks one uniform downscale so all `count` slides fit under the
  cap, and stitches every slide (long decks just get a smaller per-slide size).

* fix(export): drop dead `scale` param; keep deck PDF slides PNG (not JPEG)

Two more review findings:

- The render-slides contract accepted a `scale` field (validated in sidecar-proto,
  forwarded by handleScreenshotExport) that the desktop renderer never read — a
  broken protocol surface on the feature's first release. Remove it from the
  proto, the daemon route, and BuildDeckRenderInputOptions; the capture resolution
  comes from the measured stage size and host DPR. (No scale multiplier is needed
  today; if one is added later it must actually be applied in the renderer.)
- The deck branch derived its image encoding from `pageImageFormat`, so the
  screenshot-PDF path (which sets pageImageFormat='jpeg') made deck slides lossy
  JPEG — contradicting the contract ("deck slides stay PNG; JPEG is a full-page
  page-mode optimization") and adding compression artifacts to text-heavy slides.
  The deck branch now always encodes PNG; only `page` mode honors JPEG.

* fix(export): no silent truncation for tall pages; deterministic deck slide index

- The full-page scroll-stitch path clamped the document height to the RAM budget
  and returned ok:true, silently dropping everything below the cap on very tall
  pages. It now refuses with a clear "page is too tall — export as PDF instead"
  error instead of returning a truncated image as success; pages within budget
  still stitch their full height. (Decks downscale to fit since they are discrete
  slides; a continuous page is failed rather than seam-spliced at reduced scale.)
- Deck screenshots now always send a concrete slide index
  (slideState?.active ?? cached ?? 0) so a fresh open — or a deck detected only
  from `.slide` markup that never emits od:slide-state — captures the current
  slide instead of falling into the stitch-whole-deck branch.

* fix(export): explicit page-vs-deck signal; surface semantic export failures

Two review findings:

- Treating any `.slide` element as proof of a deck was too broad for the generic
  /export/image and /export/pdf-image routes — an ordinary page with carousel or
  testimonial `.slide` markup would skip full-page capture and stitch those
  elements as slides. The caller now passes an explicit `deck` flag (the web
  knows `effectiveDeck`; PPTX is deck-only): `deck:false` forces full-page
  capture, `deck:true` forces slide capture, and the `.slide`-count heuristic
  remains only as the no-signal fallback (e.g. the CLI).
- `exportProjectImageDataUrl()` returned null for every non-OK response, so a
  semantic failure (e.g. the daemon's new "page is too tall — export as PDF")
  was treated as "renderer unavailable" and silently downgraded to a partial
  visible-viewport screenshot. It now returns a discriminated result; the caller
  only falls back to a web capture when the off-screen renderer is genuinely
  unavailable (501/no-host/network) and surfaces the real error otherwise (Copy
  screenshot + Export as image both show the message).

Plumbs `deck` through sidecar-proto, the daemon route/options, exports.ts
(image + pptx + screenshot-pdf), FileViewer, and ProjectView. Proto test covers
deck round-trip + rejection.

* fix(export): harden the file handoff (path confinement) + narrow unavailable

Three security/contract findings on the render-slides file handoff:

- sidecar-proto now rejects a non-absolute `outputDir` (was: any non-empty
  string), so a malformed render-slides request can't make desktop main mkdir +
  write outside the daemon scratch area. Negative proto test added.
- The daemon canonicalizes every returned `slideFiles` path and requires it to
  stay under the canonical `renderOutputDir` before reading — a buggy/malicious
  renderer response can no longer make /export/{pptx,pdf-image,image} read and
  stream back arbitrary files (path traversal / symlink escape). Returns 502 on
  an out-of-scope path; handoff test proves an out-of-tree path is refused and
  its bytes never reach the response.
- exportProjectImageDataUrl wrapped the whole flow in one try/catch, so a 200
  with a corrupt/unreadable payload was reported as `unavailable` and silently
  downgraded to the viewport screenshot. The `unavailable` path is now narrowed
  to transport-level failures (the fetch itself); a bad 200 payload returns a
  semantic `error` so the real failure surfaces.

* fix(export): CLI page/deck flag; reject out-of-range slide index

Two review follow-ups:

- `od export pdf|pptx` now accepts `--deck` / `--page` and forwards the signal in
  the request body, so the CLI hits the route with the same page-vs-deck
  semantics the UI uses (which sends effectiveDeck). Previously the CLI fell back
  to the daemon's `.slide` heuristic, so an ordinary HTML file with carousel
  markup could export as a deck from the CLI but a full page from the UI. (PPTX
  stays deck-only server-side; the flag matters for PDF.) `--deck` and `--page`
  are mutually exclusive; omitting both keeps the heuristic fallback.
- renderDeckSlides rejected nothing for an out-of-range `index`: it fell back to
  range(count) and the daemon returned slide 0 with 200 for image export, so
  asking for slide 99 of a 3-slide deck silently returned slide 0. It now fails
  with a clear "slide index N is out of range" error.

* fix(export): If-None-Match precedence; renderer IPC outage -> 502 not 400

- rawRequestIsFresh fell through to If-Modified-Since even when the request sent
  a non-matching If-None-Match, so a same-second rewrite (ETag changes, but
  Last-Modified is identical at second granularity) could 304 changed bytes when
  both headers were sent. If-None-Match is now authoritative when present
  (RFC 9110 §13.1.3) — freshness is the ETag match alone. Regression test sends a
  stale ETag + the current If-Modified-Since and expects 200.
- A rejection from desktopSlideRenderer (a 600s requestJsonIpc) — missing desktop
  process, broken socket, timeout — landed in the outer catch and became
  400 BAD_REQUEST, making renderer outages look like caller errors to retries /
  monitoring. The IPC call is now wrapped and translated to 502
  UPSTREAM_UNAVAILABLE, matching the !rendered.ok branch; the outer 400 stays for
  real request-validation / assembly errors.

* fix(export): full-page stitch corrupts on fractional DPR (125%/150% scaling)

scrollSegmentStitch rounded the device pixel ratio to an integer
(`Math.round(size.width / PAGE_W)`), so on non-retina display scaling (1.25x,
1.5x) the output width and every row offset were wrong — the stitched full-page
screenshot (/export/image and the raster PDF page path) came back cropped
horizontally or with vertical gaps/overlap even though the page rendered fine.

Derive width/height/placement from the REAL captured device width and its true
(possibly fractional) ratio instead. Extracted scrollStitchGeometry /
scrollStitchRowOffset as pure helpers with a non-integer-DPR regression test
(1x / 1.25x / 1.5x / 2x).

* fix(export): broaden deck slide selector; content ETag for transformed HTML

- The renderer only recognized `.slide`, but shipped decks use other slide
  contracts the print/export path already supports (e.g.
  html-ppt-zhangzara-creative-mode uses `<section data-screen-label=...>`), so an
  explicit deck export of those silently downgraded to a single full-page
  capture. Broaden SLIDE_SELECTOR to the pdf-export family
  (`.slide, [data-screen-label], .deck-slide, .ppt-slide`), and when
  `deck === true` finds no slide surfaces, fail fast with a clear error instead
  of capturing a page.
- /raw/ revalidation used the source file's mtime ETag even when the response is
  substituted by a transform (Vite dev-entry -> dist/index.html, or preview
  bridge injection). A change to dist/index.html with an unchanged source entry
  could then return a stale 304. Compute a content ETag from the actual sent
  bytes for transformed HTML; assets/fonts/images/streamed media keep the fast
  mtime ETag + early 304. Regression: rewriting only dist/index.html returns 200.

* fix(export): gate PPTX on explicit deck; page-mode DOM intact; stitch RAM budget

Four review findings:

- PPTX action was gated on the `.slide` regex (`effectiveDeck`/`looksLikeDeck`),
  so ordinary pages with carousel/testimonial `.slide` markup surfaced PPTX and
  were forced through the deck renderer (hardcoded `deck: true`). Gate
  show/canPptx on the EXPLICIT deck signal (`isDeckArtifact`: deck renderer / kind
  / presentation) instead; real decks keep PPTX, pages don't, and `deck: true`
  is now always correct. Image/PDF stay on the broader signal (they handle pages).
- renderDeckSlides ran prepareDeck (hide chrome + freeze animations) BEFORE
  deciding page vs deck, so page-mode exports rendered on a mutated DOM (content
  using generic `.notes`/`.overview` classes vanished). Split the non-mutating
  slide count from the deck-only DOM prep; page mode now captures the original
  document.
- stitchDeckSlides capped only output height, so a wide/high-DPR deck could still
  allocate >1 GiB (8192px stage @2x => W~16384 * 30000 * 4). Add a RAM byte budget
  (320MB, like the page stitcher): downscale by min(heightScale, byteScale).
- sidecar-proto render-slides test now covers the `index` field (success + reject
  negative / fractional / non-number).

* fix(export): image/PDF deck flag from explicit signal, not .slide heuristic

The image and screenshot-PDF exports still passed `deck: effectiveDeck` (the
`.slide` regex), so an ordinary HTML page with carousel/testimonial `.slide`
markup exported only the current card instead of the full page. Drive both off
the explicit `isDeckArtifact` signal (same as PPTX): a real deck → per-slide, a
page → full-page capture. Extracted `shouldCaptureAsDeck()` as a pure helper with
a regression test (page + slides + deck:false => page, not per-slide).

* fix(export): screenshot PDF download must prompt Save As (.pdf in allowlist)

The default Export PDF flow now streams a .pdf download via
exportProjectScreenshotPdf, but the will-download Save As hook only intercepted
.pptx + image extensions — so PDF silently wrote to the OS Downloads folder
while PPTX/images prompted. Add .pdf to SAVE_AS_EXTENSIONS with a PDF filter,
and extract saveAsDialogOptionsForFilename() as a pure helper with a runtime test
(PDF/PPTX/image prompt; uppercase matched; other extensions pass through).

* fix(export): single-shot guard for image export (no double-click duplicates)

The toast-based image export closes the modal and starts the save without an
in-flight guard (the old in-modal busy/disabled states were removed), so a fast
double-click / Enter-repeat on Save could enqueue two concurrent exports
(duplicate captures, downloads, and fireImageExportResult bookkeeping) before the
modal-close re-render removed the button. Add an imageExportInFlightRef guard
that returns early on re-entry and resets in finally — mirrors the existing
screenshotInFlightRef pattern.

* fix(export): If-Range guard on /raw/ stream; block image-modal reopen mid-export

Two non-blocking correctness issues:

- /raw/ honored Range unconditionally even with the new ETag/Last-Modified, so a
  client resuming a cached font/media download after the file changed could
  splice stale + fresh bytes. Gate Range on If-Range (RFC 9110 §13.1.5): serve
  206 only when the If-Range validator (ETag or date) still matches the current
  file, else fall back to a full 200. Regression test: stale If-Range + Range
  returns 200 with the new full length.
- The image-export single-shot guard covered handleImageExportSave, but reopening
  the modal mid-export reset the shared request/result refs, mis-attributing or
  dropping the in-flight export's analytics result. openImageExportModal now
  no-ops while an export is in flight.

* fix(export): drive image/PDF deck decision off the viewer signal (effectiveDeck)

The desktop screenshot image/PDF paths were gated on isDeckArtifact while the
vector-PDF fallback (and the viewer's own prev/next/Present) use effectiveDeck.
That diverged: a metadata-free `.slide` deck rendered as a deck in preview but
exported as a single full page on a desktop host, yet as a deck via the browser
fallback — same artifact, different output depending on host.

Drive image + screenshot-PDF off effectiveDeck (the viewer's deck decision), so
export matches what the user sees and is host-independent. PPTX keeps the
narrower isDeckArtifact: it is deck-only with no vector fallback, so it can't
diverge, and it must not offer slide export for incidental carousel markup.
Removes the now-dead isDeckForExport binding.

* test(web): update image-export specs for capture-on-Save modal flow

The image-export modal was redesigned in this PR from eager-capture-on-open
(preview + live format re-render + in-modal alert + disabled-until-ready Save)
to capture-on-Save unified with the PPTX/PDF portaled-toast flow: the dialog
just picks a format, and Save closes it and runs the single capture behind the
export toast. The 9 specs in file-viewer-image-export.test.tsx still drove the
old eager flow and failed in CI (Web workspace tests). Updated each to click
Save before asserting capture, pick the format before Save, assert the portaled
toast (role=alert error text unchanged) instead of the removed in-modal alert,
and replaced the obsolete "preparing label" spec with one proving no eager
capture happens on open or on format change.

* fix(cli): od export honors the server Content-Disposition filename

The web download helper prefers the daemon's Content-Disposition filename and
only falls back to a locally derived name. `od export` ignored it and always
synthesized the name from --title/basename, so the two surfaces could write
different filenames for the same export. Parse the header (RFC 5987 filename*
and plain filename, reduced to a hardened basename so an odd header can't steer
the write outside the cwd) and prefer it when --output is not given, keeping the
title-slug/basename fallback. Mirrors apps/web/src/runtime/exports.ts.

* fix(export): detect runtime-managed decks; image=whole deck; de-dup long pages

QA found three blocking export-fidelity issues on this PR:

1. Horizontal decks export only slide 1 (image: all such templates; PDF:
   some). Runtime-managed decks (`<deck-stage>` web component with slotted
   `<section data-screen-label>` children toggled via `data-deck-active`)
   carry no literal `class="slide"`, so the viewer's `looksLikeDeck` regex
   misses them and the UI sent an authoritative `deck:false`. The host then
   force-captured page mode (`mode:'page', slides:1`) — a full-page shot of
   whatever slide was visible. PDF same path: `deck:false` skips the host
   DECK_PRINT_CSS, so decks without their own `@media print` print one page.
   Fix: a broader EXPORT-only signal `sourceLooksLikeExportableDeck` /
   `deckExportSignal` mirroring the host's slide-surface family
   (`.slide`/`[data-screen-label]`/`.deck-slide`/`.ppt-slide`) plus
   `<deck-stage>`. Kept OUT of `effectiveDeck` so the host's deck-stage-
   incompatible prev/next nav is not surfaced as a dead "— / —" control.

2. "Export as image" of a deck returned the current slide only. It now
   stitches every slide into one long image (matching the slide count the
   viewer reports); Copy screenshot / Mark-Draw capture keep the current
   slide via `captureExportImageSnapshot({ wholeDeck })`.

3. Long-page image/PDF export duplicated a fixed/sticky hero down the
   output: the scroll-segment stitch captures the viewport per offset, so a
   pinned element was copied into every segment. `preparePageForCapture` now
   neutralizes `position:fixed`->absolute and `sticky`->static before
   measuring/capturing, so each renders once (captureBeyondViewport already
   de-dupes; applied uniformly for consistency).

Red specs: exports.test.ts (deck detection), neutralize-positioning.test.ts
(fixed/sticky normalization).

* chore: re-trigger CI on updated main — needs-validation gate moved to merge_group (#4714)

* fix(sidecar): decode IPC frames with StringDecoder (multibyte UTF-8 corruption)

Exported CJK artifacts intermittently showed `???` / `◆?` (U+FFFD) in place of a
character — e.g. "拥挤" rendered as "拥���", "交付边界" as "交付���界". The bad
character varied between exports, the source bytes on disk were correct, and the
daemon /raw/ serve was byte-identical, so it was not a font or storage problem.

Root cause is in the generic JSON-IPC transport. Both the server and client
socket readers did `buffer += chunk.toString()` into a STRING. A render request
carries the full artifact HTML over the desktop IPC; when the payload spans
multiple `data` events, a multibyte UTF-8 character (CJK = 3 bytes) straddling a
chunk boundary is decoded per-chunk, turning each partial half into U+FFFD. Small
payloads never hit a boundary (hence "works in my repro, breaks on the real
file"); large real artifacts do, at whichever character lands on the split.

Fix: feed each chunk through a per-connection `StringDecoder("utf8")`, which
holds an incomplete trailing byte sequence until the next chunk completes it.

Verified end-to-end against the QA "Blog Post" artifact in a packaged client:
"拥���" → "拥挤" after the fix. Red spec: a ~1.3 MB CJK payload round-tripped
through createJsonIpcServer/requestJsonIpc (forces multi-chunk delivery) is now
byte-exact; it fails on the pre-fix reader.

* fix(export): vector deck PDF rendered only the first slide

A deck exported via the vector PDF fallback (POST /export/pdf →
exportPdfFromHtml) collapsed to a single page: only the runtime-active slide
appeared. Decks gate visibility with `.slide:not(.active){display:none!important}`
(specificity 0,2,0); the host DECK_PRINT_CSS `.slide{}` rule (0,1,0) cannot win
that cascade, so every non-active slide stayed `display:none` in print.

Fix: before printToPDF, mark every slide surface active (the same class set the
screenshot path toggles in deck-capture's showSlide), so the deck's own
`.slide.active` styling applies to all slides and DECK_PRINT_CSS paginates them
one per page. Shadow-DOM `<deck-stage>` decks are unaffected (their own
`@media print` already lays out every slide).

Verified with an offscreen printToPDF of a 12-slide `.slide`-class deck: 1 page
-> 12 pages, each a distinct centered slide.

* fix(export): screenshot PDF fails fast instead of masking errors as vector PDF

Per review: the raster-PDF path fell back to the vector `exportProjectAsPdf` for
EVERY non-ok screenshot result, so a semantic failure (bad deck routing, a 422,
a renderer-side 502, "page too tall", unreadable output) silently handed the user
a different (vector) PDF — the exact fidelity/CJK-glyph class of bug the
screenshot path exists to avoid.

exportProjectAsPptx now returns the same tri-state as exportProjectImageDataUrl:
`{ok:true}` / `{ok:false,unavailable:true}` (501 or transport — caller may fall
back) / `{ok:false,error}` (semantic — must surface). The PDF action only falls
through to the vector path on `unavailable`; a semantic error throws and is shown
in the export toast (onErr now prefers the export's own user-facing message).

* chore(nix): refresh pnpm deps hash

* fix(export): guard deck capture against stale-frame duplicate pages

QA saw a deck export with duplicate pages (e.g. two identical 目录 pages, a
slide silently missing). Root cause is a compositor race: after showing slide i,
`capturePage()` can return the PREVIOUS slide's frame when the new slide hasn't
painted yet (more likely on slower / loaded machines and slides with heavier
reveal content), so the loop emits an exact duplicate of the prior page. The
source has 12 distinct slides and live navigation is fine — the race is purely in
the offscreen capture loop.

Fix: after each capture, compare a cheap sampled checksum to the previous
slide's; if byte-identical (which can't happen for distinct slides), wait for
more frames and re-capture (bounded, 4 attempts × ~60ms). Two genuinely-identical
adjacent slides exhaust the retries and emit once. Applied to both the per-slide
(PDF/PPTX) and stitch (whole-deck image) loops.

Test: imageSignature distinguishes captures by content and length. (The race
itself is timing-dependent and not reproducible on a fast/idle machine — both
file:// and packaged-http exports of the reported deck render 12 unique pages
here — so the guard hardens the failure mode rather than relying on local repro.)

* fix(export): paginate tall pages for raster PDF instead of refusing

Per review: the single-image RAM/texture guard in capturePage refused any page
taller than the budget with "page is too tall — export as PDF instead". That is
right for /export/image, but /export/pdf-image routes ordinary-page PDFs through
the same branch — and since the screenshot-PDF path now fails fast (no silent
vector fallback), a long landing page exported as PDF hit a self-contradictory
hard error and regressed tall-website PDF export.

Fix: the PDF path (`jpeg`) now paginates a too-tall page into a multi-page raster
PDF — captureBeyondViewport per texture/RAM-bounded chunk, one image per chunk,
which the daemon assembles into one PDF page each. /export/image (png) keeps its
refusal (it has nowhere to paginate to). tallPageChunkHeights extracted + tested.

Verified offscreen: a ~20400px page → PDF path returns 3 paginated pages
(ok/page), image path still refuses.

* fix(export): capture deck slides via CDP (structural fix for duplicate pages)

Replaces the pixel-compare/retry guard (88d21c7) with a structural fix, per
review feedback: comparing each capture to the previous slide is the wrong
abstraction (it can't tell a stale frame from two genuinely-identical adjacent
slides, and wastes retries on the latter).

Root cause: the deck path used `webContents.capturePage()`, which grabs the last
COMPOSITED frame and can return the previous slide's frame when the just-shown
slide hasn't composited yet — emitting an exact-duplicate page. The page path
never had this because it uses CDP `Page.captureScreenshot`, which renders the
CURRENT DOM to a fresh frame.

Fix: deck capture now uses CDP `Page.captureScreenshot` too (attach the debugger
once around the deck loop; fall back to capturePage if it can't attach). The
captured pixels always reflect the slide just shown — no compare, no retry, no
identical-slide edge case. Animations/transitions are already frozen
(prepareDeckStage), so each slide is captured at its final state, never a
mid page-turn frame. Removed imageSignature + the retry loop.

Verified: 12-slide deck still stitches to 12 distinct slides at the correct dims.

* fix(export): current-slide capture of runtime decks uses the visible slide

Per review: deckExportSignal makes runtime-managed decks (<deck-stage> /
data-screen-label) exportable, but the current-slide path (Copy screenshot /
annotation capture) still resolved the slide index as `slideState?.active ?? 0`.
Those decks are deliberately kept out of effectiveDeck, so the viewer never
receives their active-slide bridge and slideState is null — meaning Copy
screenshot always off-screen-rendered slide 0 instead of the slide on screen,
inconsistent with the PPTX/PDF fix on the same templates.

Fix: planDeckImageCapture() decides per capture — whole-deck (Export as image),
ordinary pages, and tracked .slide decks render off-screen (with the active index
when tracked); an untracked deck's current-slide capture skips the off-screen
path and falls through to the visible host snapshot (which IS the current slide).

Tests: planDeckImageCapture unit cases (exports.test.ts) + a FileViewer
regression — Copy screenshot of a data-screen-label deck with no tracked slide
uses the host snapshot and does NOT off-screen-render slide 0.

* fix(export): don't mask post-response failures / debugger-less tall PDF as fallback

Two review edge cases:

- exportProjectAsPptx wrapped resp.blob() + triggerDownload() in the same
  try/catch that maps to `{unavailable:true}`, so a corrupt body or a
  client-side download failure (after a 200) was reported as "renderer
  unavailable" — letting the PDF caller silently downgrade to the vector path.
  Only the fetch (transport) and 501 now map to `unavailable`; post-response
  failures return `{error}` so they surface. Unit test added.

- capturePage's no-debugger fallback still returned "page is too tall — export
  as PDF instead" for the PDF path (jpeg). Pagination needs CDP, and we only
  reach this branch when the debugger can't attach, so it now surfaces a
  distinct retryable error instead of telling the user to switch to the format
  they already chose. (The debugger attaches in normal packaged exports; this is
  a rare transient.)

* fix(export): distinguish CDP attach failure from later CDP command failure

Per review: when the debugger attached but a later CDP command threw (a real
Chromium/GPU/clip error), the broad catch swallowed it and the too-tall PDF
refusal reported "renderer is busy, please retry" — hiding the actionable error
and sending users into a pointless retry loop. The retryable busy message is
only correct when the attach itself failed.

Track the caught CDP error (cdpError) separately: the too-tall PDF branch now
surfaces the real CDP error message when the debugger was available but a command
failed, and reserves the retryable "busy" message for true attach contention.

* fix(export): reject `od export pptx --page`; test the tall-PDF error split

Two review items:

- CLI: `od export pptx --page` advertised a page mode that can never work (the
  daemon forces deck mode for /export/pptx). Reject `--page` for pptx with a
  clear contract error pointing at `od export pdf --page` instead of silently
  ignoring it.

- Lock down the cdpError split with a regression: extract tooTallPdfErrorMessage
  and unit-test both branches — attach failure → retryable "busy" message;
  attached-but-CDP-command-failed → the real Chromium/GPU error surfaces (and
  neither tells the user to "export as PDF", which they already chose).

* fix(export): keep current-view captures viewport-based; reject weak If-Range

Two review items:

- planDeckImageCapture sent ordinary-page Copy screenshot / captureViewport
  annotation through the off-screen renderer (useOffscreen:true, no index), which
  renders the WHOLE document instead of the visible region — a regression for
  screenshot/annotation viewport semantics. Now: Export-as-image (wholeDeck) and
  tracked-deck current-slide still render off-screen; an ordinary page's
  current-view capture (and an untracked deck's) falls back to the visible host
  snapshot. Tests updated.

- ifRangeAllowsPartial accepted weak entity-tags for a 206, but RFC 9110 §13.1.5
  requires a strong validator and our /raw/ ETag is weak (W/"size-mtime"). A
  same-size rewrite / mtime collision could splice stale + fresh bytes under a
  matching weak tag. Now any entity-tag If-Range falls back to full 200; only the
  date form authorizes a range. project-raw-cache.test.ts pins it (weak-ETag
  If-Range → 200, fresh date → 206, stale date → 200).

* fix(export): resolve imported-folder project files via metadata.baseDir

Per review: the new screenshot export routes (and the vector /export/pdf) read
the source with readProjectFile() and no metadata, so it fell back to
<OD_DATA_DIR>/projects/:id and returned FILE_NOT_FOUND for imported-folder
projects (whose workspace lives at metadata.baseDir) even though the file renders
in the UI.

Thread project metadata through: BuildDeckRenderInputOptions and
BuildDesktopPdfExportInputOptions gain a `metadata` field passed to
readProjectFile; handleScreenshotExport and the /export/pdf route load it via
getProject(db, id)?.metadata. HTTP regression added: an imported-folder project
(created through /api/import/folder) hitting /export/image now returns 200 with
the rendered image instead of 404.

* chore(nix): refresh pnpm deps hash

* Show PPTX export for detected decks

* Fix deck export detection for page captures

* Route CLI image export through screenshot renderer

* Route legacy image export through screenshot renderer

* fix(export): per-viewport PDF pagination + parallax-faithful image capture

A long non-deck page exported to PDF came out as one giant page, and the same
page exported as an image dropped its scroll-pinned text. Both stemmed from the
page-capture path: PDF assembled one PDF page from a single tall capture, and
the image path flattened fixed/sticky positioning (fixed->absolute,
sticky->static), which deleted parallax headline/foreground text.

- PDF: add a `paginate` render-slides input. A non-deck page now captures one
  image PER VIEWPORT, top to bottom, and the daemon assembles a multi-page PDF
  (one screen per page). Decks still paginate per slide; page-mode only.
- Image: capture each viewport live at its real scroll offset and stitch into
  one tall image, keeping fixed/sticky CSS as authored -- the SAME capture logic
  as the PDF path, differing only in assembly. Drop the captureBeyondViewport
  one-shot and its isScrollBound heuristic (it rendered the whole document at
  scroll 0 and got parallax/reveal-on-scroll content wrong), and drop the
  fixed-neutralization step (it dropped pinned text).

Adds paginateViewportBand unit coverage and a paginate IPC round-trip/rejection
case; removes the now-unused neutralizeFixedAndStickyPositioning helper and test.

* fix(export): capture deck-stage at authored size; share pptx in contracts

Addresses two review findings on the screenshot export surface.

- deck-stage fidelity (blocking): the <deck-stage> runtime fits its canvas to
  the viewport with `transform: scale(...)` by default and documents that PPTX
  export must set the `noscale` attribute so the DOM is captured at the authored
  slide size. The renderer never set it, so a deck whose authored canvas differs
  from the 1920x1080 capture viewport was measured + captured at the preview-
  scaled size. prepareDeckStage now sets `noscale` on every <deck-stage> (a
  no-op for plain `.slide` decks).

- contract boundary: `pptx` was a first-class CLI/daemon export format but the
  shared `EXPORT_FORMATS` in `@open-design/contracts` still declared only
  `['pdf', 'image']`, so the capability was typed through an ad hoc local union.
  Add `pptx` to the shared contract, import it in the CLI instead of a local
  duplicate, and route `pptx` through the generic `/export` route (to the
  screenshot renderer) alongside `image`.

* fix(export): route CLI --format pdf through the raster screenshot PDF path

`od export --format pdf` still posted to the generic `/export` route, whose
desktopArtifactExporter renders vector PDF via printToPDF() and drops CJK glyphs
in the packaged runtime. The web UI was deliberately switched to the raster
`/export/pdf-image` path for that reason, so the CLI diverged from the UI on the
exact decks/pages this feature targets.

Route all three CLI formats through the screenshot renderer (pdf →
/export/pdf-image, matching the UI). Extract the format→route mapping into a
pure `exportRoutePath` helper so it is unit-testable without executing the CLI
entrypoint, and assert no format falls through to the vector `/export` route.

* fix(export): route generic POST /export pdf through the raster screenshot path

The shared ExportRequest contract advertises `pdf` as part of the screenshot-
rendered export surface, but the generic `/export` route still sent `format:
'pdf'` to desktopArtifactExporter's vector printToPDF() path, which drops CJK
glyphs in the packaged runtime. So a contract caller hitting POST /export got the
lower-fidelity PDF while the dedicated /export/pdf-image route, the UI, and the
CLI all use the raster screenshot PDF — the API surface was internally
inconsistent.

Route every /export format (pdf included) through handleScreenshotExport so the
generic endpoint matches the dedicated routes and the contract; drop the now
unused desktopArtifactExporter / buildDesktopArtifactExportInput wiring from the
route. Add an HTTP-level regression asserting POST /export with format:'pdf'
runs the screenshot renderer and streams back a real (%PDF) raster PDF.

* Restore editable PPTX export

* Clarify authored slide measurement

* Enable PPTX export from browser

* Stabilize large editable PPTX text

* Use workspace root for PPTX export resource

* Let CLI exports auto-detect decks

* Avoid tracking generated PPTX bundle

* Fix generic export deck routing

* Fix deck export routing regressions

* Add CLI page-mode export flag

* Preserve authored deck capture DOM

* Load PPTX vendor bundle from gzip resource

* Harden export CLI and PPTX bundle loading

* Preserve editable PPTX slide background images

* Preserve export render sizing contract

* Classify screenshot export request errors

* Preserve freeform slide deck exports

* Preserve UTF-8 export filenames

* Align export routing and CLI JSON contract

* Preserve export compatibility paths

* Keep PDF export on screenshot renderer

* chore(nix): refresh pnpm deps hash

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-29 13:15:14 +00:00
PerishFire
8b9b169673 ci: add cost-sensitive runner tiering (#4764)
* test persistent contabo runner poc

* test persistent runner web workspace poc

* test persistent runner local cache setup

* test persistent runner aggressive setup poc

* test persistent runner web postinstall profile

* test restore stable persistent runner poc

* test keep persistent runner web tests default

* test route persistent poc to zxiyun runner

* test relax web vitest timeout on persistent poc

* test harden web vitest cases for persistent runner

* test add runner perf probe

* test isolate runner perf probe dispatch

* test pin runner perf probe node

* test expand runner perf probe matrix

* ci: route heavy jobs through runner modes

* ci: keep web workspace tests on blacksmith

* test: update web runner mode assertion

* ci: route cost-sensitive runner tiers

* ci: move preflight to hosted runner

* ci: downshift blacksmith ui runners to 4vcpu

* test: fix file viewer timeout after merge

* ci: route e2e vitest to hosted default

* ci: centralize runner profile selection

* test: type runner profile assertions

* ci: route e2e vitest to default blacksmith

* ci: remove runner experiment probes

* test: restore postinstall guard coverage

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

* test: cover postinstall script contract

* test: move postinstall guard to e2e

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-25 08:55:51 +00:00
Tom Huang
29b138f7a3 feat(brands): turn any brand into a reusable design system (#4691)
* Implement brand management routes and CLI support

- Added `brand-routes.ts` to handle HTTP endpoints for brand operations: listing, extracting, retrieving details, deleting, and serving logos.
- Introduced `brands-cli-help.ts` for CLI commands related to brand management, including usage instructions for listing, creating, retrieving, and deleting brands.
- Updated `cli.ts` to integrate brand commands into the existing CLI structure, allowing headless management of brands via the command line.
- Created supporting files for brand metadata handling, including `design-md.ts` for rendering brand information in markdown format and `index.ts` for the brand engine API.
- Implemented `prefetch.ts` to fetch and process brand material from specified URLs, ensuring a streamlined extraction process.
- Enhanced server setup in `server.ts` to register brand routes and manage brand-related data effectively.

This commit establishes a comprehensive framework for managing brands within the application, facilitating both HTTP and CLI interactions.

* Enhance memory management and onboarding experience

- Introduced canonical profile labels to ensure consistent handling of user input in profile forms, preventing duplicate entries.
- Updated the `parseProfileBody` and `captureProfileFromForm` functions to utilize the new canonical label matching.
- Added a memory callout section in the onboarding view to highlight the benefits of memory usage, including personalized responses and reduced setup questions.
- Implemented new UI elements in the onboarding view to improve user engagement with memory features.
- Expanded i18n support for new onboarding messages related to memory benefits across multiple languages.

* Refactor onboarding flow and enhance design system integration

- Updated the onboarding process to include a new brand extraction step, replacing the previous newsletter step.
- Adjusted the tracking logic to reflect the new onboarding steps, ensuring accurate analytics for user progress.
- Improved the UI for the onboarding view, including new input fields for email collection during the brand extraction phase.
- Refined the EntryShell component to remove outdated comments and clarify the onboarding renderer's purpose.
- Enhanced CSS styles for the onboarding steps to improve layout and user experience.
- Updated internationalization strings across multiple languages to reflect changes in the onboarding flow and brand extraction messaging.

* Add brand management features and enhance font handling

- Introduced new modules for managing brand assets, including `chrome.ts` for headless Chrome operations and `fonts.ts` for self-hosting web fonts.
- Implemented `prefetch.ts` to streamline the brand material extraction process, allowing for efficient harvesting of colors, fonts, and logos.
- Enhanced the brand system with new schema definitions in `schema.ts` to support brand color and font management.
- Developed the `engine` module to integrate brand building and rendering processes, including token derivation and artifact generation.
- Improved the overall structure and organization of brand-related files for better maintainability and scalability.

* Enhance brand extraction and project management features

- Updated `brand-routes.ts` to include new dependencies for project management, allowing for the registration of brand-related projects.
- Modified the `extractBrand` function to support project ID and system files, improving the brand extraction process.
- Enhanced the CLI commands in `cli.ts` to handle project IDs during brand creation, enabling better tracking of brand projects.
- Updated the server setup in `server.ts` to register new project-related routes.
- Improved the UI components to display project information associated with brands, including buttons for opening projects in the `BrandDetailView` and `BrandsTab`.
- Added new metadata fields in the contracts to support project tracking and management for brands.

This commit establishes a more robust framework for managing brand projects, enhancing both backend and frontend functionalities.

* Enhance onboarding profile management and memory persistence

- Added new canonical profile labels for 'Organization size', 'Use cases', and 'Discovery source' to improve user input consistency.
- Introduced `OnboardingProfileState` type to manage onboarding profile data more effectively.
- Implemented functions to build and persist the onboarding profile body to memory, ensuring user selections are saved accurately.
- Updated the `OnboardingView` component to handle profile persistence during navigation and submission steps.
- Enhanced tests to verify that user selections are correctly persisted to the memory profile.

This commit improves the onboarding experience by ensuring that user inputs are consistently captured and stored, enhancing overall user engagement with the application.

* Reflow brand extraction into an agent-driven, live flow

Replace the deterministic SSE prefetch/preview/system pipeline with an
agent-driven extraction: POST /api/brands now reserves the brand and stands
up a backing project with the target site open in an in-app browser tab plus
a seeded prompt, so the agent measures, synthesizes brand.json incrementally,
and the user can clear anti-bot walls by hand. New /preview and /finalize
routes let the agent render the kit page live and register the resulting
user:<id> design system, so extracted brand facts persist as a structured,
reusable brand kit instead of a one-shot deterministic guess.

Adds the brand-extract skill (SKILL.md + brand-kit.html template), kit-render
engine, brand-extraction-engine tests, brand project covers in the Designs
tab, onboarding extract handoff, and the matching od brand extract/preview/
finalize CLI subcommands and contract updates.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Sediment finalized brands into structured memory

Reflow a finalized brand into the memory store (brandToMemoryEntries +
reflowBrandToMemory) so future chats can ground vague requests in the
brand's palette, type, voice and rules. finalizeBrand now wires through
the runtime dataDir and best-effort persists the brand, MemoryChangeEvent
gains a 'brand' source, and the brand kit render hardens its inline JSON
escaping. Adds brand.previewEmpty / brand.viewDetails across all locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Implement logo fallback and imagery support in brand extraction

- Introduced a deterministic logo fallback mechanism to ensure that brand extraction processes can retrieve and save site logos, even when the agent fails to do so.
- Enhanced the `startBrandExtraction` and `finalizeBrand` functions to utilize the new logo fallback, allowing for better handling of logo assets.
- Added support for imagery samples in brand validation, enabling the inclusion of representative images in the brand kit.
- Updated the brand kit rendering to include self-hosted fonts and imagery, improving the overall presentation of brand assets.

This commit strengthens the brand extraction workflow by ensuring that logos and imagery are reliably captured and displayed, enhancing the user experience in brand management.

* Enhance memory management with rule proposal and verification features

- Introduced new functionality for distilling annotations into rule proposals, allowing users to suggest rules based on in-canvas annotations through the `od memory rule suggest` command.
- Implemented a verification system that programmatically enforces compliance with active rules during artifact generation, ensuring that all active rules are covered in the self-verify scorecard.
- Added endpoints for managing verification outcomes, including listing, removing, and clearing verification records, enhancing the transparency of the verification process.
- Updated the memory management system to support the retrieval of active rule entries, ensuring that only linked rules are considered during verification.
- Enhanced tests for both rule proposal generation and verification processes to ensure reliability and correctness.

This commit strengthens the memory management capabilities by integrating rule proposals and verification, improving the overall user experience in managing design rules and ensuring compliance.

* Distill review annotations into memory and enforce self-verify scorecard

Add distillAnnotationsToMemory to mine inline preview comments/highlights/
marks into durable feedback + rule memory via a dedicated distiller prompt,
threaded through the existing extract pipeline with an 'annotation' change
source. Tighten the self-verify prompt (daemon + contracts) to state the
daemon programmatically checks the scorecard, so a missing or uncovered
scorecard on an artifact turn is an enforcement failure. Cover the rule
suggest and verification-history routes with tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Apply brand design system through web config on "Use in new chat"

Thread onApplyDesignSystem from the entry shell into BrandsTab so the brand's
registered design system is applied via the web config channel instead of a
bare daemon PATCH that left the Home composer stale. Add a transient
home-intent latch + event so the Brands tab can request the Prototype chip on
the already-mounted HomeView, which consumes it once the plugin catalog loads.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Wire annotation distillation into background memory extraction

Add a background distill pass that mines inline review annotations
(comments / highlights / drawn marks) from a turn into durable memory
alongside the general LLM extraction, surface an `annotation` memory
toast source in the web UI, and cover the flow with a unit test.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix brand design system not applying to composer on "Use in new chat"

Selecting a brand's "Use in new chat" applies the brand's design system as
the default and fires the Prototype chip intent in the same synchronous click
handler. HomeView consumed that intent inside the event listener, so `pickChip`
ran before React committed the config change and seeded the composer's
design-system field from the stale (empty) default — the composer showed
"No design system" instead of the brand until a reload.

Split the intent handling: the listener now only bumps a tick, and a separate
effect consumes the chip after the re-render lands, so the seeded design system
reflects the freshly-applied brand. Add the previously-untracked home-intent
latch test coverage.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): rework Brands into Brand Kit and add Home create entry

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(brands): harvest real cover/hero images for the Images module

The brand kit's Images gallery only populated when the extraction agent
remembered to save imagery — so a forgetful or bot-blocked agent (and the
pre-imagery "Open Design" brand) left it empty. Add a deterministic,
server-side imagery fallback (imagery-fallback.ts), mirroring the logo
fallback: it parses og:image/twitter:image, large <img> (highest-res
srcset/<picture>), <link rel=preload as=image>, and CSS background-image
hero blocks, fetches candidates with browser-shaped headers, decodes
PNG/GIF/JPEG/WebP dimensions to keep only big representative images
(dropping icons/sprites/logos/tracking pixels), dedupes by content hash,
and saves up to 8 of the largest into imagery/ with labeled samples.

finalizeBrand runs it as a timeout-bounded, failure-tolerant safety net
(injectable so tests stay offline) when the agent captured too few
samples, first adopting any on-disk images. The extraction prompt and
brand-extract SKILL now explicitly direct the agent to harvest the site's
large/cover/hero images, filtered by rendered size.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(qa): implement deck layout validation and safety checks

Add a new QA module for validating the layout of generated brand decks to ensure robustness against clipping and truncation issues. The `analyseDeckLayout` function checks for critical layout invariants, including the presence of `.slide` sections, correct container types, and necessary runtime layers. Introduce `assertDeckLayoutSafe` to enforce these checks during brand system rebuilds, preventing the deployment of decks that fail validation. Additionally, create comprehensive tests to verify the functionality of the new layout validation features.

* fix(brands): apply deck shrink-to-fit synchronously so slides never clip

The no-clip runtime scheduled its fit pass through requestAnimationFrame,
whose callbacks are throttled while the deck is offscreen or occluded. A
slide could therefore stay unscaled — and clip its content — until first
paint. Fit synchronously on resize/load/fonts-ready with a trailing
setTimeout settle pass for late reflow, removing the rAF dependency.

Verified at the previously-broken 1024x620 viewport: container-type:size,
zero truncations, runtime auto-applies scale (Problem 0.71, ASK 0.87,
Product 0.97, Competition 0.97) and frame clip count drops to 0.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): let New Brand modal embed scrollable brand reference picker

Add a fillHeight mode to BrandReferencePicker so the heading, quick-pick
row and controls stay pinned while only the gallery scrolls inside a
bounded-height parent. Wire it into NewBrandModal with a stable, spacious
dialog and refresh the related newBrand/brandPicker copy across all 18
locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(brands): enhance brand extraction with deterministic seed harvesting

Introduce a new `seed-fallback` module to provide a server-side deterministic palette and typography seed during brand extraction. This ensures that the brand kit's initial display includes a harvested logo, an approximate color palette, and font families, improving the user experience by reducing the all-skeleton appearance during the first paint. Update the `startBrandExtraction` function to utilize this new module, allowing for a more seamless and visually appealing brand extraction process.

Additionally, enhance the `BrandReferencePicker` component to reflect loading states and errors during brand extraction, ensuring users receive immediate feedback on their actions. Update related tests to verify the idempotency of the `finalizeBrand` function, ensuring that re-finalizing a brand correctly reuses the existing design system without duplication.

* feat(brand-extract): enhance BrandReferencePicker and localization updates

Updated the BrandReferencePicker component to reflect loading states and errors during brand extraction, improving user feedback. Added a new localization key for the brand extraction process and updated existing translations in English, Simplified Chinese, and Traditional Chinese to enhance clarity and user experience. Additionally, introduced new styles for better interaction with brand assets in the brand kit template.

* feat(brands): wire in-page lightbox/masonry/asset preview + refine seed

Brand-kit preview improvements for the live extraction kit:
- brand-kit.html: add in-page overlay system (sandboxed iframe has no
  top-nav) — clickable image lightbox with prev/next, a "view all"
  masonry modal, and a full-page asset preview modal that loads
  system/artifacts/<kind>.html in an iframe. Defer auto-reload while an
  overlay is open so it never yanks the modal out mid-interaction.
- seed-fallback.ts: prefer vivid mid-luminance hues for the seeded
  accent/accent-secondary, and drop icon/symbol faces (Remix Icon etc.)
  from the typography seed so specimens never render glyph soup.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(brands): wire in-page lightbox/masonry/asset preview + refine seed

Brand-kit preview improvements for the live extraction kit:
- brand-kit.html: add in-page overlay system (sandboxed iframe has no
  top-nav) — clickable image lightbox with prev/next, a "view all"
  masonry modal, and a full-page asset preview modal that loads
  system/artifacts/<kind>.html in an iframe. Defer auto-reload while an
  overlay is open so it never yanks the modal out mid-interaction.
- seed-fallback.ts: prefer vivid mid-luminance hues for the seeded
  accent/accent-secondary, and drop icon/symbol faces (Remix Icon etc.)
  from the typography seed so specimens never render glyph soup.

Co-authored-by: Cursor <cursoragent@cursor.com>

* i18n(web): add brandPicker.opening across remaining locales + picker test

Completes the brand-reference picker i18n key that was committed only for
en/zh-CN/zh-TW, so every locale satisfies the typed Dict, and lands the
BrandReferencePicker extraction-feedback test left untracked by the
concurrent worker.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(EntryShell): enhance AMR cloud card visibility post-detection

Updated the EntryShell component to ensure the AMR cloud card remains visible after detection settles, even when the AMR runtime is unavailable. This change prevents the card from disappearing and allows it to degrade gracefully to fallback content and sign-in flow. Additionally, added tests to verify the new behavior, ensuring a better user experience during onboarding.

* feat(library): OD Library asset registry + OD Clipper extension

Add a global, cross-project asset registry (OD Library) and a Chrome MV3
capture extension (OD Clipper), wiring the full HTTP + CLI + Web UI three-track
loop per specs/od-clipper.md.

- contracts: LibraryAsset/Source/Kind, ingest, search, pairing, task DTOs
- daemon: 6 additive SQLite tables, content-addressed owned storage, the
  idempotent registerLibraryAsset hook (hash dedup + append-source),
  programmatic enrichment (mime/size/image dims/domain/tags), pairing tokens
  with a persisted extension-origin allowlist, /api/library/* routes, and
  /api/tools/library/{search,apply} for in-task agent reuse
- cli: `od library list|get|rm|search|import|pair`
- web: Library tab (grid, source badges, filters, search, live SSE updates,
  extension pairing affordance)
- clipper/: standalone MV3 extension (background SW, content toolbar, popup)
- skills/library-curator: utility skill for agent-driven asset reuse

Origin middleware now honors paired chrome-extension:// origins (seeded from
SQLite on boot) and exempts the pairing-confirm handshake. Enrichment AI stages
(caption/OCR/embedding) are recorded as skipped pending a configured model.

* feat(brands): programmatic-first design system extraction + rename

Make brand extraction two-phase so a usable design system is ready the
moment the user enters a URL — the instant "aha" — instead of waiting on
the AI agent:

1. PROGRAMMATIC-FIRST (synchronous): startBrandExtraction now harvests the
   site deterministically (logo, palette, typography, one-line description,
   cover imagery, source URL) via prefetchBrand, synthesizes a valid design
   system with brandFromMaterial (no LLM), and finalizes + registers it
   before returning. finalizeBrand is refactored into a reusable
   finalizeBrandCore shared by both the programmatic path and the agent path.
2. ASYNC AI ENRICHMENT: the seeded agent prompt is reframed to enrich the
   already-usable design system and re-finalize in place (same user:<id>),
   updating every artifact/template.

Bounded + best-effort: a blocked/unreachable origin skips phase 1 and stays
`extracting` for the agent to drive. Gated on userDesignSystemsRoot so the
legacy agent-only path stays intact for tests.

Also rename the user-facing "Brand Kit" surface to "Design System" across
en + zh-CN strings, project names, and the enrichment prompt.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(library): enhance asset import and management features

- Updated the `import` command to allow multiple local files and remote URLs, with restrictions on supported formats.
- Added new commands: `apply` for copying assets into project design files, `edit-as-page` for converting HTML assets into editable projects, and `figma` for exporting Figma captures.
- Introduced sidecar functionality for storing derived data alongside owned assets, including Figma capture IR and element HTML.
- Enhanced server configuration to support larger ingest payloads for asset captures.
- Improved error handling and user feedback during asset import and application processes.

* feat(asset-management): enhance asset dropzone and introduce chat-to-design feature

- Updated the DesignSystemAssetDropzone component to improve file preview handling with new functions for creating and revoking object URLs.
- Adjusted CSS for better layout and spacing in the asset dropzone.
- Added a new "Chat to design" button in the LibrarySection component, allowing users to send selected assets to the Home chat composer for project creation.
- Updated localization strings across multiple languages to reflect changes in asset import terminology.
- Enhanced the HomeView component to handle asset staging from the chat composer.

* feat(library): enhance asset application with element markup support

- Updated the `applyLibraryAsset` function to include an `includeElement` option, allowing the capture of element markup alongside assets.
- Modified related components (e.g., `ChatComposer`, `LibrarySection`, `FileWorkspace`) to handle the new element markup feature, ensuring both asset paths and optional element paths are returned and processed.
- Introduced a new function, `fetchLibraryAssetElementHtml`, to retrieve the captured HTML for element-pick assets.
- Enhanced the UI to display element markup inline within the chat composer, improving user interaction with captured elements.
- Updated API contracts to reflect changes in asset application responses, including optional element markup paths.

* feat(library): enhance asset filtering and preview handling

- Updated the LibraryPicker and LibrarySection components to implement a badge-aware kind filter, allowing for more precise asset filtering based on badge kind.
- Introduced a new `matchesKindFilter` function to streamline the filtering logic across components.
- Enhanced the DesignSystemAssetDropzone to ensure proper handling of image previews, addressing issues with broken thumbnails under React StrictMode.
- Added CSS styles for kind badges to improve asset representation in the UI.
- Implemented tests for the DesignSystemAssetDropzone to ensure correct preview lifecycle management.

* feat(library): hydrate single asset on SSE ingest

Add fetchLibraryAsset(id) so the Library grid can merge just the one
asset an `ingest` SSE event references instead of refetching the whole
list on every capture. Returns null on miss/error.

* feat(clipper): richer in-page image picker

Collect CSS background-image url()s in addition to <img> (so hero/section
art painted as backgrounds is no longer silently missed), defer thumbnail
decode to visible cells via IntersectionObserver, draw downscaled canvas
thumbnails instead of second full-res decodes, and add locate-on-page
highlighting so a picked image can be traced back to its DOM source.

* feat(library): implement lazy loading for thumbnails and enhance asset filtering

- Introduced a `LibraryThumb` component to lazily load heavy content (images, videos, iframes) only when they are near the viewport, improving performance.
- Added a debounced search feature to optimize asset filtering, reducing unnecessary network requests during rapid input.
- Enhanced the asset filtering logic to track active filters using a ref, ensuring efficient updates during live events.
- Updated the `snapshotCardRects` and `cardIdsInBand` functions to support improved hit-testing for drag-and-drop interactions.

* feat(library): lazy picker thumbnails + debounced search

Extend the Library grid's lazy-thumbnail + 250ms debounced-search pattern
to the composer LibraryPicker so opening it no longer fires one full-bytes
request per asset, and tidy the clipper content-script image collection.

* feat(clipper): compress and budget capture inlining

Re-encode large raster images to downscaled WebP and inline smallest-first
within a fixed budget, dropping only the secondary Figma IR past a safe body
size, so an image-heavy page (e.g. a news front page) always saves as an
editable HTML capture instead of 413-failing the ingest.

* test(library): LibraryPicker debounce + lazy-thumbnail coverage

Cover the composer picker's 250ms debounced search and its lazy <img>
mount (deferred until the card is in view), matching the grid's perf test.

* feat(design-system): enhance asset handling and UI for design systems

- Updated the CLI to support additional asset kinds, including 'design-system'.
- Enhanced the DesignSystemProvenance type to include source URLs, improving provenance tracking.
- Modified the design system generation jobs to correctly summarize source links and GitHub repositories.
- Updated UI components to reflect changes in asset handling, including new source link management in the DesignSystemFlow.
- Improved tests to cover new functionality for adding source links and ensuring proper handling of design system assets.

* refactor(library): rename 'design-system' to 'brand kit' and enhance thumbnail loading

- Updated labels and filters in Library components to replace 'design-system' with 'brand kit'.
- Introduced a shimmer skeleton for lazy-loaded thumbnails in the LibraryPicker to improve user experience during asset loading.
- Enhanced the PickerCard component for better performance by memoizing individual asset cards.
- Updated tests to ensure proper handling of brand kit assets and their visibility in the LibraryPicker.

* feat(clipper): implement internationalization for toolbar and popup

- Added i18n support to the clipper, enabling localization of UI elements and tooltips.
- Introduced a new i18n.js file to manage translations for various languages.
- Updated content.js and popup.js to utilize the i18n functions for dynamic text rendering.
- Enhanced accessibility by ensuring aria-labels and tooltips are also localized.
- Improved user experience by providing localized messages for actions and statuses.

* feat(clipper): enhance brand kit extraction and localization support

- Updated the brand kit extraction process to include improved handling of assets and localization for various UI elements.
- Added internationalization support for the brand kit feature, allowing for dynamic text rendering based on user locale.
- Enhanced the user experience by ensuring that all relevant messages and tooltips are localized.
- Updated tests to cover new localization features and ensure proper functionality of the brand kit extraction process.

* feat(clipper): enhance brand color derivation and update localization

- Introduced new functions for color manipulation, including linear interpolation and clamping, to improve brand color derivation.
- Updated the deriveBrandColors function to better map observed palettes to semantic roles, ensuring consistent brand representation.
- Revised localization strings in i18n.js to reflect changes from 'brand kit' to 'design system', enhancing clarity and user experience.
- Improved overall code organization and readability by refactoring existing functions and adding new utility methods.

* refactor(clipper): update terminology from 'brand kit' to 'design system'

- Replaced all instances of 'brand kit' with 'design system' across various components and localization files for consistency.
- Updated UI elements, tooltips, and documentation to reflect the new terminology.
- Enhanced user experience by ensuring clarity in the design system extraction process and related functionalities.
- Adjusted localization strings in multiple languages to align with the updated terminology.

* feat(clipper): enhance image fill handling and normalization

- Introduced functions to normalize image fills by converting non-PNG/JPEG formats (SVG, WebP, GIF, AVIF) to PNG before import, ensuring all images are properly rendered in Figma.
- Updated the UI to report the number of images converted and dropped during the import process, improving user feedback.
- Enhanced the overall image processing workflow to prevent silent failures when unsupported formats are encountered.
- Revised documentation to reflect the new image handling capabilities and supported formats.

* feat(clipper): enhance UI kit and busy state feedback

- Updated the UI kit to include new components such as inputs, selection, and overlays, improving the overall design system representation.
- Enhanced the busy state feedback during capture processes with localized messages and a step-by-step progress indicator, providing users with clearer status updates.
- Revised localization strings to support new UI elements and improve user experience across multiple languages.
- Improved documentation to reflect changes in the UI kit and busy state handling.

* fix(brands): restore design-systems nav entry + reconcile BrandsTab on re-activation

Address review feedback on PR #4260:

1. EntryNavRail dropped the only control that reached view==='design-systems'
   when Brands replaced it in the rail, leaving the still-rendered/routed
   design-system manager deep-link only (the entry-nav-design-systems e2e
   specs assert this). Restore a reachable rail entry (blocks icon, existing
   navDesignSystems key) alongside Brands.

2. BrandsTab only fetched once on mount, but EntryShell keeps sub-views
   mounted and toggles visibility, so a brand finishing extraction in its
   backing project never reconciled until a full reload. Refresh whenever the
   Brands view becomes active again, and poll while any brand is extracting
   (torn down once settled / when hidden).

Red spec: tests/components/BrandsTab.refresh.test.tsx (fails pre-fix:
fetchBrands called once, not twice).

* Update clipper/brand-capture.js

* fix(clipper): improve busy state handling and UI feedback

- Adjusted the spinner CSS to use flex properties for better layout control.
- Enhanced the reclampIfMoved function to preserve user position during busy state transitions.
- Added loading toast notifications for popup-launched captures to ensure progress visibility even when the on-page bar is hidden.

* feat(daemon): add kiwi-schema dependency and enhance Figma API integration

- Added kiwi-schema package to the daemon for improved schema handling.
- Updated FigmaApiNode interface and related functions to support shared functionality with the offline decoder, ensuring consistency in node processing.
- Refactored capture functions in clipper to maintain UI visibility during DOM/IR snapshots, enhancing user experience during capture operations.

* fix(web): surface missing backing projects

* fix(web): re-enable brand actions after use

* fix(daemon): serve brand logos from data roots

* fix(brands): reconcile failed extractions

* feat(daemon): implement offline Figma import and decoding functionality

- Added support for importing `.fig` files directly into the daemon, enabling offline processing without requiring a Figma account.
- Introduced a new `fig-decode.ts` module for decoding `.fig` files, handling both ZIP-wrapped and raw formats.
- Created `figma-import.ts` to orchestrate the import process, generating a canonical snapshot and associated metadata.
- Enhanced the server to handle Figma file uploads and integrate with the new decoding logic.
- Updated package dependencies to include `kiwi-schema`, `html2canvas`, and `jspdf` for improved functionality.
- Added tests for the new Figma import features to ensure reliability and correctness.

* feat(clipper): reload-proof capture progress badge on the extension icon

The on-page progress strip dies if a page reloads itself mid-capture
(aggressive paywall sites like economist.com do this), leaving no
loading signal. Add a per-tab '•••' badge on the extension icon for the
lifetime of any capture message — it lives on the action icon, so a page
navigation can't take it down. Verified end-to-end via a real loaded
extension.

* feat(daemon): add export functionality for Figma and enhance PDF export process

- Introduced `runFigma` command for importing Figma designs, supporting both local `.fig` files and Figma URLs.
- Added detailed usage instructions for the `od figma import` command.
- Implemented `runExport` command for programmatic export of HTML/deck artifacts to PDF, PPTX, or image formats.
- Enhanced error handling and user feedback during export processes.
- Removed obsolete `build-pptx-export-prompt` module and related tests to streamline the codebase.

* feat(daemon): enhance library synchronization and export capabilities

- Implemented `reconcileLibrary` to mirror design systems and agent-produced project deliverables into the Library as referenced assets.
- Added support for programmatic export of artifacts via the `od export` command, including detailed usage instructions.
- Introduced new functions for handling Figma imports and exports, improving integration with design workflows.
- Enhanced error handling and user feedback during synchronization and export processes.
- Added tests for new features to ensure reliability and correctness.

* feat(web): PPTX export for any shareable artifact + Library toolbar tooltips

* chore(nix): refresh pnpm deps hash

* refactor(web): enhance onboarding view and file export progress indicators

- Updated the onboarding view layout for improved accessibility and visual hierarchy, including adjustments to spacing, typography, and button styles.
- Introduced a loading toast for file export progress, displaying elapsed time and estimated time remaining for slide exports.
- Added new translation keys for export progress messages in multiple languages.
- Refactored the export progress handling to provide real-time updates during the export process, improving user feedback and experience.

* refactor(web): streamline export capture bridge and update connector styles

- Removed unused loading logic for html2canvas in the export capture bridge, simplifying the code.
- Updated CSS for the onboarding view connector to improve visual clarity and ensure it does not overlap with node numbers.

* refactor(web): remove html2canvas dependency and enhance Figma URL handling

- Removed the html2canvas package from the project, including its references in the lock file and related components.
- Added functionality to manage Figma URLs within the Design System flow, allowing users to add, remove, and validate Figma file links.
- Improved drag-and-drop handling to prevent unintended file navigation when dropping files outside designated areas.
- Updated UI components to accommodate new Figma URL features, enhancing user experience and accessibility.

* refactor(web): unify brand and design system flows

- Merged the brand extraction process into the design system creation workflow, allowing users to start from a brand directly within the design system wizard.
- Updated routing to redirect legacy brand links to the unified design systems tab.
- Enhanced the onboarding experience by removing the separate Brand Kit tab and integrating brand selection into the design system creation process.
- Improved UI components to reflect these changes, ensuring a seamless user experience across the application.

* feat(web): introduce brand enrichment banner and picker modal

- Added a new BrandEnrichmentBanner component to allow users to refine programmatically-extracted design systems with AI by selecting design-system skills.
- Implemented a BrandPickerModal for selecting brands from a searchable gallery, enhancing the design system creation flow.
- Updated ChatPane to conditionally display the enrichment banner for eligible brand projects, improving user engagement.
- Enhanced the design system flow to support the new brand enrichment features, ensuring a seamless experience for users.

* feat(web): enhance BrandPickerModal and DesignSystemAssetDropzone

- Updated the BrandPickerModal to allow scrolling of the entire picker area, improving user experience by creating a unified scrolling surface.
- Added new props to the BrandReferencePicker for action labels and scroll root reference, enhancing flexibility in brand selection.
- Introduced a new DesignKitView component for rendering design kits consistently across different surfaces.
- Enhanced the DesignSystemAssetDropzone to support a wider variety of file types with appropriate previews, improving asset management during design system creation.
- Updated styles for better visual clarity and responsiveness across components.

* feat(web): update Design Systems tab actions and enhance localization

- Changed the button label in the DesignSystemsTab from "Edit" to "Open" for better clarity in user actions.
- Added a new translation key for 'dsManager.openSystem' across multiple languages to support the updated button label.
- Enhanced the FileWorkspace component to ensure the Design Files tab aligns correctly with the Design System tab, improving UI consistency.
- Implemented a new design system editing feature that allows users to fetch and save design system content from DESIGN.md, enhancing the design workflow.

* fix(merge): repair post-merge regressions after origin/main integration

Follow-up fixes on top of the origin/main merge (886f925cd) addressing
regressions the conflict resolution introduced. main's web suite is the
oracle (100% green); resolution principle was main's engine/backend +
HEAD's UI, unioned.

daemon:
- library-sync.ts: correct design-systems import to ./design-systems/index.js
  (design-systems became a directory module on main).
- tests/server-bootstrap-regression: add LIBRARY_DIR to the PathDeps fixture
  (main-added test x HEAD-added LIBRARY_DIR field).

web:
- WorkspaceTabsBar: union — restore main's onboarding Search-popover close
  behaviour + guards, keep HEAD's library/brands nav entries.
- HomeView: restore main's composer sending-state (await onSubmit, widen its
  return type to Promise<boolean>|boolean|void, pass submitting to HomeHero).
- MemorySection.test: take main's version to match main's two-loop memory
  component.
- i18n: restore dropped settings.onboardingRoleMarketing key across types.ts
  and all locales.
- App/BrandsTab/EntryNavRail/router/home-intent: union fixes restoring main
  features dropped during conflict resolution (needs_input handling, etc.).

Validation: pnpm guard + full pnpm typecheck (all 23 packages) green.
Known-red: EntryShell onboarding step 3 intentionally retains HEAD's "build"
step rather than main's brand-extract step; 8 EntryShell.onboarding /
App.onboarding-amr-e2e tests stay red pending that onboarding decision.

* fix(merge): keep HEAD's unified brand flow (revert main's separate Brands tab)

Follow-up to 688544ff7. Per the chosen product direction (brand creation
unified into the design-system create wizard, not a standalone Brands tab),
revert the brand-flow routing/nav that the post-merge repair had restored
from main:

- router.ts: keep HEAD's brand routing (brands folded into design-systems),
  drop main's standalone /brands and /brands/:id view routing.
- EntryNavRail.tsx: drop main's standalone "Brands" nav button.
- runtime/home-intent.ts: drop main's brand "Use in new chat" confirmation
  notice plumbing (tied to the separate Brands flow).

Kept from the repair commit (additive, non-conflicting): App.tsx
loadedActiveProject correctness, composer Sending… state, WorkspaceTabsBar
onboarding popover behaviour, two-loop memory test, restored i18n keys,
brand needs_input STATUS handling, server.ts plugin-route infrastructure.

* feat(library-ui): implement conditional rendering based on LIBRARY_UI_VISIBLE

- Updated router.ts to conditionally render the library view based on the LIBRARY_UI_VISIBLE flag.
- Modified ComposerPlusMenu.tsx, DesignFilesPanel.tsx, and DesignSystemAssetDropzone.tsx to show the "Select from library" button only when LIBRARY_UI_VISIBLE is true.
- Adjusted EntryNavRail.tsx and EntryShell.tsx to include the library navigation button and section conditionally based on the LIBRARY_UI_VISIBLE state.
- Enhanced HomeHero.tsx to allow starting a blank project directly, improving user experience by providing more options for project creation.

This commit introduces a feature toggle for the library UI, allowing for better control over its visibility during development and testing.

* feat(home-hero): implement edge auto-scroll for horizontal overflow

- Introduced `useEdgeAutoScroll` hook to manage auto-scrolling behavior for horizontally overflowing components in HomeHero.
- Updated `PluginPromptPresets` and `RailGroup` components to utilize the new auto-scroll functionality, enhancing accessibility for users without trackpads.
- Added `EdgeScrollZones` component to provide interactive edge zones for scrolling.
- Enhanced CSS styles to support the new scrolling layout and ensure proper positioning of elements.

This commit improves user experience by making overflow content more accessible and easier to navigate.

* feat(design-systems): add project creation from design system and enhance UI components

- Implemented `handleCreateProjectFromDesignSystem` function in `AppInner` to facilitate project creation directly from a selected design system.
- Updated `DesignKitView` to wrap the iframe in a span for better layout control.
- Refactored CSS for `BrandPreviewCard` and `DesignSystemsTab` to improve styling and responsiveness.
- Introduced a new `TemplatePicker` component in `HomeHero` for selecting project-type templates, enhancing user experience.
- Updated various components to support asynchronous handlers for design system actions, improving overall functionality.

This commit enhances the design system integration and user interface, making project creation more intuitive and accessible.

* feat(brand-routes): enhance brand reservation API and add DESIGN.md support

- Updated the POST /api/brands endpoint to accept optional fields: description and designMd, allowing for more flexible brand reservations.
- Modified validation to require either a URL or designMd for brand extraction.
- Introduced a new design-md-input module to handle parsing and validation of DESIGN.md content.
- Enhanced startBrandExtraction function to support processing of DESIGN.md, improving integration with design systems.
- Added utility functions for managing DESIGN.md input and output, streamlining the brand creation process.

This commit improves the brand extraction workflow by integrating DESIGN.md support, making it easier for users to create and manage brands.

* feat(chat-pane, design-kit-view): enhance chat functionality and design preview features

- Added `handleNextStepPromptAction` to `ChatPane` for setting draft prompts, improving user interaction.
- Introduced `nextStepVariant` to differentiate design system projects in `ChatPane`.
- Updated `DesignKitView` to include a button for previewing design kit covers, enhancing user experience.
- Implemented a modal for displaying design kit previews, allowing users to view content in a dedicated space.

These changes improve the chat interface and design kit interactions, making the application more intuitive and user-friendly.

* feat(brand-extraction): enhance DESIGN.md support and testing

- Added a new test case to validate brand extraction from DESIGN.md input without requiring a website.
- Implemented functionality to register brands directly from DESIGN.md, improving the brand creation workflow.
- Updated the `ChatPane` and `NextStepActions` components to handle design system-specific actions for projects, enhancing user experience.
- Enhanced localization files with new carousel hints and project brief options across multiple languages.

These changes streamline the brand extraction process and improve the overall functionality of the design system integration.

* feat(wireframe-examples): add annotated and greybox wireframe examples

- Introduced new wireframe examples for annotated and greybox styles, enhancing design system capabilities.
- Added HTML and JSON files for both wireframe types, providing templates for low-fidelity design mockups.
- Implemented SKILL.md documentation for each wireframe example, detailing usage and design specifications.

These additions improve the design toolkit, offering users more options for creating wireframes in various styles.

* feat(brand-extraction): refine Chrome fallback and enhance error handling

- Updated the Chrome fallback logic in the prefetch pipeline to clarify its purpose and usage as a diagnostic tool.
- Introduced environment variable checks to enable or disable system Chrome usage, improving control over the extraction process.
- Enhanced error messages in the DesignSystemCreationFlow component to provide clearer guidance on required inputs for creating a design system.
- Added regression tests to ensure that prompts do not instruct the agent to invoke a non-existent `brand-extract` skill, preventing potential failures during brand extraction.

These changes improve the robustness of the brand extraction process and enhance user experience by providing clearer instructions and error handling.

* feat(brand-extraction): enhance DESIGN.md input handling and introduce brand ready prompt

- Updated the BrandFromDesignMdInput interface to explicitly define the description property as optional with undefined.
- Enhanced the brand extraction prompts to clarify the inline brand-extract workflow, preventing confusion during the extraction process.
- Added a new BrandReadyPrompt component to notify users when a design system is ready for preview, improving user experience.
- Introduced CSS styles for the BrandReadyPrompt to ensure a visually appealing and user-friendly interface.
- Updated localization files to support new strings related to the brand ready prompt across multiple languages.

These changes improve the clarity and usability of the brand extraction process, providing users with timely feedback and a more intuitive interface.

* feat(brand-extraction): improve design system focus handling and localization updates

- Refactored the handling of browser tabs in the brand extraction tests to ensure proper validation of tab states.
- Enhanced the AppInner component to refresh design systems alongside templates, ensuring users see the latest updates without page reloads.
- Introduced a pending focus state in the DesignSystemsTab to manage design system selection more effectively after brand extraction.
- Added a BrandReadyPrompt in the ProjectView to notify users when a design system is ready for preview, improving user engagement.
- Updated localization files for Chinese (Simplified and Traditional) to reflect changes in terminology related to design systems.

These changes enhance the user experience by providing timely feedback and ensuring that the design system selection process is seamless and intuitive.

* fix(styles): adjust letter-spacing and enhance plus-menu trigger styles

- Set letter-spacing to 0 in design-system-flow.css for improved text clarity.
- Added styles for plus-menu trigger in plus-menu.css, including background, border, and hover effects to enhance user interaction and visual consistency.

These changes refine the design aesthetics and improve the usability of the plus-menu component.

* feat(tests): add design-system focus handoff tests

- Introduced a new test suite for validating the design-system focus handoff functionality.
- Implemented tests to ensure that the focus ID is correctly set, read, and cleared from session storage, preventing user selection hijacking.
- Added checks for scenarios where no focus ID is pending, enhancing test coverage for the design system's behavior.

These tests ensure the reliability of the design-system focus handling, contributing to a more robust user experience.

* feat(export): restrict image format options to PNG and JPEG

- Updated the image format options in the export functionality to only allow PNG and JPEG, removing WebP to prevent silent downgrades.
- Enhanced error handling to provide clear feedback when an unsupported image format is specified.
- Adjusted related documentation and comments to reflect the changes in supported formats across the application.

These changes ensure consistency in image export behavior and improve user experience by providing immediate validation errors for unsupported formats.

* feat(origin-validation): implement zero-config OD Clipper bypass for library requests

- Added a new function `isZeroConfigClipperLibraryRequest` to validate requests from locally-installed browser extensions targeting the `/library/` path.
- Updated the origin validation middleware to utilize this function, allowing unpaired browser extensions to access the `/api/library/ingest` endpoint while blocking other cross-origin requests.
- Enhanced tests to cover the new bypass functionality, ensuring correct behavior for both valid and invalid origins.

These changes improve the integration of browser extensions with the local daemon, enhancing user experience while maintaining security.

* feat(design-systems): add download functionality for design systems

- Implemented a new command `od design-systems download <id>` to allow users to download design systems as a .zip file, including all system files and a generated SKILLS.md usage guide.
- Updated the CLI help documentation to include usage instructions for the new download command.
- Enhanced the design systems API to support the download feature, ensuring only user design systems are accessible while handling errors for non-existent presets.
- Added localization strings for the new download functionality across multiple languages.

These changes enhance the usability of design systems by providing a straightforward method for users to obtain and share their design assets.

* feat(design-systems): enhance design system management and localization

- Introduced new UI components and styles for managing design systems, including buttons for downloading, refreshing, and resetting edits.
- Updated the DesignKitView to support direct actions for DESIGN.md editing, improving user interaction with design systems.
- Enhanced the DesignSystemDetail component to include download functionality and improved state management for design system edits.
- Added localization strings for new features, ensuring consistent user experience across multiple languages.
- Improved error handling and user feedback for design system operations, including download failures.

These changes streamline the design system management process, making it more intuitive and user-friendly while ensuring robust localization support.

* feat(tests): add comprehensive tests for design system archive functionality

- Introduced a new test suite for validating the `buildUserDesignSystemArchive` and `buildDesignSystemSkillsMarkdown` functions.
- Implemented tests to ensure correct packing of design system files, including the generation of a `SKILLS.md` guide and exclusion of internal metadata.
- Added checks for handling non-user IDs and scenarios where a design system already includes its own `SKILLS.md`.
- Enhanced the overall test coverage for design system functionalities, ensuring reliability and correctness in the design system archive process.

These changes improve the robustness of the design system features by ensuring thorough testing of critical functionalities.

* feat(figma-import): enhance CLI output and add Figma import endpoint

- Updated the CLI to conditionally log detailed import information based on the `--json` flag, improving usability for users who prefer JSON output.
- Introduced a new API endpoint for importing Figma files, handling file uploads and validating project existence, with appropriate error responses for missing files or invalid URLs.
- Added a dedicated route for the Figma import functionality, ensuring seamless integration with existing project workflows.

These changes improve the Figma import experience by providing clearer output options and robust error handling, enhancing overall user interaction with the CLI and API.

* feat(design-files): enhance DesignFilesPanel with new actions and styles

- Added new action buttons for opening a browser and creating a design system in the DesignFilesPanel, improving user interaction in the empty state.
- Updated styles for action buttons to enhance visual distinction and usability.
- Enhanced tests to verify the functionality of new actions in the DesignFilesPanel, ensuring they trigger correctly.

These changes improve the user experience by providing additional functionality and clearer visual cues in the design files interface.

* fix(ci): restore new project modal flow

* fix(ci): align design kit and onboarding checks

* fix(ci): sync bake preview workflow action

* fix(ci): include plugin preview helper scripts

* fix(review): harden brand source and preview flows

* fix(ci): stabilize web workspace tests

* fix(review): address latest blocking feedback

* chore(ci): retrigger validation after label update

* chore: re-trigger CI on updated main — needs-validation gate moved to merge_group (#4714)

* refactor(lightbox): implement portal for overlays to resolve z-index issues

- Updated the lightbox component to use React's createPortal for rendering overlays directly to the <body>, ensuring proper z-index stacking.
- Removed session mode toggle from HomeHero and adjusted related styles and tests accordingly.
- Cleaned up CSS by removing unused styles related to session mode toggle.
- Updated tests to reflect changes in the HomeHero component and its interaction with the design router.

* style(home-hero): remove focus halo from template search input

- Updated CSS to eliminate the global input focus outline and box-shadow for the template search field in the HomeHero component.
- Added a test to verify that the template picker search field maintains a clean appearance when focused.

* feat(design-system): add create design CTA and enhance design kit functionality

- Introduced a new `DesignSystemCreateCta` component to facilitate creating new designs from an active design system, enhancing user experience in the chat interface.
- Updated `ChatPane` to include the new CTA, allowing users to create designs directly from the chat.
- Enhanced `DesignKitView` with sticky header functionality for better accessibility while scrolling.
- Added new CSS styles for the `DesignSystemCreateCta` component to ensure a visually appealing and consistent design.
- Updated internationalization files to include new strings for the design system creation feature.

* feat(upload): enhance file upload handling and error recovery

- Introduced `sanitizePath` to preserve directory structures during file uploads, preventing issues with subdirectory paths.
- Updated `DesignKitView` and related components to utilize the new `sanitizePath` function for improved file name resolution.
- Added `KitErrorBoundary` component to gracefully handle rendering errors in the design kit, providing a user-friendly fallback.
- Implemented internationalization updates for new error messages and action confirmations related to uploads and error handling.
- Enhanced CSS styles for better visual feedback during error states and improved user experience.

* feat(design-kit): add keyboard shortcuts hint and enhance key handling

- Introduced a new keyboard shortcuts hint in the DesignKitView, providing users with quick access to essential actions (E edit, C copy, U upload, R refresh, ⌫ delete logo).
- Implemented a keydown event handler to manage keyboard shortcuts contextually within the design kit, improving user interaction and accessibility.
- Updated CSS for the shortcuts hint to ensure it remains low-contrast until hovered, enhancing the UI experience.
- Added internationalization support for new shortcut labels and hints across multiple languages.
- Adjusted DesignSystemsTab to prefer user logos for their systems, improving visual consistency.

* feat(design-system): introduce DesignSystemExtractionPanel and enhance design system interactions

- Added the `DesignSystemExtractionPanel` component to facilitate user interactions during design system extraction, providing a synthesized conversation view and next steps.
- Updated `ChatPane` to render the new extraction panel when a design system is active, enhancing user guidance.
- Introduced a new utility function `designSystemExtractionSource` to derive human-readable labels for design system sources.
- Enhanced internationalization support with new strings for extraction-related actions and prompts across multiple languages.
- Updated various components and tests to reflect changes in terminology and functionality, improving overall user experience.

* feat(project): add project deletion functionality and enhance design system interactions

- Introduced `onDeleteProject` prop in `ProjectView` to handle project deletion, improving project management capabilities.
- Updated `AppInner` to include the new delete project handler, enhancing user experience in project interactions.
- Enhanced `DesignKitView` and `DesignSystemsTab` with loading states and improved visual feedback during design system resolution.
- Removed deprecated `DesignSystemCreateCta` component and associated styles, streamlining the codebase.
- Updated internationalization files to reflect changes in project management terminology and actions.

* feat(design-kit): enhance internationalization and user feedback in DesignKitView

- Updated various labels and error messages in the DesignKitView to utilize internationalization functions, improving accessibility and user experience.
- Enhanced color input validation messages and added confirmation prompts for design system deletions in DesignSystemsTab and FileWorkspace.
- Introduced new props for handling design system project deletions, streamlining project management.
- Updated internationalization files to reflect new strings and translations for improved user guidance across multiple languages.

* refactor(design-kit): remove keyboard shortcuts hint and streamline header menu

- Eliminated the keyboard shortcuts hint from the DesignKitView, simplifying the header menu.
- Updated the sticky-header overflow menu to exclude upload, full-system preview, and shortcut help actions, focusing on essential project operations.
- Adjusted related tests to reflect the removal of the shortcuts hint and ensure accurate menu item visibility.

* feat(brand-routes): add extract-from-html endpoint for brand extraction

- Introduced a new POST endpoint `/api/brands/:id/extract-from-html` to re-run brand extraction using HTML rendered from the in-app browser after clearing anti-bot walls.
- Implemented error handling for missing HTML and brand not found scenarios.
- Enhanced the `extractBrandFromHtml` function to process the provided HTML and optional CSS, integrating it into the existing brand extraction workflow.
- Updated `prefetch` functionality to support extraction from pre-rendered HTML, improving the overall brand data retrieval process.

* chore(nix): refresh pnpm deps hash

* feat(brand-cli): add extract-from-html command for brand extraction

- Introduced a new CLI command `od brand extract-from-html` to facilitate brand extraction from pre-captured rendered HTML, allowing users to bypass anti-bot walls.
- Enhanced the command to accept optional CSS and base URL parameters, improving flexibility in extraction scenarios.
- Implemented error handling for missing HTML input and invalid brand IDs, ensuring robust user feedback.
- Updated the `BRAND_USAGE` documentation to reflect the new command and its usage details.
- Adjusted server configuration to accommodate larger payloads for the new extraction endpoint.

* feat(design-system): enhance design system extraction and browser tools

- Added a new script to collect CSS styles from rendered pages, improving brand extraction capabilities by capturing computed styles from cross-origin stylesheets.
- Removed the `DesignSystemExtractionPanel` component and its associated styles, streamlining the codebase.
- Updated `ProjectView` and `FileWorkspace` components to enhance design system interactions and improve user experience.
- Introduced new internationalization strings for design system phases and actions, ensuring better user guidance across multiple languages.

* feat(brand-assist): implement browser assist for brand extraction

- Added support for a client-side confirmation mechanism for the brand-browser-assist od-card, allowing users to extract brand information from the unblocked browser DOM.
- Enhanced the `ProjectView`, `ChatPane`, and `AssistantMessage` components to handle the new assist functionality, improving user interaction during brand extraction.
- Introduced new internationalization strings for browser assist prompts and messages, ensuring clarity and guidance across multiple languages.
- Updated the `useBrandReadyPrompt` hook to manage the state of the browser assist, providing a seamless user experience when dealing with anti-bot walls.

* feat(brand-prompt): enhance BrandReadyPrompt with refinement options

- Updated the BrandReadyPrompt component to include options for AI optimization and manual editing, allowing users to refine extracted brand systems.
- Added a new refinement nudge to inform users that automatic extraction may miss details, improving user guidance.
- Adjusted styles for the prompt and dismiss button for better alignment and visual consistency.
- Introduced new internationalization strings for the refinement features, ensuring clarity across multiple languages.
- Removed deprecated PPTX export functionality from the FileViewer component, streamlining the export options.

* refactor(export): remove PPTX export functionality and streamline export options

- Eliminated PPTX export support across various components, including CLI, desktop, and web, to simplify export formats.
- Updated documentation and help messages to reflect the removal of PPTX, ensuring clarity for users.
- Adjusted export-related types and constants to focus on PDF and image formats only, enhancing code maintainability.
- Improved user experience by refining export options and related UI elements.

* refactor(export): remove PPTX references and update export functionality

- Removed all instances of PPTX export functionality from the codebase, including related dependencies and comments.
- Updated export options to focus solely on PDF and image formats, enhancing clarity and maintainability.
- Adjusted UI components and tests to reflect the removal of PPTX, ensuring a streamlined user experience.
- Improved internationalization strings and documentation to align with the new export capabilities.

* chore(nix): refresh pnpm deps hash

* fix(onboarding): preserve selected runtime

* fix(brand): localize generated kit copy

* fix(onboarding): align first-run flow with main

* fix(nav): use palette icon for design systems

* fix(analytics): use design system onboarding step

* fix(ui): remove design system guide toggle

* fix(ui): position design system ready prompt

* fix(ui): space plugin task notice

* fix(web): restore home ask mode and design kit preview

* test(e2e): align onboarding visual capture

* test(e2e): align amr onboarding checks

* fix(brand): remove blocked reference brands

* feat(onboarding): show profile choices as chips

* fix(home): prefer design system cover art in recents

* test(e2e): select onboarding profile chips

* feat(brand-extraction): implement programmatic extraction transcript and UI enhancements for design systems

* feat(brand-extraction): enhance programmatic extraction with transcript agent support and UI improvements

* feat(brand-extraction): add transcript agent resolution and improve message handling in brand extraction

* fix(design-systems): stabilize loading state coverage

* test(e2e): align design system detail visual

* fix(brand-extraction): backfill programmatic transcripts

* fix(web): refresh ready brand design systems

* fix(brands): stabilize extraction handoff and seed colors

* fix(brands): return extraction transcript immediately

* fix(web): open new project modal from entry rail

* fix(editing): expose content edits for plain targets

* feat(file-viewer): implement manual edit draft dirty state tracking and reset logic

* feat(design-system): enhance project creation flow with conversation ID handling

* feat(brands): implement light theme handling for color extraction and seed generation

* feat(brands): add finalizeBrandProject function for brand project completion

* feat(file-workspace): add designSystemBrandId prop and update DesignSystemProjectPanel to use it

* Fix manual editing for brand kits

* fix(design-system): wait for project refreshes

* fix(web): open new project modal from rail

* fix(web): restore home ask mode toggle

* fix(web): sync brand color edits with seeds

* fix(web): stabilize design system workspace tests

* test(tools-pack): relax Windows resource cache timeout

* chore(pr): retrigger review after validation

* fix(web): surface design kit action progress

* fix(web): clarify brand next-step actions

* fix(web): cancel programmatic brand extraction

* fix(web): add design systems tab action feedback

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: xne998808-ai <xne998808@gmail.com>
Co-authored-by: PerishCode <perishcode@gmail.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: lefarcen <935902669@qq.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-06-25 03:56:14 +00:00
lefarcen
f4fe5ad757 feat(ci): implement the plugin-preview bake pipeline (spec 4548) (#4700)
* feat(ci): bake pipeline slice 1 — previews-diff guard + single rolling PR

Per specs/change/20260618-plugin-preview-bake-pipeline/spec.md rollout step 1
(the smallest change that stops the bleeding on the stacked bake-PR backlog).

- scripts/plugin-previews-diff.mjs: decide whether a manifest's `previews`
  subtree changed, ignoring the per-run `generatedAt` timestamp. node:test
  coverage in plugin-previews-diff.test.mjs (the #4261 timestamp-only noise
  case, entry change, add/remove, key-order stability).
- bake-plugin-previews.yml: open a review PR only when `previews` actually
  changed (was: whole-file `git diff` that fired every run because of
  generatedAt), and reuse ONE rolling branch (chore/plugin-previews) /
  force-update the open PR in place instead of stacking
  chore/plugin-previews-<run_id> per run.
- guard.ts: allowlist the two new CI-only .mjs scripts.

Deferred to later slices (same spec): pre-merge same-repo coupling job,
release-cut full bake, tag-union GC, directory-layered artifact keys.

* feat(ci): bake pipeline slices 2-4 — pre-merge coupling, release full bake, GC, directory keys

Completes the spec (specs/change/20260618-plugin-preview-bake-pipeline) on top
of slice 1, all in one PR (GC ships dry-run so staged ENABLEMENT still holds):

- .github/actions/bake-previews: composite action for the shared render + R2
  upload core, so the three bake workflows stop duplicating it.
- bake-plugin-previews.yml (post-merge): refactored onto the composite; role is
  now uploader + fork path + nightly backstop (rolling PR unchanged).
- bake-plugin-previews-pr.yml (slice 2): pre-merge bake for SAME-REPO PRs —
  renders, uploads, and commits the manifest INTO the author's branch so it
  rides with the code change. Loop guard checks the head COMMIT author
  (git log -1 %ae of head.sha, fetched via full checkout), not head.user.login,
  plus a no-op previews-diff guard. Forks fall through to post-merge.
- bake-plugin-previews-release.yml (slice 3): release-cut full bake committing
  the authoritative manifest onto release/**, with paths-ignore + bot-author
  loop guards.
- scripts/bake-plugin-previews.mjs (slice 4a): directory-layered, content-
  addressed keys <id>/<fingerprint>/preview.mp4 (+ poster.jpg), prefix-relative
  in the manifest. Daemon consumer already resolves base+key / path.join(dir,key)
  so no consumer change; reused flat entries are left untouched (additive).
- scripts/plugin-previews-gc.mjs (slice 4b) + .github/workflows/
  bake-plugin-previews-gc.yml: weekly R2 GC. Protected set = keys referenced by
  every tag + live release/** HEAD + main; deletes orphans older than a 90d
  grace window. DRY-RUN by default (needs --delete AND GC_ENABLE_DELETE=1).
  Pure protected-set/orphan logic covered by node:test.
- guard.ts: allowlist the new CI-only .mjs files.

* fix(ci): capture diff-guard result on its own line so a helper error fails the step

Per review: `if [ "$(node plugin-previews-diff.mjs ...)" != changed ]` swallows
the helper's exit 2 (bad args / unreadable manifest) inside command substitution,
so an error reads as empty string → 'unchanged' branch → the manifest PR/commit
is silently skipped despite a successful bake. Capture into diff_result on its
own line (so `set -e` aborts on a helper error) and `case` on the value, treating
unexpected output as a workflow failure. Applied to all three bake workflows.

* fix(ci): satisfy actionlint — quote gc description colon + route PR/dispatch context through env

- bake-plugin-previews-gc.yml: quote the `delete` input description (the
  '(default: ...)' colon broke YAML parsing) and pass dispatch inputs via env
  (GC_DELETE/GC_GRACE_DAYS) instead of interpolating into the run body.
- bake-plugin-previews-pr.yml: route head.sha/head.ref through HEAD_SHA/HEAD_REF
  env vars to avoid the script-injection lint on untrusted PR context.

* fix(ci): hard-gate the release bake job to release/** branches

workflow_dispatch can fire from any ref; the commit step pushes the authoritative
manifest to the triggering ref with contents:write, so a dispatch on main would
write straight to main and bypass the release back-merge. Add a job-level
`if: startsWith(github.ref, 'refs/heads/release/')` guard.

* fix(ci): GC fails closed on partial protected-ref data

Per review (non-blocking but real once deletion is armed): the GC workflow's
protected-ref fetches ended with '|| true', so a transient fetch failure could
leave the tag/release/main protected set incomplete and, with GC_ENABLE_DELETE=1,
prune clips a live release/main still references. Drop the '|| true' (fail the
job if the protected refs can't be fetched), and add a script-side guard that
refuses to delete when the protected set is empty or origin/main's manifest is
unreadable.
2026-06-23 15:39:16 +00:00
PerishFire
0bf1b6d6b8 [codex] converge release workflows and stable dry-runs (#4390)
* fix(tools-pack): use junctions for Windows standalone peer deps

* fix(desktop): expose IPC during startup

* fix(tools-pack): preserve Windows inspect diagnostics

* fix(tools-pack): report Windows inspect status errors

* fix(packaged): use Electron net fetch for app protocol

* fix(packaged): load Windows renderer from web sidecar

* fix(desktop): show Windows packaged window during startup

* fix(packaged): disable Windows GPU startup

* fix(tools-pack): keep Windows core smoke observable

* fix(packaged): remove Windows startup probes

* fix(tools-pack): trace Windows desktop IPC status

* fix(tools-pack): add Windows IPC diagnose loop

* fix(release): default beta-s Windows updater feed

* chore: clean merged test eof

* refactor(release): unify prerelease channel model

* chore(release): close prerelease doc escape hatches

* refactor(release): converge release channel workflows

* fix(release): install toolchain in metadata jobs

* fix(release): build release package before contracts

* chore(release): bump development version to 0.10.1

* fix(e2e): seed windows packaged smoke runtime config

* fix(release): install toolchain for metadata publish

* fix(release): materialize betas metadata checkout

* chore(release): bump development version to 0.10.2

* fix(release): allow betas metadata cold start from s3

* fix(e2e): support betas packaged update scenarios

* fix(release): pass betas channel into packaged smoke

* fix(release): set betas channel during self-hosted builds

* fix(release): verify counted channel reservations

* fix(release): use pnpm cmd for betas windows publish

* fix(release): add betas manifest artifact fallback

* fix(release): skip beta-s public metadata fetch

* fix(release): read beta-s manifests from storage

* fix(release): cache beta windows tools-pack builds

* fix(release): inline beta mac tools-pack builds

* fix(pack): deep sign unsigned mac bundles

* docs(pack): document payload-first beta updater validation

* fix(release): align preview tools-pack cache flow

* fix(release): align prerelease tools-pack cache flow

* fix(release): pass github token to prerelease metadata

* fix(release): setup pnpm before feishu notify

* fix(release): add stable dry-run prepublish flow

* fix(release): accept completed prerelease metadata gate

* fix(release): require stable release branches

* fix(release): converge r2 access checks

* fix(updater): use release channel parser for defaults

* fix(updater): harden windows payload relaunch

* fix(release): converge updater smoke fixture contract

* test(e2e): require silent updater fixture output

* fix(release): align stable windows smoke build path

* fix(ci): include release workspace in validation

* fix(ci): repair release validation lanes

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

* fix(ci): restore zero-install Feishu notification

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-23 06:13:21 +00:00
PerishFire
fa544f2836 [codex] Prune unused automation and repair metrics publishing (#4612)
* chore(workflows): prune unused automation

* chore(workflows): update github app token action

* chore(workflows): use github app client id
2026-06-23 04:01:49 +00:00
Marc Chan
d5d9a761ca refactor(daemon): organize source modules (#4454)
* docs(daemon): add agent guidance

* refactor(daemon): organize source modules

* chore(guard): update daemon design system import

* refactor(daemon): shorten domain module filenames

* fix(daemon): correct pitch-deck manifest path after test move

The pitch-deck manifest test moved one directory deeper into
tests/design-systems/ but kept the old '../../..' repoRoot, resolving
to apps/ instead of the repo root and failing to find
plugins/_official/examples/html-ppt-pitch-deck/open-design.json.

* fix(daemon): retarget media model verifier

Generated-By: looper 0.9.10 (runner=fixer, agent=codex)

* docs(daemon): refresh AGENTS legacy paths after module reorg

* test(e2e): allow manual edit P0 CI scaling

Generated-By: looper 0.9.10 (runner=fixer, agent=codex)

* test(daemon): fix frontmatter import path

Generated-By: looper 0.9.10 (runner=fixer, agent=opencode)

* fix(web): avoid replaying legacy terminal runs

Generated-By: looper 0.9.10 (runner=fixer, agent=opencode)

* fix(web): preserve legacy terminal replay recovery

Generated-By: looper 0.9.10 (runner=fixer, agent=opencode)

* fix(daemon): correct live-artifacts test data root

* fix(daemon): cover moved route registrars in context contract

Generated-By: looper 0.9.10 (runner=fixer, agent=opencode)

* fix(daemon): restore moved runtime artifact imports

Generated-By: looper 0.9.10 (runner=fixer, agent=opencode)
2026-06-20 16:08:20 +00:00
PerishFire
358a426b2b ci: rename atom workflows (#4522) 2026-06-19 03:22:43 +00:00
PerishFire
c1701f709a [codex] codify workflow handoff capabilities (#4503)
* codify workflow handoff capabilities

* Harden workflow marker jq lookups

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

* harden autofix head validation

* Preserve Nix hash fallback guidance

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

* Gate Nix hash fallback comments

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

* Clear stale Nix fallback comments after autofix

Generated-By: looper 0.9.10+codex.autoclean (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-18 08:37:18 +00:00
PerishFire
c782aeb3bb ci: stop duplicate post-merge validation (#4469) 2026-06-17 10:04:03 +00:00
PerishFire
b49c9b7932 Refine CI hot and full validation modes (#4466) 2026-06-17 09:32:29 +00:00
PerishFire
15d3759834 [codex] Optimize CI topology and Playwright suites (#4460)
* Consolidate PR validation into ci

* Tune absorbed CI runner tiers

* Fold visual validation into CI Playwright shards

* Run PR shards during manual CI dispatch

* Optimize CI topology and scopes

* Increase UI P0 Playwright workers

* Relax visual Playwright timeout
2026-06-17 08:24:52 +00:00
PerishFire
4b9a15734d [codex] Consolidate PR validation into CI (#4451)
* Consolidate PR validation into ci

* Tune absorbed CI runner tiers

* Fold visual validation into CI Playwright shards

* Run PR shards during manual CI dispatch
2026-06-17 14:21:04 +08:00
PerishFire
891981d460 [codex] Optimize CI runtime topology (#4450)
* Optimize e2e tools-dev runtime parallelism

* Remove visual networkidle wait

* Optimize e2e vitest parallelism

* Optimize Nix flake caching

* Test CI on Blacksmith runners

* Allow parallel manual CI runs

* Tier CI runner sizes

* Temporarily narrow CI debug scope

* Instrument watcher CI timeouts

* Instrument watcher event diagnostics

* Avoid default polling in watcher tests

* Skip runtime trace during watcher debug

* Probe ARM runner tiers for CI

* Focus CI runner probe on x64 browser

* Probe browser workers by runner size

* Probe Playwright file parallelism

* Probe Playwright worker scaling on 8v

* Reshape CI topology

* Fix split E2E Vitest CI lane

* Simplify daemon CI topology

* Optimize Windows payload CI setup

* Revert "Optimize Windows payload CI setup"

This reverts commit 5cbc48c0af.

* Cache better-sqlite3 Nix binding separately

* Revert "Cache better-sqlite3 Nix binding separately"

This reverts commit 0384e3787e.

* Remove unused Nix cache setup from CI

* Use Blacksmith ARM for lightweight CI jobs
2026-06-17 12:53:06 +08:00
lefarcen
66d9cf9fb6 fix(release): accept releaseVersion/releaseNumber in beta metadata reader (#4383)
The daily beta build (notify-daily-feishu) reads beta/latest/metadata.json
via scripts/release-beta.ts, which only understood the legacy
betaVersion/betaNumber fields. The unified release publisher and the
in-flight tools-release rewrite stamp the feed with generic
releaseVersion/releaseNumber instead, so once a build from that tooling
wrote the feed the daily reader died with 'must include betaVersion or
baseVersion+betaNumber' and the scheduled beta + Feishu card stopped.

Accept either field spelling so the reader survives whichever publisher
last wrote the feed.
2026-06-16 05:29:52 +00:00
Jeshua:D
a0beba92f9 [codex] lint craft references in guard (#4239)
* chore: lint craft references

* fix: validate bundled plugin craft references

---------

Co-authored-by: Jeshua09090 <jeshuaelpro@outlok.com>
2026-06-14 08:53:54 +00:00
Denis Redozubov
34053d95bf Pin sandbox contract shapes and ownership guards (#3429)
* Pin sandbox contract shapes and ownership guards

* fix(web): guard alias import escapes
2026-06-13 13:07:57 +00:00
CHENGLONG WANG
8359fb6d2c landing: skill-spec restyle, upstream parity, newsletter + nav updates (#4158)
* landing: skill-spec restyle, upstream parity, newsletter + nav copy updates

Landing-page-only rebuild of the home-refresh branch on top of current
main (the previous branch carried unrelated apps/web changes and
conflicted with main):

- Homepage: newsletter band between CTA and FAQ (localized in all 18
  locales, client-side stub until a provider lands), AMR logo in the
  nav, hero subtitle now reads 'best open-source alternative' in every
  non-English locale, hero/nav download labels swapped.
- /plugins/: upstream hub structure (3 tiles); /plugins/templates/ +
  kind pages: breadcrumb and counter chip removed, scene chip
  active-state contrast fix, tpl-card styles with token colors.
- /agents/: design-spec pass (token spacing incl. 68px max module gap,
  font scale, 8px radius), 3-up card grid, list-indent fix; /solutions/,
  /agents/, /plugins/ unified on one 1180px centered safe area.
- /blog/: cream/terracotta leftovers replaced with token colors, 8px
  card radius.
- Header dropdowns: full 21-agent list, upstream caret/gap parity;
  neutralized the homepage section padding leak inside info pages.
- package.json stays on main's shape; cobe / matter-js / @types/
  matter-js added as exact pins (homepage globe + FallingText) with
  matching pnpm-lock entries. nix/pnpm-deps.nix hash NOT refreshed
  (no nix locally) — needs the Validate workspace gate or a follow-up.

* landing: fix CI gates while keeping the homepage refresh intact

Rebased onto current main and resolved the blocking review + red checks on
this branch without dropping the design refresh:

- Zero-client-JS gate: the cobe globe and matter-js FallingText enhancers
  used ES `import`, so Astro emitted external `/_astro/*.js` module bundles
  and tripped `Verify zero external JavaScript`. Both effects are preserved:
  index.astro now reads each library's prebuilt runtime from node_modules at
  build time and inlines it verbatim as a classic `<script is:inline>`
  (cobe's single ESM default export is rebound to `window.__cobe`; matter-js
  ships UMD), mirroring locale-switcher-script.astro. No bundled/module JS.

- Staging/preview noindex: restored the `staging-noindex-headers` Astro
  integration that appends `X-Robots-Tag: noindex, nofollow` to `_headers`
  for OD_LANDING_NOINDEX builds, so PR previews stay out of search indexes.

- Blob guard: removed 14 stray .bak/.prev/.leftpad backup files and 9
  unreferenced PNG originals, converted 12 referenced >1MB PNGs to WebP
  (each <1MB) with their refs updated, and recompressed lab-video.mp4 /
  lab-hyperframes.mp4 under 1MB. No changed tracked file now exceeds 1MB.

- Residual-JS guard: allowlisted public/community/_site-nav.js, a verbatim
  browser script for the static /community/ pages (same precedent as the
  web notifications service worker).

- Homepage-art gate: the refresh serves its hero/gallery/method art as
  optimized origin-hosted WebP instead of Cloudflare Image Resizing variants,
  so the homepage no longer emits cdn-cgi URLs. The gate now requires the
  same floor of 16 optimized WebP references (ci/staging/production) so a
  regression to raw, unoptimized art is still caught.

* landing: stop 18x-localizing the skills/systems/templates catalog

The PR preview deploy hit Cloudflare Pages' 20,000-file-per-deployment cap:
the branch added `[locale]/{skills,systems,templates}` detail/listing
generators plus per-locale plugin previews, all fanned out across the 18
LANDING_LOCALES, producing ~29,300 output files.

This is exactly the locale-explosion the branch's own `public/_redirects`
already anticipated — its "Catalog migration" section says the standalone
skills/systems/templates generators "were removed" and are 301'd to
`/plugins/*`, but the generator routes were left in the tree, and the
localized 301 rules only covered 4 of the 18 locales.

Fix, matching that stated intent and Astro's guidance for bounded static
multilingual builds (prune the prerender matrix; signal i18n via
hreflang/sitemap, which this site already does — non-default locales are
sitemap-excluded):

- Remove the localized `[locale]/{skills,systems,templates}` detail, index
  and facet generators, and the localized `[locale]/plugins/previews`
  duplicate (previews are language-agnostic visual assets).
- Extend the localized catalog 301s in `public/_redirects` from {zh,zh-tw,
  ja,ko} to every prefixed locale, and add `/<loc>/plugins/previews/*` →
  the canonical English previews. Kept-localized plugin hub/detail pages
  whose links degrade to these paths now resolve via a single 301 hop.

English catalog pages, all localized marketing pages, the localized plugins
hub, and the homepage design are unchanged. Output drops to ~19.2k files,
back under the deploy cap.

* landing: canonicalize legacy locale aliases in the header language switcher

The shared sub-page header language switcher builds its targets with
`localePath(target, Astro.url.pathname)`, and `stripLocaleFromPath` only
recognized canonical `LANDING_LOCALES`. The catch-all route still emits
legacy / alias locale pages (`app/_lib/i18n.ts` LOCALES: `/zh-CN/`, `/es-ES/`,
`/fa/`, `/hu/`, `/th/`, plus upper-cased `zh-TW`/`pt-BR`), so on those pages
the switcher produced doubled, non-existent targets like `/zh/zh-CN/blog/`.

Strip those alias prefixes too (case-insensitive, canonicalizing to the
default locale) so switching language from `/zh-CN/blog/` now lands on an
existing URL (`/blog/`, `/zh/blog/`, …). Verified against the static build:
the doubled `/<loc>/{zh-CN,es-ES,fa,hu,th}/…` targets are gone (0 matches),
and canonical `LANDING_LOCALES` pages are unchanged.

* landing: optimize homepage first-screen — fonts → subset woff2, hero → responsive webp

Resource-weight pass on the homepage, focused on the first screen.

Fonts (≈861 KB → ≈106 KB, and no longer render-blocking):
- Albert Sans regular/italic shipped as TTF (125 KB + 138 KB). Added woff2
  builds (50 KB + 55 KB) as the primary `src`, TTF kept only as a fallback.
- `remixicon.ttf` was a 598 KB icon font loaded `font-display: block` (it hid
  the icons — and text in the same paint — until the whole font downloaded).
  The site only renders 11 of its glyphs, so it's now subset to a 1.3 KB woff2.
- All faces are `font-display: swap` now, and the above-the-fold Albert Sans
  woff2 is `<link rel=preload>`ed at high priority, so text paints immediately
  in the fallback and swaps without a blocking round-trip.

Hero imagery (responsive WebP instead of one oversized file for all devices):
- The hero backdrop was a single 326 KB / 2880px PNG served to every viewport.
  Now responsive q90 WebP variants (960→33 KB, 1440→47 KB, 1920→89 KB,
  2880→110 KB); `og:image`/`twitter:image` keep the PNG since social crawlers
  don't reliably accept WebP. The LCP preload points at the WebP set with
  `imagesizes="100vw"` to match the full-bleed `<img>`.
- The hero product shot now has a responsive srcset (800/1280/1920/2508);
  the 2508px master is the pristine original (no re-encode), phones pull the
  800px (66 KB) variant.

Quality is kept high (q90) — the savings come from format + right-sizing per
viewport, not heavier compression. Build output is unchanged structurally
(zero external/module JS still verified; ~19.2k files, under the deploy cap).

* landing: include all 25 used Remix Icon glyphs in the subset

The first subset only covered the 11 codepoints declared via CSS `content:`
and missed the 15 declared as JS \u escapes in page.tsx's `RI` glyph map
(download \uec5a, github \uedcb, star \uf18b, the arrow/chevron set, …), so
the hero download / Star CTA icons and several nav arrows rendered blank.
Re-subset to the full union of 25 glyphs (still ~2KB woff2).

* landing: add the locale-switcher caret glyph to the Remix Icon subset

A full scan of the built site (18,912 HTML files, every rendered Private-Use
codepoint) surfaced one more glyph the 25-glyph subset was missing: U+EA4E
(arrow-down-s-line), the language-switcher dropdown caret in header.tsx — it's
inserted as a literal char (not via the RI map), so the source grep missed it.
Re-subset to 26 glyphs (≈2.1KB); the same scan now reports zero missing
glyphs site-wide.

* landing: convert remaining raster art to WebP site-wide

After a site-wide image audit, converted the last referenced PNG/JPG raster
art to WebP (q88, no visible quality loss — the source files are lossless so
there's no generation loss):

- `lab-stage-bg` (homepage lab section): 710KB JPG → 323KB WebP.
- Blog study plates 09–12 and 16–22 (the hero/inline art in the Figma- and
  Claude-alternative posts, referenced across every localized section):
  ~4.9MB of 1024² PNGs → ~2.6MB WebP. All `<img src>` references in the two
  blog markdown files and the blog index were repointed; the originals are
  removed.

Combined with the earlier hero/font pass, the homepage and blog now ship only
WebP raster art (PNG kept solely for the social `og:image`). Verified: build
clean, no dangling references to the removed files, zero external/module JS,
~19.2k output files (under the deploy cap).

* landing: make homepage og:image / twitter:image absolute URLs

`heroImage` is a same-origin path (`/hero-home.png?v=2`); social crawlers
(Facebook, Twitter, LinkedIn) need an absolute URL for rich link previews.
Resolve it against `Astro.site` at the metadata call site so the share card
renders as `https://open-design.ai/hero-home.png?v=2`. Per @PerishCode's
review note; non-blocking but correct.

* landing: absolute share image on sub-pages + canonical-locale plugin search.json

Two non-blocking output regressions from @PerishCode's review:

- sub-page-layout.astro wrote `og:image`/`twitter:image` from the raw
  `ogImage ?? heroImage` (origin-relative), so every sub-page's share card
  (plugins, agents, download, …) emitted `content="/hero-home.png?v=2"`.
  Resolve against `Astro.site` like the homepage already does; `new URL()`
  leaves an already-absolute `ogImage` untouched.

- [locale]/plugins/search.json.ts keyed off `_lib/i18n` PREFIXED_LOCALES
  (the legacy alias table: zh-CN, es-ES, pt-BR …), so it emitted
  `/zh-CN/plugins/search.json` whose entries linked to `/zh-CN/plugins/<slug>/`
  detail pages that don't exist, while the canonical `/zh/plugins/search.json`
  was missing. Drive it from `app/i18n` LANDING_LOCALES + localePath so each
  endpoint matches the localized plugin detail routes and its hrefs resolve.

Verified: sub-page og:image is now absolute, /<canonical-locale>/plugins/
search.json exists with resolvable hrefs (legacy /zh-CN/ variants gone),
zero external/module JS, ~19.2k files.

* landing: preserve query + hash when switching language

The locale switcher's click handler only persisted the choice and let the
anchor's bare server-rendered href ('/<locale>/<path>/') drive navigation, so
a plain left-click dropped the current query string and hash — switching
language on '/plugins/templates/?kind=deck#gallery' landed on
'/<locale>/plugins/templates/'. Intercept the plain left-click and route
through the existing selectLocale(), which carries window.location.search +
hash across the switch via targetFor(); modified-click / non-left-button /
right-click still fall through to the real <a> (open-in-new-tab, etc.).

Per @PerishCode's review note. Verified: build clean, homepage still ships no
external/module JS, typecheck 0 errors.

* landing: handle legacy locale aliases in the switcher's path recompute

The previous commit routed plain locale-switch clicks through selectLocale(),
which recomputes the target via basePathFromCurrent(). That helper only treated
canonical LANDING_LOCALES as a strippable first segment, so on a legacy alias
page (/zh-CN/, /es-ES/, /fa/, /hu/, /th/ — still emitted by the catch-all)
clicking a language built a nested, non-existent URL like /zh/zh-CN/blog/.

Teach the switcher the same legacy-alias set as stripLocaleFromPath in
app/i18n.ts (zh-cn, es-es, fa, hu, th) via an isLocaleSegment() helper, so the
alias prefix is stripped before the target is rebuilt: /zh-CN/blog/ → ja now
yields /ja/blog/ (+ preserved query/hash), not /ja/zh-CN/blog/.

Per @PerishCode. Verified: build clean, switcher script carries the alias
handling, homepage still ships no external/module JS, typecheck 0 errors.

* landing: make plugin detail canonical-only + guard the deploy file count

The localized plugin detail wrappers (`[locale]/plugins/[...slug].astro` and
`[locale]/plugins/[slug]/index.astro`) regenerated the full ~450-plugin detail
set across all 17 non-default LANDING_LOCALES, so `build:static` emitted 19,234
files with `public/previews` empty — only ~766 below Cloudflare Pages' 20k-file
deploy cap, which the CI-generated preview thumbnails would erode further. This
is the same locale-explosion the PR set out to fix, left in the plugin subtree.

- Remove both localized plugin detail generators; plugin detail stays canonical
  (English). Output drops 19,234 → 4,287 files (huge headroom).
- Add `/<loc>/plugins/* -> /plugins/:splat` 301s for every prefixed locale so
  localized plugin links (still emitted by the kept localized hub pages) resolve
  to the canonical detail page. Kept hub pages (/<loc>/plugins/,
  /<loc>/plugins/{skills,systems,templates}/) have static files and win over
  the redirect (static assets take precedence over _redirects on CF Pages).
- Add a post-build "deploy file count under Cloudflare Pages cap" step to the
  ci / staging / production landing workflows (fail at >=19,000) so this limit
  can never regress silently again.

Per @PerishCode. Verified: 4,287 files, localized hubs + canonical detail
intact, homepage zero external/module JS, typecheck 0 errors.

* landing: one canonical plugin detail route (drop namespace catch-all)

`plugins/[...slug].astro` keyed getStaticPaths off `plugin.slug`, which keeps
the registry namespace (`open-design/<name>`), so it emitted nested pages like
/plugins/open-design/<name>/ and self-canonicalized them via Astro.url.pathname.
But catalog cards and the search JSON link to `plugin.detailHref`
(/plugins/<last-segment>/), which the sibling `plugins/[slug]/index.astro`
already generates from `detailSlug`. The catch-all therefore produced ~440
orphan, indexable duplicate URLs that no surface links to — extra sitemap
entries and extra files against the Cloudflare Pages cap.

Delete the catch-all; `[slug]/index.astro` remains the single canonical detail
route at the exact `detailHref` URL the catalog/search already target. These
namespace URLs are net-new to this branch (never shipped/indexed), so no
redirect is needed. Output drops 4,287 → 3,848 files; canonical detail pages
(445) and the templates/[kind] hubs are intact.

Per @PerishCode. Verified: /plugins/open-design/* gone, canonical detail
renders, file-count guard passes, homepage zero external/module JS, typecheck
0 errors.

* landing: fix dead locale-prefixed links to canonical detail pages and migrated catalogs

Plugin/skill/template detail pages are canonical-only since the catalog
migration, but the card components still ran detailHref through
localizedHref, 404ing every card click on all 17 non-default locales
(430+ links per locale). The homepage lab dock, the design-systems stat
card, and the html-video CTA also still pointed at the legacy /skills
and /systems paths. Cards now link detail pages canonically; homepage
and html-video links target the /plugins/* catalogs and kind facets.

* landing: anchor the community dropdown into the hub sections again

Contributors / Ambassadors / Moderators in the header dropdown linked
straight to the standalone static pages, surfacing them on a plain
menu click. Restore upstream main's destinations — /community/#... hub
anchors — so the dropdown scrolls the hub; the standalone pages stay
reachable from the hub cards only.

---------

Co-authored-by: wangchenglong <879618852@qq.com>
Co-authored-by: lefarcen <ontf116@gmail.com>
2026-06-12 12:11:56 +00:00
Max Hsu
055f3f8e0c fix(postinstall): skip build targets without tsconfig.json so partial install contexts survive (#4039)
deploy/Dockerfile runs pnpm install at a stage where only
apps/daemon/package.json has been copied (no tsconfig.json or src/), so the
root postinstall's unconditional build of every target aborts the image build
with TS5058. Skip targets whose tsconfig.json is absent — a no-op for normal
installs, where every build target ships its tsconfig — and let such contexts
run the real build later once sources are in place.

Part of #4012 (the docker compose build failure; the 401 API_TOKEN_REQUIRED
behavior is a separate auth-design question).

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:18:16 +00:00
lefarcen
8450a530fc Merge pull request #4115 from nexu-io/release/v0.10.0
release: Open Design 0.10.0 — the all-in-one Agentic design workspace, with a whole design team built in
2026-06-11 19:32:11 +08:00
lefarcen
3c3d45aa3d fix(release): add skip_nightly_gate escape hatch for stable promotion (#4150)
The stable-promotion gate (validateStableNightlyMetadata) validates a long list
of fields on the candidate nightly's metadata. After fixing stableVersion /
github.* / manifest staleness, it still trips on r2.reportZipUrl (hard-coded
null in publish-metadata.ts), with more fields (report.url, platforms.*) behind
it. To stop the field-by-field whack-a-mole and ship 0.10.0, add a
skip_nightly_gate input (default false) that bypasses the gate. The stable build
still produces freshly built + signed artifacts and still runs publish-metadata's
own per-platform manifest consistency checks; only the nightly-metadata
precondition is skipped.
2026-06-11 18:44:40 +08:00
Amy
d055737de7 [codex] Add PR UI P0 gate (#4142)
* add PR UI P0 gate

* fix(ci): keep critical UI check in required workflow

Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
2026-06-11 09:09:00 +00:00
yinjialu
e94acc2702 add cross-app import boundary check to pnpm guard (#4069)
* feat(guard): enforce the cross-app import boundary in pnpm guard

Adds a cross-app import check that scans apps/* sources for imports
reaching into another app (relative paths, @open-design/<app> package
specifiers, or apps/-rooted paths) and fails pnpm guard with the
offending file:line. The web/daemon integration boundary (HTTP APIs +
packages/contracts) was previously enforced only by human review.

The deliberate packaged -> desktop ./main export is allowlisted with a
documented reason; further exceptions must be added the same way.

* fix(guard): harden cross-app import scanning

Generated-By: looper 0.9.6 (runner=fixer, agent=codex)

* fix(guard): fail on unreadable app manifests

Generated-By: looper 0.9.6 (runner=fixer, agent=codex)

* fix(guard): reject unnamed app manifests

Generated-By: looper 0.9.6 (runner=fixer, agent=codex)

* fix(guard): catch require.resolve app imports

Generated-By: looper 0.9.6 (runner=fixer, agent=codex)

* fix(guard): detect createRequire app imports

Generated-By: looper 0.9.6 (runner=fixer, agent=codex)

* fix(guard): catch CommonJS node module createRequire

Generated-By: looper 0.9.6 (runner=fixer, agent=codex)
2026-06-10 11:40:31 +00:00
lefarcen
6869b1208b fix(plugins-home): correct deck/scroll preview capture + smoother gallery playback (#4044)
* fix(bake): classify deck-vs-scroll by viewport, real-wheel pan, motion config

Systematic audit of all 126 baked previews surfaced three capture bugs:

- 2 vertical pages misread as decks (the input probe wheel-scrolled them and the
  scroll-driven animation looked like a slide change), so they got walked
  sideways. Classify by viewport height instead: a fixed-viewport page is a deck,
  a vertically-scrollable page is a landing page (pan it) even with a horizontal
  marquee/carousel sub-component.
- 9 scroll-hijack landing pages (custom/transform scroll) that window.scrollTo
  can't move, so the pan was static. Pan those with REAL wheel events
  (page.mouse.wheel), which drive the page's own scroll handler.
- single-screen pages now hold (static) instead of being forced down a deck path.

Plus an opt-in override: authors can declare od.preview.motion ('scroll' | 'deck'
| 'static') and the bake honors it, auto-detecting only when it's absent. Schema
+ plugins-spec document the field. (Also strips a stray NUL byte from the hash
line that made the file read as binary.) BAKE_VERSION -> 4 re-bakes everything.

* perf(plugins-home): only decode visible gallery clips + stream first frame

Two cheap wins for the baked gallery videos:

- Decouple mount from play. The tile mounts the <video> across the wider inView
  margin (so scroll-in/hover never remounts + reloads), but only PLAYS while
  truly visible — off-screen tiles in the mount margin hold their poster frame
  paused instead of all running a simultaneous decode. Adds a 0-margin visible
  observer in PreviewSurface alongside the existing near one.
- preload=metadata instead of auto: paints the first frame off the +faststart
  header instead of eagerly buffering the whole clip up front, so tiles show fast
  and don't saturate the network. The idle hold buffers the pan before hover.

* perf(plugins-home): keep baked clips mounted across a scroll window

Scrolling a tile out of view and back re-showed a load even though the clip
bytes are HTTP-cached (R2 immutable): the <video> unmounted at the tight 120px
margin, so scroll-back remounted a fresh element that re-fetches metadata and
re-decodes the first frame. Add a wide keep-mounted observer (~1500/1800px) so a
clip stays mounted for a few screens — instant scroll-back — while iframes keep
the tight margin and play stays gated to the truly-visible zone (paused, not
unmounted, off screen).

* fix(bake,contracts): probe scroll mechanism before recording; validate motion

Address review:
- Move the window.scrollTo probe before Page.startScreencast so its scrollTo
  160 -> 0 jump isn't baked into the head of the pan as a visible lurch.
- Type od.preview.motion in the Zod PluginManifestSchema (enum scroll|deck|
  static) so an invalid value fails doctor/install instead of silently parsing
  via passthrough and being ignored by the bake; add contract test coverage.

* fix(bake): auto-detect single-screen fixed pages as static, not deck

A fixed-viewport page is only a deck if an input actually advances it; probe the
driver during auto-detect and fall back to 'static' (default viewport + a hold)
when nothing moves it, instead of routing every non-scrollable page through the
deck path where walkSlides(null) just held at the deck-sized capture. Extracted
the arrow/wheel probe into probeDeckDriver(). Verified: a waitlist page now bakes
a 2.5s static hold, guizang still walks, acreage still pans.

---------

Co-authored-by: audit <a@b.c>
2026-06-10 05:43:50 +00:00
lefarcen
9b06b6df78 perf(bake): bound the deck slide-walk so huge-DOM decks don't run long (#4029)
A deck that renders every slide side-by-side in one giant rail
(#deck{width:10000vw}) has thousands of DOM nodes. deckSignal called
getBoundingClientRect + getComputedStyle on ALL of them to fingerprint the
current slide, which is O(n) layout reads — fast locally but ~3s/scan in CI,
dragging guizang's walk (and its clip) out to 21.7s while every other deck
stayed <=10s.

Cap deckSignal's scan to the first 600 elements (the slide rail/track is a
structural node near the top of the DOM, so this still detects slide changes —
verified guizang still walks to its later slides), and add an 8s wall-time
backstop on the walk so a clip can never run long even if signal reads are slow.
BAKE_VERSION -> 3 re-bakes everything.

Co-authored-by: audit <a@b.c>
2026-06-10 01:53:12 +00:00
lefarcen
31cf0e0f0a fix(plugins-home): deck slide-tour bakes, CJK font fix, framing + reveal height (#4020)
* fix(plugins-home): anchor baked preset previews to the hero top

The example-prompt preset tiles render wider than the gallery's 1.31:1 baked
clip, so object-fit:cover was cropping the hero headline out of the vertical
middle. Anchor the preset video/poster to the top so the hero stays in frame.
Scoped to the preset cards; the Community grid (which matches the clip aspect)
is untouched.

* fix(plugins-home): bake decks as slide tours, fix CJK font tofu, unclip reveal

Three gallery-preview fixes found reviewing the baked Community gallery:

- Decks (fixed-viewport PPT/slideshow pages driven by arrow/wheel, not scroll)
  were vertically panned like tall landing pages — capturing ~20s of a dead
  WebGL background at the wrong 1.31 aspect (the page's compact responsive
  variant, hero headline gone). The bake now detects them (page no taller than
  the viewport), re-renders at 16:9, and walks their slides (arrow keys, falling
  back to wheel then a "next" control, detected by a structural slide signal)
  for a slide-tour the gallery loops on hover. BAKE_VERSION -> 2 re-bakes all.
- CJK-heavy templates baked tofu boxes: they pull Noto Serif/Sans SC from Google
  Fonts with display=swap and the CI runner had no CJK fallback. The bake job now
  installs fonts-noto-cjk + emoji, and the bake double-awaits fonts.ready around a
  force-load so a late-registered display face isn't captured as its fallback.
- The Home templates reveal clipped its last rows: a fixed max-height:6000px
  ceiling + overflow:hidden, but the gallery is ~7300px and grows. Switched to the
  repo's canonical grid-template-rows 0fr->1fr auto-height pattern (inner wrapper
  owns the overflow) so it expands to the gallery's natural height.

* fix(bake): probe-detect deck nav + capture at the 1.31 tile aspect

Refines the deck path after live review:

- Detection was scrollHeight-based, so a deck whose DOM stacks slides
  vertically got misread as a scroll page and vertically panned. Now PROBE the
  navigation: press the arrow key, then nudge the wheel, and use whichever
  actually moves a slide (deckSignal ignores the WebGL background); only a page
  that neither navigates nor fits the viewport keeps the linear pan.
- Captured decks at 16:9, which object-fit:cover then side-cropped in the 1.31
  tile (hero edges cut). Decks collapse to a compact variant at the normal 1440
  width, NOT at the 1.31 aspect — so capture at 1.31 (1760x1344, wide enough to
  clear the breakpoint): full layout AND a clip that fills the tile, no crop.

---------

Co-authored-by: audit <a@b.c>
2026-06-09 19:04:46 +00:00
lefarcen
c2d16727cd feat(plugins-home): content-hashed preview filenames + immutable CDN caching (#4007)
Baked clips are served from R2 behind Cloudflare's CDN (cf-cache-status: HIT).
Previously a re-bake overwrote the same key (example-x.mp4), so the edge kept
serving the stale clip until its TTL expired. Name each clip
example-x.<contentHash>.mp4 instead: a content change ships a NEW URL the
manifest points at, the old edge entry is simply no longer referenced, and the
upload sets Cache-Control: public, max-age=31536000, immutable so the CDN can
cache each clip forever.

Co-authored-by: audit <a@b.c>
2026-06-09 14:12:05 +00:00
lefarcen
d0f350e825 feat(plugins-home): pre-baked hover-pan preview clips for the gallery (#3994)
* feat(plugins-home): pre-baked hover-pan preview clips for the gallery

The Community gallery renders every html plugin as a live, scaled
example.html iframe that animates + auto-pans on hover. That is
GPU-expensive at scale (each tile is its own out-of-process document
re-compositing a tall page) and renders inconsistently for tricky pages
(WebGL noise, video backgrounds, lazy content).

Pre-render each preview to a tiny H.264 clip + first-frame poster:

- scripts/bake-plugin-previews.mjs: headless-Chrome screencast of a
  [hold@top in-place animation][linear pan top->bottom] capture, waiting
  on fonts + <img> + CSS background-images + <video> backgrounds first,
  then ffmpeg -> CFR H.264 mp4 + poster.jpg + manifest.json. Velocity is
  pre-computed from page height so the pan always finishes within ~10s.
  Runtime deps (puppeteer-core / Chrome / ffmpeg) stay out of package.json
  and are provided by the CI environment.
- daemon: serves <out> at /api/plugin-previews and attaches the clip to a
  plugin record under od.bakedPreview (a SEPARATE field — the detail modal
  still reads od.preview and opens the live, interactive page).
- web: inferPluginPreview(record, { preferBaked: true }) lets gallery tiles
  opt into the clip; MediaSurface loops the in-place [0, holdMs] span while
  idle and plays the pan on hover, one always-mounted <video> (no black
  flash), looped frame-accurately via requestVideoFrameCallback, with no
  native controls. Plugins without a bake keep the live-iframe fallback.

CI upload to R2 + the post-merge/nightly bake workflow (with content-hash
skip) and the daemon on-demand path land in follow-ups.

* feat(plugins-home): content-hash skip so unchanged plugins reuse their baked clip

The bake now hashes each plugin's preview HTML + a BAKE_VERSION and stores it
in the manifest. Re-running skips any plugin whose hash is unchanged (no render,
no re-encode, and the CI step re-uploads nothing) — editing the page or bumping
BAKE_VERSION invalidates it. Verified: a second pass over already-baked plugins
reuses all of them in ~1s instead of re-rendering.

* feat(plugins-home): point baked-preview URLs at R2 in production

bakedPreviewBlock now builds its poster/video URLs from
OD_PLUGIN_PREVIEWS_BASE_URL (the R2 public origin) when set, falling back to
the daemon's own /api/plugin-previews static route for local dev. The CI bake
uploads the clips to R2 and the deployed daemon points at them there.

* feat(plugins-home): CI workflow to bake + publish plugin previews

Adds .github/workflows/bake-plugin-previews.yml: post-merge (paths:
plugins/_official) + nightly + manual. Each run starts the daemon, bakes ALL
plugins (the content-hash skip makes that cheap — only changed pages re-render,
PREVIEW_REMOTE trusts the manifest hash since CI clips live on R2 not on disk),
`aws s3 cp`s the new clips to R2 (no --delete, so untouched clips stay), and
commits the refreshed manifest back to main.

- The daemon now reads the checked-in manifest from data/plugin-previews/ by
  default (binaries stay on R2; OD_PLUGIN_PREVIEWS_DIR still overrides locally),
  seeded here with an empty manifest so every plugin starts on the live-iframe
  fallback until the first bake lands.

Verification needs a real CI run (R2 secrets + a daemon in CI); the bake script,
hash skip, daemon injection, and web display are all already verified locally.

* ci(plugin-previews): open a reviewed PR for the manifest instead of pushing to main

Protected main can't take a direct push, and the manifest is version-pinned
(ships with the build), so the bake now opens a PR with the refreshed
data/plugin-previews/manifest.json and requests review from @lefarcen rather
than committing straight to main. Only fires when a plugin actually changed.

* ci(plugin-previews): fix invalid YAML — single-line PR body (@ in body broke the literal block)

* ci(plugin-previews): TEMP branch trigger + debug guards (limit 3, skip publish off-main)

* ci(plugin-previews): install puppeteer-core via pnpm (npm chokes on workspace:*)

* ci(plugin-previews): correct R2 secrets (repository-assets bucket) + TEMP branch publish test

* ci(plugin-previews): revert temp branch-debug toggles (workflow verified in CI)

* fix(plugins-home): gate bakedPreview on a fetchable source + fix workflow shellcheck

Review feedback (nettee):
- bakedPreviewBlock now only attaches a baked preview when a remote origin
  (OD_PLUGIN_PREVIEWS_BASE_URL) is set OR the clip files exist on disk. A
  deployment reading the checked-in manifest without the base URL set would
  otherwise emit /api/plugin-previews URLs that 404 (binaries live on R2),
  breaking tiles instead of falling back to the live iframe.
- workflow: `for _` instead of unused `for i` (SC2034) and split the CHROME
  declare/export (SC2155) so the actionlint gate passes.

* fix(plugins-home): log manifest load failures instead of swallowing them

Review feedback (nettee, non-blocking): loadManifest() caught every read/parse
error and returned {}, so a malformed manifest would silently disable all baked
previews with no trace. Warn so it's diagnosable.

* feat(plugins-home): use baked previews on the example-prompt preset tiles too

The HomeHero '示例提示词' preset tiles render the same plugin previews via
PreviewSurface; pass preferBaked so they get the cheap poster + hover-pan clip
instead of a live iframe, matching the gallery.

* ci(plugin-previews): grant pull-requests: write so the manifest PR step can open its PR

Review feedback (nettee): the permissions block only set contents: write, so
pull-requests defaulted to none and gh pr create would 403 on the first run that
changes the manifest.

---------

Co-authored-by: audit <a@b.c>
2026-06-09 13:14:54 +00:00
PerishFire
f9433a7a07 release: harden packaged launcher validation (#3995)
* Unify release metadata publishing

* Allow empty release asset suffix

* Harden beta-s release publishing

* Clean beta Windows release namespace

* Clean Windows launcher namespace

* Retry transient mac notarization uploads

* Control mac notarization upload mode

* Notarize release mac DMGs after build

* Harden release mac DMG notarization retries

* Make beta-s mac updates payload-first

* Bake mac updater metadata URL

* Add unsafe DMG install helper

* Fix beta-s feed and About version

* Improve release validation observability

* Build launcher proto before packaging

* Notarize release-stable mac artifacts

* Enable mac notarization in tools-pack

* test: update packaged release workflow assertions
2026-06-09 11:50:01 +00:00
PerishFire
7231c891f8 Allow release-stable preflight dry runs (#3857)
* Allow release stable preflight dry runs

* Fix stable dry-run nightly branch validation

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-08 04:46:02 +00:00
chaoxiaoche
e2dc6423cd design-systems: guard component manifest token references (#3834)
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-06-07 12:46:16 +00:00
chaoxiaoche
a2809ea336 design-systems: backfill 2.0 package batch 09 (#3791)
* design-systems: backfill 2.0 package batch 09

* fix: repair design token source references

* fix: update latest design token source references

* fix: tighten design token source validation

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-06-07 10:17:03 +00:00
elihahah666
a3a222c6c0 fix: small UX/correctness fixes (composer, media settings, run status, nav tabs) (#3814)
* fix(web): drop the border around the composer "+" trigger

The base button{} rule applied a 1px border to the ComposerPlusMenu
trigger, leaving a visible box around the home composer's "+" icon. Make
the trigger transparent (border + background) with a subtle hover, so it
reads as a plain icon button consistent with the project composer.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

* feat(web): defer media generation settings to in-task AI questions

HyperFrames / Image / Video / Audio composer modes no longer surface
inline pre-flight dropdowns (aspect ratio, duration, model, resolution,
audio kind). Image/Video keep only the design-system picker; HyperFrames
and Audio keep none. Those settings are now asked for by the agent during
the run via the existing question-form / AskUserQuestion flow, mirroring
how Prototype and Slide deck already defer their settings.

- footerInputNamesForChip: image/video -> ['designSystem'], hyperframes/
  audio -> [].
- metadataForHomeMediaComposer no longer seeds imageAspect/videoAspect/
  videoLength/audioKind/audioDuration; only kind (+ the hyperframes-html
  route discriminator) and any picked prompt template.
- queryTemplateForSurface drops the {{ratio}}/{{duration}}/{{model}}/
  {{resolution}}/{{voice}} slots from the prompt body.
- system.ts: image now prints "(unknown - ask: ...)" for model/aspect
  instead of silently defaulting, so the agent asks (video/audio already
  did).

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

* fix(daemon): don't mark a run failed when it produced an artifact

classifyChatRunCloseStatus decided run status purely from the process
exit (code/signal/turnCompletedCleanly) and never checked whether the run
actually produced an artifact. A non-zero exit during teardown (e.g. a
SessionEnd hook) after the deliverable was already written showed a red
"failed" card for work that succeeded.

Add an artifact-aware carve-out: a non-zero NORMAL exit (real exit code,
never a signal) that produced a confirmed artifact this run is classified
succeeded. Gated on code != null && code !== 0 so a signal kill
(OOM / external kill / container shutdown) is never flipped, preserving
the existing guard. The signal comes from scanRunEventsForRetrySideEffects
(artifactWriteSeen || liveArtifactSeen).

Also add a one-shot backfill script (pnpm backfill:failed-runs) that
repairs existing rows: failed runs whose project dir contains an
*.artifact.json are corrected to succeeded.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

* fix(web): persist the entry nav-rail expanded state

The rail's open/collapsed state lived only in EntryShell's useState, so it
reset to collapsed whenever EntryShell remounted — e.g. returning to home
after visiting a project, or a reload. Persist it to localStorage
(SSR-guarded read/write; App is loaded with ssr:false so there's no
hydration mismatch) so the rail keeps its state across navigation and
reloads.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

* fix(web): collapse all entry sections into the single leftmost tab

Clicking a sidebar section (Projects / Automation / Design systems /
Plugins / Integrations) opened a second workspace tab instead of reusing
the leftmost one. Treat the leftmost tab as a single entry-tab singleton
(any view) rather than a Home-view singleton: navigating to any entry view
switches that one tab's view in place, and its label follows the view.
Project / marketplace tabs are unchanged and still open to the right.

- syncStateToRoute: any kind:'home' route reuses the entry tab found by
  kind alone and updates its view; project-route guard broadened so
  opening a project from any entry view appends rather than replaces.
- normalizeTabsState: dedup / must-exist / leftmost-pin invariants keyed
  by kind==='entry' instead of view==='home'.
- The non-closable / pinned / drag-pin / new-tab-reuse sites switch from
  view==='home' to kind==='entry'.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

* fix(scripts): correlate backfill repair to the run that produced the artifact

The failed-runs backfill was scoped only by projectId: a single
*.artifact.json anywhere under <dataDir>/projects/<id> flipped every
failed message row in that project to succeeded. Project dirs are
long-lived and accumulate sidecars from unrelated runs, so one
successful artifact reclassified unrelated real failures — an
irreversible data-integrity bug once run without --dry-run.

Now a row is repaired only when the failed message's own
produced_files_json lists a file whose <name>.artifact.json sidecar
still exists on disk, mirroring the daemon's artifactProducedThisRun
carve-out. Path-traversal/absolute refs are rejected.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

* fix(web): strip deferred media defaults from the forwarded run inputs

Hiding the media footer pills (model / ratio / resolution / duration /
audioType / voice) stopped them rendering but did not stop their seeded
defaults from reaching the run. buildHomeMediaComposer still seeds
`ratio: 16:9`, `duration: 5`, `audioType: speech`, etc., and submit()
forwarded submittedActive.inputs as pluginInputs after stripping only
fidelity / slideCount / speakerNotes — so an Image/Video/Audio/HyperFrames
run arrived with baked-in defaults and the first-turn AskUserQuestion
discovery flow had nothing left to ask.

Extend ARTIFACT_FOOTER_FIELD_NAMES to cover the deferred media fields, and
split submit() into apply-inputs vs forwarded-inputs: the plugin is still
(re)applied with the full inputs (od-media-generation validates `subject`),
while only the stripped set is forwarded to onSubmit. Computing
inputsEqual against the apply-inputs avoids a spurious re-apply round-trip
on submit. subject / style / aspect / mediaKind are intentionally kept.

Red spec: tests/components/HomeView.media-options.test.tsx "strips deferred
media settings from the forwarded pluginInputs" — red on the pre-fix
branch (pluginInputs.model === doubao-seedance-2-0-260128), green here.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

* fix(web): assert prototype fidelity deferral on the forwarded run inputs

The deferred-media-defaults change re-applies the plugin with its full
inputs (hyperframes needs `model` to pick its pipeline, od-media-generation
needs `subject`), so the prototype apply call now legitimately carries
`fidelity`. HomeView.prefill's "binds the Home rail Prototype chip" case
still asserted `fidelity` was absent from the apply-call body, which is the
wrong surface — the deferral invariant is that `fidelity` must not reach the
run, not that it is stripped before apply.

Move the assertion onto the forwarded `pluginInputs` from onSubmit, where the
deferral actually holds, and note in HomeView why hyperframes keeps `model`
at apply time. No source behavior change.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

* fix(web): strip deferred media defaults from the run-facing snapshot too

Stripping the deferred footer/media fields from onSubmit.pluginInputs was
not enough: submit() still re-applied the plugin from the FULL inputs and
forwarded that snapshot's id as appliedPluginSnapshotId. The daemon renders
`## Plugin inputs` verbatim from snapshot.inputs (server.ts pluginPromptBlock,
packages/contracts plugin-block.ts) and tells the agent not to re-ask about
anything listed there, so an Image/Video/Audio/HyperFrames run still arrived
with `ratio: 16:9` / `duration: 5` / `model: …` baked into the prompt and the
first-turn AskUserQuestion discovery flow stayed suppressed.

Resolve the run-facing snapshot from the deferral-stripped inputs and compare
inputsEqual against the cached snapshot's inputs so the strip forces a fresh
apply. Stripping only removes non-required fields (subject / style / aspect /
mediaKind survive, and no scenario plugin requires a stripped field), so the
od-media-generation apply still validates.

Regression at the prompt/run boundary: tests/components/HomeView.media-options
"resolves the run-facing snapshot from inputs with the deferred media settings
stripped" asserts the apply-call body that yields appliedPluginSnapshotId (the
source of snapshot.inputs) carries no deferred field while subject survives.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
2026-06-07 08:35:31 +00:00
PerishFire
7a1ad1a438 feat(packaged): add launcher payload updates (#3812)
* feat(packaged): add launcher payload runtime

* ci: focus launcher payload windows smoke

* ci: publish beta mac x64 launcher payload

* test(e2e): harden payload updater smoke

* fix(pack): inject launch config for installed windows smoke

* fix(daemon): allow packaged payload resource roots

* fix(updater): relaunch after payload update

* test(e2e): refresh windows updater UI status

* test(e2e): ensure windows smoke reaches main shell

* test(e2e): skip onboarding before windows updater smoke

* fix(updater): detach windows payload relaunch helper

* test(e2e): run payload smoke without updater dry run

* fix(desktop): relaunch windows payload updates on quit

* fix(updater): preserve installed payload relaunch target

* fix(updater): require launcher target for payload updates

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* test(e2e): align mac x64 artifact mode assertion

* test(e2e): canonicalize launcher relaunch root

* fix(updater): validate launcher payload config before activation

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

* fix(updater): require live launcher target for payload updates

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-07 07:38:57 +00:00
chaoxiaoche
b3b5bbeced design-systems: backfill 2.0 package batch 01 (#3781)
* design-systems: backfill 2.0 package batch 01

* fix: repair design-system token audit references

* fix: update design token source references

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-06-06 14:49:28 +00:00
chaoxiaoche
67a8d7cc8e Add derived design-system token outputs (#3734)
* Add derived design system token outputs

* Harden derived design token output guards

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-06-05 11:33:22 +00:00
chaoxiaoche
2df76ee0c9 Improve design system token import contract (#3719)
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-06-05 07:12:40 +00:00
PerishFire
8f6cdaaba1 Harden release-beta-s self-hosted beta lane (#3641)
* Fix Windows smoke launcher payload layout

* Harden release-beta-s runner probe

* Use cmd shell for release-beta-s probe

* Add win beta build script for self-hosted lane

* Keep release-beta-s checkout lightweight

* Use explicit local beta version for release-beta-s

* Install dependencies in beta build script

* Repair Electron dist before Windows beta build

* Add release-beta-s timings and diagnostics

* Polish release-beta-s observability guardrails

* Trace tools-pack cache materialization

* Reuse materialized workspace build cache

* Trace Windows packaging segments

* Trace Windows archive process details

* Reuse installer unpacked materialization

* Preserve reusable Windows unpacked projection

* Add fast smoke telemetry for self-hosted beta

* Cache Windows archive outputs for beta lane

* Tighten fast Windows smoke lane

* Scope Windows beta artifact reporting by target

* Clean stale Windows target artifacts

* Wire Windows Authenticode signing

* Expose self-hosted beta validation modes

* Add core Windows smoke profile

* Decouple core smoke from updater fixture version

* Trace Windows smoke lifecycle steps

* Skip unpacked materialization on NSIS cache hit

* Add Nexu S3 beta publisher

* Harden self-hosted Windows beta lane

* Use Windows PowerShell for publish probe

* Run publish probe via repo script

* Require explicit self-hosted S3 public origin

* Run real installer acceptance for external feeds

* Skip ready-prompt checks for external feeds

* Relax optional self-hosted update inputs

* Tolerate BOM in self-hosted publish probe

* Converge self-hosted beta metadata and signing flow

* Run self-hosted beta metadata prep on the runner lane

* Use Windows PowerShell for self-hosted beta metadata checks

* Bypass execution policy in self-hosted beta metadata checks

* Split self-hosted beta metadata publish

* Suppress mc output in beta publish helpers

* Split Windows NSIS payload cache layers

* Require fresh tool dist metadata

* Preserve Windows overlay payload paths

* Default beta-s gate to core

* Optimize win beta packaging reuse

* simplify windows full smoke lifecycle

* preserve onboarding state after windows upgrade smoke

* fix windows tools-pack test portability

* recover stale tools-pack locks

* Allow auto signing fallback for beta builds

* Add self-hosted beta mac arm64 lane

* Quote self-hosted signing choices

* Run self-hosted beta metadata on mac lane

* Use shallow metadata checkout on self-hosted beta

* Use sparse beta metadata checkout

* Make mac release profile bash-compatible

* Allow file sparse metadata checkout

* Isolate self-hosted metadata checkout

* Clear sparse skip-worktree before mac build

* Isolate self-hosted mac build checkout

* Run self-hosted beta publish without aws cli

* Use unsigned beta prefix for mixed platform publish

* Sparse checkout self-hosted beta publish

* Publish mac beta platform before metadata

* Pass mac publish release profile

* Probe mac signing keychain import

* Expand mac signing preflight variants

* Use sudo keychain helper for self-hosted mac signing

* Route self-hosted mac codesign through sudo wrapper

* Normalize mac signing identity name

* Harden self-hosted mac notarization

* Optimize self-hosted mac release cache

* Fix mac signing helper cleanup

* Retry transient mac notarization uploads

* Keep mac notarization fail-fast

* Reduce mac symlink diagnostics noise

* Normalize self-hosted beta platform inputs

* chore: bump beta base version to 0.9.1

* chore: tune release tools-pack cache retention

* test: add temporary cache validation marker

* chore: key windows nsis base payload by content

* Revert "test: add temporary cache validation marker"

This reverts commit e8487bfd3e.

* chore: probe windows nsis payload cache before materializing

* chore: align self-hosted beta smoke modes

* chore: skip mac smoke report upload when disabled

* chore: default self-hosted beta to publish both platforms

* chore: make beta signing modes explicit

* Add release report artifacts and metatool metadata

* Fix Windows release report zip publishing

* Remove artifact handoff from release beta self-hosted lane

* Fix release beta merge follow-ups

* Fix release beta install regressions

* Restore web build dependencies

* Route Windows beta downloads through mirrors

* Use PowerShell 7 for Windows beta lane

* Fix release beta preflight gates

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Fix Windows beta smoke update reuse

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Fix release beta review follow-ups

* Refresh Nix pnpm dependency hashes

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Fix Windows signtool fallback resolution

* Reject stale beta platform manifests

* Validate beta platform manifest run identity

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Validate beta manifest commit identity

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Allow beta metadata publish reruns

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Build tools-pack fully in beta Windows bootstrap

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Emit r2 metadata for Windows beta publish

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Stabilize beta publish browser test

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Stabilize Windows beta metadata smoke test

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* Restore multi-platform Vela CLI package

* Refresh Nix pnpm dependency hashes

Generated-By: looper 0.9.5 (runner=fixer, agent=codex)

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-05 03:00:55 +00:00
PerishFire
7403dc681e refactor: move daemon routes into routes dir (#3568)
* refactor: move daemon routes into routes dir

* fix: restore components typecheck dependencies

* fix: update host tools route test import

Generated-By: looper 0.9.4 (runner=fixer, agent=codex)

* fix: restore review follow-up patches

* fix: tighten daemon route typing

---------

Co-authored-by: Looper <looper@noreply.github.com>
2026-06-04 07:39:18 +00:00
elihahah666
67077fd36f chore(docs): move translated docs into docs/i18n/ (#3621)
* chore(docs): move translated docs into docs/i18n/

Collect the translated README/QUICKSTART/CONTRIBUTING/MAINTAINERS files
(including the Korean set) into docs/i18n/, leaving only the English sources
in the repo root so the GitHub project home page file list stays clean.
Rewrite internal links for the new layout (../../ for repo-root resources,
sibling filenames between translations), update both switcher conventions,
the i18n-check mixed-layout support, the contributors-wall workflow globs,
TRANSLATIONS.md guidance, and drop now-dead root translation paths from the
fork-PR docs allowlist.

* fix(docs): correct root-relative links in Korean contribution guide

Prefix repo-root targets (scripts/sync-design-systems.ts, TRANSLATIONS.md,
package.json, .github/pull_request_template.md) with ../../ so they resolve
from the new docs/i18n/ depth; sibling translated docs stay bare.

Generated-By: looper 0.8.1 (runner=fixer, agent=claude-code)

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-06-04 06:55:16 +00:00
Thinh Nguyen
30f58ce824 fix: update anthropics/skills upstream URLs to use skills/ subdirectory (#3548) 2026-06-04 03:58:27 +00:00
Tom Huang
c6722d2671 feat: Lexical composer, interactive terminals, comment attachments & browser reference board (#3516)
* feat(daemon): add interactive terminal support with node-pty

- Introduced a new terminal service to manage interactive terminal sessions.
- Added routes for creating, streaming, and managing terminal sessions in the daemon.
- Integrated terminal functionality into the CLI, allowing users to open interactive shells.
- Updated project routes to support conversation seeding for side chats.
- Enhanced the web application to include terminal tab functionality and UI components.

This feature enables users to interact with a terminal directly within the application, enhancing the overall user experience and providing a more integrated development environment.

* feat(web): implement embedded browser module in Design Files workspace

- Added a `+` icon to the Design Files tab for opening a new Browser module.
- The Browser module supports navigation features including back, forward, refresh, and address input.
- Integrated a curated list of design reference URLs for user convenience.
- Implemented browser data clearing functionality via IPC.
- Enhanced desktop runtime to support embedded browser with appropriate security measures.
- Added tests for browser functionality and URL handling.

This commit establishes a new workspace for browsing and referencing design resources directly within the application, improving user experience and accessibility to design tools.

* refactor(web): enhance add tab menu functionality in FileWorkspace

- Updated the add tab button to toggle the visibility of the menu with improved accessibility attributes.
- Refactored the add menu to be portaled to the document body for better positioning and visibility.
- Adjusted CSS styles for the add menu to use fixed positioning and increased z-index for proper layering.
- Minor CSS adjustments in entry layout for consistent padding.

These changes improve the user experience when adding new modules in the FileWorkspace, ensuring the add menu is more accessible and visually consistent.

* feat(daemon): introduce session mode for conversations

- Added a new `session_mode` column to the `conversations` table with a default value of 'design'.
- Implemented logic to handle `session_mode` in conversation creation, updates, and retrieval.
- Enhanced the API to support `session_mode` in conversation requests, allowing for 'chat' or 'design' modes.
- Updated the web application to include a session mode toggle, enabling users to switch between chat and design modes seamlessly.
- Adjusted system prompts to reflect the current session mode, providing context-aware responses.

This feature enhances the user experience by allowing for more flexible conversation management, catering to different interaction styles.

* feat(web): enhance navigation and settings functionality in DesignBrowserPanel and EntryShell

- Introduced a navigation stack in DesignBrowserPanel to manage back and forward navigation states.
- Updated the browser navigation logic to handle URL history and improve user experience.
- Added a settings menu in EntryShell for quick access to language and appearance options.
- Implemented CSS styles for the new settings menu, ensuring a consistent and user-friendly interface.
- Enhanced tests for navigation functionality and settings menu interactions.

These changes improve the overall usability of the application by streamlining navigation and providing easy access to settings.

* feat(daemon): enhance conversation session mode handling

- Added a new `mode` flag to CLI commands for project and conversation creation, allowing users to specify 'design' or 'chat' modes.
- Implemented `normalizeChatSessionModeFlag` function to validate and normalize session mode inputs.
- Updated project routes to handle session mode during conversation creation and updates.
- Enhanced web components to support session mode changes, including new props and handlers for managing session modes in conversations.
- Adjusted UI elements to reflect the current session mode, improving user experience and interaction flexibility.

This update provides a more robust framework for managing conversation modes, catering to diverse user needs and enhancing overall functionality.

* feat(web): enhance HandoffButton and DesignBrowserPanel with improved functionality and styling

- Updated HandoffButton to support framework-specific CLI prompts and improved local project path handling.
- Enhanced DesignBrowserPanel to manage browser history with favicon support and improved address display.
- Introduced new utility functions for formatting addresses and extracting hostnames.
- Refactored CSS styles for better layout and responsiveness across components.
- Added tests for new functionalities in HandoffButton and DesignBrowserPanel, ensuring robust behavior.

These changes improve user experience by streamlining the handoff process and enhancing the design browsing capabilities within the application.

* feat(web): enhance HandoffButton and ProjectView with improved instructions handling and UI updates

- Updated HandoffButton to include a tabbed interface for switching between editor and CLI options, enhancing user experience.
- Added support for opening the AMR website directly from the HandoffButton.
- Refactored ProjectView to implement a modal for custom instructions, allowing users to edit and review instructions more intuitively.
- Improved CSS styles for project instructions modal and handoff menu, ensuring better layout and responsiveness.
- Added keyboard accessibility for closing the instructions modal with the Escape key.

These changes streamline the handoff process and improve the usability of custom instructions within the application.

* feat(daemon): add social share functionality and project folder management

- Introduced new `share` command in CLI for building localized social-share targets for Open Design projects.
- Implemented `printShareUsage` and `runShare` functions to handle share requests and display usage instructions.
- Added API routes for social sharing, allowing users to create shareable links for projects.
- Enhanced project routes with new endpoints for listing and creating project folders, improving project organization.
- Updated relevant files and tests to support new functionalities, ensuring robust behavior.

These changes enhance user experience by facilitating social sharing and better project management within the application.

* feat(web): enhance DesignBrowserPanel and DesignFilesPanel with improved address formatting and drag-and-drop functionality

- Refactored `formatAddressDisplay` to utilize a new `formatAddressDisplayParts` function, separating URL and title handling for better clarity.
- Updated `DesignFilesPanel` to improve drag-and-drop interactions, including enhanced directory navigation and file management features.
- Adjusted CSS styles for better visual consistency and responsiveness across components.
- Added tests for new functionalities in `DesignBrowserPanel` and `DesignFilesPanel`, ensuring robust behavior.

These changes improve user experience by streamlining address formatting and enhancing file management capabilities within the application.

* feat(web): enhance DesignBrowserPanel and DesignFilesPanel with new reference categories and improved folder creation

- Added new reference categories in DesignBrowserPanel, including 'Inspiration', 'Real Interfaces', 'Color', 'Typography', and more, each with curated design resources.
- Improved folder creation logic in DesignFilesPanel to suggest names based on existing folders, enhancing user experience.
- Updated CSS styles for better layout and responsiveness across components, particularly in control rows and search functionalities.
- Added tests for new reference categories and folder creation features, ensuring robust functionality.

These changes enrich the design resource catalog and streamline folder management, improving overall usability within the application.

* feat(web): implement reference icon functionality and enhance social share features

- Replaced favicon URL generation with a new `referenceIconUrl` function to provide reliable icons for curated design sites, improving visual consistency in the DesignBrowserPanel.
- Updated the FileViewer component to enhance social share functionality, including clearer messaging for protected deployments and improved UI for sharing links.
- Added CSS styles to support new visual states for social share components, ensuring better user experience.
- Expanded tests for the new `referenceIconUrl` function and social share interactions, ensuring robust functionality.

These changes enhance the design resource presentation and improve the sharing experience within the application.

* feat(web): enhance DesignFilesPanel and FileViewer with improved folder management and social share functionality

- Added optimistic folder path management in DesignFilesPanel to improve user experience during folder creation.
- Updated social share logic in FileViewer to handle protected deployments more effectively, ensuring clearer messaging and UI updates.
- Refactored CSS styles for better layout and responsiveness in both components, enhancing overall usability.
- Expanded tests for new folder management features and social share interactions, ensuring robust functionality.

These changes streamline folder management and enhance the social sharing experience within the application.

* feat(daemon): enhance project tab management with new state handling

- Updated database schema to include a new `state_json` column in the `tabs_state` table for improved project tab state management.
- Implemented functions to normalize and parse project tab states, including handling browser workspace tabs.
- Modified `listTabs` and `setTabs` functions to utilize the new state management features, allowing for better tracking of active tabs and saved states.
- Refactored related types in the contracts and web applications to support the new tab state structure.

These changes improve the functionality and user experience of managing project tabs within the application.

* feat(web): enhance project tab management and browser tab functionality

- Updated project routes and server logic to include `browserTabs` in the request body, improving tab state management.
- Implemented validation for `browserTabs` to ensure it is an array, enhancing error handling.
- Refactored `setTabs` function to accommodate the new structure for managing tabs, including browser-specific tabs.
- Added tests for browser tab state persistence and management in the new `project-tabs-state.test.ts` file, ensuring robust functionality.

These changes improve the user experience by providing better management and persistence of project and browser tabs within the application.

* feat(web): enhance ProjectView with improved chat send management and UI updates

- Added support for queue-only chat sends, allowing for better handling of messages during busy conversations.
- Refactored chat send functions to improve state management and persistence of queued messages.
- Updated CSS styles in the DesignFilesPanel for better layout and responsiveness, including search control enhancements.
- Expanded tests for new chat send functionalities and search interactions, ensuring robust behavior.

These changes improve the user experience by streamlining chat interactions and enhancing the overall design file management interface.

* feat(web): improve project tab state management and caching

- Enhanced the `loadTabs` function to utilize cached tab states, improving performance and user experience during data fetching.
- Implemented `normalizeTabsState` and caching functions to manage tab states effectively, including validation and error handling.
- Updated `writeCachedTabs` to ensure the latest state is stored in local storage, facilitating better persistence of tab information.
- Modified `listTabs` in the daemon to include `updatedAt` in the state retrieval, allowing for more accurate tracking of tab updates.

These changes streamline tab management and enhance the overall responsiveness of the application.

* feat(web): integrate Lexical for enhanced text composition and mention handling

- Added Lexical as a dependency to improve text composition capabilities within the chat interface.
- Implemented mention functionality with the creation of `MentionNode` and related serialization/deserialization logic for inline mentions.
- Enhanced `ChatComposer` and `ChatPane` components to support queue-only message sending and improved state management for queued messages.
- Updated `DesignBrowserPanel` and `PreviewDrawOverlay` to incorporate new features for capturing and annotating browser snapshots.
- Refactored various components to streamline interactions and improve user experience during chat and design tasks.

These changes significantly enhance the text editing experience and provide better management of chat interactions, improving overall usability in the application.

* feat(workspace): integrate terminal viewer and enhance chat functionality

- Added a new terminal viewer component with customizable themes and improved styling for better user experience.
- Integrated terminal functionality into the workspace, allowing users to interact with a terminal directly within the application.
- Updated chat components to support active conversation states, enabling seamless message handling and interaction.
- Refactored chat-related props and state management to enhance performance and maintainability.
- Removed deprecated file tree explorer components to streamline the workspace interface.

This update enhances the overall functionality of the workspace, providing users with a more integrated and responsive environment for both terminal and chat interactions.

* feat(workspace): enhance chat and session mode functionality

- Introduced a new `+` launcher in the workspace for easy access to project files and new tabs, including Side Chat and Terminal.
- Added Side Chat functionality that allows users to create context-aware conversations based on existing chats.
- Implemented a new terminal tab type for interactive terminal sessions using `node-pty`, enabling users to run shell commands directly within the workspace.
- Enhanced chat functionality with a new session mode toggle, allowing users to switch between 'design' and 'chat' modes seamlessly.
- Added a feature to copy chat responses as markdown to the clipboard, improving usability and sharing capabilities.
- Updated various components and styles to support these new features, ensuring a cohesive user experience.

This update significantly improves the workspace's interactivity and usability, providing users with more tools for collaboration and development.

* feat(workspace): enhance session mode toggle and chat functionality

- Improved the SessionModeToggle component to include localized guidance cards that provide context-aware descriptions for each mode.
- Updated the UI to display a popover with mode descriptions when hovering over options, enhancing user understanding of the available modes.
- Added a new copy button in the AssistantMessage footer to allow users to copy responses as raw Markdown, improving usability for external documentation.
- Enhanced localization support by updating i18n keys and translations for various languages, ensuring consistent user experience across different locales.
- Refactored styles for the session mode toggle and associated components to improve layout and responsiveness.

This update significantly enhances the user experience by providing clearer guidance and improved functionality in chat interactions.

* feat(styles): enhance session mode toggle styling for improved visibility

- Added new CSS rules to ensure the session mode toggle popover and hover cards are displayed correctly with increased z-index and visibility.
- Updated styles for various chat components to maintain consistent positioning and overflow behavior when session mode elements are present.
- Improved overall layout responsiveness for chat interactions, enhancing user experience during mode transitions.

This update refines the visual presentation of session mode toggles, ensuring they are more accessible and user-friendly.

* feat(web): finish Lexical composer input with atomic mention pills

* fix(web): make browser reference board scroll with a pinned toolbar

The browser tab's default Reference Board (.db-start) is wrapped by
PreviewDrawOverlay's position:absolute container, which is not a flex
parent — so .db-start's flex:1 1 auto never bounded its height and the
board grew to its content height instead of scrolling (and the sticky
.db-reference-toolbar could not pin). Fill the overlay with height:100%
like the .db-webview/.db-fallback siblings already do, restoring scroll
and the sticky toolbar.

* test(daemon): expect updatedAt in persisted tab state round-trip

listTabs() returns the tabs_state row's updatedAt timestamp on the
saved-state path, but these two toEqual assertions predated that field
and failed strictly. Match the real shape with expect.any(Number).

* feat(chat): enhance ChatComposer with session mode management and UI improvements

- Introduced a new `sessionMode` prop to the ChatComposer, allowing users to switch between different session modes (e.g., 'design').
- Added a SessionModeToggle component for improved user interaction and visibility of session options.
- Updated the ToolsTab type to reflect the removal of the pet option, streamlining the tools available in the composer.
- Refactored styles to enhance the visibility and positioning of session mode elements, ensuring a better user experience during mode transitions.
- Improved the handling of draft state and user interactions within the composer, enhancing overall functionality.

These changes significantly improve the ChatComposer's usability and flexibility, providing users with clearer options and a more responsive interface.

* feat(chat): implement caret floating layer for mention and slash popovers

- Introduced a new `CaretFloatingLayer` component to manage the positioning of mention and slash command popovers relative to the caret.
- Enhanced `ChatComposer` to utilize the caret rectangle for accurate popover placement, improving user experience during text input.
- Updated `LexicalComposerInput` to pass caret position data to the trigger handling logic, allowing for dynamic popover adjustments.
- Refactored styles for popovers to ensure consistent appearance and behavior, including improved animations and responsiveness.
- Added accessibility features to mention and slash popovers, enhancing usability for keyboard navigation.

These changes significantly improve the interaction model for mentions and commands within the chat interface, providing a more intuitive and responsive user experience.

* feat(chat): update mention tab order and improve search functionality

- Reordered the mention tab sections in the ChatComposer to prioritize 'Design files' over other categories, enhancing user experience during mentions.
- Updated the search prompt to reflect the new tab order, ensuring clarity in search functionality.
- Enhanced the mention selection logic to accommodate the new tab structure, allowing for a more intuitive navigation experience.
- Added tests to verify the correct display and functionality of the updated mention tabs and search behavior.

These changes significantly improve the usability of the mention feature within the chat interface, making it easier for users to find and select relevant items.

* feat(chat): enhance context management in ChatComposer and HomeHero

- Added functionality to manage MCP servers and connectors within the ChatComposer, allowing users to remove these contexts seamlessly.
- Updated the HomeHero component to support the selection and removal of MCP servers and connectors, improving context handling in user interactions.
- Enhanced the search prompt to include files, ensuring users can search across all relevant categories.
- Refactored related components and styles for better integration and user experience.
- Added tests to verify the correct functionality of the new context management features.

These changes significantly improve the usability of context features in the chat interface, making it easier for users to manage their interactions effectively.

* feat(chat): enhance message handling with session mode and plugin snapshot support

- Added new columns to the database schema for `session_mode` and `applied_plugin_snapshot_json` to support enhanced message context.
- Updated the `upsertMessage` and `listMessages` functions to handle the new fields, ensuring messages can store and retrieve session mode and plugin snapshot data.
- Enhanced the `ChatComposer` to manage and send the applied plugin snapshot as part of the message context, improving user interaction with plugins.
- Introduced a new `MessageSessionModeChip` component to visually represent the session mode in the chat interface.
- Updated styles for better presentation of session mode and plugin context within messages, enhancing user experience.

These changes significantly improve the context management capabilities in the chat interface, allowing for richer interactions and better tracking of session-specific data.

* feat(chat): add annotation handling and UI improvements in ChatComposer and PreviewDrawOverlay

- Implemented functionality to stage draw annotations into the composer input without sending, enhancing user interaction.
- Added a new button in the PreviewDrawOverlay to allow users to append notes to the input, improving workflow flexibility.
- Updated the ChatComposer tests to verify the correct staging of annotations and their integration into the input.
- Enhanced internationalization support by adding new translation keys for annotation actions across multiple languages.

These changes significantly improve the user experience by providing more intuitive annotation handling and better integration within the chat interface.

* feat(capture): implement page capture functionality and enhance folder management dialogs

- Added new capture functionality to allow users to take snapshots of the current page, improving user interaction with visual content.
- Introduced in-app dialogs for folder creation and moving files, replacing the unsupported window.prompt in the Electron desktop host, enhancing usability across platforms.
- Updated the DesignFilesPanel to support these new dialogs, ensuring a seamless experience for managing project files.
- Enhanced internationalization support by adding new translation keys for folder management actions across multiple languages.

These changes significantly improve the user experience by providing intuitive capture options and streamlined file management within the application.

* feat(projects): implement folder deletion and enhance project file management

- Added a new API endpoint to delete project folders, improving the file management capabilities within the application.
- Introduced utility functions for ensuring project subdirectories and safely deleting folders, enhancing the robustness of folder operations.
- Updated the DesignFilesPanel and FileWorkspace components to support folder deletion actions, providing users with a more intuitive interface for managing project files.
- Enhanced internationalization support by adding new translation keys for folder management actions.

These changes significantly improve the user experience by streamlining folder management and providing clearer options for users to organize their projects effectively.

* feat(composer): enhance BoardComposerPopover with image attachment functionality

- Added support for attaching images to comments, allowing users to upload and preview images directly within the composer.
- Implemented new handlers for image input changes and clipboard pasting, improving the user experience for image uploads.
- Updated the component's props to include image-related callbacks and state management for attached images.
- Enhanced styles for image thumbnails and removal buttons, ensuring a cohesive design with the existing comment popover interface.

These changes significantly improve the functionality of the BoardComposerPopover, providing users with a more interactive and visually rich commenting experience.

* feat(file-viewer): enhance comment attachment functionality with image support

- Updated the `onSendBoardCommentAttachments` prop to accept an additional `images` parameter, allowing for image attachments alongside comments.
- Introduced state management for handling attached images, including functions to add and remove images, and to generate previews.
- Implemented a modal for previewing attached images, improving user interaction when managing comment attachments.
- Updated the `FileWorkspace` component to reflect changes in the props, ensuring consistency across components.

These enhancements significantly improve the commenting experience by enabling users to attach and preview images directly within the file viewer.

* feat(home-hero, project-view, styles): enhance functionality and user experience

- Updated the HomeHero component to prevent unnecessary state changes during programmatic updates, improving user interaction with prompts.
- Enhanced the ProjectView component to support image attachments alongside comments, allowing for a more versatile commenting experience.
- Implemented a new image upload process that queues tasks efficiently, ensuring smooth handling of comment attachments.
- Added CSS to reserve scrollbar space in design files, preventing layout shifts when scrollbars appear, thus enhancing visual stability.

These changes collectively improve the user experience by streamlining interactions and ensuring consistent UI behavior across components.

* feat(preview-comments, chat): enhance comment functionality with attachment support

- Added support for `attachments_json` in the `preview_comments` table, allowing users to attach files to comments.
- Updated relevant functions to handle attachments, including `upsertPreviewComment` and `listPreviewComments`, ensuring attachments are properly managed and displayed.
- Enhanced the `CommentSidePanel` to render attached files, providing users with a visual representation of their attachments.
- Improved the `BoardComposerPopover` and `ChatPane` components to support image attachments, including drag-and-drop functionality for reordering queued sends.

These changes significantly enhance the commenting experience by enabling users to attach and manage files directly within the chat interface, improving overall usability and interaction.

* refactor(comment-attachments): rename attachment normalization functions for clarity

- Updated the `normalizeCommentAttachments` function to `normalizePreviewCommentAttachments` for better context in handling preview comment attachments.
- Adjusted the `upsertPreviewComment` and `normalizePreviewComment` functions to utilize the renamed attachment normalization function.
- Added tests to ensure that image attachments are correctly saved and retrieved, addressing a regression issue with attachment persistence.

These changes enhance code clarity and maintainability while ensuring the functionality for handling comment attachments remains robust.

* feat(comment-attachments): enhance comment submission with image and note validation

- Updated the `upsertPreviewComment` function to require either a text note or at least one image attachment for comment submission, improving validation logic.
- Modified the `BoardComposerPopover` to allow saving comments with only images, enhancing user experience by simplifying the commenting process.
- Adjusted the `FileViewer` to support saving comments with image-only notes, ensuring consistency across components.
- Improved styles in the chat and home hero components for better visual representation of attachments and comments.

These changes collectively enhance the commenting functionality, providing users with more flexibility in how they submit comments while ensuring robust validation.

* feat(project-view): implement auto-start for queued chat sends

- Replaced the `startingQueuedChatSendIdRef` with a state variable `queuedAutoStartBlocked` to manage the auto-start behavior of queued chat sends.
- Updated the `useEffect` to ensure that queued sends are processed one at a time after the active conversation completes, enhancing the chat experience.
- Added a test to verify that queued sends auto-start correctly after the active run finishes, ensuring reliable functionality.

These changes improve the handling of queued chat messages, providing a smoother user experience during conversations.

* feat(project-view): refine auto-start logic for queued chat sends

- Replaced the state variable `queuedAutoStartBlocked` with a reference `startingQueuedChatSendIdRef` to manage the auto-start behavior of queued chat sends more effectively.
- Updated the `useEffect` to ensure that queued sends are processed sequentially, improving the handling of chat messages during active conversations.
- Introduced a new state variable `queuedAutoStartTick` to track the auto-start process, enhancing the responsiveness of the chat interface.

These changes improve the reliability and user experience of the chat functionality by ensuring queued messages are handled smoothly and efficiently.

* feat(comment-attachments): improve attachment handling in preview comments

- Updated the `upsertPreviewComment` function to merge existing and incoming attachments, ensuring that image attachments are preserved when updating comments without new files.
- Introduced a new helper function, `mergePreviewCommentAttachments`, to handle the merging of attachments without duplicates, enhancing the robustness of attachment management.
- Added tests to verify the correct merging of attachments and the preservation of existing attachments during comment updates, improving overall functionality and user experience.

These changes enhance the commenting system by providing better management of image attachments, ensuring users can update comments seamlessly while retaining their attached images.

* feat(BoardComposerPopover): enhance popover positioning and measurement

- Updated the `popoverAnchorStyle` function to incorporate viewport scroll positions, ensuring the popover remains within visible bounds.
- Introduced a new `PopoverSize` type to manage measured dimensions, improving the accuracy of popover placement.
- Implemented a `useLayoutEffect` to dynamically measure the popover size and adjust its position accordingly, enhancing user experience during interactions.
- Added tests to verify that the popover correctly adjusts its position based on target and viewport dimensions, ensuring it remains fully visible.

These changes improve the usability of the BoardComposerPopover by ensuring it is properly positioned within the viewport, enhancing the overall commenting experience.

* feat(comment-attachments): enhance image attachment handling in comments

- Updated the `normalizeCommentAttachments` function to include image attachments in the comment payload, allowing comments to be submitted with only images.
- Introduced a fallback message for comments that consist solely of image attachments, improving user guidance.
- Enhanced the `renderCommentAttachmentHint` function to display image attachment details, ensuring users are informed about attached images.
- Added tests to verify that image attachments are preserved in comment submissions and correctly rendered in hints, improving overall functionality and user experience.

These changes enhance the commenting system by providing better support for image-only comments, ensuring users can effectively utilize image attachments in their interactions.

* feat(comment-attachments): refine comment context handling and enhance drag-and-drop functionality

- Updated the `normalizeCommentAttachments` function to conditionally omit comment text when the context is set to 'query', improving clarity in comment submissions.
- Enhanced the `renderCommentAttachmentHint` function to only display comments when not in 'query' context, ensuring a cleaner output.
- Implemented drag-and-drop functionality in the `CommentSidePanel` for reordering comments, improving user interaction and organization of comments.
- Added tests to verify the correct handling of comment context and the functionality of the drag-and-drop feature, ensuring robust performance.

These changes enhance the commenting system by providing clearer context management and improved usability through drag-and-drop capabilities.

* feat(FileViewer, ProjectView): enhance comment attachment handling and status updates

- Updated the `FileViewer` component to manage the state of sent comment IDs, ensuring that active previews are correctly updated after sending comments.
- Refined the `ProjectView` component to filter out comment attachments from the board-batch source and update their status to 'applying' during processing, improving user feedback on attachment handling.
- Introduced logic to handle the removal of sent comment IDs from the preview, enhancing the overall user experience when managing comments and attachments.

These changes improve the functionality and responsiveness of the commenting system, providing clearer feedback and better management of comment attachments.

* feat(conversation-forking): introduce message-based conversation forking

- Added the ability to fork conversations from a specific message, enhancing the chat experience by allowing users to create new conversations that inherit context up to a chosen point.
- Updated the CLI commands and help documentation to reflect the new `--fork-after` option, which specifies the message ID to stop copying from.
- Enhanced the backend to handle the new forking logic, ensuring that only messages up to the specified ID are included in the new conversation.
- Implemented tests to verify the forking functionality, ensuring robust performance and correct behavior when forking conversations.

These changes improve the flexibility of conversation management, allowing users to create tailored discussions based on previous interactions.

* feat(terminal-service): enhance session output management and memory handling

- Introduced new parameters for managing session output, including `maxBufferBytes`, `exitTailBytes`, `flushIntervalMs`, and `flushThresholdBytes`, to optimize memory usage and performance.
- Implemented a `trimBuffer` function to efficiently evict old events based on byte and count limits, improving memory management during active sessions.
- Added logic to coalesce buffered PTY output into single `data` events, reducing the frequency of event emissions and enhancing performance during high-throughput scenarios.
- Updated session event structure to include `byteLength`, allowing for better tracking of output size and memory usage.

These changes improve the efficiency and responsiveness of terminal sessions, ensuring better resource management and user experience.

* feat(workspace-context): enhance project attachment handling and workspace context management

- Introduced `formatProjectAttachmentHint` function to render project attachments in a user-visible order, improving clarity for users referencing attachments.
- Added `normalizeWorkspaceContextItems` function to standardize workspace context items, ensuring consistent handling of various context types.
- Updated `mergeRunContextSelections` to include workspace items, enhancing the context management during chat interactions.
- Enhanced `renderRunContextPrompt` to display active workspace context, providing users with better visibility of their current workspace state.
- Implemented tests for new functions to ensure robust functionality and correct behavior in handling project attachments and workspace contexts.

These changes improve the user experience by providing clearer context and better management of project attachments within the application.

* feat(database, chat): enhance message and conversation data structure

- Added `run_context_json` to the `messages` table schema to store contextual information for each message.
- Updated migration logic to include the new `run_context_json` field and ensure backward compatibility.
- Enhanced conversation retrieval to include `messageCount`, providing better insights into the number of messages per conversation.
- Improved attachment handling in the `ChatComposer` component by introducing ordering logic for attachments, ensuring a consistent user experience.
- Refactored `SessionModeToggle` to simplify state management and improve tooltip visibility.

These changes enhance data management and user interaction within the chat application, providing clearer context and improved functionality.

* feat(attachment-handling): improve attachment sorting and context management

- Introduced `sortAttachmentsByUserOrder` function to ensure attachments are displayed in a user-defined order, enhancing clarity and usability.
- Updated `historyWithApiAttachmentContext` to utilize the new sorting function, improving the context provided with message histories.
- Refactored `buildAnthropicMessageContent` to apply sorting for image attachments, ensuring consistent handling across different message types.

These changes enhance the user experience by providing better organization and visibility of attachments within the chat application.

* feat(chat): enhance conversation listing and loading states

- Improved the SQL query for listing conversations to include message counts, providing better insights into conversation activity.
- Added loading states to the ChatPane component, enhancing user experience during data fetching.
- Implemented a search feature in the conversation history, allowing users to filter conversations by title for easier navigation.
- Updated styles for loading indicators and conversation list to improve visual feedback during loading states.

These changes enhance the usability and responsiveness of the chat interface, providing users with clearer context and improved interaction capabilities.

* feat(chat): optimize suggestion filtering and enhance design system integration

- Refactored suggestion filtering in the ChatComposer component to utilize `useMemo`, improving performance by memoizing results based on dependencies.
- Added new props in ChatPane for handling plugin and design system details, enhancing the integration of these features within the chat interface.
- Updated the FileWorkspace component to manage tab states for design system and browser tabs, improving user navigation and context management.
- Introduced modals for displaying plugin and design system details in the ProjectView, enhancing user experience by providing contextual information.

These changes improve the efficiency of suggestion handling and enhance the overall user experience in managing plugins and design systems within the chat application.

* fix(chat): adjust styling and layout for improved chat interface

- Reduced the conversation row height in ChatPane for better alignment with design specifications.
- Updated CSS styles across various components to enhance layout consistency, including adjustments to margins, padding, and flex properties.
- Improved visual feedback and responsiveness in chat elements, ensuring a more cohesive user experience.

These changes refine the chat interface, making it more visually appealing and user-friendly.

* feat(comments): implement structural equality checks for comment snapshots

- Added `commentSnapshotOverlayEqual` and `commentSnapshotEqual` functions to compare comment snapshots based on their structural properties, improving performance by avoiding unnecessary state updates in the `FileViewer` component.
- Updated `HtmlViewer` to utilize these equality checks, optimizing the handling of live comment targets during pointer movements and hover events.
- Enhanced the overall responsiveness of the comment system by preventing redundant re-renders when comment snapshots remain unchanged.

These changes enhance the efficiency of comment handling and improve user experience in the commenting interface.

* feat(chat): enhance comment attachment sorting and order management

- Introduced `sortChatCommentAttachmentsByOrder` function to ensure comment attachments are displayed in a user-defined order, improving clarity and usability.
- Updated the `currentCommentAttachments` function to utilize the new sorting logic, enhancing the organization of attachments.
- Adjusted the order assignment for visual attachments to ensure consistent handling based on existing attachment orders.

These changes improve the user experience by providing better organization and visibility of comment attachments within the chat application.

* feat(chat): enhance queued send strip with overflow handling and styling improvements

- Added overflow handling to the queued send strip, allowing for better visibility of additional queued items when the list exceeds the visible limit.
- Updated CSS styles for the chat components, including adjustments to layout, padding, and font sizes to improve overall aesthetics and usability.
- Refactored the structure of the queued send row to utilize a grid layout, enhancing alignment and responsiveness of the elements within the chat interface.

These changes improve the user experience by providing clearer visibility of queued messages and a more polished interface.

* feat(database): add index for created_at and enhance comment retrieval

- Introduced a new index on `preview_comments` for `created_at` to optimize query performance when retrieving comments.
- Updated the `listPreviewComments` function to order results by `created_at` and `rowid`, improving the organization of displayed comments.
- Enhanced the test suite to verify the injection of the new URL preview selection bridge and its functionality in various scenarios.

These changes improve the efficiency of comment retrieval and enhance the user experience in the commenting interface.

* feat(tooltip): implement tooltip system for enhanced user guidance

- Introduced a new `TooltipLayer` component to manage tooltip display across the application, improving user interaction by providing contextual information on hover and focus.
- Updated various components to utilize the tooltip system, including buttons and icons, ensuring consistent tooltip behavior and styling.
- Enhanced CSS styles for tooltips, improving visibility and responsiveness, while maintaining a cohesive design across the application.

These changes enhance the user experience by providing clearer guidance and improving the overall usability of interactive elements.

* feat(tooltip): enhance tooltip integration across components

- Updated various components to include tooltip functionality, improving user guidance with contextual information on hover and focus.
- Added `data-tooltip` and `data-tooltip-placement` attributes to buttons and interactive elements for consistent tooltip behavior.
- Enhanced CSS styles to ensure tooltips are displayed correctly and responsively, maintaining a cohesive design across the application.

These changes improve the overall user experience by providing clearer guidance and enhancing the usability of interactive elements.

* feat(project-view): refactor comment handling and enhance tooltip positioning

- Introduced `mergeSavedPreviewComment` function to streamline the management of preview comments, improving clarity and maintainability.
- Updated `ProjectView` to utilize the new comment merging function, enhancing the efficiency of comment updates.
- Refactored `TooltipLayer` to use x and y coordinates for positioning, improving tooltip display accuracy and responsiveness.
- Enhanced CSS styles for tooltips, ensuring better visibility and layout consistency across the application.

These changes improve the user experience by providing more efficient comment handling and refined tooltip interactions.

* feat(design-files): enhance workspace hint and file handling in project context

- Introduced `formatDesignFilesWorkspaceHint` function to provide a detailed overview of the current Design Files workspace, including folder and file listings.
- Added limits for the number of folders and files displayed to improve clarity and prevent overwhelming users with excessive information.
- Updated the `startServer` function to integrate the new workspace hint, ensuring that the context of existing project files and folders is communicated effectively.
- Enhanced tests for the new workspace hint functionality to ensure accurate representation of project context.

These changes improve user experience by providing clearer insights into the Design Files workspace and facilitating better project management.

* feat(composer): atomic @mention keyboard navigation and deletion

* feat(database): add attachments_json field to comments and enhance migration logic

- Introduced `attachments_json` field in the database schema for comments to support attachment storage.
- Updated migration functions to include the new field, ensuring existing data is properly migrated.
- Refactored related SQL queries to accommodate the new field, improving data handling for comments.

These changes enhance the comment functionality by allowing attachments to be stored and retrieved effectively.

* chore(nix): refresh pnpm deps hash

* feat(chat-composer): implement design toolbox for enhanced design actions

- Added a new design toolbox feature to the ChatComposer, allowing users to access various design actions such as 'auto-match', 'motion', and 'visual-polish'.
- Introduced a state management system for the design toolbox, including hooks for opening and closing the toolbox.
- Enhanced the user interface with new components and styles for the design toolbox, improving accessibility and usability.
- Implemented functionality to apply design actions directly from the toolbox, streamlining the design workflow.
- Added tests to ensure the correct behavior of the design toolbox and its interactions within the ChatComposer.

These changes significantly enhance the design capabilities within the ChatComposer, providing users with a more efficient and intuitive design experience.

* feat(chat-composer): enhance design toolbox resource management

- Expanded the design toolbox functionality in ChatComposer to include a comprehensive resource index, allowing for better organization and retrieval of skills, plugins, MCP servers, templates, connectors, and project files.
- Introduced new types and interfaces to support the expanded resource management, improving type safety and clarity in the codebase.
- Updated the design toolbox action descriptions and search functionality to reflect the new resource capabilities, enhancing user experience.
- Added tests to validate the new resource indexing and search features, ensuring robust functionality.

These enhancements significantly improve the design workflow by providing users with a more organized and efficient way to access various design resources.

* feat(workspace): enhance workspace context management and UI integration

- Introduced a new function `renderWorkspaceContextToolHints` to provide contextual hints based on the type of workspace items (browser, terminal, files, live artifacts).
- Updated `ChatComposer`, `ChatPane`, and `FileWorkspace` components to support and display workspace context items, improving user interaction and accessibility.
- Enhanced the `QuickSwitcher` and `TabLauncherMenu` components to include workspace context items in search results, allowing users to navigate between tabs and files more efficiently.
- Added new translations and updated existing ones to reflect the inclusion of workspace tabs in the user interface.

These enhancements significantly improve the usability and functionality of the workspace, providing users with better context and navigation options.

* test(e2e): assert Lexical composer content with toHaveText, not toHaveValue

The chat composer and home hero input are now Lexical contenteditable
editors, not native form controls, so Playwright's toHaveValue (form-only)
fails with "Not an input element". Switch all chat-composer-input and
home-hero-input content assertions to toHaveText, and assert multi-line
soft-break cases with separate toContainText checks since the editor's
textContent collapses the newline (the newline reaching the sent payload
is already covered by downstream message/payload assertions).

* chore(nix): refresh pnpm deps hash

* chore(nix): refresh pnpm deps hash

* test(e2e): open settings through the entry settings menu

The home settings entry is now a menu (EntrySettingsMenu): clicking the
gear opens a popover whose "Settings" item opens the full execution-mode
dialog. Update the three specs that assumed a single click opened the
dialog directly to go through the menu trigger + open-details item.

* test(web): drive HomeView context picker through the Lexical helper

The home hero input is a Lexical contenteditable, so fireEvent.change /
reading .value throws "element does not have a value setter". Switch the
MCP+connector first-turn-context spec to setHomeHeroPrompt / homeHeroPromptText
like the rest of the file already does.

* Fix PR review blockers

* Update Nix pnpm deps hash

* Enhance project folder routes with error handling and tests

- Added checks for project existence in GET, POST, and DELETE folder routes, returning a 404 error if the project is not found.
- Updated tests to verify 404 responses for unknown project IDs in folder operations.
- Improved folder metadata handling in project routes.
- Refactored the TabLauncherMenu component for better UI structure and scrolling behavior.
- Adjusted styles for the TabLauncherMenu to improve usability and visual consistency.

* Update ProjectView component and enhance design toolbox localization

- Modified the ProjectView component to include workspacePanelTrack in chat panel width adjustments.
- Updated localization files to add new keys and translations for the design toolbox in English, Simplified Chinese, and Traditional Chinese.
- Enhanced user experience by providing comprehensive tooltips and prompts for design toolbox actions.

* Add browser use prompt launcher

* Cover browser use prompt launcher

* Add design toolbox badge translations

* Refactor ChatComposer and DesignBrowserPanel components

- Removed the design system picker from the ChatComposer component and integrated it into the StagedRunContexts for better context handling.
- Updated the DesignBrowserPanel to include a new function for determining viewport icons and modified the browser use categories to reflect changes in action prompts and titles.
- Enhanced the HandoffButton component with a new feature to copy the project path to the clipboard, improving user experience.
- Added new styles for search input and empty states in design files, enhancing the UI consistency across components.

* Enhance ChatPane and DesignBrowserPanel components with new features and styles

- Added scroll handling features to the ChatPane component, including scrollable state management and improved user experience during chat interactions.
- Updated the DesignBrowserPanel to include new localization keys and improved category titles for better clarity in the browser use prompt.
- Enhanced styles for chat log scrolling behavior, providing a more intuitive interface for users.
- Implemented keyboard shortcuts for tab navigation in the WorkspaceTabsBar, improving accessibility and user efficiency.

* Fix web workspace test hang

* Update styles and functionality for workspace tabs and chat components

- Adjusted CSS for the MAC_WINDOW_CHROME to improve spacing and margins for better layout.
- Enhanced the ChatComposer component to ensure proper tab order and mention handling, improving user experience.
- Implemented keyboard shortcuts for navigating workspace tabs, allowing for more efficient tab management.
- Updated localization strings for clarity in search prompts across multiple languages.
- Improved the HandoffButton component's UI for better visibility and interaction when copying project paths.

* Update visual settings flow

* Refresh Nix pnpm deps hashes

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-06-03 15:46:22 +00:00
Amy
8fefebbbaf test(e2e): add priority tiers and main UI alerts (#3574)
* test(e2e): add priority tiers and stabilize p0 coverage

* test(e2e): align restoration artifact reopen path

* test(e2e): stabilize p1 workspace flows

* ci(e2e): run extended UI on main and notify failures

* fix(e2e): repair priority preflight checks

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

* fix(e2e): restore AMR login pill coverage

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

* fix(e2e): add full UI test script

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

* fix(e2e): remove disallowed ui script

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

* fix(e2e): allow full UI script in guard

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)
2026-06-03 12:08:06 +00:00
Marc Chan
b6105a8cab refactor(web): extract shared UI primitives (#2879)
* refactor(web): extract shared UI primitives

* fix(web): include components in packaged builds

* fix(nix): refresh daemon pnpm deps hash

* fix(components): trim unused primitive exports

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* refactor(components): split primitives into modules

* fix(web): restore finalize cancel link

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* docs: prefer shared web components

* refactor(web): migrate easy primitives

* fix(web): preserve migrated control styling

* fix(nix): refresh daemon pnpm deps hash

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix(web): keep Continue in CLI action styling

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix(web): keep finalize action styling

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* chore(nix): refresh pnpm deps hash

* fix(web): preserve MCP row label styling

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix(web): resolve ChatComposer merge conflict

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(e2e): stop leaked tools-dev runtime before retry

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* ci: include packages/components in change scopes

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(pack): include components in linux internal packages

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(pack): ship components package in packed installs

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(pack): align win app test with internal packages

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(components): use global visually hidden class

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(components): add development export for source builds

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(components): restore selected custom select state

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(tools-pack): rebuild components for packaged web builds

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* test(tools-pack): cover components workspace artifacts

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

* fix(ci): include components in tools-pack scope

Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-06-03 03:22:29 +00:00
Denis Redozubov
3da33f92a1 Harden sandbox orchestration daemon chokepoints (#3420)
* Harden sandbox orchestration chokepoints

* Cover web app public copy in neutrality guard
2026-06-02 07:33:12 +00:00
kami
333a62cda6 fix: link od bin after fresh install (#2069)
* fix: link od bin after fresh install

* test: lock root od bin shim path

* test: cover root workspace deps in postinstall scan

* chore(nix): refresh pnpm deps hash
2026-05-31 04:36:49 +00:00
lefarcen
da19ff3ca0 feat(mocks): replay-based mock CLIs for 14 of OD's supported agents (opencode/codex/claude/gemini/cursor-agent/deepseek/qwen/grok + ACP family devin/hermes/kilo/kimi/kiro/vibe) (#3241)
* feat(mocks): replay-based mock CLIs for opencode/claude/codex/deepseek/qwen/grok

Drops in a `mocks/` top-level dir that pretends to be the real agent
CLIs by streaming pre-recorded sessions in each CLI's native stdout
protocol. Zero LLM tokens.

## Use cases

- **E2E tests** in `apps/daemon/tests/` — exercise the full chat-server
  pipeline against a known trace, assert UI events / artifacts.
- **Self-validation during dev** — iterate on `claude-stream.ts` /
  `json-event-stream.ts` parser changes without burning provider budget.
- **Regression harness** — replay the same trace before and after a
  charter / parser change; diff the daemon events the UI surfaces.
- **Demo / onboarding** — show what a 17-tool claude editing session
  looks like end-to-end, offline.

## How

- 6 bash wrappers (`mocks/bin/`) shadow the real CLIs when PATH-overlaid.
- `mocks/mock-agent.mjs` reads `mocks/recordings/<trace>.jsonl`, picks
  one via env var (`SYNCLO_EXPLORE_MOCK_TRACE` / `_POOL` /
  `_BY_PROMPT_HASH`), streams the trace in the requested format.
- Each format renderer matches the EXACT JSON shape the OD daemon
  parser expects, verified line-by-line against
  `apps/daemon/src/{json-event-stream,claude-stream}.ts`:

  | CLI                       | streamFormat              | parser source                              |
  | ------------------------- | ------------------------- | ------------------------------------------ |
  | `opencode`                | `json-event-stream`       | `handleOpenCodeEvent`                      |
  | `codex`                   | `json-event-stream`       | `handleCodexEvent`                         |
  | `claude`                  | `claude-stream-json`      | `createClaudeStreamHandler`                |
  | `deepseek` `qwen` `grok`  | `plain`                   | `server.ts` (raw stdout)                   |

## Quick start

```bash
export PATH="$PWD/mocks/bin:$PATH"
export SYNCLO_EXPLORE_MOCK_TRACE=04097377   # 8-char prefix OK
export SYNCLO_EXPLORE_MOCK_NO_DELAY=1

echo "any prompt" | opencode run
echo "any prompt" | claude -p --output-format=stream-json
echo "any prompt" | codex exec
```

The mock binary announces the picked trace id on stderr:
`[mock-opencode] picked 04097377… via fixed`.

Recording selection (env, in priority order):
- `SYNCLO_EXPLORE_MOCK_TRACE=<id>` — fixed (prefix OK)
- `SYNCLO_EXPLORE_MOCK_BY_PROMPT_HASH=1` + stdin prompt — `sha256(prompt) % N`
- `SYNCLO_EXPLORE_MOCK_POOL=<tag>` — random within `agent:claude` /
  `skill:agent-browser` / `outcome:failed` / etc.
- (default) uniform random
- `SYNCLO_EXPLORE_MOCK_SEED=<str>` — reproducible "random"
- `SYNCLO_EXPLORE_MOCK_NO_DELAY=1` — skip inter-event waits

## Dataset

179 anonymized Langfuse traces from this project's own production
telemetry:

- 9 agents: claude 57 · opencode 41 · codex 38 · gemini 25 ·
  cursor-agent 11 · qwen 2 · copilot 2 · deepseek 2 · antigravity 1
- outcomes: succeeded 144 · failed 35
- skills: default 71 · ad-creative 50 · algorithmic-art 30 ·
  agent-browser 22 · video-hyperframes 2 · plus magazine-web-ppt /
  brainstorming / data-report / penpot-flutter-design-source 1 each
- 124 multi-turn (sessions with ≥2 turns)
- 18 produce `<artifact>` output
- ~4.5 MB on disk total

Anonymization: `/Users/<name>/` → `${HOME}/`,
`C:\Users\<name>\` → `%USERPROFILE%\`, project UUIDs →
stable `proj-001`, `proj-002`, …. Tool input/output payloads
preserved verbatim (templated UI, no cell-level PII).

## Smoke test

`bash mocks/scripts/smoke-test.sh` — 6 checks across all 6 agents.
All pass on this branch (verified locally):

```
  ✓ opencode first event = step_start
  ✓ codex first event = thread.started
  ✓ claude first event = system
  ✓ deepseek emitted plain text (144 chars on first line)
  ✓ qwen emitted plain text (144 chars on first line)
  ✓ grok emitted plain text (144 chars on first line)
All mock CLIs working. 
```

## Adding more recordings

The exporter that produced this set lives in
[nexu-io/agent-pr-explore](https://github.com/nexu-io/agent-pr-explore)
(see `cli/src/local/orchestrator/langfuse-import.ts` + the `local
langfuse-import` CLI command). Operators with the Langfuse keys can pull
more by tag / outcome / artifact / multi-turn filter, then run
`local recordings anonymize --out-dir ~/Documents/open-design/mocks/recordings`.
`mocks/README.md` has the full instructions.

## Out of scope (follow-ups)

- **ACP agents** (`devin`, `hermes`, `kilo`, `kimi`, `kiro`, `vibe`) need
  a JSON-RPC server on stdio rather than a one-shot stream — separate
  `format-acp.mjs` module not yet written.
- **Per-agent json-event-stream variants** (`cursor-agent`, `gemini`,
  `qoder`, `copilot`, `pi`) currently fall back to the `plain` renderer;
  their parsers are in `apps/daemon/src/json-event-stream.ts` and follow
  the same template as `format-codex.mjs`.

## AGENTS.md updates

- Added `mocks/` to the top-level content directories listing
- Added a Validation strategy bullet pointing here for agent-stream /
  parser changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks): add opencode-cli/kiro-cli/vibe-acp bin aliases and unref ACP timeout

- Add mocks/bin/opencode-cli, kiro-cli, vibe-acp wrappers for the primary
  RuntimeAgentDef bin names OD resolves before any fallback. Without these,
  a PATH-overlaid OD daemon run bypasses the mock entirely (opencode-cli,
  kiro-cli) or cannot find the mock at all (vibe-acp, which has no fallback).
- Include opencode-cli, kiro-cli, vibe-acp in the smoke-test ACP/JSON loop
  so coverage is verified end-to-end.
- Call .unref() on the 30s safety timeout in format-acp.mjs so a completed
  ACP session exits promptly instead of waiting the full 30 seconds.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* feat(mocks): add vela (AMR) — login / models / ACP with strict set_model gate

Extends mocks/ to cover OD's own AMR runtime. `vela` is the bin name
`apps/daemon/src/runtimes/defs/amr.ts` specifies (`bin: 'vela'`,
`streamFormat: 'acp-json-rpc'`). It's richer than the generic ACP
agents — covers full login + models + chat-session lifecycle.

### What vela does (mirrored from apps/daemon/tests/fixtures/fake-vela.mjs)

1. `vela login` — writes ~/.amr/config.json with a fake profile (controlKey,
   runtimeKey, user{email,name,plan}, profile-specific apiUrl/linkUrl).
   The on-disk projection is what OD's daemon login route + AmrLoginPill
   poller read; production goes through device-auth, the mock skips
   straight to the file write.

2. `vela models` — prints the production-shaped public model catalog as
   newline-separated `public_model_*    vela` lines. Override via
   FAKE_VELA_MODELS env.

3. `vela agent run --runtime opencode` — ACP JSON-RPC server with three
   vela-specific protocol extensions:

   a. `initialize` response carries `agentCapabilities`
      (`promptCapabilities.embeddedContext`) + `models`
      (`currentModelId` + `availableModels`).
   b. `session/new` response carries the same `models` block.
   c. **Strict set_model gate**: `session/prompt` is rejected with
      JSON-RPC -32602 ("session/set_model must be called before
      session/prompt") UNLESS `session/set_model` (or
      `session/set_config_option`) has been called for the current
      sessionId. Mirrors real vela 0.0.1 contract; catches regressions
      in `attachAcpSession` that silently skip set_model.

### Error injection envs (in sync with fake-vela.mjs)

  FAKE_VELA_SESSION_ID            - sessionId returned by session/new
  FAKE_VELA_TEXT                  - override assistant text
  FAKE_VELA_THOUGHT               - optional thought_chunk before text
  FAKE_VELA_SESSION_NEW_ERROR     - fail session/new
  FAKE_VELA_SET_MODEL_ERROR       - fail session/set_model
  FAKE_VELA_PROMPT_ERROR          - fail session/prompt
  FAKE_VELA_REQUIRE_SET_MODEL='0' - disable the strict gate (legacy)
  FAKE_VELA_LOGIN_USER_EMAIL      - email written into config profile
  FAKE_VELA_LOGIN_USER_PLAN       - plan written into config profile
  FAKE_VELA_LOGIN_DELAY_MS        - sleep before write (test in-flight)
  FAKE_VELA_LOGIN_FAIL            - print + exit 1
  FAKE_VELA_MODELS                - override models stdout
  VELA_PROFILE                    - profile slot (prod | test | local)

### Components

`mocks/lib/format-vela.mjs` (~205 LOC)
  - Full ACP server with vela protocol extensions
  - Strict set_model gate
  - Error injection plumbing

`mocks/lib/vela-subcommands.mjs` (~90 LOC)
  - runVelaLogin() — writes ~/.amr/config.json
  - runVelaModels() — prints catalog

`mocks/bin/vela` — dispatcher wrapper. Forwards `vela <subcmd>` to
mock-agent.mjs which routes to login/models or falls through to ACP.

`mocks/mock-agent.mjs` — parseArgs now collects positionals so the vela
dispatcher can read subcommand from there; switch case added for vela.

`mocks/scripts/smoke-test.sh` — +4 assertions:
  vela models prints ≥10 catalog lines
  vela login writes ~/.amr/config.json with the requested email
  vela agent run ACP roundtrip (initialize+models+set_model+stream+result)
  vela strict set_model gate rejects prompt without prior set_model

### Verified locally

  ✓ vela models printed 15 catalog lines
  ✓ vela login wrote ~/.amr/config.json with profile.prod.user.email
  ✓ vela agent run ACP roundtrip (initialize+models, set_model accepted, prompt streamed)
  ✓ vela strict set_model gate rejects session/prompt without prior set_model

All 21 smoke checks pass (up from 17 with previous P3 ACP commit).

### AGENTS.md + README updates

  AGENTS.md — mention `vela (AMR — vela CLI)` alongside ACP agents in
  the directory listing entry.
  mocks/README.md — protocol table row + dedicated vela section with
  subcommand contract, strict gate explanation, env-injection cheat
  sheet. Mock-tree listing updated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks): honor REPORT_FILE env when --report-file flag not given

Harnesses that spawn the mock without translating their report-path
contract to the mock's CLI flag (notably nexu-io/agent-pr-explore's
orchestrator, which passes REPORT_FILE as env per the existing
opencode/claude/codex agent launchers) wouldn't get a report file
written, so the harness's "agent exit 0 but produced no report"
check would always fire and mark mock runs as failure even though the
stdout stream was complete.

Fix: in mock-agent.mjs parseArgs, fall through to process.env.REPORT_FILE
when --report-file wasn't provided on argv. Each format renderer already
accepts opts.reportFile and writes the recording's final assistant text
to it (`format-*.mjs` already had this — only the wiring was missing).

Verified: synclo-explore run with `mock=true, mock_trace=04097377`
against the opencode wrapper now produces a plan.md with the recording's
17-tool claude editing session report. ~1.5s per run vs ~70s real opencode.

* mocks: move recordings to Cloudflare R2; PR→main→Action upload path

The 179-recording corpus (~4.5 MB raw, ~280 KB after compression) has
been moved off git into Cloudflare R2 at the bucket open-design-mocks
under recordings/v1/. The repo now ships:

- mocks/manifest.json — the canonical catalog (renamed from
  recordings/index.json) with sha256 + storage hints; consumers
  fetch this to discover what exists, then pull individual jsonl
  files on demand
- mocks/scripts/fetch-recordings.sh — parallel, sha256-verified,
  idempotent puller for the public r2.dev URL
- mocks/scripts/add-recording.sh — local maintainer helper that
  validates a new .jsonl and copies it into recordings-staging/
  (no R2 calls; no credentials needed)
- mocks/scripts/upload-to-r2.mjs — called only by the CI workflow
- mocks/scripts/lib/manifest-utils.mjs — shared sha256/meta/
  rebuild-histograms logic, used by both add-recording (preview)
  and upload-to-r2 (actual write) so the entry shape never drifts
- .github/workflows/sync-mocks-to-r2.yml — fires on push to main
  when mocks/recordings-staging/ changes; uploads to R2, updates
  manifest, commits cleanup back; serialized via concurrency group

Trust model: R2 write credentials (CLOUDFLARE_API_TOKEN,
CLOUDFLARE_ACCOUNT_ID) are repo secrets; nobody can push from a
laptop. Read stays public via the r2.dev URL.

Why not pnpm install integration: contributors who do not touch
agent code do not pay the fetch cost. Fetch happens on first
smoke-test run (auto-fallback) or when a mock spawn needs data.

Repo size: -4.55 MB net (delete 179 jsonl, +280 KB manifest +
scripts). Smoke test (21 checks) still green against the fetched
corpus.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: scope R2 write token to a dedicated secret name

Use CLOUDFLARE_R2_MOCKS_TOKEN (instead of reusing the shared
CLOUDFLARE_API_TOKEN that landing-page-*.yml uses for Pages deploys)
so the R2 write capability can be scoped to just the
open-design-mocks bucket without bleeding extra capability into the
Pages workflows.

Also hardcode the powerformer CF account_id directly in the workflow
(account IDs are not secret and the shared CLOUDFLARE_ACCOUNT_ID
secret may point at a different account).

Workflow now fails fast with an actionable error message + dashboard
link if the secret is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: switch R2 sync to S3-compat API (wrangler getMemberships gate)

wrangler 4.x calls /memberships before any r2 action, requiring
user:read scope. R2 "Object Read & Write" tokens deliberately lack
that scope (defense in depth — a leaked token should not enumerate
account-level resources). The workflow now uses the aws CLI talking
straight to the R2 S3-compatible endpoint with SigV4, no membership
lookup.

Secret rotation: CLOUDFLARE_R2_MOCKS_TOKEN (Bearer) is replaced by
CLOUDFLARE_R2_MOCKS_AK / CLOUDFLARE_R2_MOCKS_SK (matching the
existing CLOUDFLARE_R2_RELEASES_AK/SK naming convention). End-to-end
tested locally: PUT recording → manifest rebuild → manifest PUT →
staging cleanup all green.

aws CLI is pre-installed on ubuntu-latest, so no install step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: scrub synclo namespace; use OD_MOCKS_* env prefix throughout

These mocks were copy-pasted from synclo-explore, where they
originated, and inherited the SYNCLO_EXPLORE_MOCK_* env-var
convention. That brand-bleed is not appropriate in OD: rename the
public env surface to OD_MOCKS_* (matching OD-native prefixes like
OD_MOCKS_CACHE_DIR, OD_TRACE_R2_UPLOAD, OD_EXPECT_TIMEOUT_SECONDS).

Renames:
  SYNCLO_EXPLORE_MOCK_TRACE             → OD_MOCKS_TRACE
  SYNCLO_EXPLORE_MOCK_BY_PROMPT_HASH    → OD_MOCKS_BY_PROMPT_HASH
  SYNCLO_EXPLORE_MOCK_POOL              → OD_MOCKS_POOL
  SYNCLO_EXPLORE_MOCK_SEED              → OD_MOCKS_SEED
  SYNCLO_EXPLORE_MOCK_NO_DELAY          → OD_MOCKS_NO_DELAY
  SYNCLO_EXPLORE_MOCK_RECORDINGS_DIR    → OD_MOCKS_RECORDINGS_DIR
  SYNCLO_EXPLORE_MOCK_SMOKE_TRACE       → OD_MOCKS_SMOKE_TRACE
  SYNCLO_OD_MOCKS_I_KNOW_WHAT_IM_DOING  → OD_MOCKS_ALLOW_LOCAL_UPLOAD

Also drop the inline harvester usage from README. The harvester is an
external CLI in nexu-io/agent-pr-explore — its README is the right
place for langfuse-import flags, anonymization options, etc. OD only
documents its own staging→PR→Action workflow.

Smoke test (21 checks) still green; OD_MOCKS_TRACE end-to-end
verified to route correctly.

Consumers of the OLD env names (notably the orchestrator in
nexu-io/agent-pr-explore) need a matching rename. No back-compat
shim here — the explore side has zero external users today and a
one-line follow-up is cleaner than a permanent deprecation layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* AGENTS.md: align mock env names with mocks/ rename (SYNCLO_* → OD_MOCKS_*)

Missed in the prior commit (a30b868a) — only grepped mocks/ subdir.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: drop staging dir + GH Action; back to local-script upload

The staging-dir + Action design (added earlier in this PR) had a flaw
the user caught: new recordings briefly entered the repo on their way
through staging, leaving them in git history forever even after the
Action cleanup commit removed them from HEAD. That defeats the whole
point of moving recordings to R2.

Replace with the simpler local-maintainer flow:

  bash mocks/scripts/upload-recording.sh /path/to/<trace>.jsonl
  # → validates, wrangler r2 put, updates manifest.json, wrangler r2 put manifest
  git add mocks/manifest.json && git commit && git push
  # → only the ~200B manifest delta enters git

The wrangler-OAuth gate replaces the CI secret + Action duo. For a
solo / small maintainer team this collapses the trust chain down to
"do you have wrangler login to the powerformer account?" — no GH
secrets to rotate, no concurrency window to worry about, no
inevitable repo-history bloat.

Deletes:
- .github/workflows/sync-mocks-to-r2.yml
- mocks/scripts/upload-to-r2.mjs   (CI-only)
- mocks/scripts/add-recording.sh   (staging helper, now obsolete)
- mocks/recordings-staging/        (empty dir, never to be repopulated)

Adds:
- mocks/scripts/upload-recording.sh

Kept:
- mocks/scripts/fetch-recordings.sh
- mocks/scripts/lib/manifest-utils.mjs (still used by upload-recording.sh)
- mocks/manifest.json (committed; the only mocks artifact in git)

End-to-end tested locally: re-upload an existing recording is
idempotent, manifest math is stable, fetch + smoke test still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: address review — guard allowlist + safe ~/.amr + loud OD_MOCKS_TRACE typo

Three concrete issues raised across recent Siri-Ray (Looper) review
threads on #3241:

1. scripts/guard.ts only allowlisted mocks/lib/ + mocks/mock-agent.mjs,
   leaving mocks/scripts/lib/manifest-utils.mjs outside the residual-
   JS guard. Result: Preflight fail on every push. Extend the allowlist
   to mocks/scripts/ — same precedent as the lib/ entry directly above.

2. mocks/scripts/smoke-test.sh moved the caller real ~/.amr to
   ~/.amr-smoke-backup, ran vela login (which writes a fake config),
   then rm -rf the .amr and restored the backup. Two failure modes:
   crash mid-run loses the user real config, and re-running before
   restore overwrites the backup with the fake login. Fix: sandbox
   vela login into a mktemp -d HOME via env (HOME=$amr_sandbox vela
   login). Never touches the real ~/.amr at all. trap cleans up.

3. mocks/lib/recording-picker.mjs silently fell through to
   prompt-hash → pool → random when OD_MOCKS_TRACE was set but did
   not match any recording (typo, prefix too short, corpus not
   fetched). Tests using a pinned trace would silently get a
   different trace, hiding regressions. Fix: throw an explicit error
   with the failing value + a pointer at fetch-recordings.sh.

Verified locally: pnpm guard prints "Residual JavaScript check
passed", smoke-test still 21/21, ~/.amr mtime unchanged after run,
typo on OD_MOCKS_TRACE now produces "mock-agent: OD_MOCKS_TRACE=...
set but no matching recording in <dir>" on stderr.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fetch-recordings: detect empty filter result before line-counting

printf '%s\n' on an empty string emits a single empty line, so the
previous TOTAL=$(printf ... | grep -c "") math returned 1 on an
empty $ENTRIES_TSV — a typo like `--agent no-such-agent` printed
"Fetching up to 1 recordings", downloaded zero, and exited 0
("ready"). Check `-z $ENTRIES_TSV` first.

Reproduced + fix verified per the reviewer thread.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: address mrcfps review — goldens + provenance + contract check

Three durability improvements suggested in the PR #3241 top-level
review:

## 1. Golden daemon-event snapshots (mocks/golden/*.events.json + apps/daemon/tests/mocks-golden.test.ts)

Smoke-test verified that mocks RUN; that catches crashes but not a
parser change that semantically reshapes the events the daemon emits.

Commit the daemon-event sequence for 3 representative traces:
- claude  314d6833 — median-complexity agent-browser session
- codex   dcdff3b3 — 14-tool refactor
- opencode 9a9522ec — 7-tool data-report

apps/daemon/tests/mocks-golden.test.ts spawns the mock, feeds stdout
through the real createClaudeStreamHandler / createJsonEventStreamHandler,
normalizes per-spawn volatile fields (only sessionId today, only on
claude), and deep-equals against the committed snapshot. A parser
regression fails the test loudly.

After an intentional parser change, regenerate:

  MOCKS_GOLDEN_UPDATE=1 pnpm --filter @open-design/daemon test mocks-golden
  git diff mocks/golden/
  # eyeball; commit if shapes match intent

## 2. Provenance fields on every manifest entry (mocks/scripts/lib/manifest-utils.mjs + mocks/manifest.json)

Augment inspectRecording() to write:

  captured_at         — ISO 8601 from existing meta.timestamp
  cli_version         — null until harvester writes it
  protocol_version    — null until harvester writes it
  anonymization_version — null until harvester writes it

captured_at is now populated for all 179 existing entries from the
meta event the harvester already emits. The harvester in
nexu-io/agent-pr-explore is the next step for cli_version /
protocol_version / anonymization_version — once those are
populated, consumers can detect when a recording is older than ~1
minor version behind the live CLI and flag for re-harvest.

No matrix of (cli_version × agent) recordings — that explodes
maintenance. Just metadata per recording so trust decay is visible.

## 3. Real-CLI contract check (mocks/scripts/contract-check.sh + docs/MOCKS-CONTRACT-CHECK.md)

Mocks catch parser regressions against recordings; they do NOT
catch recordings drifting away from the live agent CLI as that CLI
evolves. The contract check spawns the real CLI alongside the mock
with a fixed deterministic prompt + diffs top-level event-type
distributions.

Deliberately human-driven, not cron-scheduled:
- costs real LLM tokens per invocation
- requires real CLI auth
- maintainer reads the output, not a regex

Suggested triggers per doc: real-CLI release notes mentioning
"output format" / "stream" / "JSON" / "events"; before a parser
refactor; ad-hoc when something looks off.

## Coverage note

README updated to position mocks as "deterministic protocol/parser
coverage" (not "e2e replacement") per mrcfps framing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks-golden test): drop import of non-exported ParserKind

Use plain string (the type alias is `string` anyway) — Preflight
typecheck on a31fa71a failed:

  tests/mocks-golden.test.ts(29,8): error TS2459: Module
  "../src/json-event-stream.js" declares "ParserKind" locally, but
  it is not exported.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* recording-picker: structured OD_MOCKS_POOL + hard-fail no-match

Siri-Ray review: \`OD_MOCKS_POOL=outcome:failed\` was documented as a
supported selection knob, but the matcher only checked tags and
\`meta.agent\` — so the negative-path pool found 0 candidates and
silently fell through to global random, validating against any
recording instead of a failed trace.

Fix:
- Parse \`<dim>:<value>\` shape and route each dim to the right meta
  field: \`outcome\` → \`meta.outcome\`, \`agent\` → \`meta.agent\`,
  \`skill\` → \`tags[]\`. Bare values still fall back to tag substring.
- If the env was set and matched nothing, throw with the failing
  value and a jq one-liner for inspection. Same loud-fail policy as
  OD_MOCKS_TRACE — silent fallback was the original bug.

Verified locally: outcome:failed, agent:codex, skill:agent-browser
all route correctly; outcome:nonsense throws the explicit error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* contract-check.sh: fix lost $PROMPT in mock invocation

Siri-Ray review on e576074a: the mock side wrapped its pipeline in
`bash -c "printf %s \"\$PROMPT\" | ..."` — but $PROMPT was a parent
shell variable, not exported, so the child bash expanded it to an
empty string. Result: the contract check sent the real prompt to the
real CLI and an empty string to the mock, defeating the
same-input invariant the whole script rests on. Also let the mock
randomly select a different trace whenever a maintainer happens to
have OD_MOCKS_BY_PROMPT_HASH=1 in their env.

Fix: drop the inner bash -c entirely; use a subshell that scopes the
PATH overlay and pipes printf into the PATH-resolved mock binary
directly. The subshell limits the PATH change without var-passing.

Verified locally: with prompt-A the mock picks trace 54ec02ee via
hash; prompt-B → 2667e851 via hash; empty prompt (old broken
behavior) → random — confirms the prompt is now actually reaching
the mock under PATH overlay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 07:17:20 +00:00
lefarcen
df8a0faff6 feat(runtimes): register AMR (vela) as an ACP stdio agent (#2355)
* feat(runtimes): register AMR (vela) as an ACP stdio agent

AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode`
speaks ACP JSON-RPC over stdio (see vela's
`specs/current/runtime/manual-agent-run-openrouter.md`); per
`docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat:
'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc.

The new `defs/amr.ts` is the entire wiring — `buildArgs` returns
`['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses
`detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's
e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the
matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN`
allowlist + install/docs URLs, so users can configure the per-agent env in
Settings without leaking into other adapters.

Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns
the documented `initialize` / `session/new` / `session/set_model` /
`session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via
`child_process.spawn` and drives a full turn through `attachAcpSession` and
`detectAcpModels`, so the ACP transport contract for AMR is end-to-end
verified locally even before a real `vela` binary is installed.

Validated:
- pnpm guard
- pnpm typecheck (all workspace projects)
- pnpm --filter @open-design/daemon test (2881/2881)

Deferred: real OpenRouter-backed turn through a built `vela` binary —
the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY`
and `VELA_LINK_URL` in env (or Settings).

* fix(runtimes/amr): pin a concrete default model and bare openai ids

End-to-end validation against a freshly-built `vela` (nexu-io/vela@main)
+ OpenRouter surfaced two contract details the first AMR runtime def
got wrong:

1. vela rejects `session/prompt` with `session/set_model must be called
   before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts
   skips set_model whenever the picked model is the synthetic 'default'
   id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The
   def now ships a concrete `gpt-5.4-mini` as both `fetchModels`'
   default option and `fallbackModels[0]`, which makes attachAcpSession
   always send a real `session/set_model` for AMR turns.

2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId
   it forwards to opencode's openai provider. With OpenRouter-style ids
   like `openai/gpt-5.4-mini`, opencode receives the double-prefixed
   `openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`.
   The new fallback list ships the bare ids opencode's openai registry
   actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.).

Stub + tests:
- tests/fixtures/fake-vela.mjs now enforces the set_model gate the same
  way real vela does, so a regression that silently goes back to
  model: 'default' would surface as a fatal error in tests instead of a
  hidden production failure.
- tests/amr-acp-integration.test.ts pins both contracts: no 'default' /
  no 'openai/' prefix in fallbackModels, and a negative case that
  asserts session/prompt fails when no model is set.

Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time
runner that drives `attachAcpSession` against a real `vela` binary and
prints the daemon's chat events, so future protocol drift can be checked
against an actual OpenRouter call.

Verified locally: `vela agent run --runtime opencode` + OpenRouter
returns the prompted string ("AMR-E2E-PASS") through the full daemon
pipeline; daemon test suite stays 2883/2883.

* fix(runtimes/amr): substitute concrete model when chat run sends 'default'

A plugin-driven AMR run from the UI surfaced a real-world hole in the
prior commit:

  json-rpc id 3: session/set_model must be called before session/prompt

The Default-design-router plugin (and any caller that doesn't pin a
real model) sends `model: 'default'` straight through, which the AMR
runtime def cannot accept — vela rejects `session/prompt` without
`session/set_model` and attachAcpSession skips set_model whenever
model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the
adapter's `fallbackModels` is not enough: the chat-run handler in
server.ts still forwarded 'default' verbatim.

This adds `resolveModelForAgent(def, resolved, env?)` as the
single source of truth for the substitution:

  1. If the caller picked a real id, pass it through.
  2. Else, if `def.defaultModelEnvVar` is set and the daemon process
     env has a non-empty value for it, return that (operator escape
     hatch — see below).
  3. Else, if the def's `fallbackModels` does NOT contain a 'default'
     id, return `fallbackModels[0].id`.
  4. Else, return the original value (the historic shape — defs that
     list 'default' themselves are untouched).

AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when
opencode's openai-provider registry deprecates `gpt-5.4-mini`
upstream, an operator can swap the fallback id without a code change
by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev
/ od. Worth noting the env var must live in the daemon's `process.env`
(Settings-UI per-agent env values only reach the spawned child, not
the daemon's resolver) — the new field's docblock spells this out.

Coverage:
- `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all
  four resolver branches plus the env-override happy path / fallback /
  ignore-when-user-picked-a-real-id case.
- `pnpm --filter @open-design/daemon typecheck` clean.

* chore(runtimes/amr): move AMR to the top of the base agent list

So `AMR (vela)` shows up first in the agent picker / status views,
ahead of claude / codex. Pure ordering change; no behavior delta.

* feat(amr): Sign-in / Sign-out button on the AMR Settings card

The first half of the AMR work assumed the operator would set
VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never
surfaced login state to users. This adds the missing UX so a fresh
install can drive the full path from Settings:

  - GET  /api/integrations/vela/status   reads ~/.vela/config.json
    for the active profile and returns { loggedIn, profile, user }
    (without leaking the runtime/control keys themselves).
  - POST /api/integrations/vela/login    spawns `vela login` once
    (409 if one is already in flight). The vela CLI opens the user's
    browser to the device-authorization page itself — Open Design
    only needs to kick the subprocess off.
  - POST /api/integrations/vela/logout   removes ~/.vela/config.json
    so the next status read returns logged-out.

`AmrAgentCard` is a dedicated agent-card component for AMR because
the existing `<button>` row can't host an interactive sub-control
(nested interactive elements). It polls /status after a login click
until the daemon reports loggedIn=true (or 5 minutes elapse), and
exposes a Sign-out action on hover. Other adapters (claude, codex,
hermes, …) keep their existing `<button>` card.

i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.)
added to en + zh-CN. Other locales spread `en` and inherit the
English copy until translations land.

Coverage:
- `tests/integrations/vela.test.ts` pins the config.json reader
  against a tmp HOME — including the negative case where a profile
  has user info but no runtimeKey (still logged-out), and the
  secret-leak guard ("rt-secret-*" must not appear in the projection
  payload).
- `tests/components/AmrAgentCard.test.tsx` covers all four UI
  states (logged-out, logging-in, logged-in, logging-out) plus the
  click-propagation invariant the divergent card was built to keep.

`pnpm --filter @open-design/daemon test` 2901 / 2901 passing.
`pnpm --filter @open-design/web test` 1719 / 1719 passing.
`pnpm typecheck` + `pnpm guard` clean.

Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs`
no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if
VELA_PROFILE is set, the vela CLI is allowed to resolve credentials
from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to
`scripts/guard.ts` allowlist with the executable-fixture / dev-runner
rationale.

* fix(connection-test): substitute model for AMR before attachAcpSession

The chat-run path in server.ts already routes the requested model through
`resolveModelForAgent` so AMR / vela (whose CLI demands an explicit
`session/set_model` before `session/prompt`) gets the def's first
concrete fallback id when the chat run ships `model: 'default'`.
`connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })`
directly, which made the Test Connection button on the AMR Settings
card deadlock with the same `session/set_model must be called before
session/prompt` error the chat-run path already handles — surfaced as a
permanent "Testing connection…" spinner in the UI.

Reuse the same helper here so Test Connection mirrors chat-run behavior.

* test(amr): three-layer end-to-end coverage for the AMR login + turn flow

The PR up to this point shipped runtime + UI code with unit-level Vitest
coverage. This commit adds the cross-layer regression net the live demo
relied on:

1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest)
   Spins up the real daemon Express app via `startServer({port:0,...})`,
   persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json,
   and exercises every /api/integrations/vela/* endpoint against the
   extended fake-vela stub:
     - status reads ~/.vela/config.json under various states
     - login spawns the fake, waits for config.json to appear, returns
       pid + startedAt + profile
     - 409 already-running guard with the stub's delay knob
     - logout removes the file (idempotent)
     - secrets (runtimeKey / controlKey) never leak in the projection
     - login → status round-trip flips loggedIn=false → true

2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest)
   Boots a namespaced daemon + web pair through `createSmokeSuite`,
   inlines a self-contained fake `vela` binary that handles BOTH
   `vela login` (writes ~/.vela/config.json) and
   `vela agent run --runtime opencode` (ACP stdio with the
   `session/set_model must precede session/prompt` gate the real binary
   enforces), then drives a complete /api/runs lifecycle for
   `agentId: 'amr', model: 'default'` and asserts the assistant message
   captures the fake's streamed text. This is the test that would have
   surfaced today's plugin-default-model regression (the `set_model
   before prompt` error) at PR time instead of demo time.

3. e2e/ui/amr-login-pill.test.ts (Playwright)
   Mocks /api/agents + /api/integrations/vela/{status,login,logout}
   to drive the Settings AMR card through the full Sign in → Signed in
   → Sign out cycle. Pins the AmrLoginPill polling contract and the
   aria-label semantics (the pill's accessible name is "Sign out" once
   logged in, regardless of which label the hover-state text shows).

fake-vela.mjs extensions:
   - Handles `vela login` argv by writing
     ~/.vela/config.json for the active VELA_PROFILE and exiting 0 —
     mirrors real vela's on-disk side-effect without the device-auth
     loop.
   - FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the
     in-flight state of the spawn lifecycle.
   - FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced
     user fields end-to-end.

Validated:
   - `pnpm guard` + `pnpm typecheck` (all workspace projects)
   - `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing,
     including the new 8-test integration suite.
   - `cd e2e && pnpm test tests/amr`: 1 / 1 passing.
   - `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`:
     1 / 1 passing (6.7s).

* feat(amr): package native cli and refine login ui

* feat(amr): wire vela cli beta packaging

* docs(amr): document vela ci packaging review

* docs(amr): refine vela ci integration review

* fix(ci): refresh nix pnpm dependency hashes

* fix(pack): clean up Vela CLI packaging

* fix(pack): bundle Vela CLI support files

* fix(amr): recover login attempts from stale auth state

* test: expand AMR and automations coverage

* fix(amr): address review follow-ups

* test(web): align tasks fixtures with contracts

* fix(daemon): type wildcard route params

* fix(ci): refresh PR merge validation

* fix(amr): clear env credentials on logout

* feat(settings): inline local CLI model configuration

* fix(amr): recognize daemon env credentials

* [codex] Fix Vela companion packaging (#2979)

* Fix Vela companion packaging

* Update Nix pnpm dependency hashes

* [codex] Surface AMR account failures (#2980)

* fix: surface AMR account failures

* fix: cover AMR recovery error guidance

* chore: bump beta base version to 0.8.1 (#2990)

* Fix AMR profile and packaged runtime review issues

* Detect packaged AMR OpenCode companion tree

* feat(web): polish AMR frontend flows

* Polish AMR onboarding card

* fix: read AMR login state from dot-amr config (#3048)

* test: tighten AMR credential and packaging coverage

* test: restore AMR executable test env helper

* [codex] Fix packaged mac Dock identity and AMR label (#3076)

* Fix packaged mac sidecar Dock identity

* Rename AMR assistant label

* Fix AMR live models and dot-amr login state (#3073)

* fix: read AMR login state from dot-amr config

* fix: load live AMR models before runs

* fix: point AMR onboarding link to production wallet

* fix: address AMR model review feedback

* fix: persist live AMR model fallback

* [codex] Fix AMR link catalog model ids (#3088)

* Fix packaged mac sidecar Dock identity

* Rename AMR assistant label

* Fix AMR link catalog model ids

* Fix AMR model normalization typecheck

* Use live AMR model for default runs

* fix: polish AMR runtime settings UI

* Accelerate AMR startup defaults (#3092)

* Surface AMR insufficient balance wallet URL (#3099)

* fix(web): polish onboarding controls (#3112)

* fix(web): show CLI scan loading state

* Avoid duplicate AMR wallet recharge links (#3117)

* Avoid duplicate AMR wallet recharge links

* Use Vela CLI 0.0.3 test package

* chore(nix): refresh pnpm deps hash

* Fix AMR wallet guidance display

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>

* chore(pack): pin Vela CLI 0.0.3-test.1 (#3127)

* chore(nix): refresh pnpm deps hash

* chore(pack): pin Vela CLI 0.0.3

* chore(nix): refresh pnpm deps hash

* fix(web): suppress AMR exit 130 fallback (#3136)

* feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083)

* feat(web): nudge users to hosted AMR on model/auth/quota failures

When a non-AMR agent run fails with an auth / quota / upstream model
error, surface an inline nudge under the error pill linking to Open
Design's hosted AMR gateway (https://open-design.ai/amr). The nudge
fires `surface_view` (element=run_failed_toast) on impression and
`ui_click` (element=go_amr) on the link.

Also teach the daemon to classify CLI-agent auth/quota/upstream failures
(Claude Code, codex, ...) into specific API error codes
(AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of
the generic AGENT_EXECUTION_FAILED, so both the error message and the
nudge key off accurate codes. AMR's own runs are excluded from the
nudge — they keep the dedicated sign-in / recharge affordances.

* feat(web): rework failed-run AMR guidance into per-case error UI

Replace the single inline nudge with a per-case failed-run experience
driven by the run's error code + agent:

- The error card is now neutral gray (was red) and always carries a
  retry button; it is driven by the persisted per-message error event so
  it survives a reload.
- Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion
  card under the error card offers "switch to AMR & retry" — switches the
  run to AMR, opens Settings on the AMR card, and auto-retries once the
  account signs in (ProjectView polls vela login status, independent of
  the Settings pill lifecycle, with success / 5-min-timeout / unmount
  exits).
- AMR agent unauthorized: clearer copy + an "authorize & retry" button.
- AMR agent out of balance: clearer copy + a "top up" button to the AMR
  wallet, with manual retry.
- Settings AMR card: when opened from the nudge, it scrolls into view and
  pulses, and an authorize-button coachmark (a fake hand cursor that
  rises in and dismisses on hover) points at the sign-in control when not
  yet authorized.

analytics: surface_view (run_failed_toast) on the promotion card and
ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.*
and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall
back to en) and drops the old chat.amrErrorGuidance keys.

* fix(daemon): require status context for numeric service-failure codes

Per review on #3083: the model-service classifier matched bare HTTP
status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like
`line 500`, `read 502 bytes`, or `exit code 401` could be misclassified
as a provider outage / auth wall and wrongly surface the AMR nudge. Now
a status number only counts when it carries explicit context (`HTTP 500`,
`status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases
(overloaded, bad gateway, service unavailable, rate limit, …) are
unchanged. Adds fixtures proving unrelated numeric output stays null.

* fix(web): keep error pill for failed runs ChatPane's card doesn't cover

Per review on #3083: the per-message gray error pill was suppressed for
every persisted error status event, but ChatPane only renders the
replacement top-level error card for `retryableAssistantMessage` (the
last failed assistant). So a failed turn that is no longer last (after a
follow-up) or an older failed run in history showed neither the pill nor
the card — its error detail vanished, undercutting reload/history
survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose
error the card represents); AssistantMessage suppresses only that one
pill and keeps rendering StatusPill for all other error events.

* fix(daemon): don't treat a process exit code as an HTTP status

Follow-up to review on #3083: the status-context helper accepted a bare
`code` prefix, so `exit code 401` / `process exited with code 429` still
matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the
very `exit code 401` case the comment calls out as noise). `code` now
only counts when qualified (`status code` / `error code` / `response
code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer
matches. Adds fixtures for exit-code lines returning null.

* chore(web): translate AMR card / error keys for 16 remaining locales

PR #3083 added 10 new `chat.amrCard.*` / `chat.amrError.*` keys but only
provided en/zh-CN/zh-TW translations; the other 16 locales fell back to
English. Translate the card title/body, three chips, primary CTA, and
the AMR self-error (auth / balance) messages and buttons for ar, de,
es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk.

* fix(amr): address review feedback on #2355

Targeted fixes for the unresolved review threads on #2355. Each fix
includes / updates a focused test.

- runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now
  verifies the inner `opencode` executable exists + is runnable, not
  just the directory. This closes the false-positive availability path
  that let `detectAgents()` surface AMR as available even when the
  packaged companion was empty / partially copied (mrcfps, 4 threads).

- runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers
  the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a
  stale `opencode` on the user's PATH, so packaged AMR builds can't be
  hijacked by a global installation.

- web/EntryShell.tsx: when the Local CLI scan returns an available
  agent and the previously-selected agent is AMR, switch the selection
  to the first available local agent so the runtime and persisted
  agent agree before Continue.

- server.ts (model-probe branch): for AMR, check `readVelaLoginStatus`
  BEFORE rejecting on an empty live-model catalog — a signed-out user
  was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of
  the correct `AMR_AUTH_REQUIRED` (sign-in affordance).

- server.ts (default model fallback): if the user asked for the AMR
  agent default and the cached id is no longer in the FRESH catalog,
  fall back to `liveModels[0]` from the probe instead of rejecting the
  run as `AMR_MODEL_UNAVAILABLE`.

- integrations/vela.ts: route `vela login` through
  `createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat`
  shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with
  verbatim args (matches `execAgentFile` / chat-run spawning).

- tools/pack/src/linux.ts: in containerized Linux builds, bind-mount
  the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env
  to the container-side path. The host path was being passed in as-is
  even though the default container only mounts /project, /tools-pack
  and cache/home — `copyOptionalVelaCliBinary` saw a missing path.

Deferred (out of scope for this PR):
- `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md
  UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked
  for a separate focused PR.
- Strict `--require-vela-cli` for Windows + mac-x64 beta builds:
  prematurely blocked — `@powerformer/vela-cli` only publishes the
  `darwin-arm64` platform binary today; adding the flag elsewhere
  would fail the builds. Revisit once win/x64/linux binaries ship.

* fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ)

The new signed-out AMR branch in the catalog preflight at server.ts:10875
calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the
const declaration sat ~100 lines below at the outer function scope. Because
`const` is TDZ-aware, that branch would have thrown `ReferenceError:
Cannot access 'sendAmrAccountFailure' before initialization` for the
exact users it tries to help — defeating the original intent.

Hoist the helper to just above the AMR preflight block so it's available
to every AMR code path in this function. Behavior elsewhere is unchanged.

Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch
uses packaged built-in Vela for AMR` was creating the
`<resourceRoot>/bin/libexec/opencode/` companion *directory* only, but
this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree`
also requires the inner `opencode` executable. Add it to that fixture
to match the new contract; the test was a sibling of the executables /
env-and-detection fixtures already updated in 13fc4f4.

Addresses #2355 review (mrcfps, 2026-05-28).

* feat(web): add hover cancel for AMR login (#3158)

* feat(web): add hover cancel for AMR login

* fix(web): don't bounce AmrLoginPill back to 'Signing in…' after local cancel

Both codex-connector (P2) and looper (CHANGES_REQUESTED) on this PR
flagged the same race in the new local-cancel path: `handleCancelLogin`
dispatches `notifyAmrLoginStatusChanged('login-canceled')` immediately
after `/login/cancel` returns, but the `AMR_LOGIN_STATUS_EVENT` listener
unconditionally re-enters `refresh()` and then restarts polling
whenever `/api/integrations/vela/status` still reports
`loginInFlight: true`.

That is a real race because the daemon's `cancelVelaLogin()` only sends
SIGTERM (escalating to SIGKILL after `LOGIN_CANCEL_KILL_GRACE_MS` =
2000 ms) and keeps the child in `activeLoginProcs` until it actually
exits — so the first `/status` read after a successful cancel can
legally still come back as in-flight. Under that window the pill flips
back to 'Signing in…' and can later surface the timeout/error path even
though the user already canceled, defeating the behavior promised in
the PR description.

Fix the listener instead of every dispatch site: in the
`login-canceled` branch, after the local reset (stopPolling +
setPending(null) + clear refs), optimistically mark every subscribed
pill instance as not-in-flight (`setStatus((c) => c ? { ...c,
loginInFlight: false } : c)`) and `return` — skip the
refresh-and-reconcile branch below entirely. The next explicit refresh
(component mount, user interaction, or a `status-changed` event) will
pick up the daemon's confirmed state once the child has actually
exited.

Add a focused regression test that holds `/api/integrations/vela/status`
at `loginInFlight: true` even after a successful `/login/cancel`,
asserting that the pill stays at the Canceled → Authorize sequence and
never bounces back to 'Signing in…'. This test fails on the pre-fix
listener and passes on the new behavior; existing
'cancels an in-flight AMR sign-in…' and 'reconciles late AMR browser
completion to Signed in after local cancel' tests continue to pass.

Addresses review feedback on #3158 (chatgpt-codex-connector, nettee).

---------

Co-authored-by: lefarcen <935902669@qq.com>

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
Co-authored-by: Amy <1184569493@qq.com>
Co-authored-by: Mason <jinmeihong0201@gmail.com>
Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-28 05:09:55 +00:00
Marc Chan
125dcd0174 fix(ci): run fork visual reports from trusted code (#2935)
* fix: run fork visual reports from trusted code

* fix: auto-approve strict web visual capture

* fix: address visual report review feedback

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: propagate visual report storage failures

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: validate PR screenshots before upload

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: validate visual PR identity before comment

* fix: harden fork visual report validation

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: address remaining fork visual report review feedback

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: handle stale fork visual report lookup

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: allow stale fork visual report fallback

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)
2026-05-26 06:17:04 +00:00
Jiannanya
f4af51d550 fix(skills): update d3-visualization skill upstream to snow-d3 and expand skill metadata (#1981)
* update d3 visualization skill

* update d3 skill info

* fix(skills/d3): align seed triggers and clone path with SKILL.md

- Add 'd3 scroll' to the d3-visualization triggers array in
  seed-curated-design-skills.ts so it matches the 16 triggers
  already present in skills/d3-visualization/SKILL.md.
- Change `git clone` target from `.` to `skills/snow-d3` so
  the install command produces the path described by the prose.
2026-05-25 10:58:23 +00:00