Files
vas3k-TaxHacker/docker-compose.yml
Vasily Zubarev 0bed4a6e84 feat: new app - email/smtp listener (#102)
* feat: initial email impl

* feat: IMAP email ingest (builds on the scaffold) (#100)

* chore: add imap-simple, mailparser, vitest

* feat: AES-256-GCM helpers for email credentials

* feat: extract ingestUnsortedFile helper, reuse in upload action

* chore: gitignore .worktrees/

* feat: email-sync types and pure attachment/search filters

* feat: imap-simple + mailparser client wrapper

* feat: email sync orchestration with UID watermark + status persistence

* feat: encrypt email credentials at rest, add UID/addedAt fields

* feat: real IMAP test-connection, scoped sync-now, thin cron entry

* docs: update email app README to match real IMAP/encryption/UID behavior

* fix: nest SINCE search criteria and guard missing addedAt for first-run sync

* fix: show last-sync time and error detail from sync in server card

* fix: skip storage recompute when no attachments ingested

Avoids an ENOENT crash on first sync when the user's uploads dir does not exist yet and nothing was ingested; this was also masking the real per-server error. Adds regression tests for the guard.

* feat: configurable initial-grab window (fetch-since date)

First sync is bounded by a user-chosen 'Fetch emails since' date instead of the server's addedAt; blank = entire mailbox (IMAP ALL). The UID watermark takes over after the first run.

* fix: add missing @langchain/core dependency

@langchain/core is only a peer dep of the @langchain/* packages and was not installed on a clean npm install, breaking the build (e.g. /unsorted via ai/analyze).

* fix: harden email sync — UID dedup guard, locked status write, graceful decrypt, scrypt memo

Addresses review findings: skip messages at/below the UID watermark (defends against the IMAP `n:*` re-fetch quirk); lock the app_data row with SELECT FOR UPDATE so concurrent cron/manual syncs can't clobber each other; return a friendly error when a stored password can't be decrypted (e.g. after BETTER_AUTH_SECRET rotation) and document the coupling; memoize the scrypt-derived key.

* feat: enforce per-server syncInterval on cron; skip non-Buffer attachments

The cron now honors each server's syncInterval (manual Sync Now bypasses the throttle), so the configured interval is no longer ignored. Attachments whose parsed content is not a Buffer are skipped instead of throwing on .length. Adds throttle regression tests.

* refactor: remove dead lastProcessedMessageId field; clarify cron throttle in README

lastProcessedMessageId was superseded by the lastProcessedUid watermark and never read; dropped from the type and form state. README now describes the per-server interval as an app-level throttle (manual Sync Now bypasses).

* feat(email): UI-selectable sync frequency + working cron heartbeat

Replace the per-server sync-interval number input with a dropdown of
presets (15m/30m/hourly/6h/12h/daily). Switch the stored unit from hours
to minutes and update the throttle accordingly.

Make the cron actually run: heartbeat now fires every 5 minutes as the
resolution floor while each mailbox's UI frequency gates real fetches.
Propagate env into cron jobs via /etc/cron.env (cron strips the
environment) and add BETTER_AUTH_SECRET to the email-sync service in the
dev/build compose files so stored passwords can be decrypted.

* fix(email): reset Add Server dialog to provider selection on close

Radix's onOpenChange only toggled isOpen, so closing the dialog via Esc,
overlay click, or the X left the step/selectedProvider state intact.
Reopening then jumped straight to the previous provider's config form
instead of the provider-selection screen. Route every close through
handleClose() to reset the step.

---------

Co-authored-by: Evgenii Burmakin <Freika@users.noreply.github.com>
2026-06-18 23:30:38 +02:00

68 lines
1.8 KiB
YAML

services:
app:
image: ghcr.io/vas3k/taxhacker:latest
ports:
- "7331:7331"
environment:
- NODE_ENV=production
- SELF_HOSTED_MODE=true
- UPLOAD_PATH=/app/data/uploads
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/taxhacker
volumes:
- ./data:/app/data
restart: unless-stopped
depends_on:
- postgres
logging:
driver: "local"
options:
max-size: "100M"
max-file: "3"
email-sync:
image: ghcr.io/vas3k/taxhacker:latest
environment:
- NODE_ENV=production
- SELF_HOSTED_MODE=true
- UPLOAD_PATH=/app/data/uploads
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/taxhacker
# Must match the app's secret: it derives the key that decrypts stored mailbox passwords.
- BETTER_AUTH_SECRET=${BETTER_AUTH_SECRET}
volumes:
- ./data:/app/data
- ./etc/crontab:/etc/cron.d/email-sync:ro
restart: unless-stopped
depends_on:
- postgres
- app
command: >
sh -c "
apt-get update && apt-get install -y cron &&
printenv | grep -E '^(DATABASE_URL|BETTER_AUTH_SECRET|UPLOAD_PATH|NODE_ENV|SELF_HOSTED_MODE|BASE_URL|PATH)=' > /etc/cron.env &&
chmod 0644 /etc/cron.d/email-sync &&
crontab /etc/cron.d/email-sync &&
touch /var/log/email-sync.log &&
cron &&
tail -f /var/log/email-sync.log
"
logging:
driver: "local"
options:
max-size: "100M"
max-file: "3"
postgres:
image: postgres:17-alpine
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=taxhacker
volumes:
- ./pgdata:/var/lib/postgresql/data
restart: unless-stopped
logging:
driver: "local"
options:
max-size: "100M"
max-file: "3"