Files
larksuite-cli/shortcuts/doc/clipboard_test.go
河伯 bc6590abef feat(doc): add --from-clipboard flag to docs +media-insert (#508)
* feat(doc): add --from-clipboard flag to docs +media-insert

Allow users to upload the current clipboard image directly to a Lark
document without saving to a local file first.

- New --from-clipboard bool flag (mutually exclusive with --file)
- shortcuts/doc/clipboard.go: readClipboardToTempFile() with per-OS impl
    macOS   — osascript (built-in, no extra deps)
    Windows — PowerShell + System.Windows.Forms (built-in)
    Linux   — tries xclip / wl-paste / xsel in order; clear install hint
              on failure
- No new Go dependencies, no Cgo
- Temp file is created before upload and removed via defer cleanup()
- --file changed from Required:true to optional; Validate enforces
  exactly-one of --file / --from-clipboard

* fix(doc): fix clipboard image read on macOS for screenshots and browser-copied images

- Add TIFF fallback (macOS screenshots default to TIFF, not PNG)
- Add HTML base64 fallback (images copied from Feishu/browser embed data URI)
- Use current directory for temp file so FileIO path validation passes

* fix(doc): scan HTML/RTF/text clipboard formats for base64 image data URIs

Extend attempt-3 fallback to iterate all text-based clipboard formats
(HTML, RTF, UTF-8, plain text) rather than only HTML.  Any format that
contains a "data:<mime>;base64,<data>" pattern is accepted, covering
images copied from Feishu, Chrome, Safari, and other apps that embed
base64 in non-HTML clipboard slots.  Also handle URL-safe base64.

* test(doc): add unit tests for clipboard helpers to meet 60% coverage threshold

Cover decodeHex, hexVal, decodeOsascriptData, reBase64DataURI, and
extractBase64ImageFromClipboard (via fake osascript on PATH).
Package coverage: 57% → 61.2%.

* fix(doc): address CodeRabbit review comments on clipboard feature

- Extend reBase64DataURI regex to cover URL-safe base64 chars (-_) so
  URL-safe payloads are matched before decoding is attempted
- Fix readClipboardLinux to continue to next tool when a found tool
  returns empty output instead of failing immediately
- Guard fake-osascript test with runtime.GOOS == "darwin" skip
- Use os.PathListSeparator instead of hardcoded ":" in test PATH setup

* fix(doc): replace os.* temp-file clipboard path with in-memory streaming

Fixes forbidigo lint violations in shortcuts/doc: os.CreateTemp, os.Remove,
os.Stat, os.WriteFile are banned in shortcuts/; replaced with vfs.* equivalents
for sips TIFF→PNG conversion, and eliminated temp files entirely elsewhere by
having platform clipboard readers return []byte directly.

- readClipboardDarwin: osascript outputs hex literals decoded in Go (no file I/O)
- readClipboardWindows: PowerShell outputs base64 to stdout, decoded in Go
- readClipboardLinux: tool stdout bytes returned directly
- convertTIFFToPNGViaSips: still needs temp files — uses vfs.CreateTemp/Remove
- DriveMediaUploadAllConfig/DriveMediaMultipartUploadConfig: add Content io.Reader
  field so in-memory clipboard bytes skip FileIO.Open() path
- Fix ineffassign in clipboard_test.go (scriptBody double-assignment)
- Update TestReadClipboardLinux_NoToolsReturnsError for new signature

* fix(doc): address CodeRabbit review comments on Linux clipboard path

- Update --from-clipboard flag description to list xclip, xsel and wl-paste
- Preserve last backend-specific error in readClipboardLinux so users see
  a meaningful message when a tool is found but fails
- Validate PNG magic bytes for xsel output (xsel cannot negotiate MIME types)
- Add URL-safe base64 regression test for reBase64DataURI

* fix(doc): strip whitespace from base64 payload before decoding clipboard data URI

HTML and RTF clipboard content often line-wraps base64 at 76 characters.
FindSubmatch returns the raw wrapped token so direct decode would fail.
Normalize whitespace with strings.Fields before passing to base64.Decode.

* fix(doc): drop TIFF fallback and internal/vfs import on macOS clipboard

depguard rule shortcuts-no-vfs forbids shortcuts/ from importing
internal/vfs directly. The only caller was the sips TIFF→PNG
conversion, which was already a fragile best-effort fallback that
required temp files.

Remove the TIFF fallback entirely; the remaining two attempts cover
the real-world cases:
  1. osascript → PNG hex literal — native screenshots and most apps
  2. scan text clipboard formats for base64 data URI — Feishu/browsers

* test(doc): cover readClipboardLinux xsel PNG validation and dispatcher path

Added tests:
- TestReadClipboardLinux_XselRejectsNonPNG: fake xsel that returns plain
  text is rejected by the PNG-magic check, preventing text from being
  uploaded as an "image".
- TestHasPNGMagic: table-driven coverage of the PNG signature check.
- TestReadClipboardImageBytes_UnsupportedPlatform: exercises the shared
  dispatcher post-processing and asserts the (nil, nil) invariant.

Raises clipboard.go diff coverage and brings the package from 61.6% to
63.8% overall.

* test: cover in-memory Content upload paths for clipboard feature

Adds unit tests for the new Content io.Reader branches introduced by
the clipboard feature:

- UploadDriveMediaAll with in-memory Content (drive_media_upload.go 87.5%)
- UploadDriveMediaMultipart with in-memory Content (84.6%)
- uploadDocMediaFile single-part and multipart with clipboard bytes
  (doc_media_upload.go 0% -> 88.9%)

Adds TestNewRuntimeContextForAPI helper that wires Factory, context,
and bot identity so package tests can invoke DoAPI without mounting
the full cobra command tree.

* test: cover clipboard Validate/DryRun branches and testing helper

Adds unit tests for the clipboard-related Validate/DryRun paths that
Codecov patch-coverage was flagging as uncovered:

- Validate error when neither --file nor --from-clipboard is supplied
- Validate error when both are supplied (mutual exclusion)
- DryRun output contains <clipboard image> placeholder
- Self-test for TestNewRuntimeContextForAPI so shortcuts/common
  sees coverage for the new helper (not just shortcuts/doc)

* test: cover Execute clipboard branch via injectable readClipboardImage

Makes readClipboardImageBytes swappable in tests by routing the call
through a package-level variable readClipboardImage. Tests inject a
synthetic PNG payload so the full Execute clipboard flow
(resolve → create block → upload in-memory bytes → bind) runs under
unit test without a real pasteboard.

Covers:
- TestDocMediaInsertExecuteFromClipboard: end-to-end happy path
- TestDocMediaInsertExecuteClipboardReadError: early-return on
  readClipboardImage() failure

* ci: re-trigger pull_request workflow for PR #508

Previous push to 9dedb7a did not trigger the main CI workflow via
the pull_request event (only PR Labels ran). The workflow_dispatch
run I triggered manually lacks PR-scoped secrets so security and
e2e-live failed. An empty commit replays the pull_request event so
the full matrix (deadcode, license-header, security, e2e-live) runs
with proper context.

* test(doc): guard info.Size() behind err check to prevent nil-deref

CodeRabbit flagged that 't.Fatalf("... size=%d err=%v", info.Size(), err)'
evaluates info.Size() even when os.Stat returned (nil, err), which nil-derefs.
Split the check into two stages so the error-path t.Fatalf does not touch
info.

* fix(doc): address fangshuyu-768 review on clipboard PR

Seven code changes driven by review feedback:

1. clipboard.go: stop using CombinedOutput() on osascript / powershell.
   Stdout is decoded, stderr is captured separately via cmd.Stderr and
   surfaced in the terminal error message, so locale warnings or
   AppleEvent permission prompts no longer pollute the hex/base64
   payload or mask the real failure.

2. clipboard.go: validate decoded base64 data URI bytes against known
   image magic headers (PNG/JPEG/GIF/WebP/BMP). A text clipboard that
   happens to contain a literal 'data:image/...;base64,...' fragment
   (documentation, tutorials, pasted HTML source) no longer silently
   becomes an image upload.

3. clipboard.go: simplify the Linux 'no tool found' install hint to a
   distro-agnostic phrasing instead of apt/yum only.

4. clipboard_test.go: delete the stale TestReadClipboardToTempFile_*
   tests. They referenced a readClipboardToTempFile function that no
   longer exists and only exercised os.CreateTemp/os.Remove. Replace
   with TestReadClipboardImageBytes_EmptyResultReturnsError which
   actually locks in the 'empty clipboard' → error contract of the
   current API (Linux-only since mac/Windows need a real pasteboard).

5. doc_media_upload.go: introduce UploadDocMediaFileConfig struct so
   uploadDocMediaFile takes a named config instead of 8 positional
   params. Drops the //nolint:lll the old call site had to carry.

6. doc_media_insert.go: convert the clipboard upload call to the new
   config struct and only set Config.Content when the clipboard branch
   actually produced bytes — this also fixes a latent typed-nil bug
   where a nil *bytes.Reader was being passed through an io.Reader
   parameter, which tripped the 'if cfg.Content != nil' check in
   UploadDriveMediaAll and crashed --file uploads.

7. shortcuts/common/testing.go: TestNewRuntimeContextForAPI now takes
   the identity as an explicit core.Identity parameter instead of
   hardcoding core.AsBot, and its self-test covers both AsBot and
   AsUser. Existing call sites pass core.AsBot explicitly.

Also annotates DryRun output with an 'upload_size_note' when
--from-clipboard is set, since DryRun never reads the pasteboard and
can't predict whether the payload will take the single-part or
multipart path.

* fix(doc): capture line-wrapped base64 in clipboard data URI regex (#586)

HTML and RTF clipboard content commonly folds base64 payloads at
76 chars (standard MIME folding). The previous character class
[A-Za-z0-9+/\-_]+=* stopped at the first \n, so the downstream
strings.Fields normalisation was a no-op (nothing to strip) and
extractBase64ImageFromClipboard silently uploaded a truncated
payload whose 8-byte prefix happened to pass hasKnownImageMagic.

Extend the class to include \s so the Fields strip actually has
whitespace to remove before base64 decoding. Terminators (", <,
), ;) remain outside the class so the match still ends at the
URI boundary.

Add TestReBase64DataURI_LineWrapped covering \n, \r\n, and \t
folds, full round-trip byte-equality, and the terminator-boundary
invariant so any future regression trips a failing test.

* docs(skill): add clipboard-empty fallback guidance for +media-insert

When --from-clipboard returns 'no image data' (empty clipboard, non-image
content, or Linux without xclip/wl-paste/xsel), the agent must NOT silently
swallow the error. It should tell the user the clipboard had no image, ask
for a local file path, then retry the same insert command with --file.

Lists three anti-patterns (silent success, guessing a file path, pre-emptive
save-then-file workaround) that agents have been tempted into.

* docs(skill): user-stated source trumps clipboard/file heuristic

The heuristic table (prefer --from-clipboard when image is on the
clipboard) is a fallback for when the user is vague. If the user
explicitly says 'use the screenshot I just copied' → clipboard; if
they give a path → --file. Agent must not silently swap sources even
when the other looks 'better'.

---------

Co-authored-by: fangshuyu-768 <shuyufang768@outlook.com>
2026-04-22 22:05:33 +08:00

320 lines
10 KiB
Go

// Copyright (c) 2026 Lark Technologies Pte. Ltd.
// SPDX-License-Identifier: MIT
package doc
import (
"bytes"
"encoding/base64"
"os"
"runtime"
"strings"
"testing"
)
// TestReadClipboardImageBytes_EmptyResultReturnsError locks in the contract
// that readClipboardImageBytes surfaces a clear error (instead of silently
// succeeding with empty bytes) whenever the platform layer produced no image
// data. On Linux runners this is exercised by reusing the "no clipboard tool
// found" path, which is the only portable way to force an empty result
// without a display/pasteboard.
func TestReadClipboardImageBytes_EmptyResultReturnsError(t *testing.T) {
if runtime.GOOS != "linux" {
t.Skip("portable empty-result check only runs on Linux; macOS/Windows require a real pasteboard")
}
orig := os.Getenv("PATH")
t.Cleanup(func() { os.Setenv("PATH", orig) })
os.Setenv("PATH", "")
data, err := readClipboardImageBytes()
if err == nil {
t.Fatalf("expected error on empty clipboard, got data=%d bytes", len(data))
}
if len(data) != 0 {
t.Errorf("expected no data when readClipboardImageBytes errors, got %d bytes", len(data))
}
}
func TestReadClipboardLinux_NoToolsReturnsError(t *testing.T) {
// Override PATH so none of xclip/wl-paste/xsel can be found.
orig := os.Getenv("PATH")
t.Cleanup(func() { os.Setenv("PATH", orig) })
os.Setenv("PATH", "")
_, err := readClipboardLinux()
if err == nil {
t.Fatal("expected error when no clipboard tool is available, got nil")
}
}
func TestReadClipboardLinux_XselRejectsNonPNG(t *testing.T) {
// Fake xsel that returns plain text (non-PNG) — should be rejected by the
// PNG-magic validation so the user does not upload text as an "image".
tmpDir := t.TempDir()
fakeXsel := tmpDir + "/xsel"
if err := os.WriteFile(fakeXsel, []byte("#!/bin/sh\nprintf 'not a png'\n"), 0755); err != nil {
t.Fatalf("write fake xsel: %v", err)
}
orig := os.Getenv("PATH")
t.Cleanup(func() { os.Setenv("PATH", orig) })
os.Setenv("PATH", tmpDir) // no xclip, no wl-paste; only our fake xsel
_, err := readClipboardLinux()
if err == nil {
t.Fatal("expected error when xsel returns non-PNG bytes, got nil")
}
}
func TestHasPNGMagic(t *testing.T) {
tests := []struct {
name string
in []byte
want bool
}{
{"exact PNG signature", []byte{0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a}, true},
{"PNG signature plus payload", []byte{0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0xde, 0xad}, true},
{"plain text", []byte("not a png"), false},
{"empty", []byte{}, false},
{"too short", []byte{0x89, 0x50, 0x4e, 0x47}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := hasPNGMagic(tt.in); got != tt.want {
t.Errorf("hasPNGMagic(%v) = %v, want %v", tt.in, got, tt.want)
}
})
}
}
func TestReadClipboardImageBytes_UnsupportedPlatform(t *testing.T) {
// The dispatcher returns a clear error on platforms we do not support.
// We cannot flip runtime.GOOS, but we can cover the shared post-processing
// by invoking the function on any platform and asserting the non-error
// contract holds: either it returns data (unlikely in CI) or an error —
// never both zero values.
data, err := readClipboardImageBytes()
if err == nil && len(data) == 0 {
t.Fatal("readClipboardImageBytes returned (nil, nil); must return error when data is empty")
}
}
func TestDecodeHex(t *testing.T) {
tests := []struct {
name string
input string
want []byte
wantErr bool
}{
{"empty", "", []byte{}, false},
{"single byte lower", "2f", []byte{0x2f}, false},
{"single byte upper", "2F", []byte{0x2f}, false},
{"multi byte", "48656C6C6F", []byte("Hello"), false},
{"odd length", "abc", nil, true},
{"invalid char", "GG", nil, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := decodeHex(tt.input)
if (err != nil) != tt.wantErr {
t.Fatalf("decodeHex(%q) error=%v, wantErr=%v", tt.input, err, tt.wantErr)
}
if !tt.wantErr && string(got) != string(tt.want) {
t.Errorf("decodeHex(%q) = %v, want %v", tt.input, got, tt.want)
}
})
}
}
func TestDecodeOsascriptData(t *testing.T) {
// Build a real «data HTML<hex>» literal for the string "<img>"
raw := []byte("<img>")
hexStr := ""
for _, b := range raw {
hexStr += string([]byte{hexNibble(b >> 4), hexNibble(b & 0xf)})
}
// «data HTML3C696D673E» (« = \xc2\xab, » = \xc2\xbb)
literal := "\xc2\xab" + "data HTML" + hexStr + "\xc2\xbb"
tests := []struct {
name string
input string
want string
}{
{"plain string passthrough", "hello world", "hello world"},
{"osascript hex literal", literal, "<img>"},
{"empty string", "", ""},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := decodeOsascriptData(tt.input)
if err != nil {
t.Fatalf("decodeOsascriptData(%q) unexpected error: %v", tt.input, err)
}
if string(got) != tt.want {
t.Errorf("decodeOsascriptData(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestReBase64DataURI_Match(t *testing.T) {
imgBytes := []byte{0x89, 0x50, 0x4e, 0x47} // PNG magic bytes
b64 := base64.StdEncoding.EncodeToString(imgBytes)
html := `<img src="data:image/png;base64,` + b64 + `">`
m := reBase64DataURI.FindSubmatch([]byte(html))
if m == nil {
t.Fatal("expected regex to match base64 data URI in HTML")
}
if string(m[1]) != "image/png" {
t.Errorf("mime type = %q, want %q", m[1], "image/png")
}
if string(m[2]) != b64 {
t.Errorf("base64 payload mismatch")
}
}
func TestReBase64DataURI_URLSafeMatch(t *testing.T) {
// URL-safe base64 uses '-' and '_' instead of '+' and '/'.
// Construct a payload that contains both characters.
// base64url of 0xFB 0xFF 0xFE → "-__-" in URL-safe alphabet.
urlSafePayload := "-__-"
html := `<img src="data:image/jpeg;base64,` + urlSafePayload + `">`
m := reBase64DataURI.FindSubmatch([]byte(html))
if m == nil {
t.Fatal("expected regex to match URL-safe base64 data URI")
}
if string(m[1]) != "image/jpeg" {
t.Errorf("mime type = %q, want %q", m[1], "image/jpeg")
}
if string(m[2]) != urlSafePayload {
t.Errorf("URL-safe base64 payload = %q, want %q", m[2], urlSafePayload)
}
}
func TestReBase64DataURI_NoMatch(t *testing.T) {
if reBase64DataURI.Match([]byte("no image here")) {
t.Error("expected no match for plain text")
}
}
// TestReBase64DataURI_LineWrapped exercises the common real-world case where
// HTML or RTF clipboards fold a base64 payload at 76 chars (standard MIME
// line wrapping). The regex must capture whitespace inside the payload so
// strings.Fields can strip it before base64 decoding; otherwise the match is
// truncated at the first newline and the decoded prefix happens to pass
// hasKnownImageMagic (since PNG magic is just 8 bytes), silently uploading a
// corrupt payload.
func TestReBase64DataURI_LineWrapped(t *testing.T) {
// Build a deterministic payload larger than one wrap line so we force a
// fold. The exact bytes don't matter; the full round-trip does.
payload := make([]byte, 180)
for i := range payload {
payload[i] = byte(i * 7)
}
b64 := base64.StdEncoding.EncodeToString(payload)
// Insert realistic folding: a mix of \n, \r\n, and \t within a single
// payload, to catch regressions regardless of the clipboard source
// (HTML tends to use \n; RTF \par wraps use \r\n; some editors indent).
if len(b64) < 120 {
t.Fatalf("test payload too small for folding: len=%d", len(b64))
}
wrapped := b64[:40] + "\n " + b64[40:80] + "\r\n\t" + b64[80:]
html := `<img src="data:image/png;base64,` + wrapped + `">`
m := reBase64DataURI.FindSubmatch([]byte(html))
if m == nil {
t.Fatal("expected regex to match line-wrapped base64 payload")
}
if string(m[1]) != "image/png" {
t.Errorf("mime type = %q, want %q", m[1], "image/png")
}
// The whole point of extending the character class: the downstream
// Fields strip must see the folding and normalise it away.
normalized := strings.Join(strings.Fields(string(m[2])), "")
if normalized != b64 {
t.Fatalf("normalized payload mismatch\n got: %q\nwant: %q", normalized, b64)
}
got, err := base64.StdEncoding.DecodeString(normalized)
if err != nil {
t.Fatalf("decode after normalisation failed: %v", err)
}
if !bytes.Equal(got, payload) {
t.Error("decoded bytes differ from original payload — truncation regression")
}
// The match must still stop at the URI boundary; extending the class
// with \s should not let the capture run off the end of the attribute.
if strings.Contains(string(m[0]), `">`) {
t.Errorf("regex captured past the URI terminator: %q", m[0])
}
}
func TestExtractBase64ImageFromClipboard_WithFakeOsascript(t *testing.T) {
if runtime.GOOS != "darwin" {
t.Skip("fake osascript test only runs on macOS")
}
// Build a minimal PNG (1x1 transparent) as base64 to embed in fake HTML output.
pngBytes := []byte{
0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, // PNG signature
}
b64 := base64.StdEncoding.EncodeToString(pngBytes)
htmlContent := `<img src="data:image/png;base64,` + b64 + `">`
// Encode htmlContent as a «data HTML<hex>» literal the way osascript would.
hexStr := ""
for _, c := range []byte(htmlContent) {
hexStr += string([]byte{hexNibble(c >> 4), hexNibble(c & 0xf)})
}
fakeOutput := "\xc2\xab" + "data HTML" + hexStr + "\xc2\xbb"
// Write a fake osascript that prints fakeOutput and exits 0.
// Use a pre-written output file to avoid shell-escaping issues with binary data.
tmpDir := t.TempDir()
outputFile := tmpDir + "/output.txt"
if err := os.WriteFile(outputFile, []byte(fakeOutput), 0600); err != nil {
t.Fatalf("write output file: %v", err)
}
fakeScript := tmpDir + "/osascript"
scriptBody := "#!/bin/sh\ncat " + outputFile + "\n"
if err := os.WriteFile(fakeScript, []byte(scriptBody), 0755); err != nil {
t.Fatalf("write fake osascript: %v", err)
}
// Prepend tmpDir to PATH so our fake osascript is found first.
orig := os.Getenv("PATH")
t.Cleanup(func() { os.Setenv("PATH", orig) })
os.Setenv("PATH", tmpDir+string(os.PathListSeparator)+orig)
got := extractBase64ImageFromClipboard()
if got == nil {
t.Fatal("expected image data, got nil")
}
if string(got) != string(pngBytes) {
t.Errorf("decoded image = %v, want %v", got, pngBytes)
}
}
func TestExtractBase64ImageFromClipboard_NoOsascript(t *testing.T) {
orig := os.Getenv("PATH")
t.Cleanup(func() { os.Setenv("PATH", orig) })
os.Setenv("PATH", "")
got := extractBase64ImageFromClipboard()
if got != nil {
t.Errorf("expected nil when osascript unavailable, got %v", got)
}
}
// hexNibble converts a 4-bit value to its uppercase hex character.
func hexNibble(n byte) byte {
if n < 10 {
return '0' + n
}
return 'A' + n - 10
}