Conversation
After `boxel pull` downloads files, it now creates a sync manifest recording each file's MD5 hash and remote mtime. This prevents `boxel sync` from treating all pulled files as new and re-uploading the entire workspace on the first sync after a pull. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
README.md, .claude/CLAUDE.md, AGENTS.md now call out that `boxel pull <url> ./local` writes the sync manifest automatically, so `boxel sync .` works immediately against a freshly-pulled workspace with no manual intermediate step. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
gstack writes session state, analytics, and project metadata into a .gstack/ dir when its skills run. Keep it out of git. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Before: all uploads went through /_atomic with text/plain encoding,
corrupting binary files (images, fonts) and causing plain-text files
(.md, .txt, .csv, .yaml) to be rejected by the realm's module
compiler as "invalid source".
After: new content-type.ts module detects per-extension MIME types.
Binary files take the individual POST path with octet-stream encoding.
Plain-text files take the POST path with their true content type.
Only .json plus compilable source (.gts, .ts, .css, .html) go through
/_atomic.
Accept header now branches: compilable source asks for
application/vnd.card+source, everything else asks for */*.
Tests added: jpg uploads as binary, csv uploads as text.
Note: one pre-existing test failure on HEAD
('falls back to file type for invalid JSON') is unrelated and
predates this change.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
boxel push now supports --batch (and --batch-size <n>, default 10). Definitions (.gts) upload individually in dependency order so the realm indexer sees FieldDefs before the CardDefs that contain them. Instances (.json) batch through /_atomic in N-at-a-time groups. Meaningful speedup when pushing dozens of files to a fresh workspace, and reduces the UI-flashing that happens when each card indexes on its own POST. Builds on the upload routing from the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Patterns covering every category that historically leaked into this repo while working in Boxel workspaces with CWD inside boxel-cli. None of these belong here: they belong to realm-server, host, workspaces, or are JQXL engine code that lives in another repo. Fail-closed: if any of them reappear, they stay untracked instead of being silently committed. Also ignored: - boxel-workspaces/, down*/, up/ (workspace dirs that end up at root) - .claude/scheduled_tasks.lock (runtime) Verified: 21 drift vectors caught, 13 legitimate files unaffected (existing .claude/commands/* skills, docs/realm-repair.md, src/**, test/**, README, package.json all still tracked normally). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
5744822 to
45d0e62
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends Boxel CLI sync behavior by writing a sync manifest after pull, and introduces richer upload handling (batching + per-file content-type) to support binary and non-source text files more safely.
Changes:
- Add automatic
.boxel-sync.jsongeneration after successfulboxel pulldownloads. - Introduce content-type aware upload payload handling, including binary-safe single-file uploads and improved batching behavior.
- Add
--batch/--batch-sizepush flags plus various documentation + ignore-list updates.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| test/lib/batch-upload.test.ts | Adds tests covering binary vs text behavior for uploadSingleFile. |
| src/lib/realm-sync-base.ts | Switches uploads to use readFileForUpload() for content-type-aware payloads. |
| src/lib/content-type.ts | New utility for inferring upload content-type and choosing string vs Buffer payloads. |
| src/lib/batch-upload.ts | Adds file-type routing (atomic vs individual), dependency-aware sorting, and content-type-aware single-file uploads. |
| src/index.ts | Adds CLI options for batch push mode (--batch, --batch-size). |
| src/commands/status.ts | Hashes local files as bytes (Buffer) to support non-UTF8/binary files reliably. |
| src/commands/push.ts | Adds batch push mode and wires batch options through. |
| src/commands/pull.ts | Writes .boxel-sync.json after pull so sync can run immediately. |
| src/commands/check.ts | Hashes local files as bytes (Buffer) to support non-UTF8/binary files reliably. |
| package.json | Bumps package version to 1.0.1. |
| README.md | Documents that pull writes the manifest automatically. |
| AGENTS.md | Updates command semantics to note pull writes the manifest. |
| .gitignore | Expands ignore rules to prevent workspace/drift artifacts from being committed. |
| .claude/CLAUDE.md | Mirrors README note about pull writing the manifest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .option('--batch', 'Use atomic batch upload for faster bulk operations (10 files per batch)') | ||
| .option('--batch-size <n>', 'Files per batch when using --batch (default: 10)', parseInt) | ||
| .action(async (localDir: string, workspaceUrl: string, options: { delete?: boolean; dryRun?: boolean; force?: boolean; batch?: boolean; batchSize?: number }) => { |
There was a problem hiding this comment.
--batch-size uses parseInt directly; if parsing fails, Commander will pass NaN, and later code treats NaN as a valid value (so ?? 10 won’t apply). Validate the parsed value (finite, >= 1) and fall back to the default or throw a user-friendly error.
| { | ||
| "name": "boxel-cli", | ||
| "version": "1.0.0", | ||
| "version": "1.0.1", |
There was a problem hiding this comment.
package.json version was bumped to 1.0.1, but the CLI still reports 1.0.0 via program.version('1.0.0') in src/index.ts. Consider sourcing the version from package.json (or updating the hardcoded string) so boxel --version matches the published package version.
| // Update manifest for successfully uploaded files | ||
| if (result.uploaded > 0) { | ||
| for (const file of instances) { | ||
| if (fs.existsSync(file.localPath)) { | ||
| newManifest.files[file.relativePath] = computeFileHash(file.localPath); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
In batch mode, the manifest is updated for every instances file whenever result.uploaded > 0, even if some of those files failed to upload. This can mark failed uploads as "in sync" and prevent them from being retried on the next run. Update the manifest only for files known to have succeeded (e.g., derive a success set from result.errors or skip manifest writes when result.failed > 0).
| // Create sync manifest so subsequent `boxel sync` knows files are in sync | ||
| if (!this.options.dryRun && downloadedFiles.length > 0) { | ||
| const remoteMtimes = await this.getRemoteMtimes(); | ||
| const manifest: SyncManifest = { | ||
| workspaceUrl: this.options.workspaceUrl, | ||
| lastSyncTime: Date.now(), | ||
| files: {}, | ||
| }; | ||
|
|
||
| for (const relativePath of downloadedFiles) { | ||
| const localPath = path.join(this.options.localDir, relativePath); | ||
| if (fs.existsSync(localPath)) { | ||
| manifest.files[relativePath] = { | ||
| localHash: computeFileHash(localPath), | ||
| remoteMtime: remoteMtimes.get(relativePath) || Math.floor(Date.now() / 1000), | ||
| }; | ||
| } | ||
| } | ||
|
|
||
| const manifestPath = path.join(this.options.localDir, '.boxel-sync.json'); | ||
| fs.writeFileSync(manifestPath, JSON.stringify(manifest, null, 2)); | ||
| } |
There was a problem hiding this comment.
boxel pull now writes .boxel-sync.json in the new sync-manifest shape (files[relPath] = { localHash, remoteMtime }), but push still expects the old push-only manifest format (files[relPath] = string). This will cause push to treat every file as changed (and/or overwrite the manifest back to the old format). Align manifest reading/writing across commands (or add migration logic in push similar to sync.ts).
| } catch { | ||
| // If parsing fails, fall back to source format | ||
| return { | ||
| op: file.operation, | ||
| href: `${realmUrl}${file.relativePath}`, | ||
| data: { | ||
| type: 'file', | ||
| type: 'source', | ||
| attributes: { | ||
| content: content, | ||
| }, | ||
| }, | ||
| }; |
There was a problem hiding this comment.
The fallback path for invalid .json content now emits data.type: 'source', but existing tests (and the AtomicOperation type union including 'file') indicate the previous behavior expected 'file'. As-is, this will break the current unit test and may not match what the _atomic endpoint accepts for non-card JSON. Either update tests + types to reflect 'source', or restore/adjust the fallback type to the correct resource type for invalid JSON.
| function isBinaryFile(file: FileToUpload): boolean { | ||
| return !isTextFile(getContentType(file.relativePath)); | ||
| } | ||
|
|
||
| const ATOMIC_SOURCE_EXTENSIONS = new Set([ | ||
| '.gts', | ||
| '.ts', | ||
| '.tsx', | ||
| '.js', | ||
| '.jsx', | ||
| '.mjs', | ||
| '.cjs', | ||
| '.css', | ||
| '.scss', | ||
| '.less', | ||
| '.sass', | ||
| '.html', | ||
| ]); |
There was a problem hiding this comment.
ATOMIC_SOURCE_EXTENSIONS includes extensions like .tsx, .jsx, .cjs, .scss, .less, .sass, but getContentType() doesn’t recognize most of these and will return application/octet-stream. That makes isBinaryFile() treat them as binary and routes them through individual uploads instead of _atomic, contradicting the intent of ATOMIC_SOURCE_EXTENSIONS. Add these extensions to EXTENSION_MAP (and treat them as text in isTextFile) or change isBinaryFile() to classify atomic-source extensions as text.
| const inDegree = new Map<string, number>(); | ||
| for (const name of byName.keys()) inDegree.set(name, 0); | ||
| for (const [, depSet] of deps) { | ||
| for (const dep of depSet) { | ||
| inDegree.set(dep, (inDegree.get(dep) ?? 0) + 1); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The inDegree map computed here is never used (the algorithm uses inDeg later). This dead code is confusing and makes it harder to validate the topological sort logic. Remove the unused inDegree block or use it consistently.
| const inDegree = new Map<string, number>(); | |
| for (const name of byName.keys()) inDegree.set(name, 0); | |
| for (const [, depSet] of deps) { | |
| for (const dep of depSet) { | |
| inDegree.set(dep, (inDegree.get(dep) ?? 0) + 1); | |
| } | |
| } |
Six fixes in one commit, addressing every Copilot comment except #7 (dead code, deferred to a later cleanup pass). #4 — manifest shape mismatch (critical): push.ts was still reading/writing the old manifest format (files[path] = hashString) while pull.ts and sync.ts use the new {localHash, remoteMtime} shape. push now uses the new shape, migrates old manifests on read (mirrors sync.ts detector), and refreshes remoteMtime via getRemoteMtimes() after a successful upload so the next pull/sync sees a consistent picture. #3 — partial-success manifest (important): In --batch mode we were marking every instance as synced when result.uploaded > 0, even if some failed. Now we derive failed paths from result.errors and only add successes to the manifest. Failed uploads stay out and get retried on the next run. #6 — ATOMIC_SOURCE_EXTENSIONS gap (important): EXTENSION_MAP was missing .tsx/.jsx/.cjs/.scss/.less/.sass, so those extensions got application/octet-stream and were routed through individual POST instead of /_atomic despite being listed as atomic-source compatible. Now mapped to application/typescript/javascript and text/x-* respectively, so isTextFile() returns true and they take the atomic path. #2 — CLI --version out of sync: program.version() was hardcoded to '1.0.0' while package.json is at 1.0.1. Now reads from package.json via createRequire so boxel --version stays accurate on every release. #1 — --batch-size NaN guard: parseInt on bad input (e.g. "abc") returned NaN, which bypasses `?? 10` and flows into the uploader as NaN. Now uses a parsePositiveInt parser that throws InvalidArgumentError with a friendly message on non-positive-integer input. #5 — invalid-JSON test expectation: The test expected data.type='file' but the code (correctly) emits 'source' because /_atomic only accepts 'card' and 'source' resource types. Test updated to match the correct contract. All 164 tests pass (was 163/164 before this commit). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- README.md + .claude/CLAUDE.md: add push --batch / --batch-size to
the command examples plus a callout on when the --batch path
applies (source only; binary and plain text stay on per-file POST).
- AGENTS.md: add a contributor-facing Content-Type Routing table
(file-class → path → headers) so anyone touching batch-upload.ts
knows which extensions belong where. Add a Manifest Shape section
documenting the new {localHash, remoteMtime} format and the
migration from the pre-1.0.1 bare-string form.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Start a proper CHANGELOG. Entries for both 1.0.1 (this branch) and 1.0.0 (backfilled from the commit history of the initial public release). Future releases land here first. Also update CONTRIBUTING section of README to ask for a CHANGELOG bullet on every PR, and add a "Release notes" link so users can find it. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Six fixes in one commit, addressing every Copilot comment except #7 (dead code, deferred to a later cleanup pass). #4 — manifest shape mismatch (critical): push.ts was still reading/writing the old manifest format (files[path] = hashString) while pull.ts and sync.ts use the new {localHash, remoteMtime} shape. push now uses the new shape, migrates old manifests on read (mirrors sync.ts detector), and refreshes remoteMtime via getRemoteMtimes() after a successful upload so the next pull/sync sees a consistent picture. #3 — partial-success manifest (important): In --batch mode we were marking every instance as synced when result.uploaded > 0, even if some failed. Now we derive failed paths from result.errors and only add successes to the manifest. Failed uploads stay out and get retried on the next run. #6 — ATOMIC_SOURCE_EXTENSIONS gap (important): EXTENSION_MAP was missing .tsx/.jsx/.cjs/.scss/.less/.sass, so those extensions got application/octet-stream and were routed through individual POST instead of /_atomic despite being listed as atomic-source compatible. Now mapped to application/typescript/javascript and text/x-* respectively, so isTextFile() returns true and they take the atomic path. #2 — CLI --version out of sync: program.version() was hardcoded to '1.0.0' while package.json is at 1.0.1. Now reads from package.json via createRequire so boxel --version stays accurate on every release. #1 — --batch-size NaN guard: parseInt on bad input (e.g. "abc") returned NaN, which bypasses `?? 10` and flows into the uploader as NaN. Now uses a parsePositiveInt parser that throws InvalidArgumentError with a friendly message on non-positive-integer input. #5 — invalid-JSON test expectation: The test expected data.type='file' but the code (correctly) emits 'source' because /_atomic only accepts 'card' and 'source' resource types. Test updated to match the correct contract. All 164 tests pass (was 163/164 before this commit). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Summary
boxel pullis now sync-ready. Fresh download writes.boxel-sync.jsonautomatically; you canboxel sync .immediately against a pulled directory. Fixes the "no manifest found" foot-gun..md,.csv,.yaml) no longer rejected as "invalid source". Only.jsonand compilable source go through/_atomic.push --batchflag. Dependency-ordered.gtsthen/_atomic-batched.json. Meaningful speedup for bulk pushes.Release notes
See CHANGELOG.md — 1.0.1 for the full per-version entry.
What shipped (9 commits)
b5eddb6.boxel-sync.jsonafter download88fbf0576bbf44.gstack/306b657content-type.ts)fdcb6b4push --batchflag45d0e62.gitignoredrift guardse1c201db70962c--batch, content-type routing, manifest shapedbf0efcTest plan
npm run build— cleannpm test— 164/164 pass ✓boxel pull <url> /tmp/fresh/ && boxel sync /tmp/fresh/— no manifest error.md+.jpg+.gts+.json— all land with correctContent-Type; binaries render after round-tripboxel push <dir> <url> --batch --batch-size 5— definitions upload first in dep order; instances batch 5-at-a-timeboxel --version— reports1.0.1boxel push --batch-size abc— fails with clear error (not NaN)Notes for reviewers
src/lib/content-type.tsis load-bearing for commit306b657. The--batchcommitfdcb6b4depends on it. Review306b657first if stepping through.88fbf05(andboxel --versionnow actually prints it aftere1c201d)..gitignorestanza in45d0e62covers workspace dirs + misplaced platform docs that historically leaked into the repo; see commit message for the 21-vector coverage list.🤖 Generated with Claude Code