Skip to content

Feature/pull creates sync manifest#17

Merged
christse merged 9 commits intomainfrom
feature/pull-creates-sync-manifest
Apr 20, 2026
Merged

Feature/pull creates sync manifest#17
christse merged 9 commits intomainfrom
feature/pull-creates-sync-manifest

Conversation

@christse
Copy link
Copy Markdown
Contributor

@christse christse commented Apr 20, 2026

Summary

  • boxel pull is now sync-ready. Fresh download writes .boxel-sync.json automatically; you can boxel sync . immediately against a pulled directory. Fixes the "no manifest found" foot-gun.
  • Uploads route correctly by content type. Binary files (images, fonts) no longer corrupted, plain-text files (.md, .csv, .yaml) no longer rejected as "invalid source". Only .json and compilable source go through /_atomic.
  • push --batch flag. Dependency-ordered .gts then /_atomic-batched .json. Meaningful speedup for bulk pushes.

Release notes

See CHANGELOG.md — 1.0.1 for the full per-version entry.

What shipped (9 commits)

# Type What
b5eddb6 feat Pull creates .boxel-sync.json after download
88fbf05 docs Docs + bump to 1.0.1
76bbf44 chore ignore .gstack/
306b657 feat Content-type-aware upload routing (+ new content-type.ts)
fdcb6b4 feat push --batch flag
45d0e62 chore .gitignore drift guards
e1c201d fix Address Copilot review (6 issues: manifest shape, partial-success, extension map, version sync, input validation, test expectation)
b70962c docs Document --batch, content-type routing, manifest shape
dbf0efc docs Add CHANGELOG.md

Test plan

  • npm run build — clean
  • npm test164/164 pass
  • boxel pull <url> /tmp/fresh/ && boxel sync /tmp/fresh/ — no manifest error
  • Push a workspace with .md + .jpg + .gts + .json — all land with correct Content-Type; binaries render after round-trip
  • boxel push <dir> <url> --batch --batch-size 5 — definitions upload first in dep order; instances batch 5-at-a-time
  • boxel --version — reports 1.0.1
  • boxel push --batch-size abc — fails with clear error (not NaN)

Notes for reviewers

  • src/lib/content-type.ts is load-bearing for commit 306b657. The --batch commit fdcb6b4 depends on it. Review 306b657 first if stepping through.
  • VERSION bumped to 1.0.1 in 88fbf05 (and boxel --version now actually prints it after e1c201d).
  • Drift-guard .gitignore stanza in 45d0e62 covers workspace dirs + misplaced platform docs that historically leaked into the repo; see commit message for the 21-vector coverage list.

🤖 Generated with Claude Code

Chris and others added 6 commits April 20, 2026 10:22
After `boxel pull` downloads files, it now creates a sync manifest
recording each file's MD5 hash and remote mtime. This prevents
`boxel sync` from treating all pulled files as new and re-uploading
the entire workspace on the first sync after a pull.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
README.md, .claude/CLAUDE.md, AGENTS.md now call out that
`boxel pull <url> ./local` writes the sync manifest automatically,
so `boxel sync .` works immediately against a freshly-pulled
workspace with no manual intermediate step.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
gstack writes session state, analytics, and project metadata into a
.gstack/ dir when its skills run. Keep it out of git.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Before: all uploads went through /_atomic with text/plain encoding,
corrupting binary files (images, fonts) and causing plain-text files
(.md, .txt, .csv, .yaml) to be rejected by the realm's module
compiler as "invalid source".

After: new content-type.ts module detects per-extension MIME types.
Binary files take the individual POST path with octet-stream encoding.
Plain-text files take the POST path with their true content type.
Only .json plus compilable source (.gts, .ts, .css, .html) go through
/_atomic.

Accept header now branches: compilable source asks for
application/vnd.card+source, everything else asks for */*.

Tests added: jpg uploads as binary, csv uploads as text.

Note: one pre-existing test failure on HEAD
('falls back to file type for invalid JSON') is unrelated and
predates this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
boxel push now supports --batch (and --batch-size <n>, default 10).
Definitions (.gts) upload individually in dependency order so the
realm indexer sees FieldDefs before the CardDefs that contain them.
Instances (.json) batch through /_atomic in N-at-a-time groups.

Meaningful speedup when pushing dozens of files to a fresh workspace,
and reduces the UI-flashing that happens when each card indexes on
its own POST.

Builds on the upload routing from the previous commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Patterns covering every category that historically leaked into this
repo while working in Boxel workspaces with CWD inside boxel-cli.
None of these belong here: they belong to realm-server, host,
workspaces, or are JQXL engine code that lives in another repo.

Fail-closed: if any of them reappear, they stay untracked instead
of being silently committed.

Also ignored:
- boxel-workspaces/, down*/, up/ (workspace dirs that end up at root)
- .claude/scheduled_tasks.lock (runtime)

Verified: 21 drift vectors caught, 13 legitimate files unaffected
(existing .claude/commands/* skills, docs/realm-repair.md, src/**,
test/**, README, package.json all still tracked normally).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@christse christse force-pushed the feature/pull-creates-sync-manifest branch from 5744822 to 45d0e62 Compare April 20, 2026 14:23
@christse christse requested a review from Copilot April 20, 2026 14:25
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Boxel CLI sync behavior by writing a sync manifest after pull, and introduces richer upload handling (batching + per-file content-type) to support binary and non-source text files more safely.

Changes:

  • Add automatic .boxel-sync.json generation after successful boxel pull downloads.
  • Introduce content-type aware upload payload handling, including binary-safe single-file uploads and improved batching behavior.
  • Add --batch / --batch-size push flags plus various documentation + ignore-list updates.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
test/lib/batch-upload.test.ts Adds tests covering binary vs text behavior for uploadSingleFile.
src/lib/realm-sync-base.ts Switches uploads to use readFileForUpload() for content-type-aware payloads.
src/lib/content-type.ts New utility for inferring upload content-type and choosing string vs Buffer payloads.
src/lib/batch-upload.ts Adds file-type routing (atomic vs individual), dependency-aware sorting, and content-type-aware single-file uploads.
src/index.ts Adds CLI options for batch push mode (--batch, --batch-size).
src/commands/status.ts Hashes local files as bytes (Buffer) to support non-UTF8/binary files reliably.
src/commands/push.ts Adds batch push mode and wires batch options through.
src/commands/pull.ts Writes .boxel-sync.json after pull so sync can run immediately.
src/commands/check.ts Hashes local files as bytes (Buffer) to support non-UTF8/binary files reliably.
package.json Bumps package version to 1.0.1.
README.md Documents that pull writes the manifest automatically.
AGENTS.md Updates command semantics to note pull writes the manifest.
.gitignore Expands ignore rules to prevent workspace/drift artifacts from being committed.
.claude/CLAUDE.md Mirrors README note about pull writing the manifest.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/index.ts
Comment on lines +68 to +70
.option('--batch', 'Use atomic batch upload for faster bulk operations (10 files per batch)')
.option('--batch-size <n>', 'Files per batch when using --batch (default: 10)', parseInt)
.action(async (localDir: string, workspaceUrl: string, options: { delete?: boolean; dryRun?: boolean; force?: boolean; batch?: boolean; batchSize?: number }) => {
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--batch-size uses parseInt directly; if parsing fails, Commander will pass NaN, and later code treats NaN as a valid value (so ?? 10 won’t apply). Validate the parsed value (finite, >= 1) and fall back to the default or throw a user-friendly error.

Copilot uses AI. Check for mistakes.
Comment thread package.json
{
"name": "boxel-cli",
"version": "1.0.0",
"version": "1.0.1",
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package.json version was bumped to 1.0.1, but the CLI still reports 1.0.0 via program.version('1.0.0') in src/index.ts. Consider sourcing the version from package.json (or updating the hardcoded string) so boxel --version matches the published package version.

Copilot uses AI. Check for mistakes.
Comment thread src/commands/push.ts Outdated
Comment on lines +168 to +175
// Update manifest for successfully uploaded files
if (result.uploaded > 0) {
for (const file of instances) {
if (fs.existsSync(file.localPath)) {
newManifest.files[file.relativePath] = computeFileHash(file.localPath);
}
}
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In batch mode, the manifest is updated for every instances file whenever result.uploaded > 0, even if some of those files failed to upload. This can mark failed uploads as "in sync" and prevent them from being retried on the next run. Update the manifest only for files known to have succeeded (e.g., derive a success set from result.errors or skip manifest writes when result.failed > 0).

Copilot uses AI. Check for mistakes.
Comment thread src/commands/pull.ts
Comment on lines +119 to +140
// Create sync manifest so subsequent `boxel sync` knows files are in sync
if (!this.options.dryRun && downloadedFiles.length > 0) {
const remoteMtimes = await this.getRemoteMtimes();
const manifest: SyncManifest = {
workspaceUrl: this.options.workspaceUrl,
lastSyncTime: Date.now(),
files: {},
};

for (const relativePath of downloadedFiles) {
const localPath = path.join(this.options.localDir, relativePath);
if (fs.existsSync(localPath)) {
manifest.files[relativePath] = {
localHash: computeFileHash(localPath),
remoteMtime: remoteMtimes.get(relativePath) || Math.floor(Date.now() / 1000),
};
}
}

const manifestPath = path.join(this.options.localDir, '.boxel-sync.json');
fs.writeFileSync(manifestPath, JSON.stringify(manifest, null, 2));
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boxel pull now writes .boxel-sync.json in the new sync-manifest shape (files[relPath] = { localHash, remoteMtime }), but push still expects the old push-only manifest format (files[relPath] = string). This will cause push to treat every file as changed (and/or overwrite the manifest back to the old format). Align manifest reading/writing across commands (or add migration logic in push similar to sync.ts).

Copilot uses AI. Check for mistakes.
Comment thread src/lib/batch-upload.ts
Comment on lines 326 to 337
} catch {
// If parsing fails, fall back to source format
return {
op: file.operation,
href: `${realmUrl}${file.relativePath}`,
data: {
type: 'file',
type: 'source',
attributes: {
content: content,
},
},
};
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback path for invalid .json content now emits data.type: 'source', but existing tests (and the AtomicOperation type union including 'file') indicate the previous behavior expected 'file'. As-is, this will break the current unit test and may not match what the _atomic endpoint accepts for non-card JSON. Either update tests + types to reflect 'source', or restore/adjust the fallback type to the correct resource type for invalid JSON.

Copilot uses AI. Check for mistakes.
Comment thread src/lib/batch-upload.ts
Comment on lines +17 to +34
function isBinaryFile(file: FileToUpload): boolean {
return !isTextFile(getContentType(file.relativePath));
}

const ATOMIC_SOURCE_EXTENSIONS = new Set([
'.gts',
'.ts',
'.tsx',
'.js',
'.jsx',
'.mjs',
'.cjs',
'.css',
'.scss',
'.less',
'.sass',
'.html',
]);
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATOMIC_SOURCE_EXTENSIONS includes extensions like .tsx, .jsx, .cjs, .scss, .less, .sass, but getContentType() doesn’t recognize most of these and will return application/octet-stream. That makes isBinaryFile() treat them as binary and routes them through individual uploads instead of _atomic, contradicting the intent of ATOMIC_SOURCE_EXTENSIONS. Add these extensions to EXTENSION_MAP (and treat them as text in isTextFile) or change isBinaryFile() to classify atomic-source extensions as text.

Copilot uses AI. Check for mistakes.
Comment thread src/lib/batch-upload.ts
Comment on lines +192 to +199
const inDegree = new Map<string, number>();
for (const name of byName.keys()) inDegree.set(name, 0);
for (const [, depSet] of deps) {
for (const dep of depSet) {
inDegree.set(dep, (inDegree.get(dep) ?? 0) + 1);
}
}

Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inDegree map computed here is never used (the algorithm uses inDeg later). This dead code is confusing and makes it harder to validate the topological sort logic. Remove the unused inDegree block or use it consistently.

Suggested change
const inDegree = new Map<string, number>();
for (const name of byName.keys()) inDegree.set(name, 0);
for (const [, depSet] of deps) {
for (const dep of depSet) {
inDegree.set(dep, (inDegree.get(dep) ?? 0) + 1);
}
}

Copilot uses AI. Check for mistakes.
Chris and others added 3 commits April 20, 2026 15:23
Six fixes in one commit, addressing every Copilot comment except #7
(dead code, deferred to a later cleanup pass).

#4 — manifest shape mismatch (critical):
  push.ts was still reading/writing the old manifest format
  (files[path] = hashString) while pull.ts and sync.ts use the new
  {localHash, remoteMtime} shape. push now uses the new shape,
  migrates old manifests on read (mirrors sync.ts detector), and
  refreshes remoteMtime via getRemoteMtimes() after a successful
  upload so the next pull/sync sees a consistent picture.

#3 — partial-success manifest (important):
  In --batch mode we were marking every instance as synced when
  result.uploaded > 0, even if some failed. Now we derive failed
  paths from result.errors and only add successes to the manifest.
  Failed uploads stay out and get retried on the next run.

#6 — ATOMIC_SOURCE_EXTENSIONS gap (important):
  EXTENSION_MAP was missing .tsx/.jsx/.cjs/.scss/.less/.sass, so
  those extensions got application/octet-stream and were routed
  through individual POST instead of /_atomic despite being listed
  as atomic-source compatible. Now mapped to
  application/typescript/javascript and text/x-* respectively, so
  isTextFile() returns true and they take the atomic path.

#2 — CLI --version out of sync:
  program.version() was hardcoded to '1.0.0' while package.json is
  at 1.0.1. Now reads from package.json via createRequire so
  boxel --version stays accurate on every release.

#1 — --batch-size NaN guard:
  parseInt on bad input (e.g. "abc") returned NaN, which bypasses
  `?? 10` and flows into the uploader as NaN. Now uses a
  parsePositiveInt parser that throws InvalidArgumentError with
  a friendly message on non-positive-integer input.

#5 — invalid-JSON test expectation:
  The test expected data.type='file' but the code (correctly) emits
  'source' because /_atomic only accepts 'card' and 'source'
  resource types. Test updated to match the correct contract.

All 164 tests pass (was 163/164 before this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- README.md + .claude/CLAUDE.md: add push --batch / --batch-size to
  the command examples plus a callout on when the --batch path
  applies (source only; binary and plain text stay on per-file POST).
- AGENTS.md: add a contributor-facing Content-Type Routing table
  (file-class → path → headers) so anyone touching batch-upload.ts
  knows which extensions belong where. Add a Manifest Shape section
  documenting the new {localHash, remoteMtime} format and the
  migration from the pre-1.0.1 bare-string form.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Start a proper CHANGELOG. Entries for both 1.0.1 (this branch) and
1.0.0 (backfilled from the commit history of the initial public
release). Future releases land here first.

Also update CONTRIBUTING section of README to ask for a CHANGELOG
bullet on every PR, and add a "Release notes" link so users can
find it.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@christse christse merged commit b4d2999 into main Apr 20, 2026
christse pushed a commit that referenced this pull request Apr 20, 2026
Six fixes in one commit, addressing every Copilot comment except #7
(dead code, deferred to a later cleanup pass).

#4 — manifest shape mismatch (critical):
  push.ts was still reading/writing the old manifest format
  (files[path] = hashString) while pull.ts and sync.ts use the new
  {localHash, remoteMtime} shape. push now uses the new shape,
  migrates old manifests on read (mirrors sync.ts detector), and
  refreshes remoteMtime via getRemoteMtimes() after a successful
  upload so the next pull/sync sees a consistent picture.

#3 — partial-success manifest (important):
  In --batch mode we were marking every instance as synced when
  result.uploaded > 0, even if some failed. Now we derive failed
  paths from result.errors and only add successes to the manifest.
  Failed uploads stay out and get retried on the next run.

#6 — ATOMIC_SOURCE_EXTENSIONS gap (important):
  EXTENSION_MAP was missing .tsx/.jsx/.cjs/.scss/.less/.sass, so
  those extensions got application/octet-stream and were routed
  through individual POST instead of /_atomic despite being listed
  as atomic-source compatible. Now mapped to
  application/typescript/javascript and text/x-* respectively, so
  isTextFile() returns true and they take the atomic path.

#2 — CLI --version out of sync:
  program.version() was hardcoded to '1.0.0' while package.json is
  at 1.0.1. Now reads from package.json via createRequire so
  boxel --version stays accurate on every release.

#1 — --batch-size NaN guard:
  parseInt on bad input (e.g. "abc") returned NaN, which bypasses
  `?? 10` and flows into the uploader as NaN. Now uses a
  parsePositiveInt parser that throws InvalidArgumentError with
  a friendly message on non-positive-integer input.

#5 — invalid-JSON test expectation:
  The test expected data.type='file' but the code (correctly) emits
  'source' because /_atomic only accepts 'card' and 'source'
  resource types. Test updated to match the correct contract.

All 164 tests pass (was 163/164 before this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@christse christse deleted the feature/pull-creates-sync-manifest branch April 20, 2026 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants