Mag L1C - Fix mixed norm+burst day-boundary gap collapse (#2925) by sapols · Pull Request #5 · sapols/imap_processing

sapols · 2026-04-15T21:47:27Z

Summary

Addresses part of issue IMAP-Science-Operations-Center#2925: Alastair's December 4-7 2025 MAG L1C plots show large multi-hour gaps between stacked daily products even though the parent burst L1B files contain data in those intervals.

Root cause: on mixed norm+burst days, mag_l1c() was short-circuiting day_to_process to None before calling process_mag_l1c(). That prevented find_all_gaps() from emitting leading/trailing day-window gaps, so burst coverage preceding or following the NM file was silently dropped.

The fix passes day_to_process through unconditionally when burst data is present. All the supporting gap machinery (rate-aware generate_missing_timestamps, _find_rate_segments, usable_burst_end_epoch) is already in place from PR IMAP-Science-Operations-Center#2899, so this change is a one-line logic fix plus regression tests.

Base branch

This PR is stacked on IMAP-Science-Operations-Center#2899 (mag-l1c-fix) because that PR is not merged yet and this fix depends on the gap-machinery improvements it introduces. Rebase onto dev once IMAP-Science-Operations-Center#2899 lands.

Scope

mag_l1c.py: remove the day_to_process_arg = ... if normal_mode_dataset is None else None conditional; pass day_to_process through unconditionally. Prune TODO comments that are now handled.
test_mag_l1c.py: three new regression tests
- test_process_mag_l1c_leading_burst_only_coverage
- test_process_mag_l1c_trailing_burst_only_coverage
- test_mag_l1c_mixed_input_uses_day_to_process (public API)
test_mag_l1c.py: existing test_mag_l1c / test_mag_attributes now pass np.datetime64("2000-01-01") instead of "2025-01-01" so the (now live) ±30 min day window stays near TTJ2000 zero — the tiny synthetic epochs in the existing fixtures were fine against the old no-op path but would have generated ~1.6B ticks against a 2025 day boundary.
.gitignore: add .claude/.

Scope NOT included

Cross-day L1B fetching via imap_data_access.query() for true T017/T018/T019 inherited-timeline cases is not in this PR. Tracing the December screenshot shows the data needed to close the plotted gaps is already inside the existing per-day burst L1B files (the ±30 min day window around day_to_process is enough to cover Dec 5's 23:30 → 14:53 leading burst span). The SDC Validation Doc (section 4.3.3) still has no test data for T017-T019, so those cases are deferred.

Verification

pytest imap_processing/tests/mag/test_mag_l1c.py — 24 passed (including 3 new regression tests).
pytest imap_processing/tests/mag/test_mag_validation.py -k "013 or 014 or 015 or 016 or 024" — 10 passed (no regression on PR Fix MAG L1C gap-fill bugs for validation tests T015 and T016 IMAP-Science-Operations-Center/imap_processing#2899's T015/T016 rate-transition fixes).
Full MAG suite (pytest imap_processing/tests/mag) — 114 passed.
mag-l1c-refactor-assessment skill — T015 mago/magi PASS, T016 mago/magi PASS, T024 mago/magi PASS.

Real-day CDF regeneration (Dec 4-7 2025)

Regenerated via python -m imap_processing.cli from this branch with the same parent L1B files used for the archived products. Results:

Day	Archived rows	Regen rows	Archived first / last	Regen first / last
Dec 5	69,184	179,941	14:53:03 → 00:29:34	23:30:05 → 00:29:34
Dec 6	104,256	180,023	23:29:51 → 13:58:38	23:29:51 → 00:30:02 Dec 7
Dec 4, 7	burst-only	unchanged	—	—

Combined Dec 4-7 gap check:

Archived: 2 gaps > 5s (51,783s on Dec 5 and 34,287s on Dec 6).
Regenerated v922: 0 gaps > 5s.

Test plan

Unit tests pass on this branch
PR Fix MAG L1C gap-fill bugs for validation tests T015 and T016 IMAP-Science-Operations-Center/imap_processing#2899's validation tests (T013/14/15/16/24) still pass
Refactor-assessment skill PASS for T015/T016/T024
Real-day regeneration for Dec 4-7 closes the visible gaps
Rebase onto dev once Fix MAG L1C gap-fill bugs for validation tests T015 and T016 IMAP-Science-Operations-Center/imap_processing#2899 merges
Re-target PR to origin/dev once ready for broader review

…e-Operations-Center#2925) On days with both norm-mode and burst-mode L1B inputs, mag_l1c() was short-circuiting day_to_process to None before calling process_mag_l1c(), so find_all_gaps() never emitted leading/trailing day-window gaps and burst coverage preceding or following the NM file was silently dropped. Pass day_to_process through unconditionally, and add regression tests for the leading burst-only, trailing burst-only, and public mag_l1c mixed-input paths. Existing mag_l1c tests now anchor their synthetic epochs to 2000-01-01 so the (now live) ±30 min day window stays close to TTJ2000 zero and bounded in magnitude. Also adds .claude/ to .gitignore.

sapols · 2026-04-15T21:59:58Z

 .vscode/

+# Claude
+.claude/


Not strictly needed but I’d like for this repo to start ignoring .claude/

sapols · 2026-04-15T23:07:29Z

Before vs. after: Dec 4–7 2025 L1C gaps

Before: Archived L1C (what @alastairtree flagged in IMAP-Science-Operations-Center#1993):

Two clear gaps show up as the thin straight-line segments matplotlib draws across non-contiguous samples:

Dec 5, 00:30 → 14:53 (~14.5 h of burst-only coverage dropped before the NM file started)
Dec 6, 13:58 → 23:30 (~9.5 h of burst-only coverage dropped after the NM file ended)

After: Regenerated L1C from this branch (v922):

Dense, continuous Bx/By/Bz across the full Dec 4 → Dec 7 span — both previously-broken regions are now filled with real burst data, no thin-line segments anywhere.

Numeric confirmation (Dec 4–7 combined, gaps > 5 s):

Archived: 2 gaps (51,782.78 s and 34,287.03 s — i.e. the two visible holes above)
Regenerated v922: 0 gaps

Per-file row counts for the two worst days:

Dec 5 — archived: 69,184 rows (14:53:03 → 00:29:34); regen: 179,941 rows (23:30:05 → 00:29:34)
Dec 6 — archived: 104,256 rows (23:29:51 → 13:58:38); regen: 180,023 rows (23:29:51 → 00:30:02 Dec 7)
The leading/trailing burst-only coverage that used to be silently dropped on mixed norm+burst days is now being picked up by find_all_gaps() and filled via interpolate_gaps() exactly as the algorithm doc (§7.3.4) describes.

alastairtree

It is a small fix but was very deliberate so be interested to know why it was there like this in the first place.
Does this enable any of the MAG validation tests to now pass? Can they be added in as passing test cases now if so? I assume they are currently commented out.
Have added some small queries in the comments.

Thanks @sapols we appreciate your effort.

alastairtree · 2026-04-27T17:04:02Z

+    # 2000-01-01 anchors the ±30 min day window near TTJ2000's zero point so the
+    # leading/trailing day-boundary gap fill stays bounded given the fixtures use
+    # small synthetic epoch values.
+    l1c = mag_l1c(burst_dataset, np.datetime64("2000-01-01"), norm_dataset)


Don't really understand why using year 2000 is necessary, and probably the process should error if the data and the date are wildly different so not a great test case. Better to fix whatever is causing it to fail with the given data?

Since the J2000 epoch is 2000-01-01, this is the expected day for data that is small numbers, which is what is in burst_dataset (generated fixture)

Yeah fair to call that out. It was compensating for the fact that the old unit-test fixtures (norm_dataset / burst_dataset) used tiny synthetic TTJ2000-relative timestamps while the test passed a 2025 day_to_process. Once this PR made the mixed NM+BM path actually honor day_to_process, that mismatch caused the code to try to build an enormous day-boundary gap. So the date was just masking that old test setup.

I updated the tests: they now build synthetic NM/BM epochs aligned with the day_to_process they pass in, and the new regression tests use 2025-01-01 as well.

I agree in spirit a separate production guard for wildly mismatched input dates could be useful, but we'd have to define the allowed +/-30 minute and cross-day cases carefully. I think that's out of scope for this ticket but could be a follow-up hardening step.

alastairtree · 2026-04-27T17:06:33Z

    interp_function = InterpolationFunction[configuration.L1C_INTERPOLATION_METHOD]
    if burst_mode_dataset is not None:
-        # Only use day_to_process if there is no norm data
-        day_to_process_arg = day_to_process if normal_mode_dataset is None else None


This was obviously very deliberate and now removed - any idea why this was added like this originally?

Does this still work on a day where you only have BM data and not NM?

This was obviously very deliberate and now removed - any idea why this was added like this originally?

You know, I was curious about that too. I'm waiting for @maxinelasp to chime in here about this.

I traced this back to IMAP-Science-Operations-Center#2274 / f12e361, which added support for producing MAG L1C when there is BM data but no NM data. Before that change, that path was effectively unsupported.

The commit history suggests this conditional was added late in that work to keep the new day_to_process behavior limited to the no-NM case, probably as a conservative way to avoid changing existing mixed NM+BM behavior?

I couldn't find anything in the algorithm doc saying mixed NM+BM days should ignore the processing day window. In fact, the algorithm doc says gaps can occur at the beginning, middle, end, or entirety of the processing window, and that the resulting timeline should be regular across day boundaries and BM/NM gaps. So my guess is that the old conditional was probably a conservative implementation choice from the no-NM fix, not a spec requirement?

Does this still work on a day where you only have BM data and not NM?

It should still work, yes, because the BM-only path should be unchanged: before this PR, when normal_mode_dataset is None, day_to_process_arg was set to day_to_process; now we just pass day_to_process directly, so BM-only receives the same value as before. The no-NM branch inside process_mag_l1c() is also still there and still treats the whole day window as the gap to fill from BM. There is existing coverage via test_missing_norm_file, and the stronger T024 validation case runs for both mago and magi on this branch. But since you asked, I'll try to whip up a test that proves it.

Added test_mag_l1c_burst_only_generates_l1c_output(): to prove the BM-only case.

Ha, I got to the bottom of that PR and saw that you had linked it here.

I don't remember why we included this. If it's valid to use day_to_process in burst mode only processing, then it should be valid to use it for mixed NM+BM, like Shawn said.

It seems likely to me that this either a) an attempt to keep the new BM functionality from interfering with the mostly-working BM+NM, or b) a fix against failing validation data, which has since been fixed in Shawn's other L1C branch.

My main concern with this change is the same as the main concern with the other PR - that this will result in data extrapolation. But, if we are confident that the other code won't extrapolate, then this won't make it start. So I think this change is good, but definitely dependent on the other PR.

Add explicit burst-only L1C coverage for the no-normal-mode path and update existing mixed-input MAG L1C tests to build epochs aligned with their requested processing day instead of relying on a year-2000 workaround.

maxinelasp · 2026-04-28T20:37:50Z

+    # 2000-01-01 anchors the ±30 min day window near TTJ2000's zero point so the
+    # leading/trailing day-boundary gap fill stays bounded given the fixtures use
+    # small synthetic epoch values.
+    l1c = mag_l1c(burst_dataset, np.datetime64("2000-01-01"), norm_dataset)


Since the J2000 epoch is 2000-01-01, this is the expected day for data that is small numbers, which is what is in burst_dataset (generated fixture)

maxinelasp · 2026-04-28T22:08:18Z

    interp_function = InterpolationFunction[configuration.L1C_INTERPOLATION_METHOD]
    if burst_mode_dataset is not None:
-        # Only use day_to_process if there is no norm data
-        day_to_process_arg = day_to_process if normal_mode_dataset is None else None


Ha, I got to the bottom of that PR and saw that you had linked it here.

I don't remember why we included this. If it's valid to use day_to_process in burst mode only processing, then it should be valid to use it for mixed NM+BM, like Shawn said.

It seems likely to me that this either a) an attempt to keep the new BM functionality from interfering with the mostly-working BM+NM, or b) a fix against failing validation data, which has since been fixed in Shawn's other L1C branch.

My main concern with this change is the same as the main concern with the other PR - that this will result in data extrapolation. But, if we are confident that the other code won't extrapolate, then this won't make it start. So I think this change is good, but definitely dependent on the other PR.

maxinelasp · 2026-04-28T22:12:01Z

+    )
+    norm = _build_mag_l1b(nm_epochs, "imap_mag_l1b_norm-mago", "0:2")
+
+    # Burst covers day_start_ns through 10 minutes into the day at 8 vec/s.


Can you move this (or add another test) so burst doesn't cover the entire beginning of the day, and we can check if there is still a gap? So for example, if the first 10 seconds of a day don't have BM or NM coverage, they should have no data.

Good point. I updated this test so BM starts 10 seconds after the day-window start instead of exactly at the boundary. The test now checks both pieces of behavior: timestamps before the first BM sample remain MISSING, and the BM-covered portion before the NM window is still filled as BURST. Since this is testing process_mag_l1c() directly, the no-coverage portion is represented by ModeFlags.MISSING; the public mag_l1c() path removes those rows afterward.

Good?

sapols · 2026-04-28T22:15:15Z

Does this enable any of the MAG validation tests to now pass? Can they be added in as passing test cases now if so? I assume they are currently commented out.

This PR is stacked on top of the broader MAG L1C fix branch, so the validation-test state is a little mixed.

On the stacked branch, the available L1C validation cases T013/T014/T015/T016/T024 are already enabled and passing for both mago and magi. The T015/T016 enablement comes from the base MAG L1C fix branch rather than from this one-line day-boundary change.

For T017/T018/T019 specifically, I do not think there are passing cases to add yet: the SDC validation doc still lists those as “No data provided.”

This touches the broader question I have for @maxinelasp: is this PR good enough for issue IMAP-Science-Operations-Center#2925? Or should we try to expand it to include the stuff from the Scope NOT included section? This PR will remain "Draft" until we answer that question.

maxinelasp · 2026-04-28T22:31:56Z

@sapols I think this PR is good to merge as-is, but it doesn't completely finish the day boundary question.

There is still functionality that the MAG team requested, where if there is a gap at the beginning of the day, we could go back to the previous day's NM data and retrieve the rate to match the previous day's rate with the present day's rate.

To quote Alastair on IMAP-Science-Operations-Center#1993: "downsample so that each L1C day continues the previous. Currently It starts a fresh instead of trying to continue the cadence from the previous day. Ideally every day of L1C data continues a clean pattern of 2 vectors a second (say) so 500ms gap between any 2 adjacent vectors in a month (say), meaning all downsampled Burst->Normal sections carry on a time series from the previous normal section if possible, even if that started in the previous day."

Maybe @alastairtree can comment on the priority of that bug compared to IMAP-Science-Operations-Center#3068.

Overall, this PR is good as is and represents a good fix, but there's a question of future work and closing the high level L1C validation ticket.

sapols · 2026-04-28T22:41:09Z

@maxinelasp gotcha, thanks. I'll focus on getting the other L1C PR merged first, and then I'll make this a real PR against dev (not draft). But I'll leave IMAP-Science-Operations-Center#2925 open after merging this, since it sounds like there will be substantial future work to incorporate that day-boundary stuff (I have no idea how we'll implement that but it sounds like a big change).

alastairtree · 2026-04-28T23:39:04Z

Any gaps that get filled in by this pr is good news for us so yes agree with Maxine. If a beginning of the day gap gets filled with only 2-2 say and not the previous days rate, that's ok. We would always prefer a consistent stream of vectors in the same cadence and the same rate over day boundaries but this is an edge case for us now given current operations (constant 64-8) so now low priority.

Our number one priority is filling the gaps/timeouts/large NaN windows that mean l1d and below are not available - all this mode changing stuff is secondary to that.

Thank you!

maxinelasp · 2026-04-29T14:12:28Z

Great @alastairtree - thanks for that insight. That is what Shawn and I's next priority is - I've been given the go-ahead to prioritize MAG's infrastructure work for the next month. Full steam ahead!!

sapols · 2026-04-29T18:32:29Z

@maxinelasp @alastairtree So I realized that given the stacked nature of this branch, there's no clean way to merge/promote this draft PR without causing problems/conflicts for PR #2899. So instead, I'll wait until we resolve the extrapolation question in PR 2899 and merge it, then I'll rebase this onto dev and open a clean non-draft PR against upstream.

sapols force-pushed the mag-l1c-t017-19-fix branch from 39046dd to c54f1d8 Compare April 15, 2026 21:57

sapols commented Apr 15, 2026

View reviewed changes

sapols mentioned this pull request Apr 16, 2026

BUG - MAG L1C validation tests IMAP-Science-Operations-Center/imap_processing#1993

Open

alastairtree reviewed Apr 27, 2026

View reviewed changes

test: add and align MAG L1C day-boundary tests

a34fd68

Add explicit burst-only L1C coverage for the no-normal-mode path and update existing mixed-input MAG L1C tests to build epochs aligned with their requested processing day instead of relying on a year-2000 workaround.

sapols requested a review from maxinelasp April 28, 2026 22:06

maxinelasp reviewed Apr 28, 2026

View reviewed changes

test: preserve uncovered MAG L1C leading gaps

a198d38

sapols mentioned this pull request Apr 30, 2026

Fix MAG L1C gap-fill bugs for validation tests T015 and T016 IMAP-Science-Operations-Center/imap_processing#2899

Open

Conversation

sapols commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Base branch

Scope

Scope NOT included

Verification

Real-day CDF regeneration (Dec 4-7 2025)

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sapols commented Apr 15, 2026

Uh oh!

alastairtree left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sapols Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sapols commented Apr 28, 2026

Uh oh!

maxinelasp commented Apr 28, 2026

Uh oh!

sapols commented Apr 28, 2026

Uh oh!

alastairtree commented Apr 28, 2026

Uh oh!

maxinelasp commented Apr 29, 2026

Uh oh!

sapols commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sapols commented Apr 15, 2026 •

edited

Loading

alastairtree left a comment •

edited

Loading

sapols Apr 28, 2026 •

edited

Loading