DE-2447: ModelPack decoder schema + coffeecup parity coverage#82
Conversation
Validate HAL's ModelPack anchor-grid runtime end-to-end against real production models on a non-square 270x480 input. * Four synthetic schema fixtures (det/seg/multitask logical + det smart) with build-time tests asserting SchemaV2 parse and DecoderBuilder build for ModelPack 4.0 schemas. * Two coffeecup parity fixtures (TFLite int8 smart + ONNX float) bundle raw model outputs, schema, and reference post-NMS detections from the canonical ModelPack Python decoder. Strict-parity tests assert HAL matches the reference: IoU>=0.95 + score within 0.02 (int8), IoU>=0.99 + score within 0.001 (float). * New scripts/decoder_generate_modelpack_fixture.py regenerates both fixtures from source TFLite/ONNX models. * .gitignore: add *.onnx alongside *.tflite to keep source models local-only. Signed-off-by: Sébastien Taylor <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds ModelPack decoder coverage to edgefirst-decoder by introducing parity tests against a canonical Python reference (via LFS-tracked .safetensors fixtures) and adding schema-build tests that exercise additional ModelPack 4.0 output layouts.
Changes:
- Add end-to-end parity tests for a real non-square ModelPack detection model using embedded reference outputs in
.safetensorsfixtures. - Add schema parsing/build tests plus four synthetic ModelPack v2 schema fixtures (det/seg/multitask logical, det smart).
- Add a Python fixture generator script and update
.gitignoreto keep source.tflite/.onnxmodels local-only.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/decoder/tests/modelpack_coffeecup_parity.rs | Runs HAL decode on coffeecup fixtures and asserts parity vs embedded reference detections |
| crates/decoder/tests/modelpack_decoder_schemas.rs | Validates SchemaV2 parse + DecoderBuilder build for four ModelPack schema fixtures |
| scripts/decoder_generate_modelpack_fixture.py | Generates .safetensors fixtures by running inference + NumPy reference decode/NMS |
| testdata/decoder/modelpack_det_logical.json | Synthetic ModelPack detection logical schema fixture |
| testdata/decoder/modelpack_det_smart.json | Synthetic ModelPack detection “smart” (quantized, no nested outputs) schema fixture |
| testdata/decoder/modelpack_seg_logical.json | Synthetic ModelPack segmentation logical schema fixture |
| testdata/decoder/modelpack_multitask_logical.json | Synthetic ModelPack multitask (det+seg) logical schema fixture |
| testdata/decoder/coffeecup-mpk-det-relu-t-27d6.safetensors | LFS fixture for float ONNX export outputs + reference decode |
| testdata/decoder/coffeecup-mpk-det-relu-t-27d6_quant-u8-i8_smart.safetensors | LFS fixture for quantized TFLite export outputs + reference decode |
| .gitignore | Ignores *.onnx alongside *.tflite to keep source models out of the repo |
| def infer_tflite(model_path: Path, image_uint8_nhwc: np.ndarray) -> dict[tuple[int, ...], np.ndarray]: | ||
| """Run TFLite inference; return {shape: raw_int8_tensor}. | ||
|
|
||
| Binds outputs by shape (TFLite output names like ``PartitionedCall:0`` | ||
| don't map to schema names like ``output_0``). | ||
| """ | ||
| try: | ||
| from tflite_runtime.interpreter import Interpreter | ||
| except ImportError: | ||
| from tensorflow.lite.python.interpreter import Interpreter | ||
| interp = Interpreter(model_path=str(model_path)) | ||
| interp.allocate_tensors() | ||
| in_det = interp.get_input_details()[0] | ||
| interp.set_tensor(in_det["index"], image_uint8_nhwc) | ||
| interp.invoke() | ||
| raw = {} | ||
| quants = {} | ||
| for od in interp.get_output_details(): | ||
| t = interp.get_tensor(od["index"]) | ||
| raw[tuple(int(x) for x in t.shape)] = t | ||
| quants[tuple(int(x) for x in t.shape)] = od["quantization"] | ||
| return raw, quants |
There was a problem hiding this comment.
Dropped the unused second return value from infer_tflite; type annotation now matches actual return.
| def infer_onnx(model_path: Path, image_uint8_nhwc: np.ndarray) -> dict[tuple[int, ...], np.ndarray]: | ||
| """Run ONNX inference; return {shape: float32_tensor} in NHWC layout. | ||
|
|
||
| ModelPack ONNX takes NCHW float32 input but the heads emit NHWC outputs. | ||
| """ | ||
| import onnxruntime as ort | ||
| sess = ort.InferenceSession(str(model_path), providers=["CPUExecutionProvider"]) | ||
| in_meta = sess.get_inputs()[0] | ||
| # NCHW float32 [1, 3, H, W]; normalize uint8/255.0 | ||
| img_nchw = (image_uint8_nhwc.astype(np.float32) / 255.0).transpose(0, 3, 1, 2) | ||
| outs = sess.run(None, {in_meta.name: img_nchw}) | ||
| raw = {tuple(int(x) for x in t.shape): t for t in outs} | ||
| return raw, {} |
There was a problem hiding this comment.
Same fix as #3306578774 — infer_onnx now returns only the raw dict.
| "mode": "class_agnostic", | ||
| "score_threshold": cfg.score_threshold, | ||
| "iou_threshold": cfg.iou_threshold, | ||
| "max_output": cfg.max_output, |
There was a problem hiding this comment.
Renamed JSON key to max_detections so PerScaleFixture::load() picks up the fixture-time cap. Regenerated both safetensors.
| let nms = fix.nms_config(); | ||
| let decoder = DecoderBuilder::default() | ||
| .with_schema(schema) | ||
| .with_iou_threshold(nms.iou_threshold) | ||
| .with_score_threshold(nms.score_threshold) | ||
| .build() | ||
| .expect("build decoder"); |
There was a problem hiding this comment.
Added .with_max_det(nms.max_detections as usize) (and sized the Vec capacity from it) so the fixture's NMS settings flow into the decoder.
Four review comments, all valid: * infer_tflite / infer_onnx returned (raw, _unused) tuples but were annotated as returning a single dict. Drop the unused second return value so callers and type checkers agree. * Fixture metadata wrote nms.max_output, but PerScaleFixture::load reads max_detections — so the loader silently ignored the fixture-time cap. Rename the key. * The parity test read nms.max_detections but never threaded it through DecoderBuilder. Add with_max_det(nms.max_detections) and size the box buffer from it so the test honours the fixture config. Regenerated both coffeecup fixtures to pick up the renamed JSON key. Signed-off-by: Sébastien Taylor <[email protected]>
|



Summary
coffeecup-mpk-det-relu-t-27d6).What's included
crates/decoder/tests/modelpack_coffeecup_parity.rsrunsDecoder::decodeon raw outputs captured from both the TFLite int8 smart export and the ONNX float export, then asserts the post-NMS detections match the Python reference embedded in each fixture. Tolerances: IoU >= 0.95 + score within 0.02 (int8), IoU >= 0.99 + score within 0.001 (float).crates/decoder/tests/modelpack_decoder_schemas.rsassertsSchemaV2::parse_file+DecoderBuilder::with_schema(...).build()succeeds on four ModelPack 4.0 schema shapes (det/seg/multitask logical, det smart) with the expected output topology.scripts/decoder_generate_modelpack_fixture.pyruns inference on either a TFLite (int8) or ONNX (float) ModelPack model, executes the canonical reference decode in NumPy, and packs raw + intermediate + reference into a.safetensorsconsumable by the existingPerScaleFixtureloader.*.onnxalongside the existing*.tfliterule so source models stay local-only (only the generated fixtures are committed, via LFS).Files
crates/decoder/tests/modelpack_coffeecup_parity.rs(new, +163)crates/decoder/tests/modelpack_decoder_schemas.rs(new, +156)scripts/decoder_generate_modelpack_fixture.py(new, +492)testdata/decoder/modelpack_{det_logical,det_smart,seg_logical,multitask_logical}.json(new schema fixtures)testdata/decoder/coffeecup-mpk-det-relu-t-27d6{,_quant-u8-i8_smart}.safetensors(new, LFS-tracked, ~947 KB total).gitignore(+*.onnx)Why this matters
The existing modelpack tests only exercised synthetic 320x320 schemas. This change is the first proof that HAL's ModelPack runtime (dequant + sigmoid + anchor-grid decode + class-agnostic NMS) reproduces the canonical ModelPack reference on a real model with non-square input and per-tensor int8 quantization.
Test plan
cargo test -p edgefirst-decoder --test modelpack_coffeecup_parity— 2/2 passcargo test -p edgefirst-decoder --test modelpack_decoder_schemas— 4/4 passcargo test -p edgefirst-decoder— full suite 480/480 (no regressions)make format lint checkcleanRegenerating the fixtures
source venv/bin/activate python scripts/decoder_generate_modelpack_fixture.py \ coffeecup-mpk-det-relu-t-27d6_quant-u8-i8_smart.tflite \ testdata/coffeecup.jpg \ --output testdata/decoder/coffeecup-mpk-det-relu-t-27d6_quant-u8-i8_smart.safetensors python scripts/decoder_generate_modelpack_fixture.py \ coffeecup-mpk-det-relu-t-27d6.onnx \ testdata/coffeecup.jpg \ --output testdata/decoder/coffeecup-mpk-det-relu-t-27d6.safetensorsThe source
.tflite/.onnxare intentionally not committed; they live at the repo root for reviewers who want to reproduce locally.