Skip to content

Fix flaky DataLoaderPerformanceTest with deterministic synchronization#4299

Merged
andimarek merged 3 commits into
masterfrom
fix/flaky-dataloader-performance-test
Mar 8, 2026
Merged

Fix flaky DataLoaderPerformanceTest with deterministic synchronization#4299
andimarek merged 3 commits into
masterfrom
fix/flaky-dataloader-performance-test

Conversation

@andimarek
Copy link
Copy Markdown
Member

Summary

  • Fixes flaky "expensive query" tests in DataLoaderPerformanceTest and DataLoaderPerformanceWithChainedInstrumentationTest that intermittently failed with departmentsForShopsBatchLoaderCounter > 2
  • Root cause: random-sleep async data fetchers (shops / expensiveShops) completing at different times caused ExhaustedDataLoaderDispatchStrategy to dispatch prematurely, splitting what should be a single batch into multiple batches
  • Uses a two-latch synchronization approach in BatchCompareDataFetchers:
    1. Root fetcher rendezvous (CountDownLatch): ensures both root data fetchers complete simultaneously on a dedicated thread pool
    2. Completion overlap latch (CountDownLatch): ensures both root fields are inside their startComplete/stopComplete window before either proceeds, fully eliminating the race between thenApply callbacks
  • Removes @Ignore from the equivalent chained instrumentation test that was disabled for the same flakiness
  • Tightens assertions from <= 2 to == 1 since synchronized fetching guarantees optimal batching

Test plan

  • Verified 10/10 passes on Java 25 (default)
  • Verified 5/5 passes on Java 11
  • Verified 5/5 passes on Java 17
  • Verified 5/5 passes on Java 21
  • All 32 tests pass consistently (including previously @Ignored test)

🤖 Generated with Claude Code

The "expensive query" tests with multiple root fields (shops + expensiveShops)
were flaky because random-sleep async data fetchers could complete at different
times, causing ExhaustedDataLoaderDispatchStrategy to dispatch prematurely and
split batches.

Uses a two-latch synchronization approach:
1. Root fetcher rendezvous: ensures both root data fetchers complete simultaneously
2. Completion overlap latch: ensures both root fields are inside their
   startComplete/stopComplete window before either proceeds, eliminating
   the race between thenApply callbacks entirely

Also removes @ignore from the equivalent test in
DataLoaderPerformanceWithChainedInstrumentationTest and tightens assertions
from <= 2 to == 1 since synchronized fetching guarantees optimal batching.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 7, 2026

Test Report

Test Results

Java Version Total Passed Failed Errors Skipped
Java 11 5680 (+1 🟢) 5624 (+2 🟢) 0 (±0) 0 (±0) 56 (-1)
Java 17 5680 (+1 🟢) 5623 (+2 🟢) 0 (±0) 0 (±0) 57 (-1)
Java 21 5680 (+1 🟢) 5623 (+2 🟢) 0 (±0) 0 (±0) 57 (-1)
Java 25 5680 (+1 🟢) 5623 (+2 🟢) 0 (±0) 0 (±0) 57 (-1)
jcstress 32 (±0) 32 (±0) 0 (±0) 0 (±0) 0 (±0)
Total 22752 (+4 🟢) 22525 (+8 🟢) 0 (±0) 0 (±0) 227 (-4)

Code Coverage (Java 25)

Metric Covered Missed Coverage vs Master
Lines 28698 3126 90.2% ±0.0%
Branches 8332 1510 84.7% ±0.0%
Methods 7681 1224 86.3% ±0.0%

No per-class coverage changes detected.

Full HTML report: build artifact jacoco-html-report

Updated: 2026-03-08 02:32:40 UTC

andimarek pushed a commit that referenced this pull request Mar 8, 2026
Add 8 deterministic tests that directly exercise the concurrent state
machine in ExhaustedDataLoaderDispatchStrategy, recovering branch and
line coverage lost when PR #4299 replaced race-condition-dependent
timing with deterministic latch-based synchronization.

Tests cover:
- Basic dispatch cycle (finishedFetching triggers dispatch)
- Early return in newDataLoaderInvocation when flag already set
- Dispatch from newDataLoaderInvocation when objectRunningCount is 0
- Multiple dispatch rounds with chained data loader invocations
- executionSerialStrategy state reset
- Subscription alternative call context with separate call stack
- startComplete/stopComplete dispatch coordination
- Deferred call context lazy call stack creation via computeIfAbsent

https://claude.ai/code/session_01WwTVwjXS1wakJQmZYsXLvY
@andimarek andimarek merged commit bd3914d into master Mar 8, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant