Skip to content

fix(postgres): Use end_date in synthetic entity_df for non-entity retrieval#6110

Open
YassinNouh21 wants to merge 3 commits intofeast-dev:masterfrom
YassinNouh21:fix/postgres-entity-df-timestamp
Open

fix(postgres): Use end_date in synthetic entity_df for non-entity retrieval#6110
YassinNouh21 wants to merge 3 commits intofeast-dev:masterfrom
YassinNouh21:fix/postgres-entity-df-timestamp

Conversation

@YassinNouh21
Copy link
Collaborator

@YassinNouh21 YassinNouh21 commented Mar 15, 2026

Summary

  • The non-entity retrieval path (entity_df=None) created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event timestamp
  • Since PIT joins use MAX(entity_timestamp) as the upper bound, using start_date made end_date unreachable — no features after start_date would be returned
  • Fix: use [end_date] directly, matching the ClickHouse (feat: Add non-entity retrieval support for ClickHouse offline store #6066) and Dask implementations

Test plan

  • Added regression test test_non_entity_entity_df_uses_end_date that captures the synthetic entity_df and asserts its timestamp equals end_date
  • All existing TestNonEntityRetrieval tests pass
  • No changes to the query template or other code paths

Fixes the bug identified during review of #6066, referenced in #6057.


Open with Devin

@YassinNouh21 YassinNouh21 requested a review from a team as a code owner March 15, 2026 17:51
@YassinNouh21 YassinNouh21 self-assigned this Mar 15, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

…_df for non-entity retrieval

The non-entity retrieval path created a synthetic entity_df using
pd.date_range(start=start_date, ...)[:1], which placed start_date as
the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the
upper bound for feature data filtering, using start_date made end_date
unreachable — no features after start_date would be returned.

Fix: use [end_date] directly, matching the ClickHouse implementation
(PR feast-dev#6066) and the Dask offline store behavior.

Signed-off-by: yassinnouh21 <[email protected]>
The entity_df fix alone would cause min_event_timestamp to be computed
as end_date - TTL (instead of start_date - TTL), clipping valid data
from the query window. Override entity_df_event_timestamp_range to
(start_date, end_date) in non-entity mode so the full range is used.

Also fix ruff formatting in the test file.

Signed-off-by: yassinnouh21 <[email protected]>
@ntkathole ntkathole force-pushed the fix/postgres-entity-df-timestamp branch from 0f4df15 to e82371d Compare March 16, 2026 03:50
Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an integration test for this so we can confirm the behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants