Skip to content

[Internal review] Proposal draft: anchored composability invariant (do not merge, do not notify upstream)#24

Draft
ColinDKelley wants to merge 13 commits into
mainfrom
invoca/proposal-anchored-composability-invariant
Draft

[Internal review] Proposal draft: anchored composability invariant (do not merge, do not notify upstream)#24
ColinDKelley wants to merge 13 commits into
mainfrom
invoca/proposal-anchored-composability-invariant

Conversation

@ColinDKelley
Copy link
Copy Markdown

@ColinDKelley ColinDKelley commented May 11, 2026

Internal review: upstream proposal draft

This PR exists purely for internal Invoca review of a proposal we are considering filing upstream against prometheus/prometheus (and possibly prometheus/proposals as an amendment to PROM-52).

  • Base: Invoca/prometheus:mainnothing in this PR is destined for merge into our fork. We will close it without merging once the draft is finalized.
  • Head: Invoca/prometheus:invoca/proposal-anchored-composability-invariant
  • Do not notify or request review from upstream Prometheus maintainers until we have internal sign-off.

This is a continuation of #23, which was auto-closed when the head branch was renamed from invoca/proposal-anchored-linearity-invariant to invoca/proposal-anchored-composability-invariant. No commits or content were lost; same head commit (1f4bc92).

What this proposes (upstream)

Formally document and regression-test the composability invariant that anchored + increase satisfies:

increase(m[r₁] anchored) @ T₁
  +
increase(m[r₂] anchored) @ T₂
  =
increase(m[r₁ + r₂] anchored) @ T₂

This is the mathematical content of the "composability" goal that PROM-52 mentions but never states or tests. It is also the design contract of Invoca's yrate family, which PROM-52 already cites as prior art.

What reviewers should look at

The full draft lives in docs/proposals/anchored-composability-invariant.md. It contains:

  • Summary + precise invariant statement
  • Why it matters (ties to existing PROM-52 language on composability)
  • Proof sketch (collapsible)
  • Related prior art (Invoca yrate / the y-rate design doc)
  • Proposed additions: user-facing docs, proposal amendment, regression tests
  • A "Reviewer notes" section at the bottom with open questions on tone, attribution, scope, venue, and timing — please weigh in there

What I need from reviewers

  1. Does the framing land the way we want upstream? (Especially the attribution to Invoca's yrate work.)
  2. Anything technically wrong or underspecified in the proof sketch?
  3. Preferred venue: issue on prometheus/prometheus first, or proposal amendment PR on prometheus/proposals first?
  4. Strip the "Reviewer notes" section before filing, or keep a trimmed version as an appendix?

No rush — happy to sit on this for days/weeks before filing anything externally.

ColinDKelley and others added 10 commits April 18, 2026 21:47
Draft for internal Invoca review before filing upstream against
prometheus/prometheus and/or prometheus/proposals.

Proposes formalizing and testing the compositional invariant that
anchored + increase satisfies:

  increase(m[D12] anchored) @ T2 + increase(m[D23] anchored) @ T3
    = increase(m[D13] anchored) @ T3

This property is the design goal of Invoca's yrate family (cited as
prior art in PROM-52) and the "composability" goal PROM-52 mentions
but never states or tests.

Includes a proof sketch, proposed doc/test additions, and reviewer
notes on tone, scope, venue, and timing.

Made-with: Cursor
The previous wording ("left-open / right-closed range semantics
plus 'last sample at or before start as the boundary anchor'")
conflated two separate concepts and read as if the anchor
implied left-closed range semantics.

Reword to make the distinction explicit:

- Range membership is (start, end] — left-open, right-closed. A
  sample at t = start is NOT a member of the range.
- The anchor is a separate lookup OUTSIDE the range, reaching
  backward from start up to lookback_delta, returning the latest
  sample with t <= start.

Also sharpen the boundary-ambiguity discussion and proof-sketch
item (2) to spell out that a sample at t = T2 plays a dual role
(last member of the earlier range AND anchor for the later one)
without ever being a member of both ranges. That dual role — not
double-membership — is what makes the additivity exact.

Made-with: Cursor
Replace the delta symbol with r_ij for range durations, matching
Prometheus/PromQL's canonical term ("range selector", "range
vector", "--query.lookback-delta"). lookback_delta stays as-is
since it is literal Prometheus terminology.

No semantic changes; pure notational cleanup so readers don't
mentally translate from physics/math notation to Prometheus
terminology while parsing the invariant.

Made-with: Cursor
Reframe the invariant statement and proof in terms Prometheus
already uses natively:

- Range selectors m[r_a], m[r_b], m[r_a + r_b] — PromQL syntax.
- @ T_a, @ T_b — the @ modifier that users already know.
- The only time arithmetic is addition (T_b = T_a + r_b),
  expressing that the two anchored windows are adjacent.

The previous notation defined range durations as timestamp
subtractions (r_ij = T_j - T_i), which read as physics/math
shorthand rather than PromQL. Drop that in favor of taking
ranges as primitives and deriving the later evaluation instant
via addition.

Proof sketch refactor:

- Rename generic-range placeholders from (T_a, T_b] to (s, e]
  so they don't collide with the evaluation-instant variables.
- Introduce T_start = T_a - r_a locally as the left boundary of
  the combined window; the three anchored windows then cover
  (T_start, T_a], (T_a, T_b], and (T_start, T_b].
- Rename the three `f`-terms to f_earlier/f_later/f_combined
  (descriptive, matches the framing of "earlier + later =
  combined").
- Fix two straggler T_2 references in the regression-tests
  section.

Pure notational change; invariant, proof, and scope are
unchanged.

Made-with: Cursor
Previous commit introduced T_a / T_b / r_a / r_b via ASCII-underscore
notation, which is visually noisier than Unicode subscripts and lost
the numeric-subscript convention the earlier drafts used. Revert to:

- Statement: T_1, T_2, r_1, r_2 (Unicode subscripts), with the
  adjacency constraint T_2 = T_1 + r_2.
- Proof: introduce T_0 = T_1 - r_1 as the left boundary of the
  combined window; three anchored windows cover (T_0, T_1],
  (T_1, T_2], (T_0, T_2].
- Function values: f_01, f_12, f_02 (Unicode subscript pairs
  consistent with the interval notation) instead of the
  ASCII-underscore descriptive names f_earlier / f_later / f_combined.
- Regression-test bullets: T_1 instead of T_a.

Generic range-placeholder names (s, e] in the proof sketch are kept
unchanged; they are short, descriptive, and collision-free with the
concrete T_i subscripts.

No change to the invariant itself or the proof structure.

Made-with: Cursor
Previous wording "T₂ = T₁ + r₂" made T₂ look like a derived
quantity. In practice T₁ and T₂ are both primary — they are the
two evaluation instants a user would pick — and r₂ is what gets
determined by choosing them.

Reorder the definitions to make that explicit:

- T₁ < T₂: two independent evaluation instants.
- r₁: any range duration bounded by lookback_delta (free choice).
- r₂: the length of the interval (T₁, T₂], described in prose
  rather than as T₂ − T₁, keeping with the earlier decision to
  avoid timestamp arithmetic in the statement.

The adjacency condition ("later window's left boundary falls
exactly on T₁, no gap and no overlap") is now stated directly
rather than implied by an equation.

The precondition "r₁ ≤ lookback_delta" migrates from a trailing
parenthetical into r₁'s own definition line, since it is a
constraint on r₁ specifically.

No change to the invariant, proof, or scope.

Made-with: Cursor
Promote T₀ to a first-class timestamp alongside T₁ and T₂, so the
setup reads naturally as "pick three timestamps, the ranges are
the lengths of the two adjacent intervals." That makes clear
upfront what intervals r₁ and r₂ cover without forcing the
reader to derive them.

New setup:

- T₀ < T₁ < T₂ — three timestamps.
- r₁ = length of (T₀, T₁], bounded by lookback_delta.
- r₂ = length of (T₁, T₂].

Added a small table showing which interval each anchored window
covers; the table pairs window expression with interval in a
way that leaves no room for ambiguity about which range is
which.

Proof sketch simplifies correspondingly: the "Let T₀ = T₁ − r₁"
definition line is dropped since T₀ is now primary. The three
f terms and partition/anchor identities already referenced T₀,
so no other changes were needed there.

No change to the invariant itself or the proof structure.

Made-with: Cursor
Add a one-line "PromQL rule of thumb" reminder before the
windows-vs-intervals table: `m[duration] anchored @ T` covers
(T − duration, T], so the @ instant is always the right edge
and the range extends backward from it.

Without this reminder, readers who aren't fluent in the @
modifier may wonder why the combined window uses `@ T₂` rather
than `@ T₀` or `@ T₁` — the only way to get a range ending at
T₂ is to evaluate it @ T₂, independent of how far back the
range extends.

Everything else unchanged.

Made-with: Cursor
Adopt the term composability for the upstream-facing draft, with
additivity over adjacent ranges as the precise mathematical anchor.

Rationale:

* PROM-52 already names "composability" as a stated goal twice
  (lines 147 and 222) but never pins it down. Meeting upstream at
  their own vocabulary makes the proposal a definition of an
  existing concept rather than an introduction of a new one --
  lower-stakes and easier to land.

* Strict "linearity" overclaims; the property at issue is finite
  additivity over a partition of adjacent intervals, not full
  linearity over a vector space. A measure-theory-literate reviewer
  would flag the term, and the substance of the proposal should
  not get derailed into that pedantic objection.

* Internal Invoca terminology (branch names, commit messages, the
  yrate code comments, and the yrate design doc) continues to use
  "linearity" -- that shorthand has years of established cache
  among engineers who already know what it means. The branch name
  and PR remain on "linearity"; only the artifact destined for
  upstream consumption uses "composability".

The body of the proposal already used "composability" and
"additivity" appropriately, so the only edit needed was the H1
title and the filename.

Co-authored-by: Cursor <[email protected]>
ColinDKelley and others added 3 commits May 11, 2026 10:21
The previous wording told the reader the windows 'cover the intervals
their names suggest' immediately before the table that defines those
intervals -- a circular construction. The reader has no prior basis
for what the names suggest until the table provides it.

Rewording to point forward to the table directly:
'The three anchored windows below cover these intervals.'

Co-authored-by: Cursor <[email protected]>
Three related edits, all on the upstream-facing draft:

1. Rename 'Regression tests' (Proposed additions #3) to 'Invariant
   tests'. 'Regression' connoted defensive backsliding-protection,
   which understates the role: these tests codify the contract being
   proposed. They are the contract; regression defense is the side
   effect.

2. Add a permalink in §3 to Invoca's already-landed additivity_*
   block on the invoca-2.55.1/align-yrate-to-3x-range-boundary branch.
   Pinned to the commit SHA so the L531-L593 range stays stable as
   the file grows. Notes that uniform / earlier-reset / later-reset
   are covered; off-cadence and partial-dataset still to add.

3. In Related prior art, expand the '~5 years in production'
   sentence to actually say what we learned: composability via
   additivity is the most valuable property of the family in
   practice (dashboards zoom, recording rules roll up, alerts fire
   without partition-boundary artifacts). The original wording named
   the duration but not the lesson.

4. Tighten 'Happy to contribute' since the porting claim is now
   redundant with §3's concrete link.

Co-authored-by: Cursor <[email protected]>
Our existing additivity_* block already covers what §3 asks for:

* Off-cadence boundary timestamps -- deliberate, called out in the
  block header.
* Counter resets in the earlier window (Scenario 2) and the later
  window (Scenario 3).
* Partial-dataset / missed-scrape semantics -- the last_in / anchor
  lookups treat off-cadence boundaries and missing samples through
  the same engine code paths, so the existing scenarios exercise
  that behavior without needing a separate 'gap' test.

Rewording §3's link sentence to claim full coverage rather than
'three scenarios, more to add on top'. Reviewer can verify against
the L531-L593 permalink.

Co-authored-by: Cursor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant