Skip to content

Tags: replit/river-python

Tags

v0.17.20

Toggle v0.17.20's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
codegen: emit guards for every anyOf variant to fix mypy union-attr o…

…n array-containing unions (#182)

Why
===

The encoder generator for non-discriminated anyOf unions emits a chain
of ternary expressions, with the last variant historically rendered as
the unguarded `else` branch. That works for simple unions like `object |
str | list` (mypy can negative-narrow `x` to `list` in the final
branch), but it breaks for deeper unions where the array variant is
last, e.g.

```
str | float | bool | list[scalar] | None
```

When mypy fails to fully narrow `x` to `list[...]` through the prior
`isinstance` checks (`isinstance(x, (int, float))` plus `bool`
subclassing `int` make this tricky), it complains that scalar items of
the union have no `__iter__` attribute:

```
error: Item "float" of "str | float | bool | None | list[...]"
    has no attribute "__iter__" (not iterable)  [union-attr]
error: Item "bool"  of ... has no attribute "__iter__"  [union-attr]
error: Item "object" of ... has no attribute "__iter__"  [union-attr]
```

This is the exact failure that has been blocking ai-infra's
`codegen-latest-pid2-schema.yml` auto-update workflow since 2026-05-04,
when repl-it-web#78355 widened
`agentToolPostgreSQL.executeSqlCommand.params` from a flat scalar union
to `array<scalar | array<scalar>>`. Every run since has failed on the
regenerated `executeSqlCommand.py` at the `for y in x` iteration inside
`encode_ExecutesqlcommandInputParams`. The committed pid2 client in
ai-infra has been kept current by hand (see replit/ai-infra#12813), but
the bot has been red for ~2.5 weeks.

What changed
============

`src/replit_river/codegen/client.py`: in the non-discriminated-anyOf
branch of `encode_type`, emit an explicit `isinstance` / `is None` guard
for every entry in `encoder_parts` — including the last one — and append
a `cast(Any, x)` fallback. mypy no longer has to negative-narrow into
the iterating branch, so deep unions with an array variant lint cleanly.
`Any` and `cast` are already part of `FILE_HEADER` so no import
bookkeeping changes.

Concretely, for the failing executeSqlCommand schema, the encoder now
ends with:

```python
return (
    x if isinstance(x, str)
    else x if isinstance(x, (int, float))
    else x if isinstance(x, bool)
    else None if x is None
    else [encode_..._AnyOf_4(y) for y in x]
        if isinstance(x, list)
        else cast(Any, x)
)
```

Test plan
=========

- Existing `tests/v1/codegen/snapshot/test_anyof_mixed.py` snapshot
updated to show the new `if isinstance(x, list) else cast(Any, x)` tail
on its `obj | str | list[str]` encoder (the change is additive — the
runtime behavior is unchanged).
- New snapshot test
`tests/v1/codegen/snapshot/test_anyof_array_in_union.py` added with a
schema that mirrors `executeSqlCommand.params` (`array<scalar |
array<scalar>>`) and locks in the fixed output. This is the regression
test for ai-infra's CI failure.
- `uv run pytest` is green (67 passed, including all v1 and v2 codegen
tests).
- `make lint` is clean apart from a pre-existing `pyright` `grpc` import
error in `tests/v1/test_communication.py` that also fails on `main`
(unrelated).
- End-to-end verification against ai-infra: pointed ai-infra's
`./pkgs/pid2_client/scripts/generate.sh` at this branch via
`RIVER_CODEGEN_PATH=/tmp/opencode/river-python` and reran the full lint
pipeline that the auto-update workflow runs in CI; `[mypy] completed in
15.19s` and the script exited `OK.` instead of the historical
`union-attr` failure.

Once this is released (e.g. `v0.17.20`) ai-infra can bump `replit-river`
in `pkgs/pid2_client/pyproject.toml` and the auto-update workflow will
start producing green PRs again.

~ written by Zerg 👾
([ascendant-goliath-6d2f](https://zerg.zergrush.dev/chat?id=ascendant-goliath-6d2f))

v0.17.19

Toggle v0.17.19's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix codegen with `from` keyword (#177)

Why
===

For schema with `from` and `to` properties. from is a Python reserved
keyword, and the river-python codegen was generating invalid Python
like:

```
class Rewrites(TypedDict):
    from: NotRequired[str | None]   # SyntaxError!
```


What changed
============

- src/replit_river/codegen/typing.py — Added import keyword and extended
normalize_special_chars to append _ to Python keywords (e.g., from ->
from_). The existing alias logic in client.py already handles setting
Field(alias="from") for BaseModel when the field name is normalized, so
no changes needed there.

- tests/v1/codegen/test_input_special_chars.py — Added two new tests
(test_python_keyword_field_names_basemodel and
test_python_keyword_field_names_typeddict) that verify the codegen
produces valid Python when schema fields use reserved keywords like
from, class, and import.

Test plan
=========

Added new tests

v0.17.18

Toggle v0.17.18's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Use .get() for discriminator access in generated union encoders (#176)

Why
===

Follow-up to #175. The discriminator field in a discriminated union may
be `NotRequired` in the TypedDict. Direct key access (`x["shapeType"]`)
triggers pyright's `reportTypedDictNotRequiredAccess` error.

This broke the pid2 codegen CI when the scribe schema added
discriminated union variants where the discriminator field is optional.

What changed
============

Use `x.get("key")` instead of `x["key"]` for discriminator checks in the
generated ternary chain. This is safe because a missing key returns
`None`, which won't match any discriminator value and falls through to
the next branch.

Test plan
=========

- All 64 tests pass
- Updated snapshot for `test_unknown_enum`
- Tested end-to-end against the pid2 schema from ai-infra — codegen,
mypy, and pyright all pass

~ written by Zerg 👾

v0.17.17

Toggle v0.17.17's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix mypy arg-type errors in generated discriminated union encoders (#175

)

Why
===

The codegen for discriminated union TypedDict encoders produces ternary
chains like:

```python
encode_Foo(x) if x["kind"] == "foo" else encode_Bar(x)
```

mypy can't narrow union types through these ternary conditions, so it
flags every encoder call as receiving the wrong type (`arg-type`).

This broke the pid2 codegen CI when new discriminated union variants
were added to a schema.

What changed
============

Use `cast()` to explicitly narrow the type to the correct variant after
the discriminator check, instead of suppressing with `# type:
ignore[arg-type]`. This preserves type safety in the generated code.

Before:
```python
encode_Foo(x)  # type: ignore[arg-type]
if x["kind"] == "foo"
else encode_Bar(x)  # type: ignore[arg-type]
```

After:
```python
encode_Foo(cast('Foo', x))
if x["kind"] == "foo"
else encode_Bar(cast('Bar', x))
```

Affects both the single-variant and multi-variant discriminator code
paths.

Test plan
=========

CI

v0.17.16

Toggle v0.17.16's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat: Propagate OTel context via WebSocket HTTP upgrade headers (#174)

Why
===

River's WebSocket connections don't carry any OTel context (traceparent,
tracestate, baggage) from client to server. This means distributed
tracing and baggage propagation are broken at the WebSocket boundary —
the server has no way to inherit the caller's trace context or read OTel
baggage entries.

What changed
============

Uses the standard W3C HTTP header approach — the same mechanism any HTTP
service uses for OTel propagation — applied to the WebSocket upgrade
request.

**Client side (`client_transport.py`, `v2/session.py`)**
- Before calling `websockets.connect()`, inject the current OTel context
into a headers dict via `propagate.inject()`.
- Pass those headers as `extra_headers` (v1 legacy API) /
`additional_headers` (v2 asyncio API) to the connect call.
- This automatically includes `traceparent`, `tracestate`, and `baggage`
headers if the corresponding propagators are configured in the global
textmap.

**Server side (`server.py`)**
- In `Server.serve()`, extract the OTel context from
`websocket.request_headers` via `propagate.extract()`.
- Attach the extracted context as the ambient OTel context for the
lifetime of the connection using `context.attach()` /
`context.detach()`.
- Any handler code running within the connection can now read baggage
via `baggage.get_all()` and inherits the caller's trace context.

**Tests (`tests/v1/test_opentelemetry.py`)**
- `test_baggage_propagated_via_ws_headers`: Sets two baggage entries on
the client, verifies the server handler can read them.
- `test_no_baggage_when_none_set`: Verifies clean behavior when no
baggage is set.
- `test_traceparent_propagated_via_ws_headers`: Sets both an active span
and baggage on the client, verifies both propagate.

Test plan
=========

```
$ uv run pytest tests/ -v
64 passed in 8.46s
```

All existing tests pass unchanged. The 3 new tests verify end-to-end
OTel context propagation through the WebSocket connection.

## Revertibility

Safe to revert — only adds new `extra_headers`/`additional_headers` to
`websockets.connect()` and a `propagate.extract()` + `context.attach()`
wrapper on the server. No wire protocol changes, no schema changes, no
data mutations.

~ written by Zerg 👾

v0.17.15

Toggle v0.17.15's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix codegen crashes for intersection types and complex list inner typ…

…es (#172)

Why
===



Codegen fails when a schema contains intersection types (`allOf`) or
lists with complex inner types (e.g. `list[dict[str, Any]]`). Both crash
with `Complex type must be put through render_type_expr!` or `Unexpected
expression when expecting a type name: DictTypeExpr(...)` because
`TypeName` objects are used directly in f-strings or passed to
`ensure_literal_type` which only accepts simple `TypeName` values.

What changed
============



- Fix `TypeName.__str__()` crash in the `RiverIntersectionType` encoder
by wrapping `encoder_name` with `render_literal_type()`, matching the
existing pattern used by `RiverUnionType` (line 625) and
`RiverConcreteType` (line 654)
- Fix `ensure_literal_type` crash when a list's inner type is a complex
expression (e.g. `list[dict[str, Any]]`) by guarding the `ListTypeExpr`
match to only enter the encoding branch for `TypeName` inner types,
falling through to `list(x)` for composite types that don't need
encoding

Test plan
=========

_Describe what you did to test this change to a level of detail that
allows your reviewer to test it_

v0.17.14

Toggle v0.17.14's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix codegen for non-discriminated anyOf unions with mixed types (#171)

Why
===

The encoder generation for TypedDict inputs produces malformed Python
code when handling `anyOf` unions containing mixed types like `[object,
string, array]`.

 Before 
 ```python
  return (
      encode_...AnyOf_0(x)
      x
       if isinstance(x, str) else
      encode_str(x)
  )
```

 After 
```python
  return (
      encode_...AnyOf_0(x) if isinstance(x, dict) else
      x if isinstance(x, str) else
      list(x)
  )
```


What changed
============


- Collect `(type_check, encoder_expr)` pairs for each union member
- Build a proper ternary chain with `isinstance` checks
- Handle primitive array items by returning `list(x)` instead of undefined encoder calls

Test plan
=========

CI

v0.17.13

Toggle v0.17.13's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat: recursive types (#170)

Why
===

Recursive types weren't supported in River codegen. When a type
referenced itself (like a tree node with children of the same type), it
would generate `list[Any]` instead of proper forward references.

What changed
============

Added support for JSON Schema's `$id`/`$ref` to handle recursive types.
Now generates proper forward references like `list["TreeNode"]` instead
of `list[Any]`.

Test plan
=========

Added a test with a recursive schema (tree node with children). All
existing tests pass.

v0.17.12

Toggle v0.17.12's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Upgrade pydantic version (#168)

Why
===

Our pydantic version is getting a little out of date, and we were
pinning to a specific version.

What changed
============

Made the package less proscriptive regarding pydantic version, and
changed minimum python version to 3.12.

Test plan
=========

CI/CD

v0.17.11

Toggle v0.17.11's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Properly close streams on exception (#167)

Why
===

We've had persistent timeout errors in AI-Infra, and I suspect that it's
related to not handling bumps in the connection correctly.

What changed
============

- WebSocket drops or send failures left _streams populated, so any
in-flight RPC hung until the session fully shut down. That meant clients
didn’t see an abort signal and could block indefinitely even though the
transport was already defunct.
- Added _abort_all_streams() in src/replit_river/session.py#L289 and
call it from both client_session.serve() and server_session.serve() on
ConnectionClosed, FailedSendingMessageException, or any other unexpected
exception (src/replit_river/client_session.py#L95,
src/replit_river/server_session.py#L82). This immediately closes every
active channel and clears _streams, ensuring callers are notified right
away when the socket dies so they can retry or surface an error.

Test plan
=========

CI/CD, ran against an internal branch with no issues 3x without flake.