Skip to content

Update ctypes from v3.14.3 and make _ctypes compatible#7165

Merged
youknowone merged 4 commits intoRustPython:mainfrom
youknowone:ctypes
Feb 17, 2026
Merged

Update ctypes from v3.14.3 and make _ctypes compatible#7165
youknowone merged 4 commits intoRustPython:mainfrom
youknowone:ctypes

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Feb 16, 2026

Summary by CodeRabbit

  • New Features

    • Raw memory access: added memoryview helper and RawMemoryBuffer; exposed memoryview_at_addr.
    • pointer_type property added across ctypes types; CData convenience constructors (from_address/from_buffer/from_buffer_copy/in_dll) exposed.
    • Bitfield support for structs/unions and improved lifetime handling for pointer/char buffers.
  • Bug Fixes

    • Clearer type/comparison error messages.
    • More consistent parsing behavior and idempotent type initialization; deprecation warning for certain pack usages.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

This PR tightens buffer-format parsing whitespace handling, standardizes unsupported binary-op TypeError messages, and implements extensive ctypes changes: pointer-type caching and pointer_type accessors, kept-alive lifetime management, RawMemoryBuffer + memoryview_at_addr, bitfield-aware fields, and added CData-like constructors.

Changes

Cohort / File(s) Summary
Buffer Parsing
crates/vm/src/buffer.rs
FormatCode::parse now skips leading whitespace before parsing repeat counts/format chars and removes the previous post-repeat whitespace skip.
Comparison/Error Messages
crates/vm/src/protocol/object.rs, crates/vm/src/vm/vm_new.rs
Unsupported binary-op branch now raises TypeError with a standardized message literal; control flow unchanged.
ctypes: module surface & exports
crates/vm/src/stdlib/ctypes.rs
Expose _memoryview_at_addr and RawMemoryBuffer; add is_initialized() accessor and update check_not_initialized error text.
ctypes: core & fields
crates/vm/src/stdlib/ctypes/base.rs, crates/vm/src/stdlib/ctypes/array.rs
Add StgInfo.pointer_type, PyCData.kept_refs + keep_alive, redesigned PyCField with bitfield support and byte_size; ensure_z_null_terminated returns (kept_alive, ptr); array paths retain kept-alive refs.
ctypes: pointer/simple types
crates/vm/src/stdlib/ctypes/pointer.rs, crates/vm/src/stdlib/ctypes/simple.rs
Pointer types cache element linkage in StgInfo; add __pointer_type__ getter/setter; platform-specific SIMPLE_TYPE_CHARS; z/Z/P handling switched to kept_alive semantics.
ctypes: function & memoryview
crates/vm/src/stdlib/ctypes/function.rs
Add INTERNAL_MEMORYVIEW_AT_ADDR, RawMemoryBuffer pyclass, memoryview_at_impl for zero-copy memoryviews, route internal function to it, and expose pointer-type accessors on PyCFuncPtrType.
ctypes: structs & unions
crates/vm/src/stdlib/ctypes/structure.rs, crates/vm/src/stdlib/ctypes/union.rs
Add bitfield layout handling (MSVC/SystemV), circular self-reference checks, clear pointer_type on subclassing, __pointer_type__ accessors, delegate CData-like constructors, and deprecation warning for _pack_ without _layout_ on non-Windows.
Minor VM / stdlib tweaks
crates/vm/src/stdlib/sys.rs
Small formatting consolidation in excepthook; no behavior change.

Sequence Diagram(s)

sequenceDiagram
    participant PyCode as Python code
    participant VM as VirtualMachine
    participant CTYPES as _ctypes module
    participant Buffer as RawMemoryBuffer
    participant Memview as memoryview

    rect rgba(100,150,240,0.5)
    PyCode->>VM: call internal(MEMORYVIEW_AT_ADDR, ptr, size, readonly)
    VM->>CTYPES: handle_internal_func(MEMORYVIEW_AT_ADDR, ...)
    CTYPES->>Buffer: create RawMemoryBuffer(ptr,size,readonly)
    Buffer->>Memview: PyMemoryView_FromObject(buffer)
    Memview-->>PyCode: return memoryview(obj)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • arihant2math
  • ShaharNaveh
  • coolreader18

Poem

🐰 I hopped through pointers, bitfields, and views,
Kept bytes snug so they never lose,
A raw buffer shared without a copy,
I nudged the VM and did a floppy hop,
Celebrate small fixes with a tiny muse! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main focus of the changeset: updating the ctypes module from v3.14.3 and ensuring _ctypes compatibility.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin ctypes

@youknowone youknowone force-pushed the ctypes branch 5 times, most recently from f8d89d9 to c116926 Compare February 17, 2026 11:06
@youknowone youknowone marked this pull request as ready for review February 17, 2026 12:34
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/ctypes/base.rs`:
- Around line 466-469: The kept_refs Vec grows unbounded because keep_alive
always pushes and never replaces prior entries; change kept_refs from a Vec to a
keyed store (e.g., HashMap or Vec<Option<...>>) and have keep_alive
accept/derive a stable unique_key (or index) for the owned buffer so calls
replace the previous entry instead of always pushing; update the keep_alive call
sites and the kept_refs type (refer to kept_refs, keep_alive and the
unique_key/index used when assigning c_char_p or z[] fields) so assignments
overwrite the old keep-alive rather than accumulating entries.
- Around line 1474-1503: The bitfield creation code currently allows negative
bit_size/bit_offset because it only checks overflow; update the validation where
bit_size_val and bit_offset_val are handled (the variables bit_size_val,
bit_offset_val and the branch that calls Self::new_bitfield) to reject negative
values: if bs < 0 or bo < 0 return Err(vm.new_value_error("number of bits
invalid for bit field")) (or an equivalent ValueError message), otherwise
proceed with the existing overflow check and call to Self::new_bitfield(name,
field_type, byte_offset, byte_size, bs as u16, bo as u16, index).

In `@crates/vm/src/stdlib/ctypes/function.rs`:
- Around line 574-578: extract_ptr_from_arg currently unwraps a CArgObject but
drops carg.offset, causing byref(obj, offset) to resolve to the base address;
update extract_ptr_from_arg to check if
arg.downcast_ref::<super::_ctypes::CArgObject>() yields a CArgObject whose inner
obj is a PyCData (e.g., downcast to PyCData) and, if so, compute and return the
pointer + carg.offset (apply the offset to the base pointer) instead of
recursing; only recurse into extract_ptr_from_arg(&carg.obj, vm) when the
wrapped object is not PyCData so offsets are preserved for
string_at/wstring_at/memoryview_at/byref semantics.

---

Duplicate comments:
In `@crates/vm/src/stdlib/ctypes/union.rs`:
- Around line 169-181: The deprecation warning emission should match the pattern
used in structure.rs; update the block that checks pack and _layout_ (using
pack, cfg!(windows), cls.as_object().get_attr("_layout_", vm)) so it calls the
same warnings API and parameters as structure.rs (i.e., the identical
warnings::warn invocation/signature and stacklevel used there) — replace the
current warnings::warn(...) usage with the exact helper/call pattern from
structure.rs to ensure consistent warning behavior and API usage.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/union.rs (1)

206-315: ⚠️ Potential issue | 🟠 Major

Bug: bitfield bit_offset accumulates across union fields instead of resetting.

In a C union, every field starts at byte offset 0, and bitfields within a union each begin at bit 0 of their storage unit independently. The current code accumulates bitfield_bit_offset across consecutive bitfield members (lines 297–301), so the second bitfield in a union would incorrectly start at the end of the first one — that's structure semantics, not union semantics.

For unions, bitfield_bit_offset should reset to 0 for each field.

🐛 Proposed fix
         // Check for bitfield size (optional 3rd element in tuple)
             let c_field = if field_tuple.len() > 2 {
                 let bit_size_obj = field_tuple.get(2).expect("len checked");
                 let bit_size = bit_size_obj
                     .try_int(vm)?
                     .as_bigint()
                     .to_u16()
                     .ok_or_else(|| {
                         vm.new_value_error("number of bits invalid for bit field".to_string())
                     })?;
                 has_bitfield = true;

-                // Track bit_offset within storage unit
-                let type_bits = (size * 8) as u16;
-                if bitfield_bit_offset + bit_size > type_bits {
-                    bitfield_bit_offset = 0;
-                }
-                let bit_offset = bitfield_bit_offset;
-                bitfield_bit_offset += bit_size;
+                // Union: all fields start at offset 0, including bit offset
+                let bit_offset: u16 = 0;

                 PyCField::new_bitfield(
                     name.clone(),
                     field_type_ref,
                     0, // Union fields always at offset 0
                     size as isize,
                     bit_size,
                     bit_offset,
                     index,
                 )
             } else {
-                bitfield_bit_offset = 0;
                 PyCField::new(name.clone(), field_type_ref, 0, size as isize, index)
             };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/union.rs` around lines 206 - 315, The bug is that
bitfield_bit_offset is accumulated across union fields; inside the fields
iteration for unions (the loop handling fields, using bitfield_bit_offset and
creating PyCField via PyCField::new_bitfield), reset bitfield_bit_offset to 0
for each field before computing bit_size/bit_offset so each union bitfield
starts at bit 0; specifically update the branch that checks field_tuple.len() >
2 (the bitfield handling block that calls PyCField::new_bitfield) to always set
bitfield_bit_offset = 0 (or initialize a local bit_offset = 0) before
calculating type_bits, bit_offset, and incrementing bitfield_bit_offset,
ensuring union semantics are used when creating the PyCField.
🧹 Nitpick comments (4)
crates/vm/src/stdlib/ctypes/pointer.rs (1)

633-638: Negative index cast to usize in keep_ref produces a wrapped key.

setitem_by_index receives index: isize (negative indices are valid for pointers). Casting to usize on line 638 (and line 653) wraps negative values to large numbers. While this is only used as a dictionary key and won't corrupt data, it silently differs from CPython's behavior where negative indices are stored as-is. Consider using index directly as a signed string key.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/pointer.rs` around lines 633 - 638, The code
casts a potentially negative `index: isize` to `usize` when calling
`zelf.0.keep_ref(index as usize, ...)`, which wraps negative indices into large
unsigned values; instead preserve the signed index as the dictionary key by
converting `index` to a signed string (e.g. format!("{}", index)) and pass that
string key to the existing ref/store API. Update the `setitem_by_index` call
sites where `keep_ref` is invoked (the lines using `zelf.0.keep_ref(index as
usize, value.clone(), vm)`) to use the signed string key, keeping the
surrounding logic (calling `super::base::ensure_z_null_terminated`, assigning
`*(addr as *mut usize) = ptr_val`, and `zelf.0.keep_alive(kept_alive)`)
unchanged.
crates/vm/src/stdlib/ctypes/base.rs (2)

999-1032: Error wrapping loses the original exception type.

When array construction fails (line 1014–1024), the original exception is discarded and replaced with a RuntimeError. This loses specificity — e.g., a TypeError from a bad element becomes a generic RuntimeError. Consider either re-raising the original or at least chaining it as the cause.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/base.rs` around lines 999 - 1032, The map_err
closure that wraps array construction failures (the closure passed to
proto_type.as_object().call(...).map_err(...)) is discarding the original
exception and returning a new RuntimeError; change it to preserve the original
exception type by propagating the original error (return Err(e) from the call
chain) or, if you must wrap, create the RuntimeError but attach the original
exception as its cause using the interpreter's exception-chaining API (set
__cause__ or the VM helper) so code that matches on the original exception class
still works; update the map_err closure around proto_type.as_object().call(...)
/ array_obj handling accordingly (refer to proto_type, array_obj, map_err, and
vm.new_runtime_error in the closure).

1876-1884: Legacy size encoding packs bitfield info into a single integer.

The legacy size getter encodes (bitfield_size << 16) | bit_offset for bitfields. This matches CPython's historical behavior but is fragile — ensure this is only used for backward compatibility and that new code uses the dedicated bit_size/bit_offset accessors instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/base.rs` around lines 1876 - 1884, The size()
getter currently packs bitfield info as ((bitfield_size << 16) | bit_offset)
when bitfield_size > 0; preserve that exact legacy encoding for backward
compatibility but mark it as legacy and steer new code to use the dedicated
accessors (e.g., bit_size and bit_offset) instead — update the size() doc
comment or add a #[deprecated] annotation on size() to document it is
fragile/legacy, ensure references to bitfield_size, bit_offset_val and
byte_size_val remain unchanged so the semantics stay identical, and add/adjust
tests to assert legacy behavior while preferring bit_size/bit_offset in new code
paths.
crates/vm/src/stdlib/ctypes/union.rs (1)

420-467: Duplicated downcast-to-cls boilerplate across all four CDataType delegations.

from_address, from_buffer, from_buffer_copy, and in_dll all repeat the same zelf.downcast().map_err(...) pattern. Consider extracting a small helper (e.g., fn downcast_cls(zelf: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyTypeRef>) to reduce repetition.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/union.rs` around lines 420 - 467, All four
CDataType delegating methods (from_address, from_buffer, from_buffer_copy,
in_dll) duplicate the same downcast boilerplate; add a small helper fn
downcast_cls(zelf: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyTypeRef> that
performs zelf.downcast().map_err(|_| vm.new_type_error("expected a type")), then
replace the repeated downcast code in each method with a single call to
downcast_cls and pass the returned PyTypeRef to the corresponding
PyCData::from_address / PyCData::from_buffer / PyCData::from_buffer_copy /
PyCData::in_dll calls to remove repetition.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/vm/src/stdlib/ctypes/union.rs`:
- Around line 206-315: The bug is that bitfield_bit_offset is accumulated across
union fields; inside the fields iteration for unions (the loop handling fields,
using bitfield_bit_offset and creating PyCField via PyCField::new_bitfield),
reset bitfield_bit_offset to 0 for each field before computing
bit_size/bit_offset so each union bitfield starts at bit 0; specifically update
the branch that checks field_tuple.len() > 2 (the bitfield handling block that
calls PyCField::new_bitfield) to always set bitfield_bit_offset = 0 (or
initialize a local bit_offset = 0) before calculating type_bits, bit_offset, and
incrementing bitfield_bit_offset, ensuring union semantics are used when
creating the PyCField.

---

Duplicate comments:
In `@crates/vm/src/stdlib/ctypes/function.rs`:
- Around line 574-578: extract_ptr_from_arg currently unwraps a CArgObject but
discards carg.offset, losing byref offsets; update the CArgObject handling in
extract_ptr_from_arg so that after downcasting and recursively extracting the
inner pointer from carg.obj you detect if the inner object is a PyCData (or the
pointer extraction returned a valid base address) and then add carg.offset to
that usize before returning; reference symbols: extract_ptr_from_arg,
super::_ctypes::CArgObject, carg.obj, carg.offset, and PyCData to locate the
correct types and adjust the returned pointer value.

---

Nitpick comments:
In `@crates/vm/src/stdlib/ctypes/base.rs`:
- Around line 999-1032: The map_err closure that wraps array construction
failures (the closure passed to proto_type.as_object().call(...).map_err(...))
is discarding the original exception and returning a new RuntimeError; change it
to preserve the original exception type by propagating the original error
(return Err(e) from the call chain) or, if you must wrap, create the
RuntimeError but attach the original exception as its cause using the
interpreter's exception-chaining API (set __cause__ or the VM helper) so code
that matches on the original exception class still works; update the map_err
closure around proto_type.as_object().call(...) / array_obj handling accordingly
(refer to proto_type, array_obj, map_err, and vm.new_runtime_error in the
closure).
- Around line 1876-1884: The size() getter currently packs bitfield info as
((bitfield_size << 16) | bit_offset) when bitfield_size > 0; preserve that exact
legacy encoding for backward compatibility but mark it as legacy and steer new
code to use the dedicated accessors (e.g., bit_size and bit_offset) instead —
update the size() doc comment or add a #[deprecated] annotation on size() to
document it is fragile/legacy, ensure references to bitfield_size,
bit_offset_val and byte_size_val remain unchanged so the semantics stay
identical, and add/adjust tests to assert legacy behavior while preferring
bit_size/bit_offset in new code paths.

In `@crates/vm/src/stdlib/ctypes/pointer.rs`:
- Around line 633-638: The code casts a potentially negative `index: isize` to
`usize` when calling `zelf.0.keep_ref(index as usize, ...)`, which wraps
negative indices into large unsigned values; instead preserve the signed index
as the dictionary key by converting `index` to a signed string (e.g.
format!("{}", index)) and pass that string key to the existing ref/store API.
Update the `setitem_by_index` call sites where `keep_ref` is invoked (the lines
using `zelf.0.keep_ref(index as usize, value.clone(), vm)`) to use the signed
string key, keeping the surrounding logic (calling
`super::base::ensure_z_null_terminated`, assigning `*(addr as *mut usize) =
ptr_val`, and `zelf.0.keep_alive(kept_alive)`) unchanged.

In `@crates/vm/src/stdlib/ctypes/union.rs`:
- Around line 420-467: All four CDataType delegating methods (from_address,
from_buffer, from_buffer_copy, in_dll) duplicate the same downcast boilerplate;
add a small helper fn downcast_cls(zelf: PyObjectRef, vm: &VirtualMachine) ->
PyResult<PyTypeRef> that performs zelf.downcast().map_err(|_|
vm.new_type_error("expected a type")), then replace the repeated downcast code
in each method with a single call to downcast_cls and pass the returned
PyTypeRef to the corresponding PyCData::from_address / PyCData::from_buffer /
PyCData::from_buffer_copy / PyCData::in_dll calls to remove repetition.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/vm/src/stdlib/ctypes/simple.rs (2)

1273-1289: ⚠️ Potential issue | 🟡 Minor

Same z/Z asymmetry in set_value.

The z path (lines 1278–1279) stores the original value in objects and kept_alive in base. The Z path (line 1287) stores holder (the wchar buffer) in objects and doesn't touch base, losing the reference to the original PyStr.

Proposed fix
         } else if type_code == "Z"
             && let Some(s) = value.downcast_ref::<PyStr>()
         {
             let (holder, ptr) = super::base::str_to_wchar_bytes(s.as_str(), vm);
             *zelf.0.buffer.write() = alloc::borrow::Cow::Owned(ptr.to_ne_bytes().to_vec());
-            *zelf.0.objects.write() = Some(holder);
+            *zelf.0.objects.write() = Some(value);
+            *zelf.0.base.write() = Some(holder);
             return Ok(());
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/simple.rs` around lines 1273 - 1289, The Z branch
in set_value currently stores the wchar buffer holder in objects and doesn't set
base, dropping the original PyStr; change the Z handling to mirror the z path by
calling super::base::str_to_wchar_bytes(s.as_str(), vm) as you do now, assign
the byte buffer to zelf.0.buffer, then store the original PyStr (value) in
zelf.0.objects and store the returned holder in zelf.0.base (i.e., set objects =
Some(value) and base = Some(holder)) so both the original string and the
wchar-holder are kept alive; keep using the same buffer assignment with
ptr.to_ne_bytes().

1040-1057: ⚠️ Potential issue | 🟡 Minor

Asymmetric lifetime storage between z and Z in slot_new.

For z (line 1046–1048): the original PyBytes goes into objects and the null-terminated copy (kept_alive) goes into base.
For Z (line 1055): the wchar buffer (holder) goes into objects and base is not set.

This asymmetry means for Z, the original PyStr is not kept alive anywhere on the PyCData. In CPython, both the converted buffer and the original object are retained. If any code path later tries to retrieve the original string from objects, it will get the internal wchar buffer instead.

Consider aligning the Z path with the z pattern:

Proposed fix
             } else if _type_ == "Z"
                 && let Some(s) = v.downcast_ref::<PyStr>()
             {
                 let (holder, ptr) = super::base::str_to_wchar_bytes(s.as_str(), vm);
                 let buffer = ptr.to_ne_bytes().to_vec();
-                let cdata = PyCData::from_bytes(buffer, Some(holder));
+                let cdata = PyCData::from_bytes(buffer, Some(v.clone()));
+                *cdata.base.write() = Some(holder);
                 return PyCSimple(cdata).into_ref_with_type(vm, cls).map(Into::into);
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/simple.rs` around lines 1040 - 1057, In
slot_new's 'Z' branch you must mirror the 'z' lifetime handling: call
super::base::str_to_wchar_bytes(s.as_str(), vm) returns (holder, ptr) but
currently you pass holder into PyCData::from_bytes and never set cdata.base;
instead pass the original PyStr (s.clone()) as the "objects" argument to
PyCData::from_bytes and then store the wchar-holder into *cdata.base.write()
(like the z path does with kept_alive) so both the converted buffer (base) and
the original PyStr (objects) are retained.
🧹 Nitpick comments (2)
crates/vm/src/stdlib/ctypes/union.rs (1)

420-467: Consider extracting the repeated zelf downcast into a helper.

All four delegated methods (from_address, from_buffer, from_buffer_copy, in_dll) duplicate the same zelf.downcast() + error mapping pattern.

♻️ Extract helper
impl PyCUnionType {
    fn downcast_cls(zelf: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyTypeRef> {
        zelf.downcast()
            .map_err(|_| vm.new_type_error("expected a type"))
    }
}

Then each method becomes a one-liner delegation:

fn from_address(zelf: PyObjectRef, address: isize, vm: &VirtualMachine) -> PyResult {
    PyCData::from_address(Self::downcast_cls(zelf, vm)?, address, vm)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/union.rs` around lines 420 - 467, Extract the
repeated zelf.downcast() pattern into a small helper on PyCUnionType (e.g., fn
downcast_cls(zelf: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyTypeRef>)
that performs zelf.downcast().map_err(|_| vm.new_type_error("expected a type")),
then update the four methods from_address, from_buffer, from_buffer_copy, and
in_dll to call that helper and forward its result to PyCData::from_address /
from_buffer / from_buffer_copy / in_dll respectively (using
Self::downcast_cls(zelf, vm)? to get the PyTypeRef).
crates/vm/src/stdlib/ctypes/simple.rs (1)

340-353: Inconsistent variable naming: kept_alive for z/P handlers vs holder for Z handlers.

The z and P code paths (lines 344, 398) were updated to use kept_alive, but the Z code paths (lines 369, 410) still use holder. Both variables serve the same lifetime-management purpose. While functionally correct, this inconsistency hurts readability.

♻️ Rename for consistency
 Some("Z") => {
     if let Some(s) = value.downcast_ref::<PyStr>() {
-        let (holder, ptr) = super::base::str_to_wchar_bytes(s.as_str(), vm);
+        let (kept_alive, ptr) = super::base::str_to_wchar_bytes(s.as_str(), vm);
         return Ok(CArgObject {
             tag: b'Z',
-            value: FfiArgValue::OwnedPointer(ptr, holder),
+            value: FfiArgValue::OwnedPointer(ptr, kept_alive),
             obj: value.clone(),

Apply the same rename at line 410 in the "P" handler's Z sub-path.

Also applies to: 366-378, 396-418

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/simple.rs` around lines 340 - 353, Handlers for
the c_char_p variants ("z", "P") use the variable name kept_alive while the
corresponding "Z" sub-paths use holder; rename the lifetime-management variable
in the "Z" branches to kept_alive for consistency. Locate the "Z" handler code
paths that call super::base::ensure_z_null_terminated (and the analogous
ensure_* helpers) and replace the local variable named holder with kept_alive,
then pass kept_alive into the CArgObject::OwnedPointer (and any
closures/returns) exactly where holder was used so the lifetime semantics and
types (used with CArgObject, FfiArgValue::OwnedPointer, and the surrounding
functions) remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/ctypes/base.rs`:
- Around line 466-471: kept_refs currently uses a flat usize key so nested
fields/array elements can collide and drop live buffers; change kept_refs (the
PyRwLock<HashMap<usize, PyObjectRef>> field) to use the same hierarchical key
scheme used by keep_ref (e.g., a Vec<usize> or a composite key type that encodes
parent path + index) so keys include the full path to the field/element, update
all accesses (in methods that insert/remove from kept_refs) to build and use the
hierarchical key, and apply the same change to the other kept_refs usage sites
mentioned (the second occurrence following the keep_ref pattern).
- Around line 1003-1034: The temporary PyCArray created from tuple/list is
copied into the struct bytes but its element-level keep-alive references
(_objects) are not retained, causing dangling pointers; after successfully
downcasting to super::array::PyCArray and before returning, retain a reference
to the array object so its _objects survive (e.g., call the struct instance's
reference-retention helper such as self.keep_ref or equivalent API with the
field offset and array_obj) instead of returning immediately; ensure you pass
the original array_obj (or a clone) to the retention call after
write_bytes_at_offset so the PyCArray._objects remain live for the lifetime of
the parent struct.

In `@crates/vm/src/stdlib/ctypes/union.rs`:
- Around line 282-315: The union bitfield handling currently accumulates
bitfield_bit_offset across fields and passes that raw value to
PyCField::new_bitfield, which is wrong for unions and ignores endianness; update
the branch that handles the 3rd tuple element (bit_size) so that
bitfield_bit_offset is reset per field (do not persist or increment across union
fields) and compute bit_offset using is_swapped: for little-endian set
bit_offset = 0, for big-endian set bit_offset = (size * 8) as u16 - bit_size
(convert to the expected type), then pass that computed bit_offset to
PyCField::new_bitfield (keeping the field offset as 0 and preserving
has_bitfield and size handling).

---

Outside diff comments:
In `@crates/vm/src/stdlib/ctypes/simple.rs`:
- Around line 1273-1289: The Z branch in set_value currently stores the wchar
buffer holder in objects and doesn't set base, dropping the original PyStr;
change the Z handling to mirror the z path by calling
super::base::str_to_wchar_bytes(s.as_str(), vm) as you do now, assign the byte
buffer to zelf.0.buffer, then store the original PyStr (value) in zelf.0.objects
and store the returned holder in zelf.0.base (i.e., set objects = Some(value)
and base = Some(holder)) so both the original string and the wchar-holder are
kept alive; keep using the same buffer assignment with ptr.to_ne_bytes().
- Around line 1040-1057: In slot_new's 'Z' branch you must mirror the 'z'
lifetime handling: call super::base::str_to_wchar_bytes(s.as_str(), vm) returns
(holder, ptr) but currently you pass holder into PyCData::from_bytes and never
set cdata.base; instead pass the original PyStr (s.clone()) as the "objects"
argument to PyCData::from_bytes and then store the wchar-holder into
*cdata.base.write() (like the z path does with kept_alive) so both the converted
buffer (base) and the original PyStr (objects) are retained.

---

Nitpick comments:
In `@crates/vm/src/stdlib/ctypes/simple.rs`:
- Around line 340-353: Handlers for the c_char_p variants ("z", "P") use the
variable name kept_alive while the corresponding "Z" sub-paths use holder;
rename the lifetime-management variable in the "Z" branches to kept_alive for
consistency. Locate the "Z" handler code paths that call
super::base::ensure_z_null_terminated (and the analogous ensure_* helpers) and
replace the local variable named holder with kept_alive, then pass kept_alive
into the CArgObject::OwnedPointer (and any closures/returns) exactly where
holder was used so the lifetime semantics and types (used with CArgObject,
FfiArgValue::OwnedPointer, and the surrounding functions) remain unchanged.

In `@crates/vm/src/stdlib/ctypes/union.rs`:
- Around line 420-467: Extract the repeated zelf.downcast() pattern into a small
helper on PyCUnionType (e.g., fn downcast_cls(zelf: PyObjectRef, vm:
&VirtualMachine) -> PyResult<PyTypeRef>) that performs
zelf.downcast().map_err(|_| vm.new_type_error("expected a type")), then update
the four methods from_address, from_buffer, from_buffer_copy, and in_dll to call
that helper and forward its result to PyCData::from_address / from_buffer /
from_buffer_copy / in_dll respectively (using Self::downcast_cls(zelf, vm)? to
get the PyTypeRef).

CPython Developers and others added 3 commits February 18, 2026 03:09
- Add __pointer_type__ getter/setter and StgInfo cache
- Add kept_refs (HashMap-keyed) for c_char_p lifetime
- Simplify ensure_z_null_terminated to 2-value return
- Add RawMemoryBuffer for zero-copy memoryview_at_addr
- Add bitfield support in PyCField with validation
- Add CArgObject offset handling in extract_ptr_from_arg
- Add deprecation warning for _pack_ without _layout_
- Fix comparison TypeError message format
- Fix unsupported binop error message format
- Mark failing CI tests as expectedFailure/skip
@github-actions
Copy link
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/ctypes
[ ] test: cpython/Lib/test/test_ctypes (TODO: 11)
[x] test: cpython/Lib/test/test_stable_abi_ctypes.py

dependencies:

  • ctypes

dependent tests: (1 tests)

  • ctypes: test_ctypes

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/vm/src/stdlib/ctypes/union.rs (1)

327-334: Circular self-reference detection relies on field finalization side-effect — subtle but correct.

The check works because line 265 sets DICTFLAG_FINAL on each field's type during iteration. If cls appears as its own field, it gets finalized there, and this check catches it before the new StgInfo is written at line 337. Worth a brief inline comment explaining this mechanism for future maintainers.

💡 Suggested comment
         // Check for circular self-reference
+        // If `cls` appeared as one of its own fields, the loop above (line 264-265)
+        // will have set DICTFLAG_FINAL on cls's StgInfo, which we detect here.
         if let Some(stg_info) = cls.get_type_data::<StgInfo>()
             && stg_info.is_final()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/ctypes/union.rs` around lines 327 - 334, The circular
self-reference check using cls.get_type_data::<StgInfo>() and
stg_info.is_final() relies on the fact that each field's type is marked final
(DICTFLAG_FINAL) earlier during field iteration, so if cls appears as its own
field it will already be finalized and detected here; add a short inline comment
above this if-block (referencing StgInfo, DICTFLAG_FINAL and the
field-finalization step) noting that the detection depends on that side-effect
and that the new StgInfo is written later, to help future maintainers understand
the ordering dependency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/vm/src/stdlib/ctypes/union.rs`:
- Around line 327-334: The circular self-reference check using
cls.get_type_data::<StgInfo>() and stg_info.is_final() relies on the fact that
each field's type is marked final (DICTFLAG_FINAL) earlier during field
iteration, so if cls appears as its own field it will already be finalized and
detected here; add a short inline comment above this if-block (referencing
StgInfo, DICTFLAG_FINAL and the field-finalization step) noting that the
detection depends on that side-effect and that the new StgInfo is written later,
to help future maintainers understand the ordering dependency.

@youknowone youknowone merged commit 8e70048 into RustPython:main Feb 17, 2026
14 checks passed
@youknowone youknowone deleted the ctypes branch February 17, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant