Update PyO3 to 0.29 & enable c-api tests#8000
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe PR updates RustPython’s C-API configuration, adds new exported functions and modules, extends ctypes return handling, and enables disabled Rust tests across multiple C-API modules. ChangesRustPython C-API surface expansion and test activation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/ctypes dependencies:
dependent tests: (34 tests)
Legend:
|
43529c0 to
a46eb71
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
Cargo.toml (1)
122-122: ⚡ Quick winPin PyO3 git
revinCargo.toml
Cargo.lockalready resolves all PyO3 crates (pyo3,pyo3-ffi,pyo3-build-config,pyo3-macros,pyo3-macros-backend) to the same commit (de5816ada27bfa38cb3176d91bd6c93e206dcb3f), butCargo.tomlstill uses bare git URLs for bothpyo3andpyo3-ffi. Addrev = "de5816ada27bfa38cb3176d91bd6c93e206dcb3f"to both entries to make the manifest explicitly reproducible.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Cargo.toml` at line 122, Update the Cargo.toml git dependencies for pyo3 and pyo3-ffi to pin them to the resolved commit: add rev = "de5816ada27bfa38cb3176d91bd6c93e206dcb3f" to the pyo3 = { git = "https://github.com/PyO3/pyo3" } and pyo3-ffi = { git = "https://github.com/PyO3/pyo3" } entries so the manifest explicitly matches the commit resolved in Cargo.lock.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/capi/src/import.rs`:
- Around line 24-31: The current logic collapses all sys_modules.get_item
failures into creating a new module; instead, explicitly handle each case: call
sys_modules.get_item(name, vm) and if it returns Err, propagate or return that
error rather than creating a module; if it returns Ok(Some(value)) but
value.downcast_ref::<PyModule>() is None, return a type/mismatch error instead
of overwriting sys.modules; only when get_item returns Ok(None) should you call
vm.new_module(name, vm.ctx.new_dict(), None) and sys_modules.set_item(name,
module.clone().into(), vm)?; update the block around sys_modules.get_item,
downcast_ref::<PyModule>(), vm.new_module, and sys_modules.set_item accordingly.
- Around line 16-18: The extern "C" function PyImport_AddModuleRef currently
calls CStr::from_ptr(name).to_str().expect(...), which can panic on invalid
UTF-8; change it to return an FfiResult error instead of panicking by replacing
the expect with a to_str().map_err(|_| vm.new_system_error("Name is not valid
UTF-8"))? expression (using the existing with_vm/FfiResult flow) so invalid
UTF-8 is converted into a Python SystemError via vm.new_system_error rather than
causing a panic.
In `@crates/capi/src/object.rs`:
- Around line 205-227: PyObject_GenericGetDict and PyObject_GenericSetDict
incorrectly call obj.get_attr / obj.set_attr which go through attribute
machinery and allow __getattribute__/descriptors to intercept; instead wire
these C-API functions to the VM helpers that bypass attribute hooks. Replace the
calls in PyObject_GenericGetDict and PyObject_GenericSetDict to use the direct
helpers (object_get_dict and object_generic_set_dict from
crates/vm/src/builtins/object.rs) that read/write the underlying instance dict
(obj.dict()/obj.instance_dict()) without routing through get_attr/set_attr so
the real instance dict is returned/updated.
In `@crates/capi/src/pyerrors.rs`:
- Around line 303-308: PyException_SetCause currently clones the borrowed raw
pointer (using to_owned()) which violates CPython’s reference-stealing semantics
and leaks/increfs; change it to take ownership of the incoming raw pointer for
`cause` (use the equivalent of a from_raw/from_owned conversion on the raw `*mut
PyObject`) so the caller’s reference is consumed, map NULL to Python None as
before for clearing, and then pass that owned PyBaseException into
`exc.set___cause__` instead of cloning; ensure no extra incref remains and the
raw pointer is not used after ownership is transferred.
In `@crates/capi/src/pylifecycle.rs`:
- Around line 61-63: The function Py_GetVersion currently returns String::as_ptr
which may lack a terminating NUL; change the static LazyLock to hold a CString
(e.g. static VERSION: LazyLock<CString> = LazyLock::new(||
CString::new(format!("{MAJOR}.{MINOR}.{MICRO}")).unwrap());) and return
VERSION.as_ptr() as *const c_char so callers get a stable, NUL-terminated C
string from Py_GetVersion. Ensure you import std::ffi::CString and keep the
LazyLock so the CString lives for the program duration.
---
Nitpick comments:
In `@Cargo.toml`:
- Line 122: Update the Cargo.toml git dependencies for pyo3 and pyo3-ffi to pin
them to the resolved commit: add rev =
"de5816ada27bfa38cb3176d91bd6c93e206dcb3f" to the pyo3 = { git =
"https://github.com/PyO3/pyo3" } and pyo3-ffi = { git =
"https://github.com/PyO3/pyo3" } entries so the manifest explicitly matches the
commit resolved in Cargo.lock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 4e562741-82e6-4a0e-ad6f-0b276776160f
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockLib/test/test_ctypes/test_python_api.pyis excluded by!Lib/**
📒 Files selected for processing (22)
Cargo.tomlcrates/capi/pyo3-rustpython.configcrates/capi/src/abstract_.rscrates/capi/src/bytesobject.rscrates/capi/src/ceval.rscrates/capi/src/complexobject.rscrates/capi/src/dictobject.rscrates/capi/src/floatobject.rscrates/capi/src/import.rscrates/capi/src/lib.rscrates/capi/src/listobject.rscrates/capi/src/longobject.rscrates/capi/src/methodobject.rscrates/capi/src/moduleobject.rscrates/capi/src/object.rscrates/capi/src/pycapsule.rscrates/capi/src/pyerrors.rscrates/capi/src/pylifecycle.rscrates/capi/src/refcount.rscrates/capi/src/tupleobject.rscrates/capi/src/unicodeobject.rscrates/vm/src/stdlib/builtins.rs
07d29f9 to
2149dcb
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/vm/src/stdlib/_ctypes/function.rs`:
- Around line 1435-1455: The code incorrectly maps a NULL py_object to
Some(None) and also leaks a refcount by using ManuallyDrop + clone; update the
convert_raw_result branch that handles restype_type "_type_" == "O" so that when
ptr == 0 you raise a ValueError("PyObject is NULL") on the vm and return None
(propagating an error), and when ptr != 0 construct the PyObjectRef by moving
ownership with PyObjectRef::from_raw(NonNull::new_unchecked(ptr as *mut
PyObject)) and return Some(obj) directly (do NOT wrap in ManuallyDrop or clone),
referencing the restype_type check, RawResult variants, and
PyObjectRef::from_raw to locate and fix the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: bd14ff21-a37e-48fe-bff9-21046b4badbd
📒 Files selected for processing (2)
.github/workflows/ci.yamlcrates/vm/src/stdlib/_ctypes/function.rs
|
CI failures seem unrelated? |
ShaharNaveh
left a comment
There was a problem hiding this comment.
TYSM for all the work on this!
just a minor request:)
26b96e3 to
e77a566
Compare
25a2860 to
b8b775a
Compare
b8b775a to
04b8b1a
Compare
0.29 & enable c-api tests
| implementation=CPython | ||
| version=3.14 | ||
| implementation=RustPython | ||
| version=3.15 |
There was a problem hiding this comment.
abi3t is only supported from 3.15 onwards. This does not require actual 3.15 runtime support.
|
@coderabbitai resume |
✅ Action performedReviews resumed. |
04b8b1a to
901b388
Compare
|
@youknowone This is ready for review |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/vm/src/builtins/int.rs (1)
318-334: 💤 Low valueConsider fixing
as_u32_maskto match the correct pattern inas_u64_mask.The loop
out = out.wrapping_shl(32) | digithas a subtle issue: foru32,wrapping_shl(32)shifts by32 % 32 = 0bits (per Rust semantics), so it effectively ORs all digits together rather than extracting only the lower 32 bits.The new
as_u64_maskcorrectly takes only the first N digits. Consider applying the same approach here:Suggested fix
pub fn as_u32_mask(&self) -> u32 { let v = self.as_bigint(); v.to_u32() .or_else(|| v.to_i32().map(|i| i as u32)) .unwrap_or_else(|| { - let mut out = 0u32; - for digit in v.iter_u32_digits() { - out = out.wrapping_shl(32) | digit; - } + let out = v.iter_u32_digits().next().unwrap_or(0); match v.sign() { - Sign::Minus => out * -1i32 as u32, + Sign::Minus => out.wrapping_neg(), _ => out, } }) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/vm/src/builtins/int.rs` around lines 318 - 334, The loop in the `as_u32_mask` method has a bug where `wrapping_shl(32)` on a u32 shifts by 0 bits (since 32 % 32 = 0 in Rust), so it ORs all digits together rather than extracting only the lower 32 bits. Fix this by applying the same pattern used in `as_u64_mask`, which correctly takes only the first N digits needed. Replace the loop logic to extract just the lower 32 bits from the bigint digits instead of accumulating all digits with an ineffective shift operation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/vm/src/builtins/int.rs`:
- Around line 318-334: The loop in the `as_u32_mask` method has a bug where
`wrapping_shl(32)` on a u32 shifts by 0 bits (since 32 % 32 = 0 in Rust), so it
ORs all digits together rather than extracting only the lower 32 bits. Fix this
by applying the same pattern used in `as_u64_mask`, which correctly takes only
the first N digits needed. Replace the loop logic to extract just the lower 32
bits from the bigint digits instead of accumulating all digits with an
ineffective shift operation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 7decf53d-4d5a-46fc-b2b9-f48922df202c
📒 Files selected for processing (2)
crates/capi/src/longobject.rscrates/vm/src/builtins/int.rs
ShaharNaveh
left a comment
There was a problem hiding this comment.
The capi came out really clean. I like it a lot!
Thanks for all the work you've put into it<3
2ff30a0 to
1c0fc55
Compare
|
@bschoenmaeckers is this ready to go? |
Yes this is ready to merge! |
|
@youknowone can you take a look at this? |
1c0fc55 to
46e5005
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/capi/src/dictobject.rs`:
- Around line 77-86: The current implementation performs double-hashing by
calling inner_getitem_opt() and then inner_setitem() separately on the dict
object. Replace this approach with dict.entries.setdefault_entry() which
performs a single hash evaluation for both lookup and insertion operations. Pass
a closure to setdefault_entry() that wraps the default_value, and handle the
returned tuple containing both the final value and the key. This eliminates the
double-hashing and preserves the atomicity contract of the C API, which is
important for keys with observable side effects in __hash__ or __eq__.
In `@crates/capi/src/pystate.rs`:
- Around line 56-63: The PyInterpreterState_Get function currently uses
get_main_interpreter() which always returns the main interpreter regardless of
the calling thread, violating the CPython specification that requires returning
the current thread's interpreter. Replace the implementation to use
with_current_vm() instead of get_main_interpreter(), which will properly access
the current thread's interpreter state and automatically fatal-error (panic)
when no thread is attached to a VM, as required by the CPython API
specification.
- Around line 66-72: The `PyInterpreterState_GetID` function currently returns a
derived value from the pointer address (`interp as usize as i64`), which
violates the Python C API contract that requires a unique interpreter-owned ID
and can cause overflow issues when pointer addresses exceed i64 max value. To
fix this, add a unique ID field to the `Interpreter` struct, assign each
interpreter a stable and unique ID when it is created, and modify the
`PyInterpreterState_GetID` function to retrieve and return this ID field from
the interpreter instead of deriving it from the pointer address.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: a3750c8e-1626-4eed-be6a-e38fc864cc7a
📒 Files selected for processing (40)
.cspell.dict/cpython.txt.github/workflows/ci.yaml.github/workflows/update-caches.ymlcrates/capi/Cargo.tomlcrates/capi/pyo3-rustpython.configcrates/capi/src/abstract_.rscrates/capi/src/abstract_/iter.rscrates/capi/src/abstract_/mapping.rscrates/capi/src/abstract_/number.rscrates/capi/src/abstract_/sequence.rscrates/capi/src/bytearrayobject.rscrates/capi/src/bytesobject.rscrates/capi/src/ceval.rscrates/capi/src/complexobject.rscrates/capi/src/critical_section.rscrates/capi/src/descrobject.rscrates/capi/src/dictobject.rscrates/capi/src/floatobject.rscrates/capi/src/genericaliasobject.rscrates/capi/src/import.rscrates/capi/src/lib.rscrates/capi/src/listobject.rscrates/capi/src/longobject.rscrates/capi/src/methodobject.rscrates/capi/src/object.rscrates/capi/src/osmodule.rscrates/capi/src/pycapsule.rscrates/capi/src/pyerrors.rscrates/capi/src/pylifecycle.rscrates/capi/src/pystate.rscrates/capi/src/refcount.rscrates/capi/src/setobject.rscrates/capi/src/sliceobject.rscrates/capi/src/tupleobject.rscrates/capi/src/unicodeobject.rscrates/capi/src/util.rscrates/capi/src/warnings.rscrates/capi/src/weakrefobject.rscrates/vm/src/builtins/int.rscrates/vm/src/stdlib/_ctypes/function.rs
✅ Files skipped from review due to trivial changes (14)
- .cspell.dict/cpython.txt
- .github/workflows/update-caches.yml
- crates/capi/src/warnings.rs
- crates/capi/src/bytearrayobject.rs
- crates/capi/src/abstract_/iter.rs
- crates/capi/src/abstract_/mapping.rs
- crates/capi/src/bytesobject.rs
- crates/capi/src/unicodeobject.rs
- crates/capi/src/setobject.rs
- .github/workflows/ci.yaml
- crates/capi/src/weakrefobject.rs
- crates/capi/src/floatobject.rs
- crates/capi/src/listobject.rs
- crates/capi/src/abstract_.rs
🚧 Files skipped from review as they are similar to previous changes (24)
- crates/capi/pyo3-rustpython.config
- crates/capi/src/sliceobject.rs
- crates/capi/Cargo.toml
- crates/capi/src/complexobject.rs
- crates/capi/src/pycapsule.rs
- crates/capi/src/tupleobject.rs
- crates/capi/src/critical_section.rs
- crates/capi/src/object.rs
- crates/capi/src/util.rs
- crates/capi/src/abstract_/number.rs
- crates/capi/src/descrobject.rs
- crates/capi/src/genericaliasobject.rs
- crates/capi/src/osmodule.rs
- crates/capi/src/abstract_/sequence.rs
- crates/capi/src/lib.rs
- crates/capi/src/ceval.rs
- crates/vm/src/builtins/int.rs
- crates/capi/src/methodobject.rs
- crates/capi/src/refcount.rs
- crates/vm/src/stdlib/_ctypes/function.rs
- crates/capi/src/pylifecycle.rs
- crates/capi/src/import.rs
- crates/capi/src/longobject.rs
- crates/capi/src/pyerrors.rs
| if let Some(value) = dict.inner_getitem_opt(key, vm)? { | ||
| if let Some(result) = result { | ||
| unsafe { | ||
| result.write(value.into_raw().as_ptr()); | ||
| } | ||
| } | ||
| Ok(true) | ||
| } else { | ||
| let value = unsafe { &*default_value }.to_owned(); | ||
| dict.inner_setitem(key, value.clone(), vm)?; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd crates/capi/src && wc -l dictobject.rsRepository: RustPython/RustPython
Length of output: 83
🏁 Script executed:
cd crates/capi/src && sed -n '60,100p' dictobject.rsRepository: RustPython/RustPython
Length of output: 1332
🏁 Script executed:
# Search for setdefault or entry patterns in dict implementation
fd -e rs | xargs rg -l "setdefault|entry" | head -5Repository: RustPython/RustPython
Length of output: 182
🏁 Script executed:
# Look for the dict module and check for setdefault implementations
fd -e rs | xargs rg -A 5 "fn setdefault" | head -50Repository: RustPython/RustPython
Length of output: 1597
🏁 Script executed:
# Get full context of setdefault_entry
fd -e rs dict_inner.rs | xargs sed -n '1,500p' | grep -A 30 "pub(crate) fn setdefault_entry"Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search in dict_inner.rs for the complete setdefault_entry and setdefault implementations
rg -A 20 "pub\(crate\) fn setdefault" crates/vm/src/dict_inner.rsRepository: RustPython/RustPython
Length of output: 1599
🏁 Script executed:
# Check the PyDict struct and see what methods are available
rg -B 5 -A 15 "impl PyDict" crates/pyobject/src/py_dict.rs | head -100Repository: RustPython/RustPython
Length of output: 136
🏁 Script executed:
# Look for inner_getitem_opt and inner_setitem in PyDict to understand the current approach
rg "inner_getitem_opt|inner_setitem" crates/pyobject/src/py_dict.rs | head -20Repository: RustPython/RustPython
Length of output: 136
🏁 Script executed:
# Find where PyDict is defined
fd -e rs | xargs rg "struct PyDict" | head -10Repository: RustPython/RustPython
Length of output: 116
🏁 Script executed:
# Look at the full setdefault_entry implementation to see what it returns
rg -A 30 "pub\(crate\) fn setdefault_entry" crates/vm/src/dict_inner.rsRepository: RustPython/RustPython
Length of output: 1245
🏁 Script executed:
# Check what dict module/PyDict implementations look like
fd -e rs | xargs rg "impl.*PyDict" | head -20Repository: RustPython/RustPython
Length of output: 1244
🏁 Script executed:
# Find where inner_getitem_opt and inner_setitem are defined for PyDict
rg "fn inner_getitem_opt|fn inner_setitem" crates --type rs | head -20Repository: RustPython/RustPython
Length of output: 92
🏁 Script executed:
# Search in builtins/dict.rs for inner_getitem_opt and inner_setitem
rg "fn inner_getitem_opt|fn inner_setitem" crates/vm/src/builtins/dict.rsRepository: RustPython/RustPython
Length of output: 163
🏁 Script executed:
# Get the full view of how PyDict methods work
sed -n '1,100p' crates/vm/src/builtins/dict.rsRepository: RustPython/RustPython
Length of output: 3266
🏁 Script executed:
# Check what the setdefault implementation in PyDict looks like
rg -B 5 -A 20 "fn setdefault" crates/vm/src/builtins/dict.rsRepository: RustPython/RustPython
Length of output: 609
🏁 Script executed:
# Look for how inner_getitem_opt delegates and what it calls
rg -A 10 "inner_getitem_opt" crates/vm/src/builtins/dict.rs | head -50Repository: RustPython/RustPython
Length of output: 296
🏁 Script executed:
# See what inner_setitem actually does
rg -A 15 "pub fn inner_setitem" crates/vm/src/builtins/dict.rsRepository: RustPython/RustPython
Length of output: 458
🏁 Script executed:
# Verify that inner_getitem_opt and inner_setitem each trigger separate dict operations
rg -B 2 -A 8 "pub fn inner_getitem_opt" crates/vm/src/builtins/dict.rsRepository: RustPython/RustPython
Length of output: 268
🏁 Script executed:
# Check the full setdefault_entry to see if we can adapt it for the C API use case
rg -A 40 "pub\(crate\) fn setdefault_entry" crates/vm/src/dict_inner.rs | head -60Repository: RustPython/RustPython
Length of output: 1439
Use dict_inner::Dict::setdefault_entry to avoid double-hashing the key.
The current implementation calls inner_getitem_opt() and then inner_setitem(), which performs two separate dictionary operations. dict_inner::Dict::setdefault_entry() exists and performs a single hash evaluation (let hash = key.key_hash(vm)?;) for both lookup and insertion. This matters for keys with observable __hash__/__eq__ side effects and preserves the atomicity contract of the C API. Route through dict.entries.setdefault_entry() with a closure that wraps the default value, which returns (PyObjectRef, PyObjectRef) containing both the final value and the key for error handling.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/capi/src/dictobject.rs` around lines 77 - 86, The current
implementation performs double-hashing by calling inner_getitem_opt() and then
inner_setitem() separately on the dict object. Replace this approach with
dict.entries.setdefault_entry() which performs a single hash evaluation for both
lookup and insertion operations. Pass a closure to setdefault_entry() that wraps
the default_value, and handle the returned tuple containing both the final value
and the key. This eliminates the double-hashing and preserves the atomicity
contract of the C API, which is important for keys with observable side effects
in __hash__ or __eq__.
46e5005 to
bee8a2c
Compare
|
Could you please rebase this branch onto a fresh copy of |
c46a8a5 to
341d490
Compare
Enable `abi3t` support in PyO3 Add left over c-api functions Fix PyLong_AsUnsignedLongLongMask Update pyo3 to `0.29` Add `PyInt::as_u64_mask` Fix `PyInt::as_u32_mask`
341d490 to
570176c
Compare
Summary by CodeRabbit
os.fspathsupport, critical-section stubs, plus additional numeric and long-mask helpers.ctypeshandling for"O"return types.cargo testruns.