basic hostenv impl#7604
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new rustpython-host_env crate that centralizes broad cross-platform OS abstractions (ctypes, fcntl/posix, mmap, sockets, select/epoll, signal, time, termios, locale, faulthandler, multiprocessing, Windows APIs, etc.), updates Cargo manifests, and rewires stdlib/vm to delegate platform-specific behavior to host_env, including wasm32/non‑WASI unsupported stubs. ChangesHost environment crate + stdlib/VM wiring
Sequence Diagram(s)sequenceDiagram
participant Stdlib as Stdlib (crate)
participant VM as VM runtime
participant Host as rustpython-host_env
participant OS as OS / libc / Win32
Stdlib->>VM: Python syscall wrapper (e.g., mmap, socket, termios)
VM->>Host: call host_* API (host_mmap/map_file, host_socket/recvmsg, host_termios/tcgetattr)
Host->>OS: platform syscall/FFI (libc/Win32/rustix/rustix-epoll)
OS-->>Host: syscall result / handle / error
Host-->>VM: normalized result or io::Error
VM-->>Stdlib: raise/return Python-level value
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (3)
crates/host_env/src/msvcrt.rs (2)
58-60:chartou16cast silently truncates supplementary plane characters.Characters above U+FFFF (supplementary planes) will be truncated when cast to
u16. While this matches Windows' 16-bitwchar_tlimitation, the silent data loss could be surprising.📝 Consider documenting or validating BMP range
pub fn putwch(c: char) { + // Note: Windows wchar_t is 16-bit; only BMP characters (U+0000-U+FFFF) supported + debug_assert!(c as u32 <= 0xFFFF, "putwch only supports BMP characters"); unsafe { suppress_iph!(_putwch(c as u16)) }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/msvcrt.rs` around lines 58 - 60, The putwch function silently truncates non-BMP characters when casting char to u16; locate putwch (and its call to suppress_iph!(_putwch(...))) and add validation or documentation: check if c as u32 <= 0xFFFF before calling _putwch, and if not either map to a replacement (e.g. '\u{FFFD}') or return/propagate an error/log a warning (choose consistent behavior with your crate's error handling); alternatively add a clear doc comment on putwch stating it accepts only BMP characters and that callers must handle supplementary-plane code points themselves.
36-52: Panic on invalid Unicode could be problematic in edge cases.
char::from_u32will returnNonefor surrogate code units (0xD800-0xDFFF). If_getwch/_getwchereturns a surrogate (which shouldn't happen for console input, but could occur with unusual input methods), this will panic.Consider returning a
Resultor using a replacement character instead of panicking.♻️ Alternative: Return Result or use replacement char
pub fn getwch() -> String { let value = unsafe { _getwch() }; - char::from_u32(value) - .unwrap_or_else(|| panic!("invalid unicode {value:`#x`} from _getwch")) - .to_string() + char::from_u32(value) + .unwrap_or(char::REPLACEMENT_CHARACTER) + .to_string() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/msvcrt.rs` around lines 36 - 52, The current implementations of getwch and getwche panic if char::from_u32 returns None (surrogate/invalid codepoints); to avoid this, stop panicking and normalize invalid values to the Unicode replacement character instead so the API stays unchanged: in getwch and getwche replace the unwrap_or_else panic path with a fallback to '\u{FFFD}' before .to_string(); keep getche as-is (it returns raw bytes). Ensure you remove the panic messages and any panic-causing closures in functions getwch and getwche so invalid input yields U+FFFD rather than aborting.crates/host_env/src/fcntl.rs (1)
57-75: Function namedlockfusesflock-style flags, which could cause confusion.The function is named
lockfbut accepts acmdparameter that's checked againstlibc::LOCK_NB(aflock-style flag). POSIXlockf()uses different command constants (F_LOCK,F_ULOCK,F_TLOCK,F_TEST), whileLOCK_NBis fromflock().This conflation of APIs could confuse callers. Consider renaming to
fcntl_lockor documenting that this expects flock-style flags.📝 Suggested rename for clarity
+/// Performs file locking via fcntl using flock-style flags. +/// If `cmd` includes `LOCK_NB`, uses non-blocking `F_SETLK`; otherwise uses blocking `F_SETLKW`. #[cfg(not(any(target_os = "wasi", target_os = "redox")))] -pub fn lockf(fd: i32, cmd: i32, lock: &libc::flock) -> io::Result<i32> { +pub fn fcntl_lock(fd: i32, cmd: i32, lock: &libc::flock) -> io::Result<i32> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/fcntl.rs` around lines 57 - 75, The function lockf currently mixes flock-style flags (checks libc::LOCK_NB on the cmd parameter) with fcntl operations (using F_SETLK/F_SETLKW), which is confusing and potentially incorrect; either rename the function to something like fcntl_lock (or fcntl_lockf) and update its docstring to state it expects flock-style nonblocking flag semantics, or change the cmd handling to accept POSIX lockf commands (F_LOCK/F_TLOCK/F_ULOCK/F_TEST) and map them appropriately to F_SETLK/F_SETLKW; locate the function named lockf and its cmd parameter and adjust the name and/or the flag checks and documentation accordingly so callers aren’t misled by the mixed APIs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/host_env/src/select.rs`:
- Around line 171-176: sec_to_timeval can produce negative tv_usec when sec is
negative which yields invalid timeval for select(); update sec_to_timeval so
negative timeouts are handled by clamping to a non-negative timeval (e.g., if
sec <= 0.0 return timeval { tv_sec: 0, tv_usec: 0 }), otherwise compute tv_sec =
sec.trunc() as _ and tv_usec = (sec.fract() * 1e6) as _ (ensuring tv_usec is
non-negative and fits the timeval field); reference sec_to_timeval and timeval
(and select()) when making the change.
In `@crates/host_env/src/shm.rs`:
- Around line 5-6: Replace the infallible conversion using try_into().unwrap()
with a direct cast on FreeBSD: change the conversion of mode in the
#[cfg(target_os = "freebsd")] branch so that the local variable `mode` is cast
to the target type (since `libc::c_uint` and `libc::mode_t` are both u32 on
FreeBSD) rather than calling `try_into().unwrap()`, updating the `mode` binding
used later in functions in this module (shm.rs) accordingly.
In `@crates/host_env/src/syslog.rs`:
- Around line 62-64: The implementation of pub const fn log_mask(pri: i32)
shifts the wrong operand; change the shift to compute 1 << pri (not pri << 1) so
LOG_MASK(0) yields 1 and all masks are correct. Update log_mask to perform the
shift with a literal one of the correct integer type (e.g., 1i32 << pri or (1 as
i32) << pri) and keep the function signature and const semantics intact. Ensure
the return type stays i32 and handle pri as-is (no additional checks needed
here).
- Around line 35-43: syslog() and closelog() have race conditions: ensure
syslog() holds the global_ident read lock while calling libc::syslog() so the
ident Box<CStr> cannot be freed concurrently, and change closelog() to acquire
the global_ident write lock and perform the is_open() check and drop the ident
while holding that write lock (preventing an openlog() from racing in between).
Also correct log_mask() to use the POSIX LOG_MASK formula (return 1 << pri, not
pri << 1). Update references: openlog (stores GlobalIdent), syslog
(acquire/read-lock around libc::syslog call), closelog (do is_open check and
drop under write lock), global_ident (use its locks accordingly), and log_mask
(use 1 << pri).
In `@crates/host_env/src/time.rs`:
- Around line 24-27: get_tz_info currently ignores GetTimeZoneInformation's
return value and returns a zeroed TIME_ZONE_INFORMATION on failure; change
get_tz_info to return a
Result<windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION, std::io::Error>
(or similar error type), call GetTimeZoneInformation and check its return value
against TIME_ZONE_ID_INVALID, and on failure call GetLastError to construct and
return an Err(std::io::Error::from_raw_os_error(...)); update all callers in
crates/vm/src/stdlib/time.rs (the wrapper function and the functions altzone,
timezone, daylight, tzname and the platform module) to handle the Result by
propagating errors or mapping the Ok value to the previous behavior so failures
are not silently converted to a zeroed struct. Ensure unique symbols:
get_tz_info, wrapper function in stdlib/time.rs, altzone, timezone, daylight,
tzname, and the platform module are updated accordingly.
- Around line 31-45: checked_tm_from_struct_time currently only checks
tm.tm_wday < 0 causing a potential out-of-bounds when asctime_from_tm indexes
WDAY_NAME; update checked_tm_from_struct_time to validate the full range for
tm.tm_wday (>=0 && <=6) and ensure tm.tm_mon is validated for 0..=11 as well so
both WDAY_NAME[tm.tm_wday as usize] and MON_NAME[tm.tm_mon as usize] cannot
panic; locate and update the checks around tm.tm_wday and tm.tm_mon in the
checked_tm_from_struct_time function accordingly.
In `@crates/stdlib/src/openssl.rs`:
- Around line 1671-1681: The fallback arm of the map_err on the
rustpython_host_env::fileutils::fopen call currently returns
vm.new_os_error(e.to_string()) which drops the original errno and prevents
mapping to specific OSError subtypes; change that fallback to return the
original IO error converted to a Python exception by calling
e.to_pyexception(vm) so errno is preserved (i.e., replace
vm.new_os_error(e.to_string()) with e.to_pyexception(vm) inside the map_err
closure).
In `@crates/stdlib/src/select.rs`:
- Around line 106-108: The select path is reading the wrong Windows error
namespace; update the error retrieval in the host_select::select() helper
(crates/host_env/src/select.rs) to call WSAGetLastError() on Windows and
construct the io::Error from that code instead of using
GetLastError()/io::Error::last_os_error(); ensure the returned Error codes
propagate into select.rs::select() and socket.rs::sock_select() so the
interrupted-retry logic can detect Winsock-specific errors (e.g.,
WSAEINTR/WSAEINPROGRESS) correctly. Use conditional compilation for Windows
targets to call the WinSock API (WSAGetLastError) and fallback to the existing
path on non-Windows platforms so behavior is unchanged elsewhere.
In `@crates/vm/src/stdlib/msvcrt.rs`:
- Around line 46-54: The wide-char wrappers (getwch, getche, getwche) currently
treat MSVCRT UTF-16 code units as Rust char and can panic on surrogate halves;
update the plumbing to handle UTF-16 code units instead of Unicode scalars:
change the host_msvcrt side bindings for _getwch/_getwche/_putwch/_ungetwch to
use u16/u32 (i.e., raw UTF-16 code units), stop calling char::from_u32 on the
returned values, and in the VM wrappers (getwch, getche, getwche) either return
the raw u16 code units (Vec<u16>) or safely decode sequences using
std::char::decode_utf16 to produce a String without panicking (handling Err by
using replacement U+FFFD); likewise, when sending to putwch/ungetwch, accept
UTF-16 code units (u16/u32) rather than truncating Rust char.
---
Nitpick comments:
In `@crates/host_env/src/fcntl.rs`:
- Around line 57-75: The function lockf currently mixes flock-style flags
(checks libc::LOCK_NB on the cmd parameter) with fcntl operations (using
F_SETLK/F_SETLKW), which is confusing and potentially incorrect; either rename
the function to something like fcntl_lock (or fcntl_lockf) and update its
docstring to state it expects flock-style nonblocking flag semantics, or change
the cmd handling to accept POSIX lockf commands (F_LOCK/F_TLOCK/F_ULOCK/F_TEST)
and map them appropriately to F_SETLK/F_SETLKW; locate the function named lockf
and its cmd parameter and adjust the name and/or the flag checks and
documentation accordingly so callers aren’t misled by the mixed APIs.
In `@crates/host_env/src/msvcrt.rs`:
- Around line 58-60: The putwch function silently truncates non-BMP characters
when casting char to u16; locate putwch (and its call to
suppress_iph!(_putwch(...))) and add validation or documentation: check if c as
u32 <= 0xFFFF before calling _putwch, and if not either map to a replacement
(e.g. '\u{FFFD}') or return/propagate an error/log a warning (choose consistent
behavior with your crate's error handling); alternatively add a clear doc
comment on putwch stating it accepts only BMP characters and that callers must
handle supplementary-plane code points themselves.
- Around line 36-52: The current implementations of getwch and getwche panic if
char::from_u32 returns None (surrogate/invalid codepoints); to avoid this, stop
panicking and normalize invalid values to the Unicode replacement character
instead so the API stays unchanged: in getwch and getwche replace the
unwrap_or_else panic path with a fallback to '\u{FFFD}' before .to_string();
keep getche as-is (it returns raw bytes). Ensure you remove the panic messages
and any panic-causing closures in functions getwch and getwche so invalid input
yields U+FFFD rather than aborting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 32126555-6507-4aa3-9b49-7cd150eda503
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (56)
Cargo.tomlcrates/common/Cargo.tomlcrates/common/src/lib.rscrates/host_env/Cargo.tomlcrates/host_env/src/crt_fd.rscrates/host_env/src/fcntl.rscrates/host_env/src/fileutils.rscrates/host_env/src/lib.rscrates/host_env/src/macros.rscrates/host_env/src/msvcrt.rscrates/host_env/src/nt.rscrates/host_env/src/os.rscrates/host_env/src/posix.rscrates/host_env/src/select.rscrates/host_env/src/shm.rscrates/host_env/src/signal.rscrates/host_env/src/syslog.rscrates/host_env/src/termios.rscrates/host_env/src/time.rscrates/host_env/src/winapi.rscrates/host_env/src/windows.rscrates/stdlib/Cargo.tomlcrates/stdlib/src/faulthandler.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/mmap.rscrates/stdlib/src/openssl.rscrates/stdlib/src/overlapped.rscrates/stdlib/src/posixshmem.rscrates/stdlib/src/select.rscrates/stdlib/src/socket.rscrates/stdlib/src/syslog.rscrates/stdlib/src/termios.rscrates/vm/Cargo.tomlcrates/vm/src/exceptions.rscrates/vm/src/function/fspath.rscrates/vm/src/lib.rscrates/vm/src/ospath.rscrates/vm/src/stdlib/_codecs.rscrates/vm/src/stdlib/_ctypes/function.rscrates/vm/src/stdlib/_io.rscrates/vm/src/stdlib/_signal.rscrates/vm/src/stdlib/_winapi.rscrates/vm/src/stdlib/msvcrt.rscrates/vm/src/stdlib/nt.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/posix_compat.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/time.rscrates/vm/src/stdlib/winreg.rscrates/vm/src/stdlib/winsound.rscrates/vm/src/vm/vm_new.rscrates/vm/src/windows.rsexamples/generator.rsexamples/package_embed.rssrc/lib.rs
💤 Files with no reviewable changes (2)
- crates/common/Cargo.toml
- crates/common/src/lib.rs
| pub fn sec_to_timeval(sec: f64) -> timeval { | ||
| timeval { | ||
| tv_sec: sec.trunc() as _, | ||
| tv_usec: (sec.fract() * 1e6) as _, | ||
| } | ||
| } |
There was a problem hiding this comment.
Negative timeout values could produce invalid timeval.
If sec is negative, sec.fract() will also be negative, resulting in a negative tv_usec value. This could cause undefined behavior when passed to select().
🛡️ Proposed fix to handle negative values
pub fn sec_to_timeval(sec: f64) -> timeval {
+ // Clamp negative values to zero to avoid invalid timeval
+ let sec = sec.max(0.0);
timeval {
tv_sec: sec.trunc() as _,
tv_usec: (sec.fract() * 1e6) as _,
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn sec_to_timeval(sec: f64) -> timeval { | |
| timeval { | |
| tv_sec: sec.trunc() as _, | |
| tv_usec: (sec.fract() * 1e6) as _, | |
| } | |
| } | |
| pub fn sec_to_timeval(sec: f64) -> timeval { | |
| // Clamp negative values to zero to avoid invalid timeval | |
| let sec = sec.max(0.0); | |
| timeval { | |
| tv_sec: sec.trunc() as _, | |
| tv_usec: (sec.fract() * 1e6) as _, | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/select.rs` around lines 171 - 176, sec_to_timeval can
produce negative tv_usec when sec is negative which yields invalid timeval for
select(); update sec_to_timeval so negative timeouts are handled by clamping to
a non-negative timeval (e.g., if sec <= 0.0 return timeval { tv_sec: 0, tv_usec:
0 }), otherwise compute tv_sec = sec.trunc() as _ and tv_usec = (sec.fract() *
1e6) as _ (ensuring tv_usec is non-negative and fits the timeval field);
reference sec_to_timeval and timeval (and select()) when making the change.
| #[cfg(target_os = "freebsd")] | ||
| let mode = mode.try_into().unwrap(); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Rust libc bindings on FreeBSD, what is the modeparameter type forshm_open, and can converting from libc::c_uintviaTryInto fail?
💡 Result:
In Rust's libc bindings on FreeBSD, the mode parameter for shm_open is of type libc::mode_t, which is u32 (type alias). The libc::shm_open function is declared as pub unsafe extern "C" fn shm_open(name: *const c_char, oflag: c_int, ...) -> c_int, using variadic arguments (...), so the mode value passed after oflag must be compatible with FreeBSD's C mode_t (also u32). Since libc::c_uint is also u32 on FreeBSD, converting from libc::c_uint to libc::mode_t via TryInto cannot fail, as it is a zero-cost type alias conversion between identical underlying types (u32 to u32).
Citations:
- 1: https://docs.rs/libc/0.2.77/x86_64-apple-darwin/libc/fn.shm_open.html
- 2: https://man.bsd.lv/FreeBSD-13.0/shm_open.2
- 3: https://man.freebsd.org/cgi/man.cgi?apropos=0&manpath=FreeBSD+15.0-RELEASE+and+Ports&query=shm_open&sektion=2
- 4: https://github.com/rust-lang/libc/blob/master/src/unix/bsd/freebsdlike/freebsd/mod.rs
- 5: https://docs.rs/libc/latest/libc/type.mode_t.html
- 6: https://github.com/sfackler/rust-posix-ipc/blob/master/src/shm.rs
🏁 Script executed:
# Find and examine the shm.rs file
fd -type f -name "shm.rs" | head -20Repository: RustPython/RustPython
Length of output: 235
🏁 Script executed:
# Find the shm.rs file in the repository
fd shm.rsRepository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Read the shm.rs file to see the function and its context
cat -n crates/host_env/src/shm.rsRepository: RustPython/RustPython
Length of output: 801
🏁 Script executed:
# Check the libc crate definitions for FreeBSD mode_t and c_uint
# First, let's see what's in the Cargo.lock or Cargo.toml to find the libc version
fd -name "Cargo.toml" | xargs grep -l "libc" | head -5Repository: RustPython/RustPython
Length of output: 295
🏁 Script executed:
# Find Cargo.toml files that reference libc
find . -name "Cargo.toml" -type f | head -5Repository: RustPython/RustPython
Length of output: 202
🏁 Script executed:
# Check the main Cargo.toml and the host_env Cargo.toml for libc dependency
grep -r "libc" ./Cargo.toml ./crates/host_env/Cargo.toml 2>/dev/null | head -10Repository: RustPython/RustPython
Length of output: 192
🏁 Script executed:
# Let's verify the actual type definitions for FreeBSD
# Check if there's any documentation or test that shows the types
rg "mode_t|c_uint" crates/host_env/ -A 2 -B 2Repository: RustPython/RustPython
Length of output: 933
Simplify FreeBSD mode conversion using a direct cast.
The try_into().unwrap() is unnecessary since both libc::c_uint and libc::mode_t are u32 on FreeBSD, making the conversion infallible. Use a direct cast instead for clarity:
Proposed fix
pub fn shm_open(name: &CStr, flags: libc::c_int, mode: libc::c_uint) -> io::Result<libc::c_int> {
#[cfg(target_os = "freebsd")]
- let mode = mode.try_into().unwrap();
+ let mode = mode as libc::mode_t;
let fd = unsafe { libc::shm_open(name.as_ptr(), flags, mode) };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[cfg(target_os = "freebsd")] | |
| let mode = mode.try_into().unwrap(); | |
| pub fn shm_open(name: &CStr, flags: libc::c_int, mode: libc::c_uint) -> io::Result<libc::c_int> { | |
| #[cfg(target_os = "freebsd")] | |
| let mode = mode as libc::mode_t; | |
| let fd = unsafe { libc::shm_open(name.as_ptr(), flags, mode) }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/shm.rs` around lines 5 - 6, Replace the infallible
conversion using try_into().unwrap() with a direct cast on FreeBSD: change the
conversion of mode in the #[cfg(target_os = "freebsd")] branch so that the local
variable `mode` is cast to the target type (since `libc::c_uint` and
`libc::mode_t` are both u32 on FreeBSD) rather than calling
`try_into().unwrap()`, updating the `mode` binding used later in functions in
this module (shm.rs) accordingly.
| pub fn get_tz_info() -> windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION { | ||
| let mut info = unsafe { core::mem::zeroed() }; | ||
| unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) }; | ||
| info |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Win32 GetTimeZoneInformation, what return value indicates failure, and should callers check GetLastError in that case?
💡 Result:
The return value indicating failure for Win32 GetTimeZoneInformation is TIME_ZONE_ID_INVALID (0xFFFFFFFF). Callers should check GetLastError in that case to get extended error information, such as for out-of-memory errors.
Citations:
- 1: https://learn.microsoft.com/tr-tr/windows/win32/api/timezoneapi/nf-timezoneapi-gettimezoneinformation
- 2: https://learn.microsoft.com/en-us/windows/win32/api/timezoneapi/nf-timezoneapi-gettimezoneinformation
- 3: https://learn.microsoft.com/en-us/windows/win32/api/timezoneapi/nf-timezoneapi-getdynamictimezoneinformation
- 4: https://learn.microsoft.com/en-us/previous-versions/ms908466(v=msdn.10)
🏁 Script executed:
# Read the time.rs file to verify the current code state
cat -n crates/host_env/src/time.rsRepository: RustPython/RustPython
Length of output: 1776
🏁 Script executed:
# Search for all usages of get_tz_info to understand call sites
rg "get_tz_info" --type rs -B 2 -A 2Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Search for all usages of get_tz_info
rg "get_tz_info" -A 2 -B 2Repository: RustPython/RustPython
Length of output: 3079
Handle GetTimeZoneInformation failure explicitly.
The return value is currently ignored; on failure this returns a zeroed struct, which silently produces incorrect timezone behavior. The Windows API signals failure with TIME_ZONE_ID_INVALID and requires checking GetLastError for extended error information.
All call sites in crates/vm/src/stdlib/time.rs (wrapper function, altzone, timezone, daylight, tzname, and platform module) currently expect a direct struct return and will need to be updated to handle the Result type.
Proposed fix
+use std::io;
+
#[cfg(target_env = "msvc")]
#[cfg(not(target_arch = "wasm32"))]
-pub fn get_tz_info() -> windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION {
+pub fn get_tz_info() -> io::Result<windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION> {
let mut info = unsafe { core::mem::zeroed() };
- unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) };
- info
+ let status = unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) };
+ if status == windows_sys::Win32::System::Time::TIME_ZONE_ID_INVALID {
+ Err(io::Error::last_os_error())
+ } else {
+ Ok(info)
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/time.rs` around lines 24 - 27, get_tz_info currently
ignores GetTimeZoneInformation's return value and returns a zeroed
TIME_ZONE_INFORMATION on failure; change get_tz_info to return a
Result<windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION, std::io::Error>
(or similar error type), call GetTimeZoneInformation and check its return value
against TIME_ZONE_ID_INVALID, and on failure call GetLastError to construct and
return an Err(std::io::Error::from_raw_os_error(...)); update all callers in
crates/vm/src/stdlib/time.rs (the wrapper function and the functions altzone,
timezone, daylight, tzname and the platform module) to handle the Result by
propagating errors or mapping the Ok value to the previous behavior so failures
are not silently converted to a zeroed struct. Ensure unique symbols:
get_tz_info, wrapper function in stdlib/time.rs, altzone, timezone, daylight,
tzname, and the platform module are updated accordingly.
| pub fn asctime_from_tm(tm: &libc::tm) -> String { | ||
| const WDAY_NAME: [&str; 7] = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]; | ||
| const MON_NAME: [&str; 12] = [ | ||
| "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", | ||
| ]; | ||
| format!( | ||
| "{} {}{:>3} {:02}:{:02}:{:02} {}", | ||
| WDAY_NAME[tm.tm_wday as usize], | ||
| MON_NAME[tm.tm_mon as usize], | ||
| tm.tm_mday, | ||
| tm.tm_hour, | ||
| tm.tm_min, | ||
| tm.tm_sec, | ||
| tm.tm_year + 1900 | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -n --type rust 'asctime_from_tm\('
rg -n --type rust -C 4 'tm_wday|tm_mon'Repository: RustPython/RustPython
Length of output: 1784
🏁 Script executed:
rg -n --type rust 'asctime_from_tm' --max-count 20Repository: RustPython/RustPython
Length of output: 392
🏁 Script executed:
rg -n --type rust 'tm_wday|tm_mon' -C 3 --max-count 30Repository: RustPython/RustPython
Length of output: 13157
🌐 Web query:
libc::tm tm_wday tm_mon range valid values documentation
💡 Result:
In the C standard library (libc) struct tm, the fields tm_wday and tm_mon have the following valid ranges: tm_wday: days since Sunday [0, 6] (0=Sunday, 1=Monday, ..., 6=Saturday) tm_mon: months since January [0, 11] (0=January, 1=February, ..., 11=December) These ranges are standard across C (cppreference.com citing C11/C17/C23), POSIX (Open Group Base Specifications), and libc man pages (Linux/Ubuntu). Functions like mktime may normalize or ignore tm_wday values outside this range, but the documented valid input/output range is 0-6 for tm_wday and 0-11 for tm_mon.
Citations:
- 1: https://w.cppreference.com/c/chrono/tm.html
- 2: https://manpages.ubuntu.com/manpages/noble/man3/tm.3type.html
- 3: https://man7.org/linux/man-pages/man3/tm.3type.html
- 4: https://pubs.opengroup.org/onlinepubs/009695099/basedefs/time.h.html
- 5: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/time.h.html
🏁 Script executed:
sed -n '580,600p' crates/vm/src/stdlib/time.rs
sed -n '600,620p' crates/vm/src/stdlib/time.rs
sed -n '350,430p' crates/vm/src/stdlib/time.rsRepository: RustPython/RustPython
Length of output: 4219
🏁 Script executed:
sed -n '440,450p' crates/vm/src/stdlib/time.rsRepository: RustPython/RustPython
Length of output: 448
Add upper bound validation for tm_wday in checked_tm_from_struct_time.
While asctime_from_tm call sites do pass through checked_tm_from_struct_time validation, the bounds check for tm_wday is incomplete. Line 442 only checks tm.tm_wday < 0 but omits the upper bound—should also validate tm.tm_wday > 6 to prevent panic when indexing WDAY_NAME[7]. Similarly, confirm tm_mon validation correctly enforces [0, 11] range (line 425 looks correct).
Validation gaps in crates/vm/src/stdlib/time.rs
Line 442-444 (incomplete):
if tm.tm_wday < 0 {
return Err(vm.new_value_error("day of week out of range"));
}Should be:
if tm.tm_wday < 0 || tm.tm_wday > 6 {
return Err(vm.new_value_error("day of week out of range"));
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/time.rs` around lines 31 - 45,
checked_tm_from_struct_time currently only checks tm.tm_wday < 0 causing a
potential out-of-bounds when asctime_from_tm indexes WDAY_NAME; update
checked_tm_from_struct_time to validate the full range for tm.tm_wday (>=0 &&
<=6) and ensure tm.tm_mon is validated for 0..=11 as well so both
WDAY_NAME[tm.tm_wday as usize] and MON_NAME[tm.tm_mon as usize] cannot panic;
locate and update the checks around tm.tm_wday and tm.tm_mon in the
checked_tm_from_struct_time function accordingly.
| rustpython_host_env::fileutils::fopen(path.as_path(), "rb").map_err(|e| match e | ||
| .kind() | ||
| { | ||
| std::io::ErrorKind::NotFound => vm | ||
| .new_os_subtype_error( | ||
| vm.ctx.exceptions.file_not_found_error.to_owned(), | ||
| Some(libc::ENOENT), | ||
| e.to_string(), | ||
| ) | ||
| .upcast(), | ||
| _ => vm.new_os_error(e.to_string()), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the current load_dh_params mapping.
fd -a 'openssl\.rs$' . -x sed -n '1666,1682p' {}
# Inspect the generic OSError constructor.
rg -n -C3 --type=rust '\b(pub\s+)?fn new_os_error\b'
# Inspect how std::io::Error is normally translated into Python exceptions.
rg -n -C4 --type=rust 'raw_os_error\(\)|to_pyexception\(.*vm\)|into_pyexception\(.*vm\)|file_not_found_error|permission_error'Repository: RustPython/RustPython
Length of output: 50378
Replace vm.new_os_error() with e.to_pyexception() in the fallback arm to preserve errno.
The _ => vm.new_os_error(e.to_string()) branch discards the errno code from the std::io::Error, mapping failures like EACCES, EISDIR, and ENOTDIR to a plain OSError instead of their proper subtypes (PermissionError, IsADirectoryError, NotADirectoryError). The codebase uses e.to_pyexception(vm) throughout (_io.rs, nt.rs, _winapi.rs) to preserve this information for correct exception type selection.
Suggested fix
- _ => vm.new_os_error(e.to_string()),
+ _ => e.to_pyexception(vm),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rustpython_host_env::fileutils::fopen(path.as_path(), "rb").map_err(|e| match e | |
| .kind() | |
| { | |
| std::io::ErrorKind::NotFound => vm | |
| .new_os_subtype_error( | |
| vm.ctx.exceptions.file_not_found_error.to_owned(), | |
| Some(libc::ENOENT), | |
| e.to_string(), | |
| ) | |
| .upcast(), | |
| _ => vm.new_os_error(e.to_string()), | |
| rustpython_host_env::fileutils::fopen(path.as_path(), "rb").map_err(|e| match e | |
| .kind() | |
| { | |
| std::io::ErrorKind::NotFound => vm | |
| .new_os_subtype_error( | |
| vm.ctx.exceptions.file_not_found_error.to_owned(), | |
| Some(libc::ENOENT), | |
| e.to_string(), | |
| ) | |
| .upcast(), | |
| _ => e.to_pyexception(vm), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/stdlib/src/openssl.rs` around lines 1671 - 1681, The fallback arm of
the map_err on the rustpython_host_env::fileutils::fopen call currently returns
vm.new_os_error(e.to_string()) which drops the original errno and prevents
mapping to specific OSError subtypes; change that fallback to return the
original IO error converted to a Python exception by calling
e.to_pyexception(vm) so errno is preserved (i.e., replace
vm.new_os_error(e.to_string()) with e.to_pyexception(vm) inside the map_err
closure).
| host_msvcrt::getwch() | ||
| } | ||
| #[pyfunction] | ||
| fn getche() -> Vec<u8> { | ||
| let c = unsafe { _getche() }; | ||
| vec![c as u8] | ||
| host_msvcrt::getche() | ||
| } | ||
| #[pyfunction] | ||
| fn getwche() -> String { | ||
| let c = unsafe { _getwche() }; | ||
| char::from_u32(c).unwrap().to_string() | ||
| host_msvcrt::getwche() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
sed -n '1,180p' crates/host_env/src/msvcrt.rs
printf '\n--- wide-char wrapper sites ---\n'
rg -n "getwch|getwche|putwch|ungetwch|char::from_u32|as u16|_ungetwch" crates/host_env/src/msvcrt.rs crates/vm/src/stdlib/msvcrt.rsRepository: RustPython/RustPython
Length of output: 5075
The wide-char wrappers incorrectly model UTF-16 console I/O as Unicode scalars.
crates/host_env/src/msvcrt.rs treats _getwch, _getwche, _putwch, and _ungetwch as operating on Unicode code points via char, but MSVCRT APIs work with UTF-16 code units. _getwch and _getwche return u32 values that char::from_u32() cannot handle when they are UTF-16 surrogate code units (0xD800–0xDFFF). An emoji like 😀 arrives as a surrogate pair; the first half will panic. Similarly, putwch and ungetwch truncate char to u16/u32, losing supplementary plane input entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/msvcrt.rs` around lines 46 - 54, The wide-char wrappers
(getwch, getche, getwche) currently treat MSVCRT UTF-16 code units as Rust char
and can panic on surrogate halves; update the plumbing to handle UTF-16 code
units instead of Unicode scalars: change the host_msvcrt side bindings for
_getwch/_getwche/_putwch/_ungetwch to use u16/u32 (i.e., raw UTF-16 code units),
stop calling char::from_u32 on the returned values, and in the VM wrappers
(getwch, getche, getwche) either return the raw u16 code units (Vec<u16>) or
safely decode sequences using std::char::decode_utf16 to produce a String
without panicking (handling Err by using replacement U+FFFD); likewise, when
sending to putwch/ungetwch, accept UTF-16 code units (u16/u32) rather than
truncating Rust char.
There was a problem hiding this comment.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/host_env/src/msvcrt.rs (1)
42-54:⚠️ Potential issue | 🟡 MinorPotential panic if
_getwch()returns an invalid Unicode code point.
char::from_u32()returnsNonefor values that aren't valid Unicode scalar values (e.g., surrogate code points 0xD800-0xDFFF). If the Windows console returns such a value, this will panic.Consider using
char::from_u32(...).unwrap_or(char::REPLACEMENT_CHARACTER)or restoring error handling to gracefully handle invalid input.Suggested defensive fix
pub fn getwch() -> String { let value = unsafe { _getwch() }; - char::from_u32(value).unwrap().to_string() + char::from_u32(value).unwrap_or(char::REPLACEMENT_CHARACTER).to_string() } // ... pub fn getwche() -> String { let value = unsafe { _getwche() }; - char::from_u32(value).unwrap().to_string() + char::from_u32(value).unwrap_or(char::REPLACEMENT_CHARACTER).to_string() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/msvcrt.rs` around lines 42 - 54, The getwch and getwche functions currently call char::from_u32(...).unwrap(), which can panic if the underlying unsafe calls _getwch() or _getwche() return an invalid Unicode scalar (e.g., surrogate values); change both to handle invalid code points by using char::from_u32(value).unwrap_or(char::REPLACEMENT_CHARACTER) (or equivalent fallback) before calling to_string(), and keep getche as-is (it returns raw u8) or document its behavior; ensure you reference the unsafe calls _getwch and _getwche and the wrapper functions getwch and getwche when making the change.
♻️ Duplicate comments (3)
crates/host_env/src/syslog.rs (1)
46-49:⚠️ Potential issue | 🔴 CriticalFix TOCTOU in
closelog()by locking before checking open state.
closelog()checksis_open()on Line 46 before acquiring the write lock on Line 47. A concurrentopenlog()can race in that gap and get closed/reset unexpectedly. Lock first, then check state under the same lock.Proposed fix
pub fn closelog() { - if is_open() { - let mut locked_ident = global_ident().write(); + let mut locked_ident = global_ident().write(); + if locked_ident.is_some() { unsafe { libc::closelog() }; *locked_ident = None; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/syslog.rs` around lines 46 - 49, The TOCTOU bug occurs because closelog() calls is_open() before acquiring the global_ident() write lock, allowing a concurrent openlog() to race; modify closelog() to acquire the write lock on global_ident() first (e.g., let mut locked_ident = global_ident().write()), then check whether the saved identifier is Some via locked_ident.as_ref() or is_open() logic under that lock, call unsafe { libc::closelog() } and set *locked_ident = None while still holding the lock so close and state update happen atomically with respect to openlog().crates/host_env/src/time.rs (2)
264-267:⚠️ Potential issue | 🟠 MajorHandle
GetTimeZoneInformationfailure explicitly.Win32 reports failure from
GetTimeZoneInformationviaTIME_ZONE_ID_INVALIDand expects callers to inspectGetLastError; ignoring the status here means a failed call is silently decoded from a zeroed struct. (learn.microsoft.com)Suggested fix
+use std::io; + #[cfg(target_env = "msvc")] #[cfg(not(target_arch = "wasm32"))] -pub fn get_tz_info() -> WindowsTimeZoneInfo { +pub fn get_tz_info() -> io::Result<WindowsTimeZoneInfo> { let mut info = unsafe { core::mem::zeroed() }; - unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) }; - WindowsTimeZoneInfo { + let status = + unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) }; + if status == windows_sys::Win32::System::Time::TIME_ZONE_ID_INVALID { + return Err(io::Error::last_os_error()); + } + Ok(WindowsTimeZoneInfo { bias: info.Bias as i32, standard_bias: info.StandardBias as i32, daylight_bias: info.DaylightBias as i32, standard_name: decode_tz_name(&info.StandardName), daylight_name: decode_tz_name(&info.DaylightName), - } + }) }#!/bin/bash # Verify the current implementation and locate all callers that would need Result handling. sed -n '256,274p' crates/host_env/src/time.rs rg -n --type rust '\bget_tz_info\s*\('🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/time.rs` around lines 264 - 267, get_tz_info currently ignores the return code of GetTimeZoneInformation and decodes a zeroed struct on failure; change get_tz_info to check the return value against TIME_ZONE_ID_INVALID, call GetLastError() when invalid, and return a Result<WindowsTimeZoneInfo, std::io::Error> (or another appropriate error type) instead of unconditionally returning WindowsTimeZoneInfo; update callers of get_tz_info to handle the Result and propagate or map the error, and ensure the WindowsTimeZoneInfo construction only happens on a successful GetTimeZoneInformation return.
376-384:⚠️ Potential issue | 🟠 MajorReject
tm_wday > 6beforeasctime_from_tmindexesWDAY_NAME.Line 482 only rejects negative weekdays.
tm_wday = 7still gets through and then Line 383 panics onWDAY_NAME[tm.tm_wday as usize].Suggested fix
- if tm.tm_wday < 0 { + if !(0..=6).contains(&tm.tm_wday) { return Err(CheckedTmError::DayOfWeekOutOfRange); }Also applies to: 482-484
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/time.rs` around lines 376 - 384, asctime_from_tm currently only guards against negative weekday values but still indexes WDAY_NAME with tm.tm_wday (and similarly MON_NAME with tm.tm_mon) which can panic for tm.tm_wday == 7 or tm.tm_mon > 11; update asctime_from_tm to validate both tm.tm_wday is within 0..=6 and tm.tm_mon is within 0..=11 before using WDAY_NAME[tm.tm_wday as usize] and MON_NAME[tm.tm_mon as usize], and handle out-of-range values the same way the function handles negative values (reject/return the existing fallback) to prevent panics.
🟡 Minor comments (6)
crates/host_env/src/signal.rs-299-303 (1)
299-303:⚠️ Potential issue | 🟡 MinorPotential truncation casting
SOCKETtolibc::writefd parameter.On 64-bit Windows,
libc::SOCKETisusize(64-bit), butlibc::writeexpects a Cint(32-bit) as the first parameter. Large socket handles could truncate. Consider validating the cast or using an expliciti32::try_from.Suggested fix
} else { + let fd = wakeup_fd as i32; // SOCKET values for CRT fds should fit in i32 unsafe { - let _ = libc::write(wakeup_fd as _, &sigbyte as *const u8 as *const _, 1); + let _ = libc::write(fd, &sigbyte as *const u8 as *const _, 1); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/signal.rs` around lines 299 - 303, The write call is casting wakeup_fd to the C int parameter which can truncate on 64-bit Windows; change the unsafe libc::write(wakeup_fd as _, &sigbyte... ) usage to first attempt a safe conversion like i32::try_from(wakeup_fd) and only call libc::write with the converted fd, handling the Err case (e.g., skip the write and log or return) to avoid truncation; reference the wakeup_fd, sigbyte, and the libc::write invocation when making this change.crates/host_env/src/wmi.rs-660-661 (1)
660-661:⚠️ Potential issue | 🟡 MinorClose handles only if they were successfully created.
CloseHandleis called unconditionally, but ifCreateEventWfailed earlier, these handles could be null. WhileCloseHandle(null)may fail silently on some Windows versions, it's technically undefined behavior.Suggested fix
- CloseHandle(init_event); - CloseHandle(connect_event); + if !init_event.is_null() { + CloseHandle(init_event); + } + if !connect_event.is_null() { + CloseHandle(connect_event); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/wmi.rs` around lines 660 - 661, The code calls CloseHandle(init_event) and CloseHandle(connect_event) unconditionally; change it to only call CloseHandle when the corresponding event handle was successfully created by CreateEventW (i.e., check that init_event and connect_event are not null/INVALID_HANDLE_VALUE before calling CloseHandle). Update the cleanup logic around the CreateEventW calls and any early-return paths so each handle is closed only if it was actually allocated, referencing the symbols CreateEventW, init_event, connect_event, and CloseHandle to locate the relevant code.crates/host_env/src/winapi.rs-627-636 (1)
627-636:⚠️ Potential issue | 🟡 MinorSuspicious thread stack size of 1 byte.
CreateThreadis called with a stack size of 1 byte (3rd parameter). While Windows rounds up to a minimum, this is non-idiomatic. A value of 0 uses the executable's default stack size, which is typically more appropriate.🔧 Proposed fix
let thread = unsafe { CreateThread( core::ptr::null(), - 1, + 0, Some(batch_wait_thread), Arc::as_ptr(data) as *const _ as *mut _, 4, core::ptr::null_mut(), ) };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/winapi.rs` around lines 627 - 636, The CreateThread call currently passes a stack size of 1 (in the call that creates the thread for batch_wait_thread), which is non-idiomatic and suspicious; change the third argument to 0 so the thread uses the process/executable default stack size (i.e., replace the literal 1 with 0 in the CreateThread invocation that passes batch_wait_thread and Arc::as_ptr(data)).crates/host_env/src/overlapped.rs-536-536 (1)
536-536:⚠️ Potential issue | 🟡 MinorPotential panic if Winsock extensions not initialized.
The
unwrap()calls onACCEPT_EX.get(),CONNECT_EX.get(), etc. will panic ifinitialize_winsock_extensions()was not called beforehand. This creates a hidden precondition.Consider returning
io::Resultwith an appropriate error instead of panicking, or document this precondition clearly.🛡️ Example defensive approach
- let accept_ex: AcceptExFn = unsafe { core::mem::transmute(*ACCEPT_EX.get().unwrap()) }; + let accept_ex: AcceptExFn = unsafe { + core::mem::transmute( + *ACCEPT_EX + .get() + .ok_or_else(|| io::Error::new(io::ErrorKind::Other, "Winsock extensions not initialized"))?, + ) + };This would require changing the return type to
io::Result<u32>.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/overlapped.rs` at line 536, The current code uses unwrap() on ACCEPT_EX.get() (and similar CONNECT_EX.get(), etc.), which will panic if initialize_winsock_extensions() was not called; change the API to return an io::Result (e.g., change the function signature to io::Result<u32>) and replace unwrap() with a checked lookup that returns Err(io::Error::new(...)) when the extension pointer is None, or alternatively call initialize_winsock_extensions() lazily and return an error on failure; reference ACCEPT_EX.get(), CONNECT_EX.get(), and initialize_winsock_extensions() when making the change so the caller-visible precondition is removed and errors are propagated instead of panicking.crates/host_env/src/nt.rs-1546-1559 (1)
1546-1559:⚠️ Potential issue | 🟡 MinorPotential panic if username contains non-UTF-8 characters.
The
unwrap()onto_str()at line 1556-1557 will panic if the Windows username cannot be represented as valid UTF-8. While rare, this is possible with certain character encodings.🛡️ Proposed fix using lossy conversion
pub fn getlogin() -> io::Result<String> { let mut buffer = [0u16; 257]; let mut size = buffer.len() as u32; let ok = unsafe { windows_sys::Win32::System::WindowsProgramming::GetUserNameW(buffer.as_mut_ptr(), &mut size) }; if ok == 0 { return Err(io::Error::last_os_error()); } - Ok(OsString::from_wide(&buffer[..(size - 1) as usize]) - .to_str() - .unwrap() - .to_string()) + OsString::from_wide(&buffer[..(size - 1) as usize]) + .into_string() + .map_err(|_| io::Error::new(io::ErrorKind::InvalidData, "username is not valid UTF-8")) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/nt.rs` around lines 1546 - 1559, The getlogin function currently calls OsString::from_wide(...).to_str().unwrap() which can panic if the username contains non-UTF-8 sequences; change the conversion to a loss-tolerant form (e.g., use OsString/OsStr -> to_string_lossy().into_owned()) so invalid UTF-8 is replaced rather than panicking. Locate the getlogin function and replace the to_str().unwrap().to_string() chain with a to_string_lossy().into_owned() call on the OsString/OsStr produced from OsString::from_wide(&buffer[..(size - 1) as usize]) to return a safe String.crates/host_env/src/windows.rs-133-144 (1)
133-144:⚠️ Potential issue | 🟡 MinorOff-by-one error in service pack string extraction.
The
lastvariable holds the index of the last non-null character, butfrom_wide(&version.szCSDVersion[..last])excludes that index. For a non-empty string, this truncates the last character.🐛 Proposed fix
let service_pack = { let (last, _) = version .szCSDVersion .iter() .take_while(|&x| x != &0) .enumerate() .last() .unwrap_or((0, &0)); - let sp = OsString::from_wide(&version.szCSDVersion[..last]); + let len = version + .szCSDVersion + .iter() + .position(|&x| x == 0) + .unwrap_or(version.szCSDVersion.len()); + let sp = OsString::from_wide(&version.szCSDVersion[..len]); sp.into_string() .map_err(|_| io::Error::other("service pack is not ASCII"))? };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/windows.rs` around lines 133 - 144, The slice used to build service_pack omits the last non-null wchar because `last` is the index of that character; change the slice passed to OsString::from_wide to include that character (use last + 1) when calling from_wide on version.szCSDVersion so service_pack contains the full string; update the block that computes service_pack (variables `last`, `sp`, and the call to OsString::from_wide) to handle empty cases safely and use `&version.szCSDVersion[..last + 1]` instead of `&version.szCSDVersion[..last]`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/host_env/src/ctypes.rs`:
- Around line 1883-1904: The function copy_com_pointer currently uses
debug_assert! to validate the IUnknown vtable which is removed in release
builds, causing UB when src_ptr is non-zero but vtable is null or bogus; change
the runtime checks so that after computing iunknown and vtable you explicitly
test if vtable.is_null() (and also check the function pointer at vtable.add(1)
is non-null) and return HRESULT_E_POINTER if invalid, and only then perform the
transmute and call addref_fn; ensure you do not dereference *vtable until those
runtime checks pass and likewise avoid transmuting a null function pointer.
- Around line 375-402: The code in wstring_from_bytes and
wchar_array_field_value treats each WCHAR as an independent Unicode scalar via
char::from_u32, which drops surrogate pairs; instead collect the sequence of u16
code units produced by wchar_from_bytes (using WCHAR_SIZE to read each unit),
stop at a terminating 0, then decode the whole Vec<u16> as UTF-16 using either
String::from_utf16 or std::char::decode_utf16 to produce the String (use
decode_utf16 to gracefully handle invalid/ lone surrogates if you want
replacement chars); update both wstring_from_bytes and wchar_array_field_value
to build a Vec<u16> and run the proper UTF-16 decoder rather than mapping each
code unit to char individually.
- Around line 1981-1991: The function call_result_bytes currently hard-codes the
reported size for CallResult::Value as core::mem::size_of::<i64>(), which can be
wrong on some targets; instead use the actual serialized byte length produced by
to_ne_bytes(). Update call_result_bytes (both CallResult::Pointer and
CallResult::Value branches) to compute the reported width from the length of the
bytes returned by ptr.to_ne_bytes()/val.to_ne_bytes() (e.g., use bytes.len() or
equivalent) and return that length instead of a hard-coded type size.
- Around line 208-210: simple_type_align currently returns simple_type_size
which is wrong for ABIs where alignment != size; replace the forwarding with a
real alignment mapping inside simple_type_align: match on the type strings used
elsewhere (e.g. the same tokens handled by simple_type_size) and return the C
alignment for each token (e.g. i8/u8 => 1, i16/u16 => 2, i32/u32/ f32 => 4,
i64/u64/f64 => 8, ptr types => pointer alignment based on target pointer width,
etc.), returning Some(alignment) for recognized types and None as the fallback
(or use simple_type_size only when appropriate); reference the function name
simple_type_align and reuse simple_type_size only as a fallback, not as the
primary alignment source.
- Around line 1278-1298: The match arms in write_callback_result() (and
similarly in read_value_at_address() at the other noted location) currently
treat "l"/"L" as fixed 64-bit/32-bit types causing truncation/overrun on
ILP32/LP64 targets; change the memory access to use libc::c_long and
libc::c_ulong for the "l" and "L" branches (use *(result as *mut libc::c_long)
and *(result as *mut libc::c_ulong) respectively) and only cast between
CallbackResultValue's integer variants and libc::c_long/c_ulong at the API
boundary so the in-memory read/write respects the platform c_long/c_ulong width.
- Around line 102-115: with_swapped_errno currently writes the ctypes-local
errno into the OS slot, calls f, stores the post-call errno into
CTYPES_LOCAL_ERRNO, but fails to restore the original OS errno so the embedding
thread sees the foreign function's errno; fix by capturing the original OS errno
before setting it (save_old_os = crate::os::get_errno()), set the OS errno to
the ctypes-local value as now, call f, capture new_error =
crate::os::get_errno(), restore the original OS errno
(crate::os::set_errno(save_old_os)), then set CTYPES_LOCAL_ERRNO to new_error
and return result; apply the exact same pattern to the other helper in the
153-165 region (the corresponding swapped LastError helper) using
crate::os::get_last_error()/set_last_error() and the CTYPES_LOCAL_LAST_ERROR
slot.
- Around line 707-775: In read_array_element, guard against slicing
buffer[offset..] before verifying offset and available bytes: for type "u"
return DecodedValue::String(String::new()) if offset >= buffer.len(); for "f"
return DecodedValue::Float(0.0) if offset + 4 > buffer.len(); for "d"/"g" return
DecodedValue::Float(0.0) if offset + 8 > buffer.len(); and in the
integer/default branch use buffer.get(offset..) (or check offset + element_size
<= buffer.len()) before calling get(..element_size) and int_from_bytes so you
return the existing Signed(0) fallback instead of panicking; update the checks
around usages of wchar_from_bytes, first_chunk::<4>(), first_chunk::<8>(), and
int_from_bytes accordingly.
- Around line 172-184: LONG_DOUBLE_SIZE and the "g" format handlers are
inconsistent: LONG_DOUBLE_SIZE should represent the platform long double size
(16 bytes on x86_64/aarch64 Unix) and the "g" encoder/decoder and FFI mapping
must use the proper long double type instead of marshaling as an 8-byte f64.
Update the LONG_DOUBLE_SIZE constant to reflect c_longdouble size where
appropriate, change the "g" encoding path (the block that currently encodes as
8-byte float then pads) to encode a c_longdouble value of the correct size and
preserve its bytes (no zero-padding/truncation), change the "g" decoding path
(the block that currently reads 8 bytes into c_double) to read the full
c_longdouble size and reconstruct a c_longdouble, and replace any uses of
Type::f64() for the "g"/long double FFI mapping with the proper long double
mapping (Type::longdouble() / libffi ffi_type_longdouble) so libffi receives the
correct ffi_type_longdouble for FFI calls.
- Around line 2409-2416: The get_pointer() implementation in SharedLibrary must
not transmute the safe libloading::Library; instead persist a stable raw handle
or synthetic ID when the library is created and return that here. Modify the
SharedLibrary struct to either (A) capture the raw handle up front by converting
the platform-specific libloading::os::unix::Library to a raw handle via
into_raw()/from_raw() and store that usize (update creation/loading code to call
into_raw and reconstruct the safe wrapper with from_raw), or (B) add a dedicated
stable identifier field (AtomicUsize counter or UUID) assigned when the library
is loaded and return that from get_pointer(). Replace the transmute_copy usage
in get_pointer() to read and return the stored raw_handle or stable_id and
ensure destruction/reconstruction uses from_raw if you chose the raw-handle
approach.
In `@crates/host_env/src/faulthandler.rs`:
- Around line 196-209: The loop in enable_fatal_handlers iterates
FATAL_SIGNAL_HANDLERS and may return false after a failed install_sigaction,
leaving earlier entries enabled; update enable_fatal_handlers to roll back any
installs when a subsequent install fails by iterating the already-enabled
entries and restoring their previous handlers (using entry.previous and the same
mechanism as install_sigaction or its counterpart) and mark them disabled before
returning false; apply the same rollback logic to the analogous code region
(lines ~383-397) so partial installs are undone on failure.
- Around line 269-307: Validate the incoming signum before casting/indexing in
register_user_signal: check that the original signum (libc::c_int) is
non-negative and that (signum as usize) < USER_SIGNAL_CAPACITY; if not, return
an appropriate io::Error (e.g., from_raw_os_error(libc::EINVAL)) instead of
indexing USER_SIGNALS or USER_SIGNAL_CAPACITY. Use the validated usize only
after this check and keep references to USER_SIGNALS, USER_SIGNAL_CAPACITY, and
register_user_signal consistent.
- Around line 347-356: The code currently captures saved_errno after
restore_sigaction and then restores errno using errno_after_raise, which loses
the caller's original errno; move the capture of the caller's errno to before
you call restore_sigaction (call crate::os::get_errno() into saved_errno before
restore_sigaction), continue to call raise_signal and re-install the handler
(install_sigaction), then after re-installation call
crate::os::set_errno(saved_errno) instead of restoring errno_after_raise so the
original errno is preserved; update/remove the temporary errno_after_raise usage
accordingly and keep restore_sigaction, raise_signal, and install_sigaction
calls in the same sequence.
- Line 9: The static mutable FATAL_SIGNAL_HANDLERS must be protected from
concurrent access; replace the current static mut with a synchronization
primitive (e.g., wrap the underlying table in a Mutex or RwLock) and update the
public functions enable_fatal_handlers() and disable_fatal_handlers() to acquire
the lock before reading/modifying the table (or alternatively make those
functions unsafe and document the synchronization contract); ensure you
reference and lock the same guarded static (FATAL_SIGNAL_HANDLERS) in all helper
routines that touch it so no aliased mutable references or data races can occur.
In `@crates/host_env/src/io.rs`:
- Around line 216-218: The is_would_block_error function currently only checks
raw_os_error() == Some(libc::EAGAIN); change it to detect would-block using
Rust's platform-agnostic io::ErrorKind by checking err.kind() ==
io::ErrorKind::WouldBlock (and optionally OR with the existing raw_os_error() ==
Some(libc::EAGAIN) if you want to preserve the explicit errno check). Update the
function signature/reference to use std::io::ErrorKind so is_would_block_error
reliably returns true across platforms.
In `@crates/host_env/src/posix.rs`:
- Around line 457-468: The utimes function's tv closure incorrectly uses
d.as_micros() which gives total microseconds instead of the fractional
microseconds for timeval.tv_usec; update the tv closure in utimes so tv_sec
remains d.as_secs() as _ and tv_usec is the subsecond component (use
d.subsec_micros() or d.subsec_nanos()/1_000) cast to the correct type, ensuring
timeval.tv_usec contains only the microsecond remainder; adjust only the tv
closure inside the utimes function (references: utimes, tv closure,
timeval.tv_sec and timeval.tv_usec).
In `@crates/host_env/src/select.rs`:
- Around line 267-277: The function wait clears events then calls
rustix::event::epoll::wait which writes into events' spare capacity but doesn't
update events.len(); update events to reflect the number of entries written
before returning. Specifically, in the Ok(n) arm of select::wait (the function
named wait, operating on events: &mut Vec<Event> and calling
rustix::event::epoll::wait), set the vector length to n (e.g., advance or
set_len) so callers see the populated entries, then return Ok(n).
- Around line 86-88: Check the FD_SETSIZE bound before modifying set.__nfds to
avoid temporarily leaving the struct inconsistent: when adding a file
descriptor, ensure you validate n = set.__nfds against FD_SETSIZE (e.g., if n >=
FD_SETSIZE return an error) before doing set.__nfds = n + 1 and set.__fds[n] =
fd; in other words, perform the FD_SETSIZE check first and only increment __nfds
and write to __fds after the bound check succeeds (referencing set.__nfds,
set.__fds, and FD_SETSIZE).
In `@crates/host_env/src/socket.rs`:
- Around line 530-531: Replace the incorrect alias of SO_EXCLUSIVEADDRUSE to
SO_REUSEADDR: update the constant SO_EXCLUSIVEADDRUSE so it uses the
Windows-specific numeric value -5 (0xFFFFFFFB) instead of referencing
SO_REUSEADDR; locate the declaration of pub const SO_EXCLUSIVEADDRUSE and change
its value to the correct integer literal for Windows.
In `@crates/host_env/src/time.rs`:
- Around line 138-149: The code calls libc::sysconf(libc::_SC_CLK_TCK) into
tick_for_second and then divides process time fields by it without checking for
non-positive/zero values; guard the syscall result by checking the raw return
(from libc::sysconf) for <= 0 (or error) before converting to f64, return an
Err(std::io::Error::new(...)) on invalid tick count, and only compute
ProcessTimes.user/system/children_user/children_system/elapsed after confirming
tick_for_second is positive; reference the tick_for_second variable, the
libc::sysconf(libc::_SC_CLK_TCK) call, and the ProcessTimes construction to
locate where to add the check and early return.
In `@crates/host_env/src/winapi.rs`:
- Around line 670-678: The branch that detects a SIGINT event has an off-by-one:
when sigint_event is pushed into thread_handles_raw it occupies index
thread_handles_raw.len() - 1, so change the comparison against result to use
WAIT_OBJECT_0 + (thread_handles_raw.len() - 1) instead of WAIT_OBJECT_0 +
thread_handles_raw.len(); update the condition that references sigint_event,
thread_handles_raw and result (and KEEP using WAIT_OBJECT_0 and
ERROR_CONTROL_C_EXIT) so it correctly matches the signaled index for the SIGINT
handle.
---
Outside diff comments:
In `@crates/host_env/src/msvcrt.rs`:
- Around line 42-54: The getwch and getwche functions currently call
char::from_u32(...).unwrap(), which can panic if the underlying unsafe calls
_getwch() or _getwche() return an invalid Unicode scalar (e.g., surrogate
values); change both to handle invalid code points by using
char::from_u32(value).unwrap_or(char::REPLACEMENT_CHARACTER) (or equivalent
fallback) before calling to_string(), and keep getche as-is (it returns raw u8)
or document its behavior; ensure you reference the unsafe calls _getwch and
_getwche and the wrapper functions getwch and getwche when making the change.
---
Minor comments:
In `@crates/host_env/src/nt.rs`:
- Around line 1546-1559: The getlogin function currently calls
OsString::from_wide(...).to_str().unwrap() which can panic if the username
contains non-UTF-8 sequences; change the conversion to a loss-tolerant form
(e.g., use OsString/OsStr -> to_string_lossy().into_owned()) so invalid UTF-8 is
replaced rather than panicking. Locate the getlogin function and replace the
to_str().unwrap().to_string() chain with a to_string_lossy().into_owned() call
on the OsString/OsStr produced from OsString::from_wide(&buffer[..(size - 1) as
usize]) to return a safe String.
In `@crates/host_env/src/overlapped.rs`:
- Line 536: The current code uses unwrap() on ACCEPT_EX.get() (and similar
CONNECT_EX.get(), etc.), which will panic if initialize_winsock_extensions() was
not called; change the API to return an io::Result (e.g., change the function
signature to io::Result<u32>) and replace unwrap() with a checked lookup that
returns Err(io::Error::new(...)) when the extension pointer is None, or
alternatively call initialize_winsock_extensions() lazily and return an error on
failure; reference ACCEPT_EX.get(), CONNECT_EX.get(), and
initialize_winsock_extensions() when making the change so the caller-visible
precondition is removed and errors are propagated instead of panicking.
In `@crates/host_env/src/signal.rs`:
- Around line 299-303: The write call is casting wakeup_fd to the C int
parameter which can truncate on 64-bit Windows; change the unsafe
libc::write(wakeup_fd as _, &sigbyte... ) usage to first attempt a safe
conversion like i32::try_from(wakeup_fd) and only call libc::write with the
converted fd, handling the Err case (e.g., skip the write and log or return) to
avoid truncation; reference the wakeup_fd, sigbyte, and the libc::write
invocation when making this change.
In `@crates/host_env/src/winapi.rs`:
- Around line 627-636: The CreateThread call currently passes a stack size of 1
(in the call that creates the thread for batch_wait_thread), which is
non-idiomatic and suspicious; change the third argument to 0 so the thread uses
the process/executable default stack size (i.e., replace the literal 1 with 0 in
the CreateThread invocation that passes batch_wait_thread and
Arc::as_ptr(data)).
In `@crates/host_env/src/windows.rs`:
- Around line 133-144: The slice used to build service_pack omits the last
non-null wchar because `last` is the index of that character; change the slice
passed to OsString::from_wide to include that character (use last + 1) when
calling from_wide on version.szCSDVersion so service_pack contains the full
string; update the block that computes service_pack (variables `last`, `sp`, and
the call to OsString::from_wide) to handle empty cases safely and use
`&version.szCSDVersion[..last + 1]` instead of `&version.szCSDVersion[..last]`.
In `@crates/host_env/src/wmi.rs`:
- Around line 660-661: The code calls CloseHandle(init_event) and
CloseHandle(connect_event) unconditionally; change it to only call CloseHandle
when the corresponding event handle was successfully created by CreateEventW
(i.e., check that init_event and connect_event are not null/INVALID_HANDLE_VALUE
before calling CloseHandle). Update the cleanup logic around the CreateEventW
calls and any early-return paths so each handle is closed only if it was
actually allocated, referencing the symbols CreateEventW, init_event,
connect_event, and CloseHandle to locate the relevant code.
---
Duplicate comments:
In `@crates/host_env/src/syslog.rs`:
- Around line 46-49: The TOCTOU bug occurs because closelog() calls is_open()
before acquiring the global_ident() write lock, allowing a concurrent openlog()
to race; modify closelog() to acquire the write lock on global_ident() first
(e.g., let mut locked_ident = global_ident().write()), then check whether the
saved identifier is Some via locked_ident.as_ref() or is_open() logic under that
lock, call unsafe { libc::closelog() } and set *locked_ident = None while still
holding the lock so close and state update happen atomically with respect to
openlog().
In `@crates/host_env/src/time.rs`:
- Around line 264-267: get_tz_info currently ignores the return code of
GetTimeZoneInformation and decodes a zeroed struct on failure; change
get_tz_info to check the return value against TIME_ZONE_ID_INVALID, call
GetLastError() when invalid, and return a Result<WindowsTimeZoneInfo,
std::io::Error> (or another appropriate error type) instead of unconditionally
returning WindowsTimeZoneInfo; update callers of get_tz_info to handle the
Result and propagate or map the error, and ensure the WindowsTimeZoneInfo
construction only happens on a successful GetTimeZoneInformation return.
- Around line 376-384: asctime_from_tm currently only guards against negative
weekday values but still indexes WDAY_NAME with tm.tm_wday (and similarly
MON_NAME with tm.tm_mon) which can panic for tm.tm_wday == 7 or tm.tm_mon > 11;
update asctime_from_tm to validate both tm.tm_wday is within 0..=6 and tm.tm_mon
is within 0..=11 before using WDAY_NAME[tm.tm_wday as usize] and
MON_NAME[tm.tm_mon as usize], and handle out-of-range values the same way the
function handles negative values (reject/return the existing fallback) to
prevent panics.
---
Nitpick comments:
In `@crates/host_env/src/cert_store.rs`:
- Around line 90-124: The enum_crls function only opens a single system store
(defaulting to CurrentUser) causing CRLs from LocalMachine to be missed; modify
enum_crls to mirror enum_certificates by iterating both store locations
(CurrentUser and LocalMachine) when opening the store (either call
CertOpenSystemStoreW twice with the two system store flags or use CertOpenStore
with CERT_SYSTEM_STORE_CURRENT_USER and CERT_SYSTEM_STORE_LOCAL_MACHINE),
enumerate CRLs with CertEnumCRLsInStore for each opened store, push CrlEntry
items (der and encoding_type(crl.dwCertEncodingType)) into the same result Vec,
and ensure each opened store is closed with CertCloseStore; keep using the same
types/functions (enum_crls, CertEnumCRLsInStore, CertCloseStore, CrlEntry,
encoding_type) to locate and update the code.
In `@crates/host_env/src/faulthandler.rs`:
- Around line 163-164: In current_thread_id(), replace the explicit cast "as
u64" on libc::pthread_self() with a generic cast "as _" to avoid tripping
clippy::trivial_numeric_casts while preserving portability (keep the unsafe {
libc::pthread_self() as _ } expression and function signature returning u64);
this mirrors the pattern used elsewhere (e.g., RustPython's _thread) and avoids
introducing a lint suppression.
In `@crates/host_env/src/io.rs`:
- Line 1: The import CStr and the function/variable fd_is_own are only used on
specific targets and should be cfg-gated to avoid cross-target unused-item
lints; wrap use core::ffi::CStr with #[cfg(any(unix, target_os = "wasi"))] (or
the exact cfg used elsewhere) and add #[cfg(not(windows))] or #[cfg(any(unix,
target_os = "wasi"))] to the fd_is_own definition (and similarly guard the code
at the 174-178 region) so the symbols are only compiled on targets that need
them, keeping function names like fd_is_own and usages of CStr unchanged.
In `@crates/host_env/src/mmap.rs`:
- Around line 91-94: The unsafe impls for Send and Sync on NamedMmap lack
documented safety invariants; add a clear comment above the unsafe impls for
NamedMmap that states why sharing across threads is safe (e.g., the internal
Windows HANDLE is thread-safe, the underlying mapping is process-wide, any
mutable access is synchronized or the mapping is immutable/only exposes
atomic-safe access, and Drop does not race with concurrent use), reference the
fields used (the raw HANDLE and mapping pointer/len) and mention any required
caller guarantees; ensure the comment explains that these invariants must hold
for Send/Sync to be sound and update tests or reviewers if there are additional
assumptions about lifetime, ownership, or synchronization.
- Around line 274-284: The two identical functions map_anon should be
consolidated: replace the separate #[cfg(unix)] and #[cfg(windows)] blocks with
a single #[cfg(any(unix, windows))] attribute on one map_anon implementation
that uses MmapOptions::new(), sets len(size), calls map_anon(), and maps to
MappedFile::Write; ensure the function signature pub fn map_anon(size: usize) ->
io::Result<MappedFile> and the internal calls to MmapOptions, map_anon, and
MappedFile::Write remain unchanged.
In `@crates/host_env/src/multiprocessing.rs`:
- Around line 420-434: Replace the conditional debug-only check with a hard
assert so passing a deadline on Apple fails loudly: change the
debug_assert!(deadline.is_none()) in the Apple-specific branch of the code
handling sem_wait/sem_timedwait to assert!(deadline.is_none()) so the invariant
is enforced in release builds; ensure you update the branch that calls unsafe {
libc::sem_wait(handle) } and returns WaitStatus::Acquired /
WaitStatus::Interrupted / WaitStatus::Error(SemError::from_errno(err))
accordingly.
- Around line 199-209: current_thread_id has platform-dependent return types
(unix: u64 via pthread_self, windows: u32 via GetCurrentThreadId) causing an
inconsistent API; change the Windows implementation of current_thread_id to
return u64 as well by casting GetCurrentThreadId() to u64 so both cfg(unix) and
cfg(windows) expose the same u64 signature, and update the function signature
for the Windows variant (and any callers if needed) to match current_thread_id
-> u64 while keeping the unix implementation using pthread_self() as before.
In `@crates/host_env/src/nt.rs`:
- Around line 1082-1127: The two functions path_exists_via_open and
test_file_exists_by_name duplicate CreateFileW/opening logic and the regular
reparse point retry; extract a small helper (e.g., open_check_exists_or_reopen)
that accepts the wide path, initial flags (FILE_FLAG_BACKUP_SEMANTICS | optional
FILE_FLAG_OPEN_REPARSE_POINT), desired access (FILE_READ_ATTRIBUTES), and a
boolean indicating whether to follow links, performs the CreateFileW call, runs
test_file_type_by_handle(handle, TestType::RegularReparsePoint, false), ensures
CloseHandle is always called, and returns true/false indicating existence
(including the retry open with FILE_FLAG_BACKUP_SEMANTICS if a regular reparse
point was observed). Replace the duplicated blocks in path_exists_via_open and
test_file_exists_by_name with calls to this helper.
In `@crates/host_env/src/os.rs`:
- Around line 118-126: The cfg attributes on the two cpu_count functions are
using a confusing double-negation; update the second attribute to the clearer
equivalent by replacing #[cfg(not(any(not(target_arch = "wasm32"), target_os =
"wasi")))] with #[cfg(all(target_arch = "wasm32", not(target_os = "wasi")))] so
cpu_count() implementations read as "non-wasm32 or wasi" (the existing first fn)
and "wasm32 and not wasi" (the second fn) respectively; leave the function names
and bodies unchanged.
In `@crates/host_env/src/overlapped.rs`:
- Around line 74-75: The unsafe impls for Sync and Send on Operation are missing
a SAFETY justification even though Operation contains mutable fields (pending,
completed, read_buffer, write_buffer) that are mutated without internal
synchronization; add a clear `// SAFETY:` comment above `unsafe impl Sync for
Operation {}` and `unsafe impl Send for Operation {}` that describes why sharing
across threads is sound (e.g., external synchronization guarantees,
atomic/lock-free usage, or that only immutable methods are called concurrently),
or alternatively document on the `Operation` type that callers must ensure
exclusive access when invoking mutating methods like those that touch `pending`,
`completed`, `read_buffer`, and `write_buffer`. Include references to the
`Operation` struct and the specific fields/methods so future reviewers can
verify the invariants.
- Around line 1084-1116: The format_message function in overlapped.rs duplicates
windows.rs::format_error_message; remove the duplicated implementation and
replace callers of overlapped::format_message with a call to
windows::format_error_message(error_code) (or move the shared logic into a new
common helper and have both modules call that), ensuring you update imports to
bring windows::format_error_message into scope and delete the
LocalFree/FormatMessageW-specific code in overlapped.rs so only the single
canonical implementation remains.
In `@crates/host_env/src/posix.rs`:
- Around line 401-403: The pipe2 function currently returns nix::Result but
other functions use std::io::Result; change pub fn pipe2(...) ->
nix::Result<...> to pub fn pipe2(...) -> std::io::Result<(std::os::fd::OwnedFd,
std::os::fd::OwnedFd)> and convert the nix error to std::io::Error (e.g., via
.map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))) on the
nix::unistd::pipe2(...) call so callers get a consistent std::io::Result from
pipe2.
- Around line 1147-1154: Remove the unnecessary rewinddir call in the Drop
implementation for FdDirStream: in the impl Drop for FdDirStream (the drop
method that currently calls libc::rewinddir and libc::closedir), delete the
libc::rewinddir(self.0.as_ptr()) call and keep only the
libc::closedir(self.0.as_ptr()) inside the unsafe block so the directory stream
is closed without resetting its position first.
- Around line 1564-1569: The helper fn should_keep currently returns true when
an fd should be closed (fd > above && not found in keep), which is
misleading—rename the function to something like should_close or is_closable
(e.g., fn should_close(above: i32, keep: &[BorrowedFd<'_>], fd: i32) -> bool)
and update all call sites that reference should_keep to the new name (the places
where it filters/decides which FDs to close). Keep the implementation identical
but use the new, semantically-correct identifier to avoid confusion.
- Around line 766-768: Change the chdir function to return std::io::Result<()>
instead of nix::Result<()>, and map any nix::Error into an std::io::Error before
returning; specifically, update the signature of chdir and replace the direct
nix::unistd::chdir(cwd) return with something like
nix::unistd::chdir(cwd).map_err(|e|
std::io::Error::new(std::io::ErrorKind::Other, e)) so the module API remains
consistent while preserving the original error information (referencing the
chdir function and nix::unistd::chdir call).
- Around line 801-841: Change the listed functions to return std::io::Result<()>
instead of nix::Result<()>, and convert any nix/libc errors to std::io::Error
before returning; specifically update signatures for setsid_if_needed,
setpgid_if_needed, setgroups_if_needed, setregid_if_needed and
setreuid_if_needed to -> std::io::Result<()>, then map/convert errors from calls
like nix::unistd::setsid(), nix::unistd::setpgid(...),
nix::unistd::setgroups(...), and nix::Error::result(ret) into std::io::Error
(e.g. using io::Error::from_raw_os_error from the errno or io::Error::new with
the nix error message) so the public API is consistent with the rest of the
module.
In `@crates/host_env/src/pwd.rs`:
- Around line 38-40: getpwnam currently swallows errors via .ok().flatten(),
making it inconsistent with getpwuid; change getpwnam signature to return
io::Result<Option<Passwd>> and propagate errors from User::from_name instead of
dropping them. Replace the .ok().flatten() pattern with proper error handling
(e.g., map the Result from User::from_name to io::Result and then map/transpose
into Option<Passwd>) so callers receive Err on underlying failures while still
returning Ok(None) when the user is absent; reference the getpwnam function and
User::from_name and match the error behavior of getpwuid.
In `@crates/host_env/src/select.rs`:
- Around line 183-209: The helper functions search_poll_fd, insert_poll_fd,
get_poll_fd_mut, and remove_poll_fd rely on the vector being sorted by PollFd.fd
but that invariant is implicit; either add a short rustdoc on each function (or
on the module) stating "fds must be sorted by fd" and the consequences, or
better encapsulate the Vec<PollFd> in a small wrapper type (e.g. PollFdList)
that owns the Vec and exposes methods search, insert, get_mut, remove which
maintain/validate the sorted-by-fd invariant internally; update these functions
to be methods on that wrapper (or call its validation) so accidental misuse is
prevented.
In `@crates/host_env/src/signal.rs`:
- Line 336: Replace the magic number 21 used in the signal mapping with the
SIGBREAK constant to improve consistency: locate the mapping or match arm that
currently has the entry `21 => "Break"` and change it to use the previously
defined SIGBREAK symbol (defined on line ~216) so it reads `SIGBREAK =>
"Break"`, ensuring imports/visibility of SIGBREAK are respected.
- Around line 176-192: The syscall result in pidfd_send_signal unnecessarily
casts the libc::syscall return to libc::c_long; remove the redundant cast so the
unsafe block assigns the syscall(...) directly to ret (which is declared as
libc::c_long), updating the assignment in the pidfd_send_signal function to
return the syscall result without the "as libc::c_long" cast.
- Around line 71-72: The call to siginterrupt(signalnum, 1) currently discards
its Result with let _ = ...; add a brief inline comment next to that statement
explaining why errors are intentionally ignored (for example: non-critical
best-effort to set SA_RESTART behavior, older platforms where siginterrupt may
be unavailable, or errors are expected and safe to ignore) so future readers
know this is deliberate; reference the siginterrupt call and the signalnum
variable when adding the comment.
- Around line 204-227: VALID_SIGNALS currently embeds the literal 21 instead of
the named constant SIGBREAK because SIGBREAK is defined below; move the SIGBREAK
declaration above VALID_SIGNALS (or define SIGBREAK before VALID_SIGNALS) and
replace the literal 21 in VALID_SIGNALS with SIGBREAK so the signal list
consistently uses the named constant (refer to VALID_SIGNALS and SIGBREAK to
locate and update the definitions).
- Around line 230-235: The init_winsock function currently requests WinSock 1.1
(WSAStartup called with 0x0101); update the requested version to WinSock 2.2 by
passing 0x0202 (or using MAKEWORD(2,2)) to the WSAStartup call in the
init_winsock block (symbols: init_winsock, WSA_INIT, WSAStartup, wsa_data) so
the code initializes WinSock 2.2 instead of 1.1.
- Around line 35-45: Remove the platform-specific ffi module that declares
unsafe extern "C" getitimer/setitimer and instead use the libc::getitimer and
libc::setitimer symbols uniformly on all Unix targets; locate the mod ffi block
and replace its usage sites (calls referencing ffi::getitimer / ffi::setitimer)
to call libc::getitimer and libc::setitimer directly, ensuring any unsafe blocks
and argument types match the existing calls that already use libc on other Unix
branches (see usages around getitimer and setitimer in this file).
In `@crates/host_env/src/socket.rs`:
- Around line 150-160: The cfg attribute on the socket-related code currently
redundantly lists many Unix target_os values alongside `unix`; replace the long
predicate with a single, maintainable condition—either #[cfg(unix)] or, if you
intend to exclude redox, use #[cfg(all(unix, not(target_os = "redox")))]—by
updating the #[cfg(...)] on the block in this file (the attribute immediately
above the socket-related definitions) to the chosen simplified form.
- Around line 194-206: The functions gai_error_string and h_error_string may
call CStr::from_ptr on a null pointer if libc::gai_strerror or libc::hstrerror
return null; change each function to call the libc function into a local pointer
(e.g., let p = libc::gai_strerror(err)), check p for null (std::ptr::null())
inside the unsafe block, and if null return a safe fallback string like
format!("Unknown gai error: {}", err) (and similarly for h_error_string:
"Unknown herror: {}"); otherwise construct the CStr and return its
to_string_lossy().into_owned() as before.
In `@crates/host_env/src/thread.rs`:
- Around line 4-6: The current_thread_id function casts libc::pthread_self()
with as u64 which triggers Clippy's trivial_numeric_casts on Linux; update the
cast to use as _ so the compiler infers the return type from the function
signature (i.e., change the unsafe block that returns libc::pthread_self() as
u64 to returning libc::pthread_self() as _), keeping the function name
current_thread_id and the libc::pthread_self reference intact.
In `@crates/host_env/src/time.rs`:
- Around line 22-27: The two cfg-annotated definitions of TimeT are identical
and should be collapsed into a single unconditional type alias (remove
#[cfg(unix)]/#[cfg(windows)] around TimeT); do the same for the identical cfg
branches that implement current_time_t and the other duplicated block around
lines ~73-77 by extracting any platform-specific values only and invoking the
shared logic once (reference symbols: TimeT and current_time_t) so duplicate
byte-for-byte branches are eliminated.
In `@crates/host_env/src/winapi.rs`:
- Around line 578-579: Add a SAFETY comment above the unsafe impls for BatchData
explaining why implementing Send and Sync is sound: note that BatchData contains
raw handles and an UnsafeCell, and document the required invariants (e.g., all
mutable writes to BatchData occur-before the thread is resumed, and all reads
occur-after the thread completes/joins; no concurrent access occurs; ownership
of raw handles is not aliased across threads). Reference the symbols BatchData
and the unsafe impl Send/Sync to make clear which types and impls the comment
applies to.
In `@crates/host_env/src/winreg.rs`:
- Around line 89-96: The function bytes_as_wide_slice relies on a debug-only
assert and must be made explicit about alignment guarantees; change
bytes_as_wide_slice to an unsafe fn bytes_as_wide_slice(bytes: &[u8]) -> &[u16]
and add a doc comment stating callers must guarantee the input is u16-aligned
and valid UTF-16/registry data, keep or tighten the existing
debug_assert!(prefix.is_empty() && suffix.is_empty(), ...) for debug builds, and
update all call sites to call it within an unsafe block so alignment
responsibility is clearly documented and enforced at the call boundary.
- Around line 421-425: The code handling the initial query_value_ex result masks
real errors by forcing buf_size = 256 when buf_size == 0 even if the call
returned an error other than Foundation::ERROR_MORE_DATA; update the logic in
the query loop (the block that inspects res and buf_size) to preserve and check
the original res: only treat ERROR_MORE_DATA (or a zero-size success) as a
resize-and-retry case, but if res != 0 && res != Foundation::ERROR_MORE_DATA
then return Err(res) immediately (or return the preserved original error)
instead of overwriting buf_size; ensure the condition references the same res
from query_value_ex and that query_value_ex, buf_size, and
Foundation::ERROR_MORE_DATA are the symbols used to locate the change.
In `@crates/host_env/src/wmi.rs`:
- Around line 670-675: The code can underflow when offset is odd (e.g., offset
== 1) because (offset as usize) / 2 - 1 can wrap; update the calculation to be
defensive: compute let bytes = offset as usize, then guard bytes < 2 => return
Ok(String::new()), or use let char_count = bytes / 2 .saturating_sub(1) (or
adjust bytes = bytes.saturating_sub(bytes % 2)) before indexing buffer; apply
this fix where offset, buffer and char_count are computed so you never perform a
subtraction that can underflow.
In `@crates/stdlib/src/mmap.rs`:
- Around line 406-437: The AccessMode-to-host_mmap::AccessMode mapping is
duplicated in multiple places (e.g., the closure that calls host_mmap::map_file
in the mmap constructor and other locations), so add a single conversion method
or impl to centralize it; implement either an impl AccessMode { fn
to_host(&self) -> host_mmap::AccessMode } or impl From<AccessMode> for
host_mmap::AccessMode, then replace the repeated match blocks wherever
AccessMode is converted (references: AccessMode, host_mmap::AccessMode,
host_mmap::map_file) to call the new conversion method or From::from to avoid
repetition.
In `@crates/stdlib/src/openssl.rs`:
- Around line 4107-4137: The code in enum_certificates currently ignores
CertificateEntries.had_open_store returned by
rustpython_host_env::cert_store::enum_certificates and silently returns an empty
list; update enum_certificates to check the returned
CertificateEntries.had_open_store and, if false, raise an OSError via the VM
(e.g. return Err(vm.new_os_error("could not open certificate store")) or
similar) before mapping entries; reference the enum_certificates function, the
CertificateEntries.had_open_store field, and
rustpython_host_env::cert_store::enum_certificates when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: dc2397b2-4158-44fc-87bb-b3699bcf4196
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (83)
.cspell.json.github/workflows/ci.yamlcrates/host_env/Cargo.tomlcrates/host_env/src/cert_store.rscrates/host_env/src/ctypes.rscrates/host_env/src/errno.rscrates/host_env/src/faulthandler.rscrates/host_env/src/fcntl.rscrates/host_env/src/grp.rscrates/host_env/src/io.rscrates/host_env/src/lib.rscrates/host_env/src/locale.rscrates/host_env/src/mmap.rscrates/host_env/src/msvcrt.rscrates/host_env/src/multiprocessing.rscrates/host_env/src/nt.rscrates/host_env/src/os.rscrates/host_env/src/overlapped.rscrates/host_env/src/posix.rscrates/host_env/src/pwd.rscrates/host_env/src/resource.rscrates/host_env/src/select.rscrates/host_env/src/signal.rscrates/host_env/src/socket.rscrates/host_env/src/syslog.rscrates/host_env/src/termios.rscrates/host_env/src/testconsole.rscrates/host_env/src/thread.rscrates/host_env/src/time.rscrates/host_env/src/winapi.rscrates/host_env/src/windows.rscrates/host_env/src/winreg.rscrates/host_env/src/wmi.rscrates/stdlib/Cargo.tomlcrates/stdlib/src/_testconsole.rscrates/stdlib/src/faulthandler.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/grp.rscrates/stdlib/src/locale.rscrates/stdlib/src/mmap.rscrates/stdlib/src/multiprocessing.rscrates/stdlib/src/openssl.rscrates/stdlib/src/overlapped.rscrates/stdlib/src/posixsubprocess.rscrates/stdlib/src/resource.rscrates/stdlib/src/select.rscrates/stdlib/src/socket.rscrates/stdlib/src/ssl.rscrates/stdlib/src/termios.rscrates/vm/Cargo.tomlcrates/vm/src/exceptions.rscrates/vm/src/getpath.rscrates/vm/src/readline.rscrates/vm/src/stdlib/_codecs.rscrates/vm/src/stdlib/_ctypes.rscrates/vm/src/stdlib/_ctypes/array.rscrates/vm/src/stdlib/_ctypes/base.rscrates/vm/src/stdlib/_ctypes/function.rscrates/vm/src/stdlib/_ctypes/library.rscrates/vm/src/stdlib/_ctypes/pointer.rscrates/vm/src/stdlib/_ctypes/simple.rscrates/vm/src/stdlib/_io.rscrates/vm/src/stdlib/_signal.rscrates/vm/src/stdlib/_stat.rscrates/vm/src/stdlib/_thread.rscrates/vm/src/stdlib/_winapi.rscrates/vm/src/stdlib/_wmi.rscrates/vm/src/stdlib/builtins.rscrates/vm/src/stdlib/errno.rscrates/vm/src/stdlib/msvcrt.rscrates/vm/src/stdlib/nt.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/pwd.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/time.rscrates/vm/src/stdlib/winreg.rscrates/vm/src/vm/mod.rscrates/vm/src/windows.rshost_env_proposal.mdtest_bad_new_super_debug.pytest_full_comparison.pytest_notes_debug.py
💤 Files with no reviewable changes (1)
- crates/stdlib/Cargo.toml
✅ Files skipped from review due to trivial changes (1)
- .cspell.json
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/host_env/Cargo.toml
- crates/stdlib/src/faulthandler.rs
| const LONG_DOUBLE_SIZE: usize = core::mem::size_of::<u128>(); | ||
|
|
||
| #[cfg(target_os = "windows")] | ||
| const LONG_DOUBLE_SIZE: usize = core::mem::size_of::<c_double>(); | ||
|
|
||
| #[cfg(not(any( | ||
| all( | ||
| any(target_arch = "x86_64", target_arch = "aarch64"), | ||
| not(target_os = "windows") | ||
| ), | ||
| target_os = "windows" | ||
| )))] | ||
| const LONG_DOUBLE_SIZE: usize = core::mem::size_of::<c_double>(); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does the Rust libfficrate expose a distinctTypefor Clong double, and what is the correct ABI representation of long double on x86_64/aarch64 Unix?
💡 Result:
Yes, the Rust libffi crates expose a distinct Type for C long double. The libffi-sys crate (low-level bindings) provides a static ffi_type_longdouble: ffi_type, distinct from ffi_type_double. This is available on x86_64-unknown-linux-gnu and aarch64-unknown-linux-gnu targets. The libffi crate (higher-level) middle layer provides low::types::longdouble, which wraps ffi_type_longdouble. On x86_64 Unix (System V ABI, FFI_UNIX64), long double is the 80-bit extended precision type (x87 registers: X86_64_X87_CLASS + X86_64_X87UP_CLASS for args, UNIX64_RET_X87[_2] for returns). Size is typically 16 bytes (with padding), alignment 16. On aarch64 Unix (AAPCS64, FFI_SYSV), long double is identical to double (f64): 8 bytes, passed in FP registers like double. libffi defines it distinctly but classifies/passes as double per ABI.
Citations:
- 1: https://docs.rs/libffi-sys/latest/libffi_sys/static.ffi_type_longdouble.html
- 2: https://docs.rs/libffi-sys/latest/libffi_sys/index.html
- 3: https://github.com/libffi/libffi/blob/master/src/x86/ffi64.c
- 4: https://github.com/libffi/libffi/blob/master/src/types.c
- 5: https://docs.rs/libffi/latest/src/libffi/middle/types.rs.html
- 6: https://github.com/libffi/libffi/blob/master/src/x86/ffitarget.h
- 7: https://github.com/libffi/libffi/blob/master/src/aarch64/ffitarget.h
🏁 Script executed:
cat -n crates/host_env/src/ctypes.rs | sed -n '160,200p'Repository: RustPython/RustPython
Length of output: 1789
🏁 Script executed:
cat -n crates/host_env/src/ctypes.rs | sed -n '990,1020p'Repository: RustPython/RustPython
Length of output: 1369
🏁 Script executed:
cat -n crates/host_env/src/ctypes.rs | sed -n '1140,1170p'Repository: RustPython/RustPython
Length of output: 1459
🏁 Script executed:
cat -n crates/host_env/src/ctypes.rs | sed -n '1415,1465p'Repository: RustPython/RustPython
Length of output: 1781
Fix "g" format to use proper long double type instead of marshaling as 8-byte f64.
LONG_DOUBLE_SIZE advertises 16 bytes on x86_64/aarch64 Unix, but the "g" format is marshaled identically to "d" (8-byte double) across all operations:
- Encoding (lines 999-1010): encodes as 8-byte float then pads with zeros
- Decoding (lines 1151-1158): reads only 8 bytes, treating as
c_double - FFI type mapping (line 1452): uses
Type::f64()instead ofType::longdouble()
This causes an ABI mismatch: 16 bytes allocated but only 8 bytes marshaled, corrupting FFI boundaries on function calls. libffi exposes a distinct ffi_type_longdouble type. The fix requires using the proper long double type from libffi across all four sections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/ctypes.rs` around lines 172 - 184, LONG_DOUBLE_SIZE and
the "g" format handlers are inconsistent: LONG_DOUBLE_SIZE should represent the
platform long double size (16 bytes on x86_64/aarch64 Unix) and the "g"
encoder/decoder and FFI mapping must use the proper long double type instead of
marshaling as an 8-byte f64. Update the LONG_DOUBLE_SIZE constant to reflect
c_longdouble size where appropriate, change the "g" encoding path (the block
that currently encodes as 8-byte float then pads) to encode a c_longdouble value
of the correct size and preserve its bytes (no zero-padding/truncation), change
the "g" decoding path (the block that currently reads 8 bytes into c_double) to
read the full c_longdouble size and reconstruct a c_longdouble, and replace any
uses of Type::f64() for the "g"/long double FFI mapping with the proper long
double mapping (Type::longdouble() / libffi ffi_type_longdouble) so libffi
receives the correct ffi_type_longdouble for FFI calls.
| pub fn wstring_from_bytes(buffer: &[u8]) -> String { | ||
| let mut chars = Vec::new(); | ||
| for chunk in buffer.chunks(WCHAR_SIZE) { | ||
| if chunk.len() < WCHAR_SIZE { | ||
| break; | ||
| } | ||
| let Some(code) = wchar_from_bytes(chunk) else { | ||
| break; | ||
| }; | ||
| if code == 0 { | ||
| break; | ||
| } | ||
| if let Some(ch) = char::from_u32(code) { | ||
| chars.push(ch); | ||
| } | ||
| } | ||
| chars.into_iter().collect() | ||
| } | ||
|
|
||
| pub fn wchar_array_field_value(buffer: &[u8]) -> String { | ||
| buffer | ||
| .chunks(WCHAR_SIZE) | ||
| .filter_map(|chunk| { | ||
| wchar_from_bytes(chunk) | ||
| .filter(|&wchar| wchar != 0) | ||
| .and_then(char::from_u32) | ||
| }) | ||
| .collect() |
There was a problem hiding this comment.
Windows wide-string decoding is lossy here.
These helpers convert each wchar_t through char::from_u32. On Windows, wchar_t is a UTF-16 code unit, so non-BMP characters arrive as surrogate pairs and get dropped; lone surrogates are lost too. This needs a UTF-16/WTF-8 path instead of per-code-unit char conversion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/ctypes.rs` around lines 375 - 402, The code in
wstring_from_bytes and wchar_array_field_value treats each WCHAR as an
independent Unicode scalar via char::from_u32, which drops surrogate pairs;
instead collect the sequence of u16 code units produced by wchar_from_bytes
(using WCHAR_SIZE to read each unit), stop at a terminating 0, then decode the
whole Vec<u16> as UTF-16 using either String::from_utf16 or
std::char::decode_utf16 to produce the String (use decode_utf16 to gracefully
handle invalid/ lone surrogates if you want replacement chars); update both
wstring_from_bytes and wchar_array_field_value to build a Vec<u16> and run the
proper UTF-16 decoder rather than mapping each code unit to char individually.
9e07122 to
f34b749
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] lib: cpython/Lib/ctypes dependencies:
dependent tests: (34 tests)
[x] test: cpython/Lib/test/test_sys.py (TODO: 7) dependencies: dependent tests: (227 tests)
Legend:
|
255a25d to
e964a9d
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (11)
crates/host_env/src/posix.rs (1)
458-469:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
tv_usecbug:as_micros()returns total microseconds, not the fractional part.For e.g.
Duration::new(1, 500_000_000), this yieldstv_sec=1, tv_usec=1_500_000, whichutimeswill reject (EINVAL) or interpret incorrectly. Usesubsec_micros()to match the convention already used byset_file_times_atbelow (line 481).🐛 Proposed fix
let tv = |d: core::time::Duration| libc::timeval { tv_sec: d.as_secs() as _, - tv_usec: d.as_micros() as _, + tv_usec: d.subsec_micros() as _, };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/posix.rs` around lines 458 - 469, The utimes function's timeval builder closure tv uses d.as_micros() for tv_usec which gives total microseconds instead of the fractional microsecond part, causing invalid tv_usec values; change tv to use d.subsec_micros() (or compute d.as_micros() % 1_000_000) for tv_usec so tv_sec is d.as_secs() and tv_usec is the sub-second microsecond portion, matching the convention used by set_file_times_at and preventing EINVAL from utimes.crates/host_env/src/select.rs (2)
96-105:⚠️ Potential issue | 🔴 CriticalCheck
FD_SETSIZEbefore bumping__nfds.This still increments
__nfdsbefore validating capacity. On a full set,set.__fds[n]will panic after the count was already advanced.Suggested fix
let n = set.__nfds; - set.__nfds = n + 1; - set.__fds[n] = fd; + if n < FD_SETSIZE { + set.__fds[n] = fd; + set.__nfds = n + 1; + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/select.rs` around lines 96 - 105, In FD_SET, validate capacity before mutating __nfds: compute n = set.__nfds, check that n < FD_SETSIZE (and/or that set.__fds has room) before writing to set.__fds[n] and incrementing set.__nfds; if the set is full return early (or handle overflow) so you never advance __nfds and then index out of bounds. Reference: function FD_SET, fields __nfds and __fds, and constant FD_SETSIZE.
287-295:⚠️ Potential issue | 🔴 CriticalPublish the entries written into
eventsbefore returning.
rustix::event::epoll::waitwrites into the spare capacity, butevents.len()stays0here. Callers can getOk(n)and still observe an empty vector.Suggested fix
pub fn wait( epoll: &OwnedFd, events: &mut Vec<Event>, timeout: Option<&Timespec>, ) -> Result<usize, WaitError> { events.clear(); match rustix::event::epoll::wait(epoll, rustix::buffer::spare_capacity(events), timeout) { - Ok(n) => Ok(n), + Ok(n) => { + unsafe { events.set_len(n) }; + Ok(n) + } Err(rustix::io::Errno::INTR) => Err(WaitError::Interrupted), Err(err) => Err(WaitError::Io(err.into())), } }Does rustix::event::epoll::wait update a Vec's length when it is given rustix::buffer::spare_capacity(events), or does the caller need to set the Vec length after the call?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/select.rs` around lines 287 - 295, The epoll wait writes into the Vec's spare capacity but never updates events.len(), so callers see an empty vector; in the wait function, after calling rustix::event::epoll::wait and receiving Ok(n), set the Vec length to n (e.g. unsafe { events.set_len(n); }) before returning Ok(n) so the written entries are published (refer to the wait function, the events Vec, and rustix::event::epoll::wait/spare_capacity usage).crates/host_env/src/ctypes.rs (7)
1926-1938:⚠️ Potential issue | 🟠 MajorValidate the COM vtable in release builds before dereferencing it.
The null-vtable check is only a
debug_assert!. In release, a non-null COM pointer with a null/bad vtable turns the dereference and transmute into UB.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 1926 - 1938, In copy_com_pointer, the vtable is only checked with debug_assert! which is skipped in release builds; add a real runtime null-check on the vtable (the value loaded into `vtable` from `iunknown`) before dereferencing or transmuting it, and return an appropriate HRESULT (e.g. HRESULT_E_POINTER) if the vtable is null or invalid; ensure the check happens immediately after computing `iunknown`/`vtable` and before the transmute to `addref_fn` so the unsafe block never dereferences a null/bad vtable in release.
760-815:⚠️ Potential issue | 🟠 MajorGuard
offsetbefore slicingbuffer[offset..].Several branches still slice with
buffer[offset..]before checking thatoffset <= buffer.len(). A bad Python index should hit the existing fallback path, not panic the VM.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 760 - 815, In read_array_element ensure we guard offset before any buffer[offset..] slice to avoid panics: add an early check (e.g. if offset >= buffer.len() return DecodedValue::None) or replace raw slices with safe indexed/get calls where used in the Some("u"), Some("f"), Some("d"|"g") branches and in the default path before calling buffer[offset..].first_chunk or get(..element_size); update the branches using wchar_from_bytes(&buffer[offset..]), first_chunk::<4/8>(), and buffer[offset..].get(..element_size) to use the guarded/safe slice so invalid Python indices hit the existing fallback instead of panicking the VM.
201-205:⚠️ Potential issue | 🔴 Critical
"g"is still marshaled asf64, not Clong double.
LONG_DOUBLE_SIZEadvertises a wider object on non-Windows x86_64/aarch64, but the encoder, decoder, and libffi mapping still treat"g"as an 8-bytef64. That leaves storage size and FFI ABI out of sync.Does the Rust libffi crate expose a distinct Type for C long double, and what are the ABI size/representation rules for C long double on x86_64 and aarch64 Unix targets?Also applies to: 1049-1060, 1198-1205, 1389-1496
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 201 - 205, The code defines LONG_DOUBLE_SIZE as a wider type but the encoder/decoder and libffi mapping still treat the format code "g" as f64; update the marshal logic so "g" maps to a proper C long double representation: modify the encoder/decoder functions that handle format code "g" to branch based on LONG_DOUBLE_SIZE and target ABI (x86_64/aarch64 non-Windows) and use the Rust libffi crate's long-double Type when available (or a size-specific raw buffer fallback), ensuring the libffi type mapping, ffi call setup, and serializer/deserializer all use the same size/representation (80-bit x86 extended, 128-bit on platforms that use 16-byte long double) instead of always using f64; adjust any code that references the "g" case (encoder/decoder handlers and the libffi mapping table) to keep ABI size/representation consistent with LONG_DOUBLE_SIZE and add unit tests for both x86_64 and aarch64 branches.
428-498:⚠️ Potential issue | 🟠 MajorWindows
wchar_thelpers need UTF-16 code-unit handling, not per-code-point casts.These helpers still process Windows wide strings one
u16at a time or cast full code points directly tou16. That drops surrogate pairs and corrupts non-BMP text on Windows.On Windows, is wchar_t a UTF-16 code unit, and how should surrogate pairs be handled when converting wide strings to and from Rust strings/bytes?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 428 - 498, All helpers treat Windows wchar_t as full Unicode code points instead of UTF‑16 code units, which corrupts non‑BMP characters; update wstring_from_bytes, wchar_array_field_value, write_wchar_array_value, encode_wtf8_to_wchar_padded, and wchar_null_terminated_bytes to operate on u16 code units (WCHAR) not u32 code points: read/write chunks via u16::from_le_bytes/into_le_bytes (respecting WCHAR_SIZE), decode sequences using core::char::decode_utf16 (or std::char::decode_utf16) to correctly combine surrogates into Rust chars, and when encoding use Iterator::encode_utf16 on the Rust string or Wtf8 code points to produce u16 units before serializing to bytes; also ensure buffer length checks use u16 counts and write a u16 zero terminator when space permits.
2431-2464:⚠️ Potential issue | 🟠 MajorDon't derive the library key by
transmute_copyinglibloading::Library.This cache key still depends on libloading's private struct layout. A crate update can silently change the value or invalidate the handle semantics behind the map key.
Does libloading::Library provide a supported API to extract a raw handle, or should platform-specific libloading::os::{unix,windows}::Library into_raw/from_raw APIs be used instead?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 2431 - 2464, The current get_pointer uses core::mem::transmute_copy on libloading::Library which relies on private layout; instead, add a dedicated raw handle field to SharedLibrary (e.g., raw_handle: Mutex<Option<usize>> or platform-specific handle type) and set that field when constructing the SharedLibrary in new, new_with_mode and from_raw_handle (use UnixLibrary::into_raw/Library::into_raw on unix/windows or convert the os-specific Library into a raw handle there), then have get_pointer return the stored raw_handle (under lock) rather than transmuting the Library; update constructors to extract and store the raw handle and avoid transmute_copy entirely.
134-147:⚠️ Potential issue | 🔴 CriticalRestore the host error slot after the swapped call returns.
Both swapped helpers copy the ctypes-local value into the OS slot and capture the post-call error, but neither restores the caller's original host
errno/LastError. That leaks foreign-call state back into the embedding thread.Also applies to: 185-198
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 134 - 147, The helpers leak the foreign-call OS error back into the host by not restoring the original OS slot; in with_swapped_errno (and the analogous with_swapped_last_error helper) capture the post-call OS error and store it into CTYPES_LOCAL_ERRNO (or the ctypes-local LastError), then restore the saved host OS error slot (call crate::os::set_errno(saved) / the equivalent set_last_error with the previously saved value) before returning so the embedding thread's OS error state is preserved.
1325-1341:⚠️ Potential issue | 🔴 CriticalUse
c_long/c_ulongwidths for"l"and"L"memory access.These branches still mix fixed-width integers with platform-width C longs. That truncates on LP64 and can overrun callback/result storage on ILP32 targets.
What are the widths of C long and unsigned long on LP64 vs ILP32 targets, and should FFI memory access use libc::c_long/c_ulong instead of fixed i32/i64 types?Also applies to: 2247-2285
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 1325 - 1341, The match arms for type codes "l"/"L" currently write fixed-width i64/u64 but must use platform C long widths; split the combined ("l" | "q") and ("L" | "Q") arms so "l" maps to libc::c_long and "L" to libc::c_ulong (cast the CallbackResultValue v to libc::c_long/ libc::c_ulong and write to result as *mut libc::c_long/*c_ulong), while keeping "q"/"Q" mapped to i64/u64 as before; update the match in ctypes.rs (the arms handling type_code "l","L","q","Q" and CallbackResultValue) so FFI memory access uses libc::c_long/c_ulong for 'l'/'L' and fixed-width i64/u64 for 'q'/'Q'.crates/host_env/src/syslog.rs (1)
41-50:⚠️ Potential issue | 🔴 Critical
syslog()can still racecloselog()and free the ident string too early.Switching to
parking_lot::RwLockremoves poisoning, but it does not fix the existing lifetime race:libc::syslog()still runs without a read lock, andcloselog()still checksis_open()before taking the write lock. A concurrent close can still invalidate the storedidentwhile libc may use it.Proposed fix
pub fn syslog(priority: i32, msg: &CStr) { + let _ident_guard = global_ident().read(); let cformat = c"%s"; unsafe { libc::syslog(priority, cformat.as_ptr(), msg.as_ptr()) }; } pub fn closelog() { - if is_open() { - let mut locked_ident = global_ident().write(); + let mut locked_ident = global_ident().write(); + if locked_ident.is_some() { unsafe { libc::closelog() }; *locked_ident = None; } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/syslog.rs` around lines 41 - 50, The race happens because libc::syslog() is called without holding the ident read lock while closelog() can free the ident; fix by holding the global_ident() RwLock read guard across the libc::syslog() call (i.e. acquire read lock in syslog() and use the held reference/pointer for the call so the ident cannot be dropped concurrently), and change closelog() to acquire the write lock unconditionally and clear/closelog() under that write lock (remove the pre-check via is_open() or make it check under the write lock) so the write lock serializes with syslog()’s read lock; reference functions: syslog, closelog, is_open, global_ident.
🧹 Nitpick comments (5)
crates/host_env/src/io.rs (1)
215-221: ⚡ Quick win
is_interrupted_errorshould useio::ErrorKind::Interruptedfor symmetry with the WouldBlock pattern.Both
is_interrupted_errorandis_would_block_erroruseraw_os_error()comparisons against specificlibcconstants, but Rust'sstd::ioalready abstracts these into portableErrorKindvariants. UsingErrorKindis more idiomatic and avoids missing platform-specific errno aliases.♻️ Proposed fix
pub fn is_interrupted_error(err: &io::Error) -> bool { - err.raw_os_error() == Some(libc::EINTR) + err.kind() == io::ErrorKind::Interrupted } pub fn is_would_block_error(err: &io::Error) -> bool { - err.raw_os_error() == Some(libc::EAGAIN) + err.kind() == io::ErrorKind::WouldBlock }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/io.rs` around lines 215 - 221, Replace the OS errno checks with ErrorKind checks for symmetry and portability: update is_interrupted_error to return err.kind() == io::ErrorKind::Interrupted (matching the would-block pattern used by is_would_block_error) and ensure is_would_block_error uses err.kind() == io::ErrorKind::WouldBlock so both functions use std::io::ErrorKind rather than raw_os_error()/libc constants.crates/host_env/src/posix.rs (3)
705-705: ⚖️ Poor tradeoff
#[allow(clippy::too_many_arguments)]is a smell — wrap the fd plumbing in a struct.Nine arguments, six of which are
i32fds with overlapping meanings, makes call sites very easy to get wrong (especially given thec2pwrite/errwritereordering concern above). A smallChildFds { p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, errpipe_read }struct would let callers use named fields and let you drop the lint suppression.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/posix.rs` at line 705, Replace the #[allow(clippy::too_many_arguments)] suppression by introducing a small struct (e.g. ChildFds { p2c_read, p2c_write, c2p_read, c2p_write, err_read, err_write, errpipe_read }) to encapsulate the six i32 fd parameters and related values; update the functions that currently take nine arguments (the one guarded by the attribute and any helpers that pass those fds) to accept a ChildFds instance (and any remaining non-fd params) and change all call sites to construct ChildFds with named fields, removing the lint suppression and eliminating positional fd confusion.
723-728: 💤 Low valueClosing raw
i32fds withnix::unistd::closeis fine, but failures here abortsetup_child_fdsmid-way.If
close(p2cwrite)succeeds butclose(c2pread)fails, we return early with the remaining fds (errread,errpipe_read) still open in the child. Given this runs post-fork in a child process where most callers will_exiton error, that is probably acceptable, but it would be safer to attempt all closes and only return the first error — matching CPython'schild_execordering.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/posix.rs` around lines 723 - 728, The loop that closes p2cwrite, c2pread, errread and the subsequent close(errpipe_read) should attempt to close every fd and only return the first encountered error instead of bailing out immediately; in setup_child_fds, replace the current direct map_err? usage with logic that iterates the fds [p2cwrite, c2pread, errread, errpipe_read], for each fd >= 0 call nix::unistd::close(fd) and if it returns Err capture the first std::io::Error (via std::io::Error::from) but continue closing the rest, and after all closes return Ok(()) if no error was captured or Err(first_error) if one was.
770-829: ⚡ Quick winInconsistent error type: these return
nix::Resultwhile the rest of the surface returnsstd::io::Result.
chdir,setsid_if_needed,setpgid_if_needed, andsetgroups_if_neededleaknix::Result/nix::Errorinto the public API while almost every other helper in this file maps tostd::io::Error. That forces every caller invm/stdlibto either pull innixdirectly or do their own conversion, defeating the abstraction goal ofhost_env. Same applies tosetregid_if_needed/setreuid_if_needed(lines 831–845).♻️ Proposed change (pattern)
-pub fn chdir(cwd: &CStr) -> nix::Result<()> { - nix::unistd::chdir(cwd) +pub fn chdir(cwd: &CStr) -> std::io::Result<()> { + nix::unistd::chdir(cwd).map_err(std::io::Error::from) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/posix.rs` around lines 770 - 829, The listed functions (chdir, setsid_if_needed, setpgid_if_needed, setgroups_if_needed and the similar setregid_if_needed/setreuid_if_needed) expose nix::Result/nix::Error; change their public signatures to return std::io::Result<(),> and map any nix errors into std::io::Error before returning. Implement mapping like: on nix Err -> if e.as_errno().is_some() use std::io::Error::from_raw_os_error(errno as i32) else use std::io::Error::new(io::ErrorKind::Other, e.to_string()); keep the existing logic and use ?-style by replacing `?` with `.map_err(|e| /*map as described*/)?` so callers get std::io::Result consistently across host_env.crates/host_env/src/posix_wasi.rs (1)
5-103: ⚡ Quick winSignificant duplication with
posix.rs— consider consolidating.Every function defined here (
make_dir,make_dir_at,remove_dir_at,stat_path,stat_fd,set_file_times_at) has a byte-for-byte (or near-identical) counterpart incrates/host_env/src/posix.rs. Notably,posix.rs::set_file_times_atis gated on#[cfg(all(any(target_os = "wasi", unix), not(target_os = "redox")))]andposix.rs::stat_path/make_dirare gated on#[cfg(not(windows))], all of which include wasi — yetlib.rsswaps in this file via#[path = "posix_wasi.rs"]undertarget_os = "wasi", so the wasi-eligible bodies inposix.rsare effectively dead, and any future bug fix (e.g., theutimes/tv_usecissue raised onposix.rs) has to be done in two places.Two viable cleanups:
- Drop this file and tighten the cfgs in
posix.rsso wasi simply uses the shared bodies (preferred where the implementations are identical).- Or, keep the wasi-specific module but factor the common helpers (negative-return →
io::Error::last_os_error(),ts(Duration)builder, theMaybeUninit<libc::stat>boilerplate) into a shared private module so both sides delegate to it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/posix_wasi.rs` around lines 5 - 103, The posix_wasi.rs file duplicates implementations found in posix.rs (make_dir, make_dir_at, remove_dir_at, stat_path, stat_fd, set_file_times_at); remove posix_wasi.rs and consolidate by adjusting cfg attributes in posix.rs so WASI builds use the shared implementations (e.g., expand #[cfg(not(windows))] / #[cfg(all(any(target_os = "wasi", unix), not(target_os = "redox")))] to include target_os = "wasi" where needed), update lib.rs so it refers to the single posix module, and run a build/test to verify there are no cfg conflicts; alternatively, if any WASI-specific divergence is required, extract common helpers (error-from-neg, timespec builder, MaybeUninit<libc::stat> handling) into a small private helper module and have both posix.rs and posix_wasi.rs call those helpers instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/host_env/src/crt_fd_unsupported.rs`:
- Around line 65-71: The try_from_raw constructor currently only treats -1 as
invalid; change its validation to reject any negative Raw value (fd < 0) and
return ebadf() instead of constructing the wrapper; apply the same fix to the
other try_* constructor(s) in this file (see the other
try_from_raw/try_from_raw-like function around the 121-127 region) so all
negative file descriptors (e.g., -2, -3) produce Err(ebadf()) rather than
Ok(...).
In `@crates/host_env/src/io_unsupported.rs`:
- Around line 84-126: In parse_fileio_mode the branch handling the binary
modifier (b'b' => {}) currently allows it to appear multiple times (e.g.,
"rbb"); add a boolean flag (e.g., b_seen) initialized false and update the b
branch to return Err(FileModeError::Invalid) if b_seen is already true,
otherwise set b_seen = true (so the loop rejects duplicate 'b' modifiers);
reference the parse_fileio_mode function and the b'b' match arm when making this
change.
In `@crates/host_env/src/posix_wasi.rs`:
- Around line 32-68: stat_path currently treats a CString::new(...) failure
(embedded NUL) as Ok(None); change it to return an error instead by converting
the CString error into an io::Error with InvalidInput (or EINVAL) so callers can
distinguish interior-NUL input from "no result"—update the early match in
stat_path (the CString::new(...) branch) to return
Err(io::Error::from(io::ErrorKind::InvalidInput)) and make the same change in
the duplicate implementation in posix.rs (the stat_path there) for consistency.
In `@crates/host_env/src/posix.rs`:
- Around line 1655-1672: close_fds_brute_force assumes keep is in ascending
order (and should_keep uses binary_search_by_key), so if keep is unsorted some
fds will not be closed; either enforce the precondition at the public entry
(close_fds) by sorting the slice (or collecting and sorting by
BorrowedFd::as_raw_fd) before passing to should_keep/close_fds_brute_force, or
add a debug-assert in close_fds that verifies keep is sorted ascending (compare
successive as_raw_fd values) and document the requirement in close_fds'
docstring; update callers if you change the API contract.
In `@crates/stdlib/src/mmap.rs`:
- Around line 468-484: The code incorrectly treats fileno == 0 as anonymous;
change the branch that computes fh so only fileno == -1 is considered anonymous
(i.e. remove the fileno != 0 check) so that fileno 0 goes through
host_nt::handle_from_fd and the subsequent host_mmap::is_invalid_handle_value
check; update the surrounding comment to reflect that only -1 signals anonymous
mapping and keep using host_nt::handle_from_fd,
host_mmap::is_invalid_handle_value and vm.new_os_error as before.
---
Duplicate comments:
In `@crates/host_env/src/ctypes.rs`:
- Around line 1926-1938: In copy_com_pointer, the vtable is only checked with
debug_assert! which is skipped in release builds; add a real runtime null-check
on the vtable (the value loaded into `vtable` from `iunknown`) before
dereferencing or transmuting it, and return an appropriate HRESULT (e.g.
HRESULT_E_POINTER) if the vtable is null or invalid; ensure the check happens
immediately after computing `iunknown`/`vtable` and before the transmute to
`addref_fn` so the unsafe block never dereferences a null/bad vtable in release.
- Around line 760-815: In read_array_element ensure we guard offset before any
buffer[offset..] slice to avoid panics: add an early check (e.g. if offset >=
buffer.len() return DecodedValue::None) or replace raw slices with safe
indexed/get calls where used in the Some("u"), Some("f"), Some("d"|"g") branches
and in the default path before calling buffer[offset..].first_chunk or
get(..element_size); update the branches using
wchar_from_bytes(&buffer[offset..]), first_chunk::<4/8>(), and
buffer[offset..].get(..element_size) to use the guarded/safe slice so invalid
Python indices hit the existing fallback instead of panicking the VM.
- Around line 201-205: The code defines LONG_DOUBLE_SIZE as a wider type but the
encoder/decoder and libffi mapping still treat the format code "g" as f64;
update the marshal logic so "g" maps to a proper C long double representation:
modify the encoder/decoder functions that handle format code "g" to branch based
on LONG_DOUBLE_SIZE and target ABI (x86_64/aarch64 non-Windows) and use the Rust
libffi crate's long-double Type when available (or a size-specific raw buffer
fallback), ensuring the libffi type mapping, ffi call setup, and
serializer/deserializer all use the same size/representation (80-bit x86
extended, 128-bit on platforms that use 16-byte long double) instead of always
using f64; adjust any code that references the "g" case (encoder/decoder
handlers and the libffi mapping table) to keep ABI size/representation
consistent with LONG_DOUBLE_SIZE and add unit tests for both x86_64 and aarch64
branches.
- Around line 428-498: All helpers treat Windows wchar_t as full Unicode code
points instead of UTF‑16 code units, which corrupts non‑BMP characters; update
wstring_from_bytes, wchar_array_field_value, write_wchar_array_value,
encode_wtf8_to_wchar_padded, and wchar_null_terminated_bytes to operate on u16
code units (WCHAR) not u32 code points: read/write chunks via
u16::from_le_bytes/into_le_bytes (respecting WCHAR_SIZE), decode sequences using
core::char::decode_utf16 (or std::char::decode_utf16) to correctly combine
surrogates into Rust chars, and when encoding use Iterator::encode_utf16 on the
Rust string or Wtf8 code points to produce u16 units before serializing to
bytes; also ensure buffer length checks use u16 counts and write a u16 zero
terminator when space permits.
- Around line 2431-2464: The current get_pointer uses core::mem::transmute_copy
on libloading::Library which relies on private layout; instead, add a dedicated
raw handle field to SharedLibrary (e.g., raw_handle: Mutex<Option<usize>> or
platform-specific handle type) and set that field when constructing the
SharedLibrary in new, new_with_mode and from_raw_handle (use
UnixLibrary::into_raw/Library::into_raw on unix/windows or convert the
os-specific Library into a raw handle there), then have get_pointer return the
stored raw_handle (under lock) rather than transmuting the Library; update
constructors to extract and store the raw handle and avoid transmute_copy
entirely.
- Around line 134-147: The helpers leak the foreign-call OS error back into the
host by not restoring the original OS slot; in with_swapped_errno (and the
analogous with_swapped_last_error helper) capture the post-call OS error and
store it into CTYPES_LOCAL_ERRNO (or the ctypes-local LastError), then restore
the saved host OS error slot (call crate::os::set_errno(saved) / the equivalent
set_last_error with the previously saved value) before returning so the
embedding thread's OS error state is preserved.
- Around line 1325-1341: The match arms for type codes "l"/"L" currently write
fixed-width i64/u64 but must use platform C long widths; split the combined ("l"
| "q") and ("L" | "Q") arms so "l" maps to libc::c_long and "L" to libc::c_ulong
(cast the CallbackResultValue v to libc::c_long/ libc::c_ulong and write to
result as *mut libc::c_long/*c_ulong), while keeping "q"/"Q" mapped to i64/u64
as before; update the match in ctypes.rs (the arms handling type_code
"l","L","q","Q" and CallbackResultValue) so FFI memory access uses
libc::c_long/c_ulong for 'l'/'L' and fixed-width i64/u64 for 'q'/'Q'.
In `@crates/host_env/src/posix.rs`:
- Around line 458-469: The utimes function's timeval builder closure tv uses
d.as_micros() for tv_usec which gives total microseconds instead of the
fractional microsecond part, causing invalid tv_usec values; change tv to use
d.subsec_micros() (or compute d.as_micros() % 1_000_000) for tv_usec so tv_sec
is d.as_secs() and tv_usec is the sub-second microsecond portion, matching the
convention used by set_file_times_at and preventing EINVAL from utimes.
In `@crates/host_env/src/select.rs`:
- Around line 96-105: In FD_SET, validate capacity before mutating __nfds:
compute n = set.__nfds, check that n < FD_SETSIZE (and/or that set.__fds has
room) before writing to set.__fds[n] and incrementing set.__nfds; if the set is
full return early (or handle overflow) so you never advance __nfds and then
index out of bounds. Reference: function FD_SET, fields __nfds and __fds, and
constant FD_SETSIZE.
- Around line 287-295: The epoll wait writes into the Vec's spare capacity but
never updates events.len(), so callers see an empty vector; in the wait
function, after calling rustix::event::epoll::wait and receiving Ok(n), set the
Vec length to n (e.g. unsafe { events.set_len(n); }) before returning Ok(n) so
the written entries are published (refer to the wait function, the events Vec,
and rustix::event::epoll::wait/spare_capacity usage).
In `@crates/host_env/src/syslog.rs`:
- Around line 41-50: The race happens because libc::syslog() is called without
holding the ident read lock while closelog() can free the ident; fix by holding
the global_ident() RwLock read guard across the libc::syslog() call (i.e.
acquire read lock in syslog() and use the held reference/pointer for the call so
the ident cannot be dropped concurrently), and change closelog() to acquire the
write lock unconditionally and clear/closelog() under that write lock (remove
the pre-check via is_open() or make it check under the write lock) so the write
lock serializes with syslog()’s read lock; reference functions: syslog,
closelog, is_open, global_ident.
---
Nitpick comments:
In `@crates/host_env/src/io.rs`:
- Around line 215-221: Replace the OS errno checks with ErrorKind checks for
symmetry and portability: update is_interrupted_error to return err.kind() ==
io::ErrorKind::Interrupted (matching the would-block pattern used by
is_would_block_error) and ensure is_would_block_error uses err.kind() ==
io::ErrorKind::WouldBlock so both functions use std::io::ErrorKind rather than
raw_os_error()/libc constants.
In `@crates/host_env/src/posix_wasi.rs`:
- Around line 5-103: The posix_wasi.rs file duplicates implementations found in
posix.rs (make_dir, make_dir_at, remove_dir_at, stat_path, stat_fd,
set_file_times_at); remove posix_wasi.rs and consolidate by adjusting cfg
attributes in posix.rs so WASI builds use the shared implementations (e.g.,
expand #[cfg(not(windows))] / #[cfg(all(any(target_os = "wasi", unix),
not(target_os = "redox")))] to include target_os = "wasi" where needed), update
lib.rs so it refers to the single posix module, and run a build/test to verify
there are no cfg conflicts; alternatively, if any WASI-specific divergence is
required, extract common helpers (error-from-neg, timespec builder,
MaybeUninit<libc::stat> handling) into a small private helper module and have
both posix.rs and posix_wasi.rs call those helpers instead.
In `@crates/host_env/src/posix.rs`:
- Line 705: Replace the #[allow(clippy::too_many_arguments)] suppression by
introducing a small struct (e.g. ChildFds { p2c_read, p2c_write, c2p_read,
c2p_write, err_read, err_write, errpipe_read }) to encapsulate the six i32 fd
parameters and related values; update the functions that currently take nine
arguments (the one guarded by the attribute and any helpers that pass those fds)
to accept a ChildFds instance (and any remaining non-fd params) and change all
call sites to construct ChildFds with named fields, removing the lint
suppression and eliminating positional fd confusion.
- Around line 723-728: The loop that closes p2cwrite, c2pread, errread and the
subsequent close(errpipe_read) should attempt to close every fd and only return
the first encountered error instead of bailing out immediately; in
setup_child_fds, replace the current direct map_err? usage with logic that
iterates the fds [p2cwrite, c2pread, errread, errpipe_read], for each fd >= 0
call nix::unistd::close(fd) and if it returns Err capture the first
std::io::Error (via std::io::Error::from) but continue closing the rest, and
after all closes return Ok(()) if no error was captured or Err(first_error) if
one was.
- Around line 770-829: The listed functions (chdir, setsid_if_needed,
setpgid_if_needed, setgroups_if_needed and the similar
setregid_if_needed/setreuid_if_needed) expose nix::Result/nix::Error; change
their public signatures to return std::io::Result<(),> and map any nix errors
into std::io::Error before returning. Implement mapping like: on nix Err -> if
e.as_errno().is_some() use std::io::Error::from_raw_os_error(errno as i32) else
use std::io::Error::new(io::ErrorKind::Other, e.to_string()); keep the existing
logic and use ?-style by replacing `?` with `.map_err(|e| /*map as
described*/)?` so callers get std::io::Result consistently across host_env.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: ad3c2519-cf50-4a2f-8b2f-9ea56675d28c
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockLib/test/test_ctypes/test_dlerror.pyis excluded by!Lib/**
📒 Files selected for processing (79)
.cspell.json.github/workflows/ci.yamlcrates/host_env/Cargo.tomlcrates/host_env/src/cert_store.rscrates/host_env/src/crt_fd_unsupported.rscrates/host_env/src/ctypes.rscrates/host_env/src/errno.rscrates/host_env/src/faulthandler.rscrates/host_env/src/fcntl.rscrates/host_env/src/grp.rscrates/host_env/src/io.rscrates/host_env/src/io_unsupported.rscrates/host_env/src/lib.rscrates/host_env/src/locale.rscrates/host_env/src/mmap.rscrates/host_env/src/msvcrt.rscrates/host_env/src/multiprocessing.rscrates/host_env/src/nt.rscrates/host_env/src/os.rscrates/host_env/src/overlapped.rscrates/host_env/src/posix.rscrates/host_env/src/posix_wasi.rscrates/host_env/src/pwd.rscrates/host_env/src/resource.rscrates/host_env/src/select.rscrates/host_env/src/signal.rscrates/host_env/src/socket.rscrates/host_env/src/syslog.rscrates/host_env/src/termios.rscrates/host_env/src/testconsole.rscrates/host_env/src/thread.rscrates/host_env/src/time.rscrates/host_env/src/winapi.rscrates/host_env/src/windows.rscrates/host_env/src/winreg.rscrates/host_env/src/wmi.rscrates/stdlib/Cargo.tomlcrates/stdlib/src/_testconsole.rscrates/stdlib/src/faulthandler.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/grp.rscrates/stdlib/src/locale.rscrates/stdlib/src/mmap.rscrates/stdlib/src/multiprocessing.rscrates/stdlib/src/openssl.rscrates/stdlib/src/overlapped.rscrates/stdlib/src/posixsubprocess.rscrates/stdlib/src/resource.rscrates/stdlib/src/select.rscrates/stdlib/src/socket.rscrates/stdlib/src/ssl.rscrates/stdlib/src/termios.rscrates/vm/Cargo.tomlcrates/vm/src/exceptions.rscrates/vm/src/getpath.rscrates/vm/src/readline.rscrates/vm/src/signal.rscrates/vm/src/stdlib/_codecs.rscrates/vm/src/stdlib/_ctypes.rscrates/vm/src/stdlib/_ctypes/array.rscrates/vm/src/stdlib/_ctypes/base.rscrates/vm/src/stdlib/_ctypes/function.rscrates/vm/src/stdlib/_ctypes/library.rscrates/vm/src/stdlib/_ctypes/pointer.rscrates/vm/src/stdlib/_ctypes/simple.rscrates/vm/src/stdlib/_io.rscrates/vm/src/stdlib/_signal.rscrates/vm/src/stdlib/_stat.rscrates/vm/src/stdlib/_thread.rscrates/vm/src/stdlib/_winapi.rscrates/vm/src/stdlib/_wmi.rscrates/vm/src/stdlib/builtins.rscrates/vm/src/stdlib/errno.rscrates/vm/src/stdlib/msvcrt.rscrates/vm/src/stdlib/nt.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/pwd.rscrates/vm/src/stdlib/sys.rs
✅ Files skipped from review due to trivial changes (3)
- .cspell.json
- crates/stdlib/src/fcntl.rs
- crates/stdlib/src/_testconsole.rs
🚧 Files skipped from review as they are similar to previous changes (27)
- .github/workflows/ci.yaml
- crates/host_env/src/testconsole.rs
- crates/host_env/src/pwd.rs
- crates/host_env/src/wmi.rs
- crates/host_env/src/cert_store.rs
- crates/host_env/src/msvcrt.rs
- crates/host_env/src/thread.rs
- crates/stdlib/src/locale.rs
- crates/host_env/src/os.rs
- crates/host_env/src/errno.rs
- crates/host_env/src/grp.rs
- crates/stdlib/src/grp.rs
- crates/host_env/src/windows.rs
- crates/host_env/src/fcntl.rs
- crates/host_env/src/locale.rs
- crates/host_env/src/winreg.rs
- crates/host_env/src/resource.rs
- crates/host_env/src/mmap.rs
- crates/host_env/src/faulthandler.rs
- crates/host_env/src/lib.rs
- crates/host_env/src/overlapped.rs
- crates/host_env/src/multiprocessing.rs
- crates/stdlib/src/faulthandler.rs
- crates/host_env/src/signal.rs
- crates/host_env/src/time.rs
- crates/host_env/src/winapi.rs
- crates/host_env/src/socket.rs
| // Get file handle from fileno | ||
| // fileno -1 or 0 means anonymous mapping | ||
| let fh: Option<HANDLE> = if fileno != -1 && fileno != 0 { | ||
| // Convert CRT file descriptor to Windows HANDLE | ||
| let fh: Option<host_mmap::Handle> = if fileno != -1 && fileno != 0 { | ||
| // Convert CRT file descriptor to a Windows file mapping handle. | ||
| // Use suppress_iph! to avoid crashes when the fd is invalid. | ||
| // This is critical because socket fds wrapped via _open_osfhandle | ||
| // may cause crashes in _get_osfhandle on Windows. | ||
| // See Python bug https://bugs.python.org/issue30114 | ||
| let handle = unsafe { suppress_iph!(libc::get_osfhandle(fileno)) }; | ||
| let handle = host_nt::handle_from_fd(fileno); | ||
| // Check for invalid handle value (-1 on Windows) | ||
| if handle == -1 || handle == INVALID_HANDLE_VALUE as isize { | ||
| if host_mmap::is_invalid_handle_value(handle as isize) { | ||
| return Err(vm.new_os_error(format!("Invalid file descriptor: {}", fileno))); | ||
| } | ||
| Some(handle as HANDLE) | ||
| Some(handle as host_mmap::Handle) | ||
| } else { | ||
| None | ||
| }; |
There was a problem hiding this comment.
Don't treat file descriptor 0 as anonymous mapping on Windows.
-1 is the anonymous sentinel here. 0 is still a valid CRT fd, so this branch currently turns mmap.mmap(0, ...) into an anonymous map instead of using the underlying handle.
Proposed fix
- let fh: Option<host_mmap::Handle> = if fileno != -1 && fileno != 0 {
+ let fh: Option<host_mmap::Handle> = if fileno != -1 {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Get file handle from fileno | |
| // fileno -1 or 0 means anonymous mapping | |
| let fh: Option<HANDLE> = if fileno != -1 && fileno != 0 { | |
| // Convert CRT file descriptor to Windows HANDLE | |
| let fh: Option<host_mmap::Handle> = if fileno != -1 && fileno != 0 { | |
| // Convert CRT file descriptor to a Windows file mapping handle. | |
| // Use suppress_iph! to avoid crashes when the fd is invalid. | |
| // This is critical because socket fds wrapped via _open_osfhandle | |
| // may cause crashes in _get_osfhandle on Windows. | |
| // See Python bug https://bugs.python.org/issue30114 | |
| let handle = unsafe { suppress_iph!(libc::get_osfhandle(fileno)) }; | |
| let handle = host_nt::handle_from_fd(fileno); | |
| // Check for invalid handle value (-1 on Windows) | |
| if handle == -1 || handle == INVALID_HANDLE_VALUE as isize { | |
| if host_mmap::is_invalid_handle_value(handle as isize) { | |
| return Err(vm.new_os_error(format!("Invalid file descriptor: {}", fileno))); | |
| } | |
| Some(handle as HANDLE) | |
| Some(handle as host_mmap::Handle) | |
| } else { | |
| None | |
| }; | |
| // Get file handle from fileno | |
| // fileno -1 means anonymous mapping | |
| let fh: Option<host_mmap::Handle> = if fileno != -1 { | |
| // Convert CRT file descriptor to a Windows file mapping handle. | |
| // Use suppress_iph! to avoid crashes when the fd is invalid. | |
| // This is critical because socket fds wrapped via _open_osfhandle | |
| // may cause crashes in _get_osfhandle on Windows. | |
| // See Python bug https://bugs.python.org/issue30114 | |
| let handle = host_nt::handle_from_fd(fileno); | |
| // Check for invalid handle value (-1 on Windows) | |
| if host_mmap::is_invalid_handle_value(handle as isize) { | |
| return Err(vm.new_os_error(format!("Invalid file descriptor: {}", fileno))); | |
| } | |
| Some(handle as host_mmap::Handle) | |
| } else { | |
| None | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/stdlib/src/mmap.rs` around lines 468 - 484, The code incorrectly
treats fileno == 0 as anonymous; change the branch that computes fh so only
fileno == -1 is considered anonymous (i.e. remove the fileno != 0 check) so that
fileno 0 goes through host_nt::handle_from_fd and the subsequent
host_mmap::is_invalid_handle_value check; update the surrounding comment to
reflect that only -1 signals anonymous mapping and keep using
host_nt::handle_from_fd, host_mmap::is_invalid_handle_value and vm.new_os_error
as before.
There was a problem hiding this comment.
Actionable comments posted: 14
♻️ Duplicate comments (4)
crates/host_env/src/ctypes.rs (3)
1973-1983:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate the AddRef slot before transmuting it.
The null-vtable guard is good, but
*vtable.add(1)can still be null. Transmuting and calling that in a safe function is still UB in release builds.Suggested fix
if src_ptr != 0 { unsafe { let iunknown = src_ptr as *mut *const usize; let vtable = *iunknown; if vtable.is_null() { return HRESULT_E_POINTER; } - let addref_fn: extern "system" fn(*mut c_void) -> u32 = - core::mem::transmute(*vtable.add(1)); + let addref = *vtable.add(1); + if addref == 0 { + return HRESULT_E_POINTER; + } + let addref_fn: extern "system" fn(*mut c_void) -> u32 = + core::mem::transmute(addref); addref_fn(src_ptr as *mut c_void); } }For a COM IUnknown pointer, is it valid to call AddRef without checking that the vtable slot is non-null first, or should a safe wrapper reject a null AddRef slot?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 1973 - 1983, The code checks vtable nullity but still directly dereferences *vtable.add(1) then transmutes it to addref_fn; instead validate the AddRef slot before transmuting by reading let slot = *vtable.add(1) (or equivalent), check if slot.is_null() and return HRESULT_E_POINTER if so, only then transmute slot into the extern "system" fn and call it; refer to symbols src_ptr, iunknown, vtable, addref_fn, and HRESULT_E_POINTER when making the change.
2510-2517:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDon't key the library cache from
libloading::Library's private layout.
transmute_copy::<Library, usize>is not a supported libloading API. The cache key can change under a libloading update, which makes dedup and unload behavior depend on private struct layout instead of a stable handle/ID.Does libloading's safe Library wrapper expose a supported raw-handle API, or should callers use libloading::os::{unix,windows}::Library::into_raw/from_raw (or a separate stable ID) instead of transmuting Library?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 2510 - 2517, The get_pointer method is currently using unsafe transmute_copy on libloading::Library (get_pointer) which relies on Library's private layout; replace this by deriving a stable, supported cache key: either convert the Library to a raw handle using the platform-specific supported API (libloading::os::{unix, windows}::Library::into_raw and from_raw) or maintain a separate stable ID (e.g., incrementing usize or UUID) stored alongside the Library in the cache; update get_pointer to return that stable handle/ID (or the platform raw handle obtained via the supported into_raw/from_raw/appropriate accessor) instead of transmuting Library. Ensure you do not accidentally move/consume the Library when getting the key (if using into_raw/from_raw, reconstruct the Library with from_raw immediately) and update any cache insert/remove logic that currently expects the transmuted value.
205-221:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift
"g"is still ABI-incompatible on non-Windows 64-bit targets.
LONG_DOUBLE_SIZEadvertises a widerlong double, but serialization, deserialization, and libffi type mapping still all go through 8-bytef64. That leaves the"g"path with the wrong in-memory layout and wrong FFI ABI.Does the Rust libffi crate expose a distinct long-double type, and what is the ABI width/layout of C long double on x86_64/aarch64 Unix targets?Also applies to: 1085-1097, 1234-1241, 1431-1537
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/ctypes.rs` around lines 205 - 221, The code advertises LONG_DOUBLE_SIZE as wider on Unix x86_64/aarch64 but still treats format code "g" as f64 everywhere; update all places that serialize/deserialize and map C types for the "g" (long double) ABI to use LONG_DOUBLE_SIZE and the correct Rust/C type (c_longdouble or a u128/bytes buffer matching LONG_DOUBLE_SIZE) and map to the libffi long-double ffi type (or create a custom ffi::Type for long double) instead of f64; specifically change the handling sites that reference LONG_DOUBLE_SIZE and the format code "g" (serialization/deserialization routines and FFI type mapping functions where "g" is currently sent to f64) so they use LONG_DOUBLE_SIZE, use libc::c_longdouble or an appropriately sized byte representation, and select the libffi long-double variant (or return an error if the libffi crate lacks long-double support).crates/host_env/src/posix.rs (1)
693-703:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThe fast paths still assume
keepis sorted.
should_keep()usesbinary_search_by_key(), and bothclose_dir_fds()/close_filetable_fds()call it before the brute-force fallback. The new debug-assert inclose_fds_brute_force()therefore doesn't protect the common path.Suggested fix
pub fn close_fds(above: i32, keep: &[BorrowedFd<'_>]) { + let mut keep = keep.to_vec(); + keep.sort_unstable_by_key(BorrowedFd::as_raw_fd); + #[cfg(not(target_os = "redox"))] - if close_dir_fds(above, keep).is_ok() { + if close_dir_fds(above, &keep).is_ok() { return; } #[cfg(target_os = "redox")] - if close_filetable_fds(above, keep).is_ok() { + if close_filetable_fds(above, &keep).is_ok() { return; } - close_fds_brute_force(above, keep) + close_fds_brute_force(above, &keep) }Also applies to: 1571-1575, 1578-1615, 1617-1678
🧹 Nitpick comments (1)
crates/host_env/src/thread.rs (1)
12-13: ⚡ Quick win
unwrap_or(c_name)is unreachable dead code.
c_name.as_bytes()is already null-byte-free (guaranteed by theCString::new(name)call above), so any sub-slice of it also has no null bytes, andCString::new(&c_name.as_bytes()[..15])can never returnErr. Theunwrap_orfallback is never taken.Replace with
unwrap()(or.expect(...)) to make the invariant explicit:♻️ Proposed fix
- let truncated = if c_name.as_bytes().len() > 15 { - CString::new(&c_name.as_bytes()[..15]).unwrap_or(c_name) - } else { - c_name - }; + let truncated = if c_name.as_bytes().len() > 15 { + // SAFETY: the slice of a null-free byte string is also null-free. + CString::new(&c_name.as_bytes()[..15]).expect("slice of null-free CString is null-free") + } else { + c_name + };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/thread.rs` around lines 12 - 13, The fallback unwrap_or(c_name) is unreachable because c_name was constructed via CString::new so its byte sub-slices cannot contain nulls; update the truncation to use CString::new(&c_name.as_bytes()[..15]).unwrap() (or .expect("...")) so the invariant is explicit and dead code is removed—change the expression that assigns truncated (using c_name and CString::new) to call unwrap()/expect() instead of unwrap_or(...).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/host_env/src/cert_store.rs`:
- Around line 101-121: When CertEnumCRLsInStore(store, crl_context) returns NULL
you must call GetLastError() to distinguish normal end-of-enumeration from an
actual error; update the loop/exit logic around crl_context so that after
breaking on a NULL you call GetLastError() and if it returns ERROR_SUCCESS (0)
continue to return Ok(result), otherwise propagate an Err (convert the Win32
error code into your crate's error type). Ensure CertCloseStore(store, 0) still
runs in both success and error paths (use a single cleanup block or early-return
pattern that closes the store before returning the Err). Reference:
CertEnumCRLsInStore, crl_context, CertCloseStore, GetLastError, ERROR_SUCCESS.
- Around line 41-47: The encoding_type function currently does exact-match
matching and must instead test dwCertEncodingType bitflags: change fn
encoding_type(raw: u32) -> EncodingType to use bitwise checks (e.g., if raw &
X509_ASN_ENCODING != 0 { return EncodingType::X509Asn } else if raw &
PKCS_7_ASN_ENCODING != 0 { return EncodingType::Pkcs7Asn } else {
EncodingType::Other(raw) }), so combined flags like X509_ASN_ENCODING |
PKCS_7_ASN_ENCODING are correctly classified (prefer X509Asn when both are
present, or adjust preference if you have a different rule).
In `@crates/host_env/src/faulthandler.rs`:
- Around line 396-411: The Windows path in enable_fatal_handlers currently
returns false on the first install_signal_handler failure but leaves earlier
entries marked enabled (leaking partial installs); modify enable_fatal_handlers
to mirror the Unix rollback: keep a local Vec (imported with #[cfg(any(unix,
windows))]) of indices or entries that were successfully installed, and if an
install_signal_handler call returns Err, iterate the recorded installs and call
disable_fatal_signal (or otherwise restore entry.previous and set entry.enabled
= false) for each to revert them, then return false; ensure you update each
FATAL_SIGNAL_HANDLERS entry's previous/ enabled fields only after a successful
install and clear them during rollback.
- Line 90: USER_SIGNAL_CAPACITY is set to 64 which excludes SIGRTMAX (typically
64) — bump USER_SIGNAL_CAPACITY to 65 so the valid signum space includes 64,
update any signum-range validations such as the C-level check (the guard using
`signum as usize >= USER_SIGNAL_CAPACITY`) and corresponding Python-level ranges
(replace exclusive upper bounds like `1..64` with inclusive ranges or
`1..=64`/adjusted checks) so the highest real-time signal can be registered.
In `@crates/host_env/src/locale.rs`:
- Around line 25-30: Remove the six unused trailing fields int_p_cs_precedes,
int_p_sep_by_space, int_n_cs_precedes, int_n_sep_by_space, int_p_sign_posn,
int_n_sign_posn from the RawLconv struct to match the actual runtime lconv
layout (ensure localeconv_data and LocaleConv use only fields up through
n_sign_posn); update strxfrm so it returns a Vec<u8> that does NOT include the
trailing NUL byte (or change the caller in stdlib's strxfrm usage so it strips
the final NUL before calling String::from_utf8), and change decode_ansi_bytes so
that an empty input returns Some(String::new()) instead of None.
- Around line 116-124: The strxfrm function currently allocates twice and
returns a buffer with a trailing NUL; change it to query required size once by
calling libc::strxfrm with a null dest (size 0) to get n, allocate a single
buffer of n+1 bytes, call libc::strxfrm once to fill it, then return a Vec<u8>
truncated to n (omitting the terminating NUL) so consumers (e.g., the code using
String::from_utf8(buff)) receive no embedded NUL; update the function strxfrm to
perform the null-query, single allocation, one fill call, and return only the n
bytes.
In `@crates/host_env/src/multiprocessing.rs`:
- Around line 128-131: The unlink=true branch currently calls unsafe {
libc::sem_unlink(cname.as_ptr()) } and ignores failures, leaking the name;
update this path in the function (the create/open routine that handles the
unlink flag) to check the return value of libc::sem_unlink, and on non‑zero
return convert errno into a Rust error (e.g., std::io::Error::from_raw_os_error
or nix::Error) and return Err instead of proceeding; reference the cname
variable and sem_unlink call so the error is propagated to callers rather than
returning success with a leaked name.
- Around line 383-394: In deadline_from_timeout, first validate the input:
reject non-finite or negative timeouts (use timeout.is_finite() and timeout >=
0.0) and return Err(SemError) for invalid values; then compute seconds and
nanoseconds with safe conversions and checked arithmetic instead of direct casts
— use timeout.floor() to get sec_f64 and convert to a libc time type with a
checked/saturating conversion, compute nsec = round((timeout - sec_f64) * 1e9)
into a libc integer, and use checked_add/checked_mul when combining tv.tv_usec *
1000 + nsec and when adding sec to tv.tv_sec; normalize tv_nsec (<
1_000_000_000) and if any checked operation overflows return Err(SemError).
Reference deadline_from_timeout, gettimeofday, tv_sec, tv_usec and tv_nsec when
applying these checks and replacements.
- Around line 88-96: Replace use of POSIX/Win32 last error for WinSock calls: in
paths handling closesocket, recv, and send, call WSAGetLastError() (from
windows_sys::Win32::Networking::WinSock) instead of
GetLastError()/last_os_error(); update any error logging/returns to use that
value. Also prevent truncating usize to i32 for socket buffer lengths in the
recv/send call sites (and any places that cast buffer lengths): check the usize
against i32::MAX and either clamp or return an appropriate error if it exceeds
i32::MAX, or use try_into() with explicit handling to avoid wrapping; look for
usages of closesocket, recv, send and the existing usize as i32 casts and fix
them accordingly. Ensure imports are updated to include WSAGetLastError and
remove reliance on INVALID GetLastError where appropriate.
In `@crates/host_env/src/overlapped.rs`:
- Around line 169-223: The Operation's completed flag must be cleared before
beginning a new async call; in connect_named_pipe, write, and read set
self.completed = false immediately before initiating the OS call (i.e. before
calling start_connect_named_pipe / start_write_file / start_read_file) so a
reused Operation won't treat a fresh I/O as already completed and truncate/zero
buffers prematurely; keep existing error handling and pending logic unchanged.
- Around line 54-61: The Drop impl currently frees Operation while kernel I/O
may still be pending; ensure Drop does not free overlapped/ buffers until the
kernel finishes: when scheduling read()/write()/connect_named_pipe() set
Operation.completed = false and Operation.pending = true on ERROR_IO_PENDING; in
Drop, if pending is true call CancelIoEx(self.handle,
self.overlapped.as_mut_ptr()) and then wait for completion (e.g.,
WaitForSingleObject on overlapped.hEvent or GetOverlappedResult with bWait =
TRUE) before closing the event handle and freeing memory so the kernel cannot
write into freed buffers; also reset completed = false whenever reusing
Operation for a new read/write so get_result(false) cannot return stale state.
In `@crates/host_env/src/testconsole.rs`:
- Around line 24-39: The loop calling WriteConsoleInputW in testconsole.rs can
spin if the call succeeds but writes zero events; after the unsafe call and
after checking res == 0, add a guard that if wrote == 0 you return
Err(io::Error::new(io::ErrorKind::WriteZero, "WriteConsoleInputW made no
progress")) so the function does not loop forever; update the block around the
WriteConsoleInputW call (the wrote and total variables and their handling) to
return this WriteZero error when no progress is observed.
In `@crates/host_env/src/thread.rs`:
- Around line 12-13: The current byte-slice truncation
(&c_name.as_bytes()[..15]) can cut a multi-byte UTF‑8 character; change to
truncate at the last valid UTF‑8 character boundary <=15 bytes: compute a byte
index end = 15 then while !c_name.is_char_boundary(end) { end -= 1 } (guarding
if end becomes 0) and use CString::new(&c_name[..end]).unwrap_or(c_name) to
build truncated; keep the same variable names (c_name, truncated) so the rest of
the code is unchanged.
- Around line 1-2: Replace the alloc import with use std::ffi::CString; and
adjust the thread-name truncation to avoid slicing in the middle of UTF-8
codepoints: convert the name to a String and truncate by characters (e.g.,
name.chars().take(15).collect::<String>()) before calling CString::new; then
remove the redundant unwrap_or(c_name) since CString::new on a null-free &str
cannot fail. Update references to the old import, the .as_bytes()[..15] slice,
and the CString::new(...).unwrap_or(c_name) call accordingly.
---
Duplicate comments:
In `@crates/host_env/src/ctypes.rs`:
- Around line 1973-1983: The code checks vtable nullity but still directly
dereferences *vtable.add(1) then transmutes it to addref_fn; instead validate
the AddRef slot before transmuting by reading let slot = *vtable.add(1) (or
equivalent), check if slot.is_null() and return HRESULT_E_POINTER if so, only
then transmute slot into the extern "system" fn and call it; refer to symbols
src_ptr, iunknown, vtable, addref_fn, and HRESULT_E_POINTER when making the
change.
- Around line 2510-2517: The get_pointer method is currently using unsafe
transmute_copy on libloading::Library (get_pointer) which relies on Library's
private layout; replace this by deriving a stable, supported cache key: either
convert the Library to a raw handle using the platform-specific supported API
(libloading::os::{unix, windows}::Library::into_raw and from_raw) or maintain a
separate stable ID (e.g., incrementing usize or UUID) stored alongside the
Library in the cache; update get_pointer to return that stable handle/ID (or the
platform raw handle obtained via the supported into_raw/from_raw/appropriate
accessor) instead of transmuting Library. Ensure you do not accidentally
move/consume the Library when getting the key (if using into_raw/from_raw,
reconstruct the Library with from_raw immediately) and update any cache
insert/remove logic that currently expects the transmuted value.
- Around line 205-221: The code advertises LONG_DOUBLE_SIZE as wider on Unix
x86_64/aarch64 but still treats format code "g" as f64 everywhere; update all
places that serialize/deserialize and map C types for the "g" (long double) ABI
to use LONG_DOUBLE_SIZE and the correct Rust/C type (c_longdouble or a
u128/bytes buffer matching LONG_DOUBLE_SIZE) and map to the libffi long-double
ffi type (or create a custom ffi::Type for long double) instead of f64;
specifically change the handling sites that reference LONG_DOUBLE_SIZE and the
format code "g" (serialization/deserialization routines and FFI type mapping
functions where "g" is currently sent to f64) so they use LONG_DOUBLE_SIZE, use
libc::c_longdouble or an appropriately sized byte representation, and select the
libffi long-double variant (or return an error if the libffi crate lacks
long-double support).
---
Nitpick comments:
In `@crates/host_env/src/thread.rs`:
- Around line 12-13: The fallback unwrap_or(c_name) is unreachable because
c_name was constructed via CString::new so its byte sub-slices cannot contain
nulls; update the truncation to use
CString::new(&c_name.as_bytes()[..15]).unwrap() (or .expect("...")) so the
invariant is explicit and dead code is removed—change the expression that
assigns truncated (using c_name and CString::new) to call unwrap()/expect()
instead of unwrap_or(...).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: efeb91ac-3b83-47d4-ad1f-ee1aca42766c
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockLib/test/test_ctypes/test_dlerror.pyis excluded by!Lib/**
📒 Files selected for processing (84)
.cspell.json.github/workflows/ci.yamlcrates/host_env/Cargo.tomlcrates/host_env/src/cert_store.rscrates/host_env/src/crt_fd_unsupported.rscrates/host_env/src/ctypes.rscrates/host_env/src/errno.rscrates/host_env/src/faulthandler.rscrates/host_env/src/fcntl.rscrates/host_env/src/grp.rscrates/host_env/src/io.rscrates/host_env/src/io_unsupported.rscrates/host_env/src/lib.rscrates/host_env/src/locale.rscrates/host_env/src/mmap.rscrates/host_env/src/msvcrt.rscrates/host_env/src/multiprocessing.rscrates/host_env/src/nt.rscrates/host_env/src/os.rscrates/host_env/src/overlapped.rscrates/host_env/src/posix.rscrates/host_env/src/posix_wasi.rscrates/host_env/src/pwd.rscrates/host_env/src/resource.rscrates/host_env/src/select.rscrates/host_env/src/signal.rscrates/host_env/src/socket.rscrates/host_env/src/syslog.rscrates/host_env/src/termios.rscrates/host_env/src/testconsole.rscrates/host_env/src/thread.rscrates/host_env/src/time.rscrates/host_env/src/winapi.rscrates/host_env/src/windows.rscrates/host_env/src/winreg.rscrates/host_env/src/wmi.rscrates/stdlib/Cargo.tomlcrates/stdlib/src/_testconsole.rscrates/stdlib/src/faulthandler.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/grp.rscrates/stdlib/src/locale.rscrates/stdlib/src/mmap.rscrates/stdlib/src/multiprocessing.rscrates/stdlib/src/openssl.rscrates/stdlib/src/overlapped.rscrates/stdlib/src/posixsubprocess.rscrates/stdlib/src/resource.rscrates/stdlib/src/select.rscrates/stdlib/src/socket.rscrates/stdlib/src/ssl.rscrates/stdlib/src/termios.rscrates/vm/Cargo.tomlcrates/vm/src/exceptions.rscrates/vm/src/getpath.rscrates/vm/src/readline.rscrates/vm/src/signal.rscrates/vm/src/stdlib/_codecs.rscrates/vm/src/stdlib/_ctypes.rscrates/vm/src/stdlib/_ctypes/array.rscrates/vm/src/stdlib/_ctypes/base.rscrates/vm/src/stdlib/_ctypes/function.rscrates/vm/src/stdlib/_ctypes/library.rscrates/vm/src/stdlib/_ctypes/pointer.rscrates/vm/src/stdlib/_ctypes/simple.rscrates/vm/src/stdlib/_io.rscrates/vm/src/stdlib/_signal.rscrates/vm/src/stdlib/_stat.rscrates/vm/src/stdlib/_thread.rscrates/vm/src/stdlib/_winapi.rscrates/vm/src/stdlib/_wmi.rscrates/vm/src/stdlib/builtins.rscrates/vm/src/stdlib/errno.rscrates/vm/src/stdlib/msvcrt.rscrates/vm/src/stdlib/nt.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/pwd.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/time.rscrates/vm/src/stdlib/winreg.rscrates/vm/src/vm/mod.rscrates/vm/src/windows.rshost_env_proposal.md
✅ Files skipped from review due to trivial changes (4)
- .cspell.json
- crates/host_env/src/msvcrt.rs
- crates/host_env/src/termios.rs
- crates/stdlib/src/mmap.rs
🚧 Files skipped from review as they are similar to previous changes (28)
- .github/workflows/ci.yaml
- crates/host_env/src/resource.rs
- crates/host_env/src/grp.rs
- crates/stdlib/src/fcntl.rs
- crates/host_env/src/io_unsupported.rs
- crates/host_env/src/wmi.rs
- crates/host_env/src/posix_wasi.rs
- crates/host_env/src/pwd.rs
- crates/host_env/src/errno.rs
- crates/stdlib/src/locale.rs
- crates/host_env/src/mmap.rs
- crates/stdlib/src/_testconsole.rs
- crates/host_env/src/os.rs
- crates/host_env/src/crt_fd_unsupported.rs
- crates/stdlib/Cargo.toml
- crates/host_env/src/select.rs
- crates/host_env/src/windows.rs
- crates/host_env/src/io.rs
- crates/host_env/src/fcntl.rs
- crates/stdlib/src/grp.rs
- crates/host_env/src/nt.rs
- crates/host_env/src/time.rs
- crates/host_env/src/winapi.rs
- crates/stdlib/src/faulthandler.rs
- crates/host_env/src/socket.rs
- crates/host_env/src/lib.rs
- crates/host_env/src/signal.rs
- crates/host_env/src/winreg.rs
| ]; | ||
|
|
||
| #[cfg(unix)] | ||
| const USER_SIGNAL_CAPACITY: usize = 64; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm what _NSIG / SIGRTMAX evaluate to on the target platforms and how stdlib's register() validates signum.
rg -nP --type=rust -C3 'check_signum|NSIG|SIGRTMAX|USER_SIGNAL_CAPACITY'Repository: RustPython/RustPython
Length of output: 13605
🏁 Script executed:
# Check the actual validation and how register_user_signal works
head -n 300 crates/host_env/src/faulthandler.rs | tail -n 50Repository: RustPython/RustPython
Length of output: 1462
🏁 Script executed:
# Examine the complete register_user_signal function
sed -n '275,310p' crates/host_env/src/faulthandler.rsRepository: RustPython/RustPython
Length of output: 1121
🏁 Script executed:
# Check how this function is called from the Python layer
rg -n "register_user_signal" --type=rust -B 2 -A 2Repository: RustPython/RustPython
Length of output: 1374
🏁 Script executed:
# Verify what _NSIG and SIGRTMAX values are on Linux systems
web_search
_NSIG SIGRTMAX Linux values definitionRepository: RustPython/RustPython
Length of output: 160
🏁 Script executed:
# Look at the check_signum function in the Python layer to see if it validates before calling register_user_signal
sed -n '784,815p' crates/stdlib/src/faulthandler.rsRepository: RustPython/RustPython
Length of output: 1009
🏁 Script executed:
# Check if there are other signal-related files that might define NSIG or signal limits
fd -t f "signal" crates/ --type f | head -20Repository: RustPython/RustPython
Length of output: 151
🏁 Script executed:
# Look at what libc definitions are available for _NSIG or SIGRTMAX
rg -n "SIGRTMAX|_NSIG|libc::" crates/host_env/src/faulthandler.rs | head -20Repository: RustPython/RustPython
Length of output: 1232
🏁 Script executed:
# Check if register_user_signal is called from anywhere else without check_signum
rg -n "register_user_signal" crates/ --type=rustRepository: RustPython/RustPython
Length of output: 422
🏁 Script executed:
# Examine how CPython handles this by checking if there are comments about NSIG
rg -n "NSIG|signal.*65|SIGRTMAX" crates/host_env/src/ --type=rust -B 2 -A 2Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check what the actual libc definitions are for signals
cat << 'EOF' > /tmp/check_signals.rs
fn main() {
#[cfg(unix)]
{
// On Linux: NSIG is typically 65, SIGRTMAX is 64
println!("Signal capacity should be: 65 for full NSIG coverage");
}
}
EOF
cat /tmp/check_signals.rsRepository: RustPython/RustPython
Length of output: 240
🏁 Script executed:
# Look at the exact bounds check in check_signum again
sed -n '796,800p' crates/stdlib/src/faulthandler.rsRepository: RustPython/RustPython
Length of output: 98
🏁 Script executed:
# Let me double-check the validation range more carefully
sed -n '784,802p' crates/stdlib/src/faulthandler.rs | cat -nRepository: RustPython/RustPython
Length of output: 825
🏁 Script executed:
# Also check unregister to see if it has the same issue
sed -n '839,845p' crates/stdlib/src/faulthandler.rsRepository: RustPython/RustPython
Length of output: 276
🏁 Script executed:
# Verify the actual USER_SIGNAL_CAPACITY and how it's used
grep -n "USER_SIGNAL_CAPACITY" crates/host_env/src/faulthandler.rsRepository: RustPython/RustPython
Length of output: 263
Signal 64 (SIGRTMAX) is rejected by the validation range.
The checks at both the C level (line 281: signum as usize >= USER_SIGNAL_CAPACITY) and Python level (line 796: !(1..64).contains(&signum)) restrict signals to the range [1, 63]. On Linux, where SIGRTMAX is typically 64, this excludes the highest real-time signal from being registered. CPython sizes its signal handler table with NSIG (typically 65) to accommodate the full range. Consider bumping USER_SIGNAL_CAPACITY to 65 to match the expected signal space.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/host_env/src/faulthandler.rs` at line 90, USER_SIGNAL_CAPACITY is set
to 64 which excludes SIGRTMAX (typically 64) — bump USER_SIGNAL_CAPACITY to 65
so the valid signum space includes 64, update any signum-range validations such
as the C-level check (the guard using `signum as usize >= USER_SIGNAL_CAPACITY`)
and corresponding Python-level ranges (replace exclusive upper bounds like
`1..64` with inclusive ranges or `1..=64`/adjusted checks) so the highest
real-time signal can be registered.
| pub fn enable_fatal_handlers(handler: extern "C" fn(libc::c_int), _flags: libc::c_int) -> bool { | ||
| unsafe { | ||
| for entry in FATAL_SIGNAL_HANDLERS.iter_mut() { | ||
| if entry.enabled { | ||
| continue; | ||
| } | ||
|
|
||
| let Ok(previous) = install_signal_handler(entry.signum, handler) else { | ||
| return false; | ||
| }; | ||
| entry.previous = previous; | ||
| entry.enabled = true; | ||
| } | ||
| } | ||
| true | ||
| } |
There was a problem hiding this comment.
Windows enable_fatal_handlers is missing the rollback that Unix now has.
The Unix version (lines 197–213) was updated to track installed entries and restore them via disable_fatal_signal when a later install_sigaction fails. The Windows path here still bails on the first install_signal_handler failure and leaks the earlier installs as enabled, so the caller sees false while the process already has a partial set of fatal handlers replaced.
🛡️ Suggested fix to mirror the Unix rollback
#[cfg(windows)]
pub fn enable_fatal_handlers(handler: extern "C" fn(libc::c_int), _flags: libc::c_int) -> bool {
unsafe {
+ let mut installed: Vec<libc::c_int> = Vec::new();
for entry in FATAL_SIGNAL_HANDLERS.iter_mut() {
if entry.enabled {
continue;
}
let Ok(previous) = install_signal_handler(entry.signum, handler) else {
+ for signum in installed {
+ disable_fatal_signal(signum);
+ }
return false;
};
entry.previous = previous;
entry.enabled = true;
+ installed.push(entry.signum);
}
}
true
}Note: Vec is currently only imported under #[cfg(unix)] at line 12 — broaden that cfg (e.g. #[cfg(any(unix, windows))]) if you adopt this fix.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/host_env/src/faulthandler.rs` around lines 396 - 411, The Windows path
in enable_fatal_handlers currently returns false on the first
install_signal_handler failure but leaves earlier entries marked enabled
(leaking partial installs); modify enable_fatal_handlers to mirror the Unix
rollback: keep a local Vec (imported with #[cfg(any(unix, windows))]) of indices
or entries that were successfully installed, and if an install_signal_handler
call returns Err, iterate the recorded installs and call disable_fatal_signal
(or otherwise restore entry.previous and set entry.enabled = false) for each to
revert them, then return false; ensure you update each FATAL_SIGNAL_HANDLERS
entry's previous/ enabled fields only after a successful install and clear them
during rollback.
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
crates/host_env/src/multiprocessing.rs (1)
393-404:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate timeout input and use checked arithmetic in
deadline_from_timeout.
Line 396–Line 403uses unchecked casts/additions; non-finite or very large values can yield bogus deadlines via saturation/wrap instead of a clean error.Suggested fix
pub fn deadline_from_timeout(timeout: f64) -> Result<libc::timespec, SemError> { - let timeout = if timeout < 0.0 { 0.0 } else { timeout }; + let timeout = if timeout < 0.0 { 0.0 } else { timeout }; + if !timeout.is_finite() { + return Err(SemError::InvalidInput); + } let tv = gettimeofday()?; let sec = timeout as libc::c_long; let nsec = (1e9 * (timeout - sec as f64) + 0.5) as libc::c_long; + let tv_nsec = (tv.tv_usec as libc::c_long) + .checked_mul(1000) + .and_then(|base| base.checked_add(nsec)) + .ok_or(SemError::InvalidInput)?; + let tv_sec = tv + .tv_sec + .checked_add(sec as libc::time_t) + .ok_or(SemError::InvalidInput)?; let mut deadline = libc::timespec { - tv_sec: tv.tv_sec + sec as libc::time_t, - tv_nsec: (tv.tv_usec as libc::c_long * 1000 + nsec) as _, + tv_sec, + tv_nsec: tv_nsec as _, };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/multiprocessing.rs` around lines 393 - 404, Validate that the timeout is finite and non-negative in deadline_from_timeout, then perform checked conversions and arithmetic instead of unchecked casts: return an Err(SemError) for non-finite or negative inputs, compute whole seconds and fractional nanoseconds using safe operations (e.g., timeout.floor() / fract or similar), use checked casts to libc::time_t and libc::c_long for sec/nsec and checked_add for combining with gettimeofday()'s tv_sec/tv_usec to detect overflow, normalize tv_nsec into [0,1_000_000_000) using checked math, and return Err(SemError) on any overflow/overflow-prone step; refer to deadline_from_timeout, gettimeofday, libc::timespec and SemError when locating the code to update.
🧹 Nitpick comments (1)
crates/host_env/src/overlapped.rs (1)
1079-1106: ⚡ Quick win
unparse_addressshould accept*const SOCKADDRinstead of&SOCKADDR_IN6.The function handles both
AF_INETandAF_INET6addresses, but its parameter type is&SOCKADDR_IN6. For IPv4 callers this forces an unsafe reinterpret-cast at the call site to produce an&SOCKADDR_IN6from what is actuallySOCKADDR_INmemory. Using*const SOCKADDR(or a*const u8/ length pair) would correctly reflect the polymorphic input type and eliminate the hidden unsafe casting requirement.Additionally,
_addr_lenis silently ignored; it could guard against under-sized buffers before dereferencing.♻️ Proposed signature change
-pub fn unparse_address(addr: &SOCKADDR_IN6, _addr_len: i32) -> io::Result<SocketAddress> { - use core::net::{Ipv4Addr, Ipv6Addr}; - let family = addr.sin6_family; - if family == AF_INET { - let addr_in = unsafe { &*(addr as *const SOCKADDR_IN6 as *const SOCKADDR_IN) }; +pub fn unparse_address(addr: *const SOCKADDR, addr_len: i32) -> io::Result<SocketAddress> { + use core::net::{Ipv4Addr, Ipv6Addr}; + let family = unsafe { (*addr).sa_family }; + if family == AF_INET { + if addr_len < core::mem::size_of::<SOCKADDR_IN>() as i32 { + return Err(io::Error::new(io::ErrorKind::InvalidInput, "address buffer too small for AF_INET")); + } + let addr_in = unsafe { &*(addr as *const SOCKADDR_IN) };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/host_env/src/overlapped.rs` around lines 1079 - 1106, Change unparse_address to take a raw pointer representing polymorphic sockaddr memory (e.g., *const SOCKADDR and an addr_len) instead of &SOCKADDR_IN6, validate addr_len before dereferencing, and branch on the family field to safely cast to SOCKADDR_IN or SOCKADDR_IN6; specifically, update the signature of unparse_address to accept (*const SOCKADDR, i32) (or *const u8 plus length), check that addr_len is at least the size needed for the family and for the chosen target (sizeof(SOCKADDR_IN) for AF_INET, sizeof(SOCKADDR_IN6) for AF_INET6) before performing unsafe casts, and then use the existing logic that reads sin_family, casts to SOCKADDR_IN or SOCKADDR_IN6, and constructs SocketAddress; keep the same error handling for unsupported families and return an InvalidInput error when addr_len is too small.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/host_env/src/locale.rs`:
- Around line 144-166: The code casts bytes.len() to i32 which overflows for
inputs > i32::MAX; before calling MultiByteToWideChar, guard the input length by
checking if bytes.len() > i32::MAX as usize and return None (or an error) for
too-large slices, then compute a safe_i32_len = bytes.len() as i32 and use that
variable in both MultiByteToWideChar calls (replacing the inline casts). Do this
in the function that converts the &[u8] to wide (the block that declares len and
wide and calls MultiByteToWideChar) so the API never receives a negative
cbMultiByte.
In `@crates/host_env/src/multiprocessing.rs`:
- Around line 210-214: The Windows implementation of current_thread_id() returns
u32 while other targets and callers expect u64; change the function signature
pub fn current_thread_id() -> u64 and cast the unsafe GetCurrentThreadId()
result to u64 so the function consistently returns u64 across targets (update
the #[cfg(windows)] current_thread_id implementation that calls
GetCurrentThreadId()).
In `@crates/host_env/src/overlapped.rs`:
- Line 545: The public functions start_accept_ex, start_connect_ex,
start_disconnect_ex, and start_transmit_file currently call .get().unwrap() on
their OnceLock extension statics (e.g. ACCEPT_EX), which panics if
initialize_winsock_extensions() was not run; change each to safely handle a
missing initializer by replacing the unwrap with a checked .get() and returning
a proper Winsock error (e.g. WSAEOPNOTSUPP or another appropriate error code) or
attempting to call initialize_winsock_extensions() then rechecking, so the
functions fail via the normal error path instead of panicking; apply the same
pattern to all four functions and/or factor the logic into a small helper like
init_or_error! to centralize the check and error return.
- Around line 188-227: write() and read() replace their owned
write_buffer/read_buffer Vecs unconditionally which drops buffers while a kernel
async I/O may still be using their pointers (when pending == true), causing
use-after-free; fix by refusing to replace buffers or starting a new operation
when an I/O is pending: add a check at the start of overlapped::write and
overlapped::read that returns an Err (or io::ErrorKind::WouldBlock) if
self.pending is true, or alternatively provide and use a public
is_pending()/ensure_not_pending() helper so callers can check before calling;
update callers/tests accordingly and ensure the check is performed before
assigning to self.write_buffer/self.read_buffer and before calling
start_write_file/start_read_file to avoid dropping live buffers.
- Around line 855-908: The code leaks the Arc passed as data_ptr when a
registered wait is cancelled without the callback running; update the
WaitCallbackData struct to include an AtomicBool fired and also store the raw
pointer (data_ptr/raw_ptr) in the registry entry; in post_to_queue_callback set
data.fired.store(true, Ordering::Release) before consuming the Arc via
Arc::from_raw, and in cleanup_wait_callback_data after removing the registry
entry check if !entry.fired.load(Ordering::Acquire) and if so reclaim the leaked
raw Arc by calling unsafe { Arc::from_raw(entry.raw_ptr) } so the allocation is
dropped when the callback never fires.
In `@crates/host_env/src/thread.rs`:
- Around line 4-7: Add a Windows implementation of current_thread_id() and
normalize its public return type to u64 across platforms: keep the existing
#[cfg(unix)] pub fn current_thread_id() -> u64 as-is and add a #[cfg(windows)]
pub fn current_thread_id() -> u64 that calls the Windows API to obtain the
thread id (e.g., GetCurrentThreadId or equivalent) and casts the returned
DWORD/u32 to u64 before returning; ensure the symbol name is identical
(current_thread_id) so callers (including code in multiprocessing.rs that
currently uses u32 IDs) see a consistent u64 API.
---
Duplicate comments:
In `@crates/host_env/src/multiprocessing.rs`:
- Around line 393-404: Validate that the timeout is finite and non-negative in
deadline_from_timeout, then perform checked conversions and arithmetic instead
of unchecked casts: return an Err(SemError) for non-finite or negative inputs,
compute whole seconds and fractional nanoseconds using safe operations (e.g.,
timeout.floor() / fract or similar), use checked casts to libc::time_t and
libc::c_long for sec/nsec and checked_add for combining with gettimeofday()'s
tv_sec/tv_usec to detect overflow, normalize tv_nsec into [0,1_000_000_000)
using checked math, and return Err(SemError) on any overflow/overflow-prone
step; refer to deadline_from_timeout, gettimeofday, libc::timespec and SemError
when locating the code to update.
---
Nitpick comments:
In `@crates/host_env/src/overlapped.rs`:
- Around line 1079-1106: Change unparse_address to take a raw pointer
representing polymorphic sockaddr memory (e.g., *const SOCKADDR and an addr_len)
instead of &SOCKADDR_IN6, validate addr_len before dereferencing, and branch on
the family field to safely cast to SOCKADDR_IN or SOCKADDR_IN6; specifically,
update the signature of unparse_address to accept (*const SOCKADDR, i32) (or
*const u8 plus length), check that addr_len is at least the size needed for the
family and for the chosen target (sizeof(SOCKADDR_IN) for AF_INET,
sizeof(SOCKADDR_IN6) for AF_INET6) before performing unsafe casts, and then use
the existing logic that reads sin_family, casts to SOCKADDR_IN or SOCKADDR_IN6,
and constructs SocketAddress; keep the same error handling for unsupported
families and return an InvalidInput error when addr_len is too small.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 7e97ed5a-a7a4-4951-b413-0f5d3e3ca6f2
📒 Files selected for processing (6)
crates/host_env/src/cert_store.rscrates/host_env/src/locale.rscrates/host_env/src/multiprocessing.rscrates/host_env/src/overlapped.rscrates/host_env/src/testconsole.rscrates/host_env/src/thread.rs
# Conflicts: # Cargo.lock # crates/stdlib/Cargo.toml # crates/stdlib/src/faulthandler.rs # crates/stdlib/src/overlapped.rs # crates/stdlib/src/resource.rs # crates/vm/Cargo.toml # crates/vm/src/stdlib/_codecs.rs # crates/vm/src/stdlib/_ctypes/array.rs # crates/vm/src/stdlib/_ctypes/base.rs # crates/vm/src/stdlib/posix.rs # crates/vm/src/windows.rs
- Fix u32/u64 type mismatch in Windows multiprocessing.rs by returning u32 from current_thread_id on Windows - Fix clippy map_unwrap_or and explicit_iter_loop lints in host_env and _ctypes - Remove unused rustix dep from stdlib and unused errno workspace dep - Drop expectedFailure on test_log_mask which now passes
- nt.rs: collapse map().unwrap_or() to map_or() on Windows-only attribute_data_to_stat - overlapped.rs: use core::sync::atomic instead of std, collapse nested if-let chain - _ctypes/array.rs: rustfmt single-line code
Summary by CodeRabbit