Fix deadlock after fork by youknowone · Pull Request #7205 · RustPython/RustPython

youknowone · 2026-02-23T04:25:41Z

Summary by CodeRabbit

Release Notes

Bug Fixes
- Enhanced fork safety to prevent potential deadlocks and lock contention in child processes following multiprocessing operations.
Chores
- Improved internal synchronization to better integrate with Python's threading model for increased reliability.

coderabbitai · 2026-02-23T04:25:48Z

📝 Walkthrough

Walkthrough

Replaces standard synchronization primitives with Python-threaded equivalents in the GC state structures and adds fork-safety utilities to forcibly release locks in child processes after fork operations, preventing potential deadlocks.

Changes

Cohort / File(s)	Summary
GC State Fork-Safety Implementation `crates/vm/src/gc_state.rs`	Replaces Mutex and RwLock with PyMutex and PyRwLock across GC state structures; adds `force_unlock_after_fork()` and `force_unlock_stats_after_fork()` methods for Unix; introduces helper function to forcibly unlock PyRwLock variants; updates object tracking, untracking, finalization, and resurrection logic to use new locking primitives.
Post-Fork Cleanup Integration `crates/vm/src/stdlib/posix.rs`	Integrates GC state fork-safety by calling `force_unlock_after_fork()` in post-fork child process cleanup routine.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

force_unlock_after_fork #7186: Implements the same fork-safety pattern (force_unlock_after_fork) for other VM subsystems like codecs and string pool registries.
clear_after_fork #6933: Adds Unix-only post-fork reset/cleanup calls to py_os_after_fork_child for other VM state management.
Use try_lock in py_os_after_fork_child #7178: Also modifies py_os_after_fork_child but applies try_lock pattern for mutex handling instead.

Suggested reviewers

ShaharNaveh

Poem

🐰 With forks that split and locks that bind,
We force them free, no deadlock kind!
PyRwLocks dance where mutexes stood,
The garbage collected—all is good! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix deadlock after fork' directly corresponds to the main objective of the changeset, which addresses deadlock prevention through fork-safety utilities and lock management in the GC state.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

ℹ️ Review info

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ccfa938 and 45e47ad.

📒 Files selected for processing (2)

crates/vm/src/gc_state.rs
crates/vm/src/stdlib/posix.rs

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/gc_state.rs`:
- Around line 793-856: The code incorrectly uses parking_lot's
force_unlock_read/force_unlock_write (and force_unlock) in
force_unlock_after_fork and force_unlock_rwlock_after_fork to recover locks held
by other threads after fork, which is undefined behavior; remove any calls to
force_unlock_read/force_unlock_write/force_unlock and stop trying to forcibly
recover locks from dead threads. Instead, change force_unlock_after_fork and
force_unlock_rwlock_after_fork to not perform unsafe forcible unlocks: either
(A) make them a no-op that logs/returns an Err indicating fork-without-exec is
unsupported, or (B) abort/panic with a clear message instructing callers to use
pthread_atfork handlers or ensure fork is followed by exec; also update any
callers to handle the error or avoid calling these functions after fork. Ensure
references: update function force_unlock_after_fork, helper
force_unlock_rwlock_after_fork, and remove usages of
PyRwLock::force_unlock_read, PyRwLock::force_unlock_write, and
Mutex::force_unlock in this module.

coderabbitai · 2026-02-23T07:20:24Z

crates/vm/src/gc_state.rs

+
+    /// Force-unlock all locks after fork() in the child process.
+    ///
+    /// After fork(), only the forking thread survives. Any lock held by another
+    /// thread is permanently stuck. This method releases all such stuck locks.
+    ///
+    /// # Safety
+    /// Must only be called after fork() in the child process when no other
+    /// threads exist. The calling thread must NOT hold any of these locks.
+    #[cfg(unix)]
+    pub unsafe fn force_unlock_after_fork(&self) {
+        // Force-unlock the collecting mutex
+        if self.collecting.try_lock().is_none() {
+            unsafe { self.collecting.force_unlock() };
+        }
+
+        // Force-unlock garbage and callbacks mutexes
+        if self.garbage.try_lock().is_none() {
+            unsafe { self.garbage.force_unlock() };
+        }
+        if self.callbacks.try_lock().is_none() {
+            unsafe { self.callbacks.force_unlock() };
+        }
+
+        // Force-unlock generation stats mutexes
+        for generation in &self.generations {
+            unsafe { generation.force_unlock_stats_after_fork() };
+        }
+        unsafe { self.permanent.force_unlock_stats_after_fork() };
+
+        // Force-unlock RwLocks
+        for rw in &self.generation_objects {
+            unsafe { force_unlock_rwlock_after_fork(rw) };
+        }
+        unsafe { force_unlock_rwlock_after_fork(&self.permanent_objects) };
+        unsafe { force_unlock_rwlock_after_fork(&self.tracked_objects) };
+        unsafe { force_unlock_rwlock_after_fork(&self.finalized_objects) };
+    }
+}
+
+/// Force-unlock a PyRwLock after fork() in the child process.
+///
+/// # Safety
+/// Must only be called after fork() in the child process when no other
+/// threads exist. The calling thread must NOT hold this lock.
+#[cfg(unix)]
+unsafe fn force_unlock_rwlock_after_fork<T>(lock: &PyRwLock<T>) {
+    if lock.try_write().is_some() {
+        return;
+    }
+    let is_shared = lock.try_read().is_some();
+    if is_shared {
+        loop {
+            // SAFETY: Lock is shared-locked by dead thread(s).
+            unsafe { lock.force_unlock_read() };
+            if lock.try_write().is_some() {
+                return;
            }
-            self.generations[2].count.fetch_add(count, Ordering::SeqCst);
        }
+    } else {
+        // SAFETY: Lock is exclusively locked by a dead thread.
+        unsafe { lock.force_unlock_write() };
    }
 }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -nP 'fn force_unlock_read|fn force_unlock_write' --type=rust -C3

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

rg -nP 'type PyRwLock|struct PyRwLock|impl.*PyRwLock' --type=rust -C5

Repository: RustPython/RustPython

Length of output: 2107

🏁 Script executed:

rg -nP 'type RawRwLock|struct RawRwLock|impl.*RawRwLock|fn force_unlock' --type=rust -C3

Repository: RustPython/RustPython

Length of output: 5373

🏁 Script executed:

rg -nP 'force_unlock_read|force_unlock_write' --type=rust -B2 -A2

Repository: RustPython/RustPython

Length of output: 2026

🏁 Script executed:

rg -nP 'fn force_unlock_read|fn force_unlock_write' --type=rust -B5 -A5

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

rg -nP 'parking_lot|force_unlock' crates/common/src/lock.rs -B3 -A3

Repository: RustPython/RustPython

Length of output: 306

🏁 Script executed:

rg -nP 'trait.*force_unlock|extension|impl.*RwLock' --type=rust -C3

Repository: RustPython/RustPython

Length of output: 50377

🏁 Script executed:

cat crates/common/src/lock/cell_lock.rs | grep -A10 "force_unlock"

Repository: RustPython/RustPython

Length of output: 47

🌐 Web query:

parking_lot RwLock force_unlock_read force_unlock_write fork safety

💡 Result:

parking_lot’s RwLock::{force_unlock_read, force_unlock_write} (actually provided by lock_api::RwLock) are not “recovery” tools for fork().

They are unsafe escape hatches meant only for the case where the current thread still logically owns the lock but you intentionally leaked the guard (e.g., mem::forget(guard) for FFI), and you need to unlock without having the guard object anymore. Calling them when the lock isn’t held in that mode is undefined behavior. [1]

After fork() in a multi-threaded process, only the forking thread exists in the child, but the child inherits the memory state of all mutex/rwlock objects. If another (now-nonexistent) thread held a lock at fork time, that lock can remain locked forever in the child, and attempting to use such locks can deadlock or worse. [2]

POSIX’s intended mitigation is “fork then immediately exec” (avoid touching library code/allocators/locks in between), or to use pthread_atfork() handlers to put your process into a consistent lock state across fork—but that only works if you can reliably coordinate all relevant locks. [2][3]

So: do not use force_unlock_read/write to “make the child safe after fork”. It violates the ownership assumptions those methods require [1], and it cannot make the protected data consistent if a different thread was in the middle of updating it at fork time. [2]

Sources: [1] [2] [3]

Critical unsafe code misuse: force_unlock_read and force_unlock_write are not safe to use for fork() recovery.

The force_unlock_read() and force_unlock_write() methods from parking_lot are designed only for cases where the current thread still logically owns the lock but intentionally leaked the guard (e.g., via mem::forget). They are not meant to recover from locks held by dead threads after fork(), as they violate the ownership assumptions these methods require. Calling them when a different (now-nonexistent) thread held the lock is undefined behavior.

After fork(), using pthread_atfork() handlers or relying on the POSIX principle of "fork then immediately exec" are the intended mitigations. The current approach of force-unlocking locks held by other threads is unsafe and should not be used.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/gc_state.rs` around lines 793 - 856, The code incorrectly uses parking_lot's force_unlock_read/force_unlock_write (and force_unlock) in force_unlock_after_fork and force_unlock_rwlock_after_fork to recover locks held by other threads after fork, which is undefined behavior; remove any calls to force_unlock_read/force_unlock_write/force_unlock and stop trying to forcibly recover locks from dead threads. Instead, change force_unlock_after_fork and force_unlock_rwlock_after_fork to not perform unsafe forcible unlocks: either (A) make them a no-op that logs/returns an Err indicating fork-without-exec is unsupported, or (B) abort/panic with a clear message instructing callers to use pthread_atfork handlers or ensure fork is followed by exec; also update any callers to handle the error or avoid calling these functions after fork. Ensure references: update function force_unlock_after_fork, helper force_unlock_rwlock_after_fork, and remove usages of PyRwLock::force_unlock_read, PyRwLock::force_unlock_write, and Mutex::force_unlock in this module.

Fix deadlock after fork

45e47ad

youknowone marked this pull request as ready for review February 23, 2026 07:13

coderabbitai bot reviewed Feb 23, 2026

View reviewed changes

youknowone merged commit f096bb7 into RustPython:main Feb 23, 2026
13 checks passed

youknowone deleted the fork-deadlock branch February 23, 2026 07:22

This was referenced Mar 3, 2026

reinit-after-fork #7321

Merged

Add monitoring lock reinit and use raw() accessor #7335

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix deadlock after fork#7205

Fix deadlock after fork#7205
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:fork-deadlock

youknowone commented Feb 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

youknowone commented Feb 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Feb 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 23, 2026 •

edited

Loading