Skip to content

Newtype var_num oparg#7400

Merged
youknowone merged 2 commits intoRustPython:mainfrom
ShaharNaveh:bytecode-varnum-newtype
Mar 12, 2026
Merged

Newtype var_num oparg#7400
youknowone merged 2 commits intoRustPython:mainfrom
ShaharNaveh:bytecode-varnum-newtype

Conversation

@ShaharNaveh
Copy link
Contributor

@ShaharNaveh ShaharNaveh commented Mar 11, 2026

Summary by CodeRabbit

  • Refactor
    • Improved type system for variable handling in the compiler infrastructure by introducing specialized type representations for variable indices. This enhances code consistency, type safety, and maintainability across compiler components while maintaining backward compatibility with existing functionality.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This PR introduces a new oparg::VarNum typed newtype to replace generic numeric indices for variable references throughout the compiler. Methods, instruction variants, and indexing operations are updated across codegen, bytecode, and runtime modules to use this more type-safe representation for variable number arguments.

Changes

Cohort / File(s) Summary
Type System Foundation
crates/compiler-core/src/bytecode/oparg.rs
Introduces VarNum(u32) newtype via macro-based pattern alongside refactored ConstIdx. Adds from_u32, as_u32, and as_usize accessors with trait impls (From, OpArgType) for both types.
Bytecode Indexing & Traits
crates/compiler-core/src/bytecode.rs
Adds Index and IndexMut implementations for slices using oparg::VarNum. Updates InstrDisplayContext trait and CodeObject impl to accept VarNum instead of usize for variable name lookups. Adjusts internal name mapping logic in map_bag.
Instruction Variant Definitions
crates/compiler-core/src/bytecode/instruction.rs
Changes var_num field type from Arg<NameIdx> to Arg<oparg::VarNum> in six instruction variants: DeleteFast, LoadFast, LoadFastAndClear, LoadFastBorrow, LoadFastCheck, StoreFast. Updates disassembly formatting logic to convert arguments for display.
Code Generation
crates/codegen/src/compile.rs, crates/codegen/src/ir.rs
Updates get_local_var_index and varname methods to return oparg::VarNum instead of u32/NameIdx. Updates _name_inner return type to u32 with conversions to VarNum at call sites. Changes get_varname signature in IR context to accept VarNum with .as_usize() indexing.
Runtime & JIT
crates/jit/src/instructions.rs, crates/vm/src/frame.rs
Updates store_variable signature to accept oparg::VarNum. Refactors pattern bindings in VM frame dispatch to use var_num directly instead of nested idx binding, updating all .get(arg) call sites to use the new binding.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • youknowone

Poem

🐰 A rabbit hops through types so neat,
Where VarNum makes indices complete!
From plain ol' u32 we say goodbye,
Type safety's here—no more guessing why!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Newtype var_num oparg' directly reflects the main change: introducing a newtype for var_num in the oparg module to improve type safety in variable indexing across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ShaharNaveh ShaharNaveh marked this pull request as ready for review March 11, 2026 13:34
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)

4140-4142: Look up the synthetic fast locals by name here.

This loop ties the emitted LoadFasts to .defaults / .kwdefaults living in slots 0..num_typeparam_args, but those names are appended to metadata.varnames after push_output() has already seeded the scope from the symbol table. Resolving the slots by name keeps this wrapper robust if that scope ever carries any other locals.

🛠️ Proposed refactor
-            // Load defaults/kwdefaults with LOAD_FAST
-            for i in 0..num_typeparam_args {
-                let var_num = oparg::VarNum::from(i as u32);
-                emit!(self, Instruction::LoadFast { var_num });
-            }
+            // Load defaults/kwdefaults with LOAD_FAST
+            if funcflags.contains(bytecode::MakeFunctionFlags::DEFAULTS) {
+                let var_num = self.get_local_var_index(".defaults")?;
+                emit!(self, Instruction::LoadFast { var_num });
+            }
+            if funcflags.contains(bytecode::MakeFunctionFlags::KW_ONLY_DEFAULTS) {
+                let var_num = self.get_local_var_index(".kwdefaults")?;
+                emit!(self, Instruction::LoadFast { var_num });
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 4140 - 4142, The current loop
emits LoadFast by assuming the synthetic typeparam defaults live in slots
0..num_typeparam_args; instead, resolve each synthetic fast local by name using
metadata.varnames so the correct var slot is used even if other locals exist.
For each index i in 0..num_typeparam_args, compute the synthetic name you
appended (the same name used when push_output() seeded the scope), find its
position in self.metadata.varnames (e.g., .iter().position(|n| n == &name)),
convert that position to oparg::VarNum and then emit Instruction::LoadFast with
that VarNum; handle the case where the name is not found with an appropriate
error or fallback.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 1112-1113: get_varname currently indexes only metadata.varnames
and will panic for VarNum values that refer into the "locals-plus" table
(cellvars/freevars created for LOAD_CLOSURE); update get_varname to resolve
VarNum against the combined locals-plus layout: if var_num.as_usize() <
self.metadata.varnames.len() return that entry, otherwise subtract
varnames.len() and if the remaining index is < self.metadata.cellvars.len()
return from metadata.cellvars, else subtract cellvars.len() and return from
metadata.freevars; ensure indexes use as_usize() and do bounds-safe lookups so
it never panics when encountering LOAD_CLOSURE-created slots.

In `@crates/jit/src/instructions.rs`:
- Around line 111-115: The code directly indexes self.variables with a VarNum
(typed u32) which can panic on malformed bytecode; change both occurrences
(e.g., in store_variable and the other site where self.variables[idx] is used)
to first convert the VarNum to usize and use checked access
(self.variables.get_mut(idx_usize) or self.variables.get(idx_usize)), returning
Err(JitCompileError::BadBytecode) when None. Inside the Some(slot) branch, keep
the existing logic (declare_var via builder and set the slot) but perform it via
the checked slot reference instead of indexing to avoid panics.

---

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 4140-4142: The current loop emits LoadFast by assuming the
synthetic typeparam defaults live in slots 0..num_typeparam_args; instead,
resolve each synthetic fast local by name using metadata.varnames so the correct
var slot is used even if other locals exist. For each index i in
0..num_typeparam_args, compute the synthetic name you appended (the same name
used when push_output() seeded the scope), find its position in
self.metadata.varnames (e.g., .iter().position(|n| n == &name)), convert that
position to oparg::VarNum and then emit Instruction::LoadFast with that VarNum;
handle the case where the name is not found with an appropriate error or
fallback.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 443a5187-8634-44ef-9123-5f743fb1aa0a

📥 Commits

Reviewing files that changed from the base of the PR and between d248a04 and 87eeba9.

📒 Files selected for processing (7)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/compiler-core/src/bytecode.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/compiler-core/src/bytecode/oparg.rs
  • crates/jit/src/instructions.rs
  • crates/vm/src/frame.rs

Comment on lines +1112 to +1113
fn get_varname(&self, var_num: oparg::VarNum) -> &str {
self.metadata.varnames[var_num.as_usize()].as_ref()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve VarNum against the full locals-plus table.

Line 1113 still assumes every VarNum lives in metadata.varnames, but Line 1797 creates LOAD_FAST slots as varnames_len + ... when lowering LOAD_CLOSURE. That means displaying valid closure bytecode can walk past varnames and panic. Please fall back into cellvars/freevars here instead of indexing only varnames.

🛠️ Suggested fix
 fn get_varname(&self, var_num: oparg::VarNum) -> &str {
-    self.metadata.varnames[var_num.as_usize()].as_ref()
+    let idx = var_num.as_usize();
+    if let Some(name) = self.metadata.varnames.get_index(idx) {
+        name.as_ref()
+    } else {
+        let cell_idx = idx - self.metadata.varnames.len();
+        self.metadata
+            .cellvars
+            .get_index(cell_idx)
+            .or_else(|| self.metadata.freevars.get_index(cell_idx - self.metadata.cellvars.len()))
+            .map(String::as_ref)
+            .unwrap_or("<invalid var num>")
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1112 - 1113, get_varname currently
indexes only metadata.varnames and will panic for VarNum values that refer into
the "locals-plus" table (cellvars/freevars created for LOAD_CLOSURE); update
get_varname to resolve VarNum against the combined locals-plus layout: if
var_num.as_usize() < self.metadata.varnames.len() return that entry, otherwise
subtract varnames.len() and if the remaining index is <
self.metadata.cellvars.len() return from metadata.cellvars, else subtract
cellvars.len() and return from metadata.freevars; ensure indexes use as_usize()
and do bounds-safe lookups so it never panics when encountering
LOAD_CLOSURE-created slots.

Comment on lines +111 to 115
fn store_variable(&mut self, idx: oparg::VarNum, val: JitValue) -> Result<(), JitCompileError> {
let builder = &mut self.builder;
let ty = val.to_jit_type().ok_or(JitCompileError::NotSupported)?;
let local = self.variables[idx as usize].get_or_insert_with(|| {
let local = self.variables[idx].get_or_insert_with(|| {
let var = builder.declare_var(ty.to_cranelift());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use checked local-slot access instead of panicking on bad bytecode.

Line 114 and Line 651 index self.variables directly, but VarNum is only a typed u32, not a proof that the slot exists. A malformed or stale code object will panic the JIT here instead of taking the existing JitCompileError::BadBytecode path.

🛠️ Suggested fix
 fn store_variable(&mut self, idx: oparg::VarNum, val: JitValue) -> Result<(), JitCompileError> {
     let builder = &mut self.builder;
     let ty = val.to_jit_type().ok_or(JitCompileError::NotSupported)?;
-    let local = self.variables[idx].get_or_insert_with(|| {
+    let slot = self
+        .variables
+        .get_mut(idx.as_usize())
+        .ok_or(JitCompileError::BadBytecode)?;
+    let local = slot.get_or_insert_with(|| {
         let var = builder.declare_var(ty.to_cranelift());
         Local {
             var,
             ty: ty.clone(),
         }
@@
             Instruction::LoadFast { var_num } | Instruction::LoadFastBorrow { var_num } => {
-                let local = self.variables[var_num.get(arg)]
-                    .as_ref()
+                let local = self
+                    .variables
+                    .get(var_num.get(arg).as_usize())
+                    .and_then(Option::as_ref)
                     .ok_or(JitCompileError::BadBytecode)?;
                 self.stack.push(JitValue::from_type_and_value(
                     local.ty.clone(),
                     self.builder.use_var(local.var),
                 ));

Also applies to: 650-652

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/jit/src/instructions.rs` around lines 111 - 115, The code directly
indexes self.variables with a VarNum (typed u32) which can panic on malformed
bytecode; change both occurrences (e.g., in store_variable and the other site
where self.variables[idx] is used) to first convert the VarNum to usize and use
checked access (self.variables.get_mut(idx_usize) or
self.variables.get(idx_usize)), returning Err(JitCompileError::BadBytecode) when
None. Inside the Some(slot) branch, keep the existing logic (declare_var via
builder and set the slot) but perform it via the checked slot reference instead
of indexing to avoid panics.

@youknowone youknowone merged commit 5c631e5 into RustPython:main Mar 12, 2026
23 of 24 checks passed
This was referenced Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants