fix: disable debug=True / Werkzeug debugger exposure (closes #9)#10
fix: disable debug=True / Werkzeug debugger exposure (closes #9)#10timon0305 wants to merge 2 commits into
Conversation
…ppalliance#9) The Werkzeug debugger is a documented remote-code-execution primitive. app.py was hard-coding `debug=True`, which exposed RCE to anyone who could reach the listening port — a misconfigured `--host`, an SSH tunnel, or a careless reverse proxy was enough. - Remove the `debug=True` literal from app.py. - Default debug OFF. Opt-in via either `--debug` CLI flag or `FLASK_DEBUG=1` env var (truthy = "1" / "true" / "yes", case-insensitive, whitespace-tolerant). - Print a stderr WARNING when debug is enabled, naming the RCE risk and reminding the operator to bind only to loopback. - Gate the auto-reloader on the same flag. Live-tested all four matrix cells: (default off / --debug / FLASK_DEBUG=1 / FLASK_DEBUG=0). Bogus paths under debug-off return a plain Flask 404, not the Werkzeug debugger console. Helper `resolve_debug_flag(env_value, cli_flag)` lives in `utils/debug_flag.py` so it can be unit-tested without importing Flask (matching the existing test convention in tests/test_cli_args.py). Regression coverage in tests/test_cli_args.py adds 8 cases: - default-off, env-truthy, env-falsey, CLI override - argparse `--debug` default + explicit - source-level guard that fails if `debug=True` is reintroduced
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR makes Flask debug opt-in by adding a CLI ChangesDebug Mode Opt-in and Safety
sequenceDiagram
participant User
participant CLI_Parser
participant ResolveDebug
participant App_Run
User->>CLI_Parser: run app.py [--debug]
CLI_Parser->>ResolveDebug: env_value=FLASK_DEBUG, cli_flag=args.debug
ResolveDebug->>ResolveDebug: apply opt-in logic (CLI OR truthy env)
ResolveDebug->>App_Run: return debug_enabled (bool)
alt debug_enabled == True
App_Run->>User: write RCE warning to stderr (print)
end
App_Run->>App_Run: set debug=debug_enabled<br/>set use_reloader accordingly
App_Run->>User: start Flask server
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_cli_args.py (1)
37-47:⚠️ Potential issue | 🟠 Major | ⚡ Quick winExercise the real app parser instead of a test-local clone.
Lines 289-295 currently prove the behavior of
_build_app_parser(), not the parser thatapp.pyactually uses. That means a future regression in the real--debugdefinition can stay green here as long as this copy is not changed in lockstep. Please move parser construction into a Flask-free helper and import that from bothapp.pyand this test so the security-sensitive flag is covered by one source of truth.Also applies to: 289-295
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_cli_args.py` around lines 37 - 47, Extract the argparse construction currently duplicated in _build_app_parser into a shared, Flask-free helper function (e.g., create_app_parser) in a module that neither imports Flask nor app runtime; update app.py to import and use create_app_parser instead of building its own parser, and change tests/test_cli_args.py to import and call create_app_parser (removing the local _build_app_parser) so the real parser (including the --debug flag) is exercised from one source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_cli_args.py`:
- Around line 300-304: Replace the brittle substring check in
test_app_py_does_not_hardcode_debug_true with an AST-based check: parse the file
contents (src) with ast.parse, walk all ast.Call nodes and inspect their
.keywords for any keyword with arg == "debug" whose value is a literal True
(handle both ast.Constant(value=True) and older ast.NameConstant/ast.Name
patterns), and assert that no such call exists; keep the function name
test_app_py_does_not_hardcode_debug_true and fail the test if any debug=True
literal is found in a call site.
---
Outside diff comments:
In `@tests/test_cli_args.py`:
- Around line 37-47: Extract the argparse construction currently duplicated in
_build_app_parser into a shared, Flask-free helper function (e.g.,
create_app_parser) in a module that neither imports Flask nor app runtime;
update app.py to import and use create_app_parser instead of building its own
parser, and change tests/test_cli_args.py to import and call create_app_parser
(removing the local _build_app_parser) so the real parser (including the --debug
flag) is exercised from one source of truth.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6e397f53-d775-411f-befc-0b9ce51f68a7
📒 Files selected for processing (3)
app.pytests/test_cli_args.pyutils/debug_flag.py
…lliance#10) Old guard: `self.assertNotIn("debug=True", src)` — substring match. That misses cosmetic variants like `debug = True` (with spaces), multi-line `debug=\n True`, or any other form that produces the same runtime semantics. CodeRabbit correctly flagged it as evadable. Replaced with an `ast.walk(tree)` over the parsed app.py: find any `ast.Call` whose keywords contain `debug=True` as a literal Constant. Catches every cosmetic variant by definition. Failure message includes the offending line number(s) and the rationale (issue cppalliance#9), so a future CI break is immediately debuggable. Verified by injecting `debug = True` (with spaces — the form the old check missed) into app.py: - Old check: would have passed (false negative). - New check: failed with `[136]` and the issue-cppalliance#9 message. Then reverted the inject; test passes again. 42/42 tests still pass on the actual app.py.
|
Closed in favour of the in-repo equivalent so CI can run. |
* fix: gate Flask debug / Werkzeug debugger behind opt-in flag (closes #9) The Werkzeug debugger is a documented remote-code-execution primitive. app.py was hard-coding `debug=True`, which exposed RCE to anyone who could reach the listening port — a misconfigured `--host`, an SSH tunnel, or a careless reverse proxy was enough. - Remove the `debug=True` literal from app.py. - Default debug OFF. Opt-in via either `--debug` CLI flag or `FLASK_DEBUG=1` env var (truthy = "1" / "true" / "yes", case-insensitive, whitespace-tolerant). - Print a stderr WARNING when debug is enabled, naming the RCE risk and reminding the operator to bind only to loopback. - Gate the auto-reloader on the same flag. Live-tested all four matrix cells: (default off / --debug / FLASK_DEBUG=1 / FLASK_DEBUG=0). Bogus paths under debug-off return a plain Flask 404, not the Werkzeug debugger console. Helper `resolve_debug_flag(env_value, cli_flag)` lives in `utils/debug_flag.py` so it can be unit-tested without importing Flask (matching the existing test convention in tests/test_cli_args.py). Regression coverage in tests/test_cli_args.py adds 8 cases: - default-off, env-truthy, env-falsey, CLI override - argparse `--debug` default + explicit - source-level guard that fails if `debug=True` is reintroduced * test: AST-walk the debug=True regression guard (CodeRabbit on PR #10) Old guard: `self.assertNotIn("debug=True", src)` — substring match. That misses cosmetic variants like `debug = True` (with spaces), multi-line `debug=\n True`, or any other form that produces the same runtime semantics. CodeRabbit correctly flagged it as evadable. Replaced with an `ast.walk(tree)` over the parsed app.py: find any `ast.Call` whose keywords contain `debug=True` as a literal Constant. Catches every cosmetic variant by definition. Failure message includes the offending line number(s) and the rationale (issue #9), so a future CI break is immediately debuggable. Verified by injecting `debug = True` (with spaces — the form the old check missed) into app.py: - Old check: would have passed (false negative). - New check: failed with `[136]` and the issue-#9 message. Then reverted the inject; test passes again. 42/42 tests still pass on the actual app.py. * review: address PR #20 nits — broaden debug=True guard + FLASK_DEBUG note - AST guard now handles ast.NameConstant (Py3.7) and **{"debug":True} dict-spread bypass; helper extracted for unit testing. - README: opt-in note for the Werkzeug debugger, including that FLASK_ENV=development is NOT consulted (only FLASK_DEBUG=1). - Replace em dashes in app.py comments with ASCII to silence GitHub's non-ASCII banner on review.
Problem
app.pywas hard-codingdebug=Truein itsapp.run(...)call. The Werkzeug debugger that ships with Flask debug mode is a documented remote-code-execution primitive — anyone reaching the listening port (and guessing the PIN, which is generated from machine-stable inputs) can execute arbitrary Python in the server process. The default127.0.0.1bind is the only thing standing between the operator and RCE; a misconfigured--host, an SSH tunnel, or a careless reverse proxy is enough to expose it.There was also no way to opt out — the literal was unconditional.
Change
Three files:
app.py— removeddebug=True. Default debug now off. Opt-in via either--debugCLI flag orFLASK_DEBUG=1env var. When debug is enabled, prints a stderr WARNING naming the RCE risk + reminding the operator to bind only to loopback. Auto-reloader gated on the same flag.utils/debug_flag.py— new tiny module withresolve_debug_flag(env_value, cli_flag). Lives outsideapp.pyso it can be unit-tested without importing Flask (matching the existing_build_app_parsermirror convention in the test suite). Truthy env values:"1","true","yes"— case-insensitive, whitespace-tolerant. CLI flag wins.tests/test_cli_args.py— newTestDebugFlagGatingclass: 8 cases covering env-truthy / env-falsey / env-unset / CLI-override + argparse--debugdefault & explicit + a source-level guard that fails ifdebug=Trueis ever re-introduced as a literal inapp.py.Test plan
Unit:
python3 -m unittest tests.test_cli_args→ 42/42 OK (was 34, +8 new).Live smoke (all four matrix cells, on a real
flask>=3.0install):Debug modeDebugger is active!--debugFLASK_DEBUG=1FLASK_DEBUG=0Bogus paths in trials A/D return a plain Flask 404 — not the orange Werkzeug debugger console. The WARNING prints once per process (parent + reloader child = 2× when the reloader is on, which is correct).
Severity
Critical — Werkzeug debugger exposure is a documented RCE pathway. Listed as a Critical / 1pt item in the eval week-1 plan for
cppa-cursor-browser.Closes #9.
Summary by CodeRabbit
New Features
--debugCLI flag to opt in to debug modeImprovements
Tests