fix: disable debug=True / Werkzeug debugger exposure (closes #9) by timon0305 · Pull Request #10 · cppalliance/cppa-cursor-browser

timon0305 · 2026-05-04T21:25:49Z

Problem

app.py was hard-coding debug=True in its app.run(...) call. The Werkzeug debugger that ships with Flask debug mode is a documented remote-code-execution primitive — anyone reaching the listening port (and guessing the PIN, which is generated from machine-stable inputs) can execute arbitrary Python in the server process. The default 127.0.0.1 bind is the only thing standing between the operator and RCE; a misconfigured --host, an SSH tunnel, or a careless reverse proxy is enough to expose it.

There was also no way to opt out — the literal was unconditional.

Change

Three files:

app.py — removed debug=True. Default debug now off. Opt-in via either --debug CLI flag or FLASK_DEBUG=1 env var. When debug is enabled, prints a stderr WARNING naming the RCE risk + reminding the operator to bind only to loopback. Auto-reloader gated on the same flag.
utils/debug_flag.py — new tiny module with resolve_debug_flag(env_value, cli_flag). Lives outside app.py so it can be unit-tested without importing Flask (matching the existing _build_app_parser mirror convention in the test suite). Truthy env values: "1", "true", "yes" — case-insensitive, whitespace-tolerant. CLI flag wins.
tests/test_cli_args.py — new TestDebugFlagGating class: 8 cases covering env-truthy / env-falsey / env-unset / CLI-override + argparse --debug default & explicit + a source-level guard that fails if debug=True is ever re-introduced as a literal in app.py.

Test plan

Unit: python3 -m unittest tests.test_cli_args → 42/42 OK (was 34, +8 new).

Live smoke (all four matrix cells, on a real flask>=3.0 install):

Trial	Setting	`Debug mode`	WARNING	`Debugger is active!`
A	(default — no env, no flag)	off	—	—
B	`--debug`	on	✅	✅
C	`FLASK_DEBUG=1`	on	✅	✅
D	`FLASK_DEBUG=0`	off	—	—

Bogus paths in trials A/D return a plain Flask 404 — not the orange Werkzeug debugger console. The WARNING prints once per process (parent + reloader child = 2× when the reloader is on, which is correct).

Severity

Critical — Werkzeug debugger exposure is a documented RCE pathway. Listed as a Critical / 1pt item in the eval week-1 plan for cppa-cursor-browser.

Closes #9.

Summary by CodeRabbit

New Features
- Added a --debug CLI flag to opt in to debug mode
- Displays a security warning when debug mode is enabled
Improvements
- Debug mode now defaults to off instead of always on
- Auto-reloader behavior respects platform-specific requirements
Tests
- Added tests covering debug-flag resolution across environment values and CLI overrides

…ppalliance#9) The Werkzeug debugger is a documented remote-code-execution primitive. app.py was hard-coding `debug=True`, which exposed RCE to anyone who could reach the listening port — a misconfigured `--host`, an SSH tunnel, or a careless reverse proxy was enough. - Remove the `debug=True` literal from app.py. - Default debug OFF. Opt-in via either `--debug` CLI flag or `FLASK_DEBUG=1` env var (truthy = "1" / "true" / "yes", case-insensitive, whitespace-tolerant). - Print a stderr WARNING when debug is enabled, naming the RCE risk and reminding the operator to bind only to loopback. - Gate the auto-reloader on the same flag. Live-tested all four matrix cells: (default off / --debug / FLASK_DEBUG=1 / FLASK_DEBUG=0). Bogus paths under debug-off return a plain Flask 404, not the Werkzeug debugger console. Helper `resolve_debug_flag(env_value, cli_flag)` lives in `utils/debug_flag.py` so it can be unit-tested without importing Flask (matching the existing test convention in tests/test_cli_args.py). Regression coverage in tests/test_cli_args.py adds 8 cases: - default-off, env-truthy, env-falsey, CLI override - argparse `--debug` default + explicit - source-level guard that fails if `debug=True` is reintroduced

coderabbitai · 2026-05-04T21:26:03Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 48e40c46-12c3-4cee-8e25-f550d09f8e12

📥 Commits

Reviewing files that changed from the base of the PR and between 26258c0 and c2b4a21.

📒 Files selected for processing (1)

tests/test_cli_args.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_cli_args.py

📝 Walkthrough

Walkthrough

The PR makes Flask debug opt-in by adding a CLI --debug flag and a resolve_debug_flag utility that combines FLASK_DEBUG and the CLI flag. The app warns to stderr when debug is enabled and passes debug_enabled and a gated use_reloader into app.run. Tests verify behavior and forbid a literal debug=True.

Changes

Debug Mode Opt-in and Safety

Layer / File(s)	Summary
Data / Decision Shape `utils/debug_flag.py`	New `resolve_debug_flag(env_value, cli_flag) -> bool`: CLI forces true; `env_value is None` -> false; otherwise truthy strings (`"1"`, `"true"`, `"yes"`) enable debug.
CLI Surface `app.py`	Adds `--debug` argparse boolean (`action="store_true"`) to the app's CLI.
Integration / Wiring `app.py`	Imports `resolve_debug_flag`, computes `debug_enabled = resolve_debug_flag(os.environ.get("FLASK_DEBUG"), args.debug)`, prints a stderr RCE warning when enabled.
Runtime Behavior `app.py`	Replaces literal `debug=True` with `debug=debug_enabled` and `use_reloader=debug_enabled and (sys.platform != "win32")` in `app.run(...)`.
Tests / Regression `tests/test_cli_args.py`	Adds `TestDebugFlagGating` and updates `_build_app_parser()` to include `--debug`. Tests cover env unset/empty/falsey/truthy, CLI precedence, parser defaults when `--debug` absent/present, and AST check forbidding literal `debug=True` in `app.py`.

sequenceDiagram
    participant User
    participant CLI_Parser
    participant ResolveDebug
    participant App_Run
    User->>CLI_Parser: run app.py [--debug]
    CLI_Parser->>ResolveDebug: env_value=FLASK_DEBUG, cli_flag=args.debug
    ResolveDebug->>ResolveDebug: apply opt-in logic (CLI OR truthy env)
    ResolveDebug->>App_Run: return debug_enabled (bool)
    alt debug_enabled == True
        App_Run->>User: write RCE warning to stderr (print)
    end
    App_Run->>App_Run: set debug=debug_enabled<br/>set use_reloader accordingly
    App_Run->>User: start Flask server

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble flags and nibble code,

FLASK_DEBUG whispers on the road.
--debug hops in, but only by choice,
I warn of sparks with a cautious voice.
Safe starts, small drumbeat — a rabbit's rejoice.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately describes the main change: disabling the hardcoded debug=True that exposed the Werkzeug debugger vulnerability, and it directly references the linked issue `#9`.
Linked Issues check	✅ Passed	All core requirements from issue `#9` are implemented: debug=True removed, default disabled, --debug flag added, FLASK_DEBUG env var support, stderr warning on debug enabled, auto-reloader gated, regression tests included.
Out of Scope Changes check	✅ Passed	All changes directly address the security vulnerability in issue `#9`. The PR adds necessary files (debug_flag.py), updates app.py to use opt-in debug, and adds comprehensive test coverage—all aligned with objectives.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/test_cli_args.py (1)
37-47: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Exercise the real app parser instead of a test-local clone.

Lines 289-295 currently prove the behavior of _build_app_parser(), not the parser that app.py actually uses. That means a future regression in the real --debug definition can stay green here as long as this copy is not changed in lockstep. Please move parser construction into a Flask-free helper and import that from both app.py and this test so the security-sensitive flag is covered by one source of truth.

Also applies to: 289-295
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cli_args.py` around lines 37 - 47, Extract the argparse
construction currently duplicated in _build_app_parser into a shared, Flask-free
helper function (e.g., create_app_parser) in a module that neither imports Flask
nor app runtime; update app.py to import and use create_app_parser instead of
building its own parser, and change tests/test_cli_args.py to import and call
create_app_parser (removing the local _build_app_parser) so the real parser
(including the --debug flag) is exercised from one source of truth.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cli_args.py`:
- Around line 300-304: Replace the brittle substring check in
test_app_py_does_not_hardcode_debug_true with an AST-based check: parse the file
contents (src) with ast.parse, walk all ast.Call nodes and inspect their
.keywords for any keyword with arg == "debug" whose value is a literal True
(handle both ast.Constant(value=True) and older ast.NameConstant/ast.Name
patterns), and assert that no such call exists; keep the function name
test_app_py_does_not_hardcode_debug_true and fail the test if any debug=True
literal is found in a call site.

---

Outside diff comments:
In `@tests/test_cli_args.py`:
- Around line 37-47: Extract the argparse construction currently duplicated in
_build_app_parser into a shared, Flask-free helper function (e.g.,
create_app_parser) in a module that neither imports Flask nor app runtime;
update app.py to import and use create_app_parser instead of building its own
parser, and change tests/test_cli_args.py to import and call create_app_parser
(removing the local _build_app_parser) so the real parser (including the --debug
flag) is exercised from one source of truth.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6e397f53-d775-411f-befc-0b9ce51f68a7

📥 Commits

Reviewing files that changed from the base of the PR and between f8b3cb3 and 26258c0.

📒 Files selected for processing (3)

app.py
tests/test_cli_args.py
utils/debug_flag.py

…lliance#10) Old guard: `self.assertNotIn("debug=True", src)` — substring match. That misses cosmetic variants like `debug = True` (with spaces), multi-line `debug=\n True`, or any other form that produces the same runtime semantics. CodeRabbit correctly flagged it as evadable. Replaced with an `ast.walk(tree)` over the parsed app.py: find any `ast.Call` whose keywords contain `debug=True` as a literal Constant. Catches every cosmetic variant by definition. Failure message includes the offending line number(s) and the rationale (issue cppalliance#9), so a future CI break is immediately debuggable. Verified by injecting `debug = True` (with spaces — the form the old check missed) into app.py: - Old check: would have passed (false negative). - New check: failed with `[136]` and the issue-cppalliance#9 message. Then reverted the inject; test passes again. 42/42 tests still pass on the actual app.py.

timon0305 · 2026-05-07T21:56:48Z

Closed in favour of the in-repo equivalent so CI can run.

* fix: gate Flask debug / Werkzeug debugger behind opt-in flag (closes #9) The Werkzeug debugger is a documented remote-code-execution primitive. app.py was hard-coding `debug=True`, which exposed RCE to anyone who could reach the listening port — a misconfigured `--host`, an SSH tunnel, or a careless reverse proxy was enough. - Remove the `debug=True` literal from app.py. - Default debug OFF. Opt-in via either `--debug` CLI flag or `FLASK_DEBUG=1` env var (truthy = "1" / "true" / "yes", case-insensitive, whitespace-tolerant). - Print a stderr WARNING when debug is enabled, naming the RCE risk and reminding the operator to bind only to loopback. - Gate the auto-reloader on the same flag. Live-tested all four matrix cells: (default off / --debug / FLASK_DEBUG=1 / FLASK_DEBUG=0). Bogus paths under debug-off return a plain Flask 404, not the Werkzeug debugger console. Helper `resolve_debug_flag(env_value, cli_flag)` lives in `utils/debug_flag.py` so it can be unit-tested without importing Flask (matching the existing test convention in tests/test_cli_args.py). Regression coverage in tests/test_cli_args.py adds 8 cases: - default-off, env-truthy, env-falsey, CLI override - argparse `--debug` default + explicit - source-level guard that fails if `debug=True` is reintroduced * test: AST-walk the debug=True regression guard (CodeRabbit on PR #10) Old guard: `self.assertNotIn("debug=True", src)` — substring match. That misses cosmetic variants like `debug = True` (with spaces), multi-line `debug=\n True`, or any other form that produces the same runtime semantics. CodeRabbit correctly flagged it as evadable. Replaced with an `ast.walk(tree)` over the parsed app.py: find any `ast.Call` whose keywords contain `debug=True` as a literal Constant. Catches every cosmetic variant by definition. Failure message includes the offending line number(s) and the rationale (issue #9), so a future CI break is immediately debuggable. Verified by injecting `debug = True` (with spaces — the form the old check missed) into app.py: - Old check: would have passed (false negative). - New check: failed with `[136]` and the issue-#9 message. Then reverted the inject; test passes again. 42/42 tests still pass on the actual app.py. * review: address PR #20 nits — broaden debug=True guard + FLASK_DEBUG note - AST guard now handles ast.NameConstant (Py3.7) and **{"debug":True} dict-spread bypass; helper extracted for unit testing. - README: opt-in note for the Werkzeug debugger, including that FLASK_ENV=development is NOT consulted (only FLASK_DEBUG=1). - Replace em dashes in app.py comments with ASCII to silence GitHub's non-ASCII banner on review.

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread tests/test_cli_args.py Outdated

timon0305 self-assigned this May 7, 2026

timon0305 closed this May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: disable debug=True / Werkzeug debugger exposure (closes #9)#10

fix: disable debug=True / Werkzeug debugger exposure (closes #9)#10
timon0305 wants to merge 2 commits into
cppalliance:masterfrom
timon0305:fix/disable-debug-werkzeug-9

timon0305 commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

timon0305 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

timon0305 commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Change

Test plan

Severity

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

timon0305 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

timon0305 commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading