Skip to content

Commit e608c65

Browse files
Richard Taylorclaude
authored andcommitted
DownloadManager: add redact_url= kwarg to hide auth-bearing URLs in logs
`DownloadManager.download_url` currently prints the full request URL three times per download (start, finished, exception path), plus the entire response headers dict. That's useful debug output for the typical callers fetching public URLs (app icons, OS updates, weather data) — but it silently leaks any auth secret embedded in a URL's path or query string. Two real cases discovered in the wild: 1. Lightning Piggy's on-chain wallet (LightningPiggyApp PR #25) queries `https://btc1.trezor.io/api/v2/xpub/zpub6q...?details=txs&tokens=derived` to fetch balance + transactions for a watch-only xpub. The xpub itself is in the path. A leaked xpub exposes the wallet's entire past + future address derivation tree to whoever reads the log — non-recoverable confidentiality damage even though funds custody is unaffected. 2. Any future caller using API-key-in-URL or OAuth-token-in-URL authentication. Pattern is common enough (LNBits, BlueWallet's LNDHub, some HTTPS RPC endpoints) that the leak surface is wider than just Lightning Piggy. The Lightning Piggy app already scrubs xpubs from its own print() calls, but DownloadManager's prints sit below that and re-leak the URL on every poll. This PR adds a `redact_url=False` kwarg (default preserves current behaviour). When set True: - The URL is logged as `scheme://host[:port]/...REDACTED...` instead of full. The host is intentionally kept so failure triage (DNS, connectivity, wrong endpoint) is still possible. - The response-headers dump is suppressed entirely (replaced with `<redacted>`). Response headers often contain `set-cookie`, `cf-ray`, and other correlatable session tokens. - Exception messages have the URL substring scrubbed before printing (aiohttp's ClientConnectorError typically embeds the URL). Default False is deliberate — most callers fetch public URLs and want the full log line for diagnostics. Callers opt in: await DownloadManager.download_url(url, redact_url=True) Tests (`tests/test_download_manager.py`): - 6 unit tests for the new `_safe_url` helper (path-with-query strip, port handling, naked-host pass-through, malformed-URL placeholder, belt-and-braces "secret substrings never appear in output" guard). - 2 mock-surface tests confirming the kwarg flows through correctly and defaults to False. - All 22 existing tests continue to pass (default behaviour unchanged). $ bash tests/unittest.sh tests/test_download_manager.py Ran 30 tests OK The Lightning Piggy on-chain wallet PR will follow up on its side to pass `redact_url=True` once this lands; that's a one-line change in `onchain_wallet.py:fetch_balance_and_payments()`. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
1 parent 32a36af commit e608c65

4 files changed

Lines changed: 145 additions & 18 deletions

File tree

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ Board Support:
99
OS:
1010
- Disable the repl on hardware uart for esp32s3 targets (USB serial still works)
1111

12+
Frameworks:
13+
- `DownloadManager.download_url`: add `redact_url=True` kwarg for callers fetching URLs that embed an auth secret (API key, OAuth token, LNBits readkey, xpub/ypub/zpub). When set, the URL is logged as `scheme://host/...REDACTED...`, the response-headers dump is suppressed, and exception messages have any embedded URL scrubbed. Default `False` preserves existing debug output for callers fetching public URLs (app icons, OS updates, etc.). Use case: prevents serial / REPL logs from leaking the secret-bearing URL even when DEBUG-level chatter is on.
14+
1215
0.9.5
1316
=====
1417

internal_filesystem/lib/mpos/net/download_manager.py

Lines changed: 67 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -28,12 +28,12 @@ class DownloadManager:
2828
@classmethod
2929
def download_url(cls, url, outfile=None, total_size=None,
3030
progress_callback=None, chunk_callback=None, headers=None,
31-
speed_callback=None):
31+
speed_callback=None, redact_url=False):
3232
"""Download a URL with flexible output modes (sync or async wrapper).
33-
33+
3434
This method automatically detects whether it's being called from an async context
3535
and either returns a coroutine (for await) or runs synchronously.
36-
36+
3737
Args:
3838
url (str): URL to download (required)
3939
outfile (str, optional): Path to write file. If None, returns bytes.
@@ -42,12 +42,21 @@ def download_url(cls, url, outfile=None, total_size=None,
4242
chunk_callback (coroutine, optional): async def callback(chunk: bytes)
4343
headers (dict, optional): HTTP headers (e.g., {'Range': 'bytes=1000-'})
4444
speed_callback (coroutine, optional): async def callback(bytes_per_second: float)
45-
45+
redact_url (bool, optional): Opt in to redacting the URL in log
46+
output and the response-headers dump. Set True whenever the
47+
URL embeds an auth secret in its path or query string —
48+
e.g. an API key, an OAuth token, an LNBits readkey, or an
49+
xpub/ypub/zpub (which exposes the wallet's whole derivation
50+
tree). Only the `scheme://host[:port]` prefix is kept in
51+
logs; path + query are replaced with "/...REDACTED...".
52+
Defaults to False to preserve current debug output for
53+
callers fetching public URLs (app icons, OS updates, etc.).
54+
4655
Returns:
4756
bytes: Downloaded content (if outfile and chunk_callback are None)
4857
bool: True if successful (when using outfile or chunk_callback)
4958
coroutine: If called from async context, returns awaitable
50-
59+
5160
Raises:
5261
ValueError: If both outfile and chunk_callback are provided
5362
"""
@@ -59,20 +68,44 @@ def download_url(cls, url, outfile=None, total_size=None,
5968
# We're in an async context, return the coroutine
6069
return cls._download_url_async(url, outfile, total_size,
6170
progress_callback, chunk_callback, headers,
62-
speed_callback)
71+
speed_callback, redact_url)
6372
except RuntimeError:
6473
# No running event loop, run synchronously
6574
return asyncio.run(cls._download_url_async(url, outfile, total_size,
6675
progress_callback, chunk_callback, headers,
67-
speed_callback))
76+
speed_callback, redact_url))
6877
except ImportError:
6978
# asyncio not available, shouldn't happen but handle gracefully
7079
raise ImportError("asyncio module not available")
71-
80+
81+
@staticmethod
82+
def _safe_url(url):
83+
"""Return a log-safe rendering of `url` for use when the original URL
84+
carries a secret in its path or query string. Strips everything
85+
after `scheme://host[:port]` and replaces it with "/...REDACTED...".
86+
87+
Examples:
88+
https://example.com/api/v2/xpub/zpub6q... -> https://example.com/...REDACTED...
89+
https://api.example.com:8080/p?key=abc -> https://api.example.com:8080/...REDACTED...
90+
https://example.com -> https://example.com (no path to redact)
91+
not-a-url -> ...REDACTED...
92+
"""
93+
try:
94+
scheme_end = url.find("://")
95+
if scheme_end < 0:
96+
return "...REDACTED..."
97+
path_start = url.find("/", scheme_end + 3)
98+
if path_start < 0:
99+
# No path component — nothing sensitive to strip.
100+
return url
101+
return url[:path_start] + "/...REDACTED..."
102+
except Exception:
103+
return "...REDACTED..."
104+
72105
@classmethod
73106
async def _download_url_async(cls, url, outfile=None, total_size=None,
74107
progress_callback=None, chunk_callback=None, headers=None,
75-
speed_callback=None):
108+
speed_callback=None, redact_url=False):
76109
"""Download a URL with flexible output modes.
77110
78111
Args:
@@ -83,11 +116,14 @@ async def _download_url_async(cls, url, outfile=None, total_size=None,
83116
chunk_callback (coroutine, optional): async def callback(chunk: bytes)
84117
headers (dict, optional): HTTP headers (e.g., {'Range': 'bytes=1000-'})
85118
speed_callback (coroutine, optional): async def callback(bytes_per_second: float)
86-
119+
redact_url (bool, optional): When True, log a redacted URL
120+
(scheme://host only) and suppress the response-headers dump.
121+
See `download_url` for details and use cases.
122+
87123
Returns:
88124
bytes: Downloaded content (if outfile and chunk_callback are None)
89125
bool: True if successful (when using outfile or chunk_callback)
90-
126+
91127
Raises:
92128
ValueError: If both outfile and chunk_callback are provided
93129
"""
@@ -98,6 +134,11 @@ async def _download_url_async(cls, url, outfile=None, total_size=None,
98134
"Use outfile for saving to disk, or chunk_callback for streaming."
99135
)
100136

137+
# Compute the log-safe rendering once; used for every URL-bearing print
138+
# below. When redact_url is False this is just the original URL, so
139+
# existing behaviour is preserved verbatim.
140+
log_url = cls._safe_url(url) if redact_url else url
141+
101142
import aiohttp
102143
session = aiohttp.ClientSession()
103144
sslctx = None # for http
@@ -106,7 +147,7 @@ async def _download_url_async(cls, url, outfile=None, total_size=None,
106147
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
107148
sslctx.verify_mode = ssl.CERT_OPTIONAL # CERT_REQUIRED might fail because MBEDTLS_ERR_SSL_CA_CHAIN_REQUIRED
108149

109-
print(f"DownloadManager: Downloading {url}")
150+
print(f"DownloadManager: Downloading {log_url}")
110151

111152
fd = None
112153
try:
@@ -120,7 +161,13 @@ async def _download_url_async(cls, url, outfile=None, total_size=None,
120161
raise RuntimeError(f"HTTP {response.status}")
121162

122163
# Figure out total size and starting offset (for resume support)
123-
print("DownloadManager: Response headers:", response.headers)
164+
# When redacting, suppress the headers dump entirely — response
165+
# headers can include `set-cookie`, `cf-ray` and other tokens
166+
# that correlate to the request's secret-bearing URL.
167+
if redact_url:
168+
print("DownloadManager: Response headers: <redacted>")
169+
else:
170+
print("DownloadManager: Response headers:", response.headers)
124171
resume_offset = 0 # Starting byte offset (0 for new downloads, >0 for resumed)
125172

126173
if total_size is None:
@@ -241,7 +288,7 @@ async def _download_url_async(cls, url, outfile=None, total_size=None,
241288
speed_last_update_time = current_time
242289
else:
243290
# Chunk is None, download complete
244-
print(f"DownloadManager: Finished downloading {url}")
291+
print(f"DownloadManager: Finished downloading {log_url}")
245292
if fd:
246293
fd.close()
247294
fd = None
@@ -252,7 +299,12 @@ async def _download_url_async(cls, url, outfile=None, total_size=None,
252299
return b''.join(chunks)
253300

254301
except Exception as e:
255-
print(f"DownloadManager: Exception during download: {e}")
302+
# Exception strings from aiohttp often embed the full URL —
303+
# scrub it before printing when the caller asked for redaction.
304+
err_str = str(e)
305+
if redact_url and url in err_str:
306+
err_str = err_str.replace(url, log_url)
307+
print(f"DownloadManager: Exception during download: {err_str}")
256308
if fd:
257309
fd.close()
258310
raise # Re-raise the exception instead of suppressing it

internal_filesystem/lib/mpos/testing/mocks.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -747,19 +747,21 @@ def __init__(self):
747747

748748
async def download_url(self, url, outfile=None, total_size=None,
749749
progress_callback=None, chunk_callback=None, headers=None,
750-
speed_callback=None):
750+
speed_callback=None, redact_url=False):
751751
"""Mock async download with flexible output modes."""
752752
self.url_received = url
753753
self.headers_received = headers
754-
754+
self.redact_url_received = redact_url
755+
755756
self.call_history.append({
756757
'url': url,
757758
'outfile': outfile,
758759
'total_size': total_size,
759760
'headers': headers,
760761
'has_progress_callback': progress_callback is not None,
761762
'has_chunk_callback': chunk_callback is not None,
762-
'has_speed_callback': speed_callback is not None
763+
'has_speed_callback': speed_callback is not None,
764+
'redact_url': redact_url,
763765
})
764766

765767
if self.should_fail:

tests/test_download_manager.py

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -547,3 +547,73 @@ async def track_progress(percent):
547547
# Verify progress values are in valid range
548548
for pct in progress_calls:
549549
self.assertTrue(0 <= pct <= 100)
550+
551+
552+
class TestSafeUrl(unittest.TestCase):
553+
"""Unit tests for the `_safe_url` helper used by `redact_url=True`."""
554+
555+
def test_https_with_path_and_query(self):
556+
u = "https://btc1.trezor.io/api/v2/xpub/zpub6q...secret...stuff?details=txs&tokens=derived"
557+
self.assertEqual(DownloadManager._safe_url(u), "https://btc1.trezor.io/...REDACTED...")
558+
559+
def test_http_with_port_and_path(self):
560+
u = "http://api.example.com:8080/path/secret?key=abc"
561+
self.assertEqual(DownloadManager._safe_url(u), "http://api.example.com:8080/...REDACTED...")
562+
563+
def test_naked_host_no_path_returned_unchanged(self):
564+
# Nothing sensitive after the host — nothing to strip.
565+
u = "https://example.com"
566+
self.assertEqual(DownloadManager._safe_url(u), "https://example.com")
567+
568+
def test_trailing_slash_only(self):
569+
# `https://example.com/` has an empty path; safe to redact as the
570+
# function still finds a `/` after the scheme.
571+
u = "https://example.com/"
572+
self.assertEqual(DownloadManager._safe_url(u), "https://example.com/...REDACTED...")
573+
574+
def test_malformed_url_returns_generic_placeholder(self):
575+
# Anything without `://` is treated as untrusted — replace whole.
576+
self.assertEqual(DownloadManager._safe_url("not-a-url-at-all"), "...REDACTED...")
577+
578+
def test_secret_substrings_never_appear_in_redacted_output(self):
579+
# Belt-and-braces check: the secret-bearing parts of the input
580+
# must not appear in the redacted output. Catches future regressions
581+
# where the redaction logic might accidentally keep a substring.
582+
u = "https://idx.example.com/wallet/SECRETxpubABCDEFG?key=TOKEN_VALUE"
583+
safe = DownloadManager._safe_url(u)
584+
# MicroPython's unittest port has assertIn but not assertNotIn — use
585+
# assertFalse(... in ...) for the negative case.
586+
self.assertFalse("SECRETxpub" in safe)
587+
self.assertFalse("TOKEN_VALUE" in safe)
588+
self.assertIn("https://idx.example.com", safe) # host kept on purpose
589+
590+
591+
class TestRedactUrlKwarg(unittest.TestCase):
592+
"""Verify the redact_url kwarg flows through the call surface (sync wrapper
593+
+ async impl + mock) without changing default behaviour."""
594+
595+
def test_mock_records_redact_url_default_false(self):
596+
import asyncio
597+
mock = MockDownloadManager()
598+
# Configure a stub payload so download_url returns deterministic data
599+
mock.download_data = b"x"
600+
601+
async def go():
602+
await mock.download_url("https://example.com/path")
603+
asyncio.run(go())
604+
605+
# Default: redact_url not requested.
606+
self.assertEqual(mock.call_history[-1]['redact_url'], False)
607+
self.assertEqual(mock.redact_url_received, False)
608+
609+
def test_mock_records_redact_url_true_when_passed(self):
610+
import asyncio
611+
mock = MockDownloadManager()
612+
mock.download_data = b"x"
613+
614+
async def go():
615+
await mock.download_url("https://example.com/path", redact_url=True)
616+
asyncio.run(go())
617+
618+
self.assertEqual(mock.call_history[-1]['redact_url'], True)
619+
self.assertEqual(mock.redact_url_received, True)

0 commit comments

Comments
 (0)