perf: reduce memory baseline for busybox and sensor#19999
Draft
perf: reduce memory baseline for busybox and sensor#19999
Conversation
|
Skipping CI for Draft Pull Request. |
Contributor
🚀 Build Images ReadyImages are ready for commit 10e59b0. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-661-g10e59b0d79 |
Reduce init-time memory for the busybox binary by eliminating unnecessary imports, deferring allocations with sync.OnceValue, and breaking heavy transitive dependency chains. Results (Linux amd64): - Busybox: 16.1 MB -> 12.9 MB heap (-20%), 245K -> 173K mallocs (-29%) - AC standalone: 9.1 MB -> 7.2 MB heap (-21%), 87K -> 51K mallocs (-41%) - Binary size: 205 MB -> 194 MB (-5%) Generated with assistance from AI Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Each logger that writes to a file spawns a lumberjack goroutine for log rotation. With ~30 loggers writing to /var/log/stackrox/log.txt, that's 30 idle goroutines + 30 independent file handles to the same file. In container environments, logs go to stdout and are collected by the container runtime — file logging is unnecessary overhead. Set ROX_LOGGING_TO_FILE=false to disable file logging, saving: - 30 goroutines and their stacks - File I/O overhead - lumberjack rotation processing Default is true (unchanged behavior) for backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Each CreateLogger call created an independent lumberjack.Logger for the same log file, spawning its own rotation goroutine. With ~30 loggers, that's 30 goroutines + 30 file handles to the same file. Share a single writer per path via a map. This reduces log rotation goroutines from 30 to 1 and eliminates potential corruption from concurrent uncoordinated writes to the same file. GC sweet spot experiment findings (included in commit message for context): - 128Mi: GC thrashing (84 GC/min, 200m CPU) - 160Mi: Sweet spot (2 GC/min, 4m CPU) - 192Mi: Comfortable (0 GC/min, 3m CPU) - Rule: set limit to 1.3-1.5x natural heap size Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Process enrichment LRU cache was hardcoded at 100K entries — designed for large enterprise clusters with thousands of containers. On a 50-container edge cluster, this is 2000x oversized. Use pkg/sensor/queue.ScaleSize to scale based on ROX_MEMLIMIT: - 128Mi limit → ~3K entries (sufficient for 50 containers) - 4Gi limit → 100K entries (unchanged behavior) - Minimum: 100 entries Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
696081a to
10e59b0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Umbrella PR for the memory baseline optimization series. This tracks the overall effort — individual changes are in separate PRs for focused review.
Merged
Open — Schema Lazy Loading
Open — Init-Time Reductions
Open — Logging Improvements
Not Yet Created
Combined Measurements (live GKE cluster)
Note: These measurements are from the zap sampler change only. Additional savings expected from schema lazy loading and init-time reductions once deployed.
User-facing documentation
Testing and quality
How I validated my change
This is a tracking PR. Individual PRs have their own validation.
AI-assisted.