Skip to content

diagnostics_channel: add opt-in subscriber suppression via suppressedBy and suppressed()#63651

Open
DivyanshuX9 wants to merge 16 commits into
nodejs:mainfrom
DivyanshuX9:diag/suppression-als
Open

diagnostics_channel: add opt-in subscriber suppression via suppressedBy and suppressed()#63651
DivyanshuX9 wants to merge 16 commits into
nodejs:mainfrom
DivyanshuX9:diag/suppression-als

Conversation

@DivyanshuX9
Copy link
Copy Markdown
Contributor

Summary

This change moves suppression tracking into the diagnostics channel runtime
using a per-async-context storage so suppression state survives Promise/timer
boundaries. It introduces a lazy AsyncLocalStorage-based context for suppression,
wires per-subscription/store opt-in, and exposes a suppressed(key, fn, thisArg, ...args) helper that runs fn with the given suppression key active for the
current async context.


What Changed

lib/diagnostics_channel.js

  • Use lazy-initialized AsyncLocalStorage for suppression context
  • Implemented withSuppressionsContext(set, fn, thisArg, args)
    and getSuppressionsStorage()
  • suppressed(key, fn, ...) now runs fn inside the ALS context so
    suppression persists across Promise/async boundaries
  • ActiveChannel.prototype.subscribe and bindStore accept an
    options.suppressedBy opt-in and validate key types
  • publish and store-scoping now check the active suppression set
    (via ALS.getStore()) and skip subscribers/stores whose
    suppressedBy key is active

Tests

  • Added test/parallel/test-diagnostics-channel-suppression.js
    covering 10 scenarios: sync, async, timer, and store cases

Note: A temporary debug console.error in suppressed() was added
during development and must be removed before merge.


Why

The previous stack-based approach lost suppression state across async
boundaries (Promise and timer continuations). Using AsyncLocalStorage
preserves suppression state across Promise chains and
microtask/macrotask boundaries while remaining safe to lazily initialize
during bootstrap.


Backward Compatibility

  • Suppression is opt-in per-subscriber and per-store — existing code
    is completely unaffected unless it explicitly uses suppressedBy or
    calls suppressed()
  • ALS initialization is lazy and wrapped in try/catch — if ALS is
    unavailable at runtime, the code falls back gracefully to a non-ALS
    path with no cross-async persistence, preventing snapshot/bootstrap
    failures
  • Behavior changes are strictly limited to code that uses
    suppressedBy or calls suppressed()

Testing Performed

  • Rebuilt Node so snapshot includes the modified JS
  • Ran python tools/test.py parallel/test-diagnostics-channel-suppression.js
  • Verified suppression across Promise and timer boundaries via standalone scripts
  • One harness mustCall mismatch was observed during debugging (now resolved)

Recommend a full CI run before landing.


Pre-Merge Checklist

  • Remove temporary console.error debug logging from suppressed()
    in diagnostics_channel.js
  • Run full test suite — at minimum:
    tools/test.py parallel/test-diagnostics-channel-suppression.js
    until all tests pass with no harness warnings
  • Clean up test-diagnostics-channel-suppression.js
    (remove any debug scaffolding used during development)
  • Update API docs — document:
    • suppressed(key, fn, thisArg, ...args)
    • subscribe(handler, { suppressedBy })
    • bindStore(als, transform, { suppressedBy })
  • Add changelog entry in CHANGELOG.md per repo process
  • Run microbenchmarks on heavily-subscribed channels to
    verify ALS overhead is acceptable
  • Request reviews from: @nodejs/diagnostics @nodejs/async-hooks
  • Squash/clean commits before landing if preferred

DivyanshuX9 and others added 2 commits May 24, 2026 21:33
Defer non-critical warnings to the next event loop iteration when
can_call_into_js() returns false. This prevents crashes when V8
emits warnings during REPL preview evaluation or other contexts
where JavaScript execution is temporarily forbidden.

When a warning is emitted inside DisallowJavascriptExecutionScope,
ProcessEmitWarningGeneric cannot be called immediately. Instead,
use env->SetImmediate() to queue the warning emission for after
the scope exits. This preserves full warning formatting, deprecation
codes, and --redirect-warnings routing.

Signed-off-by: Divyanshu Sharma <[email protected]>
Copilot AI review requested due to automatic review settings May 29, 2026 21:31
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. diagnostics_channel Issues and PRs related to diagnostics channel needs-ci PRs that need a full CI run. labels May 29, 2026
DivyanshuX9 added a commit to DivyanshuX9/node that referenced this pull request May 29, 2026
@DivyanshuX9 DivyanshuX9 force-pushed the diag/suppression-als branch from 5b4110a to 8b122c2 Compare May 29, 2026 21:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

DivyanshuX9 commented May 29, 2026

@BridgeAR @bengl @Qard , I am opening this as a implementation
of #63623. Test file is included. Happy to adjust the API shape
or implementation based on your feedback before CI runs.

@DivyanshuX9 DivyanshuX9 force-pushed the diag/suppression-als branch from 8b122c2 to e4aea85 Compare May 29, 2026 22:37
@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.35%. Comparing base (79def6d) to head (e630f72).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63651      +/-   ##
==========================================
+ Coverage   90.32%   90.35%   +0.02%     
==========================================
  Files         732      732              
  Lines      236435   236573     +138     
  Branches    44527    44556      +29     
==========================================
+ Hits       213563   213749     +186     
+ Misses      14589    14539      -50     
- Partials     8283     8285       +2     
Files with missing lines Coverage Δ
lib/diagnostics_channel.js 98.31% <100.00%> (+0.17%) ⬆️

... and 50 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are missing a lot of necessary common.mustCall(fn) wrappers.

Also, it seems like this was just vibe-coded without reviewing the output before submitting the PR. Please ensure it is in a good state and that the test suite and lint passes before submitting.

Comment thread lib/diagnostics_channel.js Outdated
Comment thread src/node_errors.cc
Comment on lines 1067 to 1072
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary and would be a substantial behavioural change. This should be removed.

Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread lib/diagnostics_channel.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

DivyanshuX9 commented May 30, 2026

@Qard , thanks for the detailed review. So far the checklist of what i have fixed in the latest push:

[] Removed the impossible ALS try/catch fallback
[] Removed node_errors.cc changes as you suggested
[] Replaced all IIFE wrappers with plain blocks in tests
[] Added common.mustCall() / common.mustNotCall() to all handlers
[] Ran lint and tests locally , both passing

Ready for re-review and any changes if you want, when you have time please review it.

Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
Comment thread test/parallel/test-diagnostics-channel-suppression.js Outdated
@DivyanshuX9 DivyanshuX9 force-pushed the diag/suppression-als branch from 0f6e780 to 8fa6fd8 Compare May 30, 2026 12:27
@DivyanshuX9 DivyanshuX9 force-pushed the diag/suppression-als branch from 8fa6fd8 to b8194af Compare May 30, 2026 12:38
@Qard
Copy link
Copy Markdown
Member

Qard commented May 30, 2026

The node_errors.cc change still seems to be present.

@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

DivyanshuX9 commented May 30, 2026

@Qard , ok i removed my changes from the node_errors.cc and put what was in the main

Copy link
Copy Markdown
Member

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM at this point, though I'm not sure the lazy-loading of ALS is really necessary.

Comment thread lib/diagnostics_channel.js
@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

@Qard ,it turns out the lazy-loading IS necessary. Removing it caused
a snapshot generation failure:

Error: Should not query options before bootstrapping is done

diagnostics_channel loads early enough in bootstrap that
async_hooks isn't yet available at top-level require time.
The lazy getter defers ALS initialization until first use,
which is safely after bootstrap completes.

i would like to reverted to the lazy pattern. CI confirms this was the right call.

@Qard
Copy link
Copy Markdown
Member

Qard commented May 31, 2026

Alright, I figured that might be the case. Modules loaded during bootstrap are a bit finicky, so lazy-loading that is fine.

@DivyanshuX9 DivyanshuX9 requested a review from Copilot May 31, 2026 11:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@BridgeAR BridgeAR added the semver-minor PRs that contain new features and should be released in the next minor version. label May 31, 2026
Copy link
Copy Markdown
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality wise this would be working, but performance wise this is not what I imagined and I believe we should invest more time into name bikeshedding.

Instead of naming it suppressedBy and suppressed, what about bypass() and bypassId? That way it is consistent and I personally like bypass more after thinking about it for a bit.

We definitely also need documentation for the feature.

Comment on lines +687 to +692
if (key === null || key === undefined) {
throw new ERR_INVALID_ARG_TYPE('key', ['object', 'symbol', 'function'], key);
}
const t = typeof key;
if (t === 'string' || t === 'number' || t === 'bigint' || t === 'boolean') {
throw new ERR_INVALID_ARG_TYPE('key', ['object', 'symbol', 'function'], key);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (key === null || key === undefined) {
throw new ERR_INVALID_ARG_TYPE('key', ['object', 'symbol', 'function'], key);
}
const t = typeof key;
if (t === 'string' || t === 'number' || t === 'bigint' || t === 'boolean') {
throw new ERR_INVALID_ARG_TYPE('key', ['object', 'symbol', 'function'], key);
if (key == null) {
throw new ERR_INVALID_ARG_TYPE('key', ['object', 'symbol', 'function'], key);
}
const type = typeof key;
if (type !== 'object' && type !== 'symbol') {
throw new ERR_INVALID_ARG_TYPE('key', ['object', 'symbol'], key);

Comment on lines +191 to +195
const subscriberId = options && options.subscriberId !== undefined ? options.subscriberId : null;
if (subscriberId !== null) {
const t = typeof subscriberId;
if (t === 'string' || t === 'number' || t === 'bigint' || t === 'boolean') {
throw new ERR_INVALID_ARG_TYPE('subscriberId', ['object', 'symbol', 'function'], subscriberId);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const subscriberId = options && options.subscriberId !== undefined ? options.subscriberId : null;
if (subscriberId !== null) {
const t = typeof subscriberId;
if (t === 'string' || t === 'number' || t === 'bigint' || t === 'boolean') {
throw new ERR_INVALID_ARG_TYPE('subscriberId', ['object', 'symbol', 'function'], subscriberId);
const subscriberId = options?.subscriberId;
if (subscriberId !== undefined) {
if (typeof subscriberId !== 'object' &&
typeof subscriberId !== 'symbol' &&
subscriberId !== null) {
throw new ERR_INVALID_ARG_TYPE('subscriberId', ['object', 'symbol'], subscriberId);

const index = ArrayPrototypeIndexOf(this._subscribers, subscription);
// Find subscriber entry by handler identity.
let index = -1;
for (let i = 0; i < (this._subscribers?.length || 0); i++) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (let i = 0; i < (this._subscribers?.length || 0); i++) {
for (let i = 0; i < (this._subscribers.length); i++) {


publish(data) {
const subscribers = this._subscribers;
const activeKeys = getSuppressionsStorage().getStore();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what I had in mind, since this just reimplements the logic we have in our APMs in Node.js core.

That in itself is a nicer spot because we do not need to reimplement the wheel, but it does not improve performance, which is a concern.

Instead, we can track if there are any bypassed / suppressed and only do the store lookup if we have bypassed ones.

  const keys = this.#subscriberKeys;
  const activeKeys = keys && getSuppressionsStorage().getStore();

if (this._index !== undefined) subscriberCounts[this._index]++;
}
this._stores.set(store, transform);
this._stores.set(store, { transform, subscriberId });
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not switch to an object by default. This is not the default case, so we should optimize for the default instead which is the regular subscription without bypass / suppression.

We could use a separate lazy array. That way we only pay for the data when there are suppressed ones.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BridgeAR , ok i understood, I'll keep the default path unchanged
and use a separate lazy SafeMap for bypassed stores:

// Default path ( exactly as before, zero extra cost)
this._stores.set(store, transform);

// Bypass path ( separate structure, only allocated when needed)
if (!this._bypassedStores) this._bypassedStores = new SafeMap();
this._bypassedStores.set(store, { transform, bypassId });

Then in publish/runStores, only check _bypassedStores if it exists.
Does that match what you had in mind?

const subscriberId = options && options.subscriberId !== undefined ? options.subscriberId : null;
if (subscriberId !== null) {
const t = typeof subscriberId;
if (t === 'string' || t === 'number' || t === 'bigint' || t === 'boolean') {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is duplicating the validation in many spots, we can likely unify it. In the other comments I also suggested to drop function as type and to do a faster validation.

I wonder if we even want to allow symbols at all, if we want to allow objects. I personally would prefer a single data type over many.

Other opinions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BridgeAR , my preference would be Symbol only. Reasons:

  1. Symbols are guaranteed unique by the runtime ,so no accidental
    collision possible across vendors
  2. Objects work but require the caller to manage reference equality
    themselves ,->easy to accidentally create a new object each time
  3. A single allowed type makes the API simpler to document and
    reason about

const kMyTracer = Symbol('my-tracer') reads cleanly and is
idiomatic for this kind of identity token in JS.

Happy to go either way though what's your preference? and should we ask other member too?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The object would not be shared across vendors, since they create it in place. So that is not a risk I see.
  2. I think the risk of creating a new symbol each time is the same, no?
  3. Agreed, while I do not have a strong opinion on that

@isaacs would you go with objects only? And may I ask for not wanting symbols? I did not check, I guess objects could be faster, but I am unsure and I am curious about your reasoning.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so think i should wait for further clarification before making any changes....

Comment thread src/node_errors.cc Outdated
@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

DivyanshuX9 commented May 31, 2026

Is this naming you want @BridgeAR ?

Old name New name
suppressed (function) bypass
subscriberId (option) bypassId
suppressionStorage bypassStorage
getSuppressionsStorage getBypassStorage

have a quick look so i would start changes

@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

Also for the documentation i think we should open new issue/pr , is this fine @BridgeAR ?

@BridgeAR
Copy link
Copy Markdown
Member

@DivyanshuX9 yes, I like that naming more, while I think it would be agreed upon by @nodejs/diagnostics as a whole.

What about a short survey:

❤️ bypass
👍 suppress
🚀 ... TBD1 ...
🎉 ... TBD2 ...

In addition: we did not yet agree on the API. @bengl asked for #63623 (comment)

I do not think this would work as suggested, while the current callback API would (it is meant as drop in replacement for storage.run({ noop: true }, fn): suppressed(kMyTracer, fn). We could even introduce a using _ = suppressing(kMyTracer) as alternative to the callback (I would just not do that right now).

@DivyanshuX9
Copy link
Copy Markdown
Contributor Author

DivyanshuX9 commented May 31, 2026

@BridgeAR , i'll just voted ❤️ for bypass, agrees with your reasoning.

On the API shape , I implemented the callback style bypass(kMyTracer, fn) as a drop-in for storage.run({ noop: true }, fn) which I think is the cleaner approach.

Happy to wait for @bengl and @nodejs/diagnostics to weigh in before making further changes to the API shape.

Should I hold off on the rename until the vote settles, or go ahead with bypass now since that seems to be the leading preference?

Happy to rename again whenever the vote settles although I've gotten good at find-and-replace at this point 😄

(Note: currently still named suppressed/subscriberId in the code will rename to whatever the team settles on from the vote)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. diagnostics_channel Issues and PRs related to diagnostics channel needs-ci PRs that need a full CI run. semver-minor PRs that contain new features and should be released in the next minor version.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants