-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a forceIgnoring mechanism and apply it to the plugins (Spring Clo… #689
Conversation
…ud Gateway, HttpClient, JDK-ForkJoinPool, JDK-Threading, JDK-ThreadPool, Toolkit-Trace, Toolkit-WebFlux)
Please write a proposal about what is a force-ignoring mechanism. The discussion we did was only about the context. |
Do you have a proposal template? |
For an official proposal, you could refer to SWIP(https://skywalking.apache.org/docs/main/next/en/swip/readme/), which is mandatory for OAP, but not for the agent. A reminder for you, the way you changes, is making a significant impact for existing APIs, this may not be a good way. |
Added. |
Add what? |
here: #689 (comment) |
I think we need to separate the context. Cross thread is different from cross process. About crosee thread, the status should only documented in the snapshot. But I am not clear about how you changed a sampled tracing context into a ignoring context? |
I am feeling you are doing a hijack only, which should not be the official way. |
This question can be divided into two parts:
|
I don't have direct answer, if I had, I already told you in the last discussion. The reason it doesn't exist is, the tracing context is created before the snapshot is continued. You proposal should cover about how. The cross process is more complex. And sampling is service(process) oriented. There is a flag in the protocol about sampling, but you have to evaluate all agent codes(all languages), about whether there will be an impact. |
You are adding a mechanism for local span creation. This should not happen, if there is a context switching out, it must be controlled by ContextManager. |
When this is about ContextManager, it should not affect TracingContext#CreateLocal span. Only snapshot should be changed, and ContextManager asked TracingContext switching to IgnoringTracingCntext with the same stack. Also, the plugins should not be changed, and all span and context manager APIs should not be changed. |
If the TracingContext switches to IgnoringTracingContext in the continued method, the span reference created by the ContextManager#createLocalSpan method cannot be changed, and the plugin still uses the span before the switch. |
I think you got it wrong. It should be in ContextManager#continued method, which controls the context. |
This is a force sampling(unsampling), we don't need to worry about sampling counter. |
Understood. How can the span used by the plugin be changed to a NoopSpan? |
ForceSampling might not necessarily be true. I feel that it could still impact the count of the SamplingService. |
It just counts one more, so what? Sampling is never to be accurate. |
Consider creating |
IgnoredTracerContext retains the activeSpanStack of TracingContext, do you think there's no problem with it? |
If there is no problem, indeed the 'continued' method can complete the transition of the context. However, for the leftover spans, I am not entirely sure if they might cause other issues. |
Which problem do you mean No matter what spans they are. As they are not reported, they are fine to be GCed. |
I have made alterations using a new way and submitted another PR. Can you check if there are any problems? |
Please update here. |
@@ -34,14 +34,21 @@ public class IgnoredTracerContext implements AbstractTracerContext { | |||
private static final NoopSpan NOOP_SPAN = new NoopSpan(); | |||
private static final String IGNORE_TRACE = "Ignored_Trace"; | |||
|
|||
private LinkedList<AbstractSpan> activeSpanStack; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the old span is still in use, if it's discarded, span#prepareForAsync will throw an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not friendly for GC. We could add ingored
flag to the span and skip this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option maybe to create a ForceIgnoreTracerContext to deal with created span cases.
And this only happens in very rare case, so we should process that only when we checked all active spans, which have one running in async mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you prefer replacing the span's context with IgnoredTracerContext for old spans with old context references, or would you rather ignore any operations involving context within the span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't dig in so deeply, as I am feeling this change is not easy.
I just provided two possibilities, no preference. One key part is, what you are adding is not happening as always, you should not make the agent kernel costs too much.
I don't know whether this is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was why I said, this could be done easily if this only happens in your private fork. But in general, I don't have a good idea could be done easily.
But this has to be very good, as your changes are going to affect every user for sure.
} | ||
|
||
public IgnoredTracerContext(LinkedList<AbstractSpan> activeSpanStack) { | ||
this.activeSpanStack = activeSpanStack; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need the depth to be initialized based on stack depth, rather than holding all
this.activeSpanStack = new LinkedList<>(); | ||
this.correlationContext = new CorrelationContext(); | ||
this.extensionContext = new ExtensionContext(); | ||
this.profileStatusContext = ProfileStatusContext.createWithNone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You just need to call new constructor.
stackDepth++; | ||
activeSpanStack.addLast(NOOP_SPAN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All logic here should not be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need depth, no further.
...pm-agent-core/src/main/java/org/apache/skywalking/apm/agent/core/context/ContextManager.java
Show resolved
Hide resolved
...t-core/src/main/java/org/apache/skywalking/apm/agent/core/context/AbstractTracerContext.java
Outdated
Show resolved
Hide resolved
...re/src/main/java/org/apache/skywalking/apm/agent/core/context/trace/AbstractTracingSpan.java
Outdated
Show resolved
Hide resolved
It seems the UTs failing. Please recheck and fix. |
Please follow the comments to polish the codes. Generally, this PR is good. |
polish the codes. Co-authored-by: 吴晟 Wu Sheng <[email protected]>
Thanks for adding this. |
Add a forceIgnoring mechanism and apply it to the plugins (Spring Cloud Gateway, HttpClient, JDK-ForkJoinPool, JDK-Threading, JDK-ThreadPool, Toolkit-Trace, Toolkit-WebFlux)
CHANGES
log.proposal
Motivation
context: apache/skywalking#12161.
Generally speaking, if the source Span is ignored, the downstream Span also needs to be ignored, such as in same-thread operations and cross-process operations. However, for Spans created by cross-thread operations, such as in the scenario of Spring Cloud Gateway, there is currently no mechanism to make them ignored along with the parent Span.
The force-ignoring mechanism is a supplement to the ignoring mechanism for cross-thread operations.
Architecture Graph
Proposed Changes
The reason why you can only modify within the ContextManager#createLocalSpan method, and not within the ContextManager#continued method, is because the latter cannot change an already created Context and Span.
I haven't found a way to support this feature without disrupting the existing API, because the tracing context is created before the snapshot is continued.
Currently modified plugins are:
httpasyncclient-4.x
httpclient-5.x
jdk-forkjoinpool
jdk-threading
jdk-threadpool
spring-cloud-gateway-2.0.x
spring-cloud-gateway-2.1.x
spring-cloud-gateway-3.x
spring-cloud-gateway-4.x
toolkit-trace-activation
toolkit-webflux-activation
Imported Dependencies libs and their licenses.
No new dependency.
Compatibility
In scenarios where it is necessary to keep the span node of the entire trace intact, such as in Spring Cloud Gateway, the call to the ContextManager#continued method needs to be replaced with the ContextManager#createLocalSpan method that includes a snapshot.
General usage docs