-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
gh-141976: Protect against non-progressing specializations in tracing JIT #141989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gh-141976: Protect against non-progressing specializations in tracing JIT #141989
Conversation
|
This doesn't look correct to me. The tracing front-end should produce correct code. Working around it in the optimizer is a bad idea as it just hides a bug that we should be fixing. |
This is fixing a bug in the tracing front-end, not the optimizer. Please see that the function is We are not missing a guard. |
Not what you said here 😉 Maybe the tracer needs its own file. I'm still not convinced by this PR. It is probably right, but there a number of possible deopt/specialization possibilities and I'll like to see a full table/state machine to make sure we are handling all of them. |
|
These are the following states:
States 1 to 3 are safe, as the actual instruction executed is the right hand side. So the trace recorder recorded the correct instruction. State transition 4 is unsafe, as the actual instruction executed is _GENERIC, but since it did not unspecialize, it's recorded as _SPECIALIZED. This is problematic as it will cause unecessary contradictions in the optimizer. I'm quite confident this will also speed up some benchmarks that are unstable/highly polymorphic, as it was in the original design of the tracer recorder before we yanked out the counting. |
|
Transition 4 should be safe. We just end up recording the wrong specialization, so it will deopt immediately when the trace is executed. If it is unsafe, then there is some other bug in the trace recorder. |
markshannon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we better off just abandoning the trace in this case, and hoping it will be better behaved next time we optimize?
We should add stats for this, either way.
Yes it should now be safe after we fixed the optimizer. |
I do not have a test for this as it's too hard to test for. We require the following scenario to happen:
We don't have that many non-progressing specializations to begin with, so any test is a futile fight between the internal implementation details and the JIT.
I have verified this fixes the test case in the issue. I'm not using the test case in the issue as it's dependent on how much stack space we reserve and consume per stack chunk. If we ever change the stack chunks size, the test will break.
WITHIN_STACK_BOUNDS()inoptimize_uops#141976