-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Index content items asynchronously in the background to avoid blocking the current request. #18402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…g the current request.
| await context.WaitForHttpBackgroundJobsAsync(TestContext.Current.CancellationToken); | ||
|
|
||
| // Indexing may still be running, so we wait a bit to ensure the content item is indexed. | ||
| await Task.Delay(2000, TestContext.Current.CancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gvkries can you think of a better way than delaying? This works locally but not here. guessing that the delay need to be much higher to pass the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, our waiting methods should now be reliable. Both WaitForOutstandingDeferredTasksAsync and WaitForHttpBackgroundJobsAsync take the active count of operations into account and complete as quickly as possible.
One possible issue might be the call pattern you're using. It looks like you're starting an HTTP background job from within a deferred action, which could be causing a race condition with the wait calls.
It might be worth considering a refactor to rely solely on HTTP background tasks. This could simplify the flow and reduce the potential for timing-related issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change in the ContentIndexProfileHandler has to be executed in a deferred task to ensure we first save the changes before triggering background.
If the background executor does it what should ExecuteAtEndOrRequest, this won't be an issue :)
The background job helper should have one method called ExecuteAsync and the existing one. The existing one should not be executed until all the deferred tasks are completed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, and I think this should work as-is, it's not clear to me why waiting will fail. If a deferred task starts the HTTP job with ExecuteAfterEndOfRequestAsync, WaitForOutstandingDeferredTasksAsync should complete, but WaitForHttpBackgroundJobsAsync must wait until the indexing has been completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll it should but it is not. hopfully the logging you are adding in the other PR will give us more clues on why it is not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some initial debugging and it appears the issue isn't with the second item being added, but rather that the initial item is missing. The one with the display text "Some sort of blogpost in a Query!" is present, but the other expected item is not.
Additionally, I'm encountering frequent ObjectDisposedException errors from the writer used by LuceneIndexManager. This suggests a possible concurrency issue. I haven’t had time to investigate this in depth yet, but it might be worth looking into thread safety or lifecycle management around the writer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears the core issue isn't about waiting for background tasks to complete, but rather about concurrency in the indexing operations. Specifically, multiple HTTP background jobs are being triggered simultaneously:
RebuildIndexStepContentIndexProfileHandlerContentIndexInitializerService
Since these jobs run concurrently and without a guaranteed execution order, the Lucene index occasionally fails to include content items added by the recipe. The main culprit seems to be RebuildIndexStep, which resets and rebuilds the index in a way that may not be thread-safe:
await indexProfileManager.ResetAsync(index);
await indexProfileManager.UpdateAsync(index);
if (await indexManager.RebuildAsync(index))
{
await indexProfileManager.SynchronizeAsync(index);
}This sequence likely introduces race conditions, especially when other indexing tasks are running in parallel. Ensuring thread safety or introducing proper synchronization between these steps might help resolve the inconsistency.
No description provided.