-
Notifications
You must be signed in to change notification settings - Fork 303
Composite cv chaining #11540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Composite cv chaining #11540
Conversation
|
After a meeting with Adam who owns the related Dynflow implementation, we have these results:
|
47c7477 to
973cb03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider extracting the large
auto_publish_composites!method into a dedicated service or command object to simplify the model and make the chaining logic easier to maintain and test. - The current approach to detecting scheduled and running tasks pulls full task sets into Ruby and filters them—consider pushing the composite CV filtering down into the database query (e.g. more restrictive WHERE clauses) for better performance.
- In the Sequel adapter patch, switch from embedding a raw NOT EXISTS SQL literal to using Sequel’s built-in DSL for EXISTS/NOT EXISTS to improve readability and guard against quoting issues.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider extracting the large `auto_publish_composites!` method into a dedicated service or command object to simplify the model and make the chaining logic easier to maintain and test.
- The current approach to detecting scheduled and running tasks pulls full task sets into Ruby and filters them—consider pushing the composite CV filtering down into the database query (e.g. more restrictive WHERE clauses) for better performance.
- In the Sequel adapter patch, switch from embedding a raw NOT EXISTS SQL literal to using Sequel’s built-in DSL for EXISTS/NOT EXISTS to improve readability and guard against quoting issues.
## Individual Comments
### Comment 1
<location> `app/models/katello/content_view_version.rb:427-433` </location>
<code_context>
+ # @param task [ForemanTasks::Task] The task to check
+ # @param composite_cv [Katello::ContentView] The composite content view
+ # @return [Boolean] true if task is for this composite CV
+ def scheduled_task_for_composite?(task, composite_cv)
+ delayed_plan = ForemanTasks.dynflow.world.persistence.load_delayed_plan(task.external_id)
+ args = delayed_plan.args
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Rescue block in scheduled_task_for_composite? may mask unexpected errors.
Rescuing StandardError will cause all errors to return false, potentially hiding unrelated issues. Rescue only the exceptions you expect to handle.
```suggestion
def scheduled_task_for_composite?(task, composite_cv)
begin
delayed_plan = ForemanTasks.dynflow.world.persistence.load_delayed_plan(task.external_id)
args = delayed_plan.args
args.first.is_a?(::Katello::ContentView) && args.first.id == composite_cv.id
rescue ForemanTasks::Dynflow::PersistenceError, NoMethodError
false
end
end
```
</issue_to_address>
### Comment 2
<location> `app/models/katello/content_view_version.rb:439` </location>
<code_context>
+ # @param composite_cv [Katello::ContentView] The composite content view
+ # @param sibling_task_ids [Array<String>] Task IDs to wait for
+ # @param description [String] Description for the publish task
+ def trigger_composite_publish(composite_cv, sibling_task_ids, description)
+ if sibling_task_ids.any?
+ trigger_chained_composite_publish(composite_cv, sibling_task_ids, description)
</code_context>
<issue_to_address>
**suggestion:** Consider logging the exception object for StandardError.
Logging the full exception, including its backtrace, will provide more context for troubleshooting.
</issue_to_address>
### Comment 3
<location> `test/models/events/auto_publish_composite_view_test.rb:33` </location>
<code_context>
end
def test_run_with_lock_error
+ metadata = { description: "Auto Publish - Test", version_id: component_version.id }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Suggestion to add a test for generic error handling in event run method.
Please add a test that raises a generic exception in the event run method to confirm proper error logging and handling.
Suggested implementation:
```ruby
def test_run_with_generic_error
metadata = { description: "Auto Publish - Test", version_id: component_version.id }
event = AutoPublishCompositeView.new(composite_view.id) do |instance|
instance.metadata = metadata
end
# Stub a method called during run to raise a generic error
event.stubs(:publish).raises(StandardError.new("Generic error"))
# Expect error to be logged, not raised
Rails.logger.expects(:error).with(regexp_matches(/Generic error/))
assert_nothing_raised do
event.run
end
end
def test_run_with_publish
metadata = { description: "Auto Publish - Test", version_id: component_version.id }
```
- If the `publish` method is not the one called during `run`, replace it with the correct method that should be stubbed to raise the error.
- Ensure that the event's `run` method actually logs errors using `Rails.logger.error`. If it does not, you may need to add error logging in the implementation.
- If you use a different logger or error handling mechanism, adjust the expectation accordingly.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| # A composite publish is currently running - we need to schedule another one | ||
| # after it finishes to include this new component version | ||
| Rails.logger.info("Composite CV #{composite_cv.name} publish running, scheduling event for retry") | ||
| schedule_auto_publish_event(composite_cv, description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to follow the code path..Do we need this event if we are triggering composite publish right after this case block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a nil case here that runs trigger_composite_publish. We don't need to trigger another composite publish. In the case that there is one running, trigger_composite_publish will fail on the lock error, so we can just skip it.
qcjames53
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still wrapping my head around the chaining but I wanted to throw my hat in for this scenario: What happens if a user publishes components A and B in a CCV, and A finishes, but B fails. It seems like the CCV won't be updated with A's new content like in the old system. Wouldn't this leave us in a broken state with an outdated CCV?
| end | ||
|
|
||
| # Schedule an event to retry composite publish after current one finishes | ||
| # @param composite_cv [Katello::ContentView] The composite content view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremylenz used to yell at me for putting these @param comments in my PRs. Is this acceptable now? I'm going to do it again if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the fence, I like seeing info, but it stands out since not all methods will have em. I certainly wouldn't add them if I had written it instead of Claude. I'll probably remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like Claude is using https://yardoc.org/ formatting. Anyway, I don't think we need it since we don't use YARD.
53b2f26 to
746c701
Compare
See this comment here: Dynflow/dynflow#446 (comment) "(TODO) Error handling so that a chained task is removed from the schedule if a dependency fails (optional, but on by default)" If B fails, we don't want the composite to publish, the user will need to go in and investigate the situation. I think an outdated CCV in that case is a good thing - it's better than the publish happening and potentially including garbage from the failed B. What do you think? |
746c701 to
0dd3524
Compare
That makes sense to me. In fact it's the entire point of this PR. Is there a secret third option where after all child components either pass or fail, we publish only the working components? Or would that require reworking the CCV way too much? I'm imagining a list of auto-updated components "to_be_published" and a list of "published_successfully" and once the former list empties itself it includes everything on list 2 for the autopublish? Or is this dumb? |
|
Oh also I want to confirm this only updates components set to auto-publish. If there's a newer version of C available, will A and B updating also cause C to update? This was an issue in a bug earlier this year. |
That might be a bit complicated since the composite publish is already scheduled by this point - its dynflow inputs would need to be edited, or the scheduled task would need to be replaced by a newer one. I think it's possible though & worthy of an incremental improvement on this feature. Dynflow could potentially add a feature where the scheduled task gets 'hot swapped' with a new definition. I think we discussed this in a meeting in fact with @adamruzicka and @sjha4 . We'd want to consider the UX impacts of allowing these failures to be more transparent versus halting the entire process to have the user go in and inspect.
This feature does indeed only affect auto-publish items where, on the composite, the content view is set to be always at the latest version. Worth a test to double check though! |
0dd3524 to
545590b
Compare
48386bd to
c0a9243
Compare
|
Here is a hackmd with more information about how this change works: https://hackmd.io/Ab6mGkicTVKURKYlL5fwjw?both |
fe6acea to
42da244
Compare
|
|
||
| args = delayed_plan.args | ||
| args.first.is_a?(::Katello::ContentView) && args.first.id == composite_cv.id | ||
| rescue NoMethodError, TypeError, Dynflow::Error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| rescue NoMethodError, TypeError, Dynflow::Error | |
| rescue NoMethodError, TypeError, Dynflow::Error => e |
* All composite publishes now go through event system (not just :running case) * Event handler performs coordination and chaining * Pass calling_task_id through event metadata for sibling exclusion
42da244 to
4799c95
Compare
4799c95 to
4489865
Compare
Note: this PR contains some code generated by Claude Code.
Note: this PR requires theforeman/foreman-tasks#786 and Dynflow/dynflow#446.
Note, as of writing, this PR requires the following patch on top of the Dynflow one:
Note: everything in config/initializers/foreman_tasks_chaining.rb should actually exist in ForemanTasks. For this prototype I'm defining it here to make the test setup easier.
What are the changes introduced in this pull request?
Implements Dynflow task chaining to improve composite content view auto-publish.
Imagine a composite content view (CCV) with 3 components, A, B, and C.
If A and B are publishing at the same time, and A finishes first, CCV will start publishing while B is still publishing. This is bad because version B.latest exists but is partially-created. So, this PR 'chains' the CCV auto-publish to any publishing children.
There is yet another problem: even if the CCV publish is chained to A and B publishing, two CCV publishes will be triggered because auto-publish is triggered for each child. This doesn't make sense when A and B are publishing together though - there should only be one publish - any more are extra. So, this PR also causes composite publishes to be skipped if the composite has a scheduled publish already waiting.
If a CCV publish is running and then A publishing triggers another CCV publish, then the existing code will:
Considerations taken when implementing this change?
The 'when to skip vs when to use polling' is up for debate. At the moment I don't see any chances for content to be missed, but please consider it. Could auto-publish be skipped when it shouldn't have been?
What are the testing steps for this pull request?
a. Don't forget to set the CCV to always use the latest version of the components
Summary by Sourcery
Implement Dynflow-based task chaining for composite content view auto-publishing, preventing race conditions by skipping duplicates, scheduling retries, and ensuring a single coordinated publish per composite.
New Features:
Enhancements:
Tests:
Chores:
Summary by Sourcery
Add Dynflow task chaining to composite content view auto-publish to coordinate component publishes, avoid race conditions, and eliminate duplicate publishes. Detect existing scheduled or running composite publishes to skip or reschedule accordingly. Propagate the triggering component version ID through the publish action and event metadata. Include comprehensive tests covering immediate vs chained triggers, skip/schedule logic, lock conflicts, and task discovery.
New Features:
Enhancements:
Tests: