Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clusterimpl: update picker synchronously on config update #7652

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

aranjans
Copy link
Contributor

#5469 recommends an audit of existing LB policies to ensure that they update their pickers synchronously upon receipt of a configuration update. clusterimpl does not update picker synchronously before because of multiple reasons, one is on reciept of configuration update, process of switching the child policies was aysnchronous, and second is it was not waiting for the child policy config to be update first, before returning from UpdateClientConnState.

What does this PR do?
This PR ensures that every time configuration update is triggered, picker is updated synchronously.

RELEASE NOTES:

  • clusterimpl: update picker synchronously on receipt of configuration update.

@aranjans aranjans added Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Bug labels Sep 20, 2024
@aranjans aranjans added this to the 1.68 Release milestone Sep 20, 2024
Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 86.36364% with 3 lines in your changes missing coverage. Please review.

Project coverage is 81.83%. Comparing base (67e47fc) to head (64521e0).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
xds/internal/balancer/clusterimpl/clusterimpl.go 86.36% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7652      +/-   ##
==========================================
- Coverage   81.86%   81.83%   -0.04%     
==========================================
  Files         361      361              
  Lines       27822    27834      +12     
==========================================
+ Hits        22777    22778       +1     
- Misses       3849     3856       +7     
- Partials     1196     1200       +4     
Files with missing lines Coverage Δ
xds/internal/balancer/clusterimpl/clusterimpl.go 89.77% <86.36%> (-1.78%) ⬇️

... and 16 files with indirect coverage changes

@aranjans aranjans marked this pull request as ready for review September 20, 2024 05:33
@aranjans aranjans assigned easwars and arjan-bal and unassigned easwars Sep 20, 2024
requestCountMax: b.requestCountMax,
}),
})
if !b.inhibitPickerUpdates {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can invert this condition to return early, reducing one level of indentation.

if b.inhibitPickerUpdates {
    return
}
// update picker.

Comment on lines 257 to 258
b.inhibitPickerUpdates = true
defer func() { b.inhibitPickerUpdates = false }()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both update state and UpdateClientConnState are using the same serializer. So when the updateClientConnState callback is being executed in the serializer, not other callback (including the one scheduled by UpdateState) can be executed until updateClientConnState has returned. So inhibiting updates seems redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arjan is right. Since the same serializer is used to handle updates from the parent and handle updates from the child, we will not be able to handle updates from the child until we return from updateClientConnState. This means that we need to refactor things out a little bit.

The childState field is the only shared field that is accessed from UpdateState. So, I think we might have to do the following:

  • add a mutex that guards access to childState and inhibitPickerUpdates, and change existing methods to grab this mutex when they need to access these fields
  • do not run UpdateState in the serializer, run it inline

Let me know what you think. Please don't take my word as is. Please ensure that the above approach would actually work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @easwars for your thoughts. I agree that currently since UpdateState, UpdateClientConnState and few other methods are using the same serializer, it will be executed one after the other. And hence child config can't be handled before updateClientConnState.
I agree that we could make the UpdateState asynchronous by running it inline and not use serializer for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you haven't assigned the PR back to me, I'm assuming that you are still working on it. I skimmed through it and it was still not going what I expect it to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I was still making changes, hence i didn't assigned back to you.

@arjan-bal arjan-bal removed their assignment Sep 24, 2024
@easwars easwars assigned aranjans and unassigned easwars Oct 2, 2024
@@ -102,6 +105,12 @@ type clusterImplBalancer struct {
lrsServer *bootstrap.ServerConfig
loadWrapper *loadstore.Wrapper

// Set during UpdateClientConnState when pushing updates to child policies.
// Prevents state updates from child policies causing new pickers to be sent
// up the channel. Cleared after all child policies have processed the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clusterimpl has a single child if I'm not mistaken?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants