Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Build::compile_objects on error and when used in multithreading #779

Closed
wants to merge 1 commit into from
Closed

Fix Build::compile_objects on error and when used in multithreading #779

wants to merge 1 commit into from

Conversation

NobodyXu
Copy link
Collaborator

@NobodyXu NobodyXu commented Jan 28, 2023

Before this commit, each call to Build::compile_objects release a raw
token regardless of whether there is a token acquired by this process.

As such, if Build::compile_objects is used in multithreading context,
then it would call Client::release_raw for each thread and thus
increasing the parallelism far beyond the limit.

This commit fixed the bug by not acquiring the implicit global token
when initializing global jobserver::Client and only obtain token when
compiling objects.

Signed-off-by: Jiahao XU [email protected]

@NobodyXu
Copy link
Collaborator Author

NobodyXu commented Jan 28, 2023

@thomcc Can we have this PR merged before the release #778?

I noticed this bug when working on adding multithreading to build.rs of ring, where enabling parallel feature of cc is not enough and I have to manually spawn thread to archive more parallelism, because ring compiles each source file separately.

@thomcc
Copy link
Member

thomcc commented Jan 28, 2023

This is complex enough that I'd like to have a bit more time to review. I just got some vaccines this weekend and am slightly under the weather, and would rather not r+ some unsafe code I haven't had time to think about (sorry!). I'll review over the week and try to cut a release next weekend if it seems good.

@NobodyXu
Copy link
Collaborator Author

Thanks @thomcc

@NobodyXu
Copy link
Collaborator Author

I noticed this bug when working on adding multithreading to build.rs of ring, where enabling parallel feature of cc is not enough and I have to manually spawn thread to archive more parallelism, because ring compiles each source file separately.

I've submit the PR to ring that will trigger the bug briansmith/ring#1591

@NobodyXu
Copy link
Collaborator Author

@thomcc Friendly ping as this PR and #780 has been stale for really long time.

@NobodyXu NobodyXu changed the title Fix Build::compile_objects when used in multithreading Fix Build::compile_objects on error and when used in multithreading May 14, 2023
@joshtriplett
Copy link
Member

Initial reaction: I don't think we should be using a static mut for this. Is there some structure by which we could make this a field of the relevant object instead?

@NobodyXu

This comment was marked as off-topic.

@NobodyXu
Copy link
Collaborator Author

Initial reaction: I don't think we should be using a static mut for this. Is there some structure by which we could make this a field of the relevant object instead?

I've fixed this by having an immutable global variable for token:

static TOKEN: Mutex<Option<jobserver::Acquired>> = Mutex::new(None);

the jobserver still uses mutable global variable as before, but that is due to cc having a msrv of 1.46 so we cannot use std::sync::OnceLock and cc also don't want to have once_cell as a dependency.

@NobodyXu
Copy link
Collaborator Author

I've updated the PR to only obtain jobserver token when actually compiling objects, so there is no implicit global token now.

Also, I've replaced use of Option with MaybeUninit and eliminate an unwrap() inside jobserver.

Comment on lines +1288 to +1297
match client.available() {
Ok(0) | Err(_) => {
// There is no available tokens.
// This is probably because `cc::Build::compile_objects` is called in
// `build.rs` where the `cargo` holds the token.
//
// In that case, we will release one raw token just to prevent deadlock,
// but this will increase the allowed global concurrency by one.
let _ = client.release_raw();
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshtriplett I implemented what you requested (only acquire and release token when actually compiling) but with a catch.

cargo would acquire one token before running the build.rs, so the available token could be 0 when we obtained the jobserver via jobserver::Client::from_env and causing deadlock.

To prevent this, I added this piece of code which release one token into the jobserver, but it cannot be re-acquired unless I added back the global state for the obtained token.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the CI is still in deadlock.

Before this commit, each call to `Build::compile_objects` release a raw
token regardless of whether there is a token acquired by this process.

As such, if `Build::compile_objects` is used in multithreading context,
then it would call `Client::release_raw` for each thread and thus
increasing the parallelism far beyond the limit.

This commit fixed the bug by not acquiring the implicit global token
when initializing global `jobserver::Client` and only obtain token when
compiling objects.

It also replaced use of `Option<jobserver::Client>` with
`MaybeUninit<jobserver::Client>` and eliminate an `Option::unwrap()`
inside fn `jobserver`.

Signed-off-by: Jiahao XU <[email protected]>
@NobodyXu
Copy link
Collaborator Author

Hmmm given the deadlock caused by PR and the global state required to fix it, I suppose that it's better leave it as-is for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants