-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(pam/integration-tests): Improve reliability of PAM integration tests and actually run them in race mode #560
Conversation
d14c1b9
to
5c1feec
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #560 +/- ##
==========================================
- Coverage 84.79% 84.72% -0.07%
==========================================
Files 79 79
Lines 6963 6987 +24
Branches 75 75
==========================================
+ Hits 5904 5920 +16
- Misses 739 746 +7
- Partials 320 321 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big one, although the review is quite small. I have some nitpicks, but nothing blocking so I'll leave my approval already. Nice work! Hopefully our tests get more stable now!
It's so frustrating to see how much we need to work around because we don't have the "wait for output" feature in vhs yet, but well... It's not something we can control, right? 🥲
5c1feec
to
9ef77b8
Compare
when I run it without setting
I then tried running it with I'll now try with |
9ef77b8
to
f95eae2
Compare
Do you have the generated golden file from that log? It should be in |
It failed:
No, I don't have the golden files from that run. It's not in |
Here is another error from running without
Full test logs: gotest.log |
Ok the |
Not our bug: vhs (or the called chrome/chromium) doesn't seem to start, so not something we can fix. |
Another failure:
Full test logs: gotest.log |
@adombeck looks like another vhs upstream issue, please report it there. |
1f884b2
to
db2664e
Compare
@adombeck, @denisonbarbosa: If there are no further blockers I'd proceed merging this (later tomorrow/today) so that I'm not blocked in further cleanups branch I've + other tests. This seems more stable in CI and locally to me, but if there further failures I'm happy to analyze them (in general they require just some extra sleeps here and there, sadly) |
We repeat the same for all the files and it's hard to define specific values, so avoid repeating the same for each file but instead generate the contents programmatically so that it can be changed per test. This generalization makes adding new tests easier and will permit to change the tape data before running it.
We were doing this manually anyways, so just do it forever so that we don't forget when the function is re-used
Depending on the testing conditions we may need to change the sleeping time, since VHS doesn't yet support a wait-for feature, it's just better to make possible to define the sleep programmatically so that we can change it at once without having to refactor all the tape files
We were missing the last line
We were not waiting enough to show the view again, causing the result arriving too early and not being printed in a new line as expected
This makes the test more reliable when we are in a slower context
… change We need the broker to do some work so let's wait more
…izer Instead of relying on cgo, we can just get from go if asan support is enabled, so rely on the build parameter instead of calling the C code. This can't be done everywhere though, as the value isn't set when running the gdm model tests and so we still need to keep around pam_test.IsAddressSanitizerActive() to be used there
If the thread sanitizer is enabled we may take different paths, so enable the detection of it as we do with ASAN
We are running integration tests in race mode, but we don't actually run the binaries compiled with thread sanitizer in such case since it was not detected. Fix this by adding the -race flag if the tests are running in such mode
The integration tests relying on sleeping, and so they become quite unreliable when the environment is not optimal. To prevent unexpected failures, make possible to define a sleep multiplier via an environment variable and define some default values depending whether we're building with address or thread sanitizers, since these builds are slower
Tests may take longer time, so avoid issues in slower builders
While -dwarflocationlists is enabled by default, we may want to disable further optimizations in the PAM cflags build
If we fail, show the log and then exit
…own in tests We have various MFA based tests where we wait for a new method selection for being shown, but we used too long sleeps in the tape files compared to the one of the broker, so some stages were not reflected by the golden files. Increase the sleep time on the broker side so that the client has time to show the items
We are always accessing to the example users map to check if the user is tracked, so just lock the map all the times and not just on specific cases. Fixes the race: WARNING: DATA RACE Read at 0x00c000253680 by goroutine 335: runtime.mapaccess2_faststr() /opt/hostedtoolcache/go/1.23.0/x64/src/runtime/map_faststr.go:117 +0x0 github.com/ubuntu/authd/examplebroker.(*Broker).handleIsAuthenticated() /home/runner/work/authd/authd/examplebroker/broker.go:679 +0x1cbd github.com/ubuntu/authd/examplebroker.(*Broker).IsAuthenticated() /home/runner/work/authd/authd/examplebroker/broker.go:530 +0x412 github.com/ubuntu/authd/examplebroker.(*Bus).IsAuthenticated() /home/runner/work/authd/authd/examplebroker/dbus.go:106 +0x91 runtime.call64() /opt/hostedtoolcache/go/1.23.0/x64/src/runtime/asm_amd64.s:777 +0x42 reflect.Value.Call() /opt/hostedtoolcache/go/1.23.0/x64/src/reflect/value.go:365 +0xb5 github.com/godbus/dbus/v5.exportedMethod.Call() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/default_handler.go:128 +0x239 github.com/godbus/dbus/v5.(*exportedMethod).Call() <autogenerated>:1 +0x84 github.com/godbus/dbus/v5.(*Conn).handleCall() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/export.go:193 +0x6f2 github.com/godbus/dbus/v5.(*Conn).inWorker.gowrap1() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/conn.go:435 +0x44 Previous write at 0x00c000253680 by goroutine 350: runtime.mapassign_faststr() /opt/hostedtoolcache/go/1.23.0/x64/src/runtime/map_faststr.go:223 +0x0 github.com/ubuntu/authd/examplebroker.(*Broker).handleIsAuthenticated() /home/runner/work/authd/authd/examplebroker/broker.go:655 +0x128b github.com/ubuntu/authd/examplebroker.(*Broker).IsAuthenticated() /home/runner/work/authd/authd/examplebroker/broker.go:530 +0x412 github.com/ubuntu/authd/examplebroker.(*Bus).IsAuthenticated() /home/runner/work/authd/authd/examplebroker/dbus.go:106 +0x91 runtime.call64() /opt/hostedtoolcache/go/1.23.0/x64/src/runtime/asm_amd64.s:777 +0x42 reflect.Value.Call() /opt/hostedtoolcache/go/1.23.0/x64/src/reflect/value.go:365 +0xb5 github.com/godbus/dbus/v5.exportedMethod.Call() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/default_handler.go:128 +0x239 github.com/godbus/dbus/v5.(*exportedMethod).Call() <autogenerated>:1 +0x84 github.com/godbus/dbus/v5.(*Conn).handleCall() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/export.go:193 +0x6f2 github.com/godbus/dbus/v5.(*Conn).inWorker.gowrap1() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/conn.go:435 +0x44 Goroutine 335 (running) created at: github.com/godbus/dbus/v5.(*Conn).inWorker() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/conn.go:435 +0x46a github.com/godbus/dbus/v5.(*Conn).Auth.gowrap1() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/auth.go:118 +0x33 Goroutine 350 (finished) created at: github.com/godbus/dbus/v5.(*Conn).inWorker() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/conn.go:435 +0x46a github.com/godbus/dbus/v5.(*Conn).Auth.gowrap1() /home/runner/go/pkg/mod/github.com/godbus/dbus/[email protected]/auth.go:118 +0x33
Permit IsAuthenticated being called concurrently by reading the user info just once, instead of making subsequent calls being blocked waiting for the user info being read
… duration Add a sleep variable matching the default example broker sleep value so that we don't have to guess it in the tape files but it's clearly related
We are relying on local groups content in order to call gpasswd, but in case there are two concurrent writes, the gpasswd file content we're reading may not be updated, causing us to call --add or --remove twice for the same content. So use a mutex to avoid that we're trying to read from the filesystem before that the current write operation has been completed.
We were sleeping after typing and before actually submitting the new password, leading to potential failures. Do the other way around instead since it's after we hit Enter that we could have slowdowns
db2664e
to
89e8861
Compare
Since we rely a lot on sleeping on the integration tests, due to the fact we can't yet go with charmbracelet/vhs#257, we need to be able to tune this value depending on the context we're running the tests.
In general sanitizers slows down the runtime quite a bit (especially the thread one), so make possible to use dynamic sleep times values depending on variables that we adjust depending on the context we're running in.
Tests can now slowed up/down using
AUTHD_TESTS_SLEEP_MULTIPLIER
variable too (this can be used for helping local testing too, e.g. personally I can reliably run the tests faster in my machine - with no failures - usingAUTHD_TESTS_SLEEP_MULTIPLIER=0.3
; but also it helps to quickly "fix" slower machines).Tune the tape files so that we don't miss some changing contents (as in the MFA/QrCode tests) and make the example broker to be a bit lazier.
Then, I noticed that we were not actually running the integration tests in
-race
mode, so fix this. Indeed this implies slower tests, but at least now we're fully checking for races both the daemon and the client.This worked fine in various builds both in my fork and in a private repo fork I did where the builders are way slower than the public ones.
@adombeck please check if this improves the results in your local environment too, where you had failures
UDENG-4793