-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test(performance): make tests more deterministic by relying more on system counts #5786
base: master
Are you sure you want to change the base?
Conversation
We also have a fetch (http/network calls) abstraction which could be useful in the future. |
@@ -28,17 +29,17 @@ export async function prepareRepoData( | |||
repoRootPaths: string[], | |||
workspaceFolders: CurrentWsFolders, | |||
telemetry: TelemetryHelper, | |||
span: Span<AmazonqCreateUpload> | |||
span: Span<AmazonqCreateUpload>, | |||
zip: AdmZip = new AdmZip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: AdmZip will be replaced by zip.js #4769
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see. Do you recommend I remove the AdmZip call counting in the meantime or leave it until that other PR is done?
/runIntegrationTests |
Problem
Performance tests are currently flaky, making them a poor signal for performance regressions within the code base. Additionally, false alarms slow down development of unrelated features as test failures keep CI from being "green".
Therefore, rather than relying only on system usage thresholds for performance regressions, we can count the number of high-risk / potentially slow operations made by the code as a deterministic measure of its performance. However, directly counting each individual potentially slow operation used within the performance tests highly couples the test to the implementation details, making the tests less effective if the implementation changes.
Therefore, the goal of this PR is the following:
Solution
To meet goal (1), we increase the thresholds of the tests to decrease the changes of a false alarm.
To meet goal (2), we count expensive operations. But, to avoid tying it to the implementation details, we count the expensive operations using somewhat-loose upper bounds. Thus, implementation changes modifying the exact number of expensive operations by a small constant factor do not set a false alarm. However, if they increase the number of expensive operations by a multiplicative factor, the upper bound will alert us.
As an example, we don't want the test to fail if it makes 5-10 more file system calls when working with a few hundred files, but we do want the test to fail if it makes 2x the number of files system calls. Therefore, in the code the bounds are often described as "operations per file", since it is the multiplicative increase we are concerned about. This allows us to achieve goal (3).
Implementation Details
fs
module). Some other examples include the use ofzip
libraries or loading large files into memory.read
andwrite
bounds. This granularity allows us to assert that specific code paths do not modify any files.Open Question
License: I confirm that my contribution is made under the terms of the Apache 2.0 license.