Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fabtests/rdm_atomic, ubertest: move atomic verification to common code and use in functional test #10155

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Commits on Oct 4, 2024

  1. fabtests/common: move ubertest atomic validation code to common

    This allows fabtests to make use of atomic validation code
    
    There were many Windows atomics bugs, inconsistencies, and missing
    definitions. This patch also cleans up the entire ofi_atomic.c
    implementation for unix and windows
    
    The following changes are included:
    - Replace reads with memcpy as setting complexes on windows is not allowed
    - Remove CHECK_LOCAL and do memcmp instead to reduce code and calls
    - Remove duplicated ofi_complex definitions in ofi_atomic (already in osd.h file)
    - Add general check_atomic and fill_atomic calls and use them in ubertest
    - Replace FT_FILL set with memcpy to be compatible with windows complex types
    - Add EXPAND ( x) x define to work nicely with windows VA_ARGS handling
    - Fix inconsistency with ofi_complex_type/or naming ('complex' always should come first)
    - Fix inconsistency with op names "equ" and "mul" -> "eq" and "prod"
    - Add missing lxor complex op definitions on Windows
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    2041d60 View commit details
    Browse the repository at this point in the history
  2. fabtests/common: add hmem support to common atomic validation

    To properly validate atomic data, we need host bounce buffers for the
    result and compare buffers in addition to the regular bounce buffer for
    the tx/rx bufs.
    This adds two extra bufs allocated only for atomic purposes and adds hmem
    support to the common atomic validation path.
    It also renames the alloc/free_tx_buf calls to generic alloc/free_host_bufs
    which allocates all three buffers at once.
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    3f9553b View commit details
    Browse the repository at this point in the history
  3. fabtests/common: fix atomic buffer

    ft_post_atomic posted "buf" which is the base address for the entire
    send and recv buffer allocation. The first half of the allocation is the
    receive buffer and the second half is the send buffer. Posting just "buf"
    meant it was sending the receive buffer.
    This changes it to send the tx buf and do an atomic on the rx buf which allows
    us to properly do atomic validation
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    4ecafac View commit details
    Browse the repository at this point in the history
  4. fabtests/common: change sync message to be 0 bytes instead of 1 byte

    This allows us to post the rx buf without corrupting memory in case its needed
    for validation
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    1e80404 View commit details
    Browse the repository at this point in the history
  5. fabtests/hmem: change ZE memset to use uint8

    Match the behavior of memset() where the value passed in
    is an int, but it is interpreted as a char.
    While ZE can technically handle this scenario, others may not
    so we need to standardize across ifaces
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    e19219c View commit details
    Browse the repository at this point in the history
  6. functional/rdm_atomic: add data validation

    Add data validation to the atomic test by using the newly added atomic fill and check
    support imported from ubertest. This code uses a macro that switches on datatype for
    filling and checking the buffer contents.
    
    The atomic validation path requires an extra buffer to copy the contents of the original
    atomic buffer in order to recreate the atomic function locally and check the buffer against
    the simulated atomic operation.
    
    This patch also refactors the entire test to remove the extremely confuctin macros used
    for the base/fetch/compare operations. The macros made the code extremely difficult to
    read and debug and also made it difficult to add data validation. Separating it into
    three explicit functions is about the same amount of code and significantly more readable
    
    Synchronization messages are added in the validation case to ensure the atomic operation
    completed on both sides before validation occurs. This requires the addition of the
    FI_ORDER_SAW and FI_ORDER_SAR message ordering to ensure that we get the completion for
    the send/recv sync after the atomic message is processed
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    75bde84 View commit details
    Browse the repository at this point in the history
  7. fabtests/runfabtests.sh: add rdm_atomic validation tests

    Run fi_rdm_atomic with data validation in standard and short test suites
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    c7111d3 View commit details
    Browse the repository at this point in the history
  8. prov/psm3: disable complex comparison combinations

    Comparison of complex numbers is undefined and not a valid
    combination of atomic ops. Disable in psm3
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    e3b73f8 View commit details
    Browse the repository at this point in the history
  9. prov/psm3: check atomic op error code

    Report atomic op errors back to the application.
    Some datatype/op combinations were falsely being reported
    to the application but failing when the atomic was being
    performed. These failures were silently trested as successful
    because the errors were not passed back. Check the error code
    to catch future issues
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    abf4321 View commit details
    Browse the repository at this point in the history
  10. prov/psm3: fix logical atomic function calls

    psm3 advertises support for logical ops (lor, land, lxor)
    with all datatypes but the functions are only defined for
    integer types. When the atomic op is called with a non-integer
    type, it drops down to the default case and returns an error
    (FI_ENOTSUPP)
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    10c1bd9 View commit details
    Browse the repository at this point in the history
  11. prov/psm3: disable long double reads and compares

    Atomic long double and complex long double reads and swaps
    are failing data verification. Only part of the data is
    getting copied. Disable in psm3 until root caused.
    
    Signed-off-by: Alexia Ingerson <[email protected]>
    aingerson committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    fc9cf4e View commit details
    Browse the repository at this point in the history