Sparse algebra support with OOP API #760

jalvesz · 2024-01-10T21:33:29Z

PR related to:
#38
#749
#189

Here I try to propose the OOP API upfront as a means to centralize data-containment format within stdlib, which I feel would be a (the) big added value. I'm migrating the proposal started here FSPARSE to adapt it to stdlib.

Current proposal design:

Derived Types:
sparse_type (abstract base type)
COO_type, CSR_type, CSC_type, ELL_type, SELLC_type (DTs with no data buffer)
COO_?p_type, CSR_?p_type, CSC_?p_type, ELL_?p_type, SELLC_?p_type (DTs with data buffer, where ? kind precision including complex values)
matrix-vector product:

call spmv( matrix, x, y [,alpha,beta] ) !> y = beta * y + alpha * matrix * x

Data accessors:

val = matrix%at( i , j ) !> to get the value at (i,j) position, if (i,j) not in the sparse pattern, value = 0, if out-of-bounds val = NaN
call matrix%add( i , j, val ) !> to set the value at (i,j) position, if (i,j) not in the current structure, skip
!> OR add a data block
call matrix%add( dofs_i(:), dofs_j(:), mat(:,:) ) !>

Conversions :

call coo2ordered(COO [, sort_data]) !> sort and remove duplicates from the COO data structure.
 
call dense2coo( dense , COO )  !> dense is a plain 2d array
call coo2dense( COO  , dense )

call coo2csr( COO , CSR ) !> assumes ordered and unique indexes for COO ... to implement a sorting algorithms before transferring data
call csr2coo( CSR , COO ) !> assumes ordered and unique indexes for CSR 

call coo2csc( COO , CSC ) !> assumes ordered and unique indexes for COO 
call csc2coo( CSC , COO ) !> assumes ordered and unique indexes for CSC

call csr2sellc( CSR , SELLC [, chunk ]  ) !> chunk sizes for spmv [4, 8, 16]

call diag( <sparse> , diag ) !> extract the diagonal components of the sparse matrix

zerothi · 2024-04-04T07:29:40Z

Wouldn't it be useful if sparse_t has the interface for mat%to_coo|csr|... etc instead of using a function call. Since this is going OOP, lets keep it like that.

jalvesz · 2024-04-04T08:32:31Z

This could be done even shorter with a %to(...) interface as the types are declared before calling the conversion. I've done that before but did not propose it here upfront as I wanted to hear some opinions on what the OOP API in stdlib should look like.

zerothi · 2024-04-04T08:35:26Z

Agreed, %to would be even better, or %convert or clarity? hmm...

ftucciarone · 2024-05-17T20:46:10Z

Is it a good idea to have matvec product that do not initialize the result vector? I have to say that I have spent a good hour debugging a code and the source of divergence in the solver was the fact that I was not initializing the result vector before passing it to the subroutine. While I understand that this is a great way to perform Axpy operation, I would advise to have separate matvec (with initialization) and Axpy (without) routines.

    subroutine matvec_csr_1d_sp(vec_y,matrix,vec_x)
        type(CSR_sp), intent(in) :: matrix
        real(sp), intent(in)    :: vec_x(:)
        real(sp), intent(inout) :: vec_y(:)
        integer :: i, j
        real(sp) :: aux

        associate( data => matrix%data, col => matrix%col, rowptr => matrix%rowptr, &
            & nnz => matrix%nnz, nrows => matrix%nrows, ncols => matrix%ncols, sym => matrix%sym )
            if( sym == k_NOSYMMETRY) then
                do concurrent(i=1:nrows)
                    do j = rowptr(i), rowptr(i+1)-1
                        vec_y(i) = vec_y(i) + data(j) * vec_x(col(j))
                    end do
                end do

            else if( sym == k_SYMTRIINF )then
                do i = 1 , nrows
                    aux  = 0._sp
                    do j = rowptr(i), rowptr(i+1)-2
                        aux = aux + data(j) * vec_x(col(j))
                        vec_y(col(j)) = vec_y(col(j)) + data(j) * vec_x(i)
                    end do
                    aux = aux + data(j) * vec_x(i)
                    vec_y(i) = vec_y(i) + aux
                end do

            else if( sym == k_SYMTRISUP )then
                do i = 1 , nrows
                    aux  = vec_x(i) * data(rowptr(i))
                    do j = rowptr(i)+1, rowptr(i+1)-1
                        aux = aux + data(j) * vec_x(col(j))
                        vec_y(col(j)) = vec_y(col(j)) + data(j) * vec_x(i)
                    end do
                    vec_y(i) = vec_y(i) + aux
                end do

            end if
        end associate
    end subroutine

jalvesz · 2024-05-18T08:14:16Z

Hi @ftucciarone, thanks for checking out the current draft and sorry for the inconvenience.

Yes, I have intentionally designed the API to not initialize the vector, in the current version within FSPARSE https://github.com/jalvesz/FSPARSE?tab=readme-ov-file#sparse-matrix-vector-product I updated the README to be clear about it but forgot to update here.

The reason for that choice is because it allows maximum flexibility for preprocessing linear systems needing operations with the same matrix operator that will be used for the solver. So it basically places the "responsability" on the user to place a y=0 just before if one does indeed need a clean vector.

This is of course totally open to debate, I just proposed following my experience reworking solvers. My feeling is that doing that can avoid having to manage different implementations or optionals, when the solution can be to explicitly state that the interface is updating (y = y + M * x) instead of overwriting (y = M * x).

ftucciarone · 2024-05-18T10:18:08Z

Hi @jalvesz, it was actually a pleasure to look into this draft, I am learning a lot and I look forward to discuss it with you if possible.

I have to say that I have particularly strong feelings about the "non-initialization" problem, mostly twofold.
First, if y = y + M*x is the chosen way to go, then it must be written very large, at the very beginning of the documentation, to make sure everyone sees that. Test example should also take care of that, showing that the result is wrong if not initialized before.
Second, I have the feeling that doing first y=0_dp and then call matvec(y, A, x) might add an overhead due to the initialization of y for very large problems, while initializing the component y(i) while doing the matvec might be faster, but I have to test because I'm not 100% sure about this.

However, I think that separating matvec and Axpy could be feasible at this stage (I can take care of that with the correct guidance) and potentially doable by preprocessing as well (I'm thinking at a fypp if).

Cheers, Francesco

Edit: example of splitting matvec and matAxpy with fypp

#:include "../include/common.fypp"
#:set RANKS = range(1, 2+1)
#:set KINDS_TYPES = REAL_KINDS_TYPES
#:set OPERATIONS = ['matvec', 'matAxpy'] <--- DEFINE THE NAMES
#! define ranks without parentheses
#:def rksfx2(rank)
#{if rank > 0}#${":," + ":," * (rank - 1)}$#{endif}#
#:enddef

!! matvec_csr
    #:for k1, t1 in (KINDS_TYPES)
    #:for rank in RANKS
    #:for idx, opname in enumerate(OPERATIONS) <--- ITERATE OVER THE NAMES
    subroutine ${opname}$_csr_${rank}$d_${k1}$(vec_y,matrix,vec_x)
        type(CSR_${k1}$), intent(in) :: matrix
        ${t1}$, intent(in)    :: vec_x${ranksuffix(rank)}$
        ${t1}$, intent(inout) :: vec_y${ranksuffix(rank)}$
        integer :: i, j
        #:if rank == 1
        ${t1}$ :: aux
        #:else
        ${t1}$ :: aux(size(vec_x,dim=1))
        #:endif

        associate( data => matrix%data, col => matrix%col, rowptr => matrix%rowptr, &
            & nnz => matrix%nnz, nrows => matrix%nrows, ncols => matrix%ncols, sym => matrix%sym )
            if( sym == k_NOSYMMETRY) then
                do concurrent(i=1:nrows)
                    do j = rowptr(i), rowptr(i+1)-1
                        #:if idx==0  <--- IF MATVEC 
                        vec_y(${rksfx2(rank-1)}$i) = data(j) * vec_x(${rksfx2(rank-1)}$col(j))
                        #:else  <--- IF AXPY
                        vec_y(${rksfx2(rank-1)}$i) = vec_y(${rksfx2(rank-1)}$i) + data(j) * vec_x(${rksfx2(rank-1)}$col(j))
                        #:endif
                    end do
                end do

            else if( sym == k_SYMTRIINF )then
                do i = 1 , nrows
                    aux  = 0._${k1}$
                    do j = rowptr(i), rowptr(i+1)-2
                        aux = aux + data(j) * vec_x(${rksfx2(rank-1)}$col(j))
                        vec_y(${rksfx2(rank-1)}$col(j)) = vec_y(${rksfx2(rank-1)}$col(j)) + data(j) * vec_x(${rksfx2(rank-1)}$i)
                    end do
                    aux = aux + data(j) * vec_x(${rksfx2(rank-1)}$i)
                    #:if idx==0   <--- IF MATVEC 
                    vec_y(${rksfx2(rank-1)}$i) = aux
                    #:else  <--- IF AXPY
                    vec_y(${rksfx2(rank-1)}$i) = vec_y(${rksfx2(rank-1)}$i) + aux
                    #:endif                      
                end do

            else if( sym == k_SYMTRISUP )then
                do i = 1 , nrows
                    aux  = vec_x(${rksfx2(rank-1)}$i) * data(rowptr(i))
                    do j = rowptr(i)+1, rowptr(i+1)-1
                        aux = aux + data(j) * vec_x(${rksfx2(rank-1)}$col(j))
                        vec_y(${rksfx2(rank-1)}$col(j)) = vec_y(${rksfx2(rank-1)}$col(j)) + data(j) * vec_x(${rksfx2(rank-1)}$i)
                    end do
                    #:if idx==0   <--- IF MATVEC 
                    vec_y(${rksfx2(rank-1)}$i) = aux
                    #:else  <--- IF AXPY
                    vec_y(${rksfx2(rank-1)}$i) = vec_y(${rksfx2(rank-1)}$i) + aux
                    #:endif                    
                 end do

            end if
        end associate
    end subroutine
    
    #:endfor
    #:endfor
    #:endfor

jalvesz · 2024-05-18T16:26:13Z

@ftucciarone thanks for the idea! I will also bring parts of a discussion I had with @ivan-pi on exactly this topic, where he brought to my attention the following:

The MKL library offers, y := beta y + alpha A x, and then you need to pick either beta = 0, or beta = 1, depending on what you want.

In Apple's Accelerate Framework on the other hand, they offer two functions:

SparseMultiply(A,x,y)           // y = A x
SparseMultiplyAdd(A,x,y)     // y += A x

In PSBLAS, spmm covers both vectors and matrices (rank-2 dense arrays):

call psb_spmm(alpha, a, x, beta, y, desc_a, info)
call psb_spmm(alpha, a, x, beta, y, desc_a, info, trans, work)

https://psctoolkit.github.io/psblasguide/userhtmlse4.html#x18-550004

Taking all those ideas together, I could imagine using optionals to cover:

call matvec( A , vec_x, vec_y ) !> vec_y = vec_y + A * y
call matvec( A , vec_x, vec_y , overwrite = .true. ) !> vec_y = A * y
call matvec( A , vec_x, vec_y , alpha=value, beta=value ) !> vec_y = beta * vec_y + alpha * A * y

Or the default to be overwrite and an optional for addition, or simply with beta =1 if(present(beta)) would automatically determine that.

Let me know your thoughts on that

…r function naming

certik · 2024-07-02T18:51:40Z

Thanks for the PR! I think it looks in the right direction.

In order for me to start using these routines (for parallel sparse matmul, various conversions, etc.), I would need a version that does not use the stdlib's custom derived type, since I have my own derived type on my code. Is there a way to do that?

The only way I know is to split the API into two parts:

low level that does not use derived types
high level, OO style (this PR)

Maybe there is another way, but the above way seems robust. Then I can start using the low level API, and just give it the CSR arrays that I have in my code from my own derived type.

Why not structure this PR in this low level / high level style? It's almost there, but not quite.

jvdp1 · 2024-07-08T17:19:00Z

Thanks for the PR! I think it looks in the right direction.

In order for me to start using these routines (for parallel sparse matmul, various conversions, etc.), I would need a version that does not use the stdlib's custom derived type, since I have my own derived type on my code. Is there a way to do that?

The only way I know is to split the API into two parts:

low level that does not use derived types

high level, OO style (this PR)

Maybe there is another way, but the above way seems robust. Then I can start using the low level API, and just give it the CSR arrays that I have in my code from my own derived type.

Why not structure this PR in this low level / high level style? It's almost there, but not quite.

@certik Thank you for your comment. I agree with you that these features should be splitted into 2 parts (I have also my own sparse library, and I might be interested to call some of the stdlib features) :

low level procedures (e.g, spmv)
high level for OO

However, @jalvesz started with the OO approach, and I think it is quite close to be merged as is, pending a few "minor" changes. Restructuring this PR might be a huge task.

Therefore, I advice to review and merge this PR as is, maybe after implementing some suggestions where we think the API level could be already lowered a bit.
The merged procedures could then be reviewed in a second stage based on users' feedbacks and with having in mind this 2-level approach. The high-level OO approach would not be modified for the user. Only the low-level approach would be added based on waht is already existing.

I am afraid that if we go for the 2-API level approach in one go, we will have nothing in stdlib at the end
due to too many discussions, lack of time, @jalvesz motivation,.... This topic of sparse matrices is on the table for a long time, with many unsuccessful attempts. I think @jalvesz 's implementation is the closest one to be merged. At this stage, I prefer to have something usable in stdlib, even if it only includes OO-API. Users may start to use it, provide feedbacks, and hopefully contribute to stdlib.
Such an approach has been used for other stdlib features (e.g., sorting, hash maps, loggers) with some success.
I will be happy to hear your feedback.

jvdp1

Here are some comments for the specs only (note: I did not look to the code yet; so some comments might just be dum).

doc/specs/stdlib_sparse.md

jvdp1 · 2024-07-08T14:56:41Z

doc/specs/stdlib_sparse.md

+
+`COO`, `intent(inout)`: Shall be any `COO` type. The same object will be returned with the arrays reallocated to the correct size after removing duplicates.
+
+`sort_data`, `logical(in)`, `optional`:: Shall be an optional `logical` argument to determine whether data in the COO graph should be sorted while sorting the index array, default `.false.`.


this is not mentioned in the previous syntax section.

doc/specs/stdlib_sparse.md

jvdp1

here are some additional quick comments.

src/stdlib_sparse_kinds.fypp

Co-authored-by: Jeremie Vandenplas <[email protected]>

doc/specs/stdlib_sparse.md

jalvesz · 2024-07-10T19:46:19Z

@certik I understand and agree that a low level API is also desirable. Ideally we should eventually be able to link agains MKL following their interfaces which are quite similar but not exactly to Sparse BLAS syntax: https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-fortran/2024-0/sparse-blas-level-2-and-level-3-routines-002.html

Now, this is a huge work. My feeling is that it is doable by changing the low-level back-ends without having to modify the high-level API (or with minimal breaking changes). I tried to bring this proposal to kickoff the interest and see if that could happen eventually and progressively once the DTs are in shape. Also, the idea of the DTs is to have sparse objects at the same level as a dense array such that higher-level interfaces can then be designed to work on either dense or sparse matrices.

…large arrays

jalvesz and others added 11 commits January 10, 2024 00:47

sparse API 1st commit

5d0ff52

Merge branch 'fortran-lang:master' into sparse

b1dcbf6

cmake build

c63c0dd

Merge branch 'sparse' of https://github.com/jalvesz/stdlib into sparse

d940ebd

add data accessors set and get

72481be

fix typo

a7cb6be

change ij accessor as subroutine

d48dde5

Merge branch 'fortran-lang:master' into sparse

93c5c3a

fix missing i,j integer declaration

6241ca3

Merge branch 'fortran-lang:master' into sparse

0f2ee3b

Merge branch 'fortran-lang:master' into sparse

c1a85c4

Merge branch 'fortran-lang:master' into sparse

2606691

jalvesz marked this pull request as draft April 20, 2024 09:14

jalvesz and others added 6 commits April 23, 2024 23:45

Merge branch 'fortran-lang:master' into sparse

c34ac70

Merge branch 'fortran-lang:master' into sparse

3126d16

Merge branch 'fortran-lang:master' into sparse

c0bbabb

add comments and change _t for _type

2c2431d

revert matvec convention to (matrix,x,y)

d64a045

Merge branch 'fortran-lang:master' into sparse

6786859

jalvesz and others added 2 commits May 18, 2024 18:11

Merge branch 'fortran-lang:master' into sparse

d926581

Merge branch 'sparse' of https://github.com/jalvesz/stdlib into sparse

22b477c

jalvesz added 2 commits May 18, 2024 19:50

upgrade sparse support with SELLC format and more tests add suffix fo…

d165b8b

…r function naming

include alpha and beta parameters in sparse matvec

8f72559

Merge branch 'fortran-lang:master' into sparse

14e2c17

jvdp1 reviewed Jul 8, 2024

View reviewed changes

jalvesz and others added 9 commits July 9, 2024 07:09

Update doc/specs/stdlib_sparse.md

b53eca2

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update doc/specs/stdlib_sparse.md

65e3fcb

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update doc/specs/stdlib_sparse.md

2f56cd4

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update doc/specs/stdlib_sparse.md

941de3a

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update doc/specs/stdlib_sparse.md

575c426

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update src/stdlib_sparse_kinds.fypp

db73fdc

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update src/stdlib_sparse_kinds.fypp

697afa2

Co-authored-by: Jeremie Vandenplas <[email protected]>

Update src/stdlib_sparse_kinds.fypp

c97e665

Co-authored-by: Jeremie Vandenplas <[email protected]>

Merge branch 'fortran-lang:master' into sparse

ac100a1

jvdp1 reviewed Jul 9, 2024

View reviewed changes

doc/specs/stdlib_sparse.md Outdated Show resolved Hide resolved

Merge branch 'fortran-lang:master' into sparse

9879a9c

jvdp1 mentioned this pull request Jul 9, 2024

sort_index: use of only int_index iterators inside sort_index #848

Merged

refactor spmv as submodule to keep parameters private, rework specs

dde88a7

jalvesz and others added 11 commits July 10, 2024 22:06

add an ilp parameter to change in the future for int64 if needed for …

6ae038b

…large arrays

add the _type suffix to all sparse types

3596f3f

Merge branch 'fortran-lang:master' into sparse

a21d1e8

rollback on submodules

c8d94a3

forgotten file in cmake

82dbe02

Merge branch 'fortran-lang:master' into sparse

a8aa247

Merge branch 'fortran-lang:master' into sparse

66b0ce2

Merge branch 'fortran-lang:master' into sparse

4b41aa1

Merge branch 'fortran-lang:master' into sparse

ab112e6

add csc/coo conversions and diagonal extraction

bc0021b

Merge branch 'fortran-lang:master' into sparse

7279461

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse algebra support with OOP API #760

Sparse algebra support with OOP API #760

jalvesz commented Jan 10, 2024 •

edited

Loading

zerothi commented Apr 4, 2024

jalvesz commented Apr 4, 2024

zerothi commented Apr 4, 2024

ftucciarone commented May 17, 2024 •

edited

Loading

jalvesz commented May 18, 2024 •

edited

Loading

ftucciarone commented May 18, 2024 •

edited

Loading

jalvesz commented May 18, 2024

certik commented Jul 2, 2024

jvdp1 commented Jul 8, 2024

jvdp1 left a comment

jvdp1 Jul 8, 2024

jvdp1 left a comment

jalvesz commented Jul 10, 2024 •

edited

Loading


		`COO`, `intent(inout)`: Shall be any `COO` type. The same object will be returned with the arrays reallocated to the correct size after removing duplicates.

		`sort_data`, `logical(in)`, `optional`:: Shall be an optional `logical` argument to determine whether data in the COO graph should be sorted while sorting the index array, default `.false.`.

Sparse algebra support with OOP API #760

Are you sure you want to change the base?

Sparse algebra support with OOP API #760

Conversation

jalvesz commented Jan 10, 2024 • edited Loading

zerothi commented Apr 4, 2024

jalvesz commented Apr 4, 2024

zerothi commented Apr 4, 2024

ftucciarone commented May 17, 2024 • edited Loading

jalvesz commented May 18, 2024 • edited Loading

ftucciarone commented May 18, 2024 • edited Loading

jalvesz commented May 18, 2024

certik commented Jul 2, 2024

jvdp1 commented Jul 8, 2024

jvdp1 left a comment

Choose a reason for hiding this comment

jvdp1 Jul 8, 2024

Choose a reason for hiding this comment

jvdp1 left a comment

Choose a reason for hiding this comment

jalvesz commented Jul 10, 2024 • edited Loading

jalvesz commented Jan 10, 2024 •

edited

Loading

ftucciarone commented May 17, 2024 •

edited

Loading

jalvesz commented May 18, 2024 •

edited

Loading

ftucciarone commented May 18, 2024 •

edited

Loading

jalvesz commented Jul 10, 2024 •

edited

Loading