I'm trying to create a sparse square matrix in Matlab through a mex function (written in Fortran). I want something like A = sparse(I,J,K) . My triplets look like this, there are repetitions among the entries
femi = [1, 2, 3, 2, 2, 4, 5, 5, 4, 6, 6, 5, 5, 2]
femj = [2, 2, 1, 1, 1, 3, 3, 6, 3, 1, 1, 2, 2, 4]
femk = [2, 1, 5, 4, 2, 4, 5, 7, 2, 1, 6, 2, 1, 4]
I've written a rough piece of code, it works for small matrix dimensions, but it's much slower than the intrinsic Matlab's sparse. Since I have almost no background in coding, I don't know what I'm doing wrong (wrong way to allocate variables? too many do loops?). Any help is appreciated. Thank you. This is the mex computational subroutine. It returns the pr, ir, jc indices array to give to the sparse matrix
subroutine new_sparse(femi, femj, femk, pr, ir, jc, n, m)
implicit none
intrinsic:: SUM, COUNT, ANY
integer :: i, j, k, n, indjc, m
real*8 :: femi(n), femj(n), femk(n)
real*8 :: pr(n)
integer :: ir(n),jc(m+1)
logical :: indices(n)
indices = .false.
k = 1
indjc = 0
jc(1) = 0
do j=1,m
do i =1,m
indices = [femi==i .and. femj==j]
if (ANY(indices .eqv. .true.)) then
ir(k) = i-1
pr(k) = SUM(femk, indices)
k = k+1
indjc = indjc + 1
end if
end do
if (indjc/=0) then
jc(j+1) = jc(j) + indjc
indjc = 0
else
jc(j+1) = jc(j)
end if
end do
return
end
Edit:
As suggested by users #jack and #veryreverie in the comments below, it's better to sort directly femi, femj and femk. I guess that ranking/sorting femi first (and sorting femj and femk according to femi) and then ranking/sorting femj (and sorting femi and femk according to femj) provides the desired result. The only thing left is to deal with duplicates.
Edit #2 :
I translated line by line the serialized version of the C code by Engblom and Lukarksi . This document explains very clearly their reasoning and I think it's useful for beginners like me. However, due to my inexperience, I was unable to translate the parallelized version of the code. Maybe that prompts another question.
subroutine new_sparse(ir, jcS, pr, MatI, MatJ, MatK, n, m)
! use omp_lib
implicit none
integer, parameter :: dp = selected_real_kind(15,300)
integer, intent(in) :: n, m
real(dp), intent(in) :: MatK(n), MatI(n), MatJ(n)
! integer*8, intent(out) :: nnew
integer :: i, k, col, row, c, r !, nthreads
integer :: hcol(m+1), jcS(m+1), jrS(m+1)
integer :: ixijs, irank(n), rank(n)
real*8 :: pr(*)
integer :: ir(*)
hcol = 0
jcS = 0
jrS = 0
do i = 1,n
jrS(MatI(i)+1) = jrS(MatI(i)+1)+1
end do
do r = 2,m+1
jrS(r) = jrS(r) + jrS(r-1)
end do
do i = 1,n
rank(jrS(MatI(i))+1) = i
jrS(MatI(i)) = jrS(MatI(i)) + 1
end do
k = 1
do row = 1,m
do i = k , jrS(row)
ixijs = rank(i)
col = MatJ(ixijs)
if (hcol(col) < row) then
hcol(col) = row
jcS(col+1) = jcS(col+1)+1
end if
irank(ixijs) = jcS(col+1)
k = k+1
end do
end do
do c = 2,m+1
jcS(c) = jcS(c) + jcS(c-1)
end do
do i = 1,n
irank(i) = irank(i) + jcS(MatJ(i))
end do
ir(irank) = MatI-1
do i = 1,n
pr(irank(i)) = pr(irank(i)) + MatK(i)
end do
return
end
This should work:
module test
implicit none
! This should probably be whatever floating point format Matlab uses.
integer, parameter :: dp = selected_real_kind(15,300)
contains
subroutine new_sparse(femi, femj, femk, pr, ir, jc, n, m)
integer, intent(in) :: n ! The size of femi, femj, femk.
integer, intent(in) :: m ! The no. of rows (and cols) in the matrix.
integer, intent(in) :: femi(n) ! The input i indices.
integer, intent(in) :: femj(n) ! The input j indices.
real(dp), intent(in) :: femk(n) ! The input values.
real(dp), intent(out) :: pr(n) ! The output values.
integer, intent(out) :: ir(n) ! The output i indices.
integer, intent(out) :: jc(m+1) ! Column j has jc(j+1)-jc(j) non-zero entries
! loop indices.
integer :: a,b
! Initialise jc.
! All elements of `jc` are `1` as the output initially contains no elements.
jc = 1
! Loop over the input elements.
do_a : do a=1,n
associate(i=>femi(a), j=>femj(a), k=>femk(a))
! Loop over the stored entries in column j of the output,
! looking for element (i,j).
do b=jc(j),jc(j+1)-1
! Element (i,j) is already in the output, update the output and cycle.
if (ir(b)==i) then
pr(b) = pr(b) + femk(a)
cycle do_a
endif
enddo
! Element (i,j) is not already in the output.
! First make room for the new element in ir and pr,
! then add the element to ir and pr,
! then update jc.
ir(jc(j+1)+1:jc(m+1)) = ir(jc(j+1):jc(m+1)-1)
pr(jc(j+1)+1:jc(m+1)) = pr(jc(j+1):jc(m+1)-1)
ir(jc(j+1)) = i
pr(jc(j+1)) = k
jc(j+1:) = jc(j+1:) + 1
end associate
enddo do_a
end subroutine
end module
program prog
use test
implicit none
integer, parameter :: n = 14
integer, parameter :: m = 6
integer :: femi(n), femj(n)
real(dp) :: femk(n)
real(dp) :: pr(n)
integer :: ir(n),jc(m+1)
integer :: a,b
femi = [1, 2, 3, 2, 2, 4, 5, 5, 4, 6, 6, 5, 5, 2]
femj = [2, 2, 1, 1, 1, 3, 3, 6, 3, 1, 1, 2, 2, 4]
femk = real([2, 1, 5, 4, 2, 4, 5, 7, 2, 1, 6, 2, 1, 4], dp)
write(*,*) 'Input:'
do a=1,n
write(*,'(a,i0,a,i0,a,f2.0)') '(',femi(a),',',femj(a),') : ',femk(a)
enddo
write(*,*)
call new_sparse(femi,femj,femk,pr,ir,jc,n,m)
write(*,*) 'Output:'
do a=1,m
do b=jc(a),jc(a+1)-1
write(*,'(a,i0,a,i0,a,f2.0)') '(',ir(b),',',a,') : ',pr(b)
enddo
enddo
end program
This writes:
Input:
(1,2) : 2.
(2,2) : 1.
(3,1) : 5.
(2,1) : 4.
(2,1) : 2.
(4,3) : 4.
(5,3) : 5.
(5,6) : 7.
(4,3) : 2.
(6,1) : 1.
(6,1) : 6.
(5,2) : 2.
(5,2) : 1.
(2,4) : 4.
Output:
(3,1) : 5.
(2,1) : 6.
(6,1) : 7.
(1,2) : 2.
(2,2) : 1.
(5,2) : 3.
(4,3) : 6.
(5,3) : 5.
(2,4) : 4.
(5,6) : 7.
The bottleneck in your algorithm comes from the instructions indices = [femi==i .and. femj==j], any(indices .eqv. .true.) and sum(femk, indices). These all take O(n) operations, and as these are within a double loop the overall cost of the subroutine is O(m^2*n).
My algorithm works in two stages. The first stage, the do b=jc(j),jc(j+1)-1 loop, compares each element in the input with each element in the matching column of the output, for a maximum cost of O(mn) operations. If the input element is found in the output, then the value is updated and nothing more needs to be done.
If the input element is not found in the output, then it needs to be added to the output. This is handled by the second stage, the code after the do b... loop. Since this needs to move the output elements in order to make space for the new element, this stage has a maximum of O(n'^2) operations, where n' is the number of unique elements in the input, which should satisfy n'<=n and n'<<m^2 for a sparse matrix.
My algorithm should run a lot faster for large m and n, but it certainly has a lot of scope for improvement. I suspect it is worth using an intermediate data structure for storing ir and pr, so that new elements can be inserted without having to re-arrange all the elements to do so.
I am trying to write a function which return s-tuples with non-negative integers and a given sum $n$ (the sum of each tuple is $n$). In the program, I need to use s nested loops:
for i1 from 0 to n do
for i2 from 1 to n do
...
for is from 1 to n do
end for;
end for;
end for;
How could I use only a few loops instead of s loops? Thank you very much.
I suggest the combinat:-composition() command. On its own, the command won't include zero terms, but you can instead partition n+s and remove 1 from each term at the end:
restart;
partitions := proc( n :: posint, s :: posint, { allowzero :: truefalse := false } )
local P, u:
if allowzero = false then
P := convert( combinat:-composition( n, s ), 'list' ):
return select( u -> numelems(u) = s, P ):
else
P := procname( n + s, s, ':-allowzero'=false ):
return map( u -> u -~ 1, P ):
end if:
end proc:
partitions( 5, 2, ':-allowzero'=false ); # [ [1,4], [2,3], [3,2], [4,1] ]
partitions( 5, 2, ':-allowzero'=true ); # [ [0,5], [1,4], [2,3], [3,2], [4,1], [5,0] ]
Just started programming in the Nim language (which I really like so far). As a learning exercise I am writing a small matrix library. I have a bunch more code, but I'll just show the part that's relevant to this question.
type
Matrix*[T; nrows, ncols: static[int]] = array[0 .. (nrows * ncols - 1), T]
# Get the index in the flattened array corresponding
# to row r and column c in the matrix
proc index(mat: Matrix, r, c: int): int =
result = r * mat.ncols + c
# Return the element at r, c
proc `[]`(mat: Matrix, r, c: int): Matrix.T =
result = mat[mat.index(r, c)]
# Set the element at r, c
proc `[]=`(mat: var Matrix, r, c: int, val: Matrix.T) =
mat[mat.index(r, c)] = val
# Add a value to every element in the matrix
proc `+=`(mat: var Matrix, val: Matrix.T) =
for i in 0 .. mat.high:
mat[i] += val
# Add a value to element at r, c
proc `[]+=`(mat: var Matrix, r, c: int, val: Matrix.T) =
mat[mat.index(r, c)] += val
# A test case
var mat: Matrix[float, 3, 4] # matrix with 3 rows and 4 columns
mat[1, 3] = 7.0
mat += 1.0
# add 8.0 to entry 1, 3 in matrix
`[]+=`(mat, 1, 3, 8.0) # works fine
All this works fine, but I'd like to be able to replace the last line with something like
mat[1, 3] += 4.0
This won't work (wasn't expecting it to either). If I try it, I get
Error: for a 'var' type a variable needs to be passed
How would I create an addition assignment operator that has this behavior? I'm guessing I need something other than a proc to accomplish this.
There are two ways you can do this:
Overload [] for var Matrix and return a var T (This requires the current devel branch of Nim):
proc `[]`(mat: Matrix, r, c: int): Matrix.T =
result = mat[mat.index(r, c)]
proc `[]`(mat: var Matrix, r, c: int): var Matrix.T =
result = mat[mat.index(r, c)]
Make [] a template instead:
template `[]`(mat: Matrix, r, c: int): expr =
mat[mat.index(r, c)]
This causes a problem when mat is not a value, but something more complex:
proc x: Matrix[float, 2, 2] =
echo "x()"
var y = x()[1, 0]
This prints x() twice.