I don't know why but I am getting this error:
Error in mr_lsbpex (line 3)
dlen = uint32(0) ;
Output argument "a" (and maybe others) not assigned during call to "E:\path\mr_lsbpex.m>mr_lsbpex"
I have tested "dlen = uint32(0) ;" in matlab enviorment (outside of this function) and everything was OK. Here is my code:
function a = mr_lsbpex ( r, p )
% extract from an array
dlen = uint32(0) ;
s = size (r) ;
rnd = rand (s(1),s(2)) ;
rd = 32 ;
rl = s(2) ;
for i=1:s(2)
if rnd(1,i)<rd/rl
d = bitget (round(r(1,i)/p),1);
dlen = bitset (dlen,rd,d);
rd = rd -1 ;
end
rl = rl -1 ;
end
if (dlen > 10000000 )
clear a ;
return ;
end
a = uint8(zeros(dlen,1)) ;
rd = double(dlen * 8) ;
rl = double(s(1)*s(2)-s(2)) ;
for i=2:s(1)
for j=1:s(2)
if rnd(i,j)<rd/rl
d = bitget (round(r(i,j)/p) ,1) ;
a = z_set_bit (a,rd,d) ;
rd = rd - 1 ;
end
rl = rl - 1 ;
end
end
Remember: a needs to be returned ALLWAYS!
The error is not in that specific line, but in the "whole" function itself.
Your problem is that Matlab thinks that a its not going to be created. And actually in some case it may not be created.
The following line in the beginning of your function should do the trick
a=0; % well, or a=NaN; or whatever you want to return
Additionally, don't clear a in if (dlen > 10000000 ).
writing my first assembly language program for class using Easy68K.
I'm using an if-else branching to replicate the code:
IF (P > 12)
P = P * 8 + 3
ELSE
P = P - Q
PRINT P
But I think I have my branches wrong because without the first halt in my code the program runs through the IF branch anyway even after the CMP finds a case that P < 12. Am I missing something here or would this be a generally accepted way of doing this?
Here is my assembly code:
START: ORG $1000 ; Program starts at loc $1000
MOVE P, D1 ; [D1] <- P
MOVE Q, D2 ; [D2] <- Q
* Program code here
CMP #12, D1 ; is P > 12?
BGT IF ;
SUB D2, D1 ; P = P - Q
MOVE #3, D0 ; assign read command
TRAP #15 ;
SIMHALT ; halt simulator
IF ASL #3, D1 ; P = P * 8
ADD #3, D1 ; P = P + 3
ENDIF
MOVE #3, D0 ; assign read command
TRAP #15 ;
SIMHALT ; halt simulator
* Data and Variables
ORG $2000 ; Data starts at loc $2000
P DC.W 5 ;
Q DC.W 7 ;
END START ; last line of source
To do if..else, you need two jumps; one at the start, and one at the end of the first block.
While it doesn't affect correctness, it is also conventional to retain source order, which means negating the condition.
MOVE P, D1 ; [D1] <- P
MOVE Q, D2 ; [D2] <- Q
* Program code here
CMP #12, D1 ; is P > 12?
BLE ELSE ; P is <= 12
IF
ASL #3, D1 ; P = P * 8
ADD #3, D1 ; P = P + 3
BRA ENDIF
ELSE
SUB D2, D1 ; P = P - Q
ENDIF
MOVE #3, D0 ; assign read command
TRAP #15 ;
SIMHALT ; halt simulator
EASy68K supports structured assembly.
OPT SEX
IF.L P <GT> #12 THEN
ELSE
ENDI
Add the option SEX to expand the structured code during assembly if you wish to view the compare and branch instructions used to implement the structured code.
I've been trying to run a code using MPI I/O on a large number of cores. The time required for each core to read from and write to a single file (the same for all cores) increases with the number of cores used. I'm currently using 512 cores and this problem is making my project unfeasible. The problem appears, however, even when running on 8 cores; it then takes about 0.2 seconds to read the first real number in the file. On 32 cores it takes more then 30 seconds to write one real number. I'm running it here: https://www.msi.umn.edu/hpc/itasca. The following simple code generates exactly this problem (the counting of the number of elements in the file might seem unnecessary here but it is necessary in my actual code):
PROGRAM MAIN
USE MPI
IMPLICIT NONE
! INITIALIZING VARIABLES
REAL(8) :: A, B
INTEGER :: COUNT_IO, i, j, ST, GO, tag, t, nb_bytes, N, d_each, d_start, d_end, NN
REAL(8) :: time_start, time_end
! VARIABLES RELATED TO MPI
INTEGER :: ierror ! returns error messages from the mpi subroutines
INTEGER :: rank ! identification number of each processor
INTEGER :: nproc ! number of processors
INTEGER, DIMENSION(mpi_status_size):: status
INTEGER(kind= MPI_OFFSET_KIND ) :: offset
INTEGER :: fh ! file handle
! EXECUTABLE
! INITIALIZE THE MPI ENVIRONMENT
CALL MPI_INIT(ierror) ! initialize MPI
CALL MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierror) ! obtain rank for each node
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierror) ! obtain the number of nodes
CALL MPI_TYPE_SIZE(MPI_REAL8,nb_bytes,ierror)
CALL MPI_FILE_OPEN (MPI_COMM_WORLD,"file.dat",MPI_MODE_RDWR+MPI_MODE_UNIQUE_OPEN,MPI_INFO_NULL,fh,ierror)
NN = 2048
DO d_each=1,NN
IF (d_each*nproc>=NN) EXIT
END DO
d_start = rank*d_each+1
d_end = MIN((rank+1)*d_each,NN)
DO t = d_start,d_end
! READING ONE THREAD AT A TIME
tag = 1
GO = 0
IF (rank .gt. 0) THEN
CALL MPI_RECV (GO,1,MPI_INTEGER,rank-1,tag, MPI_COMM_WORLD ,status,ierror)
ENDIF
time_start = MPI_WTIME()
i = 0
ST = 0
COUNT_IO = 0
DO WHILE ((i .lt. 100000) .AND. (ST .eq. 0))
i = i+1
offset = nb_bytes*(i-1)
CALL MPI_FILE_READ_AT (fh,offset,A,1,MPI_REAL8,status,ierror)
IF (status(1) .eq. 0) THEN
COUNT_IO = i
ST = 1
ELSE
COUNT_IO = 0
END IF
ENDDO
N = (COUNT_IO - 1)
IF (N .gt. 0) THEN
offset = 0
CALL MPI_FILE_READ_AT (fh,offset,B,1,MPI_REAL8,status,ierror)
ENDIF
time_end = MPI_WTIME()
PRINT *, 'My rank is', rank, 'Time for read =',time_end-time_start
GO = 1
IF (rank .lt. nproc-1) THEN
CALL MPI_SEND (GO,1, MPI_INTEGER ,rank+1,tag, MPI_COMM_WORLD ,ierror)
ENDIF
CALL MPI_BARRIER(MPI_COMM_WORLD,ierror)
! WRITING ONE THREAD AT A TIME
tag = 2
GO = 0
IF (rank .gt. 0) THEN
CALL MPI_RECV (GO,1,MPI_INTEGER,rank-1,tag, MPI_COMM_WORLD ,status,ierror)
ENDIF
time_start = MPI_WTIME()
i = 0
ST = 0
COUNT_IO = 0
DO WHILE ((i .lt. 100000) .AND. (ST .eq. 0))
i = i+1
offset = nb_bytes*(i-1)
CALL MPI_FILE_READ_AT (fh,offset,A,1,MPI_REAL8,status,ierror)
IF (status(1) .eq. 0) THEN
COUNT_IO = i
ST = 1
ELSE
COUNT_IO = 0
END IF
ENDDO
N = (COUNT_IO - 1)
offset = nb_bytes*N
CALL MPI_FILE_WRITE_AT (fh,offset,0.0D0,1,MPI_REAL8,status,ierror)
time_end = MPI_WTIME()
PRINT *, 'My rank is', rank, 'Time for write =',time_end-time_start
GO = 1
IF (rank .lt. nproc-1) THEN
CALL MPI_SEND (GO,1, MPI_INTEGER ,rank+1,tag, MPI_COMM_WORLD ,ierror)
ENDIF
CALL MPI_BARRIER(MPI_COMM_WORLD,ierror)
ENDDO
CALL MPI_FILE_CLOSE (fh,ierror)
CALL MPI_FINALIZE(ierror)
END PROGRAM MAIN
The main thing to realize here is that you can read in the data in one fell swoop (or, if memory is a problem, in chunks - but it can be in much larger chunks than individual doubles!) and that you don't need to skip to the end of the file one double at a time.
Here's an example which will read in the data in arbitrary chunk sizes, processes the data as you will, and appends some data (in this case, everyone just adds 4 copies of their rank to the end of the file). For simplicity, little python scripts help with writing and displaying test data.
$ ./writedata.py
$ ./readdata.py
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
15. 16. 17. 18. 19. 20. 21. 22. 23. 24.]
$ mpirun -np 3 ./usepario
rank: 0 got data: 0.000... 24.000
rank: 1 got data: 0.000... 24.000
rank: 2 got data: 0.000... 24.000
$ ./readdata.py
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 0. 0. 0. 0. 1.
1. 1. 1. 2. 2. 2. 2.]
usepario.f90:
module pario
contains
function openFile(filename)
use mpi
implicit none
integer :: openFile, ierr
character(len=*) :: filename
integer(MPI_OFFSET_KIND) :: off = 0
call MPI_File_open(MPI_COMM_WORLD, filename, &
ior(MPI_MODE_RDWR, MPI_MODE_UNIQUE_OPEN), &
MPI_INFO_NULL, openFile, ierr)
call MPI_File_set_view(openFile, off, &
MPI_DOUBLE_PRECISION, MPI_DOUBLE_PRECISION, &
"native", MPI_INFO_NULL, ierr)
end function openFile
subroutine closeFile(fh)
use mpi
implicit none
integer :: fh, ierr
call MPI_File_close(fh, ierr)
end subroutine closeFile
function filesizedoubles(fh)
use mpi
implicit none
integer :: fh, ierr
integer(MPI_OFFSET_KIND) :: filesize, filesizedoubles
integer :: dblsize
call MPI_File_get_size(fh, filesize, ierr)
call MPI_type_size(MPI_DOUBLE_PRECISION, dblsize, ierr)
filesizedoubles = filesize / dblsize
end function filesizedoubles
subroutine getdatablock(fh, blocksize, datablock, datasize)
use mpi
implicit none
integer :: fh, ierr
integer :: blocksize, datasize
double precision, dimension(:) :: datablock
integer(MPI_OFFSET_KIND) :: fileloc
integer, dimension(MPI_STATUS_SIZE) :: rstatus
! you can also experiment with read_all for non collective/synchronous file
! access
call MPI_File_read(fh, datablock, blocksize, MPI_DOUBLE_PRECISION, &
rstatus, ierr)
call MPI_Get_count(rstatus, MPI_DOUBLE_PRECISION, datasize, ierr)
end subroutine getdatablock
subroutine eachappend(fh, filesize, numitems, newdata)
use mpi
implicit none
integer :: fh, numitems
integer(MPI_OFFSET_KIND) :: filesize
double precision, dimension(:) :: newdata
integer :: rank, ierr
integer(MPI_OFFSET_KIND) :: offset
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
offset = filesize + rank*numitems
call MPI_File_write_at_all(fh, offset, newdata, numitems, &
MPI_DOUBLE_PRECISION, &
MPI_STATUS_IGNORE, ierr)
end subroutine eachappend
end module pario
program usepario
use mpi
use pario
implicit none
integer :: fileh
integer, parameter :: bufsize=1000, newsize=4
integer(MPI_OFFSET_KIND) :: filesize
double precision, allocatable, dimension(:) :: curdata, newdata
integer :: datasize
integer :: rank, ierr
call MPI_Init(ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
allocate(curdata(bufsize))
fileh = openFile("data.dat")
filesize = filesizedoubles(fileh)
do
call getdatablock(fileh, bufsize, curdata, datasize)
!!
!! process data here
!!
!! do i=1,datasize
!! ...dostuff...
!! end do
!!
print '(1X,A,I3,A,F8.3,A,F8.3)', 'rank: ', rank, ' got data: ', curdata(1), '...', curdata(datasize)
if (datasize /= bufsize) exit
end do
deallocate(curdata)
allocate(newdata(newsize))
newdata = rank
call eachappend(fileh, filesize, newsize, newdata)
call closeFile(fileh)
call MPI_Finalize(ierr)
end program usepario
writedata.py:
#!/usr/bin/env python
import numpy
numdoubles = 25
data = numpy.arange(numdoubles,dtype=numpy.float64)
data.tofile("data.dat")
readdata.py:
#!/usr/bin/env python
import numpy
data = numpy.fromfile("data.dat",dtype=numpy.float64)
print data
Normally this function should give me the values 1, 2, 3 or 4. but when I use it, I get 0, 1 or 2. Could you help me to know where is the problem:
function Vecteur_retour = var_Test(Test)
AA = Test;
var_Test = zeros(1,2000);
for i=3:1:2000
if AA(i)<=AA(i-1) && AA(i-1)<=AA(i-2)
var_Test(i)=1;
else
if AA(i)<=AA(i-1) && AA(i-1)>AA(i-2)
var_Test(i)=2;
if AA(i)>AA(i-1) && AA(i-1)<=AA(i-2)
var_Test(i)=3;
else
if AA(i)>AA(i-1) && AA(i-1)>AA(i-2)
var_Test(i)=4;
end
end
end
end
end
Vecteur_retour = var_Test;
Vector comparisons will be much faster:
var_Test = ones(1,2000);
delta_Test = diff(Test);
var_Test([0 0 delta_Test(1:end-1)] > 0) = 2;
var_Test([0 delta_Test] > 0) = var_Test([0 delta_Test] > 0) + 2;
var_Test(1:2) = 0;
Probably because you never reach the cases var_Test(i) = 3 or var_Test(i) = 4.
You have a problem with your if and end blocks. The way you have it, case 3 is only reached if case 2 is hit first, but these are contradictory.
You want code more like.
function Vecteur_retour = var_Test(Test)
AA = Test;
var_Test = zeros(1,2000);
for i=3:1:2000
if AA(i)<=AA(i-1) && AA(i-1)<=AA(i-2)
var_Test(i)=1;
else
if AA(i)<=AA(i-1) && AA(i-1)>AA(i-2)
var_Test(i)=2;
else % you forgot this else
if AA(i)>AA(i-1) && AA(i-1)<=AA(i-2)
var_Test(i)=3;
else
if AA(i)>AA(i-1) && AA(i-1)>AA(i-2)
var_Test(i)=4;
end
end
end
end
end
Vecteur_retour = var_Test;
Careful indentation would have helped here.