Reading a text file as a input and then writing text files with the data in 2D matrices as output - fortran90

I am the real novice in FORTRAN90, of course in any other programming language.
However, I need to make a program for my work.
So I am asking any comment from anybody.
Thanks in advance.
From now on, I will explain my problem.
The original data is just following as;
15 1r 2 1r 70 22r 2 2r 15 1r 2 1r 8 8r 15 1 3r
This is could be written again in another way like below;
15 15 2 2 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 2 2 2 15 15 2 2
8 8 8 8 8 8 8 8 8 15 1 1 1 1
So the total number of data elements in the above example can be counted 48.
The total number of the original data is 12,113,640.
And then the original data was made of 220(z) slices of 322(x) x 171(y).
Each slices has 55,062 of elements.
If it is possible, just like the way of count of the element number, the program count by 55,062 elements and then make output file(txt) and then totally 220 slices of the data should be 2D array.
So I have to extract 220 slices from the one whole data file. And also the data should be 2D array in each slice.

It will be something like this - I assume you know how to read lines from the file
program main
character(len=120):: compressed
character ch
integer value, rval, repeat, ix, complen
! Pretend this is the line read from the file
compressed = '15 1r 2 1r 70 22r 2 2r 15 1r 2 1r 8 8r 15 1 3r'
complen = len(trim(compressed))
ix = 1
do while (ix .lt. complen)
! Skip spaces
do
ch = compressed(ix:ix)
if (ch .ne. ' ') exit
ix = ix + 1
end do
! Extract the number
value = 0
do
ch = compressed(ix:ix)
if (ch .ge. '0' .and. ch .le. '9') then
value = value * 10 + ichar(ch) - ichar('0')
else
exit
end if
ix = ix + 1
if (ix .gt. complen) exit
end do
! is it a repeater
repeat = 1
if (ch .eq. 'r') then
! reduce by 1 since one has already been printed
repeat = value - 1
ix = ix + 1
else
rval = value
end if
! assume there won't be more than 100 repeats
write(*, '(100I3)', advance='no') (rval, ii = 1, repeat)
end do
stop
end program

Related

Dividing a matrix into two parts

I am trying to classify my dataset. To do this, I will use the 4th column of my dataset. If the 4th column of the dataset is equal to 1, that row will added in new matrix called Q1. If the 4th column of the dataset is equal to 2, that row will be added to matrix Q2.
My code:
i = input('Enter a start row: ');
j = input('Enter a end row: ');
search = importfiledataset('search-queries-features.csv',i,j);
[n, p] = size(search);
if j>n
disp('Please enter a smaller number!');
end
for s = i:j
class_id = search(s,4);
if class_id == 1
Q1 = search(s,1:4)
elseif class_id ==2
Q2 = search(s,1:4)
end
end
This calculates the Q1 and Q2 matrices, but they all are 1x4 and when it gives new Q1 the old one is deleted. I need to add new row and make it 2x4 if conditions are true. I need to expand my Q1 matrix.
Briefly I am trying to divide my dataset into two parts using for loops and if statements.
Dataset:
I need outcome like:
Q1 = [30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 58 30 1
34 60 1 1
34 61 10 1]
Q2 = [34 59 0 2
34 66 9 2]
How can I prevent my code from deleting previous rows of Q1 and Q2 and obtain the entire matrices?
The main problem in your calculation is that you overwrite Q1 and Q2 each loop iteration. Best solution: get rid of the loops and use logical indexing.
You can use logical indexing to quickly determine where a column is equal to 1 or 2:
search = [
30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 59 0 2
34 66 9 2
34 58 30 1
34 60 1 1
34 61 10 1
];
Q1 = search(search(:,4)==1,:) % == compares each entry in the fourth column to 1
Q2 = search(search(:,4)==2,:)
Q1 =
30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 58 30 1
34 60 1 1
34 61 10 1
Q2 =
34 59 0 2
34 66 9 2
Warning: Slow solution!
If you are hell bent on using loops, make sure to not overwrite your variables. Either extend them each iteration (which is very, very slow):
Q1=[];
Q2=[];
for ii = 1:size(search,1) % loop over all rows
if search(ii,4)==1
Q1 = [Q1;search(ii,:)];
end
if search(ii,4)==2
Q2 = [Q2;search(ii,:)];
end
end
MATLAB will put orange wiggles beneath Q1 and Q2, because it's a bad idea to grow arrays in-place. Alternatively, you can preallocate them as large as search and strip off the excess:
Q1 = zeros(size(search)); % Initialise to be as large as search
Q2 = zeros(size(search));
Q1kk = 1; % Intialiase counters
Q2kk = 1;
for ii = 1:size(search,1) % loop over all rows
if search(ii,4)==1
Q1(Q1kk,:) = search(ii,:); % store
Q1kk = Q1kk + 1; % Increase row counter
end
if search(ii,4)==2
Q2(Q2kk,:) = search(ii,:);
Q2kk = Q2kk + 1;
end
end
Q1 = Q1(1:Q1kk-1,:); % strip off excess rows
Q2 = Q2(1:Q2kk-1,:);
Another option using accumarray, if Q is your original matrix:
Q = accumarray(Q(:,4),1:size(Q,1),[],#(x){Q(x,:)});
You can access the result with Q{1} (for class_id = 1), Q{2} (for class_id = 2) and so on...

All unique multiplication products

I'd like to obtain all unique products for a given vector.
For example, given a:
a = [4,10,12,3,6]
I want to obtain a matrix that contains the results of:
4*10
4*12
4*3
4*6
10*12
10*3
10*6
12*3
12*6
3*6
Is there a short and/or quick way of doing this in MATLAB?
EDIT: a may contain duplicate numbers, giving duplicate products - and these must be kept.
Given:
a =
4 10 12 3 6
Construct the matrix of all pairwise products:
>> all_products = a .* a.'
all_products =
16 40 48 12 24
40 100 120 30 60
48 120 144 36 72
12 30 36 9 18
24 60 72 18 36
Now, construct a mask to keep only those values below the main diagonal:
>> mask = tril(true(size(all_products)), -1)
mask =
0 0 0 0 0
1 0 0 0 0
1 1 0 0 0
1 1 1 0 0
1 1 1 1 0
and apply the mask to the product matrix:
>> unique_products = all_products(mask)
unique_products =
40
48
12
24
120
30
60
36
72
18
If you have the Statistics Toolbox, you can abuse pdist, which considers only one of the two possible orders for each pair:
result = pdist(a(:), #times);
One option involves nchoosek, which returns all combinations of k elements out of a vector, each row is one combination. prod computes the product of rows or columns:
a = [4,10,12,3,6];
b = nchoosek(a,2);
b = prod(b,2); % 2 indicates rows
Try starting with this. Have the unique function filter out the result of multiplying a by itself.
b = unique(a*a')

Matlab isn't incrementing my variable

I have the following matrix declared in Matlab:
EmployeeData =
1 20 100000 42 14
2 15 95000 35 14
3 18 70000 28 14
4 10 85000 35 14
5 10 40000 21 12
6 4 45000 14 8
7 3 50000 21 10
8 5 55000 21 14
9 1 25000 14 7
10 2 50000 21 9
42 4 100000 42 10
Where column 1 represents ID numbers, 2 represents years, 3 is salary, 4 is vacation days, and 5 is sick days. I am trying to find the maximum value of a column (in this case the salary column), and print out the ID associated with that value. If more than one employee has the maximum value, all the IDs with that maximum are supposed to be shown. So here is how I naively implemented a way to do it:
>> maxVal = [];
>> j = 1;
>> for i = EmployeeData(:, 3)
if i == max(EmployeeData(:, 3))
maxVal = [maxVal EmployeeData(j, 1)];
end
j = j + 1;
end
But it shows maxVal to be [] in my workspace variables, instead of [1 42] as I expected. Upon inserting a disp(i) in the for loop above the if to debug, I get the following output:
100000
95000
70000
85000
40000
45000
50000
55000
25000
50000
Just like I expected. But when I switch out that disp(i) with a disp(j), I get this for my output:
1
What am I doing wrong? Should this not work?
MATLAB for loops operate on rows, not columns. You should try replacing your for loop with:
for i = EmployeeData(:, 3)' % NOTE THE TRANSPOSE
...
end
EDIT: Note that you can do what you're trying to do without a forloop:
maxVal = EmployeeData(EmployeeData(:,3) == max(EmployeeData(:,3)),1);
Is this what you want?
>> EmployeeData(EmployeeData(:,3)==max(EmployeeData(:,3)),1)
ans =
1
42

Find closest matching distances for a set of points in a distance matrix in Matlab

I have a matrix of measured angles between M planes
0 52 77 79
52 0 10 14
77 10 0 3
79 14 3 0
I have a list of known angles between planes, which is an N-by-N matrix which I name rho. Here's is a subset of it (it's too large to display):
0 51 68 75 78 81 82
51 0 17 24 28 30 32
68 17 0 7 11 13 15
75 24 7 0 4 6 8
78 28 11 4 0 2 4
81 30 13 6 2 0 2
82 32 15 8 4 2 0
My mission is to find the set of M planes whose angles in rho are nearest to the measured angles.
For example, the measured angles for the planes shown above are relatively close to the known angles between planes 1, 2, 4 and 6.
Put differently, I need to find a set of points in a distance matrix (which uses cosine-related distances) which matches a set of distances I measured. This can also be thought of as matching a pattern to a mold.
In my problem, I have M=5 and N=415.
I really tried to get my head around it but have run out of time. So currently I'm using the simplest method: iterating over every possible combination of 3 planes but this is slow and currently written only for M=3. I then return a list of matching planes sorted by a matching score:
function [scores] = which_zones(rho, angles)
N = size(rho,1);
scores = zeros(N^3, 4);
index = 1;
for i=1:N-2
for j=(i+1):N-1
for k=(j+1):N
found_angles = [rho(i,j) rho(i,k) rho(j,k)];
score = sqrt(sum((found_angles-angles).^2));
scores(index,:)=[score i j k];
index = index + 1;
end
end;
end
scores=scores(1:(index-1),:); % was too lazy to pre-calculate #
scores=sortrows(scores, 1);
end
I have a feeling pdist2 might help but not sure how. I would appreciate any help in figuring this out.
There is http://www.mathworks.nl/help/matlab/ref/dsearchn.html for closest point search, but that requires same dimensionality. I think you have to bruteforce find it anyway because it's just a special problem.
Here's a way to bruteforce iterate over all unique combinations of the second matrix and calculate the score, after that you can find the one with the minimum score.
A=[ 0 52 77 79;
52 0 10 14;
77 10 0 3;
79 14 3 0];
B=[ 0 51 68 75 78 81 82;
51 0 17 24 28 30 32;
68 17 0 7 11 13 15;
75 24 7 0 4 6 8;
78 28 11 4 0 2 4;
81 30 13 6 2 0 2;
82 32 15 8 4 2 0];
M = size(A,1);
N = size(B,1);
% find all unique permutations of `1:M`
idx = nchoosek(1:N,M);
K = size(idx,1); % number of combinations = valid candidates for matching A
score = NaN(K,1);
idx_triu = triu(true(M,M),1);
Atriu = A(idx_triu);
for ii=1:K
partB = B(idx(ii,:),idx(ii,:));
partB_triu = partB(idx_triu);
score = norm(Atriu-partB_triu,2);
end
[~, best_match_idx] = min(score);
best_match = idx(best_match_idx,:);
The solution of your example actually is [1 2 3 4], so the upperleft part of B and not [1 2 4 6].
This would theoretically solve your problem, and I don't know how to make this algorithm any faster. But it will still be slow for large numbers. For example for your case of M=5 and N=415, there are 100 128 170 583 combinations of B which are a possible solution; just generating the selector indices is impossible in 32-bit because you can't address them all.
I think the real optimization here lies in cutting away some of the planes in the NxN matrix in a preceding filtering part.

How do I select n elements of a sequence in windows of m ? (matlab)

Quick MATLAB question.
What would be the best/most efficient way to select a certain number of elements, 'n' in windows of 'm'. In other words, I want to select the first 50 elements of a sequence, then elements 10-60, then elements 20-70 ect.
Right now, my sequence is in vector format(but this can easily be changed).
EDIT:
The sequences that I am dealing with are too long to be stored in my RAM. I need to be able to create the windows, and then call upon the window that I want to analyze/preform another command on.
Do you have enough RAM to store a 50-by-nWindow array in memory? In that case, you can generate your windows in one go, and then apply your processing on each column
%# idxMatrix has 1:50 in first col, 11:60 in second col etc
idxMatrix = bsxfun(#plus,(1:50)',0:10:length(yourVector)-50); %'#
%# reshapedData is a 50-by-numberOfWindows array
reshapedData = yourVector(idxMatrix);
%# now you can do processing on each column, e.g.
maximumOfEachWindow = max(reshapedData,[],1);
To complement Kerrek's answer: if you want to do it in a loop, you can use something like
n = 50
m = 10;
for i=1:m:length(v)
w = v(i:i+n);
% Do something with w
end
There's a slight issue with the description of your problem. You say that you want "to select the first 50 elements of a sequence, then elements 10-60..."; however, this would translate to selecting elements:
1-50
10-60
20-70
etc.
That first sequence should be 0-10 to fit the pattern which of course in MATLAB would not make sense since arrays use one-indexing. To address this, the algorithm below uses a variable called startIndex to indicate which element to start the sequence sampling from.
You could accomplish this in a vectorized way by constructing an index array. Create a vector consisting of the starting indices of each sequence. For reuse sake, I put the length of the sequence, the step size between sequence starts, and the start of the last sequence as variables. In the example you describe, the length of the sequence should be 50, the step size should be 10 and the start of the last sequence depends on the size of the input data and your needs.
>> startIndex = 10;
>> sequenceSize = 5;
>> finalSequenceStart = 20;
Create some sample data:
>> sampleData = randi(100, 1, 28)
sampleData =
Columns 1 through 18
8 53 10 82 82 73 15 66 52 98 65 81 46 44 83 9 14 18
Columns 19 through 28
40 84 81 7 40 53 42 66 63 30
Create a vector of the start indices of the sequences:
>> sequenceStart = startIndex:sequenceSize:finalSequenceStart
sequenceStart =
10 15 20
Create an array of indices to index into the data array:
>> index = cumsum(ones(sequenceSize, length(sequenceStart)))
index =
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
>> index = index + repmat(sequenceStart, sequenceSize, 1) - 1
index =
10 15 20
11 16 21
12 17 22
13 18 23
14 19 24
Finally, use this index array to reference the data array:
>> sampleData(index)
ans =
98 83 84
65 9 81
81 14 7
46 18 40
44 40 53
Use (start : step : end) indexing: v(1:1:50), v(10:1:60), etc. If the step is 1, you can omit it: v(1:50).
Consider the following vectorized code:
x = 1:100; %# an example sequence of numbers
nwind = 50; %# window size
noverlap = 40; %# number of overlapping elements
nx = length(x); %# length of sequence
ncol = fix((nx-noverlap)/(nwind-noverlap)); %# number of sliding windows
colindex = 1 + (0:(ncol-1))*(nwind-noverlap); %# starting index of each
%# indices to put sequence into columns with the proper offset
idx = bsxfun(#plus, (1:nwind)', colindex)-1; %'
%# apply the indices on the sequence
slidingWindows = x(idx)
The result (truncated for brevity):
slidingWindows =
1 11 21 31 41 51
2 12 22 32 42 52
3 13 23 33 43 53
...
48 58 68 78 88 98
49 59 69 79 89 99
50 60 70 80 90 100
In fact, the code was adapted from the now deprecated SPECGRAM function from the Signal Processing Toolbox (just do edit specgram.m to see the code).
I omitted parts that zero-pad the sequence in case the sliding windows do not evenly divide the entire sequence (for example x=1:105), but you can easily add them again if you need that functionality...