csvwrite formatting (strings & overwriting) problems, Matlab - matlab

I had two arrays, (data1 is a header array of strings and data2 is the data array of numbers)
data1 = {'#','Area','C Xp','C Yp','Length','B #','R','L','Ch','E1 Xp','E1 Yp','E2 Xp','E2 Yp'};
data2 = [1 939 -397 586 99 2 2 0 -1 -450 588 -352 572
2 1185 -287 294 145 2 1 1 0 -317 359 -235 244
3 592 -242 486 77 3 2 1 0 -278 488 -202 477
4 818 -144 480 60 2 0 2 1 -181 488 -135 451
5 377 -23 -443 37 1 0 1 0 -42 -459 -12 -460
6 923 32 -234 67 1 0 0 0 -3 -260 60 -212
7 812 150 -148 54 1 0 1 0 136 -130 169 -161
8 5968 428 432 402 3 3 0 -1 224 468 622 356
9 617 714 13 63 1 0 1 0 687 35 702 -22
csvwrite('file.xlsx', data1, 0, 0);
csvwrite('file.xlsx', data2, 0, 1);
My first problem is data1 prints to the spreadsheet as an array of chars (example: '#','A','r','e','... each in their own cells). How do I get it to print as the strings I am passing?
My second problem is when I csvwrite data2, data1's info is erased or overwritten. How can I write both to the same file?

Hidden away in the Tips section of the csvwrite documentation:
csvwrite does not accept cell arrays for the input matrix M. To export a cell array that contains only numeric data, use cell2mat to convert the cell array to a numeric matrix before calling csvwrite. To export cell arrays with mixed alphabetic and numeric data, where each cell contains a single element, you can create an Excel® spreadsheet (if your system has Excel installed) using xlswrite. For all other cases, you must use low-level export functions to write your data. For more information, see Export Cell Array to Text File in the MATLAB® Data Import and Export documentation.
I'd say, use xlswrite.
If you can't use xlswrite, it looks like you are stuck doing it manually as described on the page, Export Cell Array to Text File. Something along the lines of:
% write headers
fid = fopen('test.csv','w');
fprintf(fid,'%s,',data1{:});
fprintf(fid,'\n');
% write data...
fprintf(fid,[repmat('%d,',1,numel(data1)) '\n'],data2);
fclose(fid)

Related

Plot selected rows with the average and standard deviation (GNUPlot)

I have a csv file with experiment results that goes like this:
64 4 8 1 1 2 1 ttt 62391 4055430 333 0.0001 10 161 108 288 0
64 4 8 1 1 2 1 ttt 60966 3962810 322 0.0001 10 164 112 295 0
64 4 8 1 1 2 1 ttt 61530 3999475 325 0.0001 10 162 112 291 0
64 4 8 1 1 2 1 ttt 61430 4054428 332 0.0001 10 158 110 286 0
64 4 8 1 1 2 1 ttt 63891 4152938 339 0.0001 9 149 109 274 0
64 4 32 1 1 2 1 ttt 63699 4204182 345 0.0001 4 43 179 240 0
64 4 32 1 1 2 1 ttt 63326 4116218 336 0.0001 4 45 183 248 0
64 4 32 1 1 2 1 ttt 62654 4135211 340 0.0001 4 48 178 248 0
64 4 32 1 1 2 1 ttt 63192 4107506 339 0.0001 4 49 175 245 0
64 4 32 1 1 2 1 ttt 62707 4138666 345 0.0001 4 46 179 245 0
64 4 64 1 1 2 1 ttt 60968 3962929 323 0.0001 4 46 191 256 0
64 4 64 1 1 2 1 ttt 58765 3819787 305 0.0001 4 50 196 267 0
64 4 64 1 1 2 1 ttt 58946 3831499 308 0.0001 5 52 187 260 0
64 4 64 1 1 2 1 ttt 60646 3942047 321 0.0001 4 47 187 254 0
64 4 64 1 1 2 1 ttt 59723 3882044 311 0.0001 4 46 201 269 0
64 8 8 1 1 2 1 ttt 63414 4185382 382 0.0001 33 517 109 643 0
64 8 8 1 1 2 1 ttt 62429 4057899 372 0.0001 33 538 110 667 0
64 8 8 1 1 2 1 ttt 60622 3940452 384 0.0001 33 556 115 689 0
64 8 8 1 1 2 1 ttt 64433 4188192 369 0.0001 33 519 110 644 0
My goal is to be able to plot various combinations (choose which, in different charts) of the columns before the "ttt", with the average and standard deviation of the columns (choose which) after "ttt" (by grouping them by the before "ttt" columns).
Is this possible in GNUPlot and if yes how? If not, do you have any alternate suggestions regarding my problem?
Here is a completely revised and more general version.
Since you want to filter by 3 columns you need to have 3 properties to distinguish the data in the plot. This would be for example color, x-position and pointtype. What the script basically does:
Generates random data for testing (take your file instead)
$Data looks like this:
8 64 57773 0
4 32 64721 2
8 32 56757 1
4 16 56226 2
8 8 56055 1
8 64 59874 0
8 32 58733 0
4 16 55525 2
8 32 58869 0
8 64 64470 0
4 32 60930 1
8 64 57073 2
...
the variables ColX, ColC, ColP, and ColS define which columns are taken for x-position, color, pointtype and statistics.
find unique values of ColX, ColC, ColP, (check help smooth frequency) and put them to datablocks $ColX, $ColC, and $ColP.
put the unique values to arrays ArrX, ArrC, ArrP
loop all possible combinations and do statistics on ColS and put it to $Data2. Add 3 columns at the beginning for color, x-position and pointtype.
$Data2 looks like this:
1 1 1 0 8 4 61639.4 2788.4
1 1 2 0 8 8 59282.1 2740.2
1 2 1 0 16 4 59372.3 2808.6
1 2 2 0 16 8 60502.3 2825.0
1 3 1 0 32 4 59850.7 2603.8
1 3 2 0 32 8 60617.7 1979.8
1 4 1 0 64 4 60399.4 3273.6
1 4 2 0 64 8 59930.7 2919.8
2 1 1 1 8 4 59172.6 2288.2
2 1 2 1 8 8 58992.2 2888.0
2 2 1 1 16 4 59350.1 2364.6
2 2 2 1 16 8 61034.0 2368.5
2 3 1 1 32 4 59920.8 2867.6
2 3 2 1 32 8 59711.9 3464.2
2 4 1 1 64 4 60936.7 3439.7
2 4 2 1 64 8 61078.7 2349.3
3 1 1 2 8 4 58976.0 2376.3
3 1 2 2 8 8 61731.5 1635.7
3 2 1 2 16 4 58276.0 2101.7
3 2 2 2 16 8 58594.5 3358.5
3 3 1 2 32 4 60471.5 3737.6
3 3 2 2 32 8 59909.1 2024.0
3 4 1 2 64 4 62044.2 1446.7
3 4 2 2 64 8 60454.0 3215.1
Finally, plot the data. I couldn't figure out how plotting style with yerror works properly together with variable pointtypes. So, instead I split it into two plot commands with vectors and with points. The third one keyentry is just to get an empty line in the legend and the forth one is to get the pointtype into the legend.
I hope you can figure out all the other details and adapt it to your data.
Code:
### grouped statistics on filtered (unsorted) data
reset session
set colorsequence classic
# generate some random test data
rand1(n) = 2**(int(rand(0)*2)+2) # values 4,8
rand2(n) = 2**(int(rand(0)*4)+3) # values 8,16,32,64
rand3(n) = int(rand(0)*10000)+55000 # values 55000 to 65000
rand4(n) = int(rand(0)*3) # values 0,1,2
set print $Data
do for [i=1:200] {
print sprintf("% 3d% 4d% 7d% 3d", rand1(0), rand2(0), rand3(0), rand4(0))
}
set print
print $Data # (just for test purpose)
ColX = 2 # column for x
ColC = 4 # column for color
ColP = 1 # column for pointtype
ColS = 3 # column for statistics
# get unique values of the columns
set table $ColX
plot $Data u (column(ColX)) smooth freq
unset table
set table $ColC
plot $Data u (column(ColC)) smooth freq
unset table
set table $ColP
plot $Data u (column(ColP)) smooth freq
unset table
# put unique values into arrays
set table $Dummy
array ArrX[|$ColX|-6] # gnuplot creates 6 extra lines
array ArrC[|$ColC|-6]
array ArrP[|$ColP|-6]
plot $ColX u (ArrX[$0+1]=$1)
plot $ColC u (ArrC[$0+1]=$1)
plot $ColP u (ArrP[$0+1]=$1)
unset table
print ArrX, ArrC, ArrP # just for test purpose
# define filter function
Filter(c,x,p) = ArrX[x]==column(ColX) && ArrC[c]==column(ColC) && \
ArrP[p]==column(ColP) ? column(ColS) : NaN
# loop all values and do statistics, write data into $Data2
set print $Data2
do for [c=1:|ArrC|] {
do for [x=1:|ArrX|] {
do for [p=1:|ArrP|] {
undef var STATS*
stats $Data u (Filter(c,x,p)) nooutput
if (exists('STATS_mean') && exists('STATS_stddev')) {
print sprintf("% 3d% 3d% 3d% 3d% 3d% 3d% 9.1f % 7.1f", c, x, p, ArrC[c], ArrX[x], ArrP[p], STATS_mean, STATS_stddev)
}
}
}
print ""; print ""
}
set print
# print $Data2 # just for testing purpose
set xlabel sprintf("Column %d", ColX)
set ylabel sprintf("Column %d", ColS)
set xrange[0.5:|ArrX|+1]
set xtics () # remove all xtics
do for [x=1:|ArrX|] { set xtics add (sprintf("%d",ArrX[x]) x)} # set xtics "manually"
# function for x position and offsets,
# actually not dependent on 'n' but to shorten plot command
# columns in $Data2: 1=color, 2=x, 3=pointtype
width = 0.5 # float number!
step = width/(|ArrC|-1)
PosX(n) = column(2) - width/2.0 + step*(column(1)-1) + (column(3)-1)*step*0.3
plot \
for [c=1:|ArrC|] $Data2 u (PosX(0)):($7-$8):(0):(2*$8) index c-1 w vectors \
heads size 0.04,90 lw 2 lc c ti sprintf("%g",ArrC[c]),\
for [c=1:|ArrC|] '' u (PosX(0)):7:($3*2+4):(c) index c-1 w p ps 1.5 pt var lc var not, \
keyentry w p ps 0 ti "\n", \
for [p=1:|ArrP|] '' u (0):(NaN) w p pt p*2+4 ps 1.5 lc rgb "black" ti sprintf("%g",ArrP[p])
### end of code
Result:
I do not think gnuplot can produce exactly what you are asking for in a single plot command. I will show you two alternatives in the hope that one or both is a useful starting point.
Alternative 1: standard boxplot
spacing = 1.0
width = 0.25
unset key
set xlabel "Column 3"
set ylabel "Column 9"
plot 'data' using (spacing):9:(width):3 with boxplot lw 2
This collects points based on the value in column 3 and for each such value it produces a boxplot. This is a widely used method of showing the distribution of point values in a category, but it is a quartile analysis not a display of mean + standard deviation.
Alternative 2: calculate mean and standard deviation for categories known in advance
The boxplot analysis has the advantage that you do not need to know in advance what values may be present in column 3. Gnuplot can calculate mean and standard deviation based on a column 3 value, but you need to specify in advance what that value is. Here is a set of commands tailored to the specific example file you provided. It calculates, but does not plot, the requested categorical mean and standard deviation. You can use these numbers to construct a plot, but that will require additional commands. You could, for example, save the values for each category in a new file, or array, or datablock and then go back and plot these together.
col3entry = "8 32 64"
do for [i in col3entry] {
stats "data" using ($3 == real(i) ? $9 : NaN) name "Condition".i nooutput
print i, ": ", value("Condition".i."_mean"), value("Condition".i."_stddev")
}
output:
8: 62345.1111111111 1259.34784220021
32: 63115.6 392.552977316438
64: 59809.6 881.583711283279

Is graycomatrix's NumLevels and GrayLimits the same thing MATLAB

Ive been looking at implementing GLCM within MATLAB using graycomatrix. There are two arguments that I have discovered (NumLevels and GrayLimits) but in in my research and implementation they seem to achieve the same result.
GrayLimits specified bins between a range set [low high], causing a restricted set of gray levels.
NumLevels declares the number of gray levels in an image.
Could someone please explain the difference between these two arguments, as I don't understand why there would be two arguments that achieve the same result.
From the documentation:
'GrayLimits': Range used scaling input image into gray levels, specified as a two-element vector [low high]. If N is the number of gray levels (see parameter 'NumLevels') to use for scaling, the range [low high] is divided into N equal width bins and values in a bin get mapped to a single gray level.
'NumLevels': Number of gray levels, specified as an integer.
Thus the first parameter sets the input gray level range to be used (defaults to the min and max values in the image), and the second parameter sets the number of unique gray levels considered (and thus the size of the output matrix, defaults to 8, or 2 for binary images).
For example:
>> graycomatrix(img,'NumLevels',8,'GrayLimits',[0,255])
ans =
17687 1587 81 31 7 0 0 0
1498 7347 1566 399 105 8 0 0
62 1690 3891 1546 298 38 1 0
12 335 1645 4388 1320 145 4 0
2 76 305 1349 4894 959 18 0
0 16 40 135 965 7567 415 0
0 0 0 2 15 421 2410 0
0 0 0 0 0 0 0 0
>> graycomatrix(img,'NumLevels',8,'GrayLimits',[0,127])
ans =
1 9 0 0 0 0 0 0
7 17670 1431 156 50 31 23 15
1 1369 3765 970 350 142 84 92
0 128 1037 1575 750 324 169 167
0 46 361 836 1218 747 335 260
0 16 163 330 772 1154 741 547
0 10 74 150 370 787 1353 1208
0 4 67 136 294 539 1247 21199
>> graycomatrix(img,'NumLevels',4,'GrayLimits',[0,255])
ans =
28119 2077 120 0
2099 11470 1801 5
94 1829 14385 433
0 2 436 2410
As you can see, these parameters modify the output in different ways:
In the first case above, the range [0,255] was mapped to columns/rows 1-8, putting 32 different input grey values into each.
In the second case, the smaller range [0,127] was mapped to 8 indices, putting 16 different input grey values into each, and putting the remaining grey values 128-255 into the 8th index.
In the third case, the range [0,255] was mapped to 4 indices, putting 64 different input grey values into each.

Read a big data file with headlines into a matrix

I have a file that looks like this (with real data and much bigger):
A B C D E F G H I
1 105.28 1 22 84 2 10.55 21 2
2 357.01 0 32 34 1 11.43 28 1
3 150.23 3 78 22 0 12.02 11 0
4 357.01 0 32 34 1 11.43 28 1
5 357.01 0 32 34 1 11.43 28 1
6 357.01 0 32 34 1 11.43 28 1
...
17000 357.01 0 32 34 1 11.43 28 1
I want to import all the numerical value into a matrix, skipping the headlines. For that purpose I use this code:
Filename = 'test.txt';
A = dlmread(Filename,' ',1,0); %Imports the whole data into a matrix
The problem with this is just that A is a 17 000 * 1 vector instead of a matrix with several columns. If I manual edit the data file, remove the headlines and just run this it works:
A = dlmread(Filename); %Imports the whole data into a matrix
But I would prefer not to do this since the headlines are used later on in the code. Any advice how to get this work?
edit: solved by using
' '
instead of just
' '
Use the import tool.
Make sure you choose the data.
Generate script.

Matlab replace consecutive zero value with others value

I have this matrix:
A = [92 92 92 91 91 91 146 146 146 0
0 0 112 112 112 127 127 127 35 35
16 16 121 121 121 55 55 55 148 148
0 0 0 96 96 0 0 0 0 0
0 16 16 16 140 140 140 0 0 0]
How can I replace consecutive zero value with shuffled consecutive value from matrix B?
B = [3 3 3 5 5 6 6 2 2 2 7 7 7]
The required result is some matrix like this:
A = [92 92 92 91 91 91 146 146 146 0
6 6 112 112 112 127 127 127 35 35
16 16 121 121 121 55 55 55 148 148
7 7 7 96 96 5 5 3 3 3
0 16 16 16 140 140 140 2 2 2]
You simply can do it like this:
[M,N]=size(A);
for i=1:M
for j=1:N
if A(i,j)==0
A(i,j)=B(i+j);
end
end
end
If I understand it correctly from what you've described, your solution is going to need the following steps:
Loop over the rows of your matrix, e.g. for row = 1:size(A, 1)
Loop over the elements of each row, identify where each run of zeroes starts and store the indices and the length of the run. For example you might end up with a matrix like: consecutiveZeroes = [ 2 1 2 ; 4 1 3 ; 4 6 5 ; 5 8 3 ] indicating that you have a run at (2, 1) of length 2, a run at (4, 1) of length 3, a run at (4, 6) of length 5, and a run at (5, 8) of length 3.
Now loop over the elements of B counting up how many elements there are of each value. For example you might store this as replacementValues = [ 3 3 ; 2 5 ; 2 6 ; 3 2 ; 3 7 ] meaning three 3's, two 5's, two 6's etc.
Now take a row from your consecutiveZeroes matrix and randomly choose a row of replacementValues that specifies the same number of elements, replace the zeroes in A with the values from replacementValues, and delete the row from replacementValues to show that you've used it.
If there isn't a row in replacementValues that describes a long enough run of values to replace one of your runs of zeroes, find a combination of two or more rows from replacementValues that will work.
You can't do this with just a single pass through the matrix, because presumably you could have a matrix A like [ 15 7 0 0 0 0 0 0 3 ; 2 0 0 0 5 0 0 0 9 ] and a vector B like [ 2 2 2 3 3 3 7 7 5 5 5 5 ], where you can only achieve what you want if you use the four 5's and two 7's and not the three 2's and three 3's to substitute for the run of six zeroes, because you have to leave the 2's and 3's for the two runs of three zeroes in the next row. The easiest approach if efficiency is not critical would probably be to run the algorithm multiple times trying different random combinations until you get one that works - but you'll need to decide how many times to try before giving up in case the input data actually has no solution.
If you get stuck on any of these steps I suggest asking a new, more specific question.

How to load a text file in Matlab when the number of values in every line are different

I have a none rectangular text file like A which has 10 values in first line, 14 values in 2nd line, 16 values in 3rd line and so on. Here is an example of 4 lines of my text file:
line1:
1.68595314026 -1.48498177528 2.39820933342 27 20 15 2 4 62 -487.471069336 -517.781921387 5 96 -524.886108398 -485.697143555
Line2:
1.24980998039 -0.988095104694 1.89048337936 212 209 191 2 1 989 -641.149658203 -249.001220703 3 1036 -608.681762695 -300.815673828
Line3:
8.10434532166 -4.81520080566 4.90576314926 118 115 96 3 0 1703 749.967773438 -754.015136719 1 1359 1276.73632813 -941.855895996 2 1497 1338.98852539 -837.659179688
Line 4:
0.795098006725 -0.98456710577 1.89322447777 213 200 68 5 0 1438 -1386.39111328 -747.421386719 1 1565 -1153.50915527 -342.951965332 2 1481 -1341.57043457 -519.307800293 3 1920 -1058.8828125 -371.696960449 4 1303 -1466.5802002 -308.764587402
Now, I want to load this text file in to a matrix M in Matlab. I tired to use importdata function for loading it
M = importdata('A.txt');
but it loads the file in a rectangular matrix (all rows have same number of columns!!!) which is not right. The expected created matrix size should be like this:
size(M(1,:))= 1 10
size(M(2,:))= 1 14
size(M(3,:))= 1 16
How can I load this text file in a correct way into Matlab?
As #Jens suggested, you should use a cell array. Assuming your file contains only numeric values separated by whitespaces, for instance:
1 3 6
7 8 9 12 15
1 2
0 3 7
You can parse it into cell array like this:
% Read full file
str = fileread('A.txt');
% Convert text in a cell array of strings
c = textscan(str, '%s', 'Delimiter', '\n');
c = c{1};
% Convert 'string' elements to 'double'
n = cellfun(#str2num, c, 'UniformOutput', false)
You can then access individual lines like this:
>> n{1}
ans =
1 3 6
>> n{2}
ans =
7 8 9 12 15