matlab edit text file, replacing numbers by their - matlab

I want to use Matlab to replace every floating point number in a text file with another number. (let's say half the original value)
Other data (integer and string) should not change.
A few lines of my text file (each variable is in a new line):
VERTEX
8
0
10
0.000000
20
110.500000
42
0.000000
0
VERTEX
8
0
10
0.000000
20
0.000000
42
0.000000
0
VERTEX
8
0
10
124.000000
20
0.000000
42
0.000000
0
VERTEX
8
0
10
248.000000
20
0.000000
42
0.000000
0
VERTEX
8
0
10
248.000000
20
110.500000
42
0.000000
0
VERTEX
8
0
10
248.000000
20
221.000000
42
0.000000
0
Any help is appreciated.

Here is a solution using fgetl and regexp
rid = fopen('test.txt','r');
wid = fopen('test2.txt','w');
while ~feof(rid)
s = fgetl(rid); % read a line
if regexp(s, '\d+\.\d+') % float founded
fprintf(wid, '42\n'); % wite "another integer"
else
fprintf(wid, '%s\n', s); % write original data
end
end
fclose(rid);
fclose(wid);

Related

Plot selected rows with the average and standard deviation (GNUPlot)

I have a csv file with experiment results that goes like this:
64 4 8 1 1 2 1 ttt 62391 4055430 333 0.0001 10 161 108 288 0
64 4 8 1 1 2 1 ttt 60966 3962810 322 0.0001 10 164 112 295 0
64 4 8 1 1 2 1 ttt 61530 3999475 325 0.0001 10 162 112 291 0
64 4 8 1 1 2 1 ttt 61430 4054428 332 0.0001 10 158 110 286 0
64 4 8 1 1 2 1 ttt 63891 4152938 339 0.0001 9 149 109 274 0
64 4 32 1 1 2 1 ttt 63699 4204182 345 0.0001 4 43 179 240 0
64 4 32 1 1 2 1 ttt 63326 4116218 336 0.0001 4 45 183 248 0
64 4 32 1 1 2 1 ttt 62654 4135211 340 0.0001 4 48 178 248 0
64 4 32 1 1 2 1 ttt 63192 4107506 339 0.0001 4 49 175 245 0
64 4 32 1 1 2 1 ttt 62707 4138666 345 0.0001 4 46 179 245 0
64 4 64 1 1 2 1 ttt 60968 3962929 323 0.0001 4 46 191 256 0
64 4 64 1 1 2 1 ttt 58765 3819787 305 0.0001 4 50 196 267 0
64 4 64 1 1 2 1 ttt 58946 3831499 308 0.0001 5 52 187 260 0
64 4 64 1 1 2 1 ttt 60646 3942047 321 0.0001 4 47 187 254 0
64 4 64 1 1 2 1 ttt 59723 3882044 311 0.0001 4 46 201 269 0
64 8 8 1 1 2 1 ttt 63414 4185382 382 0.0001 33 517 109 643 0
64 8 8 1 1 2 1 ttt 62429 4057899 372 0.0001 33 538 110 667 0
64 8 8 1 1 2 1 ttt 60622 3940452 384 0.0001 33 556 115 689 0
64 8 8 1 1 2 1 ttt 64433 4188192 369 0.0001 33 519 110 644 0
My goal is to be able to plot various combinations (choose which, in different charts) of the columns before the "ttt", with the average and standard deviation of the columns (choose which) after "ttt" (by grouping them by the before "ttt" columns).
Is this possible in GNUPlot and if yes how? If not, do you have any alternate suggestions regarding my problem?
Here is a completely revised and more general version.
Since you want to filter by 3 columns you need to have 3 properties to distinguish the data in the plot. This would be for example color, x-position and pointtype. What the script basically does:
Generates random data for testing (take your file instead)
$Data looks like this:
8 64 57773 0
4 32 64721 2
8 32 56757 1
4 16 56226 2
8 8 56055 1
8 64 59874 0
8 32 58733 0
4 16 55525 2
8 32 58869 0
8 64 64470 0
4 32 60930 1
8 64 57073 2
...
the variables ColX, ColC, ColP, and ColS define which columns are taken for x-position, color, pointtype and statistics.
find unique values of ColX, ColC, ColP, (check help smooth frequency) and put them to datablocks $ColX, $ColC, and $ColP.
put the unique values to arrays ArrX, ArrC, ArrP
loop all possible combinations and do statistics on ColS and put it to $Data2. Add 3 columns at the beginning for color, x-position and pointtype.
$Data2 looks like this:
1 1 1 0 8 4 61639.4 2788.4
1 1 2 0 8 8 59282.1 2740.2
1 2 1 0 16 4 59372.3 2808.6
1 2 2 0 16 8 60502.3 2825.0
1 3 1 0 32 4 59850.7 2603.8
1 3 2 0 32 8 60617.7 1979.8
1 4 1 0 64 4 60399.4 3273.6
1 4 2 0 64 8 59930.7 2919.8
2 1 1 1 8 4 59172.6 2288.2
2 1 2 1 8 8 58992.2 2888.0
2 2 1 1 16 4 59350.1 2364.6
2 2 2 1 16 8 61034.0 2368.5
2 3 1 1 32 4 59920.8 2867.6
2 3 2 1 32 8 59711.9 3464.2
2 4 1 1 64 4 60936.7 3439.7
2 4 2 1 64 8 61078.7 2349.3
3 1 1 2 8 4 58976.0 2376.3
3 1 2 2 8 8 61731.5 1635.7
3 2 1 2 16 4 58276.0 2101.7
3 2 2 2 16 8 58594.5 3358.5
3 3 1 2 32 4 60471.5 3737.6
3 3 2 2 32 8 59909.1 2024.0
3 4 1 2 64 4 62044.2 1446.7
3 4 2 2 64 8 60454.0 3215.1
Finally, plot the data. I couldn't figure out how plotting style with yerror works properly together with variable pointtypes. So, instead I split it into two plot commands with vectors and with points. The third one keyentry is just to get an empty line in the legend and the forth one is to get the pointtype into the legend.
I hope you can figure out all the other details and adapt it to your data.
Code:
### grouped statistics on filtered (unsorted) data
reset session
set colorsequence classic
# generate some random test data
rand1(n) = 2**(int(rand(0)*2)+2) # values 4,8
rand2(n) = 2**(int(rand(0)*4)+3) # values 8,16,32,64
rand3(n) = int(rand(0)*10000)+55000 # values 55000 to 65000
rand4(n) = int(rand(0)*3) # values 0,1,2
set print $Data
do for [i=1:200] {
print sprintf("% 3d% 4d% 7d% 3d", rand1(0), rand2(0), rand3(0), rand4(0))
}
set print
print $Data # (just for test purpose)
ColX = 2 # column for x
ColC = 4 # column for color
ColP = 1 # column for pointtype
ColS = 3 # column for statistics
# get unique values of the columns
set table $ColX
plot $Data u (column(ColX)) smooth freq
unset table
set table $ColC
plot $Data u (column(ColC)) smooth freq
unset table
set table $ColP
plot $Data u (column(ColP)) smooth freq
unset table
# put unique values into arrays
set table $Dummy
array ArrX[|$ColX|-6] # gnuplot creates 6 extra lines
array ArrC[|$ColC|-6]
array ArrP[|$ColP|-6]
plot $ColX u (ArrX[$0+1]=$1)
plot $ColC u (ArrC[$0+1]=$1)
plot $ColP u (ArrP[$0+1]=$1)
unset table
print ArrX, ArrC, ArrP # just for test purpose
# define filter function
Filter(c,x,p) = ArrX[x]==column(ColX) && ArrC[c]==column(ColC) && \
ArrP[p]==column(ColP) ? column(ColS) : NaN
# loop all values and do statistics, write data into $Data2
set print $Data2
do for [c=1:|ArrC|] {
do for [x=1:|ArrX|] {
do for [p=1:|ArrP|] {
undef var STATS*
stats $Data u (Filter(c,x,p)) nooutput
if (exists('STATS_mean') && exists('STATS_stddev')) {
print sprintf("% 3d% 3d% 3d% 3d% 3d% 3d% 9.1f % 7.1f", c, x, p, ArrC[c], ArrX[x], ArrP[p], STATS_mean, STATS_stddev)
}
}
}
print ""; print ""
}
set print
# print $Data2 # just for testing purpose
set xlabel sprintf("Column %d", ColX)
set ylabel sprintf("Column %d", ColS)
set xrange[0.5:|ArrX|+1]
set xtics () # remove all xtics
do for [x=1:|ArrX|] { set xtics add (sprintf("%d",ArrX[x]) x)} # set xtics "manually"
# function for x position and offsets,
# actually not dependent on 'n' but to shorten plot command
# columns in $Data2: 1=color, 2=x, 3=pointtype
width = 0.5 # float number!
step = width/(|ArrC|-1)
PosX(n) = column(2) - width/2.0 + step*(column(1)-1) + (column(3)-1)*step*0.3
plot \
for [c=1:|ArrC|] $Data2 u (PosX(0)):($7-$8):(0):(2*$8) index c-1 w vectors \
heads size 0.04,90 lw 2 lc c ti sprintf("%g",ArrC[c]),\
for [c=1:|ArrC|] '' u (PosX(0)):7:($3*2+4):(c) index c-1 w p ps 1.5 pt var lc var not, \
keyentry w p ps 0 ti "\n", \
for [p=1:|ArrP|] '' u (0):(NaN) w p pt p*2+4 ps 1.5 lc rgb "black" ti sprintf("%g",ArrP[p])
### end of code
Result:
I do not think gnuplot can produce exactly what you are asking for in a single plot command. I will show you two alternatives in the hope that one or both is a useful starting point.
Alternative 1: standard boxplot
spacing = 1.0
width = 0.25
unset key
set xlabel "Column 3"
set ylabel "Column 9"
plot 'data' using (spacing):9:(width):3 with boxplot lw 2
This collects points based on the value in column 3 and for each such value it produces a boxplot. This is a widely used method of showing the distribution of point values in a category, but it is a quartile analysis not a display of mean + standard deviation.
Alternative 2: calculate mean and standard deviation for categories known in advance
The boxplot analysis has the advantage that you do not need to know in advance what values may be present in column 3. Gnuplot can calculate mean and standard deviation based on a column 3 value, but you need to specify in advance what that value is. Here is a set of commands tailored to the specific example file you provided. It calculates, but does not plot, the requested categorical mean and standard deviation. You can use these numbers to construct a plot, but that will require additional commands. You could, for example, save the values for each category in a new file, or array, or datablock and then go back and plot these together.
col3entry = "8 32 64"
do for [i in col3entry] {
stats "data" using ($3 == real(i) ? $9 : NaN) name "Condition".i nooutput
print i, ": ", value("Condition".i."_mean"), value("Condition".i."_stddev")
}
output:
8: 62345.1111111111 1259.34784220021
32: 63115.6 392.552977316438
64: 59809.6 881.583711283279

I am searching a loop which stores values in a matrix

I have an input-table. It has 3 columns and some rows -> The first column are my x-coordinates and the second column are my y-coordinates. (Of my start points)
This start point is always my left bottom corner point of my rectangle. From this point I want to draw my rectangle. (Height and width should be a constant). I have big problems with loops and matrices:
My output must be looking like:
AllPoints= [0,0;1,0;1,1;0,1;5,5;10,5;10,10;5,10;2,2;4,2;4,4;2,4];
-> Explanation see screenshot
[clc % löscht den Bildschirm
clear all % löscht alle Variablen
%Table -> (start Points...)
%12 0 10
%14 0 30
%16 0 54
%18 0 51
%20 0 35
%22 0 12
%14 2 25
%16 2 35
Input_Matrix = readtable('Testbeispiel_Rainflow.dat',...
'Delimiter','\t','ReadVariableNames',false)%,'Format','%f%f%f')][1]

Plot Zero Value on Contour Plot in MATLAB

I'm plotting z values, 0 to 10, on a contour plot.
When I include data 1 or greater, I obtain a contour plot. Like the following:
longitude = [80 82 95]
latitude = [30 32 35]
temp = [1 4 6; 1 2 7; 3 5 7]
contourf(longitude,latitude,temp)
Now, I want to plot the ZERO VALUE also on the contour plot. While I was expecting one color representing the zero value, instead I obtained a white square.
longitude = [80 82 95]
latitude = [30 32 35]
temp = [0 0 0; 0 0 0; 0 0 0]
contourf(longitude,latitude,temp)
Thanks a lot,
Amanda
As Issac mentioned. To plot a constant data in a contourf is not possible.
When you try to do so you will obtain this warning from Matlab:
temp =
0 0 0
0 0 0
0 0 0
Warning: Contour not rendered for constant ZData
> In contourf>parseargs at 458
In contourf at 63
In TESTrandom at 45
However, if you put some numbers as 0, the contourf works fine:
longitude = [80 82 95];
latitude = [30 32 35];
temp = [0 4 6; 1 0 7; 0 5 9];
contourf(longitude,latitude,temp);
hcb = colorbar('horiz'); % colour bar
set(get(hcb,'Xlabel'),'String','Contourf Bar.')

Matlab, Image compression

i am unsure about what this is asking me to do in matlab? what does it mean to encode? what format should the answer be? can anyone help me to work it out please?
Encode the 8x8 image patch and print out the results
I have got an 8X8 image
symbols=[0 20 50 99];
p=[32 8 16 8];
p = p/sum(p);
[dict, avglen] = huffmandict(symbols, p);
A = ...
[99 99 99 99 99 99 99 99 ...
20 20 20 20 20 20 20 20 ...
0 0 0 0 0 0 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 0 0 0 0 0 0];
comp=huffmanenco(A,dict);
ratio=(8*8*8)/length(comp)
Do you understand the principle of Huffman coding?
To put it simply, it is an algorithm used to compress data (like images in your case). This means that the input of the algorithm is an image and the output is a numeric code that is smaller in size than the input: hence the compression.
The principle of Huffman coding is (roughly) to replace symbols in the original data (in your case the value of each pixel of the image) by a numeric code that is attributed according to the probability of the symbol. The most probable (i.e. the most common) symbol will be replaced by shorter codes in order to realize a compression of the data.
To solve your problem, Matlab has two functions in the Communications Toolbox: huffmandict and huffmanenco.
huffmandict: this function build a dictionary that is used to translate symbols from the original data to their numeric Huffman codewords. To build this dictionary, huffmandict needs the list of symbols used in the data and their probability of appearance which is the number of time they are used divided by the total number of symbols in your data.
huffmanenco: this function is used to translate your original data by using the dictionary built by huffmandict. Each symbol in the original data is translated to a numeric Huffman code. To measure the gain in size of this compression method, you can compute the compression ration, which is the ratio between the number of bits used to describe your original data and the number of bits of the Huffman corresponding code. In your case, infering from your computation of the compression ratio, you have an 8 by 8 image using 8 bits integer to describe each pixel, and the Huffman corresponding code uses length(comp) bits.
With all this in mind, you could read your code in this way:
% Original image
A = ...
[99 99 99 99 99 99 99 99 ...
20 20 20 20 20 20 20 20 ...
0 0 0 0 0 0 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 50 50 50 50 0 0 ...
0 0 0 0 0 0 0 0];
% First step: extract the symbols used in the original image
% and their probability (number of occurences / number of total symbols)
symbols=[0 20 50 99];
p=[32 8 16 8];
p=p/sum(p);
% To do this you could also use the following which automatically extracts
% the symbols and their probability
[symbols,p]=hist(A,unique(A));
p=p/sum(p);
% Second step: build the Huffman dictionary
[dict,avglen]=huffmandict(symbols,p);
% Third step: encode your original image with the dictionary you just built
comp=huffmanenco(A,dict);
% Finally you can compute the compression ratio
ratio=(8*8*8)/length(comp)

How can I perform this cumulative sum in MATLAB?

I want to calculate a cumulative sum of the values in column 2 of dat.txt below for each string of ones in column 1. The desired output is shown as dat2.txt:
dat.txt dat2.txt
1 20 1 20 20 % 20 + 0
1 22 1 22 42 % 20 + 22
1 20 1 20 62 % 42 + 20
0 11 0 11 11
0 12 0 12 12
1 99 1 99 99 % 99 + 0
1 20 1 20 119 % 20 + 99
1 50 1 50 169 % 50 + 119
Here's my initial attempt:
fid=fopen('dat.txt');
A =textscan(fid,'%f%f');
in =cell2mat(A);
fclose(fid);
i = find(in(2:end,1) == 1 & in(1:end-1,1)==1)+1;
out = in;
cumulative =in;
cumulative(i,2)=cumulative (i-1,2)+ cumulative(i,2);
fid = fopen('dat2.txt','wt');
format short g;
fprintf(fid,'%g\t%g\t%g\n',[out cumulative(:)]');
fclose(fid);
Here's a completely vectorized (albeit somewhat confusing-looking) solution that uses the functions CUMSUM and DIFF along with logical indexing to produce the results you want:
>> data = [1 20;... %# Initial data
1 22;...
1 20;...
0 11;...
0 12;...
1 99;...
1 20;...
1 50];
>> data(:,3) = cumsum(data(:,2)); %# Add a third column containing the
%# cumulative sum of column 2
>> index = (diff([0; data(:,1)]) > 0); %# Find a logical index showing where
%# continuous groups of ones start
>> offset = cumsum(index.*(data(:,3)-data(:,2))); %# An adjustment required to
%# zero the cumulative sum
%# at the start of a group
%# of ones
>> data(:,3) = data(:,3)-offset; %# Apply the offset adjustment
>> index = (data(:,1) == 0); %# Find a logical index showing where
%# the first column is zero
>> data(index,3) = data(index,2) %# For each zero in column 1 set the
%# value in column 3 to be equal to
data = %# the value in column 2
1 20 20
1 22 42
1 20 62
0 11 11
0 12 12
1 99 99
1 20 119
1 50 169
Not completely vectorized solution (it loops through the segments of sequential 1s), but should be faster. It's doing only 2 loops for your data. Uses MATLAB's CUMSUM function.
istart = find(diff([0; d(:,1)])==1); %# start indices of sequential 1s
iend = find(diff([d(:,1); 0])==-1); %# end indices of sequential 1s
dcum = d(:,2);
for ind = 1:numel(istart)
dcum(istart(ind):iend(ind)) = cumsum(dcum(istart(ind):iend(ind)));
end
dlmwrite('dat2.txt',[d dcum],'\t') %# write the tab-delimited file
d=[
1 20
1 22
1 20
0 11
0 12
1 99
1 20
1 50
];
disp(d)
out=d;
%add a column
out(:,3)=0;
csum=0;
for(ind=1:length(d(:,2)))
if(d(ind,1)==0)
csum=0;
out(ind,3)=d(ind,2);
else
csum=csum+d(ind,2);
out(ind,3)=csum;
end
end
disp(out)