Extract only numerical data from MATLAB from a text file into a matrix - matlab

I have a code which is producing output files containing information about some mesh which I need to analyse using MATLAB.
The output files look like this.
Vertex 1 1.3 -2.1 0 {z=(1.3e+0 -2.1e+0) mu=(1.4e-3 2.0e-3) uv=(-0.6 0.4)}
Vertex 2 1.4 -2.1 0 {z=(1.4e+0 -2.1e+0) mu=(2.8e-3 1.5e-3) uv=(-0.6 0.4)}
Vertex 3 -1.9 1.9 0 {z=(-1.9e+0 1.9e+0) mu=(-8.9e-2 1.4e-1) uv=( 0.7 -0.2)}
.
.
.
I would like my MATLAB code to read in this data file and form a matrix containing all the numbers
in the order specified.
So e.g I would want the above 3 lines to be processed into the matrix
1 1.3 -2.1 0 1.3e+0 -2.1e+0 1.4e-3 2.0e-3 -0.6 0.4
2 1.4 -2.1 0 1.4e+0 -2.1e+0 2.8e-3 1.5e-3 -0.6 0.4
3 -1.9 1.9 0 -1.9e+0 1.9e+0 -8.9e-2 1.4e-1 0.7 -0.2
Is there some convenient MATLAB facility/command to do this?

I think you could use textscan for this:
Example date.txt:
Vertex 1 1.3 -2.1 0 {z=(1.3e+0 -2.1e+0) mu=(1.4e-3 2.0e-3) uv=(-0.6 0.4)}
Vertex 2 1.4 -2.1 0 {z=(1.4e+0 -2.1e+0) mu=(2.8e-3 1.5e-3) uv=(-0.6 0.4)}
Vertex 3 -1.9 1.9 0 {z=(-1.9e+0 1.9e+0) mu=(-8.9e-2 1.4e-1) uv=( 0.7 -0.2)}
Code:
fileID = fopen('data.txt');
C = textscan(fileID,'Vertex %f %f %f %f {z=(%f %f) mu=(%f %f) uv=(%f %f)}');
fclose(fileID);
mtxC = [C{:}];
Result:
mtxC =
1.0000 1.3000 -2.1000 0 1.3000 -2.1000 0.0014 0.0020 -0.6000 0.4000
2.0000 1.4000 -2.1000 0 1.4000 -2.1000 0.0028 0.0015 -0.6000 0.4000
3.0000 -1.9000 1.9000 0 -1.9000 1.9000 -0.0890 0.1400 0.7000 -0.2000

MATLAB Option (partly tested)
I had to do something similar with a CMM once and it was easy to do in Python (see below). You could use the MATLAB command regexp(text, expression) to match a regular expression that gets what you want. This will return string data though, which you can save to a data file and then load that data file, or convert to numbers using str2double.
To use this, you first have to get your data file into MATLAB as series of strings. You can do this with fgetl.
in_fid = fopen('my_input_file.txt', 'r');
out_fid = fopen('my_output_file.txt', 'w');
data = [];
line = fgetl(in_fid);
while ischar(line)
match = regexp(line, '[+-]?\d+\.?\d*e?[+-]?\d*', 'match'); % find all matches
% Write to text file
fprintf(out_fid, '%s\t', match); % write values to file with tabs between
fprintf(out_fid, '\n'); % write a new line to the file
% Or save to an array locally
data = [data; str2double(match)];
line = fgetl(in_fid); % grab the next line
end
fclose('all');
% If you wrote to a text file, retrieve the data
data = dlmread('my_output_file.txt', 'delimiter', '\t'); % not sure about this...
Note that this will not match numbers that begin with a decimal point with no preceding digit, i.e. .2. Also note that this will match numbers that match the pattern in any file that you feed it, so it is generalized. For how to match floating point numbers, see this site (I changed it a bit though to add the e portion for scientific notation).
I was able to test the regexp and str2double operations on a remote machine, and it looks like building your data array directly works. I was unable to test the file I/O portion, so there may be some bugs there still.
Python Option (my favorite)
I suggest using regular expressions in Python for this sort of thing. I had to do something similar with a CMM once and it was easy to do in Python with something like:
import re
# Make pattern to match scientific notation numbers
pattern = re.compile(r"[+-]?\d+\.?\d*e?[+-]?\d*")
with open("your_input_file.txt", "r") as in_file:
with open("your_output_file.txt", "w") as out_file:
for line in in_file:
match = pattern.findall(line) # find all matches in the line
out_file.write("\t".join(match) + "\n") # write the results to a line in your output
For a good introduction to regex in Python, see Dive Into Python 3, which I recommend just about everybody reads. I tested this on your example file and it gives me:
1 1.3 -2.1 0 1.3e+0 -2.1e+0 1.4e-3 2.0e-3 -0.6 0.4
2 1.4 -2.1 0 1.4e+0 -2.1e+0 2.8e-3 1.5e-3 -0.6 0.4
3 -1.9 1.9 0 -1.9e+0 1.9e+0 -8.9e-2 1.4e-1 0.7 -0.2
in your_output_file.txt, so I think it works! The last step then is to just dlmread('your_output_file.txt', 'delimeter', '\t') in MATLAB and you should be good to go.
If you want to get fancy, you could upgrade your Python script so that it can be called from the command line with your input and output files as arguments (look into the sys.argv method), but this gets a bit more complicated and it is easy enough to just open the script and change the filename manually. Unless you need to do this all the time on differently-named files, in which case arguments are a good route. There is a good example of this here.

Related

How do I output only two numbers after a decimal place even for complex numbers?

I'm having trouble outputting a number with only 2 decimal spaces after the decimal point.
For example, you divide 46/3 and get 15.3333. How would I change the output to 15.33 if I'm using the display (disp) function?
a = 46;
b = 3;
disp(a/b) % 15.3333 <- this should be displayed as 15.33
Also this should work with complex numbers:
a = 46i;
b = 3;
disp(a/b) % 0.0000 +15.3333i but I need 0.00 +15.33i
For printing complex numbers with two decimal places in both the real and imaginary part I suggest using the fprintf with the Re and Im parts. For example, the following anonymous function implements this 2-decimal printing similar to the default Matlab format:
disp2 = #(Z) fprintf('%.2f%+.2fi\n', real(Z), imag(Z));
Usage example:
a = 10i;
b = 3;
disp2(a/b); % 0.00+3.33i
disp2(-b-a); % -3.00-10.00i
You can also replace the fprintf with sprintf and use the returned string inside a disp() function.
And this is a more complicated version of this function with nicer output:
disp2 = #(z) fprintf('% 5.2f %c% 6.2fi\n', real(z), subsref(sprintf('%+d', imag(z)),struct('type','()','subs',{{1}})), abs(imag(z)));
E.g.
disp2(a/b); % 0.00 + 3.33i
disp2(-b-a); % -3.00 - 10.00i

integral2 returns 0 when it shouldn't

I am having the following issue. I am trying to integrate a pdf function of bivariate normal over the entire support. However, Matlab returns 0 for d, which shouldn't be the case. What is wrong with my approach?
The code is below:
mu1=100;
mu2=500;
sigma1=1;
sigma2=2;
rho=0.5;
var1=sigma1^2;
var2=sigma2^2;
pdf = #(x,y) (1/(2*pi*sigma1*sigma2*sqrt(1-rho^2))).*exp((-1/(2*(1-rho^2)))*(((x-mu1).^2/var1)+((y-mu2).^2/var2)-((2*rho*(x-mu1).*(y-mu2))/(sigma1*sigma2))));
d = integral2(pdf,-Inf,Inf,-Inf,Inf)
As #Andras Deak commented, the "exponentials cut off very fast away from the peak".
As a matter of fact, you can visualize it:
mu1=100;
mu2=500;
sigma1=1;
sigma2=2;
rho=0.5;
var1=sigma1^2;
var2=sigma2^2;
pdf = #(x,y) (1/(2*pi*sigma1*sigma2*sqrt(1-rho^2))).*exp((-1/(2*(1-rho^2)))*(((x-mu1).^2/var1)+((y-mu2).^2/var2)-((2*rho*(x-mu1).*(y-mu2))/(sigma1*sigma2))));
figure
fsurf(pdf,[90 110 490 510])
figure
fsurf(pdf,[0 200 400 600])
In the first figure, the limits are close to the means you provided. You can see the shape of the bivariate normal:
If you extend the limits, you will see what it looks like a discontinuity:
The built-in integral functions try to evaluate the integrals, but if your limits are -inf and inf, your function is zero almost everywhere, with a discontinuity close to the means.
To treat singularities, you should break your domain, as suggested by MATLAB. Since the function is zero almost everywhere, you can integrate only around the means:
d = integral2(pdf,90,110,490,510)
> d =
>
> 1.0000
You can also write it as a function of your variables. The empirical rule states that 99.7% of your data is within 3 standard deviations from the means, so:
d = integral2(pdf,mu1-3*sigma1,mu1+3*sigma1,mu2-3*sigma2,mu2+3*sigma2)
> d =
>
> 0.9948
which will get you a pretty good result.
We can elaborate more. In the wikipedia page of the empirical rule, the expression
erf(x/sqrt(2))
will give the "Expected fraction of population inside range mu+-x*sigma". For the short precision shown as standard in MATLAB, if you choose, say x=5, you will get:
x = 5;
erf(x/sqrt(2))
> ans =
>
> 1.0000
Pretty much every data is contained within 5 standard deviations. So, you may neglect the domain outside this range in the double integration to avoid the (almost) singularity.
d = integral2(pdf,mu1-x*sigma1,mu1+x*sigma1,mu2-x*sigma2,mu2+x*sigma2)
> d =
>
> 1.0000

the difference between makecform('srgb2xyz') and rgb2xyz() in matlab

I wonder what the matlab does when using the function rgb2xyz()?
I cannot re-produce the results using the rgb2xyz conversion matrix..
Moreover, is there any difference between using makecform('srgb2xyz') and using rgb2xyz()? they produce difference results..
The default white point for makecform('srgb2xyz') appears to be D50, whereas rgb2xyz defaults to D65.
>> applycform([.2 .3 .4],makecform('srgb2xyz','AdaptedWhitePoint',whitepoint('D65')))
ans =
0.0638 0.0690 0.1356
>> rgb2xyz([.2 .3 .4])
ans =
0.0638 0.0690 0.1356
>> applycform([.2 .3 .4],makecform('srgb2xyz'))
ans =
0.0617 0.0679 0.1024
>> rgb2xyz([.2 .3 .4],'WhitePoint','D50')
ans =
0.0616 0.0679 0.1025
Note the documentation for makecform suggests using the more recent rgb2xyz instead. As for your comment about reproducing the results using a matrix, note that the matrices are generally derived from / applied to linear data. If you want to reproduce the results you'll need to model the srgb gamma correction as well.

MATLAB resulted MTX format

So.. I am writing this function in MATLAB but I don't like the result's format where MATLAB factors a constant out of the returned MTX.
The number I am looking for is 0.7584 as the second element in the returned MTX. However the result was displayed as
1.0e+04 * [2.0059 0.0001 0.0004].
I asked for the second element in the result and it was the value I wanted.
The question is: how can I make the function/MATLAB display the result as a MTX only without timing it with that constant.
any thoughts ?
Click here to see the screenshot
1
You can use format to set the output style of the command window.
>> a = rand(1, 3)*10e-4;
>> format short
>> disp(a)
1.0e-03 *
0.9649 0.1576 0.9706
>> format longe
>> disp(a)
9.648885351992765e-04 1.576130816775483e-04 9.705927817606157e-04
Try out the different options to find what suits you.

Vertcat error using ODE45

I am trying to numerically simulate a system using ODE45. I cannot seem to figure out why i'm getting the following error:
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
[t,x] = ode45(#NL_hw3c, [0,20], [1, 2, 3]);
function sys = NL_hw3c(t,x)
sys = [x(2)+ x(1)^2;
x(3)+log10(x(2)^2 + 1);
-3*x(1)-5*x(2)-3*x(3)+4*cos(t)-5*x(1)^2 -3*log10(x(2)^2 + 1) -6*x(1)*x(2) -6*x(1)^3 -(2*x(2)*(x(3)+ log10(x(2)^2 + 1)))/((x(2)^2 + 1)*log(10)) -2*x(1)*x(3) -2*x(1)*log10(x(2)^2 + 1) -2*x(2)^2 -8*x(1)^2*x(2) -6*x(1)^4];
end
Googled and couldn't find a similar solution. Any help would be appreciated.
Thanks
I had to separate each of the variables in your array for it to work:
function s = NL_hw3c(t,x)
s1 = x(2)+ x(1)^2;
s2 = x(3)+log10(x(2)^2 + 1);
s3 = -3*x(1)-5*x(2)-3*x(3)+4*cos(t)-5*x(1)^2 -3*log10(x(2)^2 + 1) -6*x(1)*x(2) -6*x(1)^3 -(2*x(2)*(x(3)+ log10(x(2)^2 + 1)))/((x(2)^2 + 1)*log(10)) -2*x(1)*x(3) -2*x(1)*log10(x(2)^2 + 1) -2*x(2)^2 -8*x(1)^2*x(2) -6*x(1)^4;
s = [s1;s2;s3];
end
I got the following output:
t =
0
0.0018
0.0037
0.0055
0.0074
0.0166
...
...
19.7647
19.8431
19.9216
20.0000
x =
1.0000 2.0000 3.0000
1.0055 2.0067 2.8493
1.0111 2.0131 2.6987
1.0167 2.0192 2.5481
1.0224 2.0251 2.3975
...
...
0.7926 -0.0187 -1.7587
0.8380 -0.1567 -1.7624
0.8781 -0.2928 -1.7534
0.9129 -0.4253 -1.7299
The reason why your function didn't work was because in the last value of your array, the spaces between each part of the expression are interpreted as going into a separate column. Essentially, the first two rows of your matrix consist of 1 element and if you use the last expression exactly as it is, you would be trying to place 9 elements in the last row. I'm assuming you want a 3 x 1 matrix and the last element would thus violate the size of this matrix that you want to create and this is why it's giving you an error.
I'm assuming you want the last value as an entire expression, so to do this you will need to place this as a separate expression then place it into your array.
To make the code more readable, I've placed all of the entries as separate variables before making the array.