matlab: reading numbers that also contain characters - matlab

I have to read a textfile which contains a list of companycodes. The format of the textfile is:
[1233A12; 1233B88; 2342Q85; 2266738]
Even if I have read the file? Is it possible to compare these numbers with regular numbers? Because I have the codes from two different data-bases and one of them has regular firmnumbers (no characters) and the other has characters inside the firmnumbers.
Btw the file is big (50+mb).
Edit: I have added an additional number in the example because not all the numbers have a character inside

If you want to compare part of a string with a number, you could do it as follows:
combiString = '1234AB56'
myNumber= 1234
str2num(combiString(1:4))==myNumber
str2num(combiString(7:8))==myNumber

You can achieve this result by using regular expressions. For example, if str = '1233A12' you can write
nums = regexp(str, '(\d+)[A-Z]*(\d+)', 'tokens');
str1 = nums{1}(1);
num1 = str2num(str1{1});
str2 = nums{1}(2);
num2 = str2num(str2{1});

Related

How do I read comma separated values from a .txt file in MATLAB using textscan()?

I have a .txt file with rows consisting of three elements, a word and two numbers, separated by commas.
For example:
a,142,5
aa,3,0
abb,5,0
ability,3,0
about,2,0
I want to read the file and put the words in one variable, the first numbers in another, and the second numbers in another but I am having trouble with textscan.
This is what I have so far:
File = [LOCAL_DIR 'filetoread.txt'];
FID_File = fopen(File,'r');
[words,var1,var2] = textscan(File,'%s %f %f','Delimiter',',');
fclose(FID_File);
I can't seem to figure out how to use a delimiter with textscan.
horchler is indeed correct. You first need to open up the file with fopen which provides a file ID / pointer to the actual file. You'd then use this with textscan. Also, you really only need one output variable because each "column" will be placed as a separate column in a cell array once you use textscan. You also need to specify the delimiter to be the , character because that's what is being used to separate between columns. This is done by using the Delimiter option in textscan and you specify the , character as the delimiter character. You'd then close the file after you're done using fclose.
As such, you just do this:
File = [LOCAL_DIR 'filetoread.txt'];
f = fopen(File, 'r');
C = textscan(f, '%s%f%f', 'Delimiter', ',');
fclose(f);
Take note that the formatting string has no spaces because the delimiter flag will take care of that work. Don't add any spaces. C will contain a cell array of columns. Now if you want to split up the columns into separate variables, just access the right cells:
names = C{1};
num1 = C{2};
num2 = C{3};
These are what the variables look like now by putting the text you provided in your post to a file called filetoread.txt:
>> names
names =
'a'
'aa'
'abb'
'ability'
'about'
>> num1
num1 =
142
3
5
3
2
>> num2
num2 =
5
0
0
0
0
Take note that names is a cell array of names, so accessing the right name is done by simply doing n = names{ii}; where ii is the name you want to access. You'd access the values in the other two variables using the normal indexing notation (i.e. n = num1(ii); or n = num2(ii);).

Using fscanf in MATLAB to read an unknown number of columns

I want to use fscanf for reading a text file containing 4 rows with an unknown number of columns. The newline is represented by two consecutive spaces.
It was suggested that I pass : as the sizeA parameter but it doesn't work.
How can I read in my data?
update: The file format is
String1 String2 String3
10 20 30
a b c
1 2 3
I have to fill 4 arrays, one for each row.
See if this will work for your application.
fid1=fopen('test.txt');
i=1;
check=0;
while check~=1
str=fscanf(fid1,'%s',1);
if strcmp(str,'')~=1;
string(i)={str};
end
i=i+1;
check=strcmp(str,'');
end
fclose(fid1);
X=reshape(string,[],4);
ar1=X(:,1)
ar2=X(:,2)
ar3=X(:,3)
ar4=X(:,4)
Once you have 'ar1','ar2','ar3','ar4' you can parse them however you want.
I have found a solution, i don't know if it is the only one but it works fine:
A=fscanf(fid,'%[^\n] *\n')
B=sscanf(A,'%c ')
Z=fscanf(fid,'%[^\n] *\n')
C=sscanf(Z,'%d')
....
You could use
rawText = getl(fid);
lines = regexp(thisLine,' ','split);
tokens = {};
for ix = 1:numel(lines)
tokens{end+1} = regexp(lines{ix},' ','split'};
end
This will give you a cell array of strings having the row and column shape or your original data.
To read an arbitrary line of text then break it up according the the formating information you have available. My example uses a single space character.
This uses regular expressions to define the separator. Regular expressions powerful but too complex to describe here. See the MATLAB help for regexp and regular expressions.

Function to split string in matlab and return second number

I have a string and I need two characters to be returned.
I tried with strsplit but the delimiter must be a string and I don't have any delimiters in my string. Instead, I always want to get the second number in my string. The number is always 2 digits.
Example: 001a02.jpg I use the fileparts function to delete the extension of the image (jpg), so I get this string: 001a02
The expected return value is 02
Another example: 001A43a . Return values: 43
Another one: 002A12. Return values: 12
All the filenames are in a matrix 1002x1. Maybe I can use textscan but in the second example, it gives "43a" as a result.
(Just so this question doesn't remain unanswered, here's a possible approach: )
One way to go about this uses splitting with regular expressions (MATLAB's strsplit which you mentioned):
str = '001a02.jpg';
C = strsplit(str,'[a-zA-Z.]','DelimiterType','RegularExpression');
Results in:
C =
'001' '02' ''
In older versions of MATLAB, before strsplit was introduced, similar functionality was achieved using regexp(...,'split').
If you want to learn more about regular expressions (abbreviated as "regex" or "regexp"), there are many online resources (JGI..)
In your case, if you only need to take the 5th and 6th characters from the string you could use:
D = str(5:6);
... and if you want to convert those into numbers you could use:
E = str2double(str(5:6));
If your number is always at a certain position in the string, you can simply index this position.
In the examples you gave, the number is always the 5th and 6th characters in the string.
filename = '002A12';
num = str2num(filename(5:6));
Otherwise, if the formating is more complex, you may want to use a regular expression. There is a similar question matlab - extracting numbers from (odd) string. Modifying the code found there you can do the following
all_num = regexp(filename, '\d+', 'match'); %Find all numbers in the filename
num = str2num(all_num{2}) %Convert second number from str

converting a string array into a struct

I have a string array with the contents of the row in the following manner.
X ='Xmole(1)=0.0Xmole(2)=1.0rho(1)=2343rho(2)=2343'
Now I need a struct data.Massdensity which should look like this
<data.Massdensity = Xmole(1)=0.0
Xmole(2)=1.0
rho(1)=2343
rho(2)=2343>
I did use cell2struct which will gave me a struct like this
data.Massdensity ='Xmole(1)=0.0Xmole(2)=1.0rho(1)=2343rho(2)=2343'
Is there any way possible I can get the struct like the one above.
I am reading a textfile whose contents look like this
MassDensity{
Xmole(1) = 0.0
Xmole(2) = 1.0
rho(1) = 2343 # [kg/m^3]
rho(2) = 2343 # [kg/m^3]
}
I am using fileread to read this into a single string.
So any better way of doing this
The problem with the intial way you presented your data is that there are no obvious delimiters. Whereas within the original file you have the option (one presumes) of using the line ends as delimiters.
1) Read in as separate strings (may require splitting or reassembly in MATLAB), the individual lines into a cell array. With textscan you can set a range of delimiters and other settings so make full use of the options.
For example:
a = textscan(fid,'%s','Delimiter',...
{'\n','{','}','#'},'CommentStyle','#','MultipleDelimsAsOne',1);
a = a{1}
Ideally you want to end up with:
a{1} = 'Massdensity'
a{2} = 'Xmole(1)=0.0'
...
a{4} = 'rho(2) = 2343'
You may need to do some trimming of whitespace.
2) Create your struct, using dynamic field naming:
data.(a{1})=a(2:end);
data.MassDensity{1}
ans =
Xmole(1) = 0.0

How should I write a matrix onto a text in Hex format?

I have a matrix, say:
M = [1000 1350;2000 2040;3000 1400];
I wish to write this matrix onto a text file in the hex format, like this:
0x000003e8 0x00000bb8
0x000007d0 0x000007f8
0x00000bb8 0x00000578
I considered using the function dec2hex but it's very slow and inefficient. It also gives me the output as a string, which I don't know how to restructure for my above required format.
MATlab directly converts hex numbers to decimal when reading from a text file, ex. when using the function fscanf(fid,'%x').
Can we do the exact same thing while writing a matrix?
You can use the %x format string. For the sake of demonstration, see an example with sprintf below. If you want to write to a file, you will have to use fprintf.
M = [1000 1350;2000 2040;3000 1400];
str = sprintf('0x%08x\t0x%08x\n', M')
this results in
str =
0x000003e8 0x00000546
0x000007d0 0x000007f8
0x00000bb8 0x00000578
You may use num2str with a format string:
str = num2str(M, '0x%08x ');
which returns
str =
0x000003e8 0x00000546
0x000007d0 0x000007f8
0x00000bb8 0x00000578
Using this instead of sprintf you do not need to repeat the format string for each column.