Matlab read one digit at a time from text file

Matlab read one digit at a time from text file - matlab

I have a file that contains byte values 0 or 1 that are formatted without any whitespace between, like 1010111101010010010101. I want to make a [1, 0, 1, ...] vector out of those, reading one digit at a time. How can I do that? I tried using fscanf(fileId,'%c') but I get ASCII codes instead of actual values. '%d' on the other hand reads the entire file as one number.
I also tried writing to file:
fprintf(file1,'%d ',matrix); //notice the space after `%d`
and reading
fscanf(file2,'%d');
but I get a Nx1 matrix and I want to keep it as 1xN.
I could transpose it to be horizontal, but I still need to add space between digits, and I don't want to do that if possible.

You can convert easily from ascii char code to integer format as follows:
text = fscanf(fileId,'%c') - '0' ;
Note that you will also pick up end-of-line characters this way if there are any.
If you only have 0/1 in your file, using fileread will accomplish the same thing but also catches EOL characters:
text = fileread('test.txt');
text = text' - '0';
You can also read the entire file with textread:
text = textread('test.txt','%s');
text = char(text) - '0' ;
Now lines are returned in a cell array with one row per line. char then converts the cell array to a regular char array. This will not capture EOL characters but char will append blank spaces (ascii code 32) if the lines are not all equal in length.
Finally, you can also read line by line by looping and applying fgetl at each iteration until the function returns a -1.
while ~isnumeric(c)
c = fscanf(fileId,'%c')
c - '0';
end
This avoids reading EOL characters and appending blank space but you need to handle catenating the data.

Related

How to fix extra space that MATLAB displays after first iteration

I have an fprintf statement which loops 3 times in order to display some data. After the first iteration, MATLAB displays a mysterious space even though I have not added an extra \t. It acts as if I had an if statement to display a different fprintf statement after the first iteration, but I have nothing like that on the code. See picture on the link for the result it displays
% Display results
fprintf('Panel\tPressure Cl\tCd\t| Panel\tPressure Cl\tCd\n')
for q = 1:length(AOA)
fprintf('--------------\t-------\t------- |--------------\t--
-----\t-------\n')
fprintf('AOA %.0f°\t\t%.4f\t%.4f\t|AOA %.0f°
\t\t%.4f\t%.4f\n'...
,AOA(q),Cl(q),CD(q),AOA(q),ClFinal(q),CDFinal(q))
fprintf('--------------\t-------\t------- |--------------\t--
-----\t-------\n')
for j = 1:length(pressure{1})
fprintf('%.0f\t%.4f\t |\t |\t|%.0f\t%.4f\n',j+1,pressure{q}
(j),j+1,pFinal{q}(j))
end
end

When you fprintf a \t character, there is an automatic space padding up to 4 spaces. If the string has less than 4 characters, the string will be placed at the start and be "space padded" until 4 characters have been filled (in reality, the space padded characters resemble just one character). If the string has more than 4 characters, then it will space pad at 8, 12, 16, etc...
Here is what your question is really about:
fprintf('Panel\tPressure Cl\tCd\t| Panel\tPressure Cl\tCd\n')
The first string Panel has 5 characters, and therefore will be space padded with the equivalent of 3 spaces at the end of the first Panel. However, the second string | Panel has 7 characters, and therefore will only need the equivalent of 1 space at the end of the second string.
To remove your spacing issue, and have a more uniform spacing between your text headers, you can place a tab character after every header you want, and change your formating for your other fprintf statements accordingly:
fprintf('Panel\tPressure\tCl\t\tCd\t\t|\tPanel\tPressure\tCl\t\tCd\n')
You can also view this link for another example of how space padding works.
Also, here is the MATLAB Documentation on Formatting Text.

relation between size of text and position in file

let us suppose that we have following text file 'badpoem.txt', which contains the following sentences
Oranges and lemons,
Pineapples and tea.
Orangutans and monkeys,
Dragonflys or fleas.
i determined size for each sentence in bytes
whos
Name Size Bytes Class Attributes
ans 1x1 8 double
fid 1x1 8 double
tline1 1x19 38 char
tline2 1x19 38 char
tline3 1x23 46 char
where tline1, tline2 and tline3 are corresponding texts, now when i have opened file and read text three times, i have checked current position of files, and here is result for first one
fid = fopen('badpoem.txt');
ftell(fid)
ans = 0
it is opening, so it's fine, now read first text
tline1 = fgetl(fid) % read the first line
ftell(fid)
tline1 =
'Oranges and lemons,'
ans =
21
now lets read second file
tline2 = fgetl(fid)
ftell(fid)
tline2 =
'Pineapples and tea.'
ans =
42
and finally last one
tline3 = fgetl(fid)
ftell(fid)
tline3 =
'Orangutans and monkeys,'
ans =
67
is there any relation between size of text and position? thanks in advance

For text files Windows adds a two characters at the end of each line, Other systems add one. Matlab, when reading a line skips these in the returned string but since Windows adds two instead of one you get different position values for Windows than is shown in the Matlab example here:
https://www.mathworks.com/help/matlab/ref/ftell.html
Char strings are saved in files using one byte for each character but are stored in Matlab's memory as 16 bit words or 2 bytes for each character which doubles the apparent size of char strings.

Very nice question, indeed. Actually, I think what confuses you is the fact that you are dealing with many different problems mixed up all together. Let's analyze them one by one.
1) TXT File Format under Windows
Usually (asian locales and advanced text editors are a common exception), text files under Windows are ANSI encoded (where ANSI is a generic way for referring to ISO/IEC 8859 encodings). Within this encoding framework, on a binary point of view, each character is represented by a single byte. If you open such TXT files with Notepad and paste a few chinese ideograms inside, this is the message you will see when trying to save your changes:
This file contains characters in Unicode format which will be lost if
you save this file as ANSI encoded text file. To keep the Unicode
information, click Cancel below and then select one of the Unicode
option from the Encoding drop down list. Continue?
2) Line Separators under Windows
As other users already pointed out, in Windows, the default line break is represented by a combination of two character: a carriage return (better known as \r or 0xD) and a line feed (better known as \n or 0xA). Here is an example based on your text:
Oranges and lemons,\r\nPineapples and tea.\r\nOrangutans and monkeys,\r\nDragonflys or fleas.
This doesn't happen with other operating systems like Linux and MacOS, in which only line feeds are supported:
Oranges and lemons,\nPineapples and tea.\nOrangutans and monkeys,\nDragonflys or fleas.
3) Storage of Strings under Matlab
Matlab stores characters in memory as Unicode 16-bit unsigned integers that take up two bytes each. This does not depend on the current Matlab encoding, (which can be retrieved executing the command feature('DefaultCharacterSet') and, by default, corresponds to the current operating system encoding).
4) The fgetl Function
As per official documentation, the fgetl function reads a single line from a file (signally, a valid file handle) excluding line breaks. This means that Matlab reads the whole line, including all the line break characters, but they are trimmed out from the output string returned by the function.
The difference between fgetl and fgets is that the former trims the line breaks while the latter doesn't.
All this being said, let's analyze step-by-step what is happening in your code. First, you open the file and the pointer is being placed at the beginning of the stream:
fid = fopen('data.txt','r');
ftell(fid) % 0
Then, you read the first line:
tline1 = fgetl(fid)
ftell(fid) % 21
The line contains 19 characters (the size you get from the whos table) that, memory-side, are being stored using 38 bytes because of the Unicode. The ftell call displays the number 21 because fgetl read the whole line, which includes two line breaks characters that have been trimmed from the output (0 + 19 + 2 = 21).
Then, you read the second line:
tline2 = fgetl(fid)
ftell(fid) % 42
The line contains 19 characters that, memory-side, are being stored using 38 bytes. The ftell call displays the number 42 because fgetl read the whole line, which includes two line breaks characters that have been trimmed from the output. From the previous offset, 21 + 19 + 2 = 42.
Finally, you read the third line:
tline3 = fgetl(fid)
ftell(fid) % 67
The line contains 23 characters that, memory-side, are being stored using 46 bytes. The ftell call displays the number 67 because fgetl read the whole line, which includes two line breaks characters that have been trimmed from the output. From the previous offset, 42 + 23 + 2 = 67.

How can i get 6 digits after comma (matlab)?

I read from text some comma seperated values.
-8.618643,41.141412
-8.639847,41.159826
...
I write script below;
get_in = zeros(lendata,2);
nums = str2num(line); % auto comma seperation.(two points)
for x=1:2
get_in(i,x)=nums(x);
end
it automatically round numbers. For example;
(first row convert to "-8.6186 , 41.1414")
How can i ignore round operation?
I want to get 6 digits after comma.
I tried "str2double" after split line with comma delimeter.
I tried import data tool
But it always rounded to 4 digits, too.

As one of the replies has already said, the values aren't actually rounded, just the displayed values (for ease of reading them). As suggested, if you just enter 'format long' into the command window that should help.
The following link might help with displaying individual values to certain decimal places though: https://uk.mathworks.com/matlabcentral/newsreader/view_thread/118222
It suggests using the sprintf function. For example sprintf(%4.6,data) would display the value of 'data' to 6 decimal places.

dlmwriter puts space between each character

I am trying to write a quite large binary array to text file. My data's dimension is 1 X 35,000 and it is like :
0 0 0 1 0 0 0 .... 0 0 0 1
What I want to do is first add a string in the beginning of this array let's say ROW1 and then export this array to a text file with space delimiter.
What I have tried so far:
fww1 = strcat({'ROW_'},int2str(1));
fww2 = strtrim(cellstr(num2str((full(array(1,:)))'))');
new = [fww1 fww2];
dlmwrite('text1.txt', new,'delimiter',' ','-append', 'newline', 'pc');
As a result of this code I got:
R O W _ 1 0 0 0 0 1 ....
How can I get it as below:
ROW_1 0 0 0 0 1 ....

The most flexible way of writing to text files is using fprintf. There is a bit of a learning curve (you'll need to figure out the format specifiers, i.e. the %d etc.) but it's definitely worth it, and many other programming languages have some implementation of fprintf.
So for your problem, let's do the following. First, we'll open a file for writing.
fid = fopen('text1.txt', 'wt');
The 'wt' means that we'll open the file for writing in text mode. Next, let's write this string you wanted:
row_no = 1;
fprintf(fid, 'ROW_%d', row_no);
The %d is a special character that tells fprintf to replace it with a decimal representation of the given number. In this case it behaves a lot like int2str (maybe num2str is a better analogy, since it also works on non-integers).
Next, we'll write the row of data. Again, we'll use %d to specify that we want a decimal representation of the boolean array.
fprintf(fid, ' %d', array(row_no,:));
A couple thing to note. First, we the format specifier also includes a space in front of every number, so that takes care of the delimiter. Second, we only specified a single format but an array of numbers. When faced with this, fprintf will just go on repeating the format until it runs out of numbers.
Next, we'll write a newline to indicate the end of the row (\n is one of the special characters recognized by fprintf):
fprintf(fid, '\n');
If you have more lines to write, you can put a for loop over these fprintf statements. Finally, we'll close the file so that the operating system knows we're done writing to it.
fclose(fid);

Dimensions of matrices being concatenated are not consistent using array with characters

I'm trying to initialize
labels =['dh';'Dh';'gj';'Gj';'ll';'Ll';'nj';'Nj';'rr';'Rr';'sh';'Sh';'th';'Th';'xh';'Xh';'zh';'Zh';'ç';'Ç';'ë';'Ë'];
But it shows me the error on title.When I try with numbers it's all perfect but not with characters.What could be the problem?

If you wish to eliminate any padding, you can also store it into a cell as follows.
labels = {'dh';'Dh';'gj';'Gj';
'll';'Ll';'nj';'Nj';
'rr';'Rr';'sh';'Sh';
'th';'Th';'xh';'Xh';
'zh';'Zh';'ç';'Ç';
'ë';'Ë'};
Then you can reference the "i"th element using labels{i} instead of labels(i,:) which is simpler. You can further run more string operations using cellfun and not interfere with any existing values that you've stored.

I agree with krisdestruction that using a cell array makes the code accessing the strings simpler and is generally more idiomatic. That is what I would also recommend unless there is a compelling reason to do something else.
For completeness, you could use the char function to add the padding automatically for you if you really want a character array:
>> char('aa','bb','c')
ans =
aa
bb
c
where the last row is 'c '. From the char documentation:
S = char(A1,...,AN) converts the arrays A1,...,AN into a single character array. After conversion to characters, the input arrays become rows in S. Each row is automatically padded with blanks as needed. An empty string becomes a row of blanks.
(Emphasis mine)

From the Mathworks documentation:
Apply the MATLAB concatenation operator, []. Separate each row with a semicolon (;). Each row must contain the same number of characters. For example, combine three strings of equal length:
You can try padding like this to make every row 2 characters:
labels = ['dh';'Dh';'gj';'Gj';
'll';'Ll';'nj';'Nj';
'rr';'Rr';'sh';'Sh';
'th';'Th';'xh';'Xh';
'zh';'Zh';'ç ';'Ç ';
'ë ';'Ë '];

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse