Matlab misreading ascii text file - matlab

This is a problem in analyzing some text files using Matlab, which is screwing up some of the text. I am using R2017a (9.2.0.538062) 64-bit (maci64). Please note the accented characters.
Other text editors are reading the file ("War and Peace.txt") correctly (Textmate, Emacs, Textedit, and GNU Octave), as well as other programs (Python, Ruby, Mathematica).
It was in July, 1805, and the speaker was the well-known Anna Pávlovna Schérer, maid of honor and favorite of the Empress Márya Fëdorovna.
Whereas in Matlab
It was in July, 1805, and the speaker was the well-known Anna Pávlovna Schérer, maid of honor and favorite of the Empress Márya Fëdorovna.
My Question
Is there a Matlab (preferences?) setting that will read Ascii text accurately? Matlab appears to be garbling valid Ascii characters (mostly in the 200-256 range).

I actually faced the same problem as yours, when trying to read string from a text file. The problem with me was that I saved the .txt file in ANSI Encoding Format. After many trials, I came up with a solution. First you have to save the file in UTF-8 Encoding format. Like this:
Then in your MATLAB code, you should specify the encondigIn in fopencommand.
A test code can be something like:
close all;clearvars;clc;
fileID = fopen('text.txt', 'r', 'n', 'UTF-8');
C = textscan(fileID, '%s');
fclose(fileID);
celldisp(C)
The output of this code would be:
C{1}{1} =
It
C{1}{2} =
was
C{1}{3} =
in
C{1}{4} =
July,
C{1}{5} =
1805,
C{1}{6} =
and
C{1}{7} =
the
C{1}{8} =
speaker
C{1}{9} =
was
C{1}{10} =
the
C{1}{11} =
well-known
C{1}{12} =
Anna
C{1}{13} =
Pávlovna
C{1}{14} =
Schérer,
C{1}{15} =
maid
C{1}{16} =
of
C{1}{17} =
honor
C{1}{18} =
and
C{1}{19} =
favorite
C{1}{20} =
of
C{1}{21} =
the
C{1}{22} =
Empress
C{1}{23} =
Márya
C{1}{24} =
Fëdorovna.

Related

matlab text read and write %s character (without escaping)

Dear All (with many thanks in advance),
The following script has trouble reading (and therefore writing) the %s character in the file 'master.py'.
I get that matlab thinks the %s is an escape character, so perhaps an option is to modify the terminator, but I have found this difficult.
(EDIT: Forgot to mention the file master.py is not in my control, so I can't modify the file to %%s for example).
%matlab script
%===============
fileID = fopen('script.py','w');
yMax=5;
fprintf(fileID,'yOverallDim = %d\n', -1*yMax);
%READ IN "master.py" for rest of script
fileID2 = fopen('master.py','r');
currentLine = fgets(fileID2);
while ischar(currentLine)
fprintf(fileID,currentLine);
currentLine = fgets(fileID2);
end
fclose(fileID);
fclose(fileID2);
The file 'master.py' looks like this (and the problem is on line 6 'setName ="Set-%s"%(i+1)':
i=0
for yPos in range (0,yOverallDim,yVoxelSize):
yCoordinate=yPos+(yVoxelSize/2) #
for xPos in range (0,xOverallDim,xVoxelSize):
xCoordinate=xPos+(xVoxelSize/2)
setName ="Set-%s"%(i+1)
p = mdb.models['Model-1'].parts['Part-1']
# p = mdb.models['Model-1'].parts['Part-2']
c = p.cells
cells = c.findAt(((xCoordinate, yCoordinate, 10.0), ))
region = p.Set(cells=cells, name=setName)
p.SectionAssignment(region=region, sectionName='Section-1', offset=0.0, offsetType=MIDDLE_SURFACE, offsetField='', thicknessAssignment=FROM_SECTION)
i+=1
In the documentation of fprintf you'll find this:
fprintf(fileID,formatSpec,A1,...,An) applies the formatSpec to all elements of arrays A1,...An in column order, and writes the data to a text file.
So in your function fprintf uses currentLine as format specification, resulting in an unexpected output for line 6. Correct application of fprintf by providing a formatSpec, fixes this issue and doesn't require any replace operations:
fprintf(fileID, '%s', currentLine);
Your script has no trouble reading the % characters correctly. The "problem" is with fprintf(). This function correctly interpretes the percent signs in the string as formatting characters. Therefore, I think you have to manually escape every single % character in your currentLine string:
currentLine = strrep(currentLine, '%', '%%');
At least, it worked when I checked it on your example data.
Thanks applesoup for identifying my fundamental oversight - the problem is in the fprintf - not in the file read
Thanks serial for enhancing the fprintf

Read data to matlab with for loop

I want to read the data of a file with size about 60 MB into matlab in some variables, but I get errors. This is my code:
clear all ;
clc ;
% Reading Input File
Dataz = importdata('leak0.lis');
%Dataz = load('leak0.lis');
for k = 1:1370
foundPosition = 1 ;
for i=1:size(Dataz,1)
strp = sprintf('I%dz=',k);
fprintf(strp);
findValue = strfind(Dataz{i}, strp) ;
if ~isempty(findValue)
eval_param = strp + '(foundPosition) = sscanf(Dataz{i},''%*c%*c%*f%*c%*c%f'') ;';
disp(eval_param);
% str(foundPosition) = sscanf(Dataz{i},'%*c%*c%*f%*c%*c%f') ;
eval(eval_param);
foundPosition = foundPosition + 1 ;
end
end
end
When I debugged it, I found out that the dataz is empty & so it doesn't proceed to next lines. I replace it with fopen, load & etc, but it didn't work.
From the Matlab help files, import data is likely failing because it doesn't understand your file format.
From the help files
Name and extension of the file to import, specified as a string. If importdata recognizes the file extension, it calls the MATLAB helper function designed to import the associated file format (such as load for MAT-files or xlsread for spreadsheets). Otherwise, importdata interprets the file as a delimited ASCII file.
For ASCII files and spreadsheets, importdata expects to find numeric
data in a rectangular form (that is, like a matrix). Text headers can
appear above or to the left of the numeric data, as follows:
Assuming that your .lis files actually have delimited text.
You should adjust the delimiter in the importdata call so that Matlab can understand your file.
filename = 'myfile01.txt';
delimiterIn = ' ';
headerlinesIn = 1;
A = importdata(filename,delimiterIn,headerlinesIn);

Octave: Problems with load

I'm currently doing a program in Octave where I want the user to be able to insert the file that he wants to load. The files in question are .mat files and are loaded with
load ("filename.mat")
I was thinking about doing something like this:
file=input("Whats the file name: ")
load ("file")
But that didn't work...
Anyone got any tips?
That's likely because you need to input the file name enclosed in single quotation marks : 'filename'. (Note: I use MATLAB but that should work just the same in Octave).
As an alternative you can use inputdlg to request user input. It gives you much flexibility as you can add fields to the prompt such as the file extension or else.
Here is a simple example:
clear
clc
prompt = {'Enter file name'};
dlg_title = 'Input';
num_lines = 1;
def = {'Dummy file'};
answer = inputdlg(prompt,dlg_title,num_lines,def)
The prompt looks like this:
You can fetch the asnwer like so:
name = answer{1};
And finally add the extension to load the .mat file:
filename = strcat(name,'.mat')
S = load(filename)
To do it in one go with the file extension:
prompt = {'Enter file name'; 'Enter file extension'};
dlg_title = 'Input';
num_lines = 1;
def = {'Dummy file'; '.mat'};
answer = inputdlg(prompt,dlg_title,num_lines,def)
name = answer{1};
extension = answer{2};
filename = strcat(name,extension)
S = load(filename)
Hope that helps!
I used Benoit_11's method but changed it to input instead since inputdlg doesn't seem to work in Octave.
clear
clc
name=input('Enter the file name, without the file extension: ','s')
filename = strcat(name,'.mat')
S = load(filename)

How to read digits from file to matrix, no delimeter

I have a data stored in below format, no delimeter and digit domain is {0,1}. With using octave, taking the digits and storing them in martix is reaised a problem for me. I have not managed below scnerio. So, How can I take those digits and store them on matrix as told at below?
Data in File, 32 x 32 digits
00000000000000000000000000000000
00000000001111110000000000000000
...
00000010000000100001000000000000
how to store data
matrix[1, 1:32] = 00000000000000000000000000000000
matrix[2, 1:32] = 00000000001111110000000000000000
. . .
matrix[32, 1:32] = 00000010000000100001000000000000
OR
matrix[1, 1:32] = 00000000000000000000000000000000
matrix[1, 33:64] = 00000000001111110000000000000000
. . .
matrix[1, 993:1024] = 00000010000000100001000000000000
One possible solution is to read the data as a string first:
octave> textread('foo.dat', '%s', 'headerlines', 2)
ans =
{
[1,1] = 00000000000000000000000000000000
[2,1] = 00000000001111110000000000000000
...
}
If these are binary representations of decimals, you may find bin2dec() useful.
This would do the trick (though I don't know how well that third input to fread and arrayfun work with Octave, tested this on Matlab):
fid = fopen('a.txt','rt');
str = fread(fid,inf,'char=>char');
st = fclose(fid);
qrn = str==10|str==13;
str(qrn) = [];
yourMat = reshape(arrayfun(#str2num,str),find(qrn,1)-1,[]).'
Assuming you don't have header lines, you can read the text in as a cell arrray of strings like so:
C = textread('names.txt', '%s');
Then, in general for all numbers from 0 to 9, you can transform this into a matrix like so:
M = vertcat(S{:})-'0';
If performance is an issue you can look into other ways to import the strings, but this should get the job done.
I have never used Matlab, but asuming it reads files the same way Octave does, and if using an external tool is OK, you could try replacing the characters to add a delimiter using a text editor. You could change every "0" to "0," and every "1" to "1," and then simply load the file.
(This would add a delimiter at the end of every line. In case that creates a problem, you could try replacing your text by pairs instead "00"->"0,0" "10" -> "1,0" and so on)
In case the file is too big for a normal editor, you might even try replacing the characters with sed:
sed -i 's/charactertoreplace/newcharacter/g' yourfile.txt

saving common strings to a new text file in MATLAB

To get similar files among different text files I've used ismember()
file1 = {'DSC01605.bmp';'Hampi8.bmp';'DSC01633.bmp';...
'DSC01198.bmp';'DSC01619.bmp'}
file2 = {'DSC01605.bmp';'Hampi8.bmp';'DSC01633.bmp'}
file3 = {'DSC01605.bmp';'Hampi8.bmp'}
matching12 = ismember(file1, file2)
matching13 = ismember(file1, file3)
matchesAll3 = matching12 & matching13
allMatchingStrings = file1(matchesAll3)
Now allMatchingStrings contains
'DSC01605.bmp'
'Hampi8.bmp'
How can i write these files to a new text file all.txt? Problem with my requirements is - suppose allMatchingStrings contains around 10 files, but i need only 5 out of those 10 files. I need to save 5 files to a new text file say all.txt. How can i do that?
A quick way to write them to disk is with the fprintf command.
fid = fopen('all.txt', 'w');
fprintf(fid, '%s\n', allMatchingStrings{:});
fclose(fid);
If you only wanted to write the first 2 filenames in allMatchingStrings then you could limit like this:
filenamesIWant = 1:2;
fid = fopen('all2.txt', 'w');
fprintf(fid, '%s\n', allMatchingStrings{filenamesIWant});
fclose(fid);
This works because the fprintf command repeats for each string you give it. The only trick is getting the curly brackets int he right place.