Printing data to text file - matlab

I am using the fprintf command to store the contents of a .mat file to a .txt. The .mat file contains strings. My code prints the data in the same column.
fid = fopen('exp.txt','wt');
for i=1:275
fprintf (fid,classes{i}{1})
end
fclose(fid);
When I use the \n and the '\r\n' options, they doesn't print anything to the file. I'd appreciate the help!

Some text editors will show those new line characters and some wont. This happens because of different standards followed by different softwares and operating systems for eg:
end of line sequences
Windows end of line sequence: \r\n
Unix end of line sequence: \n
Mac end of line sequence: \r
So if you really want good human readable formats, either fix your operation system/software and use characters friendly for that system, or if you want uniformity in the reports, better write files in standard HTML formats :)
adding a "br" tag and naming file .html is as simple as writing out '\n' and naming it .txt!

Related

Strange behaviour when writing RTF file

I have a strange behaviour in Matlab dealing with RTF files.
The rtf file is read with this instruction:
cin = textread(filename, '%s', 'delimiter', '\n');
(cin) is a Nx1 cell where N is the number of rows of the file,
so I can edit some specific row.
I write the file RTF with the function below:
function dum= cell2rtf(cin, filename)
[row, col]= size(cin);
fout= fopen(filename, 'w');
for ii=1:row
if(ii<row)
fprintf(fout, '%s\r\n', cin{ii});
else
fprintf(fout, '%s', cin{ii});
end
end
fclose(fout);
The strange behaviour is this one:
If the row cin{x} is a string with content
'19°\cell 19°\cell \cell \cell \cell 70°'
the same row appears like below when the file is written by the function
'19°\cell 19°\cell \cell \cell \cell 70°'
I can't understand why the char '°' becomes '°' in every occurrence
and I'd like to know how this can be corrected.
The problem is that textread() works on plain text files, and RTF files are not plain text files: they are binary-ish files that contain markup and formatting codes, sorta like Word .doc files. textread() is probably encountering those formatting/structural codes and misinterpreting them as plain text characters, and that's where your junk characters like  are coming from.
Could you just save your RTF file as a plain text file and read from that?
Otherwise, you'll need to write an RTF parser or find an RTF parsing library and use that. Matlab works (kind of) easily with Java libraries, so you could use the Apache Tika library's RTFParser or RTFParserKit.

Compare filenames with different encoding in Octave

I'm trying to accomplish following task in Octave:
Read filename from text file
Search for this file in particular location on hard drive
My script works for most files, but for certain files containing unicode characters I'm unable to match the filename from textfile with filename as it appears in the file system.
Filenames in textfile are in UTF-8 encoding and I read them in Octave with function fgetl().
Filenames from file system are obtained via function readdir(). I'm on Windows, NTFS file system.
For example, one problematic filename contains character "Č".
When printed out in Octave console, the characters appear exactly the same. However, a HEX viewer reveals that the characters are not actually the same. In the first case the character is encoded as 0x010C, in the second case as 0x0043 + 0x030C. Comparing both of them via strcmp() fails, of course.
What I tried to do is to omitt all non-ASCII characters from the filename and then compare them. But this didn't work, probably because in the second variant the first part of the character (0x0043) is actually ASCII.
Now I'm looking for some way of converting one format to another to be able to compare them. Any ideas?
EDIT:
As I discovered later, the character Č in the filename on Windows is actually written as C+ˇ, which is just another way you can write that character. So the difference probably insn't in encoding standard, but in 2 different ways to achieve 1 visible character (glyph).
This question basically then changes to a task of matching characters written "at once" and corresponding pair of letter+combining character.

fprintf causing commas in Matlab

I am trying to output table data to a .dat file where I separate the rows by newlines and the column data by commas. I have this written for the first few rows:
fileID = fopen(strcat(filename,'.dat'), 'wt');
fprintf(fileID, '"","","","","","","",""\n');
fprintf(fileID, '"TIMESTAMP","RECORD","MuxAddress","Averages"\n');
fclose(fileID);
This should generate this text in the file:
"","","","","","","",""
"TIMESTAMP","RECORD","MuxAddress","Averages"
Unfortunately, the code actually generates this text:
"","","","","","","","",
"TIMESTAMP","RECORD","MuxAddress","Averages",
Which you can see has commas at the end of each line. This issue breaks a viewer program that I am using, and I can see no way to fix it. I have not found anyone else saying they have this issue either.
I have done some testing, and if I do a fprintf by itself with a newline, it does not put a comma, but as soon as I put a second fprintf, it creates commas at the end of both lines.
So it turns out that it all came down to file permissions. In the specific case with overwriting the file instead of appending it, the code would add the commas. I never discovered WHY the commas were being added, but I did find that if I appended the file instead of overwriting they went away.

What code format shows proper line breaks?

I am exporting some Access tables to txt files and there are a lot of problems with the txt file. One of those problems being line breaks not visible in the txt file itself. If I copy a line with a line break into Notepad++ from Notepad, it'll break into 2 lines.
So I believe this may be a code format problem, but I can't find the proper one to resolve this. I'm currently exporting to the default Western European, but should I export tot UTF, Unicode, ASCII or something else?
When exporting from MS Access (or VB/VBA in general), make sure you're using vbCrLf constant (Carriage Return plus Line Feed) for line breaks. That constant corresponds to HEX values 0D 0A.
In Windows, it is a convention to use the above 2 characters together as line breaks, while in many other platforms, such as Unix/Linux/MacOS/etc. typically just 0A is used.
That brings up an issue: Notepad, the standard Windows text file viewer, cannot deal with 0A alone and does not treat such symbols as line breaks. More advanced editors, such as Notepad++ or UltraEdit, display such files correctly, though.
The CSV export function in Microsoft Office applications (Excel, Access) terminate a data row with CR+LF and write for a line break within a data value (multi-line string) just LF into the file. (I think just CR was written into the CSV file for a line break in older versions of Office before Office 2007.)
Most text editors detect those LF without CR (respectively CR without LF) and convert them to CR+LF on loading the CSV file which results on viewing of the CSV file in text editor in supposed wrong CSV lines as number of data values is not correct on data rows with data values containing a line break.
However, newline characters within a double quoted value in a CSV file are correct according to CSV specification as described in Wikipedia article about Comma-separated values.
But most applications with support on import from CSV file do not support CSV files with newline characters within a double quoted value and therefore some data values are imported wrong. Also regular expression replaces can't be done on a CSV file with newline characters within a data value because the number of separator character is not constant on all lines.
UltraEdit has for editing such CSV files with only LF (or CR) for a line break within a data value a special configuration setting. At Advanced - Configuration - File Handling - DOS/Unix/Mac Handling the option Never prompt to convert files to DOS format or Prompt to convert if file is not DOS format with clicking on button No if this prompt is displayed must be selected and additionally Only recognize DOS terminated lines (CR/LF) as new lines for editing must be enabled.
The CSV file with CR+LF for end of data row and only LF (or CR) for a line-break within a data value is loaded with those settings in UltraEdit with number of lines equal the number of data rows. And the line-feeds without carriage return (respectively the carriage returns without line-feed) in the CSV file are displayed as character in the lines with a small rectangle as no font has a glyph for a carriage return or line-feed defined because they are whitespace characters with no width. A Perl regular expression find searching for \r(?!\n)|\n(?<!\r) can be used now to find those line breaks within data values and replace them with something different like a space character or remove them.
Which character encoding (ASCII, ANSI, Unicode (UTF-16), UTF-8) to use on export depends on which characters can exist in string values. A Unicode encoding is necessary if string values can have also characters not included in local code page.

fprintf not printing new line

I am trying to send an array that is [2 x N] doubles large to a text file using the fprintf() command. I am having problems in that fprintf() is not recognizing the new line command (\n) or the carriage return command (\r). The code I am using is
fid = fopen([Image.Dir,'CtlPts_',Image.Files{k},'.txt'],'w');
fprintf(fid,'%.4f\t%.4f\n',control_points{k});
fclose(fid);
where the data I am trying to print is in the cell control_points{k}.
The tab gets printed fine, but everything in the text file gets printed on one line, so this is why I am assuming that it is ignoring my new line character.
Is there something wrong with my syntax that I am not seeing?
I know that on many systems, \n is not enough to create what you're asking for (and so, maybe you have to do \r\n)
An alternative solution is to open the file in text mode, that way MATLAB automatically inserts a carriage return \r before any newline \n character in the output on Windows systems:
fid = fopen('file.txt', 'wt');
fprintf(fid, '%f\t%f\n', rand(10,2));
fclose(fid);
Note that this is somewhat unnecessary, since most editors (with the exception of Microsoft Notepad) recognize Unix/Mac/Windows line endings.