Appending data to a file in Matlab, removing before a symbol - matlab

I have a file which is written via Matlab from a vector M with binary data values. This file is written with Matlab's fwrite in the following script myGenFile.m of the form function myGenFile(fName, M):
% open output file
fId = fopen(fName, 'W');
% start by writing some things to the file
fprintf(fId, '{DATA BITLENGTH:%d}', length(M));
fprintf(fId, '{DATA LIST-%d:#', ceil(length(M) / 8) + 1);
% pad to full bytes
lenRest = mod(length(M), 8);
M = [M, zeros(1, 8 - lenRest)];
% reverse order in bytes
M = reshape(M, 8, ceil(length(M) / 8));
MReversed = zeros(8, ceil(length(M) / 8));
for i = 1:8
MReversed(i,:) = M(9-i,:);
end
MM = reshape(MReversed, 1, 8*len8);
fwrite(fId, MM, 'ubit1');
% write some ending of the file
fprintf(fId, '}');
fclose(fId);
Now I want to write a file myAppendFile.m, which appends some values to the existing file and has the following form: function myAppendFile(newData, fName). To do this I will have to remove the trailing '}':
fId = fopen(nameFile,'r');
oldData = textscan(fId, '%s', 'Delimiter', '\n');
% remove the last character of the file; aka the ending '}'
oldData{end}{end} = oldData{end}{end}(1:end-1);
The problem is now when trying to write oldData into the file (writing newData should be trivial, since it is also a vector of binary data like M), since it is a cell of cell arrays, containing strings.
How could I overcome this issue and append the new data correctly?

Instead of using textscan which copies the file to your memory, then writes it back, you could use fseek to set the pointer where you want to continue writing. Just put it one char before end of file and continue writing.
fseek(fid, -1, 'eof');

Related

How to write multiple matrixes in a textfile in MATLAB?

I have three matrixes and their sizes are different. I need to write them in a textfile. I've tried to do that writing these:
fileID = fopen('results.txt','w');
fprintf(fileID,'HEADER\n');
fprintf(fileID,'\nmatrix1 = ');
fprintf(fileID,'%d',m1,'\n');
fprintf(fileID,'\nmatrix2 = ');
fprintf(fileID,'%d',m2,'\n');
fprintf(fileID,'\nresult = ');
fprintf(fileID,'%d',m3,'\n');
fclose(fileID);
The result is:
HEADER
matrix1 = 1111121121111111111132133113132333223333213233222212112411442341243123122112323313342431432334132434333341241424433334334333412414244333343321321212221211211222213213122212112112222132232232222231222222344333342243323232333224333343324223233443243343433343333334432433434333433333233443443434443444443444344343433443434443244224343444344444443341442444434434333413133242131123132234344433432434334433124313312212222124222241243323223113222323323343212434321111433213223121241442414334232433243434434412211241211113211121224333412141433122334444444444444444444492110
matrix2 = 1221314111312212211134432433434333433333211212122212112112224334432434244434444492110
result = 1041111041031091131061021111001071031011021061081001091059792110
But this isn't what I need. matrix1's size is 20x28, matrix2's size is 20x4 and matrix3's size is 1x20. They should look like matrixes in the textfile.
I should also write many more matrixes in the same file, so when I need to write something in the file, the previous data shouldn't be deleted.
Here's a function that wraps MathWorks' existing dlmwrite, which represents matrices in a file in the way you want. The wrapper is necessary to allow different naming of multiple variables in the file:
function mwrite(filename, variableName, data, mode)
if nargin < 4, mode = 'w'; end % pass mode 'w' to overwrite or 'a' to append
f = fopen(filename, mode);
fprintf(f, '%s = [', variableName);
if numel(data) == size(data, 2)
fprintf(f, '%s];\n', num2str(data) );
else
fprintf(f, '\n');
fclose(f);
dlmwrite(filename, data, '-append');
f = fopen(filename, 'a');
fprintf(f, '];\n\n');
fclose(f);
end

How to read a single character in file using MATLAB?

In my file data.txt, I have a string abcdefgh. Now I want to take just 1 character without read whole string. How can I do this in MATLAB?
For example, I want to take the first character, I use c = fscanf(data.txt, '%c'); and c = textscan(data.txt, '%c'); but it read whole line in data.txt. I know that c(1) is my answer but I don't want to do that.
You can limit the number of characters that are read in using the third input to either fscanf or textscan.
fid = fopen('data.txt', 'r');
c = fscanf(fid, '%c', 1);
c = textscan(fid, '%c', 1);
You could also just use a lower-level function such as fread to do this.
fid = fopen('data.txt', 'r');
c = fread(fid, 1, '*char');

MATLAB reads UNICODE CSV with spaces between characters

I am using the fgetl command to read a .csv file but instead of returning the results I wanted as:
"HIST",1,1,27,PWH,"1"
it returned with additional space between each character:
" H I S T " , 1 , 1 , 2 7 , P W H , " 1 "
I know that I can replace the space with regexprep, but my file contains billions of lines so the added expression might consume considerably more time. I had a feeling that this is a unicode issue and someone pointed out the same issue when he used Java and it was related to unicode. I wonder if anyone knows a better way to deal with the problem in MATLAB?
Update:
It should be the unicode issue because the .csv file is an output from another program, and when I read it using fgetl the spaces are added. However, if I save the .csv file again using Excel and read the .csv file using fgetl again, it returns the results I want.
I am not able to provide an example because the .csv file is very large and I cannot make a small sample because when I open and save it from Excel, this problem is gone.
For the purpose of demonstration, let's consider a demo file - demo.csv:
"GIST",1,6,17,PWH,"1"
"FIST",0,4,72,WPH,"2"
"MIST",3,2,27,WHP,"3"
You have some options:
textscan (for any text file with a known structure):
fID = fopen('demo.csv');
C = textscan(fID,'%s%d%d%d%s%s','Delimiter',{',','"'},'MultipleDelimsAsOne',1);
fclose(fID);
Which results in:
C =
{3x1 cell} [3x1 int32] [3x1 int32] [3x1 int32] {3x1 cell} {3x1 cell}
Import helper + generate script (AKA overkill is an understatement):
Which results in:
%% Import data from text file.
% Script for importing data from the following text file:
%
% F:\demo.csv
%
% To extend the code to different selected data or a different text file, generate a
% function instead of a script.
% Auto-generated by MATLAB on 2016/04/20 19:51:32
%% Initialize variables.
filename = 'F:\demo.csv';
delimiter = ',';
%% Read columns of data as strings:
% For more information, see the TEXTSCAN documentation.
formatSpec = '%q%q%q%q%q%q%[^\n\r]';
%% Open the text file.
fileID = fopen(filename,'r');
%% Read columns of data according to format string.
% This call is based on the structure of the file used to generate this code. If an error
% occurs for a different file, try regenerating the code from the Import Tool.
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'ReturnOnError', false);
%% Close the text file.
fclose(fileID);
%% Convert the contents of columns containing numeric strings to numbers.
% Replace non-numeric strings with NaN.
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[2,3,4,6]
% Converts strings in the input cell array to numbers. Replaced non-numeric strings with
% NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(numbers, thousandsRegExp, 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%% Split data into numeric and cell columns.
rawNumericColumns = raw(:, [2,3,4,6]);
rawCellColumns = raw(:, [1,5]);
%% Allocate imported array to column variable names
GIST = rawCellColumns(:, 1);
VarName2 = cell2mat(rawNumericColumns(:, 1));
VarName3 = cell2mat(rawNumericColumns(:, 2));
VarName4 = cell2mat(rawNumericColumns(:, 3));
PWH = rawCellColumns(:, 2);
VarName6 = cell2mat(rawNumericColumns(:, 4));
%% Clear temporary variables
clearvars filename delimiter formatSpec fileID dataArray ans raw col numericData rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me rawNumericColumns rawCellColumns;
csvread (for numeric values only; which means it is not applicable here).
I happened to have the same issue. I opened a .csv file using textscan and it added 1 whitespace on both side of any character and I also noticed that when opening the variable storing the read data, the font was different than the usual in Matlab.
We managed to solve this issue by opening the '.csv' file into Notepad++ and changed the encoding to UTF-8. It solved the problem.
Hope it helps!

MATLAB CSV File Read

I have a CSV file with the following content:
Header line1
Space
Space
Space
,1,2,3,
1,81,82,83
And I am trying to read the data portion into a numeric matrix.
Here is the code I have implemented, however I am having issues.
%To get the number of rows in the file
for i = 1:9
headerline = fgetl(fid);
headerline = strsplit(headerline,',')
end
fclose(fid);
fopen(fid);
% to get the data
C = textscan(fid,'%s','headerline',4,'EmptyValue',=Inf)
rowsize = size(C{1});
data = []
% to store data in matrix
for i = 1:rowsize
data = [data, strsplit(C{1}{i},',')];
end
Can anybody recommend a better way to just read the whole file into a numeric matrix? Thanks!
All you really need is this;
fid = fopen('your.csv');
data = cell2mat(textscan(fid, '%f%f%f%f', 'Delimiter', ',', 'HeaderLines', 4));
You could also use csvread (https://www.mathworks.com/help/matlab/ref/csvread.html) if your csv just contains numeric values.
M = csvread(filename,R1,C1) reads data from the file starting at row offset R1 and column offset C1. For example, the offsets R1=0, C1=0 specify the first value in the file.
So in this case:
data = csvread('filename.csv', 4, 0)

Modifying PDF file with Matlab by fopen

Is it possible to use matlab to fopen a PDF file, manually replace a string ('Helvetica') with a new string ('Arial')? Probably due to the fact that the file is part binary and part ascii, if I
fid = fopen(filename, 'r');
str = fread(fid, '*char')';
fclose(fid);
newStr = strrep(str, 'Helvetica', 'Arial');
fid = fopen(filename, 'w');
fprintf(fid, '%s', newStr);
fclose(fid);
The PDF will be unusable at all. Is there a way to avoid this?
PS: 1) The PDF file may have very different sizes, so skipping a certain amount of binary data may be difficult;
2) I know how to do it in python, but I'd really like to see whether it could be done by pure MATLAB...
Thanks!
One way of doing this is to read the pdf as uint8 instead of char and write out with fwrite
fid = fopen(filename, 'r');
bytes = fread(fid, 'uint8')';
fclose(fid);
% Do the replacement
% NB: strrep complains about the byte array but works anyway
% You could do replacement without using string function
% but this works.
output = strrep(bytes,'Helvetica','Arial');
% Write out the modified pdf
fid = fopen(filename, 'w');
fwrite(fid, output);
fclose(fid);