Modifying PDF file with Matlab by fopen - matlab

Is it possible to use matlab to fopen a PDF file, manually replace a string ('Helvetica') with a new string ('Arial')? Probably due to the fact that the file is part binary and part ascii, if I
fid = fopen(filename, 'r');
str = fread(fid, '*char')';
fclose(fid);
newStr = strrep(str, 'Helvetica', 'Arial');
fid = fopen(filename, 'w');
fprintf(fid, '%s', newStr);
fclose(fid);
The PDF will be unusable at all. Is there a way to avoid this?
PS: 1) The PDF file may have very different sizes, so skipping a certain amount of binary data may be difficult;
2) I know how to do it in python, but I'd really like to see whether it could be done by pure MATLAB...
Thanks!

One way of doing this is to read the pdf as uint8 instead of char and write out with fwrite
fid = fopen(filename, 'r');
bytes = fread(fid, 'uint8')';
fclose(fid);
% Do the replacement
% NB: strrep complains about the byte array but works anyway
% You could do replacement without using string function
% but this works.
output = strrep(bytes,'Helvetica','Arial');
% Write out the modified pdf
fid = fopen(filename, 'w');
fwrite(fid, output);
fclose(fid);

Related

Issue with format specification while reading from file Matlab

I have a .dat file with a table containing data in following order:
0,000E+0 4,069E-2 -5,954E+0 1,851E-2
What I need to do is to read this data with matlab and then somehow handle it.
Here is my code:
path = 'C:/Users/user/Desktop/file1.dat';
fileID = fopen(path,'r');
formatSpec = '%e';
A = fscanf(fileID,formatSpec);
fclose(fileID);
disp(A);
Unfortunately, it doesn't work. What did I do wrong?
After replacement of comma with dot in data you can read it using dlmread function:
M = dlmread('filename', ' ');
M is what you want.
For the first part, replacing a character, you can use the following code:
% read the file
fid = fopen('input.txt','r');
f=fread(fid,'*char')';
fclose(fid);
%replace the char
f = strrep(f,',','.');
% write into the another file
fid = fopen('output.txt','w');
fprintf(fid,'%s',f);
fclose(fid);

How to read a single character in file using MATLAB?

In my file data.txt, I have a string abcdefgh. Now I want to take just 1 character without read whole string. How can I do this in MATLAB?
For example, I want to take the first character, I use c = fscanf(data.txt, '%c'); and c = textscan(data.txt, '%c'); but it read whole line in data.txt. I know that c(1) is my answer but I don't want to do that.
You can limit the number of characters that are read in using the third input to either fscanf or textscan.
fid = fopen('data.txt', 'r');
c = fscanf(fid, '%c', 1);
c = textscan(fid, '%c', 1);
You could also just use a lower-level function such as fread to do this.
fid = fopen('data.txt', 'r');
c = fread(fid, 1, '*char');

Appending data to a file in Matlab, removing before a symbol

I have a file which is written via Matlab from a vector M with binary data values. This file is written with Matlab's fwrite in the following script myGenFile.m of the form function myGenFile(fName, M):
% open output file
fId = fopen(fName, 'W');
% start by writing some things to the file
fprintf(fId, '{DATA BITLENGTH:%d}', length(M));
fprintf(fId, '{DATA LIST-%d:#', ceil(length(M) / 8) + 1);
% pad to full bytes
lenRest = mod(length(M), 8);
M = [M, zeros(1, 8 - lenRest)];
% reverse order in bytes
M = reshape(M, 8, ceil(length(M) / 8));
MReversed = zeros(8, ceil(length(M) / 8));
for i = 1:8
MReversed(i,:) = M(9-i,:);
end
MM = reshape(MReversed, 1, 8*len8);
fwrite(fId, MM, 'ubit1');
% write some ending of the file
fprintf(fId, '}');
fclose(fId);
Now I want to write a file myAppendFile.m, which appends some values to the existing file and has the following form: function myAppendFile(newData, fName). To do this I will have to remove the trailing '}':
fId = fopen(nameFile,'r');
oldData = textscan(fId, '%s', 'Delimiter', '\n');
% remove the last character of the file; aka the ending '}'
oldData{end}{end} = oldData{end}{end}(1:end-1);
The problem is now when trying to write oldData into the file (writing newData should be trivial, since it is also a vector of binary data like M), since it is a cell of cell arrays, containing strings.
How could I overcome this issue and append the new data correctly?
Instead of using textscan which copies the file to your memory, then writes it back, you could use fseek to set the pointer where you want to continue writing. Just put it one char before end of file and continue writing.
fseek(fid, -1, 'eof');

Loop through text files, replace consecutive asterisks with 0.00

all,
I am writing a matlab program to read in text data and rearrange it. Now I am meeting with a new problem.
When I am writing data out to csv file, there are randomly missing data noted as ******, as shown below causing my program to terminate.
2055 6 17 24.2 29.57 7.02****** 0.99 2.65 2.73 4.09 0.11
Any one can help me with a small program to loop through all the text files in the folder, and replace the consecutive stars, with 0.00? The stars are always in columns 33 to 38, occupying 6 spaces. I want it to be changed to be two spaces followed by 0.00.
Thanks,
James
For a given text file, you can read it into memory, replace the asterisks with the desired text, and then overwrite the original text file:
filename = 'blah.txt'
% Read it into memory
fid = fopen(filename, 'r');
scanned_fields = textscan(fid, '%s', 'Delimiter','\n');
fclose(fid);
% The first (and only) field of textscan will be our cell array of text
lines = scanned_fields{1};
% Replace the asterisks with the desired text
lines = strrep(lines, '******', ' 0.00');
% Overwrite the original file
fid = fopen(filename, 'w');
fprintf(fid, '%s\n', lines{:});
fclose(fid);
To do this for all of the text files in your directory, you can use dir to get a list of files in your current directory that end in ".txt":
files = dir('*.m');
filenames = {files.name};
And then loop over the files:
for ii = 1:length(filenames)
filename = filenames{ii};
% Read it into memory
fid = fopen(filename, 'r');
scanned_fields = textscan(fid, '%s', 'Delimiter','\n');
fclose(fid);
lines = scanned_fields{1};
% Replace the asterisks with the desired text
lines = strrep(lines, '******', ' 0.00');
% Overwrite the original file
fid = fopen(filename, 'w');
fprintf(fid, '%s\n', lines{:});
fclose(fid);
% Go on to the next file
end
And of course, I would recommend creating a backup copy of this directory before running this, just in case something unexpected comes up.

Remove Characters from EOF while Writing to File in Matlab

In Matlab, after creating a certain number of lines and printing them to a file, I have the need to delete a line and rewrite the rest of the data to that same file. When I do so, the new data overwrites the previous data, but since the data is shorter than the original, there are still remnants of the original data. Does anyone have any idea what the best/most efficient way to delete that extra data is?
Here is a simplified example of what I'm trying to do:
fid = fopen('file.txt','w');
for i=1:10
fprintf(fid,'%i\r\t',i);
end
frewind(fid);
for i=3:5
fprintf(fid,'%i\r\t',i);
end
fprintf(fid,'EOF');
fclose(fid);
I've looked all over, but I can't seem to find the solution to my question. Any suggestions?
Without using any temp files, you can do the following:
fid = fopen('file.txt', 'wt');
for i=1:10
fprintf(fid, '%i\n', i);
end
frewind(fid);
for i=3:5
fprintf(fid, '%i\n', i);
end
pos = ftell(fid); % get current position in file
fclose(fid);
% read from begining to pos
fid = fopen('file.txt', 'r');
data = fread(fid, pos);
fclose(fid);
% overwite file with data read
fid = fopen('file.txt', 'w');
fwrite(fid, data);
fclose(fid);
Printing "EOF" won't work - nice try!
There are Unix system calls truncate and ftruncate that will do that, given either a file descriptor (truncate) or handle (ftruncate) in the first argument and a desired length in the second.
I'd try and see if Matlab supports ftruncate. Failing that... if worst came to worst you could copy-write the file to a new file, stopping and closing the new file when you hit what you consider the end of data.
To follow up on Carl Smotricz's suggestion of using two files, you can use MATLAB's DELETE and MOVEFILE commands to avoid system calls:
fid = fopen('file.txt','wt');
for i=1:10
fprintf(fid,'\t%i\r',i);
end
fclose(fid);
fid = fopen('file.txt','rt');
fidNew = fopen('fileNew.txt', 'wt');
for i = 1:2
s = fgetl(fid);
fprintf(fidNew, '%s\r', s);
end
for i=4:10
fprintf(fidNew, '\t%i\r', i);
end
fclose(fid);
fclose(fidNew);
delete('file.txt');
movefile('fileNew.txt', 'file.txt')