Modify the value of a specific position in a text file in Matlab - matlab

AIR, ID
AIR.SIT
50 1 1 1 0 0 2 1
43.57 -116.24 1. 857.7
Hi, All,
I have a text file like above. Now in Matlab, I want to create 5000 text files, changing the number "2" (the specific number in the 3rd row) from 1 to 5000 in each file, while keeping other contents the same. In every loop, the changed number is the same with the loop number. And the output in every loop is saved into a new text file, with the name like AIR_LoopNumber.SIT.
I've spent some time writing on that. But it is kind of difficult for a newby. Here is what I have:
% - Read source file.
fid = fopen ('Air.SIT');
n = 1;
textline={};
while (~feof(fid))
textline(n,1)={fgetl(fid)};
end
FileName=Air;
% - Replace characters when relevant.
for i = 1 : 5000
filename = sprintf('%s_%d.SIT','filename',i);
end
Anybody can help on finishing the program?
Thanks,
James

If your file is so short you do not have to read it line by line. Just read the full thing in one variable, modify only the necessary part of it before each write, then write the full variable back in one go.
%% // read the full file as a long sequence of 'char'
fid = fopen ('Air.SIT');
fulltext = fread(fid,Inf,'char') ;
fclose(fid) ;
%% // add a few blank placeholder (3 exactly) to hold the 4 digits when we'll be counting 5000
fulltext = [fulltext(1:49) ; 32 ; 32 ; 32 ; fulltext(50:end) ] ;
idx2replace = 50:53 ; %// save the index of the characters which will be modified each loop
%% // Go for it
baseFileName = 'AIR_%d.SIT' ;
for iFile = 1:1000:5000
%// build filename
filename = sprintf(baseFileName,iFile);
%// modify the string to write
fulltext(idx2replace) = num2str(iFile,'%04d').' ; %//'
%// write the file
fidw = fopen( filename , 'w' ) ;
fwrite(fidw,fulltext) ;
fclose(fidw) ;
end
This example works with the text in your example, you may have to adjust slightly the indices of the characters to replace if your real case is different.
Also I set a step of 1000 for the loop to let you try and see if it works without writing 1000's of file. When you are satisfied with the result, remove the 1000 step in the for loop.
Edit:
The format specifier %04d I gave in the first solution insure the output will take 4 characters, and it will pad any smaller number with zero (ex: 23 => 0023). It is sometimes desirable to keep the length constant, and in your particular example it made things easier because the output string would be exactly the same length for all the files.
However it is not mandatory at all, if you do not want the loop number to be padded with zero, you can use the simple format %d. This will only use the required number of digits.
The side effect is that the output string will be of different length for different loop number, so we cannot use one string for all the iterations, we have to recreate a string at each iteration. So the simple modifications are as follow. Keep the first paragraph of the solution above as is, and replace the last 2 paragraphs with the following:
%% // prepare the block of text before and after the character to change
textBefore = fulltext(1:49) ;
textAfter = fulltext(51:end) ;
%% // Go for it
baseFileName = 'AIR_%d.SIT' ;
for iFile = 1:500:5000
%// build filename
filename = sprintf(baseFileName,iFile);
%// rebuild the string to write
fulltext = [textBefore ; num2str(iFile,'%d').' ; textAfter ]; %//'
%// write the file
fidw = fopen( filename , 'w' ) ;
fwrite(fidw,fulltext) ;
fclose(fidw) ;
end
Note:
The constant length of character for a number may not be important in the file, but it can be very useful for your file names to be named AIR_0001 ... AIR_0023 ... AIR_849 ... AIR_4357 etc ... because in a list they will appear properly ordered in any explorer windows.
If you want your files named with constant length numbers, the just use:
baseFileName = 'AIR_%04d.SIT' ;
instead of the current line.

Related

Merge specific text files

I have a problem in my code. I am using some loops in order to create mulptile .txt files with specific names, eg mR1.txt, mR2.txt,...mR100.txt. After that I merge them into one file with the following commands:
name=regexp(fileread('names.txt'), '\r?\n', 'split') .';
file1=regexp(fileread('data.txt'), '\r?\n', 'split') .';
.....
for n=1:numel(name);
for z=1:size(file1)
........ %making caluclations
FP=fopen(sprintf('mR%g0.txt',z),'wt');
fprintf(FP,'%s\t',file1{z},num2str(number),num2str(number);
fclose(FP);
txtFiles = dir('mR*.txt') ; % get the text files in the present folder
N = length(txtFiles) ; % Total number of text files
iwant = cell(N,1) ; % initlaize the data required
% loop for each file
for i = 1:N
thisFile = txtFiles(i).name ;
iwant{i} = importdata(thisFile) ; % read data of the text file
end
iwant = cell2mat(iwant) ;
outFile = strcat('finalR',num2str(n),'.txt') ;
dlmwrite(outFile,iwant,'delimiter','\t')
end
end
I would like to create for each iteration M1_mR1.txt,...,M1_mr100.txt, and M1_mR1.txt,..., M2_mR100.txt. After that I would like to merge all the files that Have prefix M1_.....txt into one file and all the files that Have prefix M2_.....txt into one file.
How could I do this?
This can be done in multiple ways.
Note about sorting: The default sorting in files will give higher value to file2 than file100, so you need to sort by actual number value. This is done by converting the filename numbers to first code and second code, and then sort by the second code with sortrows.
So this will solve the problem:
clear
textfiles_dir=dir('M*_mR*.txt');
textfiles_name={textfiles_dir.name}; %group the filenames into cell array
textfiles_codes=cellfun(#(fname)textscan(fname,'M%d_mR%d.txt'),textfiles_name,'Uniform',false);
textfiles_codes=cell2mat(cat(1,textfiles_codes{:})); %separate bw first and second code
disp(textfiles_codes)
[sorted_codes,orig_idx]=sortrows(textfiles_codes,2); %sort by second code
first_codes=sorted_codes(:,1);
second_codes=sorted_codes(:,2);
unique_firstcodes=unique(first_codes);
for ifirstcode=1:length(unique_firstcodes)
firstcode=unique_firstcodes(ifirstcode);
fprintf('firstcode=%d: ',firstcode)
target_name=sprintf('Merged%d.txt',firstcode);
f_target=fopen(target_name,'w');
idxs=orig_idx(first_codes==firstcode); % already sorted in increasing order
for i=1:length(idxs)
idx=idxs(i);
fname=textfiles_name{idx};
fprintf(' %s ',fname)
f_src=fopen(fname,'r');
src_text=fread(f_src,'*char')';
fclose(f_src);
fwrite( f_target,src_text);
end
fclose(f_target);
fprintf('\n')
end
fprintf('Done!\n')

joining arrays in Matlab and writing to file using dlmwrite( ) adds extra space

I am generating 2500 values in Matlab in format (time,heart_rate, resp_rate) by using below code
numberOfSeconds = 2500;
time = 1:numberOfSeconds;
newTime = transpose(time);
number0 = size(newTime, 1)
% generating heart rates
heart_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intHeartRate = int64(heart_rate);
number1 = size(intHeartRate, 1)
% hist(heart_rate)
% generating resp rates
resp_rate = 50 +(70-50) * rand (numberOfSeconds,1);
intRespRate = int64(resp_rate);
number2 = size(intRespRate, 1)
% hist(heart_rate)
% joining time and sensor data
joinedStream = strcat(num2str(newTime),{','},num2str(intHeartRate),{','},num2str(intRespRate))
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', joinedStream,'delimiter','');
The data shown in the console is alright, but when I save this data to a .txt file, it contains extra spaces in beginning. Hence I am not able to parse the .txt file to generate input stream. Please help
Replace the last two lines of your code with the following. No need to use strcat if you want a CSV output file.
dlmwrite('/Users/amar/Desktop/geenrated/rate.txt', [newTime intHeartRate intRespRate]);
π‘‡β„Žπ‘’ π‘ π‘œπ‘™π‘’π‘‘π‘–π‘œπ‘› 𝑠𝑒𝑔𝑔𝑒𝑠𝑑𝑒𝑑 𝑏𝑦 π‘ƒπΎπ‘œ 𝑖𝑠 π‘‘β„Žπ‘’ π‘ π‘–π‘šπ‘π‘™π‘’π‘ π‘‘ π‘“π‘œπ‘Ÿ π‘¦π‘œπ‘’π‘Ÿ π‘π‘Žπ‘ π‘’. π‘‡β„Žπ‘–π‘  π‘Žπ‘›π‘ π‘€π‘’π‘Ÿ 𝑒π‘₯π‘π‘™π‘Žπ‘–π‘›π‘  π‘€β„Žπ‘¦ π‘¦π‘œπ‘’ 𝑔𝑒𝑑 π‘‘β„Žπ‘’ 𝑒𝑛𝑒π‘₯𝑝𝑒𝑐𝑑𝑒𝑑 π‘œπ‘’π‘‘π‘π‘’π‘‘.
The data written in the file is exactly what is shown in the console.
>> joinedStream(1) %The exact output will differ since 'rand' is used
ans =
cell
' 1,60,63'
num2str basically converts a matrix into a character array. Hence number of characters in its each row must be same. So for each column of the original matrix, the row with the maximum number of characters is set as a standard for all the rows with less characters and the deficiency is filled by spaces. Columns are separated by 2 spaces. Take a look at the following smaller example to understand:
>> num2str([44, 42314; 4, 1212421])
ans =
2Γ—11 char array
'44 42314'
' 4 1212421'

read complicated format .txt file into Matlab

I have a txt file that I want to read into Matlab. Data format is like below:
term2 2015-07-31-15_58_25_612 [0.9934343, 0.3423043, 0.2343433, 0.2342323]
term0 2015-07-31-15_58_25_620 [12]
term3 2015-07-31-15_58_25_625 [2.3333, 3.4444, 4.5555]
...
How can I read these data in the following way?
name = [term2 term0 term3] or namenum = [2 0 3]
time = [2015-07-31-15_58_25_612 2015-07-31-15_58_25_620 2015-07-31-15_58_25_625]
data = {[0.9934343, 0.3423043, 0.2343433, 0.2342323], [12], [2.3333, 3.4444, 4.5555]}
I tried to use textscan in this way 'term%d %s [%f, %f...]', but for the last data part I cannot specify the length because they are different. Then how can I read it? My Matlab version is R2012b.
Thanks a lot in advance if anyone could help!
There may be a way to do that in one single pass, but for me these kind of problems are easier to sort with a 2 pass approach.
Pass 1: Read all the columns with a constant format according to their type (string, integer, etc ...) and read the non constant part in a separate column which will be processed in second pass.
Pass 2: Process your irregular column according to its specificities.
In a case with your sample data, it looks like this:
%% // read file
fid = fopen('Test.txt','r') ;
M = textscan( fid , 'term%d %s %*c %[^]] %*[^\n]' ) ;
fclose(fid) ;
%% // dispatch data into variables
name = M{1,1} ;
time = M{1,2} ;
data = cellfun( #(s) textscan(s,'%f',Inf,'Delimiter',',') , M{1,3} ) ;
What happened:
The first textscan instruction reads the full file. In the format specifier:
term%d read the integer after the literal expression 'term'.
%s read a string representing the date.
%*c ignore one character (to ignore the character '[').
%[^]] read everything (as a string) until it finds the character ']'.
%*[^\n] ignore everything until the next newline ('\n') character. (to not capture the last ']'.
After that, the first 2 columns are easily dispatched into their own variable. The 3rd column of the result cell array M contains strings of different lengths containing different number of floating point number. We use cellfun in combination with another textscan to read the numbers in each cell and return a cell array containing double:
Bonus:
If you want your time to be a numeric value as well (instead of a string), use the following extension of the code:
%% // read file
fid = fopen('Test.txt','r') ;
M = textscan( fid , 'term%d %f-%f-%f-%f_%f_%f_%f %*c %[^]] %*[^\n]' ) ;
fclose(fid) ;
%% // dispatch data
name = M{1,1} ;
time_vec = cell2mat( M(1,2:7) ) ;
time_ms = M{1,8} ./ (24*3600*1000) ; %// take care of the millisecond separatly as they are not handled by "datenum"
time = datenum( time_vec ) + time_ms ;
data = cellfun( #(s) textscan(s,'%f',Inf,'Delimiter',',') , M{1,end} ) ;
This will give you an array time with a Matlab time serial number (often easier to use than strings). To show you the serial number still represent the right time:
>> datestr(time,'yyyy-mm-dd HH:MM:SS.FFF')
ans =
2015-07-31 15:58:25.612
2015-07-31 15:58:25.620
2015-07-31 15:58:25.625
For comlicated string parsing situations like such it is best to use regexp. In this case assuming you have the data in file data.txt the following code should do what you are looking for:
txt = fileread('data.txt')
tokens = regexp(txt,'term(\d+)\s(\S*)\s\[(.*)\]','tokens','dotexceptnewline')
% Convert namenum to numeric type
namenum = cellfun(#(x)str2double(x{1}),tokens)
% Get time stamps from the second row of all the tokens
time = cellfun(#(x)x{2},tokens,'UniformOutput',false);
% Split the numbers in the third column
data = cellfun(#(x)str2double(strsplit(x{3},',')),tokens,'UniformOutput',false)

Read specific hex data in CSV file in MATLAB

I've looked through the posts on StackOverflow and can't seem to find the answer I am looking for. I have a large CSV file (450 MB) with hex data that looks like this:
63C000CF,6000002F,603000AF,6000C06F,617300EF,6C7C001F,6000009F,0%,63C000CF...
That is a very truncated example, but basically I have approximately 78 different hex values separated by commas, then there will be the '0%', then 78 more hex values. This will continue for a very long time. I've been using textscan like this:
data = textscan(fid, '%s', 1, 'delimiter', '%');
data = textscan(data{1}{1}, '%s', 'delimiter', ',');
data = data{1};
count = size(data);
outstring = ['%', sprintf('\n')];
for idx = 1:count(1)
string = data{idx};
stringSize = size(string);
if stringSize(2) > 1
outstring = [outstring, string, sprintf('\n')];
end
end
fprintf(output_fid, '%s', outstring)
This allowed me to format the csv file in a way to which I could use fgetl() to analyze whether or not I was looking at the data I needed. Because the data repeats itself, I can use fseek() to jump to the next occurrence before calling fgetl() again.
What I need is a way to skip to the ending. I want to just be able to use something like fgetl() but have it only return the first hex value it encounters. I will know how many bytes to shift through the file. Then I need to make sure I can read other hex values. Is what I'm asking possible? My code using textscan above takes far too long on a csv file that is 90 MB let alone 450 MB.
Answer obtained from user Cedric Wannaz on the Mathworks MATLAB Central Answers page.
NEW solution
Here is a more efficient solution; I am using a 122MB file, so you have an idea about the timing
% One line for reading the whole file. To perform once only.
tic ;
content = fileread( 'adam_1.txt' ) ;
fprintf( 'Time for reading the file : %.2fs\n', toc ) ;
% One line for defining an extraction function. To perform once only.
extract = #(label) content(bsxfun( #plus, ...
strfind( content, [label,','] ).' - 6, ...
0 : 5 )) ;
% Then it is one call per label to extract data.
tic ;
data = extract( 'CF' ) ;
fprintf( 'Time for extracting one label: %.2fs\n', toc ) ;
Running this, I obtain
Time for reading the file : 0.52s
Time for extracting one label: 0.62s
FORMER solution
Would the following work for you?
% Read file content. To do once only.
content = fileread( 'myFile.txt' ) ;
% Define regexp-based extraction function. To do once only.
getByLabel = #(label) regexp( content, sprintf( '\\w{6}(?=%s)', label ), ...
'match' ) ;
% Get all entries for e.g. label 'CF'.
entries_CF = getByLabel( 'CF' ) ;
% Get all entries for e.g. label '6F'.
entries_6F = getByLabel( '6F' ) ;
I am not completely clear on what you need to achieve ultimately; if I had to design a GUI where users can choose a label and get corresponding data, I would process the data much further during the init phase, e.g. by grouping them by label in a cell array. Regexp is not the most efficient approach in this case I guess, but the principle would be..
labels = {'CF', '6F', 'AF', ..} ;
nLabels = numel( labels ) ;
data = cell{ 1, nLabels ) ;
for lId = 1 : nLabels
data{lId} = getByLabel( labels{lId} ) ;
end
and then when a user selects 'CF' ..
lId = strcmpi( label, labels ) ;
dataForThisLabel = data{lId} ;

Extract variables from a textfile in matlab

I have a large text file (~3500 lines) which contains output data from an instrument. This consists of repeated entries from different measurements which are all formatted as follows:
* LogFrame Start *
variablename1: value
variablename2: value
...
variablename35: value
* LogFrame End *
Each logframe contains 35 variables. From these I would like to extract two, 'VName' and 'EMGMARKER' and put the associated values in columns in a matlab array, (i.e. the array should be (VName,EMGMARKER) which I can then associate with data from another output file which I already have in a matlab array. I have no idea where to start with this in order to extract the variables from this file, so hence my searches on the internet so far have been unsuccessful. Any advice on this would be much appreciated.
You could use textscan:
C = textscan(file_id,'%s %f');
Then you extract the variables of interest like this:
VName_indices = strcmp(C{1},'VName:');
VName = C{2}(VName_indices);
If the variable names, so 'VName' and 'EMGMARKER' , are always the same you can just read through the file and search for lines containing them and extract data from there. I don't know what values they contain so you might have to use cells instead of arrays when extracting them.
fileID = fopen([target_path fname]); % open .txt file
% read the first line of the specified .txt file
current_line = fgetl(fileID);
counter = 1;
% go through the whole file
while(ischar(current_line))
if ~isempty(strfind(current_line, VName))
% now you find the location of the white space delimiter between name and value
d_loc = strfind(current_line,char(9));% if it's tab separated - so \t
d_loc = strfind(current_line,' ');% if it's a simple white space
variable_name{counter} = strtok(current_line);% strtok returns everything up until the first white space
variable_value{counter} = str2double(strtok(current_line(d_loc(1):end)));% returns everything form first to second white space and converts it into a double
end
% same block for EMGMARKER
counter = counter+1;
current_line = fgetl(fileID);% next line
end
You do the same for EMGMARKER and you have cells containing the data you need.