I'm using the following code to select a bunch of csv files, load each one and append to a new csv:
[filenames, folder] = uigetfile('*.csv','Select the data file','MultiSelect','on')
% Create output file name in the same folder.
outputFileName = fullfile(folder, 'new_file.csv') % NAME THE NEW FILE
fidOutput = fopen(outputFileName, 'wt');
for k = 1 : length(filenames)
% Get this file name.
thisFileName = fullfile(folder, filenames{k})
% Open input file:
fidInput = fopen(thisFileName);
% Read text from it
thisText = fread(fidInput, '*char');
% Copy to output file:
fwrite(fidOutput, thisText);
fclose(fidInput); % close the input file
end
fido=fclose(fidOutput);
However, when I inspect the new file, the first line of each file is appended in front of the last line of the previous file. See image for example with 3 files containing 5 lines each.
I need everything to be aligned in order to load each column in a subsequent step.
fwrite does not seem to have a \n input to create a line break. Is there a way to correct this? I'm on matlab 2015b
Related
I would like to read text files inside a zip file without unzipping using Matlab
Read the data of CSV file inside Zip File without extracting the contents in Matlab
The suggested above is working and I get list of cells for file.
zipFilename = 'C:\ZippedData.zip';
zipJavaFile = java.io.File(zipFilename);
% Create a Java ZipFile
zipFile = org.apache.tools.zip.ZipFile(zipJavaFile);
% Extract the entries from the ZipFile.
entries = zipFile.getEntries;
cnt = 1;
% Get Zip File Paths
while entries.hasMoreElements
tempObj = entries.nextElement;
file{cnt,1} = tempObj.getName.toCharArray';
cnt = cnt+ 1;
end
% Extract File Name
ind = regexp(file,'textfile.*');
ind = find(~cellfun(#isempty,ind)); % Find Non Empty Cell Index
file = file(ind);
% Create Absolute Path so that Windows consider as Directory
file = cellfun(#(x) fullfile('.',x),file,'UniformOutput',false);
\file1 , .\file2 ,..., .\filen , but them how do I use that in fopen and say textscan? something like fileID = fopen([zipFilename filesep file{1}]); ?.
I have a txt file that has a lot of content and in this file there is a lot of "include" word and I want to get data from all three lines after that.
myFile.txt:
"-include:
-6.5 6.5
sin(x^2)
diff
-include
-5 5
cos(x^4)
diff"
How do I get this data in an array?
Based on your example (first include has : at the end, second one doesn't) you could use something like this.
fID = fopen('myFile.txt'); % Open the file for reading
textString = textscan(fID, '%s', 'Delimiter', '\n'); % Read all lines into cells
fclose(fID); % Close the file for reading
textString = textString{1}; % Use just the contents on the first cell (each line inside will be one cell)
includeStart = find(contains(textString,'include'))+1; % Find all lines with the word include. Add +1 to get the line after
includeEnd = [includeStart(2:end)-2; length(textString)]; % Get the position of the last line before the next include (and add the last line in the file)
parsedText = cell(length(includeStart), 1); % Create a new cell to store the output
% Loop through all the includes and concatenate the text with strjoin
for it = 1:length(includeStart)
parsedText{it} = strjoin(textString(includeStart(it):includeEnd(it)),'\n');
% Display the output
fprintf('Include block %d:\n', it);
disp(parsedText{it});
end
Which results in the following output:
Include block 1:
-6.5 6.5
sin(x^2)
diff
Include block 2:
-5 5
cos(x^4)
diff
You can tune the loop to suit your needs. If you just want line numbers, use includeStart and includeEnd variables.
So I have to modify a .dxf file (an Autocad file) by changing some data in it for another one we choose previously. Changing some lines of a .txt file in Matlab is not pretty difficult.
However, I cannot change a specific line when the new input's length is larger than the old one.
This is what I have and I want to change only 1D57:
TEXT
5
1D57
330
1D52
100
AcDbEntity
8
0
If I have as an input BBBB, everything goes right since both strings have the same length. The same does not apply when I try with BBBBbbbbbbbbbb:
TEXT
5
BBBBbbbbbbbbbb2
100
AcDbEntity
8
0
It deletes everything after it until the string stops. It happens the same when the input is shorter: it does not change the line for the new string but it writes until the new input stops. For example, in our case with AAA as an input, the result would be AAA7.
This is basically the code I am using to modify the file:
fID = fopen('copia.dxf','r+');
for i = 1:2
LineToReplace = TextIndex(i);
for k = 1:((LineToReplace) - 1);
fgetl(fID);
end
fseek(fID, 0, 'cof');
fprintf (fID, [Data{i}, '\n']);
end
fclose(fID);
You need to overwrite at least the rest of the file in order to change it (unless exact number of characters is replaced), as explained in jodag's comment. For instance,
% String to change and it's replacement
% (can readily be automated for more replacements)
str_old = '1D52';
str_new = 'BBBBbbbbbbbbbb';
% Open input and output files
fIN = fopen('copia.dxf','r');
fOUT = fopen('copia_new.dxf','w');
% Temporary line
tline = fgets(fIN);
% Read the entire file line by line
% Write it to the new file
% Replace str_old with str_new when encountered - note, if there is more
% than one occurence of str_old in the file all will be replaced - this can
% be handled with a proper flag
while (ischar(tline))
% char(10) is MATLAB's newline character representation
if strcmp(tline, [str_old, char(10)])
fprintf(fOUT, '%s \n', str_new);
else
% No need for \n - it's already there as we're using fgets
fprintf(fOUT, '%s', tline);
end
tline = fgets(fIN);
end
% Close the files
fclose(fIN);
fclose(fOUT);
% Copy the new file into the original
movefile 'copia_new.dxf' 'copia.dxf'
In practice, it is often far easier to simply overwrite the whole file.
As written in the notes - this can be automated for more replacements and it would also need an additional flag to only replace a given string once.
I have a very large data set which is divided in folders, I have 100 folders with approximately 200 text files each. I have been trying the for loop first of all importing one and then in another command importing the rest. But I am not interested in a dataarray but rather conserving each file with its name as I have to then match the dates among all the files and each file does not have the same amount of columns.
Each text file has is like the one I have attached, where the data I need is from the row 23 until column 13.
The data names are saves as 010010.txt, 010030.txt, 010050.txt ......until 014957.txt , they are not sequential
Apart from this I have created a script for importing one file but I would like to know how to repeat the same script for the rest.
filename = 'C:*\010010.txt';
startRow = 22;
formatSpec = '%4f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, 'Delimiter', '', 'WhiteSpace', '', 'HeaderLines' ,startRow-1, 'ReturnOnError', false);
fclose(fileID);
Untitled (010010) = [dataArray{1:end-1}];
I would like to repeat the same import process but for the rest files. I would appreciate any suggestion
The text files have the following format:
I only need from row 23 and column 13 and each txt file has different number of rows as some have data from 1992 - 2014 and other have only 2000 - 2014. The first column is the year and column 2 to 13 are months.
I guess you know the basepath under which all your folders are. You can then use something like this:
% First find all folders
folders = cell(0); % empty cell to save folder names
nFolders = 0;
allFolders = ls(basePath); % find all files and folders
for k=1:size(allFolders,1)
curFolder = fullfile(basePath,strtrim(allFolders(k,:)));
if isdir(curFolder) % find out if it is a folder
if ~(allFolders(k,1) == '.') % ignore '.' and '..'
folders{nFolders+1,1} = curFolder; % Save folder path
nFolders = nFolders + 1;
end
end
end
% Then find all files inside these folders
files = cell(0); % empty cell array for file names
nFiles = 0;
for k=1:nFolders % go through all folders
allFiles = ls(folders{k,1});
for l=1:size(allFiles,1) % go through all found files/subfolders
curFile = fullfile(folders{k},strtrim(allFiles(l,:)));
if ~isdir(curFile) % only select files
files{nFiles+1,1} = curFile; % and save it to the cell
nFiles = nFiles + 1;
end
end
end
Now you can iterate through the files cell and read all files according to your script. I see you are interested in the file name. You can extract the file name by
[path,filename,extension] = fileparts(files{k,1});
To import text files, you can use dlmread, which I think is more intuitive than textscan (but has more limitations, of course). For that you don't have to open the file using fopen, you can directly supply the file name.
value = dlmread(fileName,' ',[23,13,23,13]);
The delimiter is now a white space and only the value at row=23 / col=13 is read. Note that the range starts at row/col=0, not 1 like normally in Matlab - so maybe you'll have to change it to [22,12,22,12].
I have a list of CSV files. The filenames of these files have been stored in the form [year '_MDA8_mat.dat']. I want to read in each of these files into MATLAB and save the output. How can I write the code so that each year is considered in turn and the output .mat file will be saved for each year?
Here's what I have for reading in one of the files:
flist = fopen('2006_MDA8_mat.dat'); % Open the list of file names - CSV files of states with data under consideration
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:27299 % Number of files
fname = fgetl(flist); % Reads next line of list, which is the name of the next data file
disp(fname); % Stores name as string in fname
nt = nt+1; % Time index
load (fname, 'site_data'); % Load current file. It is all the data for one site for one year
O3_data{i} = site_data;
% Do some more stuff
end
save ('2006_MDA8_1990_2014.mat', '-v7.3')
I tried to write a for loop like this:
year = 2006:2014
for y = 1:9
flist = fopen([year(y) '_MDA8_mat.dat']);
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:1500 % Number of files
% Same as above
end
end
save ([year '_MDA8_1990_2014.mat'], '-v7.3')
end
However, when I run this, it doesn't do the same thing as it did for the one file script. I'm not quite sure where the error occurs, but MATLAB tells me there's an error with feof, which doesn't seem to make sense.
When combining numbers and strings, you need to do a num2str on the number:
[num2str(year) '_MDA8_1990_2014.mat']