I have four .txt files. Each one has 250 lines, where each line has 4 values separated by commas as shown below are the first 5 lines in one of the file, but all are of the same structure:
NaN,NaN,NaN,-1
792.98,419.48,333.35,245.63
787.13,408.59,345.05,251.48
798.3,414.17,333.36,245.63
803.61,414.43,333.35,239.78
One of the four files is the reference file, named groundtruth.txt I want to read each line from the three files and compare it with the values found in the same line number in the groudtruth.txt file. And after that save the difference between the values of the ground_truth and each one in a file for further processing, so the result will be that I'll have 3 new different files holding the differences where each file will have 250 lines and each line holds the difference such as the first line of the result file having the difference between the ground_truth and the first file will be like this :79.8,9.42,22.35,10.63
So if anyone could please advise.
If I understand correctly, this should be the thing you are after:
groundtruth = dlmread('groundtruth.txt');
file1 = dlmread('file_01.txt');
file2 = dlmread('file_02.txt');
file3 = dlmread('file_03.txt');
dlmwrite('diff_01.txt', file1 - groundtruth);
dlmwrite('diff_02.txt', file2 - groundtruth);
dlmwrite('diff_03.txt', file3 - groundtruth);
Related
I am trying to read hundreds of .dat file by skipping header lines (I do not know how many of them I need to skip beforehand). Header lines very from 1 to 20 and have at beginning either or "$" oder "!".
A sample data (left column - node, right column - microstructure) has always two columns and looks like the following:
!===
!Comment
$Material
1 1.452E-001
2 1.446E-001
3 1.459E-001
I tried the following codeline, assuming I know beforehand that there 3 lines in header:
fid = fopen('Graphite_Node_Test.dat') ;
data = textscan(fid,'%f %f','HeaderLines',3) ;
fclose(fid);
This solution works if the number of header lines is known. How can I change the code so that it can read the .dat file without knowing the number of header lines beginning with either "$" or "!" sign?
I have csv file of many rows, each having 101 columns, with the 101th column being a char, while the rest of the columns are doubles. Eg.
1,-2.2,3 ... 98,99,100,N
I implemented a filter to operate on the numbers and wrote the result in a different file, but now I need to map the last column of my old csv to my new csv. how should I approach this?
I did the original loading using loadcsv but that didn't seem to load the character so how should I proceed?
In MATLAB there are many ways to do it, this answer expands on the use of tables:
Input
test.csv
1,2,5,A
2,3,5,G
5,6,8,C
8,9,7,T
test2.csv
1,2,1.2
2,3,8
5,6,56
8,9,3
Script
t1 = readtable('test.csv'); % Read the csv file
lastcol = t{:,end}; % Extract the last column
t2 = readtable('test2.csv'); % Read the second csv file
t2.addedvar = lastcol; % Add the last column of the first file to the table from the second file
writetable(t2,'test3.csv','Delimiter',',','WriteVariableNames',false) % write the new table in a file
Note that test3.csv is a new file but you could also overwrite test2.csv
'WriteVariableNames',false allows you to write the csv file without the headers of the table.
Output
test3.csv
1,2,1.2,A
2,3,8,G
5,6,56,C
8,9,3,T
I have several .csv files that I read with matlab using textscan, beause csvread and xlsread do not support this size of a file 200Mb-600Mb.
I use this line to read it:
C = textscan(fileID,'%s%d%s%f%f%d%d%d%d%d%d%d','delimiter',',');
the problem that I have found that sometimes the data is not in this format and then the textscan stop to read in that line without any error.
So what I have done is to read it in this way
C = textscan(fileID,'%s%d%s%f%f%s%s%s%s%s%s%s%s%s%s%s','delimiter',',');
In this way I see the in 2 rows out of 3 milion there is a change in the format.
I want to read all the lines except the bad/different lines.
In addition if its possible to read only the lines that the first string is 'PAA'. is it possible ?
I have tried to load it directly to matlab but its super slow and sometime it get stuck. Or for the realy big one it will announce memory problem.
Any recomendations?
For large files which are still small enough to fit your memory, parsing all lines at once is typically the best choice.
f = fopen('data.txt');
g = textscan(f,'%s','delimiter','\n');
fclose(f);
In a next step you have to identify the lines starting with PAA use strncmp.
Now having your data filtered, apply your textscan expression above to each line. If it fails, try the other.
Matlab is slow with this kind of thing because it needs to load everything into memory. I would suggest using grep/bash/cmd lines to reduce your file to readable lines before processing them in Matlab, in Linux you can:
awk '{if (p ~ /^PAA/ && $1 ~ /^PAA/) print; p=$1}' yourfile.csv > yourNewFile.csv %// This will give you a new file with all the lines that starts with PAA (NOTE: Case sensitive)
To Find lines that does not have the same format, you can use:
awk -F ',' 'NF = 12 {print NR, $0} ' yourfile.csv > yourNewFile.csv
This line looks at 12 delimiters for each line, and discard any line that has more than 12 ",".
HiIs it possible to open and read text file from another one. For example
"file1.txt" contain 2 columns and the data are:
1, "file4.txt"
2, "file5.txt"
3, "file6.txt"
and I want to display column 2 from file4,5 and 6
Any idea? and how to implement it
Thanks guys
You could first read the contents of 'file1.txt' like this
fid = fopen('file1.txt');
fileContents = textscan(fid,'%d %q','Delimiter',',');
And then iterate over the second column (the file names) of the file's content
fileNames = fileContents{2};
for i = 1:length(fileNames)
% filenames{i} will be 'file4.txt', 'file5.txt', 'file6.txt' respectively in
% each iteration
fid2 = fopen(fileNames{i});
%%%%% put code to read second column here %%%%
fclose(fid2);
end
fclose(fid);
Sorry I have too low reputation to comment hence answering..
I think you question is you have a text file file1.txt and in that file you have data of file4 and file5 right? Either you have link of the file4.txt or you have its data.. In both cases you need to filter that part(either file path to file4 or its data) and then store its content in an array so you can modify according to your needs later.. Please be more specific about your problem while questioning..
This is my problem.
I need to copy 2 columns each from 7 different files to the same output file.
All input and output files are CSV files.
And I need to add each new pair of columns beside the columns that have already been copied, so that at the end the output file has 14 columns.
I believe I cannot use
open(FILEHANDLE,">>file.csv").
Also all 7 CSV files have nearlly 20,000 rows each, therefore I'm reading and writing the files line by line.
It would be a great help if you could give me an idea as to what I should do.
Thanx a lot in advance.
Provided that your lines are 1:1 (Meaning you're combining data from line 1 of File_1, File_2, etc):
open all 7 files for input
open output file
read line of data from all input files
write line of combined data to output file
Text::CSV is probably the way to access CSV files.
You could define a csv handler for each file (including output), use getline or getline_hr (returns hashref) methods to fetch data, combine it into arrayrefs, than use print.