I'm trying to obtain a table from a csv file in Matlab. The file is available at the following link: http://vincentarelbundock.github.io/Rdatasets/csv/carData/SLID.csv
fid = fopen('SLID.csv', 'r');
C = textscan(fid, '%s %f %f %d %s %s', 'Delimiter', ',', ...
'headerLines', 1, 'TreatAsEmpty','NA');
fclose(fid);
T = cell2table(C,...
'VariableNames',{'id' 'wages' 'education' 'age' 'sex' 'language'});
whos T
But in such a way I obtain a 1x6 table, where each element is a cell of size 7425x1. How to obtain instead a 7425x6 table?
You can get the table you want using the table command:
T = table(C{1},C{2},C{3},C{4},C{5},C{6})
After that, you can set the column names using the table properties:
T.Properties.VariableNames{'Var2'} = 'wages';
etc.
Also, you may want to import the data using the %q specifier, which will remove the double quotes when reading the values from the file:
C = textscan(fid, '%q%f%f%d%q%q', 'Delimiter', ',',...
'headerLines', 1, 'TreatAsEmpty','NA')
But that depends on how you will work with the data later.
i read a csv file with textscan and when i want write in a file i receive this error : Error using horzcat. Dimensions of matrices being concatenated are not consistent.
if i change the first format in textscan (i mean %S) to %f the error vanishes.
the error occurs when matlab want to make [datatest{1} probability]
probability is 1000*1 double
datatest{1} is 1000*1 cell
datatest=textscan(FileID,'%s %*f %f %f %*s %*s %*s %*s %*s %*s %*s %*s %*s %f %f %f %f %f %f %f %f %f %f',1000,'headerlines',1,'delimiter',',');
csvwrite('output.csv',[datatest{1} probability]);
Your variable datatest{1} contains 1000 cells which each contains a string (may be or may be not the same length).
In your statement [datatest{1} probability] you are trying to concatenate cells (containing strings) with double numeric type, this does not work. The concatenation operator needs to operate on data of similar type.
Now even if you were to create a cell array which would contain all your desired columns myCellArray={datatest{1} probability}, this would not help you because the output of that cannot be passed on the function csvwrite.
csvwrite, or the better sister dlmwrite, do not accept cell arrays. You would have to convert the cell values into numeric values. Unfortunately, you want to write strings and numeric values, so your only way is to use low level functions like fprintf
In your case, to write the file you were expecting, you can use the following code.
col1 = datatest{1} ; %// extract the column of interest for easier indexing later on
fidw = fopen('output.csv','w') ; %// get a handle on a file to write (necessary with "fprintf")
for iline = 1:numel(probability) %// loop on each line
fprintf( fidw , '%s, %f\n' , col1{iline} , probability(iline) ) ; %// write the line
end
fclose(fidw) ; %// close the file - IMPORTANT - (necessary with "fprintf")
Hi I am using the following code to read some values from lines containing 'GPGGA' from data.txt
fid = fopen('D:\data.txt','r');
A=textscan(fid,'%s %*s %f %s %f %s %*s %*s %*s %*s %*s %*s %*s %*s %*s,'Delimiter',',');
fclose(fid);
Loc = [A{[2, 4]}];
row_idxs = cellfun( #(s) strcmp(s, '$GPGGA'), A{1});
Loc = Loc(row_idxs, :);
display(Loc);
The code works perfectly if the last line in data.txt is deleted. Not sure why it throws this error when the last line is included in the text file. What is the reason? I'm confused!
"??? Error using ==> horzcat
CAT arguments dimensions are not consistent.
Error in ==> test at 4
Loc = [A{[2, 4]}];"
data.txt
$GPGSV,4,1,16,05,15,046,23,29,47,071,21,16,31,291,18,31,39,202,18*73
$GPGSV,4,1,16,05,15,046,23,29,47,071,21,16,31,291,18,31,39,202,18*73
$GPGSV,4,1,16,05,15,046,23,29,47,071,21,16,31,291,18,31,39,202,18*73
$GPGSV,4,1,16,05,15,046,23,29,47,071,21,16,31,291,18,31,39,202,18*73
$GPGSV,4,2,16,23,13,298,17,25,15,119,17,06,22,247,16,03,04,251,14*75
$GPGSV,4,2,16,23,13,298,17,25,15,119,17,06,22,247,16,03,04,251,14*75
$GPGSV,4,2,16,23,13,298,17,25,15,119,17,06,22,247,16,03,04,251,14*75
$GPGSV,4,2,16,23,13,298,17,25,15,119,17,06,22,247,16,03,04,251,14*75
$GPGGA,1.8,98.90,S,18.0014,E,1,04,1.0,87.8,M,48.0,M,,*76
$GPGGA,1.3,98.91,S,18.0015,E,1,04,1.0,100.7,M,48.0,M,,*40
$GPGGA,1.3,98.92,S,18.0016,E,1,04,1.0,105.4,M,48.0,M,,*4F
$GPGGA,1.8,98.93,S,18.0017,E,1,04,1.0,87.8,M,48.0,M,,*76
$GPGGA,1.8,98.94,S,18.0018,E,1,04,1.0,87.8,M,48.0,M,,*76
$GPGSV,4,4,16,27,,,,26,,,,24,,,,22,,,*79
Your format string is no good. It is only indicative of 15 columns. The sample data you've posted has 20 columns. I suggest using the following code (which runs without error on my machine) instead:
fid = fopen('D:\data.txt','r');
A=textscan(fid,'%s %*s %f %s %f %s %*[^\n]', 'Delimiter',',');
fclose(fid);
Loc = [A{[2, 4]}];
row_idxs = cellfun( #(s) strcmp(s, '$GPGGA'), A{1});
Loc = Loc(row_idxs, :);
display(Loc);
Note the construct %*[^\n] in my format string. This tells textscan to ignore all columns from this point onwards. It is much neater than writing out lots of %*s over and over. Also, it means you're less likely to miscount the number of columns when building the format string :-)
I've problem changing my code that uses textread function to textscan.
Contents of data.txt:(Note:I've changed all actual coordinates to dddd.mmmmmm,ddddd.mmmmmm)
$GPGGA,104005.3,dddd.mmmmmm,N,ddddd.mmmmmm,W,1,05,4.4,73.4,M,48.0,M,,*7E
$GPGGA,104006.3,dddd.mmmmmm,N,ddddd.mmmmmm,W,1,05,2.1,73.5,M,48.0,M,,*7F
$GPGGA,104007.3,dddd.mmmmmm,N,ddddd.mmmmmm,W,1,05,2.1,74.0,M,48.0,M,,*70
$GPGGA,104008.3,dddd.mmmmmm,N,ddddd.mmmmmm,W,1,05,2.4,73.9,M,48.0,M,,*7C
$GPGGA,104009.3,dddd.mmmmmm,N,ddddd.mmmmmm,W,1,04,2.4,73.9,M,48.0,M,,*75
Code:
fid = fopen('E:\data.txt','r');
Location=zeros(2,);
Block = 1;
while(~feof(fid))
A=textscan(fid,'%*s %*s %s %*s %s %*s %*s %*s %*s %*s','delimiter',',','delimiter','\n');
Location(:)=[%s %s]';
x=Location(1,:);
y=Location(2,:);
Block = Block+1;
end
display(Location);
The new code is wrong. I'm using 2 delimiters here. I want to take out the latitude and longitude values from each line if they are not null. How can I correct it? Also what do I need to do to take Lat Long values only from lines starting with $GPGGA if there are many different lines in the text file?
This code should work for both your requirements and put in the correct signs (please check):
fid = fopen('data.txt','r');
A=textscan(fid,'%s %*s %f %s %f %s %*s %*s %*s %*s %*s %*s %*s %*s %*s','Delimiter',',');
fclose(fid);
Location = [A{[2, 4]}];
row_idxs = cellfun( #(s) strcmp(s, '$GPGGA'), A{1});
Location = Location(row_idxs, :);
LatSigns = -2*cellfun(#(dir) strcmp(dir, 'S'), A{3}(row_idxs))+1;
LongSigns = -2*cellfun(#(dir) strcmp(dir, 'W'), A{5}(row_idxs))+1;
Location = Location .* [LatSigns LongSigns];
display(Location);