I want to have a list of data in a text file, and for that I use:
fprintf(fid, '%d %s %d\n',ii, names{ii},vals(ii));
the problem in my data, there are names that are longer than other. so I get results in this form:
1 XXY 5
2 NHDMUCY 44
3 LL 96
...
How i can change the fprintf line of code to make the results in this form:
1 XXY 5
2 NHDMUCY 44
3 LL 96
...
Something like this before the start of the loop -
%// extents of each vals string and the corresponding whitespace padding
lens0 = cellfun('length',cellfun(#(x) num2str(x),num2cell(1:numel(names)),'Uni',0))
pad_ws_col1 = max(lens0) - lens0
%// extents of each names string and the corresponding whitespace padding
lens1 = cellfun('length',names)
pad_ws_col2 = max(lens1) - lens1
Then, inside the loop -
fprintf(fid, '%d %s %s %s %d\n',col1(ii), repmat(' ',1,pad_ws_col1(ii)), ...
names{ii},repmat(' ',1,pad_ws_col2(ii)),vals(ii));
Output would be -
1 XXY 5
2 NHDMUCY 44
3 LL 96
For a range 99 - 101, it would be -
99 XXY 5
100 NHDMUCY 44
100 LL 96
Please note that the third column numerals start at a fixed distance instead of ending at a fixed distance from the start of each row as asked in the question. But, assuming that the whole idea of the question was to present the data in a more readable way, this could work for you.
You can use the function char to convert a cell array of string into a character array where all rows will be padded to be the length of the longest one.
So for you:
charNames = char( names ) ;
then you can use fprintf :
fprintf(fid, '%d %s %d\n',ii, charNames(ii,:) , vals(ii)) ;
Just make sure your cell array is a colum before you convert it to char.
Related
I have consecutive .dat files which I want to read and input into a single matrix by concatenating the files vertically. The code I have so far works fine for simple numeric files with only tabs as delimiter.
import=[];
data=[];
for i = 1:32
data1=[import dlmread(sprintf('%d.dat',i))];
data=vertcat(data, data1);
clear data1;
end
and I take the correct output into the data matrix. But my file format is as follows:
first second third
0 11/15 08:57:42.000 54 67 82
1 11/15 09:48:47.010 49 32 31
...
As you can see I have three delimiters (: \t /) and headers only in the last three columns which are essentially the ones I want to read, that is I want a matrix:
54 67 82
49 32 31
...
I tried specifying the delimiters into the dlmwrite and how many rows/columns to skip but an error occurs in sprintf ('delimiter = sprintf(delimiter); % Interpret \t (if necessary)'). Does anyone have any idea how to go about this?
UPDATE:
I managed to get a little further
data=[];
for i = 1:32
filename = sprintf( '%d.dat',i );
data1=importdata(filename);%creates a cell array
data2=cell2mat(data1(3:end,:));%converts it to char
%The data, without the header, start from the 3rd row.
data=vertcat(data, data2); %concatenate vertically all the files
clear data1; clear data2;
end
%the data
a1=str2num(data(1:end,20:25));%the first data column is in char 20-25
a2=str2num(data(1:end,30:35));%the second data column is in char 30-35
The thing is that the last part takes too much time, over an hour has passed until I manually stopped it. Does anyone know a simpler and faster way to do this?
I managed to solve this myself so I post it here for future reference:
for i = 1:32
filename = sprintf( '%d.dat',i );
data1 = dlmread(filename,'',2,3);%start from row 2, headercolumn 3
data=vertcat(data, data1);
clear data1;
end
Now the data matrix contains only my data columns and it runs in a few seconds.
I have a data set that I would like to store and be able to load in Octave
18.0 8 307.0 130.0 3504. 12.0 70 1 "chevrolet chevelle malibu"
15.0 8 350.0 165.0 3693. 11.5 70 1 "buick skylark 320"
18.0 8 318.0 150.0 3436. 11.0 70 1 "plymouth satellite"
16.0 8 304.0 150.0 3433. 12.0 70 1 "amc rebel sst"
17.0 8 302.0 140.0 3449. 10.5 70 1 "ford torino"
15.0 8 429.0 198.0 4341. 10.0 70 1 "ford galaxie 500"
14.0 8 454.0 220.0 4354. 9.0 70 1 "chevrolet impala"
14.0 8 440.0 215.0 4312. 8.5 70 1 "plymouth fury iii"
14.0 8 455.0 225.0 4425. 10.0 70 1 "pontiac catalina"
15.0 8 390.0 190.0 3850. 8.5 70 1 "amc ambassador dpl"
It does not work immediately when I try to use:
data = load('auto.txt')
Is there a way to load from a text files with the given format or do I need to convert it to e.g
18.0,8,307.0,130.0,3504.0,12.0,70,1
...
EDIT:
Deleting the last row and fixing the 'half' number e.g. 3504. -> 3504.0
and then used:
data = load('-ascii','autocleaned.txt');
Loaded the data as wanted in to a matrix in Octave.
load is usually meant for loading octave and matlab binary files but can be used for loading textual data like yours. You can load your data using the "-ascii" option but you'd have to reformat your file slightly before putting it into load even with the "-ascii" option enabled. Use a consistent column separator ie. just a tab or a comma, use full numbers not 3850. and don't use strings.
Then you can do something like this to get it to work
DATA = load("-ascii", "auto.txt");
If the final string field is removed from each line, the file can be read with:
filename='stack25148040_1.txt'
fid = fopen(filename, 'r');
[x, count] = fscanf(fid, '%f', [10, Inf])
endif
fclose(fid);
Alternatively the whole file could read in as one column and reshaped.
I haven't figured out how to read both the numeric fields and the string field. For that I've had to fall back on Python with more general purpose file reading tools.
Here is a Python script that reads the file, creates a numpy structured array, writes that to a .mat file, which Octave can then read:
import csv
import numpy as np
data=[]
with open('stack25148040.txt','rb') as f:
r = csv.reader(f, delimiter=' ')
# csv handles quoted strings with white space
for l in r:
# remove empty strings from the split on ' '
data.append([x for x in l if x])
print data[0]
for dd in data:
# convert 8 of the strings (per line) to float
dd[:]=[float(d) for d in dd[:8]]+dd[-1:]
data=data[:-1] # remove empty last line
print data[0]
print
# make a structured array, with numbers and a string
dt=np.dtype("f8,i4,f8,f8,f8,f8,i4,i4,|S25")
A=np.array([tuple(d) for d in data],dtype=dt)
print A
from scipy.io import savemat
savemat('stack25148040.mat',{'A':A})
In Octave this could read with
load stack25148040.mat
A
# A = 1x10 struct array containing the fields:
# f0 f1 ... f8
A.f8 # string field
A(1) # 1st row
# scalar structure containing the fields:
# f0 = 18
# f1 = 8
...
# f8 = chevrolet chevelle malibu
Newer Octave (3.8) has an importdata function. It handles the original data file without any extra arguments. It returns a structure with 2 fields
x.data is a (10,11) matrix. x.data(:,1:8) is the desire numerical data. x.data(:,9:11) is a mix of NA and random numbers. The NA stand in for the words at the end of the lines. x.textdata is a (24,1) cell with those words. The quoted string s could be reassembled from those words, using the NA and quotes to determine how many words belong to which line.
To read the numeric data it uses dlmread. Since the rest of importdata is written in Octave, it could be used as the starting point for a custom function that handles the string data properly.
dlmread ('stack25148040.txt')(:,1:8)
importread ('stack25148040.txt').data(:,1:8)
textread ('stack25148040.txt','')(:,1:8)
https://octave.org/doc/v4.0.0/Simple-File-I_002fO.html
Try this,
data = importdata('Auto.data')
I'm having issue with getting what I want from sscanf;
e.g. getting varname, year, month, day from a filename;
filename = 'stn2014021412598cjgafe.cnv'
format = '%3s%4d%2d%2d%5d%*10s';
test = sscanf(filename,format);
and I get the result:
test =
115
116
110
2014
2
14
12598
but what I want is the
varname = 'stn'
year = 2014
month = 2
day = 14
and then record or not the 5 digits
num = 12598
and skip everything else.
However, I have no understanding on why I get those 3 numbers 115, 116, 110.
Those first three values are the character codes for 's', 't' and 'n'. The sscanf documentation explains why it comes out this way for your format specifier.
Mixing character and numeric conversion specifications causes the
resulting matrix to be numeric and any characters read to show up
as their numeric values, one character per MATLAB matrix element.
In other words:
>> char(test(1:3))'
ans =
stn
An easier solution is probably textscan since it stores the components in a cell array, allowing different types:
>> C = textscan(filename,format)
C =
{1x1 cell} [2014] [2] [14] [12598]
>> C{1}
ans =
'stn'
I have a number like this - 778310098 - and I want to read 2 bytes at a time. So, I am expecting my output to be 77; 83; 10; 09; 8. I tried using the below:
uint16(fread(fileID,inf, 'ubit8')) and the output I get is the ASCII value of the individual numbers:
55
55
56
51
49
48
48
57
56
What do I need to do to get the desired output?
To read pairs of ASCII digits from a text file (we tend not to describe text files in byets, but in characters), use:
[10 1] * (fread(fileID,[2 inf], 'char') - 48)
To read bytes pairwise from a binary file, try
fread(fileID,inf, '*uint16')
One method is to convert it to a string, then process the string, then convert it back to an integer. While this may not be particularly elegant or perfect, will this do the trick?
a = 778310098;
b = num2str(a);
for i = 1:2:length(b)
if i == length(b) % to handle the case for odd input
split = str2num(b(i))
else
split = str2num(b(i:i+1)) % handle all others
end
end
How can I convert the numbers in the range 1 through 26 to their respective letter position in the alphabet?
1 = A
2 = B
...
26 = Z
CHR(#) will give you the ASCII character, you just need to offset it based on the ASCII table:
e.g. A = 65, so you will need to add 64 to 1:
CHR(64 + #) = A if # is 1
ASCII code is the numerical representation of a character such as 'a' or 'Z'. Therefore by looking at the table one can see that capital A has a value of 65 and Z has a value of 90. Adding 64 from each value in the range 1-26 will give you their corresponding letter.