Why does readmatrix in Matlab skip the first n lines? - matlab

In my simulation I am writing data to file using writematrix, then later reading it back using readmatrix. I am appending to a single file at each time step, each line is the same length or longer than the previous line.
For some reason when using readmatrix on the output file, the first n lines are skipped entirely, as in not read at all. For example, my file looks like this:
...
11.8,1,2,3,4,5,6,7,8,9,10,2
11.9,1,2,3,4,5,6,7,8,9,10,2
...
12.3,1,2,3,4,5,6,7,8,9,10,2
12.4,7,8,9,10,7,8,9,10,1,2,1,1,2,3,4,5,6,3,4,5,6,1
12.5,7,8,9,10,7,8,9,10,1,2,1,1,2,3,4,5,6,3,4,5,6,1
...
30.5,7,8,9,10,7,8,9,10,1,2,2,1,2,3,4,5,6,3,4,5,6,2
30.6,7,8,9,10,7,8,9,10,1,2,2,1,2,3,4,5,6,3,4,5,6,2
30.7,17,18,19,20,1,2,7,8,9,10,1,1,2,3,4,5,6,3,4,5,6,2,11,12,13,14,15,16,7,8,9,10,1
30.8,17,18,19,20,1,2,7,8,9,10,1,1,2,3,4,5,6,3,4,5,6,2,11,12,13,14,15,16,7,8,9,10,1
...
(the first column is a time stamp, so the first ellipsis represents t=0 to t=11.7. At t=30.7 there is another step jump in the number of entries), and when I read using the command
data = readmatrix('/path/to/file/data.csv');
the matrix data looks like
12.4 7 8 9 10 7 8 9 10 1 2 1 1 2 3 4 5 6 3 4 5 6 1
12.5 7 8 9 10 7 8 9 10 1 2 1 1 2 3 4 5 6 3 4 5 6 1
12.6 7 8 9 10 7 8 9 10 1 2 1 1 2 3 4 5 6 3 4 5 6 1
...
30.5 7 8 9 10 7 8 9 10 1 2 2 1 2 3 4 5 6 3 4 5 6 2
30.6 7 8 9 10 7 8 9 10 1 2 2 1 2 3 4 5 6 3 4 5 6 2
30.7 17 18 19 20 1 2 7 8 9 10 1 1 2 3 4 5 6 3 4 5 6 2 11 12 13 14 15 16 7 8 9 10 1
30.8 17 18 19 20 1 2 7 8 9 10 1 1 2 3 4 5 6 3 4 5 6 2 11 12 13 14 15 16 7 8 9 10 1
...
That is to say, all the entries before t=12.4 (i.e. the first step jump in line length) are skipped.
In the file, if I delete everything before the first step jump (i.e everything before t=12.4), then I get the same matrix data, so we can conclude the subsequent step jumps cause no issue. If I delete everything from the second step jump (i.e. everything after t=30.6) then it still skips all the entries before t=12.4. If I have no step jumps (i.e. only t=0 to t=12.3) then it happily reads in the first lines.
I've tried reading the same file using csvread and it returns all of the data from the beginning of the file (albeit padded with zeros instead of nans), so I'm confident the issue isn't with the file.
Why is this happening?
A minimum working example is the first code block without the ellipses.
For reference, the first lines have 12 csvs, and each step jump increase that by 11
Edit:
Output from detectImportOptions
ans =
DelimitedTextImportOptions with properties:
Format Properties:
Delimiter: {','}
Whitespace: '\b\t '
LineEnding: {'\n' '\r' '\r\n'}
CommentStyle: {}
ConsecutiveDelimitersRule: 'split'
LeadingDelimitersRule: 'keep'
EmptyLineRule: 'skip'
Encoding: 'UTF-8'
Replacement Properties:
MissingRule: 'fill'
ImportErrorRule: 'fill'
ExtraColumnsRule: 'addvars'
Variable Import Properties: Set types by name using setvartype
VariableNames: {'Var1', 'Var2', 'Var3' ... and 20 more}
VariableTypes: {'double', 'double', 'double' ... and 20 more}
SelectedVariableNames: {'Var1', 'Var2', 'Var3' ... and 20 more}
VariableOptions: Show all 23 VariableOptions
Access VariableOptions sub-properties using setvaropts/getvaropts
PreserveVariableNames: false
Location Properties:
DataLines: [4 Inf]
VariableNamesLine: 0
RowNamesColumn: 0
VariableUnitsLine: 0
VariableDescriptionsLine: 0
To display a preview of the table, use preview

Matlab's readmatrix is trying to be smart and locate a 2-D matrix within the data model of the CSV file you're passing it. It looks like it's passing over the first few lines which don't have explicit trailing empty "cells".
You can control this by setting the import options. Run opts = detectImportOptions(...); on your file and have a look at the DataLines property. If it doesn't start at 1, set it to [1 Inf] to force readmatrix to read in all the lines. And then call readmatrix, explicitly passing in that options structure.
To do this compactly (and probably more efficiently), call readmatrix with an explicit option right off the bat like this:
readmatrix(path2mat,delimitedTextImportOptions('DataLines',[0,Inf]))

Related

KDB+/Q:Input agnostic function for single and multi row tables

I have tried using the following function to derive a table consisting of 3 columns with one column data holding a list of an arbitrary schema.
fn:{
flip `time`data`id!(x`b;(x`a`b`c`d`e);x`a)
};
which works well on input with multiple rows i.e.:
q)x:flip `a`b`c`d`e!(5#enlist 5?10)
q)fn[`time`data`id!(x`b;(x`a`b`c`d`e);x`a)]
time data id
-----------------
8 8 5 2 8 6 8
5 8 5 2 8 6 5
2 8 5 2 8 6 2
8 8 5 2 8 6 8
6 8 5 2 8 6 6
However fails when using input with a single row i.e.
q)x:`a`b`c`d`e!5?10
q)fn[`time`data`id!(x`b;(x`a`b`c`d`e);x`a)]
time data id
------------
8 7 7
8 8 7
8 4 7
8 4 7
8 6 7
which is obviously incorrect.
One might fix this by using enlist i.e.
q)x:enlist `a`b`c`d`e!5?10
q)fn[`time`data`id!(x`b;(x`a`b`c`d`e);x`a)]
time| 8
data| 7 8 4 4 6
id | 7
Which is correct, however if one were to apply this in the function i.e.
fn:{
flip enlist `time`data`id!(x`b;(x`a`b`c`d`e);x`a)
};
...
time| 2 5 8 7 9
data| 2 5 8 7 9 2 5 8 7 9 2 5 8 7 9 2 5 8 7 9 2 5 8 7 9
id | 2 5 8 7 9
Which has the wrong format of data values.
My question here is how might one avert this conversion issue and derive the same field values whether the argument is a multi row or single row table.
Or otherwise what is the canonical implementation of this in kdb+/q
Thanks
Edit:
To clarify: my problem isn't necessarily with the data input as one could just apply enlist if it is only one row. My question pertains to how one might use enlist in the fn function to make single row input conform to the logic seen when using multi row tables. i.e. how to replace fn enlist input with fn data (how to make the function input agnostic) Thanks
Are you meaning to flip the data perpendicular to the rest of the table? Your 5 row example works because there are 5 rows and 5 columns. The single row doesn't work due to 1 row to 5 columns.
Correct me if I'm wrong but I think this is what you want:
fn:{([]time:x`b;data:flip x`a`b`c`d`e;id:x`a)};
--------------------------------------------------
t1:flip `a`b`c`d`e!(5#enlist til 5);
a b c d e
---------
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
fn[t1]
time data id
-----------------
0 0 0 0 0 0 0
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
--------------------------------------------------
t2:enlist `a`b`c`d`e!til 5;
a b c d e
---------
0 1 2 3 4
fn[t2]
time data id
-----------------
1 0 1 2 3 4 0
Note without the flip you get this:
([]time:t1`b;data:t1`a`b`c`d`e;id:t1`a)
time data id
-----------------
0 0 1 2 3 4 0
1 0 1 2 3 4 1
2 0 1 2 3 4 2
3 0 1 2 3 4 3
4 0 1 2 3 4 4
In this case the time is no longer in line with the data but it works because of 5 row and cols.
Edit - I can't think of a better way to convert a dictionary to a table when needed other than using count first in a conditional. Note if the first key is a nested list this wouldn't work
{ $[1 = count first x;enlist x;x] } `a`b`c`d`e!til 5
Note, your provided function doesn't work with this:
{
flip `time`data`id!(x`b;(x`a`b`c`d`e);x`a)
}{$[1 = count first x;enlist x;x]} `a`b`c`d`e!til 5

Take a column of a matrix and make it a row in kdb

Consider the matrix:
1 2 3
4 5 6
7 8 9
I'd like to take the middle column, assign it to a variable, and replace the middle row with it, giving me
1 2 3
2 5 8
7 8 9
I'm extracting the middle column using
a:m[;enlist1]
which returns
2
5
8
How do I replace the middle row with a? Is a flip necessary?
Thanks.
If you want to update the matrix in place you can use
q)show m:(3;3)#1+til 10
1 2 3
4 5 6
7 8 9
q)a:m[;1]
q)m[1]:a
q)show m
1 2 3
2 5 8
7 8 9
q)
cutting out "a" all you need is:
m[1]:m[;1]
You can use dot amend -
q)show m:(3;3)#1+til 10
1 2 3
4 5 6
7 8 9
q)show a:m[;1]
2 5 8
q).[m;(1;::);:;a]
1 2 3
2 5 8
7 8 9
Can see documentation here:
http://code.kx.com/wiki/Reference/DotSymbol
http://code.kx.com/wiki/JB:QforMortals2/functions#Functional_Forms_of_Amend
Making it slightly more generic where you can define the operation, row, and column
q)m:3 cut 1+til 9
1 2 3
4 5 6
7 8 9
Assigning the middle column to middle row :
q){[ m;o;i1;i2] .[m;enlist i1;o; flip[m] i2 ] }[m;:;1;1]
1 2 3
2 5 8
7 8 9
Adding the middle column to middle row by passing o as +
q){[ m;o;i1;i2] .[m;enlist i1;o; flip[m] i2 ] }[m;+;1;1]
1 2 3
6 10 14
7 8 9

Extracting max value from each column in a table

I have generated a table of data with time in one column with attempts 1-10 in the next series of columns. I want to be able to extract the max value in each attempt for further analysis.
I have tried for table MGA
max = max(MGA(:, []))
I get the following error -- "You cannot subscript a table using only one subscript. Table subscripting requires both row and variable subscripts."
First off: Never do max = max();, you'll overload max, and you won't be able to use it again.
And to answer the question, you can do as follows (notice that I've kept the values in the first columns):
MGA
MGA =
1 5 3 8 9
2 4 7 3 3
3 8 7 6 9
4 8 2 7 3
5 2 2 9 10
6 5 5 10 4
7 5 10 6 2
8 7 4 2 3
9 8 6 2 7
10 8 3 3 5
max_values = [MGA(:,1), max(MGA(:,2:end),[],2)]
max_values =
1 9
2 7
3 9
4 8
5 10
6 10
7 10
8 7
9 8
10 8

period in sequence of element in vector

I have a vector V periodic an I want to write a program which associates to each period the set of different elements in that period and gives its cardinal.
For example:
For the vector v=(2 3 7 2 7 3 2 3 7 2 7 3) and cardinal 6, give me only the vector P=(2 3 7 2 7 3).
For The vector v=(2 3 7 5 8 6 10 11 10 6 8 5 7 3 2 3 7 5 8 6) and the cardinal 14, give me P=(2 3 7 5 8 6 10 11 10 6 8 5 7 3).
If I understood you correctly you have to use build-in function seqperiod(v).
In your case, for example:
v=[2 3 7 5 8 6 10 11 10 6 8 5 7 3 2 3 7 5 8 6];
>> seqperiod(v)
ans =
14
Interesting moment: in your second example there aren't full repetition. So we can't really say is it periodic... But seqperiod still works and returns 14 as you wish.
go further you can use it in this way:
[p, num] = seqperiod(v);
p = 14
num = 1.4286
num - is a number of repetitions.
Ok. Now you say you need not only the cardinal, but the vector. So you can do it easily:
result = v(1:p);
Hope it helps!

Matlab: creating a matrix whose rows consist of a linspace or similar pattern

Anybody know a fast way to produce a matrix consisting of a linspace for each row? For example, the sort of pattern I'm looking for in this matrix is:
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
...
1 2 3 4 5 6 7 8 9 10
Anyone know any fast tricks to produce this WITHOUT using a for loop?
I just figured this out, so just in case anyone else was troubled by this, we can achieve this exact pattern by:
a=linspace(1,10,10);
b=ones(3,1)*a;
This will give:
>> a = 1 2 3 4 5 6 7 8 9 10
>> b = 1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
You need to use repmat.
Example:
>> B = repmat(1:10,[3 1])
B =
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
You can vary the value of 3 there. You can change it accordingly.
Another shortcut I can recommend is similar to repmat, but you specify a base array first of a = 1:10;. Once you do this, you specify a series of 1s in the first dimension when indexing which should produce a matrix of the same vectors with many rows as you want, where each row consists of the base array a. As such:
%// number of times to replicate
n = 4;
a = 1:10;
a = a(ones(1,n),:);
Result:
a =
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
Insert this command: transpose(ndgrid(1:10,1:n));, where n is the number of rows desired in the result.
You can consider these solutions:
With basic matrix indexing (taken from here)
b=a([1:size(a,1)]' * ones(1,NumToReplicate), :) %row-wise replication
b=a(:, ones(NumToReplicate, 1)) %column-wise replication
With bsxfun:
bsxfun(#times,a,(ones(1,NumToReplicate))') %row-wise replication
bsxfun(#times,a',(ones(1,NumToReplicate))) %column-wise replication
You are welcome to benchmark above two solutions with repmat.