Matlab dlmread adds random zeros - matlab

I need to add data to array in matlab, I trying to use dlmread, but it adds random zeroes, how could I define rows length?
My file:
1 65.058 5 0
2 80.661 46 0
3 102.083 197 1
4 80.529 111 5
5 88.331 160 6
My line:
X = dlmread(Data, ' ', 0, 0);
Output:
1.0000 65.0580 5.0000
0 0 0
2.0000 80.6610 46.0000
0 0 0
3.0000 102.0830 197.0000
1.0000 0 0
4.0000 80.5290 111.0000
5.0000 0 0

There are two consecutive spaces in the first line of your file. This causes dlmread to add an additional column. I can't recreate your output (my version is R2015b), but I'm suspecting this is the culprit. You don't need to (and can't) define the number of rows or columns with dlmread; it's supposed to figure it out for itself by design. This shouldn't be a problem when your input data matches the expected format.

Related

MATLAB: Filter struct based on column value

i'm new to matlab, too used to python and having difficulty finding a way to filter a struct similar to how i can filter a pandas dataframe in python based on condition.
Matlab
a = arrayfun(#(x) x.value ==10, Data);
Data_10 = Data(a);
Error using arrayfun Non-scalar in Uniform output, at index 1, output
1. Set 'UniformOutput' to false.
How i would do so in python:
Data_10 = Data[Data.value == 10]
Try this:
Data_10 = zeros(size(Data.value));
Data_10(Data.value==10) == 10;
This should write into your array Data_10 the value 10 into each position, that has a 10 in Data and leave the rest as 0.
I am not sure if I fully understood your question. Here is my underestanding:
You want to filter certain values of an matrix.
Lets imagine we have a Matrix A filled with values. You want to filter values smaller than lowthresh = 0 and greater than upthresh = 5.
A = [3 6 -2.4 1; 0 34 4.76 0.5; 84 3 2.32 4; 1 -1 2 3.99];
lowthresh = 0;
upthresh = 5;
A(A<lowthresh | A>upthresh) = NaN; % Nan is a good flag
Output:
A =
3.0000 NaN NaN 1.0000
0 NaN 4.7600 0.5000
NaN 3.0000 2.3200 4.0000
1.0000 NaN 2.0000 3.9900
Having substituted your values you can do some basic functions ignoring NaNs:
For instance average:
mean(A,'omitnan')
ans =
1.3333 3.0000 3.0267 2.3725
I hope this adresses your question. Notice, that you can do this for any statement, that returns a boolean (isnan(), ... ) even if the boolean does not have anything to do with the matrix at all.
Lets say we have 2 matrizes that have the same size but different numbers:
A =
1 1 0
1 1 0
0 0 0
B =
0 0 0
0 0 0
0 0 0
We can easily say:
B(A==1) = 2
B =
2 2 0
2 2 0
0 0 0
I hope it helped a bit,
cheers Pablo

How to accumulate (average) data based on multiple criteria

I have a set of data where I have recorded values in sets of 3 readings (so as to be able to obtain a general idea of the SEM). I have them recorded in a list that looks as follows, which I am trying to collapse into averages of each set of 3 points:
I want to collapse essentially each 3 rows into one row where the average data value is given for that set. In essence, it would look as follows:
This is something I know how to do basically in Excel (i.e. using a Pivot table) but I am not sure how to do the same in MATLAB. I have tried using accumarray but struggle with knowing how to incorporate multiple conditions essentially. I would need to create a subs array where its number corresponds to each unique set of 3 data points. By brute force, I could create an array such as:
subs = [1 1 1; 2 2 2; 3 3 3; 4 4 4; ...]'
using some looping and have that as my subs array, but since it isn't tied to the data itself, and there may be strange hiccups throughout (i.e. more than 3 data points per set, or missing data, etc.). I know there must be some way to have this sort of Pivot-table-esque grouping for something like this, but need some help to get it off the ground. Thanks.
Here is the input data in text form:
Subject Flow On/Off Values
1 10 1 2.20
1 10 1 2.50
1 10 1 2.60
1 20 1 5.50
1 20 1 6.10
1 20 1 5.90
1 30 1 10.10
1 30 1 10.50
1 30 1 10.50
1 10 0 1.90
1 10 0 2.20
1 10 0 2.30
1 20 0 5.20
1 20 0 5.80
1 20 0 5.60
1 30 0 9.80
1 30 0 10.20
1 30 0 10.20
2 10 1 5.70
2 10 1 6.00
2 10 1 6.10
2 20 1 9.00
2 20 1 9.60
2 20 1 9.40
2 30 1 13.60
2 30 1 14.00
2 30 1 14.00
2 10 0 5.40
2 10 0 5.70
2 10 0 5.80
2 20 0 8.70
2 20 0 9.30
2 20 0 9.10
2 30 0 13.30
2 30 0 13.70
2 30 0 13.70
You can use unique and accumarray like so to maintain the order of your rows of data:
[newData, ~, subs] = unique(data(:, 1:3), 'rows', 'stable');
newData(:, 4) = accumarray(subs, data(:, 4), [], #mean);
newData =
1.0000 10.0000 1.0000 2.4333
1.0000 20.0000 1.0000 5.8333
1.0000 30.0000 1.0000 10.3667
1.0000 10.0000 0 2.1333
1.0000 20.0000 0 5.5333
1.0000 30.0000 0 10.0667
2.0000 10.0000 1.0000 5.9333
2.0000 20.0000 1.0000 9.3333
2.0000 30.0000 1.0000 13.8667
2.0000 10.0000 0 5.6333
2.0000 20.0000 0 9.0333
2.0000 30.0000 0 13.5667
I assume that
You want to average based on unique values of the first three columns (not on groups of three rows, although the two criteria coincide in your example);
Order is determined by column 1, then 3, then 2.
Then, denoting your data as x,
[~, ~, subs] = unique(x(:, [1 3 2]), 'rows', 'sorted');
result = accumarray(subs, x(:,end), [], #mean);
gives
result =
2.1333
5.5333
10.0667
2.4333
5.8333
10.3667
5.6333
9.0333
13.5667
5.9333
9.3333
13.8667
As you see, I am using the third output of unique with the 'rows' and 'sorted' options. This creates the subs grouping vector based on first three columns of your data in the desired order. Then, passing that to accumarray computes the means.
accumarray is indeed the way to go. First, you'll need assign an index to each set of values with unique :
[unique_subjects, ~, ind_subjects] = unique(vect_subjects);
[unique_flows, ~, ind_flows] = unique(vect_flows);
[unique_on_off, ~, ind_on_off] = unique(vect_on_off);
So basically, you now got ind_subjects, ind_flows and ind_on_off that are values in [1..2], [1..3] and [1..2].
Now, you can compute the mean values in a [3x2x2] array (in you example) :
mean_values = accumarray([ind_flows, ind_on_off, ind_subjects], vect_values, [], #mean);
mean_values = mean_values(:);
Nota : order is set accordingly to your example.
Then you can construct the summary :
[ind1, ind2, ind3] = ndgrid(1:numel(unique_flows), 1:numel(unique_on_off), 1:numel(unique_subjects));
flows_summary = unique_flows(ind1(:));
on_off_summary = unique_on_off(ind2(:));
subjects_summary = unique_subjects(ind3(:));
Nota : Also works with non numeric values.
You should also try checking out the findgroups and splitapply reference pages. The easiest way to use them here is probably to place your data in a table:
>> T = array2table(data, 'VariableNames', { 'Subject', 'Flow', 'On_Off', 'Values'});
>> [gid,Tgrp] = findgroups(T(:,1:3));
>> Tgrp.MeanValue = splitapply(#mean, T(:,4), gid)
Tgrp =
12×4 table
Subject Flow On_Off MeanValue
_______ ____ ______ _________
1 10 0 2.1333
1 10 1 2.4333
1 20 0 5.5333
1 20 1 5.8333
1 30 0 10.067
1 30 1 10.367
2 10 0 5.6333
2 10 1 5.9333
2 20 0 9.0333
2 20 1 9.3333
2 30 0 13.567
2 30 1 13.867

Store values from exisiting matrix into new matrix

I have a matrix containing 8 cols and 80k rows. (From an excel file)
Each row has an ID.
I want to store all data with ID no. 1 in a new matrix. And all data with ID no. 2 in a second matrix etc. So each time an ID changes I want to save all the data of a new ID in a new matrix.
There are above 800 ID's.
Ive tried several things w/o luck. Among others:
k = zeros(117,8)
for i =1:80000
k(i) = i + Dataset(1:i,:)
end
The above was only to see if I actually could get the first 117 rows saved in another matrix which didnt succeed.
If one of the 8 columns contains the ID then you can use logical indexing. For example if column 1 contains the ID, we can first find a list of all different ID values:
uniqueIDs = unique(Dataset(:, 1));
Then we can create cell array, with the lists of items of a given ID:
listsByID = cell(length(uniqueIDs), 1);
for idx = 1:length(uniqueIDs)
listsByID{idx} = Dataset(Dataset(:, 1) == uniqueIDs(idx), :);
end
Running the above on an example dataset:
Dataset = [1 0.1 10
1 0.2 20
2 0.3 30
3 0.4 40
2 0.5 50
2 0.6 60];
Results in:
1.0000 0.1000 10.0000
1.0000 0.2000 20.0000
2.0000 0.3000 30.0000
2.0000 0.5000 50.0000
2.0000 0.6000 60.0000
3.0000 0.4000 40.0000

Normalization of inputs of a feedforward Neural network

Let's say I have a mxn matrix of different features of a time series signal (column 1 represents linear regression of the last n samples, column 2 represents the average of the last n samples, column 3 represents the local max values of a different time series but correlated signal, etc). How should I normalize these inputs? All the inputs fall into different categories, so they have a different range. One ranges from 0,1, the other ranges from -5 to 50, etc etc.
Should I normalize the WHOLE matrix? Or should I normalize each set of inputs one by one individually?
Note: I usually use mapminmax function from MATLAB for the normalization.
You should normalise each vector/column of your matrix individually, they represent different data types and shouldn't be mixed up together.
You could for example transpose your matrix to have your 3 different data types in the rows instead of in the columns of your matrix and still use mapminmax:
A = [0 0.1 -5; 0.2 0.3 50; 0.8 0.8 10; 0.7 0.9 20];
A =
0 0.1000 -5.0000
0.2000 0.3000 50.0000
0.8000 0.8000 10.0000
0.7000 0.9000 20.0000
B = mapminmax(A')
B =
-1.0000 -0.5000 1.0000 0.7500
-1.0000 -0.5000 0.7500 1.0000
-1.0000 1.0000 -0.4545 -0.0909
You should normalize each feature independently.
column 1 represents linear regression of the last n samples, column 2 represents the average of the last n samples, column 3 represents the local max values of a different time series but correlated signal, etc
I can't say for sure about your particular problem, but generally, you should normalize each feature independently. So normalize column 1, then column 2 etc.
Should I normalize the WHOLE matrix? Or should I normalize each set of inputs one by one individually?
I'm not sure what you mean here. What is an input? If by that you mean an instance (a row of your matrix), then no, you should not normalize rows individually, but columns.
I don't know how you would do this in Matlab, but I took your question more as a theoretical one than an implementation one.
If you want to have a range of [0,1] for all the columns that normalized within each column, you can use mapminmax like so (assuming A as the 2D input array) -
out = mapminmax(A.',0,1).'
You can also use bsxfun for the same output, like so -
Aoffsetted = bsxfun(#minus,A,min(A,[],1))
out = bsxfun(#rdivide,Aoffsetted,max(Aoffsetted,[],1))
Sample run -
>> A
A =
3 7 4 2 7
1 3 4 5 7
1 9 7 5 3
8 1 8 6 7
>> mapminmax(A.',0,1).'
ans =
0.28571 0.75 0 0 1
0 0.25 0 0.75 1
0 1 0.75 0.75 0
1 0 1 1 1
>> Aoffsetted = bsxfun(#minus,A,min(A,[],1));
>> bsxfun(#rdivide,Aoffsetted,max(Aoffsetted,[],1))
ans =
0.28571 0.75 0 0 1
0 0.25 0 0.75 1
0 1 0.75 0.75 0
1 0 1 1 1

Find the same values in another column in matlab

i want to find same values of number in different column,
for example i have a matrix array:
A = [1 11 0.17
2 1 78
3 4 90
45 5 14
10 10 1]
so as you can see no. 1 in column 1 have the same values in column 2 and column 3, so i want to pick that number and put into another cell or matrix cell
B= [1]
and perform another operation C/B, letting C is equal to:
C= [1
3
5
7
9]
and you will have:
D= [1 11 0.17 1
2 1 78 3
3 4 90 5
45 5 14 7
10 10 1 9]
then after that, values in column 4 have equivalent numbers that we can define, but we will choose only those number that have number 1, or B in theirs row
define:
1-->23
3 -->56
9 --> 78
then we have, see image below:
so how can i do that? is it possible? thanks
Let's tackle your problem into steps.
Step #1 - Determine if there is a value shared by all columns
We can do this intelligently by bsxfun, unique, permute and any and all.
We first need to use unique so that we can generate all possible unique values in the matrix A. Once we do this, we can look at each value of the unique values and see if all columns in A contain this value. If this is the case, then this is the number we need to focus on.
As such, do something like this first:
Aun = unique(A);
eqs_mat = bsxfun(#eq, A, permute(Aun, [3 2 1]));
eqs_mat would generate a 3D matrix where each slice figures out where a particular value in the unique array appeared. As such, for each slice, each column will have a bunch of false values but at least one true value where this true value tells you the position in the column that matched a unique value. The next thing you'll want to do is go through each slice of this result and determine whether there is at least one non-zero value for each column.
For a value to be shared along all columns, a slice should have a non-zero value per column.
We can eloquently determine which value we need to extract by:
ind = squeeze(all(any(eqs_mat,1),2));
Given your example data, we have this for our unique values:
>> B
B =
0.1700
1.0000
2.0000
3.0000
4.0000
5.0000
10.0000
11.0000
14.0000
45.0000
78.0000
90.0000
Also, the last statement I executed above gives us:
>> ind
ind =
0
1
0
0
0
0
0
0
0
0
0
0
The above means that the second location of the unique array is the value we want, and this corresponds to 1. Therefore, we can extract the particular value we want by:
val = Aun(ind);
val contains the value that is shared along all columns.
Step #2 - Given the value B, take a vector C and divide by B.
That's pretty straight forward. Make sure that C is the same size as the total number of rows as A, so:
C = [1 3 5 7 9].';
B = val;
col = C / B;
Step #3 - For each location in A that shares the common value, we want to generate a new fifth column that gives a new value for each corresponding row.
You can do that by declaring a vector of... say... zeroes, then find the right rows that share the common value and replace the values in this fifth column with the values you want:
zer = zeros(size(A,1), 1);
D = [23; 56; 78];
ind2 = any(A == val, 2);
zer(ind2) = D;
%// Create final matrix
fin = [A col zer];
We finally get:
>> fin
fin =
1.0000 11.0000 0.1700 1.0000 23.0000
2.0000 1.0000 78.0000 3.0000 56.0000
3.0000 4.0000 90.0000 5.0000 0
45.0000 5.0000 14.0000 7.0000 0
10.0000 10.0000 1.0000 9.0000 78.0000
Take note that you need to make sure that what you're assigning to the fifth column is the same size as the total number of columns in A.