How to create a table with NaNs in Matlab? - matlab

I am trying to create a table that is 10 x 5 with only NaNs. I start by creating an array with NaNs:
N = NaN(10, 5);
then I try converting it to a table:
T = table(N);
It puts all cells into one column, but I need the table to be 5 columns with one NaN in each cell. Does anyone know how to do that?

array2table
works just fine. This takes a matrix and converts it to the table structure where each column of the matrix is a column in the output table:
>> N = NaN(10, 5);
>> T = array2table(N)
T =
N1 N2 N3 N4 N5
___ ___ ___ ___ ___
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN

What you want is:
t = array2table(NaN(10,5))
Bonus (so our answers are slightly different :P) You can rename the variables to anything you want with something like:
t.Properties.VariableNames = {'x1','x2','x3','x4','x5'};

Related

How to get utility Matrix from initial dataset?

While I apply Alternating Least Squares,I found need to use utility matrix.
I'm working on 20 milion Movielens dataset which contain rating file(userId ,MovieId ,Rating).
I know utility matrix (M X N) where M is the number of users and N is the number of Movies .
my question: How to build utility matrix from rating file?
As, the 20M dataset couldn't fit in my computer, during the pivot call, I am showing the process for 1M dataset.
import re
import os
import zipfile
import numpy as np
import pandas as pd
from sklearn import preprocessing
from urllib.request import urlretrieve
# Creating required folders, if they don't exist
def create_dir(dirname):
if os.path.exists(dirname):
print(f"Directory {dirname} already exists.")
else:
os.mkdir(dirname)
create_dir('Datasets')
print("Downloading movielens data...")
urlretrieve("http://files.grouplens.org/datasets/movielens/ml-1m.zip", "movielens.zip")
zip_ref = zipfile.ZipFile('movielens.zip', "r")
zip_ref.extractall()
print("Extraction done")
# Loading ratings dataset and renamed extracted folder
ratings = pd.read_csv('ml-1m/ratings.dat', sep='::', names=['userId', 'movieId', 'rating', 'timestamp'])
ratings = ratings.drop(columns=['timestamp'])
ratings.to_csv('Datasets/ratings.csv', index=False)
print(ratings.shape)
pivot_table = ratings.pivot_table(index=['userId'], columns=['movieId'], values='rating')
pivot_table.to_csv('Datasets/user_vs_movies.csv', index=False)
pivot_table.head()
Output:
Downloading movielens data...
Extraction done
(1000209, 3)
movieId 1 2 3 4 5 6 7 8 9 10 ... 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952
userId
1 5.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN 2.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
5 rows × 3706 columns

ode45 outputs are all NaN

I'm trying to solve this simple ODE system:
dydpdt = 1*(-f{2}-f{1}*dydp);
Stored in a function called funsensitivity (f is a cell array, with a 765x765 sparse matrix and a 765x1 vector, downloadable as a .MAT file here ). I call it using:
dydp0 = zeros(size(f{2}));
[t2,JJ]=ode45(#(t,y)funsensitivity(t,y,f),0:4000:100000,dydp0);
JJ is the right size, I get no errors, but all values in JJ are NaN. I have no idea why this could happen. What am I doing wrong?
You are using {} to index, which is for cell arrays. You should be using ():
dydpdt = 1*(-f(2)-f(1)*dydp);
and also
dydp0 = zeros(size(f(2)));
[t2,JJ]=ode45(#(t,y)funsensitivity(t,y,f),0:tstep:tfinal,dydp0);
EDIT based on comments:
I have tried running your code in Octave (don't have MATLAB), and it looks like your problem is unstable by looking at the solution:
JJ =
Columns 1 through 10:
0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000 0.0000e+000
5.1645e+001 1.0181e+004 1.2727e+003 -1.1492e+004 -1.2900e+001 -7.2862e+001 7.2502e+001 7.7228e-003 1.1269e-002 1.4324e-003
1.5631e+031 1.3173e+033 1.6466e+032 -1.4936e+033 -3.7860e+030 -2.2204e+031 2.2012e+031 7.4674e+026 1.0897e+027 1.3851e+026
7.0857e+060 3.7159e+062 4.6449e+061 -4.2334e+062 -1.6685e+060 -1.0125e+061 1.0005e+061 1.9564e+056 2.8548e+056 3.6287e+055
3.4854e+090 1.3800e+092 1.7250e+091 -1.5787e+092 -7.9162e+089 -5.0174e+090 4.9378e+090 7.0373e+085 1.0269e+086 1.3053e+085
-5.3460e+120 5.8857e+121 7.3572e+120 -6.2253e+121 1.3755e+120 7.4917e+120 -7.4825e+120 3.1352e+115 4.5748e+115 5.8151e+114
-2.3670e+152 2.7260e+151 3.4075e+150 1.4565e+152 5.7833e+151 3.3559e+152 -3.3303e+152 9.3359e+145 1.3623e+146 1.7316e+145
-7.6120e+183 1.3332e+181 1.6682e+180 5.6622e+183 1.8358e+183 1.0823e+184 -1.0724e+184 3.3770e+177 4.9277e+177 6.2636e+176
-2.4360e+215 6.7970e+210 9.2304e+209 1.8189e+215 5.8022e+214 3.4726e+215 -3.4359e+215 1.4223e+209 2.0755e+209 2.6381e+208
-7.7953e+246 4.4692e+240 3.6568e+240 5.8278e+246 1.8337e+246 1.1142e+247 -1.1008e+247 6.0021e+240 8.7582e+240 1.1133e+240
-2.4947e+278 4.0992e+271 1.3587e+272 1.8673e+278 5.7955e+277 3.5749e+278 -3.5270e+278 2.5328e+272 3.6959e+272 4.6979e+271
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
You might want to try a stiff solver such as ode15s to see if it improves things, or using a smaller time step in your time vector, but it looks like the problem is fundamentally wrong (numerically at least).

Matlab contourf plots interpolation

Hi If I have data like this e.g.
x=[1:1:7];
y=[5:-1:1]';
z=[NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN
0.955113030084974 0.948571658876062 0.942624899410361 NaN NaN NaN NaN
0.937493758208870 0.928392864395896 0.920119550965773 0.910466888808695 0.901586502842837 0.892741292179595 NaN
0.879644551679863 0.862126561405869 0.846200299426160 0.827622958701087 0.810531605135333 0.793507569055583 0.775604152867929
];
I'd like to generate a contourf (i.e. contourf(x,y,z);) plot that gets rid of the steps i.e. the result should be a smooth curve at the border.
You could use imagesc instead, but the reason that there are such harsh steps is that you don't have enough data points. To change that, one option is to interpolate more data points in between what you have.
x=[1:1:7];
y=[5:-1:1]';
z=[NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN
0.955113030084974 0.948571658876062 0.942624899410361 NaN NaN NaN NaN
0.937493758208870 0.928392864395896 0.920119550965773 0.910466888808695 0.901586502842837 0.892741292179595 NaN
0.879644551679863 0.862126561405869 0.846200299426160 0.827622958701087 0.810531605135333 0.793507569055583 0.775604152867929];
xn = 1:.01:7;
yn = [5:-.01:1]';
zn = interp2(x,y,z,xn,yn);
imagesc(xn,yn,zn);

Make a vector equal to another by filling 'NaN' without interpolation

I have a time stamp as follow.
Time =
243.0000
243.0069
243.0139
243.0208
243.0278
243.0347
243.0417
243.0486
243.0556
243.0625
243.0694
243.0764
243.0833
243.0903
243.0972
243.1042
243.1111
243.1181
243.1250
243.1319
243.1389
243.1458
243.1528
243.1597
243.1667
243.1736
243.1806
243.1875
243.1944
Now I have another two column vector.
ab =
243.0300 0.5814
243.0717 0.6405
243.1134 0.6000
243.1550 0.5848
243.1967 0.5869
First column is 'Time2' and second column is 'Conc'.
Time2 = ab(:,1);
Conc = ab(:,2);
Now I want to match 'Conc' based on 'Time2' with 'Time' but only filling with 'NaN'. Also 'Time2' is not exactly as 'Time'. I can use something like following
Conc_interpolated = interp1(Time2,Conc,Time)
but it does an interpolation with artificial data. I only want to match vector length by filling with 'NaN' in 'Conc' not with interpolated data.Any recommendations? Thanks
I try to guess what you want:
you have time vector A:
TimeA = ...
[243.0000;
243.0069;
...
243.1875;
243.1944];
and probably some data A:
DataA = rand(length(TimeA),1);
now you want to implement your second time vector B:
TimeB = ...
[243.0300;
243.0717;
243.1134;
243.1550;
243.1967];
and the according data:
DataB = ...
[0.5814;
0.6405;
0.6000;
0.5848;
0.5869];
finally everything should be merged together and sorted:
X = [ TimeA, DataA , NaN(size(DataA)) ;
TimeB, NaN(size(DataB)) , DataB ]
Y = sortrows(X,1);
results to:
Y =
243.0000 0.8852 NaN
243.0069 0.9133 NaN
243.0139 0.7962 NaN
243.0208 0.0987 NaN
243.0278 0.2619 NaN
243.0300 NaN 0.5814
243.0347 0.3354 NaN
243.0417 0.6797 NaN
243.0486 0.1366 NaN
243.0556 0.7212 NaN
243.0625 0.1068 NaN
243.0694 0.6538 NaN
243.0717 NaN 0.6405
243.0764 0.4942 NaN
243.0833 0.7791 NaN
243.0903 0.7150 NaN
243.0972 0.9037 NaN
243.1042 0.8909 NaN
243.1111 0.3342 NaN
243.1134 NaN 0.6000
243.1181 0.6987 NaN
243.1250 0.1978 NaN
243.1319 0.0305 NaN
243.1389 0.7441 NaN
243.1458 0.5000 NaN
243.1528 0.4799 NaN
243.1550 NaN 0.5848
243.1597 0.9047 NaN
243.1667 0.6099 NaN
243.1736 0.6177 NaN
243.1806 0.8594 NaN
243.1875 0.8055 NaN
243.1944 0.5767 NaN
243.1967 NaN 0.5869
is that right?
My understanding is a little different, it doesn't add to Time but rather assigns each Conc to the nearst Time based on it's Time2:
ind = zeros(size(ab,1),1); %//preallocate memory
for ii = 1:size(ab,1)
[~, ind(ii)] = min(abs(ab(ii,1)-Time)); %//Based on this FEX entry: http://www.mathworks.com/matlabcentral/fileexchange/30029-findnearest-algorithm/content/findNearest.m
end
Time(:,2) = NaN; %// Prefill with NaN
Time(ind, 2) = ab(:,2)
This results in:
Time =
243.00000 NaN
243.00690 NaN
243.01390 NaN
243.02080 NaN
243.02780 0.58140
243.03470 NaN
243.04170 NaN
243.04860 NaN
243.05560 NaN
243.06250 NaN
243.06940 0.64050
243.07640 NaN
243.08330 NaN
243.09030 NaN
243.09720 NaN
243.10420 NaN
243.11110 0.60000
243.11810 NaN
243.12500 NaN
243.13190 NaN
243.13890 NaN
243.14580 NaN
243.15280 0.58480
243.15970 NaN
243.16670 NaN
243.17360 NaN
243.18060 NaN
243.18750 NaN
243.19440 0.58690
for your example inputs

How to extract valid value from a sparse vector in Matlab?

I have lots of vectors like this one below, very sparse, lots of 'NaN'. What I intend to do is to extract the valid number out of this vector, and put them into a separate vector with no 'NaN' values.
And every vector has different positions with valid number, so I can't put them into a matrix then extract rows.
Thus please help me with this!
10459865
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
8751943
NaN
NaN
NaN
NaN
NaN
NaN
6951680
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
5991217
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
5327653
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
4740048
NaN
NaN
4265221
NaN
NaN
3973280
Assuming that vector is stored in variable a,
a(isfinite(a))
will extract just the valid (finite) entries.
You can use the isnan() function to find out if an entry is a number. Then something like
x = vector of values;
new_x = x(~isnan(x));
new_x is a vector with only the valid numbers.