I have the following table ('ABC'):
Goal: For each date, create 2 new variables (e.g. 'LSum' and 'USum'). 'LSum' should calculate the sum of all cell values across the column universe (4-281), but only with those values whose header is in the cell array of ABC.L, for that specific date. In the same fashion, 'USum' should calculate the sum of all cell values across the columns, but only with those values whose header is in the cell array of ABC.U, for that specific date.
% load content
load ('ABC.mat');
% run through every date, starting from the top
for row=1:size(ABC,1);
% for-loop for 'L' that determines for what specific cells (of col. 4-281) the following calculation has to be done: how?
% for-loop for 'U' that determines for what specific cells (of col. 4-281) the following calculation has to be done: how?
% now generate new variables
LSum = sum(); % But how can I use if clause here to select only eligible cells that enter into the sum calculation?
USum = sum(); % Same problem here as LSum
end;
% Concatenate table ABC and the newly formed variables into 1 table
ABC = [ABC(:,1:3) LSum USum ABC(:,3+1:end)];
Thanks for your help, especially for the looping through date and the cell arrays of 'L' and 'U' at the same time.
Possible way how to avoid loops is to utilize rowfun() function. Suppose we have table similar to yours.
Augment the table by column with table variable names.
tb = horzcat(tb,repmat({tb.Properties.VariableNames},1,height(tb))');
Define function which is applied onto every row.
function out = selectiveSum(varargin)
[~,~,sumColsId] = intersect(varargin{2},varargin{end});
out = sum(cell2mat(varargin(sumColsId)));
end
Run rowfun().
sums = rowfun(#selectiveSum,tb);
The function deals just with one selector array (L column). Add the second one and let the selectiveSum() return 1x2 vector.
Related
I have a table with 2 columns (x and y value) and 100 rows (the x values repeat in a certain interval).
Therfore I would like to perform the following task for changeable table sizes (only changes in the row size!):
I want to determine the amount of repetitions of the x values and save this information as a variable named n. Here the amount of repetition is 5 (each x value occurs 5 times in total).
I want to know the range of the x values from repetition circle and save this information as R = height(range); Here the x range is [0,20]
With the above informaton I would like the create smaller tables where only one repetition of the x values is present
How could I implement this in matlab?
Stay safe and healthy,
Greta
This approach converts the Table to an array/matrix using the table2array() function for further processing. To find the repeated pattern in the x-values the unique() function is used to retrieve the vector that is repeated multiple times. The range of the values can be calculated by using the min() and max() functions and concatenating the values in a 2 element array. The assignin() function can then be used to create a set of smaller tables that separate the y-values according to the x-value repetitions.
Table Used to Test Script:
x = repmat((1:20).',[5 1]);
y = rand(100,1);
Table = array2table([x y]);
Script:
Array = table2array(Table);
Unique_X_Values = unique(Array(:,1));
Number_Of_Repetitions = length(Array)/length(Unique_X_Values);
Range = [min(Array(:,1)) max(Array(:,1))];
Y_Reshaped = reshape(Array(:,2),[numel(Array(:,2))/Number_Of_Repetitions Number_Of_Repetitions]);
for Column_Index = 1: Number_Of_Repetitions
Variable_Name = ['Small_Tables_' num2str(Column_Index)];
assignin('base',Variable_Name,array2table(Y_Reshaped(:,Column_Index)));
eval(Variable_Name);
end
fprintf("Number of repetitions: %d\n",Number_Of_Repetitions);
fprintf("Range: [%d,%d]\n",Range(1),Range(2));
I have a cell array of size 360*1 where each element is composed by a 330*3 timetable. The names for each column on every timetable are 'hour','volume' and 'price'. For each timetable, I want to erase the volume observations that are duplicated (of course, I also want to erase its corresponding price and hour). How can I do it? Unfortunately, the function 'unique' is just good when I have a vector, but not for a cell array.
Thanks in advance!
Here I provide a sample code of one timetable,
Date = datetime({'2015-12-18 08:03:05';'2015-12-18 10:03:17';'2015-12-18 12:03:13';'2015-12-18 12:04:13';'2015-12-18 12:05:13'});
Hour = [1;1;1;1;1];
Volume = [152;152;300;400;500];
Price = [13.4;6.5;7.3;10;11];
TT = timetable(Date,Hour,Volume,Price)
The objective would be to get rid of the two 152 volume observations, and this for all the timetables contained on the cell array.
This is pretty much just a question on how to delete elements from a table.
Here is your MVE:
dts = [datetime('yesterday')
datetime('today')
datetime('now')
datetime('tomorrow')];
T = timetable(dts,rand(length(dts),1),rand(length(dts),1),'VariableNames',{'price','volume'});
T.volume(4) = T.volume(2);
Note that the 4th entry of volume is the same as the second entry. Further I have assumed that volume is a vector (sounded reasonable)...
% find unique entries of the vector T.volume
[~, idx] = unique(T.volume);
% delete other rows / better: keep unique rows of the table
T = T(idx,:);
If you now cope with a cell of many tables, just loop over it. Assuming your 360x1 cell is called C:
for i = 1:length(C)
% get table from cell
T = C{i};
% do the stuff above
%...
% assign cropped table back to the cell
C{i} = T;
end
I'm working on matlab and try to assign a matrix to one cell of a cell array. However, there was always something wrong. Here is the code:
C = {};
myMatrix = xlsread('myexcelfile');
C{'ID', 'info'} = myMatrix;
Then matlab prompted that
"Expected one output from a curly brace or dot indexing expression, but there were 12 results."
But if I don't use 'ID' and 'Info' but use '1' and '2' instead, the matrix could be assigned successfully.
Could anyone help me? Thanks!
Assuming we have got three persons and each one has got a name and ID number and the data size that corresponds to each person is 2x3. I utilize a cell for storing data and fill it via random number.(In your case you should use xlsread('myexcelfile') to fill this cell). Each ID number is concatenated with a string because Matlab does not accept a string which is directly converted by number, for names in rows and columns of the table.
clc;clear all;close all;
% assuming we have got three persons in the dataset
cell_data=cell(3,3); % I use cell instead of matrix for storing data
ID_number=[38;48;58];% this vector contains the ID numbers of each person
for i=1:numel(ID_number);rng('shuffle');cell_data{i,i}=rand(2,3);end % using random number as dataset
ID=strcat('ID ',string(ID_number));%'38' is not a valid variable name so concat number with 'ID ' string
Customer = string({'Jones';'Brown';'Smith'});
Customer = cellstr(Customer);
T = table('RowNames',Customer);
for i=1:numel(ID_number)
T.(char(ID(i)))=cell_data(:,i);
end
%
After creating our table we can get input as follows:
input_cell = inputdlg({'Name','ID number'});% 2x1 cell
ID_input=strcat('ID ',input_cell{2,1});
T( {input_cell{1,1}} , {ID_input} )
And if the input formats are adapted to the table, we can get output like this:
table
ID48
____________
Brown [2×3 double]
You can add some conditions to the script for the cases that inputs are not adapted to the table format.
I want to generate a 3D cell array called timeData so that timeData(:,:,a) for some integer a is an nx1 matrix of data, and the number of rows n varies with the value of a in a 1:1 correspondence. To do this, I am generating a 2D array of data called data that is nx1. This assignment statement takes place within a for loop as follows:
% Before iterating, I define an array of indices where I want to store the
% data sets in timeData. This choice of storage location is for
% organizational purposes.
A = [2, 5, 9, 21, 34, 100]; % Notice they are in ascending order, but have
% gaps that have no predictability.
sizeA = size(A);
numIter = A(1);
for m = 1:numIter % numIter is the number of data sets that I need to store
% in timeData
% At this point, some code that is entirely irrelevant to my question
% generates a nx1 array of data. One example of this data array is below.
data = [1.1;2.3;5.5;4.4]; % This is one example of what data could be. Its
% number of rows, n, changes each iteration, as
% do its contents.
B = size(data);
timeData(1:B(1),1,A(m)) = num2cell(data);
end
This code does put all contents of data in the appropriate locations within timeData as I want. However, it also adds {0x0 double} rows to all 2D arrays of timeData(:,:,a) for any a whose corresponding number of rows n was not the largest number of rows. Thus, there are many of these 2D arrays that have 10 to a couple hundred 0-valued rows that I don't want. For values of a that did not have a corresponding data set, the content of timeData(:,:,a) is an nx1 array of {0x0 double}.
I need to iterate over the contents of timeData in subsequent code, and I need to be able to find the size of the data set that is in timeData(:,:,a) without somehow discounting all the {0x0 double}.
How can I modify my assignment statement to fix this?
Edit: Desired output of the above example is the following with n = 5. Let this data set be represented by a = 9.
timeData(:,:,9) = {[1.1]}
{[2.3]}
{[5.5]}
{[8.6]}
{[4.4]}
Now, consider the possibility that a previous or subsequent value of the A matrix had a data set with n = 7, and n = 7 is the largest data set (largest n value). timeData(:,:,9) outputs like so in my code:
timeData(:,:,9) = {[1.1]}
{[2.3]}
{[5.5]}
{[8.6]}
{[4.4]}
{[0x0 double]}
{[0x0 double]}
#Dev-iL, as I understand it, your answer gives me the ability to delete the cells that have {[0x0 double]} in them (this is what I mean by "discounting"). This is a good plan B, but is there a way to prevent the {[0x0 double]} cells from showing up in the first place?
Edit 2: Update to the above statement "your answer gives me the ability to delete the cells that have {[0x0 double]} in them (this is what I mean by "discounting")". The cellfun(#isempty... ) function makes the {[0x0 double]}cells go to {[0x0 cell]}, it does not remove them. In other words, size(timeData(:,:,9)) is the same before and after the command is performed. This is not what I want. I want size(timeData(:,:,9)) to be 5x1 no matter what n is for any other value of a.
Edit 3: I just realized that the most desired output would be the following:
timeData(:,:,9) = {[1.1;2.3;5.5;8.6;4.4]} % An n x 1 column matrix within
% the cell.
but I can work with this outcome or the outcome as described above.
Unfortunately, I don't understand the structure of your dataset, which is why I can't suggest a better assignment method. However, I'd like to point out an operation that can you help deal with your data after it's been created:
cellfun(#isempty,timeData);
What the above does is return a logical array the size of timeData, indicating which cells contain something "empty". Typically, an array of arbitrary datatype is considered "empty" when it has at least one dimension that is equal to 0.
How can you use it to your advantage?
%% Example 1: counting non-empty cells:
nData = sum(~cellfun(#isempty,timeData(:)));
%% Example 2: assigning empty cells in place of empty double arrays:
timeData(cellfun(#isempty,timeData)) = {{}};
I have an Excel sheet containing 1838 records and I need to RANDOMLY split these records into 3 Excel Sheets. I am trying to use Matlab but I am quite new to it and I have just managed the following code:
[xlsn, xlst, raw] = xlsread('data.xls');
numrows = 1838;
randindex = ceil(3*rand(numrows, 1));
raw1 = raw(:,randindex==1);
raw2 = raw(:,randindex==2);
raw3 = raw(:,randindex==3);
Your general procedure will be to read the spreadsheet into some matlab variables, operate on those matrices such that you end up with three thirds and then write each third back out.
So you've got the read covered with xlsread, that results in the two matrices xlsnum and xlstxt. I would suggest using the syntax
[~, ~, raw] = xlsread('data.xls');
In the xlsread help file (you can access this by typing doc xlsread into the command window) it says that the three output arguments hold the numeric cells, the text cells and the whole lot. This is because a matlab matrix can only hold one type of value and a spreadsheet will usually be expected to have text or numbers. The raw value will hold all of the values but in a 'cell array' instead, a different kind of matlab data type.
So then you will have a cell array valled raw. From here you want to do three things:
work out how many rows you have (I assume each record is a row) by using the size function and specifying the appropriate dimension (again check the help file to see how to do this)
create an index of random numbers between 1 and 3 inclusive, which you can use as a mask
randindex = ceil(3*rand(numrows, 1));
apply the mask to your cell array to extract the records matching each index
raw1 = raw(:,randindex==1); % do the same for the other two index values
write each cell back to a file
xlswrite('output1.xls', raw1);
You will probably have to fettle the arguments to get it to work the way you want but be sure to check the doc functionname page to get the syntax just right. Your main concern will be to get the indexing correct - matlab indexes row-first whereas spreadsheets tend to be column-first (e.g. cell A2 is column A and row 2, but matlab matrix element M(1,2) is the first row and the second column of matrix M, i.e. cell B1).
UPDATE: to split the file evenly is surprisingly more trouble: because we're using random numbers for the index it's not guaranteed to split evenly. So instead we can generate a vector of random floats and then pick out the lowest 33% of them to make index 1, the highest 33 to make index 3 and let the rest be 2.
randvec = rand(numrows, 1); % float between 0 and 1
pct33 = prctile(randvec,100/3); % value of 33rd percentile
pct67 = prctile(randvec,200/3); % value of 67th percentile
randindex = ones(numrows,1);
randindex(randvec>pct33) = 2;
randindex(randvec>pct67) = 3;
It probably still won't be absolutely even - 1838 isn't a multiple of 3. You can see how many members each group has this way
numel(find(randindex==1))