How to obtain rows of an incomplete Matlab table? - matlab

After getting answers to this question I realized that I do have a problem with importing data into Matlab but it has nothing to do with NaNs but rather with different data types stored in the table.
In the same example I used in the other question importing an Excel table
using
measurementTable = readtable('MWE.xlsx','ReadVariableNames',false,'ReadRowNames',true);
leads to the Matlab table
As you can see the values in column 1 to 4 are of type cell while the values in column 5 are of type double. If I would now try to obtain a single row of the table by using
measurementTable{'DATE',:}
I get the error message:
Cannot concatenate the table variables 'Var5' and 'Var1', because their types are double and cell.
How can I tackle this problem?

As you worked out the command you are using is failing due to Matlab trying to combine the cells and doubles to an array.
Since you have multiple data types you need to store your "row" in a cell array.
You can obtain a single row of mixed data by doing:
table2cell ( measurementTable('DATE',:) )

Related

How to splay a column of type dictionary

I have a column in my table whose values that are dictionaries. The type in the meta of that column is " ".
I want to know how to splay this table. When I try to splay it, I get a type error. I am aware only vectors can be splayed, however, I have seen a table where a column holds dictionaries splayed before, so I know it's possible, but I am not sure how it is done.
Dictionaries are only supported in kdb version >3.6.
If you are running 3.6/4.0, double check you are enumerating the table for splay.
`:path/to/table set .Q.en[`:hdb;table]
If <3.6 json string is a good alternative although not recommended on large tables as .j.k is slow.

How to extract specific value in a dictionary column with multiple lists

I'm trying to extract specific value inside a column in a dataframe as you can see in the next image without any success, referring back to similar question still didn't work for my code.
If there is any way to extract the values as [Culture, Climate change, technology, ...]
Data
First Try
I have tried split() function however I reached a dead end as still I need the exact value after the word "name", and this new dataframe contains 75 columns. If I can only get a for loop to extract the value after the word "name" that's my latest vision to solve my problem.

Compare 2 spark data frame cell by cell in scala

I’m comparing the data ingested in hive table with that of that source and storing the differences in mariadb There are no primary keys for the tables and would like to have a optimise solution and though I’ve used except method to check the difference I’m finding difficult in printing out the difference in the columns for the same row which are different.
As far as I can think it's not possible to solve your problem in the absence of primary key as in that case each row of one DataFrame is potentially different than each row of the other DataFrame and practically you wouldn't want to report difference with each row of the other DataFrame.

How to properly remove NaN values from table

After reading an Excel spreadsheet in Matlab I unfortunately have NaNs included in my resulting table. So for example this Excel table:
would result in this table:
where an additional column of NaNs occurs. I tried to remove the NaNs with the following code snippet:
measurementCells = readtable('MWE.xlsx','ReadVariableNames',false,'ReadRowNames',true);
measurementCells = measurementCells(any(isstruct(measurementCells('TIME',1)),1),:);
However this results in a 0x6 table, without any values present anymore. How can I properly remove the NaNs without removing any data from the table?
Either this:
tab = tab(~any(ismissing(tab),2),:);
or:
tab = rmmissing(tab);
if you want to remove rows that contain one or more missing value.
If you want instead to replace missing values with other values, read about how fillmissing (https://mathworks.com/help/matlab/ref/fillmissing.html) and standardizeMissing (https://mathworks.com/help/matlab/ref/standardizemissing.html) functions work. The examples are exhaustive and should help you to find the solution that best fits your needs.
One last solution you have is to spot (and manipulate in the way you prefer) NaN values within the call to the readtable function using the EmptyValue parameter. But this works only against numeric data.

Aggregate/sum function of a table in Matlab

In matlab I have read in a table from a csv file, then moved two columns I am interested in into a new table. These columns are "ID" (of a person, 1-400) and then another ID to represent their occupation (1-12).
What I want to do is create a simple table with 12 records and 2 columns, there is a record for each job, and the number of user IDs who have this job must be aggregated/summed, such a table could be easily bar charted. At the moment I have 400 user records, all with their IDs and one of the 12 possible job IDs.
So much like an SQL aggregate/sum function, but I want to do it in Matlab, with a table object. The problem I am having is finding how to do this without using a cell array or something similar.
Thanks!
I know that you found an answer yourself, but I would like to mention the histc function, which avoids the loop (and is faster for larger matrices):
JobCounts = histc(OccupationTable(:,2), 1:NumberOfJobs);
Combining this with the job number gives the desired result:
result = [(1:NumberOfJobs)' JobCounts];
Nevermind, solved it. Just looped through the job numbers and ran "sum" where the ID was equal to what I wanted:
for i = 1:1:NumberOfJobs;
JobCounts(i,:) = sum(OccupationTable(:,2) == i);
end