Plotting group variable excluding some categories based on condition - matlab

I want to plot an histogram of a group variable. For this I can use categorical.
I am using an example, using summary(Group), to explain:
variable.group={'one','three','four','one','three','four','two','three','four','one','three','four','one','three','four','one','three','four','one','three','four','one','three','four','one','three','four','one','three','four'}
Group=categorical(variable.group)
summary(Group)
figure,histogram(Group),title('Summary Group')
I could also use tabulate(Group) to give this result:
Group_tabulated =
4×3 cell array
'four' [10] [33.3333]
'one' [ 9] [ 30]
'three' [10] [33.3333]
'two' [ 1] [ 3.3333]
Now, as we see in above plot, there is a group with very little occurrence; I want to exclude that category to focus on the 3 most important ones.
Now using condition on tabulate, I have managed to nearly done it. But I have an issue since the category I want to exclude is still showing... just at 0 now.
Group_tabulated = tabulate(Group)
idx_largest=cell2mat(Group_tabulated(:,2))>3
Group_to_display=Group_tabulated(idx_largest,1)
Learn1_1n_largest=Group(ismember(Group,Group_tabulated(idx_largest,1)))
summary(Learn1_1n_largest)
figure,histogram(Learn1_1n_largest),title('Summary Group largest only')
2 questions:
Can we make the solution with tabulate work, so it exclude that category data from Learn1_1n_largest?
Can I use a different method just using some conditions on either categorical or on histogram for this?
Thanks in advance!

I have found the solution to my first question. It is possible to use removecats in order to remove all unused categories. It is simple by just not specifying anything else that the categorical as such Learn1_1n_largest=removecats(Learn1_1n_largest). The full solution to my first question is:
variable.group={'one','three','four','one','three','four','two','three','four','one','three','four','one','three','four','one','three','four', ...
'one','three','four','one','three','four','one','three','four','one','three','four'}
Group=categorical(variable.group)
summary(Group)
figure,histogram(Group),title('Summary Group')
Group_tabulated = tabulate(Group)
idx_largest=cell2mat(Group_tabulated(:,2))>3
Group_to_display=Group_tabulated(idx_largest,1)
Learn1_1n_largest=Group(ismember(Group,Group_tabulated(idx_largest,1)))
figure,histogram(Learn1_1n_largest),title('Summary Group largest only')
Learn1_1n_largest=removecats(Learn1_1n_largest)
figure,histogram(Learn1_1n_largest),title('Summary Group largest only')
I would be glad if anyone has the solution of my second question, as I hope that there would be a simpler solution without having to use both categorical & tabulate.
Thanks!

Related

How to create a variable in Matlab where certain subjects are coded as 1?

I want to create a variable called 'flag_artifact' where certain subjects from my dataset (for whom I know have bad quality images) are coded as e.g., 1. My dataset is stored in a table T with a certain number of rows and 'subject' is the 1st column in the table.
I managed to do it by creating a for loop. Surely there is a more efficient way to do this by perhaps directly creating the variable and using less lines of code? Could anyone give some advice?
Thank you very much! Here is what I have:
flag_artifact = {'T_300'}; %flagging subject number 300 for example
for i = 1:size(T,1)
if isequal(table2cell(T(i, 1)), flag_artifact)
T(i, 1) = {'1'};
end
end
However, when creating the variable flag_artifact = {'T_300'}, I would like it to include more than one subject. I tried using flag_artifact = {'T_300'}; {'T_301'}, as well as flag_artifact = {'T_300', 'T_301'} but it doesn't work because these subject identifiers do not get replaced with 1s.

How to create a list of length x with identical elements?

I would like to create a list in q/kdb of variable length x which contains the same element e repeated. For example:
x:4;
e:`this;
expected_result:`this`this`this`this
As mentioned by all, # is the best solution in the singular case. If you wanted to duplicate multiple items into a larger single list, then where can achieve this nicely
q)`this`that where 4 2
`this`this`this`this`that`that
Take is what you're looking for:
https://code.kx.com/v2/ref/take/
q)x:4
q)e:`this
q)x#e
`this`this`this`this
You can do this using # https://code.kx.com/v2/ref/take/
q)n:4
q)vals:`this
q)n#vals
`this`this`this`this
Use '#'(take) function:
q) x:4
q) e:`this
q) x#e

kdb q - lookup in nested list

Is there a neat way of looking up the key of a dictionary by an atom value if that atom is inside a value list ?
Assumption: The value lists of the dictionary have each unique elements
Example:
d:`tech`fin!(`aapl`msft;`gs`jpm) / would like to get key `fin by looking up `jpm
d?`gs`jpm / returns `fin as expected
d?`jpm / this doesn't work unfortunately
$[`jpm in d`fin;`fin;`tech] / this is the only way I can come up with
The last option does not scale well with the number of keys
Thanks!
You can take advantage of how where operates with dictionaries, and use in :
where `jpm in/:d
,`fin
Note this will return a list, so you might need to do first on the output if you want to replicate what you have above.
Why are you making this difficult on yourself? Use a table!
q)t:([] c:`tech`tech`fin`fin; sym:`aapl`msfw`gs`jpm)
q)first exec c from t where sym=`jpm
You can of course do what you're asking:
first where `jpm in'd
but this doesn't extend well to vectors while the table-approach does!
q)exec c from t where sym in `jpm`gs
I think you can take advantage of the value & key keywords to find what you're after:
q)key[d]where any value[d]in `jpm
,`fin
Hope that helps!
Jemma
The answers you have received so far are excellent. Here's my contribution building on Ryan's answer:
{[val;dict]raze {where y in/:x}[dict]'[val]}[`msft`jpm`gs;d]
The main difference is that you can pass a list of values to be evaluated and the result will be a list of keys.
[`msft`jpm`gs;d]
Output:
`tech`fin`fin

How to display distinct list of column values from OData?

I have an OData model with a property column 'category'. In twenty rows are i. e. 3 different categories. Now I want to display a list of all different categories to use as filter for a table. How can I do that?
Thanks
I started to answer this earlier today but then didnt complete it as it may not be a full answer but this certainly is a good place to start...
Two options I guess: get a function import that just returns a set of the categories and push the problem to the server.
Or process on the client side by using reduce on the column in question.
The best way to do that is explained here.
So adapting that answer:
var categories = ["SAPUI5","OpenUI5","JavaScript","NodeJS","SAP HANA","JavaScript","SAPUI5"];
var uniq = categories.reduce(function (a,b) {
if (a.indexOf(b) < 0 ) a.push(b);
return a;
}, []);
console.log(uniq); // ["SAPUI5", "OpenUI5", "JavaScript", "NodeJS", "SAP HANA"]

combing colls together - MaxMSP

I work on a project with MaxMSP where I have multiple colls. I want to combine all the lists in there in one single coll. Is there a way to do that directly without unpacking and repacking everything?
In order to be more clear, let’s say I have two colls, with the first one being:
0, 2
1, 4
2, 4
….
99, 9
while the second one is:
100, 8
101, 4
…
199, 7
I would like the final coll to be one list from 0-199.
Please keep in mind I don’t want to unpack everything ( with uzi for instance) cause my lists are very long and I find that it is problematic for the cpu to use colls with such long lists.That’s why I broke my huge list into sublists/subcolls in the first place
Hope that’s clear enough.
If the two colls do not have overlapping indices, then you can just dump one into the other, like this:
----------begin_max5_patcher----------
524.3ocyU0tSiCCD72IOEQV7ybnZmFJ28pfPUNI6AlKwIxeTZEh28ydsCDNB
hzdGbTolTOd20yXOd6CoIjp98flj8irqxRRdHMIAg7.IwwIjN995VtFCizAZ
M+FfjGly.6MHdisaXDTZ6DxVvfYvhfCbS8sB4MaUPsIrhWxNeUdFsf5esFex
bPYW+bc5slwBQinhFbA6qt6aaFWwPXlCCPnxDxSEQaNzhnDhG3wzT+i7+R4p
AS1YziUvTV44W3+r1ozxUnrKNdYW9gKaIbuagdkpGTv.HalU1z26bl8cTpkk
GufK9eI35911LMT2ephtnbs+0l2ybu90hl81hNex241.hHd1usga3QgGUteB
qDoYQdDYLpqv3dJR2L+BNLQodjc7VajJzrqivgs5YSkMaprkjZwroVLI03Oc
0HtKv2AMac6etChsbiQIprlPKto6.PWEfa0zX5+i8L+TnzlS7dBEaLPC8GNN
OC8qkm4MLMKx0Pm21PWjugNuwg9A6bv8URqP9m+mJdX6weocR2aU0imPwyO+
cpHiZ.sQH4FQubRLtt+YOaItUzz.3zqFyRn4UsANtZVa8RYyKWo4YSwmFane
oXSwBXC6SiMaV.anmHaBlZ9vvNPoikDIhqa3c8J+vM43PgLLDqHQA6Diwisp
Hbkqimwc8xpBMc1e4EjPp8MfRZEw6UtU9wzeCz5RFED
-----------end_max5_patcher-----------
mzed's answer works, as stated if the lists have no overlapping indices which they shouldn't based on the design you specify.
If you are treating your 'huge list' as multiple lists, or vice versa, that might help come up with an answer. One question some may ask is "why are you merging it again?"
you consider your program to have one large list
that large list is really an interface that handles how you interact with several sub-lists for efficiency sake
the interface to your data persistence (the lists) for storing and retrieval then acts like one large list but works with several under-the-hood
an insertion and retrieval mechanism for handling the multiple lists as one list should exist for your interface then
save and reload the sublists individually as well
If you wrap this into a poly~, the voice acts as the sublist, so when I say voice I basically mean sublist:
You could use a universal send/receive in and out of a poly~ abstraction that contains your sublist's unique coll, the voice# from poly~ can append uniquely to your sublist filename that is reading/saving to for that voice's [coll].
With that set up, you could specify the number of sublists (voices) and master list length you want in the poly~ arguments like:
[poly~ sublist_manager.maxpat 10 1000] // 10 sublists emulating a 1000-length list
The math for index lookup is:
//main variables for master list creation/usage
master_list_length = 1000
sublist_count = 10
sublist_length = master_list_length/sublist_count;
//variables created when inserting/looking up an index
sublist_number = (desired_index/sublist_count); //integer divide to get the base sublist you'll be performing the lookup in
sublist_index = (desired_index%sublist_length); //actual index within your sublist to access
If the above ^ is closer to what you're looking for I can work on a patch for that. cheers