Extracting kdb list values based on some condition - kdb

Say we have a kdb list
L1:(1 2 3 4 5)
Apply condition
L1 < 3
And how can I retrieve result in another list (1 2)

You can use the where keyword for this:
q)l1 where l1<3
1 2
Applying l1<3 will return a list of booleans 11000b. Using where on this list will return the index of every 1b
q)where 11000b
0 1
Then indexing back into the original list will return the result in another list.

Related

kdb - how to create sum a list of dynamic columns using functional select

I want to be able to construct (+; (+; `a; `b); `c) given a list of `a`b`c
Similarly if I have a list of `a`b`c`d, I want to be able to construct another nest and so on and so fourth.
I've been trying to use scan but I cant get it right
q)fsum:(+;;)/
enlist[+;;]/
q)fsum `a`b`c`d
+
(+;(+;`a;`b);`c)
`d
If you only want the raw parse tree output, one way is to form the equivalent string and use parse. This isn't recommended for more complex examples, but in this case it is clear.
{parse "+" sv string x}[`a`b`c`d]
+
`d
(+;`c;(+;`b;`a))
If you are looking to use this in a functional select, we can use +/ instead of adding each column individually, like how you specified in your example
q)parse"+/[(a;b;c;d)]"
(/;+)
(enlist;`a;`b;`c;`d)
q)f:{[t;c] ?[t;();0b;enlist[`res]!enlist (+/;(enlist,c))]};
q)t:([]a:1 2 3;b:4 5 6;c:7 8 9;d:10 11 12)
q)f[t;`a`b`c]
res
---
12
15
18
q)f[t;`a`b]
res
---
5
7
9
q)f[t;`a`b`c]~?[t;();0b;enlist[`res]!enlist (+;(+;`a;`b);`c)]
1b
You can also get the sum by indexing directly to return a list of each column values and sum over these. We use (), to turn any input into a list, otherwise it will sum the values in that single column and return only a single value
q)f:{[t;c] sum t (),c}
q)f[t;`a`b`c]
12 15 18

Mongdb combined limit and sort when using find function

I have db mongdb example with document a and document b
a_id type
1 1
2 2
3 3
4 4
Now. I want to extract the last N (1,2,3,4,5,....) values in table b in the same order as in the example above. But if I use skip function :
b.find().skip(M)
if M > N then result empty => wrong. I want dynamic M.
If I use sort and limit then it does not give the correct order.
b.find().sort({$natural:-1}).limit(M)
result:
4 4
3 3
I want a solution!
You can use the same skip() to access the last N documents in the collection.
N = Last N documents to be accessed
So the query is
b.find().skip(b.count() - N).pretty()
or you can play with the mongo shell just as javascript like
var totalCount = b.count()
db.find().skip(totalCount - N).pretty()

Selecting all columns in a cell array that contain a certain value in the first row?

I currently have a 4x3500 cell array. First row is a single number, 2 row is a single string, 3rd and 4th rows are also single numbers.
Ex:
1 1 2 3 3 4 5 5 5 6
hi no ya he ........ % you get the idea
28 34 18 0 3 ......
55 2 4 42 24 .....
I would like to be able to select all columns that have a certain value in the first row. ie if I wanted '1' as the first row value, it would return
1 1
hi no
28 34
55 2
Then I would like to sort based on the 2nd row's string. ie if I wanted to have'hi', it would return:
1
hi
28
55
I have attempted to do:
variable = cellArray{:,find(cellArray{1,:} == 1)}
However I keep getting:
Error using find
Too many input arguments.
or
Error using ==
Too many input arguments.
Any help would be much appreciated! :)
{} indexing will return a comma separated list which will provide multiple outputs. When you pass this to find, it's the same as passing each element of your cell array as a separate input. This is what leads to the error about to many input arguments.
You will want to surround the comma-separated list with [] to create an array or numbers. Also, you don't need find because you can just use logical indexing to grab the columns you want. Additionally, you will want to index using () to grab the relevant rows, again to avoid the comma-separated list.
variable = cellArray(:, [cellArray{1,:}] == 1)

Reshape (#) doesn't work with a dynamic argument

To form a matrix consisting of identical rows, one could use
x:1 2 3
2 3#x,x
which produces (1 2 3i;1 2 3i) as expected. However, attempting to generalise this thus:
2 (count x)#x,x
produces a type error although the types are equal:
(type 3) ~ type count x
returns 1b. Why doesn't this work?
The following should work.
q)(2;count x)#x,x
1 2 3
1 2 3
If you look at the parse tree of both your statements you can see that the second is evaluated differently. In the second only the result of count is passed as an argument to #.
q)parse"2 3#x,x"
#
2 3
(,;`x;`x)
q)parse"2 (count x)#x,x"
2
(#;(#:;`x);(,;`x;`x))
If you're looking to build matrices with identical rows you might be better off using
rownum#enlist x
q)x:100000?100
q)\ts do[100;v1:5 100000#x,x]
157 5767696j
q)\ts do[100;v2:5#enlist x]
0 992j
q)v1~v2
1b
I for one find this more natural (and its faster!)

Using SUM and UNIQUE to count occurrences of value within subset of a matrix

So, presume a matrix like so:
20 2
20 2
30 2
30 1
40 1
40 1
I want to count the number of times 1 occurs for each unique value of column 1. I could do this the long way by [sum(x(1:2,2)==1)] for each value, but I think this would be the perfect use for the UNIQUE function. How could I fix it so that I could get an output like this:
20 0
30 1
40 2
Sorry if the solution seems obvious, my grasp of loops is very poor.
Indeed unique is a good option:
u=unique(x(:,1))
res=arrayfun(#(y)length(x(x(:,1)==y & x(:,2)==1)),u)
Taking apart that last line:
arrayfun(fun,array) applies fun to each element in the array, and puts it in a new array, which it returns.
This function is the function #(y)length(x(x(:,1)==y & x(:,2)==1)) which finds the length of the portion of x where the condition x(:,1)==y & x(:,2)==1) holds (called logical indexing). So for each of the unique elements, it finds the row in X where the first is the unique element, and the second is one.
Try this (as specified in this answer):
>>> [c,~,d] = unique(a(a(:,2)==1))
c =
30
40
d =
1
3
>>> counts = accumarray(d(:),1,[],#sum)
counts =
1
2
>>> res = [c,counts]
Consider you have an array of various integers in 'array'
the tabulate function will sort the unique values and count the occurances.
table = tabulate(array)
look for your unique counts in col 2 of table.