I want to be able to construct (+; (+; `a; `b); `c) given a list of `a`b`c
Similarly if I have a list of `a`b`c`d, I want to be able to construct another nest and so on and so fourth.
I've been trying to use scan but I cant get it right
q)fsum:(+;;)/
enlist[+;;]/
q)fsum `a`b`c`d
+
(+;(+;`a;`b);`c)
`d
If you only want the raw parse tree output, one way is to form the equivalent string and use parse. This isn't recommended for more complex examples, but in this case it is clear.
{parse "+" sv string x}[`a`b`c`d]
+
`d
(+;`c;(+;`b;`a))
If you are looking to use this in a functional select, we can use +/ instead of adding each column individually, like how you specified in your example
q)parse"+/[(a;b;c;d)]"
(/;+)
(enlist;`a;`b;`c;`d)
q)f:{[t;c] ?[t;();0b;enlist[`res]!enlist (+/;(enlist,c))]};
q)t:([]a:1 2 3;b:4 5 6;c:7 8 9;d:10 11 12)
q)f[t;`a`b`c]
res
---
12
15
18
q)f[t;`a`b]
res
---
5
7
9
q)f[t;`a`b`c]~?[t;();0b;enlist[`res]!enlist (+;(+;`a;`b);`c)]
1b
You can also get the sum by indexing directly to return a list of each column values and sum over these. We use (), to turn any input into a list, otherwise it will sum the values in that single column and return only a single value
q)f:{[t;c] sum t (),c}
q)f[t;`a`b`c]
12 15 18
Say I have a list
b:1 1 2 3 4
and I want to find the location of the element in list b using another list
a:1 2
When I type in b in\ a, I got
11000b
00000b
where it should be
11000b
00100b
What is going on and how to get the desired answer?
Thanks in advance!
You need to use each-right /:
q)b in/:a
11000b
00100b
With b in\a the first output is getting passed back in as b. Effectively:
q)1 1 2 3 4 in 1
11000b
q)11000b in 2
00000b
You can also use each-both ':
q)in[b]'[a]
11000b
00100b
I have a file which contain multiple rows of item codes as follows. There are 1 million rows similar to these
1. 123,134,256,345,789.....
2. 123,256,345,678,789......
.
.
I would like to find the count of all the pair of words/items per row in the file using q in kdb+. i.e. any two pair of words that occur in the same row can be considered a word pair.
e.g:
(123,134),(123,256),(134,256), (123,345) (123,789), (134,789) are some of the word pairs in row 1
(123,256),(123,345),(123,345),(678,789),(345,789) are some of the word pairs in row 2
word/item pair count
`123,134----1
123,256---2
345,789---2`
I am reading the file using read0 and have been able to convert each line into list using vs and using count each group to count the number of words, but now I want to find the count of all the word pairs per row in the file.
Thanks in advance for your help
I'm not 100% I understand your definition of a word-pair. Perhaps you could expand a little if my logic doesn't match what you were looking for.
In the example below, I've created a 5x5 matrice of symbols for testing - selected distinct pairs of values from each row, and then checked how many rows each of these appeared in, in total.
Please double check with your own results.
q)test:5 cut`$string 25?5
q)test
2 0 1 0 0
2 4 4 2 0
1 0 0 3 4
2 1 1 4 4
3 0 3 4 0
q)count each group raze {l[where(count'[l:distinct distinct each asc'[x cross x:distinct x]])>1]} each test
0 2| 2
1 2| 2
0 1| 2
2 4| 2
0 4| 3
1 3| 1
1 4| 2
0 3| 2
3 4| 2
To add some other cases to Matthew's answer above, if what you want is to break the list down into pairs in this way:
l:"a,b,c,d,e,f,g"
becomes
"a,b"
"b,c"
"c,d"
"d,e"
"e,f"
"f,g"
so only taking valid pairs, you could use something like this:
f:{count each group b flip 0 1+\:til 1+count[b:","vs x]-1}
q)f l
,"a" ,"b"| 1
,"b" ,"c"| 1
,"c" ,"d"| 1
,"d" ,"e"| 1
,"e" ,"f"| 1
,"f" ,"g"| 1
where we're splitting the input list on ".", then using indexing to get a list of each element and the element directly to its right, then grouping the resultant list of pairs to count the distinct pairs. If you want to split it so l becomes
"a,b"
"c,d"
"e,f"
then you could use this:
g:{count each group b flip 0 1+\:2*til count[b:","vs x]div 2}
q)g l
,"a" ,"b"| 1
,"c" ,"d"| 1
,"e" ,"f"| 1
Which uses a similar approach, starting with the even-positioned elements and getting those to their right, and repeating as above.
You can easily apply these to the rows read with read0:
r:read0`:file.txt
f each r
will output a dictionary of the counts of each pair for each row, and this can be summed to give the total count of each word pair with each method throughout the file.
Hope this helps - it's still not clear what you mean by pairs, so if neither my answer not Matthew's is of some use, you could edit in a more complete explanation of what you'd like and we can help with that.
If you want to consider all possible combinations of 2 pairs in each row then this may be of help. The following function can be used to give distinct combinations, where x is the size of the list and y is the length of the combination:
q)comb:{$[x=y;enlist til x;1=y;flip enlist til x;.z.s[x;y],.z.s[x;y-1],'x-:1]}
q)comb[3;2]
0 1
0 2
1 2
From here we can index into each list to get the pairs, then raze to give a single list of all pairs, group to get the indices where each pair occurs and then count the number of indices in each group:
q)a
123 134 256 345 789
123 256 345 678 789
q)count each group raze{x comb[count x;2]}'[a]
123 134| 1
123 256| 2
134 256| 1
...
345 789| 2
...
I currently have a 4x3500 cell array. First row is a single number, 2 row is a single string, 3rd and 4th rows are also single numbers.
Ex:
1 1 2 3 3 4 5 5 5 6
hi no ya he ........ % you get the idea
28 34 18 0 3 ......
55 2 4 42 24 .....
I would like to be able to select all columns that have a certain value in the first row. ie if I wanted '1' as the first row value, it would return
1 1
hi no
28 34
55 2
Then I would like to sort based on the 2nd row's string. ie if I wanted to have'hi', it would return:
1
hi
28
55
I have attempted to do:
variable = cellArray{:,find(cellArray{1,:} == 1)}
However I keep getting:
Error using find
Too many input arguments.
or
Error using ==
Too many input arguments.
Any help would be much appreciated! :)
{} indexing will return a comma separated list which will provide multiple outputs. When you pass this to find, it's the same as passing each element of your cell array as a separate input. This is what leads to the error about to many input arguments.
You will want to surround the comma-separated list with [] to create an array or numbers. Also, you don't need find because you can just use logical indexing to grab the columns you want. Additionally, you will want to index using () to grab the relevant rows, again to avoid the comma-separated list.
variable = cellArray(:, [cellArray{1,:}] == 1)
To form a matrix consisting of identical rows, one could use
x:1 2 3
2 3#x,x
which produces (1 2 3i;1 2 3i) as expected. However, attempting to generalise this thus:
2 (count x)#x,x
produces a type error although the types are equal:
(type 3) ~ type count x
returns 1b. Why doesn't this work?
The following should work.
q)(2;count x)#x,x
1 2 3
1 2 3
If you look at the parse tree of both your statements you can see that the second is evaluated differently. In the second only the result of count is passed as an argument to #.
q)parse"2 3#x,x"
#
2 3
(,;`x;`x)
q)parse"2 (count x)#x,x"
2
(#;(#:;`x);(,;`x;`x))
If you're looking to build matrices with identical rows you might be better off using
rownum#enlist x
q)x:100000?100
q)\ts do[100;v1:5 100000#x,x]
157 5767696j
q)\ts do[100;v2:5#enlist x]
0 992j
q)v1~v2
1b
I for one find this more natural (and its faster!)