Select non-empty string rows in KDB - kdb

q)tab
items sales prices detail
-------------------------
nut 6 10 "blah"
bolt 8 20 ""
cam 0 15 "some text"
cog 3 20 ""
nut 6 10 ""
bolt 8 20 ""
I would like to select only the rows which have "detail" non-empty. Seems fairly straightforward, but not i can't get it to work.
q) select from tab where count[detail] > 0
This gives all the rows still.
Alternatively i tried
q) select from tab where not null detail
This gives me type error.
How can one query for non-empty string fields in KDB???

Rather than use adverbs, you can simplify this with the use of like.
q)select from tab where not detail like ""
items sales prices detail
------------------------------
nut 1 10 "blah"
cam 5 9 "some text"

As you need to perform the check row-wise, use each:
select from tab where 0 < count each detail
This yields to the following table:
items sales prices detail
------------------------------
nut 6 10 "blah"
cam 0 15 "some text"

Use the adverb each:
q)select from ([]detail:("blah";"";"some text")) where 0<count each detail
detail
-----------
"blah"
"some text"

I would use following approach
select from tab where not detail~\:""
where every detail is compared to empty string. Approach with not null detail does not work, because Q treats string as character array and checks if each of array elements is null. I.e. null "abc" returns boolean array 000b, but where clause expects for single boolean value for each "row"

If your table is not big, another way you can check is by converting it to a symbol in the where clause.
q)select from ([]detail:("blah";"";"some text")) where `<>`$detail
detail
-----------
"blah"
"some text"
Or simply
q)select from ([]detail:("blah";"";"some text")) where not null `$detail
detail
-----------
"blah"
"some text"

Related

KDB: how to compare strings?

I have a column of type C. How do I compare the value to the previous value in the same column? I did col1 like prev col1 but it returns a Correction hint: length error. I also created another column newCol: prev col1 but still cannot perform the comparison. I also tried with = and no luck. How can I do this?
a sample data:
col1
Paris
London
London
New York
Singapore
Ha Noi
Could you use the prior keyword?
q)t
col1
-----------
"Paris"
"London"
"London"
"Ney York"
"Singapore"
"Ha Noi"
q)select (~) prior col1 from t
col1
----
0
0
1
0
0
0
When comparing strings, if they are the same length it will check that each character in each slot of the array is the same, and return a list of booleans to tell you where the strings are the same. If the strings are two different lengths, you will get a length error. If you want to test if two strings are the exact same thing, you can use ~, which will work regardless of the length of the string and give you a single boolean telling you if they are the same.
Use each prior:https://code.kx.com/q/ref/maps/#each-prior
With match: https://code.kx.com/q/basics/comparison/#match
q)tab:([]col1:("Paris";"London";"London";"New York"))
q)select col1,compare:(~':)col1 from tab
col1 compare
------------------
"Paris" 0
"London" 0
"London" 1
"New York" 0
You should use like' instead of like, because you are comparing not to single value, but to list.
update comparison: col1 like' prev col1 from
([]col1:("Paris";"London";"London";"New York";"Singapore";"Ha Noi"))
Although this is essentially the same as Matthews and jomahony's answers, the differ keyword can arguably make it easier to read/understand:
q)select not differ col1 from ([]col1:("Paris";"London";"London";"New York"))
col1
----
0
0
1
0

Aggregate on another aggregation

I have a dataframe like this :
user_id items
1 item1
1 item2
1 item3
2 item1
2 item5
3 item4
3 item2
If I put user_id as row and items as columns, I get this :
user_id number_of_items
1 3
2 2
3 2
Now I would like to group this result again, like this :
number_of_user_id number_of_items
1 3
2 2
How can I do this, as a calculated field or in a graph(maybe an histogramm?)
First create the following calculated field, called users_per_item
{ fixed items : countd(user_id) }
Then highlight the new measure you just created in the data pane, users_per_item and right click to create Bins. Set the bin size to 1 or whatever value you like. That will create a dimension called users_per_item (bin)
Finally, you can now use the bin field to create the view you want, say place users_per_item (bin) on the columns shelf and CNTD(items) on the rows shelf.
A natural use for LOD calculations for a 2 stage analysis.

Get substring into a new column

I have a table that contains a column that has data in the following format - lets call the column "title" and the table "s"
title
ab.123
ab.321
cde.456
cde.654
fghi.789
fghi.987
I am trying to get a unique list of the characters that come before the "." so that i end up with this:
ab
cde
fghi
I have tried selecting the initial column into a table then trying to do an update to create a new column that is the position of the dot using "ss".
something like this:
t: select title from s
update thedot: (title ss `.)[0] from t
i was then going to try and do a 3rd column that would be "N" number of characters from "title" where N is the value stored in "thedot" column.
All i get when i try the update is a "type" error.
Any ideas? I am very new to kdb so no doubt doing something simple in a very silly way.
the reason why you get the type error is because ss only works on string type, not symbol. Plus ss is not vector based function so you need to combine it with each '.
q)update thedot:string[title] ss' "." from t
title thedot
---------------
ab.123 2
ab.321 2
cde.456 3
cde.654 3
fghi.789 4
There are a few ways to solve your problem:
q)select distinct(`$"." vs' string title)[;0] from t
x
----
ab
cde
fghi
q)select distinct(` vs' title)[;0] from t
x
----
ab
cde
fghi
You can read here for more info: http://code.kx.com/q/ref/casting/#vs
An alternative is to make use of the 0: operator, to parse around the "." delimiter. This operator is especially useful if you have a fixed number of 'columns' like in a csv file. In this case where there is a fixed number of columns and we only want the first, a list of distinct characters before the "." can be returned with:
exec distinct raze("S ";".")0:string title from t
`ab`cde`fghi
OR:
distinct raze("S ";".")0:string t`title
`ab`cde`fghi
Where "S " defines the types of each column and "." is the record delimiter. For records with differing number of columns it would be better to use the vs operator.
A variation of WooiKent's answer using each-right (/:) :
q)exec distinct (` vs/:x)[;0] from t
`ab`cde`fghi

Crystal Report Cross Tab Calculated Member as text

I'vre created a cross tab report with 2 calculated Member to be able to have the difference between 2 column and the percentage of this difference in CR 2011. What I want to achieve is to create a new column that will display a test depending on the difference value.
Here is a example:
Col1 Col2 Difference Percentage Action
200 0 -200 100 DROPPED
100 100 0 0
0 300 300 100 ADDED
How can create this action column. Calculated member only want some amount value so I cannot output a text in the formula.
Thanks in advance for your help
I finally found the solution.
I can use the Display string formula in the Format Field properties (Common Tab). Here I just check the column and return the string I want otherwise I just format the number.
IF GetColumnGroupIndexOf(CurrentColumnIndex) = 1
AND CurrentColumnIndex =4 THEN
IF GridValueAt(CurrentRowIndex, CurrentColumnIndex,CurrentSummaryIndex) =2 THEN "DROPPED"
ELSE "ADDED"
ELSE
ToText( GridValueAt(CurrentRowIndex, CurrentColumnIndex,CurrentSummaryIndex),2,",")

Jasper - How to remove complete row when 1 perticular field is null?

I want to print pdf something like this
Name Class RollNo
------- ---------- -----------
John 5 <null>
Mark 5 103
Robert 6 104
I need to add condition if RollNo is null then remove that row in 'detail' band.
You can use the report's filter expression or the detail band's print when expression. The filter expression completely skips the record, which is not counted and does not participate in aggregations, while the band's print when expression simply inhibits the band from printing.
<filterExpression>$F{RollNo} != null</filterExpression>
...OR...
<detail>
<band height="x">
<printWhenExpression>$F{RollNo} != null</printWhenExpression>
<textField>
...