I have a table of data in kdb and I would like to use q to remove the rows which contain a duplicate value in one column.
For example, if I have the following table where there is a duplicate value in the Age column:
Name Age Degree
---------------------
Alice 26 Science
Bob 34 Arts
Carrie 26 Engineering
How would I delete the third row so I end up with the following:
Name Age Degree
---------------------
Alice 26 Science
Bob 34 Arts
Thanks!
You could do
select from t where i=(first;i)fby Age
You can delete any of the duplicates in any columns using this:
q)delete from t where ({not x in 1#x};i) fby Age
Name Age Degree
-----------------
Alice 26 Science
Bob 34 Arts
Could also be solved using a by clause instead of fby, but in this case to get the first occurrence of each age you have to use reverse
q)0!select by Age from reverse t
Age Name Degree
-----------------
26 Alice Science
34 Bob Arts
Related
I know that it is possible to almost do what I want with LISTAGG however that adds all of the values into one column.
For example if I have something like this
SubjectID StudentName
---------- -------------
1 Mary
1 John
1 Sam
2 Alaina
2 Edward
I am hoping to get something like this
SubjectID StudentName StudentName1 StudentName2
---------- ----------- ------------ ------------
1 Mary John Sam
2 Alaina Edward 0
This is called pivoting and is perfectly described in this article
In Apache Pig 0.15, I have two simple lists (WITHOUT id/primary key, etc.) that I want to merge together to create one list of tuples with two columns. Example:
Names
-----
Peter
John
Anne
Ages
-----
45
23
44
I want to end up with:
Names Age
---------------
Peter 45
John 23
Anne 44
I know I can use RANK on both lists and then JOIN, but that looks way too costly as I have millions of entries in these lists. I kind of want to do a JOIN with "merge" without having a join parameter...
Any idea about how to do this efficiently in Apache Pig?
If you do not care about the mapping between Age and Name then you can try cross-join between two relations. Post Cross join group by names and retain anyone out of it. However IMO, this may be more costlier ( rather resource intensive) than the RANK approach you mentioned above.
I have the following data returned from a stored procedure
Staff Category Amount
----- ------- ------
Bob Art 123
Bob Sport 777
Bob Music 342
Jeff Art 0
Jeff Sport 11
Jeff Music 27
All Categories will always be returned for all Staff even is the Amount is zero
What I want to do on my Crystal Report is output this:-
Staff Art Sport Music
----- --- ----- -----
Bob 123 777 342
Jeff 0 11 27
I effectively want to Transpose the data in the Category rows as headers or columns in my report.
I do not want to use a Cross Tab as I have other things I need to add which will not fit nicely into a Cross Tab
Any thoughts on how I can do this in Crystal? I'm using version 11
Should be able to achive this in your sproc with a PIVOT Table. A helpfile on PIVOT tables can be found here
Group the report by staff and place staff, Art, Sport, Music as text fields in Group header.
now in details section place data as
Staff, formula 1 (If Category='Art' then Amount), formula 2 (If Category='Sport' then Amount), formula 3 (If Category='Music' then Amount)
If Staff has only one value then its ok else place Staff in Group footer and take sum of all values in group footer (Don't remove Formula 1,2,3 from details)
Given the following two tables:
NAMES
NAME NUMBER
---------- -------
Wayne Gretzky 99
Jaromir Jagr 68
Bobby Orr 4
Bobby Hull 23
Mario Lemieux 66
POINTS
-----------------------------
NAME POINTS
---------- ------
Wayne Gretzky 244
Bobby Orr 129
Brett Hull 121
Mario Lemieux 189
Joe Sakic 94
How many rows would be returned using the following statement?
SELECT name FROM names, points
Can someone explain why the answer is 25?
Thanks in advance for any help provided
I guess this instruction is equivalent to a cross join in standard SQL. Hence the number of records returned is 5 records in names * 5 records in points = 25.
Also known as the "Cartesian Product"
"The Cartesian product, also referred to as a cross-join, returns all the rows in all the tables listed in the query. Each row in the first table is paired with all the rows in the second table. This happens when there is no relationship defined between the two tables."
from:
http://www.dba-oracle.com/t_garmany_9_sql_cross_join.htm
I have a list like:
Name Age
Charles 18
Anna 20
Anna 19
Tomas 44
Karla 13
Charles 88
I would write a JPQL statement that give me:
Charles 18
Anna 20
Tomas 44
Karla 13
In other words, how can I get a list with unique names where the age dont care?
Best regards
Carl
If you really don't care about the age, don't select it:
select distinct u.name from User u
If you'd like to get a valid age with each user, but don't care which one, select the min or max of the ages:
select u.name, max(u.age) from User u group by u.name