Extract value from a super field - amazon-redshift

I have a field event_params that is type super.
Example value of event_params:
[
{"key":"type","value":{"string_value":"blah"}},
{"key":"value","value":{"double_value":5.99}},
{"key":"category","value":{"string_value":"add-ons"}},
{"key":"event_id","value":{"string_value":"956bb654-be90-5334-91a3-ee97074da0eb"}},
]
I would like to get event_id without unfolding the whole thing, I just want to drill down into it and pull up event id.
After reading docs and messing around, I can d0 this:
select
event_params[4].value,
event_params[4].key,
event_params
from "myschema"."mytable"
where event_name = 'purchase'
and event_timestamp = 1660193632449615;
Here I used indexing with 4. But data will not necessarily be in the same order. INstead I want to filter based on the key being event_id.
How can I modify this: event_params[4].key to instead filter based on the name of the key I want, in this case event_id?

You can try something like this:
select
event_name,
ep
from
"myschema"."mytable" m,
m.event_params as ep
where
event_name = 'purchase'
and event_timestamp = 1660193632449615
and ep.key = 'event_id';

Related

Get Max value of a DATE and an ID in order to filter results

I'm trying to get the latest "date" so the max value of "date" and from the same table I want the max value of "stand" also from the same ID.
I have tons of dates, stands for one ID but i only want to extract the latest.
im trying to save it into a function i dont know yet if thats the best idea. The rest of my query It's made of inner joins.
Datum is of type date.
stand is decimal(18,6)
DECLARE #MAXDATE DATE
DECLARE #MAXSTAND decimal(18,6)
SELECT #MAXDATE = MAX(Datum) FROM [dbo].[1] WHERE ID = ID
SELECT #MAXSTAND = Stand FROM [dbo].[2]WHERE ID = ID
Result I get: #MAXDATE: 2106-10-13
Result I get: #MAXSTAND: 0.000000
Result I want: #MAXDATE: 2018-01-16
result I want: #MAXSTAND: 1098.000000
Assuming SQL Server 2012 or higher, you can use the first_value window function:
SELECT FIRST_VALUE(Stand) OVER(ORDER BY Datum DESC)
FROM TableName
WHERE Id = #Id
This will return the value of Stand where the Datum column has the latest value for the specific Id.
With the data provided, I don't think you need to do something else :
SELECT MAX(Datum), MAX(Stand)
FROM TableName
WHERE ID = #MyId
edit : You want it by ID, you can do this :
SELECT MAX(Datum), MAX(Stand), ID
FROM TableName
GROUP BY ID

OrientDB: select edge where out=(select ??) does not work

I have a problem. I think that this is supposed to work, otherwise someone else would have run into this problem.
The following command works perfectly:
// suppose my record id is #10:0
select from MyEdgeType where out=#10:0
This works.
select from MyNodeType where name="this"
> returns obj with #rid = #10:0
The following does not work:
select from MyEdgeType where out=(select from MyNodeType where name="this")
select from MyEdgeType where out=(select #rid from (select from MyNodeType where name="this")
select from MyEdgeType let $rec = (select fcom MyNodeType...) where out=$rec.rid
... etc.
Nothing works. Nothing. How do I select from edges such that I do not have to know the record id which is incident to the edges I would like to grab ahead of time?
You're comparing a single field on a resultset (it's like comparing a string to an array), try something like this:
select from MyEdgeType where out IN (select from MyNodeType where name="this")
I got this to work.
Since my nodes are unique (this is a constraint), I used the unique property to ID them during the filtration, rather than the record id from a subquery:
select from MyEdgeType where out.unique_identifier=...
worked.

Processing record type from a jsonb_each query

I store some data as JSON.
I want to flatten the data using jsonb_each.
The new column type is RECORD, but I don't how extract values from it.
SELECT T FROM (
SELECT json_each_text(skills::json->'prizes') FROM users) AS T;
The output is
jsonb_each
---------------------------------
(compliance,2)
(incentives,3)
(compliance,0)
(legal,3)
(legal,2)
(international-contributions,3)
The type is RECORD.
pg_typeof
-----------
record
I want to do an aggregate and GROUPBY, but I cannot figure out how to extract the first element(the string) and the second element (the value).
Here is a workaround I have found: JSON -> ROW -> JSON -> (string, integer) and then aggregate. But I am wondering if there is a shortcut and skip the ROW->JSON conversion.
SELECT U.key, AVG(U.value::int) FROM
(SELECT row_to_json(T)->'s'->>'key' AS key,
row_to_json(T)->'s'->>'value' AS value
FROM
(SELECT jsonb_each(skills::jsonb->'prizes') AS s
FROM users) AS T
) AS U
GROUP BY key;
Thanks a lot, #Arnaud, this seems like a not-very-common problem. I wasn't sure about json data structure after using row_to_json function, so I needed to validate that via:
SELECT row_to_json(T) FROM
(SELECT jsonb_each((data->'app_metadata'->>'results')::jsonb)
FROM temp) AS T;
And once I got the keys structure, I could replicate your approach:
SELECT row_to_json(T)->'jsonb_each'->>'key' as key, row_to_json(T)->'jsonb_each'->>'value' as value
FROM (select jsonb_each((data->'app_metadata'->>'results')::jsonb) FROM temp) AS T

In Apache Pig, select DISTINCT rows based on a single column

Let's say I have a table such as the one below, that may or may not contain duplicates for a given field:
ID URL
--- ------------------
001 http://example.com/adam
002 http://example.com/beth
002 http://example.com/beth?extra=blah
003 http://example.com/charlie
I would like to write a Pig script to find only DISTINCT rows, based on the value of a single field. For instance, filtering the table above by ID should return something like the following:
ID URL
--- ------------------
001 http://example.com/adam
002 http://example.com/beth
003 http://example.com/charlie
The Pig GROUP BY operator returns a bag of tuples grouped by ID, which would work if I knew how to get just the first tuple per bag (perhaps a separate question).
The Pig DISTINCT operator works on the entire row, so in this case all four rows would be considered unique, which is not what I want.
For my purposes, I do not care which of the rows with ID 002 are returned.
I found one way to do this, using the GROUP BY and the TOP operators:
my_table = LOAD 'my_table_file' AS (A, B);
my_table_grouped = GROUP my_table BY A;
my_table_distinct = FOREACH my_table_grouped {
-- For each group $0 refers to the group name, (A)
-- and $1 refers to a bag of entire rows {(A, B), (A, B), ...}.
-- Here, we take only the first (top 1) row in the bag:
result = TOP(1, 0, $1);
GENERATE FLATTEN(result);
}
DUMP my_table_distinct;
This results in one distinct row per ID column:
(001,http://example.com/adam)
(002,http://example.com/beth?extra=blah)
(003,http://example.com/charlie)
I don't know if there is a better approach, but this works for me. I hope this helps others starting out with Pig.
(Reference: http://pig.apache.org/docs/r0.12.1/func.html#topx)
I have found that you can do this with a nested grouping and using LIMIT So using Arel's example:
my_table = LOAD 'my_table_file' AS (A, B);
-- Nested foreach grouping generates bags with same A,
-- limit bags to 1
my_table_distinct = FOREACH (GROUP my_table BY A) {
result = LIMIT my_table 1;
GENERATE FLATTEN(result);
}
DUMP my_table_distinct;
You can use
Apache DataFu™ (incubating)
FirstTupleFrom Bag
register datafu-pig-incubating-1.3.1.jar
define FirstTupleFromBag datafu.pig.bags.FirstTupleFromBag();
my_table_grouped = GROUP my_table BY A;
my_table_grouped_first_tuple = foreach my_table_grouped generate flatten(FirstTupleFromBag(my_table,null));

How do I count the number of objects in SQL Server 2000 and get its modified date?

I need to count the number of objects in SQL Server 2000 when restoring from the database to make sure that restore includes the latest updates. I also wanted to get the latest date an object was created or modified.
Specifically wanted to get counts for number of tables, the number of views, the number of udfs, the number of sprocs, and the date it was created or modified.
select
count(xtype) as [MyCounts],
crdate as [CreateDate],
refdate as [ModifiedDate]
from sysobjects
where xtype like 'U%'
--does not appear to be working correctly.
A very basic solution would simply use grouping and aggregating, like this:
SELECT
xtype,
total_count = COUNT(*),
last_crdate = MAX(crdate),
last_refdate = MAX(refdate)
FROM sysobjects
GROUP BY xtype
This, however, returns information on all types of objects in the current database, including those you didn't mention in your question, like constraints, keys etc.
So you might want to narrow the resulting list by applying a filter on xtype, like this:
SELECT
xtype,
total_count = COUNT(*),
last_crdate = MAX(crdate),
last_refdate = MAX(refdate)
FROM sysobjects
WHERE xtype IN ('U', 'V', 'FN', 'TF', 'IF', 'P')
GROUP BY xtype
Note that there are three types of UDFs in SQL Server. They are designated in sysobjects as follows:
FN – scalar function
TF – multi-statement table-valued function
IF – inline table-valued function
Accordingly the information about functions will be scattered in three rows if you use the above script. If you'd like to group those results in one row, your query would have to be slightly more sophisticated. For example, like this:
SELECT
type,
total_count = COUNT(*),
last_crdate = MAX(crdate),
last_refdate = MAX(refdate)
FROM (
SELECT
type = CASE
WHEN xtype = 'U' THEN 'table'
WHEN xtype = 'V' THEN 'view'
WHEN xtype = 'P' THEN 'proc'
WHEN xtype IN ('FN', 'TF', 'IF') THEN 'udf'
END,
crdate,
refdate
FROM sysobjects
WHERE xtype IN ('FN', 'TF', 'IF', 'P', 'U', 'V')
) s
GROUP BY type
Here the original types are first replaced by custom types based on the xtype value. All rows pertaining to functions are marked simply as udf, regardless of the actual function type, so in the end you can simply group by the custom type column and get the necessary totals, the information on functions now being gathered in one row.
Reference:
sys.sysobjects (Transact-SQL)
You should post the errors that you're getting so it's easier to diagnose. However, looking at your query, it's likely because you're missing a GROUP BY to accompany the COUNT aggregation you're attempting to do.
Now, the real question is how do you display a COUNT aggregation along with line-specific information like the created date?
If there are 5 views and 4 procs, what does each line look like? What are the column headers? Is the COUNT shown on each row along with the detail for that item, like this?
select
c.cnt as [MyCounts],
s.name as [Name],
s.xtype as [Type],
s.crdate as [CreateDate],
s.refdate as [ModifiedDate]
from
sysobjects s
inner join (select COUNT(1) cnt, xtype from sysobjects group by xtype) c
on s.xtype = c.xtype
where
s.xtype like 'U%'