Fetch second max date json object using the SQL query - postgresql

I'm trying to fetch second max date json data from an json column..
Here is jsonb column
--------
value
--------
{
"id": "90909",
"records": [
{
"name":"john",
"date": "2016-06-16"
},
{
"name":"kiran",
"date": "2017-06-16"
},
{
"name":"koiy",
"date": "2018-06-16"
}
]
}
How to select the second maximum date json object..
expected output:-
{
"name":"kiran",
"date": "2017-06-16"
}
and if we have only one object inside the records means that will be the second max date
and any suggestions would also helpful..

My main suggestion would be this: If your data is structured, do not store it in a JSON. It will be much easier to work with it if you structure it as relational tables.
But anyhow, here's one way to get the second-latest-date object. First unpack the array, then sort by the date and take the second to last:
SELECT obj.*
FROM your_table, jsonb_array_elements(value->'records') obj
ORDER BY obj->'date' DESC
LIMIT 1 OFFSET 1;
value
-----------------------------------------
{"date": "2017-06-16", "name": "kiran"}
(1 row)

Related

ADF: use the output from a lookup activity on another activity in Data Factory

I have a lookup activity (Get_ID) that returns:
{
"count": 2,
"value": [
{
"TRGT_VAL": "10000"
},
{
"TRGT_VAL": "52000"
}
],
(...)
I want to use these 2 values from TRGT_VAL in a WHERE clause of a query in another activity. I'm using
#concat('SELECT * FROM table WHERE column in ',activity('Get_ID').output.value[0].TRGT_VAL)
But only the first value of 10000 is being taken into account. How to get the whole list?
I solved by using a lot of replaces:
#concat('(',replace(replace(replace(replace(replace(replace(replace(string(activity('Get_ID').output.value),'{',''),' ',''),'"',''),'TRGT_VAL:',''),'[',''),'}',''),']',''),')')
Output
{
"name": "AptitudeCF",
"value": "(10000,52000)"
}
Instead of using big expression with lot of replace functions, you can use String interpolation syntax and frame your query. Below is query which you can consider.
SELECT * FROM table WHERE column in (#{activity('Get_ID').output.value[0].TRGT_VAL},#{activity('Get_ID').output.value[1].TRGT_VAL)

Aggregate results based on array of strings in JSON?

I have a table with a field called 'keywords'. It is a JSONB field with an array of keyword metadata, including the keyword's name.
What I would like is to query the counts of all these keywords by name, i.e. aggregate on keyword name and count(id). All the examples of GROUP BY queries I can find just result in the grouping occuring on the full list (i.e. only giving me counts where the two records have the same set of keywords).
So is it possible to somehow expand the list of keywords in a way that lets me get these counts?
If not, I am still at the planning stage and could refactor my schema to better handle this.
"keywords": [
{
"addedAt": "2017-04-07T21:11:00+0000",
"addedBy": {
"email": "foo#bar.com"
},
"keyword": {
"name": "Animal"
}
},
{
"addedAt": "2017-04-07T20:54:00+0000",
"addedBy": {
"email": "foo#bar.comm"
},
"keyword": {
"name": "Mammal"
}
}
]
step-by-step demo:db<>fiddle
SELECT
elems -> 'keyword' ->> 'name' AS keyword, -- 2
COUNT(*) AS count
FROM
mytable t,
jsonb_array_elements(myjson -> 'keywords') AS elems -- 1
GROUP BY 1 -- 3
Expand the array records into one row per element
Get the keyword's names
Group these text values.

Building query in Postgres 9.4.2 for JSONB datatype using builtin function

I have a table schema as follows:
DummyTable
-------------
someData JSONB
All my values will be a JSON object. For example, when you do a select *
from DummyTable, it would look like
someData(JSONB)
------------------
{"values":["P1","P2","P3"],"key":"ProductOne"}
{"values":["P3"],"key":"ProductTwo"}
I want a query which will give me result set as follows:
[
{
"values": ["P1","P2","P3"],
"key": "ProductOne"
},
{
"values": ["P4"],
"key": "ProductTwo"
}
]
I'm using Postgres version 9.4.2. I looked at documentation page of the same, but could not find the query which would give the above result.
However, in my API, I can build the JSON by iterating over rows, but I would prefer query doing the same. I tried json_build_array, row_to_json on a result which would be given by select * from table_name, but no luck.
Any help would be appreciated.
Here is the link I looked for to write a query for JSONB
You can use json_agg or jsonb_agg:
create table dummytable(somedata jsonb not null);
insert into dummytable(somedata) values
('{"values":["P1","P2","P3"],"key":"ProductOne"}'),
('{"values":["P3"],"key":"ProductTwo"}');
select jsonb_pretty(jsonb_agg(somedata)) from dummytable;
Result:
[
{
"key": "ProductOne",
"values": [
"P1",
"P2",
"P3"
]
},
{
"key": "ProductTwo",
"values": [
"P3"
]
}
]
Although retrieving the data row by row and building on client side can be made more efficient, as the server can start to send data much sooner - after it retrieves first matching row from storage. If it needs to build the json array first, it would need to retrieve all the rows and merge them before being able to start sending data.

using pymongo, how can i deal with nested json format?

To be more specific,
I loaded the data into Mongodb by Pymongo with this script.
header = ['id', 'info']
for each in reader:
row={}
for field in header:
row[field]=each[field]
db.segment.insert_one(row)
The id column has unique Id of users and Info column is composed as nested json.
For example, here is the data set in the db
{
u'_id': ObjectId('111'),
u'id': u'123',
u'info': {
"TYPE": "food",
"dishes":"166",
"cc": "20160327 040001",
"country": "japan",
"money": 3521,
"info2": [{"type"; "dishes", "number":"2"}]
}
}
What i want to do is to read the value in the nested json format.
So what i did is ..
pipe = [{"$group":{"_id":"$id", "Totalmoney":{"$sum":"$info.money"}}}]
total_money = db.segment.aggregate(pipeline=pipe)
but the result of sum is always "0"for every id.
What am i doing wrong? how can i fix it?
I have to use mongodb because of the data size which is too big to be handled by python
Thank you in advance.

momentjs date collation from a json table

Background
momentjs 2.8.3
angularjs
collating dates in a date table
Problem
Trevor wishes to get the global timespan of dates in a date table, where each record contains a start date and an end date.
Goal
The goal is to get a global timespan, such that the earliest part of the timespan reflects the earliest date in any row the table, and the latest part of the timespan reflects the latest date in any row the table.
Trevor does not know in advance how the dates are arranged in the table, other than they are all formatted as 'YYYY-MM-DD'
Trevor is sold on momentjs as the most effective js library for handling this kind of problem, but he is open to using any others.
Details
The data is all encoded in JSON and structured as below.
```
dataroot {
"datedemo_data_table": [
{
"datebeg": "2014-01-15",
"dateend": "2014-02-15"
},
{
"datebeg": "2014-03-15",
"dateend": "2015-01-01"
},
{
"datebeg": "2015-06-15",
"dateend": "2015-07-20"
},
{
"datebeg": "2012-08-15",
"dateend": "2013-08-15"
},
{
"datebeg": "2013-01-15",
"dateend": "2013-01-16"
}
],
"datedemosummary_data_dict": {
"x": "x",
"ds_soonst_date": "",
"ds_latest_date": ""
}
}
```
The goal is to populate the ds_soonst_date and ds_latest_date with the correct date values.
Questions
Is momentjs the best library for a task such as this?
Are there any performance implications for large data tables (over 10k records)?
You actually don't need moment (or any library) for this. Since the values are in YYYY-MM-DD format, they are sortable as strings. Simple array/object manipulation will work.
var data = JSON.parse('{"datedemo_data_table":[{"datebeg":"2014-01-15","dateend":"2014-02-15"},{"datebeg":"2014-03-15","dateend":"2015-01-01"},{"datebeg":"2015-06-15","dateend":"2015-07-20"},{"datebeg":"2012-08-15","dateend":"2013-08-15"},{"datebeg":"2013-01-15","dateend":"2013-01-16"}],"datedemosummary_data_dict":{"x":"x","ds_soonst_date":"","ds_latest_date":""}}');
var firstBegDate = data.datedemo_data_table
.map(function(x){return x.datebeg;})
.sort().shift();
var lastEndDate = data.datedemo_data_table
.map(function(x){return x.dateend;})
.sort().pop();
As far as performance goes - if you have 10k items in a single JSON, that's probably an issue right there. You will always have O(n) performance with any approach unless you use an index to reduce the data to start with.
Answer
momentjs is an excellent choice as it is a well-documented and feature-full library.
The performance question is not addressed here, perhaps someone else can chime in on that.
Nevertheless with a small table of a few values, you can get a quick result by doing a collation of the dates into a single javascript array, and then extracting the max and min using the relevant functions from momentjs.
This can be done easily with the following:
Solution
var fmt = 'YYYY-MM-DD'
,ddtemp = $scope.dataroot.datedemosummary_data_dict
,aatemp_dates = []
;
$scope.dataroot.datedemo_data_table.forEach(function(currow, ixx, arr) {
aatemp_dates.push(moment(currow.datebeg,fmt));
aatemp_dates.push(moment(currow.dateend,fmt));
},ddtemp);
ddtemp.ds_soonst_date = (moment.min(aatemp_dates).format(fmt));
ddtemp.ds_latest_date = (moment.max(aatemp_dates).format(fmt));
Result
dataroot {
"datedemo_data_table": [
{
"datebeg": "2014-01-15",
"dateend": "2014-02-15"
},
{
"datebeg": "2014-03-15",
"dateend": "2015-01-01"
},
{
"datebeg": "2015-06-15",
"dateend": "2015-07-20"
},
{
"datebeg": "2012-08-15",
"dateend": "2013-08-15"
},
{
"datebeg": "2013-01-15",
"dateend": "2013-01-16"
}
],
"datedemosummary_data_dict": {
"x": "x",
"ds_soonst_date": "2012-08-15",
"ds_latest_date": "2015-07-20"
}
}
See also
momentjs #min
momentjs #max
momentjs range addon library by gf3 https://github.com/gf3/moment-range