How to improve this query using array or json constructor? - postgresql

how are you?
I am wondering how to improve this query to return something better to work with. Let me show the tables, the current query and my idea first:
Tables
users
nfts
here owner_id is a fk to users.id
users_nfts (here I save all the creators of the nft, one nft could have more than one creator. This is to calculate royalties later)
Current query and explanation
To be able to code a "buy" nft process (in nodejs) I want to retrieve some data about the nft to buy:
Its price
Its current owner
The creators (to calculate the royalties and update their balances)
SELECT nfts.id, price, owner_id, owner.balance as owner_balance, creators.user_id, users.balance
FROM nfts
INNER JOIN users_nfts as creators
ON nfts.id = creators.nft_id
INNER JOIN users
ON creators.user_id = users.id
INNER JOIN users as owner
ON nfts.owner_id = owner.id
WHERE nfts.id = ${nft_id}
The query works but it's horrible because it retrieves me repeated data (this is what I want to solve). The red square means the repeated data.
What I would like to achieve
I would like to make a query so all the data about the NFT comes in one row. To do that, I need to retrieve the user_id and balance inside an array of tuples or in a json.
The result in my backend could be something like (any ideas here are welcome):
{
"id": "ea850c65-818e-40bd-bb06-af69eaeda4a6", // nft id
"price": 42,
"owner_id": "1134e9e0-02ae-4567-9adf-220ead36a6ef",
"owner_balance": 100,
"creators": [
{
"user_id": "1134e9e0-02ae-4567-9adf-220ead36a6ef",
"balance": 100,
},
{
"user_id": "2134e9e0-02ae-4567-9adf-220ead36a6ea",
"balance": 35,
},
],
},
Thanks in advance for any tips :)

I achieved what I want with this query, using json_agg and json_build_object
SELECT nfts.id, price, owner_id, owner.balance as owner_balance, json_agg(json_build_object('user_id', creators.user_id, 'balance', users.balance)) as creators
FROM nfts
INNER JOIN users_nfts as creators
ON nfts.id = creators.nft_id
INNER JOIN users
ON creators.user_id = users.id
INNER JOIN users as owner
ON nfts.owner_id = owner.id
WHERE nfts.id = ${nft_id}
GROUP BY nfts.id, owner.id;
The query produces the following output:
{
"rows": [
{
"id": "d87716ec-4005-4ccb-9970-6769adec3aa1",
"price": 42,
"owner_id": "7dd619dd-b997-4351-9541-4d8989c58667",
"owner_balance": 58,
"creators": [
{
"user_id": "1134e9e0-02ae-4567-9adf-220ead36a6ef",
"balance": 137.8
},
{
"user_id": "492851bb-dead-4c9d-b9f6-271dcf07a8bb",
"balance": 104.2
}
]
}
]
}
Hope someone else can find this useful.

Related

Sort by json element at nested level for jsonb data - postgresql

I have below table in postgresql which stored JSON data in jsonb type of column.
CREATE TABLE "Trial" (
id SERIAL PRIMARY KEY,
data jsonb
);
Below is the sample json structure
{
"id": "000000007001593061",
"core": {
"groupCode": "DVL",
"productType": "ZDPS",
"productGroup": "005001000"
},
"plants": [
{
"core": {
"mrpGroup": "ZMTS",
"mrpTypeDesc": "MRP",
"supLeadTime": 777
},
"storageLocation": [
{
"core": {
"storageLocation": "H050"
}
},
{
"core": {
"storageLocation": "H990"
}
},
{
"core": {
"storageLocation": "HM35"
}
}
]
}
],
"discriminator": "Material"
}
These are the scripts for insert json data
INSERT INTO "Trial"(data)
VALUES(CAST('{"id":"000000007001593061","core":{"groupCode":"DVL","productType":"ZDPS","productGroup":"005001000"},"plants":[{"core":{"mrpGroup":"ZMTS","mrpTypeDesc":"MRP","supLeadTime":777},"storageLocation":[{"core":{"storageLocation":"H050"}},{"core":{"storageLocation":"H990"}},{"core":{"storageLocation":"HM35"}}]}],"discriminator":"Material"}' AS JSON))
INSERT INTO "Trial"(data)
VALUES(CAST('{"id":"000000000104107816","core":{"groupCode":"ELC","productType":"ZDPS","productGroup":"005001000"},"plants":[{"core":{"mrpGroup":"ZCOM","mrpTypeDesc":"MRP","supLeadTime":28},"storageLocation":[{"core":{"storageLocation":"H050"}},{"core":{"storageLocation":"H990"}}]}],"discriminator":"Material"}' AS JSON))
INSERT INTO "Trial"(data)
VALUES(CAST('{"id":"000000000104107818","core":{"groupCode":"DVK","productType":"ZDPS","productGroup":"005001000"},"plants":[{"core":{"mrpGroup":"ZMTL","mrpTypeDesc":"MRP","supLeadTime":28},"storageLocation":[{"core":{"storageLocation":"H050"}},{"core":{"storageLocation":"H990"}}]}]}' AS JSON))
If try to sort by at first level then it works
select id,data->'core'->'groupCode'
from "Trial"
order by data->'core'->'groupCode' desc
But when I try to sort by at nested level, below is the script then it doesn't work for me, I'm for sure I'm wrong for this script but don't know what is it ? Need assistant if someone knows how to order by at nested level for JSONB data.
select id,data->'plants'
from sap."Trial"
order by data->'plants'->'core'->'mrpGroup' desc
Need assistance for write a query for order by at nested level for JSONB data.
Below query works for me
SELECT id, data FROM "Trial" ORDER BY jsonb_path_query_array(data, '$.plants[*].core[*].mrpGroup') desc limit 100

Sequelize association not returning value

I am trying to understand associations in sequelize and I am able to get one association to work but when I replicate the same query nothing gets returned. I don't understand the relationships that well even after reading the user guide
I can get the data from "stages" but not able to successfully join the "status" table and retrieve the data
I was able to find a workaround too where you can create the association from within findALL()
db.leads
.findAll({
attributes: ["id", "name", "title", "name", "title", "company", "workPhone", "mobilePhone", "otherPhone", "email", "dateCreated"],
include: [
{
model: db.stages,
association: db.leads.hasMany(db.stages, { foreignKey: "id", targetKey: "id" }),
on: {
[Op.and]: [
db.sequelize.where(
db.sequelize.col("stages.id"),
Op.eq, // '=',
db.sequelize.col("leads.stageID")
),
],
},
attributes: ["name"],
},
{
model: db.status,
association: db.leads.belongsToMany(db.status, { through: "id" }),
on: {
[Op.and]: [
db.sequelize.where(
db.sequelize.col("status.id"),
Op.eq, // '=',
db.sequelize.col("leads.statusID")
),
],
},
attributes: ["name"],
},
],
where: {
ownerID: req.query.ownerID,
},
subQuery: false,
duplicating: false,
})
This is what gets returned:
{
"id": "920cc536-48ae-40ee-8c5b-e1bfedbec602",
"name": "Dummy Lead",
"title": "Dummy",
"company": "Dummy",
"workPhone": "000-000-0000",
"mobilePhone": "000-000-0000",
"otherPhone": "000-000-0000",
"email": "Dummy#Dummy.com",
"dateCreated": "2022-06-18T13:30:09.676Z",
"stages": [
{
"name": "Qualify"
}
],
"status_types": []
}
Below are my tables:
Leads:
Stages:
Status:
In general, I believe the recommended approach to associations is to set them up in your models. This will make your queries easier to write, and also help from a DRY perspective. With the associations set up in models, your include would look something like
include: [
{
model: db.stages
attributes: ['name']
}
]
This would LEFT JOIN Stages with Leads. (NB: By adding a property of required: true to the object, you can make that perform an INNER JOIN).
The associations you're trying to call in these queries also do not really seem to match your table structure. If Leads hasMany Stages, then why is there a StageId in the Leads table? Having the foreign key in Leads would suggest that this is a hasOne or belongsTo relation from the Leads side. If Leads SHOULD have many Stages, then the foreign key should exist on the Stages table, so that multiple Stages can be associated to a single Lead.
Defining Leads with a belongsToMany relationship on Status also appears to be incorrect. BelongsToMany is the association method for Many to Many relationships through a junction table. If you do want to have a Many to Many relationship here (Where a Lead can have many Statuses, and Statuses can belong to many Leads), you will need to create that belongsToMany association on each side with a junction table (LeadStatuses). If a Lead should only have one status, however, the association should be lead.belongsTo() and status.hasMany().
I would take a long read about Sequelize Associations to try to take all of this in. It's challenging stuff that you will likely need to revisit several times, but it will make your life much easier once implemented correctly.

Creating an AND query on a list of items in Azure Cosmos

I'm building an application in Azure Cosmos and I'm having trouble creating a query. Using the dataset below, I want to create a query that only finds CharacterId "Susan" by searching for all characters that have the TraitId of "Athletic" and "Slim".
Here is my JSON data set
{
"characterId": "Bob",
"traits": [
{
"traitId": "Athletic",
"traitId": "Overweight"
}
],
},
{
"characterId": "Susan",
"traits": [
{
"traitId": "Athletic",
"traitId": "Slim"
}
],
},
{
"characterId": "Jerry",
"traits": [
{
"traitId": "Slim",
"traitId": "Strong"
}
],
}
]
The closest I've come is this query but it acts as an OR statement and what I want is an AND statement.
SELECT * FROM Characters f WHERE f.traits IN ("Athletic", "Slim")
Any help is greatly appreciated.
EDITED: I figured out the answer to this question. If anyone is interested this query gives the results I was looking for:
SELECT * FROM Characters f
WHERE EXISTS (SELECT VALUE t FROM t IN f.traits WHERE t.traitId = 'Athletic')
AND EXISTS (SELECT VALUE t FROM t IN f.traits WHERE t.traitId = 'Slim')
The answer that worked for me is to use EXISTS statements with SELECT statements that searched the traits list. In my program I can use StringBuilder to create a SQL statement that concatenates an AND EXISTS statement for each of the traits I want to find:
SELECT * FROM Characters f
WHERE EXISTS (SELECT VALUE t FROM t IN f.traits WHERE t.traitId = 'Athletic')
AND EXISTS (SELECT VALUE t FROM t IN f.traits WHERE t.traitId = 'Slim')

how to gather several fields from linked data in one query?

I have a graph db with Members and Pages, some members can be ExpertOf some pages. I am trying to build a query giving experts of a given page, with all the pages they are experts of.
To sum up, I have a simple db : Member---(ExpertOf)--->Page
My (quite) working query is
SELECT #rid,title,out('ExpertOf') AS expertises FROM
(SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)
It works as expected, returning :
{
"result": [
{
"#type": "d",
"#rid": "#-2:0",
"#version": 0,
"rid": "#17:0",
"title": "John Doe",
"expertises": [
"#16:299",
"#16:221",
"#15:160",
"#16:94",
"#16:714"
],
"#fieldTypes": "rid=x,expertises=z"
}
],
"notification": "Query executed in 0.057 sec. Returned 1 record(s)"
}
(btw, I wonder what is this #rid:#-2:0...)
But now, instead of having pages #rid, I would like to have both #rid and title...
I've tried :
SELECT #rid,title,out('ExpertOf').title AS expertises FROM
(SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)
or (same result)
SELECT #rid,title,out('ExpertOf').include('title') AS expertises FROM
(SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)
which gives :
"expertises": [
"Wave on string",
"USE A SLOPE",
"Spin coating",
"Gas hydrate",
"Mpemba effect"
],
then
SELECT #rid,title, out('ExpertOf').include('#rid','title') AS
expertises FROM (SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)
which returns :
"expertises": [
"#16:299",
"#16:221",
"#15:160",
"#16:94",
"#16:714"
],
whereas I would have hoped
"expertises": [
{ "#rid":"#16:299", "title":"Wave on string" },
{ "#rid":"#16:221", "title":"USE A SLOPE" },
{ "#rid":"#15:160", "title":"Spin coating" },
{ "#rid":"#16:94", "title":"Gas hydrate" },
{ "#rid":"#16:714", "title":"Mpemba effect" }
],
I've tried expand(out('ExpertOf').include('#rid','title')) or unwind expertises, or unionAll(out('ExpertOf').#rid,out('ExpertOf').title) as explained elsewhere, and so on... but no query has given the hoped result.
Is there a way to get this kind of result ? (I've succeded to make it work with a function calling a query on Page, but I am wondering if this can be done in one query, and if my solution is efficient).
Thanks
If you are using HTTP and you want to obtain a nested JSON, you can use fetchplans:
SELECT #rid,title,out('ExpertOf') AS expertises FROM
(SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)
fetchplan *:0 expertises.rid:1 expertises.title:1 expertises:-2
instead of this,
SELECT #rid,title,out('ExpertOf').include('title') AS expertises FROM
(SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)
try to do this one
SELECT #rid,title,out('ExpertOf')['title'] AS expertises FROM
(SELECT expand(in('ExpertOf')) FROM Page WHERE #rid=16:299)

How would you model this in MongoDB?

There are products with a name and price.
Users log about products they have bought.
# option 1: embed logs
product = { id, name, price }
user = { id,
name,
logs : [{ product_id_1, quantity, datetime, comment },
{ product_id_2, quantity, datetime, comment },
... ,
{ product_id_n, quantity, datetime, comment }]
}
I like this. But if product ids are 12 bytes long, quantity and datetime are 32-bit (4 bytes) integers and comments 100 bytes on average, then the size of one log is 12+4+4+100 = 120 bytes. The maximum size of a document is 4MB, so maximum amount of logs per user is 4MB/120bytes = 33,333. If assumed that a user logs 10 purchases per day, then the 4MB limit is reached in 33,333/10 = 3,333 days ~ 9 years. Well, 9 years is probably fine, but what if we needed to store even more data? What if the user logs 100 purchases per day?
What is the other option here? Do I have to normalize this fully?
# option 2: normalized
product = { id, name, price }
log = { id, user_id, product_id, quantity, datetime, comment }
user = { id, name }
Meh. We are back to relational.
if the size is the main concern, you can go ahead with option 2 with mongo DbRef.
logs : [{ product_id_1, quantity, datetime, comment },
{ product_id_2, quantity, datetime, comment },
... ,
{ product_id_n, quantity, datetime, comment }]
and embed this logs inside user using Dbref, something like
var log = {product_id: "xxx", quantity:"2", comment:"something"}
db.logs.save(log)
var user= { id:"xx" name : 'Joe', logs : [ new DBRef('logs ', log._id) ] }
db.users.save(user)
Yes, option 2 is your best bet. Yes, you're back to a relational model, but then, your data is best modeled that way. I don't see a particular downside to option 2, its your data that is requiring you to go that way, not a bad design process.