Using Oracle Path Expressions on JSON arrays - oracle12c

I am attempting to write a simple query with path expressions against an Oracle JSON data structure that will return the Student Name and the the name of their CS220 teacher (if they are taking that class).
The JSON:
{
'studentName': 'John Smith',
'classes': [
{
'className': 'CS115',
'teacherName': 'Sally Wilson'
},
{
'className': 'CS220',
'teacherName': 'Jason Wu'
}
]
}
Expected Output
Student Name Professor
John Smith Jason Wu
Jane Doe << Not taking CS220
Ajay Kumar Robert Kroll
The query I would hope to write:
Select
jsonfield.studentName,
jsonfield.classes.<some path expression to find the CS220 professor here>
from mytable
The only solution I have found is to project out the nested 'classes' into a table and join that to the query above to get the professor. I would have thought that Oracle's json path implementation would be able to solve this without the overhead/complexity of a second query.

In 12cR1 you could do something like:
select jt.studentname,
max(case when jt.classname = 'CS220' then jt.teachername end) as teachername
from mytable mt
cross join json_table (
mt.jsonfield,
'$'
columns (
studentname varchar2(30) path '$.studentName',
nested path '$.classes[*]' columns (
classname varchar2(30) path '$.className',
teachername varchar2(30) path '$.teacherName'
)
)
) jt
group by jt.studentname;
The json_table() splits the JSON into relational columns; the nested path means you get one row per class (per student), with the relevant class names and teacher names.
The select list then uses a case expression to change the teacher name to null for any other classes - so John Smith gets one row with CS220 and Jason Wu, and one row with CS115 and null. Aggregating that with max() collapses those so all the irrelevant teachers are ignored.
With some expanded sample data:
create table mytable (jsonfield clob check (jsonfield is json));
insert into mytable a(jsonfield) values (q'#{
'studentName': 'John Smith',
'classes': [
{
'className': 'CS115',
'teacherName': 'Sally Wilson'
},
{
'className': 'CS220',
'teacherName': 'Jason Wu'
}
]
}#');
insert into mytable a(jsonfield) values (q'#{
'studentName': 'Jane Doe',
'classes': [
{
'className': 'CS115',
'teacherName': 'Sally Wilson'
}
]
}#');
insert into mytable a(jsonfield) values (q'#{
'studentName': 'Ajay Kumar',
'classes': [
{
'className': 'CS220',
'teacherName': 'Robert Kroll'
}
]
}#');
the basic json_table() call gets:
select jt.*,
case when jt.classname = 'CS220' then jt.teachername end as adjusted_teachername
from mytable mt
cross join json_table (
mt.jsonfield,
'$'
columns (
studentname varchar2(30) path '$.studentName',
nested path '$.classes[*]' columns (
classname varchar2(30) path '$.className',
teachername varchar2(30) path '$.teacherName'
)
)
) jt;
STUDENTNAME CLASSNAME TEACHERNAME ADJUSTED_TEACHERNAME
------------------------------ ------------------------------ ------------------------------ ------------------------------
John Smith CS115 Sally Wilson
John Smith CS220 Jason Wu Jason Wu
Jane Doe CS115 Sally Wilson
Ajay Kumar CS220 Robert Kroll Robert Kroll
Adding the aggregation step gets:
select jt.studentname,
max(case when jt.classname = 'CS220' then jt.teachername end) as teachername
from mytable mt
cross join json_table (
mt.jsonfield,
'$'
columns (
studentname varchar2(30) path '$.studentName',
nested path '$.classes[*]' columns (
classname varchar2(30) path '$.className',
teachername varchar2(30) path '$.teacherName'
)
)
) jt
group by jt.studentname;
STUDENTNAME TEACHERNAME
------------------------------ ------------------------------
John Smith Jason Wu
Jane Doe
Ajay Kumar Robert Kroll
In 12cR2 I think thought you might be able to do something like this instead, with a filter inside the JSON path (which isn't allowed in 12cR1):
select jt.*
from mytable mt
cross join json_table (
mt.jsonfield,
'$'
columns (
studentname varchar2(30) path '$.studentName',
nested path '$.classes[*]?(#.className=="CS220")' columns (
teachername varchar2(30) path '$.teacherName'
)
)
) jt;
... but I don't have a suitable DB to test that against.
... but it turns out that gets "ORA-40553: path expression with predicates not supported in this operation" and "Only JSON_EXISTS supports predicates".

Related

Query to return multiple MAX values with HAVING clause

I want to write a query that will return the name of students who did the most projects with the count of the project. I want the query to return a table like this:
student_name
max_project_count
John Doe
2
Anna Do
2
This is the code I have so far but it's only giving me the 2 column names student_name and count, but not the result.
SELECT s.student_name, COUNT(student_name)
FROM student s
GROUP BY student_name
HAVING COUNT(student_name) = (
SELECT MAX(count)
FROM (SELECT s.student_name, COUNT(*) AS count
FROM student_project k, student s
WHERE s.student_id = k.student_id
GROUP BY student_name) AS foo)
Result I have right now:
student_name
max_project_count
These are the tables I have in my database:
student
student_id
student_name
jd123
John Doe
ad456
Anna Do
js678
Jess Smith
dk789
Daniel Kim
school_project
project_id
project_name
math_1023
Math Comp.
sci_9872
Science Comp.
student_project
student_id
project_id
jd123
math_1023
ad456
math_1023
jd123
sci_9872
ad456
sci_9872
js678
sci_9872
dk789
sci_9872
with projects as (
Select student_id, count(*) as pcount from student_project group by 1),
max_proj as (
Select max(pcount) as max_project_count from projects)
Select
student_name, max_project_count
from student s,projects p,max_proj m
where
s.student_id=p.student_id and pcount=max_project_count

Unpivot Columns with Most Recent Record

Student Records are updated for subject and update date. Student can be enrolled in one or multiple subjects. I would like to get each student record with most subject update date and status.
CREATE TABLE Student
(
StudentID int,
FirstName varchar(100),
LastName varchar(100),
FullAddress varchar(100),
CityState varchar(100),
MathStatus varchar(100),
MUpdateDate datetime2,
ScienceStatus varchar(100),
SUpdateDate datetime2,
EnglishStatus varchar(100),
EUpdateDate datetime2
);
Desired query output, I am using CTE method but trying to find alternative and better way.
SELECT StudentID, FirstName, LastName, FullAddress, CityState, [SubjectStatus], UpdateDate
FROM Student
;WITH orginal AS
(SELECT * FROM Student)
,Math as
(
SELECT DISTINCT StudentID, FirstName, LastName, FullAddress, CityState,
ROW_NUMBER OVER (PARTITION BY StudentID, MathStatus ORDER BY MUpdateDate DESC) as rn
, _o.MathStatus as SubjectStatus, _o.MupdateDate as UpdateDate
FROM original as o
left join orignal as _o on o.StudentID = _o.StudentID
where _o.MathStatus is not null and _o.MUpdateDate is not null
)
,Science AS
(
...--Same as Math
)
,English AS
(
...--Same As Math
)
SELECT * FROM Math WHERE rn = 1
UNION
SELECT * FROM Science WHERE rn = 1
UNION
SELECT * FROM English WHERE rn = 1
First: storing data in a denormalized form is not recommended. Some data model redesign might be in order. There are multiple resources about data normalization available on the web, like this one.
Now then, I made some guesses about how your source table is populated based on the query you wrote. I generated some sample data that could show how the source data is created. Besides that I also reduced the number of columns to reduce my typing efforts. The general approach should still be valid.
Sample data
create table Student
(
StudentId int,
StudentName varchar(15),
MathStat varchar(5),
MathDate date,
ScienceStat varchar(5),
ScienceDate date
);
insert into Student (StudentID, StudentName, MathStat, MathDate, ScienceStat, ScienceDate) values
(1, 'John Smith', 'A', '2020-01-01', 'B', '2020-05-01'),
(1, 'John Smith', 'A', '2020-01-01', 'B+', '2020-06-01'), -- B for Science was updated to B+ month later
(2, 'Peter Parker', 'F', '2020-01-01', 'A', '2020-05-01'),
(2, 'Peter Parker', 'A+', '2020-03-01', 'A', '2020-05-01'), -- Spider-Man would never fail Math, fixed...
(3, 'Tom Holland', null, null, 'A', '2020-05-01'),
(3, 'Tom Holland', 'A-', '2020-07-01', 'A', '2020-05-01'); -- Tom was sick for Math, but got a second chance
Solution
Your question title already contains the word unpivot. That word actually exists in T-SQL as a keyword. You can learn about the unpivot keyword in the documentation. Your own solution already contains common table expression, these constructions should look familiar.
Steps:
cte_unpivot = unpivot all rows, create a Subject column and place the corresponding values (SubjectStat, Date) next to it with a case expression.
cte_recent = number the rows to find the most recent row per student and subject.
Select only those most recent rows.
This gives:
with cte_unpivot as
(
select up.StudentId,
up.StudentName,
case up.[Subject]
when 'MathStat' then 'Math'
when 'ScienceStat' then 'Science'
end as [Subject],
up.SubjectStat,
case up.[Subject]
when 'MathStat' then up.MathDate
when 'ScienceStat' then up.ScienceDate
end as [Date]
from Student s
unpivot ([SubjectStat] for [Subject] in ([MathStat], [ScienceStat])) up
),
cte_recent as
(
select cu.StudentId, cu.StudentName, cu.[Subject], cu.SubjectStat, cu.[Date],
row_number() over (partition by cu.StudentId, cu.[Subject] order by cu.[Date] desc) as [RowNum]
from cte_unpivot cu
)
select cr.StudentId, cr.StudentName, cr.[Subject], cr.SubjectStat, cr.[Date]
from cte_recent cr
where cr.RowNum = 1;
Result
StudentId StudentName Subject SubjectStat Date
----------- --------------- ------- ----------- ----------
1 John Smith Math A 2020-01-01
1 John Smith Science B+ 2020-06-01
2 Peter Parker Math A+ 2020-03-01
2 Peter Parker Science A 2020-05-01
3 Tom Holland Math A- 2020-07-01
3 Tom Holland Science A 2020-05-01

Concat Names against row_number() or similar function

my data repeats rows for individual relationships between people. For example, the below states that John Smith is known by 3 employees:
Person EmployeeWhoKnowsPerson
John Smith Derek Jones
John Smith Adrian Daniels
John Smith Peter Low
I am looking to do the following:
1) Count the number of people who know John Smith. I have done this via the row_number() function and it appears to be behaving:
select Person, MAX(rowrank) as rowrank
from (
select Person, EmployeeWhoKnowsPerson, rowrank=ROW_NUMBER() over (partition by Person order by EmployeeWhoKnowsPerson desc)
from Data
) as t
group by Person
Which returns:
Person rowrank
John Smith 3
But now i am looking at concatenating the EmployeeWhoKnowsPerson column to return and was wondering how this might be possible:
Person rowrank EmployeesWhoKnow
John Smith 3 Derek Jones, Adrian Daniels, Peter Low
For SQL Server 2017 +
select
person,
count(*) as KnowsCount,
string_agg(EmployeeWhoKnowsPerson, ',') WITHIN GROUP (ORDER BY EmployeeWhoKnowsPerson ASC) AS EmployeesWhoKnowPerson
from
data
group by person;
For prior versions:
select
person,
count(*) as KnowsCount,
stuff((select ',' + EmployeeWhoKnowsPerson
from data as dd
where dd.Person = d.Person
order by EmployeeWhoKnowsPerson
for xml path('')), 1, 1, '') AS EmployeesWhoKnowPerson
from
data as d
group by person;
And you're overthinking that whole count of who knows piece.
Here's a SQL Fiddle Demo with an extra name thrown in.
If 2017+, you can use string_agg() in a simple group by
Example
Declare #YourTable Table ([Person] varchar(50),[EmployeeWhoKnowsPerson] varchar(50)) Insert Into #YourTable Values
('John Smith','Derek Jones')
,('John Smith','Adrian Daniels')
,('John Smith','Peter Low')
Select Person
,rowrank = sum(1)
,[EmployeeWhoKnowsPerson] = string_agg([EmployeeWhoKnowsPerson],', ')
From #YourTable
Group By Person
Returns
Person rowrank EmployeeWhoKnowsPerson
John Smith 3 Derek Jones, Adrian Daniels, Peter Low
If <2017 ... use the stuff()/xml approach
Select Person
,rowrank = sum(1)
,[EmployeeWhoKnowsPerson] = stuff((Select ', ' + [EmployeeWhoKnowsPerson]
From #YourTable
Where Person=A.Person
For XML Path ('')),1,2,'')
From #YourTable A
Group By Person

Postgres - jsonb : Update key in column with value taken from another table

I am using postgres 9.5. I have a profile table, which lists the names:
public.profiles:
id | first_name | last_name
--- --------------- ---------------------
1 Jason Bourne
2 Jhonny Quest
I have an invoices table:
public.invoices:
invoice_id | billing_address | profile_id
------------------ ----------------------------- ---------------------
1 { 2
"address_line1": "445 Mount
Eden Road",
"city":"Mount Eden",
"country": "Auckland"
}
I want to update the billing_address column of the invoices table with the first_name and last_name from the profile table, like :
public.invoices:
invoice_id | billing_address | profile_id
------------------ ----------------------------- ---------------------
1 {
"name" : "Jhonny Quest" 2
"address_line1": "445 Mount
Eden Road",
"city":"Mount Eden",
"country": "Auckland"
}
To do so, I have tried using jsonb_set:
UPDATE invoices AS i SET billing_address = jsonb_set(billing_address,'{name}', SELECT t::jsonb FROM (SELECT CONCAT (p.first_name,p.middle_name, p.last_name) FROM profiles p WHERE p.id = i.profile_id)t )
It throws an error at SELECT. TBH I am not even sure if any of that statement is legal. Looking for any guidance.
Click: demo:db<>fiddle
UPDATE invoices i
SET billing_address = s.new_billing_address
FROM (
SELECT
i.invoice_id,
jsonb_set(
billing_address,
'{name}'::text[],
to_jsonb(concat_ws(' ', first_name, last_name))
) AS new_billing_address
FROM
invoices i
JOIN profiles p ON i.profile_id = p.id
) s
WHERE s.invoice_id = i.invoice_id;
Creating the SELECT with joining the second table; Afterwards you are able to create the new JSON object out of the name parts using to_jsonb() and the concat operator || (or concat_ws(), of course, as mentioned in the comments).

PostgreSQL - How to display a corresponding string on every entry in string_agg()?

I have 2 tables:
Employee
ID Name
1 John
2 Ben
3 Adam
Employer
ID Name
1 James
2 Rob
3 Paul
I want to string_agg() and concatenate the two tables in one record as a single column. Now I wanted another column than will determine that if that string is from "Employee" table, it will display "Employee" and "Employer" if the data comes from the "Employer" table.
Here's my code for displaying the table:
SELECT string_agg(e.Name, CHR(10)) || CHR(10) || string_agg(er.Name, CHR(10)), PERSON_STATUS
FROM Employee e, Employer er
Here's my expected output:
ID Name PERSON_STATUS
1 John Employee
Ben Employee
Adam Employee
James Employer
Rob Employer
Paul Employer
NOTE: I know this can be done by adding another column in the table but that's not the case of this scenario. This is just an example to illustrate my problem.
Based on your sample, I'd say that you need UNION ALL rather than an aggregate:
SELECT id, name, 'Employee'::text AS person_status
FROM employee
UNION ALL
SELECT id, name, 'Employer'::text
from employer;
SELECT 1 AS id, STRING_AGG(name, E'\r\n') AS name, STRING_AGG(person_status, E'\r\n') AS person_status
FROM (
SELECT name, 'Employee' AS person_status
FROM employee
UNION ALL
SELECT name, 'Employer'
FROM employer
) data
Returns:
Ok, so first we merge our 2 tables into 3 columns. We can select arbitrary values this way.
select
"ID", -- Double quotes are necesary for capitalised aliases
"Name",
'Employee' as "PERSON_STATUS"
from
employee
union
select
"ID",
"Name",
'Employer'
from
employer
We then subquery this and perform our string operations as required.
select
string_agg(concat(people."Name", ' ', people."PERSON_STATUS"), chr(10))
from
(
select
"ID",
"Name",
'Employee' as "PERSON_STATUS"
from
employee
union
select
"ID",
"Name",
'Employer'
from
employer
) as people