Table scoping difficulty: Joining on a sub query with a where clause from a table outside the subquery - tsql

I am trying to adapt a stored procedure into a view and I've run into what appears to be a table scoping issue in SQL Server 2008 R2.
In the sproc, there was a parameter that specified the companyId. Now we want to retrieve the same data but not limit it to a specific companyId but rather to group by that companyId.
However, my attempt broadening it has failed.
Select Distinct [T].[TruckID], [T].[VIN], [T].[Make], [T].[Model],
[T].[ModelYear], [T].[RFIDNo], [SQCT].[VendorCode], [SQCT].[UserLabel],
[T].[IsActive], [T].[IsFleetTruck], [SQDA].[DriverAssociations],
dbo.GetCurrentPlate([T].[TruckID]) [Plate],
[TCT].[CompanyId] -- // <- Now we're including it
From [dbo].[Truck] as [T]
Inner Join [dbo].[TruckerCompanyTruck] as [TCT]
On [T].[TruckID] = [TCT].[TruckId]
Left Outer Join (
Select [CT].[CompanyTruckId], [CT].[CompanyId],
[CT].[TruckId], [CT].[VendorCode], [CT].[UserLabel],
[CT].[CreateDate], [CT].[CreateBy], [CT].[UpdateDate],
[CT].[UpdateBy], [CT].[Version]
From [dbo].[CompanyTruck] as [CT]
) as [SQCT]
On [TCT].[CompanyId] = [SQCT].[CompanyId]
and [TCT].[TruckId] = [SQCT].[TruckId]
Left Join (
Select Distinct [TCTC].[TruckId], Count(*) [DriverAssociations]
From [TruckerCompanyTruck] [TCTC]
Inner Join [Trucker] [D]
On [TCTC].[TruckerId] = [D].[TruckerId]
Inner Join [TruckerContract] [TC]
On [TCTC].[TruckerId] = [TC].[TruckerId]
Where [TCTC].[CompanyId] = [TCT].[CompanyId] -- // Error!
and [TC].[CompanyId] = [TCT].[CompanyId] -- // Error!
Group By [TCTC].[TruckId]
) as [SQDA]
On [SQDA].[TruckId] = [T].[TruckId]
The "Error!" lines throw "The multi-part identifier "TCT.CompanyId" could not be bound."
The ultimate goal is that running this query will produce multiple rows for a truck that has multiple company associations and return the correct driver count.
In the end, this query of the view...
Select *
From [FlattenedTruck]
Where [CompanyId] = 28
Order By [VIN]
...should produce the same output as this.
Exec [GetTrucksForGridWithAssociationCounts] #companyId = 28
But the differ. The sproc returns 1 driver association for companyId 28 and the view returns 1246 driver associations which is all of the associations for that truck regardless of the company. (It's also different because the sproc does not return the companyId because it's passed in as a parameter.)
TruckID -> VIN -> Make -> Model -> ModelYear -> RFIDNo -> VendorCode -> UserLabel -> IsActive -> IsFleetTruck -> DriverAssociations -> Plate -> CompanyId
26 -> NULL -> NULL -> NULL -> NULL -> NULL -> NULL -> NULL -> 1 -> 0 -> 1246 -> US-WA-9D68812 28
26 -> NULL -> NULL -> NULL -> NULL -> NULL -> NULL -> NULL -> 1 -> 0 -> 1 -> US-WA-9D68812

Can you try OUTER APPLY instead of the LEFT JOIN to the SQDA subquery? You'll want to add the condition of [TCTC].[TruckId] = [T].[TruckId] to the WHERE clause within the subquery, as the APPLY doesn't take an ON condition.

When joining with subqueries, it is important to remember that joins inside the subquery are not aware of tables joined outside the subquery.
The error is being thrown because you are attempting to reference table TCT inside the parentheses of the SQDA subquery, but the join aliased to TCT is outside the parentheses. You are joining that same table inside the parentheses as TCTC so it is a simple matter of changing any use of TCT inside the subquery to TCTC, or moving it outside as appropriate.
So your WHERE clause would become WHERE [TC].[CompanyId] = [TCTC].[CompanyId] inside the subquery, and you would move the other clause outside to join to the subquery since WHERE [TCTC].[CompanyId] = [TCTC].[CompanyId] is pointless,
That will make the query work; without knowing more about the underlying data model I can't guarantee it will actually return what you intend it to return. But that should be sufficient to get you past the error you are currently getting.

Related

Is there any way to write this code avoiding intermediate steps/views in PostgreSQL coming from different tables?

I am working in a large query I would like to eliminate intermediate
steps, so I am trying to write the two queries below in just one.
The first query (QUERY 1) select the grid id from a table called
tiles, from here i obtained an UUID who correspond to a value in a
second table so to obtain the real value of this grid id I have to
query a second query (QUERY 2). I have try to cast everything as I did
in the third query for other values, but this approach doesn't work.
Has someone an idea how can I manage to do this query (query 1 and 2)
just in one:
QUERY 1:
SELECT
jsonb_array_elements(grid_id_tile.tiledata -> '34cfea5d-c2c0-11ea-9026-02e7594ce0a0'::text) ->> 'resourceId'::text AS grid_id
FROM mv_geojson_geoms mv
LEFT JOIN tiles grid_id_tile ON tv1.resourceinstanceid = grid_id_tile.resourceinstanceid
WHERE (( SELECT resource_instances.graphid
FROM resource_instances
WHERE mv.resourceinstanceid = resource_instances.resourceinstanceid)) = '34cfe98e-c2c0-11ea-9026-02e7594ce0a0'::uuid;
QUERY 2:
SELECT grid_id.legacyid AS grid_id,
FROM table 1 (Where I have obtained the grid id)
LEFT JOIN resource_instances grid_id ON hb1.grid_id = grid_id.resourceinstanceid::text
QUERY 3:
( SELECT "values".value
FROM "values"
WHERE ((name_ft_tile.tiledata ->> '34cfea97-c2c0-11ea-9026-02e7594ce0a0'::text)::uuid) = "values".valueid) AS nametype,
FROM mv_geojson_geoms mv
LEFT JOIN tiles name_ft_tile ON mv.resourceinstanceid = name_ft_tile.resourceinstanceid AND (name_ft_tile.tiledata ->> '34cfea97-c2c0-11ea-9026-02e7594ce0a0'::text) <> ''::text
WHERE (( SELECT resource_instances.graphid
FROM resource_instances
WHERE mv.resourceinstanceid = resource_instances.resourceinstanceid)) = '34cfe98e-c2c0-11ea-9026-02e7594ce0a0'::uuid
Those are the type of data I am managing at the moment:
This is the table tiles where in the jsonb got the UUID from the feature i would like to get the gridid
This is the table resource instance where the legacyid is
So from the query 1 I get this result Gridid is a UUID
And from query 2 I get this result with the grid_id code
This is what I obtain and I would like to get directly the grid_id value without intermediate steps
The third query is a sample of similar approach I did so in one query I get the value instead of the UUID, and it is what I would like to do with the grid_id.
But when I run the similar code I get the error, because I get the element from an array:
ERROR: cannot extract elements from a scalar
CONTEXT: parallel worker
SQL state: 22023
You can literally inline the query 1 as a subquery where you've written "table 1 (Where I have obtained the grid id)":
SELECT grid_id.legacyid AS grid_id
FROM (
SELECT jsonb_array_elements(grid_id_tile.tiledata -> '34cfea5d-c2c0-11ea-9026-02e7594ce0a0'::text) ->> 'resourceId'::text AS grid_id
FROM mv_geojson_geoms mv
LEFT JOIN tiles grid_id_tile ON mv.resourceinstanceid = grid_id_tile.resourceinstanceid
JOIN resource_instances ON mv.resourceinstanceid = resource_instances.resourceinstanceid
WHERE resource_instances.graphid = '34cfe98e-c2c0-11ea-9026-02e7594ce0a0'::uuid;
) AS hb1
LEFT JOIN resource_instances grid_id ON hb1.grid_id = grid_id.resourceinstanceid::text;

PostGres joins Using JSONB

I have two tables which look like such
Organizations
Id (primary_key. big int)
Name (text)
CustomInformations
Id (primary_key. big int)
ConnectedIdentifiers (JSONB)
Info (text)
CustomInformations's ConnectedIdentifiers column contains a JSONB structure which looks like
{ 'organizations': [1,2,3] }
This means that there's an array of organizations with those ids 1, 2, 3, which are related to that particular CustomInformation
I'm trying to do a JOIN where given a CustomInformation Id will also get me all the Organizations names
I tried this after looking at some examples:
SELECT * FROM CustomInformations ci
INNER JOIN Organizations o on jsonb_array_elements(ci.ConnectedIdentifiers->'19') = o.id
WHERE
ci.id = 5
I got an error No operator matches the given name and argument type(s). You might need to add explicit type casts.
Is this the right approach? And if so what is wrong with my syntax?
Thanks
You cannot use jsonb_array_elements() in this way because the function returns set of rows. It should be placed in a lateral join instead. Use jsonb_array_elements_text() to get array elements as text and cast these elements to bigint:
select ci.*, o.*
from custominfo ci
-- lateral join
cross join jsonb_array_elements_text(ci.connectedidentifiers->'organizations') ar(elem)
join organizations o
on elem::bigint = o.id
where ci.id = 5

Flag a record as current using UPDATE with DISTINCT and ORDER

I have a table which contains data for properties, each property can have multiple entries which are recorded by date. I want to add a column in the table that flags the current entry to make them easier to select. Something like col name is current_record = Y.
What i've tried;
Creating view which selects just the latest records, then using that view
UPDATE properties
SET current_record = 'Y'
FROM current_properties
WHERE current_properties.building_reference_number = properties.building_reference_number;
With building_reference_number indexed on both, with type as integer. This results in a cost of 434537.04 which seems very high. properties is 15.6m rows current_properties is 12.8m.
I tried a second approach without creating a view as;
UPDATE properties
SET current_record = 'Y'
WHERE inspection_date IN (
SELECT inspection_date FROM properties
WHERE inspection_date IN (SELECT distinct on (building_reference_number)inspection_date )
ORDER BY inspection_date desc
FOR UPDATE
But the cost of this is even higher. Here's the explain for the first method, im new to postgres
Update on properties (cost=434537.04..4930742.06 rows=15019542 width=608)
-> Hash Join (cost=434537.04..4930742.06 rows=15019542 width=608)
Hash Cond: (properties.building_reference_number = yy_current_epc_dmv.building_reference_number)
-> Seq Scan on properties (cost=0.00..2233904.25 rows=17313725 width=378)
-> Hash (cost=210745.13..210745.13 rows=12874313 width=14)
-> Seq Scan on current_properties (cost=0.00..210745.13 rows=12874313 width=14)
View definition is;
SELECT DISTINCT ON (properties.building_reference_number) properties.building_reference_number,
properties.inspection_date
FROM properties
ORDER BY properties.building_reference_number, properties.inspection_date DESC;

EFCore returning too many columns for a simple LEFT OUTER join

I am currently using EFCore 1.1 (preview release) with SQL Server.
I am doing what I thought was a simple OUTER JOIN between an Order and OrderItem table.
var orders = from order in ctx.Order
join orderItem in ctx.OrderItem
on order.OrderId equals orderItem.OrderId into tmp
from oi in tmp.DefaultIfEmpty()
select new
{
order.OrderDt,
Sku = (oi == null) ? null : oi.Sku,
Qty = (oi == null) ? (int?) null : oi.Qty
};
The actual data returned is correct (I know earlier versions had issues with OUTER JOINS not working at all). However the SQL is horrible and includes every column in Order and OrderItem which is problematic considering one of them is a large XML Blob.
SELECT [order].[OrderId], [order].[OrderStatusTypeId],
[order].[OrderSummary], [order].[OrderTotal], [order].[OrderTypeId],
[order].[ParentFSPId], [order].[ParentOrderId],
[order].[PayPalECToken], [order].[PaymentFailureTypeId] ....
...[orderItem].[OrderId], [orderItem].[OrderItemType], [orderItem].[Qty],
[orderItem].[SKU] FROM [Order] AS [order] LEFT JOIN [OrderItem] AS
[orderItem] ON [order].[OrderId] = [orderItem].[OrderId] ORDER BY
[order].[OrderId]
(There are many more columns not shown here.)
On the other hand - if I make it an INNER JOIN then the SQL is as expected with only the columns in my select clause:
SELECT [order].[OrderDt], [orderItem].[SKU], [orderItem].[Qty] FROM
[Order] AS [order] INNER JOIN [OrderItem] AS [orderItem] ON
[order].[OrderId] = [orderItem].[OrderId]
I tried reverting to EFCore 1.01, but got some horrible nuget package errors and gave up with that.
Not clear whether this is an actual regression issue or an incomplete feature in EFCore. But couldn't find any further information about this elsewhere.
Edit: EFCore 2.1 has addressed a lot of issues with grouping and also N+1 type issues where a separate query is made for every child entity. Very impressed with the performance in fact.
3/14/18 - 2.1 Preview 1 of EFCore isn't recommended because the GROUP BY SQL has some issues when using OrderBy() but it's fixed in nightly builds and Preview 2.
The following applies to EF Core 1.1.0 (release).
Although shouldn't be doing such things, tried several alternative syntax queries (using navigation property instead of manual join, joining subqueries containing anonymous type projection, using let / intermediate Select, using Concat / Union to emulate left join, alternative left join syntax etc.) The result - either the same as in the post, and/or executing more than one query, and/or invalid SQL queries, and/or strange runtime exceptions like IndexOutOfRange, InvalidArgument etc.
What I can say based on tests is that most likely the problem is related to bug(s) (regression, incomplete implementation - does it really matter) in GroupJoin translation. For instance, #7003: Wrong SQL generated for query with group join on a subquery that is not present in the final projection or #6647 - Left Join (GroupJoin) always materializes elements resulting in unnecessary data pulling etc.
Until it get fixed (when?), as a (far from perfect) workaround I could suggest using the alternative left outer join syntax (from a in A from b in B.Where(b = b.Key == a.Key).DefaultIfEmpty()):
var orders = from o in ctx.Order
from oi in ctx.OrderItem.Where(oi => oi.OrderId == o.OrderId).DefaultIfEmpty()
select new
{
OrderDt = o.OrderDt,
Sku = oi.Sku,
Qty = (int?)oi.Qty
};
which produces the following SQL:
SELECT [o].[OrderDt], [t1].[Sku], [t1].[Qty]
FROM [Order] AS [o]
CROSS APPLY (
SELECT [t0].*
FROM (
SELECT NULL AS [empty]
) AS [empty0]
LEFT JOIN (
SELECT [oi0].*
FROM [OrderItem] AS [oi0]
WHERE [oi0].[OrderId] = [o].[OrderId]
) AS [t0] ON 1 = 1
) AS [t1]
As you can see, the projection is ok, but instead of LEFT JOIN it uses strange CROSS APPLY which might introduce another performance issue.
Also note that you have to use casts for value types and nothing for strings when accessing the right joined table as shown above. If you use null checks as in the original query, you'll get ArgumentNullException at runtime (yet another bug).
Using "into" will create a temporary identifier to store the results.
Reference : MDSN: into (C# Reference)
So removing the "into tmp from oi in tmp.DefaultIfEmpty()" will result in the clean sql with the three columns.
var orders = from order in ctx.Order
join orderItem in ctx.OrderItem
on order.OrderId equals orderItem.OrderId
select new
{
order.OrderDt,
Sku = (oi == null) ? null : oi.Sku,
Qty = (oi == null) ? (int?) null : oi.Qty
};

PostgreSQL - jsonb_each

I have just started to play around with jsonb on postgres and finding examples hard to find online as it is a relatively new concept.I am trying to use jsonb_each_text to printout a table of keys and values but get a csv's in a single column.
I have the below json saved as as jsonb and using it to test my queries.
{
"lookup_id": "730fca0c-2984-4d5c-8fab-2a9aa2144534",
"service_type": "XXX",
"metadata": "sampledata2",
"matrix": [
{
"payment_selection": "type",
"offer_currencies": [
{
"currency_code": "EUR",
"value": 1220.42
}
]
}
]
}
I can gain access to offer_currencies array with
SELECT element -> 'offer_currencies' -> 0
FROM test t, jsonb_array_elements(t.json -> 'matrix') AS element
WHERE element ->> 'payment_selection' = 'type'
which gives a result of "{"value": 1220.42, "currency_code": "EUR"}", so if i run the below query I get (I have to change " for ')
select * from jsonb_each_text('{"value": 1220.42, "currency_code": "EUR"}')
Key | Value
---------------|----------
"value" | "1220.42"
"currency_code"| "EUR"
So using the above theory I created this query
SELECT jsonb_each_text(data)
FROM (SELECT element -> 'offer_currencies' -> 0 AS data
FROM test t, jsonb_array_elements(t.json -> 'matrix') AS element
WHERE element ->> 'payment_selection' = 'type') AS dummy;
But this prints csv's in one column
record
---------------------
"(value,1220.42)"
"(currency_code,EUR)"
The primary problem here, is that you select the whole row as a column (PostgreSQL allows that). You can fix that with SELECT (jsonb_each_text(data)).* ....
But: don't SELECT set-returning functions, that can often lead to errors (or unexpected results). Instead, use f.ex. LATERAL joins/sub-queries:
select first_currency.*
from test t
, jsonb_array_elements(t.json -> 'matrix') element
, jsonb_each_text(element -> 'offer_currencies' -> 0) first_currency
where element ->> 'payment_selection' = 'type'
Note: function calls in the FROM clause are implicit LATERAL joins (here: CROSS JOINs).
WITH testa AS(
select jsonb_array_elements
(t.json -> 'matrix') -> 'offer_currencies' -> 0 as jsonbcolumn from test t)
SELECT d.key, d.value FROM testa
join jsonb_each_text(testa.jsonbcolumn) d ON true
ORDER BY 1, 2;
tetsa get the temporal jsonb data. Then using lateral join to transform the jsonb data to table format.