Aggregating Data with MAX & GROUP BY in VIEWS with a WHERE restriction - tsql

Here is my Request.
Our Application generates a statement which invokes a view in a SQL-Server DB.
This statement selects DISTINCT or GROUP BY Data-fields from the view
passing a restricted subset of IDs.
Background:
I am using SQL-Server 2008 R2 running on Windows Server 2008 R2
With an example-table I will try to explain the problem.
Given this example table: [TabA]
ID [INT] DATA [VARCHAR(8)]
--- ----
51 A1
50 A1
110 A5
100 A5
We then create the following view:
CREATE VIEW ViewOnTabA
AS
SELECT
MAX(ta.ID) As ID, ta.DATA
FROM
TabA ta
GROUP BY ta.DATA
GO
With this statement, generated by our application,
we call the view passing some ID values:
SELECT
ID, DATA
FROM ViewOnTabA
WHERE ID in (51,50,110,100)
The result is Ok
The Data fields are grouped and complete:
ID DATA
-- ----
51 A1
110 A5
If instead of all IDs we pass only the smaller ID number: (50 instead of 51)
...
WHERE ID in (50,110,100)
The result is incomplete (ID 50 is missing):
ID DATA
-- ----
110 A5
But we expected
ID DATA
-- ----
50 A1
110 A5
It seems that in a VIEW the GROUP BY is executing before the WHERE condition.
As we cannot invoke stored procedures from our application,
we have to rely on calling a view.
Are there other possibilities to have a DISTINCT or GROUP BY order
of the DATA fields (in the example) within a VIEW.
The ID-fields must not necessarily be filtered with the MAX aggregate function.
But the Where restriction has to be applied on them.
P.S.
Executing this Select in a normal query including
the (WHERE ID in ...) restriction it works fine:
SELECT
MAX(ta.ID) As ID, ta.DATA
FROM
TabA ta
WHERE ID in (50,110,100)
GROUP BY ta.DATA
output>>>>
ID DATA
-- ----
50 A1
110 A5
For simulating this example
here are the create & insert statements:
create table TabA
(
ID int ,
DATA varchar(8)
)
go
insert into TabA values (51,'A1')
insert into TabA values (50,'A1')
insert into TabA values (110,'A5')
insert into TabA values (100,'A5')
Any help would be greatly appreciated.
Regards,
Alberto

For this to work the way that you are expecting to, you would need to remove the group by from your view. The reason why you are not seeing for e.g. the "50" in your results, is because it is already filtered out by the group by in the view.
You will either need to change your view (if nothing else is using it), or create a new view like this:
CREATE VIEW ViewOnTabA /* or use a new name */
AS
SELECT ta.ID, ta.DATA
FROM TabA ta
GO
... and then run the select statement with the where clause on this view.
SELECT
MAX(ta.ID) As ID, ta.DATA
FROM
ViewOnTabA ta /* or use the new view's name */
WHERE ta.ID in (50,110,100)
GROUP BY ta.DATA

Related

Update Variable based on Group

I need to perform an update to a field in a table with a variable, but I need the variable to change when the group changes. It is just an INTt, so for example if I The example below I want to update the record of texas with a 1 and flordia with the next number of 2:
UPDATE table
set StateNum = #Count
FROM table
where xxxxx
GROUP BY state
Group Update Variable
Texas 1
Texas 1
Florida 2
Florida 2
Florida 2
I think you should use a lookup table with the state and its number StateNum Then you should store this number instead of the name to your table.
You might use DENSE_RANK within an updateable CTE:
--mockup data
DECLARE #tbl TABLE([state] VARCHAR(100),StateNum INT);
INSERT INTO #tbl([state]) VALUES
('Texas'),('Florida'),('Texas'),('Nevada');
--your update-statement
WITH updateableCTE AS
(
SELECT StateNum
,DENSE_RANK() OVER(ORDER BY [state]) AS NewValue
FROM #tbl
)
UPDATE updateableCTE SET StateNum=NewValue;
--check the result
SELECT * FROM #tbl;
And then you should use this to get the data for your lookup table
SELECT StateNum,[state] FROM #tbl GROUP BY StateNum,[state];
Then drop the state-column from your original table and let the StateNum be a foreign key.

Insert into Table using JOIN T-SQL

I want to insert into a specific column in my table A which belongs to DB 1
from my DB 2 table B
In table A I have a unique ID field called F6 same goes for table B field name F68; both fields are the same they are simply a copy of each other which gives me the opportunity to do a join on them.
So far so good, what I want now is to insert into my table A in the field F110 the values from table B F64 since I did a join on the "ID's" they should be in the right manner.
All fields are of type VARCHAR.
INSERT INTO [D061_15018659].[dbo].[A](F110)
SELECT v.F64,v.F68
FROM [VFM6010061V960P].[dbo].[B] v LEFT JOIN
ON v.F68 = F6
I have the problem that I have an error on "ON" why so ever I can't figure it out.
Your select query provide 2 columns ==> you need concatenate the columns
You need repeat the tabel A in join clause.
Try this :
INSERT INTO [D061_15018659].[dbo].[A] (F110)
SELECT
v.F64 || v.F68 as theNewF110
FROM
[VFM6010061V960P].[dbo].[B] v
LEFT JOIN
[D061_15018659].[dbo].[A] w ON v.F68 = w.F6

OrientDB Traverse Sum and Group By Top-Most Record

We have Orders that include "caused_order" edges from Order to Order because friends can refer other friends to make purchases. We know from the links we generate for the friends that Order ID 42 caused Order ID 47, so we create a "caused_order" edge between the two Order vertices.
We're looking to identify the people that are generating the most referral business. Right now we just loop through in C# and figure it out because our datasets are relatively small. But I'd like to figure out if there's a way to use the Traverse SQL to accomplish this instead.
The problem I'm running in to is getting an accurate count/sum for each Original Order ID.
Consider the following scenario:
Order 42 caused four other Orders, including Order 47. Order 47 caused 2 additional Orders. And Order 51, unrelated to 42 or 47, caused 3 Orders.
I can run the following SQL to get the best referrers for this specific {ProductId}:
select in_caused_order[0].id as OrderID, count(*) as ReferCount, sum(amount) as ReferSum
from ( traverse out('caused_order') from Order )
where out_includes.id = '{ProductId}' and $depth >= 1
group by in_caused_order[0].id
EDIT: the schema is a bit more complex than this, I was just including the out_includes WHERE clause to show that there's a bit of filtering of the Orders. But it's a bit like:
Product(V) <-- includes(E) <-- Order(V) --> caused_order(E) --> Order(V)
(the Order vertex has "amount" as a property, which stores the money spent and is being SUM'd in the SELECT, along with a few fields like date which aren't important)
But that will result in something like:
OrderID | ReferCount | ReferSum
42 | 4 | 525
47 | 2 | 130
51 | 3 | 250
Except that's not quite right, is it? Because Order 42 also technically caused 47's two orders. So we'd want to see something like:
OrderID | ReferCount | ReferSum | ExtendedCount | ExtendedSum
42 | 4 | 525 | 2 | 130
47 | 2 | 130 | 0 | 0
51 | 3 | 250 | 0 | 0
I recognize that the two "Extended" count/sum columns might be tricky. We might have to run the query twice, once with $depth = 1, and again with $depth > 1, and then assemble the results of those two queries in C#, which is fine.
But I can't even figure out how to get the overall total calculated correctly. The first step would even be to see something like:
OrderID | ReferCount | ReferSum
42 | 6 | 635 <-- includes its 4 orders + 47's 2 orders
47 | 2 | 130
51 | 3 | 250
And since this can be n-levels deep, it's not like I can somehow just do in_caused_order.in_caused_order.in_caused_order in the SQL, I don't know how many deep that will go. Order 83 could be caused by Order 47, and Order 105 could be caused by Order 83, and so on.
Any help would be much appreciated. Or maybe the answer is, Traverse can't handle this, and we'll have to figure something else out entirely.
I'm trying your usecase, following is my testdata:
create class caused_order extends e
create class Order extends v
create property Order.id integer
create property Order.amount integer
begin
create vertex Order set id=1 ,amount=1
create vertex Order set id=2 ,amount=5
create vertex Order set id=3 ,amount=11
create vertex Order set id=4 ,amount=23
create vertex Order set id=5 ,amount=31
create vertex Order set id=6 ,amount=49
create vertex Order set id=7 ,amount=4
create vertex Order set id=8 ,amount=74
create vertex Order set id=9 ,amount=87
create edge caused_order from (select from Order where id=1) to (select from Order where id=2)
create edge caused_order from (select from Order where id=1) to (select from Order where id=3)
create edge caused_order from (select from Order where id=2) to (select from Order where id=4)
create edge caused_order from (select from Order where id=2) to (select from Order where id=5)
create edge caused_order from (select from Order where id=6) to (select from Order where id=7)
create edge caused_order from (select from Order where id=6) to (select from Order where id=8)
commit retry 20
then I wrote these 2 queries to show orders with relative referSum and ReferCount.
First one including head order in the count:
select id as OrderID, $a[0].Amount as ReferSum, $a[0].Count as ReferCount from Order
let $a=(select sum(amount) as Amount, count(*) as Count from (traverse out('caused_order') from $parent.$current) group by Amount)
second one, excluding the head:
select id as OrderID, $a[0].Amount as ReferSum, $a[0].Count as ReferCount from Order
let $a=(select sum(amount) as Amount, count(*) as Count from (select from (traverse out('caused_order') from $parent.$current) where $depth>=1) group by Amount)
EDIT
I've added this to my data:
create class includes extends E
create class Product extends V
create property Product.id Integer
create vertex Product set id = 101
create vertex Product set id = 102
create vertex Product set id = 103
create vertex Product set id = 104
create edge includes from (select from Order where id=1) to (select from Product where id=101)
create edge includes from (select from Order where id=2) to (select from Product where id=102)
create edge includes from (select from Order where id=3) to (select from Product where id=103)
create edge includes from (select from Order where id=4) to (select from Product where id=104)
create edge includes from (select from Order where id=5) to (select from Product where id=101)
create edge includes from (select from Order where id=6) to (select from Product where id=102)
create edge includes from (select from Order where id=7) to (select from Product where id=103)
create edge includes from (select from Order where id=8) to (select from Product where id=104)
create edge includes from (select from Order where id=9) to (select from Product where id=101)
create edge includes from (select from Order where id=1) to (select from Product where id=102)
create edge includes from (select from Order where id=1) to (select from Product where id=103)
create edge includes from (select from Order where id=2) to (select from Product where id=104)
and these are the modified queries (added the while out('includes').id contains {prodID_number} in traverse and where out('includes').id contains {prodID_number}:
select id as OrderID, $a[0].Amount as ReferSum, $a[0].Count as ReferCount from Order
let $a=(select sum(amount) as Amount, count(*) as Count from (traverse out('caused_order') from $parent.$current while out('includes').id contains 102) group by Amount)
where out('includes').id contains 102
select id as OrderID, $a[0].Amount as ReferSum, $a[0].Count as ReferCount from Order
let $a=(select sum(amount) as Amount, count(*) as Count from (traverse out('caused_order') from $parent.$current while out('includes').id contains 102) where $depth >= 1 group by Amount)
where out('includes').id contains 102

Need an efficient select query

I would like to know an efficient to way to fetch the data in the following case.
There are two tables say Table1 and Table2 having two common field say contry and pincode and other table "Table3" having key fields of first two tables (DNO, MPNO).
Here is the little glitch, In table3 data, if it is having DNO it wont have MPNO
So when in the selection screen(Pic no2) if the use enter any thing, result should be as follows
**MFID | DNO | MPNO | COUNTRY | PINCODE**
----------
00001 | 10011 | novalue | IN | 4444
00002 | Novalue | 1200 | IN | 5555
00003 | 300 | novalue | US | 9999
( as you can observe if DNO present no MPNO , vice versa )
Please have a look at the pictures for a clear picture :-)
Table Relation:
Selection screen with select options:
The code shouldn't be long.
PSEUDO CODE:
Select queries:
Select * from table3 into it_table3.
Select * from table1 FOR ALL ENTRIES IN it_table3 INTO it_table1
WHERE dno = table3-dno.
Select * from table2 FOR ALL ENTRIES IN it_table3 INTO it_table2
WHERE mpno = table3-mpno.
Loop at internal table 3 and build final table.
LOOP at it_table3 into wa_table3.
IF wa_table3-dno IS NOT INITIAL.
READ it_table1 where dno = wa_table3-dno.
ELSE.
READ it_table2 where mpno = wa_table3-mpno.
ENDIF.
ENDLOOP.
Hope this was the answer you were hoping to find!
Building of efficient select will require information about obligatory fields in your selection screen, as well as about alleged production size of all 3 tables. However, without this information let's assume that table1 and table2 are reference tables and table3 is a transaction table, as onr can assume from their structure. It would be sensible to build selection in a following way:
Selecting data from reference tables. As you said fields DNO/MPNO are mutually exclusive then there will be no hits of country/pincode pair in both reference tables, so JOIN is useless here. However we can merge 2 result sets in single itab without any constraints' violations.
TYPES: BEGIN OF tt_result,
dno TYPE table1-dno,
mpno TYPE table2-mpno,
country TYPE table1-country,
pincode TYPE table1-pincode,
...other field from table3
END OF tt_result.
DATA: itab_result TYPE tt_result.
SELECT dno
FROM table1
INTO CORRESPONDING FIELDS OF TABLE itab_result
WHERE pincode IN so_pincode
AND country IN so_country.
SELECT mpno
FROM table2
APPENDING CORRESPONDING FIELDS OF TABLE itab_result
WHERE pincode IN so_pincode
AND country IN so_country.
FOR ALL ENTRIES addition allows specifying the same table in FOR ALL ENTRIES clause and in INTO clause, so we can fill our result table with absent table3 data by DNO/MPNO key.
SELECT *
FROM table3
INTO CORRESPONDING FIELDS OF TABLE itab_result
FOR ALL ENTRIES IN itab_result
ON itab_result~dno = itab3~dno
AND itab_result_mpno = itab3~mpno.

Finding exact matches to a requested set of values

Hi I'm facing a challenge. There is a table progress.
User_id | Assesment_id
-----------------------
1 | Test_1
2 | Test_1
3 | Test_1
1 | Test_2
2 | Test_2
1 | Test_3
3 | Test_3
I need to pull out the user_id who have completed only Test_1 & test_2 (i.e User_id:2). The input parameters would be the list of Assesment id.
Edit:
I want those who have completed all the assessments on the list, but no others.
User 3 did not complete Test_2, and so is excluded.
User 1 completed an extra test, and is also excluded.
Only User 2 has completed exactly those assessments requested.
You don't need a complicated join or even subqueries. Simply use the INTERSECT operator:
select user_id from progress where assessment_id = 'Test_1'
intersect
select user_id from progress where assessment_id = 'Test_2'
I interpreted your question to mean that you want users who have completed all of the tests in your assessment list, but not any other tests. I'll use a technique called common table expressions so that you can follow step by step, but it is all one query statement.
Let's say you supply your assessment list as rows in a table called Checktests. We can count those values to find out how many tests are needed.
If we use a LEFT OUTER JOIN then values from the right-side table will be null. So the test_matched column will be null if an assessment is not on your list. COUNT() ignores null values, so we can use this to find out how many tests were taken that were on the list, and then compare this to the number of all tests the user took.
with x as
(select count(assessment_id) as tests_needed
from checktests
),
dtl as
(select p.user_id,
p.assessment_id as test_taken,
c.assessment_id as test_matched
from progress p
left join checktests c on p.assessment_id = c.assessment_id
),
y as
(select user_id,
count(test_taken) as all_tests,
count(test_matched) as wanted_tests -- count() ignores nulls
from dtl
group by user_id
)
select user_id
from y
join x on y.wanted_tests = x.tests_needed
where y.wanted_tests = y.all_tests ;