Duplicate values returned with joins - tsql

I was wondering if there is a way using TSQL join statement (or any other available option) to only display certain values. I will try and explain exactly what I mean.
My database has tables called Job, consign, dechead, decitem. Job, consign, and dechead will only ever have one line per record but decitem can have multiple records all tied to the dechead with a foreign key. I am writing a query that pulls various values from each table. This is fine with all the tables except decitem. From dechead I need to pull an invoice value and from decitem I need to grab the net wieghts. When the results are returned if dechead has multiple child decitem tables it displays all values from both tables. What I need it to do is only display the dechad values once and then all the decitems values.
e.g.
1 ¦123¦£2000¦15.00¦1
2 ¦--¦------¦20.00¦2
3 ¦--¦------¦25.00¦3
Line 1 displays values from dechead and the first line/Join from decitems. Lines 2 and 3 just display values from decitem. If I then export the query to say excel I do not have duplicate values in the first two fileds of lines 2 and 3
e.g.
1 ¦123¦£2000¦15.00¦1
2 ¦123¦£2000¦20.00¦2
3 ¦123¦£2000¦25.00¦3
Thanks in advance.

Check out 'group by' for your RDBMS http://msdn.microsoft.com/en-US/library/ms177673%28v=SQL.90%29.aspx

this is a task best left for the application, but if you must do it in sql, try this:
SELECT
CASE
WHEN RowVal=1 THEN dt.col1
ELSE NULL
END as Col1
,CASE
WHEN RowVal=1 THEN dt.col2
ELSE NULL
END as Col2
,dt.Col3
,dt.Col4
FROM (SELECT
col1, col2, col3
,ROW_NUMBER OVER(PARTITION BY Col1 ORDER BY Col1,Col4) AS RowVal
FROM ...rest of your big query here...
) dt
ORDER BY dt.col1,dt.Col4

Related

How to sort table alphabetically by name initial?

I have a table contains columns 'employeename' and 'id', how can I sort the 'employeename' column following alphabetical order of the names initial?
Say the table is like this now:
employeename rid eid
Dave 1 1
Ben 4 2
Chloe 6 6
I tried the command ORDER BY, it shows what I want but when I query the data again by SELECT, the showed table data is the same as original, indicting ORDER BY does not modify the data, is this correct?
SELECT *
FROM employee
ORDER BY employeename ASC;
I expect the table data to be modified (sorted by names alphabetical order) like this:
employeename rid eid
Ben 4 2
Chloe 6 6
Dave 1 1
the showed table data is the same as original, indicting ORDER BY does not modify the data, is this correct?
Yes, this is correct. A SELECT statement does not change the data in a table. Only UPDATE, DELETE, INSERT or TRUNCATE statements will change the data.
However, your question shows a misconception on how a relational database works.
Rows in a table (of a relational database) are not sorted in any way. You can picture them as balls in a basket.
If you want to display data in a specific sort order, the only (really: the only) way to do that is to use an ORDER BY in your SELECT statement. There is no alternative to that.
Postgres allows to define a VIEW that includes an ORDER BY which might be an acceptable workaround for you:
CREATE VIEW sorted_employee;
AS
SELECT *
FROM employee
ORDER BY employeename ASC;
Then you can simply use
select *
from sorted_employees;
But be aware of the drawbacks. If you run select * from sorted_employees order by id then the data will be sorted twice. Postgres is not smart enough to remove the (useless) order by from the view's definition.
Some related questions:
Default row order in SELECT query - SQL Server 2008 vs SQL 2012
What is the default SQL result sort order with 'select *'?
Is PostgreSQL order fully guaranteed if sorting on a non-unique attribute?
Why do results from a SQL query not come back in the order I expect?

Merging in powerquery

although i selected Full Join, i couldn't get the all rows from both tables.
how can i get all rows from both tables ? (all 12093 rows)
maybe another join type may help ?
let
Source = Table.NestedJoin(#"Beton Irsaliye Kumulatif",{"Proje No & Adi", "Firma Kodu"},#"Beton Muhasebe Kumulatif",{"Proje No & Adi", "Hesap No"},"Beton Muhasebe Kumulatif",JoinKind.FullOuter)
in
Source
Your merge is accounting for all your rows. It's just that 4 of the rows in the first table don't have matches in the second table.
Here's a simple example of what is happening. Here, I have two tables: Table1 and Table2. Both have 10 rows. In fact, both are exactly the same.
If I choose to do a Full Outer join with these, using Col1 and Col2 for matching, I'll see this:
It tells me that 10 of the rows from the first table (Table1) match rows of the second table (Table2).
Now, if I change the last two rows of Table1 (specifically, the last two rows of Col2 of Table1) like this:
Then when I try to do a Full Outer join the same way, I'll see this:
Only 8 of the rows from the first table (Table1) match rows of the second table (Table2).
But when I continue with the merge, I'll see Table1's information in a table with Table2's matching information as embedded tables in column "NewColumn" of that table:
When I then expand "NewColumn", I see all the info from Table1, as before, and all matching info from Table2, as well as rows that don't have matches between the two tables.
All rows of both tables are accounted for.

Condensing Left Join Result Set into one row

I have a sql query which currently, due to a couple of Left joins, returns multiple rows:-
Id
Action
Group
12345 NULL NULL
12345 ADD NULL
12345 NULL ABC Group
How do I go about condensing these 3 rows into one e.g.
12345 ADD ABC Group
The constraints of your question are a bit unclear. If for every ID, there is only one possible non-NULL value of the other columns, you could use MIN or MAX to pull those values out. For example, put the query with the LEFT JOINS into a CTE, then do
SELECT ID, MAX(col2),MAX(col3)
FROM CTE
GROUP BY ID
If there is the potential for multiple non-NULL values per column, you will need to be more specific about what you would want the output to look like.

Grouping data in postgresql

If I have a table with multiple entries with same name I want to group only the name, i.e., show as many rows present in table but the name should appear only once and other data should show in multiple columns. i.e., for other rows name should be blank:
table expected result
---------------- ------------------
col1 col2 col1 col2
a 5 a 5
a 6 6
a 8 8
b 3 b 3
b 4 4
I'm using PostgreSQL 9.2.
You could use row_number to determine the first occurrence of each group, and from there, it's just a case away from not displaying it:
SELECT CASE rn WHEN 1 THEN col1 ELSE NULL END, col2
FROM (SELECT col1,
col2,
ROW_NUMBER() OVER (PARTITION BY col1
ORDER BY col2 ASC) AS rn
FROM my_table
ORDER BY col1, col2) t
Firstly I need to say that I do not have experience in PostgreSQL, just some basic SQL knowledge. It is not right to change data in original table itself, what you want is some 'view' of the data. Usually such things are made after data set is returned to client, actually it is a matter how to display the data (representation matter), and it should not be made in SQL query but on client side. But, if you want to bother the server with such things indeed, so I would do following: created copy of the table (it can be a temp table), then cleared values in col1 which are not the first in the subsequent select ordering records by col2. By the way, your table does not have primary key, so you will have a problem to implement that, since you can't identify parent record within the subsequent select.
So, the idea to archive that you need on client side (via a data cursor), just traversing records each by one, has even more points.

Postgres: Distinct but only for one column

I have a table on pgsql with names (having more than 1 mio. rows), but I have also many duplicates. I select 3 fields: id, name, metadata.
I want to select them randomly with ORDER BY RANDOM() and LIMIT 1000, so I do this is many steps to save some memory in my PHP script.
But how can I do that so it only gives me a list having no duplicates in names.
For example [1,"Michael Fox","2003-03-03,34,M,4545"] will be returned but not [2,"Michael Fox","1989-02-23,M,5633"]. The name field is the most important and must be unique in the list everytime I do the select and it must be random.
I tried with GROUP BY name, bu then it expects me to have id and metadata in the GROUP BY as well or in a aggragate function, but I dont want to have them somehow filtered.
Anyone knows how to fetch many columns but do only a distinct on one column?
To do a distinct on only one (or n) column(s):
select distinct on (name)
name, col1, col2
from names
This will return any of the rows containing the name. If you want to control which of the rows will be returned you need to order:
select distinct on (name)
name, col1, col2
from names
order by name, col1
Will return the first row when ordered by col1.
distinct on:
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.
Anyone knows how to fetch many columns but do only a distinct on one column?
You want the DISTINCT ON clause.
You didn't provide sample data or a complete query so I don't have anything to show you. You want to write something like:
SELECT DISTINCT ON (name) fields, id, name, metadata FROM the_table;
This will return an unpredictable (but not "random") set of rows. If you want to make it predictable add an ORDER BY per Clodaldo's answer. If you want to make it truly random, you'll want to ORDER BY random().
To do a distinct on n columns:
select distinct on (col1, col2) col1, col2, col3, col4 from names
SELECT NAME,MAX(ID) as ID,MAX(METADATA) as METADATA
from SOMETABLE
GROUP BY NAME