sql date order by problem - date

i have image table, which has 2 or more rows with same date.. now im tring to do order by created_date DESC, which works fine and shows rows same position, but when i change the query and try again, it shows different positions.. and no i dont have any other order by field, so im bit confused on why its doing it and how can i fix it.
can you please help on this.

To get reproducible results you need to have columns in your order by clause that together are unique. Do you have an ID column? You can use that to tie-break:
ORDER BY created_date DESC, id

I suspect that this is happening because MySQL is not given any ordering information other than ORDER BY created_date DESC, so it does whatever is most convenient for MySQL depending on its complicated inner workings (caching, indexing, etc.). Assuming you have a unique key id, you could do:
SELECT * FROM table t ORDER BY t.created_date DESC, t.id ASC
Which would give you the same result every time because putting a comma in the arguments following ORDER BY gives it a secondary ordering rule that is executed when the first ordering rule doesn't produce a clear order between two rows.

To have consistent results, you will need to add at least more column to the 'ORDER BY' clause. Since the values in the created_date column are not unique, there is not a defined order. If you wanted that column to be 'unique', you could define it as a timestamp.

Related

GROUP BY and ordering by date that was extracted as timestamp

I have a rather simple query:
SELECT table.foo, array_agg([ARRAY[EXTRACT(epoch FROM table.date), table.bar]) AS array
FROM table
GROUP BY table.foo,
ORDER BY table.date ASC;
When I run this query I get an error:
ERROR: column "table.date" must appear in the GROUP BY clause or be used in an aggregate function
I don't quite understand why that is happening because date appears in aggregate function. Is there any way to achieve that grouping?
you cant order by not existing column. If you want to order values in aggregation, use:
SELECT table.foo, array_agg([ARRAY[EXTRACT(epoch FROM table.date), table.bar] ORDER BY table.date ASC) AS array
FROM table
GROUP BY table.foo;

PostgreSQL - order randomly, but with NULLs first

I have a query that takes all rows out of a table, and joins with another table I am updating. The other table has some items that have been checked (these get a value), and some which are not yet checked. I am trying to implement a way to update all records, but make sure any NULLs get sorted as quickly as possible. I have the following query:
SELECT * FROM posts
LEFT JOIN post_stats
ON post_stats.post_id = posts.id
ORDER BY RANDOM() NULLS FIRST LIMIT 10
However, this is ordering everything randomly. Is there a way to order everything randomly, but any NULLs get shown first?
Note that you don't even specify which column can contain NULLs in your query. This is an indicator that something is going wrong.
The following query (replace with what you need) should do what you want.
SELECT *
FROM posts
LEFT JOIN post_stats ON post_stats.post_id = posts.id
ORDER BY <YOUR_COLUMN> IS NOT NULL, RANDOM()
LIMIT 10;

Will Postgres' DISTINCT function always return null as the first element?

I'm selecting distinct values from tables thru Java's JDBC connector and it seems that NULL value (if there's any) is always the first row in the ResultSet.
I need to remove this NULL from the List where I load this ResultSet. The logic looks only at the first element and if it's null then ignores it.
I'm not using any ORDER BY in the query, can I still trust that logic? I can't find any reference in Postgres' documentation about this.
You can add a check for NOT NULL. Simply like
select distinct columnName
from Tablename
where columnName IS NOT NULL
Also if you are not providing the ORDER BY clause then then order in which you are going to get the result is not guaranteed, hence you can not rely on it. So it is better and recommended to provide the ORDER BY clause if you want your result output in a particular output(i.e., ascending or descending)
If you are looking for a reference Postgresql document then it says:
If ORDER BY is not given, the rows are returned in whatever order the
system finds fastest to produce.
If it is not stated in the manual, I wouldn't trust it. However, just for fun and try to figure out what logic is being used, running the following query does bring the NULL (for no apparent reason) to the top, while all other values are in an apparent random order:
with t(n) as (values (1),(2),(1),(3),(null),(8),(0))
select distinct * from t
However, cross joining the table with a modified version of itself brings two NULLs to the top, but random NULLs dispersed througout the resultset. So it doesn't seem to have a clear-cut logic clumping all NULL values at the top.
with t(n) as (values (1),(2),(1),(3),(null),(8),(0))
select distinct * from t
cross join (select n+3 from t) t2

TSQL Keyword Previous or Last or something similar

This question is geared for those who have more SQL experience than me.
I am writing a query(that will eventually be a Stored Procedure but this should be irrelevant) where I want to select the count of rows if the most recent entry's is equivalent to the one that was just entered before. And i want to continue to do this until it hits an entry that has a different value. (Poorly explained so I will show the example)
In my table I have a column 'Product_Id' and when this query is run i want it take the product_id and compare it to the previously entered product Id, if its the same I want to add one, and I want it to keep checking the previously entered product_id until it runs into a different product_id
I'm hoping it sounds more complicated than it is, and the query would look something like
Select count(Product_ID)
FROM dbo.myTable
Where Product_Id = previous(Product_Id)
Now, i know that previous isn't a keyword in TSQL, and neither was Last, but I'm hoping of someone who knows a keyword that does what I am asking.
Edit for Sam
USE DbName;
GO
WITH OrderedCount as
(
select ROW_NUMBER() OVER (Order by dbo.Line_Production.Run_Date DESC) as RowNumber,
Line_Production.Product_ID
From dbo.Line_Production
)
Select RowNumber, COUNT(OrderedCount.Product_ID) as PalletCount
From OrderedCount
WHERE OrderedCount.RowNumber + 1 = RowNumber
and Product_ID = Product_ID
Group by RowNumber
The OrderedCount portion works, and it returns the data back how I want it, I'm now having trouble comparing the Product_ID's for different RowNumbers
my Where Clause is wrong
There's no keyword. That would be a nice magic solution, but it doesn't exist, at least in part because there is no guaranteed ordering (okay, you could have the keyword only if there is an ORDER BY...). I can write you a query, but that'll take time, so for now I'll give you a few steps and I'll come back and see if you still need help in a bit.
Figure out an ORDER BY, otherwise no order is guaranteed. If there is a time entered field, that's a good choice, or an index, that works too.
Learn to use Row_Number.
Compare the table (with Row_Number) to itself where instance1.row - 1 = instance2.row.
If product_id is an identity column, couldn't you just do product_id - 1? In other words, if it's sequential, it's the same as using ROW_NUMBER mentioned in the previous comment.

group by date aggregate function in postgresql

I'm getting an error running this query
SELECT date(updated_at), count(updated_at) as total_count
FROM "persons"
WHERE ("persons"."updated_at" BETWEEN '2012-10-17 00:00:00.000000' AND '2012-11-07 12:25:04.082224')
GROUP BY date(updated_at)
ORDER BY persons.updated_at DESC
I get the error ERROR: column "persons.updated_at" must appear in the GROUP BY clause or be used in an aggregate function LINE 5: ORDER BY persons.updated_at DESC
This works if I remove the date( function from the group by call, however I'm using the date function because i want to group by date, not datetime
any ideas
At the moment it is unclear what you want Postgres to return. You say it should order by persons.updated_at but you do not retrieve that field from the database.
I think, what you want to do is:
SELECT date(updated_at), count(updated_at) as total_count
FROM "persons"
WHERE ("persons"."updated_at" BETWEEN '2012-10-17 00:00:00.000000' AND '2012-11-07 12:25:04.082224')
GROUP BY date(updated_at)
ORDER BY count(updated_at) DESC -- this line changed!
Now you are explicitly telling the DB to sort by the resulting value from the COUNT-aggregate. You could also use: ORDER BY 2 DESC, effectively telling the database to sort by the second column in the resultset. However I highly prefer explicitly stating the column for clarity.
Note that I'm currently unable to test this query, but I do think this should work.
the problem is that, because you are grouping by date(updated_at), the value for updated_at may not be unique, different values of updated_at can return the same value for date(updated_at). You need to tell the database which of the possible values it should use, or alternately use the value returned by the group by, probably one of
SELECT date(updated_at) FROM persons GROUP BY date(updated_at)
ORDER BY date(updated_at)
or
SELECT date(updated_at) FROM persons GROUP BY date(updated_at)
ORDER BY min(updated_at)