Related
I have a SQL table with a datetime field. The field in question can be null. I have a query and I want the results sorted ascendingly by the datetime field, however I want rows where the datetime field is null at the end of the list, not at the beginning.
Is there a simple way to accomplish that?
select MyDate
from MyTable
order by case when MyDate is null then 1 else 0 end, MyDate
(A "bit" late, but this hasn't been mentioned at all)
You didn't specify your DBMS.
In standard SQL (and most modern DBMS like Oracle, PostgreSQL, DB2, Firebird, Apache Derby, HSQLDB and H2) you can specify NULLS LAST or NULLS FIRST:
Use NULLS LAST to sort them to the end:
select *
from some_table
order by some_column DESC NULLS LAST
I also just stumbled across this and the following seems to do the trick for me, on MySQL and PostgreSQL:
ORDER BY date IS NULL, date DESC
as found at https://stackoverflow.com/a/7055259/496209
If your engine allows ORDER BY x IS NULL, x or ORDER BY x NULLS LAST use that. But if it doesn't these might help:
If you're sorting by a numeric type you can do this: (Borrowing the schema from another answer.)
SELECT *
FROM Employees
ORDER BY ISNULL(DepartmentId*0,1), DepartmentId;
Any non-null number becomes 0, and nulls become 1, which sorts nulls last because 0 < 1.
You can also do this for strings:
SELECT *
FROM Employees
ORDER BY ISNULL(LEFT(LastName,0),'a'), LastName
Any non-null string becomes '', and nulls become 'a', which sorts nulls last because '' < 'a'.
This even works with dates by coercing to a nullable int and using the method for ints above:
SELECT *
FROM Employees
ORDER BY ISNULL(CONVERT(INT, HireDate)*0, 1), HireDate
(Lets pretend the schema has HireDate.)
These methods avoid the issue of having to come up with or manage a "maximum" value of every type or fix queries if the data type (and the maximum) changes (both issues that other ISNULL solutions suffer). Plus they're much shorter than a CASE.
You can use the built-in function to check for null or not null, as below. I test it and its working fine.
select MyDate from MyTable order by ISNULL(MyDate,1) DESC, MyDate ASC;
order by coalesce(date-time-field,large date in future)
When your order column is numeric (like a rank) you can multiply it by -1 and then order descending. It will keep the order you're expecing but put NULL last.
select *
from table
order by -rank desc
In Oracle, you can use NULLS FIRST or NULLS LAST: specifies that NULL values should be returned before / after non-NULL values:
ORDER BY { column-Name | [ ASC | DESC ] | [ NULLS FIRST | NULLS LAST ] }
For example:
ORDER BY date DESC NULLS LAST
Ref: http://docs.oracle.com/javadb/10.8.3.0/ref/rrefsqlj13658.html
If you're using MariaDB, they mention the following in the NULL Values
documentation.
Ordering
When you order by a field that may contain NULL values, any NULLs are
considered to have the lowest value. So ordering in DESC order will see the
NULLs appearing last. To force NULLs to be regarded as highest values, one can
add another column which has a higher value when the main field is NULL.
Example:
SELECT col1 FROM tab ORDER BY ISNULL(col1), col1;
Descending order, with NULLs first:
SELECT col1 FROM tab ORDER BY IF(col1 IS NULL, 0, 1), col1 DESC;
All NULL values are also regarded as equivalent for the purposes of the
DISTINCT and GROUP BY clauses.
The above shows two ways to order by NULL values, you can combine these with the
ASC and DESC keywords as well. For example the other way to get the NULL values
first would be:
SELECT col1 FROM tab ORDER BY ISNULL(col1) DESC, col1;
-- ^^^^
SELECT *
FROM Employees
ORDER BY ISNULL(DepartmentId, 99999);
See this blog post.
Thanks RedFilter for providing excellent solution to the bugging issue of sorting nullable datetime field.
I am using SQL Server database for my project.
Changing the datetime null value to '1' does solves the problem of sorting for datetime datatype column. However if we have column with other than datetime datatype then it fails to handle.
To handle a varchar column sort, I tried using 'ZZZZZZZ' as I knew the column does not have values beginning with 'Z'. It worked as expected.
On the same lines, I used max values +1 for int and other data types to get the sort as expected. This also gave me the results as were required.
However, it would always be ideal to get something easier in the database engine itself that could do something like:
Order by Col1 Asc Nulls Last, Col2 Asc Nulls First
As mentioned in the answer provided by a_horse_with_no_name.
Solution using the "case" is universal, but then do not use the indexes.
order by case when MyDate is null then 1 else 0 end, MyDate
In my case, I needed performance.
SELECT smoneCol1,someCol2
FROM someSch.someTab
WHERE someCol2 = 2101 and ( someCol1 IS NULL )
UNION
SELECT smoneCol1,someCol2
FROM someSch.someTab
WHERE someCol2 = 2101 and ( someCol1 IS NOT NULL)
USE NVL function
select * from MyTable order by NVL(MyDate, to_date('1-1-1','DD-MM-YYYY'))
Here's the alternative of NVL in most famous DBMS
order by -cast([nativeDateModify] as bigint) desc
For a project, I'm looking to get all results group by day.
Here is my query:
SELECT MAX(id) AS id,
SUM(value) AS value,
country,
cast(TO_CHAR(date, 'dd/mm/yyyy') AS DATE) AS date
FROM records
GROUP BY date, country
My problem is that records are not groupped correctly when I use my "date" alias, instead it seems to group by the field name.
Results with group by alias
It works if I use indices instead of alias, but I'd like to have column's name in my result :
SELECT MAX(id) AS id,
SUM(value) AS value,
country,
cast(TO_CHAR(date, 'dd/mm/yyyy') AS DATE) AS date
FROM records
GROUP BY 3, 4
Results with group by indices
Has someone an idea why it works this way?
Quote from the manual
An expression used inside a grouping_element can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name
(emphasis mine)
So the (input) column names always have precedence over column aliases.
The two GROUP BY clauses are not equivalent.
In both, the SELECT clause is:
SELECT
MAX(id) AS id,
SUM(value) AS value,
country,
cast(TO_CHAR(date, 'dd/mm/yyyy') AS DATE) AS date
So the columns will be (id, value, country, date).
The first query groups by date then country:
GROUP BY date, country
The second groups by country then date:
GROUP BY 3, 4
With different hierarchy of GROUP BY you'll get different results, such as what you show.
I have a strange problem when retrieving records from db after comparing a truncated field with date_trunc().
This query doesn't return any data:
select id from my_db_log
where date_trunc('day',creation_date) >= to_date('2014-03-05'::text,'yyyy-mm-dd');
But if I add the column creation_date with id then it returns data(i.e. select id, creation_date...).
I have another column last_update_date having same type and when I use that one, still does the same behavior.
select id from my_db_log
where date_trunc('day',last_update_date) >= to_date('2014-03-05'::text,'yyyy-mm-dd');
Similar to previous one. it also returns record if I do id, last_update_date in my select.
Now to dig further, I have added both creation_date and last_updated_date in my where clause and this time it demands to have both of them in my select clause to have records(i.e. select id, creation_date, last_update_date).
Does anyone encountered the same problem ever? This similar thing works with my other tables which are having this type of columns!
If it helps, here is my table schema:
id serial NOT NULL,
creation_date timestamp without time zone NOT NULL DEFAULT now(),
last_update_date timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT db_log_pkey PRIMARY KEY (id),
I have asked a different question earlier that didn't get any answer. This problem may be related to that one. If you are interested on that one, here is the link.
EDITS:: EXPLAIN (FORMAT XML) with select * returns:
<explain xmlns="http://www.postgresql.org/2009/explain">
<Query>
<Plan>
<Node-Type>Result</Node-Type>
<Startup-Cost>0.00</Startup-Cost>
<Total-Cost>0.00</Total-Cost>
<Plan-Rows>1000</Plan-Rows>
<Plan-Width>658</Plan-Width>
<Plans>
<Plan>
<Node-Type>Result</Node-Type>
<Parent-Relationship>Outer</Parent-Relationship>
<Alias>my_db_log</Alias>
<Startup-Cost>0.00</Startup-Cost>
<Total-Cost>0.00</Total-Cost>
<Plan-Rows>1000</Plan-Rows>
<Plan-Width>658</Plan-Width>
<Node/s>datanode1</Node/s>
<Coordinator-quals>(date_trunc('day'::text, creation_date) >= to_date('2014-03-05'::text, 'yyyy-mm-dd'::text))</Coordinator-quals>
</Plan>
</Plans>
</Plan>
</Query>
</explain>
"Impossible" phenomenon
The number of rows returned is completely independent of items in the SELECT clause. (But see #Craig's comment about SRFs.) Something must be broken in your db.
Maybe a broken covering index? When you throw in the additional column, you force Postgres to visit the table itself. Try to re-index:
REINDEX TABLE my_db_log;
The manual on REINDEX. Or:
VACUUM FULL ANALYZE my_db_log;
Better query
Either way, use instead:
select id from my_db_log
where creation_date >= '2014-03-05'::date
Or:
select id from my_db_log
where creation_date >= '2014-03-05 00:00'::timestamp
'2014-03-05' is in ISO 8601 format. You can just cast this string literal to date. No need for to_date(), works with any locale. The date is coerced to timestamp [without time zone] automatically when compared to creation_date (being timestamp [without time zone]). More details about timestamps in Postgres here:
Ignoring timezones altogether in Rails and PostgreSQL
Also, you gain nothing by throwing in date_trunc() here. On the contrary, your query will be slower and any plain index on the column cannot be used (potentially making this much slower)
I have created a table with a column date_time type (varchar2 (40) ) but when i try to insert the current system date and time the doesnt work it gives error (too many values). please tell me what's wrong with the insert statement.
create table HR (type varchar2 (20), raised_by number (6), complaint varchar2 (500), date_time varchar2(40))
insert into HR values ('request',6785,'good morning',sysdate,'YYYY/MM/DD:HH:MI:SSAM')
The immediate cause of the error is that you have too many values, as the message says; that is, more elements in your values clause than there are columns. It is better to explicitly list the column names to avoid future problems and confusion, so you're really doing this:
insert into HR (type, raised_by, complaint, date_time)
values ('request',6785,'good morning',sysdate,'YYYY/MM/DD:HH:MI:SSAM')
... sp you have four columns, but five values. You're trying to insert the current date/time as a string so you would need to use the to_char() function:
insert into HR (type, raised_by, complaint, date_time)
values ('request',6785,'good morning',
to_char(sysdate,'YYYY/MM/DD:HH:MI:SSAM'))
But it is bad practice to store a date (or any other structured data, such as a number) as a string. As the documentation notes:
Each value manipulated by Oracle Database has a data type. The data
type of a value associates a fixed set of properties with the value.
These properties cause Oracle to treat values of one data type
differently from values of another. For example, you can add values of
NUMBER data type, but not values of RAW data type.
If you use a string then you can put invalid values in. If you use a proper DATE data type then you cannot accidentally put an invalid or confusing value in. Oracle will also be able to optimise the use of the column, and will be able to compare values safely and efficiently. Although the format you're using is better than some, using string comparison you still can't easily compare two values to see which is earlier, so you can't properly order by the date_time column for example.
Say you inserted two rows with values 2013/11/15:09:00:00AM and 2013/11/15:08:00:00PM - which is earlier? You need to look at the AM/PM marker to realise the first one is earlier; with a string comparison you'd get it wrong because 8 would be sorted before 9. Using HH24 instead of HH and AM avoids that, but would still be less efficient than a true date.
If you need to store a date with a time component you can use the DATE data type, which has precision down to the second; or if you need fractional seconds too then you can use TIMESTAMP. Then your table and insert would be:
create table HR (type varchar2 (20), raised_by number (6),
complaint varchar2 (500), date_time date);
insert into HR (type, raised_by, complaint, date_time)
values ('request',6785,'good morning',sysdate);
You can still get the value in the format you wanted for display purposes as part of a query:
select type, raised_by, complaint,
to_char(date_time, 'YYYY/MM/DD:HH:MI:SSAM') as date_time
from HR
order by date_time;
TYPE RAISED_BY COMPLAINT DATE_TIME
-------------------- ---------- -------------------- ---------------------
request 6785 good morning 2013/11/15:08:44:35AM
Only treat a date as a string for display.
You can use TO_DATE() or TO_TIMESTAMP or To_char() function,
insert into HR values ('request',6785,'good morning',TO_DATE(sysdate, 'yyyy/mm/dd hh24:mi:ss'))
insert into HR values ('request',6785,'good morning',TO_TIMESTAMP(systimestamp, 'yyyy/mm/dd hh24:mi:ss'))
sysdate - It will give date with time.
systimestamp - It will give datetime with milliseconds.
To_date() - Used to convert string to date.
To_char() - Used to convert date to string.
Probably here you have to use To_char() because your table definition have varchar type for date_time column.
Use TIMESTAMP datatype for date_time. And while inserting use the current timestamp.
create table HR (type varchar2(20), raised_by number(6), complaint varchar2(500), date_time timestamp);
insert into HR values ('request',6785,'good morning', systimestamp);
For other options: http://psoug.org/reference/timestamp.html
In this data - there are multiple DATA_ID values associated with time-series data. I am trying to exclude all data from any DATA_ID values that return a NULL value for USE for any timestamp value.
In other words, I only want to return DATA_ID values (and their data) if they have complete (not any NULL) values for all timestamp values.
Sample query given below:
SELECT
My.Table.DATA_ID,
MY.Table.timestamp,
My.Table.USE
FROM
My.TABLE
WHERE timestamp BETWEEN '2012-06-01 00:00:00' AND '2012-06-02 23:59:59'
-- Something here that says exclude all data from DATA_ID(s)
-- with any missing USE data, i.e. USE=NULL
ORDER BY DATA_ID, timestamp
Assuming I understand your question correctly and you want to exclude whole batches of samples (determined by equal data_id and timestamp) that contain a null value.
SELECT
My.Table.DATA_ID,
MY.Table.timestamp,
My.Table.USE
FROM
My.TABLE o
WHERE timestamp BETWEEN '2012-06-01 00:00:00' AND '2012-06-02 23:59:59'
and not exists (select 1 from my_table i
where i.use is null
and i.data_id = o.data_id
and i.timestamp BETWEEN '2012-06-01 00:00:00' AND '2012-06-02 23:59:59')
ORDER BY DATA_ID, timestamp
The simple thing to do is something like this:
CREATE FUNCTION missing_info(MY.TABLE)
RETURNS BOOL
LANGUAGE SQL AS
$$ select $1.use is null -- chain conditions together with or.
-- no from clause needed. no where clause needed.
$$;
Then you can just add:
where (My.Table).missing_info is not true;
And as you need to change the logic as to what sorts of info is missing you can just change it in the function and everything still works.
This is the sort of encapsulation of derived information where ORDBMS's like PostgreSQL really shine.
Edit: Re-reading your example, it looks like what you are looking for is the IS NULL operator. However if you need to re-use some sort of logic, see the above example. NULL never "equals" NULL (because we can't say whether two unknown values are the same). But IS NULL tells you whether it is NULL or not.