What happens when I group by a column in Teradata which is nullable ? Are those records which have the corresponding field's value AS NULL get eliminated ?
Please Explain with an example !
Every DBMS treats NULLs in the GROUP BY columns like any other value and puts them into one group. Eliminated are NULLs in the aggregated columns.
Why don't you simply test it?
Related
I have a database I am importing and it has a unique index for a field that is null. In PostgreSQL 13 it was not an issue but in 14 it now longer allows the import as null is no longer null but a value.
Is there a setting where null is treated like it should be instead of as a value?
The behavior has not changed in PostgreSQL v14. If the import doesn't work in the database, the only possible explanation is that you have defined the column NOT NULL in one database, but not in the other one (or used a similar check constraint).
PostgreSQL v15 introduces this standard conforming additional clause for unique constraints:
UNIQUE NULLS [NOT] DISTINCT
If you define a unique constraint with NULLS NOT DISTINCT in v15, it will behave differently from prior versions. However, the default is still UNIQUE NULLS DISTINCT.
when I query the table, the null values are being set as '' during the import, so it fails after the first row. Not sure what changed (other than upgrading to 14.5). Going to reach out to the importer company, I can insert multiple null values so somethings up on their end.
Pre-requisite: The schema of MongoDB documents is unknown.
It has null values and missing values:
For instance CCNO in the following documents
(1) has value
(2) has null value
(3) is missing.
I am fetching them in spark using MongoSpark.load(SparkSession, ReadConfig), but it is replacing missing values with nulls in the dataframe.
Please suggest a way to differentiate between the manually inserted nulls and spark inserted nulls(for missing values).
I have a query like this.
SELECT companies.id, companies.code, MAX(disclosures.filed_at) disclosure_filed_at
FROM \"companies\" INNER JOIN \"disclosures\" ON \"disclosures\".\"company_id\" = \"companies\".\"id\"
GROUP BY companies.id
This query works in Postgresql 9.5.2, but it failed in version 8.4.20 with an error.
PG::GroupingError: ERROR: column "companies.code" must appear in the GROUP BY clause or be used in an aggregate function
If I add companies.code to GROUP BY clause, then it works. But when I select by companies.*, I can't group by companies.*.
Should I write all columns in version 8.4 to use *?
The Postgres behavior is supported by the ANSI standard. The reason is that the id not only defines each row in companies, but it is defined to do so (using a unique or primary key constraint, although I'm not sure if this works in Postgres for a unique constraint).
Hence, the database knows that it can safely refer to any other column from the same row. This is called "functional dependency".
This feature has also now been added to MySQL (documented here). You might find that documentation easier to follow than the Postgres description:
When GROUP BY is present, or any aggregate functions are present, it
is not valid for the SELECT list expressions to refer to ungrouped
columns except within aggregate functions or when the ungrouped column
is functionally dependent on the grouped columns, since there would
otherwise be more than one possible value to return for an ungrouped
column. A functional dependency exists if the grouped columns (or a
subset thereof) are the primary key of the table containing the
ungrouped column.
Alright, this query will not run and returns a data type mismatch error. This error did not start until I attempted to join two tables within the sub-query in the WHERE clause.
The two tables I am attempting to join are the exact two tables that are joined in the primary query. Each table contains a column, CRD Number. In one table, it is stored as text, in the other, it is stored as number. This is why I have used the CStr() function to cast the numerical column as a textual column. This worked beautifully for the primary query (thank Stack Overflow!) but when I attempted to do the same exact join in the sub-query, I received the data type mismatch error.
Here is a picture, with a small arrow and text box to highlight the area which I (99% sure) believe is causing the problem. Again, the problem only arose when I attempted to join these two tables in the sub-query. If the join is removed, the query will run. (Although the sub-query will not return the correct results, thus making my primary query useless)
*All my Access DB's are set to accept Standard T-SQL syntax, so I will also tag this as T-SQL
I think the difference is probably that in the primary query you are excluding the cases where Crd Number is null, but in the subquery you are not. I don't have Access installed to be able to test, but I would bet it will work if you add to the subquery where clause.
sp.CRD_NUMBER is not null and dtp.CRD_NUMBER is not null
I suspect that the function CStr() is having a problem with null values, but then again, I can't test this to see if that would make a difference.
I've noticed that while I can use %dictionary.compiledclass to get a table with schema names and table names, which allow querying for their existence, I cannot do the same for columns. I have yet to find the command that allows verifying whether a column exists or not, or retrieving numerous column names using LIKE "prefix%".
Is there even such a thing? Or an alternative?
You can use %Dictionary.CompiledProperty table and SqlFieldName column of that table.
For example to find out tables that have column 'ColumnName' you can use this query:
select parent->SqlTableName
from %dictionary.compiledproperty
where SqlFieldName='ColumnName'
Execute this Query :
select * from %dictionary.compiledproperty Where parent='TableName' and SqlFieldName='ColumnName'
Check Row Count value ,0 not exist