How handle NULL in visualization - qliksense

I have two tables (it is example, of course) that I loaded to app from different sources by script
Table 1:
ID
Attribute T1
1
100
3
200
Table 2:
ID
Attribute T2
1
Value 1
2
Value 2
On a list I create table:
ID
Attribute T1
Attribute T2
Finally I have table
ID
Attribute T1
Attribute T2
1
100
Value 1
2
-
Value 2
3
200
-
So, as You know it limits me in filtering and analyzing data, for example I can't show all data that isn't represented in Table 1, or all data for Attribute T1 not equal 100.
I try to use NullAsValue, but it didn't help. Would be appreciate for idea how to manage my case.

To achieve what you're attempting, you'll need to Join or Concatenate your tables. The reason is because Null means something different depending on how the data is loaded.
There's basically two "types" of Null:
"Implied" Null
When you associate several tables in your data model, as you've done in your example, Qlik is essentially treating that as a natural outer join between the tables. But since it's not an actual join that happens when the script executes, the Nulls that arise from data incongruencies (like in your example) are basically implied, since there really is an absence of data there. There's nothing in the data or script that actually says "there are no Attribute T1 values for ID of 2." Because of that, you can't use a function like NullAsValue() or Coalesce() to replace Nulls with another value because those Nulls aren't even there -- there's nothing to actually replace.
The above tables don't have any actual Nulls -- just implied ones from their association and the fact that the ID fields in either table don't have all the same values.
"Realized" Null
If, instead of just using associations, you actually combine the tables using the Join or Concatenate prefixes, then Qlik is forced to actually generate a Null value in the absence of data. Instead of Null being implied, it's actually there in the data model -- it's been realized. In this case, we can actually use functions like NullAsValue() or Coalesce() or Alt() to replace Nulls with another value since we actually have something in our table to replace.
The above joined table has actual Nulls that are realized in the data model, so they can be replaced.
To replace Nulls at that point, you can use the NullAsValue() or Coalesce() functions like this in the Data Load Editor:
table1:
load * inline [
ID , Attribute T1
1 , 100
3 , 200
];
table2:
join load * inline [
ID , Attribute T2
1 , Value 1
2 , Value 2
];
NullAsValue [Attribute T1];
Set NullValue = '-NULL-';
new_table:
NoConcatenate load
ID
, [Attribute T1]
, Coalesce([Attribute T2], '-AlsoNULL-') as [Attribute T2]
Resident table1;
Drop Table table1;
That will result in a table like this:
The Coalesce() and Alt() functions are also available in chart expressions.
Here are some quick links to the things discussed here:
Qlik Null interpretation
Qlik table associations
NullAsValue() function
Coalesce() function
Alt() function

Related

How to Link 2 Sheets that have the same fields

I am looking for some help with trying to link 2 sheets that have a number of Filters that I have setup but are both sitting in separate tables. The reason this is because I have a number of aggregated columns that are different for the 2 tables and want to keep this separately as I will be building more sheets as I go along.
The filters that are the same within the 2 sheets are the following:
we_date
product
manager
patch
Through the data manager I managed to create an association between the 2 tables for we_date but from reading on this site and other searches on Google I can't make any associations between these tables and this is where I am stuck.
The 2 sheets will now allow me to filter using the we_date, but if I use the filters for product, manager or patch then nothing happens on my 2nd sheet as they are not linked.
Currently in my data load editor I have 2 sections of select queries like the following:
Table1
QUALIFY *;
w:
SELECT
*
FROM
table1
;
UNQUALIFY *;
Table2
QUALIFY *;
w_c:
SELECT
*
FROM
table2
;
UNQUALIFY *;
I would really appreciate if somebody could advise a fix on the issue I am having.
In Qlik, field names of identical values from different tables are automatically associated.
When you're calling Qualify *, you're actually renaming all field names and explicitly saying NOT to associate.
Take a look at the Qlik Sense documentation on Qualify *:
The automatic join between fields with the same name in different
tables can be suspended by means of the qualify statement, which
qualifies the field name with its table name. If qualified, the field
name(s) will be renamed when found in a table. The new name will be in
the form of tablename.fieldname. Tablename is equivalent to the label
of the current table, or, if no label exists, to the name appearing
after from in LOAD and SELECT statements.
We can use as to manually reassign field names.
SELECT customer_id, private_info as "private_info_1", favorite_dog from table1;
SELECT customer_id, private_info as "private_info_2", car from table2;
Or, we can correctly use Qualify. Example:
table1 and table2 have a customer_id field, and private_info field. We want customer_id field to be the associative value, and private_info to not be. We would use QUALIFY on private_info, which Qlik would then rename based on file name.
QUALIFY private_info;
SELECT * from table1;
SELECT * from table2;
The following field names would then be: customer_id (associated), and table1.private_info, and table2.private_info

Merge multiple tables having different columns

I have 4 tables each tables has different number of columns as listed below.
tableA - 34
tableB - 47
tableC - 26
tableD - 16
Every table has a common column called id, now i need to perform a union the problem is since the columns are not equal length and entirely different i can't do a union.
Based on id only i can get the details from every table, so how should i approach this.
What is the optimized way to solve this, tried full join but that takes too much time.
Tried so far
SELECT * FROM tableA FULL JOIN
tableB FULL JOIN
tableC FULL JOIN
tableD
USING (id)
WHERE tableA.id = 123 OR
tableB.id = 123 OR
tableC.id = 123 OR
tableD.id = 123
Snowflake does have a declared limitation in use of Set operators (such as UNION):
When using these operators:
Make sure that each query selects the same number of columns.
[...]
However, since the column names are well known, it is possible to come up with a superset of all unique column names required in the final result and project them explicitly from each query.
There's not enough information in the question on how many columns overlap (47 unique columns?), or if they are all different (46 + 33 + 25 + 15 = 119 unique columns?). The answer to this would determine the amount of effort required to write out each query, as it would involve adapting a query from the following form:
SELECT * FROM t1
Into an explicit form with dummy columns defined with acceptable defaults that match the data type on tables where they are present:
SELECT
present_col1,
NULL AS absent_col2,
0.0 AS absent_col3,
present_col4,
[...]
FROM
t1
You can also use some meta programming with stored procedures to "generate" such an altered query by inspecting independent result's column names using the Statement::getColumnCount(), Statement::getColumnName(), etc. APIs and forming a superset union version with default/empty values.

Will Postgres' DISTINCT function always return null as the first element?

I'm selecting distinct values from tables thru Java's JDBC connector and it seems that NULL value (if there's any) is always the first row in the ResultSet.
I need to remove this NULL from the List where I load this ResultSet. The logic looks only at the first element and if it's null then ignores it.
I'm not using any ORDER BY in the query, can I still trust that logic? I can't find any reference in Postgres' documentation about this.
You can add a check for NOT NULL. Simply like
select distinct columnName
from Tablename
where columnName IS NOT NULL
Also if you are not providing the ORDER BY clause then then order in which you are going to get the result is not guaranteed, hence you can not rely on it. So it is better and recommended to provide the ORDER BY clause if you want your result output in a particular output(i.e., ascending or descending)
If you are looking for a reference Postgresql document then it says:
If ORDER BY is not given, the rows are returned in whatever order the
system finds fastest to produce.
If it is not stated in the manual, I wouldn't trust it. However, just for fun and try to figure out what logic is being used, running the following query does bring the NULL (for no apparent reason) to the top, while all other values are in an apparent random order:
with t(n) as (values (1),(2),(1),(3),(null),(8),(0))
select distinct * from t
However, cross joining the table with a modified version of itself brings two NULLs to the top, but random NULLs dispersed througout the resultset. So it doesn't seem to have a clear-cut logic clumping all NULL values at the top.
with t(n) as (values (1),(2),(1),(3),(null),(8),(0))
select distinct * from t
cross join (select n+3 from t) t2

T-SQL LEFT JOIN on bigint id return only ids lower than 101 on right table

I have two tables on a Sql Server 2008.
ownership with 3 fields and case with another 3 fields I need to join both on the ID field (bigint).
For testing purposes I'm only using one field from each table. This field is bigint and has values from 1 to 170 (for now).
My query is:
SELECT DISTINCT
ownership.fCase,
case.id
FROM
ownership LEFT JOIN case ON (case.id=ownership.fCase)
WHERE
ownership.dUser='demo'
This was expected to return 4 rows with the same values on both columns. Problem is that the last row of the right table comes as null for the fCase = 140. This is the only value above 100.
If I run the query without the WHERE clause it show all rows on the left table but the values on the right only apear if below 101 otherwise shows null.
Can someone help me, am I doing something wrong or is this a limitation or a bug?
Case is also a verb so it may be getting confused. Try your table and column names in []. E.G. [case].[id] = [ownership].[fCase]. Are you like double check sure that [case].[id] and [ownership].[fCase] are both bigint. If your current values are 1-170 then why bigint (9,223,372,036,854,775,807)? Does that column accept nulls?

Duplicate values returned with joins

I was wondering if there is a way using TSQL join statement (or any other available option) to only display certain values. I will try and explain exactly what I mean.
My database has tables called Job, consign, dechead, decitem. Job, consign, and dechead will only ever have one line per record but decitem can have multiple records all tied to the dechead with a foreign key. I am writing a query that pulls various values from each table. This is fine with all the tables except decitem. From dechead I need to pull an invoice value and from decitem I need to grab the net wieghts. When the results are returned if dechead has multiple child decitem tables it displays all values from both tables. What I need it to do is only display the dechad values once and then all the decitems values.
e.g.
1 ¦123¦£2000¦15.00¦1
2 ¦--¦------¦20.00¦2
3 ¦--¦------¦25.00¦3
Line 1 displays values from dechead and the first line/Join from decitems. Lines 2 and 3 just display values from decitem. If I then export the query to say excel I do not have duplicate values in the first two fileds of lines 2 and 3
e.g.
1 ¦123¦£2000¦15.00¦1
2 ¦123¦£2000¦20.00¦2
3 ¦123¦£2000¦25.00¦3
Thanks in advance.
Check out 'group by' for your RDBMS http://msdn.microsoft.com/en-US/library/ms177673%28v=SQL.90%29.aspx
this is a task best left for the application, but if you must do it in sql, try this:
SELECT
CASE
WHEN RowVal=1 THEN dt.col1
ELSE NULL
END as Col1
,CASE
WHEN RowVal=1 THEN dt.col2
ELSE NULL
END as Col2
,dt.Col3
,dt.Col4
FROM (SELECT
col1, col2, col3
,ROW_NUMBER OVER(PARTITION BY Col1 ORDER BY Col1,Col4) AS RowVal
FROM ...rest of your big query here...
) dt
ORDER BY dt.col1,dt.Col4