Tableau Union Joins - Can you un-merge automatically merged fields? - tableau-api

I am attempting to union join 2 tables in my Microsoft NAV data source within Tableau. However, I have two field named "No." that do not contain the same data.
When I apply a union join, Tableau automatically merges these fields and I cannot un-merge them.
Is there a way to un-merge these fields?
Or is there a way of doing a manual union join?
I have tried renaming the field before dragging the second table into the worksheet however I can see that the "Remote Field Name" still remains the same.
Thanks

One approach is to let Tableau merge the fields and then use the generated fields to distinguish between them.
When you perform a Union in Tableau, it adds a few fields to your data source so you can tell which data rows came from which tables. The most useful in your case is called [Table Name]. So when you build your visualizations, you can use the [Table Name] field to know how to interpret the [No.] field.
If that is awkward, you can create 2 calculated fields to represent only those [No.] values that have the same role. For example, define [No. Type 1] as if [Table Name] = “Table 1” then [No.] end. And define, [No. Type 2] similarly. Then you can hide the original [No.] field.
These new fields will only have values for the appropriate data rows, and will be null otherwise. Aggregate functions like SUM(), AVG() etc ignore nulls, so you can use those fields as measures easily.
If you want to use a calculation in a JOIN clause, say after making a UNION, then first specify the tables (or unions of tables) to join, then when you click on the Venn diagram to specify the join keys, and then select either the left or right list of fields --> look at the bottom of the list in small print to either create or edit your Join Calculation.

Related

Doctrine : PostgreSQL group by different than select

I have two tables :
user and activityHistory (that has a key to the user)
I am trying to query activityHistory while grouping by user and postgreSQL does not allow me to do that WHILE Sqlite does allow me
return $qb
->select('a.id')
->leftJoin('a.user', 'user')
->leftJoin(
ActivityHistory::class,
'b',
'WITH',
'a.id = b.id AND a.createdAt > b.createdAt'
)
->groupBy('user.id')
->orderBy( 'a.createdAt','ASC' )
->getQuery()->getArrayResult();
I am getting this error only with postgreSQL : Grouping error: 7 ERROR: column "a0_.id" must appear in the GROUP BY clause or be used in an aggregate function
The point is I don't want to groupBy activityHistory id, I only want it to be selected, how can I do ? (I heard about aggregate but I think this works only with functions like SUM etc)
First of all, let's clarify how aggregation works. Aggregation is the act of grouping by certain field(s) and selecting either those fields or calling aggregation functions and passing ungrouped fields.
You misunderstand how this works - hence the question -, but let me provide you a few very simple examples:
Example 1
Let's consider that there is a town and there are individuals living in that town. Each individual has an eye color, but, if you are wondering what the eye color of the people of the town is, then your question does not make sense, because the group itself does not have an eye color, unless specified otherwise.
Example 2
Let's modify the example above by grouping the people of the town by eye color. Such an aggregation will have a row for all existent eye colors and you can select the eye color, along with the average age, number of individuals, etc. as you like, because you are grouping by eye color
Your example
You have users and they are performing actions. So, an activity is performed by a single user, but a user may perform many activities. So, if you want to group by your user id, then the "eye color" that you are not grouping by here is the history id.
You will have a single record for any user, so you are grouping multiple history items into the same row and after the grouping, asking about the history item's id does not exist.
But, you can use string_agg(some_column, ',') which will take all the values you have and put them all into a string of values separated by comma.
You can explode(',', '$yourvalues) in PHP to convert such a value into an array.

Ignoring space characters when linking tables

I’m experiancing a problem when trying to link to tables in the database expert. The two fields that link the tables have exactly the same information except one table always has an additional space. For example;
Table 1 = Multivitamin/Tablets
Table 2 = Multivitamin//Tablets
‘/‘ are representing spaces
Formulas won’t help (e.g. extractstring etc) as it’s the tables themselves I need to link together
This is preventing me from retrieving the information I need. Any advice on how I can get around this?
There are some ways to come across this:
Consider using a command as datasource instead of tables. When writing the query of the command you can define the join condition yourself.
If you have access to the data source, you could add a calculated field to the tables to contain the normalized field values and then use these for linking in CR.
Alternatively, one could create views in the database, either adding normalized "linking fields" or providing the joined tables results.
If it's only a few rows in CR, you could consider using SQL fields or subreports to retrieve data from Table 2.

data merging in Tableau

I have two sheets in excel. One has CBG (neighborhood) IDs as shown below.
The second sheet has state and county names and IDs as shown below.
Now the first 5 digits in the CBG ID are just the corresponding state and county IDs for that CBG.
I need to to join this data together in Tableau so that I would have the state and county on the CBG sheet for each CBG.
Basically I tried to blend the data and it didn't work. I also tried to perform a join calculation using the 5-digit code in the second sheet and the LEFT function to extract the 5-digits in the CBG code but it didn't seem to work either.
To fix it, just needed to fix the Join calculation on both sides of the join.
Also, it seems that both variables to be joined need to be the same data type.
The data that you analyze in Tableau is often made up of a collection of tables that are related by specific fields (that is, columns). Joining is a method for combining the related data on those common fields. The result of combining data using a join is a virtual table that is typically extended horizontally by adding columns of data.
When joining tables, the fields that you join on must have the same data type. If you change the data type after you join the tables, the join will break.
Please go through the below steps for joining tables:
In Tableau Desktop: on the start page, under Connect, click a connector to connect to your data. This step creates the first connection in the Tableau data source.
In web authoring: Select New Workbook and connect to your data. This step creates the first connection in the Tableau data source.
Select the file, database, or schema, and then double-click or drag a table to the canvas.
Double-click or drag another table to the canvas, and then click the join relationship to add join clauses and select your join type.
Add one or more join clauses by selecting a field from one of the available tables used in the data source, a join operator, and a field from the added table. Inspect the join clause to make sure it reflects how you want to connect the tables.
When you are finished, close the Join dialog.
Thank you.

Multiple optional query parameters with PostgreSQL

I use PostgreSQL10 and I want to built queries that have multiple optional parameters.
A user must input area name, but then it is optional to pick none or any combination of the following event, event date, category, category date, style
So a full query could be "all the banks (category), constructed in 1990 (category date) with modern architecture (style), that got renovated in 1992 (event and event date) in the area of NYC (area) ".
My problem is that all those are in different tables, connected by many-to-many tables, so I cannot do something like
SELECT * FROM mytable
WHERE (Event IS NULL OR Event = event)
I dont know if any good will come if I just join four tables.
I can easily find the area id, since it is required, but I dont know what the user chose, beside that.
Any suggestions on how to approach this, with Postgre?
Thanks
It might be optimal to build the entire query dynamically and only join in tables that you know you're going to need in order to apply the user's filters, but it's impractical. You're better off creating a view on the full set of tables. Use LEFT OUTER JOINs to ensure that you don't accidentally filter out valid combinations and index your tables to ensure that the query planner can navigate the table graph quickly. Then query the view with a WHERE clause reflecting only the filters you want to apply.
If performance becomes a concern and you don't mind having non-realtime data, you could use a materialized view to cache the results. Materialized views can be indexed directly, but this is a pretty radical change so don't do this unless you have to.

Google refine cross-reference between row and column

I'm not sure if this can be achieved in Google Refine at all. But basically, I have data like this.
The first table is the table of all the users. The second table show all the friends. However, in the second table in "friends" column not all the id exists in the first table which I want to get rid of. So, how can I search each id in friends column in the second table and get rid of the id that doesn't exists in the table 1?
Put the two tables in different projects (we'll call them Table1 and Table2).
In Table2 on on the friends column:
use "split multi-valued cells" to get each value on a separate row
convert the visitors column to numbers (or conversely user_id in Table1 to string)
use "add a new column based on this column" with the expression cross(cell,'Table1','user_id').length()
This will return 0 if there's no match, 1 if there's a match or N>1 if there are duplicates in Table1
If you want the data back in the original format, set up a facet to filter on the validity column, blank out all the bad values and then use "join multi-valued cells" to reverse the split operation you did up front.
I fixed some caching bugs with cross() for OpenRefine 2.6, so if the cross doesn't work, try stopping and restarting the Refine server.