Convert single row to multiple rows in tableau based on some delimiter - tableau-api

I had a column with data as => a,b,c,d,e
I need to display(in worksheet) as
a
b
c
d
e
Note: need to be split based on ','
Do I need to to use calculation field or any other approach is there???
Went through split function but is used to generate new columns, I want to store in a single column.

is this something that could work? (you said it's just a matter of visualization without altering data, right?)
I just created a CF like this:
REPLACE(value,",","
")
EDIT: since it seems that your need involves a data manipulation (you want multiple row instead of one) I think that the best way is using the split function even though, as you noticed, it will create new columns.
Otherwise if it's just a visualization need, you could use the solution posted before which shows your data ("a,b,c,d,e") in the same cell with the same horizontal alignment, just replacing commas with CR

Related

How can I alias labels (using a query) in Grafana?

I'm using Grafana v9.3.2.2 on Azure Grafana
I have a line chart with labels of an ID. I also have an SQL table in which the IDs are mapped to simple strings. I want to alias the IDs in the label to the strings from the SQL
I am trying to look for a transformation to do the conversion.
There is a transformation called “rename by regex”, but that will require me to hardcode for each case. Is there something similar with which I don't have to hardcode for each case.
There is something similar for variables - https://grafana.com/blog/2019/07/17/ask-us-anything-how-to-alias-dashboard-variables-in-grafana-in-sql/. But I don't see anything for transformations.
Use 2 queries in the panel - one for data with IDs and seconds one for mapping ID to string. Then add transformation Outer join and use that field ID to join queries results into one result.
You may need to use also Organize fields transformation to rename, hide unwanted fields, so only right fields will be used in the label at the end.

How to properly remove NaN values from table

After reading an Excel spreadsheet in Matlab I unfortunately have NaNs included in my resulting table. So for example this Excel table:
would result in this table:
where an additional column of NaNs occurs. I tried to remove the NaNs with the following code snippet:
measurementCells = readtable('MWE.xlsx','ReadVariableNames',false,'ReadRowNames',true);
measurementCells = measurementCells(any(isstruct(measurementCells('TIME',1)),1),:);
However this results in a 0x6 table, without any values present anymore. How can I properly remove the NaNs without removing any data from the table?
Either this:
tab = tab(~any(ismissing(tab),2),:);
or:
tab = rmmissing(tab);
if you want to remove rows that contain one or more missing value.
If you want instead to replace missing values with other values, read about how fillmissing (https://mathworks.com/help/matlab/ref/fillmissing.html) and standardizeMissing (https://mathworks.com/help/matlab/ref/standardizemissing.html) functions work. The examples are exhaustive and should help you to find the solution that best fits your needs.
One last solution you have is to spot (and manipulate in the way you prefer) NaN values within the call to the readtable function using the EmptyValue parameter. But this works only against numeric data.

Pivot data in Talend

I have some data which I need to pivot in Talend. This is a sample:
brandname,metric,value
A,xyz,2
B,xyz,2
A,abc,3
C,def,1
C,ghi,6
A,ghi,1
Now I need this data to be pivoted on the metric column like this:
brandname,abc,def,ghi,xyz
A,3,null,1,2
B,null,null,null,2
C,null,1,6,null
Currently I am using tPivotToColumnsDelimited to pivot the data to a file and reading back from that file. However having to store data on an external file and reading back is messy and unnecessary overhead.
Is there a way to do this with Talend without writing to an external file? I tried to use tDenormalize but as far as I understand, it will return the rows as 1 column which is not what I need. I also looked for some 3rd party component in TalendExchange but couldn't find anything useful.
Thank you for your help.
Assuming that your metrics are fixed, you can use their names as columns of the output. The solution to do the pivot has two parts: first, a tMap that transposes the value of each input-row in into the corresponding column in the output-row out and second, a tAggregate that groups the map's output-rows according to the brandname.
For the tMap you'd have to fill the columns conditionally like this, example for output colum named "abc":
out.abc = "abc".equals(in.metric)?in.value:null
In the tAggregate you'd have to group by out.brandname and aggregate each column as sum ignoring nulls.

Transpose data using Talend

I have this kind of data:
I need to transpose this data into something like this using Talend:
Help would be much appreciated.
dbh's suggestion should work indeed, but I did not try it.
However, I have another solution which doesn't require to change input format and is not too complicated to implement. Indeed the job has only 2 transformation components (tDenormalize and tMap).
The job looks like the following:
Explanation :
Your input is read from a CSV file (could be a database or any other kind of input)
tDenormalize component will Denormalize your column value (column 2), based on value on id column (column 1), separating fields with a specific delimiter (";" in my case), resulting as shown in 2 rows.
tMap : split the aggregated column into multiple columns, by using java's String.split() method and spreading the resulting array into multiple columns. The tMap should like like this:
Since Talend doesn't accept to store Array objects, make sure to store the splitted String in Object format. Then, cast that object into Array on the right side of the Map.
That approach should give you the expected result.
IMPORTANT:
tNormalize might shuffle the rows, meaning for bigger input, you might encounter unsorted output. Make sure to sort it if needed or use tDenormalizeSortedRow instead.
tNormalize is similar to an aggregation component meaning it scans the whole input before processing, which results into possible performance issues with particularly big inputs (tens of millions of records).
Your input is probably wrong (you have 5 entries with 1 as id, and 6 entries with 2 as id). 6 columns are expected meaning you should always have 6 lines per id. If not, then you should implement dbh's solution, and you probably HAVE TO add a column with a key.
You can use Talend's tPivotToColumnsDelimited component to achieve this. You will most likely need an additional column in your data to represent the field name.
Like "Identifier, field name, value "
Then you can use this component to pivot the data and write a file as output. If you need to process the data further, read the resulting file with tFileInoutDelimited .
See docs and an example at
https://help.talend.com/display/TalendOpenStudioComponentsReferenceGuide521EN/13.43+tPivotToColumnsDelimited

GROUP BY CLAUSE using SYNCSORT

I have some content in a file on which I must generate statistics such as how many of records are of type - 1, type - 2 etc. Number of types can change and is unknown to the code until file arrives. In a SQL system, I can do this using COUNT and GROUP BY clause. But I am not sure if I can do this using SYNCSORT or COBOL program. Would anyone here have an idea on how I can implement 'GROUP BY' type query on a file using SYNCSORT.
Sample Data:
TYPE001 SUBTYPE001 TYPE01-DESC
TYPE001 SUBTYPE002 TYPE01-DESC
TYPE001 SUBTYPE003 TYPE01-DESC
TYPE002 SUBTYPE001 TYPE02-DESC
TYPE002 SUBTYPE004 TYPE02-DESC
TYPE002 SUBTYPE008 TYPE02-DESC
I want to get the information such as TYPE001 ==> 3 Records, TYPE002 ==> 3 Records. What the code doesn't know until runtime is the TYPENNN value
You show data already in sequence, so there is no need to sort the data itself, which makes SUM FIELDS= with SORT a poor solution if anyone suggests it (plus code for the formatting).
MERGE with a single input file and SUM FIELDS= would be better, but still require the code for formatting.
The simplest way to produce output which may suit you is to use OUTFIL reporting functions:
OPTION COPY
OUTFIL NODETAIL,
REMOVECC,
SECTIONS=(1,7,
TRAILER3=(1,7,
' ==> ',
COUNT=(M10,LENGTH=3),
' Records'))
The NODETAIL says "remove all the data lines". The REMOVECC says "although it is a report, don't use printer-control characters on position one of the output records". The SECTIONS says "we're going to use control-breaks, and here they (it in this case) are". In this case, your control-field is 1,7. The TRAILER3 defines the output which will be produced at each control-break: COUNT here is the number of records in that particular break. M10 is an editing mask which will change leading zeros to blanks. The LENGTH gives a length to the output of COUNT, three is chosen from your sample data with sub-types being unique and having three digits as the unique part of the data. Change to whatever suits your actual data.
You've not been clear, and perhaps you want the output "floating" (3bb instead of bb3, where b represents a blank)? That would require more code...