How can I rename several columns in dataprep? - google-cloud-dataprep

I have more than 100 columns in dataprep whose names are like:
my column name 1
my column name 2
I would like to rename the name of the columns to be:
my_column_name_1
my_column_name_2
I have tried to do a rename, changing " " by "_". However, dataprep only changes the first whitespace! Is there any way to change all the whitespaces?
Another question, when I do a function like rename, it is done just for a column. I can add more columns writing the name of de column. Is there any way to select all columns without writing all the names?
thank you so much!

You can shift-select multiple columns to Transform when the data is in column view mode.
Select the columns to apply to and then choose the transformation.

JSDBroughton answer did the trick for me although it's not so clear how to do it. Change your view to Columns (second icon from the left on the toolbar). Select the first column, then hold Shift and select the last column. You should now have all columns selected. Then right clock and select Rename. A new Recipe step will be added with all your columns already added. Then set the Option to "Find and replace".
In terms removing all the spaces I couldn't find any Cloud Dataprep pattern or Regular Expression which let me replace all my spaces in my columns. Having said that my columns had a maximum of 4 spaces so I simply added the same step multiple times. I used the Regular Expression \s to match spaces and I replaced them underscores.

Related

Copying contents of columns with Field calculator in Qgis

I have to split the content of a column into 2 differents columns using the QGIS Field Calculator. Basically, my table is something like that:
Basically I have to work with descriptio column omitting characters from 1-12 and then copy next 8 characters (in this case "AgilisSi") into the PresLACAGI column.
The other element to copy is the final number in descriptio column, ranging from 1 to 3 characters. Possibly the best is thing would be a syntax that reproduces in CodiClapa column the number after ": ", including the space in the syntax.
Thanks a lot!
Use the field calculator, check Update existing field and select column from drop down and type in the Expression window for:
PresLACAGI: substr(descriptio,12,8)
CodiClapa: right(descriptio,3)

ADF map source columns startswith to sink columns in SQL table

I have a ADF data flow with many csv files as a source and a SQL database as a sink. The data in the csv files are similar with 170 plus columns wide however not all of the files have the same columns. Additionally, some column names are different in each file, but each column name starts with the same corresponding 3 digits. Example: 203-student name, 644-student GPA.
Is it possible to map source columns using the first 3 characters?
Go back to the data flow designer and edit the data flow.
Click on the parameters tab
Create a new parameter and choose string array data type
For the default value as per your requirement, enter ['203-student name','203-student grade',’203-student-marks']
Add a Select transformation. The Select transformation will be used to map incoming columns to new column names for output.
We're going to change the first 3 column names to the new names defined in the parameter
To do this, add 3 rule-based mapping entries in the bottom pane
For the first column, the matching rule will be position==1 and the name will be $parameter11
Follow the same pattern for column 2 and 3
Click on the Inspect and Data Preview tabs of the Select transformation to view the new column name.
Reference - https://learn.microsoft.com/en-us/azure/data-factory/tutorial-data-flow-dynamic-columns#parameterized-column-mapping

Postgresql : How can I determine how many characters in a text

a column has type text and its data looks like "{U}{R}" or "{3}{U}{U}{U}".
How can I determine how many "U" contains this column?
I want to select those data who has at least one and at most three {U}.
You can remove the code U and compare the size before/after removing it. This difference is the number of occurrence.
select length('{3}{U}{U}{U}{R}{R}')-length(translate('{3}{U}{U}{U}{R}{R}','U','')) AS U_CNT;
--> 3
or more generaly
select length(colname)-length(translate(colname,'U','')) AS U_CNT;

In Tableau, Is it possible to give Column number in Column Shelf

I'm using Tableau. So, instead of giving the [Column_name], Is it possible to give [Column_number] in column shelf?
- Hariharasudhan. R.
No -- for good reason.
Think of the data source as a template for a potential SQL (or MDX or TQL) query; specifying tables, joins, unions and possibly some where/having clauses for data source filters.
The actual SQL generated for any particular view will be an (optimized) query that only selects columns that are actually needed for that particular view, adds where/having clauses based on the filters being used etc.
So a column doesn't have a fixed number. The same column may be the first field selected in one situation, the last field in another situation, and left off completely in another.
If you want to change the name of column shelf:
Create a duplicate of variable and change original with duplicates and assign name as your wish by right-click on Edit Aliases and change as per your requirement.
Go to Data Source
On the middle right corner check on Show aliases
Go to column and right click on it. Go to Rename

Merge all the data in the second column for each unique value in the first column

I have two columns of data. Some of the data in the first column repeats (they represent questions). The data in the second column is unique (they represent multiple answers to the same question).
I need to merge all the data in the second column for each unique value in the first column. e.g.:
Q,A
1,yes.
1,is possible.
2,no.
2,not possible.
2,cannot do this.
2,impossible.
3,maybe.
merged to:
Q,A
1,yes.is possible.
2,no.not possible.cannot do this.impossible.
3,maybe.
Something like this is crude but may be adequate:
=IF(A1=A2,C1&B2,B2)
copied down to suit. Then select the last entry (identifiable with something like =A1=A2 copied down to suit) for each Question number.
Questions in column A sorted in order
Answers in column B
In C1 use =B1
In C2 use =if(a2=A1,C1&B2,B2)
Drag down formula in C2.
It will keep adding the lines together as long as the question remains the same. When it gets to a new question, it'll start a new string. The last time each question is listed will be the complete string in column C.
Create a 2 column project in Google Refine
Sort by Q column (if not already sorted) and make sort permanent
Blank Down on Q column to remove duplicate values
On A column, do Edit Cells -> Merge multi-valued cells