Google Cloud Dataprep - How to replace a pattern in a string - google-cloud-dataprep

How can I replace a pattern in a string in one column with a value from another column in Cloud Dataprep?
To be precise, I have a column A with the same pattern in every string of the column, and I want to replace that pattern inside a string with corresponding value (when i say corresponding I mean the value in the same row) from another column.
Any idea?

I think you should be able to do this in two steps:
1.- Splitting the column (using the pattern in order to take it away)
2.- Merging the column (gathering the pieces together into the one desired column)

Related

Can I create a parameter in a Local?

I created a Derived Column with a Expression
(dummy sample)
iif(columnX=='true',1,0)
This expression will be util in anothers Derived Columns, so I'd like create a Local with this Expression, but in the place of columnX I'll put a parameter for another column
Is it possible? How?
I tried creating a data flow parameter (param2) and added your expression to it using a different parameter (param1) instead of ColumnX.
But I was not able to change the parameter value in the expression later, it was only taking the default value. Also did not find any related documents to assign a column value to a parameter in the data flow.
The only way I could think of is, using the expression multiple times in different derived columns taking different columns in place of ColumnX.
Derived column1: Added expression to the new column (col1)
Preview of Derived column1: Evaluating expression against Sample1 column.
Derived column2: Reusing same column name Col1 to evaluate the expression against Sample2 column. If the previous value is needed, you can assign the previous value of Col1 to the new column (previous_col1) in derived column2 as shown in the below snip.
Preview of derived column2

Is there a way of creating a Serial Number based on other inputs on a MS access form?

I have some samples I need to take.
In order to create a good identifier/serial number for the samples, I want it to be a product of its characteristics.
For example, if the sample was taken from India and the temperature was 40 degrees then I would click dropdowns in the form to create those two entries and then a serial number would be spat out in the form "Ind40".
Assuming that your form is bound to a table, you can create a calculated column in the table that concatenates the values from other columns into a single value.
For instance, create a new column and give it a name (for example, SerialNbr). Then for Data Type select "Calculated". An expression builder window will appear:
Enter the columns you'd like to concatenate and separate them with &. Here is an example of how the expression could look:
Left([Country],3) & [Temperature]
This expression takes the first 3 chars from the Country column and combines it with the value from Temperature column to create the value in column SerialNbr. The calculated column will automatically update when values are entered into the other fields. I'd also suggest adding another value to the calculated expression to help avoid duplicates, such as date/time of submission.

How can I rename several columns in dataprep?

I have more than 100 columns in dataprep whose names are like:
my column name 1
my column name 2
I would like to rename the name of the columns to be:
my_column_name_1
my_column_name_2
I have tried to do a rename, changing " " by "_". However, dataprep only changes the first whitespace! Is there any way to change all the whitespaces?
Another question, when I do a function like rename, it is done just for a column. I can add more columns writing the name of de column. Is there any way to select all columns without writing all the names?
thank you so much!
You can shift-select multiple columns to Transform when the data is in column view mode.
Select the columns to apply to and then choose the transformation.
JSDBroughton answer did the trick for me although it's not so clear how to do it. Change your view to Columns (second icon from the left on the toolbar). Select the first column, then hold Shift and select the last column. You should now have all columns selected. Then right clock and select Rename. A new Recipe step will be added with all your columns already added. Then set the Option to "Find and replace".
In terms removing all the spaces I couldn't find any Cloud Dataprep pattern or Regular Expression which let me replace all my spaces in my columns. Having said that my columns had a maximum of 4 spaces so I simply added the same step multiple times. I used the Regular Expression \s to match spaces and I replaced them underscores.

How to convert a fixed length string to multiple columns in Tableau

Using Tableau Desktop 10.x I have a column that contain a fixed length string starting with a letter and followed by 5 numbers.
For example: A12345. I would like to separate this column into two columns, where the first column would have the letter (A) and the second would have the 5 number string (12345). How can this be done in Tableau?
It can be achieved using this by creating Calculated fields .
click on analysis
1 write the name of the new column
2 in the box write RIGHT
3 choose the function RIGHT
4 add the original column name inside the square brackets, and follow with a comma followed by 1 (for the amount of elements)
More details

Merge all the data in the second column for each unique value in the first column

I have two columns of data. Some of the data in the first column repeats (they represent questions). The data in the second column is unique (they represent multiple answers to the same question).
I need to merge all the data in the second column for each unique value in the first column. e.g.:
Q,A
1,yes.
1,is possible.
2,no.
2,not possible.
2,cannot do this.
2,impossible.
3,maybe.
merged to:
Q,A
1,yes.is possible.
2,no.not possible.cannot do this.impossible.
3,maybe.
Something like this is crude but may be adequate:
=IF(A1=A2,C1&B2,B2)
copied down to suit. Then select the last entry (identifiable with something like =A1=A2 copied down to suit) for each Question number.
Questions in column A sorted in order
Answers in column B
In C1 use =B1
In C2 use =if(a2=A1,C1&B2,B2)
Drag down formula in C2.
It will keep adding the lines together as long as the question remains the same. When it gets to a new question, it'll start a new string. The last time each question is listed will be the complete string in column C.
Create a 2 column project in Google Refine
Sort by Q column (if not already sorted) and make sort permanent
Blank Down on Q column to remove duplicate values
On A column, do Edit Cells -> Merge multi-valued cells