I have to do this command:
df.groupBy('docid', 'vehicle_vin').pivot('intervaltype').agg(first('NAME_COLUMN1').alias('NAME_COLUMN1'))
And i want to pass dynamically the name of the column that are saved in a list:
(NAME_COLUMN1,NAME_COLUMN2,NAME_COLUMN3.....)
Related
I am trying to create different users using awx, but whenever I run it I am getting such errors
the value ['user1', 'user2', 'user3'] (type list) in a string field was converted to "['user1', 'user2', 'user3']" (type string). If this does not look like what you expect, quote the entire value to ensure it does not change.
This is how I have added the varibale in awx ui
I have tried passing using json
I have 2 Get metadata stages in ADF which is fetching file names from 2 different folders, I need to use these outputs for file name comparison in databricks notebook and return true if all the files are present.
how to pass the output from Get meta data stages to databricks and perform string comparison and
return true if all files are present and return false if even 1 file is missing
How to achieve this?
Please find the below answer which I explained with 1 Get metadata stage , the same can be replicated for more than one also.
Create an ADF pipeline with below activities.
Now in the Get Metadata activity , add the childItems in the Fieldlist as argument, to pass the output of Get Metadata to Notebook as show below
In the Databricks Notebook activity , add the below parameter as Base Paramter which will capture the output of Get Metadata and pass as input paramater to Notebook. Generally this parameter will of object datatype , but I converted to string datatype to access the names of files in the notebook as show below
#string(activity('Get Metadata1').output.childItems)
Now we can able to access the Get Metadata output as string in the notebook.
import ast
required_filenames = ['File1.csv','File2.csv','File3.csv'] ##This is for comparing with the output we get from GetMetadata activity.
metadata_value = dbutils.widgets.get('metadata_output') ##Accessing the output from Get Metadata and storing into a variable using databricks widgets.
metadata_list = ast.literal_eval(metadata_value) ##Converting the above string datatype to the list datatype.
blob_output_list=[] ##Creating an empty list to add the names of files we get from GetMetadata activity.
for i in metadata_list:
blob_output_list.append(i['name']) ##This will add all the names of files from blob storage to the empty list we created above.
validateif = all(item in blob_output_list for item in required_filenames) ##This validateif variable now compare both the lists using list comprehension and provide either True or False.
I tried in the above way and can able to solve the provided requirement. Hope this helps.
Request to please upvote the answer if this helps in your requirement.
I would like to pass a column name into a Q function to query a loaded table.
Example:
getDistinct:{[x] select count x from raw}
getDistinct "HEADER"
This doesn't work as the Q documentation says I cannot pass column as arguments. Is there a way to bypass this?
When q interprets x it will treat it as a string, it has no reference to the column, so your output would just be count "HEADER".
If you want to pass in the column as a string you need to build the whole select statement then use value
{value "select count ",x," from tab"} "HEADER"
However, the recommended method would be to use a functional select. Below I use parse to build the functional select equivalent using the parse tree.
/Create sample table
tab:([]inst:10?`MSFT`GOOG`AAPL;time:10?.z.p;price:10?10f)
/Generate my parse tree to get my functional form
.Q.s parse "select count i by inst from tab"
/Build this into my function
{?[`tab;();(enlist x)!enlist x;(enlist `countDistinct)!enlist (#:;`i)]} `inst
Note that you have to pass the column in as a symbol. Additionally the #:i is just the k equivalent to count i.
Update for multiple columns
tab:([]inst:10?`MSFT`GOOG`AAPL;time:10?.z.p;price:10?10f;cntr:10`HK`SG`UK`US)
{?[`tab;();(x)!x;(enlist `countDistinct)!enlist (#:;`i)]} `inst`cntr
To get the functional form of a select statement, I recommend using buildSelect. Also, reduce the scope of parenthesis, i.e. use enlist[`countDistinct] instead of (enlist `countDistinct).
I have a JOB rundeck called "TEST"
I have an option called country
this option retreives a list of key, value from a remote URL as :
[
{"name":"FRANCE", "value":"FR"},
{"name":"ITALY", "value":"IT"},
{"name":"ALGERIA", "value":"DZ"}
]
I would like to use both of the name and the value in a job step.
echo ${option.country.name}
echo ${option.country.value}
But this doesn't work and I'm not able to get the name of the parameter
getting the value can be done using ${option.country}
Is there any trick to get the parameter name ???
Just for the record answer: Maybe the best approach is to create some script-step that reads the JSON file and extracts the name, also, you can use the same value name like this example (of course is not applicable for all cases).
I am trying to feed the values of a feeder that supplies Id's into a .txt File. Is their any way to extract values directly from the feeder without having to extract the Id from each session?
I not sure, what do you mean, but the way for extract values from feed you can use next :
val creditCard = "creditCard"
feed(tsv("CreditCard.txt").random)
Inside file "Credit.txt" you should have 1st line (column) name exactly as init value of variable -> "creditCard".
In this way you can use it like : "${creditCard}" in you script.