Convert map variable to dict format - kubernetes-helm

My code is returning the value of info as follows
info: '[map[usa:map[DrRegions:uk region:usa region_key:usa] uk:map[DrRegions:usa region:uk]]]'
How do i convert "info" to the below format
info:
usa:
DrRegions: uk
region: usa
uk:
DrRegions: usa
region: uk
I tried using toYaml and it returns not what i am expecting
regionInfo: 'usa: DrRegions: uk region: usa uk: DrRegions: usa region: uk

Related

How do I sort hierarchical data in Palantir?

Lets say I have flight data (from Foundry Academy).
Starting dataset:
Date
flight_id
origin_state
carrier_name
jan
000000001
California
delta air
jan
000000002
Alabama
delta air
jan
000000003
California
southwest
feb
000000004
California
southwest
...
...
...
...
I'm doing monthly data aggregation by state and by carrier. Header of my aggregated data looks like this:
origin state
carrier name
jan
feb
...
Alabama
delta air
1
0
...
California
delta air
1
0
...
California
southwest
1
1
...
I need to get subtotals for each state;
I need to sort by most flights;
and I want it to be sorted by states, then by carrier.
desired output
origin state
carrier name
jan
feb
...
California
null
2
1
...
California
delta air
1
0
...
California
southwest
1
1
...
Alabama
null
1
0
...
Alabama
delta air
1
0
...
PIVOT - doesn't provide subtotals for categories;
EXPRESSION - doesn't offer possibility to split date column into columns.
I solved it on Contour. not the prettiest solution, but it works.
I've created two paths to the same dataset:
| Date | flight_id | origin_state | carrier_name |
| ---- | --------- | ------------ | ------------ |
| ... | ... | ... | ... |
1st path was used to calculate full aggregation. pivot table and switch to pivoted data:
Switch to pivoted data: using column "date",
grouped by "origin_state" and "carrier_name",
aggregated by Count
2nd path was used to get subtotals:
Switch to pivoted data: using column "date",
grouped by "origin_state",
aggregated by Count
Afterwards I've added empty column "carrier_name" to second dataset. And made union of both datasets
Add rows that appear in "second_path" by column name
After that I've added additional column with expression
Add new column "order" from max("Jan") OVER (
PARTITION BY "origin_state" )
After that I sorted resulting dataset.
Sort dataset by "order" descending, then by "Jan" descending
I received result. but it has additional column, and now I wish to change row formatting of subtotals.
Other approaches are welcome. as my real data has more hierarchical levels.

Value of date changes in scala when reading from yaml file

I have a YAML file, which has data -
time:
- days : 50
- date : 2020-02-30
I am reading both the things in my scala program -
val t = yml.get(time).asInstanceOf[util.Arraylist[util.LinkedHashMap[String, Any]]]
val date = t.get(1)
this output of date is - {date=Sat Feb 29 19:00:00 EST 2020}
or if I put the value in yaml file as date: 2020-02-25 the output will be {date=Mon Feb 24 19:00:00 EST 2020} and not {date=Tue Feb 25 19:00:00 EST 2020}.
Why is it always reducing the date value by 1?
Any help is highly appreciated.
PS - I want to validate the date; that is why the input is date: 2020-02-30

How to query with "IN" in Q (kdb)?

Let's assume that I have a table in KBD named "Automotive" with following data:
Manufacturer Country Sales Id
Mercedes United States 002
Mercedes Canada 002
Mercedes Germany 003
Mercedes Switzerland 003
Mercedes Japan 004
BMW United States 002
BMW Canada 002
BMW Germany 003
BMW Switzerland 003
BMW Japan 004
How would I structure a query in Q such that I can fetch the records matching United States and Canada without using an OR clause?
In SQL, it would look something like:
SELECT Manufacturer, Country from Automotive WHERE Country IN ('United States', 'Canada')
Thanks in advance for helping this Q beginner!
It's basically the same in kdb. The way you write you query depends on the data type. See below an example where manufacturer is a symbol, and country is a string.
q)tbl:([]manufacturer:`Merc`Merc`BMW`BMW`BMW;country:("United States";"Canada";"United States";"Germany";"Japan");ID:til 5)
q)
q)tbl
manufacturer country ID
-------------------------------
Merc "United States" 0
Merc "Canada" 1
BMW "United States" 2
BMW "Germany" 3
BMW "Japan" 4
q)meta tbl
c | t f a
------------| -----
manufacturer| s
country | C
ID | j
q)select from tbl where manufacturer in `Merc`Ford
manufacturer country ID
-------------------------------
Merc "United States" 0
Merc "Canada" 1
q)
q)select from tbl where country in ("United States";"Canada")
manufacturer country ID
-------------------------------
Merc "United States" 0
Merc "Canada" 1
BMW "United States" 2
Check out how to use Q-sql here: https://code.kx.com/q4m3/9_Queries_q-sql/

Remove duplicates in spark with 90 percent column match

Compare two rows in a dataframe in Spark and to remove the row if 90 percent of the columns matches(if there are 10 columns and if 9 matches). How to do this?
Name Country City Married Salary
Tony India Delhi Yes 30000
Carol USA Chicago Yes 35000
Shuaib France Paris No 25000
Dimitris Spain Madrid No 28000
Richard Italy Milan Yes 32000
Adam Portugal Lisbon Yes 36000
Tony India Delhi Yes 22000 <--
Carol USA Chicago Yes 21000 <--
Shuaib France Paris No 20000 <--
Have to remove the marked rows since 90 percent that 4 out of 5 column values are matching with already existing rows.How to do this in Pyspark Dataframe.TIA

Tableau: How to display the maximum sales category name using a group function?

I have the data something like below:
Input
Year Region Sales Team Name
2014 East 30 Team1
2014 East 26 Team2
2014 East 28 Team1
2014 West 40 Team1
2014 West 34 Team2
2014 North 56 Team1
2014 North 50 Team2
2014 South 24 Team1
2014 South 32 Team2
2014 South 19 Team3
2015 East 35 Team1
2015 East 42 Team2
2015 East 54 Team3
2015 West 41 Team1
2015 West 43 Team2
2015 West 40 Team3
2015 North 38 Team1
2015 North 32 Team2
2015 North 41 Team3
2015 South 28 Team1
2015 South 29 Team2
I am trying to achieve the output as below:
Output
2014 East Team1
2014 West Team1
2014 North Team1
2014 South Team2
2015 East Team3
2015 West Team1
2015 North Team3
2015 South Team2
For each region category with-in the year, I have to display the name of the team that made maximum sales. I am able to display the maximum sales quantity on a tabular format by giving condition MAX(SALES) but I am not able to display the team name. When I am including the team name in the columns or in the TEXT field, I am getting all other rows for that year as well.
Kindly help me in this regard.
Use the RANK formula and then filter on that for 1. Adjust the Compute Using as needed depending on your layout.
rank(sum([Sales]))
Take a look at the sample workbook here: https://dl.dropboxusercontent.com/u/60455118/161116%20stack%20question.twbx