LibreOffice - RANDBETWEEN return a name - libreoffice

I got two columns list like this
+----+-------+
| Nr | Name |
+----+-------+
| 1 | Alice |
| 2 | Bob |
| 3 | Joe |
| 4 | Ann |
| 5 | Jane |
+----+-------+
And would like to generate a random name from this list.
For now I am only able to randomly select a number and then manually pick out the corresponding name - using this function =RANDBETWEEN(A2;A10) How can I pick out the name instead?

Assuming that the data of your table are in cells E7:F11 the following code can do what you need:
=VLOOKUP(RANDBETWEEN(1;5);E7:F11;2)
Further, in case you need to create a random permutation of the names you may also use the Calc extension Permutate at https://sourceforge.net/projects/permutate/.
Hope that helps.

Assuming your data is with Nr in A1 I suggest:
=INDEX(B$2:B$6;RANDBETWEEN(1;5))
then there is no need for the Nr column in making the selection.

Related

SPSS group by rows and concatenate string into one variable

I'm trying to export SPSS metadata to a custom format using SPSS syntax. The dataset with value labels contains one or more labels for the variables.
However, now I want to concatenate the value labels into one string per variable. For example for the variable SEX combine or group the rows F/Female and M/Male into one variable F=Female;M=Male;. I already concatenated the code and labels into a new variable using Compute CodeValueLabel = concat(Code,'=',ValueLabel).
so the starting point for the source dataset is like this:
+--------------+------+----------------+------------------+
| VarName | Code | ValueLabel | CodeValueLabel |
+--------------+------+----------------+------------------+
| SEX | F | Female | F=Female |
| SEX | M | Male | M=Male |
| ICFORM | 1 | Yes | 1=Yes |
| LIMIT_DETECT | 0 | Too low | 0=Too low |
| LIMIT_DETECT | 1 | Normal | 1=Normal |
| LIMIT_DETECT | 2 | Too high | 2=Too high |
| LIMIT_DETECT | 9 | Not applicable | 9=Not applicable |
+--------------+------+----------------+------------------+
The goal is to get a dataset something like this:
+--------------+-------------------------------------------------+
| VarName | group_and_concatenate |
+--------------+-------------------------------------------------+
| SEX | F=Female;M=Male; |
| ICFORM | 1=Yes; |
| LIMIT_DETECT | 0=Too low;1=Normal;2=Too high;9=Not applicable; |
+--------------+-------------------------------------------------+
I tried using CASESTOVARS but that creates separate variables, so several variables not just one single string variable. I'm starting to suspect that I'm running up against the limits of what SPSS can do. Although maybe it's possible using some AGGREGATE or OMS trickery, any ideas on how to do this?
First I recreate your example here to demonstrate on:
data list list/varName CodeValueLabel (2a30).
begin data
"SEX" "F=Female"
"SEX" "M=Male"
"ICFORM" "1=Yes"
"LIMIT_DETECT" "0=Too low"
"LIMIT_DETECT" "1=Normal"
"LIMIT_DETECT" "2=Too high"
"LIMIT_DETECT" "9=Not applicable"
end data.
Now to work:
* sorting to make sure all labels are bunched together.
sort cases by varName CodeValueLabel.
string combineall (a300).
* adding ";" .
compute combineall=concat(rtrim(CodeValueLabel), ";").
* if this is the same varname as last row, attach the two together.
if $casenum>1 and varName=lag(varName)
combineall=concat(rtrim(lag(combineall)), " ", rtrim(combineall)).
exe.
*now to select only relevant lines - first I identify them.
match files /file=* /last=selectthis /by varName.
*now we can delete the rest.
select if selectthis=1.
exe.
NOTE: make combineall wide enough to contain all the values of your most populated variable.

Spark scala finding value in another dataframe

Hello I'm fairly new to spark and I need help with this little exercise. I want to find certain values in another dataframe but if those values aren't present I want to reduce the length of each value until I find the match. I have these dataframes:
----------------
|values_to_find|
----------------
| ABCDE |
| CBDEA |
| ACDEA |
| EACBA |
----------------
------------------
| list | Id |
------------------
| EAC | 1 |
| ACDE | 2 |
| CBDEA | 3 |
| ABC | 4 |
------------------
And I expect the next output:
--------------------------------
| Id | list | values_to_find |
--------------------------------
| 4 | ABC | ABCDE |
| 3 | CBDEA | CBDEA |
| 2 | ACDE | ACDEA |
| 1 | EAC | EACBA |
--------------------------------
For example ABCDE isn't present so I reduce its length by one (ABCD), again it doesn't match any so I reduce it again and this time I get ABC, which matches so I use that value to join and form a new dataframe. There is no need to worry about duplicates values when reducing the length but I need to find the exact match. Also, I would like to avoid using a UDF if possible.
I'm using a foreach to get every value in the first dataframe and I can do a substring there (if there is no match) but I'm not sure how to lookup these values in the 2nd dataframe. What's the best way to do it? I've seen tons of UDFs that could do the trick but I want to avoid that as stated before.
df1.foreach { values_to_find =>
df1.get(0).toString.substring(0, 4)}
Edit: Those dataframes are examples, I have many more values, the solution should be dynamic... iterate over some values and find their match in another dataframe with the catch that I need to reduce their length if not present.
Thanks for the help!
You can load the dataframe as temporary view and write the SQL. Is the above scenario you are implementing for the first time in Spark or already did in the previous code ( i mean before spark have you implemented in the legacy system). With Spark you have the freedom to write udf in scala or use SQL. Sorry i don't have solution handy so just giving a pointer.
the following will help you.
val dataDF1 = Seq((4,"ABC"),(3,"CBDEA"),(2,"ACDE"),(1,"EAC")).toDF("Id","list")
val dataDF2 = Seq(("ABCDE"),("CBDEA"),("ACDEA"),("EACBA")).toDF("compare")
dataDF1.createOrReplaceTempView("table1")
dataDF2.createOrReplaceTempView("table2")
spark.sql("select * from table1 inner join table2 on table1.list like concat('%',SUBSTRING(table2.compare,1,3),'%')").show()
Output:
+---+-----+-------+
| Id| list|compare|
+---+-----+-------+
| 4| ABC| ABCDE|
| 3|CBDEA| CBDEA|
| 2| ACDE| ACDEA|
| 1| EAC| EACBA|
+---+-----+-------+

Combine multiple rows into single row in Google Data Prep

I have a table which has multiple payload values in separate rows. I want to combine those rows into a single row to have all the data together. Table looks something like this.
+------------+--------------+------+----+----+----+----+
| Date | Time | User | D1 | D2 | D3 | D4 |
+------------+--------------+------+----+----+----+----+
| 2020-04-15 | 05:39:45 UTC | A | 2 | | | |
| 2020-04-15 | 05:39:45 UTC | A | | 5 | | |
| 2020-04-15 | 05:39:45 UTC | A | | | 8 | |
| 2020-04-15 | 05:39:45 UTC | A | | | | 7 |
+------------+--------------+------+----+----+----+----+
And I want to convert it to something like this.
+------------+--------------+------+----+----+----+----+
| Date | Time | User | D1 | D2 | D3 | D4 |
+------------+--------------+------+----+----+----+----+
| 2020-04-15 | 05:39:45 UTC | A | 2 | 5 | 8 | 7 |
+------------+--------------+------+----+----+----+----+
I tried "set" and "aggregate" but they didn't work as I wanted them to and I am not sure how to go forward.
Any help would be appreciated.
Thanks.
tl;dr:
use fill() function to fill all empty values within each d1-d4 columns in the wanted group (AKA - the columns date+time+user) then dedup\aggregate to your heart's content.
long version
So the quickest way to do this is by using a window-function called "fill()".
What this function does for each given field in a column, it tells it:
"Look down. look up. find the closest non-empty value, and copy it!"
you can ofcourse limit it's sight (look only 3 rows above, for example) but for this example, don't need the limitation. so your fill function will look like this:
FILL($col, -1, -1)
So the "$col" will reference all the chosen columns. the "-1" says "unlimited sight".
finally, the "~" says "from column D1 to column D4".
So, function will look like this:
.
Which in turn will make your columns look like this:
.
Now you can use the "dedup" transformation to remove any duplications, and only 1 copy of each "group" will remain.
Alternatively, if you still want to use "group by", you can do that aswell.
Hope this helps =]
p.s
There are more ways to do this - which entails using the "pivot" transformation, and array unnesting. But in the process you'll lose your columns' names, and will need to rename them.

How to aggregate Postgres table so that ID is unique and column values are collected in array?

I'm not sure how to call what I'm trying to do, so trying to look it up didn't work very well. I would like to aggregate my table based on one column and have all the rows from another column collapsed into an array by unique ID.
| ID | some_other_value |
-------------------------
| 1 | A |
| 1 | B |
| 2 | C |
| .. | ... |
To return
| ID | values_array |
-------------------------
| 1 | {A, B} |
| 2 | {C} |
Sorry for the bad explanation, I'm really lacking the vocabulary here. Any help with writing a query that achieves what's in the example would be very much appreciated.
Try the following.
select id, array_agg(some_other_value order by some_other_value ) as values_array from <yourTableName> group by id
You can also check here.
See Aggregate Functions documentation.
SELECT
id,
array_agg(some_other_value)
FROM
the_table
GROUP BY
id;

Tableau - Show multiple discrete string (dropdown) dimensions side-by-side in a single table

I have a list of survey results that looks similar to the following:
| Email | Question 1 | Question 2 |
| ----------------- | ---------- | ---------- |
| test#example.com | Always | Sometimes |
| test2#example.com | Always | Always |
| test3#example.com | Sometimes | Never |
Question 1 and Question 2 (and a few others) have the same discrete set of values (from a dropdown list on the survey).
I want to show the data in the following format in Tableau (a table is fine, but a heatmap or highlight table would be best):
| | Always | Sometimes | Never |
| ---------- | ------ | --------- | ----- |
| Question 1 | 2 | 1 | 0 |
| Question 2 | 1 | 1 | 1 |
How can I achieve this? I've tried various combinations of rows and columns and I just can't seem to get close to this layout. Do I need to use a calculated value?
As far as I know - it is not natively possible with Tableau, because what you have is kind of a pivot table.
What you can do is unpivot the whole table as explained here https://stackoverflow.com/a/20543651/5130012, then you can load the data into Tableau and create the table you want.
I did some dummy data and tried it.
That's my "unpivoted" table:
Row,Column,Value
test,q1,always
test,q2,sometimes
test1,q1,sometimes
test1,q2,never
test10,q1,always
test10,q2,always
test11,q1,sometimes
test11,q2,never
And that's how it looks in Tableau: