Query Importrange with concat - adding 2 of the returned columns together - import

I am trying to import specified data using query importrange but at the same time I want to reduce need for additional calculation columns and by using concat or something similar to add 2 columns together with a space in between ie. first name 'bob' last name 'smith' returns 'bob smith' in 1 column
=QUERY({IMPORTRANGE("https://docs.google.com/spreadsheets/d/1oaZP3-p1cI4d1QyLQ2qM5sMwnVGz8S0bhe29W4QqH6g/edit#gid=1908577977","Sheet7!A2:c"),"select Col1&" "&Col2,Col3",0})
I've tried the above but it returns formula parse error
https://docs.google.com/spreadsheets/d/1oaZP3-p1cI4d1QyLQ2qM5sMwnVGz8S0bhe29W4QqH6g/edit?usp=sharing

in post-IMPORTRANGE you can join two columns only like this:
=FLATTEN(QUERY(TRANSPOSE(QUERY(
IMPORTRANGE("13Ptmj3sejlOADvwhgfBPxRy_H-RGCxLX4r2jecbceIE", "Sheet7!A2:C"),
"select Col1,Col2", )),,9^9))
so for 3 columns:
={FLATTEN(QUERY(TRANSPOSE(QUERY(
IMPORTRANGE("13Ptmj3sejlOADvwhgfBPxRy_H-RGCxLX4r2jecbceIE", "Sheet7!A2:C"),
"select Col1,Col2", )),,9^9)),
IMPORTRANGE("13Ptmj3sejlOADvwhgfBPxRy_H-RGCxLX4r2jecbceIE", "Sheet7!C2:C")}

Related

How To Take Left Side String From A Particular position using PostgreSql

I have a table in that table there is a column called eq1.sources in that column, entries are like mentioned below. Now I would like to extract the string from the left side to till card slot number only.
Example:
fdn:realm:pam:network:55.150.40.841:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
for this entry I need only
fdn:realm:pam:network:55.150.40.841:shelf-1:cardSlot-1
similarly
fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1:card
for this entry I need
fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1
I have tried substring(eq1.sources,0,position (':card:daughter' in eq1.sources)). this is working only for row numbers 1,2,4,5,6,7,9,10 but row number 3,8,11 not working as the entries not continued with ':card:daughter'.
The column name for the below entries is eq1.sources.
1.fdn:realm:pam:network:55.150.40.841:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
2.fdn:realm:pam:network:35.250.40.824:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
3.fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1:card
4.fdn:realm:pam:network:55.159.40.994:shelf-1:cardSlot-2:card:daughterCardSlot-1:daughterCard
5.fdn:realm:pam:network:35.250.140.104:shelf-1:cardSlot-2:card:daughterCardSlot-1:daughterCard
6.fdn:realm:pam:network:55.170.40.1:shelf-1:cardSlot-2:card:daughterCardSlot-1:daughterCard
7.fdn:realm:pam:network:35.450.40.24:shelf-1:cardSlot-3:card:daughterCardSlot-1:daughterCard
8.fdn:realm:sam:network:35.250.40.14:shelf-1:cardSlot-3:card
9.fdn:realm:pam:network:55.150.40.854:shelf-1:cardSlot-4:card:daughterCardSlot-1:daughterCard
10.fdn:realm:pam:network:35.250.40.84:shelf-1:cardSlot-5:card:daughterCardSlot-1:daughterCard
11.fdn:realm:sam:network:35.250.40.84:shelf-1:cardSlot-6:card
Expecting a PostgreSQL query to extract left side substring from a particular position in a row.
Expected output is
1.fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1
2.fdn:realm:sam:network:35.250.40.14:shelf-1:cardSlot-3:card
from
1.fdn:realm:sam:network:35.250.40.834:shelf-1:cardSlot-1:card:daughterCardSlot-1:daughterCard
2.fdn:realm:sam:network:35.250.40.14:shelf-1:cardSlot-3:card
First split the string into an array with : as a delimiter (this is the t subquery) and then pick the first 7 array elements and join them again into a string with : delimiter.
select array_to_string(arr[1:7], ':') as sources
from
(
select string_to_array(sources, ':') as arr
from the_table
) as t;
See demo.

Postgres get null if row doesn't exist in where clause

I've a postgres table with data like
Name
Attendance
Jackie
2
Jade
5
Xi
10
Chan
15
In my query I want all present by name, and if name doesn't exist return "null" instead of no row for that particular name
Eg
query where Name in ('Jackie', 'Jade', 'Cha', 'Xi')
Should return
Name
Attendance
Jackie
2
Jade
5
NULL
NULL
Xi
10
To produce the desired rows, you need to join with a table or set of rows which has all those names.
You can do this by inserting the names into a temp table and joining on that, but in Postgres you can turn an array of names into a set of rows using unnest. Then left join with the table to return a row for every value in the array.
select attendances.*
from
unnest(ARRAY['Jackie','Jade','Cha','Xi']) as names(name)
left join attendances on names.name = attendances.name;

Create rows from part of column names

Source data
I am working on an ELT project to load data from CSV files into PostgreSQL where I will transform it. The CSV files have many columns that are consistent across files, but also contain activity columns that are inconsistent with names like Date (05/19/2020), Type (05/19/2020), etc.
In the loading script I am merging all of the columns with dates in the column name into one jsonb column so I don't have to constantly add new columns to the raw data table.
The resulting jsonb column in the raw data table looks like this:
id
activity
12345678
{"Date (05/19/2020)": null, "Type (05/19/2020)": null, "Date (06/03/2020)": "06/01/2020", "Type (06/03/2020)": "E"}
98765432
{"Date (05/19/2020)": "05/18/2020", "Type (05/19/2020)": "B", "Date (10/23/2020)": "10/26/2020", "Type (10/23/2020)": "T"}
JSON to columns
Using the amazing create_jsonb_flat_view function from this post I can convert the jsonb to columns like this:
id
Date (05/19/2020)
Type (05/19/2020)
Date (06/03/2020)
Type (06/03/2020)
Type (10/23/2020
Date (10/23/2020)
Type (10/23/2020)
10629465
null
null
06/01/2020
E
98765432
05/18/2020
B
10/26/2020
T
Need to move part of column name to row
Now, this is where I'm stuck. I need to remove the portion of the column name that is the Activity Date (e.g. (05/19/2020)) and create a row for each id and ActivityDate with additional columns for Date and Type like this:
id
ActivityDate
Date
Type
12345678
05/19/2020
null
null
12345678
06/03/2020
06/01/2020
E
98765432
05/19/2020
05/18/2020
B
98765432
10/23/2020
10/26/2020
T
I followed your link to the create_jsonb_flat_view article yesterday and then forgot this question. While I thank you for pointing me there, I think that mentioning it worked against you.
A more conventional approach using regexp_replace() works here. I left the date values as strings, but you can convert them with to_date() if needed:
with parse as (
select id, e.k, e.v,
regexp_replace(e.k, '\s+\([0-9/]{10}\)', '') as k_no_date,
regexp_replace(e.k, '^.+([0-9/]{10}).+', '\1') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
db<>fiddle here
#Mike-Organek's Answer works beautifully!
However, I was curious if the regexp_replace() calls might be slowing the query down a bit and it seemed I could get the same results using a simpler function.
Since Mike gave me a great example to start with I modified it to split on the space between Date and (05/19/2020).
For 20,000 rows, it went from taking an avg of 7 sec on my local machine to an avg of .9 sec.
Here is the resulting query:
with parse as (
select id, e.k, e.v,
split_part(e.k, ' ', 1) as k_no_date,
trim(split_part(e.k, ' ', 2),'()') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;

Putting keyword data into a csv file MATLAB

Given a table of the following format in MATLAB:
userid | itemid | keywords
A = [ 3 10 'book'
3 10 'briefcase'
3 10 'boat'
12 20 'windows'
12 20 'picture'
12 35 'love'
4 10 'day'
12 10 'working day'
... ... ... ];
where A is a table of size (58000*3), I want to write the data in a csv file with the following format:
csv.file
itemid keywords
10 book, briefcase, boat, day, working day, ...
20 windows, picture, ...
35 love, ...
where we the list of itemids is stored in Iids = [10,20,35,...]
I would like to avoide using loops for this as you can imagine the matrix is big-sized. Any idea is appreciated.
I wasn't able to think of a solution without loops. But you can optimize your loop by:
using logical indexing
running such loop only M times (if M is the number of unique itemid elements) instead of N times (if N is the number of elements in your table).
The solution I come up with is this.
First of all, create your table
A=table([3;3;3;12;12;12;4;12], [10;10;10;20;20;35;10;10],{'book','briefcase','boat','windows','picture','love','day','working day'}','VariableNames',{'userid','itemid','keywords'});
which looks like
Select the unique values for column itemid (your Iids):
Iids=unique(A.itemid);
which looks like
Create a new, empty, table which will contain the results:
NewTable=table();
And now the minimal loop I've come up with:
for id=Iids'
% select rows with given itemid value
RowsWithGivenId=A(A.itemid==id,:);
% create new row in NewTable with the id and the (joined together) keywords from the selected rows
NewTable=[NewTable; table(id,{strjoin(RowsWithGivenId.keywords,', ')})];
end
Also, append the new column names in NewTable
NewTable.Properties.VariableNames = {'itemid','keywords'};
And now NewTable looks like:
Please note: due to the fact that the keywords in the new table are separated by comma, a csv file is not the format I recommend. By using writetable() as writetable(NewTable,'myfile.csv');
what you'll get is
As instead, by replacing ; instead of a separating comma (in strjoin()), you'll get a nicer format:

Get substring into a new column

I have a table that contains a column that has data in the following format - lets call the column "title" and the table "s"
title
ab.123
ab.321
cde.456
cde.654
fghi.789
fghi.987
I am trying to get a unique list of the characters that come before the "." so that i end up with this:
ab
cde
fghi
I have tried selecting the initial column into a table then trying to do an update to create a new column that is the position of the dot using "ss".
something like this:
t: select title from s
update thedot: (title ss `.)[0] from t
i was then going to try and do a 3rd column that would be "N" number of characters from "title" where N is the value stored in "thedot" column.
All i get when i try the update is a "type" error.
Any ideas? I am very new to kdb so no doubt doing something simple in a very silly way.
the reason why you get the type error is because ss only works on string type, not symbol. Plus ss is not vector based function so you need to combine it with each '.
q)update thedot:string[title] ss' "." from t
title thedot
---------------
ab.123 2
ab.321 2
cde.456 3
cde.654 3
fghi.789 4
There are a few ways to solve your problem:
q)select distinct(`$"." vs' string title)[;0] from t
x
----
ab
cde
fghi
q)select distinct(` vs' title)[;0] from t
x
----
ab
cde
fghi
You can read here for more info: http://code.kx.com/q/ref/casting/#vs
An alternative is to make use of the 0: operator, to parse around the "." delimiter. This operator is especially useful if you have a fixed number of 'columns' like in a csv file. In this case where there is a fixed number of columns and we only want the first, a list of distinct characters before the "." can be returned with:
exec distinct raze("S ";".")0:string title from t
`ab`cde`fghi
OR:
distinct raze("S ";".")0:string t`title
`ab`cde`fghi
Where "S " defines the types of each column and "." is the record delimiter. For records with differing number of columns it would be better to use the vs operator.
A variation of WooiKent's answer using each-right (/:) :
q)exec distinct (` vs/:x)[;0] from t
`ab`cde`fghi