How to create multiple dictionaries from a list of strings? - kdb

I have a variable that contains a list of strings. For each string in the list I would like to create a new dictionary, then update each dictionary with that string value. This would result in 100 new dictionaries being created. For example my variable looks similar to this...
q)stock
"APPL"
"TSLA"
"AMZN"
"MSFT"
"NVDA"
..
q)count stock
100
I know how to do this for a single string in a variable...
q)newdict:()!()
q)stock:"AAPL"
q)newdict[`ticker]:stock
q)newdict
ticker|"AAPL"
But how would I do this for a variable that is a list of 100 strings? Also, if I wanted to update the dictionary to include additional key:value pairs from another variable with the same count (like below) what is the correct syntax to use for this operation?
q)date
2019.05.23T13:03:55.271474000 2019.05.23T13:03:55.271474000...
q)count date
100
Expected output for each string in variable would create a new dictionary similar to the following...
q)newdictAPPL
ticker|"APPL"
datetime|2019.05.23T13:03:55.271474000
q)newdictTSLA
ticker|"TSLA"
datetime|2019.05.23T13:05:33.727845200
q)newdictAMZN
ticker|"AMZN"
datetime|2019.05.23T13:08:27.742968000

I think what you're most likely to be interested in is a table. Starting with some sample data similar to what you have:
q)stock:20?("AAPL";"TSLA";"MSFT";"AMZN")
q)date:20?.z.Z
q)0N!stock;
("AMZN";"MSFT";"AMZN";"MSFT";"AAPL";"TSLA";"TSLA";"MSFT";"TSLA";"AAPL";"TSLA"..
q)date
2016.09.16T16:06:23.573 2010.10.04T23:28:53.863 2001.03.12T15:16:04.379 2005...
We can construct table quite simply in kdb:
q)t:([]ticker:stock;datetime:date)
q)t
ticker datetime
------------------------------
"AMZN" 2016.09.16T16:06:23.573
"MSFT" 2010.10.04T23:28:53.863
"AMZN" 2001.03.12T15:16:04.379
"MSFT" 2005.07.17T04:02:58.577
"AAPL" 2012.12.17T10:48:15.839
"TSLA" 2017.09.16T11:06:02.579
"TSLA" 2002.11.18T00:03:57.945
"MSFT" 2009.06.02T08:28:32.680
"TSLA" 2013.10.24T06:50:31.420
"AAPL" 2007.06.12T07:04:33.058
"TSLA" 2006.08.10T03:45:58.748
"AAPL" 2001.01.17T11:04:57.387
"AAPL" 2010.08.29T21:47:39.564
"MSFT" 2003.10.19T01:58:58.820
"AMZN" 2010.11.21T00:05:03.256
"MSFT" 2001.05.13T21:03:21.293
"TSLA" 2004.02.13T07:49:57.013
"AAPL" 2015.01.31T08:13:03.986
"MSFT" 2009.05.24T06:34:05.044
"TSLA" 2013.03.28T22:11:03.641
We can see that a table in kdb is a list of dictionaries, we can index and get individual dictionaries:
q)t[0]
ticker | "AMZN"
datetime| 2016.09.16T16:06:23.573
q)t[1]
ticker | "MSFT"
datetime| 2010.10.04T23:28:53.863
We can also serialise to JSON using built-in .j.j function:
q).j.j t
"[{\"ticker\":\"AMZN\",\"datetime\":\"2016-09-16T16:06:23.573\"},{\"ticker\":..
Or if we want each dict as individual JSON strings:
q).j.j each t
"{\"ticker\":\"AMZN\",\"datetime\":\"2016-09-16T16:06:23.573\"}"
"{\"ticker\":\"MSFT\",\"datetime\":\"2010-10-04T23:28:53.863\"}"
"{\"ticker\":\"AMZN\",\"datetime\":\"2001-03-12T15:16:04.379\"}"
"{\"ticker\":\"MSFT\",\"datetime\":\"2005-07-17T04:02:58.577\"}"
"{\"ticker\":\"AAPL\",\"datetime\":\"2012-12-17T10:48:15.839\"}"
"{\"ticker\":\"TSLA\",\"datetime\":\"2017-09-16T11:06:02.579\"}"
"{\"ticker\":\"TSLA\",\"datetime\":\"2002-11-18T00:03:57.945\"}"
..

Related

String match in Postgresql

I am trying to make separate columns in my query result for values stored in in a single column. It is a string field that contains a variety of similar values stored like this:
["john"] or ["john", "jane"] or ["john", "john smith', "jane"],etc... where each of the values in quotes is a distinct value. I cannot seem to isolate just ["john"] in a way that will return john and not john smith. The john smith value would be in a separate column. Essentially a column for each value in quotes. Note, I would like the results to not contain the quotes or the brackets.
I started with:
Select name
From namestbl
Where name like %["john"]%;
I think this is heading in the wrong direction. I think this should be in select instead of where.
Sorry about the format, I give up trying to figure out the completely useless error message when i try to save this with table markdown.
Your data examples represent valid JSON array syntax. So cast them to JSONB array and access individual elements by position (JSON arrays are zero based). The t CTE is a mimic of real data. In the illustration below the number of columns is limited to 6.
with t(s) as
(
values
('["john", "john smith", "jane"]'),
('["john", "jane"]'),
('["john"]')
)
select s::jsonb->>0 name1, s::jsonb->>1 name2, s::jsonb->>2 name3,
s::jsonb->>3 name4, s::jsonb->>4 name5, s::jsonb->>5 name6
from t;
Here is the result.
name1
name2
name3
name4
name5
name6
john
john smith
jane
john
jane
john

kdb+/q table - convert string to number

assume you have a table
tbl:([] id:("123"; ""; "invalid"))
And want to parse this string into a number.
Invalid values - in the example above, both the empty string "" as well as the value "invalid", should be parsed to null (0Nj).
How can you best do it? My initial approach was
select id:.[value;;0Nj] each enlist each id from tbl
But while that will parse the both the "123" as well as "invalid" entries correctly, it will return the unary operator :: instead of null when trying to parse the row with the empty string.
Of course I could do something like
select id:.[value;;0Nj] each enlist each id from update id:string (count id)#`invalid from tbl where id like ""
but this seems kind of.. ugly/inefficient. Is there any better way to do this?
Thanks
Try "J"$ to cast the column
q)select "J"$id from tbl
id
---
123
https://code.kx.com/v2/ref/tok/
how about just cast it to long?
q)update id:"J"$id from `tbl
`tbl
q)select from tbl where not null id
id
---
123

How to use ts_query with ANY(anyarray)

I currently have a query in PostgreSQL like:
SELECT
name
FROM
ingredients
WHERE
name = ANY({"string value",tomato,other})
My ingredients table is simply a list of names:
name
----------
jalapeno
tomatoes
avocados
lime
My issue is that plural values in the array will not match single values in the query. To solve this, I created a tsvector column on the table:
name | tokens
---------------+--------------
jalapeno | 'jalapeno':1
tomatoes | 'tomato':1
avocados | 'avocado':1
lime | 'lime':1
I'm able to correctly query single values from the table like this:
SELECT
name,
ts_rank_cd(tokens, plainto_tsquery('tomato'), 16) AS rank
FROM
ingredients
WHERE
tokens ## plainto_tsquery('tomato')
ORDER BY
rank DESC;
However, I need to query values from the entire array. The array is generated from another function, so I have control over the type of each of items in the array.
How can I use the ## operand with ANY(anyarray)?
That should be straight forward:
WHERE tokens ## ANY
(ARRAY[
plainto_tsquery('tomato'),
plainto_tsquery('celery'),
plainto_tsquery('vodka')
])

Get substring into a new column

I have a table that contains a column that has data in the following format - lets call the column "title" and the table "s"
title
ab.123
ab.321
cde.456
cde.654
fghi.789
fghi.987
I am trying to get a unique list of the characters that come before the "." so that i end up with this:
ab
cde
fghi
I have tried selecting the initial column into a table then trying to do an update to create a new column that is the position of the dot using "ss".
something like this:
t: select title from s
update thedot: (title ss `.)[0] from t
i was then going to try and do a 3rd column that would be "N" number of characters from "title" where N is the value stored in "thedot" column.
All i get when i try the update is a "type" error.
Any ideas? I am very new to kdb so no doubt doing something simple in a very silly way.
the reason why you get the type error is because ss only works on string type, not symbol. Plus ss is not vector based function so you need to combine it with each '.
q)update thedot:string[title] ss' "." from t
title thedot
---------------
ab.123 2
ab.321 2
cde.456 3
cde.654 3
fghi.789 4
There are a few ways to solve your problem:
q)select distinct(`$"." vs' string title)[;0] from t
x
----
ab
cde
fghi
q)select distinct(` vs' title)[;0] from t
x
----
ab
cde
fghi
You can read here for more info: http://code.kx.com/q/ref/casting/#vs
An alternative is to make use of the 0: operator, to parse around the "." delimiter. This operator is especially useful if you have a fixed number of 'columns' like in a csv file. In this case where there is a fixed number of columns and we only want the first, a list of distinct characters before the "." can be returned with:
exec distinct raze("S ";".")0:string title from t
`ab`cde`fghi
OR:
distinct raze("S ";".")0:string t`title
`ab`cde`fghi
Where "S " defines the types of each column and "." is the record delimiter. For records with differing number of columns it would be better to use the vs operator.
A variation of WooiKent's answer using each-right (/:) :
q)exec distinct (` vs/:x)[;0] from t
`ab`cde`fghi

How to create and store array of objects in postgresql

In postgresql allowed array types or integer and text.But i need to create array of objects.how can i do that.
myarray text[]; //for text ['a','b','c']
myarray integer[]; //for integer[1,2,3]
I need to create the array like below
[{'dsad':1},{'sdsad':34.6},{'sdsad':23}]
I dont want to use JSON type.Using array type i need to store the array of objects.
If you're running Postgres 9.2+, you can use the JSON type.
For example, we could do
create table jsontest (id serial primary key, data json);
insert into jsontest (data) values ('[{"dsad":1},{"sdsad":34.6},{"sdsad":23}]');
And query the data with
select data->1 from jsontest;
{"sdsad":34.6}
You say:
I dont want to use JSON type
but you cannot use an ordinary array, as PostgreSQL arrays must be of homogenous types. You can't have a 2-dimensional array of text and integer.
What you could do if you don't want to use json is to create a composite type:
CREATE TYPE my_pair AS (blah text, blah2 integer);
SELECT ARRAY[ ROW('dasd',2), ROW('sdsad', 34.6), ROW('sdsad', 23) ]::my_pair[]
which will emit:
array
----------------------------------------
{"(dasd,2)","(sdsad,35)","(sdsad,23)"}
(1 row)
If you don't want that, then json is probably your best bet. Or hstore:
SELECT hstore(ARRAY['a','b','c'], ARRAY[1,2,3]::text[])
hstore
------------------------------
"a"=>"1", "b"=>"2", "c"=>"3"
(1 row)
JSON is your preferred answer, but more info as to why.
You can do something like:
SELECT array_agg(v)
FROM mytable v;
However you get something that looks like this:
{"(dsad,1)","(sdsad,34.6)","(""sdsad,var"",23)"}
It is then up to you to know how to decode this (i.e. column order). This is possible to do programmatically but is much easier with JSON.
It's hacky, but what about using an array for each property in the object (and its corresponding scalar type). If you have a data model layer in your get/read you could put the arrays "back together" into an array of objects and in your save method you would break you objects apart into synchronized arrays. This might be complicated by your example of each object not having the same properties; IDK how you'd store undefined for a property unless you're willing for null to be the same semantically.
It's not entirely clear if you mean json:
# select '[{"dsad":1},{"sdsad":34.6},{"sdsad":23}]'::json;
json
------------------------------------------
[{"dsad":1},{"sdsad":34.6},{"sdsad":23}]
(1 row)
Or an array of json:
# select array['{"dsad":1}', '{"sdsad":34.6}', '{"sdsad":23}']::json[];
array
------------------------------------------------------
{"{\"dsad\":1}","{\"sdsad\":34.6}","{\"sdsad\":23}"}
(1 row)
Or perhaps hstore? If the latter, it's only for key-values pairs, but you can likewise use an array of hstore values.
You can do something like:
SELECT JSON_AGG(v) FROM mytable v;
However you get something that looks like this:
["00000000-0000-0000-0000-000000000001","00000000-0000-0000-0000-000000000002", "00000000-0000-0000-0000-000000000003"]
exemple :
SELECT title, (select JSON_AGG(v.video_id) FROM videos v WHERE v.channel_id = c.channel_id) AS videos FROM channel AS c
Use text[] myarray insted of myarray text[].