Amazon Redshift get all keys from JSON

Amazon Redshift get all keys from JSON - amazon-redshift

I looked at the documentation of Amazon redshift and I'm not able to see a function which will give me what I want.
https://docs.aws.amazon.com/redshift/latest/dg/json-functions.html
I have a column in my database which contains JSON like this
{'en_IN-foo':'bla bla', 'en_US-foo':'bla bla'}
I want to extract all keys from json which have foo. So I want to extract
en_IN-foo
en_US-foo
How can I get what I want? The closest to my requirement is JSON_EXTRACT_PATH_TEXT function but that can only extract the key when you know the key name. in my case I want all keys which have a pattern but I don't know the key names.
I also tried abandoning the JSON function way and going the REGEX way. I wrote this code
select distinct regexp_substr('{en_in-foo:FOO, en_US-foo:BAR}','[^.]{5}-foo')
but this finds only the first match. I need all the matches.

Redshift is not flexible with JSON, so I don't think getting keys from an arbitrary JSON document is possible. You need to know the keys upfront.
option 1
If possible change your JSON document to have a static schema:
{"locale":"en_IN", "foo": "bla bla"}
Or even
{"locale":"en_IN", "name": "foo", "value": "bla bla"}
Option 2
I can see that your prefix may be known to you as it looks like the locale. What you could do is to create a static table of locales, and then CROSS JOIN it with your JSON column.
locales_table:
Id | locale
----------------
1 | en_US
2 | en_IN
The query would look like this:
SELECT
JSON_EXTRACT_PATH_TEXT(json_column, locale || '-foo', TRUE) as foo_at_locale
FROM json_table
CROSS JOIN locales_table
WHERE foo_at_locale IS NOT NULL

Related

How can I use key/value dashboard variables in Grafana + InfluxDB?

I’m trying to suss out how to format my key/value pair dashboard variable. I’ve got a variable whose definitions are:
sensor_list = 4431,8298,11041,13781
sensor_kv = 4431 : Storage,8298 : Stairs,11041 : Closet,13781 : Attic
However, I can't seem to use it effectively for queries and dashboard formatting with InfluxDB. For example, I've got a panel whose query is this:
SELECT last("battery_ok") FROM "autogen"."Acurite-Tower" WHERE ("id" =~ /^$sensor_list$/) AND $timeFilter GROUP BY time($__interval) fill(null)
That works, but if I replace it with the KV, I can't get the value:
SELECT last("battery_ok") FROM "autogen"."Acurite-Tower" WHERE ("id" =~ /^$sensor_kv$/) AND $timeFilter GROUP BY time($__interval) fill(null)
^ that comes back with no data.
I'm also at a loss as to how to access the value of the KV pair in, say, the template values for a repeating panel. ${sensor_kv:text} returns the word "All" but ${sensor_kv:value} actually causes a straight up error: "Error: Variable format value not found"
My goal here is twofold:
To use the key side of the kv map as the ID to query from in the DB
To use the value side as the label of the stat panel and also as the alias of the measurement if I'm querying in a graph
I’ve read the formatting docs and all they mention are lists; there are no key/value examples on there, and certainly none that do this. It’s clearly a new-ish feature (here is the GH issue where its implementation is merged) so I’m hoping there’s just a doc miss somewhere.

In PR that you linked there is a tiny comment that key/value pair has to contain spaces.
So when you're defining a pairs in Values separated by comma it should be like
key1 : value1, key2 : value2
These will not work
key1:value1, key2:value2
key1 :value1, key2 :value2
key1: value1, key2: value2
Let's say that name of the custom variable is var1
Then you can access the key by ${var1} ,$var1, ${var1:text} or [[var1:text]]
(some datasources will be satisfied with $var1 - some will understand only ${var1:text})
And you can access the value by ${var1:value} [[var1:value]]
Tested in Grafana 8.4.7

I realise this might not be all the information you're after, but hope it will be useful. I came across this question when trying to implement something similar myself (also using InfluxDB), and I have managed to access both keys and values in a query
My query looks like this:
SELECT
"Foo.${VariableName:text}.Bar.${VariableName:value}"
FROM "db"
WHERE (filters, filters) AND $timeFilter GROUP BY "bas"
So as you see, my use case was a bit different from what you're trying to achieve, but it demonstrates that it's basically possible to access both the key and the value in a query.

Key/values are working with some timeseries DB where it makes sense, e.g. MySQL https://grafana.com/docs/grafana/latest/datasources/mysql/:
Another option is a query that can create a key/value variable. The query should return two columns that are named __text and __value. The __text column value should be unique (if it is not unique then the first value is used). The options in the dropdown will have a text and value that allows you to have a friendly name as text and an id as the value.
But that's not a case for InfluxDB: https://grafana.com/docs/grafana/latest/datasources/influxdb/ InfluxDB can't return key=>value result - it returns only timeseries (that's not a key=>value) or only values or only keys.
Workarounds:
1.) Use supported DB (MySQL, PostgreSQL) just to have correct key=>value results. You really don't need to create table for that, just combination of SELECT, UNION, ... and you will get desired result.
2.) Use hidden variable which will be "translating" value to key, which will be used then in the query. E.g. https://community.grafana.com/t/how-to-alias-a-template-variable-value/10929/3
Of course everything has pros and cons, for example multi value variable values may not work as expecting.

Sequelize how to use aggregate function on Postgres JSONB column

I have created one table with JSONB column as "data"
And the sample value of that column is
[{field_id:1, value:10},{field_id:2, value:"some string"}]
Now there are multiple rows like this..
What i want ?
I want to use aggregate function on "data" column such that, i should
get
Sum of all value where field_id = 1;
Avg of value where field_id = 1;
I have searched alot on google but not able to find a proper solution.
sometimes it says "Field doesn't exist" and some times it says "from clause missing"
I tried referring like data.value & also data -> value lastly data ->> value
But nothing is working.
Please let me know the solution if any one knows,
Thanks in advance.

Your attributes should be something like this, so you instruct it to run the function on a specific value:
attributes: [
[sequelize.fn('sum', sequelize.literal("data->>'value'")), 'json_sum'],
[sequelize.fn('avg', sequelize.literal("data->>'value'")), 'json_avg']
]
Then in WHERE, you reference field_id in a similar way, using literal():
where: sequelize.literal("data->>'field_id' = 1")
Your example also included a string for the value of "value" which of course won't work. But if the basic Sequelize setup works on a good set of data, you can enhance the WHERE clause to test for numeric "value" data, there are good examples here: Postgres query to check a string is a number
Hopefully this gets you close. In my experience with Sequelize + Postgres, it helps to run the program in such a way that you see what queries it creates, like in a terminal where the output is streaming. On the way to a working statement, you'll either create objects which Sequelize doesn't like, or Sequelize will create bad queries which Postgres doesn't like. If the query looks close, take it into pgAdmin for further work, then try to reproduce your adjustments in Sequelize. Good luck!

PostgreSql Search JSON And Match Whole Term

I have a Postgres database using JSON storage. I have a table of cameras and lenses with a single property to search against called BrandAndModel. The relevant JSON portion looks like this and is stored in a column called "data":
"BrandAndModel": "nikon nikkor 50mm f/1.4 ai-s"
I have a LIKE query running against this brand and model string but it only returns a result of the sequence of characters matches. For instance, the above does get results for "nikkor 50mm" but NOT "nikon 50mm".
I'm no SQL expert and I'm not sure what I need to use to match more possible combinations.
My query looks like this
SELECT * FROM listing where data ->> 'Product' ->> 'BrandAndModel' like '%nikon 50mm%'
How could I get this query to match "nikon 50mm"?

You may use ANY with an array for multiple comparisons.
LIKE ANY(ARRAY['%nikon%ai-s%', 'nikon%50mm%', '%nikkor%50mm%'])

why is this postgresql full text search query returning ts_rank of 0?

Before I invest in using solr or lucene or sphinx, I wanted to try to implement a search capability on my system using postgresql full text search.
I have a national list of businesses in a table that I want to search. I created a ts vector that combines the business name and city so that I can do a search like "outback atlanta".
I am also implementing an auto-completion function by using the wildcard capability of the search by appending ":" to the search pattern and inserting " & " between keywords, so the search pattern "outback atl" turns into the "outback & atl:" before getting converted into a query using to_tsquery().
Here's the problem that I am running into currently.
if the search pattern is entered as "ou", many "Outback Steakhouse" records are returned.
if the search pattern is entered as "out", no results are returned.
if the search pattern is entered as "outb", many "Outback Steakhouse" records are returned.
doing a little debugging, I came up with this:
select ts_rank(to_tsvector('Outback Steakhouse'),to_tsquery('ou:*')) as "ou",
ts_rank(to_tsvector('Outback Steakhouse'),to_tsquery('out:*')) as "out",
ts_rank(to_tsvector('Outback Steakhouse'),to_tsquery('outb:*')) as "outb"
which results this:
ou out outb
0.0607927 0 0.0607927
What am I doing wrong?
Is this a limitation of pg full text search?
Is there something that I can do with my dictionary or configuration to get around this anomaly?
UPDATE:
I think that "out" may be a stop word.
when I run this debug query, I don't get any lexemes for "out"
SELECT * FROM ts_debug('english','out back outback');
alias description token dictionaries dictionary lexemes
asciiword Word all ASCII out {english_stem} english_stem {}
blank Space symbols {}
asciiword Word all ASCII back {english_stem} english_stem {back}
blank Space symbols {}
asciiword Word all ASCII outback {english_stem} english_stem {outback}
So now I ask how do I modify the stop word list to remove a word?
UPDATE:
here is the query that I currently using:
select id,name,address,city,state,likes
from view_business_favorite_count
where textsearchable_index_col ## to_tsquery('simple',$1)
ORDER BY ts_rank(textsearchable_index_col, to_tsquery('simple',$1)) DESC
When I execute the query (I'm using Strongloop Loopback + Express + Node), I pass the pattern in to replace $1 param. The pattern (as stated above) will look something like "keyword:" or "keyword1 & keyword2 & ... & keywordN:"
thanks

The problem here is that you are searching against business names and as #Daniel correctly pointed out - 'english' dictionary will not help you to find "fuzzy" match for NON-dictionary words like "Outback Steakhouse" etc;
'simple' dictionary
'simple' dictionary on its own will not help you neither, in your case business names will work only for exact match as all words are unstemmed.
'simple' dictionary + pg_trgm
But if you use 'simple' dictionary together with pg_trgm module - it will be exactly what you need, in particular:
for to_tsvector('simple','<business name>') you don't need to worry about stop words "hack", you will get all the lexemes unstemmed;
using similarity() from pg_trgm you will get the the highest "rank"
for the best match,
look at this:
WITH pg_trgm_test(business_name,search_pattern) AS ( VALUES
('Outback Steakhouse','ou'),
('Outback Steakhouse','out'),
('Outback Steakhouse','outb')
)
SELECT business_name,search_pattern,similarity(business_name,search_pattern)
FROM pg_trgm_test;
result:
business_name | search_pattern | similarity
--------------------+----------------+------------
Outback Steakhouse | ou | 0.1
Outback Steakhouse | out | 0.15
Outback Steakhouse | outb | 0.2
(3 rows)
Ordering by similarity DESC you will be able to get what you need.
UPDATE
For you situation there are 2 possible options.
Option #1.
Just create trgm index for name column in view_business_favorite_count table; index definition may be the following:
CREATE INDEX name_trgm_idx ON view_business_favorite_count USING gin (name gin_trgm_ops);
Query will look something like that:
SELECT
id,
name,
address,
city,
state,
likes,
similarity(name,$1) AS trgm_rank -- similarity score
FROM
view_business_favorite_count
WHERE
name % $1 -- trgm search
ORDER BY trgm_rank DESC;
Option #2.
With full text search, you need to :
create a separate table, for example unnested_business_names, where you will store 2 columns: 1st column will keep all lexemes from to_tsvector('simple',name) function, 2nd column will have vbfc_id(FK for id from view_business_favorite_count table);
add trgm index for the column, which contains lexemes;
add trigger for unnested_business_names, which will update OR insert OR delete new values from view_business_favorite_count to keep all words up to date

How to create like query in Cassandra?

In my keyspace
posts = [
#key
'post1': {
# columns and value
'url': 'foobar.com/post1',
'body': 'Currently has client support FOOBAR for the following programming languages..',
},
'post2': {
'url': 'foobar.com/post2',
'body': 'The table with the following table FOOBAR structure...',
},
# ... ,
}
How to create a like query in Cassandra to get all posts that contains the word 'FOOBAR'?
In SQL is SELECT * FROM POST WHERE BODY LIKE '%FOOBAR%', but in Cassandra?

The only way to do this efficiently is to use a full-text search engine like https://github.com/tjake/Solandra (Solr-on-cassandra). Of course you can roll your own using the same techniques manually, but usually this is not called for.
Note that this is true for SQL databases too: they will translate %FOO% to a table scan, unless you use a FTS extension like postgresql's tsearch2.

You might create another column family where the keys are the domains, and the values are the keys in your original column family. That way you could refer to records within a specific domain directly.

Cassandra 3.4 added support for LIKE in CSQL. So finally it is available natively.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Amazon Redshift get all keys from JSON - amazon-redshift

Related

How can I use key/value dashboard variables in Grafana + InfluxDB?

Sequelize how to use aggregate function on Postgres JSONB column

PostgreSql Search JSON And Match Whole Term

why is this postgresql full text search query returning ts_rank of 0?

How to create like query in Cassandra?

Categories

Resources