how to filter fields from jsonb type column while querying on postgresql - postgresql

Here is my table (simplified, only significant columns):
CREATE TABLE details(
id serial primary key,
name text,
Address jsonb );
And some sample Data
# Select * from details
id | name | Address
----+----------+-----------------------------------------------------------
1 | Batman | {"city":"Gotham City","street":"1007 Mountain Drive"}
2 | Superman | {"city":"Metropolis","street":"344 Clinton Street"}
3 | Flash | {"city":"Central City","street":"122 Englewood street"}
Now I would like to select only name and City field of Address, Query would be
Select name, Address -> 'city' as Address from details
name | Address
----------+------------------
Batman | "Gotham City"
Superman | "Metropolis"
Flash | "Central City"
But I want it to be filtered as shown below.
name | Address
----------+-------------------------
Batman | {"city":"Gotham City"}
Superman | {"city":"Metropolis"}
Flash | {"city":"Central City"}
Is it possible to select only some fields from jsonb type column? If it is possible then what would be the query ?

If you want to include only 1 field, your query can be fairly easy:
select name, jsonb_build_object('city', address -> 'city') address
from details
However, if you want to include multiple fields, things will get complex. You could f.ex. remove unwanted keys one-by-one with the - operator, like: jsonb_column - 'key1' - 'key2':
select name, address - 'street' address
from details
But this will only work, when you have a fairly few fields inside of the JSON column (and they are well defined).
If you want a general solution, you should use some aggregation:
select name, (select jsonb_object_agg(e.key, e.value)
from jsonb_each(address) e
where e.key in ('city')) address
from details

Related

Aggregate function to extract all fields based on maximum date

In one table I have duplicate values ​​that I would like to group and export only those fields where the value in the "published_at" field is the most up-to-date (the latest date possible). Do I understand it correctly as I use the MAX aggregate function the corresponding fields I would like to extract will refer to the max found or will it take the first found in the table?
Let me demonstrate you this on simple example (in real world example I am also joining two different tables). I would like to group it by id and extract all fields but only relating to the max published_at field. My query would be:
SELECT "t1"."id", "t1"."field", MAX("t1"."published_at") as "published_at"
FROM "t1"
GROUP By "t1"."id"
| id | field | published_at |
---------------------------------
| 1 | document1 | 2022-01-10 |
| 1 | document2 | 2022-01-11 |
| 1 | document3 | 2022-01-12 |
The result I want is:
1 - document3 - 2022-01-12
Also one question - why am I getting this error "ERROR: column "t1"."field" must appear in the GROUP BY clause or be used in an aggregate function". Can I use MAX function on string type column?
If you want the latest row for each id, you can use DISTINCT ON. For example:
select distinct on (id) *
from t
order by id, published_at desc
If you just want the latest row in the whole result set you can use LIMIT. For example:
select *
from t
order by published_at desc
limit 1

Looking for a value in a jsonb list of keys/values

I have a postgresql table of cities (1 row = 1 city) with a jsonb colum containing the name of the city in different languages (as a list, not an array). For example for Paris(France) I have:
id_city (integer) = 7444
name_city (text) = Paris
names_i18n (jsonb) = {"name:fr":"Paris","name:zh":"巴黎","name:it":"Parigi",...}
In reality in my table I have around 20 different languages. So I try to find a city looking for any name:xx's value that could match a parameter given by the user, but I can't find how to query the jsonb column in that way. I've tried something like the request below but it doesn't seem to be the good syntaxe
select * from jsonb_each_text(select names_i18n from CityTable)
where value ilike 'Parigi'
I have also tried the following
select * from CityTable where names_i18n ? 'Parigi';
But it seems to work only for the key part of the jsonb, is there any similar operator for the value part? I also need a way to know what name:XX has been found, not only the city name.
Anyone has a clue?
with CityTable (id_city, name_city, names_i18n) as (values(
7444, 'Paris',
'{"name:fr":"Paris","name:zh":"巴黎","name:it":"Parigi"}'::jsonb
))
select *
from CityTable, jsonb_each_text(names_i18n) jbet (key, value)
where value ilike 'Parigi'
;
id_city | name_city | names_i18n | key | value
---------+-----------+--------------------------------------------------------------+---------+--------
7444 | Paris | {"name:fr": "Paris", "name:it": "Parigi", "name:zh": "巴黎"} | name:it | Parigi

Search inside full search column using certain letters

I want to search inside a full search column using certain letters, I mean:
select "Name","Country","_score" from datatable where match("Country", 'China');
Returns many rows and is ok. My question is, how can I search for example:
select "Name","Country","_score" from datatable where match("Country", 'Ch');
I want to see, China, Chile, etc.
I think that match_type phrase_prefix can be the answer, but I don't know how I can use (correct syntax).
The match predicate supports different types by use of using match_type [with (match_parameter = [value])].
So in your example using the phrase_prefix match type:
select "Name","Country","_score" from datatable where match("Country", 'Ch') using phrase_prefix;
gives you your desired results.
See the match predicate documentation: https://crate.io/docs/en/latest/sql/fulltext.html?#match-predicate
If you just need to match the beginning of a string column, you don't need a fulltext analyzed column. You can use the LIKE operator instead, e.g.:
cr> create table names_table (name string, country string);
CREATE OK (0.840 sec)
cr> insert into names_table (name, country) values ('foo', 'China'), ('bar','Chile'), ('foobar', 'Austria');
INSERT OK, 3 rows affected (0.049 sec)
cr> select * from names_table where country like 'Ch%';
+---------+------+
| country | name |
+---------+------+
| Chile | bar |
| China | foo |
+---------+------+
SELECT 2 rows in set (0.037 sec)

psql - order alphabetically on either field

Say I have a query like this
SELECT "contacts".name, "email_addresses".address
FROM "contacts"
LEFT JOIN "email_addresses" ON ("contacts"."id" = "email_addresses"."contact_id") WHERE (("contacts"."account_id" = 1) AND ("public" IS TRUE))
ORDER BY "contacts"."name", "email_addresses"."address"
Some of the results might have null name fields or null email addresses, I want to order alphabetically on a sort of computed property of display_name that orders by the name if there is one and if not by the email address if there is no name field so I could get results like this:
name | email
====================================
null | aaron#gmail.com
Ben Jones | null
Colin Cowan | zach#gmail.com
etc.
You can use the COALESCE function which takes the first non null value of the supplied parameters:
order by coalesce("contacts"."name","email_addresses"."address")

zend search lucene

I have a database that I would like to leverage with Zend_Search_Lucene. However, I am having difficulty creating a "fully searchable" document for Lucene.
Each Zend_Search_Lucene document pulls information from two relational database tables (Table_One and Table_Two). Table_One has basic information (id, owner_id, title, description, location, etc.), Table_Two has a 1:N relationship to Table_One (meaning, for each entry in Table_One, there could be one or more entries in Table_Two). Table_Two contains: id, listing_id, bedrooms, bathrooms, price_min, price_max, date_available. See Figure 1.
Figure 1
Table_One
id (Primary Key)
owner_id
title
description
location
etc...
Table_Two
id (Primary Key)
listing_id (Foreign Key to Table_One)
bedrooms (int)
bathrooms (int)
price_min (int)
price_max (int)
date_available (datetime)
The problem is, there are multiple Table_Two entries for each Table_One entry. [Question 1] How to create a Zend_Search_Lucene document where each field is unique? (See Figure 2)
Figure 2
Lucene Document
id:Keyword
owner_id:Keyword
title:UnStored
description:UnStored
location: UnStored
date_registered:Keyword
... (other Table_One information)
bedrooms: UnStored
bathrooms: UnStored
price_min: UnStored
price_max: UnStored
date_available: Keyword
bedrooms_1: <- Would prefer not to have do this as this makes the bedrooms harder to search.
Next, I need to be able to do a Range Query on the bedrooms, bathrooms, price_min and price_max fields. (Example: finding documents that have between 1 and 3 bedrooms) Zend_Search_Lucene will only allow ranged searches on the same field. From my understanding, this means each field I want to do a ranged query on can only contain one value (example: bedrooms:"1 bedroom");
What I have now, within the Lucene Document is the bedrooms, bathrooms, price_min, price_max, date_available fields being space delimited.
Example:
Sample Table_One Entry:
| 5 | 2 | "Sample Title" | "Sample Description" | "Sample Location" | 2008-01-12
Sample Table_Two Entries:
| 10 | 5 | 3 | 1 | 900 | 1000 | 2009-10-01
| 11 | 5 | 2 | 1 | 800 | 850 | 2009-08-11
| 12 | 5 | 1 | 1 | 650 | 650 | 2009-09-15
Sample Lucene Document
id:5
owner_id:2
title: "Sample Title"
description: "Sample Description"
location: "Sample Location"
date_registered: [datetime stamp YYYY-MM-DD]
bedrooms: "3 bedroom 2 bedroom 1 bedroom"
bathrooms: "1 bathroom 1 bathroom 1 bathroom"
price_min: "900 800 650"
price_max: "1000 850 650"
date_available: "2009-10-01 2009-08-11 2009-09-15"
[Question 2] Can you do a Range Query search on the bedroom, bathroom, price_min, price_max, date_available fields as they are shown above or does each range query field have to contain only one value (e.g. "1 bedroom")? I have not been able to get the Range Query to work in its current form. I am at a lose here.
Thanks in advance.
I suggest you create a separate Lucene document for each entry in Table_Two. This will cause some duplication of the Table_One information common to these entries, but this is not a high price to pay for much easier index structure in Lucene.
Use a boolean query to combine several range queries. The number-valued fields should be something like this:
bedrooms: 3
price_min: 900
and a sample query in Lucene syntax will be:
date_available:[20100101 TO 20100301] AND price_min:[600 TO 1000]