Pattern matching Postgres to replace misspelled words of a column - postgresql

I have a table A :
name
renamed_name
HON/A
HONDA TRUCK
GMC
and I have a renaming rules table B:
rule
correct_name
HON/A
HONDA
HONDA TRUCK
HONDA
^GMC.+
GMC
I need to update table A and set the column A.renamed_name to the B.correct_name of Table B where A.name matched any of B.rule.
When I use the following update query:
Update A set A.renamed_name = B.correct_name from B where name ~* any(array[B.rule]) gives me a result
name
renamed_name
HON/A
HONDA
HONDA TRUCK
HONDA
GMC
NULL
The last row is not updated though my condition check includes regex exp. Please let me know where I can be possibly going wrong or if there is an alternate solution.

Related

Postgres 13 join from another table on JSONB array of String

I have a table with JSONB column. In this column we store identifiers of another table as json array of strings. How can I join the tables
Table Customer:
CustomerID
Name
Campaigns (JSONB)
1
John
[ "rxuatoak", "vsnxcvdsl", "jkiasokd" ]
2
Mick
[ "jdywmsks", "nxbsvwios", "jkiasokd" ]
Table Campaign:
CampaignID
Identifier
CampaignName
1
rxuatoak
Alpha
2
vsnxcvdsl
Bravo
3
jkiasokd
Charlie
4
jdywmsks
Delta
5
nxbsvwios
Echo
Result something like:
CustomerID
Name
CampaignNames
1
John
Alpha, Bravo, Charlie
2
Mick
Delta, Echo, Charlie
I tried many ways, and could only find online help with json objects inside the jsonb column. My jsonb column has simple array of strings.
Using POSTGRES 13
You can apply a JOIN operation between the two tables on condition that an identifier is found within a campaign (using ? operator). Then apply aggregation with STRING_AGG, with respect to the "CustomerID" and "Name"
SELECT customer.CustomerID,
customer.Name_,
STRING_AGG(campaign.CampaignName, ',') AS CampaignNames
FROM customer
INNER JOIN campaign
ON customer.Campaigns ? campaign.Identifier
GROUP BY customer.CustomerID,
customer.Name_
Check the demo here.

How to use ts_query with ANY(anyarray)

I currently have a query in PostgreSQL like:
SELECT
name
FROM
ingredients
WHERE
name = ANY({"string value",tomato,other})
My ingredients table is simply a list of names:
name
----------
jalapeno
tomatoes
avocados
lime
My issue is that plural values in the array will not match single values in the query. To solve this, I created a tsvector column on the table:
name | tokens
---------------+--------------
jalapeno | 'jalapeno':1
tomatoes | 'tomato':1
avocados | 'avocado':1
lime | 'lime':1
I'm able to correctly query single values from the table like this:
SELECT
name,
ts_rank_cd(tokens, plainto_tsquery('tomato'), 16) AS rank
FROM
ingredients
WHERE
tokens ## plainto_tsquery('tomato')
ORDER BY
rank DESC;
However, I need to query values from the entire array. The array is generated from another function, so I have control over the type of each of items in the array.
How can I use the ## operand with ANY(anyarray)?
That should be straight forward:
WHERE tokens ## ANY
(ARRAY[
plainto_tsquery('tomato'),
plainto_tsquery('celery'),
plainto_tsquery('vodka')
])

Extract some digits after word id

I have a table with a Name column and a Log column.
Name Log
Michelle Bad 222 news travels id 54585 fast.
Lucy Barking 333 dogs id 545584 seldom bite.
Green Beauty is 444 in the id 85955 eyes of the beholder.
Gail Beggars 123 can't be ID 4658 choosers.
I want to extract only the ID digits from log column. Note that the word ID could be capitalized or not. Hence, the output should be like this:
Name ID
Michelle 54585
Lucy 545584
Green 85955
Gail 4658
I tried to use the following query:
select name
, substring(log from E'^(.*?)[id< ]') as id
from mytable;
However, I cannot have the output I need.
Something along these lines should work. Since you didn't provide CREATE TABLE and INSERT statements, I just used one row in a common table expression.
with data as (
select 'Michelle' as name, 'Bad 333 id 54342 wibble' as log
)
select name, substring(substring(log::text, '((id|ID) [0-9]+)'), '[0-9]+')
from data
where log::text ~* 'id [0-9]+';
The nested substring() calls first return the id number along with the string 'id', then return just the id number.

Calculate value based on existence of records matching given criteria - FileMaker Pro 13

How can I write a calculation field in a table that outputs '1' if there are other (related) records in the same table that meet a given set of criteria and '0' otherwise?
Here's my problem explained in more detail:
I have a table containing 'students' and another containing 'exam results'. The 'exam results' table looks like this:
StudentID SubjectID Level Result
3234 1 2 A-
3234 2 4 B+
4739 1 4 C+
A student can only pass a Level 4 exam in subject 2 if they have also passed a Level 2 exam in subject 1 with a B+ or higher. I want to define a field in the 'students' table that contains a '1' if there exists an exam result belonging to the right student that meets these criteria and a '0' otherwise.
What would be the best way to do this?
Let us take an example of a Results table where the results are also calculated as a numeric value, e.g.
StudentID SubjectID Level Result cResultNum
3234 1 2 A- 95
3234 2 4 B+ 85
4739 1 4 C+ 75
and an Exams table with the following fields (among others):
RequiredSubjectID
RequiredLevel
RequiredResultNum
Given these, you can construct a relationship between Exams and (another occurrence of) Results as:
Exams::RequiredSubjectID = Results 2::SubjectID
AND
Exams::RequiredLevel = Results 2::Level
AND
Exams::RequiredResultNum ≤ Results 2::cResultNum
This allows each exam record to calculate a list of students that are eligible to take that exam as =
List ( Results 2::StudentID )
I want to define a field in the 'students' table that contains a '1'
if there exists an exam result belonging to the right student that
meets these criteria and a '0' otherwise.
This request is unclear, because there are many exams a student may want to take, and a field in the Students table can calculate only one result.
You need to do a self-join in the table for the field you want to check, for example:
Exam::Level = Exam2::Level
Exam::Student = Exam2::Student
And for the "was passed" criteria I think you could do an "If" on the calculation like this:
If ( Last(Exam2::Result) = "D" and ...(all the pass values) ; 1 ; 0 )
Edit:
It could be just with the not pass value hehe I miss that it will be like this:
If ( Last(Exam2::Result) = "F" ; 0 ; 1 )
I hope this helps you.

zend search lucene

I have a database that I would like to leverage with Zend_Search_Lucene. However, I am having difficulty creating a "fully searchable" document for Lucene.
Each Zend_Search_Lucene document pulls information from two relational database tables (Table_One and Table_Two). Table_One has basic information (id, owner_id, title, description, location, etc.), Table_Two has a 1:N relationship to Table_One (meaning, for each entry in Table_One, there could be one or more entries in Table_Two). Table_Two contains: id, listing_id, bedrooms, bathrooms, price_min, price_max, date_available. See Figure 1.
Figure 1
Table_One
id (Primary Key)
owner_id
title
description
location
etc...
Table_Two
id (Primary Key)
listing_id (Foreign Key to Table_One)
bedrooms (int)
bathrooms (int)
price_min (int)
price_max (int)
date_available (datetime)
The problem is, there are multiple Table_Two entries for each Table_One entry. [Question 1] How to create a Zend_Search_Lucene document where each field is unique? (See Figure 2)
Figure 2
Lucene Document
id:Keyword
owner_id:Keyword
title:UnStored
description:UnStored
location: UnStored
date_registered:Keyword
... (other Table_One information)
bedrooms: UnStored
bathrooms: UnStored
price_min: UnStored
price_max: UnStored
date_available: Keyword
bedrooms_1: <- Would prefer not to have do this as this makes the bedrooms harder to search.
Next, I need to be able to do a Range Query on the bedrooms, bathrooms, price_min and price_max fields. (Example: finding documents that have between 1 and 3 bedrooms) Zend_Search_Lucene will only allow ranged searches on the same field. From my understanding, this means each field I want to do a ranged query on can only contain one value (example: bedrooms:"1 bedroom");
What I have now, within the Lucene Document is the bedrooms, bathrooms, price_min, price_max, date_available fields being space delimited.
Example:
Sample Table_One Entry:
| 5 | 2 | "Sample Title" | "Sample Description" | "Sample Location" | 2008-01-12
Sample Table_Two Entries:
| 10 | 5 | 3 | 1 | 900 | 1000 | 2009-10-01
| 11 | 5 | 2 | 1 | 800 | 850 | 2009-08-11
| 12 | 5 | 1 | 1 | 650 | 650 | 2009-09-15
Sample Lucene Document
id:5
owner_id:2
title: "Sample Title"
description: "Sample Description"
location: "Sample Location"
date_registered: [datetime stamp YYYY-MM-DD]
bedrooms: "3 bedroom 2 bedroom 1 bedroom"
bathrooms: "1 bathroom 1 bathroom 1 bathroom"
price_min: "900 800 650"
price_max: "1000 850 650"
date_available: "2009-10-01 2009-08-11 2009-09-15"
[Question 2] Can you do a Range Query search on the bedroom, bathroom, price_min, price_max, date_available fields as they are shown above or does each range query field have to contain only one value (e.g. "1 bedroom")? I have not been able to get the Range Query to work in its current form. I am at a lose here.
Thanks in advance.
I suggest you create a separate Lucene document for each entry in Table_Two. This will cause some duplication of the Table_One information common to these entries, but this is not a high price to pay for much easier index structure in Lucene.
Use a boolean query to combine several range queries. The number-valued fields should be something like this:
bedrooms: 3
price_min: 900
and a sample query in Lucene syntax will be:
date_available:[20100101 TO 20100301] AND price_min:[600 TO 1000]