Is place_id in Nominatim a unique column? - openstreetmap

Is the place_id column in a reverse lookup response, a unique column?
[Long story...]
The reason I need to know is, I have made a monumental mistake, I reverse geo-coded 40 million records, but forgot to include osm_type in the saved results.
This is a problem, because I am updating a SQL table, with the results of my reverse geocoding, but because there are multiple osmIds (its only unique within element type) and because I don't know the element type of my updating row, I've landed myself with a big problem!
So now I need a way to figure out a way to match these rows.
I have the place_id saved so this could be my savior? (if it's a unique column?)
Otherwise if it isn't, is there any other way to implicitly infer what element type a nominatim response is? e.g from the presence of another column?
Here is a sample reverse geocoded result:
{
"placeId": "90367351",
"osmId": "109378817",
"boundingBox": "",
"polygonPoints": "",
"displayName": "Bugallon, Calbayog, Samar, Eastern Visayas, 6710, Philippines",
"road": "Bugallon",
"neighbourhood": "",
"suburb": "",
"cityDistrict": "",
"county": "",
"state": "Samar",
"country": "Philippines",
"countryCode": "ph",
"continent": "",
"house": "",
"village": "",
"town": "",
"city": "Calbayog",
"lat": "12.0666148",
"lon": "124.5958354"
}
and here is a sample row from my Sql (two rows with same osmId but different OsmElementType)
Id OsmId OsmKey OsmValue OsmNodeLat OsmNodeLng OsmInnerXml OsmXmlPath OsmXmlCountry OsmXmlFileNumber OsmElementType OsmPlaceId OsmDisplayName OsmRoad OsmNeighbourhood OsmSuburb OsmCityDistrict OsmCounty OsmState OsmCountry OsmCountryCode OsmContinent OsmHouse OsmVillage OsmTown OsmCity OsmBoundingBox OsmPolygonPoints FriendlyUrlTitle ViewsNum NumFavourites Text DateCreated UserIdCsvWhoViewedProfile IpCsvWhoViewedProfile Country_Id Owner_Id
6518 255653806 place town -3.3383462 35.6735367 NULL africa 105 N 769219 Karatu, Arusha, Northern, Tanzania Arusha Tanzania tz Karatu karatu-arusha-northern-tanzania 0 0 Karatu 1900-01-01 00:00:00.000 NULL NULL 170 2
3078707 255653806 landuse residential 0 0 PG5kIHJlZj0iMjYxMzI1OTE3NyIgLz48bmQgcmVmPSIyNjEzMjU5MTc4IiAvPjxuZCByZWY9IjI2MTMyNTkxNzkiIC8+PG5kIHJlZj0iMjYxMzI1OTE4MCIgLz48bmQgcmVmPSIyNjEzMjU5MTc3IiAvPjx0YWcgaz0ibGFuZHVzZSIgdj0icmVzaWRlbnRpYWwiIC8+PHRhZyBrPSJuYW1lIiB2PSLlj7DljJflt7TloZ7pmobntI0iIC8+ NULL asia 124 W NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 0 0 ?????? 1900-01-01 00:00:00.000 NULL NULL NULL 2

Looks like it is unique:
https://github.com/twain47/Nominatim/blob/master/sql/tables.sql
CREATE UNIQUE INDEX idx_place_id ON placex USING BTREE (place_id);

Related

Convert individual postgres jsonb array elements to row elements

I have to query a table with 2 columns, id and content. Id is just a uuid and the content column looks like
{
"fields": [
{
"001": "mig00004139229"
},
{
"856": {
"ind1": " ",
"ind2": " ",
"subfields": [
{
"u": "https://some.domain.com"
},
{
"z": "some text"
}
]
}
},
{
"999": {
"subfields": [
{
"i": "81be1acf-11df-4d13-a5c6-4838e3a808ee"
},
{
"s": "3a6aa357-8fd6-4451-aedc-13453c1f2296"
}
]
}
}
]
}
I need to select the id, 001, and 856 elements where the subfield "u" domain matches a string "domain.com" so the output would be
id
001
856
81be1acf-11df-4d13-a5c6-4838e3a808ee
mig00004139229
https://some.domain.com
If this were a flat table, the query would correspond with "select id, 001, 856 from table where 856 like '%domain.com%'"
I can select individual columns based on the criteria I need, but they appear in separate rows except the id which appears with any other individual field in a regular select statement. How would I get the other fields to appear in the same row since it's part of the same record?
Unfortunately, my postgres version doesn't support jsonb_path_query, so I've been trying something along the lines of:
SELECT id, jsonb_array_elements(content -> 'fields') -> '001',
jsonb_array_elements(content -> 'fields') -> '856' -> 'subfields'
FROM
mytable
WHERE....
This method returns the data I need, but the individual elements arrive on separate rows with the with the id in the first column and nulls for every element that is neither the 001 nor 856 e.g.
id
001
856
id_for_first_record
001_first_record
null
id_for_first_record
null
null
id_for_first_record
null
null
id_for_first_record
null
856_first_record
id_for_second_record
001_second_record
null
id_for_second_record
null
null
id_for_second_record
null
null
id_for_second_record
null
856_second_record
Usable, but clunky so I'm looking for something better
I think my query can help you. There are different ways to resolve this, I am not sure if this is the best approach.
I use jsonb_path_query() function with the path for the specified JSON value.
SELECT
id,
jsonb_path_query(content, '$.fields[*]."001"') AS "001",
jsonb_path_query(content, '$.fields[*]."856".subfields[*].u') AS "856"
FROM t
WHERE jsonb_path_query_first(content, '$.fields[*]."856".subfields[*].u' )::text ilike '%domain%';
Output:
id
001
856
81be1acf-11df-4d13-a5c6-4838e3a808ee
"mig00004139229"
"https://some.domain.com"
UPDATED: because of Postgresql version is prior to 12.
You could try something like this, but I think there must be a better approach:
SELECT
t.id,
max(sq1."001") AS "001",
max(sq2."856") AS "856"
FROM t
INNER JOIN (SELECT id, (jsonb_array_elements(content -> 'fields') -> '001')::text AS "001" FROM t) AS sq1 ON t.id = sq1.id
INNER JOIN (SELECT id, (jsonb_array_elements(jsonb_array_elements(content -> 'fields') -> '856' -> 'subfields') -> 'u')::text AS "856" FROM t) AS sq2 ON t.id = sq2.id
WHERE sq2."856" ilike '%domain%'
GROUP BY t.id;

Postgresql: Can the minus operator not be used with a parameter? Only hardcoded values?

The following query deletes an entry using index:
const deleteGameQuery = `
update users
set games = games - 1
where username = $1
`
If I pass the index as a parameter, nothing is deleted:
const gameIndex = rowsCopy[0].games.findIndex(obj => obj.game == gameID).toString();
const deleteGameQuery = `
update users
set games = games - $1
where username = $2
`
const { rows } = await query(deleteGameQuery, [gameIndex, username]);
ctx.body = rows;
The gameIndex parameter is just a string, the same as if I typed it. So why doesn't it seem to read the value? Is this not allowed?
The column games is a jsonb data type with the following data:
[
{
"game": "cyberpunk-2077",
"status": "Backlog",
"platform": "Any"
},
{
"game": "new-pokemon-snap",
"status": "Backlog",
"platform": "Any"
}
]
The problem is you're passing text instead of an integer. You need to pass an integer. I'm not sure exactly how your database interface works to pass integers, try removing toString() and ensure gameIndex is a Number.
const gameIndex = rowsCopy[0].games.findIndex(obj => obj.game == gameID).
array - integer and array - text mean two different things.
array - 1 removes the second element from the array.
select '[1,2,3]'::jsonb - 1;
[1, 3]
array - '1' searches for the entry '1' and removes it.
select '["1","2","3"]'::jsonb - '1';
["2", "3"]
-- Here, nothing is removed because 1 != '1'.
select '[1,2,3]'::jsonb - '1';
[1, 2, 3]
When you pass in a parameter, it is translated by query according to its type. If you pass a Number it will be translated as 1. If you pass a String it will be translated as '1'. (Or at least that's how it should work, I'm not totally familiar with Javascript database libraries.)
As a side note, this sort of data is better handled as a join table.
create table games (
id bigserial primary key,
name text not null,
status text not null,
platform text not null
);
create table users (
id bigserial primary key,
username text not null
);
create table game_users (
game_id bigint not null references games,
user_id bigint not null references users,
-- If a user can only have each game once.
unique(game_id, user_id)
);
-- User 1 has games 1 and 2. User 2 has game 2.
insert into game_users (game_id, user_id) values (1, 1), (2, 1), (2,2);
-- User 1 no longer has game 1.
delete from game_users where game_id = 1 and user_id = 1;
You would also have a platforms table and a game_platforms join table.
Join tables are a little mind bending, but they're how SQL stores relationships. JSONB is very useful, but it is not a substitute for relationships.
You can try to avoid decomposing objects outside of postgress and manipulate jsonb structure inside the query like this:
create table gameplayers as (select 1 as id, '[
{
"game": "cyberpunk-2077",
"status": "Backlog",
"platform": "Any"
},
{
"game": "new-pokemon-snap",
"status": "Backlog",
"platform": "Any"
},
{
"game": "gameone",
"status": "Backlog",
"platform": "Any"
}
]'::jsonb games);
with
ungroupped as (select * from gameplayers g, jsonb_to_recordset(g.games)
as (game text, status text, platform text)),
filtered as (select id,
jsonb_agg(
json_build_object('game', game,
'status', status,
'platfrom', platform
)
) games
from ungroupped where game not like 'cyberpunk-2077' group by id)
UPDATE gameplayers as g set games=f.games
from filtered f where f.id=g.id;

Postgres where json column "in" casting json to uuid

Given the following data structure I am trying to count the number of answers to question type messages
questions are identified by either a non null node or options column
answers are identified by non null previous
Ideally I'm hoping to return
| message | answer | count |
|---------------------|---------------|-------|
| Stuffed crust? | Crunchy crust | 2 |
| Stuffed crust? | More cheese! | 1 |
| Pineapple on pizza? | No | 3 |
| Pineapple on pizza? | Yes | 2 |
I assume once I work out how to get around the casting error above I can work out the counting and grouping, but I can't seem to get that far yet.
Query 1 ERROR: ERROR: operator does not exist: json = uuid
LINE 24: where previous->'id'::text in (
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
WITH data (
id,
message,
node,
options,
previous
) AS (
VALUES
('5f0a50c7-2736-45a2-81c0-fad1ca62cbdc'::uuid, 'No', null, null, '{"id": "20c98b37-6cf3-47d1-b93a-606b99bb341a", "node": "pineapple"}'::json),
('ec7cd365-e206-4f21-be37-495914458313'::uuid, 'Yes', null, null, '{"id": "20c98b37-6cf3-47d1-b93a-606b99bb341a", "node": "pineapple"}'::json),
('56240ea2-6bc7-435e-b76f-c874084a234c'::uuid, 'No', null, null, '{"id": "20c98b37-6cf3-47d1-b93a-606b99bb341a", "node": "pineapple"}'::json),
('670d6d09-89d6-4063-ace7-e606f18c2cc2'::uuid, 'Yes', null, null, '{"id": "20c98b37-6cf3-47d1-b93a-606b99bb341a", "node": "pineapple"}'::json),
('25acbc4c-dd27-412c-86b2-8882c80b9c73'::uuid, 'No', null, null, '{"id": "20c98b37-6cf3-47d1-b93a-606b99bb341a", "node": "pineapple"}'::json),
('e7ff8b2b-cc4d-4006-a3c4-9efdc8e458db'::uuid, 'More cheese!', null, null, '{"id": "b18059f0-6d38-4898-bbb7-ebdd7e175b82", "node": "stuffed_crust"}'::json),
('c3aee52f-e30e-4c83-8c90-9ff890dd0e72'::uuid, 'Crunchy crust', null, null, '{"id": "b18059f0-6d38-4898-bbb7-ebdd7e175b82", "node": "stuffed_crust"}'::json),
('965f9936-284f-4e57-838d-bcf90f119a9c'::uuid, 'Crunchy crust', null, null, '{"id": "b18059f0-6d38-4898-bbb7-ebdd7e175b82", "node": "stuffed_crust"}'::json),
-- questions
('b18059f0-6d38-4898-bbb7-ebdd7e175b82'::uuid, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, null::json),
('20c98b37-6cf3-47d1-b93a-606b99bb341a'::uuid, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json)
)
SELECT * from data
where previous->'id'::uuid in (
SELECT id::uuid FROM data WHERE options is not null
);
Update
Having had my casting question answered, the query I used to achieve the results i wanted is as follows
select d2.message as question, data.message, count(data.message)
from data
join data as d2 on (data.previous->>'id')::uuid = d2.id
where (data.previous->>'id')::uuid in (
SELECT id FROM data WHERE options is not null
)
group by question, data.message;
The :: operator has a higher precedence than the -> operator, so you need to use parentheses there. You also need to get the ID as text, not as JSONB as there is no direct cast from jsonb to uuid:
where (previous->>'id')::uuid IN (...)

How can I update a jsonb column while leaving existing fields intact in postgresql

How can I refactor this update statement so that it updates a jsonb column in my postgresql table?
update contacts as c set latitude = v.latitude,longitude = v.longitude,
home_house_num = v.home_house_num,home_predirection = v.home_predirection,
home_street_name = v.home_street_name, home_street_type = v.home_street_type
from (values (16247746,40.814140,-74.259250,'25',null,'Moran','Rd'),
(16247747,20.900840,-156.373700,'581','South','Pili Loko','St'))
as v(contact_id,latitude,longitude,home_house_num,home_predirection,home_street_name,home_street_type) where c.contact_id = v.contact_id
My table looks like this ...
|contact_id (bigInt) |data (jsonb)|
-----------------------------------
|111231 |{"email": "albsmith#gmail.com", "home_zip": "07052", "lastname": "Smith",
"firstname": "Al", "home_phone": "1111111111", "middlename": "B",
"home_address1": "25 Moran Rd", "home_street_name": "Moran Rd"}
Note that I do not want to overwrite any other fields that may already exist in the jsonb object that are not specified in the update statement. For example, in this example, I would not want to overwrite the email or name fields
You could restructure your update into something like:
update
contacts c
set
data = c.data || v.data
FROM
(values
(
16247746,
'{
"latitude": 40.814140,
"longitude": -74.259250,
"home_house_num": "25",
"home_predirection": null,
"home_street_name": "Moran",
"home_street_type": "Rd"
}'::JSONB
),
(
16247747,
'{
"latitude": 20.900840,
"longitude": -156.373700,
"home_house_num": "581",
"home_predirection": "Sourth",
"home_street_name": "Pili Loko",
"home_street_type": "St"
}'::JSONB
)
) as v(
contact_id,
data
)
where
c.contact_id = v.contact_id
;

Recursive PostgreSQL dynamic average

I have PostgreSQL dynamic averaging problem that I cannot solve:
I have data for individuals with start and finish dates for employment are as follows:
"parentid" "Name" "startdate" "enddate"
"01e7de72-843d-4aa5-b3ae-2e2887d1b342" "Isabelle Smith" "2011-05-23" "2016-04-16"
"027ee658-8c4d-4910-b93e-62c0900f2147" "Emelie Blogs" "2012-09-17" "2016-03-16"
"02cbb478-adf3-4a8b-a5aa-ae9f03943ce4" "Joshauh Jow" "2015-04-04" NULL
"0328f382-2845-4623-a940-ab68af5d11cc" "VICTORIA Fred" "2015-05-11" NULL
"03823a20-51bc-4ae5-ab73-79056355ea36" "Elin Tree" "2014-03-24" NULL
"03878ef8-1c3a-4310-b3d5-7b8d18634707" "Michaela Apple" "2011-07-08" NULL
"03c36926-395b-4e3c-9f77-c6214ce763a2" "Immad Cheese" "2012-05-15" NULL
"0436824c-29a6-4140-ba4a-d0f56facd8fc" "Burak Teal" "2009-06-22" NULL
"04d7a07a-0ad4-4091-98d2-a7ff35798b6f" "Roberto Purple" "2015-03-30" "2016-03-01"
"04f32c2f-887f-4e03-be67-bc023aa3a7c2" "Iftikar Orange" "2012-06-27" NULL
"055b690a-153a-49c8-8ac0-112681f79551" "Josef Red" "2014-02-21" "2016-04-13"
"055be2f6-baec-4626-b876-7ff16dc95464" "Harry Green" "2016-03-27" NULL
"05a570b0-ec76-49d9-a742-5bf08f215fec" "Sofie Blue" "2010-06-15" "2016-05-16"
"05c92e7a-efde-44f0-a57c-298cbe129259" "BANARAS Yellow" "2015-06-22" NULL
"05fe0113-9bda-407b-bd72-5bf2a9deae15" "Bengt Drury" "2015-03-30" "2016-06-16"
"063c454f-2e97-48a8-96fc-9e84d29f5d96" "Son That" "2016-03-27" NULL
"07b76b47-8086-4df6-a3da-50dcfcd2de89" "Sam This" "2015-03-21" "2016-05-24"
"082771ee-2f02-4623-abc2-696447f9f791" "Felix This" "2014-11-24" "2016-05-31"
"08e39639-176b-4f44-ae75-1025219730c6" "ROBIN That" "2015-10-26" NULL
"09ab8491-9d9a-4091-b448-8315e3b5d3f0" "Kaziah This" "2016-05-14" NULL
"0a74dd0c-e1ee-4b32-a893-c486f7402363" "Luke Him" "2015-12-16" NULL
"0b098799-7d92-47df-9778-b48edf948af9" "MARIA Her" "2015-05-11" NULL
"0b480b25-8d2b-441b-8039-48b4e9188769" "That Adebayor" "2015-04-09" NULL
"0b86b44e-f3e0-4ddf-8e72-e0d7f9470279" "This Ă…lund" "2012-02-07" "2016-06-05"
"0c3e13d0-f602-41da-b10c-f70072605e63" "First Ekmark" "2013-02-08" NULL
"0d2367f4-a6b4-4381-b7dc-3e0c9063285f" "Anna Check" "2015-03-13" NULL
"0e31731b-0384-43ef-adeb-503ad5a137f9" "Assign Test1" "2015-05-22" NULL
"0e3f8b57-cba2-4240-abd4-d157832ef421" "Ramises Person "2016-10-11" NULL
"0f6af1c8-7672-4f0b-912c-91675cf52845" "Lars Surname" "2016-03-28" NULL
For this report a user would input two dates startOfPeriod and endOfPeriod
I need an SQL statement that for those dynamic dates would give me a week by week output on the number of people who were employed for each week during that period.
(A week would constitute each 7 days from the startOfPeriod date)
Is this possible in PostgreSQL and how would I do it?
Use the type daterange and the overlap operator &&.
The first query in WITH defines the period, the second generates series of weeks:
with period(start_of_period, end_of_period) as (
values ('2012-01-20'::date, '2012-02-15'::date)
),
weeks as (
select daterange(d::date, d::date+ 7) a_week
from period,
lateral generate_series (start_of_period, end_of_period, '7d'::interval) d
)
select lower(a_week) start_of_week, count(*)
from weeks
left join a_table
on daterange(startdate, enddate) && a_week
group by 1
order by 1;
start_of_week | count
---------------+-------
2012-01-20 | 4
2012-01-27 | 4
2012-02-03 | 5
2012-02-10 | 5
(4 rows)
Idea is generate series of week between start and end date, select starting and ending week from employment, then for each week count.
I've not tested it for bound cases but something OP coud starts with
WITH startDate(d) as (VALUES ('2010-01-01'::DATE))
, endDate(d) as (VALUES ('2016-06-06'::DATE))
, weeks as (select to_char(startDate.d+s.a,'YYYY-WW') as w
from startDate,endDate,generate_series(0,(endDate.d - startDate.d),7) as s(a))
, emp as (select name,to_char(sd,'YYYY-WW') as sw
, to_char(coalesce(ed,endDate.d),'YYYY-WW') as ew
from startDate,endDate,public.so where sd > startDate.d )
SELECT
w.w
,(select ARRAY_AGG(name) from emp Where w.w BETWEEN sw AND ew ) as emps
,(select count(name) from emp Where w.w BETWEEN sw AND ew ) as empCount
FROM weeks w
Test setup
create table public.so (
name TEXT
,sd DATE
,ed DATE
);
INSERT INTO public.so (name,sd,ed) VALUES
('a','2011-05-23','2016-04-16')
,('b','2012-09-17','2016-03-16')
,('c','2009-12-12',null)
,('d','2015-03-30','2016-03-01')
,('e','2012-06-27',null)
,('f','2014-02-21','2016-04-13')
,('g','2016-03-27',null)
,('h','2010-06-15','2016-05-16')
;