Update telephone number format to include country identifier - postgresql

Being a beginner in SQL, I am trying to do the telephone fields to be noted in the format: '"+" + country identifier + telephone number'.
UPDATE public.contact
SET phone_number = CASE WHEN (country_code ='FR')
AND phone_number NOT LIKE '+33%'
AND phone_number <> NULL
THEN CONCAT('+33', phone_number)
WHEN (country_code ='GB')and phone_number NOT LIKE '+44%'
AND phone_number <> NULL
THEN CONCAT('+44', phone_number)
I want to update telephone number format to include country identifier like : 0606080905-> +33606080905 if country_code='FR' . I am looking for a faster and less complex way than what I did.

You can do this with a regular expression using regexp_replace.
Imagine your data being:
+----------+--------------+
Table 'numbers': | country | phone |
+----------+--------------+
| FR | 0606080905 |
| FR | +33606080906 |
| GB | 0123456789 |
| GB | +44987654321 |
| GB | NULL |
+----------+--------------+
Then the following update would replace the leading 0 with the country code +33 for all numbers that do not start with a +xx and have FR as country.
UPDATE numbers
SET phone = REGEXP_REPLACE(trim(phone), '^(0)', '+33')
WHERE country = 'FR'
Explained:
the ^ means start of the string
the (0) is the match that gets replaced (leading zero)
the +33 is the string that is used to replace it
the trim() is just added for safety, in case there are leading spaces
NULL phone numbers won't be affected, as they do not match
You could do this now as you did before with a CASE WHEN or something similar for each of the different possibilities. But since the expression always is the same, an easier way would be to have your country codes and their numerical mapping in a separate table:
+----------+--------+
Table 'mapping': | country | prefix |
+----------+--------+
| FR | +33 |
| GB | +44 |
+----------+--------+
You could then do
UPDATE numbers n
SET phone = REGEXP_REPLACE(trim(phone), '^(0)', prefix)
FROM mapping m
WHERE m.country = n.country
and update all your numbers in one go:
+----------+--------------+
| country | phone |
+----------+--------------+
| FR | +33606080905 |
| FR | +33606080906 |
| GB | +44123456789 |
| GB | +44987654321 |
| GB | NULL |
+----------+--------------+
EDIT: Previously, I had this needlessly complicated answer. You may need something like this if your phone number patterns are more diverse...
The following update would replace the leading 0 with the country code +33 for all numbers that do not start with a +xx and have FR as country.
UPDATE numbers
SET phone = REGEXP_REPLACE(trim(phone), '^(?<![+\d{2}])(0)', '+33')
WHERE country = 'FR'
Explained:
the (?<![+]) is a negative lookbehind assertion that makes sure the regex only matches if there is no + followed by two digits before
the (0) is the match that gets replaced
the +33 is the string that is used to replace it
the trim() is just added for safety, in case there are leading spaces
NULL phone numbers won't be affected, as they do not match

That's about as simple as it gets.
The only way I can imagine to speed up processing is to add a WHERE condition that avoids updating the rows that don't have to be modified.
You could also run several such statements in parallel, where each modifies a different part of the table.
As mentioned in the comment, <> NULL is never true.

Related

create JSONB array grouped from column values with incrementing integers

For a PostgreSQL table, suppose the following data is in table A:
key_path | key | value
--------------------------------------
foo[1]__scrog | scrog | apple
foo[2]__scrog | scrog | orange
bar | bar | peach
baz[1]__biscuit | biscuit | watermelon
The goal is to group data when there is an incrementing number present for an otherwise identical value for column key_path.
For context, key_path is a JSON key path and key is the leaf key. The desired outcome would be:
key_path_group | key | values
------------------------------------------------------------
[foo[1]__scrog, foo[2]__scrog] | scrog | [apple, orange]
bar | bar | peach
[baz[1]__biscuit] | biscuit | [watermelon]
Also noting that for key_path=baz[1]__biscuit even though there is only a single incrementing value, it still triggers casting to an array of length 1.
Any tips or suggestions much appreciated!
May have answered my own question (sometimes just typing it out helps). The following gets very close, if not exactly, what I'm looking for:
select
regexp_replace(key_path, '(.*)\[(\d+)\](.*)', '\1[x]\3') as key_path_group,
key,
jsonb_agg(value) as values
from A
group by gp_key_path, key;

Power Query: some values are counted from zero when count should start at a higher value

I have a column with values counting occurrences.
I am trying to continue the series in Power Query.
I am thus trying to increment 1 to the max of the given column..
The ID column has rows with letter tags : AB or BE. Following these letters, specific numeric ranges are associated. For both AB and BE, number ranges first from 0000 to 3000 and from 3000 to 6000.
I thus have the following possibilities: From AB0000 to AB3000 From AB3001 to AB6000 From BE0000 to BE3000 From BE3001 to AB6000
Each category match to the a specific item in my column geography, from the other workbook: From AB0000 to AB3000, it is ItalyZ From AB3001 to AB6000, it is ItalyB From BE0000 to BE3000, it is UKY From BE3001 to AB6000, it is UKM
I am thus trying to find the highest number associated to the first AB category, the second AB category, the first BE category, and the second.
My issue is that for some values, there is simply "nothing" yet in we source file.
This means that there is no occurrence yet of UKM for example.
Here is an example with no UKM or UKY:
|------------------|---------------------|
| Max | Geography |
|------------------|---------------------|
| 0562 | ItalyZ |
|------------------|---------------------|
| 0563 | ItalyZ |
|------------------|---------------------|
Hence, I have the following result:
|------------------|---------------------|
| Increment | Place |
|------------------|---------------------|
| 0564 | ItalyZ |
|------------------|---------------------|
| 0565 | ItalyZ |
|------------------|---------------------|
| 0565 | ItalyZ |
|------------------|---------------------|
| null | UKM |
|------------------|---------------------|
Here is the used power query code:
let
Source = #table({"Prefix", "Seq_Start", "Seq_End","GeoLocation"},{{"AB",0,2999,"ItalyZ"},{"AB",3000,6000,"ItalyB"},{"BC",0,299,"UKY"},{"BC",3000,6000,"UKM"}}),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Seq_Start", Int64.Type}, {"Seq_End", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Prefix"}, HighestID, {"Prefix"}, "HighestID", JoinKind.LeftOuter),
#"Expanded HighestID" = Table.ExpandTableColumn(#"Merged Queries", "HighestID", {"Number"}, {"Number"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded HighestID", each [Number] >= [Seq_Start] and [Number] <= [Seq_End]),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Prefix", "Seq_Start", "Seq_End", "GeoLocation"}, {{"NextSeq", each List.Max([Number]) + 1, type number}})
in
#"Grouped Rows"
I would like to know how I could insure that when I have the first occurrence of a value, I would not have "null", but "0000" (or 0) and so on for the next occurrences.
Because, for example, if I have 0 occurrences of UKY before, I do not know why but the end results will be as follows:
|------------------|---------------------|
| Increment | Place |
|------------------|---------------------|
| 1 | UKM |
|------------------|---------------------|
| 2 | UKM |
|------------------|---------------------|
Which is not ideal because UKM should start at 30000. And because I had no values recorded before, it is starting with "null" only and then, 1, 2...rather than 3001 and 3002.

Computation of the number of uppercase letters in a string

I've bumped into a seemingly simple problem that I'm unable to solve. I would like to determine whether the number of uppercase letters is greater than the number of lowercase letter (ignoring special character, spaces etc).
Example
id | text | upper_greater_lower | note
------------------------------------------------------------------
1 | Hello World | False | because |HW| < |elloorld|
2 | The XYZ | True | because |TXYZ| > |he|
3 | Foo!!! | False | because |F| < |oo|
4 | BAr??? | True | because |BA| > |r|
My initial idea was to determine the number of lowecase letters, then uppercase letters, and finally, compare them. However, I'm unable to do so in any elegant and efficient way.
I expect handling ~30M rows with ~300 character each.
What would you suggest?
Thanks!
Using regular expression magic, that could be:
SELECT length(regexp_replace(textcol, '[^[:upper:]]', '', 'g'))
> length(regexp_replace(textcol, '[^[:lower:]]', '', 'g'))
FROM atable;

Sane way to store different data types within same column in postgres?

I'm currently attempting to modify an existing API that interacts with a postgres database. Long story short, it's essentially stores descriptors/metadata to determine where an actual 'asset' (typically this is a file of some sort) is storing on the server's hard disk.
Currently, its possible to 'tag' these 'assets' with any number of undefined key-value pairs (i.e. uploadedBy, addedOn, assetType, etc.) These tags are stored in a separate table with a structure similar to the following:
+---------------+----------------+-------------+
|assetid (text) | tagid(integer) | value(text) |
|---------------+----------------+-------------|
|someStringValue| 1234 | someValue |
|---------------+----------------+-------------|
|aDiffStringKey | 1235 | a username |
|---------------+----------------+-------------|
|aDiffStrKey | 1236 | Nov 5, 1605 |
+---------------+----------------+-------------+
assetid and tagid are foreign keys from other tables. Think of the assetid representing a file and the tagid/value pair is a map of descriptors.
Right now, the API (which is in Java) creates all these key-value pairs as a Map object. This includes things like timestamps/dates. What we'd like to do is to somehow be able to store different types of data for the value in the key-value pair. Or at least, storing it differently within the database, so that if we needed to, we could run queries checking date-ranges and the like on these tags. However, if they're stored as text items in the db, then we'd have to a.) Know that this is actually a date/time/timestamp item, and b.) convert into something that we could actually run such a query on.
There is only 1 idea I could think of thus far, without complete changing changing the layout of the db too much.
It is to expand the assettag table (shown above) to have additional columns for various types (numeric, text, timestamp), allow them to be null, and then on insert, checking the corresponding 'key' to figure out what type of data it really is. However, I can see a lot of problems with that sort of implementation.
Can any PostgreSQL-Ninjas out there offer a suggestion on how to approach this problem? I'm only recently getting thrown back into the deep-end of database interactions, so I admit I'm a bit rusty.
You've basically got two choices:
Option 1: A sparse table
Have one column for each data type, but only use the column that matches that data type you want to store. Of course this leads to most columns being null - a waste of space, but the purists like it because of the strong typing. It's a bit clunky having to check each column for null to figure out which datatype applies. Also, too bad if you actually want to store a null - then you must chose a specific value that "means null" - more clunkiness.
Option 2: Two columns - one for content, one for type
Everything can be expressed as text, so have a text column for the value, and another column (int or text) for the type, so your app code can restore the correct value in the correct type object. Good things are you don't have lots of nulls, but importantly you can easily extend the types to something beyond SQL data types to application classes by storing their value as json and their type as the class name.
I have used option 2 several times in my career and it was always very successful.
Another option, depending on what your doing, could be to just have one value column but store some json around the value...
This could look something like:
{
"type": "datetime",
"value": "2019-05-31 13:51:36"
}
That could even go a step further, using a Json or XML column.
I'm not in any way PostgreSQL ninja, but I think that instead of two columns (one for name and one for type) you could look at hstore data type:
data type for storing sets of key/value pairs within a single
PostgreSQL value. This can be useful in various scenarios, such as
rows with many attributes that are rarely examined, or semi-structured
data. Keys and values are simply text strings.
Of course, you have to check how date/timestamps converting into and from this type and see if it good for you.
You can use 2 different technics:
if you have floating type for every tagid
Define table and ID for every tagid-assetid combination and actual data tables:
maintable:
+---------------+----------------+-----------------+---------------+
|assetid (text) | tagid(integer) | tablename(text) | table_id(int) |
|---------------+----------------+-----------------+---------------|
|someStringValue| 1234 | tablebool | 123 |
|---------------+----------------+-----------------+---------------|
|aDiffStringKey | 1235 | tablefloat | 123 |
|---------------+----------------+-----------------+---------------|
|aDiffStrKey | 1236 | tablestring | 123 |
+---------------+----------------+-----------------+---------------+
tablebool
+-------------+-------------+
| id(integer) | value(bool) |
|-------------+-------------|
| 123 | False |
+-------------+-------------+
tablefloat
+-------------+--------------+
| id(integer) | value(float) |
|-------------+--------------|
| 123 | 12.345 |
+-------------+--------------+
tablestring
+-------------+---------------+
| id(integer) | value(string) |
|-------------+---------------|
| 123 | 'text' |
+-------------+---------------+
In case if every tagid has fixed type
create tagid description table
tag descriptors
+---------------+----------------+-----------------+
|assetid (text) | tagid(integer) | tablename(text) |
|---------------+----------------+-----------------|
|someStringValue| 1234 | tablebool |
|---------------+----------------+-----------------|
|aDiffStringKey | 1235 | tablefloat |
|---------------+----------------+-----------------|
|aDiffStrKey | 1236 | tablestring |
+---------------+----------------+-----------------+
and correspodnding data tables
tablebool
+-------------+----------------+-------------+
| id(integer) | tagid(integer) | value(bool) |
|-------------+----------------+-------------|
| 123 | 1234 | False |
+-------------+----------------+-------------+
tablefloat
+-------------+----------------+--------------+
| id(integer) | tagid(integer) | value(float) |
|-------------+----------------+--------------|
| 123 | 1235 | 12.345 |
+-------------+----------------+--------------+
tablestring
+-------------+----------------+---------------+
| id(integer) | tagid(integer) | value(string) |
|-------------+----------------+---------------|
| 123 | 1236 | 'text' |
+-------------+----------------+---------------+
All this is just for general idea. You should adapt it for your needs.

Update a single value in a database table through form submission

Here is my table in the database :
id | account_name | account_number | account_type | address | email | ifsc_code | is_default_account | phone_num | User
-----+--------------+----------------+--------------+---------+------------------------------+-----------+--------------------+-------------+----------
201 | helloi32irn | 55265766432454 | Savings | | mypal.appa99721989#gmail.com | 5545 | f | 98654567876 | abc
195 | hello | 55265766435523 | Savings | | mypal.1989#gmail.com | 5545 | t | 98654567876 | axyz
203 | what | 01010101010101 | Current | | guillaume#sample.com | 6123 | f | 09099990 | abc
On form submission in the view, which only posts a single parameter which in my case is name= "activate" which corresponds to the column "is_default_account" in the table.
I want to change the value of "is_default_account" from "t" to "f". For example here in the table, for account_name "hello" it is "t". And i want to deactivate it, i.e make it "f" and activate any of the other that has been sent trough the form
This will update your table and make account 'what' default (assuming that is_default_account is BOOLEAN field):
UPDATE table
SET is_default_account = (account_name = 'what')
You may want limit updates if table is more than just few rows you listed, like this:
UPDATE table
SET is_default_account = (account_name = 'what')
WHERE is_default_account != (account_name = 'what')
AND <limit updates by some other criteria like user name>
I think to accomplish what you want to do you should send at least two values from the form. One for the id of the account you want to update and the other for the action (activate here). You can also just send the id and have it toggle. There are many ways to do this but I can't figure out exactly what you are trying to do and whether you want SQL or Playframework code. Without limiting your update in somewhere (like id) you can't precisely control what specific rows get updated. Please clarify your question and add some more code if you want help on the playframework side, which I would think you do.