Convert value to unique value (ex John to John_1) - postgresql

The user writes his name and i want to store it into the database. If the name is already in the database i want to insert a postfix. ie Convert 'John' to the first one available between ('John_1', 'John_2' ... etc).
This is my way of doing this so far, but i'm sure there's a better way.
select n from
(
select 'John' n ,0 v
union
select 'John'||'_'||generate_series(1,100),generate_series(1,100)
) possible_names
where n not in
(select my_name from all_names u)
order by v
limit 1
Any suggestions?

If you need to worry about concurrency, the simplest way to guarantee uniqueness is by issuing insert statements until one succeeds. (This assumes you've a unique constraint, of course.)
Pseudocode:
while true
if db.execute(insert_sql, [..., name + postfix, ...])
break
end
counter += 1
postfix = '_' + counter
end
You can make the procedure run in a shorter amount of time by starting at the maximum existing postfix (see the other answers with approaches to do that).
An awkward alternative would be to find the maximum existing postfix using a select statement, and then to try to acquire an advisory lock on something unique to the applicable name and postfix, e.g. 'username:' + name + postfix. It's much less robust though, because it opens up the possibility of two transactions finding the same max_postfix, and then one transaction trying to acquire the lock immediately after other is done committing its insert and releasing that lock -- thus resulting in a duplicate.

SELECT CASE WHEN num IS NULL THEN 'John' ELSE 'John' || '_' || num END AS new_name
FROM (
SELECT max(substr(my_name, position('_' in my_name) + 1)::int) + 1 AS num
FROM all_names
WHERE my_name ilike 'John' || '_%'
) new_number
With all three instances of 'John' being where you pass in the name entered. (This is assuming that the user can't make an underscore part of their name and a number will always follow the underscore.)
Edit: This is also assuming that 'John' and 'john' should be treated the same. If they shouldn't, then replace the ilike with like instead.

CREATE FUNCTION get_username_proposal(text) RETURNS text AS $$
SELECT
CASE WHEN (SELECT COUNT(*) FROM all_names WHERE my_name = $1)=0 THEN
$1
ELSE
$1 || '_' || COALESCE(MAX(LTRIM(SUBSTRING(my_name FROM '_[0-9]+$'), '_')::int), 0)+1
END
FROM
all_names
WHERE
my_name ~ ($1 || '_[0-9]+$');
$$ LANGUAGE SQL STABLE;

Related

How to make a self referential window functions

I have a table like this:
amount type app owe
1 a 10 10
2 a 8 -2
3 a 20 12
4 i 30 10
5 a 40 10
owe is:
(type == 'a')?app - sum(owe) where amount < (amount for current row):max(app-sum(owe)where amount<(amount for current row),0)
So I'd need a window function on the column that the window function is on. There are these partition on rows between rows unlimited preceding and prior row, but it has to be on a different column, not the column I'm summing. Is there a way to reference the same column the window function is on
I tried an alias
case
when type = a
then app - sum(owe)over(ROWS BETWEEN UNBOUNDED PRECEDING AND 1 preceding) as owe
else
greatest(0,app - sum(owe)over(ROWS BETWEEN UNBOUNDED PRECEDING AND 1 preceding))
end as owe
But since owe doesn't exist when I made it, I get:
owe doesn't exist.
Is there some other way?
You cannot do that with window functions. Your only chance using SQL is a recursive CTE:
WITH RECURSIVE tab_owe AS (
SELECT amount, type, app,
CASE WHEN type = 'a'
THEN app
ELSE GREATEST(app, 0)
END AS owe
FROM tab
ORDER BY amount LIMIT 1
UNION ALL
SELECT t.amount, t.type, t.app,
CASE WHEN t.type = 'a'
THEN t.app - sum(tab_owe.owe)
ELSE GREATEST(t.app - sum(tab_owe.owe), 0)
END AS owe
FROM (SELECT amount, type, app
FROM tab
WHERE amount > (SELECT max(amount) FROM tab_owe)
ORDER BY amount
LIMIT 1) AS t
CROSS JOIN tab_owe
GROUP BY t.amount, t.type, t.app
)
SELECT amount, type, app, owe
FROM tab_owe;
(untested)
This would be much easier to write in procedural code, sou consider using a table function.
This is what I came up with. Of course, I'm not a real programmer, so I'm sure there's a smarter way:
insert into mort (amount, "type", app)
values
(1,'a',10),
(2,'a',8),
(3,'a',20),
(4,'i',30),
(5,'a',40)
CREATE OR REPLACE FUNCTION mort_v ()
RETURNS TABLE (
zamount int,
ztype text,
zapp int,
zowe double precision
) AS $$
DECLARE
var_r record;
charlie double precision;
sam double precision;
BEGIN
charlie = 0;
FOR var_r IN(SELECT
amount,
"type",
app
FROM mort order by 1)
LOOP
zamount = var_r.amount;
ztype = var_r.type;
zapp = var_r.app;
sam = var_r.app - charlie;
if ztype = 'a' then
zowe = sam;
else
zowe = greatest(sam, 0);
end if;
charlie = charlie + zowe;
RETURN NEXT;
END LOOP;
END; $$
LANGUAGE 'plpgsql';
select * from mort_v()
So with my limited skills you'll notice I had to add a 'z' in front of the columns that are already in the table so I can spit it out again. If your table has 30 columns you'd normally have to do this 30 times. But, I asked a real engineer and he mentioned that if you just spit out the primary key with the calculated column, you can just join it back to the original table. That's smarter than what I have. If there's an even better solution, that would be great. This does serve as a nice reference to how to do something like a cursor in postgre and how to make variables without a '#' in front like in mssqlserver.

Postgres reverse LIKE lookup indexing and performance

We have a musicians table containing records with multiple string fields, say:
"Jimi", "Hendrix", "Guitar"
"Phil", "Collins", "Drums"
"Sting", "", "Bass"
"Ringo", "Starr", "Drums"
"Paul", "McCartney", "Bass"
I want to pass postgres a long string, say:
"It is known that Jimi liked to set light to his guitar and smash up
all the drums while on stage."
and i want to get returned the fields that have any matches - preferably in order of the most matches first:
"Jimi", "Hendrix", "Guitar"
"Phil", "Collins", "Drums"
"Ringo", "Starr", "Drums"
because i need the search to be case insensitive, i'm constructing a query like this...
select * from musicians where lowercase_string like '%'||firstname||'%' or lowercase_string like '%'||lastname||'%' or lowercase_string like '%'||instrument||'%'
and then looping through (in ruby in my case) to capture the result with the most matches.
this is however very slow in the sql stage (1 minute+).
i've tried adding lower-case GIN index using pg_trgm as suggested here - but it's not helping - presumably because the like query is back to front?
Thanks!
With my testing, it seems that no trigram index could help your query at all. And no other index type could possibly speed up an (I)LIKE / FTS based search.
I should mention that all of the queries below use the trigram indexes, when they are queried "reversed": when the table contains the document (which is indexed), and your parameter is the query. The (I)LIKE variant variant f.ex. 2-3 times faster with it.
These the queries I've tested:
select *
from musicians
where :input_string ilike '%' || firstname || '%'
or :input_string ilike '%' || lastname || '%'
or :input_string ilike '%' || instrument || '%'
At first, FTS seemed a great idea, but my testing shows that even without ranking, it is 60-100 times slower than the (I)LIKE variant. (So even, when you don't have to post-process results with these methods, these are not worth it).
select *
from musicians
where to_tsvector(:input_string) ## (plainto_tsquery(firstname) || plainto_tsquery(lastname) || plainto_tsquery(lastname))
However, ORDER BY rank doesn't slow down that much further: it is 70-120 times slower than the (I)LIKE variant.
select *
from musicians
where to_tsvector(:input_string) ## (plainto_tsquery(firstname) || plainto_tsquery(lastname) || plainto_tsquery(lastname))
order by ts_rank(to_tsvector(:input_string), plainto_tsquery(firstname) || plainto_tsquery(lastname) || plainto_tsquery(lastname))
Then, for a last effort, I tried the (fairly new) "word similarity" operators of the trigram module: <% and %> (available from PostgreSQL 9.6).
select *
from musicians
where :input_string %> firstname
or :input_string %> lastname
or :input_string %> instrument
select *
from musicians
where firstname <% :input_string
or lastname <% :input_string
or instrument <% :input_string
These were somewhat faster then FTS: around 50-70 times slower than the (I)LIKE variant.
(Partially working) rextester: it is run against PostgreSQL 9.5, so the 9.6 operators obviously won't run here.
Update: IF full word match is enough for you, you can actually reverse your query, to be able to use indexes. You'll need to "parse" your query (aka. "long string") though:
with long_string(ls) as (
values (:input_string)
),
words(word) as (
select s
from long_string, regexp_split_to_table(ls, '[^[:alnum:]]+') s
where s <> ''
)
select musicians.*
from musicians, words
where firstname ilike word
or lastname ilike word
or instrument ilike word
group by musicians.id
Note: I parsed the query for every complete word. You can have some other logic there, or it can even be parsed in client side.
The default, btree index shines here, as it is much faster than the trigram index with (I)LIKE (we won't need them anyway, as we are looking for complete word match here):
with long_string(ls) as (
values (:input_string)
),
words(word) as (
select s
from long_string, regexp_split_to_table(lower(ls), '[^[:alnum:]]+') s
where s <> ''
)
select musicians.*
from musicians, words
where lower(firstname) = word
or lower(lastname) = word
or lower(instrument) = word
group by musicians.id
http://rextester.com/PSABJ6745
You could even get the match count with something like
sum((lower(firstname) = word)::int
+ (lower(lastname) = word)::int
+ (lower(instrument) = word)::int)
The ilike option with match ordering:
with long_string (ls) as (values
('It is known that Jimi liked to set light to his guitar and smash up all the drums while on stage.')
)
select musicians.*, matches
from
musicians
cross join
long_string
cross join lateral
(select
(ls ilike format ('%%%s%%', first_name) and first_name != '')::int +
(ls ilike format ('%%%s%%', last_name) and last_name != '')::int +
(ls ilike format ('%%%s%%', instrument) and instrument != '')::int
as matches
) m
where matches > 0
order by matches desc
;
first_name | last_name | instrument | matches
------------+-----------+------------+---------
Jimi | Hendrix | Guitar | 2
Phil | Collins | Drums | 1
Ringo | Starr | Drums | 1

SQL SErver Trigger not evaluating as Insert or Update properly

I want to have one trigger to handle updates and inserts. Most of the sql actions in the trigger are for both. The only exception is the fields I'm using to record date and username for an insert and an update. This is what I have, but the updates of the fields used to track update and insert are not firing right. If I insert a new record, I get CreatedBy, CreatedOn, LastEditedBy, LastEditedOn populated, with LastEditedOn as 1 second after CreatedOn (which I dont want to happen). When I update the record, only the LastEditedBy & LastEditedOn changes (which is correct). I'm including my full trigger for reference:
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
-- =================================================================================
-- Author: Paul J. Scipione
-- Create date: 2/15/2012
-- Update date: 6/5/2012
-- Description: To concatenate several fields into a set formatted UnitDescription,
-- to total Span & Loop footages, to set appropriate AcctCode, & track
-- user inserts
-- =================================================================================
IF OBJECT_ID('ProcessCable', 'TR') IS NOT NULL
DROP TRIGGER ProcessCable
GO
CREATE TRIGGER ProcessCable
ON Cable
AFTER INSERT, UPDATE
AS
BEGIN
SET NOCOUNT ON;
-- IF TRIGGER_NESTLEVEL() > 1 RETURN
IF ((SELECT TRIGGER_NESTLEVEL()) > 1 )
RETURN
ELSE
BEGIN
-- record user and date of insert or update
IF EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
ELSE IF NOT EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
-- reset Suffix if applicable
UPDATE Cable SET Suffix = NULL WHERE Suffix = 'n/a'
-- create UnitDescription value
UPDATE Cable SET UnitDescription =
isnull (Type, '') +
isnull (CONVERT (NVARCHAR (10), Size), '') +
'-' +
isnull (CONVERT (NVARCHAR (10), Gauge), '') +
CASE
WHEN ExtraTrench IS NOT NULL AND ExtraTrench > 0 THEN
CASE
WHEN Suffix IS NULL THEN 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')'
ELSE 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')' + Suffix
END
ELSE isnull (Suffix, '')
END
-- convert any accidental negative numbers entered
UPDATE Cable SET Length = ABS(Length)
-- sum Length with LoopFootage into TotalFootage
UPDATE Cable SET TotalFootage = isnull(Length, 0) + isnull(LoopFootage, 0)
-- set proper AcctCode based on Type
UPDATE Cable SET AcctCode =
CASE
WHEN Type IN ('SEA', 'CW', 'CJ') THEN '32.2421.2'
WHEN Type IN ('BFC', 'BJ', 'SEB') THEN '32.2423.2'
WHEN Type IN ('TIP','UF') THEN '32.2422.2'
WHEN Type = 'unknown' OR Type IS NULL THEN 'unknown'
END
WHERE AcctCode IS NULL OR AcctCode = ' '
END
END
GO
A few things jump out at me when I look at your trigger:
You are doing several additional updates rather than a single update (performance-wise, a single update would be better).
Your update statements are unconstrained (there is no JOIN to the inserted/deleted tables to limit the number of records that you perform these additional updates on).
Most of this logic feels like it should be in the application layer rather than in the database; Or, perhaps in some cases implemented differently.
Some quick examples:
Suffix of "n/a" should be removed before inserted.
Cable length absolute value should be done before inserted (with a CHECK CONSTRAINT to verify that bad data cannot be inserted).
TotalFootage should be a computed column so it is always correct.
The Type/AcctCode relationship seems like it should be a column value in a foreign key reference.
But ultimately, I think the reason you are seeing the unexpected dates is because of the unconstrained updates. Without addressing any of the other concerns I brought up above, the statement that sets the audit fields should be more like this:
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN inserted on Cable.PrimaryKeyColumn = inserted.PrimaryKeyColumn
LEFT JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
WHERE deleted.PrimaryKeyColumn IS NULL

SELECT..CASE - Refactor T-SQL

Can I refactor the below SQL CASE statements into single for each case ?
SELECT
CASE RDV.DOMAIN_CODE WHEN 'L' THEN CN.FAMILY_NAME ELSE NULL END AS [LEGAL_FAMILY_NAME],
CASE RDV.DOMAIN_CODE WHEN 'L' THEN CN.GIVEN_NAME ELSE NULL END AS [LEGAL_GIVEN_NAME],
CASE RDV.DOMAIN_CODE WHEN 'L' THEN CN.MIDDLE_NAMES ELSE NULL END AS [LEGAL_MIDDLE_NAMES],
CASE RDV.DOMAIN_CODE WHEN 'L' THEN CN.NAME_TITLE ELSE NULL END AS [LEGAL_NAME_TITLE],
CASE RDV.DOMAIN_CODE WHEN 'P' THEN CN.FAMILY_NAME ELSE NULL END AS [PREFERRED_FAMILY_NAME],
CASE RDV.DOMAIN_CODE WHEN 'P' THEN CN.GIVEN_NAME ELSE NULL END AS [PREFERRED_GIVEN_NAME],
CASE RDV.DOMAIN_CODE WHEN 'P' THEN CN.MIDDLE_NAMES ELSE NULL END AS [PREFERRED_MIDDLE_NAMES],
CASE RDV.DOMAIN_CODE WHEN 'P' THEN CN.NAME_TITLE ELSE NULL END AS [PREFERRED_NAME_TITLE]
FROM dbo.CLIENT_NAME CN
JOIN dbo.REFERENCE_DOMAIN_VALUE RDV
ON CN.NAME_TYPE_CODE = RDV.DOMAIN_CODE AND RDV.REFERENCE_DOMAIN_ID = '7966'
No, you will require 8 separate statements as case and other such variants can only be used in a select to modify the results of a single column, not a series of columns.
If RDV.DOMAIN_COD can only by 'P' or 'L' use NULLIf. It's cleaner.
NULLIF ( expression , expression )
NULLIF is equivalent to a searched CASE expression in which the two expressions are equal and the resulting expression is NULL.
SELECT
NullIf('P', RDV.DOMAIN_CODE) AS [LEGAL_FAMILY_NAME],
...
NullIf('L', RDV.DOMAIN_CODE) AS [PREFERRED_FAMILY_NAME],
...
Since a CASE expression returns a single value, you cannot take eight CASE expressions returning 8 values and make a single CASE expression that returns all eight.
A less efficient alternative with no cases:
SELECT LEGAL_FAMILY_NAME, LEGAL_GIVEN_NAME, LEGAL_MIDDLE_NAMES, LEGAL_NAME_TITLE,
PREFERRED_FAMILY_NAME, PREFERRED_GIVEN_NAME, PREFERRED_MIDDLE_NAMES, PREFERRED_NAME_TITLE
FROM dbo.REFERENCE_DOMAIN_VALUE RDV
LEFT OUTER JOIN
( SELECT
NAME_TYPE_CODE,
FAMILY_NAME AS [LEGAL_FAMILY_NAME],
GIVEN_NAME AS [LEGAL_GIVEN_NAME],
MIDDLE_NAMES AS [LEGAL_MIDDLE_NAMES],
NAME_TITLE AS [LEGAL_NAME_TITLE]
FROM dbo.CLIENT_NAME
WHERE NAME_TYPE_CODE = 'L') LN ON RDV.DOMAIN_CODE = LN.NAME_TYPE_CODE
LEFT OUTER JOIN
( SELECT
NAME_TYPE_CODE,
FAMILY_NAME AS [PREFERRED_FAMILY_NAME],
GIVEN_NAME AS [PREFERRED_GIVEN_NAME],
MIDDLE_NAMES AS [PREFERRED_MIDDLE_NAMES],
NAME_TITLE AS [PREFERRED_NAME_TITLE]
FROM dbo.CLIENT_NAME
WHERE NAME_TYPE_CODE = 'P') PN ON RDV.DOMAIN_CODE = PN.NAME_TYPE_CODE
WHERE RDV.REFERENCE_DOMAIN_ID = '7966'
You could also use a temp table or table variable with all 8 columns and then do two inserts. You could also use a UNION ALL. My guess is that the 8 case statements are the most efficient way. This is especially true if you have some key where you will want some type of ClientID, Legal Names, Preferred Names so you will wrap a MAX around the cases or something and group by a ClientID......
You could generate the SQL script using syscols/INFORMATION_SCHEMA.columns with two case whens for each column 'CASE WHEN DOMAIN_CODE = ''P'' THEN ' + COLUMN_NAME + ' ELSE NULL END AS PREFERRED_' + COLUMN_NAME and then another for L. You could make a LOOP so that the same code executes once for P and once for L and get it down to one loop. Then you could EXEC the string directly, or PRINT it and put it into your SQL script. Anyway for just 8 columns I would cut/paste the case statements...
But anyway in general T-SQL is limited on being able to do for eaches over columns. Either you generate the script using a code generator (done in t-sql or another programming language) or you rethink your problem in another way. But many times you get better performance from duplicate cut/paste code. And many times it isn't worth the hassle of writing an external code generator (ie just for 8 case statements).

DESCENDING/ASCENDING Parameter to a stored procedure

I have the following SP
CREATE PROCEDURE GetAllHouses
set #webRegionID = 2
set #sortBy = 'case_no'
set #sortDirection = 'ASC'
AS
BEGIN
Select
tbl_houses.*
from tbl_houses
where
postal in (select zipcode from crm_zipcodes where web_region_id = #webRegionID)
ORDER BY
CASE UPPER(#sortBy)
when 'CASE_NO' then case_no
when 'AREA' then area
when 'FURNISHED' then furnished
when 'TYPE' then [type]
when 'SQUAREFEETS' then squarefeets
when 'BEDROOMS' then bedrooms
when 'LIVINGROOMS' then livingrooms
when 'BATHROOMS' then bathrooms
when 'LEASE_FROM' then lease_from
when 'RENT' then rent
else case_no
END
END
GO
Now everything in that SP works but I want to be able to choose whether I want to sort ASCENDING or DESCENDING.
I really can't fint no solution for that using SQL and can't find anything in google.
As you can see I have the parameter sortDirection and I have tried using it in multiple ways but always with errors... Tried Case Statements, IF statements and so on but it is complicated by the fact that I want to insert a keyword.
Help will be very much appriciated, I have tried must of the things that comes into mind but haven't been able to get it right.
You could use two order by fields:
CASE #sortDir WHEN 'ASC' THEN
CASE UPPER(#sortBy)
...
END
END ASC,
CASE #sortDir WHEN 'DESC' THEN
CASE UPPER(#sortBy)
...
END
END DESC
A CASE will evaluate as NULL if none of the WHEN clauses match, so that causes one of the two fields to evaluate to NULL for every row (not affecting the sort order) and the other has the appropriate direction.
One drawback, though, is that you'd need to duplicate your #sortBy CASE statement. You could achieve the same thing using dynamic SQL with sp_executesql and writing a 'ASC' or 'DESC' literal depending on the parameter.
That code is going to get very unmanageable very quickly as you'll need to double nest your CASE WHEN's... one set for the Column to order by, and nested set for whethers it's ASC or DESC
Might be better to consider using Dynamic SQL here...
DECLARE #sql nvarchar(max)
SET #sql = '
Select
tbl_houses.*
from tbl_houses
where
postal in (select zipcode from crm_zipcodes where web_region_id = ' + #webRegionID + ') ORDER BY '
SET #sql = #sql + ' ' + #sortBy + ' ' + #sortDirection
EXEC (#sql)
You could do it with some dynamic SQL and calling it with an EXEC. Beware SQL injection though if the user has any control over the parameters.
CREATE PROCEDURE GetAllHouses
set #webRegionID = 2
set #sortBy = 'case_no'
set #sortDirection = 'ASC'
AS
BEGIN
DECLARE #dynamicSQL NVARCHAR(MAX)
SET #dynamicSQL =
'
SELECT
tbl_houses.*
FROM
tbl_houses
WHERE
postal
IN
(
SELECT
zipcode
FROM
crm_zipcodes
WHERE
web_region_id = ' + CONVERT(nvarchar(10), #webRegionID) + '
)
ORDER BY
' + #sortBy + ' ' + #sortDirection
EXEC(#dynamicSQL)
END
GO