Why does the Postgres position() function give different results? - postgresql

I am baffled at to why the Postgresql position function gives different results for what seems to be a straightforward test.
Here is query #1:
SELECT count(*)
FROM
dnasample D, ibg_studies ST, subjects S
WHERE
D.studyindex=ST.studyindex
AND ST.studyabrv='CONGER'
AND D.subjectidkey=S.id
AND D.projectindex IS NULL
AND POSITION('Previous subjectid:' in D.comment) IS NULL
which returns a result of 246.
Then here is query #2:
SELECT count(*)
FROM
dnasample D, ibg_studies ST, subjects S
WHERE
D.studyindex=ST.studyindex
AND ST.studyabrv='CONGER'
AND D.subjectidkey=S.id
AND D.projectindex IS NULL
AND POSITION('Previous subjectid:' in D.comment)=0
I do not see why these return such different results?
I've tried reading the Postgres documentation to clarify the difference between zero and the null string, but not much luck there yet...
Thanks in advance,
--Rick

The position(needle in haystack) function will return:
Zero if needle is not NULL and not in haystack and haystack is not NULL.
A positive value if needle is in haystack with one being the beginning of haystack.
NULL if needle or haystack are NULL.
For example:
=> select position('a' in 'a' ) as a,
position('a' in 'b' ) as b,
position('a' in null) as n1,
position(null in 'a' ) as n2,
position(null in null) as n3;
a | b | n1 | n2 | n3
---+---+----+----+----
1 | 0 | | |
That empty spots under n1 through n3 are NULLs.
So this condition:
POSITION('Previous subjectid:' in D.comment) IS NULL
is equivalent to:
D.comment IS NULL
whereas this:
POSITION('Previous subjectid:' in D.comment)=0
will be true if and only if D.comment IS NOT NULL and D.comment does not contain 'Previous subjectid:'.
So the two conditions are quite different.

NULL is actually a no value.
0 is a pool of bit data.
This is the difference.

Related

PostGIS returns record as datatype. This is unexpected

I have this query
WITH buffered AS (
SELECT
ST_Buffer(geom , 10, 'endcap=round join=round') AS geom,
id
FROM line),
hexagons AS (
SELECT
ST_HexagonGrid(10, buffered.geom) AS hex,
buffered.id
FROM buffered
) SELECT * FROM hexagons;
This gives the datatype record in the column hex. This is unexpected. I expect geometry as a datatype. Why is that?
According to the documentation, the function ST_HexagonGrid returns a setof record. These records contain however a geometry attribute called geom, so in order to access the geometry of this record you have to wrap the variable with parenthesis () and call the attribute with a dot ., e.g.
SELECT (hex).geom FROM hexagons;
or just access fetch all attributes using * (in this case, i,j and geom):
SELECT (hex).* FROM hexagons;
Demo (PostGIS 3.1):
WITH j (hex) AS (
SELECT
ST_HexagonGrid(
10,ST_Buffer('LINESTRING(-105.55 41.11,-115.48 37.16,-109.29 29.38,-98.34 27.13)',1))
)
SELECT ST_AsText((hex).geom,2) FROM j;
st_astext
----------------------------------------------------------------------------------------
POLYGON((-130 34.64,-125 25.98,-115 25.98,-110 34.64,-115 43.3,-125 43.3,-130 34.64))
POLYGON((-115 25.98,-110 17.32,-100 17.32,-95 25.98,-100 34.64,-110 34.64,-115 25.98))
POLYGON((-115 43.3,-110 34.64,-100 34.64,-95 43.3,-100 51.96,-110 51.96,-115 43.3))
POLYGON((-100 34.64,-95 25.98,-85 25.98,-80 34.64,-85 43.3,-95 43.3,-100 34.64))
As ST_HexagonGrid returns a setof record, you can access the record atributes using a LATERAL as described here, or just call the function in the FROM clause:
SELECT i,j,ST_AsText(geom,2) FROM
ST_HexagonGrid(
10,ST_Buffer('LINESTRING(-105.55 41.11,-115.48 37.16,-109.29 29.38,-98.34 27.13)',1));
i | j | st_astext
----+---+----------------------------------------------------------------------------------------
-8 | 2 | POLYGON((-130 34.64,-125 25.98,-115 25.98,-110 34.64,-115 43.3,-125 43.3,-130 34.64))
-7 | 1 | POLYGON((-115 25.98,-110 17.32,-100 17.32,-95 25.98,-100 34.64,-110 34.64,-115 25.98))
-7 | 2 | POLYGON((-115 43.3,-110 34.64,-100 34.64,-95 43.3,-100 51.96,-110 51.96,-115 43.3))
-6 | 2 | POLYGON((-100 34.64,-95 25.98,-85 25.98,-80 34.64,-85 43.3,-95 43.3,-100 34.64))
Further reading: How to divide world into cells (grid)

Syntax Error Raised When Trying to update value with XOR in Postgresql

I'm trying to update a column of a salary table in my Postgres database. The script I'm trying to use is:
UPDATE "LC".sex
SET sex = CHAR(ASCII('f') ^ ASCII('m') ^ ASCII(sex));
As this worked in MySQL. However, I got a syntax error:
ERROR: syntax error at or near "ASCII"
LINE 2: SET sex = CHAR(ASCII('f') ^ ASCII('m') ^ ASCII(sex));"
I tried to dig around and tried my luck with CHR()function and then got this:
function chr(double precision) does not exist
I've nearly given up until I tried this:
SELECT CHAR(ASCII('f') ^ ASCII('m') ^ ASCII('f'));
And that gave me the same syntax error, however, SELECT CHAR(ASCII('f') ^ ASCII('m'); does work in Postgres. So I'm critically stumped. What am I doing wrong?
Thanks.
The ^ operator is for exponentiation in PostgreSQL, you'd want # for bitwise XOR, see Mathematical Operators in the fine manual for details.
So you could say:
update "LC".sex
set sex = chr(ascii('f') # ascii('m') # ascii(sex));
However, I'm a little curious about what you're trying to accomplish with all the bit wrangling. If sex is 'f' then you get 'm'; if sex is 'm' then you get 'f'; if sex is null you get null; if sex is anything else then you get nonsense:
=> select sex, chr(ascii('f') # ascii('m') # ascii(sex))
from (values ('f'), ('m'), ('F'), ('M'), (null), ('X'), ('y')) t(sex);
sex | chr
-----+-----
f | m
m | f
F | M
M | F
|
X | S
y | r
(7 rows)
If you just want to flip the sexes then why not say so:
update "LC".sex
set sex = case lower(sex) when 'f' then 'm' when 'm' then 'f' else null end;
A minor modification will preserve case if that's an issue. That will convert anything other than 'f', 'F', 'm', and 'M' to null but presumably that's not an issue.

DB2: Converting varchar to money

I have 2 varchar(64) values that are decimals in this case (say COLUMN1 and COLUMN2, both varchars, both decimal numbers(money)). I need to create a where clause where I say this:
COLUMN1 < COLUMN2
I believe I have to convert these 2 varchar columns to a different data types to compare them like that, but I'm not sure how to go about that. I tried a straight forward CAST:
CAST(COLUMN1 AS DECIMAL(9,2)) < CAST(COLUMN2 AS DECIMAL(9,2))
But I had to know that would be too easy. Any help is appreciated. Thanks!
You can create a UDF like this to check which values can't be cast to DECIMAL
CREATE OR REPLACE FUNCTION IS_DECIMAL(i VARCHAR(64)) RETURNS INTEGER
CONTAINS SQL
--ALLOW PARALLEL -- can use this on Db2 11.5 or above
NO EXTERNAL ACTION
DETERMINISTIC
BEGIN
DECLARE NOT_VALID CONDITION FOR SQLSTATE '22018';
DECLARE EXIT HANDLER FOR NOT_VALID RETURN 0;
RETURN CASE WHEN CAST(i AS DECIMAL(31,8)) IS NOT NULL THEN 1 END;
END
For example
CREATE TABLE S ( C VARCHAR(32) );
INSERT INTO S VALUES ( ' 123.45 '),('-00.12'),('£546'),('12,456.88');
SELECT C FROM S WHERE IS_DECIMAL(c) = 0;
would return
C
---------
£546
12,456.88
It really is that easy...this works fine...
select cast('10.15' as decimal(9,2)) - 1
from sysibm.sysdummy1;
You've got something besides a valid numerical character in your data..
And it's something besides leading or trailing whitespace...
Try the following...
select *
from table
where translate(column1, ' ','0123456789.')
<> ' '
or translate(column2, ' ','0123456789.')
<> ' '
That will show you the rows with alpha characters...
If the above does't return anything, then you've probably got a string with double decimal points or something...
You could use a regex to find those.
There is a built-in ability to do this without UDFs.
The xmlcast function below does "safe" casting between (var)char and decfloat (you may use as double or as decimal(X, Y) instead, if you want). It returns NULL if it's impossible to cast.
You may use such an expression twice in the WHERE clause.
SELECT
S
, xmlcast(xmlquery('if ($v castable as xs:decimal) then xs:decimal($v) else ()' passing S as "v") as decfloat) D
FROM (VALUES ( ' 123.45 '),('-00.12'),('£546'),('12,456.88')) T (S);
|S |D |
|---------|------------------------------------------|
| 123.45 |123.45 |
|-00.12 |-0.12 |
|£546 | |
|12,456.88| |

Extracting values from non-standard markup strings in PostgreSQL

Unfortunately, I have a table like the following:
DROP TABLE IF EXISTS my_list;
CREATE TABLE my_list (index int PRIMARY KEY, mystring text, status text);
INSERT INTO my_list
(index, mystring, status) VALUES
(12, '', 'D'),
(14, '[id] 5', 'A'),
(15, '[id] 12[num] 03952145815', 'C'),
(16, '[id] 314[num] 03952145815[name] Sweet', 'E'),
(19, '[id] 01211[num] 03952145815[name] Home[oth] Alabama', 'B');
Is there any trick to get out number of [id] as integer from the mystring text shown above? As though I ran the following query:
SELECT index, extract_id_function(mystring), status FROM my_list;
and got results like:
12 0 D
14 5 A
15 12 C
16 314 E
19 1211 B
Preferably with only simple string functions and if not regular expression will be fine.
If I understand correctly, you have a rather unconventional markup format where [id] is followed by a space, then a series of digits that represents a numeric identifier. There is no closing tag, the next non-numeric field ends the ID.
If so, you're going to be able to do this with non-regexp string ops, but only quite badly. What you'd really need is the SQL equivalent of strtol, which consumes input up to the first non-digit and just returns that. A cast to integer will not do that, it'll report an error if it sees non-numeric garbage after the number. (As it happens I just wrote a C extension that exposes strtol for decoding hex values, but I'm guessing you don't want to use C extensions if you don't even want regex...)
It can be done with string ops if you make the simplifying assumption that an [id] nnnn tag always ends with either end of string or another tag, so it's always [ at the end of the number. We also assume that you're only interested in the first [id] if multiple appear in a string. That way you can write something like the following horrible monstrosity:
select
"index",
case
when next_tag_idx > 0 then substring(cut_id from 0 for next_tag_idx)
else cut_id
end AS "my_id",
"status"
from (
select
position('[' in cut_id) AS next_tag_idx,
*
from (
select
case
when id_offset = 0 then null
else substring(mystring from id_offset + 4)
end AS cut_id,
*
from (
select
position('[id] ' in mystring) AS id_offset,
*
from my_list
) x
) y
) z;
(If anybody ever actually uses that query for anything, kittens will fall from the sky and splat upon the pavement, wailing in horror all the way down).
Or you can be sensible and just use a regular expression for this kind of string processing, in which case your query (assuming you only want the first [id]) is:
regress=> SELECT
"index",
coalesce((SELECT (regexp_matches(mystring, '\[id\]\s?(\d+)'))[1])::integer, 0) AS my_id,
status
FROM my_list;
index | my_id | status
-------+----------------+--------
12 | 0 | D
14 | 5 | A
15 | 12 | C
16 | 314 | E
19 | 01211 | B
(5 rows)
Update: If you're having issues with unicode handling in regex, upgrade to Pg 9.2. See https://stackoverflow.com/a/14293924/398670

Split a string and populate a table for all records in table in SQL Server 2008 R2

I have a table EmployeeMoves:
| EmployeeID | CityIDs
+------------------------------
| 24 | 23,21,22
| 25 | 25,12,14
| 29 | 1,2,5
| 31 | 7
| 55 | 11,34
| 60 | 7,9,21,23,30
I'm trying to figure out how to expand the comma-delimited values from the EmployeeMoves.CityIDs column to populate an EmployeeCities table, which should look like this:
| EmployeeID | CityID
+------------------------------
| 24 | 23
| 24 | 21
| 24 | 22
| 25 | 25
| 25 | 12
| 25 | 14
| ... and so on
I already have a function called SplitADelimitedList that splits a comma-delimited list of integers into a rowset. It takes the delimited list as a parameter. The SQL below will give me a table with split values under the column Value:
select value from dbo.SplitADelimitedList ('23,21,1,4');
| Value
+-----------
| 23
| 21
| 1
| 4
The question is: How do I populate EmployeeCities from EmployeeMoves with a single (even if complex) SQL statement using the comma-delimited list of CityIDs from each row in the EmployeeMoves table, but without any cursors or looping in T-SQL? I could have 100 records in the EmployeeMoves table for 100 different employees.
This is how I tried to solve this problem. It seems to work and is very quick in performance.
INSERT INTO EmployeeCities
SELECT
em.EmployeeID,
c.Value
FROM EmployeeMoves em
CROSS APPLY dbo.SplitADelimitedList(em.CityIDs) c;
UPDATE 1:
This update provides the definition of the user-defined function dbo.SplitADelimitedList. This function is used in above query to split a comma-delimited list to table of integer values.
CREATE FUNCTION dbo.fn_SplitADelimitedList1
(
#String NVARCHAR(MAX)
)
RETURNS #SplittedValues TABLE(
Value INT
)
AS
BEGIN
DECLARE #SplitLength INT
DECLARE #Delimiter VARCHAR(10)
SET #Delimiter = ',' --set this to the delimiter you are using
WHILE len(#String) > 0
BEGIN
SELECT #SplitLength = (CASE charindex(#Delimiter, #String)
WHEN 0 THEN
datalength(#String) / 2
ELSE
charindex(#Delimiter, #String) - 1
END)
INSERT INTO #SplittedValues
SELECT cast(substring(#String, 1, #SplitLength) AS INTEGER)
WHERE
ltrim(rtrim(isnull(substring(#String, 1, #SplitLength), ''))) <> '';
SELECT #String = (CASE ((datalength(#String) / 2) - #SplitLength)
WHEN 0 THEN
''
ELSE
right(#String, (datalength(#String) / 2) - #SplitLength - 1)
END)
END
RETURN
END
Preface
This is not the right way to do it. You shouldn't create comma-delimited lists in SQL Server. This violates first normal form, which should sound like an unbelievably vile expletive to you.
It is trivial for a client-side application to select rows of employees and related cities and display this as a comma-separated list. It shouldn't be done in the database. Please do everything you can to avoid this kind of construction in the future. If at all possible, you should refactor your database.
The Right Answer
To get the list of cities, properly expanded, from a table containing lists of cities, you can do this:
INSERT dbo.EmployeeCities
SELECT
M.EmployeeID,
C.CityID
FROM
EmployeeMoves M
CROSS APPLY dbo.SplitADelimitedList(M.CityIDs) C
;
The Wrong Answer
I wrote this answer due to a misunderstanding of what you wanted: I thought you were trying to query against properly-stored data to produce a list of comma-separated CityIDs. But I realize now you wanted the reverse: to query the list of cities using existing comma-separated values already stored in a column.
WITH EmployeeData AS (
SELECT
M.EmployeeID,
M.CityID
FROM
dbo.SplitADelimitedList ('23,21,1,4') C
INNER JOIN dbo.EmployeeMoves M
ON Convert(int, C.Value) = M.CityID
)
SELECT
E.EmployeeID,
CityIDs = Substring((
SELECT ',' + Convert(varchar(max), CityID)
FROM EmployeeData C
WHERE E.EmployeeID = C.EmployeeID
FOR XML PATH (''), TYPE
).value('.[1]', 'varchar(max)'), 2, 2147483647)
FROM
(SELECT DISTINCT EmployeeID FROM EmployeeData) E
;
Part of my difficulty in understanding is that your question is a bit disorganized. Next time, please clearly label your example data and show what you have, and what you're trying to work toward. Since you put the data for EmployeeCities last, it looked like it was what you were trying to achieve. It's not a good use of people's time when questions are not laid out well.