Get substring from a column with delimiter and varying number of elements

Get substring from a column with delimiter and varying number of elements - postgresql

A column in my table in Postgres has varchar values in the format: 'str1/str2' or 'str1/str2/str3' where str represents any string.
I want to write a select query which will return me str2. I surfed but couldn't find any proper function.

Use split_part():
SELECT split_part(col, '/', 2) AS result
FROM tbl;
As Victoria pointed out, the index is 1-based,
Obviously, the delimiter needs to be unambiguous. It (/ in your example) cannot cannot be part of a substring. (Unless that's to the right of what you extract, which is ignored anyway.)
Related:
Split comma separated column data into additional columns

Related

How to add a leading zero when the length of the column is unknown?

How can I add a leading zero to a varchar column in the table and I don't know the length of the column. If the column is not null, then I should add a leading zero.
Examples:
345 - output should be 0345
4567 - output should be 04567
I tried:
SELECT lpad(column1,WHAT TO SPECIFY HERE?, '0')
from table_name;
I will run an update query after I get this.

You may be overthinking this. Use plain concatenation:
SELECT '0' || column1 AS padded_col1 FROM table_name;
If the column is NULL, nothing happens: concatenating anything to NULL returns NULL.
In particular, don't use concat(). You would get '0' for NULL columns, which you do not want.
If you also have empty strings (''), you may need to do more, depending on what you want.
And since you mentioned your plan to updated the table: Consider not doing this, you are adding noise, that could be added for display with the simple expression. A VIEW might come in handy for this.
If all your varchar values are in fact valid numbers, use an appropriate numeric data type instead and format for display with the same expression as above. The concatenation automatically produces a text result.
If circumstances should force your hand and you need to update anyway, consider this:
UPDATE table_name
SET column1 = '0' || column1
WHERE column1 IS DISTINCT FROM '0' || column1;
The added WHERE clause to avoid empty updates. Compare:
How do I (or can I) SELECT DISTINCT on multiple columns?

try concat instead?..
SELECT concat(0::text,column1) from table_name;

Alphanumeric Sorting in PostgreSQL

I have this table with a character varying column in Postgres 9.6:
id | column
------------
1 |IR ABC-1
2 |IR ABC-2
3 |IR ABC-10
I see some solutions typecasting the column as bytea.
select * from table order by column::bytea.
But it always results to:
id | column
------------
1 |IR ABC-1
2 |IR ABC-10
3 |IR ABC-2
I don't know why '10' always comes before '2'. How do I sort this table, assuming the basis for ordering is the last whole number of the string, regardless of what the character before that number is.

When sorting character data types, collation rules apply - unless you work with locale "C" which sorts characters by there byte values. Applying collation rules may or may not be desirable. It makes sorting more expensive in any case. If you want to sort without collation rules, don't cast to bytea, use COLLATE "C" instead:
SELECT * FROM table ORDER BY column COLLATE "C";
However, this does not yet solve the problem with numbers in the string you mention. Split the string and sort the numeric part as number.
SELECT *
FROM table
ORDER BY split_part(column, '-', 2)::numeric;
Or, if all your numbers fit into bigint or even integer, use that instead (cheaper).
I ignored the leading part because you write:
... the basis for ordering is the last whole number of the string, regardless of what the character before that number is.
Related:
Alphanumeric sorting with PostgreSQL
Split comma separated column data into additional columns
What is the impact of LC_CTYPE on a PostgreSQL database?
Typically, it's best to save distinct parts of a string in separate columns as proper respective data types to avoid any such confusion.
And if the leading string is identical for all columns, consider just dropping the redundant noise. You can always use a VIEW to prepend a string for display, or do it on-the-fly, cheaply.

As in the comments split and cast the integer part
select *
from
table
cross join lateral
regexp_split_to_array(column, '-') r (a)
order by a[1], a[2]::integer

PostgreSQL query on a text column ignoring special characters

I have a table which contains a text column, say vehicle number.
Now I want to query the table for fields which contain a particular vehicle number.
While matching I do not want to consider non-alphanumeric characters.
example: query condition - DEL123
should match - DEL-123, DEL/123, DEL#123, etc...

If you know which characters to skip, put them as the second parameter of this translate() call (which is faster than regexp functions):
select *
from a_table
where translate(code, '-/#', '') = 'DEL123';
Else, you can compare only alphanumeric characters using regexp_replace():
select *
from a_table
where regexp_replace(code, '[^[:alnum:]]', '', 'g') = 'DEL123';

#klin's answer is great, but is not sargable, so in cases where you're searching through millions of records (maybe not your case, but perhaps someone else with a similar question looking for answers), using regular expressions will likely render much better results.
The following will use indexes on code significantly reducing the number of rows tested:
select *
from a_table
where code ~ '^DEL[^[:alnum:]]*123$';

Similarity in tsv column

I'm needing some help getting the SQL to work here in PostgreSQL 9.5.1 using pgAdminIII. What I have is a column status (datatype, text) of Facebook statuses in the format they were typed and another column status_tsv which stores a tsvector of the status column with stop words removed and the words stemmed.
I'd like to find similar statuses by comparing the similarity of the tsvector column in a self-join.
Thus far I have tried using a regexp_replace function combined with the pg_trgm similarity search to keep only the a-zA-Z character set in the tsvector column but this didn't worked as regexp_replace says it can't do tsvector columns so I've changed datatype of tsv column to text.
The problem now is that it only compares the similarity of the first word in each row and ignores the rest, obviously this is no use and I need it to compare the whole row.
My SQL just now looks like
`SELECT * FROM status_table AS x
JOIN status_table AS y
ON ST_Dwithin (x.geom54032, y.geom54032,5000)
WHERE status_similarity (x.tsvector_status, y.tsvector_status) > 0.7
AND x.status_id != y.status_id;`
The status_similarity does this `(regexp_replace(x.tsvector_status, '[^a-zA-Z]', '', 'g'), regexp_replace(y.tsvector_status, '[^a-zA-Z]', '', 'g')) which I'm sure keeps only the a-zA-Z from the tsvector_status column.
What must I changed to get this returning similar status'?

Order by char column numerically

How to Sort Character column numerically.
I have a column of numbers stored as chars. When I do a ORDER BY for this column I get the following:
100D
131A
200
21B
30
31000A
etc.
There may be chance of having one Alphabet at the end.
How can I order these chars numerically? Do I need to convert something or is there already an SQL command or function for this?

You could use something like:
ORDER BY Cast(regexp_replace(yourcolumn, '[^0-9]', '', 'g') as integer)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Get substring from a column with delimiter and varying number of elements - postgresql

A column in my table in Postgres has varchar values in the format: 'str1/str2' or 'str1/str2/str3' where str represents any string. I want to write a select query which will return me str2. I surfed but couldn't find any proper function.

Related

How to add a leading zero when the length of the column is unknown?

Alphanumeric Sorting in PostgreSQL

PostgreSQL query on a text column ignoring special characters

Similarity in tsv column

Order by char column numerically

Categories

Resources