I am trying to make regular expression search on a table in PostgreSql . But i am not able to find right syntax to do so using single statement .
I don't know if i create index column will i be able to do search based on regular expression or not .could any one help me please
The query is like this
SELECT *
FROM flight_info
where to_tsvector('english',event_id || ' ' || type || ' ' || category || ' ' || number || ' '|| make || ' '|| model ) ## to_tsquery('Incident');
Here Incident is matched exactly based on text But I need to make search base on regular expression like how we do using LIKE .
Or Using like clause on all the column is the only way ?
Or is there a way to give like clause that applies to many columns in query
You will not be able to mix Regex and full text search. It is not possible to use a Regex within to_tsquery()
You can use regexp_matches() instead:
SELECT regexp_matches('foobarbequebaz', '(bar)(beque)');
regexp_matches
----------------
{bar,beque}
(1 row)
SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g');
regexp_matches
----------------
{bar,beque}
{bazil,barf}
(2 rows)
More info at https://www.postgresql.org/docs/9.6/static/functions-matching.html
Update:
To use a Regex in whe WHERE clause, use the ~ operator:
-- Sample schema
CREATE TABLE sample_table (
id serial,
col1 text,
col2 text,
col3 text
);
INSERT INTO sample_table (col1, col2, col3) VALUES ('this', 'is', 'foobarbequebazilbarfbonk');
INSERT INTO sample_table (col1, col2, col3) VALUES ('apple foobarbequebazequeb', 'rocky', 'sunny');
INSERT INTO sample_table (col1, col2, col3) VALUES ('not', 'a', 'match');
-- Query
SELECT *
FROM sample_table
WHERE (col1 || col2 || col3) ~ '(bar)(beque)';
->
id | col1 | col2 | col3
----+---------------------------+-------+--------------------------
1 | this | is | foobarbequebazilbarfbonk
2 | apple foobarbequebazequeb | rocky | sunny
(2 rows)
You can use ~* instead to make it case insensitive.
More info: https://www.postgresql.org/docs/current/static/functions-matching.html
Related
begin;
create table test101(col1 int default 2, col2 text default 'hello world');
insert into test101 values (1,default);
insert into test101 values (default,default);
insert into test101 values (default,'dummy');
insert into test101 values (5,'dummy');
commit;
update: OK.
update test101 set col2 = default where col1 = 4;
select, delete not OK.
select * from test101 where col1 = COALESCE (col1,default);
delete from test101 where col1 = COALESCE (col1,default);
error code:
ERROR: 42601: DEFAULT is not allowed in this context
LINE 1: delete from test101 where col1 = COALESCE (col1,default);
^
LOCATION: transformExprRecurse, parse_expr.c:285
also tried: delete from test101 where col1 = default;
default value is not easy to find.
Get the default values of table columns in Postgres? Then select/delete operation with default operation is not that weird.
In the question you linked to they do:
SELECT column_name, column_default
FROM information_schema.columns
WHERE (table_schema, table_name) = ('public', 'test101')
ORDER BY ordinal_position;
which produces something like:
column_name | column_default
-------------+---------------------
col1 | 2
col2 | 'hello world'::text
Maybe, you can combine this query with your query? (But I would not recommend it, because ... )
Anyone know how string_agg results need to be "massaged" so they can be used in an IN statement?
The following is some sample code. Thanks for your time.
P.S: Before scratching your head and asking what the hell. I'm only using this code to show the problem of the string_agg b/c as you can see the query otherwise is a bit pointless.
Henry
WITH TEMP AS
(
SELECT 'John' AS col1
UNION ALL
SELECT 'Peter' AS col1
UNION ALL
SELECT 'Henry' AS col1
UNION ALL
SELECT 'Mo' AS col1
)
-- results that are being used in the IN statement
--SELECT string_agg('''' || col1::TEXT || '''',',') AS col1 FROM TEMP
SELECT col1 FROM TEMP
WHERE col1 IN
(
SELECT string_agg('''' || col1::TEXT || '''',',') AS col1
FROM TEMP
)
You can't mix dynamic code with static code. Your example is not very clear as to what exactly is it that you want to do. Your sample could be written as:
WITH TEMP(col1) AS (values ('John'), ('Peter'), ('Henry'), ('Mo'))
SELECT col1 FROM TEMP
WHERE col1 IN (SELECT col1 FROM TEMP)
or using an array:
WITH TEMP(col1) AS (values ('John'), ('Peter'), ('Henry'), ('Mo'))
SELECT col1 FROM TEMP
WHERE col1 = ANY (SELECT ARRAY(SELECT col1 FROM TEMP))
or simply (in this case since the main from and the subselect are the same table without any filters):
WITH TEMP(col1) AS (values ('John'), ('Peter'), ('Henry'), ('Mo'))
SELECT col1 FROM TEMP
We are using postgresql 8/psql version 8.4. I am querying information_schema.columns for the column names and would like to generate a text file/output file so:
UNLOAD ('select
col1,
col2,
col3,
col4,
col5)
to more text here
or
UNLOAD ( 'select col1,col2,col3,col4,col5) to more text here
So, I'm basically looking to output the colname followed by a "," - colname,
Is this possible? Thanks for the help.
This will create a string like that:
SELECT 'UNLOAD ( ''select ' ||
array_to_string(array_agg(column_name::text), ',') ||
' to more text here'
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'whatever'
;
You can use \o to send it to a file, or use COPY (...) TO STDOUT.
In Postgres 8.4 or higher, what is the most efficient way to get a row of data populated by defaults without actually creating the row. Eg, as a transaction (pseudocode):
create table "mytable"
(
id serial PRIMARY KEY NOT NULL,
parent_id integer NOT NULL DEFAULT 1,
random_id integer NOT NULL DEFAULT random(),
)
begin transaction
fake_row = insert into mytable (id) values (0) returning *;
delete from mytable where id=0;
return fake_row;
end transaction
Basically I'd expect a query with a single row where parent_id is 1 and random_id is a random number (or other function return value) but I don't want this record to persist in the table or impact on the primary key sequence serial_id_seq.
My options seem to be using a transaction like above or creating views which are copies of the table with the fake row added but I don't know all the pros and cons of each or whether a better way exists.
I'm looking for an answer that assumes no prior knowledge of the datatypes or default values of any column except id or the number or ordering of the columns. Only the table name will be known and that a record with id 0 should not exist in the table.
In the past I created the fake record 0 as a permanent record but I've come to consider this record a type of pollution (since I typically have to filter it out of future queries).
You can copy the table definition and defaults to the temp table with:
CREATE TEMP TABLE table_name_rt (LIKE table_name INCLUDING DEFAULTS);
And use this temp table to generate dummy rows. Such table will be dropped at the end of the session (or transaction) and will only be visible to current session.
You can query the catalog and build a dynamic query
Say we have this table:
create table test10(
id serial primary key,
first_name varchar( 100 ),
last_name varchar( 100 ) default 'Tom',
age int not null default 38,
salary float default 100.22
);
When you run following query:
SELECT string_agg( txt, ' ' order by id )
FROM (
select 1 id, 'SELECT ' txt
union all
select 2, -9999 || ' as id '
union all
select 3, ', '
|| coalesce( column_default, 'null'||'::'||c.data_type )
|| ' as ' || c.column_name
from information_schema.columns c
where table_schema = 'public'
and table_name = 'test10'
and ordinal_position > 1
) xx
;
you will get this sting as a result:
"SELECT -9999 as id , null::character varying as first_name ,
'Tom'::character varying as last_name , 38 as age , 100.22 as salary"
then execute this query and you will get the "phantom row".
We can build a function that build and excecutes the query and return our row as a result:
CREATE OR REPLACE FUNCTION get_phantom_rec (p_i test10.id%type )
returns test10 as $$
DECLARE
v_sql text;
myrow test10%rowtype;
begin
SELECT string_agg( txt, ' ' order by id )
INTO v_sql
FROM (
select 1 id, 'SELECT ' txt
union all
select 2, p_i || ' as id '
union all
select 3, ', '
|| coalesce( column_default, 'null'||'::'||c.data_type )
|| ' as ' || c.column_name
from information_schema.columns c
where table_schema = 'public'
and table_name = 'test10'
and ordinal_position > 1
) xx
;
EXECUTE v_sql INTO myrow;
RETURN myrow;
END$$ LANGUAGE plpgsql ;
and then this simple query gives you what you want:
select * from get_phantom_rec ( -9999 );
id | first_name | last_name | age | salary
-------+------------+-----------+-----+--------
-9999 | | Tom | 38 | 100.22
I would just select the fake values as literals:
select 1 id, 1 parent_id, 1 user_id
The returned row will be (virtually) indistinguishable from a real row.
To get the values from the catalog:
select
0 as id, -- special case for serial type, just return 0
(select column_default::int -- Cast to int, because we know the column is int
from INFORMATION_SCHEMA.COLUMNS
where table_name = 'mytable'
and column_name = 'parent_id') as parent_id,
(select column_default::int -- Cast to int, because we know the column is int
from INFORMATION_SCHEMA.COLUMNS
where table_name = 'mytable'
and column_name = 'user_id') as user_id;
Note that you must know what the columns are and their type, but this is reasonable. If you change the table schema (except default value), you would need to tweak the query.
See the above as a SQLFiddle.
I have an interesting conundrum which I believe can be solved in purely SQL. I have tables similar to the following:
responses:
user_id | question_id | body
----------------------------
1 | 1 | Yes
2 | 1 | Yes
1 | 2 | Yes
2 | 2 | No
1 | 3 | No
2 | 3 | No
questions:
id | body
-------------------------
1 | Do you like apples?
2 | Do you like oranges?
3 | Do you like carrots?
and I would like to get the following output
user_id | Do you like apples? | Do you like oranges? | Do you like carrots?
---------------------------------------------------------------------------
1 | Yes | Yes | No
2 | Yes | No | No
I don't know how many questions there will be, and they will be dynamic, so I can't just code for every question. I am using PostgreSQL and I believe this is called transposition, but I can't seem to find anything that says the standard way of doing this in SQL. I remember doing this in my database class back in college, but it was in MySQL and I honestly don't remember how we did it.
I'm assuming it will be a combination of joins and a GROUP BY statement, but I can't even figure out how to start.
Anybody know how to do this? Thanks very much!
Edit 1: I found some information about using a crosstab which seems to be what I want, but I'm having trouble making sense of it. Links to better articles would be greatly appreciated!
Use:
SELECT r.user_id,
MAX(CASE WHEN r.question_id = 1 THEN r.body ELSE NULL END) AS "Do you like apples?",
MAX(CASE WHEN r.question_id = 2 THEN r.body ELSE NULL END) AS "Do you like oranges?",
MAX(CASE WHEN r.question_id = 3 THEN r.body ELSE NULL END) AS "Do you like carrots?"
FROM RESPONSES r
JOIN QUESTIONS q ON q.id = r.question_id
GROUP BY r.user_id
This is a standard pivot query, because you are "pivoting" the data from rows to columnar data.
I implemented a truly dynamic function to handle this problem without having to hard code any specific class of answers or use external modules/extensions. It also gives full control over column ordering and supports multiple key and class/attribute columns.
You can find it here: https://github.com/jumpstarter-io/colpivot
Example that solves this particular problem:
begin;
create temporary table responses (
user_id integer,
question_id integer,
body text
) on commit drop;
create temporary table questions (
id integer,
body text
) on commit drop;
insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');
select colpivot('_output', $$
select r.user_id, q.body q, r.body a from responses r
join questions q on q.id = r.question_id
$$, array['user_id'], array['q'], '#.a', null);
select * from _output;
rollback;
This outputs:
user_id | 'Do you like apples?' | 'Do you like carrots?' | 'Do you like oranges?'
---------+-----------------------+------------------------+------------------------
1 | Yes | No | Yes
2 | Yes | No | No
You can solve this example with the crosstab function in this way
drop table if exists responses;
create table responses (
user_id integer,
question_id integer,
body text
);
drop table if exists questions;
create table questions (
id integer,
body text
);
insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');
select * from crosstab('select responses.user_id, questions.body, responses.body from responses, questions where questions.id = responses.question_id order by user_id') as ct(userid integer, "Do you like apples?" text, "Do you like oranges?" text, "Do you like carrots?" text);
First, you must install tablefunc extension. Since 9.1 version you can do it using create extension:
CREATE EXTENSION tablefunc;
I wrote a function to generate the dynamic query.
It generates the sql for the crosstab and creates a view (drops it first if it exists).
You can than select from the view to get your results.
Here is the function:
CREATE OR REPLACE FUNCTION public.c_crosstab (
eavsql_inarg varchar,
resview varchar,
rowid varchar,
colid varchar,
val varchar,
agr varchar
)
RETURNS void AS
$body$
DECLARE
casesql varchar;
dynsql varchar;
r record;
BEGIN
dynsql='';
for r in
select * from pg_views where lower(viewname) = lower(resview)
loop
execute 'DROP VIEW ' || resview;
end loop;
casesql='SELECT DISTINCT ' || colid || ' AS v from (' || eavsql_inarg || ') eav ORDER BY ' || colid;
FOR r IN EXECUTE casesql Loop
dynsql = dynsql || ', ' || agr || '(CASE WHEN ' || colid || '=''' || r.v || ''' THEN ' || val || ' ELSE NULL END) AS ' || agr || '_' || r.v;
END LOOP;
dynsql = 'CREATE VIEW ' || resview || ' AS SELECT ' || rowid || dynsql || ' from (' || eavsql_inarg || ') eav GROUP BY ' || rowid;
RAISE NOTICE 'dynsql %1', dynsql;
EXECUTE dynsql;
END
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;
And here is how I use it:
SELECT c_crosstab('query_txt', 'view_name', 'entity_column_name', 'attribute_column_name', 'value_column_name', 'first');
Example:
Fist you run:
SELECT c_crosstab('Select * from table', 'ct_view', 'usr_id', 'question_id', 'response_value', 'first');
Than:
Select * from ct_view;
There is an example of this in contrib/tablefunc/.