Using IN statement with jsonb type in postgres - postgresql

I am trying to write a postgres sql query to select jsonb fields from my table and wondering if I can use IN statement with #> jsonb operator
The query I have is
SELECT data FROM catalog WHERE data #> '{"name":{"firstname":"myname"}}'
Above works fine with one value in WHERE condition, is it possible that I could use mutliple json in WHERE condition like along with '{"name":{"firstname":"myname"}}', I also want return records for '{"name":{"firstname":"yourname"}}'
I can do something like below
Select *
FROM catalog
WHERE data ->'name' ->> 'firstname' IN ('myname','yourname')
Whats the best way to do it ?

Starting in the soon to be released v12, you can use JSONPATH to do that.
SELECT data FROM catalog WHERE data ## '$.name.firstname=="myname" || $.name.firstname=="yourname"';
There may be a better to way to write that JSONPATH without the repetition, I'm not an expert there.
Your other choices are the IN you already shown, and multiple #> connected by OR. The different operations are supported by different indexes. If you care about performance, they the "best" way to do it depends on what indexes you already have, or are willing to build, and just how you prefer to write your queries. (Indeed I'd argue the "best" way is not to use JSON in the first place). To use the IN list, you would need an expressional index like:
create index on catalog ((data ->'name' ->> 'firstname') );

Related

PostgreSQL, allow to filter by not existing fields

I'm using a PostgreSQL with a Go driver. Sometimes I need to query not existing fields, just to check - maybe something exists in a DB. Before querying I can't tell whether that field exists. Example:
where size=10 or length=10
By default I get an error column "length" does not exist, however, the size column could exist and I could get some results.
Is it possible to handle such cases to return what is possible?
EDIT:
Yes, I could get all the existing columns first. But the initial queries can be rather complex and not created by me directly, I can only modify them.
That means the query can be simple like the previous example and can be much more complex like this:
WHERE size=10 OR (length=10 AND n='example') OR (c BETWEEN 1 and 5 AND p='Mars')
If missing columns are length and c - does that mean I have to parse the SQL, split it by OR (or other operators), check every part of the query, then remove any part with missing columns - and in the end to generate a new SQL query?
Any easier way?
I would try to check within information schema first
"select column_name from INFORMATION_SCHEMA.COLUMNS where table_name ='table_name';"
And then based on result do query
Why don't you get a list of columns that are in the table first? Like this
select column_name
from information_schema.columns
where table_name = 'table_name' and (column_name = 'size' or column_name = 'length');
The result will be the columns that exist.
There is no way to do what you want, except for constructing an SQL string from the list of available columns, which can be got by querying information_schema.columns.
SQL statements are parsed before they are executed, and there is no conditional compilation or no short-circuiting, so you get an error if a non-existing column is referenced.

How to access a HSTORE column using PostgreSQL C library (libpq)?

I cannot find any documentation regarding HSTORE data access using the C library. Currently I'm considering to just convert the HSTORE columns into arrays in my queries but is there a way to avoid such conversions?
libpqtypes appears to have some support for hstore.
Another option is to avoid directly interacting with hstore in your code. You can still benefit from it in the database without dealing with its text representation on the client side. Say you want to fetch a hstore field; you just use:
SELECT t.id, k, v FROM thetable t, LATERAL each(t.hstorefield);
or on old PostgreSQL versions you can use the quirky and nonstandard set-returning-function-in-SELECT form:
SELECT t.id, each(t.hstorefield) FROM thetable t;
(but watch out if selecting multiple records from t this way, you'll get weird results wheras LATERAL will be fine).
Another option is to use hstore_to_array or hstore_to_matrix when querying, if you're comfortable dealing with PostgreSQL array representation.
To create hstore values you can use the hstore constructors that take arrays. Those arrays can in turn be created with array_agg over a VALUES clause if you don't want to deal with PostgreSQL's array representation in your code.
All this mess should go away in future, as PostgreSQL 9.4 is likely to have much better interoperation between hstore and json types, allowing you to just use the json representation when interacting with hstore.
The binary protocol for hstore is not complicated.
See the _send and _recv functions from its IO code.
Of course, that means requesting (or binding) it in binary format in libpq.
(see the paramFormats[] and resultFormat arguments to PQexecParams)

Fast search within strings in PostgreSQL

Which is the fastest way to search within string in PostgreSQL (case insensivity):
SELECT col FROM table WHERE another_col ILIKE '%typed%'
or
SELECT col FROM table WHERE another_col ~* 'typed'
How can I turn on showing the time which query need to return results? Something like is on default in mySQL (I am thinking about CLI client).
Both queries are the same, PostgreSQL rewrites ILIKE to ~*. Check the results from EXPLAIN to see this behaviour.
I'm not sure about your question, but the psql-client can show you some timing of the query, using \timing.
Regarding the timing:
One solution is to use the switch for psql that Frank has already mentioned.
When you use EXPLAIN ANALZYE it also includes the total runtime of the query on the server.
I prefer this when comparing the runtime for different versions of a query as it removes the network from the equation.

Is it possible to use CASE with IN?

I'm trying to construct a T-SQL statement with a WHERE clause determined by an input parameter. Something like:
SELECT * FROM table
WHERE id IN
CASE WHEN #param THEN
(1,2,4,5,8)
ELSE
(9,7,3)
END
I've tried all combination of moving the IN, CASE etc around that I can think of. Is this (or something like it) possible?
try this:
SELECT * FROM table
WHERE (#param='??' AND id IN (1,2,4,5,8))
OR (#param!='??' AND id in (9,7,3))
this will have a problem using an index.
The key with a dynamic search conditions is to make sure an index is used, instead of how can I easily reuse code, eliminate duplications in a query, or try to do everything with the same query. Here is a very comprehensive article on how to handle this topic:
Dynamic Search Conditions in T-SQL by Erland Sommarskog
It covers all the issues and methods of trying to write queries with multiple optional search conditions. This main thing you need to be concerned with is not the duplication of code, but the use of an index. If your query fails to use an index, it will preform poorly. There are several techniques that can be used, which may or may not allow an index to be used.
here is the table of contents:
Introduction
The Case Study: Searching Orders
The Northgale Database
Dynamic SQL
Introduction
Using sp_executesql
Using the CLR
Using EXEC()
When Caching Is Not Really What You Want
Static SQL
Introduction
x = #x OR #x IS NULL
Using IF statements
Umachandar's Bag of Tricks
Using Temp Tables
x = #x AND #x IS NOT NULL
Handling Complex Conditions
Hybrid Solutions – Using both Static and Dynamic SQL
Using Views
Using Inline Table Functions
Conclusion
Feedback and Acknowledgements
Revision History
if you are on the proper version of SQL Server 2008, there is an additional technique that can be used, see: Dynamic Search Conditions in T-SQL Version for SQL 2008 (SP1 CU5 and later)
If you are on that proper release of SQL Server 2008, you can just add OPTION (RECOMPILE) to the query and the local variable's value at run time is used for the optimizations.
Consider this, OPTION (RECOMPILE) will take this code (where no index can be used with this mess of ORs):
WHERE
(#search1 IS NULL or Column1=#Search1)
AND (#search2 IS NULL or Column2=#Search2)
AND (#search3 IS NULL or Column3=#Search3)
and optimize it at run time to be (provided that only #Search2 was passed in with a value):
WHERE
Column2=#Search2
and an index can be used (if you have one defined on Column2)
if #param = 'whatever'
select * from tbl where id in (1,2,4,5,8)
else
select * from tbl where id in (9,7,3)

PostgreSql XML Text search

I have a text column in a table. We store XML in this column. Now I want to search for tags and values
Example data:
<bank>
<name>Citi Bank</name>
.....
.....
/<bank>
I would like to run the following query:
select * from xxxx where to_tsvector('english',xml_column) ## to_tsquery('<name>Citi Bank</name>')
This works fine but it also works for tags like name1 or no tag.
How do I have to setup my search in order for this to work so I get an exact match for the tag and value ?
You could use the xpath function like this
select *
from xxx
where xpath(xml_column, 'bank/name/text()') = 'CitiBank';
BUT it won't use the full-text search index. You could use a subquery to find probable matches and avoid full scans, and the xpath expression for getting correct answers, or create a function index if the queries are going to be always the same.
You might want to reconsider storing XML in a database, instead you could look at inserting the data into related tables, since using XML is a poor replacement for a relational store. Even if you go with XML in database, use the XML type, not the TEXT type, and create an index like this (yes, basically you'd need an index per xpath expression):
CREATE INDEX my_funcidx ON my_table USING GIN ( CAST(xpath('/bank/name/text()', xmlfield) AS TEXT[]) );
then, query it like this:
SELECT * FROM my_table WHERE CAST(xpath('/bank/name/text()', xmlfield) AS TEXT[]) #> '{Citi Bank}'::TEXT[];
and this will use the index, as EXPLAIN will indicate.
The important part is the CASTing to TEXT[], as XML[], which the xpath function returns, isn't indexable by default.