Indexing on jsonb keys in postgresql - postgresql

I'm using PostgreSQL.
Is there any way to create index just on dictionary keys, not values.
For example imagine a jsonb column like:
select data from tablename where id = 0;
answer: {1:'v1', 2:'v2'}
I want to index on the key set (or key list) which is [1, 2]. To speed up queries like:
select count(*) from tablename where data ? '2';
As you can see in docs, there is a way for indexing the column entirely (keys + values):
CREATE INDEX idxgin ON api USING GIN (jdoc);
This is not good for me, considering that I store a large amount of data in values.
I tried this before:
CREATE INDEX test ON tablename (jsonb_object_keys(data));
The error was:
ERROR: set-returning functions are not allowed in index expressions
Also, I don't want to store keys in the dictionary as a value.
Can you help me?

Your example doesn't make much sense, as your WHERE clause isn't specifying a JSON operation, and your example output is not valid JSON syntax.
You can hide the set-returning function (and the aggregate) into an IMMUTABLE function:
create function object_keys(jsonb) returns text[] language SQL immutable as $$
select array_agg(jsonb_object_keys) from jsonb_object_keys($1)
$$;
create index on tablename using gin ( object_keys(data));
If you did it this way, you could then query it formulated like this:
select * from tablename where object_keys(data) #> ARRAY['2'];
You could instead make the function return a JSONB containing an array rather than returning a PostgreSQL text array, if you would rather query it that way:
select * from tablename where object_keys_jsonb(data) #> '"2"';
You can't use a ? formulation, because in JSONB that is specifically for objects not arrays. If you really wanted to use ?, you could instead write a function which keeps the object as an object, but converts all the values to JSON null or to empty string, so they take up less space.

Related

Postgres - Indexing long character varying array type column using GIN?

I'm trying to add index to a table with column of character varying array type.
CREATE TABLE test1 (abc character varying(5000)[]);
INSERT INTO test1 VALUES ('{"ok"}');
INSERT INTO test1 VALUES ('{"E7j0JY8vBgNzdzXHI8gvJ5eWe0HphmiOXhkazbogMrhZmZa84sblywEWVVMGD3jjwq3Etcy87ZITCrIJWyLAIBfynHCavdzpfTutGXmbc9JkuZEX8YuUbjeJc2TvktyNZJXxFlh1GaogibFkxd0qDNM07qk05QuCeX1UudrsFnLLZXc0s4JeP6heekAlJQBsKt2NSM0hcLHQyYq61iofXraGOV7bnPrcfeP6fEq2LDrAb9QVzzJ0MdYduiFxUfmoSoBs9HjqN5ofUMD8B6qi81cGdMTxYN359n6yf65htfIywSPPvRDbTBpgfdejuTtyfXoEY9awt2zzlkdJDXvMm1FoaEz4hHtmsg92C91UR2yE9z37Ws0yaNWCI0KPxGVEAMuzwh6JGDljgLXNo9MZTBmJd24Vhr4HR9YDHazpaWXje9ImUMiurRRKjcUcP5hZIWWl2kKpcfxUMrQsTnqgcTIj7WymW62DgqnIzHCFrXc6ykthIieynqtevghDZNVPiuWtDoquE0dzrDIFsdYUm74hFeTB5NdIsk0dZBjwmKBour5qJiVMJPz33AbZEPhrI1JbBNhiovrTh2Q5tz6PNv9BExM9zsHB19uU8AtHhlYF3XMwSHjKzMgYobohSl3hfpFuDoqKXhgnuR9Ni4bvoyUcIyyp9rnhs1yawm1r4922Ule3BAoDDBvVeREcvRUT2u2j1eKWmTiXQzx03XdukpgLc0rZ1DKB7V7US403XXVGiyv6QEpPeMTlkPdsSwqA4BmW73qQlsXqVg2COzVE8uAGjXqYZxLsiyndTum3b3VXCkXcogp0fJNGi47M2xkW6novZ1ZfCqL0pMspgAiDLSKHzu8GFvzs3wi3FJLKJrmmLp201U4zXhsF3K2TPw0QrTto3ArUzijt0Ea8pi2LzB9pw9ZpTmc2yUvaL2q8sdeF8TIwPptR0SxOL4NTxNij7oieCpT74NaMXUqkwbKHO59TeGNd6hRlp2sj6JRicibtNBN1WQYWiivMfXDylt9hLr0Cgw2V9b586PXf8noLH62SOE97ua9axU2mhmznhELWDmnISTYoNVsoLcHOGKK2EFJKovEc5rtQghHAti3E4ha5PftWRCMleindEPEJAjokXDkIQxx9Q8AcpZrJw389eag85ClGQnu0jhyn0aRcIe1f9wAK5wUfvzAD39WGUewzDogmPGgbRU9UtR7aRd2v09idFzmPFPxRUgeSQlz7uEwYzsLzqcuahIKH5kcc6B3OAOWv5c5bS7MO74ko7ToUvtdHr6Ah0C3DeN9Br4QLOjSKkQKVhWw66JGGJQvjKIhzntDFLXsSuN9iD32h3q62DW88RFcdx2MjL8SXKeZm11NkAoRERuQBvXLOWo9JA1tq8J2CZE95vXlFuvge0xxiLS7vB0GMtYzWvMgH3C9qshNMQfeAl4Bm8pgB3M1SjxBz0CgCXhy69482BQFTRCBBVwzrc5VLmOXZEA4iKuMXxkUGKkB8PzXjspeo3VyIWjDZGpbDBvfdDViCGo8OOHU43T81vSeRHaN3U4JYc3UW0mPEwF2HOY4hKbgbELKIMsg9HbwPqcv3SmCIuVqORSrnq0sysJL7TwZt8mheSZ4nSNIXDOBLdO6V29X3XFVjRJIK0bdLfxs59E0SiWwGGqGEI0MtcCbbSM5QfzQTALG7E0Mbf2pSEbw7ZwXKLh4Il3mLaqTmtJt3p18wTJOfCi7Qkwem66emBjx5YAixDhp6D73e8X9mTNZvClkS9COrAgqcj9XwnyDlaw1Vgg7UJk6uaOXgW57LJIqkAruI3lrZEOfYe8gYgTaUUDzXHPKLj5BzIqQpU8agA5m11NiDvdocr9yrsJLdfNMWyjTeXmq61mi9Ok31sWUxmuRlolfOMtjm1ev3yJzaTdrq3Xekedtmn889KVCw2AvV8KvT15h2bQOtqTPQRjNQGVyRspE9j5tPupqR98YrobNXrcZE9Wifd5JsFX1vlgIgJxISQMrOZx42Win944RK0IYBKo64cyfugH6hjL0QUCLTJqio9P3kB56H10iTi8iizoJ5qZZa8jt9qAJBqmSMeZn1yw5eVbQluWKxWUyXw1iOvm8SqfWZ2UWr5G0X2XNXj8HTBH0u25T9yUsCPoIxfJ63sBoOYW3DHzOKex4DUR9I7wGyT75kRkRO3Qmqzxa3aonD5CYe3cfn9f6DRiFlinoaVXHFiOKOP81NrUei9uKzmpOfBv6ats3hOacP9wdBKVfw1ZOpCGsDDhoX2I3wn9wREojlMz9fDXr9HHYoSfVH9xDkbBsTvqGAaOsYlNQEvDno9pkWm1SI6KxkdyBAHEXFyN3272kgFFFUPEDmXRfaFvy3FB3reqaBwAZeM0h6b9Wm47jXFKil0AaRhUTf0pactOHd98doNhe4m5HCdPJQ5bvSS8c46gn0vRjAwLMzZ9Q0w9OEpTHexy5R1cxSs27pAN8LSNjt0DcGM1hbJ08wnMfn7zCmhIsqlq1F0mwc39Ndf2cJdHgpLr24Q7OxAlbCjbnM3Uoiu4GeWVFEhaUDkwXKyC0zAoggFIe8QQtpvD8eWpvw8NxkkB70H65N5xNmgKaabbF2fytiyV0geWleopRW2kohncl8wxkneVuNeJulYuKQLH6oGHmcBRA1t8fRjBfcYofxO50Gicx0NGEKHozDOD3lubc7VdjAsSne5W4zYMu8UUm6dxbqFW7QY1S9EKuhHsdXm0S6iNLaE8Zceg33u8JYSBRtvFRSYVeHHYQTBhaKZOsiLgyDWiu8JgQ0yPfkZlLQA5oQVhOrJ4ZBfO7SUTEPsAvHSlgWph4sBrSw4rRUjYFuCZvyniFJOyvS165ogqqpuVHj1LZ9ATlT7aFN0ujmzCuX3gLZdCKwWEGpZA59auM1HxcY9Q0Mn5SSh7HDw0lZaSpRF3VCkehAGjhxYEs3CgihRiIWtZfjxKFLHMGTY7bfY2vpE65xTgTsTR86dJdvDy6sCughSMkcL2pZfvEg6JGrdBw88YPoEtgzCBjAYMXC7L09ZLVKen4TdbnV0qAKnFZRkvPhlMFW4tLVx0b1LPR4xACWLYtU74BDMVGt0BYx2XzA4Xf4CMcjK6s6BdjdTI4kMV04i13lQG3ffuFey1GNqvPLaiYh1Iay9N5beCPlyVmQHlXWd9UEMMrtJ66JKW07dn3O8jfGPd4JNv1V7YU3qHD54Xm60JNbC1nhOCX5xC9L0t2gVJ3RGH0wlmVQk1hrb1mIfdXAuI2qKOu04HAv1XWOCWYyrhcacuzLyGjDJlfUcPe9qOeMXVE6p56drl0EEWxQL7vU9KcwBR6YNoCEChHgoratnDibiFDp9hlDmaif2Jm0ruww3J5cTc1EOe59VCeqFvccdbH1CzGSoJkPFwlxJxiMU2kVGjeSGdCDhS5SOGutZlfmlhnKxMmhmPGNe2OMjYfNrJlVjUEyuZw4PhZNUuqYDLxfOt6teVzVx3OeylQd7mnvSFm27Yn03Dwv3v7FU3z8s7ZqudUtNj2DxHNcsgg2xnvnGP5YD7FjYx64O4nnPqP6jhRMnXO1mJpXzdcy8rvWEnbIIVdfVXKGorhVCDugSOipMnmpEqiLtvBnMSrwhe8sU3MGdJOy4PA5KEcyLSdeCCbBbTiNt3kOMOSaiByEUhDsyyhoo2w2aEV3QWAiJn2P8NdZ5JWZ1lWgslwL9NSpK8jADyl5u4tOoyADKohVKCadsiRe8weAKd2fvgrye10PTbI1hOkWhABqFsG6Cgx9OFWOh7dBITo1I5v1YQnKyUC3aeNqx0uxGjbe5KWsB3LPqN8O7SkiJ5XEQbVZBD2htvOis386BoUuCfyvRYweyXrwOdWoHfWXDPDx6F5oBaNvteWeqgjkLpAvdS34Idk7fMPgTMwchXp8ekyJxG6JTtKLA4zDhn71TRAo4YSPtcd5SrsytkBUyf40YIIajYGFYLKtdGGevrdwZDH0QmykUw88GApWqwwiGr3ygvtAxpyVL9Jy3nqcgF8QOSzEI1YcD5cX1j1MncKouCfuipcfxmnOZQR3Wwzn9YKQXv6OuKKKDRSfJnndkG6ivEFZ6Vfd7smewTfnZccxdYkdaEedCc8wWU4NipJadMsDi9BCGlpuzCu3t4M6PNBndUHUuWmNfAo2OEIXjtNxNrEenIuttvWDwF3kRDT03rymsi1yUsQuk9v4VpDLDjadb6eXIUS9W5vL3yVRUJNX8hp5HRD4PQbFoHOQVFJa31Nj3dlI4yvzcPSlP7DEQLqtP2nKWCYEbqBwJFytiuQqdE8sXtUc9p8pwXeFQSk2fIzXMocmCLesUH5TgphOJEcAN6sQyH2v7Pvtp9iMKPSFzGytFOSpQrofrLGJyDIwXlNHMCBa9AYIgqguFOAh9ltT5VJPCFTb7Po9LwTM4uDKZuLsMLxmohjeVV0Txv6aDhETnrfQalYgk4BlAjLFoZJhyoniM0dIyJdW8lB2kshuwYEguI2iggdv65X8DXxNyrUAIsYmYXWjNeXFyyLP4cfLIvK8C8t994LerlvhqwgwriHFmII4nAR5sdfWiRqdfjPFR7oQxOB0dOtWfC3lSomvgo4INgs6Lneg5QbiFfVXVKZ8PYhQ0Y6ynazBbNN6LEOvY2BgquHgSM59ebsVwnqDcZChn5N3oOYcqHC9TvXQfhQ8DJSS9n9b5RCduWIn42Uxy4eHSEpYGbepMKcyGVMEb9O94AxkdL1K81QLwNkw1Yt3xftOr93K4YTl2OhP04COcl3HSHDe7aOnA04MWsFgNSnUKR8I16KgwkVUsvgQJe6ROHfNFJtSYvPeqTtWkr7RDHQPaEeIPCMUVo4pMxhTMAz5J5vEQwwNDZ0qaVlPCRVF7tDXZJThAro9rGZyzdWc5ctk5E4PQr2Z7Oq3hiHLiuoxrpSZ7qrRX6TCyLJyrMUB0vQ3MpLoQ5tJ5GQ6lQ7Rrjsfhpuyc94yKu2kO6FdgoWqVu39sRq2XgxMTcGohRF9", "alpha", "VWyRaMaUucKNnmadMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhUvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaMnhpAOvLlJvorfNFAqCwQLsiKbSmxgpJhvoRCaOINFHyKgeTsqBAPUvHmMkrRTIdEhRVWyRMaUucKNnmdMXZgtJppEceutFgsoLOFStAjnQHOuBZXixVQEObHndzfsGFqBjeVkFFtwmfFIQgYMNaM"}');
create index idx_test1 on test1 using GIN(abc);
This throws an error message to create an index.
ERROR: index row size 5016 exceeds maximum 2712 for index "idx_test1"
Is there a way to increase this max size? Or is there an alternative way to add index to such columns.
Also, is it wise to add an index to such long varchar array columns?
Note: I'm using PostgreSQL 10.7
You could create a helper function like this:
CREATE FUNCTION hash_array(text[]) RETURNS integer[]
LANGUAGE sql IMMUTABLE AS
$$SELECT array_agg(hashtext(u.e)) FROM unnest($1) AS u(e)$$;
Then you can index an array of the hashes of the text strings:
CREATE INDEX idx_test1 ON test1 USING gin (hash_array(abc::text[]));
Now you can search like
SELECT ... FROM test1
WHERE hash_array(abc::text[]) #> ARRAY[hashtext('ok')]
AND abc::text[] #> ARRAY['ok'];
The first condition can use the index, and the second condition removes any false positives from hash collisions.
I recommend that you use text rather than character varying(5000).

ILIKE query with indexing for jsonb array data in postgres

I have table in which has city as jsonb column which has json array like below
[{"name":"manchester",..},{"name":"liverpool",....}]
now I want to query table on "name" column with ILIKE query.
I have tried with below but it is not working for me
select * from data where city->>'name' ILIKE '%man%'
while i know, I can search with exact match by below query
select * from data where city->>'name' #> 'manchester'
Also I know we can jsonb functions to make it flat data and search but it will not use than indexing.
is there anyway to search data with ilike in a way it also use indexing?
Index support will be difficult; for that, a schema that adheres to the first normal form would be beneficial.
Other than that, you can use the JSONPATH language from v12 on:
WITH t(c) AS (
SELECT '[{"name":"manchester"},{"name":"liverpool"}]'::jsonb
)
SELECT jsonb_path_exists(
c,
'$.**.name ? (# like_regex "man" flag "i")'::jsonpath
)
FROM t;
jsonb_path_exists
═══════════════════
t
(1 row)
You should really store your data differently.
You can do the ilike query "naturally" but without index support, like this:
select * from data where exists (select 1 from jsonb_array_elements(city) f(x) where x->>'name' ILIKE '%man%');
You can get index support like this:
create index on data using gin ((city::text) gin_trgm_ops);
select * from data where city::text ilike '%man%';
But it will find matches within the text of the keys, as well as the values, and using irrelevant keys/values of any are present. You could get around this by creating a function that returns just the values, all banged together into one string, and then use a functional index. But the index will get less effective as the length of the string gets longer, as there will be more false positives that need to be tracked down and weeded out.
create or replace function concat_val(jsonb, text) returns text immutable language sql as $$
select string_agg(x->>$2,' ') from jsonb_array_elements($1) f(x)
$$ parallel safe;
create index on data using gin (concat_val(city,'name') gin_trgm_ops);
select * from data where concat_val(city,'name') ilike '%man%';
You should really store your data differently.

What PostgreSQL type is good for stroring array of strings and offering fast lookup afterwards

I am using PostgreSQL 11.9
I have a table containing a jsonb column with arbitrary number of key-values. There is a requirement when we perform a search to include all values from this column as well. Searching in jsonb is quite slow so my plan is to create a trigger which will extract all the values from the jsonb column:
select t.* from app.t1, jsonb_each(column_jsonb) as t(k,v)
with something like this. And then insert the values in a newly created column in the same table so I can use this column for faster searches.
My question is what type would be most suitable for storing the keys and then searchin within them. Currently the search looks like this:
CASE
WHEN something IS NOT NULL
THEN EXISTS(SELECT value FROM jsonb_each(column_jsonb) WHERE value::text ILIKE search_term)
END
where the search_term is what the user entered from the front end.
This is not going to be pretty, and normalizing the data model would be better.
You can define a function
CREATE FUNCTION jsonb_values_to_string(
j jsonb,
separator text DEFAULT ','
) RETURNS text LANGUAGE sql IMMUTABLE STRICT
AS 'SELECT string_agg(value->>0, $2) FROM jsonb_each($1)';
Then you can query like
WHERE jsonb_values_to_string(column_jsonb, '|') ILIKE 'search_term'
and you can define a trigram index on the left hand side expression to speed it up.
Make sure that you choose a separator that does not occur in the data or the pattern...

Postgresql: query on jsonb column - index doesn't make it quicker

There is a table in Postgresql 9.6, query on jsonb column is slow compared to a relational table, and adding a GIN index on it doesn't make it quicker.
Table:
-- create table
create table dummy_jsonb (
id serial8,
data jsonb,
primary key (id)
);
-- create index
CREATE INDEX dummy_jsonb_data_index ON dummy_jsonb USING gin (data);
-- CREATE INDEX dummy_jsonb_data_index ON dummy_jsonb USING gin (data jsonb_path_ops);
Generate data:
-- generate data,
CREATE OR REPLACE FUNCTION dummy_jsonb_gen_data(n integer) RETURNS integer AS $$
DECLARE
i integer:=1;
name varchar;
create_at varchar;
json_str varchar;
BEGIN
WHILE i<=n LOOP
name:='dummy_' || i::text;
create_at:=EXTRACT(EPOCH FROM date_trunc('milliseconds', now())) * 1000;
json_str:='{
"name": "' || name || '",
"size": ' || i || ',
"create_at": ' || create_at || '
}';
insert into dummy_jsonb(data) values
(json_str::jsonb
);
i:= i + 1;
END LOOP;
return n;
END;
$$ LANGUAGE plpgsql;
-- call function,
select dummy_jsonb_gen_data(1000000);
-- drop function,
DROP FUNCTION IF EXISTS dummy_jsonb_gen_data(integer);
Query:
select * from dummy_jsonb
where data->>'name' like 'dummy_%' and data->>'size' >= '500000'
order by data->>'size' desc
offset 50000 limit 10;
Test result:
The query takes 1.8 seconds on a slow vm.
Adding or removing the index, don't make a difference.
Changing to index gin with jsonb_path_ops, also don't make a difference.
Questions:
Is it possible to make the query quicker, either improve index or sql?
If not, the does it means, within pg a relational table is more proper in this case?
And, in my test, mongodb performs better, does that means mongodb is more proper for such storage & query?
Quote from the manual
The default GIN operator class for jsonb supports queries with top-level key-exists operators ?, ?& and ?| operators and path/value-exists operator #> [...] The non-default GIN operator class jsonb_path_ops supports indexing the #> operator only.
Your query uses LIKE and string comparison with > (which is probably not correct to begin with), neither of those are supported by a GIN index.
But even an index on (data ->> 'name') wouldn't be used for the condition data->>'name' like 'dummy_%' as that is true for all rows because every name starts with dummy.
You can create a regular btree index on the name:
CREATE INDEX ON dummy_jsonb ( (data ->> 'name') varchar_pattern_ops);
Which will be used if the condition is restrictive enough, e.g.:
where data->>'name' like 'dummy_9549%'
If you need to query for the size, you can create an index on ((data ->> 'size')::int) and then use something like this:
where (data->>'size')::int >= 500000
However your use of limit and offset will always force the database to read all rows, sort them and the limit the result. This is never going to be very fast. You might want to read this article for more information why limit/offset is not very efficient.
JSON is a nice addition to the relational world, but only if you use it appropriately. If you don't need dynamic attributes for a row, then use standard columns and data types. Even though JSON support is Postgres is extremely good, this doesn't mean one should use it for everything, just because it's the current hype. Postgres is still a relational database and should be used as such.
Unrelated, but: your function to generate the test data can be simplified to a single SQL statement. You might not have been aware of the generate_series() function for things like that:
insert into dummy_jsonb(data)
select jsonb_build_object('name', 'dummy_'||i,
'size', i::text,
'created_at', (EXTRACT(EPOCH FROM date_trunc('milliseconds', clock_timestamp())) * 1000)::text)
from generate_series(1,1000000) as t(i);
While a btree index (the standard PostgreSQL index based on binary trees) is able to optimize ordering-based queries like >= '500000', the gin index, using an inverted index structure, is meant to quickly find data containing specific elements (it is quite used e.g. for text search to find rows containing given words), so (AFAIK) it can't be used for the query you provide.
PostgreSQL docs on jsonb indexing indicates on which WHERE conditions the index may be applied. As pointed out there, you can create a btree index on specific elements in a jsonb column: indexes on the specific elements referenced in the WHERE clause should work for the query you indicate.
Also, as commented above, think whether you actually need JSON for your use case.

Extracting XML element value from Postgres

Given:
CREATE TABLE xmltest(xtxt xml);
And:
INSERT INTO xmltest values ('<EMP><NAME>Mike</NAME><HIREDATE>12-FEB-96</HIREDATE></EMP><EMP><NAME>Bob</NAME><HIREDATE>13-AUG-97</HIREDATE></EMP><EMP><NAME>Paul</NAME><HIREDATE>17-JUN-94</HIREDATE></EMP><EMP><NAME>Jim</NAME><HIREDATE>01-JUN-94</HIREDATE></EMP>');
Using the base functionality of Postgres 9.2, how would I write a SELECT statement that returned only the employee names, 1 name per row in the result set? Or would I have to write a function in PL/PGSQL to do that?
You can extract fields of interest into an array using the xpath function, and then from there you can use the unnest builtin to split this array into multiple rows:
SELECT unnest(xpath('//name', xtxt))
FROM xmltest;
(Slightly borrowed from this question)