How to convert tsvector?

How to convert tsvector? - postgresql

A typical and relevant application of tsvectot is to query and summarize information about the set of occurred words and about its frequency... And JSONB is the natural choice (!) to represent tsvectot datatype for these "querying applications"... So,
There are a simple workaround to cast tsvector into JSONB?
Example: counting global frequency of words of a cached tsvectot's, will be something like this query
SELECT r.key as word, SUM(r.value) as occurrences
FROM (
SELECT jsonb_each(kx_tsvectot::jsonb) as r FROM terms
) t
GROUP BY 1;

You can use ts_stat() function, which will give you exactly what you need
word text — the value of a lexeme
ndoc integer — number of documents (tsvectors) the word occurred in
nentry integer — total number of occurrences of the word
Example may be the following:
CREATE TABLE t (
tsv TSVECTOR
);
INSERT INTO t VALUES
('word'::TSVECTOR),
('second word'::TSVECTOR),
('third word'::TSVECTOR);
SELECT * FROM
ts_stat('SELECT tsv FROM t');
Result:
word | ndoc | nentry
--------+------+--------
word | 3 | 3
third | 1 | 1
second | 1 | 1
(3 rows)
If you still want to convert it to jsonb you can use cast word from text to jsonb.

Related

Postgresql - Get MAX Numeric Value on Character Varying Column

I have a column in a Postgresql table that is unique and character varying(10) type. The table contains old alpha-numeric values that I need to keep. Every time a new row is created from this point forward, I want it to be numeric only. I would like to get the max numeric-only value from this table for this column then create a new row with that max value incremented by 1.
Is there a way to query this table for the max numeric value only for this column?
For example, if this column currently has the values:
1111
A1111A
1234
1234A
3331
B3332
C-3333
33-D33
3**333*
Is there a query that will return 3333, AKA cut out all the non-numeric characters from the values and then perform a MAX() on them?

Not precisely what you asking, but something that I think will work better for you.
To go over all the columns, convert each to numbers, and then cast it to integer & return max.:
SELECT MAX(regexp_replace(my_column, '[^0-9]', '', 'g')::int) FROM public.foobar;
This gets you your max value... say 2999.
Now, going forward, consider making the default for your column a serial-like value, and convert it to text... that way you set the "MAX" once, and then let postgres do all the work for future values.
-- create simple integer sequence
CREATE SEQUENCE public.foobar_my_column_seq
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 1
CACHE 0;
-- use new sequence as default value for column __and__ convert to text
ALTER TABLE foobar
ALTER COLUMN my_column
SET DEFAULT nextval('publc.foobar_my_column_seq'::regclass)::text;
-- initialize "next value" of sequence to whatever is larger than
-- what you already have in your data ... say 3000:
ALTER SEQUENCE public.foobar_my_column_seq RESTART WITH 3000;
Because you're simply setting default, you don't change your current alpha-numeric values.

I figured it out. The following query works.
select text_value, regexp_replace(text_value, '[^0-9]+', '') as new_value from the_table;
Result:
text_value | new_value
-----------------------+-------------
4*215474 | 4215474
740024 | 740024
4*100535 | 4100535
42356 | 42356
CASH |
4*215474 | 4215474
740025 | 740025
740026 | 740026
4*5089655798 | 45089655798
4*15680 | 415680
4*224034 | 4224034
4*265718708 | 4265718708

PostgreSQL convert varchar to numeric and get average

I have a column that I want to get an average of, the column is varchar(200). I keep getting this error. How do I convert the column to numeric and get an average of it.
Values in the column look like
16,000.00
15,000.00
16,000.00 etc
When I execute
select CAST((COALESCE( bonus,'0')) AS numeric)
from tableone
... I get
ERROR: invalid input syntax for type numeric:

The standard way to represent (as text) a numeric in SQL is something like:
16000.00
15000.00
16000.00
So, your commas in the text are hurting you.
The most sensible way to solve this problem would be to store the data just as a numeric instead of using a string (text, varchar, character) type, as already suggested by a_horse_with_no_name.
However, assuming this is done for a good reason, such as you inherited a design you cannot change, one possibility is to get rid of all the characters which are not a (minus sign, digit, period) before casting to numeric:
Let's assume this is your input data
CREATE TABLE tableone
(
bonus text
) ;
INSERT INTO tableone(bonus)
VALUES
('16,000.00'),
('15,000.00'),
('16,000.00'),
('something strange 25'),
('why do you actually use a "text" column if you could just define it as numeric(15,0)?'),
(NULL) ;
You can remove all the straneous chars with a regexp_replace and the proper regular expression ([^-0-9.]), and do it globally:
SELECT
CAST(
COALESCE(
NULLIF(
regexp_replace(bonus, '[^-0-9.]+', '', 'g'),
''),
'0')
AS numeric)
FROM
tableone ;
| coalesce |
| -------: |
| 16000.00 |
| 15000.00 |
| 16000.00 |
| 25 |
| 150 |
| 0 |
See what happens to the 15,0 (this may NOT be what you want).
Check everything at dbfiddle here

I'm going to go out on a limb and say that it might be because you have Empty strings rather than nulls in your column; this would result in the error you are seeing. Try wrapping the column name in a nullif:
SELECT CAST(coalesce(NULLIF(bonus, ''), '0') AS integer) as new_field
But I would really question your schema that you have numeric values stored in a varchar column...

Is it possible in PL/pgSQL to evaluate a string as an expression, not a statement?

I have two database tables:
# \d table_1
Table "public.table_1"
Column | Type | Modifiers
------------+---------+-----------
id | integer |
value | integer |
date_one | date |
date_two | date |
date_three | date |
# \d table_2
Table "public.table_2"
Column | Type | Modifiers
------------+---------+-----------
id | integer |
table_1_id | integer |
selector | text |
The values in table_2.selector can be one of one, two, or three, and are used to select one of the date columns in table_1.
My first implementation used a CASE:
SELECT value
FROM table_1
INNER JOIN table_2 ON table_2.table_1_id = table_1.id
WHERE CASE table_2.selector
WHEN 'one' THEN
table_1.date_one
WHEN 'two' THEN
table_1.date_two
WHEN 'three' THEN
table_1.date_three
ELSE
table_1.date_one
END BETWEEN ? AND ?
The values for selector are such that I could identify the column of interest as eval(date_#{table_2.selector}), if PL/pgSQL allows evaluation of strings as expressions.
The closest I've been able to find is EXECUTE string, which evaluates entire statements. Is there a way to evaluate expressions?

In the plpgsql function you can dynamically create any expression. This does not apply, however, in the case you described. The query must be explicitly defined before it is executed, while the choice of the field occurs while the query is executed.
Your query is the best approach. You may try to use a function, but it will not bring any benefits as the essence of the issue will remain unchanged.

Search inside full search column using certain letters

I want to search inside a full search column using certain letters, I mean:
select "Name","Country","_score" from datatable where match("Country", 'China');
Returns many rows and is ok. My question is, how can I search for example:
select "Name","Country","_score" from datatable where match("Country", 'Ch');
I want to see, China, Chile, etc.
I think that match_type phrase_prefix can be the answer, but I don't know how I can use (correct syntax).

The match predicate supports different types by use of using match_type [with (match_parameter = [value])].
So in your example using the phrase_prefix match type:
select "Name","Country","_score" from datatable where match("Country", 'Ch') using phrase_prefix;
gives you your desired results.
See the match predicate documentation: https://crate.io/docs/en/latest/sql/fulltext.html?#match-predicate

If you just need to match the beginning of a string column, you don't need a fulltext analyzed column. You can use the LIKE operator instead, e.g.:
cr> create table names_table (name string, country string);
CREATE OK (0.840 sec)
cr> insert into names_table (name, country) values ('foo', 'China'), ('bar','Chile'), ('foobar', 'Austria');
INSERT OK, 3 rows affected (0.049 sec)
cr> select * from names_table where country like 'Ch%';
+---------+------+
| country | name |
+---------+------+
| Chile | bar |
| China | foo |
+---------+------+
SELECT 2 rows in set (0.037 sec)

Function in Postgres to convert a varchar to a big integer

I have a varchar column in Postgres 8.3 that holds values like: '0100011101111000'
I need a function that would consider that string to be a number in base 2 and spits out the numeric in base 10. Makes sense?
So, for instance:
'000001' -> 1.0
'000010' -> 2.0
'000011' -> 3.0
Thanks!

Cast to a bit string then to an integer.
An example:
'1110'::bit(4)::integer -> 14
Though you had varying length examples, and were after bigint, so instead use bit(64) and pad the input with zeroes using the lpad function.
lpad('0100011101111000',64,'0')::bit(64)::bigint
Here's a complete example...
create temp table examples (val varchar(64));
insert into examples values('0100011101111000');
insert into examples values('000001');
insert into examples values('000010');
insert into examples values('000011');
select val,lpad(val,64,'0')::bit(64)::bigint as result from examples;
The result of the select is:
val | result
------------------+--------
0100011101111000 | 18296
000001 | 1
000010 | 2
000011 | 3
(4 rows)