Display float column with leading sign - postgresql

I am using PHP with PostgreSQL. I have the following query:
SELECT ra, de, concat(ra, de) AS com, count(*) OVER() AS total_rows
FROM mdust
LIMIT :pagesize OFFSET :starts
The columns ra and de are floats where de can be positive or negative, however, the de does not return the + associated with the float. It does however return the - negative sign. What I want is for the de column within concat(ra, de) to return back the positive or negative sign.
I was looking at this documentation for PostgreSQL which provides to_char(1, 'S9') which is exactly what I want but it only works for integers. I was unable to find such a function for floats.

to_char() works for float as well. You just have to define desired output format. The simple pattern S9 would truncate fractional digits and fail for numbers > 9.
test=> SELECT to_char(float '0.123' , 'FMS9999990.099999')
test-> , to_char(float '123' , 'FMS9999990.099999')
test-> , to_char(float '123.123', 'FMS9999990.099999');
to_char | to_char | to_char
---------+---------+----------
+0.123 | +123.0 | +123.123
(1 row)
The added FM modifier stands for "fill mode" and suppresses insignificant trailing zeroes (unless forced by a 0 symbol instead of 9) and padding blanks.
Add as many 9s before and after the period as you want to allow as many digits.
You can tailor desired output format pretty much any way you want. Details in the manual here.
Aside: There are more efficient solutions for paging than LIMIT :pagesize OFFSET :starts:
Optimize query with OFFSET on large table

Related

Postgres large numeric value operations

I am trying some operations on large numeric field such as 2^89.
Postgres numeric data type can store 131072 on left of decimal and 16383 digits on right of decimal.
I tried some thing like this and it worked:
select 0.037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037::numeric;
But when I put some operator, it rounds off values after 14 digits.
select (2^89)::numeric(40,0);
numeric
-----------------------------
618970019642690000000000000
(1 row)
I know the value from elsewhere is:
>>> 2**89
618970019642690137449562112
Why is this strange behavior. It is not letting me enter values beyond 14 digits numeric to database.
insert into x select (2^89-1)::numeric;
select * from x;
x
-----------------------------
618970019642690000000000000
(1 row)
Is there any way to circumvent this.
Thanks in advance.
bb23850
You should not cast the result but one part of the operation to make clear that this is a numeric operation, not an integer operation:
select (2^89::numeric)
Otherwise PostgreSQL takes the 2 and the 89 as type integer. In that case the result is type integer, too, which is not an exact value at that size. Your cast is a cast of that inaccurate result, so it cannot work.

Alphanumeric Sorting in PostgreSQL

I have this table with a character varying column in Postgres 9.6:
id | column
------------
1 |IR ABC-1
2 |IR ABC-2
3 |IR ABC-10
I see some solutions typecasting the column as bytea.
select * from table order by column::bytea.
But it always results to:
id | column
------------
1 |IR ABC-1
2 |IR ABC-10
3 |IR ABC-2
I don't know why '10' always comes before '2'. How do I sort this table, assuming the basis for ordering is the last whole number of the string, regardless of what the character before that number is.
When sorting character data types, collation rules apply - unless you work with locale "C" which sorts characters by there byte values. Applying collation rules may or may not be desirable. It makes sorting more expensive in any case. If you want to sort without collation rules, don't cast to bytea, use COLLATE "C" instead:
SELECT * FROM table ORDER BY column COLLATE "C";
However, this does not yet solve the problem with numbers in the string you mention. Split the string and sort the numeric part as number.
SELECT *
FROM table
ORDER BY split_part(column, '-', 2)::numeric;
Or, if all your numbers fit into bigint or even integer, use that instead (cheaper).
I ignored the leading part because you write:
... the basis for ordering is the last whole number of the string, regardless of what the character before that number is.
Related:
Alphanumeric sorting with PostgreSQL
Split comma separated column data into additional columns
What is the impact of LC_CTYPE on a PostgreSQL database?
Typically, it's best to save distinct parts of a string in separate columns as proper respective data types to avoid any such confusion.
And if the leading string is identical for all columns, consider just dropping the redundant noise. You can always use a VIEW to prepend a string for display, or do it on-the-fly, cheaply.
As in the comments split and cast the integer part
select *
from
table
cross join lateral
regexp_split_to_array(column, '-') r (a)
order by a[1], a[2]::integer

Conversion does not give the decimal places when the result is a whole number in postgres

I am new to postgres and presently migrating from sql server to postgres and facing some problems. Kindly help me with this.
I am not being able to convert to decimal whenever the answer is in whole number. Whenever the answer is in whole number,
decimal conversion results in only giving the integer part as the answer.
For example :- If the result is 48 decimal conversion gives 48 whereas I want 48.00.
you can start from using numeric(4,2), instead of decimal, eg:
t=# select 48::numeric(4,2);
numeric
---------
48.00
(1 row)
or even:
t=# select 48*1.00;
?column?
----------
48.00
(1 row)
but keep in mind the fact you don't see zeroes in decimal does not mean the number is not decimal. eg here it is still float:
t=# select 48::float;
float8
--------
48
(1 row)
You can round the value by using the statement
select round(48,2);
It will return 48.00. You can also round to more decimal points.

PostgreSQL - making ts_rank take the ts_vector position as-is or defining a custom ts_rank function

I'm performing weighted search on a series of items in an e-commerce platform. The problem I have is ts_rank is giving me the exact same value for different combinations of words, even if the ts_vector gives different positions for each set of words.
Let me illustrate this with an example:
If I give ts_vector the word camas, it gives me the following:
'cam':1
If I give ts_vector the word sofas camas, it gives me the following:
'cam':2 'sof':1
So camas is getting different positions depending on the words combination.
When I execute the following statement:
select ts_rank(to_tsvector('camas'),to_tsquery('spanish','cama'));
PostgreSQL gives me 0.0607927 as the ts_rank computed value, whereas the computed value for the following statement:
select ts_rank(to_tsvector('sofas camas'),to_tsquery('spanish','cama'));
is the same value: 0.0607927.
How can this be?
The question I have in mind is the following: is there a way for ts_rank to consider the position for the words contained in the ts_vector structure as-is or is there a way to define a custom ts_rank function for me to take the position for the words as explained?
Any help would be greatly appreciated.
As the documentation sais about functions ts_rank and ts_rank_cd:
they consider how often the query terms appear in the document, how close together the terms are in the document, and how important is the part of the document where they occur
That is these functions ignore other words in calculation. For example, you can get different results for these queries:
postgres=# select ts_rank(to_tsvector('spanish', 'famoso sofas camas'),to_tsquery('spanish','famoso & cama'));
ts_rank
-----------
0.0985009
(1 row)
postgres=# select ts_rank(to_tsvector('spanish', 'famoso camas'),to_tsquery('spanish','famoso & cama'));
ts_rank
-----------
0.0991032
(1 row)
postgres=# select ts_rank(to_tsvector('spanish', 'sofas camas camas'),to_tsquery('spanish','cama'));
ts_rank
-----------
0.0759909
(1 row)
Also the documentation sais:
Different applications might require additional information for ranking, e.g., document modification time. The built-in ranking functions are only examples. You can write your own ranking functions and/or combine their results with additional factors to fit your specific needs.
You can get PostgreSQL code from GitHub. Needed function is ts_rank_tt.
You can also change the normalization options to take it into account the document length, which is ignored by default.
For example, if you add 1 as the third parameter, it divides the rank by 1 + the logarithm of the document length. With your example:
postgres=# select ts_rank(to_tsvector('spanish', 'camas'),to_tsquery('spanish','camas'), 1);
ts_rank
-----------
0.0607927
(1 row)
postgres=# select ts_rank(to_tsvector('spanish', 'sofas camas'),to_tsquery('spanish','camas'), 1);
ts_rank
-----------
0.0383559
(1 row)
Documentation: https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING

Total number of "1s" in a Postgres bitmask

Is there a way to get the total number of 1's in a Postgres "bit string" type?
# select length(replace(x::text, '0', '')) from ( values ('1010111101'::bit varying) ) as something(x);
length
--------
7
(1 row)
And approach without string conversion:
# select count(*) from ( select x, generate_series(1, length(x)) as i from ( values ('1010111101'::bit varying) ) as something(x) ) as q where substring(x, i, 1) = B'1';
count
-------
7
(1 row)
If you need it to be really efficient, here's a discussion: Efficiently determining the number of bits set in the contents of, a VARBIT field
I know, this is already an old topic, but I found a cool answer here: https://stackoverflow.com/a/38971017/4420662
So adapted it would be:
=# select length(regexp_replace((B'1010111101')::text, '[^1]', '', 'g'));
length
--------
7
(1 row)
Based on the discussion here, the above mentioned thread Efficiently determining the number of bits set in the contents of, a VARBIT field,
and the Bit Twiddling Hacks page, I published a PostgreSQL extension: pg_bitcount.
If you install that extension (see instructions there), you can count the number of bits set in a bitstring using:
# Register the extension in PostgreSQL
create extension pg_bitcount;
# Use the pg_bitcount function
select public.pg_bitcount(127::bit(8));
select public.pg_bitcount(B'101010101');
I compared a number of different algorithms for performance and using a table lookup seems to be the fastest. They are all much faster than converting to text and replacing '0' with ''.
You have a simple way using plpgsql here.
The one / first bit? Or the total number of bits flipped on? The former: bit mask (& 1) the bit. The latter: A nasty query, like:
SELECT (myBit & 1 + myBit >> 1 & 1 + myBit >> 2 & 1) AS bitCount FROM myBitTable;
I suppose, you could also cast to a string and count the 1's in PL/SQL.