So i learned in my Sql course last week how to turn a string into an integer. the table we used for this was timezone based. so it was '-5' hours offset.
in order to do this we had to cast the string to a DECIMAL and then to an SMALLINT. It was pretty simple once I knew that , thats not where my question lies.
what im curious about is why a SMALLlNT wouldnt take a negative sign but A Decimal could do it. according to the specs a SMALLINT still can go to -32768. so does anyone know if this persists in all coding languages or is it just SQL specific? As well as what wont allow it to cast
I don't see why you would bother doing any casting to begin with? According to the documentation (see table 1/3 down the page), T-SQL supports implicit conversion between varchar and smallint.
DECLARE #negative_varchar VARCHAR(10) = '-5'
DECLARE #negative_smallint SMALLINT = CONVERT(SMALLINT, #negative_varchar)
DECLARE #negative_smallint_implicit SMALLINT = #negative_varchar
SELECT #negative_varchar, #negative_smallint, #negative_smallint_implicit
Produces
---------- ------ ------
-5 -5 -5
no this is what im saying (declare #s as nvarchar
#s=-5
(50) = CAST (#s AS SMALL INT)
again you have to cast from decimal then small int. im asking about the underlying code behind the process.
such as does one of the bytes hold whether a number is positive or negative etc...
I was looking for a way to create a table with unsigned integer (I know I will have only positive integers, so why not to increase the range twofold). To create an integer field, I do:
create table funny_table(
my_field bigint
);
So I thought that using my_field bigint unsigned will solve my problem, but syntax error tells me otherwise. Looking though the documentation tells nothing about unsigned integers. Is it even possible?
Unfortunately Amazon Redshift doesn't support unsigned integer. As a workaround, we are using numeric(20,0) for bigint unsigned data. Here is an example.
create table funny_table(
my_field numeric(20, 0)
);
insert into funny_table values ( 18446744073709551614 );
select * from funny_table;
my_field
----------------------
18446744073709551614
(1 row)
See here for the details of Numeric type.
As already mentioned, Redshift does not support unsigned. Given that, please take a closer at what you need to achieve.
bigint occupies 8 bytes giving you a range of -9223372036854775808 to 9223372036854775807
numeric occupies 128-bit (Variable, up to 128 bits) but offers a bigger range at the expense of memory.
I believe the idea behind using unsigned is doubling the range WITHOUT EXTRA EXPENSE TO STORAGE. So if you are comfortable with the highest positive value of 2^63 - 1 go with bigint and forget about the unsigned because it costs 8 bytes anyway.
If you have bigger positive integers go with numeric(20, 0) (or higher precision) but you need to be aware that it is still signed and occupies more than 8 bytes.
For anyone that is running into the issue that they cant copy the unsigned int value from the source table:
You have to change the tupple mapping from previously:
("id", "int", "id", "int")
to
("id", "id", "decimal(20,0)")
We always had null values in our redshift cluster for the id. The change in the mapping changed this behaviour, which resulted in correctly copying the value from the source table.
Bigint in postgresql is 8 byte integer. which is has half the range as uint64 (as one bit is used to sign the integer)
I need to do a lot of aggregation on the column and I am under the impression that aggregation on NUMERIC type is slow in comparison to integer types.
How should I optimize my storage in this case?
Unless you have a concrete reason, just use NUMERIC. It is slower, quite a lot slower, but that might not matter as much as you think.
You don't really have any alternative, as PostgreSQL doesn't support unsigned 64-bit integers at the SQL level. You could add a new datatype as an extension module, but it'd be a lot of work.
You could shove the unsigned 64-bit int bitwise into a 64-bit signed int, so values above maxuint64/2 are negative. But that'll be totally broken for aggregation, and would generally be horribly ugly.
sum() will return numeric if the input is bigint so it will not overflow
select sum(a)
from (values (9223372036854775807::bigint), (9223372036854775807)) s(a)
;
sum
----------------------
18446744073709551614
http://www.postgresql.org/docs/current/static/functions-aggregate.html
There is also an extension to provide an additional uint64 datatype in postgresql. See Github
It is by Peter Eisentraut
I'm working with postgresql-9.1 recently.
For some reason I have to use a tech which does not support data type numeric but decimal. Unfortunately, the data type of columns which I've assigned decimal to them in my Postgresql are always numeric. I tried to alter the type, but it did not work though I've got the messages just like "Query returned successfully with no result in 12 ms".
SO, I want to know how can I get the columns to be decimal.
Any help will be highly appreciate.
e.g.
My creating clauses:
CREATE TABLE IF NOT EXISTS htest
(
dsizemin decimal(8,3) NOT NULL,
dsizemax decimal(8,3) NOT NULL,
hidentifier character varying(10) NOT NULL,
tgrade character varying(10) NOT NULL,
fdvalue decimal(8,3),
CONSTRAINT htest_pkey PRIMARY KEY (dsizemin , dsizemax , hidentifier , tgrade )
);
My altering clauses:
ALTER TABLE htest
ALTER COLUMN dsizemin TYPE decimal(8,3);
But it does not work.
In PostgreSQL, "decimal" is an alias for "numeric" which poses some problems when your app thinks it expects a type called "decimal" from the database. As Craig noted above, you can't even create a domain called "decimal"
There is no good workaround in the database side. The only thing you can do is change the application to expect a numeric data type back.
Use Numeric (precision, scale) to store decimals
precision represents the total number of expected digits on either side of the decimal point. scale is the number decimals you wish to store.
This Numeric (5,5) would imply you only want numbers less than 1 (negative or positive) with 5 decimal points. Debug, it may be Numeric (6,5) if the postgre sql errors out because it things the leading 0 is a decimal.
0.12345 would be an example of the above.
1.12345 would need a field Numeric (6,5)
100.12345 would need a field Numeric (8,5)
-100.12345 would need a field Numeric (8,5)
When you write a select statement to see the decimals, it rounds to 2; but if you do something like Select 100 * [field] from [table], then extra decimals should start appearing....
I'd like to make a random string for use in session verification using PostgreSQL. I know I can get a random number with SELECT random(), so I tried SELECT md5(random()), but that doesn't work. How can I do this?
You can fix your initial attempt like this:
SELECT md5(random()::text);
Much simpler than some of the other suggestions. :-)
I'd suggest this simple solution:
This is a quite simple function that returns a random string of the given length:
Create or replace function random_string(length integer) returns text as
$$
declare
chars text[] := '{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}';
result text := '';
i integer := 0;
begin
if length < 0 then
raise exception 'Given length cannot be less than 0';
end if;
for i in 1..length loop
result := result || chars[1+random()*(array_length(chars, 1)-1)];
end loop;
return result;
end;
$$ language plpgsql;
And the usage:
select random_string(15);
Example output:
select random_string(15) from generate_series(1,15);
random_string
-----------------
5emZKMYUB9C2vT6
3i4JfnKraWduR0J
R5xEfIZEllNynJR
tMAxfql0iMWMIxM
aPSYd7pDLcyibl2
3fPDd54P5llb84Z
VeywDb53oQfn9GZ
BJGaXtfaIkN4NV8
w1mvxzX33NTiBby
knI1Opt4QDonHCJ
P9KC5IBcLE0owBQ
vvEEwc4qfV4VJLg
ckpwwuG8YbMYQJi
rFf6TchXTO3XsLs
axdQvaLBitm6SDP
(15 rows)
You can get 128 bits of random from a UUID. This is the method to get the job done in modern PostgreSQL.
CREATE EXTENSION pgcrypto;
SELECT gen_random_uuid();
gen_random_uuid
--------------------------------------
202ed325-b8b1-477f-8494-02475973a28f
May be worth reading the docs on UUID too
The data type uuid stores Universally Unique Identifiers (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. (Some systems refer to this data type as a globally unique identifier, or GUID, instead.) This identifier is a 128-bit quantity that is generated by an algorithm chosen to make it very unlikely that the same identifier will be generated by anyone else in the known universe using the same algorithm. Therefore, for distributed systems, these identifiers provide a better uniqueness guarantee than sequence generators, which are only unique within a single database.
How rare is a collision with UUID, or guessable? Assuming they're random,
About 100 trillion version 4 UUIDs would need to be generated to have a 1 in a billion chance of a single duplicate ("collision"). The chance of one collision rises to 50% only after 261 UUIDs (2.3 x 10^18 or 2.3 quintillion) have been generated. Relating these numbers to databases, and considering the issue of whether the probability of a Version 4 UUID collision is negligible, consider a file containing 2.3 quintillion Version 4 UUIDs, with a 50% chance of containing one UUID collision. It would be 36 exabytes in size, assuming no other data or overhead, thousands of times larger than the largest databases currently in existence, which are on the order of petabytes. At the rate of 1 billion UUIDs generated per second, it would take 73 years to generate the UUIDs for the file. It would also require about 3.6 million 10-terabyte hard drives or tape cartridges to store it, assuming no backups or redundancy. Reading the file at a typical "disk-to-buffer" transfer rate of 1 gigabit per second would require over 3000 years for a single processor. Since the unrecoverable read error rate of drives is 1 bit per 1018 bits read, at best, while the file would contain about 1020 bits, just reading the file once from end to end would result, at least, in about 100 times more mis-read UUIDs than duplicates. Storage, network, power, and other hardware and software errors would undoubtedly be thousands of times more frequent than UUID duplication problems.
source: wikipedia
In summary,
UUID is standardized.
gen_random_uuid() is 128 bits of random stored in 128 bits (2**128 combinations). 0-waste.
random() only generates 52 bits of random in PostgreSQL (2**52 combinations).
md5() stored as UUID is 128 bits, but it can only be as random as its input (52 bits if using random())
md5() stored as text is 288 bits, but it only can only be as random as its input (52 bits if using random()) - over twice the size of a UUID and a fraction of the randomness)
md5() as a hash, can be so optimized that it doesn't effectively do much.
UUID is highly efficient for storage: PostgreSQL provides a type that is exactly 128 bits. Unlike text and varchar, etc which store as a varlena which has overhead for the length of the string.
PostgreSQL nifty UUID comes with some default operators, castings, and features.
Building on Marcin's solution, you could do this to use an arbitrary alphabet (in this case, all 62 ASCII alphanumeric characters):
SELECT array_to_string(array
(
select substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', trunc(random() * 62)::integer + 1, 1)
FROM generate_series(1, 12)), '');
Please use string_agg!
SELECT string_agg (substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil (random() * 62)::integer, 1), '')
FROM generate_series(1, 45);
I'm using this with MD5 to generate a UUID also. I just want a random value with more bits than a random () integer.
I was playing with PostgreSQL recently, and I think I've found a little better solution, using only built-in PostgreSQL methods - no pl/pgsql. The only limitation is it currently generates only UPCASE strings, or numbers, or lower case strings.
template1=> SELECT array_to_string(ARRAY(SELECT chr((65 + round(random() * 25)) :: integer) FROM generate_series(1,12)), '');
array_to_string
-----------------
TFBEGODDVTDM
template1=> SELECT array_to_string(ARRAY(SELECT chr((48 + round(random() * 9)) :: integer) FROM generate_series(1,12)), '');
array_to_string
-----------------
868778103681
The second argument to the generate_series method dictates the length of the string.
While not active by default, you could activate one of the core extensions:
CREATE EXTENSION IF NOT EXISTS pgcrypto;
Then your statement becomes a simple call to gen_salt() which generates a random string:
select gen_salt('md5') from generate_series(1,4);
gen_salt
-----------
$1$M.QRlF4U
$1$cv7bNJDM
$1$av34779p
$1$ZQkrCXHD
The leading number is a hash identifier. Several algorithms are available each with their own identifier:
md5: $1$
bf: $2a$06$
des: no identifier
xdes: _J9..
More information on extensions:
pgCrypto: http://www.postgresql.org/docs/9.2/static/pgcrypto.html
Included Extensions: http://www.postgresql.org/docs/9.2/static/contrib.html
EDIT
As indicated by Evan Carrol, as of v9.4 you can use gen_random_uuid()
http://www.postgresql.org/docs/9.4/static/pgcrypto.html
#Kavius recommended using pgcrypto, but instead of gen_salt, what about gen_random_bytes? And how about sha512 instead of md5?
create extension if not exists pgcrypto;
select digest(gen_random_bytes(1024), 'sha512');
Docs:
F.25.5. Random-Data Functions
gen_random_bytes(count integer) returns bytea
Returns count cryptographically strong random bytes. At most 1024
bytes can be extracted at a time. This is to avoid draining the
randomness generator pool.
The INTEGER parameter defines the length of the string. Guaranteed to cover all 62 alphanum characters with equal probability (unlike some other solutions floating around on the Internet).
CREATE OR REPLACE FUNCTION random_string(INTEGER)
RETURNS TEXT AS
$BODY$
SELECT array_to_string(
ARRAY (
SELECT substring(
'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
FROM (ceil(random()*62))::int FOR 1
)
FROM generate_series(1, $1)
),
''
)
$BODY$
LANGUAGE sql VOLATILE;
I do not think that you are looking for a random string per se. What you would need for session verification is a string that is guaranteed to be unique. Do you store session verification information for auditing? In that case you need the string to be unique between sessions. I know of two, rather simple approaches:
Use a sequence. Good for use on a single database.
Use an UUID. Universally unique, so good on distributed environments too.
UUIDs are guaranteed to be unique by virtue of their algorithm for generation; effectively it is extremely unlikely that you will generate two identical numbers on any machine, at any time, ever (note that this is much stronger than on random strings, which have a far smaller periodicity than UUIDs).
You need to load the uuid-ossp extension to use UUIDs. Once installed, call any of the available uuid_generate_vXXX() functions in your SELECT, INSERT or UPDATE calls. The uuid type is a 16-byte numeral, but it also has a string representation.
create extension if not exists pgcrypto;
then
SELECT encode(gen_random_bytes(20),'base64')
or even
SELECT encode(gen_random_bytes(20),'hex')
This is for 20 bytes = 160 bits of randomness (as long as sha1 for example).
select * from md5(to_char(random(), '0.9999999999999999'));
select encode(decode(md5(random()::text), 'hex')||decode(md5(random()::text), 'hex'), 'base64')