Below query gives 2 different wrong length but same numbers ,In IBM db2 SQL
Why there is 2 different length for same value ?
select
decimal(TRIM(cast(15 as char(2)))||TRIM(LPAD(cast(7 as char(2)),2,'0'))||TRIM(LPAD(cast(13 as char(2)),2,'0'))),
length(decimal(TRIM(cast(15 as char(2)))||TRIM(LPAD(cast(7 as char(2)),2,'0'))||TRIM(LPAD(cast(13 as char(2)),2,'0')))),
decimal(TRIM(substr(replace(char(current_date -1 days,ISO),'-',''),3,6)),6,0),
length(decimal(TRIM(substr(replace(char(current_date -1 days,ISO),'-',''),3,6)),6,0))
from sysibm.sysdummy1
These numbers are not the same value the first one is 15713 and the second is 150713.
DECIMAL(x,p,s) returns a packed decimal value of precision p, scale s.
A packed decimal(p,s) only takes p/2 + 1 bytes of memory.
So 6 /2 + 1 = 4, which is the value LENGTH() returns for packed decimal expressions per the (DB2 for IBM i) manual
Related
How can I create an uint256 data type in Postgres? It looks like they only support up to 8 bytes for integers natively..
They offer decimal and numeric types with user-specified precision. For my app, the values are money, so I would assume I would use numeric over decimal, or does that not matter?
NUMERIC(precision, scale)
So would I use NUMERIC(78, 0)? (2^256 is 78 digits) Or do I need to do NUMERIC(155, 0) and force it to always be >= 0 (2^512, 155 digits, with the extra bit representing the sign)? OR should I be using decimal?
numeric(78,0) has a max value of 9.999... * 10^77 > 2^256 so that is sufficient.
You can create a domain.
CREATE DOMAIN uint_256 AS NUMERIC NOT NULL
CHECK (VALUE >= 0 AND VALUE < 2^256)
CHECK (SCALE(VALUE) = 0)
This creates a reusable uint_256 datatype which is constrained to be within the 2^256 limit and also prevents rounding errors by only allowing the scale of the number to be 0 (i.e. throws an error with decimal values). There is nothing like NULL in Solidity so the datatype should not be nullable.
Try it: dbfiddle
I have a 16-bit WORD and I want to read the status of a specific bit or several bits.
I've tried a method that divides the word by the bit that I want, converts the result to two values - an integer and to a real, and compares the two. if they are not equal, then it it equates to false. This appears to only work if i am looking for a bit that the last 'TRUE' bit in the word. If there are any successive TRUE bits, it fails. Perhaps I just haven't done it right. I don't have the ability to use code, just basic math, boolean operations, and type conversion. Any ideas? I hope this isn't a dumb question but i have a feeling it is.
eg:
WORD 0010000100100100 = 9348
I want to know the value of bit 2. how can i determine it from 9348?
There are many ways, depending on what operations you can use. It appears you don't have much to choose from. But this should work, using just integer division and multiplication, and a test for equality.
(psuedocode):
x = 9348 (binary 0010000100100100, bit 0 = 0, bit 1 = 0, bit 2 = 1, ...)
x = x / 4 (now x is 1000010010010000
y = (x / 2) * 2 (y is 0000010010010000)
if (x == y) {
(bit 2 must have been 0)
} else {
(bit 2 must have been 1)
}
Every time you divide by 2, you move the bits to the left one position (in your big endian representation). Every time you multiply by 2, you move the bits to the right one position. Odd numbers will have 1 in the least significant position. Even numbers will have 0 in the least significant position. If you divide an odd number by 2 in integer math, and then multiply by 2, you loose the odd bit if there was one. So the idea above is to first move the bit you want to know about into the least significant position. Then, divide by 2 and then multiply by two. If the result is the same as what you had before, then there must have been a 0 in the bit you care about. If the result is not the same as what you had before, then there must have been a 1 in the bit you care about.
Having explained the idea, we can simplify to
((x / 8) * 2) <> (x / 4)
which will resolve to true if the bit was set, and false if the bit was not set.
AND the word with a mask [1].
In your example, you're interested in the second bit, so the mask (in binary) is
00000010. (Which is 2 in decimal.)
In binary, your word 9348 is 0010010010000100 [2]
0010010010000100 (your word)
AND 0000000000000010 (mask)
----------------
0000000000000000 (result of ANDing your word and the mask)
Because the value is equal to zero, the bit is not set. If it were different to zero, the bit was set.
This technique works for extracting one bit at a time. You can however use it repeatedly with different masks if you're interested in extracting multiple bits.
[1] For more information on masking techniques see http://en.wikipedia.org/wiki/Mask_(computing)
[2] See http://www.binaryhexconverter.com/decimal-to-binary-converter
The nth bit is equal to the word divided by 2^n mod 2
I think you'll have to test each bit, 0 through 15 inclusive.
You could try 9348 AND 4 (equivalent of 1<<2 - index of the bit you wanted)
9348 AND 4
should give 4 if bit is set, 0 if not.
So here is what I have come up with: 3 solutions. One is Hatchet's as proposed above, and his answer helped me immensely with actually understanding HOW this works, which is of utmost importance to me! The proposed AND masking solutions could have worked if my system supports bitwise operators, but it apparently does not.
Original technique:
( ( ( INT ( TAG / BIT ) ) / 2 ) - ( INT ( ( INT ( TAG / BIT ) ) / 2 ) ) <> 0 )
Explanation:
in the first part of the equation, integer division is performed on TAG/BIT, then REAL division by 2. In the second part, integer division is performed TAG/BIT, then integer division again by 2. The difference between these two results is compared to 0. If the difference is not 0, then the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337 w/ integer division. Then 2337/2 = 1168.5 w/ REAL division but 1168 w/ integer division. 1168.5-1168 <> 0, so the result is TRUE.
My modified technique:
( INT ( TAG / BIT ) / 2 ) <> ( INT ( INT ( TAG / BIT ) / 2 ) )
Explanation:
effectively the same as above, but instead of subtracting the two results and comparing them to 0, I am just comparing the two results themselves. If they are not equal, the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337 w/ integer division. Then 2337/2 = 1168.5 w/ REAL division but 1168 w/ integer division. 1168.5 <> 1168, so the result is TRUE.
Hatchet's technique as it applies to my system:
( INT ( TAG / BIT )) <> ( INT ( INT ( TAG / BIT ) / 2 ) * 2 )
Explanation:
in the first part of the equation, integer division is performed on TAG/BIT. In the second part, integer division is performed TAG/BIT, then integer division again by 2, then multiplication by 2. The two results are compared. If they are not equal, the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337. Then 2337/2 = 1168 w/ integer division. Then 1168x2=2336. 2337 <> 2336 so the result is TRUE. As Hatchet stated, this method 'drops the odd bit'.
Note - 9348/4 = 2337 w/ both REAL and integer division, but it is important that these parts of the formula use integer division and not REAL division (12164/32 = 380 w/ integer division and 380.125 w/ REAL division)
I feel it important to note for any future readers that the BIT value in the equations above is not the bit number, but the actual value of the resulting decimal if the bit in the desired position was the only TRUE bit in the binary string (bit 2 = 4 (2^2), bit 6 = 64 (2^6))
This explanation may be a bit too verbatim for some, but may be perfect for others :)
Please feel free to comment/critique/correct me if necessary!
I just needed to resolve an integer status code to a bit state in order to interface with some hardware. Here's a method that works for me:
private bool resolveBitState(int value, int bitNumber)
{
return (value & (1 << (bitNumber - 1))) != 0;
}
I like it, because it's non-iterative, requires no cast operations and essentially translates directly to machine code operations like Shift, And and Comparison, which probably means it's really optimal.
To explain in a little more detail, I'm comparing the bitwise value to a mask for the bit I am interested in (value & mask) using an AND operation. If the bitwise AND operation result is zero, then the bit is not set (return false). If the AND operation result is not zero, then the bit is set (return true). The result of the AND operation is either zero or the value of the bit (1, 2, 4, 8, 16, 32...). Hence the boolean evaluation comparing the AND operation result and 0. The mask is created by taking the number 1 and shifting it left (bit wise), by the appropriate number of binary places (1 << n). The number of places is the number of the bit targeted minus 1. If it's bit #1, I want to shift the 1 left by 0 and if it's #2, I want to shift it left 1 place, etc.
I'm surprised no one rates my solution. It think it's most logical and succinct... and works.
Sometimes there is a need to have multiple values in one variable or database field, even though that violates relational normalization principles. In python and other languages that support lists, that's easy. In others it is not. See insert multiple values in single attribute
One common technique is to concatenate values into a comma delimited string: "1,2,3" or "English,French,Spanish" and then extracting values by parsing.
When the valid values come from an enumerated list, is there another way that does not require parsing?
Yes. Use prime numbers, multiply them together, then factor them out.
The field type to use is integer or large integer
Use code values that are prime numbers
3 == English
5 == French
7 == Spanish
11 == Italian
Store the product of all that apply into the field.
21 == English and Spanish
385 == French, Spanish and Italian
Use modulo functions to determine which values are in the field
if ( field % 3 == 0 ) { english() ;}
if ! (field % 5) { french() ;}
=IF(NOT(MOD(A203,5)),"French","")
The same value can appear multiple times
9 == English, English
I first used this technique to store dimensions.
3 == time
5 == length
7 == mass
11 == charge
13 == temperature
17 == moles
For example, a "first moment" lever-arm would have a dimension value of 35 == mass * length.
To store fractional dimensions in an integer, I multiplied fractional dimensions by the product of all of them and dealt with it in processing.
255255 == 3*5*7*11*13*17
force == mass * length / (second^2)
force == ( 7 * 5 / ( 3 * 3 ) ) * 255255 * 255255
force == 253381002875
The reason I used integers was to avoid dealing with invalid equality comparisons due to rounding errors.
Please do not ask for the code to extract the fractional dimensions. All this was 40 years ago in APL/360.
If you don't need to allow for multiples of the same value, then you could use a bit map. Depending on whether there are up to 8, 16, 32, 64, or 128 allowed values, they could fit in a 1, 2, 4, 8, or 16 byte integer.
I am working with a table in a PostgreSQL database that has several boolean columns that determine some state (e.g. published, visible, etc.). I want to make a single status column that will store all these values as well as possible new ones in a form of a bitmask. Is there any difference between integer and bit(n) in this case?
This is going to be a rather big table, because it stores objects that users create via a web-interface. So I think I will have to use (partial) indexes for this column.
If you only have a few variables I would consider keeping separate boolean columns.
Indexing is easy. In particular also indexes on expressions and partial indexes.
Conditions for queries are easy to write and read and meaningful.
A boolean column occupies 1 byte (no alignment padding). For only a few variables this occupies the least space.
Unlike other options boolean columns allow NULL values for individual bits if you should need that. You can always define columns NOT NULL if you don't.
If you have more than a hand full variables but no more than 32, an integer column may serve best. (Or a bigint for up to 64 variables.)
Occupies 4 bytes on disk (may require alignment padding, depending on preceding columns).
Very fast indexing for exact matches ( = operator).
Handling individual values may be slower / less convenient than with varbit or boolean.
With even more variables, or if you want to manipulate the values a lot, or if you don't have huge tables or disk space / RAM is not an issue, or if you are not sure what to pick, I would consider bit(n) or bit varying(n) (short: varbit(n).
Occupies at least 5 bytes (or 8 for very long strings) plus 1 byte for each group of 8 bits (rounded up).
You can use bit string functions and operators directly, and some standard SQL functions as well.
For just 3 bits of information, individual boolean columns get by with 3 bytes, an integer needs 4 bytes (maybe additional alignment padding) and a bit string 6 bytes (5 + 1).
For 32 bits of information, an integer still needs 4 bytes (+ padding), a bit string occupies 9 bytes for the same (5 + 4) and boolean columns occupy 32 bytes.
To optimize disk space further you need to understand the storage mechanisms of PostgreSQL, especially data alignment. More in this related answer.
This answer on how to transform the types boolean, bit(n) and integer may be of help, too.
You can apply the bit string functions directly to a bit string without the need to cast from an integer.
With the advent of GENERATED columns in PostgreSQL (as from version 12), you could do something like this (all of the code below is available on the fiddle here):
Base table:
CREATE TABLE test
(
t_id INTEGER GENERATED BY DEFAULT AS IDENTITY,
data TEXT,
bitmask VARBIT(9)
);
but, with GENERATED columns, you can now do:
CREATE TABLE test
(
t_id INTEGER GENERATED BY DEFAULT AS IDENTITY,
data TEXT,
bitmask VARBIT(9), -- choose 9 because it's not 8, to show that you don't have to
-- select an INT or even a SMALLINT
published BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 0)::BOOLEAN) STORED,
visible BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 1)::BOOLEAN) STORED,
rubbish BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 2)::BOOLEAN) STORED,
masterpiece BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 3)::BOOLEAN) STORED,
meh BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 4)::BOOLEAN) STORED,
arts BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 5)::BOOLEAN) STORED,
legal BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 6)::BOOLEAN) STORED,
sport BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 7)::BOOLEAN) STORED,
politics BOOLEAN GENERATED ALWAYS AS (GET_BIT(bitmask, 8)::BOOLEAN) STORED,
CONSTRAINT subject_ck -- so you can't have conflicting subjects - just for demo purposes
CHECK -- a document can't be art and legal at the same time!
(
CASE
WHEN GET_BIT(bitmask, 5) = 1
THEN GET_BIT(bitmask, 6) = 0 AND GET_BIT(bitmask, 7) = 0 AND GET_BIT(bitmask, 8) = 0
WHEN GET_BIT(bitmask, 6) = 1
THEN GET_BIT(bitmask, 5) = 0 AND GET_BIT(bitmask, 7) = 0 AND GET_BIT(bitmask, 8) = 0
WHEN GET_BIT(bitmask, 7) = 1
THEN GET_BIT(bitmask, 6) = 0 AND GET_BIT(bitmask, 5) = 0 AND GET_BIT(bitmask, 8) = 0
WHEN GET_BIT(bitmask, 5) = 1
THEN GET_BIT(bitmask, 8) = 0 AND GET_BIT(bitmask, 7) = 0 AND GET_BIT(bitmask, 5) = 0
END
)
);
Now, the extra 9 booleans add about 12 bytes to the size of the table - if that isn't a problem, then we're good to go! Also, when (I presume shortly - as of writing 2022-09-09) PostgreSQL is enhanced with VIRTUAL columns, there'll be no space overhead at all.
The benefits of doing it like this is that it makes your SQL short and readable - instead of having to a bunch of ugly CASE statements, you'll simply be able to do the following:
INSERT INTO test (data, bitmask) VALUES
('Document 1', '000100000'),
('Document 2', '100000000'),
('Document 3', '101000001');
and also, stuff like this:
CREATE INDEX legal_ix ON test (legal) WHERE legal;
So now, obtaining all of the records is far easier on the eye - much more legible:
SELECT * FROM test;
Result:
t_id data bitmask published visible rubbish masterpiece meh arts legal sport politics
1 Document 1 000100000 f f f t f f f f f
2 Document 2 100000000 t f f f f f f f f
3 Document 3 101000001 t f t f f f f f t
You can also do:
BEGIN TRANSACTION; -- can't update the other way round or CHECK constraint will fail
-- CHECK constraints are not deferrable - can't be 8 & 6 simultaneously
UPDATE test
SET bitmask = SET_BIT(bitmask, 8, 0) WHERE data = 'Document 3';
UPDATE test
SET bitmask = SET_BIT(bitmask, 6, 1) WHERE data = 'Document 3';
COMMIT;
Result:
SELECT t_id, published, legal FROM test; -- legal has gone from f -> t
There are a few other bits and pieces in the fiddle.
the thing is that, the 1st number is already ORACLE LONG,
second one a Date (SQL DATE, no timestamp info extra), the last one being a Short value in the range 1000-100'000.
how can I create sort of hash value that will be unique for each combination optimally?
string concatenation and converting to long later:
I don't want this, for example.
Day Month
12 1 --> 121
1 12 --> 121
When you have a few numeric values and need to have a single "unique" (that is, statistically improbable duplicate) value out of them you can usually use a formula like:
h = (a*P1 + b)*P2 + c
where P1 and P2 are either well-chosen numbers (e.g. if you know 'a' is always in the 1-31 range, you can use P1=32) or, when you know nothing particular about the allowable ranges of a,b,c best approach is to have P1 and P2 as big prime numbers (they have the least chance to generate values that collide).
For an optimal solution the math is a bit more complex than that, but using prime numbers you can usually have a decent solution.
For example, Java implementation for .hashCode() for an array (or a String) is something like:
h = 0;
for (int i = 0; i < a.length; ++i)
h = h * 31 + a[i];
Even though personally, I would have chosen a prime bigger than 31 as values inside a String can easily collide, since a delta of 31 places can be quite common, e.g.:
"BB".hashCode() == "Aa".hashCode() == 2122
Your
12 1 --> 121
1 12 --> 121
problem is easily fixed by zero-padding your input numbers to the maximum width expected for each input field.
For example, if the first field can range from 0 to 10000 and the second field can range from 0 to 100, your example becomes:
00012 001 --> 00012001
00001 012 --> 00001012
In python, you can use this:
#pip install pairing
import pairing as pf
n = [12,6,20,19]
print(n)
key = pf.pair(pf.pair(n[0],n[1]),
pf.pair(n[2], n[3]))
print(key)
m = [pf.depair(pf.depair(key)[0]),
pf.depair(pf.depair(key)[1])]
print(m)
Output is:
[12, 6, 20, 19]
477575
[(12, 6), (20, 19)]