How do you put multiple values into one variable or field without using a list and then parsing? - enumerable

Sometimes there is a need to have multiple values in one variable or database field, even though that violates relational normalization principles. In python and other languages that support lists, that's easy. In others it is not. See insert multiple values in single attribute
One common technique is to concatenate values into a comma delimited string: "1,2,3" or "English,French,Spanish" and then extracting values by parsing.
When the valid values come from an enumerated list, is there another way that does not require parsing?

Yes. Use prime numbers, multiply them together, then factor them out.
The field type to use is integer or large integer
Use code values that are prime numbers
3 == English
5 == French
7 == Spanish
11 == Italian
Store the product of all that apply into the field.
21 == English and Spanish
385 == French, Spanish and Italian
Use modulo functions to determine which values are in the field
if ( field % 3 == 0 ) { english() ;}
if ! (field % 5) { french() ;}
=IF(NOT(MOD(A203,5)),"French","")
The same value can appear multiple times
9 == English, English
I first used this technique to store dimensions.
3 == time
5 == length
7 == mass
11 == charge
13 == temperature
17 == moles
For example, a "first moment" lever-arm would have a dimension value of 35 == mass * length.
To store fractional dimensions in an integer, I multiplied fractional dimensions by the product of all of them and dealt with it in processing.
255255 == 3*5*7*11*13*17
force == mass * length / (second^2)
force == ( 7 * 5 / ( 3 * 3 ) ) * 255255 * 255255
force == 253381002875
The reason I used integers was to avoid dealing with invalid equality comparisons due to rounding errors.
Please do not ask for the code to extract the fractional dimensions. All this was 40 years ago in APL/360.

If you don't need to allow for multiples of the same value, then you could use a bit map. Depending on whether there are up to 8, 16, 32, 64, or 128 allowed values, they could fit in a 1, 2, 4, 8, or 16 byte integer.

Related

How can I extract a specific bit from a 16-bit register using math ONLY?

I have a 16-bit WORD and I want to read the status of a specific bit or several bits.
I've tried a method that divides the word by the bit that I want, converts the result to two values - an integer and to a real, and compares the two. if they are not equal, then it it equates to false. This appears to only work if i am looking for a bit that the last 'TRUE' bit in the word. If there are any successive TRUE bits, it fails. Perhaps I just haven't done it right. I don't have the ability to use code, just basic math, boolean operations, and type conversion. Any ideas? I hope this isn't a dumb question but i have a feeling it is.
eg:
WORD 0010000100100100 = 9348
I want to know the value of bit 2. how can i determine it from 9348?
There are many ways, depending on what operations you can use. It appears you don't have much to choose from. But this should work, using just integer division and multiplication, and a test for equality.
(psuedocode):
x = 9348 (binary 0010000100100100, bit 0 = 0, bit 1 = 0, bit 2 = 1, ...)
x = x / 4 (now x is 1000010010010000
y = (x / 2) * 2 (y is 0000010010010000)
if (x == y) {
(bit 2 must have been 0)
} else {
(bit 2 must have been 1)
}
Every time you divide by 2, you move the bits to the left one position (in your big endian representation). Every time you multiply by 2, you move the bits to the right one position. Odd numbers will have 1 in the least significant position. Even numbers will have 0 in the least significant position. If you divide an odd number by 2 in integer math, and then multiply by 2, you loose the odd bit if there was one. So the idea above is to first move the bit you want to know about into the least significant position. Then, divide by 2 and then multiply by two. If the result is the same as what you had before, then there must have been a 0 in the bit you care about. If the result is not the same as what you had before, then there must have been a 1 in the bit you care about.
Having explained the idea, we can simplify to
((x / 8) * 2) <> (x / 4)
which will resolve to true if the bit was set, and false if the bit was not set.
AND the word with a mask [1].
In your example, you're interested in the second bit, so the mask (in binary) is
00000010. (Which is 2 in decimal.)
In binary, your word 9348 is 0010010010000100 [2]
0010010010000100 (your word)
AND 0000000000000010 (mask)
----------------
0000000000000000 (result of ANDing your word and the mask)
Because the value is equal to zero, the bit is not set. If it were different to zero, the bit was set.
This technique works for extracting one bit at a time. You can however use it repeatedly with different masks if you're interested in extracting multiple bits.
[1] For more information on masking techniques see http://en.wikipedia.org/wiki/Mask_(computing)
[2] See http://www.binaryhexconverter.com/decimal-to-binary-converter
The nth bit is equal to the word divided by 2^n mod 2
I think you'll have to test each bit, 0 through 15 inclusive.
You could try 9348 AND 4 (equivalent of 1<<2 - index of the bit you wanted)
9348 AND 4
should give 4 if bit is set, 0 if not.
So here is what I have come up with: 3 solutions. One is Hatchet's as proposed above, and his answer helped me immensely with actually understanding HOW this works, which is of utmost importance to me! The proposed AND masking solutions could have worked if my system supports bitwise operators, but it apparently does not.
Original technique:
( ( ( INT ( TAG / BIT ) ) / 2 ) - ( INT ( ( INT ( TAG / BIT ) ) / 2 ) ) <> 0 )
Explanation:
in the first part of the equation, integer division is performed on TAG/BIT, then REAL division by 2. In the second part, integer division is performed TAG/BIT, then integer division again by 2. The difference between these two results is compared to 0. If the difference is not 0, then the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337 w/ integer division. Then 2337/2 = 1168.5 w/ REAL division but 1168 w/ integer division. 1168.5-1168 <> 0, so the result is TRUE.
My modified technique:
( INT ( TAG / BIT ) / 2 ) <> ( INT ( INT ( TAG / BIT ) / 2 ) )
Explanation:
effectively the same as above, but instead of subtracting the two results and comparing them to 0, I am just comparing the two results themselves. If they are not equal, the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337 w/ integer division. Then 2337/2 = 1168.5 w/ REAL division but 1168 w/ integer division. 1168.5 <> 1168, so the result is TRUE.
Hatchet's technique as it applies to my system:
( INT ( TAG / BIT )) <> ( INT ( INT ( TAG / BIT ) / 2 ) * 2 )
Explanation:
in the first part of the equation, integer division is performed on TAG/BIT. In the second part, integer division is performed TAG/BIT, then integer division again by 2, then multiplication by 2. The two results are compared. If they are not equal, the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337. Then 2337/2 = 1168 w/ integer division. Then 1168x2=2336. 2337 <> 2336 so the result is TRUE. As Hatchet stated, this method 'drops the odd bit'.
Note - 9348/4 = 2337 w/ both REAL and integer division, but it is important that these parts of the formula use integer division and not REAL division (12164/32 = 380 w/ integer division and 380.125 w/ REAL division)
I feel it important to note for any future readers that the BIT value in the equations above is not the bit number, but the actual value of the resulting decimal if the bit in the desired position was the only TRUE bit in the binary string (bit 2 = 4 (2^2), bit 6 = 64 (2^6))
This explanation may be a bit too verbatim for some, but may be perfect for others :)
Please feel free to comment/critique/correct me if necessary!
I just needed to resolve an integer status code to a bit state in order to interface with some hardware. Here's a method that works for me:
private bool resolveBitState(int value, int bitNumber)
{
return (value & (1 << (bitNumber - 1))) != 0;
}
I like it, because it's non-iterative, requires no cast operations and essentially translates directly to machine code operations like Shift, And and Comparison, which probably means it's really optimal.
To explain in a little more detail, I'm comparing the bitwise value to a mask for the bit I am interested in (value & mask) using an AND operation. If the bitwise AND operation result is zero, then the bit is not set (return false). If the AND operation result is not zero, then the bit is set (return true). The result of the AND operation is either zero or the value of the bit (1, 2, 4, 8, 16, 32...). Hence the boolean evaluation comparing the AND operation result and 0. The mask is created by taking the number 1 and shifting it left (bit wise), by the appropriate number of binary places (1 << n). The number of places is the number of the bit targeted minus 1. If it's bit #1, I want to shift the 1 left by 0 and if it's #2, I want to shift it left 1 place, etc.
I'm surprised no one rates my solution. It think it's most logical and succinct... and works.

How to take one particular number or a range of particular number from a set of number?

I am looking for to take one particular number or range of numbers from a set of number?
Example
A = [-10,-2,-3,-8, 0 ,1, 2, 3, 4 ,5,7, 8, 9, 10, -100];
How can I just take number 5 from the set of above number and
How can I take a range of number for example from -3 to 4 from A.
Please help.
Thanks
I don't know what you are trying to accomplish by this. But you could check each entry of the set and test it it's in the specified range of numbers. The test for a single number could be accomplished by testing each number explicitly or as a special case of range check where the lower and the upper bound are the same number.
looping and testing, no matter what the programming language is, although most programming languages have builtin methods for accomplishing this type of task (so you may want to specify what language are you supposed to use for your homework):
procfun get_element:
index=0
for element in set:
if element is 5 then return (element,index)
increment index
your "5" is in element and at set[index]
getting a range:
procfun getrange:
subset = []
index = 0
for element in set:
if element is -3:
push element in subset
while index < length(set)-1:
push set[index] in subset
if set[index] is 4:
return subset
increment index
#if we met "-3" but we didn't met "4" then there's no such range
return None
#keep searching for a "-3"
increment index
return None
if ran against A, subset would be [-3,-8, 0 ,1, 2, 3, 4]; this is a "first matched, first grabbed" poorman's algorithm. on sorted sets the algorithms can get smarter and faster.

binary to decimal in objective-c

I want to convert the decimal number 27 into binary such a way that , first the digit 2 is converted and its binary value is placed in an array and then the digit 7 is converted and its binary number is placed in that array. what should I do?
thanks in advance
That's called binary-coded decimal. It's easiest to work right-to-left. Take the value modulo 10 (% operator in C/C++/ObjC) and put it in the array. Then integer-divide the value by 10 (/ operator in C/C++/ObjC). Continue until your value is zero. Then reverse the array if you need most-significant digit first.
If I understand your question correctly, you want to go from 27 to an array that looks like {0010, 0111}.
If you understand how base systems work (specifically the decimal system), this should be simple.
First, you find the remainder of your number when divided by 10. Your number 27 in this case would result with 7.
Then you integer divide your number by 10 and store it back in that variable. Your number 27 would result in 2.
How many times do you do this?
You do this until you have no more digits.
How many digits can you have?
Well, if you think about the number 100, it has 3 digits because the number needs to remember that one 10^2 exists in the number. On the other hand, 99 does not.
The answer to the previous question is 1 + floor of Log base 10 of the input number.
Log of 100 is 2, plus 1 is 3, which equals number of digits.
Log of 99 is a little less than 2, but flooring it is 1, plus 1 is 2.
In java it is like this:
int input = 27;
int number = 0;
int numDigits = Math.floor(Log(10, input)) + 1;
int[] digitArray = new int [numDigits];
for (int i = 0; i < numDigits; i++) {
number = input % 10;
digitArray[numDigits - i - 1] = number;
input = input / 10;
}
return digitArray;
Java doesn't have a Log function that is portable for any base (it has it for base e), but it is trivial to make a function for it.
double Log( double base, double value ) {
return Math.log(value)/Math.log(base);
}
Good luck.

how to create unique integer number from 3 different integers numbers(1 Oracle Long, 1 Date Field, 1 Short)

the thing is that, the 1st number is already ORACLE LONG,
second one a Date (SQL DATE, no timestamp info extra), the last one being a Short value in the range 1000-100'000.
how can I create sort of hash value that will be unique for each combination optimally?
string concatenation and converting to long later:
I don't want this, for example.
Day Month
12 1 --> 121
1 12 --> 121
When you have a few numeric values and need to have a single "unique" (that is, statistically improbable duplicate) value out of them you can usually use a formula like:
h = (a*P1 + b)*P2 + c
where P1 and P2 are either well-chosen numbers (e.g. if you know 'a' is always in the 1-31 range, you can use P1=32) or, when you know nothing particular about the allowable ranges of a,b,c best approach is to have P1 and P2 as big prime numbers (they have the least chance to generate values that collide).
For an optimal solution the math is a bit more complex than that, but using prime numbers you can usually have a decent solution.
For example, Java implementation for .hashCode() for an array (or a String) is something like:
h = 0;
for (int i = 0; i < a.length; ++i)
h = h * 31 + a[i];
Even though personally, I would have chosen a prime bigger than 31 as values inside a String can easily collide, since a delta of 31 places can be quite common, e.g.:
"BB".hashCode() == "Aa".hashCode() == 2122
Your
12 1 --> 121
1 12 --> 121
problem is easily fixed by zero-padding your input numbers to the maximum width expected for each input field.
For example, if the first field can range from 0 to 10000 and the second field can range from 0 to 100, your example becomes:
00012 001 --> 00012001
00001 012 --> 00001012
In python, you can use this:
#pip install pairing
import pairing as pf
n = [12,6,20,19]
print(n)
key = pf.pair(pf.pair(n[0],n[1]),
pf.pair(n[2], n[3]))
print(key)
m = [pf.depair(pf.depair(key)[0]),
pf.depair(pf.depair(key)[1])]
print(m)
Output is:
[12, 6, 20, 19]
477575
[(12, 6), (20, 19)]

Generate a hash sum for several integers

I am facing the problem of having several integers, and I have to generate one using them. For example.
Int 1: 14
Int 2: 4
Int 3: 8
Int 4: 4
Hash Sum: 43
I have some restriction in the values, the maximum value that and attribute can have is 30, the addition of all of them is always 30. And the attributes are always positive.
The key is that I want to generate the same hash sum for similar integers, for example if I have the integers, 14, 4, 10, 2 then I want to generate the same hash sum, in the case above 43. But of course if the integers are very different (4, 4, 2, 20) then I should have a different hash sum. Also it needs to be fast.
Ideally I would like that the output of the hash sum is between 0 and 512, and it should evenly distributed. With my restrictions I can have around 5K different possibilities, so what I would like to have is around 10 per bucket.
I am sure there are many algorithms that do this, but I could not find a way of googling this thing. Can anyone please post an algorithm to do this?.
Some more information
The whole thing with this is that those integers are attributes for a function. I want to store the values of the function in a table, but I do not have enough memory to store all the different options. That is why I want to generalize between similar attributes.
The reason why 10, 5, 15 are totally different from 5, 10, 15, it is because if you imagine this in 3d then both points are a totally different point
Some more information 2
Some answers try to solve the problem using hashing. But I do not think this is so complex. Thanks to one of the comments I have realized that this is a clustering algorithm problem. If we have only 3 attributes and we imagine the problem in 3d, what I just need is divide the space in blocks.
In fact this can be solved with rules of this type
if (att[0] < 5 && att[1] < 5 && att[2] < 5 && att[3] < 5)
Block = 21
if ( (5 < att[0] < 10) && (5 < att[1] < 10) && (5 < att[2] < 10) && (5 < att[3] < 10))
Block = 45
The problem is that I need a fast and a general way to generate those ifs I cannot write all the possibilities.
The simple solution:
Convert the integers to strings separated by commas, and hash the resulting string using a common hashing algorithm (md5, sha, etc).
If you really want to roll-your-own, I would do something like:
Generate large prime P
Generate random numbers 0 < a[i] < P (for each dimension you have)
To generate hash, calculate: sum(a[i] * x[i]) mod P
Given the inputs a, b, c, and d, each ranging in value from 0 to 30 (5 bits), the following will produce an number in the range of 0 to 255 (8 bits).
bucket = ((a & 0x18) << 3) | ((b & 0x18) << 1) | ((c & 0x18) >> 1) | ((d & 0x18) >> 3)
Whether the general approach is appropriate depends on how the question is interpreted. The 3 least significant bits are dropped, grouping 0-7 in the same set, 8-15 in the next, and so forth.
0-7,0-7,0-7,0-7 -> bucket 0
0-7,0-7,0-7,8-15 -> bucket 1
0-7,0-7,0-7,16-23 -> bucket 2
...
24-30,24-30,24-30,24-30 -> bucket 255
Trivially tested with:
for (int a = 0; a <= 30; a++)
for (int b = 0; b <= 30; b++)
for (int c = 0; c <= 30; c++)
for (int d = 0; d <= 30; d++) {
int bucket = ((a & 0x18) << 3) |
((b & 0x18) << 1) |
((c & 0x18) >> 1) |
((d & 0x18) >> 3);
printf("%d, %d, %d, %d -> %d\n",
a, b, c, d, bucket);
}
You want a hash function that depends on the order of inputs and where similar sets of numbers will generate the same hash? That is, you want 50 5 5 10 and 5 5 10 50 to generate different values, but you want 52 7 4 12 to generate the same hash as 50 5 5 10? A simple way to do something like this is:
long hash = 13;
for (int i = 0; i < array.length; i++) {
hash = hash * 37 + array[i] / 5;
}
This is imperfect, but should give you an idea of one way to implement what you want. It will treat the values 50 - 54 as the same value, but it will treat 49 and 50 as different values.
If you want the hash to be independent of the order of the inputs (so the hash of 5 10 20 and 20 10 5 are the same) then one way to do this is to sort the array of integers into ascending order before applying the hash. Another way would be to replace
hash = hash * 37 + array[i] / 5;
with
hash += array[i] / 5;
EDIT: Taking into account your comments in response to this answer, it sounds like my attempt above may serve your needs well enough. It won't be ideal, nor perfect. If you need high performance you have some research and experimentation to do.
To summarize, order is important, so 5 10 20 differs from 20 10 5. Also, you would ideally store each "vector" separately in your hash table, but to handle space limitations you want to store some groups of values in one table entry.
An ideal hash function would return a number evenly spread across the possible values based on your table size. Doing this right depends on the expected size of your table and on the number of and expected maximum value of the input vector values. If you can have negative values as "coordinate" values then this may affect how you compute your hash. If, given your range of input values and the hash function chosen, your maximum hash value is less than your hash table size, then you need to change the hash function to generate a larger hash value.
You might want to try using vectors to describe each number set as the hash value.
EDIT:
Since you're not describing why you want to not run the function itself, I'm guessing it's long running. Since you haven't described the breadth of the argument set.
If every value is expected then a full lookup table in a database might be faster.
If you're expecting repeated calls with the same arguments and little overall variation, then you could look at memoizing so only the first run for a argument set is expensive, and each additional request is fast, with less memory usage.
You would need to define what you mean by "similar". Hashes are generally designed to create unique results from unique input.
One approach would be to normalize your input and then generate a hash from the results.
Generating the same hash sum is called a collision, and is a bad thing for a hash to have. It makes it less useful.
If you want similar values to give the same output, you can divide the input by however close you want them to count. If the order makes a difference, use a different divisor for each number. The following function does what you describe:
int SqueezedSum( int a, int b, int c, int d )
{
return (a/11) + (b/7) + (c/5) + (d/3);
}
This is not a hash, but does what you describe.
You want to look into geometric hashing. In "standard" hashing you want
a short key
inverse resistance
collision resistance
With geometric hashing you susbtitute number 3 with something whihch is almost opposite; namely close initial values give close hash values.
Another way to view my problem is using the multidimesional scaling (MS). In MS we start with a matrix of items and what we want is assign a location of each item to an N dimensional space. Reducing in this way the number of dimensions.
http://en.wikipedia.org/wiki/Multidimensional_scaling