I am using Dart Int8List in place of byte type(Java). I want to do bitwise operations with Int8List. But bitwise operations aren't defined in Int8Listclass. How to solve this.?
My block of code for your reference :
List<Int8List> myList;
int offset = 0;
int read16Bit() {
int x = ((myList[offset] & 0xFE) << 8) |
(myList[offset+1] & 0xFE);
offset += 2;
return x;
}
Related
I'm writing to a piece of hardware using bluetooth and need to format my data in a specific way.
When I get the value from the device I have do a little bit shifting to get the correct answer.
Here is a breakdown of the values I am getting back from the device.
byte[1] = (unsigned char)temp;
byte[2] = (unsigned char)(temp>>8);
byte[3] = (unsigned char)(temp>>16);
byte[4] = (unsigned char)(temp>>24);
It is a List with a size of 4. A real world example would be this:
byte[1] = '46';
byte[2] = '2';
byte[3] = '0';
byte[4] = '0';
This should work out to be
558
My working code to get this is:
int _shiftLeft(int n, int amount) {
return n << amount;
}
int _getValue(List<int> list) {
int temp;
temp = list[1];
temp += _shiftLeft(list[2], 8);
temp += _shiftLeft(list[3], 16);
temp += _shiftLeft(list[4], 24);
return temp;
}
The actual list I get back from the device is quite large but I only need values 1-4.
This works great and gets me the correct value back. Now I have to write to the device. So if I have a value of 558, I need to build a list of size 4 with the same bit shifting but in reverse. Following the exact method above but in reverse. What is the best way to do this?
Basically if I pass a method a value of '558' I need to get back a List<int> of [46,2,0,0]
You can get only the lower 8 bits by the bitwise AND operation & 255 (or & 0xFF).
Just combining this with bit shifting will do.
int _shiftRight(int n, int amount) {
return n >> amount;
}
List<int> _getList(int value) {
final list = <int>[];
list.add(value & 255);
list.add(_shiftRight(value, 8) & 255);
list.add(_shiftRight(value, 16) & 255);
list.add(_shiftRight(value, 24) & 255);
return list;
}
It can be simplified using for as follows:
List<int> _getList(int value) {
final list = <int>[];
for (int i = 0; i < 4; i++) {
list.add(value >> i * 8 & 255);
}
return list;
}
I have two variable bit-shifting code fragments that I want to SSE-vectorize by some means:
1) a = 1 << b (where b = 0..7 exactly), i.e. 0/1/2/3/4/5/6/7 -> 1/2/4/8/16/32/64/128/256
2) a = 1 << (8 * b) (where b = 0..7 exactly), i.e. 0/1/2/3/4/5/6/7 -> 1/0x100/0x10000/etc
OK, I know that AMD's XOP VPSHLQ would do this, as would AVX2's VPSHLQ. But my challenge here is whether this can be achieved on 'normal' (i.e. up to SSE4.2) SSE.
So, is there some funky SSE-family opcode sequence that will achieve the effect of either of these code fragments? These only need yield the listed output values for the specific input values (0-7).
Update: here's my attempt at 1), based on Peter Cordes' suggestion of using the floating point exponent to do simple variable bitshifting:
#include <stdint.h>
typedef union
{
int32_t i;
float f;
} uSpec;
void do_pow2(uint64_t *in_array, uint64_t *out_array, int num_loops)
{
uSpec u;
for (int i=0; i<num_loops; i++)
{
int32_t x = *(int32_t *)&in_array[i];
u.i = (127 + x) << 23;
int32_t r = (int32_t) u.f;
out_array[i] = r;
}
}
Hi i have been wondering if there is a way in which to convert binary numbers into decimal fractions.
I know how to change base as an example through this code
String binary = "11110010";
//I'd like to change this line so it produces a decimal value
String denary = int.parse(binary, radix: 2).toRadixString(10);
If anyone still wondering how to convert decimal to binary and the inverse:
print(55.toRadixString(2)); // Outputs 110111
print(int.parse("110111", radix: 2)); Outputs 55
int binaryToDecimal(int n)
{
int num = n;
int dec_value = 0;
// Initializing base value to 1, i.e 2^0
int base = 1;
int temp = num;
while (temp) {
int last_digit = temp % 10;
temp = temp / 10;
dec_value += last_digit * base;
base = base * 2;
}
return dec_value;
}
int main()
{
int num = 10101001;
cout << binaryToDecimal(num) << endl;
}
This is my c++ solution but you can implement any language
I have two image blocks stored as 1D arrays and have do the following bitwise AND operations among the elements of them.
int compare(unsigned char *a, int a_pitch,
unsigned char *b, int b_pitch, int a_lenx, int a_leny)
{
int overlap =0 ;
for(int y=0; y<a_leny; y++)
for(int x=0; x<a_lenx; x++)
{
if(a[x + y * a_pitch] & b[x+y*b_pitch])
overlap++ ;
}
return overlap ;
}
Actually, I have to do this job about 220,000 times, so it becomes very slow on iphone devices.
How could I accelerate this job on iPhone ?
I heard that NEON could be useful, but I'm not really familiar with it. In addition it seems that NEON doesn't have bitwise AND...
Option 1 - Work in the native width of your platform (it's faster to fetch 32-bits into a register and then do operations on that register than it is to fetch and compare data one byte at a time):
int compare(unsigned char *a, int a_pitch,
unsigned char *b, int b_pitch, int a_lenx, int a_leny)
{
int overlap = 0;
uint32_t* a_int = (uint32_t*)a;
uint32_t* b_int = (uint32_t*)b;
a_leny = a_leny / 4;
a_lenx = a_lenx / 4;
a_pitch = a_pitch / 4;
b_pitch = b_pitch / 4;
for(int y=0; y<a_leny_int; y++)
for(int x=0; x<a_lenx_int; x++)
{
uint32_t aVal = a_int[x + y * a_pitch_int];
uint32_t bVal = b_int[x+y*b_pitch_int];
if (aVal & 0xFF) & (bVal & 0xFF)
overlap++;
if ((aVal >> 8) & 0xFF) & ((bVal >> 8) & 0xFF)
overlap++;
if ((aVal >> 16) & 0xFF) & ((bVal >> 16) & 0xFF)
overlap++;
if ((aVal >> 24) & 0xFF) & ((bVal >> 24) & 0xFF)
overlap++;
}
return overlap ;
}
Option 2 - Use a heuristic to get an approximate result using fewer calculations (a good approach if the absolute difference between 101 overlaps and 100 overlaps is not important to your application):
int compare(unsigned char *a, int a_pitch,
unsigned char *b, int b_pitch, int a_lenx, int a_leny)
{
int overlap =0 ;
for(int y=0; y<a_leny; y+= 10)
for(int x=0; x<a_lenx; x+= 10)
{
//we compare 1% of all the pixels, and use that as the result
if(a[x + y * a_pitch] & b[x+y*b_pitch])
overlap++ ;
}
return overlap * 100;
}
Option 3 - Rewrite your function in inline assembly code. You're on your own for this one.
Your code is Rambo for the CPU - its worst nightmare :
byte access. Like aroth mentioned, ARM is VERY slow reading bytes from memory
random access. Two absolutely unnecessary multiply/add operations in addition to the already steep performance penalty by its nature.
Simply put, everything is wrong that can be wrong.
Don't call me rude. Let me be your angel instead.
First, I'll provide you a working NEON version. Then an optimized C version showing you exactly what you did wrong.
Just give me some time. I have to go to bed right now, and I have an important meeting tomorrow.
Why don't you learn ARM assembly? It's much easier and useful than x86 assembly.
It will also improve your C programming capabilities by a huge step.
Strongly recommended
cya
==============================================================================
Ok, here is an optimized version written in C with ARM assembly in mind.
Please note that both the pitches AND a_lenx have to be multiples of 4. Otherwise, it won't work properly.
There isn't much room left for optimizations with ARM assembly upon this version. (NEON is a different story - coming soon)
Take a careful look at how to handle variable declarations, loop, memory access, and AND operations.
And make sure that this function runs in ARM mode and not Thumb for best results.
unsigned int compare(unsigned int *a, unsigned int a_pitch,
unsigned int *b, unsigned int b_pitch, unsigned int a_lenx, unsigned int a_leny)
{
unsigned int overlap =0;
unsigned int a_gap = (a_pitch - a_lenx)>>2;
unsigned int b_gap = (b_pitch - a_lenx)>>2;
unsigned int aval, bval, xcount;
do
{
xcount = (a_lenx>>2);
do
{
aval = *a++;
// ldr aval, [a], #4
bval = *b++;
// ldr bavl, [b], #4
aval &= bval;
// and aval, aval, bval
if (aval & 0x000000ff) overlap += 1;
// tst aval, #0x000000ff
// addne overlap, overlap, #1
if (aval & 0x0000ff00) overlap += 1;
// tst aval, #0x0000ff00
// addne overlap, overlap, #1
if (aval & 0x00ff0000) overlap += 1;
// tst aval, #0x00ff0000
// addne overlap, overlap, #1
if (aval & 0xff000000) overlap += 1;
// tst aval, #0xff000000
// addne overlap, overlap, #1
} while (--xcount);
a += a_gap;
b += b_gap;
} while (--a_leny);
return overlap;
}
First of all, why the double loop? You can do it with a single loop and a couple of pointers.
Also, you don't need to calculate x+y*pitch for every single pixel; just increment two pointers by one. Incrementing by one is a lot faster than x+y*pitch.
Why exactly do you need to perform this operation? I would make sure there are no high-level optimizations/changes available before looking into a low-level solution like NEON.
This is an interview question I saw on some site.
It was mentioned that the answer involves forming a recurrence of log2() as follows:
double log2(double x )
{
if ( x<=2 ) return 1;
if ( IsSqureNum(x) )
return log2(sqrt(x) ) * 2;
return log2( sqrt(x) ) * 2 + 1; // Why the plus one here.
}
as for the recurrence, clearly the +1 is wrong. Also, the base case is also erroneous.
Does anyone know a better answer?
How is log() and log10() actually implemented in C.
Perhaps I have found the exact answers the interviewers were looking for. From my part, I would say it's little bit difficult to derive this under interview pressure. The idea is, say you want to find log2(13), you can know that it lies between 3 to 4. Also 3 = log2(8) and 4 = log2(16),
from properties of logarithm, we know that log( sqrt( (8*16) ) = (log(8) + log(16))/2 = (3+4)/2 = 3.5
Now, sqrt(8*16) = 11.3137 and log2(11.3137) = 3.5. Since 11.3137<13, we know that our desired log2(13) would lie between 3.5 and 4 and we proceed to locate that. It is easy to notice that this has a Binary Search solution and we iterate up to a point when our value converges to the value whose log2() we wish to find. Code is given below:
double Log2(double val)
{
int lox,hix;
double rval, lval;
hix = 0;
while((1<<hix)<val)
hix++;
lox =hix-1;
lval = (1<<lox) ;
rval = (1<<hix);
double lo=lox,hi=hix;
// cout<<lox<<" "<<hix<<endl;
//cout<<lval<<" "<<rval;
while( fabs(lval-val)>1e-7)
{
double mid = (lo+hi)/2;
double midValue = sqrt(lval*rval);
if ( midValue > val)
{
hi = mid;
rval = midValue;
}
else{
lo=mid;
lval = midValue;
}
}
return lo;
}
It's been a long time since I've written pure C, so here it is in C++ (I think the only difference is the output function, so you should be able to follow it):
#include <iostream>
using namespace std;
const static double CUTOFF = 1e-10;
double log2_aux(double x, double power, double twoToTheMinusN, unsigned int accumulator) {
if (twoToTheMinusN < CUTOFF)
return accumulator * twoToTheMinusN * 2;
else {
int thisBit;
if (x > power) {
thisBit = 1;
x /= power;
}
else
thisBit = 0;
accumulator = (accumulator << 1) + thisBit;
return log2_aux(x, sqrt(power), twoToTheMinusN / 2.0, accumulator);
}
}
double mylog2(double x) {
if (x < 1)
return -mylog2(1.0/x);
else if (x == 1)
return 0;
else if (x > 2.0)
return mylog2(x / 2.0) + 1;
else
return log2_aux(x, 2.0, 1.0, 0);
}
int main() {
cout << "5 " << mylog2(5) << "\n";
cout << "1.25 " << mylog2(1.25) << "\n";
return 0;
}
The function 'mylog2' does some simple log trickery to get a related number which is between 1 and 2, then call log2_aux with that number.
The log2_aux more or less follows the algorithm that Scorpi0 linked to above. At each step, you get 1 bit of the result. When you have enough bits, stop.
If you can get a hold of a copy, the Feynman Lectures on Physics, number 23, starts off with a great explanation of logs and more or less how to do this conversion. Vastly superior to the Wikipedia article.