Save frame from TangoService_connectOnFrameAvailable - android-camera

How can I save a frame via TangoService_connectOnFrameAvailable() and display it correctly on my computer? As this reference page mentions, the pixels are stored in the HAL_PIXEL_FORMAT_YV12 format. In my callback function for TangoService_connectOnFrameAvailable, I save the frame like this:
static void onColorFrameAvailable(void* context, TangoCameraId id, const TangoImageBuffer* buffer)
{
...
std::ofstream fp;
fp.open(imagefile, std::ios::out | std::ios::binary );
int offset = 0;
for(int i = 0; i < buffer->height*2 + 1; i++) {
fp.write((char*)(buffer->data + offset), buffer->width);
offset += buffer->stride;
}
fp.close();
}
Then to get rid of the meta data in the first row and to display the image I run:
$ dd if="input.raw" of="new.raw" bs=1 skip=1280
$ vooya new.raw
I was careful to make sure in vooya that the channel order is yvu. The resulting output is:
What am I doing wrong in saving the image and displaying it?
UPDATE per Mark Mullin's response:
int offset = buffer->stride; // header offset
// copy Y channel
for(int i = 0; i < buffer->height; i++) {
fp.write((char*)(buffer->data + offset), buffer->width);
offset += buffer->stride;
}
// copy V channel
for(int i = 0; i < buffer->height / 2; i++) {
fp.write((char*)(buffer->data + offset), buffer->width / 2);
offset += buffer->stride / 2;
}
// copy U channel
for(int i = 0; i < buffer->height / 2; i++) {
fp.write((char*)(buffer->data + offset), buffer->width / 2);
offset += buffer->stride / 2;
}
This now shows the picture below, but there are still some artifacts; I wonder if that's from the Tango tablet camera or my processing of the raw data... any thoughts?

Can't say exactly what you're doing wrong AND tango images often have artifacts in them - yours are new, but I often see baby blue as a color where glare seems to be annoying deeper systems, and as it begins to loose sync with the depth system under load, you'll often see what looks like a shiny grid (its the IR pattern, I think) - At the end, any rational attempt to handle the image with openCV etc failed, so I hand wrote the decoder with some help from SO thread here
That said, given imagebuffer contains a pointer to the raw data from Tango, and various other variables like height and stride are filled in from the data received in the callback, then this logic will create an RGBA map - yeah, I optimized the math in it, so it's a little ugly - it's slower but functionally equivalent twin is listed second. My own experience says its a horrible idea to try and do this decode right in the callback (I believe Tango is capable of loosing sync with the flash for depth for purely spiteful reasons), so mine runs at the render stage.
Fast
uchar* pData = TangoData::cameraImageBuffer;
uchar* iData = TangoData::cameraImageBufferRGBA;
int size = (int)(TangoData::imageBufferStride * TangoData::imageBufferHeight);
float invByte = 0.0039215686274509803921568627451; // ( 1 / 255)
int halfi, uvOffset, halfj, uvOffsetHalfj;
float y_scaled, v_scaled, u_scaled;
int uOffset = size / 4 + size;
int halfstride = TangoData::imageBufferStride / 2;
for (int i = 0; i < TangoData::imageBufferHeight; ++i)
{
halfi = i / 2;
uvOffset = halfi * halfstride;
for (int j = 0; j < TangoData::imageBufferWidth; ++j)
{
halfj = j / 2;
uvOffsetHalfj = uvOffset + halfj;
y_scaled = pData[i * TangoData::imageBufferStride + j] * invByte;
v_scaled = 2 * (pData[uvOffsetHalfj + size] * invByte - 0.5f) * Vmax;
u_scaled = 2 * (pData[uvOffsetHalfj + uOffset] * invByte - 0.5f) * Umax;
*iData++ = (uchar)((y_scaled + 1.13983f * v_scaled) * 255.0);;
*iData++ = (uchar)((y_scaled - 0.39465f * u_scaled - 0.58060f * v_scaled) * 255.0);
*iData++ = (uchar)((y_scaled + 2.03211f * u_scaled) * 255.0);
*iData++ = 255;
}
}
Understandable
for (int i = 0; i < TangoData::imageBufferHeight; ++i)
{
for (int j = 0; j < TangoData::imageBufferWidth; ++j)
{
uchar y = pData[i * image->stride + j];
uchar v = pData[(i / 2) * (TangoData::imageBufferStride / 2) + (j / 2) + size];
uchar u = pData[(i / 2) * (TangoData::imageBufferStride / 2) + (j / 2) + size + (size / 4)];
YUV2RGB(y, u, v);
*iData++ = y;
*iData++ = u;
*iData++ = v;
*iData++ = 255;
}
}

I think that there is a better way to do if you can to do it offline.
The best way to save the image should be something like this (don't forgot to create the folder Pictures or you won't save anything)
void onFrameAvailableRouter(void* context, TangoCameraId id, const TangoImageBuffer* buffer) {
//To write the image in a txt file.
std::stringstream name_stream;
name_stream.setf(std::ios_base::fixed, std::ios_base::floatfield);
name_stream.precision(3);
name_stream << "/storage/emulated/0/Pictures/"
<<cur_frame_timstamp_
<<".txt";
std::fstream f(name_stream.str().c_str(), std::ios::out | std::ios::binary);
// size = 1280*720*1.5 to save YUV or 1280*720 to save grayscale
int size = stride_ * height_ * 1.5;
f.write((const char *) buffer->data,size * sizeof(uint8_t));
f.close();
}
Then to convert the .txt file to png you can do this
inputFolder = "input"
outputFolderRGB = "output/rgb"
outputFolderGray = "output/gray"
input_filename = "timestamp.txt"
output_filename = "rgb.png"
allFile = listdir(inputFolder)
numberOfFile = len(allFile)
if "input" in glob.glob("*"):
if "output/rgb" in glob.glob("output/*"):
print ""
else:
makedirs("output/rgb")
if "output/gray" in glob.glob("output/*"):
print ""
else:
makedirs("output/gray")
#The output reportories are ready
for file in allFile:
count+=1
print "current file : ",count,"/",numberOfFile
input_filename = file
output_filename = input_filename[0:(len(input_filename)-3)]+"png"
# load file into buffer
data = np.fromfile(inputFolder+"/"+input_filename, dtype=np.uint8)
#To get RGB image
# create yuv image
yuv = np.ndarray((height + height / 2, width), dtype=np.uint8, buffer=data)
# create a height x width x channels matrix with the datatype uint8 for rgb image
img = np.zeros((height, width, channels), dtype=np.uint8);
# convert yuv image to rgb image
cv2.cvtColor(yuv, cv2.COLOR_YUV2BGRA_NV21, img, channels)
cv2.imwrite(outputFolderRGB+"/"+output_filename, img)
#If u saved the image in graysacale use this part instead
#yuvReal = np.ndarray((height, width), dtype=np.uint8, buffer=data)
#cv2.imwrite(outputFolderGray+"/"+output_filename, yuvReal)
else:
print "not any input"
You just have to put your .txt in a folder input
It's a python script but if you prefer a c++ version it's very close.

Related

How to convert RGB pixmap to ui.Image in Dart?

Currently I have a Uint8List, formatted like [R,G,B,R,G,B,...] for all the pixels of the image. And of course I have its width and height.
I found decodeImageFromPixels while searching but it only takes RGBA/BGRA format. I converted my pixmap from RGB to RGBA and this function works fine.
However, my code now looks like this:
Uint8List rawPixel = raw.value.asTypedList(w * h * channel);
List<int> rgba = [];
for (int i = 0; i < rawPixel.length; i++) {
rgba.add(rawPixel[i]);
if ((i + 1) % 3 == 0) {
rgba.add(0);
}
}
Uint8List rgbaList = Uint8List.fromList(rgba);
Completer<Image> c = Completer<Image>();
decodeImageFromPixels(rgbaList, w, h, PixelFormat.rgba8888, (Image img) {
c.complete(img);
});
I have to make a new list(waste in space) and iterate through the entire list(waste in time).
This is too inefficient in my opinion, is there any way to make this more elegant? Like add a new PixelFormat.rgb888?
Thanks in advance.
You may find that this loop is faster as it doesn't keep appending to the list and then copy it at the end.
final rawPixel = raw.value.asTypedList(w * h * channel);
final rgbaList = Uint8List(w * h * 4); // create the Uint8List directly as we know the width and height
for (var i = 0; i < w * h; i++) {
final rgbOffset = i * 3;
final rgbaOffset = i * 4;
rgbaList[rgbaOffset] = rawPixel[rgbOffset]; // red
rgbaList[rgbaOffset + 1] = rawPixel[rgbOffset + 1]; // green
rgbaList[rgbaOffset + 2] = rawPixel[rgbOffset + 2]; // blue
rgbaList[rgbaOffset + 3] = 255; // a
}
An alternative is to prepend the array with a BMP header by adapting this answer (though it would simpler as there would be no palette) and passing that bitmap to instantiateImageCodec as that code is presumably highly optimized for parsing bitmaps.

How to optimize flutter CameraImage to TensorImage?

That function is too slow. So Flutter CameraImage efficiency convert to TensorImage in dart?
var img = imglib.Image(image.width, image.height); // Create Image buffer
Plane plane = image.planes[0];
const int shift = (0xFF << 24);
// Fill image buffer with plane[0] from YUV420_888
for (int x = 0; x < image.width; x++) {
for (int planeOffset = 0;
planeOffset < image.height * image.width;
planeOffset += image.width) {
final pixelColor = plane.bytes[planeOffset + x];
// color: 0x FF FF FF FF
// A B G R
// Calculate pixel color
var newVal =
shift | (pixelColor << 16) | (pixelColor << 8) | pixelColor;
img.data[planeOffset + x] = newVal;
}
}
return img;
}```
Seems your for loop is inefficient. The data for whole row (with same placeOffset, different x) will be cached at once, so would be faster to switch ordering of the two loops.
for (int y = 0; y < image.height; y++) {
for (int x = 0; x < image.width; x++) {
final pixelColor = plane.bytes[y * image.width + x];
// ...
}
}
However, your code does not seems to be reading from the actual camera stream. please refer this thread for converting CameraImage to Image.
How to convert Camera Image to Image in Flutter?

How do I get the initBricks function to accept floats for better precision when placing bricks?

i'm trying to be very precise with the positioning of my bricks when using initBricks. however they are not placing exactly where they should be 9out by small fractions) and i'm pretty certain it is because my code is using ints where i'm trying to get it to use floats. or perhaps the prototype only accepts ints, in which case, how can i get my bricks to be properly spaced given i have to have 10 ROWS and space between the bricks?
/**
* Initializes window with a grid of bricks.
*/
void initBricks(GWindow window)
{
float brickWidth = WIDTH / (COLS + 1);
float brickHeight = HEIGHT / (3*ROWS);
float brickSpace = WIDTH / (20*(ROWS + 1));
for (int j = 0; j < ROWS; j++)
{
for (int i = 0; i < COLS; i++)
{
GRect brick = newGRect(i*brickWidth + i*brickSpace + brickSpace, j*brickHeight + j*brickSpace + brickSpace,
brickWidth, brickHeight);
setColor(brick, "RED");
setFilled(brick, true);
add(window, brick);
}
}
}

Perceptual (or average) image hashing

I need to calculate the perceptual hash of an image and should do it without using any external libraries.
I tried using pHash (http://phash.org/) but I wasn't able to compile it for iOS (5) and I haven't found a real tutorial on how to do it.
One (library-dependent) solution is to use the pHashing functionality added to ImageMagick in version 6.8.8.3, which has iOS binaries available. Usage examples are documented here.
Here's also a simple reference function (in C#) for generating your own comparable image average hash, found on this blog.
public static ulong AverageHash(System.Drawing.Image theImage)
// Calculate a hash of an image based on visual characteristics.
// Described at http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
{
// Squeeze the image down to an 8x8 image.
// Chant the ancient incantations to create the correct data structures.
Bitmap squeezedImage = new Bitmap(8, 8, PixelFormat.Format32bppRgb);
Graphics drawingArea = Graphics.FromImage(squeezedImage);
drawingArea.CompositingQuality = CompositingQuality.HighQuality;
drawingArea.InterpolationMode = InterpolationMode.HighQualityBilinear;
drawingArea.SmoothingMode = SmoothingMode.HighQuality;
drawingArea.DrawImage(theImage, 0, 0, 8, 8);
byte[] grayScaleImage = new byte[64];
uint averageValue = 0;
ulong finalHash = 0;
// Reduce to 8-bit grayscale and calculate the average pixel value.
for(int y = 0; y < 8; y++)
{
for(int x = 0; x < 8; x++)
{
Color pixelColour = squeezedImage.GetPixel(x,y);
uint grayTone = ((uint)((pixelColour.R * 0.3) + (pixelColour.G * 0.59) + (pixelColour.B * 0.11)));
grayScaleImage[x + y*8] = (byte)grayTone;
averageValue += grayTone;
}
}
averageValue /= 64;
// Return 1-bits when the tone is equal to or above the average,
// and 0-bits when it's below the average.
for(int k = 0; k < 64; k++)
{
if(grayScaleImage[k] >= averageValue)
{
finalHash |= (1UL << (63-k));
}
}
return finalHash;
}

Teaching a Neural Net: Bipolar XOR

I'm trying to to teach a neural net of 2 inputs, 4 hidden nodes (all in same layer) and 1 output node. The binary representation works fine, but I have problems with the Bipolar. I can't figure out why, but the total error will sometimes converge to the same number around 2.xx. My sigmoid is 2/(1+ exp(-x)) - 1. Perhaps I'm sigmoiding in the wrong place. For example to calculate the output error should I be comparing the sigmoided output with the expected value or with the sigmoided expected value?
I was following this website here: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html , but they use different functions then I was instructed to use. Even when I did try to implement their functions I still ran into the same problem. Either way I get stuck about half the time at the same number (a different number for different implementations). Please tell me if I have made a mistake in my code somewhere or if this is normal (I don't see how it could be). Momentum is set to 0. Is this a common 0 momentum problem? The error functions we are supposed to be using are:
if ui is an output unit
Error(i) = (Ci - ui ) * f'(Si )
if ui is a hidden unit
Error(i) = Error(Output) * weight(i to output) * f'(Si)
public double sigmoid( double x ) {
double fBipolar, fBinary, temp;
temp = (1 + Math.exp(-x));
fBipolar = (2 / temp) - 1;
fBinary = 1 / temp;
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// Initialize the weights to random values.
private void initializeWeights(double neg, double pos) {
for(int i = 0; i < numInputs + 1; i++){
for(int j = 0; j < numHiddenNeurons; j++){
inputWeights[i][j] = Math.random() - pos;
if(inputWeights[i][j] < neg || inputWeights[i][j] > pos){
print("ERROR ");
print(inputWeights[i][j]);
}
}
}
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenWeights[i] = Math.random() - pos;
if(hiddenWeights[i] < neg || hiddenWeights[i] > pos){
print("ERROR ");
print(hiddenWeights[i]);
}
}
}
// Computes output of the NN without training. I.e. a forward pass
public double outputFor ( double[] argInputVector ) {
for(int i = 0; i < numInputs; i++){
inputs[i] = argInputVector[i];
}
double weightedSum = 0;
for(int i = 0; i < numHiddenNeurons; i++){
weightedSum = 0;
for(int j = 0; j < numInputs + 1; j++){
weightedSum += inputWeights[j][i] * inputs[j];
}
hiddenActivation[i] = sigmoid(weightedSum);
}
weightedSum = 0;
for(int j = 0; j < numHiddenNeurons + 1; j++){
weightedSum += (hiddenActivation[j] * hiddenWeights[j]);
}
return sigmoid(weightedSum);
}
//Computes the derivative of f
public static double fPrime(double u){
double fBipolar, fBinary;
fBipolar = 0.5 * (1 - Math.pow(u,2));
fBinary = u * (1 - u);
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// This method is used to update the weights of the neural net.
public double train ( double [] argInputVector, double argTargetOutput ){
double output = outputFor(argInputVector);
double lastDelta;
double outputError = (argTargetOutput - output) * fPrime(output);
if(outputError != 0){
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenError[i] = hiddenWeights[i] * outputError * fPrime(hiddenActivation[i]);
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
hiddenWeights[i] += deltaHiddenWeights[i];
}
for(int in = 0; in < numInputs + 1; in++){
for(int hid = 0; hid < numHiddenNeurons; hid++){
lastDelta = deltaInputWeights[in][hid];
deltaInputWeights[in][hid] = learningRate * hiddenError[hid] * inputs[in] + (momentum * lastDelta);
inputWeights[in][hid] += deltaInputWeights[in][hid];
}
}
}
return 0.5 * (argTargetOutput - output) * (argTargetOutput - output);
}
General coding comments:
initializeWeights(-1.0, 1.0);
may not actually get the initial values you were expecting.
initializeWeights should probably have:
inputWeights[i][j] = Math.random() * (pos - neg) + neg;
// ...
hiddenWeights[i] = (Math.random() * (pos - neg)) + neg;
instead of:
Math.random() - pos;
so that this works:
initializeWeights(0.0, 1.0);
and gives you initial values between 0.0 and 1.0 rather than between -1.0 and 0.0.
lastDelta is used before it is declared:
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
I'm not sure if the + 1 on numInputs + 1 and numHiddenNeurons + 1 are necessary.
Remember to watch out for rounding of ints: 5/2 = 2, not 2.5!
Use 5.0/2.0 instead. In general, add the .0 in your code when the output should be a double.
Most importantly, have you trained the NeuralNet long enough?
Try running it with numInputs = 2, numHiddenNeurons = 4, learningRate = 0.9, and train for 1,000 or 10,000 times.
Using numHiddenNeurons = 2 it sometimes get "stuck" when trying to solve the XOR problem.
See also XOR problem - simulation