CopyPixels method in Bgr32 format - bitmapsource

i'm just doing a simple job: Converting a bitmap into an array, then using that array, recreating the bitmap with BitmapSource.Create method.
However, i'm getting the error: "Value does not fall within the expected range". Here's my code.
Dim width As Integer = bitmapImage.PixelWidth
Dim height As Integer = bitmapImage.PixelHeight
Dim bytesPerPixel As Integer = bitmapImage.Format.BitsPerPixel / 8
Dim stride As Integer = width * bytesPerPixel
Dim pixelBuffer = New Byte(height * stride - 1) {}
bitmapImage.CopyPixels(pixelBuffer, stride, 0)
Dim bmpSource As BitmapSource = BitmapSource.Create(width, height, 96, 96, PixelFormats.Bgr32, Nothing, pixelBuffer, width)
Image2.Source = bmpSource
Any help regarding that will be appreciated, thank you.

Dim pixelBuffer = New Byte(height * stride - 1) {} allocates one byte too little.
As an example, a 4x4 pixel image with 4 bytes per pixel will allocate 4*4*4-1=63 bytes, but 64 are required.
Also, you're using BGR32 (4 byte pixels) here so you're safe, but the stride may in other pixel formats need to be rounded up to the next 4 byte boundary.
BitmapSource.Create also takes stride as last parameter, not width.

Related

CIConvolution chain leads to integer overflow

I've been doing some work with Core Image's convolution filters and I've noticed that sufficiently long chains of convolutions lead to unexpected outputs that I suspect are the result of numerical overflow on the underlying integer, float, or half float type being used to hold the pixel data. This is especially unexpected because the documentation says that every convolution's output value is "clamped to the range between 0.0 and 1.0", so ever larger values should not accumulate over successive passes of the filter but that's exactly what seems to be happening.
I've got some sample code here that demonstrates this surprise behavior. You should be able to paste it as is into just about any Xcode project, set a breakpoint at the end of it, run it on the appropriate platform (I'm using an iPhone Xs, not a simulator), and then when the break occurs use Quick Looks to inspect the filter chain.
import CoreImage
import CoreImage.CIFilterBuiltins
// --------------------
// CREATE A WHITE IMAGE
// --------------------
// the desired size of the image
let size = CGSize(width: 300, height: 300)
// create a pixel buffer to use as input; every pixel is bgra(0,0,0,0) by default
var pixelBufferOut: CVPixelBuffer?
CVPixelBufferCreate(kCFAllocatorDefault, Int(size.width), Int(size.height), kCVPixelFormatType_32BGRA, nil, &pixelBufferOut)
let input = pixelBufferOut!
// create an image from the input
let image = CIImage(cvImageBuffer: input)
// create a color matrix filter that will turn every pixel white
// bgra(0,0,0,0) becomes bgra(1,1,1,1)
let matrixFilter = CIFilter.colorMatrix()
matrixFilter.biasVector = CIVector(string: "1 1 1 1")
// turn the image white
matrixFilter.inputImage = image
let whiteImage = matrixFilter.outputImage!
// the matrix filter sets the image's extent to infinity
// crop it back to original size so Quick Looks can display the image
let cropped = whiteImage.cropped(to: CGRect(origin: .zero, size: size))
// ------------------------------
// CONVOLVE THE IMAGE SEVEN TIMES
// ------------------------------
// create a 3x3 convolution filter with every weight set to 1
let convolutionFilter = CIFilter.convolution3X3()
convolutionFilter.weights = CIVector(string: "1 1 1 1 1 1 1 1 1")
// 1
convolutionFilter.inputImage = cropped
let convolved = convolutionFilter.outputImage!
// 2
convolutionFilter.inputImage = convolved
let convolvedx2 = convolutionFilter.outputImage!
// 3
convolutionFilter.inputImage = convolvedx2
let convolvedx3 = convolutionFilter.outputImage!
// 4
convolutionFilter.inputImage = convolvedx3
let convolvedx4 = convolutionFilter.outputImage!
// 5
convolutionFilter.inputImage = convolvedx4
let convolvedx5 = convolutionFilter.outputImage!
// 6
convolutionFilter.inputImage = convolvedx5
let convolvedx6 = convolutionFilter.outputImage!
// 7
convolutionFilter.inputImage = convolvedx6
let convolvedx7 = convolutionFilter.outputImage!
// <-- put a breakpoint here
// when you run the code you can hover over the variables
// to see what the image looks like at various stages through
// the filter chain; you will find that the image is still white
// up until the seventh convolution, at which point it turns black
Further evidence that this is an overflow issue is that if I use a CIContext to render the image to an output pixel buffer, I have the opportunity to set the actual numerical type used during the render via the CIContextOption.workingFormat option. On my platform the default value is CIFormat.RGBAh which means each color channel uses a 16 bit float. If instead I use CIFormat.RGBAf which uses full 32 bit floats this problem goes away because it takes a lot more to overflow 32 bits than it does 16.
Is my insight into what's going on here correct or am I totally off? Is the documentation about clamping wrong or is this a bug with the filters?
It seems the documentation is outdated. Maybe it comes from a time where Core Image used 8-bit unsigned byte texture formates by default on iOS because those are clamped between 0.0 and 1.0.
With the float-typed formates, the values aren't clamped anymore and are stored as returned by the kernel. And since you started with white (1.0) and applied 7 consecutive convolutions with unnormalized weights (1 instead of 1/9), you end up with values of 9^7 = 4,782,969 per channel, which is outside of 16-bit float range.
To avoid something like that, you should normalize your convolution weights so that they sum up to 1.0.
By the way: to create a white image of a certain size, simply do this:
let image = CIImage(color: .white).cropped(to: CGSize(width: 300, height: 300))
🙂

pass matlab image to open3d three::Image in a mex script

I am trying to load an image in a mex script and cast it to the corresponding format that the Open3D library uses, i.e. three::Image. I am using the following code:
uint8_t* rgb_image = (uint8_t*) mxGetPr(prhs[3]);
int* dims = (int*) mxGetDimensions(prhs[3]);
int height = dims[0];
int width = dims[2];
int channels = dims[4];
int imsize = height * width;
Image image;
image.PrepareImage(height, width, 3, sizeof(uint8_t)); // parameters: height, width, num_of_channels, bytes_per_channel
memcpy(image.data_.data(), rgb_image, image.data_.size());
The above works well when I give a grayscale image and specify num_of_channels to 1 but not for 3 channel images as you can notice below:
Then I tried to create a function where I am manually looping through the raw data and assigning them to the output image
auto image_ptr = std::make_shared<Image>();
image_ptr->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(image_ptr->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *rgb_image++;
}
But now it seems that the color channels are wrongly assigned:
Any idea how to address this issue. The point is that it seems to be something easy but since my knowledge with C++ and pointers is quite limited I cannot figure it out straight forward.
I found this solution here (Reading image in matlab in a format acceptable to mex) as well but I am not sure how exactly I can use it. To be honest I am quite of confused.
ok the solution was quite straight forward as I was though in first place. It was just playing correctly with the pointers:
std::shared_ptr<Image> CreateRGBImageFromMat(uint8_t *mat_image, int width, int height, int channels)
{
auto open3d_image = std::make_shared<Image>();
open3d_image->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(open3d_image->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *(mat_image + i);
*p++ = *(mat_image + i + height*width);
*p++ = *(mat_image + i + height*width*2);
}
return open3d_image;
}
since the three::Image expects the data in contiguous order row x col x channel while from matlab the image comes in blocks rows x cols x channel_1 (after you transpose the image since matlab is column major). My question though now is whether I can do the same with memcpy() or std::copy() where I can copy the bloc data to contiguous form so that I bypass the for loop.

ASTC Texture compression in metal– what should I use as the bytes per row?

I'm writing a program that's using compressed textures in Metal. I'm having a bit of trouble with the replaceRegion() function of MTLTexture. The parameter bytesPerRow just doesn't seem to make sense. It says that for compressed textures, "bytesPerRow is the number of bytes from the beginning of one row of blocks to the beginning of the next."
Now I'm using ASTC with 4x4 blocks, which means that I have 8 bpp. So then 4*4 is 16, and 8 bits is one byte. So I'm guessing that each block size is 16 bytes. But yet, when I enter 16, I get a failed assertion that requires the minimum value to be 4096. What's going on?
Thank you very much.
bytesPerRow = texelsPerRow / blockFootprint.x * 16
uint32_t bytes_per_row(uint32_t texture_width, uint32_t block_width) {
return (texture_width % block_width ? texture_width + (texture_width % block_width) : texture_width) / block_width * 16;
}
This rounds up the texture width to a multiple of the block width first. E.g. a 1024x1024 texture encoded with a block size of 6x6 corresponds to 2736 bytes per row. 1026 / 6 * 16 == 2736.

How do I use vDSP functions for Short Time Fourier Transform?

I trying to understand how to use vDSP functions for STFT. So I use FFT code from apple's expamles and I can get FFT of first 1024 frames but how could I get FFT of next 1024 - 2047 frames and so on, until the end of file.. (in this case I imagine the size of file is int f = 10000).
//vDSP variables
DOUBLE_COMPLEX_SPLIT A;
FFTSetupD setupReal;
uint32_t log2n;
uint32_t n, nOver2;
int32_t stride;
double *obtainedReal;
double scale;
log2n = N;
n = 1 << log2n;
stride = 1;
nOver2 = n/2;
int f = 10000;
buffer = malloc(f *sizeof(double));
obtainedReal = malloc(f *sizeof(double));
A.realp = malloc(f *sizeof(double));
A.imagp = malloc(f *sizeof(double));
vDSP_ctozD((DOUBLE_COMPLEX*) buffer, 2, &A, 1, nOver2);
setupReal = vDSP_create_fftsetupD(log2n, FFT_RADIX2);
if (setupReal == NULL) {
NSLog(#"fft_setup failed to allocate enough memory for real FFT\n");
return 0 ;
}
vDSP_fft_zripD(setupReal, &A, stride, log2n, FFT_FORWARD);
scale = (double) 1.0 / (2 * n);
vDSP_vsmulD(A.realp, 1, &scale, A.realp, 1, nOver2);
vDSP_vsmulD(A.imagp, 1, &scale, A.imagp, 1, nOver2);
vDSP_ztocD(&A, 1, (DOUBLE_COMPLEX *) obtainedReal, 2, nOver2);
If you simply want the FFT of the next 1024 elements, add nOver2 to A.realp and to A.imagp, then perform another vDSP_fft_zripD and another vDSP_ztocD. You will probably want to advance obtainedReal too, or the new results will overwrite the old results.
Note that changing A.realp and A.imagp loses the starting addresses, so you will not be able to free this memory unless you recalculate the starting addresses or save them elsewhere before changing A.realp and A.imagp.
Also, 10,000 is not an integer multiple of 1024, so your last portion will not have 1024 elements, so you need to figure out an alternative, such as getting more data or padding the data with zeroes.
You are allocating too much memory for A.realp and A.imagp. Each of them receives half of the elements in buffer, so each of them only needs half as much memory.
Even that much memory is not needed. You can use vDSP_ctozD to move just 1024 elements into A.realp and A.imagp (512 each), then perform an FFT, then move the data to obtainedReal using vDSP_ztocD, then move on to the next group by using vDSP_ctozD to move 1024 new elements into the same space in A.realp and A.imagp that was used before.

How to obtain and modify a pixel value here?

Listing 2 of Apple's Q & A shows an example of how to modify pixels in a CGImageRef. The problem is: They're not showing how to obtain a pixel and modify it's R G B and A values.
The interesting part is here:
void *data = CGBitmapContextGetData (cgctx);
if (data != NULL)
{
// **** You have a pointer to the image data ****
// **** Do stuff with the data here ****
}
Now, lets say I want to read Red, Green, Blue and Alpha from pixel at x = 100, y = 50. How do I get access to that pixel and it's R, G, B and A components?
First, you need to know the bytesPerRow of your bitmap, as well as the data type and color format of the pixels in your bitmap. bytesPerRow can be different from the width_in_pixels*bytesPerPixel, as there might be padding at the end of each line. The pixels can be 16-bits or 32-bits, or possibly some other size. The format of the pixels can be ARGB or BRGA, or some other format.
For 32-bit ARGB data:
unsigned char *p = (unsigned char *)bytes;
long int i = bytesPerRow * y + 4 * x; // for 32-bit pixels
alpha = p[i ]; // for ARGB
red = p[i+1];
green = p[i+2];
blue = p[i+3];
Note that depending on your view transform, the Y axis might also appear to look upside down, depending on what you expect.