Bitmap from byte[] array - unity3d

I want to create a bitmap from a byte[]. My problem is that I can't use a BitmapSource in Unity and if I use a MemoryStream Unity gets an error.
I tried it with this:
Bitmap bitmap = new Bitmap(512, 424);
var data = bitmap.LockBits(new Rectangle(Point.Empty, bitmap.Size),
ImageLockMode.WriteOnly, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
Marshal.Copy(arrayData, 0, data.Scan0, arrayData.Length);
bitmap.UnlockBits(data);
It works but the Bitmap I get is the wrong way up. Can someone explain me why and got a solution for me?

This can be two things, perhaps combined: The choice of coordinate system, and Endianness
There's a convention (I believe universal) to list pixels from left to right, but there's none regarding vertical orientation. While some programs and APIs have the Y-coordinate be zero at the bottom and increases upwards, others do the exact opposite. I don't know where you get the byte[] from, but some APIs allow you to configure the pixel orientation when writing, reading or using textures. Otherwise, you'll have to manually re-arrange the rows.
The same applies to endianness; ARGB sometimes means Blue is the last byte, sometimes the first.Some classes, like BitConverter have buit-in solutions too.
Unity uses big-endian, bottom-up textures. In fact, Unity handles lots of this stuff under the hood, and has to re-order rows and flip bytes when importing bitmap files. Unity also provides methods like LoadImage and EncodeToPNG that take care of both problems.
To illustrate what happens to the byte[], this sample code saves the same image in three different ways (but you need to import them as Truecolor to see them properly in Unity):
using UnityEngine;
using UnityEditor;
using System.Drawing;
using System.Drawing.Imaging;
public class CreateTexture2D : MonoBehaviour {
public void Start () {
int texWidth = 4, texHeight = 4;
// Raw 4x4 bitmap data, in bottom-up big-endian ARGB byte order. It's transparent black for the most part.
byte[] rawBitmap = new byte[] {
// Red corner (bottom-left) is written first
255,255,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,
0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,
0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,
255,0,0,255, 255,0,0,255, 255,0,0,255, 255,0,0,255
//Blue border (top) is the last "row" of the array
};
// We create a Texture2D from the rawBitmap
Texture2D texture = new Texture2D(texWidth, texHeight, TextureFormat.ARGB32, false);
texture.LoadRawTextureData(rawBitmap);
texture.Apply();
// 1.- We save it directly as a Unity asset (btw, this is useful if you only use it inside Unity)
UnityEditor.AssetDatabase.CreateAsset(texture, "Assets/TextureAsset.asset");
// 2.- We save the texture to a file, but letting Unity handle formatting
byte[] textureAsPNG = texture.EncodeToPNG();
System.IO.File.WriteAllBytes(Application.dataPath + "/EncodedByUnity.png", textureAsPNG);
// 3.- Rearrange the rawBitmap manually into a top-down small-endian ARGB byte order. Then write to a Bitmap, and save to disk.
// Bonus: This permutation is it's own inverse, so it works both ways.
byte[] rearrangedBM = new byte[rawBitmap.Length];
for (int row = 0; row < texHeight; row++)
for (int col = 0; col < texWidth; col++)
for (int i = 0; i < 4; i++)
rearrangedBM[row * 4 * texWidth + 4 * col + i] = rawBitmap[(texHeight - 1 - row) * 4 * texWidth + 4 * col + (3 - i)];
Bitmap bitmap = new Bitmap(texWidth, texHeight, PixelFormat.Format32bppArgb);
var data = bitmap.LockBits(new Rectangle(0, 0, texWidth, texHeight), ImageLockMode.WriteOnly, PixelFormat.Format32bppArgb);
System.Runtime.InteropServices.Marshal.Copy(rearrangedBM, 0, data.Scan0, rearrangedBM.Length);
bitmap.UnlockBits(data);
bitmap.Save(Application.dataPath + "/SavedBitmap.png", ImageFormat.Png);
}
}

Related

In Unity, how to segment the user's voice from microphone based on loudness?

I need to collect voice pieces from a continuous audio stream. I need to process later the user's voice piece that has just been said (not for speech recognition). What I am focusing on is only the voice's segmentation based on its loudness.
If after at least 1 second of silence, his voice becomes loud enough for a while, and then silent again for at least 1 second, I say this is a sentence and the voice should be segmented here.
I just know I can get raw audio data from the AudioClip created by Microphone.Start(). I want to write some code like this:
void Start()
{
audio = Microphone.Start(deviceName, true, 10, 16000);
}
void Update()
{
audio.GetData(fdata, 0);
for(int i = 0; i < fdata.Length; i++) {
u16data[i] = Convert.ToUInt16(fdata[i] * 65535);
}
// ... Process u16data
}
But what I'm not sure is:
Every frame when I call audio.GetData(fdata, 0), what I get is the latest 10 seconds of sound data if fdata is big enough or shorter than 10 seconds if fdata is not big enough, is it right?
fdata is a float array, and what I need is a 16 kHz, 16 bit PCM buffer. Is it right to convert the data like: u16data[i] = fdata[i] * 65535?
What is the right way to detect loud moments and silent moments in fdata?
No. you have to read starting at the current position within the AudioClip using Microphone.GetPosition
Get the position in samples of the recording.
and pass the optained index to AudioClip.GetData
Use the offsetSamples parameter to start the read from a specific position in the clip
fdata = new float[clip.samples * clip.channels];
var currentIndex = Microphone.GetPosition(null);
audio.GetData(fdata, currentIndex);
I don't understand what exactly you convert this for. fdata will contain
floats ranging from -1.0f to 1.0f (AudioClip.GetData)
so if for some reason you need to get values between short.MinValue (= -32768) and short.MaxValue(= 32767) than yes you can do that using
u16data[i] = Convert.ToUInt16(fdata[i] * short.MaxValue);
note however that Convert.ToUInt16(float):
value, rounded to the nearest 16-bit unsigned integer. If value is halfway between two whole numbers, the even number is returned; that is, 4.5 is converted to 4, and 5.5 is converted to 6.
you might want to rather use Mathf.RoundToInt first to also round up if a value is e.g. 4.5.
u16data[i] = Convert.ToUInt16(Mathf.RoundToInt(fdata[i] * short.MaxValue));
Your naming however suggests that you are actually trying to get unsigned values ushort (or also UInt16). For this you can not have negative values! So you have to shift the float values up in order to map the range (-1.0f | 1.0f ) to the range (0.0f | 1.0f) before multiplaying it by ushort.MaxValue(= 65535)
u16data[i] = Convert.ToUInt16(Mathf.RoundToInt(fdata[i] + 1) / 2 * ushort.MaxValue);
What you receive from AudioClip.GetData are the gain values of the audio track between -1.0f and 1.0f.
so a "loud" moment would be where
Mathf.Abs(fdata[i]) >= aCertainLoudThreshold;
a "silent" moment would be where
Mathf.Abs(fdata[i]) <= aCertainSiltenThreshold;
where aCertainSiltenThreshold might e.g. be 0.2f and aCertainLoudThreshold might e.g. be 0.8f.

Using C++ AMP with Direct2D

Is it possible to use a texture generated by C++ AMP as a screen buffer?
I would like to generate an image with my C++ AMP code (already done) and use this image to fill the entire screen of Windows 8 metro app. The image is updated 60 times per second.
I'm not at all fluent in Direct3D. I used Direct2d template app as a starting point.
First I tried to manipulate the buffer from swap chain in the C++ AMP code directly, but any attempt to write to that texture caused an error.
Processing data with AMP on GPU, then moving it to CPU memory to create a bitmap that I can use in D2D API seems way inefficient.
Can somebody share a piece of code that would allow me to manipulate swap chain buffer texture with C++ AMP directly (without data leaving the GPU) or at least populate that buffer with data from another texture that doesn't leave the GPU?
You can interop between an AMP Texture<> and a ID3D11Texture2D buffer. The complete code and other examples of interop can be found in the Chapter 11 samples here.
// Get a D3D texture resource from an AMP texture.
texture<int, 2> text(100, 100);
CComPtr<ID3D11Texture2D> texture;
IUnknown* unkRes = get_texture(text);
hr = unkRes->QueryInterface(__uuidof(ID3D11Texture2D),
reinterpret_cast<LPVOID*>(&texture));
assert(SUCCEEDED(hr));
// Create a texture from a D3D texture resource
const int height = 100;
const int width = 100;
D3D11_TEXTURE2D_DESC desc;
ZeroMemory(&desc, sizeof(desc));
desc.Height = height;
desc.Width = width;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_UINT;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;
desc.CPUAccessFlags = 0;
desc.MiscFlags = 0;
CComPtr<ID3D11Texture2D> dxTexture = nullptr;
hr = device->CreateTexture2D(&desc, nullptr, &dxTexture);
assert(SUCCEEDED(hr));
texture<uint4, 2> ampTexture = make_texture<uint4, 2>(dxView, dxTexture);

Quartz 2d Drawing: Perfect on simulator / bad on device. Distribution Vs Debug

I've just finished waveform drawing code for my app. I'm pretty happy with it and on the simulator it looks great.
The problem I have is when I run it on an ipad it doesnt draw properly. On the simulator the drawing looks like a nice regular waveform drawing whereas on the ipad the waveform just looks like one big rectangle.
I'm very unsure how I could even begin to start trouble shooting and resolving something like this.
Can you offer any suggestions as to why its working on the simulator & not the ipad?
If I can submit anymore information that might help please let me know.
calculation
-(void) plotwaveform:(AudioSourceOBJ )source
{
int count =source->framecount;
int blocksize= count/resolution;
currentmaxvalue=0;
int readindex=0;
CGRect *addrects= malloc(resolution * sizeof(CGRect));
float *heights=malloc(resolution * sizeof(float));
for (int i=0; i<resolution;i++) {
AudioUnitSampleType *blockofaudio;
blockofaudio =malloc(blocksize * sizeof(AudioUnitSampleType));
memcpy(blockofaudio, &source->leftoutput[readindex],(blocksize * sizeof(AudioUnitSampleType)));
float sample= [self getRMS:blockofaudio blocksize:blocksize];
heights[i]=sample;
readindex+=blocksize;
}
for (int scale=0; scale<resolution; scale++) {
float h= heights[scale];
h= (h/currentmaxvalue)* 45;
addrects[scale]=CGRectMake(scale, 0, 1, h);
}
if (waveform) {
[waveform release];
[waveform removeFromSuperview];
waveform=nil;
}
CGMutablePathRef halfpath=CGPathCreateMutable();
CGPathAddRects(halfpath, NULL, addrects, resolution);
CGMutablePathRef path= CGPathCreateMutable();
CGAffineTransform xf = CGAffineTransformIdentity;
xf= CGAffineTransformTranslate(xf, 0.0,45);
CGPathAddPath(path,&xf, halfpath);
xf= CGAffineTransformIdentity;
xf= CGAffineTransformTranslate(xf, 0.0, 45);
xf=CGAffineTransformScale(xf, 1.0, -1);
CGPathAddPath(path, &xf, halfpath);
CGPathRelease(halfpath);
free(addrects);
waveform = [[Waveform alloc] initWithFrameAndPlotdata:CGRectMake(0, 0, 400,90) thepoints:path];
[self.view addSubview:waveform];
}
-(float ) getRMS:(AudioUnitSampleType *)blockofaudio blocksize:(int)blocksize
{
float output;
float sqsummed;
float sqrootofsum;
float val;
for (int i=0;i<blocksize; i++) {
val= blockofaudio[i];
sqsummed+= val* val;
}
sqrootofsum=sqsummed / blocksize;
output = sqrt(sqrootofsum);
// find the max
if(output> currentmaxvalue)
{
currentmaxvalue=output;
}
return output;
}
Drawing
- (void)drawRect:(CGRect)rect
{
CGContextRef ctx= UIGraphicsGetCurrentContext();
CGContextSetRGBFillColor(ctx, 0, 0, 0, .5);
CGContextBeginPath(ctx);
CGContextAddPath(ctx, mutatablepath);
//CGContextStrokePath(ctx);
CGContextFillPath(ctx);
CFRelease(mutatablepath);
}
DESC EDIT
I pass a bunch of audio data to the plotwaveform function and divide it into chunks. For each chunk of audio I calculate the RMS for each chunk and keep a track of the maximum value. When all that is done I use the max value to scale my rms values to fit my view port.
I have noticed a strange thing. If I NSLog the values for the "output" variable in the getRMS function the waveform draws fine on the device. If I do not NSLog the values the waveform does not draw properly?!?
That to me is bizarre.
One major error I see is that you never initialize sqsummed inside the getRMS:blocksize: method, so its initial value is garbage. What the garbage happens to be depends on the details of the surrounding code, how the compiler allocates registers for variables, and so on. Adding an NSLog statement could well change what the garbage is next time around the loop.
If the garbage happens to always correspond to a very small float value you'll get expected behavior, while if it happens to always correspond to some extremely large float value (large enough to swamp the actual samples) you'll get one big rectangle, while if it happens to vary you'll get a noise-like output.
In any case, please remember that the simulator has your entire mac ram and cpu power to work with. Process capacity is sadly not emulated in the iphone/ipad simulator.

Compiled Error in Xcode for iphone game and other questions

I am a newbie of xcode and objective-c and I have a few questions regarding to the code sample of a game attached below. It is written in objective C, Xcode for iphone4 simulator. It is part of the code of 'ball bounce against brick" game. Instead of creating the image by IB, the code supposes to create (programmatically) 5 X 4 bricks using 4 different kinds of bricks pictures (bricktype1.png...). I have the bricks defined in .h file properly and method written in .m.
My questions are for the following code:
- (void)initializeBricks
{
brickTypes[0] = #"bricktype1.png";
brickTypes[1] = #"bricktype2.png";
brickTypes[2] = #"bricktype3.png";
brickTypes[3] = #"bricktype4.png";
int count = 0;
for (int y = 0; y < BRICKS_HEIGHT; y++)
{
for (int x = 0; x < BRICKS_WIDTH; x++)
{
UIImage *image = [ImageCache loadImage:brickTypes[count++ % 4]]; - Line1
bricks[x][y] = [[[UIImageView alloc] initWithImage:image] autorelease];
CGRect newFrame = bricks[x][y].frame;
newFrame.origin = CGPointMake(x * 64, (y * 40) + 50);
bricks[x][y].frame = newFrame;
[self.view addSubview:bricks[x][y]];
}
}
}
1) When it is compiled, error "ImageCache undeclared" in Line 1. But I have already added the png to the project. What is the problem and how to fix it? (If possible, please suggest code and explain what it does and where to put it.)
2) How does the following in Line 1 work? Does it assign the element (name of .png) of brickType to image?
brickTypes[count ++ % 4]
For instance, returns one of the file name bricktype1.png to the image object? If true, what is the max value of "count", ends at 5? (as X increments 5 times for each Y). But then "count" will exceed the max 'index value' of brickTypes which is 3!
3) In Line2, does the image object which is being allocated has a name and linked with the .png already at this line before it is assigned to brick[x][y]?
4) What do Line3 and Line5 do? Why newFrame on left in line3 but appears on right in Line5?
5) What does Line 4 do?
When it is compiled, error "ImageCache undeclared" in Line 1. But I have already added the png to the project. What is the problem and how to fix it?
ImageCache is the name of an object you're supposed to create. Since you haven't created one, it's undefined.
How does the following in Line 1 work? Does it assign the element (name of .png) of brickType to image?
It uses the count modulo 4 (% is the modulus operator) as the index to the array and then increments count. It will not exceed the array size - that's what the modulus operation is preventing. Suggest you study: Modulo Operation
In Line2, does the image object which is being allocated has a name and linked with the .png already at this line before it is assigned to brick[x][y]?
Not sure I understand the question, but yes, the image has been loaded.
What do Line3 and Line5 do? Why newFrame on left in line3 but appears on right in Line5?
They set newFrame to the same frame as an existing image and then create a CGPoint with which they set a new origin for newFrame. Lines 3, 4, and 5 get the frame of an image, set it's origin to a new value, and then replace the image with the newFrame.
1) ImageCache is not a standard class, so the class must be defined somewhere in the game's project. You are missing an import "ImageCache.h" (or whichever header file ImageCache is defined in).
2) First, the count modulo 4 is taken. That is: divide count by 4 and take the rest. That rest is used as an index to brickTypes. So you will always get a value between 0 and 3 (including). Then, count is increased by 1 (the postfix ++ operator first returns the variable value and afterwards increases the variable by one). Since brickType seems to of type NSString *brickType[4] (you haven't showed us the declaration) this code will always return a string #"bricktype1.png" ... #"bricktype4.png".
3) I don't understand that question, sorry. Please try to explain.
4) First, the position and size of the brick are queried (line 3). Then, the position is changed, while leaving the size unmodified (line 4). Lastly, the changed position and size are assigned to the brick. In effect, this just moves the brick to a new position. It must be done this way because frame is property of type CGRect (that is: it's a method called setFrame:(CGRect)rect but the compiler provides a more convenient way to access it) which is a struct containing other structs, so one can't just do brick[x][y].frame.origin.x = x * 64.
5) It assigns a new position to the brick (or rather, to the struct queried from the brick). The CGPointMake(x,y) method returns a struct of type CGPoint. The result is assigned to the frame's member origin. One could also write:
newFrame.origin.x = x * 64;
newFrame.origin.y = (y * 40) + 50;
(here you can directly do the assigns because newFrame is a struct on your stack, not a method like brick[x][y].frame)

EXC_BAD_ACCESS when calling avcodec_encode_video

I have an Objective-C class (although I don't believe this is anything Obj-C specific) that I am using to write a video out to disk from a series of CGImages. (The code I am using at the top to get the pixel data comes right from Apple: http://developer.apple.com/mac/library/qa/qa2007/qa1509.html). I successfully create the codec and context - everything is going fine until it gets to avcodec_encode_video, when I get EXC_BAD_ACCESS. I think this should be a simple fix, but I just can't figure out where I am going wrong.
I took out some error checking for succinctness. 'c' is an AVCodecContext*, which is created successfully.
-(void)addFrame:(CGImageRef)img
{
CFDataRef bitmapData = CGDataProviderCopyData(CGImageGetDataProvider(img));
long dataLength = CFDataGetLength(bitmapData);
uint8_t* picture_buff = (uint8_t*)malloc(dataLength);
CFDataGetBytes(bitmapData, CFRangeMake(0, dataLength), picture_buff);
AVFrame *picture = avcodec_alloc_frame();
avpicture_fill((AVPicture*)picture, picture_buff, c->pix_fmt, c->width, c->height);
int outbuf_size = avpicture_get_size(c->pix_fmt, c->width, c->height);
uint8_t *outbuf = (uint8_t*)av_malloc(outbuf_size);
out_size = avcodec_encode_video(c, outbuf, outbuf_size, picture); // ERROR occurs here
printf("encoding frame %3d (size=%5d)\n", i, out_size);
fwrite(outbuf, 1, out_size, f);
CFRelease(bitmapData);
free(picture_buff);
free(outbuf);
av_free(picture);
i++;
}
I have stepped through it dozens of times. Here are some numbers...
dataLength = 408960
picture_buff = 0x5c85000
picture->data[0] = 0x5c85000 -- which I take to mean that avpicture_fill worked...
outbuf_size = 408960
and then I get EXC_BAD_ACCESS at avcodec_encode_video. Not sure if it's relevant, but most of this code comes from api-example.c. I am using XCode, compiling for armv6/armv7 on Snow Leopard.
Thanks so much in advance for help!
I have not enough information here to point to the exact error, but I think that the problem is that the input picture contains less data than avcodec_encode_video() expects:
avpicture_fill() only sets some pointers and numeric values in the AVFrame structure. It does not copy anything, and does not check whether the buffer is large enough (and it cannot, since the buffer size is not passed to it). It does something like this (copied from ffmpeg source):
size = picture->linesize[0] * height;
picture->data[0] = ptr;
picture->data[1] = picture->data[0] + size;
picture->data[2] = picture->data[1] + size2;
picture->data[3] = picture->data[1] + size2 + size2;
Note that the width and height is passed from the variable "c" (the AVCodecContext, I assume), so it may be larger than the actual size of the input frame.
It is also possible that the width/height is good, but the pixel format of the input frame is different from what is passed to avpicture_fill(). (note that the pixel format also comes from the AVCodecContext, which may differ from the input). For example, if c->pix_fmt is RGBA and the input buffer is in YUV420 format (or, more likely for iPhone, a biplanar YCbCr), then the size of the input buffer is width*height*1.5, but avpicture_fill() expects the size of width*height*4.
So checking the input/output geometry and pixel formats should lead you to the cause of the error. If it does not help, I suggest that you should try to compile for i386 first. It is tricky to compile FFMPEG for the iPhone properly.
Does the codec you are encoding support the RGB color space? You may need to use libswscale to convert to I420 before encoding. What codec are you using? Can you post the code where you initialize your codec context?
The function RGBtoYUV420P may help you.
http://www.mail-archive.com/libav-user#mplayerhq.hu/msg03956.html