How to generate audio wave form programmatically while recording Voice in iOS?

How to generate audio wave form programmatically while recording Voice in iOS? - iphone

How to generate audio wave form programmatically while recording Voice in iOS?
m working on voice modulation audio frequency in iOS... everything is working fine ...just need some best simple way to generate audio wave form on detection noise...
Please dont refer me the code tutorials of...speakhere and auriotouch... i need some best suggestions from native app developers.
I have recorded the audio and i made it play after recording . I have created waveform and attached screenshot . But it has to been drawn in the view as audio recording in progress
-(UIImage *) audioImageGraph:(SInt16 *) samples
normalizeMax:(SInt16) normalizeMax
sampleCount:(NSInteger) sampleCount
channelCount:(NSInteger) channelCount
imageHeight:(float) imageHeight {
CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
UIGraphicsBeginImageContext(imageSize);
CGContextRef context = UIGraphicsGetCurrentContext();
CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
CGContextSetAlpha(context,1.0);
CGRect rect;
rect.size = imageSize;
rect.origin.x = 0;
rect.origin.y = 0;
CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
CGColorRef rightcolor = [[UIColor redColor] CGColor];
CGContextFillRect(context, rect);
CGContextSetLineWidth(context, 1.0);
float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
float centerLeft = halfGraphHeight;
float centerRight = (halfGraphHeight*3) ;
float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax;
for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
SInt16 left = *samples++;
float pixels = (float) left;
pixels *= sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerLeft-pixels);
CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
CGContextSetStrokeColorWithColor(context, leftcolor);
CGContextStrokePath(context);
if (channelCount==2) {
SInt16 right = *samples++;
float pixels = (float) right;
pixels *= sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerRight - pixels);
CGContextAddLineToPoint(context, intSample, centerRight + pixels);
CGContextSetStrokeColorWithColor(context, rightcolor);
CGContextStrokePath(context);
}
}
// Create new image
UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();
// Tidy up
UIGraphicsEndImageContext();
return newImage;
}
Next a method that takes a AVURLAsset, and returns PNG Data
- (NSData *) renderPNGAudioPictogramForAssett:(AVURLAsset *)songAsset {
NSError * error = nil;
AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];
NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
// [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
// [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/
[NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
nil];
AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];
[reader addOutput:output];
[output release];
UInt32 sampleRate,channelCount;
NSArray* formatDesc = songTrack.formatDescriptions;
for(unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
if(fmtDesc ) {
sampleRate = fmtDesc->mSampleRate;
channelCount = fmtDesc->mChannelsPerFrame;
// NSLog(#"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
}
}
UInt32 bytesPerSample = 2 * channelCount;
SInt16 normalizeMax = 0;
NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];
UInt64 totalBytes = 0;
SInt64 totalLeft = 0;
SInt64 totalRight = 0;
NSInteger sampleTally = 0;
NSInteger samplesPerPixel = sampleRate / 50;
while (reader.status == AVAssetReaderStatusReading){
AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];
if (sampleBufferRef){
CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);
size_t length = CMBlockBufferGetDataLength(blockBufferRef);
totalBytes += length;
NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];
NSMutableData * data = [NSMutableData dataWithLength:length];
CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);
SInt16 * samples = (SInt16 *) data.mutableBytes;
int sampleCount = length / bytesPerSample;
for (int i = 0; i < sampleCount ; i ++) {
SInt16 left = *samples++;
totalLeft += left;
SInt16 right;
if (channelCount==2) {
right = *samples++;
totalRight += right;
}
sampleTally++;
if (sampleTally > samplesPerPixel) {
left = totalLeft / sampleTally;
SInt16 fix = abs(left);
if (fix > normalizeMax) {
normalizeMax = fix;
}
[fullSongData appendBytes:&left length:sizeof(left)];
if (channelCount==2) {
right = totalRight / sampleTally;
SInt16 fix = abs(right);
if (fix > normalizeMax) {
normalizeMax = fix;
}
[fullSongData appendBytes:&right length:sizeof(right)];
}
totalLeft = 0;
totalRight = 0;
sampleTally = 0;
}
}
[wader drain];
CMSampleBufferInvalidate(sampleBufferRef);
CFRelease(sampleBufferRef);
}
}
NSData * finalData = nil;
if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
// Something went wrong. return nil
return nil;
}
if (reader.status == AVAssetReaderStatusCompleted){
NSLog(#"rendering output graphics using normalizeMax %d",normalizeMax);
UIImage *test = [self audioImageGraph:(SInt16 *)
fullSongData.bytes
normalizeMax:normalizeMax
sampleCount:fullSongData.length / 4
channelCount:2
imageHeight:100];
finalData = imageToData(test);
}
[fullSongData release];
[reader release];
return finalData;
}
I have

If you want real-time graphics derived from mic input, then use the RemoteIO Audio Unit, which is what most native iOS app developers use for low latency audio, and Metal or Open GL for drawing waveforms, which will give you the highest frame rates. You will need completely different code from that provided in your question to do so, as AVAssetRecording, Core Graphic line drawing and png rendering are far far too slow to use.
Update: with iOS 8 and newer, the Metal API may be able to render graphic visualizations with even greater performance than OpenGL.
Update 2: Here are some code snippets for recording live audio using Audio Units and drawing bit maps using Metal in Swift 3: https://gist.github.com/hotpaw2/f108a3c785c7287293d7e1e81390c20b

You should check out EZAudio (https://github.com/syedhali/EZAudio), specifically the EZRecorder and the EZAudioPlot (or GPU-accelerated EZAudioPlotGL).
There is also an example project that does exactly what you want, https://github.com/syedhali/EZAudio/tree/master/EZAudioExamples/iOS/EZAudioRecordExample
EDIT: Here's the code inline
/// In your interface
/**
Use a OpenGL based plot to visualize the data coming in
*/
#property (nonatomic,weak) IBOutlet EZAudioPlotGL *audioPlot;
/**
The microphone component
*/
#property (nonatomic,strong) EZMicrophone *microphone;
/**
The recorder component
*/
#property (nonatomic,strong) EZRecorder *recorder;
...
/// In your implementation
// Create an instance of the microphone and tell it to use this view controller instance as the delegate
-(void)viewDidLoad {
self.microphone = [EZMicrophone microphoneWithDelegate:self startsImmediately:YES];
}
// EZMicrophoneDelegate will provide these callbacks
-(void)microphone:(EZMicrophone *)microphone
hasAudioReceived:(float **)buffer
withBufferSize:(UInt32)bufferSize
withNumberOfChannels:(UInt32)numberOfChannels {
dispatch_async(dispatch_get_main_queue(),^{
// Updates the audio plot with the waveform data
[self.audioPlot updateBuffer:buffer[0] withBufferSize:bufferSize];
});
}
-(void)microphone:(EZMicrophone *)microphone hasAudioStreamBasicDescription:(AudioStreamBasicDescription)audioStreamBasicDescription {
// The AudioStreamBasicDescription of the microphone stream. This is useful when configuring the EZRecorder or telling another component what audio format type to expect.
// We can initialize the recorder with this ASBD
self.recorder = [EZRecorder recorderWithDestinationURL:[self testFilePathURL]
andSourceFormat:audioStreamBasicDescription];
}
-(void)microphone:(EZMicrophone *)microphone
hasBufferList:(AudioBufferList *)bufferList
withBufferSize:(UInt32)bufferSize
withNumberOfChannels:(UInt32)numberOfChannels {
// Getting audio data as a buffer list that can be directly fed into the EZRecorder. This is happening on the audio thread - any UI updating needs a GCD main queue block. This will keep appending data to the tail of the audio file.
if( self.isRecording ){
[self.recorder appendDataFromBufferList:bufferList
withBufferSize:bufferSize];
}
}

I was searching the same thing. (Making wave from the data of the audio recorder). I found some library that might be helpful and worth to check the code to understand the logic behind this.
The calculation is all based with sin and mathematic formula. This is much simple if you take a look to the code!
https://github.com/stefanceriu/SCSiriWaveformView
or
https://github.com/raffael/SISinusWaveView
This is only few examples that you can find on the web.

Related

How to compare images using opencv in iOS (iPhone)

I want to compare 2 images taken by iPhone camera in my project. I am using OpenCV for doing that. Is there any other better way to do that?
If i got the % similarity, It will be great.
I am using OpenCV following code for image comparison:
-(void)opencvImageCompare{
NSMutableArray *valuesArray=[[NSMutableArray alloc]init];
IplImage *img = [self CreateIplImageFromUIImage:imageView.image];
// always check camera image
if(img == 0) {
printf("Cannot load camera img");
}
IplImage *res;
CvPoint minloc, maxloc;
double minval, maxval;
double values;
UIImage *imageTocompare = [UIImage imageNamed:#"MyImageName"];
IplImage *imageTocompareIpl = [self CreateIplImageFromUIImage:imageTocompare];
// always check server image
if(imageTocompareIpl == 0) {
printf("Cannot load serverIplImageArray image");
}
if(img->width-imageTocompareIpl->width<=0 && img->height-imageTocompareIpl->height<=0){
int balWidth=imageTocompareIpl->width-img->width;
int balHeight=imageTocompareIpl->height-img->height;
img->width=img->width+balWidth+100;
img->height=img->height+balHeight+100;
}
CvSize size = cvSize(
img->width - imageTocompareIpl->width + 1,
img->height - imageTocompareIpl->height + 1
);
res = cvCreateImage(size, IPL_DEPTH_32F, 1);
// CV_TM_SQDIFF CV_TM_SQDIFF_NORMED
// CV_TM_CCORR CV_TM_CCORR_NORMED
// CV_TM_CCOEFF CV_TM_CCOEFF_NORMED
cvMatchTemplate(img, imageTocompareIpl, res,CV_TM_CCOEFF);
cvMinMaxLoc(res, &minval, &maxval, &minloc, &maxloc, 0);
printf("\n value %f", maxval-minval);
values=maxval-minval;
NSString *valString=[NSString stringWithFormat:#"%f",values];
[valuesArray addObject:valString];
weedObject.values=[valString doubleValue];
printf("\n------------------------------");
cvReleaseImage(&imageTocompareIpl);
cvReleaseImage(&res);
}
cvReleaseImage(&img);
}
For the same image I am getting non zero result (14956...) and if I pass different image its crash.

try this code, It compares images bit by bit, ie 100%
UIImage *img1 = // Some photo;
UIImage *img2 = // Some photo;
NSData *imgdata1 = UIImagePNGRepresentation(img1);
NSData *imgdata2 = UIImagePNGRepresentation(img2);
if ([imgdata1 isEqualToData:imgdata2])
{
NSLog(#"Same Image");
}

Try this code - it compare images pixel by pixel
-(void)removeindicator :(UIImage *)image
{
for (int i =0; i < [imageArray count]; i++)
{
CGFloat width =100.0f;
CGFloat height=100.0f;
CGSize newSize1 = CGSizeMake(width, height); //whaterver size
UIGraphicsBeginImageContext(newSize1);
[[UIImage imageNamed:[imageArray objectAtIndex:i]] drawInRect:CGRectMake(0, 0, newSize1.width, newSize1.height)];
UIImage *newImage1 = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
UIImageView *imageview_camera=(UIImageView *)[self.view viewWithTag:-3];
CGSize newSize2 = CGSizeMake(width, height); //whaterver size
UIGraphicsBeginImageContext(newSize2);
[[imageview_camera image] drawInRect:CGRectMake(0, 0, newSize2.width, newSize2.height)];
UIImage *newImage2 = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
float numDifferences = 0.0f;
float totalCompares = width * height;
NSArray *img1RGB=[[NSArray alloc]init];
NSArray *img2RGB=[[NSArray alloc]init];
for (int yCoord = 0; yCoord < height; yCoord += 1)
{
for (int xCoord = 0; xCoord < width; xCoord += 1)
{
img1RGB = [self getRGBAsFromImage:newImage1 atX:xCoord andY:yCoord];
img2RGB = [self getRGBAsFromImage:newImage2 atX:xCoord andY:yCoord];
if (([[img1RGB objectAtIndex:0]floatValue] - [[img2RGB objectAtIndex:0]floatValue]) == 0 || ([[img1RGB objectAtIndex:1]floatValue] - [[img2RGB objectAtIndex:1]floatValue]) == 0 || ([[img1RGB objectAtIndex:2]floatValue] - [[img2RGB objectAtIndex:2]floatValue]) == 0)
{
//one or more pixel components differs by 10% or more
numDifferences++;
}
}
}
// It will show result in percentage at last
CGFloat percentage_similar=((numDifferences*100)/totalCompares);
NSString *str=NULL;
if (percentage_similar>=10.0f)
{
str=[[NSString alloc]initWithString:[NSString stringWithFormat:#"%i%# Identical", (int)((numDifferences*100)/totalCompares),#"%"]];
UIAlertView *alertview=[[UIAlertView alloc]initWithTitle:#"i-App" message:[NSString stringWithFormat:#"%# Images are same",str] delegate:nil cancelButtonTitle:#"Ok" otherButtonTitles:nil];
[alertview show];
break;
}
else
{
str=[[NSString alloc]initWithString:[NSString stringWithFormat:#"Result: %i%# Identical",(int)((numDifferences*100)/totalCompares),#"%"]];
UIAlertView *alertview=[[UIAlertView alloc]initWithTitle:#"i-App" message:[NSString stringWithFormat:#"%# Images are not same",str] delegate:nil cancelButtonTitle:#"OK" otherButtonTitles:nil];
[alertview show];
}
}
}
-(NSArray*)getRGBAsFromImage:(UIImage*)image atX:(int)xx andY:(int)yy
{
//NSArray *result = [[NSArray alloc]init];
// First get the image into your data buffer
CGImageRef imageRef = [image CGImage];
NSUInteger width = CGImageGetWidth(imageRef);
NSUInteger height = CGImageGetHeight(imageRef);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
unsigned char *rawData = (unsigned char*) calloc(height * width * 4, sizeof(unsigned char));
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;
CGContextRef context = CGBitmapContextCreate(rawData, width, height,
bitsPerComponent, bytesPerRow, colorSpace,
kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGColorSpaceRelease(colorSpace);
CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
CGContextRelease(context);
// Now your rawData contains the image data in the RGBA8888 pixel format.
int byteIndex = (bytesPerRow * yy) + xx * bytesPerPixel;
// for (int ii = 0 ; ii < count ; ++ii)
// {
CGFloat red = (rawData[byteIndex] * 1.0) / 255.0;
CGFloat green = (rawData[byteIndex + 1] * 1.0) / 255.0;
CGFloat blue = (rawData[byteIndex + 2] * 1.0) / 255.0;
//CGFloat alpha = (rawData[byteIndex + 3] * 1.0) / 255.0;
byteIndex += 4;
// UIColor *acolor = [UIColor colorWithRed:red green:green blue:blue alpha:alpha];
//}
free(rawData);
NSArray *result = [NSArray arrayWithObjects:
[NSNumber numberWithFloat:red],
[NSNumber numberWithFloat:green],
[NSNumber numberWithFloat:blue],nil];
return result;
}

Capturing a OpenGL view to an AVAssetWriterInputPixelBufferAdaptor [duplicate]

This question already has answers here:
OpenGL ES 2.0 to Video on iPad/iPhone
(7 answers)
Closed 2 years ago.
I am trying to create a AVAssetWriter to screen capture an openGL project. I have never written a AVAssetWriter or an AVAssetWriterInputPixelBufferAdaptor so I am not sure if I did anything correctly.
- (id) initWithOutputFileURL:(NSURL *)anOutputFileURL {
if ((self = [super init])) {
NSError *error;
movieWriter = [[AVAssetWriter alloc] initWithURL:anOutputFileURL fileType:AVFileTypeMPEG4 error:&error];
NSDictionary *videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:
AVVideoCodecH264, AVVideoCodecKey,
[NSNumber numberWithInt:640], AVVideoWidthKey,
[NSNumber numberWithInt:480], AVVideoHeightKey,
nil];
writerInput = [[AVAssetWriterInput
assetWriterInputWithMediaType:AVMediaTypeVideo
outputSettings:videoSettings] retain];
writer = [[AVAssetWriterInputPixelBufferAdaptor alloc] initWithAssetWriterInput:writerInput sourcePixelBufferAttributes:[NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA], kCVPixelBufferPixelFormatTypeKey,nil]];
[movieWriter addInput:writerInput];
writerInput.expectsMediaDataInRealTime = YES;
}
return self;
}
Other parts of the class:
- (void)getFrame:(CVPixelBufferRef)SampleBuffer:(int64_t)frame{
frameNumber = frame;
[writer appendPixelBuffer:SampleBuffer withPresentationTime:CMTimeMake(frame, 24)];
}
- (void)startRecording {
[movieWriter startWriting];
[movieWriter startSessionAtSourceTime:kCMTimeZero];
}
- (void)stopRecording {
[writerInput markAsFinished];
[movieWriter endSessionAtSourceTime:CMTimeMake(frameNumber, 24)];
[movieWriter finishWriting];
}
The assetwriter is initiated by:
NSURL *outputFileURL = [NSURL fileURLWithPath:[NSString stringWithFormat:#"%#%#", NSTemporaryDirectory(), #"output.mov"]];
recorder = [[GLRecorder alloc] initWithOutputFileURL:outputFileURL];
The view is recorded this way:
glReadPixels(0, 0, 480, 320, GL_RGBA, GL_UNSIGNED_BYTE, buffer);
for(int y = 0; y <320; y++) {
for(int x = 0; x <480 * 4; x++) {
int b2 = ((320 - 1 - y) * 480 * 4 + x);
int b1 = (y * 4 * 480 + x);
buffer2[b2] = buffer[b1];
}
}
pixelBuffer = NULL;
CVPixelBufferCreateWithBytes (NULL,480,320,kCVPixelFormatType_32BGRA,buffer2,1920,NULL,0,NULL,&pixelBuffer);
[recorder getFrame:pixelBuffer :framenumber];
framenumber++;
Note:
pixelBuffer is a CVPixelBufferRef.
framenumber is an int64_t.
buffer and buffer2 are GLubyte.
I get no errors but when I finish recording there is no file. Any help or links to help would greatly be appreciated. The opengl has from live feed from the camera. I've been able to save the screen as a UIImage but want to get a movie of what I created.

If you're writing RGBA frames, I think you may need to use a AVAssetWriterInputPixelBufferAdaptor to write them out. This class is supposed to manage a pool of pixel buffers, but I get the impression that it actually massages your data into YUV.
If that works, then I think you'll find that your colours are all swapped at which point you'll probably have to write pixel shader to convert them to BGRA. Or (shudder) do it on the CPU. Up to you.

Video from Set of Images have RGB problem

HI Guys,
I used ffmpeg for creating video from sequence of Images. Following is My coding.
-(void)imageToMov:(NSString*)videoName imageNumber:(int)imageNumber{
[[NSFileManager defaultManager]createDirectoryAtPath:[Utilities documentsPath:[NSString stringWithFormat:#"CoachedFiles/"]] attributes:nil];
[[NSFileManager defaultManager]createDirectoryAtPath:[Utilities documentsPath:[NSString stringWithFormat:#"CoachedFiles/%#/",videoName]] attributes:nil];
//创建文件
[[NSFileManager defaultManager]createFileAtPath:[Utilities documentsPath:[NSString stringWithFormat:#"CoachedFiles/%#/%#.mov",videoName,videoName]] contents:nil attributes:nil];
//[[NSFileManager defaultManager]createFileAtPath:[Utilities documentsPath:[NSString stringWithFormat:#"temp/temp.mov"]] contents:nil attributes:nil];
const char *outfilename = [[Utilities documentsPath:[NSString stringWithFormat:#"CoachedFiles/%#/%#.mov",videoName,videoName]]UTF8String];
UIImage * tempImage = [UIImage imageWithContentsOfFile:[Utilities documentsPath:[NSString stringWithFormat:#"temp/temp0000.jpeg"]]];
AVFormatContext *pFormatCtxEnc;
AVCodecContext *pCodecCtxEnc;
AVCodec *pCodecEnc;
AVFrame *pFrameEnc;
AVOutputFormat *pOutputFormat;
AVStream *video_st;
int i;
int outbuf_size;
uint8_t *outbuf;
int out_size;
// Register all formats and codecs
av_register_all();
// auto detect the output format from the name. default is mpeg.
pOutputFormat = av_guess_format(NULL, outfilename, NULL);
if (pOutputFormat == NULL)
return;
// allocate the output media context
pFormatCtxEnc = avformat_alloc_context();
if (pFormatCtxEnc == NULL)
return;
pFormatCtxEnc->oformat = pOutputFormat;
sprintf(pFormatCtxEnc->filename, "%s", outfilename);
video_st = av_new_stream(pFormatCtxEnc, 0); // 0 for video
pCodecCtxEnc = video_st->codec;
pCodecCtxEnc->codec_id = pOutputFormat->video_codec;
pCodecCtxEnc->codec_type = CODEC_TYPE_VIDEO;
// put sample parameters
pCodecCtxEnc->bit_rate = 500000;
// resolution must be a multiple of two
pCodecCtxEnc->width = tempImage.size.width;
pCodecCtxEnc->height = tempImage.size.height;
// frames per second
pCodecCtxEnc->time_base.den = 1;
pCodecCtxEnc->time_base.num = 1;
pCodecCtxEnc->pix_fmt = PIX_FMT_YUV420P;
pCodecCtxEnc->gop_size = 12; /* emit one intra frame every ten frames */
if (pCodecCtxEnc->codec_id == CODEC_ID_MPEG1VIDEO){
/* needed to avoid using macroblocks in which some coeffs overflow
this doesnt happen with normal video, it just happens here as the
motion of the chroma plane doesnt match the luma plane */
pCodecCtxEnc->mb_decision=2;
}
// some formats want stream headers to be seperate
if(!strcmp(pFormatCtxEnc->oformat->name, "mp4") || !strcmp(pFormatCtxEnc->oformat->name, "mov") || !strcmp(pFormatCtxEnc->oformat->name, "3gp"))
pCodecCtxEnc->flags |= CODEC_FLAG_GLOBAL_HEADER;
// set the output parameters (must be done even if no parameters).
if (av_set_parameters(pFormatCtxEnc, NULL) < 0) {
return;
}
// find the video encoder
pCodecEnc = avcodec_find_encoder(pCodecCtxEnc->codec_id);
if (pCodecEnc == NULL)
return;
// open it
if (avcodec_open(pCodecCtxEnc, pCodecEnc) < 0) {
return;
}
if (!(pFormatCtxEnc->oformat->flags & AVFMT_RAWPICTURE)) {
/* allocate output buffer */
/* XXX: API change will be done */
outbuf_size = 500000;
outbuf = av_malloc(outbuf_size);
}
pFrameEnc= avcodec_alloc_frame();
// open the output file, if needed
if (!(pOutputFormat->flags & AVFMT_NOFILE)) {
if (url_fopen(&pFormatCtxEnc->pb, outfilename, URL_WRONLY) < 0) {
//fprintf(stderr, "Could not open '%s'\n", filename);
return;
}
}
// write the stream header, if any
av_write_header(pFormatCtxEnc);
// Read frames and save frames to disk
int size = pCodecCtxEnc->width * pCodecCtxEnc->height;
uint8_t * picture_buf;
picture_buf = malloc((size * 3) / 2);
pFrameEnc->data[0] = picture_buf;
pFrameEnc->data[1] = pFrameEnc->data[0] + size;
pFrameEnc->data[2] = pFrameEnc->data[1] + size / 4;
pFrameEnc->linesize[0] = pCodecCtxEnc->width;
pFrameEnc->linesize[1] = pCodecCtxEnc->width / 2;
pFrameEnc->linesize[2] = pCodecCtxEnc->width / 2;
for (i=0;i<imageNumber;i++){
NSString *imgName = [NSString stringWithFormat:#"temp/temp%04d.jpeg",i];
NSLog(#"%#",imgName);
UIImage * image = [UIImage imageWithContentsOfFile:[Utilities documentsPath:imgName]];
[imgName release];
//创建avpicture
AVPicture pict;
//格式保持bgra,后面是输入图片的长宽
avpicture_alloc(&pict, PIX_FMT_BGRA, image.size.width, image.size.height);
//读取图片数据
CGImageRef cgimage = [image CGImage];
CGDataProviderRef dataProvider = CGImageGetDataProvider(cgimage);
CFDataRef data = CGDataProviderCopyData(dataProvider);
const uint8_t * imagedata = CFDataGetBytePtr(data);
//向avpicture填充数据
avpicture_fill(&pict, imagedata, PIX_FMT_BGRA, image.size.width, image.size.height);
//定义转换格式，从bgra转到yuv420
static int sws_flags = SWS_FAST_BILINEAR;
struct SwsContext * img_convert_ctx = sws_getContext(image.size.width,
image.size.height,
PIX_FMT_BGRA,
image.size.width,
image.size.height,
PIX_FMT_YUV420P,
sws_flags, NULL, NULL, NULL);
//转换图象数据格式
sws_scale (img_convert_ctx, pict.data, pict.linesize,
0, image.size.height,
pFrameEnc->data, pFrameEnc->linesize);
if (pFormatCtxEnc->oformat->flags & AVFMT_RAWPICTURE) {
/* raw video case. The API will change slightly in the near
futur for that */
AVPacket pkt;
av_init_packet(&pkt);
pkt.flags |= PKT_FLAG_KEY;
pkt.stream_index= video_st->index;
pkt.data= (uint8_t *)pFrameEnc;
pkt.size= sizeof(AVPicture);
av_write_frame(pFormatCtxEnc, &pkt);
} else {
// encode the image
out_size = avcodec_encode_video(pCodecCtxEnc, outbuf, outbuf_size, pFrameEnc);
// if zero size, it means the image was buffered
if (out_size != 0) {
AVPacket pkt;
av_init_packet(&pkt);
pkt.pts= pCodecCtxEnc->coded_frame->pts;
if(pCodecCtxEnc->coded_frame->key_frame)
pkt.flags |= PKT_FLAG_KEY;
pkt.stream_index= video_st->index;
pkt.data= outbuf;
pkt.size= out_size;
// write the compressed frame in the media file
av_write_frame(pFormatCtxEnc, &pkt);
}
}
}
// get the delayed frames
for(; out_size; i++) {
out_size = avcodec_encode_video(pCodecCtxEnc, outbuf, outbuf_size, NULL);
if (out_size != 0) {
AVPacket pkt;
av_init_packet(&pkt);
pkt.pts= pCodecCtxEnc->coded_frame->pts;
if(pCodecCtxEnc->coded_frame->key_frame)
pkt.flags |= PKT_FLAG_KEY;
pkt.stream_index= video_st->index;
pkt.data= outbuf;
pkt.size= out_size;
// write the compressed frame in the media file
av_write_frame(pFormatCtxEnc, &pkt);
}
}
// Close the codec
//avcodec_close(pCodecCtxDec);
avcodec_close(pCodecCtxEnc);
// Free the YUV frame
//av_free(pFrameDec);
av_free(pFrameEnc);
av_free(outbuf);
// write the trailer, if any
av_write_trailer(pFormatCtxEnc);
// free the streams
for(i = 0; i < pFormatCtxEnc->nb_streams; i++) {
av_freep(&pFormatCtxEnc->streams[i]->codec);
av_freep(&pFormatCtxEnc->streams[i]);
}
if (!(pOutputFormat->flags & AVFMT_NOFILE)) {
/* close the output file */
//comment out this code to fix the record video issue. Kevin 2010-07-11
//url_fclose(&pFormatCtxEnc->pb);
}
/* free the stream */
av_free(pFormatCtxEnc);
if([[NSFileManager defaultManager]fileExistsAtPath:[Utilities documentsPath:[NSString stringWithFormat:#"temp/"]] isDirectory:NULL]){
[[NSFileManager defaultManager] removeItemAtPath:[Utilities documentsPath:[NSString stringWithFormat:#"temp/"]] error:nil];
}
//[self MergeVideoFileWithVideoName:videoName];
[self SaveFileDetails:videoName];
[alertView dismissWithClickedButtonIndex:0 animated:YES];
}
Now the Problem is Video is created successfully and the RGB color is greenish. Please notify my mistake on this coding.

I am not an expert on colors but I think your problem might be in the color space you use on your img_convert_ctx. You are using PIX_FMT_BGRA. That's different colorspace then RGB it's BGR, it has reversed colors. Try using PIX_FMT_RGBA instead.

Drawing waveform with AVAssetReader

I reading song from iPod library using assetUrl (in code it named audioUrl)
I can play it many ways, I can cut it, I can make some precessing with this but...
I really don't understand what I gonna do with this CMSampleBufferRef to get data for drawing waveform! I need info about peak values, how I can get it this (maybe another) way?
AVAssetTrack * songTrack = [audioUrl.tracks objectAtIndex:0];
AVAssetReaderTrackOutput * output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:nil];
[reader addOutput:output];
[output release];
NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];
while (reader.status == AVAssetReaderStatusReading){
AVAssetReaderTrackOutput * trackOutput =
(AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];
if (sampleBufferRef){/* what I gonna do with this? */}
Please help me!

I was searching for a similar thing and decided to "roll my own."
I realize this is an old post, but in case anyone else is in search of this, here is my solution. it is relatively quick and dirty and normalizes the image to "full scale".
the images it creates are "wide" ie you need to put them in a UIScrollView or otherwise manage the display.
this is based on some answers given to this question
Sample Output
EDIT: I have added a logarithmic version of the averaging and render methods, see the end of this message for the alternate version & comparison outputs. I personally prefer the original linear version, but have decided to post it, in case someone can improve on the algorithm used.
You'll need these imports:
#import <MediaPlayer/MediaPlayer.h>
#import <AVFoundation/AVFoundation.h>
First, a generic rendering method that takes a pointer to averaged sample data,
and returns a UIImage. Note these samples are not playable audio samples.
-(UIImage *) audioImageGraph:(SInt16 *) samples
normalizeMax:(SInt16) normalizeMax
sampleCount:(NSInteger) sampleCount
channelCount:(NSInteger) channelCount
imageHeight:(float) imageHeight {
CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
UIGraphicsBeginImageContext(imageSize);
CGContextRef context = UIGraphicsGetCurrentContext();
CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
CGContextSetAlpha(context,1.0);
CGRect rect;
rect.size = imageSize;
rect.origin.x = 0;
rect.origin.y = 0;
CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
CGColorRef rightcolor = [[UIColor redColor] CGColor];
CGContextFillRect(context, rect);
CGContextSetLineWidth(context, 1.0);
float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
float centerLeft = halfGraphHeight;
float centerRight = (halfGraphHeight*3) ;
float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax;
for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
SInt16 left = *samples++;
float pixels = (float) left;
pixels *= sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerLeft-pixels);
CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
CGContextSetStrokeColorWithColor(context, leftcolor);
CGContextStrokePath(context);
if (channelCount==2) {
SInt16 right = *samples++;
float pixels = (float) right;
pixels *= sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerRight - pixels);
CGContextAddLineToPoint(context, intSample, centerRight + pixels);
CGContextSetStrokeColorWithColor(context, rightcolor);
CGContextStrokePath(context);
}
}
// Create new image
UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();
// Tidy up
UIGraphicsEndImageContext();
return newImage;
}
Next, a method that takes a AVURLAsset, and returns PNG image data
- (NSData *) renderPNGAudioPictogramForAsset:(AVURLAsset *)songAsset {
NSError * error = nil;
AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];
NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
// [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
// [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/
[NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
nil];
AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];
[reader addOutput:output];
[output release];
UInt32 sampleRate,channelCount;
NSArray* formatDesc = songTrack.formatDescriptions;
for(unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
if(fmtDesc ) {
sampleRate = fmtDesc->mSampleRate;
channelCount = fmtDesc->mChannelsPerFrame;
// NSLog(#"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
}
}
UInt32 bytesPerSample = 2 * channelCount;
SInt16 normalizeMax = 0;
NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];
UInt64 totalBytes = 0;
SInt64 totalLeft = 0;
SInt64 totalRight = 0;
NSInteger sampleTally = 0;
NSInteger samplesPerPixel = sampleRate / 50;
while (reader.status == AVAssetReaderStatusReading){
AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];
if (sampleBufferRef){
CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);
size_t length = CMBlockBufferGetDataLength(blockBufferRef);
totalBytes += length;
NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];
NSMutableData * data = [NSMutableData dataWithLength:length];
CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);
SInt16 * samples = (SInt16 *) data.mutableBytes;
int sampleCount = length / bytesPerSample;
for (int i = 0; i < sampleCount ; i ++) {
SInt16 left = *samples++;
totalLeft += left;
SInt16 right;
if (channelCount==2) {
right = *samples++;
totalRight += right;
}
sampleTally++;
if (sampleTally > samplesPerPixel) {
left = totalLeft / sampleTally;
SInt16 fix = abs(left);
if (fix > normalizeMax) {
normalizeMax = fix;
}
[fullSongData appendBytes:&left length:sizeof(left)];
if (channelCount==2) {
right = totalRight / sampleTally;
SInt16 fix = abs(right);
if (fix > normalizeMax) {
normalizeMax = fix;
}
[fullSongData appendBytes:&right length:sizeof(right)];
}
totalLeft = 0;
totalRight = 0;
sampleTally = 0;
}
}
[wader drain];
CMSampleBufferInvalidate(sampleBufferRef);
CFRelease(sampleBufferRef);
}
}
NSData * finalData = nil;
if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
// Something went wrong. return nil
return nil;
}
if (reader.status == AVAssetReaderStatusCompleted){
NSLog(#"rendering output graphics using normalizeMax %d",normalizeMax);
UIImage *test = [self audioImageGraph:(SInt16 *)
fullSongData.bytes
normalizeMax:normalizeMax
sampleCount:fullSongData.length / 4
channelCount:2
imageHeight:100];
finalData = imageToData(test);
}
[fullSongData release];
[reader release];
return finalData;
}
Advanced Option:
Finally, if you want to be able to play the audio using AVAudioPlayer, you'll need to cache
it to your apps's bundle cache folder. Since I was doing that, i decided to cache the image data
also, and wrapped the whole thing into a UIImage category. you need to include this open source offering to extract the audio, and some code from here to handle some background threading features.
first, some defines, and a few generic class methods for handling path names etc
//#define imgExt #"jpg"
//#define imageToData(x) UIImageJPEGRepresentation(x,4)
#define imgExt #"png"
#define imageToData(x) UIImagePNGRepresentation(x)
+ (NSString *) assetCacheFolder {
NSArray *assetFolderRoot = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES);
return [NSString stringWithFormat:#"%#/audio", [assetFolderRoot objectAtIndex:0]];
}
+ (NSString *) cachedAudioPictogramPathForMPMediaItem:(MPMediaItem*) item {
NSString *assetFolder = [[self class] assetCacheFolder];
NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID];
NSString *assetPictogramFilename = [NSString stringWithFormat:#"asset_%#.%#",libraryId,imgExt];
return [NSString stringWithFormat:#"%#/%#", assetFolder, assetPictogramFilename];
}
+ (NSString *) cachedAudioFilepathForMPMediaItem:(MPMediaItem*) item {
NSString *assetFolder = [[self class] assetCacheFolder];
NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL];
NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID];
NSString *assetFileExt = [[[assetURL path] lastPathComponent] pathExtension];
NSString *assetFilename = [NSString stringWithFormat:#"asset_%#.%#",libraryId,assetFileExt];
return [NSString stringWithFormat:#"%#/%#", assetFolder, assetFilename];
}
+ (NSURL *) cachedAudioURLForMPMediaItem:(MPMediaItem*) item {
NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item];
return [NSURL fileURLWithPath:assetFilepath];
}
Now the init method that does "the business"
- (id) initWithMPMediaItem:(MPMediaItem*) item
completionBlock:(void (^)(UIImage* delayedImagePreparation))completionBlock {
NSFileManager *fman = [NSFileManager defaultManager];
NSString *assetPictogramFilepath = [[self class] cachedAudioPictogramPathForMPMediaItem:item];
if ([fman fileExistsAtPath:assetPictogramFilepath]) {
NSLog(#"Returning cached waveform pictogram: %#",[assetPictogramFilepath lastPathComponent]);
self = [self initWithContentsOfFile:assetPictogramFilepath];
return self;
}
NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item];
NSURL *assetFileURL = [NSURL fileURLWithPath:assetFilepath];
if ([fman fileExistsAtPath:assetFilepath]) {
NSLog(#"scanning cached audio data to create UIImage file: %#",[assetFilepath lastPathComponent]);
[assetFileURL retain];
[assetPictogramFilepath retain];
[NSThread MCSM_performBlockInBackground: ^{
AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil];
NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset];
[waveFormData writeToFile:assetPictogramFilepath atomically:YES];
[assetFileURL release];
[assetPictogramFilepath release];
if (completionBlock) {
[waveFormData retain];
[NSThread MCSM_performBlockOnMainThread:^{
UIImage *result = [UIImage imageWithData:waveFormData];
NSLog(#"returning rendered pictogram on main thread (%d bytes %# data in UIImage %0.0f x %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height);
completionBlock(result);
[waveFormData release];
}];
}
}];
return nil;
} else {
NSString *assetFolder = [[self class] assetCacheFolder];
[fman createDirectoryAtPath:assetFolder withIntermediateDirectories:YES attributes:nil error:nil];
NSLog(#"Preparing to import audio asset data %#",[assetFilepath lastPathComponent]);
[assetPictogramFilepath retain];
[assetFileURL retain];
TSLibraryImport* import = [[TSLibraryImport alloc] init];
NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL];
[import importAsset:assetURL toURL:assetFileURL completionBlock:^(TSLibraryImport* import) {
//check the status and error properties of
//TSLibraryImport
if (import.error) {
NSLog (#"audio data import failed:%#",import.error);
} else{
NSLog (#"Creating waveform pictogram file: %#", [assetPictogramFilepath lastPathComponent]);
AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil];
NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset];
[waveFormData writeToFile:assetPictogramFilepath atomically:YES];
if (completionBlock) {
[waveFormData retain];
[NSThread MCSM_performBlockOnMainThread:^{
UIImage *result = [UIImage imageWithData:waveFormData];
NSLog(#"returning rendered pictogram on main thread (%d bytes %# data in UIImage %0.0f x %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height);
completionBlock(result);
[waveFormData release];
}];
}
}
[assetPictogramFilepath release];
[assetFileURL release];
} ];
return nil;
}
}
An example of invoking this :
-(void) importMediaItem {
MPMediaItem* item = [self mediaItem];
// since we will be needing this for playback, save the url to the cached audio.
[url release];
url = [[UIImage cachedAudioURLForMPMediaItem:item] retain];
[waveFormImage release];
waveFormImage = [[UIImage alloc ] initWithMPMediaItem:item completionBlock:^(UIImage* delayedImagePreparation){
waveFormImage = [delayedImagePreparation retain];
[self displayWaveFormImage];
}];
if (waveFormImage) {
[waveFormImage retain];
[self displayWaveFormImage];
}
}
Logarithmic version of averaging and render methods
#define absX(x) (x<0?0-x:x)
#define minMaxX(x,mn,mx) (x<=mn?mn:(x>=mx?mx:x))
#define noiseFloor (-90.0)
#define decibel(amplitude) (20.0 * log10(absX(amplitude)/32767.0))
-(UIImage *) audioImageLogGraph:(Float32 *) samples
normalizeMax:(Float32) normalizeMax
sampleCount:(NSInteger) sampleCount
channelCount:(NSInteger) channelCount
imageHeight:(float) imageHeight {
CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
UIGraphicsBeginImageContext(imageSize);
CGContextRef context = UIGraphicsGetCurrentContext();
CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
CGContextSetAlpha(context,1.0);
CGRect rect;
rect.size = imageSize;
rect.origin.x = 0;
rect.origin.y = 0;
CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
CGColorRef rightcolor = [[UIColor redColor] CGColor];
CGContextFillRect(context, rect);
CGContextSetLineWidth(context, 1.0);
float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
float centerLeft = halfGraphHeight;
float centerRight = (halfGraphHeight*3) ;
float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (normalizeMax - noiseFloor) / 2;
for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
Float32 left = *samples++;
float pixels = (left - noiseFloor) * sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerLeft-pixels);
CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
CGContextSetStrokeColorWithColor(context, leftcolor);
CGContextStrokePath(context);
if (channelCount==2) {
Float32 right = *samples++;
float pixels = (right - noiseFloor) * sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerRight - pixels);
CGContextAddLineToPoint(context, intSample, centerRight + pixels);
CGContextSetStrokeColorWithColor(context, rightcolor);
CGContextStrokePath(context);
}
}
// Create new image
UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();
// Tidy up
UIGraphicsEndImageContext();
return newImage;
}
- (NSData *) renderPNGAudioPictogramLogForAsset:(AVURLAsset *)songAsset {
NSError * error = nil;
AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];
NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
// [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
// [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/
[NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
nil];
AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];
[reader addOutput:output];
[output release];
UInt32 sampleRate,channelCount;
NSArray* formatDesc = songTrack.formatDescriptions;
for(unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
if(fmtDesc ) {
sampleRate = fmtDesc->mSampleRate;
channelCount = fmtDesc->mChannelsPerFrame;
// NSLog(#"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
}
}
UInt32 bytesPerSample = 2 * channelCount;
Float32 normalizeMax = noiseFloor;
NSLog(#"normalizeMax = %f",normalizeMax);
NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];
UInt64 totalBytes = 0;
Float64 totalLeft = 0;
Float64 totalRight = 0;
Float32 sampleTally = 0;
NSInteger samplesPerPixel = sampleRate / 50;
while (reader.status == AVAssetReaderStatusReading){
AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];
if (sampleBufferRef){
CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);
size_t length = CMBlockBufferGetDataLength(blockBufferRef);
totalBytes += length;
NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];
NSMutableData * data = [NSMutableData dataWithLength:length];
CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);
SInt16 * samples = (SInt16 *) data.mutableBytes;
int sampleCount = length / bytesPerSample;
for (int i = 0; i < sampleCount ; i ++) {
Float32 left = (Float32) *samples++;
left = decibel(left);
left = minMaxX(left,noiseFloor,0);
totalLeft += left;
Float32 right;
if (channelCount==2) {
right = (Float32) *samples++;
right = decibel(right);
right = minMaxX(right,noiseFloor,0);
totalRight += right;
}
sampleTally++;
if (sampleTally > samplesPerPixel) {
left = totalLeft / sampleTally;
if (left > normalizeMax) {
normalizeMax = left;
}
// NSLog(#"left average = %f, normalizeMax = %f",left,normalizeMax);
[fullSongData appendBytes:&left length:sizeof(left)];
if (channelCount==2) {
right = totalRight / sampleTally;
if (right > normalizeMax) {
normalizeMax = right;
}
[fullSongData appendBytes:&right length:sizeof(right)];
}
totalLeft = 0;
totalRight = 0;
sampleTally = 0;
}
}
[wader drain];
CMSampleBufferInvalidate(sampleBufferRef);
CFRelease(sampleBufferRef);
}
}
NSData * finalData = nil;
if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
// Something went wrong. Handle it.
}
if (reader.status == AVAssetReaderStatusCompleted){
// You're done. It worked.
NSLog(#"rendering output graphics using normalizeMax %f",normalizeMax);
UIImage *test = [self audioImageLogGraph:(Float32 *) fullSongData.bytes
normalizeMax:normalizeMax
sampleCount:fullSongData.length / (sizeof(Float32) * 2)
channelCount:2
imageHeight:100];
finalData = imageToData(test);
}
[fullSongData release];
[reader release];
return finalData;
}
comparison outputs
Linear plot for start of "Warm It Up" by Acme Swing Company
Logarithmic plot for start of "Warm It Up" by Acme Swing Company

You should be able to get a buffer of audio from your sampleBuffRef and then iterate through those values to build your waveform:
CMBlockBufferRef buffer = CMSampleBufferGetDataBuffer( sampleBufferRef );
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
sampleBufferRef,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
&buffer
);
// this copies your audio out to a temp buffer but you should be able to iterate through this buffer instead
SInt32* readBuffer = (SInt32 *)malloc(numSamplesInBuffer * sizeof(SInt32));
memcpy( readBuffer, audioBufferList.mBuffers[0].mData, numSamplesInBuffer*sizeof(SInt32));

Another approach using Swift 5 and using AVAudioFile:
///Gets the audio file from an URL, downsaples and draws into the sound layer.
func drawSoundWave(fromURL url:URL, fromPosition:Int64, totalSeconds:UInt32, samplesSecond:CGFloat) throws{
print("\(logClassName) Drawing sound from \(url)")
do{
waveViewInfo.samplesSeconds = samplesSecond
//Get audio file and format from URL
let audioFile = try AVAudioFile(forReading: url)
waveViewInfo.format = audioFile.processingFormat
audioFile.framePosition = fromPosition * Int64(waveViewInfo.format.sampleRate)
//Getting the buffer
let frameCapacity:UInt32 = totalSeconds * UInt32(waveViewInfo.format.sampleRate)
guard let audioPCMBuffer = AVAudioPCMBuffer(pcmFormat: waveViewInfo.format, frameCapacity: frameCapacity) else{ throw AppError("Unable to get the AVAudioPCMBuffer") }
try audioFile.read(into: audioPCMBuffer, frameCount: frameCapacity)
let audioPCMBufferFloatValues:[Float] = Array(UnsafeBufferPointer(start: audioPCMBuffer.floatChannelData?.pointee,
count: Int(audioPCMBuffer.frameLength)))
waveViewInfo.points = []
waveViewInfo.maxValue = 0
for index in stride(from: 0, to: audioPCMBufferFloatValues.count, by: Int(audioFile.fileFormat.sampleRate) / Int(waveViewInfo.samplesSeconds)){
let aSample = CGFloat(audioPCMBufferFloatValues[index])
waveViewInfo.points.append(aSample)
let fix = abs(aSample)
if fix > waveViewInfo.maxValue{
waveViewInfo.maxValue = fix
}
}
print("\(logClassName) Finished the points - Count = \(waveViewInfo.points.count) / Max = \(waveViewInfo.maxValue)")
populateSoundImageView(with: waveViewInfo)
}
catch{
throw error
}
}
///Converts the sound wave in to a UIImage
func populateSoundImageView(with waveViewInfo:WaveViewInfo){
let imageSize:CGSize = CGSize(width: CGFloat(waveViewInfo.points.count),//CGFloat(waveViewInfo.points.count) * waveViewInfo.sampleSpace,
height: frame.height)
let drawingRect = CGRect(origin: .zero, size: imageSize)
UIGraphicsBeginImageContextWithOptions(imageSize, false, 0)
defer {
UIGraphicsEndImageContext()
}
print("\(logClassName) Converting sound view in rect \(drawingRect)")
guard let context:CGContext = UIGraphicsGetCurrentContext() else{ return }
context.setFillColor(waveViewInfo.backgroundColor.cgColor)
context.setAlpha(1.0)
context.fill(drawingRect)
context.setLineWidth(1.0)
// context.setLineWidth(waveViewInfo.lineWidth)
let sampleAdjustFactor = imageSize.height / waveViewInfo.maxValue
for pointIndex in waveViewInfo.points.indices{
let pixel = waveViewInfo.points[pointIndex] * sampleAdjustFactor
context.move(to: CGPoint(x: CGFloat(pointIndex), y: middleY - pixel))
context.addLine(to: CGPoint(x: CGFloat(pointIndex), y: middleY + pixel))
context.setStrokeColor(waveViewInfo.strokeColor.cgColor)
context.strokePath()
}
// for pointIndex in waveViewInfo.points.indices{
//
// let pixel = waveViewInfo.points[pointIndex] * sampleAdjustFactor
//
// context.move(to: CGPoint(x: CGFloat(pointIndex) * waveViewInfo.sampleSpace, y: middleY - pixel))
// context.addLine(to: CGPoint(x: CGFloat(pointIndex) * waveViewInfo.sampleSpace, y: middleY + pixel))
//
// context.setStrokeColor(waveViewInfo.strokeColor.cgColor)
// context.strokePath()
//
// }
// var xIncrement:CGFloat = 0
// for point in waveViewInfo.points{
//
// let normalizedPoint = point * sampleAdjustFactor
//
// context.move(to: CGPoint(x: xIncrement, y: middleY - normalizedPoint))
// context.addLine(to: CGPoint(x: xIncrement, y: middleX + normalizedPoint))
// context.setStrokeColor(waveViewInfo.strokeColor.cgColor)
// context.strokePath()
//
// xIncrement += waveViewInfo.sampleSpace
//
// }
guard let soundWaveImage = UIGraphicsGetImageFromCurrentImageContext() else{ return }
soundWaveImageView.image = soundWaveImage
// //In case of handling sample space in for
// updateWidthConstraintValue(soundWaveImage.size.width)
updateWidthConstraintValue(soundWaveImage.size.width * waveViewInfo.sampleSpace)
}
WHERE
class WaveViewInfo {
var format:AVAudioFormat!
var samplesSeconds:CGFloat = 50
var lineWidth:CGFloat = 0.20
var sampleSpace:CGFloat = 0.20
var strokeColor:UIColor = .red
var backgroundColor:UIColor = .clear
var maxValue:CGFloat = 0
var points:[CGFloat] = [CGFloat]()
}
At the moment only prints one sound wave but it can be extended. The good part is that you can print an audio track by parts

A little bit refactoring from the above answers (using AVAudioFile)
import AVFoundation
import CoreGraphics
import Foundation
import UIKit
class WaveGenerator {
private func readBuffer(_ audioUrl: URL) -> UnsafeBufferPointer<Float> {
let file = try! AVAudioFile(forReading: audioUrl)
let audioFormat = file.processingFormat
let audioFrameCount = UInt32(file.length)
guard let buffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)
else { return UnsafeBufferPointer<Float>(_empty: ()) }
do {
try file.read(into: buffer)
} catch {
print(error)
}
// let floatArray = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength)))
let floatArray = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
return floatArray
}
private func generateWaveImage(
_ samples: UnsafeBufferPointer<Float>,
_ imageSize: CGSize,
_ strokeColor: UIColor,
_ backgroundColor: UIColor
) -> UIImage? {
let drawingRect = CGRect(origin: .zero, size: imageSize)
UIGraphicsBeginImageContextWithOptions(imageSize, false, 0)
let middleY = imageSize.height / 2
guard let context: CGContext = UIGraphicsGetCurrentContext() else { return nil }
context.setFillColor(backgroundColor.cgColor)
context.setAlpha(1.0)
context.fill(drawingRect)
context.setLineWidth(0.25)
let max: CGFloat = CGFloat(samples.max() ?? 0)
let heightNormalizationFactor = imageSize.height / max / 2
let widthNormalizationFactor = imageSize.width / CGFloat(samples.count)
for index in 0 ..< samples.count {
let pixel = CGFloat(samples[index]) * heightNormalizationFactor
let x = CGFloat(index) * widthNormalizationFactor
context.move(to: CGPoint(x: x, y: middleY - pixel))
context.addLine(to: CGPoint(x: x, y: middleY + pixel))
context.setStrokeColor(strokeColor.cgColor)
context.strokePath()
}
guard let soundWaveImage = UIGraphicsGetImageFromCurrentImageContext() else { return nil }
UIGraphicsEndImageContext()
return soundWaveImage
}
func generateWaveImage(from audioUrl: URL, in imageSize: CGSize) -> UIImage? {
let samples = readBuffer(audioUrl)
let img = generateWaveImage(samples, imageSize, UIColor.blue, UIColor.white)
return img
}
}
Usage
let url = Bundle.main.url(forResource: "TEST1.mp3", withExtension: "")!
let img = waveGenerator.generateWaveImage(from: url, in: CGSize(width: 600, height: 200))

How do I make my own recordedPath in GLPaint Sample Code

I've recently downloaded the GLPaint sample code and looked at a very interesting part in it. There is a recordedPaths NSMutableArray that has points in it that are then read and drawn by GLPaint.
It's declared here:
NSMutableArray *recordedPaths;
recordedPaths = [NSMutableArray arrayWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"Recording" ofType:#"data"]];
if([recordedPaths count])
[self performSelector:#selector(playback:) withObject:recordedPaths afterDelay:0.2];
This is the code for playback:
- (void) playback:(NSMutableArray*)recordedPaths {
NSData* data = [recordedPaths objectAtIndex:0];
CGPoint* point = (CGPoint*)[data bytes];
NSUInteger count = [data length] / sizeof(CGPoint),
i;
//Render the current path
for(i = 0; i < count - 1; ++i, ++point)
[self renderLineFromPoint:*point toPoint:*(point + 1)];
//Render the next path after a short delay
[recordedPaths removeObjectAtIndex:0];
if([recordedPaths count])
[self performSelector:#selector(playback:) withObject:recordedPaths afterDelay:0.01];
}
From this I understand that recordedPaths is a mutable array that his in it struct c arrays of CGPoint that are then read and rendered.
I'd like to put in my own array and i've been having trouble with that.
I tried changing the recordedPaths declaration to this:
NSMutableArray *myArray = [[NSMutableArray alloc] init];
CGPoint* points;
CGPoint a = CGPointMake(50,50);
int i;
for (i=0; i<100; i++,points++) {
a = CGPointMake(i,i);
points = &a;
}
NSData *data = [NSData dataWithBytes:&points length:sizeof(*points)];
[myArray addObject:data];
This didn't work though...
Any advice?

If you look at the Recording.data you will notice that each line is its own array. To capture the ink and play it back you need to have an array of arrays. For purposes of this demo - declare a mutable array - writRay
#synthesize writRay;
//init in code
writRay = [[NSMutableArray alloc]init];
Capture the ink
// Handles the continuation of a touch.
- (void)touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event
{
CGRect bounds = [self bounds];
UITouch* touch = [[event touchesForView:self] anyObject];
// Convert touch point from UIView referential to OpenGL one (upside-down flip)
if (firstTouch) {
firstTouch = NO;
previousLocation = [touch previousLocationInView:self];
previousLocation.y = bounds.size.height - previousLocation.y;
/******************* create a new array for this stroke's points **************/
[writRay addObject:[[NSMutableArray alloc]init]];
/***** add 1st point *********/
[[writRay objectAtIndex:[writRay count] -1]addObject:[NSValue valueWithCGPoint:previousLocation]];
} else {
location = [touch locationInView:self];
location.y = bounds.size.height - location.y;
previousLocation = [touch previousLocationInView:self];
previousLocation.y = bounds.size.height - previousLocation.y;
/********* add additional points *********/
[[writRay objectAtIndex:[writRay count] -1]addObject:[NSValue valueWithCGPoint:previousLocation]];
}
// Render the stroke
[self renderLineFromPoint:previousLocation toPoint:location];
}
Playback the ink.
- (void)playRay{
if(writRay != NULL){
for(int l = 0; l < [writRay count]; l++){
//replays my writRay -1 because of location point
for(int p = 0; p < [[writRay objectAtIndex:l]count] -1; p ++){
[self renderLineFromPoint:[[[writRay objectAtIndex:l]objectAtIndex:p]CGPointValue] toPoint:[[[writRay objectAtIndex:l]objectAtIndex:p + 1]CGPointValue]];
}
}
}
}
For best effect shake the screen to clear and call playRay from changeBrushColor in the AppController.

CGPoint* points;
CGPoint a = CGPointMake(50,50);
int i;
for (i=0; i<100; i++,points++) {
a = CGPointMake(i,i);
points = &a;
}
NSData *data = [NSData dataWithBytes:&points length:sizeof(*points)];
Wrong code.
(1) You need an array of points. Simply declaring CGPoint* points; won't create an array of points, just an uninitialized pointer of CGPoint. You need to allocate space for the array either with
CGPoint points[100];
or
CGPoint* points = malloc(sizeof(CGPoint)*100);
Remember to free the points if you choose the malloc way.
(2) To copy value to the content of the pointer you need to use
*points = a;
But I suggest you keep the pointer points invariant in the loop, since you're going to reuse it later. Use the array syntax points[i].
(3)
sizeof(*points)
Since *points is just one CGPoint, so the sizeof is always 8 bytes. You need to multiply the result by 100 to get the correct length.
(4)
[NSData dataWithBytes:&points ...
points already is a pointer to the actual data. You don't need to take the address of it again. Just pass points directly.
So the final code should look like
CGPoint* points = malloc(sizeof(CGPoint)*100); // make a cast if the compiler warns.
CGPoint a;
int i;
for (i=0; i<100; i++) {
a = CGPointMake(i,i);
points[i] = a;
}
NSData *data = [NSData dataWithBytes:points length:sizeof(*points)*100];
free(points);

I was just reading over this post as I am trying to achive a similar thing. With modifications to the original project by Apple, I was able to create a new 'shape' by modifying the code accordingly.
Note that it only draws a diagonal line... several times. But the theory is there to create your own drawing.
I've taken the code from KennyTM's post and incorporated it into the 'payback' function, this could be adapted to create the array in the initWithCoder function and then send it through like the original code but for now - this will get you a result.
CGPoint* points = malloc(sizeof(CGPoint)*100);
CGPoint a;
int iter;
for (iter=0; iter<200; iter++) {
a = CGPointMake(iter,iter);
points[iter] = a;
}
NSData *data = [NSData dataWithBytes:points length:sizeof(*points)*100];
free(points);
CGPoint* point = (CGPoint*)[data bytes];
NSUInteger count = [data length] / sizeof(CGPoint),
i;
for(i = 0; i < count - 1; ++i, ++point)
[self renderLineFromPoint:*point toPoint:*(point + 1)];
[recordedPaths removeObjectAtIndex:0];
if([recordedPaths count])
[self performSelector:#selector(playback:) withObject:recordedPaths afterDelay:0.01];
I am still in my first weeks on openGL coding, so forgive any glaring mistakes / bad methods and thanks for the help!
Hope this helps

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to generate audio wave form programmatically while recording Voice in iOS? - iphone

Related

How to compare images using opencv in iOS (iPhone)

Capturing a OpenGL view to an AVAssetWriterInputPixelBufferAdaptor [duplicate]

Video from Set of Images have RGB problem

Drawing waveform with AVAssetReader

How do I make my own recordedPath in GLPaint Sample Code

Categories

Resources