I am trying to use the SoX vad (voice activity detection) feature to analyze a wav file to determine if it contains speech (unsurprisingly.) However, I am using it on the command line on a Linux server that has no audio device. I would expect that I should be able run the command and capture the output somehow, but it seems like the vad feature is dependent on using the "play" command and that appears to be dependent on an audio device.
Is there a way that I can do this without an audio device?
Works here, how did you run it? Here's what I did:
sox infile.wav outfile.wav vad
outfile.wav is trimmed at the front until voice is detected.
Related
I would like to use chrome speech recognition WebKitSpeechRecognition() with the input of an audio file for testing purposes. I could use a virtual microphone but this is really hacky and hard to implement with automation, but when I tested it everything worked fine and the speechrecognition converted my audio file to text. now I wanted to use the following chrome arguments:
--use-file-for-fake-audio-capture="C:/url/to/audio.wav"
--use-fake-device-for-media-stream
--use-fake-ui-for-media-stream
This worked fine on voice recorder sites for example and I could hear the audio file play when I replayed the recording. But for some reason when I try to use this on WebKitSpeechRecognition of chrome then it doesn't use the fake audio device but instead my actual microphone. Is there any way I can fix this or test my audio files on the website? I am using C# and I couldn't really find any useful info on automatically adding, managing and configuring virtual audio devices. What approaches could I take?
Thanks in advance.
Well it turns out this is not possible because chrome and google check if you are using a fake mic ect, they do this specifically to prevent this kind of behavior so people cannot get free speech to text. There is a paid api available from google (first 60 minutes per month are free)
I've started writing a java synthesizer using the jsyn library. It works well in windows and osx, but when I run it on Raspbian. When starting the program i notice some activity in the headphone output though, it starts to output some silent noise, but no clear loud sawtooth wave like it does on Windows and OSX. Which sound device is the correct one to select as output when starting the synthesizer if I want to use the headphone jack? There are 4 avaliable when I run the AudioDeviceManager.getDeviceCount()
It is hard to know which of the 4 devices to use. This example will list them by name and also indicate which one is the default input or output.
https://github.com/philburk/jsyn/blob/master/tests/com/jsyn/examples/ListAudioDevices.java
It is also possible that the CPU cannot keep up. Try just playing a single oscillator. A sine wave is good because then you can easily hear any click or distortion. Here is an example that does that:
https://github.com/philburk/jsyn/blob/master/tests/com/jsyn/examples/PlayTone.java
I am attempting to record live audio via USB microphone to be converted to WAV and uploaded to a server. I am using Chrome Canary (latest build) on Windows XP. I have based my development on the example at http://webaudiodemos.appspot.com/AudioRecorder/index.html
I see that when I activate the recording, the onaudioprocess event input buffers (e.inputBuffer.getChannelData(0) for example) are all zero-value data. Naturally, there is no sound output or recorded when this is the case. I have verified the rest of the code by replacing the input buffer data with data that produces a tone which shows up in the output WAV file. When I use approaches other than createMediaStreamSource, things are working correctly. For example, I can use createObjectURL and set an src to that and successfully hear my live audio played back in real time. I can also load an audio file and using createBufferSource, see that during playback (which I hear), the inputBuffer has non-zero data in it, of course.
Since most of the web-audio recording demos I have seen on the web rely upon createMediaStreamSource, I am guessing this has been inadvertantly broken in some subsequent release of Chrome. Can anyone confirm this or suggest how to overcome this problem?
It's probably not the version of Chrome. Live input still has some high requirements right now:
1) Input and output sample rates need to be the same on Windows
2) Windows 7+ only - I don't believe it will work on Windows XP, which is likely what is breaking you.
3) Input device must be stereo (or >2 channels) - many, if not most, USB microphones show up as a mono device, and Web Audio isn't working with them yet.
I'm presuming, of course, that my AudioRecorder demo isn't working for you either.
These limitations will be removed over time.
These days, I was researching the software architechture for iPhone Streaming (Base on MMS protocol).
As we know, in order to playback MMS audio stream, we should call libMMS to read wma stream data from remote media server, and then call FFmpeg to decode the stream data from wma format into PCM data buffer, and finally, enqueue the PCM data buffer into iPhone’s audioqueue to generate real sound.
The introduction above just describe the working process of iPhone streaming. If we only need to implement this simple functionality, that is not difficult. Just follow the introduction above to call libMMS, FFMpeg and audioqueue step by step, we can achieve the streaming function. Actually, I have implemented the code last week.
But, what I need is not only a simple streaming function! I need a software architechture makes FFmpeg accessing libMMS just like accessing local filesystem!
Does anybody know how to hook the libMMS interfaces like mms_read/mms_seek onto FFmpeg filesystem interfaces like av_read_frame/av_seek_frame?
I think I have to answer my own question again this time……
After several weeks reseach and debuging, I got the truth finally.
Actually, we don’t need to “hook” libMMS onto FFMpeg. Why? Because the FFMpeg already has its native mms protocol process module “mms_protocol” (see in mms_protocol.c in FFMpeg).
All we need to do is just configuring the FFMpeg to enable the mms module like this (see in config.h in FFMpeg):
#define ENABLE_MMS_PROTOCOL 1
#define CONFIG_MMS_PROTOCOL 1
After this configuration, FFMpeg will add mms protocol into its protocol list. (Actually, the protocol list has already contained “local file system protocol”). As result, the FFMpeg could be able to treat the “mms://hostserver/abc” media file like local media file. Therefore, we can still open and read the mms media file using:
av_open_input_file();
av_read_frame();
like we did on local media file before!
By the way, in my ffmpeg version, there are still many bugs in libAVFormat module for proccessing mms protocol. It took one week for me to debug it, however, I think it will be much shorter for the guy as smart as you:-)
I am trying to understand the concept of Beat Detektion and I found that it works on the basis of detecting sound through Microphone. So, my first question is will it not be a disadvantage if i am detecting sound from Microphone? Because when we are using the device it happens that other sound from environment is also there so the actual beat will not produce for sound.
My second question (actually where i got stuck) I found that this Beat Detektion is not able to access iPod Library. Will i be able to play beats if i fetch the song from ipod Library in my app and then i use with beat detektion.
http://www.cubicproductions.com/index.php?option=com_content&view=article&id=67&Itemid=82
http://www.gearslutz.com/board/product-alerts-older-than-2-months/457617-beatdetektor-iphone-app-open-source-algorithm-bpm-detection.html
I will be very thankful if any reference/link other then above provided to understand Beat Detektion more...
Edit 1
I have got the code for the above from this link But this code is in C++ and there it is written that we have to convert the code to XCODE project using CMAKE software. I am somehow able to convert the code to xcode project but then i am only having cpp files then how should i run the program in iphone???
ok I am somehow able to solve my problem with the Apple's sample code : AurioTouch
I have implemented song in that example and produced the beats of the song on the basis of the beats.. In Iphone we can access sound beat using mic only. So aurioTouch uses same for beat detection