Check microphone for silence - unity3d

While recording user voice, i want to know when he/she stopped talking to end the recording and send the audio file to google speech recognition API.
I found this thread here and tried to use it's solution but i am always getting the same value from the average of spectrum data which is 5.004574E-08:
Unity - Microphone check if silent
This is the code i am using for getting the GetSpectrumData value:
public void StartRecordingSpeech()
{
//If there is a microphone
if (micConnected)
{
if (!Microphone.IsRecording(null))
{
goAudioSource.clip = Microphone.Start(null, true, 10, 44100); //Currently set for a 10 second clip max
goAudioSource.Play();
StartCoroutine(StartRecordingSpeechCo());
}
}
else
{
Debug.LogError("No microphone is available");
}
}
IEnumerator StartRecordingSpeechCo()
{
while (Microphone.IsRecording(null))
{
float[] clipSampleData = new float[128];
goAudioSource.GetSpectrumData(clipSampleData, 0, FFTWindow.Rectangular);
Debug.Log(clipSampleData.Average());
yield return null;
}
}
PS: I am able to record the users voice, save it and get the right response from the voice recognition api.

The following method is what worked for me. it detect the volume of the microphone, turn it into decibels. It does not need to play the recorded audio or anything. (credit goes to this old thread in the unity answers: https://forum.unity.com/threads/check-current-microphone-input-volume.133501/).
public float LevelMax()
{
float levelMax = 0;
float[] waveData = new float[_sampleWindow];
int micPosition = Microphone.GetPosition(null) - (_sampleWindow + 1); // null means the first microphone
if (micPosition < 0) return 0;
goAudioSource.clip.GetData(waveData, micPosition);
// Getting a peak on the last 128 samples
for (int i = 0; i < _sampleWindow; i++)
{
float wavePeak = waveData[i] * waveData[i];
if (levelMax < wavePeak)
{
levelMax = wavePeak;
}
}
float db = 20 * Mathf.Log10(Mathf.Abs(levelMax));
return db;
}
In my case, if the value is bigger then -40 then the user is talking!if its 0 or bigger then there is a loud noise, other then that, its silence!

If you are interested in a volume then GetSpectrumData is actually not really what you want. This is used for frequency analysis and returns - as the name says - a frequency spectrum so how laud is which frequency in a given frequency range.
What you rather want to use is GetOutputData which afaik returns an array with amplitudes from -1 to 1. So you have to square all values, get the average and take the square root of this result (source)
float[] clipSampleData = new float[128];
goAudioSource.GetOutputData(clipSampleData, 0);
Debug.Log(Mathf.Sqrt(clipSampleData.Select(f => f*f).Average()));

Related

Web audio playback contains clicks

I am trying to build a midi player using web audio API. I used tonejs to parse midi file into JSON. I am using mp3 files to play notes. Following are the relevant parts of the code:
//create audio samples
static async setupSample(audioContext, filepath) {
const response = await fetch(filepath);
const arrayBuffer = await response.arrayBuffer();
const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);
return audioBuffer;
}
//play a single sample
static playSample(audioContext, audioBuffer, time) {
const sampleSource = new AudioBufferSourceNode(audioContext, {
buffer: audioBuffer,
playbackRate: 1,
});
sampleSource.connect(audioContext.destination);
sampleSource.start(time);
return sampleSource;
}
Scheduling samples:
async start() {
this.startTime = this.audioCtx.currentTime;
this.play();
}
play() {
let nextNote = this.notes[this.noteIndex];
//schedule samples
while ((nextNote.time + this.startTime) - this.audioCtx.currentTime <= 0.250) {
let s = Audio.playSample(this.audioCtx, this.samples[nextNote.midi], this.startTime + nextNote.time);
s.stop(this.startTime + nextNote.time + nextNote.duration);
this.noteIndex++;
if (this.noteIndex == this.notes.length) {
break;
}
nextNote = this.notes[this.noteIndex];
}
if (this.noteIndex == this.notes.length) {
return;
}
requestAnimationFrame(() => {
this.play();
});
}
I am testing code with a midi file which contains C major scale. I have tested the midi file using timidity and it is fine.
The code does play the midi file correctly execpet a small problem: I hear some clicking sounds during playback. The clicking increases with increasing tempo but does not completely go away even with tempo as small as 50bpm. Any ideas what could be going wrong?
Full code can be viewed at : https://test.meedee.in/
Nothing is "wrong". You are observing a phenomenon intrinsic to the physics of audio.
Chopping audio samples arbitrarily like this creates clicks at the transitions. Any instantaneous change in level is heard as a click. To get rid of the clicks, apply an envelope to the sample, blend adjacent notes, or apply a low-pass filter.

AR camera distance measurement

I have a question about AR(Augmented Reality).
I want to know how to show the distance information(like centermeter...) between AR camera and target object. (Using Smartphone)
Can I do that in Unity ? Should I use AR Foundation? and with ARcore? How to write code?
I tried finding some relative code(below), but it seems just like Printing information between object and object, nothing about "AR camera"...
var other : Transform;
if (other) {
var dist = Vector3.Distance(other.position, transform.position);
print ("Distance to other: " + dist);
}
Thank again!
Here is how to do it Unity and AR Foundation 4.1.
This example script prints the depth in meters at the depth texture's center and works both with ARCore and ARKit:
using System;
using System.Collections;
using UnityEngine;
using UnityEngine.Assertions;
using UnityEngine.XR.ARFoundation;
using UnityEngine.XR.ARSubsystems;
public class GetDepthOfCenterPixel : MonoBehaviour {
// assign this field in inspector
[SerializeField] AROcclusionManager manager = null;
IEnumerator Start() {
while (ARSession.state < ARSessionState.SessionInitializing) {
// manager.descriptor.supportsEnvironmentDepthImage will return a correct value if ARSession.state >= ARSessionState.SessionInitializing
yield return null;
}
if (!manager.descriptor.supportsEnvironmentDepthImage) {
Debug.LogError("!manager.descriptor.supportsEnvironmentDepthImage");
yield break;
}
while (true) {
if (manager.TryAcquireEnvironmentDepthCpuImage(out var cpuImage) && cpuImage.valid) {
using (cpuImage) {
Assert.IsTrue(cpuImage.planeCount == 1);
var plane = cpuImage.GetPlane(0);
var dataLength = plane.data.Length;
var pixelStride = plane.pixelStride;
var rowStride = plane.rowStride;
Assert.AreEqual(0, dataLength % rowStride, "dataLength should be divisible by rowStride without a remainder");
Assert.AreEqual(0, rowStride % pixelStride, "rowStride should be divisible by pixelStride without a remainder");
var numOfRows = dataLength / rowStride;
var centerRowIndex = numOfRows / 2;
var centerPixelIndex = rowStride / (pixelStride * 2);
var centerPixelData = plane.data.GetSubArray(centerRowIndex * rowStride + centerPixelIndex * pixelStride, pixelStride);
var depthInMeters = convertPixelDataToDistanceInMeters(centerPixelData.ToArray(), cpuImage.format);
print($"depth texture size: ({cpuImage.width},{cpuImage.height}), pixelStride: {pixelStride}, rowStride: {rowStride}, pixel pos: ({centerPixelIndex}, {centerRowIndex}), depthInMeters of the center pixel: {depthInMeters}");
}
}
yield return null;
}
}
float convertPixelDataToDistanceInMeters(byte[] data, XRCpuImage.Format format) {
switch (format) {
case XRCpuImage.Format.DepthUint16:
return BitConverter.ToUInt16(data, 0) / 1000f;
case XRCpuImage.Format.DepthFloat32:
return BitConverter.ToSingle(data, 0);
default:
throw new Exception($"Format not supported: {format}");
}
}
}
I'm working on AR depth image as well and the basic idea is:
Acquire an image using API, normally it's in format Depth16;
Split the image into shortbuffers, as Depth16 means each pixel is 16 bits;
Get the distance value, which is stored in the lower 13 bits of each shortbuffer, you can do this by doing (shortbuffer & 0x1ff), then you can have the distance for each pixel, normally it's in millimeters.
By doing this through all the pixels, you can create a depth image and store it as jpg or other formats, here's the sample code of using AR Engine to get the distance:
try (Image depthImage = arFrame.acquireDepthImage()) {
int imwidth = depthImage.getWidth();
int imheight = depthImage.getHeight();
Image.Plane plane = depthImage.getPlanes()[0];
ShortBuffer shortDepthBuffer = plane.getBuffer().asShortBuffer();
File sdCardFile = Environment.getExternalStorageDirectory();
Log.i(TAG, "The storage path is " + sdCardFile);
File file = new File(sdCardFile, "RawdepthImage.jpg");
Bitmap disBitmap = Bitmap.createBitmap(imwidth, imheight, Bitmap.Config.RGB_565);
for (int i = 0; i < imheight; i++) {
for (int j = 0; j < imwidth; j++) {
int index = (i * imwidth + j) ;
shortDepthBuffer.position(index);
short depthSample = shortDepthBuffer.get();
short depthRange = (short) (depthSample & 0x1FFF);
//If you only want the distance value, here it is
byte value = (byte) depthRange;
byte value = (byte) depthRange ;
disBitmap.setPixel(j, i, Color.rgb(value, value, value));
}
}
//I rotate the image for a better view
Matrix matrix = new Matrix();
matrix.setRotate(90);
Bitmap rotatedBitmap = Bitmap.createBitmap(disBitmap, 0, 0, imwidth, imheight, matrix, true);
try {
FileOutputStream out = new FileOutputStream(file);
rotatedBitmap.compress(Bitmap.CompressFormat.JPEG, 90, out);
out.flush();
out.close();
MainActivity.num++;
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
e.printStackTrace();
}
}
While the answers are great, they may be too complicated and advanced for this question, which is about the distance between the ARCamera and another object, and not about the depth of pixels and their occlusion.
transform.position gives you the position of whatever game object you attach the script to in the hierarchy. So attach the script to the ARCamera object. And obviously, other should be the target object.
Alternately, you can get references to the two game objects using inspector variables or GetComponent
/raycasting should be in update/
Ray ray = new Ray(cam.transform.position, cam.transform.forward);
if (Physics.Raycast(ray, out info, 50f, layerMaskAR))//50 meters detection range bcs of 50f
{
distanca.text = string.Format("{0}: {1:N2}m", info.collider.name, info.distance, 2);
}
This is func that does it what u need with this is ofc on UI txt element and layer assigne to object/prefab.
int layerMaskAR = 1 << 6; (here u see 6 bcs 6th is my custom layer ,,layerMaskAR,,)
This is ray cating on to objects in only this layer rest object are ignored(if u dont want to ignore anything remove layerMask from raycast and it will print out name of anything with collider).
Totally doable by this line of code
Vector3.Distance(gameObject.transform.position, Camera.main.transform.position)

How to offset note scheduling for interactive recording of notes via user controls

In the code below I have a note scheduler that increments a variable named current16thNote up to 16 and then looping back around to 1. The ultimate goal of the application is to allow the user to click a drum pad and push the current16thNote value to an array. On each iteration of current16thNote a loop is run on the track arrays looking for the current current16thNote value, if it finds it the sound will play.
//_________________________________________________________General variable declarations
var isPlaying = false,
tempo = 120.0, // tempo (in beats per minute)
current16thNote = 1,
futureTickTime = 0.0,
timerID = 0,
noteLength = 0.05; // length of "beep" (in seconds)
//_________________________________________________________END General variable declarations
//_________________________________________________________Load sounds
var kick = audioFileLoader("sounds/kick.mp3"),
snare = audioFileLoader("sounds/snare.mp3"),
hihat = audioFileLoader("sounds/hihat.mp3"),
shaker = audioFileLoader("sounds/shaker.mp3");
//_________________________________________________________END Load sounds
//_________________________________________________________Track arrays
var track1 = [],
track2 = [5, 13],
track3 = [],
track4 = [1, 3, 5, 7, 9, 11, 13, 15];
//_________________________________________________________END Track arrays
//_________________________________________________________Future tick
function futureTick() {
var secondsPerBeat = 60.0 / tempo;
futureTickTime += 0.25 * secondsPerBeat;
current16thNote += 1;
if (current16thNote > 16) {
current16thNote = 1
}
}
//_________________________________________________________END Future tick
function checkIfRecordedAndPlay(trackArr, sound, beatDivisionNumber, time) {
for (var i = 0; i < trackArr.length; i += 1) {
if (beatDivisionNumber === trackArr[i]) {
sound.play(time);
}
}
}
//__________________________________________________________Schedule note
function scheduleNote(beatDivisionNumber, time) {
var osc = audioContext.createOscillator(); //____Metronome
if (beatDivisionNumber === 1) {
osc.frequency.value = 800;
} else {
osc.frequency.value = 400;
}
osc.connect(audioContext.destination);
osc.start(time);
osc.stop(time + noteLength);//___________________END Metronome
checkIfRecordedAndPlay(track1, kick, beatDivisionNumber, time);
checkIfRecordedAndPlay(track2, snare, beatDivisionNumber, time);
checkIfRecordedAndPlay(track3, hihat, beatDivisionNumber, time);
checkIfRecordedAndPlay(track4, shaker, beatDivisionNumber, time);
}
//_________________________________________________________END schedule note
//_________________________________________________________Scheduler
function scheduler() {
while (futureTickTime < audioContext.currentTime + 0.1) {
scheduleNote(current16thNote, futureTickTime);
futureTick();
}
timerID = window.requestAnimationFrame(scheduler);
}
//_________________________________________________________END Scheduler
The Problem
In addition to the previous code I have some user interface controls as shown in the following image.
When a user mousedowns on a “drum pad” I want to do two things. The first is to hear the sound immediately , and the second is to push the current16thNote value to the respective array.
If I use the following code to do this a few problems emerge.
$("#kick").on("mousedown", function() {
kick.play(audioContext.currentTime)
track1.push(current16thNote)
})
The first problem is that the sound plays twice. This is because when the sound is pushed to the array it is immediately recognized by the next iteration of the note scheduler and immediately plays. I fixed this by creating a delay with setInterval to offset the push to the array.
$("#kick").on("mousedown", function() {
kick.play(audioContext.currentTime)
window.setTimeout(function() {
track1.push(note)
}, 500)
})
The second problem is musical.
When a user clicks a drum pad the value that the user anticipates the drum will be recorded at is one 16th value earlier. In other words if you listen to the metronome and click on the kick drum pad with the intent of landing right on the 1/1 down beat this won't happen. Instead, when the metronome loops back around it will have been “recorded” at one 16th increment later.
This can be remedied by writing code that intentionally offsets the value that is pushed to the array by -1 .
I wrote a helper function named pushNote to do this.
$("#kick").on("mousedown", function() {
var note = current16thNote;
kick.play(audioContext.currentTime)
window.setTimeout(function() {
pushNote(track1, note)
}, 500)
})
//________________________________________________Helper
function pushNote(trackArr, note) {
if (note - 1 === 0) {
trackArr.push(16)
} else {
trackArr.push(note - 1)
}
}
//________________________________________________END Helper
My question is really a basic one. Is there a way to solve this problem without creating these odd “offsets” ?
I suspect there is a way to set/write/place the current16thNote increment without having to create offsets to other parts of the program. But I'm hazy on what it could be.
In the world of professional audio recording there isn't a tick per 16th division value , you usually have 480 ticks per quarter note. I want to begin exploring writing my apps using this larger value but I want to resolve this "offset" issue before I go down that rabbit hole.

Why comparing float values is such difficult?

I am newbie in Unity platform. I have 2D game that contains 10 boxes vertically following each other in chain. When a box goes off screen, I change its position to above of the box at the top. So the chain turns infinitely, like repeating Parallax Scrolling Background.
But I check if a box goes off screen by comparing its position with a specified float value. I am sharing my code below.
void Update () {
offSet = currentSquareLine.transform.position;
currentSquareLine.transform.position = new Vector2 (0f, -2f) + offSet;
Vector2 vectorOne = currentSquareLine.transform.position;
Vector2 vectorTwo = new Vector2 (0f, -54f);
if(vectorOne.y < vectorTwo.y) {
string name = currentSquareLine.name;
int squareLineNumber = int.Parse(name.Substring (11)) ;
if(squareLineNumber < 10) {
squareLineNumber++;
} else {
squareLineNumber = 1;
}
GameObject squareLineAbove = GameObject.Find ("Square_Line" + squareLineNumber);
offSet = (Vector2) squareLineAbove.transform.position + new Vector2(0f, 1.1f);
currentSquareLine.transform.position = offSet;
}
}
As you can see, when I compare vectorOne.y and vectorTwo.y, things get ugly. Some boxes lengthen and some boxes shorten the distance between each other even I give the exact vector values in the code above.
I've searched for a solution for a week, and tried lots of codes like Mathf.Approximate, Mathf.Round, but none of them managed to compare float values properly. If unity never compares float values in the way I expect, I think I need to change my way.
I am waiting for your godlike advices, thanks!
EDIT
Here is my screen. I have 10 box lines vertically goes downwards.
When Square_Line10 goes off screen. I update its position to above of Square_Line1, but the distance between them increases unexpectedly.
Okay, I found a solution that works like a charm.
I need to use an array and check them in two for loops. First one moves the boxes and second one check if a box went off screen like below
public GameObject[] box;
float boundary = -5.5f;
float boxDistance = 1.1f;
float speed = -0.1f;
// Update is called once per frame
void Update () {
for (int i = 0; i < box.Length; i++) {
box[i].transform.position = box[i].transform.position + new Vector3(0, speed, 0);
}
for (int i = 0; i < box.Length; i++)
{
if(box[i].transform.position.y < boundary)
{
int topIndex = (i+1) % box.Length;
box[i].transform.position = new Vector3(box[i].transform.position.x, box[topIndex].transform.position.y + boxDistance, box[i].transform.position.z);
break;
}
}
}
I attached it to MainCamera.
Try this solution:
bool IsApproximately(float a, float b, float tolerance = 0.01f) {
return Mathf.Abs(a - b) < tolerance;
}
The reason being that the tolerances in the internal compare aren't good to use. Change the tolerance value in a function call to be lower if you need more precision.

Controlling robot makes a shaking movement

Im trying to control a robot by sending positions with 100hz. It's making a shaking movement when sending so much positions. When I send 1 position that is like 50 mm from his start position it moves smoothly. When I use my sensor to steer,(so it send every position from 0 to 50mm) it is shaking. I'm probably sending like X0-X1-X2-X1-X2-X3-X4-X5-X4-X5 and this is the reason why it might shake. How can I solve this making the robot move smoothly when I use my mouse to steer it?
Robot is asking 125hz
IR sensor is sending 100hz
Otherwise does the 25hz makes the diffrent?
Here is my code.
while(true)
// If sensor 1 is recording IR light.
if (listen1.newdata = true)
{
coX1 = (int) listen1.get1X(); //
coY1 = (int) listen1.get1Y();
newdata = true;
} else {
coX1 = 450;
coY1 = 300;
}
if (listen2.newdata = true)
{
coX2 = (int) listen2.get1X();
coY2 = (int) listen2.get1Y();
newdata = true;
} else {
coY2 = 150;
}
// If the sensor gets further then the workspace, it will automaticly correct it to these
// coordinates.
if (newdata = true)
{
if (coX1< 200 || coX1> 680)
{
coX1 = 450;
}
if (coY1<200 || coY1> 680)
{
coY1 = 300;
}
if (coY2<80 || coY2> 300)
{
coY2 = 150;
}
}
// This is the actually command send to a robot.
Gcode = String.format( "movej(p[0.%d,-0.%d, 0.%d, -0.5121, -3.08, 0.0005])"+ "\n", coX1, coY1, coY2);
//sends message to server
send(Gcode, out);
System.out.println(Gcode);
newdata = false;
}
}
private static void send(String movel, PrintWriter out) {
try {
out.println(movel); /*Writes to server*/
// System.out.println("Writing: "+ movel);
// Thread.sleep(250);
}
catch(Exception e) {
System.out.print("Error Connecting to Server\n");
}
}
}
# Edit
I discovered on wich way I can do this. It is via min and max. So basicly what I think I have to do is:
* put every individual coordinate in a array( 12 coordinates)
* Get the min and max out of this array
* Output the average of the min and max
Without knowing more about your robot characteristics and how you could control it, here are some general considerations:
To have a smooth motion of your robot, you should control it in speed with a well designed PID controller algorithm.
If you can only control it in position, the best you can do is monitoring the position & waiting for it to be "near enough" from the targetted position before sending the next position.
If you want a more detailed answer, please give more information on the command you send to the robot (movej), I suspect that you can do much more than just sending [x,y] coordinates.