Use GPU for speech NSSpeechSynthesis and NSSpeechRecogniers on OS X - swift

I just did an interesting test of running a speech recogniser service and using NSSpeechSynthesis to echo what I said using NSSpeechSynthesizer.
However, NSSpeechSynthesizer is notorious for being slow and unresponsive, and I wanted to know if anyone has tried optimising this by specifying either a core, a thread or the GPU (using metal) to process both recognition and synthesis.
I've been checking the following article to understand better pipelining values through the metal buffer:
http://memkite.com/blog/2014/12/30/example-of-sharing-memory-between-gpu-and-cpu-with-swift-and-metal-for-ios8/
The author has used Metal for off loading the sigmoid function used in ML which makes complete sense as vector maths is what GPUs do best.
However, I would like to know if anyone has explored the possibility of sending other type of data, floats values from a wave form or other (render synthesis through the GPU).
Particularly, has anyone tried this for NSSpeechRecogniser or NSSpeechSynthesizer?
As it goes now, I have a full 3D scene with 3D HRTF sound, and both recognition and synthesis work but sometimes there's a noticeable lag, so maybe dedicating a buffer pipeline through the GPU MTLDevice then back to play the file might work?

Related

Can the Modelica "Fluid" library handle choked flow?

I'd like to start off by saying that I'm new to StackOverflow and to Modelica.
My goal is to simulate the injector system of a Rotating Detonation Engine. Essentially this is a piping system from a tank to a rocket engine. This system will change depending on the experimental setup, so I chose Modelica (specifically OpenModelica) because of the re-usability of components. The flows encountered will be at high pressures and high flow rates (sustaining a detonation requires this), and choked flow will occur.
My question is this: does the standard "Fluid" library in Modelica allow for choked flow? I understand that a few valves model this, but will the current library be able to capture "choking" in a long rough pipe, or the small end of a converging pipe (basically anywhere choking can happen, despite it not being the design location for a choke)?
If yes, excellent. If not, is there a non-standard library available? Should I be looking at something other than Modelica? I am happy to work on making a new library, but before going through that work I thought I would check to see if anything already existed.
I have read through most of the "Media" and the basics of the "Fluid" libraries and I get the feeling that compressible flow is modeled as a means of increasing accuracy over in-compressible flow, but not to actually handle choked flow.
Thank you for your time. I hope everyone is keeping safe!
The pipe model in the Modelica library does not handle choked flows.
Adding a standard orifice in series with the pipe should help provided the 'zeta' value is adjusted to make the velocity at the orifice match with the speed of sound in the gas. In other words Modelica library does not provide a valid mean of modeling choked flows in pipes.
However, I found a very interesting library called FreeFluids (https://github.com/CarlosTrujilloGonzalez/FreeFluidsModelica) who does have a very good model for choked pipes. An example is provided with the library for a choked air flow in a 10m long diam. 50mm circular pipe. The model returns correct values for air.

Measure the electricity consumed by a browser to render a webpage

Is there a way to calculate the electricity consumed to load and render a webpage (frontend)? I was thinking of a 'test' made with phantomjs for example:
load a web page
scroll to the bottom
And measure how much electricity was needed. I can perhaps extrapolate from CPU cycle. But phantomjs is headless, rendering in real browser is certainly different. Perhaps it's impossible to do real measurements.. but with an index it may be possible to compare websites.
Do you have other suggestions?
It's pretty much impossible to measure this internally in modern processors (anything more recent than 286). By internally, I mean by counting cycles. This is because different parts of the processor consume different levels of energy per cycle depending upon the instruction.
That said, you can make your measurements. Stick a power meter between the wall and the processor. Here's a procedure:
Measure the baseline energy usage, i.e. nothing running except the OS and the browser, and the browser completely static (i.e. not doing anything). You need to make sure that everything is stead state (SS) meaning start your measurements only after several minutes of idle.
Measure the usage doing the operation you want. Again, you want to avoid any start up and stopping work, so make sure you start measuring at least 15 seconds after you start the operation. Stopping isn't an issue since the browser will execute any termination code after you finish your measurement.
Sounds simple, right? Unfortunately, because of the nature of your measurements, there are some gotchas.
Do you recall your physics classes (or EE classes) that talked about signal to noise ratios? Well, a scroll down uses very little energy, so the signal (scrolling) is well in the noise (normal background processes). This means you have to take a LOT of samples to get anything useful.
Your browser startup energy usage, or anything else that uses a decent amount of processing, is much easier to measure (better signal to noise ratio).
Also, make sure you understand the underlying electronics. For example, power is VA (voltage*amperage) where both V and A are in phase. I don't think this will be an issue since I'm pretty sure they are in phase for computers. Also, any decent power meter understands the difference.
I'm guessing you intend to do this for mobile devices. Your measurements will only be roughly the same from processor to processor. This is due to architectural differences from generation to generation, and from manufacturer to manufacturer.
Good luck.

2D multi-robot simulation libraries?

background
I'm working on a group project to simulate some consensus algorithms used by a group of independent robots to form an arbitrary shape on a 2D plane. The robots are modeled as unit disks, and all run the same algorithm. Basically, each robot can move, wait, or observe its local environment at any moment, but cannot communicate explicitly with an other robots. We'd like to find a simulation or even 2d graphics library to help us without writing too much from scratch.
Question
Can anyone recommend a simulation library meeting the requirements below, which could be used for a multi-robot 2D simulation?
I've never coded a simulation before, so it's possible some of my concerns are readily addressed by many existing libraries. However, the Mason project is the only resource I've found that seems promising so far. Unfortunately, a few of our team members are not very proficient in Java, so I'd like to find something suitable in a different language, if possible.
Requirements
* language preference (descending order): python, c++, (maybe) java
* open source/FOSS recommendations only
* Options/flags to disable simulation: We plan on running several thousand trials of randomly generated shapes against each algorithm, so for the bulk of trials we don't care about any visual representation, just data. So the simulation logic has to be decoupled from the graphics components if this makes sense.
* collision detection
* Customizable visual representations: Within a simulation, we'd like to have several views (or toggles for a single view) that present additional information about each robot like current state, the area it's currently observing etc.
For such simple graphics you can surely get away with either pyqt or wxpython.
The simulation itself should be its own python module; the GUI should just load the module, then call its "timestep" function at regular intervals (timer, GUI idle callback, etc); the step function should evolve the robot system by one small time step.
The GUI should just display the simulation state. Avoid mixing everything (display and simulation) in one module, it'll get pretty messy, plus if your simulation engine is a separate module you can then also run it directly from the command line and look at the output file.
It would be pretty easy to write a python script that reads such output file and generates commands to represent it graphically in either excel or powerpoint using win32com, in which case you don't even need pyqt or wxpython.
For the collision detection, look at pybox2d.

Sound Waves Simiulation in 3D Environment

Hi: I want to do a sound waves simulation that include wave propagation, absorbing and reflection in 3D space.
I do some searches and I found this question in stackoverflow but it talk about electromagnetic waves not sound waves.
I know i can reimplement the FDTD method for sound waves but how about the sources and does it act like the electromagnetic waves ? Is there any resources to start with ?
Thanks in advance.
Hope this can give you some inputs...
As far as i know, in EM simulations obstacles (and thus terrain) are not considered at all. With sound you have to consider reflection, diffraction, etc
there are different standards to calculate the noise originated from different sources (I'll list the europe ones, the one i know of):
traffic, NMPB (NMPB-Routes-96) is THE standard. All the noise calculations have to be done with that one (at least in my country). Results aren't very good. A "new" algorithm is SonRoad (i think it uses inverse ray-tracing)... from my tests it works great.
trains: Schall03
industries, ISO 9613
a list of all the used models in CadnaA (a professional software) so you can google them all: http://www.datakustik.com/en/products/cadnaa/modeling-and-calculation/calculation-standards/
another pro software is SoundPlan, somewhere on the web there is a free "SoundPlan-ReferenceManual.pdf" 800-pages with the mathematical description of the implemented algorithms... i haven't had any luck with google today tough
An easy way to do this is use the SoundPlan software. Multiple sound propagation methods such as ISO9613-2, CONCAWE and Nord2000 are implemented. It has basic 3D visualization with sound pressure level contours.

Creating music visualizer [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
So how does someone create a music visualizer? I've looked on Google but I haven't really found anything that talks about the actual programming; mostly just links to plug-ins or visualizing applications.
I use iTunes but I realize that I need Xcode to program for that (I'm currently deployed in Iraq and can't download that large of a file). So right now I'm just interested in learning "the theory" behind it, like processing the frequencies and whatever else is required.
As a visualizer plays a song file, it reads the audio data in very short time slices (usually less than 20 milliseconds). The visualizer does a Fourier transform on each slice, extracting the frequency components, and updates the visual display using the frequency information.
How the visual display is updated in response to the frequency info is up to the programmer. Generally, the graphics methods have to be extremely fast and lightweight in order to update the visuals in time with the music (and not bog down the PC). In the early days (and still), visualizers often modified the color palette in Windows directly to achieve some pretty cool effects.
One characteristic of frequency-component-based visualizers is that they don't often seem to respond to the "beats" of music (like percussion hits, for example) very well. More interesting and responsive visualizers can be written that combine the frequency-domain information with an awareness of "spikes" in the audio that often correspond to percussion hits.
For creating BeatHarness ( http://www.beatharness.com ) I've 'simply' used an FFT to get the audiospectrum, then use some filtering and edge / onset-detectors.
About the Fast Fourier Transform :
http://en.wikipedia.org/wiki/Fast_Fourier_transform
If you're accustomed to math you might want to read Paul Bourke's page :
http://local.wasp.uwa.edu.au/~pbourke/miscellaneous/dft/
(Paul Bourke is a name you want to google anyway, he has a lot of information about topics you either want to know right now or probably in the next 2 years ;))
If you want to read about beat/tempo-detection google for Masataka Goto, he's written some interesting papers about it.
Edit:
His homepage : http://staff.aist.go.jp/m.goto/
Interesting read : http://staff.aist.go.jp/m.goto/PROJ/bts.html
Once you have some values for for example bass, midtones, treble and volume(left and right),
it's all up to your imagination what to do with them.
Display a picture, multiply the size by the bass for example - you'll get a picture that'll zoom in on the beat, etc.
Typically, you take a certain amount of the audio data, run a frequency analysis over it, and use that data to modify some graphic that's being displayed over and over. The obvious way to do the frequency analysis is with an FFT, but simple tone detection can work just as well, with a lower lower computational overhead.
So, for example, you write a routine that continually draws a series of shapes arranged in a circle. You then use the dominant frequencies to determine the color of the circles, and use the volume to set the size.
There are a variety of ways of processing the audio data, the simplest of which is just to display it as a rapidly changing waveform, and then apply some graphical effect to that. Similarly, things like the volume can be calculated (and passed as a parameter to some graphics routine) without doing a Fast Fourier Transform to get frequencies: just calculate the average amplitude of the signal.
Converting the data to the frequency domain using an FFT or otherwise allows more sophisticated effects, including things like spectrograms. It's deceptively tricky though to detect even quite 'obvious' things like the timing of drum beats or the pitch of notes directly from the FFT output
Reliable beat-detection and tone-detection are hard problems, especially in real time. I'm no expert, but this page runs through some simple example algorithms and their results.
Devise an algorithm to draw something interesting on the screen given a set of variables
Devise a way to convert an audio stream into a set of variables analysing things such as beats/minute frequency different frequency ranges, tone etc.
Plug the variables into your algorithm and watch it draw.
A simple visualization would be one that changed the colour of the screen every time the music went over a certain freq threshhold. or to just write the bpm onto the screen. or just displaying an ociliscope.
check out this wikipedia article
Like suggested by #Pragmaticyankee processing is indeed an interesting way to visualize your music. You could load your music in Ableton Live, and use an EQ to filter out the high, middle and low frequencies from your music. You could then use a VST follwoing plugin to convert audio enveloppes into MIDI CC messages, such as Gatefish by Mokafix Audio (works on windows) or PizMidi’s midiAudioToCC plugin (works on mac). You can then send these MIDI CC messages to a light-emitting hardware tool that supports MIDI, for instance percussa audiocubes. You could use a cube for every frequency you want to display, and assign a color to the cube. Have a look at this post:
http://www.percussa.com/2012/08/18/how-do-i-generate-rgb-light-effects-using-audio-signals-featured-question/
We have lately added DirectSound-based audio data input routines in LightningChart data visualization library. LightningChart SDK is set of components for Visual Studio .NET (WPF and WinForms), you may find it useful.
With AudioInput component, you can get real-time waveform data samples from sound device. You can play the sound from any source, like Spotify, WinAmp, CD/DVD player, or use mic-in connector.
With SpectrumCalculator component, you can get power spectrum (FFT conversion) that is handy in many visualizations.
With LightningChartUltimate component you can visualize data in many different forms, like waveform graphs, bar graphs, heatmaps, spectrograms, 3D spectrograms, 3D lines etc. and they can be combined. All rendering takes place through Direct3D acceleration.
Our own examples in the SDK have a scientific approach, not really having much entertainment aspect, but it definitely can be used for awesome entertainment visualizations too.
We have also configurable SignalGenerator (sweeps, multi-channel configurations, sines, squares, triangles, and noise waveforms, WAV real-time streaming, and DirectX audio output components for sending wave data out from speakers or line-output.
[I'm CTO of LightningChart components, doing this stuff just because I like it :-) ]