Why Does MIDI Offer 127 Notes - midi

Is the 127 note values in MIDI musically significant (certain number of octaves or something)? or was it set at 127 due to the binary file format, IE for the purposes of computing?

In the MIDI protocol there are status bytes (think commands, such as note-on or note-off) and there are data bytes (think parameters, such as pitch value and velocity). The way to determine the difference between them is by the first bit. If that first bit is 1, then it is a status byte. If the first bit is 0, then it is a data byte. This leaves only 7 bits available for the rest of the status or data byte value.
So to answer your question in short, this has more to do with the protocol specification, but it just so happens to nicely line up to good number of available pitch values.
Now, these pitch values do not correspond to specific pitches. Yes it is true that typically a pitch value of 60 will give you C4, or middle C. Most synths work this way, but certainly not all. It isn't even a requirement that the synth uses the pitch value for pitches! MIDI doesn't care... it is just a protocol. You may be wondering how alternate tunings work... they work just fine. It is up to the synthesizer to produce the correct pitches for these alternate tunings. MIDI simply provides for a selection of 128 different values to be sent.
Also, if you are wondering why it is so important for that first bit to signify what the data is... There are system realtime messages that can be interjected in the middle of some other command. These are things like the timing clock which is often used to sync up LFOs among other things.
You can read more about the types of MIDI messages here: http://www.midi.org/techspecs/midimessages.php

127 = 27 - 1
It's the maximum positive value of an 8-bit signed integer, and so is a meaningful limit in file formats--it's the highest value you can store in a byte (on most systems) without making it unsigned.

I think what you are missing is that MIDI was created in the early 1980's, not to run on personal computers, but to run on musical instruments with extremely limited processing and storage capabilities. Storing 127 values seemed GIANT back then, especially when the largest keyboard typically has only 88 keys, and most electronic instruments only had 48. If you think MIDI is doing something in a strange way, it is likely that stems from its jurassic heritage.
Yes it is true that typically a pitch value of 60 will give you C4,
or middle C. Most synths work this way, but certainly not all.
Yes ... there has always been a disagreement about where middle C is in MIDI. On Yamaha keyboards it is C3, on Roland keyboards it is C4. Yamaha did it one way and Roland did it another.
Now, these pitch values do not correspond to specific pitches.
Not originally. However, in the "General MIDI" standard, A = 440, which is standard tuning. General MIDI also describes which patch is a piano, which is a guitar, and so on, so that MIDI files become portable across multitimbral sound sources.

Simple efficiency.
As a serial protocol MIDI was designed around simple serial chips of the time which would take 8 data bits in and transmit them as a stream out of one separate serial data pin at a proscribed rate. In the MIDI world this was 31,250 Hz. It added stop and start bits so all data could travel over one wire.
It was designed to be cheap and simple and the simplicity was extended into the data format.
The most significant bit of the 8 data bits was used to signal if the data byte was a command or data. So-
To send Middle C note ON on channel 1 at a velocity of 56 A command bytes is sent first
and the command for Note on was the upper 4 bits of that command bit 1001. Notice the 1 in the Most significant bit, this was followed by the channel ID for channel 1 0000 ( computers preferring to start counting from 0)
10010000 or 128 + 16 = 144
This was followed by the actual Note data
72 for Middle C or 01001000
and then the velocity data again specified in the range 0 -127 with a 0 MSB
56 in our case
00111000
So what would go down the wire (ignoring stop start & sync bits was)
144, 72, 56
For the almost brain dead microcomputers of the time in electronic keyboards the ability to separate command from data by simply looking at the first bit was a godsend.
As has been stated 127 bits covers pretty much any western keyboard you care to mention. So made perfectly logical sense and the protocols survival long after many serial protocols have disappeared into obscurity is a great compliment to http://en.wikipedia.org/wiki/Dave_Smith_(engineer) Dave Smith of Sequential Circuits who started the discussions with other manufacturers to set all this in place.
Modern music and composition would be considerably different without him and them.
Enjoy!

127 is enough to cover all piano keys

0 ~ 127 fits nicely for ADC conversions.
Many MIDI hardware devices rely on performing Analog to Digital conversions (ADC). Considering MIDI is a real time communication protocol, when performing an ADC conversion using successive-approximation (a commonly used algorithm), a good rule of thumb is to use 8 bit resolution for fast computation. This will yield values in the 0 ~ 1023 range, which can be converted to MIDI range by dividing by 8.

Related

Compression, using invalid data to represent chess board states

I am making a chess program and I have been trying to optimize the game board state to use the least amount of data storage possible. I've realized there is a set of data that makes sense (can be decompressed) with my compression algorithm but is also invalid, for example a rook that is eligible for a castling move but is not in its starting position, or a pawn that is not in the fourth/fifth row but is eligible for an En passant move. I am trying to figure out an effective way to have those invalid positions encode a string of 0's between 1 and 61 bits long, to represent empty board spaces.
So given the input either 10110 or 11110 (input might be irrelevant to this problem?) what is a good way to represent between 1 and 61 bits of 0's? Any bits of your choosing can follow so long as it, plus the input, is shorter than the equivalent amount of zero's it would be taking the place of.
This would be used optionally in place of the string of zeros, so for example if there was three 0's it would make more sense to encode it as 000 rather than 10110(your bits here). But if it was say, 16 zeros, then it would be potentially cheaper to encode it as 10110(your bits here). At compression time the decision would be made which to use, and by nature of it at decompression time both would be decompressed to mean the same thing.

How much faster is sending 16 bit vs 32 bit over BLE?

I am working on a project where I am sending information over BLE from a phone to a Raspberry Pi Zero. I can fit all the information I need into 16 bit messages, however, down the line I may need more bits, though I probably won’t. Would I be better off sending only 16 bit packets than 32 bit? Is it that much faster to send and parse 16 bits for a RPi Zero over BLE? I am only entertaining the idea of 32 bits because if I do need more information in the future, updating the code would be much easier.
The packets contain position data of the phone and will be send every .1 of a second. I am using Bleno on the Pi to receive data.
Dude, those two extra bytes won't kill your energy budget. It's wise to keep reserved space for future use. It enables backwards compatibility and ease of future development.
There's not really any difference in the length of the on-air packet transmission due to the big overhead of BLE, and you won't experience any difference due to the nature of connection intervals. We're talking 16bits/(10^6) = 16uS in 1mbps mode and 8uS in 2mbps mode.

Binary integral data compression

I need to transmit integral data types over the network but don't want to transfer all 32 (or 64) bits all the time - data fits into just one byte 99% of time - so it looks like it's need to compress it somehow: for example first bit of a byte is 0 if other 7 bits means just some value (0-127), otherwise (if first byte is 1) it's need to shift these 7 bytes left and read second byte to do the same process.
Is there some common way to do this? I don't want to reinvent a wheel...
Thank you.
The scheme you describe (which is essentially a base-128 encoding: each byte is a 7-bit base-128 "digit" and a single bit flag to indicate whether or not it is the final digit) is a common way of doing this.
For example, see:
the section on "LEB128" in the DWARF spec (§7.6);
"Base 128 Varints" in Google's protocol buffers;
"Variable Width Integers" in the LLVM bitcode format (various different widths are used in various different places there).
Just about any data compression algorithm would be able to compress that kind of data stream very well. Use whatever compression libraries your language provides.

Transferring data using ultrasound

Yamaha InfoSound and ShopKick application use technologies that allow to transfer data using ultrasound. That is playing an inaudible signal (>18kHz) that can be picked up by modern mobile phones (iOS, Android).
What is the approach used in such technologies? What kind of modulation they use?
I see several problems with this approach. First, 18kHz is not inaudible. Many people cannot hear it, especially as they age, but I know I certainly can (I do regular hearing tests, work-related). Also, most phones have different low-pass filters on their A/D converters, and many devices, especially older Android ones (I've personally seen that happen), filter everything below 16 kHz or so. Your app therefore is not guaranteed to work on any hardware. The iPhone should probably be able to do it.
In terms of modulation, it could be anything really, but I would definitely rule out AM. Sound has next to zero robustness when it comes to volume. If I were to implement something like that, I would go with FSK. I would think that PSK would fail due to acoustic reflections and such. The difficulty is that you're working with non-robust energy transfer within a very narrow bandwidth. I certainly do not doubt that it can be achieved, but I don't see something like this proving reliable. Just IMHO, that is.
Update: Now that i think about it, a plain on-off would work with a single tone if you're not transferring any data, just some short signals.
Can't say for Yamaha InfoSound and ShopKick, but what we used in our project was a variation of frequency modulation: the frequency of the carrier is modulated by a digital binary signal, where 0 and 1 correspond to 17 kHz and 18 kHz respectively. As for demodulator, we tried heterodyne. More details you could find here: http://rnd.azoft.com/mobile-app-transering-data-using-ultrasound/
There's nothing special in being ultrasound, the principle is the same as data transmission through a modem, so any digital modulation is -in principle- feasible. You only have a specific frequency band (above 18khz) and some practical requisites (the medium is very unreliable, I guess) that suggest to use a simple-robust scheme with low-bit rate.
I don't know how they do it but this is how I do it:
If it is a string then make sure it's not a long one (the longer the higher is the error probability ). Lets assume we're working with the vital part of the ASCII code, namely up to character number 127, then all you need is 7 bits per character. Transform this character into bits and modulate those bits using QFSK (there are several modulations to choose from, frequency shift based ones have turned out to be the most robust I've tried from the conventional ones... I've created my own modulation scheme for this use case). Select the carrier frequencies as 18.5,19,19.5, and 20 kHz (if you want to be mathematically strict in your design, select frequency values that assure you both orthogonality and phase continuity at symbol transitions, if you can't, a good workaround to avoid abrupt symbols transitions is to multiply your symbols by a window of the same size, eg. a Gaussian or Bartlet ). In my experience you can move this values in the range from 17.5 to 20.5 kHz (if you go lower it will start to bother people using your app, if you go higher the average type microphone frequency response will attenuate your transmission and induce unwanted errors).
On the receiver side implement a correlation or matched filter receiver (an FFT receiver works as well, specially a zero padded one but it might be a little bit slower, I wouldn't recommend Goertzel because frequency shift due to Doppler effect or speaker-microphone non-linearities could affect your reception). Once you have received the bit stream make characters with them and you will recover your message
If you face too many broadcasting errors, try selecting a higher amount of samples per symbol or band-pass filtering each frequency value before giving them to the demodulator, using an error correction code such as BCH or Reed Solomon is sometimes the only way to assure an error free communication.
One topic everybody always forgets to talk about is synchronization (to know on the receiver side when the transmission has begun), you have to be creative here and make a lot a tests with a lot of phones before you can derive an actual detection threshold that works on all, notice that this might also be distance dependent
If you are unfamiliar with these subjects I would recommend a couple of great books:
Digital Modulation Techniques from Fuqin Xiong
DIGITAL COMMUNICATIONS Fundamentals and Applications from BERNARD SKLAR
Digital Communications from John G. Proakis
You might have luck with a library I created for sound-based modems, libquiet. It gives you a handful of profiles to work from, including a slow "Ultrasonic whisper" profile with spectral content above 19kHz. The library is written in C but would require some work to interface with iOS.

Variable-byte encoding clarification

I am very new to the world of byte encoding so please excuse me (and by all means, correct me) if I am using/expressing simple concepts in the wrong way.
I am trying to understand variable-byte encoding. I have read the Wikipedia article (http://en.wikipedia.org/wiki/Variable-width_encoding) as well as a book chapter from an Information Retrieval textbook. I think I understand how to encode a decimal integer. For example, if I wanted to provide variable-byte encoding for the integer 60, I would have the following result:
1 0 1 1 1 1 0 0
(please let me know if the above is incorrect). If I understand the scheme, then I'm not completely sure how the information is compressed. Is it because usually we would use 32 bits to represent an integer, so that representing 60 would result in 1 1 1 1 0 0 preceded by 26 zeros, thus wasting that space as opposed to representing it with just 8 bits instead?
Thank you in advance for the clarifications.
The way you do it is by reserving one of the bits to mean "I'm not done with the value." Usually, that's the most significant bit.
When you read a byte, you process the lower 7 bits. If the most significant bit is 1, then you know there's one more byte to read, and you repeat the process, adding the next 7 bits to the current 7 bits.
The MIDI format uses that exact encoding to represent lengths of MIDI events, in the following manner:
ExpectedValue = 0
byte=ReadFromFile
ExpectedValue = ExpectedValue + (byte AND 0x7f)
if byte > 127 then
ExpectedValue = ExpectedValue SHL 7
Goto 2
Done
For example, the value 0x80 would be represented using the bytes 0x81 0x00. You can try running the algorithm on those two bytes, and you see you'll get the right value.
UTF-8 works similarly, but it uses a slightly more complex scheme to tell you how many bytes you should be expecting. This allows for some error correction, since you can easily tell if the bytes you're getting match the length claimed. Wikipedia describes their structure quite well.
You hit the nail on the head.
There are many encoding schemes, such as gamma and delta, which are special cases of elias coding. These are bit-level codes, as opposed to the byte-level code you used, and are useful when you have a strong skew towards small numbers (which can often be achieved by encoding deltas instead of absolute values).
Bit-level encoding schemes are much more difficult to implement than byte-level schemes and the additional CPU burden may outweigh the time saved by having less data to read, though most modern CPUs have "highest-bit" and "lowest-bit" instructions that dramatically improve the performance of bit-level codecs. As CPU speeds continue to outpace RAM speeds, bit-level schemes will become more attractive, though the simplicity of byte-level codecs is a big factor too.
Yes, you are right, you save space by encoding using one byte instead of 4.
Generally, you will save memory if the values you are encoding are much smaller than the maximum value that would have fit in your original fixed-width encoding.