Synchronization in Manchester coding - encoding

Lately I have been reading about Manchester encoding and I think I'm beginning to understand most of it now, but still I have got some whys that need addressing. Mainly 3 for the moment:
1) Most articles on Internet when introducing Manchester coding start by telling how bad NRZI really was and one of the disadvantages that gets mentioned is that synchronization becomes a problem when lengthy 1's or 0's get sent. Why is that a problem, since most places where NRZI is used have got separate clock and data lines. As long as the clock signal is there why should that ever be a problem?
2) Also, is Manchester supposed to work on a fixed frequency? Or can it work like I2C where clock frequency can be variable?
3) The good thing that gets mentioned about Manchester encoding is that it does not require separate clock line and that clock is embedded in the data and can be recovered by the receiver. Frequent transitions in Manchester help in synchronization and that the transitions happen in the middle and so clock can be recovered from transition. But my question is, if there are repeated 1's or 0's transition can happen in the middle and in the end as well (see attached waveform pic, look at the transitions when sending 111). So when a receiver sees a transition how does it figure out whether it is in the middle or at the end?
If I'm talking rubbish I would love to be corrected.

regarding your third question: I'm also brushing up on manchester and it appears that to recover a clock you need a differential signal:
Reference: "Data Communications, Computer Networks and Open Systems" by Fred Halsall, page 104, figure 3.8

For the 3 question,
Whenever a signal is transmitted, initially a few redundant bits which contain info about clock are sent.
For example, 1111, now the receiver knows the real data will arrive next, and through those redundant bits clock signal is extracted as well as the “notification “ that a signal is going to come.
As for question 1, NRZ scheme can send lengthy 1’s and lengthy 0’s.... but here the problem is actually with lengthy 1’s, if you could check sending lengthy 1’s with some modulation scheme and a dipole antenna, you could observe that the power of carrier signal will start decaying exponentially.
And the other reason would be the power needed to send that many lengthy 1’s, which is not favourable!
For question 2, yes it is possible to use it variable clock frequency but the condition is you should send redundant bits before you could change the clock frequency so that the receiver understands that the clock is changed from this point onwards.
Hope it’s clear now ;)

Related

Machine Learning to predict time-series multi-class signal changes

I would like to predict the switching behavior of time-dependent signals. Currently the signal has 3 states (1, 2, 3), but it could be that this will change in the future. For the moment, however, it is absolutely okay to assume three states.
I can make the following assumptions about these states (see picture):
the signals repeat periodically, possibly with variations concerning the time of day.
the duration of state 2 is always constant and relatively short for all signals.
the duration of states 1 and 3 are also constant, but vary for the different signals.
the switching sequence is always the same: 1 --> 2 --> 3 --> 2 --> 1 --> [...]
there is a constant but unknown time reference between the different signals.
There is no constant time reference between my observations for the different signals. They are simply measured one after the other, but always at different times.
I am able to rebuild my model periodically after i obtained more samples.
I have the following problems:
I can only observe one signal at a time.
I can only observe the signals at different times.
I cannot trigger my measurement with the state transition. That means, when I measure, I am always "in the middle" of a state. Therefore I don't know when this state has started and also not exactly when this state will end.
I cannot observe a certain signal for a long duration. So, i am not able to observe a complete period.
My samples (observations) are widespread in time.
I would like to get a prediction either for the state change or the current state for the current time. It is likely to happen that i will never have measured my signals for that requested time.
So far I have tested the TimeSeriesPredictor from the ML.NET Toolbox, as it seemed suitable to me. However, in my opinion, this algorithm requires that you always pass only the data of one signal. This means that assumption 5 is not included in the prediction, which is probably suboptimal. Also, in this case I had problems with the prediction not changing, which should actually happen time-dependently when I query multiple predictions. This behavior led me to believe that only the order of the values entered the model, but not the associated timestamp. If I have understood everything correctly, then exactly this timestamp is my most important "feature"...
So far, i did not do any tests on Regression-based approaches, e.g. FastTree, since my data is not linear, but keeps changing states. Maybe this assumption is not valid and regression-based methods could also be suitable?
I also don't know if a multiclassifier is required, because I had understood that the TimeSeriesPredictor would also be suitable for this, since it works with the single data type. Whether the prediction is 1.3 or exactly 1.0 would be fine for me.
To sum it up:
I am looking for a algorithm which is able to recognize the switching patterns based on lose and widespread samples. It would be okay to define boundaries, e.g. state duration 3 of signal 1 will never last longer than 30s or state duration 1 of signal 3 will never last longer 60s.
Then, after the algorithm has obtained an approximate model of the switching behaviour, i would like to request a prediction of a certain signal state for a certain time.
Which methods can I use to get the best prediction, preferably using the ML.NET toolbox or based on matlab?
Not sure if this is quite what you're looking for, but if detecting spikes and changes using signals is what you're looking for, check out the anomaly detection algorithms in ML.NET. Here are two tutorials that show how to use them.
Detect anomalies in product sales
Spike detection
Change point detection
Detect anomalies in time series
Detect anomaly period
Detect anomaly
One way to approach this would be to first determine the periodicity of each of the signals independently. This could be done by looking at the frequency distribution of time differences between measurements of state 2 only and separately for each signal.
This will give a multinomial distribution. The shortest time difference will be the duration of the switching event (after discarding time differences less than the max duration of state 2). The second shortest peak will be the duration between the end of one switching event and the start of the next.
When you have the 3 calculations of periodicity you can simply calculate the difference between each of them. Given you have the timestamps of the measurements of state 2 for each signal you should be able to calculate the time of switching for all other signals.

Why do we need to specify the number of flash wait cycles?

Especially when working with "faster" devices like STMF4xx/F7xx we need to specify the number of flash wait cycles, based on the supply voltage and the sys-clock frequency.
When the CPU fetches instructions/or constants this is done over the FLITF. Am I right with the assumption that the FLITF holds a CPU request as long as it can provide the requested data, making it impossible for other Bus-Masters to access flash meanwhile.
If this was true, why should it be important to any interface to know flash wait cycles. Like Cache does preload instructions so or so, independent if it knows how long to wait, no?
Because the flash interface isn't magic.
It has to meet the necessary setup and hold times for addressing and reading out the flash cells, which will vary somewhat depending on voltage. Taking the STM32F411 as an example (because I have that TRM handy), doing some maths with the voltage/frequency/wait-state table implies that a read from flash on one of those takes in the order of ~30ns above 2.7V, down to ~60ns below 2.1V.
Since the flash interface doesn't have its own asynchronous nanosecond-precision timekeeping ability (because that would be needlessly complicated, power-hungry, and silly), that translates to asserting its signals for n clock cycles, after which it can assume the data signals from the cells are stable enough to read back*. How does it know what the clock frequency is, and therefore what n should be? Simple: you, as the programmer who set the clock, tell it. Some hardware things are just infinitely easier to let software deal with.
* and then going through the further shenanigans of extracting the relevant 8, 16 or 32 bits out of the 128-bit line it's read, to finally spit that out the other side onto the AHB bus to the waiting CPU, obviously.

How to synchronize readout of binary streams on serial port of Matlab

I'm having an issue which is partially Matlab- and partially general programming-related, I'm hoping that somebody can help me brainstorm for solutions.
I have an external microcontroller that generates a large stream of binary data (~40kb) every 400ms and sends it via UART to a PC running Matlab scripts. The data is not encoded in hexa or dec characters, but true binary (hence, there's no terminator defined as all 256 values are possible, valid combinations of data). Baudrate is set at 1024000. In short, it takes roughly 375ms for a whole stream of data to be sent, with 25ms of dead time in between streams
In Matlab, the serial port is configured correctly (also 1024000, 8x bits, 1x stop bit, no parity, no hardware flow control, etc.). I am able to readout the data I'm sending via the microcontroller correctly (i.e. there's no corruption of data), but I'm not being able to synchronize the serial readout on Matlab. My script is as follows:
function data_show = GetDATA
if ~isempty(instrfind)
fclose(instrfind);
end
DATA_TOTAL_SIZE = 38400;
DATA_buffer = uint8(zeros(DATA_TOTAL_SIZE,1));
DATA_show = reshape(DATA_buffer(1:2:end)',[160,120])';
f_data_in = false;
f_data_out = true;
serialport = serial('COM11','BaudRate',1024000,'DataBits',8,'FlowControl','none','Parity','none','StopBits',1,...
'BytesAvailableFcnCount',DATA_TOTAL_SIZE,'BytesAvailableFcnMode','byte','InputBufferSize',DATA_TOTAL_SIZE * 2,...
'BytesAvailableFcn',#GetPortData);
fopen(serialport);
while (get(serialport,'BytesAvailable') ~= 0) % Skip first packet which might be incomplete
fread(serialport,DATA_TOTAL_SIZE,'uint8');
end
f_data_out = true;
while (1)
if (f_data_in)
DATA_buffer = fread(serialport,DATA_TOTAL_SIZE,'uint8');
DATA_show = reshape(DATA_buffer(1:2:end)',[160,120])'; %Reshape array as matrix
DATAsc(DATA_show);
disp('DATA');
end
pause(0.01);
end
fclose(serialport);
delete(serialport);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function GetPortData (obj,~)
if f_data_out
f_data_in = true;
end
end
end
The problem I see is that what I end up reading is always the correct size, but belongs to multiple streams, because I haven't found a way to tell Matlab that these 25ms of no data should be used to synchronize (i.e. data from before and after that blank period should belong to different streams).
Does anyone have any suggestions for this?
Thanks a lot!
For completeness, I would like to post the current implementation I have fixing this issue, which is probably not a suitable solution in all cases but might be useful in some.
The approach I took consists in moving into a bi-directional communication protocol, in which Matlab initiates the streaming by sending a very short command as a trigger (e.g. single, non-printable character). Given the high baudrate it does not add significant delay due to processing in the microcontroller's side.
The microcontroller, upon reception of this trigger, proceeds to transmit only one full package (as opposed to continuously streaming package at a 5Hz rate). By forcing Matlab to pickup a serial package of the known length right after issuing the trigger, it ensures that only one package and without synchronization issues is received.
Then it becomes just a matter of encapsulating the Matlab script in a routine with a 5Hz tick given by a timer, in which the sequence is repeated (send trigger, retrieve package, do whatever processing, and repeat).
Advantages of this:
It solves the synchronization problems
Disadvantages of this:
Having Matlab running on a timer tick does not ensure perfect periodicity, and hence the triggers might not always be sent at exactly 5Hz. If triggers are sent at "inconvenient" times for the microcontroller, packages might need to be skipped in order to avoid that a package is updated in memory while it is still being transmitted (since transmission takes a significant part of the 200ms time slot)
From experience, performance can vary a lot depending on what the PC running Matlab is doing. For example, it works fine when the PC is left on its own to do the acquisition, but if another program is used (e.g. Chrome), Matlab begins to lag and that results in delays in transmission of triggers.
As mentioned above, it's not a complete answer, but it is an approach that might be sufficient in some situations. If someone has a more efficient option, please fell free to share!

Transferring data using ultrasound

Yamaha InfoSound and ShopKick application use technologies that allow to transfer data using ultrasound. That is playing an inaudible signal (>18kHz) that can be picked up by modern mobile phones (iOS, Android).
What is the approach used in such technologies? What kind of modulation they use?
I see several problems with this approach. First, 18kHz is not inaudible. Many people cannot hear it, especially as they age, but I know I certainly can (I do regular hearing tests, work-related). Also, most phones have different low-pass filters on their A/D converters, and many devices, especially older Android ones (I've personally seen that happen), filter everything below 16 kHz or so. Your app therefore is not guaranteed to work on any hardware. The iPhone should probably be able to do it.
In terms of modulation, it could be anything really, but I would definitely rule out AM. Sound has next to zero robustness when it comes to volume. If I were to implement something like that, I would go with FSK. I would think that PSK would fail due to acoustic reflections and such. The difficulty is that you're working with non-robust energy transfer within a very narrow bandwidth. I certainly do not doubt that it can be achieved, but I don't see something like this proving reliable. Just IMHO, that is.
Update: Now that i think about it, a plain on-off would work with a single tone if you're not transferring any data, just some short signals.
Can't say for Yamaha InfoSound and ShopKick, but what we used in our project was a variation of frequency modulation: the frequency of the carrier is modulated by a digital binary signal, where 0 and 1 correspond to 17 kHz and 18 kHz respectively. As for demodulator, we tried heterodyne. More details you could find here: http://rnd.azoft.com/mobile-app-transering-data-using-ultrasound/
There's nothing special in being ultrasound, the principle is the same as data transmission through a modem, so any digital modulation is -in principle- feasible. You only have a specific frequency band (above 18khz) and some practical requisites (the medium is very unreliable, I guess) that suggest to use a simple-robust scheme with low-bit rate.
I don't know how they do it but this is how I do it:
If it is a string then make sure it's not a long one (the longer the higher is the error probability ). Lets assume we're working with the vital part of the ASCII code, namely up to character number 127, then all you need is 7 bits per character. Transform this character into bits and modulate those bits using QFSK (there are several modulations to choose from, frequency shift based ones have turned out to be the most robust I've tried from the conventional ones... I've created my own modulation scheme for this use case). Select the carrier frequencies as 18.5,19,19.5, and 20 kHz (if you want to be mathematically strict in your design, select frequency values that assure you both orthogonality and phase continuity at symbol transitions, if you can't, a good workaround to avoid abrupt symbols transitions is to multiply your symbols by a window of the same size, eg. a Gaussian or Bartlet ). In my experience you can move this values in the range from 17.5 to 20.5 kHz (if you go lower it will start to bother people using your app, if you go higher the average type microphone frequency response will attenuate your transmission and induce unwanted errors).
On the receiver side implement a correlation or matched filter receiver (an FFT receiver works as well, specially a zero padded one but it might be a little bit slower, I wouldn't recommend Goertzel because frequency shift due to Doppler effect or speaker-microphone non-linearities could affect your reception). Once you have received the bit stream make characters with them and you will recover your message
If you face too many broadcasting errors, try selecting a higher amount of samples per symbol or band-pass filtering each frequency value before giving them to the demodulator, using an error correction code such as BCH or Reed Solomon is sometimes the only way to assure an error free communication.
One topic everybody always forgets to talk about is synchronization (to know on the receiver side when the transmission has begun), you have to be creative here and make a lot a tests with a lot of phones before you can derive an actual detection threshold that works on all, notice that this might also be distance dependent
If you are unfamiliar with these subjects I would recommend a couple of great books:
Digital Modulation Techniques from Fuqin Xiong
DIGITAL COMMUNICATIONS Fundamentals and Applications from BERNARD SKLAR
Digital Communications from John G. Proakis
You might have luck with a library I created for sound-based modems, libquiet. It gives you a handful of profiles to work from, including a slow "Ultrasonic whisper" profile with spectral content above 19kHz. The library is written in C but would require some work to interface with iOS.

redundant encoding?

This is more of a computer science / information theory question than a straightforward programming one, so if anyone knows of a better site to post this, please let me know.
Let's say I have an N-bit piece of data that will be sent redundantly in M messages, where at least M-1 of those messages will be received successfully. I am interested in different ways of encoding the N-bit piece of data in fewer bits per message. (this is similar to RAID but at a much smaller level, where N = 8 or 16 or 32)
Example: suppose N = 16 and M = 4. Then I could use the following algorithm:
1st and 3rd message: send "0" + bits 0-7
2nd and 4th message: send "1" + bits 8-15
If I can guarantee that 3 messages of the 4 will get through, then at least one message from each group will get through. Thus I can make this work with 9 bits or less, there's probably a way to do this with fewer total bits but I'm not sure how.
Are there some simple encoding/decoding algorithms to do this kind of thing? Does this problem have a name? (if I know what it's called, I can google it!)
note: in my particular case, the messages either arrive correctly or do not arrive at all (no messages arrive with errors).
(edit: moved 2nd part to a separate question)
(Incomplete answer follows. I may add more later.)
The term you may be interested in is channel coding: adding redundancy to a source in order to make it robust during transmission over a noisy channel. In information theory, the complementary problem to channel coding is source coding: reducing the redundancy in a source to represent it using fewer bits. (The combination of these two problems is called joint source-channel coding.)
Your first question asks to find a channel code. The simple example you give is similar to a repetition code, i.e., you send the same message more than twice (usually an odd number of times), and then the message which is received most often is accepted as the original message.
This code is inefficient. To use standard notation, let k = number of bits in original message, and n = number of bits in the transmitted message. For your example, k = 16 and n = 36. A measure of coding efficiency is k/n, where higher means more efficient. In your case, k/n = 0.44. This is low.
The repetition code is a simple kind of block code, i.e., redundancy is added to each block of k bits to create a codeword of n bits. So are the Hamming and Reed-Solomon codes as others mentioned. Hamming codes are relatively easy to understand with some basic linear algebra.
These should be enough terms for you to search on your own. Good luck.
I'm not sure if I understood all the details of your question correctly, but your problem is definitely aboud designing some kind of error correcting code. This is a vast area of computer science and thick tomes have been written about it. Start with wikipedia and see if you can get any simple schemes (like Hamming or Reed-Solomon codes) to work in your case.
If you want to deal not only with symbol corruption, but also deletion of symbols, you should look at erasure codes, this is definitely a more difficult task but good methods exist in many cases.
EDIT: This material from hackersdelight.org seems a nice introduction.
See erasure codes.
You're looking for a packet erasure code. There are only two useful packet erasure codes that are not totally encumbered by patents, and there's only one open-source library to implement those. Find it here: http://planete-bcast.inrialpes.fr/rubrique.php3?id_rubrique=5
Here's a trivially simple scheme that's almost twice as efficient as your example.
You chopped the message into blocks of (N/M)*2 bits. Instead, chop it into N/(M-1)-bit blocks. (Round it up if necessary.) The first block, src[0], encodes as itself: enc[0]=src[0]. The same for the last block: enc[M-1]=src[M-1]. Each of the other blocks gets XORed with its left neighbor: enc[i]=src[i-1]^src[i].
Prefix each encoded block with a log(M)-bit sequence number, essentially as you did, so the receiver can tell which was dropped. (If you can be sure that whichever blocks arrive will arrive in order, then a 1-bit sequence number will do. Just alternate 0 and 1.)
To decode, successively XOR from the left and the right until you hit the dropped block. E.g. src[1] == enc[0]^enc[1]. (Dropping one of the endpoint blocks isn't a special case -- e.g. if the first block is dropped, the scan from the right recovers it, and the scan from the left is of length 0.)