How does a computer distinguish between different types of data? - cpu-architecture

Any type of data be it audio, video or text etc is represented and stored inside a computer as a sequence of 0s and 1s. Then how does a computer know that this particular sequence is a textual data and that type of sequence is an audio file?

Related

How to decode 16-bit signed binary file in IEEE754 standard

I have a file format called .ogpr (openGPR, a dead format used for Ground Radar data), I'm trying to read this file and convert it into a matrix using Matlab(R).
In the first part of file there is a JSON Header where are explained the characteristics of data acquisition (number of traces, position etc), and on the second part there are two different data blocks.
First block contains the 'real' GPR data and I know that they are formatted as:
Multibyte binary data are little-endian
Floating point binary data follow the IEEE 754 standard
Integer data follow the two’s complement encoding
I know also the total number of bytes and also the relative number of bytes for each single 'slice' (we have 512 samples * 10 channel * 3971 slices [x2 byte per sample]).
Furthermore: 'A Data Block of type Radar Volume stores a 3D array of radar Samples At the moment, each sample value is stored in a 16-bit signed integer. Each Sample value is in volts in the range [-20, 20].'
Second block contains geolocation infos.
I'd like to read and convert the Data Block from that codification but it ain't clear especially how many bytes break the data and how to convert them from that codification to number.
I tried to use this part of code:
bin_data = ogpr_data(48:(length(ogpr_data)-1),1);
writematrix(bin_data, 'bin_data.txt');
fileID = fopen('bin_data.txt', 'r', 'ieee-le');
format = 'uint16';
Data = fread(fileID, Inf, format);fclose(fileID)
Looks like your posted code is mixing text files and binary files. The writematrix( ) routine writes values as comma delimited text. Then you turn around and try to use fopen( ) and fread( ) to read this as a binary file in IEEE Little Endian format. These are two totally different things. You need to pick one format and use it consistently, either human readable comma delimited text files, or machine readable binary IEEE format files.

What video containers are 'sequential' in their file encoding?

If for example I have two video files, both of similar characteristics, file type, encoding, resolution, etc and starting at the same point but A goes on for 10 seconds while B goes on for 20. If A's file size is 10MB and B's is 20MB, if I read in e.g. the first 5MB from both will the major video encoding formats' binary sequences match for that 5MB?
E.G. MP4, AVI, MOV, WMV?
No, different containers work differently, the first X bytes will not contain the same number of frames. In some cases like mp4, you may get audio, or metadata, and no video at all, or you may get bytes that can not be interpreted without information that comes later in the file.

Read and represent mp3 files using memmapfile in matlab

I have to analyze bio acoustic audiofiles using matlab. Eventually I want to be able to find anomalies in the audio. That's the reason I need to find a way to represent the audio in a way I can extract and compare features. I'm dealing with mp3 files up to 150 mb. These files are too large for matlab to read in to it's memory. Therefore I want to use the memmapfile() function. I used the following code and a small mp3 file to find out how it actually works.
[testR, ~] = audioread('test.mp3');
testM = memmapfile('test.mp3');
disp(testM.Data);
disp(testR);
The actual values of the testM.Data and testR are different. Audioread() returns a 7483391 x 2 matrix and memmapfile() a 4113874 x 1 matrix.
I'm not really sure how memmapfile() works, I expected this to be equal to each other. Is there a way to read mp3 files in the same format audioread() does using memmapfile()? And what does memmapfile actually return in case of an audio file? Maybe it's also usable in the vector format in the case of anomaly detection?
Thanks in advance!
NOTE: The original files were in wav IMA ADPCM format with sizes from 1.5 up to 2.5 gb. Since Matlab can't deal with that format and the size of the files I converted them to 8bit mp3 files.
I think that the problem is mammapfile by default read data in uint8 format, while audioread function read data in another way.
How you can see here you can specify the format of data when you read it with memmapfile, so try to "play" with different values. From the documentation I read that you can read data in double format, so try to modify the memmapfile data format and audioread data format.
Last thing, memmapfile always organize the data in matrix like "somenumbers x 1", so if you want the original one you need to use something like reshape.
Anyway if you work with big data I suggest you to try with something different instead memmapfile, because it is very very slow

Serial communication with simulink

I'm trying to send and receive data through a serial port using simulink (matlab 7.1) and d-space. The values I want to send and receive are doubles. Unfortunately for me the send and receive blocks use uint8 values. My question is how can I convert doubles into an array of uint8 values and vice versa? Are there simulink blocks for this or should I use embedded matlab functions?
Use the aptly named Data Type Conversion block, which does just that.
EDIT following discussion in the comments
Regarding scaling, here's a snapshot of something I did a long time ago. It's using CAN rather than serial, but the principle is the same. Here, it's slightly easier in that the signals are always positive, so I don't have to worry about scaling a negative number. 65535 is the max value for a uint16, and I would do the reverse scaling on the receiving end. When converting to uint16 (or uint8 as in your case, it automatically rounds the value, and you can specify that behaviour in the block mask).
There are pack and unpack blocks in simulink, search for them in simulink library browser. You could need som additional product, not sure which.

Playing sound in Matlab at +30dB

As far as I know when I load wav files to matlab with command:
song = wavread('file.wav');
array song have elements with values from -1 to 1. This file (and hardware) is prepared to be played with 80dB. I need to add +30dB to achieve 110dB.
I do +10dB by multiplying by sqrt(10), so to get +30dB I do:
song = song*10*sqrt(10); which is the same as
song = song*sqrt(10)*sqrt(10)*sqrt(10);
Now values of array song have much greater values than -1 to 1 and I hear distorted sound.
Is it because of this values greater than <-1,1> or quality of my speakers/headphones?
The distortion is because your values are exceeding +/-1. The float values are converted to ADC counts, which are either +/-32768 (for a 16-bit ADC) or +/-8388608 (for a right-justified 24-bit ADC) or +/-2147483648 (for a left-justfied 24-bit ADC). For a 16-bit ADC, this is usually accomplished by an operation like adcSample = (short int)(32768.0*floatSample); in C. If floatSample is > +1 or < -1 this will cause wraparound in the short int cast, which is the distortion you hear. The cast is necessary because the ADC expects 16-bit digital samples.
You will need to adjust your amplifier/speaker settings to get the sound level you desire.
Conversely, you could create a copy of your file, lower it by 30 dB, adjust your amplifier/speakers to play the new file at 80 dB, then play the original file at the same amp/speaker settings. This will cause the original file to be played at 110 dB.
As Paul R noted in his comment, I am guessing here that you are using dB as shorthand for dB SPL when referring to the actual analog sound level produced by the full signal chain.