Create a variable of a specific length and populate it with 0's and 1's - matlab

I am trying to use MATLAB in order to simulate a communications encoding and decoding mechanism. Hence all of the data will be 0's or 1's.
Initially I created a vector of a specific length and populated with 0's and 1's using
source_data = rand(1,8192)<.7;
For encoding I need to perform XOR operations multiple times which I was able to do without any issue.
For the decoding operation I need to implement the Gaussian Elimination method to solve the set of equations where I realized this vector representation is not very helpful. I tried to use strcat to append multiple 0's and 1's to a variable a using a for loop:
for i=1:8192
if(mod(i,2)==0)
a = strcat(a,'0');
else
a = strcat(a,'1');
end
i = i+1;
disp(i);
end
when I tried length(a) after this I found that the length was 16384, which is twice 8192. I am not sure where I am going wrong or how best to tackle this.

Did you reinitialize a before the example code? Sounds like you ran it twice without clearing a in between, or started with a already 8192 long.
Growing an array in a loop like this in Matlab is inefficient. You can usually find a vectorized way to do stuff like this. In your case, to get an 8192-long array of alternating ones and zeros, you can just do this.
len = 8192;
a = double(mod(1:len,2) == 0);
And logicals might be more suited to your code, so you could skip the double() call.

There are probably a few answer/questions here. Firstly, how can one go from an arbitrary vector containing {0,1} elements to a string? One way would be to use cellfun with the converter num2str:
dataDbl = rand(1,8192)<.7; %see the original question
dataStr = cellfun(#num2str, num2cell(dataDbl));
Note that cellfun concatenates uniform outputs.

Related

Encoding a binary vector in a suitable way in Matlab

The context and the problem below are only examples that can help to visualize the question.
Context: Let's say that I'm continously generating random binary vectors G with length 1x64 (whose values are either 0 or 1).
Problem: I don't want to check vectors that I've already checked, so I want to create a kind of table that can identify what vectors are already generated before.
So, how can I identify each vector in an optimized way?
My first idea was to convert the binary vectors into decimal numbers. Due to the maximum length of the vectors, I would need 2^64 = 1.8447e+19 numbers to encode them. That's huge, so I need an alternative.
I thought about using hexadecimal coding. In that case, if I'm not wrong, I would need nchoosek(16+16-1,16) = 300540195 elements, which is also huge.
So, there are better alternatives? For example, a kind of hash function that can identify that vectors without repeating values?
So you have 64 bit values (or vectors) and you need a data structure in order to efficiently check if a new value is already existing?
Hash sets or binary trees come to mind, depending on if ordering is important or not.
Matlab has a hash table in containers.Map.
Here is a example:
tic;
n = 1e5; % number of random elements
keys = uint64(rand(n, 1) * 2^64); % random uint64
% check and add key if not already existing (using a containers.Map)
map = containers.Map('KeyType', 'uint64', 'ValueType', 'logical');
for i = 1 : n
key = keys(i);
if ~isKey(map, key)
map(key) = true;
end
end
toc;
However, depending on why you really need that and when you really need to check, the Matlab function unique might also be something for you.
Just throwing out duplicates once at the end like:
tic;
unique_keys = unique(keys);
toc;
is in this example 300 times faster than checking every time.

Splitting an audio file in Matlab

I'm trying to split an audio file into 30 millisecond disjoint intervals using Matlab. I have the following code at the moment:
clear all
close all
% load the audio file and get its sampling rate
[y, fs] = audioread('JFK_ES156.wav');
for m = 1 : 6000
[t(m), fs] = audioread('JFK_ES156.wav', [(m*(0.03)*fs) ((m+1)*(0.03)*fs)]);
end
But the problem is that I get the following error:
In an assignment A(I) = B, the number of elements in B and I
must be the same.
Error in splitting (line 12)
[t(m), fs] = audioread('JFK_ES156.wav', [(m*(0.03)*fs)
((m+1)*(0.03)*fs)]);
I don't see why there's a mismatch in the number of elements in B and I and how to solve this. How can I get past this error? Or is there just an easier way to split the audio file (maybe another function I don't know about or something)?
I think the easiest way to split audio is to just load it and use the vec2mat function. so you would have something like this;
[X,Fs] = audioread('JFK_ES156.wav');
%Calculate how many samples you need to capture 30ms of audio
matSize = Fs*0.3;
%Pay attention to that apostrophe. Makes sure samples are stored in columns
%rather than rows.
output = vec2mat(x,matSize)';
%You can now have your audio split up into the different columns of your matrix.
%You can call them by using the column calling command for matrices.
%Plot first 30ms of audio
plot(output(:,1));
%You can join the audio back together using this command.
output = output(:);
Hope that helps. Another good thing about this method is that it keeps all your data in one place!
Edit : One thing I thought of, you may get a problem with this depending on your vector size. But I think vec2mat actually zeroPads your vector. Not a big thing, but if you're moving back and forth between the two, then it might be a good idea to have another variable that stores the original length of your signal.
You should just use the variable y and reshape it to form your split audio. For example,
chunk_size = fs*0.03;
y_chunks = reshape(y, chunk_size, 6000);
That will give you a matrix with each column a 30 ms chunk. This code will also be faster than reading small segments from file in a loop.
As hiandbaii suggested you could also use cell array. Make sure you clear your existing variables before that. Not clearing the array t is probably the reason you got the error "Cell contents assignment to a non-cell array object."
Your original error is because you cannot assign a vector with scalar indexing. That is, 'm' is a scalar, but your audioread call is returning a vector. This is what the error says about mismatch in size of I and B. You could also fix that by making t a 2-D array and use an assignment like
[t(m,:), fs] =
It appears that each 30 ms segment is not equal to one sample. That would be the only case where your code works. i.e. 0.03*fs != 1.
You could try using cells instead.. i.e. replace t(m) with t{m}

Addressing multiple ranges via indices in a vector

To begin, this problem is easily solvable with a for-loop. However, I'm trying to force/teach myself to think vector-wise to take advantage of what Matlab does best.
Simplified, here is the problem explanation:
I have a vector with data in it.
I have a 2xN array of start/stop indices that represent ranges of interesting data in the vector.
I want to perform calculations on each of those ranges, resulting in a number (N results, corresponding to each start/stop range.)
In code, here's a pseudoexample of what I'd like to have at the end:
A = 1:10000;
startIndicies = [5 100 1000];
stopIndicies = [10 200 5000];
...
calculatedResults = [func(A(5:10)) func(A(100:200)) func(A(1000:5000))]
The length of A, and of the start/stop index array is variable.
Like I said, I can easily solve this with a for loop. However since could be used with a large data set, I'd like to know if there's a good solution without a for loop.
Here is one possible solution, although, I won't call it a fully vectorized solution, rather a one liner one.
out = cellfun(#(i,j) fun(A(i:j)), num2cell(startIndicies), num2cell(stopIndicies) );
or, if you plan to have homogeneous outputs,
out = arrayfun(#(i,j) fun(A(i:j)), startIndicies, stopIndicies);

Concatenate equivalent in MATLAB for a single value

I am trying to use MATLAB in order to generate a variable whose elements are either 0 or 1. I want to define this variable using some kind of concatenation (equivalent of Java string append) so that I can add as many 0's and 1's according to some upper limit.
I can only think of using a for loop to append values to an existing variable. Something like
variable=1;
for i=1:N
if ( i%2==0)
variable = variable.append('0')
else
variable = variable.append('1')
i=i+1;
end
Is there a better way to do this?
In MATLAB, you can almost always avoid a loop by treating arrays in a vectorized way.
The result of pseudo-code you provided can be obtained in a single line as:
variable = mod((1:N),2);
The above line generates a row vector [1,2,...,N] (with the code (1:N), use (1:N)' if you need a column vector) and the mod function (as most MATLAB functions) is applied to each element when it receives an array.
That's not valid Matlab code:
The % indicates the start of a comment, hence introducing a syntax error.
There is no append method (at least not for arrays).
Theres no need to increment the index in a for loop.
Aside of that it's a bad idea to have Matlab "grow" variables, as memory needs to be reallocated at each time, slowing it down considerably. The correct approach is:
variable=zeros(N,1);
for i=1:N
variable(i)=mod(i,2);
end
If you really do want to grow variables (some times it is inevitable) you can use this:
variable=[variable;1];
Use ; for appending rows, use , for appending columns (does the same as vertcat and horzcat). Use cat if you have more than 2 dimensions in your array.

What's the best way to iterate through columns of a matrix?

I want to apply a function to all columns in a matrix with MATLAB. For example, I'd like to be able to call smooth on every column of a matrix, instead of having smooth treat the matrix as a vector (which is the default behaviour if you call smooth(matrix)).
I'm sure there must be a more idiomatic way to do this, but I can't find it, so I've defined a map_column function:
function result = map_column(m, func)
result = m;
for col = 1:size(m,2)
result(:,col) = func(m(:,col));
end
end
which I can call with:
smoothed = map_column(input, #(c) (smooth(c, 9)));
Is there anything wrong with this code? How could I improve it?
The MATLAB "for" statement actually loops over the columns of whatever's supplied - normally, this just results in a sequence of scalars since the vector passed into for (as in your example above) is a row vector. This means that you can rewrite the above code like this:
function result = map_column(m, func)
result = [];
for m_col = m
result = horzcat(result, func(m_col));
end
If func does not return a column vector, then you can add something like
f = func(m_col);
result = horzcat(result, f(:));
to force it into a column.
Your solution is fine.
Note that horizcat exacts a substantial performance penalty for large matrices. It makes the code be O(N^2) instead of O(N). For a 100x10,000 matrix, your implementation takes 2.6s on my machine, the horizcat one takes 64.5s. For a 100x5000 matrix, the horizcat implementation takes 15.7s.
If you wanted, you could generalize your function a little and make it be able to iterate over the final dimension or even over arbitrary dimensions (not just columns).
Maybe you could always transform the matrix with the ' operator and then transform the result back.
smoothed = smooth(input', 9)';
That at least works with the fft function.
A way to cause an implicit loop across the columns of a matrix is to use cellfun. That is, you must first convert the matrix to a cell array, each cell will hold one column. Then call cellfun. For example:
A = randn(10,5);
See that here I've computed the standard deviation for each column.
cellfun(#std,mat2cell(A,size(A,1),ones(1,size(A,2))))
ans =
0.78681 1.1473 0.89789 0.66635 1.3482
Of course, many functions in MATLAB are already set up to work on rows or columns of an array as the user indicates. This is true of std of course, but this is a convenient way to test that cellfun worked successfully.
std(A,[],1)
ans =
0.78681 1.1473 0.89789 0.66635 1.3482
Don't forget to preallocate the result matrix if you are dealing with large matrices. Otherwise your CPU will spend lots of cycles repeatedly re-allocating the matrix every time it adds a new row/column.
If this is a common use-case for your function, it would perhaps be a good idea to make the function iterate through the columns automatically if the input is not a vector.
This doesn't exactly solve your problem but it would simplify the functions' usage. In that case, the output should be a matrix, too.
You can also transform the matrix to one long column by using m(:,:) = m(:). However, it depends on your function if this would make sense.