Matlab code to split a random data - matlab

In Matlab, how can I split a random data into two matrices, for example: X(i) is a random vector, where i=1:100, every data symbol is formed from four bits, where x(1) and x(2) are the MSB(Most Significant Bits), x(3) and x(4) are the LSB(Least Significant Bits). I want to split them to get a new matrices y1(for the MSB) and y2(for the LSB).
EDIT
Here is some example code but for some reason it does not seem to work,
M=16;
N=10;
c=randi([0 M-1],1,N);
xx=dec2bin(c);
for k = 1:N-1
for j= 1:4
y1(k)=xx(k);
y1(k+1)=xx(k+1);
y2(k+2)= xx(k+2);
y2(k+3)= xx(k+3);
end
end

The code you wrote seems to have some issues. I think some of the problems is based on a misunderstanding of matlab. I will write a short list of some issue here:
1) There is no string class in matlab. Instead there is char arrays. Further, Matlab does not use pointers or reference in the same ways as Java or c++. This means that you cannot have a vector with char arrays as you have there. This also mean y1 and y2 must be a matlab cell or a matrix to store the data.
2) If c is a vector, then xx will be a matrix and size(xx) == [length(c),dec2bin(max(x))] So to say, each string of binary values is a row and every row is exactly large enough for the the largest string to fit, eg.
a = [13,257];
b = dec2bin(a);
where b is a 2x9 matrix since b needs at least nine bits. So to your problem. I will vectorize the solution and also use the extra agument in dec2bin to lock the number of bits to 4. Try this,
function [msBit, lsBit] = test()
M=16;
N=10;
if M>16
error('M must not be greater than 4 bits');
end
c=randi([0 M-1],1,N);
xx = dec2bin(c,4);
disp xx
disp(xx)
disp ' '
% Take the 2 most significant bits from every row
msBit = xx(:,1:2);
% Take the 2 least significant bits from every row
lsBit = xx(:,3:4);
disp msb
disp(msBit);
disp ' '
disp lsb
disp(lsBit);
It is of course possible to work with int8 as well, then we need to use the bitwise operaton functions. This is more difficult and the result will of course be an int. So 1100 will be represented by 12. This does not seem to be what you are after though, so I will not do this here.
Hope it works and good luck!

Related

Draw non full matrix of random numbers

I am doing a Monte-Carlo simulation, where each repetition requires the sum or product of a random number of random variables. My problem is how to do this efficiently as the entire simulation should be as vectorized as possible.
For example, say we want to take the sum of 5, 10 and 3 random numbers, represented by the vector len = [5;10;3]. Then what I am currently doing is drawing a full matrix of random numbers:
A = randn(length(len),max(len));
Creating a mask of the non-needed numbers:
lenlen = repmat(len,1,max(len));
idx = repmat(1:max(len),length(len),1);
mask = idx>lenlen;
and then I can "pad", the matrix as I am interested in the sum the padding have to be zero (for the case with the product the padding had to be 1)
A(mask)=0;
To obtain:
A =
1.7708 -1.4609 -1.5637 -0.0340 0.9796 0 0 0 0 0
1.8034 -1.5467 0.3938 0.8777 0.6813 1.0594 -0.3469 1.7472 -0.4697 -0.3635
1.5937 -0.1170 1.5629 0 0 0 0 0 0 0
Whereafter I can sum them together
B = sum(A,2);
However, I find it rather superfluous that I have to draw too many random numbers and then throw them away. In the real case, I need in the range of hundred thousands of repetitions and the vector len might vary a lot, i.e. it can easily be that I have to draw twice or three times the number of random numbers than of what is needed.
You can generate the exact amount of random numbers required, create a grouping variable with repelem, and compute the sum of each group using accumarray:
len = [5; 10; 3];
B = accumarray(repelem(1:numel(len), len).', randn(sum(len),1));
You could just use arrayfun or a loop. You say "efficient" and "vectorized" in the same breath, but they are not necessarily the same thing - since the new(ish) JIT compiler, loops are pretty fast in MATLAB. arrayfun is basically a loop in disguise, but means you could create B like so:
len = [5;10;3];
B = arrayfun( #(x) sum( randn(x,1) ), len );
For each element in len, this creates a vector of length len(i) and takes the sum. The output is an array with one value for each value in len.
This will certainly be a lot more memory friendly for large values and largely different values within len. It may therefore be quicker, your mileage may vary but it cuts out a lot of the operations you're doing.
You mention wanting to take the product sometimes, in which case use prod in place of sum.
Edit: rough and ready benchmark to compare arrayfun and a loop...
len = randi([1e3, 1e7], 100, 1);
tic;
B = arrayfun( #(x) sum( randn(x,1) ), len );
toc % ~8.77 seconds
tic;
out=zeros(size(len));
for ii = 1:numel(len)
out(ii) = sum(randn(len(ii),1));
end
toc % ~8.80 seconds
The "advantage" of the loop over arrayfun is you can pre-generate all of the random numbers in one go, then index. This isn't necesarryily quicker because you're addressing much bigger chunks of memory, and the call to randn is the main bottleneck anyway!
tic;
out = zeros(size(len));
rnd = randn(sum(len),1);
idx = [0; cumsum(len)]; % note: cumsum is very quick (~0.001sec here) so negligible
for ii = 1:numel(len)
out(ii) = sum(rnd(idx(ii)+1:idx(ii+1)),1);
end
toc % ~10.2 sec! Slower because of massive call to randn and the indexing into large array.
As stated at the top, arrayfun and looping are basically the same under the hood, so no reason to expect a big time difference.
The sum of multiple random numbers drawn from a specific distribution is also a random number with a (different) specific distribution. Therefore you can just cut the middleman and draw directly from the latter distribution.
In your case you are summing 3, 10 and 5 numbers drawn from a N(0,1) distribution. As explained here, the resulting distributions therefore are N(0,3), N(0,10) and N(0,5). This page explains how you can draw from non-standard normal distributions in Matlab. As such, we can in this case generate those numbers with randn(3,1).*sqrt([5; 10; 3]).
In case you would want 1000 triples, you could then use
randn(3,1000).*sqrt([5; 10; 3])
or pre Matlab2016b
bsxfun(#times, randn(3,1000), sqrt([5; 10; 3]))
which is of course very fast.
Different distributions have different summation rules, but as long as you are not summing up numbers drawn from different distributions the rules are usually quite simple and found quickly with google.
You can do this using a combination of cumsum and diff. The plan is:
Create all the random numbers in a single call to randn up front
Then, use cumsum to produce a vector of cumulative summations
Use cumsum on the list of number-of-samples-per-result to work out where to read out the results
We also need diff to correct for the prior summations.
Note that this method might lose accuracy if you weren't using randn for the random samples, as cumsum would then build up arithmetic rounding errors.
% We want 100 sums of random numbers
numSamples = 100;
% Here's where we define how many random samples contribute to each sum
numRandsPerSample = randi(5, 1, numSamples);
% Let's make all the random numbers in one call
allRands = randn(1, sum(numRandsPerSample));
% Use CUMSUM to build up a cumulative sum of the whole of allRands. We also
% need a leading 0 for the first sum.
allRandsCS = [0, cumsum(allRands)];
% Use CUMSUM again to pick out the places we need to pick from
% allRandsCS
endIdxs = 1 + [0, cumsum(numRandsPerSample)];
% Use DIFF to subtract the prior sums from the result.
result = diff(allRandsCS(endIdxs))

Optimize nested for loop for calculating xcorr of matrix rows

I have 2 nested loops which do the following:
Get two rows of a matrix
Check if indices meet a condition or not
If they do: calculate xcorr between the two rows and put it into new vector
Find the index of the maximum value of sub vector and replace element of LAG matrix with this value
I dont know how I can speed this code up by vectorizing or otherwise.
b=size(data,1);
F=size(data,2);
LAG= zeros(b,b);
for i=1:b
for j=1:b
if j>i
x=data(i,:);
y=data(j,:);
d=xcorr(x,y);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(i,j)=I-1;
d=xcorr(y,x);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(j,i)=I-1;
end
end
end
First, a note on floating point precision...
You mention in a comment that your data contains the integers 0, 1, and 2. You would therefore expect a cross-correlation to give integer results. However, since the calculation is being done in double-precision, there appears to be some floating-point error introduced. This error can cause the results to be ever so slightly larger or smaller than integer values.
Since your calculations involve looking for the location of the maxima, then you could get slightly different results if there are repeated maximal integer values with added precision errors. For example, let's say you expect the value 10 to be the maximum and appear in indices 2 and 4 of a vector d. You might calculate d one way and get d(2) = 10 and d(4) = 10.00000000000001, with some added precision error. The maximum would therefore be located in index 4. If you use a different method to calculate d, you might get d(2) = 10 and d(4) = 9.99999999999999, with the error going in the opposite direction, causing the maximum to be located in index 2.
The solution? Round your cross-correlation data first:
d = round(xcorr(x, y));
This will eliminate the floating-point errors and give you the integer results you expect.
Now, on to the actual solutions...
Solution 1: Non-loop option
You can pass a matrix to xcorr and it will perform the cross-correlation for every pairwise combination of columns. Using this, you can forego your loops altogether like so:
d = round(xcorr(data.'));
[~, I] = max(d(F:(2*F)-1,:), [], 1);
LAG = reshape(I-1, b, b).';
Solution 2: Improved loop option
There are limits to how large data can be for the above solution, since it will produce large intermediate and output variables that can exceed the maximum array size available. In such a case for loops may be unavoidable, but you can improve upon the for-loop solution above. Specifically, you can compute the cross-correlation once for a pair (x, y), then just flip the result for the pair (y, x):
% Loop over rows:
for row = 1:b
% Loop over upper matrix triangle:
for col = (row+1):b
% Cross-correlation for upper triangle:
d = round(xcorr(data(row, :), data(col, :)));
[~, I] = max(d(:, F:(2*F)-1));
LAG(row, col) = I-1;
% Cross-correlation for lower triangle:
d = fliplr(d);
[~, I] = max(d(:, F:(2*F)-1));
LAG(col, row) = I-1;
end
end

Convert a decimal number that is not integer to base 4 in Matlab?

Is there a way to convert a decimal number between $0$ and $1$ that is not integer to base 4 in Matlab? E.g. if I put 2/5 I want to get 0.12121212... (with some approximation I guess)
The function dec2base only works for integers.
Listed in this post is a vectorized approach that works through all possible combinations of digits to select the best one for the final output as a string. Please note that because of its very nature of creating all possible combinations, it would be memory intensive and slower than a recursive approach, but I guess it could be used just for fun or educational purposes!
Here's the function implementation -
function s = dec2base_float(d,b,nde)
%DEC2BASE_FLOAT Convert floating point numbers to base B string.
% DEC2BASE_FLOAT(D,B) returns the representation of D as a string in
% base B. D must be a floating point array between 0 and 1.
%
% DEC2BASE_FLOAT(D,B,N) produces a representation with at least N decimal digits.
%
% Examples
% dec2base_float(2/5,4,4) returns '0.1212'
% dec2base_float(2/5,3,6) returns '0.101211'
%// Get "base power-ed scaled" digits
scale = b.^(-1:-1:-nde);
%// Calculate all possible combinations
P = dec2base(0:b^nde-1,b,nde)-'0';
%// Get the best possible combination ID. Index into P with it and thus get
%// based converted number with it
[~,idx] = min(abs(P*scale(:) - d));
s = ['0.',num2str(P(idx,:),'%0.f')];
return;
Sample runs -
>> dec2base_float(2/5,4,4)
ans =
0.1212
>> dec2base_float(2/5,4,6)
ans =
0.121212
>> dec2base_float(2/5,3,6)
ans =
0.101211

Extract data from multidimentional array into 2 dims based on index

I have a huge (1000000x100x7) matrix and i need to create a (1000000x100x1) matrix based on an index vector (100x1) which holds 1 2 3 4 5 6 or 7 for each location.
I do not want to use loops
The problem (I think)
First, let me try create a minimum working example that I think captures what you want to do. You have a matrix A and an index vector index:
A = rand(1000000, 100, 7);
index = randi(7, [100, 1]);
And you would like to do something like this:
[I,J,K] = size(A);
B = zeros(I,J);
for i=1:I
for j=1:J
B(i,j) = A(i,j,index(j));
end
end
Only you'd like to do so without the loops.
Linear indexing
One way to do this is by using linear indexing. This is kinda a tricky thing that depends on how the matrix is laid out in memory, and I'm gonna do a really terrible job explaining it, but you can also check out the documentation for the sub2ind and ind2sub functions.
Anyways, it means that given your (1,000,000 x 100 x 7) matrix stored in column-major format, you can refer to the same element in many different ways, i.e.:
A(i, j, k)
A(i, j + 100*(k-1))
A(i + 1000000*(j-1 + 100*(k-1)))
all refer to the same element of the matrix. Anyways, the punchline is:
linear_index = (1:J)' + J*(index-1);
B_noloop = A(:, linear_index);
And of course we should verify that this produces the same answer:
>> isequal(B, B_noloop)
ans =
1
Yay!
Performance vs. readability
So testing this on my computer, the nested loops took 5.37 seconds and the no-loop version took 0.29 seconds. However, it's kinda hard to tell what's going on in that code. Perhaps a more reasonable compromise would be:
B_oneloop = zeros(I,J);
for j=1:J
B_oneloop(:,j) = A(:,j,index(j));
end
which vectorizes the longest dimension of the matrix and thus gets most of the way there (0.43 seconds), but maintains the readability of the original code.

matlab Pythagorean Theorem without using for

I am doing a matlab homework and I solved the next problem. and the grader say it is a correct answer. I used for in the program and we didn't take yet in the course. can someone suggest a program with out for or if.
Write a function called pitty that takes a matrix called ab as an input argument. The matrix ab has exactly two columns. The function should return a column vector c that contains positive values each of which satisfies the Pythagorean Theorem, a2 + b2 = c2, for the corresponding row of ab assuming that the two elements on each row of ab correspond to one pair, a and b, respectively, in the theorem. Note that the built-in MATLAB function sqrt computes the square root and you are allowed to use it.
my code
function c = pitty(ab)
[n , m] = size(ab)
for i = 1:n
c(i) = sqrt(ab(i,1)^2 + ab(i,2)^2)
end
c = c'
end
You can square each element of the matrix by using the .^2 operator. Then summing along each row sum(...,2) and finally taking the root.
ab = [1,2;3,4;5,6]
c = sqrt(sum(ab.^2,2));
No for needed for that.
MATLAB has a function for this called hypot short for hypotenuse. The main reason for existence of it is that it takes care of overflow (and underflow) problem. If the input values are too large (or small) the square of them (or sum of square of them) can be larger (smaller) than the largest (smallest) representable value in floating-point, while still the corresponding c value is representable. In your case you can use it like this:
c=hypot(ab(:,1), ab(:,2));
Cleve Moler, one of the founders of MathWorks and original author of MATLAB, tells the story behind hypotin this article.
I'd recommend hypot as in Mohsen's answer.
Just for some variety, here's another approach, using complex numbers. This approach avoids overflow and underflow, just like hypot does:
abs(ab*[1; 1j])
Examples (taken from Cleve Moler's post):
>> ab = [1e154 1e154]; %// LARGE VALUES: possible overflow
>> sqrt(sum(ab.^2,2))
ans =
Inf %// overflow
>> hypot(ab(:,1), ab(:,2))
ans =
1.414213562373095e+154 %// correct result
>> abs(ab*[1; 1j])
ans =
1.414213562373095e+154 %// correct result
>> ab = [3e-200 4e-200]; %// SMALL VALUES: possible underflow
>> sqrt(sum(ab.^2,2))
ans =
0 %// underflow
>> hypot(ab(:,1), ab(:,2))
ans =
5.000000000000000e-200 %// correct result
>> abs(ab*[1; 1j])
ans =
5.000000000000000e-200 %// correct result