Matlab binary encoding - matlab

I have a vector containing a series of integers, and what I want to do is take all numbers, convert them into their corresponding binary forms, and concatenate all of the resulting binary values together. Is there any easy way to do this?
e.g. a=[1 2 3 4] --> b=[00000001 00000010 00000011 00000100] --> c=00000001000000100000001100000100

Try:
b = dec2bin(a)

As pointed out by the other answers, the function DEC2BIN is one option that you have to solve this problem. However, as pointed out by this other SO question, it can be a very slow option when converting a large number of values.
For a faster solution, you can instead use the function BITGET as follows:
a = [1 2 3 4]; %# Your array of values
nBits = 8; %# The number of bits to get for each value
nValues = numel(a); %# The number of values in a
c = zeros(1,nValues*nBits); %# Initialize c to an array of zeroes
for iBit = 1:nBits %# Loop over the bits
c(iBit:nBits:end) = bitget(a,nBits-iBit+1); %# Get the bit values
end
The result c will be an array of zeroes and ones. If you want to turn this into a character string, you can use the function CHAR as follows:
c = char(c+48);

Yes, use dec2bin, followed by string concatenation.

Related

How to perform XOR in a recursive scenario

I have a 1x5 char matrix. I need to perform a bitwise XOR operation on all the elements in the matrix.If T is the char matrix , I need a matrix T' such that
T'= T XOR (T-1)' for all T
T for T=1
Let the char matrix be T
T=['0000000000110111' '0000000001000001' '0000000001001010' '0000000010111000' '0000000000101111']
T'=['0000000000110111' '0000000001110110' '0000000000111100' '0000000010000100' '0000000010101011']
ie; Leaving the first element as such , I need to XOR all the other elements with the newly formed matrix. I tried the following code but I'm unable to get the correct result.
Yxor1d = [T(1) cellfun(#(a,b) char((a ~= b) + '0'), T(2:end), T'(1:end-1), 'UniformOutput', false)]
I need to perform the XOR operation such that , for obtaining the elements of T'
T' (2)= T(2) XOR T' (1)
T' (3)= T(3) XOR T' (2)
It'll be really helpful to know where I went wrong.Thanks.
You are using cellfun when a cell array is expected as the input. You are using a character array, and what you're actually doing is taking each of those 5 strings and creating a single character array out of them. Chaining those strings together is actually performing a character concatenation.
You probably don't want that. To fix this, all you have to do is make T a cell array by placing {} characters instead of array ([]) characters to declare your characters:
T={'0000000000110111' '0000000001000001' '0000000001001010' '0000000010111000' '0000000000101111'};
Because you have edited your post after I provided my answer, my previous answer using cellfun is now incorrect. Because you are using a recurrence relation where you are referring to the previous output rather than input, you can no longer use cellfun. You'll need to use a for loop. There are probably more elegant ways to do it, but this is the easiest if you want to get something working.
As such, initialize an output cell array that is the same size as the input cell array like above, then you'll need to initialize the first cell to be the first cell of the input, then iterate through each pair of input and output elements yourself.
So do something like this:
Yxor1d = cell(1,numel(T));
Yxor1d{1} = T{1};
for idx = 2 : numel(T)
Yxor1d{idx} = char(double(T{idx} ~= Yxor1d{idx-1}) + '0');
end
For each value i of T', we XOR with the current input at T{i} with the previous output of T'{i-1}.
Use the above and your input cell array T, we get:
Yxor1d =
Columns 1 through 3
'0000000000110111' '0000000001110110' '0000000000111100'
Columns 4 through 5
'0000000010000100' '0000000010101011'
This matches with your specifications in your modified post.
Edit: There is a solution without a loop:
T=['0000000000110111';'0000000001000001';'0000000001001010';'0000000010111000' ;'0000000000101111'];
Yxor = dec2bin(bi2de(mod(cumsum(de2bi(bin2dec(T))),2)),16)
Yxor =
0000000000110111
0000000001110110
0000000000111100
0000000010000100
0000000010101011
This uses the fact that you effectively want a cumulative xor operation on the elements of your array.
For N booleans it should be either any one of them or else all of them. So if you do a cumulative sum of each of your bits, the sum should be an odd number for a true answer to 'xor'.
The one liner above can be decomposed like that:
Y = bin2dec(T) ; %// convert char array T into decimal numbers
Y = de2bi( Y ) ; %// convert decimal array Tbin into array of "bit"
Y = cumsum(Y) ; %// do the cumulative sum on each bit column
Y = mod(Y,2) ; %// convert all "even" numbers to '0', and 'odd' numbers to '1'
Y = bi2de(Y) ; %// re-assemble the bits into decimal numbers
Yxor = dec2bin(Y,16) ; %// get their string representation
Note that if you are happy to handle arrays of bits (boolean) instead of character arrays, you can shave off a few lines from above ;-)
Initial answer (simpler to grasp, but with a loop):
You can use the bitxor function, but you have to convert your char array in numeric value first:
T=['0000000000110111';'0000000001000001';'0000000001001010' ;'0000000010111000' ;'0000000000101111'];
Tbin = bin2dec(T) ; %// convert to numeric values
Ybin = Tbin ; %// pre-assign result, then loop ...
for idx = 2 : numel(Tbin)
Ybin(idx) = bitxor( Ybin(idx) , Ybin(idx-1) ) ;
end
Ychar = dec2bin(Ybin,16) %// convert back to 16bit char array representation if necessary
Ychar =
0000000000110111
0000000001110110
0000000000111100
0000000010000100
0000000010101011
edited answer after you redefined your problem

Char array to numeric array in matlab

I have the following matrix A of size 3x2:
A = [12; 34; 56];
But the data is stored as chars. I want to convert it to the numeric array. str2num doesn't. Is there another method to do that?
Well, your array doesn't look like a 3-by-2 array. In any case, you are looking for a casting function:
A = double(A);
should convert your chars in double.
If I understand correctly, you have
A = ['12'; '34'; '56']; %// strings
and want to get
B = [1 2; 3 4; 5 6]; %// numbers
This could be done as follows: convert A to double to produce each character's ASCII code, and then subtract the code of character '0' to obtain the desired numbers. In fact, conversion to double is done implicitly when you subtract chars, so you can just use
B = A-'0';

Bitwise XOR operation to scramble two character matrices by generating a truth table

I need to perform the XOR operation for four characters where each of them have a bit representation as follows:
A = 00
G = 01
C = 10
T = 11
I need to create a table that XORs two characters together which gives the values for all combinations of XORing pairs of characters in the following way.
XOR A G C T
A A G C T
G G A T C
C C T A G
T T C G A
To obtain the output, you need to convert each character into its bit representation, XOR the bits, then use the result and convert it back into the right character. For example, consulting the third row and second column of the table, by XORing C and G:
C = 10
G = 01
C XOR G = 10 XOR 01 = 11 --> T
I would ultimately like to apply this rule to scrambling characters in a 5 x 5 matrix.
As an example:
A = 'GATT' 'AACT' 'ACAC' 'TTGA' 'GGCT'
'GCAC' 'TCAT' 'GTTC' 'GCCT' 'TTTA'
'AACG' 'GTTA' 'ACGT' 'CGTC' 'TGGA'
'CTAC' 'AAAA' 'GGGC' 'CCCT' 'TCGT'
'GTGT' 'GCGG' 'GTTT' 'TTGC' 'ATTA'
B = 'ATAC' 'AAAT' 'AGCT' 'AAGC' 'AAGT'
'TAGG' 'AAGT' 'ATGA' 'AAAG' 'AAGA'
'TAGC' 'CAGT' 'AGAT' 'GAAG' 'TCGA'
'GCTA' 'TTAC' 'GCCA' 'CCCC' 'TTTC'
'CCAA' 'AGGA' 'GCAG' 'CAGC' 'TAAA'
I would like to generate a matrix C such that each element of A gets XORed with its corresponding element in B.
For example, considering the first row and first column:
A{1,1} XOR B{1,1} = GATT XOR ATAC = GTTG
How can I do this for the entire matrix?
Looks like you're back for some more!
First, let's define the function letterXOR that takes two 4-character strings and XORs both strings corresponding to that table that you have. Recalling from our previous post, let's set up a lookup table where a unique two-bit string corresponds to a letter. We can use the collections.Map class to help us do this. We will also need the inverse lookup table using a collections.Map class where given a letter, we produce a two-bit string. We need to do this as you want to convert each letter into its two bit representation, and we need the inverse lookup to do this. After, we XOR the bits individually, then use the forward lookup table to get back to where we started. As such:
function [out] = letterXOR(A,B)
codebook = containers.Map({'00','11','10','01'},{'A','T','G','C'}); %// Lookup
invCodebook = containers.Map({'A','T','G','C'},{'00','11','10','01'}); %// Inv-lookup
lettersA = arrayfun(#(x) x, A, 'uni', 0); %// Split up each letter into a cell
lettersB = arrayfun(#(x) x, B, 'uni', 0);
valuesA = values(invCodebook, lettersA); %// Obtain the binary bit strings
valuesB = values(invCodebook, lettersB);
%// Convert each into a matrix
valuesAMatrix = cellfun(#(x) double(x) - 48, valuesA, 'uni', 0);
valuesBMatrix = cellfun(#(x) double(x) - 48, valuesB, 'uni', 0);
% XOR the bits now
XORedBits = arrayfun(#(x) bitxor(valuesAMatrix{x}, valuesBMatrix{x}), 1:numel(A), 'uni', 0);
%// Convert each bit pair into a string
XORedString = cellfun(#(x) char(x + 48), XORedBits, 'uni', 0);
%// Access lookup, then concatenate as a string
out = cellfun(#(x) codebook(x), XORedString);
Let's go through the above code slowly. The inputs into letterXOR are expected to be a character array of letters that are composed of A, T, G and C. We first define the forward and reverse lookups. We then split up each character of the input strings A and B into a cell array of individual characters, as looking up multiple keys in your codebook requires it to be this way. We then figure out what the bits are for each character in each string. These bits are actually strings, and so what we need to do is convert each string of bits into an array of numbers. We simply cast the string to double and subtract by 48, which is the ASCII code for 0. By converting to double, you'll either get 48 or 49, which is why we need to subtract with 48.
As such, each pair of bits is converted into a 1 x 2 array of bits. We then take each 1 x 2 array of bits between A and B, use bitxor to XOR the bits. The outputs at this point are still 1 x 2 arrays. As such, we need to convert each array into a string of bits, then use our forward lookup table to look up the character equivalent of these bits. After this, we concatenate all of the characters together to make the final string for the output.
Make sure you save the above in a function called letterXOR.m. Once we have this, we now simply have to use one cellfun call that will XOR each four-element string in your cell array and we then output our final matrix. We will use arrayfun to do that, and the input into arrayfun will be a 5 x 5 matrix that is column major defined. We do this as MATLAB can access elements in a 2D array using a single value. This value is the column major index of the element in the matrix. We define a vector that goes from 1 to 25, then use reshape to get this into the right 2D form. The reason why we need to do this is because we want to make sure that the output matrix (which is C in your example) is structured in the same way. As such:
ind = reshape(1:25, 5, 5); %// Define column major indices
C = arrayfun(#(x) letterXOR(A{x},B{x}), ind, 'uni', 0); % // Get our output matrix
Our final output C is:
C =
'GTTG' 'AACA' 'ATCG' 'TTAC' 'GGTA'
'CCGT' 'TCGA' 'GACC' 'GCCC' 'TTCA'
'TATT' 'TTCT' 'ATGA' 'TGTT' 'ATAA'
'TGTC' 'TTAC' 'ATTC' 'AAAG' 'AGCG'
'TGGT' 'GTAG' 'AGTC' 'GTAA' 'TTTA'
Good luck!

Vectorizing the Notion of Colon (:) - values between two vectors in MATLAB

I have two vectors, idx1 and idx2, and I want to obtain the values between them. If idx1 and idx2 were numbers and not vectors, I could do that the following way:
idx1=1;
idx2=5;
values=idx1:idx2
% Result
% values =
%
% 1 2 3 4 5
But in my case, idx1 and idx2 are vectors of variable length. For example, for length=2:
idx1=[5,9];
idx2=[9 11];
Can I use the colon operator to directly obtain the values in between? This is, something similar to the following:
values = [5 6 7 8 9 9 10 11]
I know I can do idx1(1):idx2(1) and idx1(2):idx2(2), this is, extract the values for each column separately, so if there is no other solution, I can do this with a for-loop, but maybe Matlab can do this more easily.
Your sample output is not legal. A matrix cannot have rows of different length. What you can do is create a cell array using arrayfun:
values = arrayfun(#colon, idx1, idx2, 'Uniform', false)
To convert the resulting cell array into a vector, you can use cell2mat:
values = cell2mat(values);
Alternatively, if all vectors in the resulting cell array have the same length, you can construct an output matrix as follows:
values = vertcat(values{:});
Try taking the union of the sets. Given the values of idx1 and idx2 you supplied, run
values = union(idx1(1):idx1(2), idx2(1):idx2(2));
Which will yield a vector with the values [5 6 7 8 9 10 11], as desired.
I couldn't get #Eitan's solution to work, apparently you need to specify parameters to colon. The small modification that follows got it working on my R2010b version:
step = 1;
idx1 = [5, 9];
idx2 = [9, 11];
values = arrayfun(#(x,y)colon(x, step, y), idx1, idx2, 'UniformOutput', false);
values=vertcat(cell2mat(values));
Note that step = 1 is actually the default value in colon, and Uniform can be used in place of UniformOutput, but I've included these for the sake of completeness.
There is a great blog post by Loren called Vectorizing the Notion of Colon (:). It includes an answer that is about 5 times faster (for large arrays) than using arrayfun or a for-loop and is similar to run-length-decoding:
The idea is to expand the colon sequences out. I know the lengths of
each sequence so I know the starting points in the output array. Fill
the values after the start values with 1s. Then I figure out how much
to jump from the end of one sequence to the beginning of the next one.
If there are repeated start values, the jumps might be negative. Once
this array is filled, the output is simply the cumulative sum or
cumsum of the sequence.
function x = coloncatrld(start, stop)
% COLONCAT Concatenate colon expressions
% X = COLONCAT(START,STOP) returns a vector containing the values
% [START(1):STOP(1) START(2):STOP(2) START(END):STOP(END)].
% Based on Peter Acklam's code for run length decoding.
len = stop - start + 1;
% keep only sequences whose length is positive
pos = len > 0;
start = start(pos);
stop = stop(pos);
len = len(pos);
if isempty(len)
x = [];
return;
end
% expand out the colon expressions
endlocs = cumsum(len);
incr = ones(1, endlocs(end));
jumps = start(2:end) - stop(1:end-1);
incr(endlocs(1:end-1)+1) = jumps;
incr(1) = start(1);
x = cumsum(incr);

Get string index into matrix

I have the following string in matlab
V= 'abcdefghijklmnñopqrstuvwxyz';
Then I have a word of 9 characters consisting of chars from my 'V' alphabet.
k = 'peligroso';
I want to create a square matrix (3x3) with the indices of my word 'k' according to my alphabet, this would be the output. (Note that the range I'm considering is 0 to 26, so 'a' char does have index 0)
16 4 11
8 6 18
15 19 15
My code for doing this is:
K = [findstr(V, k(1))-1 findstr(V, k(2))-1 findstr(V, k(3))-1;findstr(V, k(4))-1 findstr(V, k(5))-1 findstr(V, k(6))-1; findstr(V, k(7))-1 findstr(V, k(8))-1 findstr(V, k(9))-1];
But I think there must be a more elegant solution to achieve the same, any ideas?
PS: I'm not using ASCII values since char 'ñ' must be inside my alphabet
For a loop-free solution, you can use ISMEMBER, which works on strings as well as on numbers:
K = zeros(3); %# create 3x3 array of zeros
[~,K(:)] = ismember(k,V); %# fill in indices
K = K'-1; %# make K conform to expected output
Since strings are just arrays of characters, it is easy to manipulate them using the usual array-processing functions.
For example, we can use arrayfun to create a new array by applying the specified function, which produces an output array of the same size. Using reshape we can form the desired 3x3 shape. Note that we transpose at the end since MATLAB's reshape handles arrays in column-major order.
K = reshape(arrayfun(#(x) findstr(V, x)-1, k), 3,3)'
Alternatively, since MATLAB lets you index matrices using a single index, which reads the entries of the matrix in column major order, we can construct an empty matrix and build its entries up one-by-one.
K = zeros(3,3)
for i=1:9
K(i) = findstr(V, k(i))-1;
end
K = K'
I am fond of #Jonas' solution (ismember), I think it's the most elegant way to go here.
But, just to provide another solution:
V = 'abcdefghijklmnñopqrstuvwxyz';
k = 'peligroso';
K = reshape( bsxfun(#eq, (k-0).', V-0) * (1:numel(V)).', 3,3).'
(forgive the SO highlighting)
The advantage of this would be that this uses built-in functions exclusively (ismember is not built-in, at least, not on my Matlab R2010b). This means that this solution might be faster than ismember, but
You'll have to test whther that is actually true, and if true,
you should have cases complex and large enough to justify losing the readability of ismember
Note that indices in Matlab are 1-based, meaning that V(1) = a. The solution above produces a 1-based answer, while you provide a 0-based example. Just subtract 1 from the line above if you really need 0-based indices.