How to perform XOR in a recursive scenario - matlab

I have a 1x5 char matrix. I need to perform a bitwise XOR operation on all the elements in the matrix.If T is the char matrix , I need a matrix T' such that
T'= T XOR (T-1)' for all T
T for T=1
Let the char matrix be T
T=['0000000000110111' '0000000001000001' '0000000001001010' '0000000010111000' '0000000000101111']
T'=['0000000000110111' '0000000001110110' '0000000000111100' '0000000010000100' '0000000010101011']
ie; Leaving the first element as such , I need to XOR all the other elements with the newly formed matrix. I tried the following code but I'm unable to get the correct result.
Yxor1d = [T(1) cellfun(#(a,b) char((a ~= b) + '0'), T(2:end), T'(1:end-1), 'UniformOutput', false)]
I need to perform the XOR operation such that , for obtaining the elements of T'
T' (2)= T(2) XOR T' (1)
T' (3)= T(3) XOR T' (2)
It'll be really helpful to know where I went wrong.Thanks.

You are using cellfun when a cell array is expected as the input. You are using a character array, and what you're actually doing is taking each of those 5 strings and creating a single character array out of them. Chaining those strings together is actually performing a character concatenation.
You probably don't want that. To fix this, all you have to do is make T a cell array by placing {} characters instead of array ([]) characters to declare your characters:
T={'0000000000110111' '0000000001000001' '0000000001001010' '0000000010111000' '0000000000101111'};
Because you have edited your post after I provided my answer, my previous answer using cellfun is now incorrect. Because you are using a recurrence relation where you are referring to the previous output rather than input, you can no longer use cellfun. You'll need to use a for loop. There are probably more elegant ways to do it, but this is the easiest if you want to get something working.
As such, initialize an output cell array that is the same size as the input cell array like above, then you'll need to initialize the first cell to be the first cell of the input, then iterate through each pair of input and output elements yourself.
So do something like this:
Yxor1d = cell(1,numel(T));
Yxor1d{1} = T{1};
for idx = 2 : numel(T)
Yxor1d{idx} = char(double(T{idx} ~= Yxor1d{idx-1}) + '0');
end
For each value i of T', we XOR with the current input at T{i} with the previous output of T'{i-1}.
Use the above and your input cell array T, we get:
Yxor1d =
Columns 1 through 3
'0000000000110111' '0000000001110110' '0000000000111100'
Columns 4 through 5
'0000000010000100' '0000000010101011'
This matches with your specifications in your modified post.

Edit: There is a solution without a loop:
T=['0000000000110111';'0000000001000001';'0000000001001010';'0000000010111000' ;'0000000000101111'];
Yxor = dec2bin(bi2de(mod(cumsum(de2bi(bin2dec(T))),2)),16)
Yxor =
0000000000110111
0000000001110110
0000000000111100
0000000010000100
0000000010101011
This uses the fact that you effectively want a cumulative xor operation on the elements of your array.
For N booleans it should be either any one of them or else all of them. So if you do a cumulative sum of each of your bits, the sum should be an odd number for a true answer to 'xor'.
The one liner above can be decomposed like that:
Y = bin2dec(T) ; %// convert char array T into decimal numbers
Y = de2bi( Y ) ; %// convert decimal array Tbin into array of "bit"
Y = cumsum(Y) ; %// do the cumulative sum on each bit column
Y = mod(Y,2) ; %// convert all "even" numbers to '0', and 'odd' numbers to '1'
Y = bi2de(Y) ; %// re-assemble the bits into decimal numbers
Yxor = dec2bin(Y,16) ; %// get their string representation
Note that if you are happy to handle arrays of bits (boolean) instead of character arrays, you can shave off a few lines from above ;-)
Initial answer (simpler to grasp, but with a loop):
You can use the bitxor function, but you have to convert your char array in numeric value first:
T=['0000000000110111';'0000000001000001';'0000000001001010' ;'0000000010111000' ;'0000000000101111'];
Tbin = bin2dec(T) ; %// convert to numeric values
Ybin = Tbin ; %// pre-assign result, then loop ...
for idx = 2 : numel(Tbin)
Ybin(idx) = bitxor( Ybin(idx) , Ybin(idx-1) ) ;
end
Ychar = dec2bin(Ybin,16) %// convert back to 16bit char array representation if necessary
Ychar =
0000000000110111
0000000001110110
0000000000111100
0000000010000100
0000000010101011
edited answer after you redefined your problem

Related

How to find an matching element (either number or string) in a multi level cell?

I am trying to search a cell of cell arrays for a matching number (for example, 2) or string ('text'). Example for a cell:
A = {1 {2; 3};4 {5 'text' 7;8 9 10}};
There is similar question. However, this solution works only, if you want to find a number value in cell. I would need a solution as well for numbers as for strings.
The needed output should be 1 or 0 (the value is or is not in the cell A) and the cell level/deepness where the matched element was found.
For your example input, you can match character vectors as well as numbers by replacing ismember in the linked solution with isequal. You can get the depth at which the search value was found by tracking how many times the function has to go round the while loop.
function [isPresent, depth] = is_in_cell(cellArray, value)
depth = 1;
f = #(c) isequal(value, c);
cellIndex = cellfun(#iscell, cellArray);
isPresent = any(cellfun(f, cellArray(~cellIndex)));
while ~isPresent
depth = depth + 1;
cellArray = [cellArray{cellIndex}];
cellIndex = cellfun(#iscell, cellArray);
isPresent = any(cellfun(f, cellArray(~cellIndex)));
if ~any(cellIndex)
break
end
end
end
Using isequal works because f is only called for elements of cellArray that are not themselves cell arrays. Use isequaln if you want to be able to search for NaN values.
Note this now won't search inside numeric, logical or string arrays:
>> A = {1 {2; 3};4 {5 'text' 7;8 9 [10 11 12]}};
>> is_in_cell(A, 10)
ans =
logical
0
If you want that, you can define f as
f = #(c) isequal(value, c) || isequal(class(value), class(c)) && ismember(value, c);
which avoids calling ismember with incompatible data types, because of the 'short-circuiting' behaviour of || and &&. This last solution is still a bit inconsistent in how it matches strings with character vectors, just in case that's important to you - see if you can figure out how to fix that.

Adding each element in a vector yields no number

I have a vector,
a2 = [8 10 18 18]
I want to add all individual digits in this vector, i.e.
8 + 1+0 + 1+8 + 1+8 = 27
I decided to use the following piece of code:
a3 = num2str(a2)
sum2 = 0;
for k = 1:numel(a3)
sum2 = sum2 + str2num(a3(k));
end
sum2
However, when I output this I get sum2 = []. What exactly is going wrong here? Apparently, a3 has 13 elements, which means the spaces must be 2 elements wide. Does the issue lie there?
Recommended Solution:
Use num2str, cellstr, str2double, and sum with the omitnan flag.
req = num2str(a2);
req = sum(str2double(cellstr(req(:))),'omitnan');
num2str converts the given matrix a2 into a character array. req(:) reshapes the character array req into a column vector. It still contains spaces. cellstr is applied to convert the column character array into a cell array so that str2double can be applied. str2double converts the spaces into NaN and the char numbers into respective doubles. sum with the omitnan flag ignores the NaN while addition.
Just another Solution:
It can also be done using just num2str, str2num, and sum. But str2num uses eval and hence it should be avoided. Anyhow just for the fun of it:
req = num2str(a2);
req = sum(str2num(req(:)));
Just like the previous solution, when str2num is applied on the column character array containing spaces, spaces get removed and the remaining char numbers are converted into respective doubles. The operation of the sum function is obvious.
Why does your code not work?
When str2num is applied on the space character, [] is returned. When [] is added into any number, the result is also []. Since in your code a3 contains spaces, hence you get [] as output.
You can exploit the ASCII mapping:
b = uint64(num2str(a2') - '0')
b =
4×2 uint64 matrix
0 8
1 0
1 8
1 8
and then sum:
sum(b(:))
ans =
27
Just for fun, a shorter, faster, less robust and less readable solution:
sum2 = sum(sprintf('%d',a2)-'0');
Breakdown:
sprintf to convert all elements of a2 to a string without space as delimiter, like num2str would do
subtracting '0' implicitly casts the character array to the ASCII code equivalents. Subtracting the ASCII value for 0 then results in numbers 0-10
sum() to complete the operation.
Note that if a2 was a string to begin with, this solution will not give an error (same for the other answer, by the way)

How to convert char to number in Matlab

I am having trouble converting a character variable to a number in Matlab.
Each cell in the char variable contains one of two possible words. I need to convert word_one (for example) to represent '1', and word_two to represent '2'.
Is there a command that will let me do this?
So far I've tried:
%First I converted 'Word' from cell to char
Word = char(Word);
Word(Word == 'Word_one') = '1';
Word(Word == 'Word_two') = '2';
However, I get the:
Error using ==
Matrix dimensions must agree.
When I try to include the first letter only (ie. 'W'), it only changes the first letter in the full word (ie. 1ord_one).
Is there an easy way to do this?
Thanks for your help - any advice is much appreciated!
Use ismember:
possibleWords = {'Word_one', 'Word_two'}; %// template: words corresponding to 1, 2, ...
words = {'Word_two', 'Word_one', 'Word_two'}; %// data: words you need to convert
[~, result] = ismember(words, possibleWords);
In this example,
result =
2 1 2
If you need more flexibility, you can specify the value corresponding to each word:
possibleWords = {'Word_one', 'Word_two'}; %// template: words corresponding to 1, 2, ...
correspondingValues = [1.1, 2.2]; %// template: value corresponding to each word
words = {'Word_two', 'Word_one', 'Word_two'}; %// data: words you need to convert
[~, ind] = ismember(words, possibleWords);
result = correspondingValues(ind);
which gives
result =
2.2000 1.1000 2.2000
Looks like there are a couple of potential issues here.
Use strcmp() (string compare) in place of your current equivalence statement. Comparing strings using == compares element by element and returns a logical vector (where here you want a single logical value). String comparison, strcmp(), will compare the entire strings instead and return a single value.
It's also probably not necessary for you to convert your cell array. You can maintain the cell array structure and address each cell individually.
Try something along the lines of the following snippet.
for i = 1:length(Word)
if strcmp(Word{i},'Word_one')
Word{i} = '1';
elseif strcmp(Word{i},'Word_two')
Word{i} = '2';
end
end
There are a number of ways to solve this problem. Here's my approach.
% define your words
words = {'word_one','word_two','word_two','word_one','word_one'};
% define a function to get the indexes of the words of interest
getindex = #(c, y) cellfun(#(x) strcmp(x,y), c);
% replace 'word_one' with '1'
words(getindex(words, 'word_one'))={'1'};
% replace 'word_two' with '2'
words(getindex(words, 'word_two'))={'2'};
words =
'1' '2' '2' '1' '1'
You can use short n simple unique -
input_cellarr = {'Word_two','Word_one','Word_two','Word_two','Word_one','Word_one'}
[~,~,out] = unique(input_cellarr)
Sample run -
input_cellarr =
'Word_two' 'Word_one' 'Word_two' 'Word_two' 'Word_one' 'Word_one'
out =
2
1
2
2
1
1
Explanation: unique works here because it will produce an ascending order sorted array with numeric arrays. Now, when used on cell arrays, that ascending order translates to alphabetical order sorting. Thus, unique(input_cellarr) would always have {'Word_one' , 'Word_two'} because one is alphabetically higher up than two. Therefore the out indices would always have the first unique ID as 1 for 'Word_one' and the second ID as 2 for 'Word_two'.

Bitwise XOR operation to scramble two character matrices by generating a truth table

I need to perform the XOR operation for four characters where each of them have a bit representation as follows:
A = 00
G = 01
C = 10
T = 11
I need to create a table that XORs two characters together which gives the values for all combinations of XORing pairs of characters in the following way.
XOR A G C T
A A G C T
G G A T C
C C T A G
T T C G A
To obtain the output, you need to convert each character into its bit representation, XOR the bits, then use the result and convert it back into the right character. For example, consulting the third row and second column of the table, by XORing C and G:
C = 10
G = 01
C XOR G = 10 XOR 01 = 11 --> T
I would ultimately like to apply this rule to scrambling characters in a 5 x 5 matrix.
As an example:
A = 'GATT' 'AACT' 'ACAC' 'TTGA' 'GGCT'
'GCAC' 'TCAT' 'GTTC' 'GCCT' 'TTTA'
'AACG' 'GTTA' 'ACGT' 'CGTC' 'TGGA'
'CTAC' 'AAAA' 'GGGC' 'CCCT' 'TCGT'
'GTGT' 'GCGG' 'GTTT' 'TTGC' 'ATTA'
B = 'ATAC' 'AAAT' 'AGCT' 'AAGC' 'AAGT'
'TAGG' 'AAGT' 'ATGA' 'AAAG' 'AAGA'
'TAGC' 'CAGT' 'AGAT' 'GAAG' 'TCGA'
'GCTA' 'TTAC' 'GCCA' 'CCCC' 'TTTC'
'CCAA' 'AGGA' 'GCAG' 'CAGC' 'TAAA'
I would like to generate a matrix C such that each element of A gets XORed with its corresponding element in B.
For example, considering the first row and first column:
A{1,1} XOR B{1,1} = GATT XOR ATAC = GTTG
How can I do this for the entire matrix?
Looks like you're back for some more!
First, let's define the function letterXOR that takes two 4-character strings and XORs both strings corresponding to that table that you have. Recalling from our previous post, let's set up a lookup table where a unique two-bit string corresponds to a letter. We can use the collections.Map class to help us do this. We will also need the inverse lookup table using a collections.Map class where given a letter, we produce a two-bit string. We need to do this as you want to convert each letter into its two bit representation, and we need the inverse lookup to do this. After, we XOR the bits individually, then use the forward lookup table to get back to where we started. As such:
function [out] = letterXOR(A,B)
codebook = containers.Map({'00','11','10','01'},{'A','T','G','C'}); %// Lookup
invCodebook = containers.Map({'A','T','G','C'},{'00','11','10','01'}); %// Inv-lookup
lettersA = arrayfun(#(x) x, A, 'uni', 0); %// Split up each letter into a cell
lettersB = arrayfun(#(x) x, B, 'uni', 0);
valuesA = values(invCodebook, lettersA); %// Obtain the binary bit strings
valuesB = values(invCodebook, lettersB);
%// Convert each into a matrix
valuesAMatrix = cellfun(#(x) double(x) - 48, valuesA, 'uni', 0);
valuesBMatrix = cellfun(#(x) double(x) - 48, valuesB, 'uni', 0);
% XOR the bits now
XORedBits = arrayfun(#(x) bitxor(valuesAMatrix{x}, valuesBMatrix{x}), 1:numel(A), 'uni', 0);
%// Convert each bit pair into a string
XORedString = cellfun(#(x) char(x + 48), XORedBits, 'uni', 0);
%// Access lookup, then concatenate as a string
out = cellfun(#(x) codebook(x), XORedString);
Let's go through the above code slowly. The inputs into letterXOR are expected to be a character array of letters that are composed of A, T, G and C. We first define the forward and reverse lookups. We then split up each character of the input strings A and B into a cell array of individual characters, as looking up multiple keys in your codebook requires it to be this way. We then figure out what the bits are for each character in each string. These bits are actually strings, and so what we need to do is convert each string of bits into an array of numbers. We simply cast the string to double and subtract by 48, which is the ASCII code for 0. By converting to double, you'll either get 48 or 49, which is why we need to subtract with 48.
As such, each pair of bits is converted into a 1 x 2 array of bits. We then take each 1 x 2 array of bits between A and B, use bitxor to XOR the bits. The outputs at this point are still 1 x 2 arrays. As such, we need to convert each array into a string of bits, then use our forward lookup table to look up the character equivalent of these bits. After this, we concatenate all of the characters together to make the final string for the output.
Make sure you save the above in a function called letterXOR.m. Once we have this, we now simply have to use one cellfun call that will XOR each four-element string in your cell array and we then output our final matrix. We will use arrayfun to do that, and the input into arrayfun will be a 5 x 5 matrix that is column major defined. We do this as MATLAB can access elements in a 2D array using a single value. This value is the column major index of the element in the matrix. We define a vector that goes from 1 to 25, then use reshape to get this into the right 2D form. The reason why we need to do this is because we want to make sure that the output matrix (which is C in your example) is structured in the same way. As such:
ind = reshape(1:25, 5, 5); %// Define column major indices
C = arrayfun(#(x) letterXOR(A{x},B{x}), ind, 'uni', 0); % // Get our output matrix
Our final output C is:
C =
'GTTG' 'AACA' 'ATCG' 'TTAC' 'GGTA'
'CCGT' 'TCGA' 'GACC' 'GCCC' 'TTCA'
'TATT' 'TTCT' 'ATGA' 'TGTT' 'ATAA'
'TGTC' 'TTAC' 'ATTC' 'AAAG' 'AGCG'
'TGGT' 'GTAG' 'AGTC' 'GTAA' 'TTTA'
Good luck!

string "cross correlation" in matlab

Assume that I have 2 strings of characters:
AACCCGGAAATTTGGAATTTTCCCCAAATACG
CGATGATCGATGAATTTTAGCGGATACGATTC
I want to find by how much I should move the second string such that it matches the first one the most.
There are 2 cases. The first one is that we assume that the string are wrapped around, and the second one is that we don't.
Is there a matlab function that does returns either a N array or 2N+1 array of values for how much the shifted string 2 correlates with string 1?
If not, is there a faster/simpler method than something like
result = zeroes(length, 1)
for i = 0:length-1
result(i+1) = sum (str1 == circshift(str2, i));
end
You can convert each char into a binary column of size 4:
A -> [1;0;0;0]
C -> [0;1;0;0]
G -> [0;0;1;0]
T -> [0;0;0;1]
As a result a string of length n becomes a binary matrix of size 4-by-n.
You can now cross-correlate (along X axis only) the two n-by-4 and m-by-4 to get your result.
With a hat tip to John d'Errico:
str1 = 'CGATGATCGATGAATTTTAGCGGATACGATTC';
str2 = 'AACCCGGAAATTTGGAATTTTCCCCAAATACG';
% the circulant matrix
n = length(str2);
C = str2( mod(bsxfun(#plus,(0:n-1)',0:n-1),n)+1 ); %//'
% Find the maximum number of matching characters, and the amount
% by which to shift the string to achieve this result
[score, shift] = max( sum(bsxfun(#eq, str1, C), 2) );
Faster yes, simpler...well, I'll leave that up to you to decide :)
Note that this method trades memory for speed. That is, it creates the matrix of all possible shifts in memory (efficiently), and compares the string to all rows of this matrix. That matrix will contain N² elements, so if N becomes large, it's better to use the loop (or Shai's method).