How to convert a String to a Matrix Matlab - matlab

Im trying to convert a String into a Matrix. So like a=1 b=2... "Space"=28. Etc.
My question is how would I convert a string to a matrix?
aka..
abc=[1,2,3]
Tried a for loop, which does convert the string into numbers.
Here is where I try to make it into a Matrix
String1=char(string)
String2=reshape(String1,[10,14]);
the error I get is
"To RESHAPE the number of elements must not change"
"String2=reshape(String1,[10,14]);

If you need a general coding from characters into numbers (not necessarily ASCII):
Define the coding by means of a string, such that the character that appears first corresponds to number 1, etc.
Use ismember to do the "reverse indexing" operation.
Code:
coding = 'abcdefghijklmnñopqrstuvwxyz .,;'; %// define coding: 'a' is 1, 'b' is 2 etc
str = 'abc xyz'; %// example text
[~, result] = ismember(str, coding);
In this example,
result =
1 2 3 28 25 26 27

Related

Adding each element in a vector yields no number

I have a vector,
a2 = [8 10 18 18]
I want to add all individual digits in this vector, i.e.
8 + 1+0 + 1+8 + 1+8 = 27
I decided to use the following piece of code:
a3 = num2str(a2)
sum2 = 0;
for k = 1:numel(a3)
sum2 = sum2 + str2num(a3(k));
end
sum2
However, when I output this I get sum2 = []. What exactly is going wrong here? Apparently, a3 has 13 elements, which means the spaces must be 2 elements wide. Does the issue lie there?
Recommended Solution:
Use num2str, cellstr, str2double, and sum with the omitnan flag.
req = num2str(a2);
req = sum(str2double(cellstr(req(:))),'omitnan');
num2str converts the given matrix a2 into a character array. req(:) reshapes the character array req into a column vector. It still contains spaces. cellstr is applied to convert the column character array into a cell array so that str2double can be applied. str2double converts the spaces into NaN and the char numbers into respective doubles. sum with the omitnan flag ignores the NaN while addition.
Just another Solution:
It can also be done using just num2str, str2num, and sum. But str2num uses eval and hence it should be avoided. Anyhow just for the fun of it:
req = num2str(a2);
req = sum(str2num(req(:)));
Just like the previous solution, when str2num is applied on the column character array containing spaces, spaces get removed and the remaining char numbers are converted into respective doubles. The operation of the sum function is obvious.
Why does your code not work?
When str2num is applied on the space character, [] is returned. When [] is added into any number, the result is also []. Since in your code a3 contains spaces, hence you get [] as output.
You can exploit the ASCII mapping:
b = uint64(num2str(a2') - '0')
b =
4×2 uint64 matrix
0 8
1 0
1 8
1 8
and then sum:
sum(b(:))
ans =
27
Just for fun, a shorter, faster, less robust and less readable solution:
sum2 = sum(sprintf('%d',a2)-'0');
Breakdown:
sprintf to convert all elements of a2 to a string without space as delimiter, like num2str would do
subtracting '0' implicitly casts the character array to the ASCII code equivalents. Subtracting the ASCII value for 0 then results in numbers 0-10
sum() to complete the operation.
Note that if a2 was a string to begin with, this solution will not give an error (same for the other answer, by the way)

How to perform XOR in a recursive scenario

I have a 1x5 char matrix. I need to perform a bitwise XOR operation on all the elements in the matrix.If T is the char matrix , I need a matrix T' such that
T'= T XOR (T-1)' for all T
T for T=1
Let the char matrix be T
T=['0000000000110111' '0000000001000001' '0000000001001010' '0000000010111000' '0000000000101111']
T'=['0000000000110111' '0000000001110110' '0000000000111100' '0000000010000100' '0000000010101011']
ie; Leaving the first element as such , I need to XOR all the other elements with the newly formed matrix. I tried the following code but I'm unable to get the correct result.
Yxor1d = [T(1) cellfun(#(a,b) char((a ~= b) + '0'), T(2:end), T'(1:end-1), 'UniformOutput', false)]
I need to perform the XOR operation such that , for obtaining the elements of T'
T' (2)= T(2) XOR T' (1)
T' (3)= T(3) XOR T' (2)
It'll be really helpful to know where I went wrong.Thanks.
You are using cellfun when a cell array is expected as the input. You are using a character array, and what you're actually doing is taking each of those 5 strings and creating a single character array out of them. Chaining those strings together is actually performing a character concatenation.
You probably don't want that. To fix this, all you have to do is make T a cell array by placing {} characters instead of array ([]) characters to declare your characters:
T={'0000000000110111' '0000000001000001' '0000000001001010' '0000000010111000' '0000000000101111'};
Because you have edited your post after I provided my answer, my previous answer using cellfun is now incorrect. Because you are using a recurrence relation where you are referring to the previous output rather than input, you can no longer use cellfun. You'll need to use a for loop. There are probably more elegant ways to do it, but this is the easiest if you want to get something working.
As such, initialize an output cell array that is the same size as the input cell array like above, then you'll need to initialize the first cell to be the first cell of the input, then iterate through each pair of input and output elements yourself.
So do something like this:
Yxor1d = cell(1,numel(T));
Yxor1d{1} = T{1};
for idx = 2 : numel(T)
Yxor1d{idx} = char(double(T{idx} ~= Yxor1d{idx-1}) + '0');
end
For each value i of T', we XOR with the current input at T{i} with the previous output of T'{i-1}.
Use the above and your input cell array T, we get:
Yxor1d =
Columns 1 through 3
'0000000000110111' '0000000001110110' '0000000000111100'
Columns 4 through 5
'0000000010000100' '0000000010101011'
This matches with your specifications in your modified post.
Edit: There is a solution without a loop:
T=['0000000000110111';'0000000001000001';'0000000001001010';'0000000010111000' ;'0000000000101111'];
Yxor = dec2bin(bi2de(mod(cumsum(de2bi(bin2dec(T))),2)),16)
Yxor =
0000000000110111
0000000001110110
0000000000111100
0000000010000100
0000000010101011
This uses the fact that you effectively want a cumulative xor operation on the elements of your array.
For N booleans it should be either any one of them or else all of them. So if you do a cumulative sum of each of your bits, the sum should be an odd number for a true answer to 'xor'.
The one liner above can be decomposed like that:
Y = bin2dec(T) ; %// convert char array T into decimal numbers
Y = de2bi( Y ) ; %// convert decimal array Tbin into array of "bit"
Y = cumsum(Y) ; %// do the cumulative sum on each bit column
Y = mod(Y,2) ; %// convert all "even" numbers to '0', and 'odd' numbers to '1'
Y = bi2de(Y) ; %// re-assemble the bits into decimal numbers
Yxor = dec2bin(Y,16) ; %// get their string representation
Note that if you are happy to handle arrays of bits (boolean) instead of character arrays, you can shave off a few lines from above ;-)
Initial answer (simpler to grasp, but with a loop):
You can use the bitxor function, but you have to convert your char array in numeric value first:
T=['0000000000110111';'0000000001000001';'0000000001001010' ;'0000000010111000' ;'0000000000101111'];
Tbin = bin2dec(T) ; %// convert to numeric values
Ybin = Tbin ; %// pre-assign result, then loop ...
for idx = 2 : numel(Tbin)
Ybin(idx) = bitxor( Ybin(idx) , Ybin(idx-1) ) ;
end
Ychar = dec2bin(Ybin,16) %// convert back to 16bit char array representation if necessary
Ychar =
0000000000110111
0000000001110110
0000000000111100
0000000010000100
0000000010101011
edited answer after you redefined your problem

Binary to DNA encoding

I have an 8 bit binary sequence. I need to encode this 8 bit binary sequence into DNA sequence.
E.g., I have 10011100, the encoding rule I'm following is,
A=00;T=11;G=10;C=01,
So I want it to be something like GCTA. Therefore I need 4 bit DNA sequence as result.
I need to do this for a 256 * 256 matrix where each element is an 8 bit binary sequence.
I've created the matrix using the following code
a=imread('C:\Users\Desktop\lena.png');
disp(a);
imshow(a);
for i=1:1:256
for j=1:1:256
b{i,j,1} = dec2bin(a(i,j),8);
end
end
disp(b)
Here's a no for loop approach for you. We can actually do this in three lines.
You have the first step which is to take each 8-bit number in your image and convert it into its binary representation. Take note that this is a 2D cell array that is the same size as the image you used for doing this conversion. Each cell array would be the representation of the number as a string.
Now, all you really need to do now is create a lookup, then use this lookup to generate four characters per location in a new 2D cell array. As such, I would use the containers.Map() class to create a key-value lookup where each pair of bits gets mapped to a single character. Once we do this, we can then use cellfun and iterate over each 8 character string in your cell array, break up the bits into 2 element strings, and use these as keys into our lookup. We will inevitably get 4 separate cells for the output, so we'll need to use cell2mat to bring it all back together. As such, try doing this:
codebook = containers.Map({'00','11','10','01'},{'A','T','G','C'}); %// Lookup
outputCell = cellfun(#(x) values(codebook, {x(1:2),x(3:4),x(5:6),x(7:8)}), ...
b, 'uni', 0);
finalOutput = cellfun(#cell2mat, outputCell, 'uni', 0);
As an example, let's say we had this 2 x 2 matrix of cell elements:
b = {'11111111', '10101010'; '11001100', '00001101'}
b =
'11111111' '10101010'
'11001100' '00001101'
Running through the above code, this is what we get:
finalOutput =
'TTTT' 'GGGG'
'TATA' 'AATC'
Similar to rayryeng's solution using a lookup table, but imho containers.Map() is overkill:
codebook = 'ACGT';
output = cellfun(#(x) codebook(bin2dec(reshape(x, 2, 4)') + 1), b, 'UniformOutput', false)
I don't think it gets much shorter if the input consists of "binary numbers" in the sense of 8-character 0/1-strings. reshape breaks the strings into 4 portions of 2 characters each, bin2dec transforms these into four numbers in the range 0 to 3, codebook(... + 1) translates these into the characters ACGT.
If the input consists of actual 8-bit binary numbers, e.g. the uint8 data a that you get from reading in that Lena image, you can save the detour through 0/1-strings and use base 4 from the start:
output = reshape(cellstr(codebook(dec2base(a, 4) - '0' + 1)), size(a))
Here dec2base(a, 4) represents the binary numbers as 4-character strings of characters '0' to '3', - '0' is a trick to get numbers 0 to 3, then the lookup as before, and finally some stuff to get everything in the cell-array-of-strings format.

Bitwise XOR operation to scramble two character matrices by generating a truth table

I need to perform the XOR operation for four characters where each of them have a bit representation as follows:
A = 00
G = 01
C = 10
T = 11
I need to create a table that XORs two characters together which gives the values for all combinations of XORing pairs of characters in the following way.
XOR A G C T
A A G C T
G G A T C
C C T A G
T T C G A
To obtain the output, you need to convert each character into its bit representation, XOR the bits, then use the result and convert it back into the right character. For example, consulting the third row and second column of the table, by XORing C and G:
C = 10
G = 01
C XOR G = 10 XOR 01 = 11 --> T
I would ultimately like to apply this rule to scrambling characters in a 5 x 5 matrix.
As an example:
A = 'GATT' 'AACT' 'ACAC' 'TTGA' 'GGCT'
'GCAC' 'TCAT' 'GTTC' 'GCCT' 'TTTA'
'AACG' 'GTTA' 'ACGT' 'CGTC' 'TGGA'
'CTAC' 'AAAA' 'GGGC' 'CCCT' 'TCGT'
'GTGT' 'GCGG' 'GTTT' 'TTGC' 'ATTA'
B = 'ATAC' 'AAAT' 'AGCT' 'AAGC' 'AAGT'
'TAGG' 'AAGT' 'ATGA' 'AAAG' 'AAGA'
'TAGC' 'CAGT' 'AGAT' 'GAAG' 'TCGA'
'GCTA' 'TTAC' 'GCCA' 'CCCC' 'TTTC'
'CCAA' 'AGGA' 'GCAG' 'CAGC' 'TAAA'
I would like to generate a matrix C such that each element of A gets XORed with its corresponding element in B.
For example, considering the first row and first column:
A{1,1} XOR B{1,1} = GATT XOR ATAC = GTTG
How can I do this for the entire matrix?
Looks like you're back for some more!
First, let's define the function letterXOR that takes two 4-character strings and XORs both strings corresponding to that table that you have. Recalling from our previous post, let's set up a lookup table where a unique two-bit string corresponds to a letter. We can use the collections.Map class to help us do this. We will also need the inverse lookup table using a collections.Map class where given a letter, we produce a two-bit string. We need to do this as you want to convert each letter into its two bit representation, and we need the inverse lookup to do this. After, we XOR the bits individually, then use the forward lookup table to get back to where we started. As such:
function [out] = letterXOR(A,B)
codebook = containers.Map({'00','11','10','01'},{'A','T','G','C'}); %// Lookup
invCodebook = containers.Map({'A','T','G','C'},{'00','11','10','01'}); %// Inv-lookup
lettersA = arrayfun(#(x) x, A, 'uni', 0); %// Split up each letter into a cell
lettersB = arrayfun(#(x) x, B, 'uni', 0);
valuesA = values(invCodebook, lettersA); %// Obtain the binary bit strings
valuesB = values(invCodebook, lettersB);
%// Convert each into a matrix
valuesAMatrix = cellfun(#(x) double(x) - 48, valuesA, 'uni', 0);
valuesBMatrix = cellfun(#(x) double(x) - 48, valuesB, 'uni', 0);
% XOR the bits now
XORedBits = arrayfun(#(x) bitxor(valuesAMatrix{x}, valuesBMatrix{x}), 1:numel(A), 'uni', 0);
%// Convert each bit pair into a string
XORedString = cellfun(#(x) char(x + 48), XORedBits, 'uni', 0);
%// Access lookup, then concatenate as a string
out = cellfun(#(x) codebook(x), XORedString);
Let's go through the above code slowly. The inputs into letterXOR are expected to be a character array of letters that are composed of A, T, G and C. We first define the forward and reverse lookups. We then split up each character of the input strings A and B into a cell array of individual characters, as looking up multiple keys in your codebook requires it to be this way. We then figure out what the bits are for each character in each string. These bits are actually strings, and so what we need to do is convert each string of bits into an array of numbers. We simply cast the string to double and subtract by 48, which is the ASCII code for 0. By converting to double, you'll either get 48 or 49, which is why we need to subtract with 48.
As such, each pair of bits is converted into a 1 x 2 array of bits. We then take each 1 x 2 array of bits between A and B, use bitxor to XOR the bits. The outputs at this point are still 1 x 2 arrays. As such, we need to convert each array into a string of bits, then use our forward lookup table to look up the character equivalent of these bits. After this, we concatenate all of the characters together to make the final string for the output.
Make sure you save the above in a function called letterXOR.m. Once we have this, we now simply have to use one cellfun call that will XOR each four-element string in your cell array and we then output our final matrix. We will use arrayfun to do that, and the input into arrayfun will be a 5 x 5 matrix that is column major defined. We do this as MATLAB can access elements in a 2D array using a single value. This value is the column major index of the element in the matrix. We define a vector that goes from 1 to 25, then use reshape to get this into the right 2D form. The reason why we need to do this is because we want to make sure that the output matrix (which is C in your example) is structured in the same way. As such:
ind = reshape(1:25, 5, 5); %// Define column major indices
C = arrayfun(#(x) letterXOR(A{x},B{x}), ind, 'uni', 0); % // Get our output matrix
Our final output C is:
C =
'GTTG' 'AACA' 'ATCG' 'TTAC' 'GGTA'
'CCGT' 'TCGA' 'GACC' 'GCCC' 'TTCA'
'TATT' 'TTCT' 'ATGA' 'TGTT' 'ATAA'
'TGTC' 'TTAC' 'ATTC' 'AAAG' 'AGCG'
'TGGT' 'GTAG' 'AGTC' 'GTAA' 'TTTA'
Good luck!

Matlab binary encoding

I have a vector containing a series of integers, and what I want to do is take all numbers, convert them into their corresponding binary forms, and concatenate all of the resulting binary values together. Is there any easy way to do this?
e.g. a=[1 2 3 4] --> b=[00000001 00000010 00000011 00000100] --> c=00000001000000100000001100000100
Try:
b = dec2bin(a)
As pointed out by the other answers, the function DEC2BIN is one option that you have to solve this problem. However, as pointed out by this other SO question, it can be a very slow option when converting a large number of values.
For a faster solution, you can instead use the function BITGET as follows:
a = [1 2 3 4]; %# Your array of values
nBits = 8; %# The number of bits to get for each value
nValues = numel(a); %# The number of values in a
c = zeros(1,nValues*nBits); %# Initialize c to an array of zeroes
for iBit = 1:nBits %# Loop over the bits
c(iBit:nBits:end) = bitget(a,nBits-iBit+1); %# Get the bit values
end
The result c will be an array of zeroes and ones. If you want to turn this into a character string, you can use the function CHAR as follows:
c = char(c+48);
Yes, use dec2bin, followed by string concatenation.