transfer each letter of an string array into integers in matlab - matlab

I have the task to transfer each letter of a given sequence into an integer vector in matlab. For instance, given the input sequence, 'seq = TGCA'. Since here we totally have 4 distinct letters, I plan to encode 'A' as '0001', encode 'T' as '0010', encode 'G' as '0100' and encode 'C' as '1000'. And then the whole sequence can be encoded as the contenationn of all the encoded (0,1) vectors. So, in this case, the whole sequence would be '0010010010000001'. Any comments would be appreciated. Many thanks.

The idea behind this solution is to define a key, which returns the expected output when compared to the string:
>> key='CGTA'
key =
CGTA
>> key=='A'
ans =
0 0 0 1
>> key=='T'
ans =
0 0 1 0
This basically solves it, now use bsxfun to vectorize:
E=reshape(bsxfun(#eq,key(:),seq(:).'),1,[])
This outputs a logical vector, if char is inteded use:
F=char(reshape(bsxfun(#eq,key(:),seq(:).'),1,[])+'0')

Octave doesn't support containers.Map, so I'm gonna waste 80 rows out of an 84x4 matrix...
codes(['A','T','G','C'],:)=['0001';'0010';'0100';'1000'];
seq = 'ATACAGCTAGGATCA';
encodedSeq=codes(seq,:);
encodedSeq =
0001
0010
0001
1000
0001
0100
1000
0010
0001
0100
0100
0001
0010
1000
0001
or
reshape(encodedSeq,1,[])
ans = 000100100000010000001000110000010000010000100101010001001001

Related

How to find 1's complement in MARIE?

I'm trying to write Checksum Algorithm using MARIE.js, but I'm stuck on doing 1's complement.
I saw other assembly languages have CMA code, but I couldn't find that information on MARIE.
Thus, I typed G that opcode is 2F to find the checksum byte but the output is not what I expected.
What did I miss or do something wrong?
Input /Takes user input
Store A /Stores user input to A
Input /Takes user input
Store B /Stores user input to B
Input /Takes user input
Store C /Stores user input to C
Input /Takes user input
Store D /Stores user input to D
Load A /Load A to AC
Add B /Add B to A
Add C /Add C to B
Add D /Add D to C
Subt F /Subtract F from Sum of data 1,2,3,4
Store E /Sum of data 1,2,3,4 ignoring carry
Subt G
/Add ONE
Output /Print checksum byte
HALT /End program
/Variable decleration
A, HEX 0 /Data 1
B, HEX 0 /Data 2
C, HEX 0 /Data 3
D, HEX 0 /Data 4
E, HEX 0 /Checksum byte
F, HEX 100 /Ignore carry
G, HEX 2F
ONE, DEC 1
Two's complement, -n, is defined as one's complement + 1, e.g. ~n + 1
Therefore, since MARIE has subtraction you can make two's complement (e.g. 0-n) and subtracting 1 from that will yield one's complement, ~n.

Matlab: output format with fprintf

I want to have a list of data in a text file, and for that I use:
fprintf(fid, '%d %s %d\n',ii, names{ii},vals(ii));
the problem in my data, there are names that are longer than other. so I get results in this form:
1 XXY 5
2 NHDMUCY 44
3 LL 96
...
How i can change the fprintf line of code to make the results in this form:
1 XXY 5
2 NHDMUCY 44
3 LL 96
...
Something like this before the start of the loop -
%// extents of each vals string and the corresponding whitespace padding
lens0 = cellfun('length',cellfun(#(x) num2str(x),num2cell(1:numel(names)),'Uni',0))
pad_ws_col1 = max(lens0) - lens0
%// extents of each names string and the corresponding whitespace padding
lens1 = cellfun('length',names)
pad_ws_col2 = max(lens1) - lens1
Then, inside the loop -
fprintf(fid, '%d %s %s %s %d\n',col1(ii), repmat(' ',1,pad_ws_col1(ii)), ...
names{ii},repmat(' ',1,pad_ws_col2(ii)),vals(ii));
Output would be -
1 XXY 5
2 NHDMUCY 44
3 LL 96
For a range 99 - 101, it would be -
99 XXY 5
100 NHDMUCY 44
100 LL 96
Please note that the third column numerals start at a fixed distance instead of ending at a fixed distance from the start of each row as asked in the question. But, assuming that the whole idea of the question was to present the data in a more readable way, this could work for you.
You can use the function char to convert a cell array of string into a character array where all rows will be padded to be the length of the longest one.
So for you:
charNames = char( names ) ;
then you can use fprintf :
fprintf(fid, '%d %s %d\n',ii, charNames(ii,:) , vals(ii)) ;
Just make sure your cell array is a colum before you convert it to char.

Convert text to binary and store in a single array in matlab

I need to convert the given text (not in file format) into binary values and store in a single array that is to be given as input to other function in Matlab .
Example:
Hi how are you ?
It is to be converted into binary and stored in an array.I have used dec2bin() function but i did not suceed in getting the output required.
Sounds a bit like a trick question. In MATLAB, a character array (string) is just a different representation of 16-bit unsigned character codes.
>> str = 'Hi, how are you?'
str =
Hi, how are you?
>> whos str
Name Size Bytes Class Attributes
str 1x16 32 char
Note that the 16 characters occupy 32 bytes, or 2 bytes (16-bits) per character. From the documentation for char:
Valid codes range from 0 to 65535, where codes 0 through 127 correspond to 7-bit ASCII characters. The characters that MATLABĀ® can process (other than 7-bit ASCII characters) depend upon your current locale setting. To convert characters into a numeric array,use the double function.
Now, you could use double as it recommends to get the character codes into double arrays, but a minimal representation would simply involve uint16:
int16bStr = uint16(str)
To split this into bytes, typecast into 8-bit integers:
typecast(int16bStr,'uint8')
which yields 32 uint8 values (bytes), which are suitable for conversion to binary representation with dec2bin, if you want to see the binary (but these arrays are already binary data).
If you don't expect anything other than ASCII characters, just throw out the extra bits from the start:
>> int8bStr =
72 105 44 32 104 111 119 32 97 114 101 32 121 111 117 63
>> binStr = reshape(dec2bin(binStr8b.'),1,[])
ans =
110011101110111001111111111111110000001001001011111011000000 <...snip...>

how to convert an arrary of 16-bit unsigned intergers into ascii string in matlab

I am looking for a way to convert an array of 16-bit unsigned integer into ASCII char array. I am using char to do the conversion
D=[65 65 65 65];
char(D)
which will show 4 'A'. However, since each number in D is 16-bit, I expect it to convert each number to 2 chars. For example, if I have
D=[16707]
char(D)
I expect it gives me two chars 'A' and 'C'. But char always return 1 character. Is that anyway to force char to convert like the way I stated? Thanks.
For this, you need to write your own function.
You can use char() to convert most significant byte and least significant byte separately.
k = 16707;
first = char(bitand(bitshift(k, -8), 255));
second = char(bitand(k, 255));
Have a look at
http://www.mathworks.com/help/matlab/ref/char.html
It cleatly states that the char function is valid only for 8 bit numbers. you can convert each part of cell of the array with this and contact the results for each two cells.
Use typecast to convert each uint16 to two uint8, and then apply char. Make sure that the input to typecastr is really of type uint16.
If you need to reverse char order, use swapbytes on the uint16 vector.
>> D = [16707 16708];
>> char(typecast(uint16(D),'uint8'))
ans =
CADA
>> char(typecast(swapbytes(uint16(D)),'uint8'))
ans =
ACAD

Matlab - how to read 2 bytes at a time

I have a number like this - 778310098 - and I want to read 2 bytes at a time. So, I am expecting my output to be 77; 83; 10; 09; 8. I tried using the below:
uint16(fread(fileID,inf, 'ubit8')) and the output I get is the ASCII value of the individual numbers:
55
55
56
51
49
48
48
57
56
What do I need to do to get the desired output?
To read pairs of ASCII digits from a text file (we tend not to describe text files in byets, but in characters), use:
[10 1] * (fread(fileID,[2 inf], 'char') - 48)
To read bytes pairwise from a binary file, try
fread(fileID,inf, '*uint16')
One method is to convert it to a string, then process the string, then convert it back to an integer. While this may not be particularly elegant or perfect, will this do the trick?
a = 778310098;
b = num2str(a);
for i = 1:2:length(b)
if i == length(b) % to handle the case for odd input
split = str2num(b(i))
else
split = str2num(b(i:i+1)) % handle all others
end
end