Matlab : String vector - character subtraction - matlab

I'm trying to make a linear algebra-based algorithm for shift(Ceasar) cryptography cipher . Supposing I have a string : 'hello ' . When I'm trying to convert it into a (int)number matrix I do this :
'hello' - 'a'
And the result is
ans =
7 4 11 11 14
This is the desired result . But if I subtract the character 'g' the result will be
ans =
1 -2 5 5 8
I'd like to ask what happens in Matlab(or Octave) when I subtract a character and I get the results above .

As Mohit Jain wrote, the results you get are based on a conversion to ASCII which is the most widely accepted way to numerically encode textual information. ASCII is also included as a subset in the current standard of Unicode, and on supporting platforms Matlab actually uses a 16-bit Unicode encoding, which enables it to not only represent the 95 printable characters of ASCII which support English text, but a large number of international scripts, special characters for applications in mathematics, typography and many other fields. Explicit conversion between numeric and character data in Matlab is done through char and double:
>> double('aAΔ')
ans =
97 65 916
A small latin letter 'a' has the ASCII code 97, a large latin letter 'A' the ASCII code 65, and a large greek letter Delta has the Unicode number 916. Since the latin letters are encoded in sequence with codes 97 to 122 for small letters and 65 to 90 for capitals, you can generate the English alphabet e.g. like this:
>> char(65 : 90)
ans =
ABCDEFGHIJKLMNOPQRSTUVWXYZ
When you apply an arithmetic operator like - to character strings, the characters are implicitly converted to numbers as if you had used double
>> double('hello')
ans =
104 101 108 108 111
>> double('g')
ans =
103
and therefore 'hello' - 'a' is the same as
>> [104 101 108 108 111] - 103
ans =
1 -2 5 5 8

It changes characters of string to their ascii value and then subtracts each value
'hello' - 'a' = 7 4 11 11 14 because h - a = 8 -1 =7
(these should be ascii values but i am using these values for simplicity because its all relative)
e-a=5-1=4
l-a = 12-1 =11 and so on
'hello' - 'g'
h-g=8-7=1
e-g=5-7=-2 and so on

Related

How to modify the last 3 bits of signed numbers

When I apply the function dwt2() on an image, I get the four subband coefficients. By choosing any of the four subbands, I work with a 2D matrix of signed numbers.
In each value of this matrix I want to embed 3 bits of information, i.e., the numbers 0 to 7 in decimal, in the last 3 least significant bits. However, I don't know how to do that when I deal with negative numbers. How can I modify the coefficients?
First of all, you want to use an Integer Wavelet Transform, so you only have to deal with integers. This will allow you a lossless transformation between the two spaces without having to round float numbers.
Embedding bits in integers is a straightforward problem for binary operations. Generally, you want to use the pattern
(number AND mask) OR bits
The bitwise AND operation clears out the desired bits of number, which are specified by mask. For example, if number is an 8-bit number and we want to zero out the last 3 bits, we'll use the mask 11111000. After the desired bits of our number have been cleared, we can substitute them for the bits we want to embed using the bitwise OR operation.
Next, you need to know how signed numbers are represented in binary. Make sure you read the two's complement section. We can see that if we want to clear out the last 3 bits, we want to use the mask ...11111000, which is always -8. This is regardless of whether we're using 8, 16, 32 or 64 bits to represent our signed numbers. Generally, if you want to clear the last k bits of a signed number, your mask must be -2^k.
Let's put everything together with a simple example. First, we generate some numbers for our coefficient subband and embedding bitstream. Since the coefficient values can take any value in [-510, 510], we'll use 'int16' for the operations. The bitstream is an array of numbers in the range [0, 7], since that's the range of [000, 111] in decimal.
>> rng(4)
>> coeffs = randi(1021, [4 4]) - 511
coeffs =
477 202 -252 371
48 -290 -67 494
483 486 285 -343
219 -504 -309 99
>> bitstream = randi(8, [1 10]) - 1
bitstream =
0 3 0 7 3 7 6 6 1 0
We embed our bitstream by overwriting the necessary coefficients.
>> coeffs(1:numel(bitstream)) = bitor(bitand(coeffs(1:numel(bitstream)), -8, 'int16'), bitstream, 'int16')
coeffs =
472 203 -255 371
51 -289 -72 494
480 486 285 -343
223 -498 -309 99
We can then extract our bitstream by using the simple mask ...00000111 = 7.
>> bitand(coeffs(1:numel(bitstream)), 7, 'int16')
ans =
0 3 0 7 3 7 6 6 1 0

In Matlab, with an English text file, how to load the file and get a matrix as an ASCII encoding result?

Firstly it is a very simple example:
In a text file ('test1.txt'), the content is:
Formally, the
What I want to get is an array with the ASCII encoding result like:
dat_ascii = [70 111 114 109 97 108 108 121 44 32 116 104 101]
In the result, every char is translated to ASCII code, even space and common.
Now I have a text file like 10MB full with English text. I want to read it and translate every char to ASCII code and put them into a matrix (with every 4096 char per line, many lines).
How can I do this in Matlab?
You can easily convert every thing in ASCII with :
double, you just cast to double your string.
And to revert it, just do char
Example :
myStr = 'I have 2 apple.'
myStr =
I have 2 apple.
myASCII = double(myStr)
myASCII =
73 32 104 97 118 101 32 50 32 97 112 112 108 101 46
myChar = char(myASCII)
myChar =
I have 2 apple.
In order to read text file in MATLAB, you need to open the text file and read
>> filePtr = fopen('test1.txt')
and then use the file pointer to read the data and convert to ASCII values:
>> ASCIIValues = double(textscan(filePtr, '%c')); ASCIIValues{:}
Note: Use the appropriate formatting argument when you try to read a text file. In my case, I neglect all whitespaces. For documentation, read http://www.mathworks.com/help/matlab/ref/textscan.html

DEFLATE Encoding with static Huffman Codes

need some help to understand how DEFLATE Encoding works. I know that is a combination of the LZSS algorithm and Huffman coding.
So let encode for example "Deflate late". Params: [Search buffer: 8kb and Look-ahead buffer 4kb] Well, the output of LZSS algorithm is "Deflate <5, 4>" The next step uses static huffman coding to reduce the redundancy. Here is my problem, I dont know how should i encode this pair <5, 4> with huffman.
[Edited]
D 000
f 001
l 010
a 011
t 100
_ 101
e 11
So well, according to this table the string "Deflate " is written as 000 11 001 010 011 100 11 101. As a next step lets encode the pair (5, 4). The fixed prefix code of the length 4 according to the book "Data Compression - The Complete Reference" is 258, followed by fixed prefix code of the distance 5 (Code 4 + 1 Extra bit).
That can be summarized as:
length 4 -> 258 -> 0000010
distance 5 -> 4 + 1 extra bit -> 00100|0
So, the encoded string is written as [header: 1 01] 000 11 001 010 011 100 11 101 0000010 001000 [end-of-block: 0000000], BUT if i create a huffman tree, it is not a static huffman anymore, right?
Good day
D 000
f 001
l 010
a 011
t 100
_ 101
e 11
is not the Deflate static code. The static literal/length codes are all 7, 8, or 9 bits, and the distance codes are all 5 bits. You asked about the static codes.
'Deflate late' encoded in static deflate format as the literals 'Deflate ' and a length 4, distance 5 match in hex is:
73 49 4d cb 49 2c 49 55 00 11 00
That is broken down as follows (bits are read from the least significant part of each byte first):
011 - 01 means fixed code, 1 means last block
00101110 - D
10101001 - e
01101001 - f
00111001 - l
10001001 - a
00100101 - t
10101001 - e
00001010 - space
0100000 - length 4
00100 - distance 5 or 6 depending on one extra bit
0 - extra bit -> distance 5
0000000 - end code
0 - fill bit to byte boundary

matlab get the value of char

from MATLAB command line , when I type my variable a , it gives me values as expected :
a =
value_1
value_2
and I would like to access to each value of a, I tried a(1) but this gives me empty
the type of a is 1x49char.
how could I get value_1 and value_2 ?
whos('a')
Name Size Bytes Class Attributes
a 1x49 98 char
I get the a from xml file :
<flag ="value">
<flow>toto</flow>
<flow>titi</flow>
</flag>
a+0:
ans =
10 32 32 32 32 32 32 32 32 32 32 32 32 98,...
111 111 108 101 97 110 95 84 10 32 32 32 32 32,...
32 32 32 32 32 32 32 66 79 79 76 10 32 32,...
32 32 32 32 32 32 32
Perhaps a is a string with a newline in it. To make two separate variables, try:
values = strtrim(strread(a, '%s', 'delimiter', sprintf('\n')))
strread will split a into separate lines, and strtrim will remove leading/trailing whitespace.
Then you can access the lines using
values{1}
values{2}
(note that you must use curly brackets since this is a cell array of strings).
How are you reading in the xml file? If you're using xmlread then MatLab adds a lot of white space in there for you and could be the cause of your problems.
http://www.mathworks.com/matlabcentral/fileexchange/28518-xml2struct
This will put your xml file into a struct where you should be able to access the elements in the array.
You seem to have an somewhat inconvenient character array. You can convert this array in a more manageable form by doing something like what #Richante said:
strings = strread(a, '%s', 'delimiter', sprintf('\n'));
Then you can reference to toto and titi by
>> b = strings{2}
b =
toto
>> c = strings{3}
c =
titi
Note that strings{1} is empty, since a starts with a newline character. Note also that you don't need a strtrim -- that is taken care of by strread already. You can circumvent the initial newlines by doing
strings = strread(a(2:end), '%s', 'delimiter', sprintf('\n'));
but I'd only do that if the first newline is consistently there for all cases. I'd much rather do
strings = strread(a, '%s', 'delimiter', sprintf('\n'));
strings = strings(~cellfun('isempty', strings))
Finally, if you'd rather use textscan instead of strread, you need to do 1 extra step:
strings = textscan(a, '%s', 'delimiter', sprintf('\n'));
strings = [strings{1}(2:end)];

Agilent Vee or Matlab: 4 byte ASCII to floating point

I currently have an instrument that sends 4 bytes representing a floating point number of 32-bit in little endian format, the data looks like:
Gz*=
<«�=
N×e=
or this
à|ƒ=
is there a conversion for this in matlab, Agilent vee and manually
To convert an array of char to single, you can use typecast:
c = 'Gz*=';
f = typecast(c, 'single')
f = 0.041621
Just implicitly!
>> data = ['Gz*=';'<«�=';'N×e=']
data =
Gz*=
<«�=
N×e=
>> data+0
ans =
71 122 42 61
60 171 65533 61
78 215 101 61
data+0 forces it to be interpreted as a number which is fine.
If it's interpreted it backwards (I'm not sure if MATLAB is big or little endian) just use the swapbytes function.