Include the trailing space in sprintf in MATLAB - matlab

I'm trying to implement the DES cipher in Matlab.
In order to have the bits for the plain text and key, I'm doing this:
binInput = hex2bin(sprintf('%x',input));
Where hex2bin is a function gave to us by the professor.
This gives me the hex for the input, then the binary of it as char array.
I noted that when input has a trailing space, it is ignored, hence my algorithm stops to work because the block is not 64 bit long anymore (i get a 1x15 char vector instead of a 1x16 for example).
How can I include this trailing space? I could not find anything online or in the help of sprintf.
Thanks in advance

sprintf does respect all whitespace regardless of whether it's trailing, leading, or in-between.
sprintf('%x', 'hello')
% 68656c6c6f
sprintf('%x', 'hello ')
% 68656c6c6f20
If you need your input length to be a multiple of 64-bits, you'll likely want to pad your data with null bytes
str((end+1):(end+mod(numel(str), 8))) = '0';
If anything is getting truncated, it is likely an issue with the hex2bin function your professor gave you.

Related

MATLAB - literally convert decimal to string

I hope I have a simple question, I just couldn't figure it out.
I have several numbers which I want to be converted to string quite literally:
12.000 -> '12.000'
4.0 -> '4.0'
34.760000 -> '34.760000'
As you can see, I cannot simply pad zeros, since that highly depends on how many zero are given with the number.
Does anyone know how to do this?
Ahh, yes, this is easily accomplished with MATLAB's num2str function, like so:
num2str(12.000 ,'%.3f')
num2str(4.0, '%.1f')
num2str(34.760000,'%.6f')
WRT " %x.f ", where x equals 3,1, and 6 in the examples above, this is called the formSpec, which I would encourage you to read about more, here. In this case, we are saying that the variable is a floating point number, and we want to preserve x digits after the decimal place. It is useful to know about format specification for parsing text, and to efficiently read from and write to files.
Edit: A point of clarification, and as I'm sure you already know, single quotes (' ') in MATLAB yield a character array rather than a string. These are different data types. If you're really after a string, just add string to the num2str argument, i.e.,
string(num2str(12.000 ,'%.3f'))
string(num2str(4.0, '%.1f'))
string(num2str(34.760000,'%.6f'))

Transform a matrix of integers (0 to 30) to a matrix of emojis

I am working on transforming an image into a set of emojis, depending on how many colors are there. The Maths part is done. I have the matrix of numbers from 0 to 30, but I specifically need to convert the numbers into symbols and I was thinking about emojis since they are so used nowadays.
My question is how am I supposed to read a matrix of integers from a file, transform the matrix of integers into a matrix with different emojis (eventually, from a list of my choice) and put the output in another text file, this time containing the emojis? Is that possible? I guess it should be, but how do I do that? Does anyone have any suggestions?
The problem that I face is actually with the emojis unicode, I don't seem to have success when it comes to receiving messages on the console in their case. I just get "? ?" instead of a smiley face. But that thing happens only for them, the ASCII characters seem to work a bit better. The problem with ASCII characters is that I need, again, expressive images instead of numbers or random pipe shapes.
There is the code:
%make sure you have the "1234567.jpg" in the same location as the .m file
imdata = imread('1234567.jpg');
[X_no_dither,map] = rgb2ind(imdata,30,'nodither');
imshow(X_no_dither,map)
% and there I try to put the output in a text file
dlmwrite('result.txt',X_no_dither,'delimiter',"\t");
Ok, and the output in the text file is:
0 0 0 0 26 26 ... etc.
And I wonder how am I supposed to write the code in such a way that I will get emojis instead of numbers.
πŸ€” πŸ€” πŸ€” πŸ€” πŸ’– πŸ’– ... etc.
That's how I'd want the output to be like. But, from what I tried yesterday, I cannot print them without getting warnings/errors.
What you need to do is create a table with your 30 emojis (this documentation page might be helpful), then index into that table. I'm using the compose function as indicated in the page above, it should also be possible to copy-paste emojis into your M-file. If you don't see the emojis in MATLAB's console, change the font you're using.
For example:
table = [compose("\xD83D\xDE0E"),
"B", % find the utf16 encoding for your emojis, or copy-paste them in
"C",
"D",
...
];
output = table(X_no_dither + 1);
f = fopen('result.txt', 'wt');
for ii = 1:size(output, 1)
fprintf(f, '%s', output(ii, :));
fprintf(f, '\n');
end
fclose(f);
This will write the file out in UTF16 format, which is what MATLAB uses. If you're on Windows this might work well for you. On other platforms you might want to save as UTF8 instead, which can be accomplished by opening the file in UTF8 mode:
f = fopen('result.txt', 'wt', 'native', 'UTF-8');
Note that, even if you don't manage to get the emojis shown in the MATLAB command window, opening the text file in an editor will show the emojis correctly.

Use of strtok function

Based on MATLAB's code for strtok (see end):
"Here’s a more advanced example that finds the first token in a character string. A token is a set of characters delimited by whitespace or some other character. Given one input, the function assumes a default delimiter of whitespace; given two, it lets you specify another delimiter if desired. It also allows for two possible output argument lists"
I have a few questions:
1) Is a delimiter specified at the beginning or end of a token?
So for example, if I wanted to find the section of a text which gave me a certain date and the whole text was: "I like the date april 10 because it is close to May Day". I imagine the token is "april 10" but the starting delimiter would be "a" and the ending delimiter would be a digit?
You see I am confused as to what a "delimiter" is exactly in context. In MATLAB I would normally probably write the token as (\w*\s\d*) in order to locate the date (april 10) in the text since I do not know what date it would be (what letter it starts with or the digits after it). But is a delimiter that whole "april 10" or just an "a" at the beginning? How would this help if I do not know what month it is (april, may, june, etc) or does it basically just work as a "find" command?
I ran the program shown in the picture and tried it with 'hello my friend' as the string and 'o' as the delimiter and it gives:
token=hell
remainder=o my friend
So basically I am getting the impression delimiter are usually used at the end of fields or different regions in order to specify when the new field/section (remainder) begins? Basically a delimiter is commonly used as a simple one (or maybe more) character device to indicate the start of a new field or datum whereas using (/d/w*....etc) format is used for more specific extractions like dates where there is no "comma" or specific indicator in front of it? Are these two observations correct?
BUT then when I run it using "hello my fri" as delimiter (see --> running it with delimiter, it seems to arbitrarily select "I want to say hello my friend good man" as the remainder and "nd" as the token which makes no sense so I am wondering if there is a bug in this program or if it's just not set up to handle a delimiter that appear twice.
Also,
2) Can someone please explain why [9:13 32] is made the default for one input argument? If we're assuming whitespace is the delimiter, then what does that [9:13 32] mean?
3) Is there any purpose to using "any" since it is ran by a looping process? Would not it check it each iteration anyways?
function [token, remainder] = strtok(string, delimiters)
%STRTOK Find token in string.
% TOKEN = STRTOK(STR) returns the first token in the string STR delimited
% by white-space characters. STRTOK ignores any leading white space.
% If STR is a cell array of strings, TOKEN is a cell array of tokens.
%
% TOKEN = STRTOK(STR,DELIM) returns the first token delimited by one of
% the characters in DELIM. STRTOK ignores any leading delimiters.
% Do not use escape sequences as delimiters. For example, use char(9)
% rather than '\t' for tab.
%
% [TOKEN,REMAIN] = STRTOK(...) returns the remainder of the original
% string.
%
% If the body of the input string does not contain any delimiter
% characters, STRTOK returns the entire string in TOKEN (excluding any
% leading delimiter characters), and REMAIN contains an empty string.
%
% Example:
%
% s = ' This is a simple example.';
% [token, remain] = strtok(s)
%
% returns
%
% token =
% This
% remain =
% is a simple example.
%
% See also ISSPACE, STRFIND, STRNCMP, STRCMP, TEXTSCAN.
% Copyright 1984-2009 The MathWorks, Inc.
if nargin<1
error(message('MATLAB:strtok:NrInputArguments'));
end
token = ''; remainder = '';
len = length(string);
if len == 0
return
end
if (nargin == 1)
delimiters = [9:13 32]; % White space characters
end
i = 1;
while (any(string(i) == delimiters))
i = i + 1;
if (i > len),
return,
end
end
start = i;
while (~any(string(i) == delimiters))
i = i + 1;
if (i > len),
break,
end
end
finish = i - 1;
token = string(start:finish);
if (nargout == 2)
remainder = string(finish + 1:length(string));
end
EDIT: I was not aware that strtok was a built in function. I was under the assumption it was a UDF the textbook was building as an example. This is why there are many ambiguities since the book does not specify clearly what the function does.
This, for example, was not specified in the text which only stated the function found the first token in a character string. --> token = strtok(str) parses input character vector str from left to right, returning part or all of that character vector in token. Using the white-space character as a delimiter, the token output begins at the start of str, skipping any delimiters that might appear at the start, and includes all characters up to either the next delimiter or the end of the character vector. White-space characters include space (ASCII 32), tab (ASCII 9), and carriage return (ASCII 13).
Copyright 1984-2009 The MathWorks, Inc.
strtok is very much not going to help you here so I'm not going to answer your main question. I think you should use regular expression for this but I don't speak regex so I'll leave that to someone else.
[9:13 32]
Why is the default delimiter set to [9:13 32]. From the comments, MATLAB is claiming that those are all the white space characters. In other words then numbers 9, 10, 11, 12, 13 and 32 are the ASCII values for white space characters. For example 32 is the value of a space. Prove this to yourself by casting one to an integer:
uint8(' ') % or even ' ' + 0
I don't know what all the others are but I'm pretty sure one must be the tab character. To check the ASCII value of a tab character you can do
uint8(sprintf('\t'))
which returns 9 which is indeed in the list.
So [9:13 32] is a list of all the white space characters, as the comment implies.
Actually there are many more white space characters that this doesn't cover: https://en.wikipedia.org/wiki/Whitespace_character
any
When you say any I'm assuming you mean in lines like this: any(string(i) == delimiters). So yes, the loop ensures that only one character of string is compared at a time however there can be multiple values in delimiter for example all the white space characters as mentioned above or maybe you called strtok like this:
strtok('I like the date...', 'ad')
now both 'a' and 'd' are used as delimiters and so it returns
'I like the '
because it hit a 'd' first.

How does one convert from char format to double format, when working with binary numbers?

I have a piece of code which outputs what I want but in the wrong format
for k=1:100
bin(k,:)=dec2bin(randi([0 31]),5);
end
I want the output to be a 100x5 double array, with one bit per cell (0 or 1 value).
I've tried using the double() function...
for k=1:100
bin(k,:)=double(dec2bin(randi([0 31]),5));
end
...but that returns the correct format, with the wrong values.
My jargon might be a bit off, I apologise (Am I using cell, double, etc in the wrong context?)
Thank you for helping me.
There are a lot of ways to do what you want. The simplest would actually be generating the binary array right from the start, without a loop:
bin = rand(100, 5) > 0.5
Other alternatives:
If you have an integer array and you want to convert it to bits, you can use bitget instead of dec2bin inside the loop:
bin(k, :) = bitget(randi([0 31]), 5:-1:1)
If you already have a string array representing binary numbers, and you want to operate on it, you can delimit the bits with spaces and then apply str2num:
bin = reshape(str2num(sprintf('%c ', bin)), size(bin))

Char (non ascii) in Matlab

I have three characters (bigger than 127) and I need to write it in a binary file.
For some reason, MATLAB and PHP/Python tends to write a different characters.
For Python, I have:
s = chr(143)+chr(136);
f = open('pythonOut.txt', 'wb');
f.write(s);
f.close();
For MATLAB, I have:
s = strcat(char(143),char(136));
fid = fopen('matlabOut.txt');
fwrite(fid, s, 'char');
fclose(fid);
When I compare these two files, they're different. (using diff and/or cmp command).
More over, when I do
disp(char(hex2dec('88'))) //MATLAB prints
print chr(0x88) //PYTHON prints Λ†
Both outputs are different. I want to make my MATLAB code same as Python. What's wrong with MATLAB code?
You're trying to display extended ASCII characters, i.e symbols with an ASCII number greater than 128. MATLAB does not use extended ASCII internally, it uses 16-bit Unicode instead.
If you want to write the same values as in the Python script, use native2unicode to obtain the characters you want. For example, native2unicode(136) returns ^.
The fact that the two files are different seems obvious; chr(134) is obviously different from char(136) :)
In Matlab, only the first 127 characters correspond to (non-extended) ASCII; anything after that is Unicode16.
In Python, the first 255 characters correspond to (extended) ASCII (use unichr() for Unicode).
However, unicode 0x88 is the same as the extended ASCII 0x88 (as are most of the others). The reason Matlab does not display it correctly, is due to the fact that the Matlab command window does not treat Unicode very well by default, while Python (running in a terminal or so I presume) usually does a better job.
Try changing the font in Matlab's command window, or starting Matlab in a terminal and print the 0x88 character; it should be the same.
In any case, the output of the characters to file should not result in any difference; it is just a display issue.