I'm fairly new to Matlab, although not to programming. I'm trying to hash a string, and get back a single value that acts as a unique id for that string. I'm using this DataHash function from FileExchange which returns the hash as an integer vector. So far the best solution I've found for converting this to a single numeric value goes:
hash_opts.Format = 'uint8';
hash_vector = DataHash(string, hash_opts);
hash_string = num2str(hash_vector);
% Use a simple regex to remove all whitespace from the string,
% takes it from '1 2 3 4' to '1234'
hash_string = regexprep(hash_string, '[\s]', '');
hashcode = str2double(hash_string);
A reproducible example that doesn't depend on DataHash:
hash_vector = [1, 23, 4, 567];
hash_string = num2str(hash_vector);
% Use a simple regex to remove all whitespace from the string,
% takes it from '1 2 3 4' to '1234'
hash_string = regexprep(hash_string, '[\s]', '');
hashcode = str2double(hash_string); % Output: 1234567
Are there more efficient ways of achieving this, without resorting to a regex?
Yes, Matlab's regex implementation isn't particularly fast. I suggest that you use strrep:
hashcode = str2double(strrep(hash_string,' ',''));
Alternatively, you can use a string creation method that doesn't insert spaces in the first place:
hash_vector = [1, 23, 4, 567];
hash_string = str2double(sprintf('%d',hash_vector))
Just make sure that your hash number is less than 2^53 or the conversion to double might not be exact.
I'v seen there's already an answer - though it loses precission as it omits leading 0s - I'm not really sure if it will cause you troubles but I wouldn't want to rely on it.
As you output as uint8 why don't you use hex values instead - this will give you the exactly same number. Converting back is also easy using dec2hex.
hash_vector = [1, 23, 4, 253]
hash_str=sprintf('%02x',hash_vector); % to assure every 8 bit use 2 hex digits!
hash_dig=hex2dec(hash_str)
btw. - your sampe hash contains 567 - an impossible number in uint8.
Having looked at DataHash the question would also be why not use base64 or hex in the first place.
Related
How can I connect these two parts?
In Excel if you say 'state'&2 you will get a combined phrase state2.
I want to join 'state' and 'i' where i is a number between e.g. 1,2,3...
Then I can end up with state1 or state5 for example depending on what i is equal to.
How can I do this?
You can
Use num2str to convert 2 to '2', and then concatenation to build your char array
Use sprintf to create a char array with a specified placeholder format
Use strings.
Importantly here I've made a distinction between strings ("double quotes") and character arrays ('single quotes') - read here for more details about their differences.
Corresponding code would look like
% 1. Use num2str and concatenation
str = ['state', num2str(2)]; % -> 'state2' (char)
% 2. Use sprintf
str = sprintf( 'state%d', 2 ); % -> 'state2' (char)
% 3. Use strings
str = "state" + 2 % -> "state2" (string)
I would opt for number 2, since I think it's cleaner than 1 and more flexible, and I have used MATLAB since before strings existed so I'm predisposed to dislike them!
The problem I have is that it only works partially, but what can I add to make it work?
A2 = [20 4 6 8 5];
A3 = [10 2 3 4 6];
Str=[];
formatSpec = 'P%d (%d,%d)';
for i=1:length(A2)
str = char(sprintf (formatSpec, i, A2(i),A3(i)));
Str=[Str;str];
end
set(handles.text2,'string',Str);
You are not concatenating strings but char-arrays. Thinking it this way, it already answers your question: If you have a two-digit number, the char-array is one element longer than the char-array of a single-digit number... and you cannot concatenate two arrays of different size vertically.
The solution is fairly simple: use actual strings (introduced somewhere around R2016a). Strings are indicated with "" instead of '', which are chars. So replace your charwith string and it works fine. (Even better: provide the formatSpecas ""-string and it sprintf() will return a string right away)
Side note:
BTW, you should always allocate memory if your are looping. That is why the Str has an orange squiggly underline. This is because MATLAB stores arrays in RAM contiguously and has to copy it to a larger section it it outgrows the current one.
So instead of Str=[], write Str = strings(length(A2),1), and index Str(i) = ... in the loop.
Personally, I like num2str more that sprintf but I cannot give a good reason for this, except for that it also works without providing a format.
I'd like to have a function generate(n) that generates the first n lowercase characters of the alphabet appended in a string (therefore: 1<=n<=26)
For example:
generate(3) --> 'abc'
generate(5) --> 'abcde'
generate(9) --> 'abcdefghi'
I'm new to Matlab and I'd be happy if someone could show me an approach of how to write the function. For sure this will involve doing arithmetic with the ASCII-codes of the characters - but I've no idea how to do this and which types that Matlab provides to do this.
I would rely on ASCII codes for this. You can convert an integer to a character using char.
So for example if we want an "e", we could look up the ASCII code for "e" (101) and write:
char(101)
'e'
This also works for arrays:
char([101, 102])
'ef'
The nice thing in your case is that in ASCII, the lowercase letters are all the numbers between 97 ("a") and 122 ("z"). Thus the following code works by taking ASCII "a" (97) and creating an array of length n starting at 97. These numbers are then converted using char to strings. As an added bonus, the version below ensures that the array can only go to 122 (ASCII for "z").
function output = generate(n)
output = char(97:min(96 + n, 122));
end
Note: For the upper limit we use 96 + n because if n were 1, then we want 97:97 rather than 97:98 as the second would return "ab". This could be written as 97:(97 + n - 1) but the way I've written it, I've simply pulled the "-1" into the constant.
You could also make this a simple anonymous function.
generate = #(n)char(97:min(96 + n, 122));
generate(3)
'abc'
To write the most portable and robust code, I would probably not want those hard-coded ASCII codes, so I would use something like the following:
output = 'a':char(min('a' + n - 1, 'z'));
...or, you can just generate the entire alphabet and take the part you want:
function str = generate(n)
alphabet = 'a':'z';
str = alphabet(1:n);
end
Note that this will fail with an index out of bounds error for n > 26, so you might want to check for that.
You can use the char built-in function which converts an interger value (or array) into a character array.
EDIT
Bug fixed (ref. Suever's comment)
function [str]=generate(n)
a=97;
% str=char(a:a+n)
str=char(a:a+n-1)
Hope this helps.
Qapla'
I think I'm losing my mind.
I have a 3 x 2 cell which looks exactly like below.
Region Code
US 1
EU 2
I then have the following code to determine the row number for the EU region.
eq_code_index = find(ismember(fund.type_des(:, 1), 'EU'));
eq_code = cell2mat(fund.type_des(eq_code_index, 2));
eq_code_index returns 3 which is correct (row headers are included in the output). So I want the value in row 3, column 2 which is 2. I then use cell2mat to convert it from a cell value to an integer however it doesn't work the value is of type char? Haven't a clue why cell2mat isn't working?
Update
Even if I do the following two lines of code below I can't get the codes into a vector, they turn into char's
codes = fund.type_des(2:end, end);
codes = cell2mat(codes)
To access a single element in a cell array, use curly braces:
fund.type_des{eq_code_index, 2};
This is generally simpler than using cell2mat(). If the contents of the cell are chars and you want an integer, you have to perform the conversion. str2num() is one of many options for this:
eq_code = str2num(fund.type_des{eq_code_index, 2});
I have a string and I need two characters to be returned.
I tried with strsplit but the delimiter must be a string and I don't have any delimiters in my string. Instead, I always want to get the second number in my string. The number is always 2 digits.
Example: 001a02.jpg I use the fileparts function to delete the extension of the image (jpg), so I get this string: 001a02
The expected return value is 02
Another example: 001A43a . Return values: 43
Another one: 002A12. Return values: 12
All the filenames are in a matrix 1002x1. Maybe I can use textscan but in the second example, it gives "43a" as a result.
(Just so this question doesn't remain unanswered, here's a possible approach: )
One way to go about this uses splitting with regular expressions (MATLAB's strsplit which you mentioned):
str = '001a02.jpg';
C = strsplit(str,'[a-zA-Z.]','DelimiterType','RegularExpression');
Results in:
C =
'001' '02' ''
In older versions of MATLAB, before strsplit was introduced, similar functionality was achieved using regexp(...,'split').
If you want to learn more about regular expressions (abbreviated as "regex" or "regexp"), there are many online resources (JGI..)
In your case, if you only need to take the 5th and 6th characters from the string you could use:
D = str(5:6);
... and if you want to convert those into numbers you could use:
E = str2double(str(5:6));
If your number is always at a certain position in the string, you can simply index this position.
In the examples you gave, the number is always the 5th and 6th characters in the string.
filename = '002A12';
num = str2num(filename(5:6));
Otherwise, if the formating is more complex, you may want to use a regular expression. There is a similar question matlab - extracting numbers from (odd) string. Modifying the code found there you can do the following
all_num = regexp(filename, '\d+', 'match'); %Find all numbers in the filename
num = str2num(all_num{2}) %Convert second number from str