I'm working on a program that converts between number bases. For example Octal is 8, decimal is 10. Letters A to Z could be considered as base 26.
I want to convert a number like "A" into 0, Z into 25, "AA" into 27 and "BA" into 53.
Before I start coding I'm doing it on paper so I understand the process. To start out I'm trying to convert 533 to base 26.
What algorithm is best for doing this?
You need to assign a "digit" to each letter, like:
A = 0 N = 13
B = 1 O = 14
C = 2 P = 15
D = 3 Q = 16
E = 4 R = 17
F = 5 S = 18
G = 6 T = 19
H = 7 U = 20
I = 8 V = 21
J = 9 W = 22
K = 10 X = 23
L = 11 Y = 24
M = 12 Z = 25
Then, your {20,13} becomes UN.
Converting back is UN -> {20,13} -> (20 * 26 + 13) -> 52.
By way of further example, let's try the number 10163, just plucked out of the air at random.
Divide that by 26 until you get a number less than 26 (i.e., twice), and you get 15 with a fractional part of 0.03402366.
Multiply that by 26 and you get 0 with a fractional part of 0.88461516.
Multiply that by 26 and you get 23 (actually 22.99999416 on my calculator but, since the initial division was only two steps, we stop here - the very slight inaccuracy is due to the fact that the floating point numbers are being rounded).
So the "digits" are {15,0,23} which is the "number" PAX. Wow, what a coincidence?
To convert PAX back into decimal, its
P * 262 + A * 261 + X * 260
or
(15 * 676) + (0 * 26) + 23
= 10140 + 0 + 23
= 10163
Let's take a step back for a second, and look at decimal.
What does a number like "147" mean? Or rather, what do the characters '1', '4' and '7', when arranged like that, indicate?
There are ten digits in decimal, and after that, we add another digit to the left of the first, and so on as our number increases. So after "9" = 9*1, we get "10" = 1*10 + 0*1. So "147" is 1*10^2 + 4*10 + 7*1 = 147. Similarly, we can go backwards - 147/10^2 = 1, which maps to the character '1'. (147 % 10^2) / 10 = 4, which maps to the character '4'. And 147 % 10 = 7, which maps to the character '7'.
This works works for any base N - if we get the number 0, that maps to the first character in our set. The number 1 maps to the second character, and so on until the number N-1 maps to the last character in our set of digits.
You convert 20 and 13 to the symbols that represent 20 and 13 in your base 26 notation. It sounds like you are using the letters of the alphabet so, that would be UN (where A is 0 and Z is 25).
What language are you writing this in? If you're doing this in Perl you can use the CPAN module Math::Fleximal that I wrote many years ago while I was bored. If you're using a language with infinite precision integers, then life becomes much easier. All you have to do is take characters, convert them into an array of integers, then do the calculation to turn that into a number.
Related
Want to convert the alphabet to numerical values and transform it back to alphabets using some mathematical techniques like fast Fourier transform in MATLAB.
Example:
The following is the text saved in "text2figure.txt" file
Hi how r u am fine take care of your health
thank u very much
am 2.0
Reading it in MATLAB:
data=fopen('text2figure.txt','r')
d=fscanf(data,'%s')
temp = fileread( 'text2figure.txt' )
temp = regexprep( temp, ' {6}', ' NaN' )
c=cellstr(temp(:))'
Now I wish to convert cell array with spaces to numerical values/integers:
coding = 'abcdefghijklmnñopqrstuvwxyz .,;'
str = temp %// example text
[~, result] = ismember(str, coding)
y=result
result =
Columns 1 through 18
0 9 28 8 16 24 28 19 28 22 28 1 13 28 6 9 14 5
Columns 19 through 36
28 21 1 11 5 28 3 1 19 5 28 16 6 28 26 16 22 19
Columns 37 through 54
28 8 5 1 12 21 8 28 0 0 21 8 1 14 11 28 22 28
Columns 55 through 71
23 5 19 26 28 13 22 3 8 0 0 1 13 28 0 29 0
Now I wish to convert the numerical values back to alphabets:
Hi how r u am fine take care of your health
thank u very much
am 2.0
How to write a MATLAB code to return the numerical values in the variable result to alphabets?
Most of the code in the question doesn't have any useful effects. These three lines are the ones that lead to result:
str = fileread('test2figure.txt');
coding = 'abcdefghijklmnñopqrstuvwxyz .,;';
[~, result] = ismember(str, coding);
ismember returns, in the second output argument, the indices into coding for each element of str. Thus, result are indices that we can use to index into coding:
out = coding(result);
However, this does not work because some elements of str do not occur in coding, and for those elements ismember returns 0, which is not a valid index. We can replace the zeros with a new character:
coding = ['*',coding];
out = coding(result+1);
Basically, we're shifting each code by one, adding a new code for 1.
One of the characters we're missing here is the newline character. Thus the three lines have become one line. You can add a code for the newline character by adding it to the coding table:
str = fileread('test2figure.txt');
coding = ['abcdefghijklmnñopqrstuvwxyz .,;',char(10)]; % char(10) is the newline character
[~, result] = ismember(str, coding);
coding = ['*',coding];
out = coding(result+1);
All of this is easier to achieve just using the ASCII code table:
str = fileread('test2figure.txt');
result = double(str);
out = char(result);
a=magic(5)
k=a,3
When I print k, it simply shows a.
m=size(a,3)
n=size(a,6)
when I print m and n, they print different values.
Anyone please explain what this function is?
On Octave 4.2.1
k=a,3
assigns the matrix a to the variable k, then, as a second instruction, prints on the CommandWindow the value 3.
The , (comma) is used in order to have two instruction on the same row.
An alterntive could be replacing the , with the ; which has the effect of suppressing the output on the CommandWindow of the assignment k=a
With respect to
m=size(a,3)
n=size(a,6)
the second parameter n the call to size specifies the dimension of the matrix (the first parameter) for which you want to know the size.
a is a two "dimensional" matrix of size (5 x 5) while the instruction size(a,3) looks for the size of the third dimension of a.
In a similar way, size(a,6) looks for the size of the a's sixth dimension. In these case, the a is considered as (5 x 5 x 1) and (5 x 5 x 1 x 1 x 1 x 1)
The return value, for is 1
This is the output in the CommandWondow:
>> a=magic(5)
a =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> k=a,3
k =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
ans = 3
>> m=size(a,3)
m = 1
>> n=size(a,6)
n = 1
In matlab / octave, there are three ways to terminate an expression (e.g. 1+2):
With a semicolon ;
With a comma ,
With a newline (i.e. pressing enter)
The first one (i.e. the semicolon) when used, evaluates the expression, but suppresses its output. The other two (i.e. the comma and the newline), both evaluate the statement and also display its result.
Why have both a comma and a newline? Because, with a comma, you can evaluate multiple expressions on the same line (and have all of them display their results).
Note: Given the fact that most people write their expressions in separate lines, the comma tends not to be used very much, so it is less known.
Examples:
octave:1> 1+2, 3+4
ans = 3
ans = 7
octave:2> 1+2; 3+4;
octave:3> 1+2; 3+4
ans = 7
octave:4> 1+2, 3+4;
ans = 3
octave:5> for i = 1:3; i; end % output in each iteration is suppressed
octave:6> for i = 1:3; i, end % whereas with a comma, output is not suppressed
i = 1
i = 2
i = 3
Therefore your statements:
a = magic(5)
k = a, 3
are essentially equivalent to
a = magic(5) % newline used: display value of a after assignment
k = a, % comma used, assign value of a to k, then display k
3 % newline used: displays the value '3' after pressing enter
Furthermore the size function doesn't do what you think it does. size(a,3) returns the size of array a in the 3rd dimension.
I have a 20*120 matrix. For each column in the matrix I need to find the maximum value between all the values, and then sum the remaining values. Then I need to divide the maximum value by the summation of the remaining values. I tried the following code but the result was not correct. What is the problem?
s = 1:z %z=120
for i = 1:x %x=20
maximss = max(Pres_W); %maximum value
InterFss = (sum(Pres_W))-maximss; %remaining values
SIRk(:,s) = (maximss(:,s))./(InterFss(:,s));
end
Instead of answering "what's wrong", I'll first provide a solution explaining how this should be done:
Say we have an example matrix m as follows:
m =
8 5 9 14 10 7 5
10 8 12 11 9 9 12
10 3 7 7 8 4 6
13 11 6 15 13 11 9
Find the maximum value of each column:
col_max = max(m, [], 1)
col_max =
13 11 12 15 13 11 12
Sum all elements in each column, and substract the maximum values:
col_sum = sum(m, 1) - col_max
col_sum =
28 16 22 32 27 20 20
Divide the maximum value by the sum of the other elements:
col_max ./ col_sum
ans =
0.46429 0.68750 0.54545 0.46875 0.48148 0.55000 0.60000
Or, as a one-liner:
max(m,[],1)./(sum(m,1)-max(m,[],1))
ans =
0.46429 0.68750 0.54545 0.46875 0.48148 0.55000 0.60000
By the way: Your code does exactly what you're explaining, it returns the maximum value divided by all values except the maximum value.
Notes regarding best practice:
Vectorize things like this, no need for loops.
max(m, [], 1) is the same as max(m) for 2D-arrays. However, if your matrix for some reason only have one row, it will return the maximum value of the row, thus a single number.
sum(m,1) is the same as sum(m) for 2D-arrays. However, if your matrix for some reason only have one row, it will return the sum of the row, thus a single number.
I have a matrix of 2d lets assume the values of the matrix
a =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
17 24 1 8 15
11 18 25 2 9
This matrix is going to be divided into three different matrices randomly let say
b =
17 24 1 8 15
23 5 7 14 16
c =
4 6 13 20 22
11 18 25 2 9
d =
10 12 19 21 3
17 24 1 8 15
How can i know the index of the vectors in matrix d for example in the original matrix a,note that the values of the matrix can be duplicated.
for example if i want to know the index of {10 12 19 21 3} in matrix a?
or the index of {17 24 1 8 15} in matrix a,but for this one should return only on index value?
I would appreciate it so much if you can help me with this. Thank you in advance
You can use ismember with the 'rows' option. For example:
tf = ismember(a, c, 'rows')
Should produce:
tf =
0
0
1
0
0
1
To get the indices of the rows, you can apply find on the result of ismember (note that it's redundant if you're planning to use this vector for matrix indexing). Here find(tf) return the vector [3; 6].
If you want to know the number of the row in matrix a that matches a single vector, you either use the method explained and apply find, or use the second output parameter of ismember. For example:
[tf, loc] = ismember(a, [10 12 19 21 3], 'rows')
returns loc = 4 for your example. Note that here a is the second parameter, so that the output variable loc would hold a meaningful result.
Handling floating-point numbers
If your data contains floating point numbers, The ismember approach is going to fail because floating-point comparisons are inaccurate. Here's a shorter variant of Amro's solution:
x = reshape(c', size(c, 2), 1, []);
tf = any(all(abs(bsxfun(#minus, a', x)) < eps), 3)';
Essentially this is a one-liner, but I've split it into two commands for clarity:
x is the target rows to be searched, concatenated along the third dimension.
bsxfun subtracts each row in turn from all rows of a, and the magnitude of the result is compared to some small threshold value (e.g eps). If all elements in a row fall below it, mark this row as "1".
It depends on how you build those divided matrices. For example:
a = magic(5);
d = a([2 1 2 3],:);
then the matching rows are obviously: 2 1 2 3
EDIT:
Let me expand on the idea of using ismember shown by #EitanT to handle floating-point comparisons:
tf = any(cell2mat(arrayfun(#(i) all(abs(bsxfun(#minus, a, d(i,:)))<1e-9,2), ...
1:size(d,1), 'UniformOutput',false)), 2)
not pretty but works :) This would be necessary for comparisons such as: 0.1*3 == 0.3
(basically it compares each row of d against all rows of a using an absolute difference)
In MATLAB, I would like to generate n pairs of random integers in the range [1, m], where each pair is unique. For uniqueness, I consider the order of the numbers in the pair to be irrelevant such that [3, 10] is equal to [10, 3].
Also, each pair should consist of two distinct integers; i.e. [3, 4] is ok but [3, 3] would be rejected.
EDIT: Each possible pair should be chosen with equal likelihood.
(Obviously a constraint on the parameters is that n <= m(m-1)/2.)
I have been able to successfully do this when m is small, like so:
m = 500; n = 10; % setting parameters
A = ((1:m)'*ones(1, m)); % each column has the numbers 1 -> m
idxs1 = squareform(tril(A', -1))';
idxs2 = squareform(tril(A, -1))';
all_pairs = [idxs1, idxs2]; % this contains all possible pairs
idx_to_use = randperm( size(all_pairs, 1), n ); % choosing random n pairs
pairs = all_pairs(idx_to_use, :)
pairs =
254 414
247 334
111 146
207 297
45 390
229 411
9 16
75 395
12 338
25 442
However, the matrix A is of size m x m, meaning when m becomes large (e.g. upwards of 10,000), MATLAB runs out of memory.
I considered generating a load of random numbers randi(m, [n, 2]), and repeatedly rejecting the rows which repeated, but I was concerned about getting stuck in a loop when n was close to m(m-1)/2.
Is there an easier, cleaner way of generating unique pairs of distinct integers?
Easy, peasy, when viewed in the proper way.
You wish to generate n pairs of integers, [p,q], such that p and q lie in the interval [1,m], and p
How many possible pairs are there? The total number of pairs is just m*(m-1)/2. (I.e., the sum of the numbers from 1 to m-1.)
So we could generate n random integers in the range [1,m*(m-1)/2]. Randperm does this nicely. (Older matlab releases do not allow the second argument to randperm.)
k = randperm(m/2*(m-1),n);
(Note that I've written this expression with m in a funny way, dividing by 2 in perhaps a strange place. This avoids precision problems for some values of m near the upper limits.)
Now, if we associate each possible pair [p,q] with one of the integers in k, we can work backwards, from the integers generated in k, to a pair [p,q]. Thus the first few pairs in that list are:
{[1,2], [1,3], [2,3], [1,4], [2,4], [3,4], ..., [m-1,m]}
We can think of them as the elements in a strictly upper triangular array of size m by m, thus those elements above the main diagonal.
q = floor(sqrt(8*(k-1) + 1)/2 + 1/2);
p = k - q.*(q-1)/2;
See that these formulas recover p and q from the unrolled elements in k. We can convince ourselves that this does indeed work, but perhaps a simple way here is just this test:
k = 1:21;
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
[k;p;q]'
ans =
1 1 2
2 1 3
3 2 3
4 1 4
5 2 4
6 3 4
7 1 5
8 2 5
9 3 5
10 4 5
11 1 6
12 2 6
13 3 6
14 4 6
15 5 6
16 1 7
17 2 7
18 3 7
19 4 7
20 5 7
21 6 7
Another way of testing it is to show that all pairs get generated for a small case.
m = 5;
n = 10;
k = randperm(m/2*(m-1),n);
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
sortrows([p;q]',[2 1])
ans =
1 2
1 3
2 3
1 4
2 4
3 4
1 5
2 5
3 5
4 5
Yup, it looks like everything works perfectly. Now try it for some large numbers for m and n to test the time used.
tic
m = 1e6;
n = 100000;
k = randperm(m/2*(m-1),n);
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
toc
Elapsed time is 0.014689 seconds.
This scheme will work for m as large as roughly 1e8, before it fails due to precision errors in double precision. The exact limit should be m no larger than 134217728 before m/2*(m-1) exceeds 2^53. A nice feature is that no rejection for repeat pairs need be done.
This is more of a general approach rather then a matlab solution.
How about you do the following first you fill a vector like the following.
x[n] = rand()
x[n + 1] = x[n] + rand() %% where rand can be equal to 0.
Then you do the following again
x[n][y] = x[n][y] + rand() + 1
And if
x[n] == x[n+1]
You would make sure that the same pair is not already selected.
After you are done you can run a permutation algorithm on the matrix if you want them to be randomly spaced.
This approach will give you all the possibility or 2 integer pairs, and it runs in O(n) where n is the height of the matrix.
The following code does what you need:
n = 10000;
m = 500;
my_list = unique(sort(round(rand(n,2)*m),2),'rows');
my_list = my_list(find((my_list(:,1)==my_list(:,2))==0),:);
%temp = my_list; %In case you want to check what you initially generated.
while(size(my_list,1)~=n)
%my_list = unique([my_list;sort(round(rand(1,2)*m),2)],'rows');
%Changed as per #jucestain's suggestion.
my_list = unique([my_list;sort(round(rand((n-size(my_list,1)),2)*m),2)],'rows');
my_list = my_list(find((my_list(:,1)==my_list(:,2))==0),:);
end