Search for a specific digit in an integer - matlab

I'm looking for a really quick method in MATLAB of searching for a specific digit within an integer, ideally in a given position. For example:
Simple case...
I want to look through an array of integers and return all those which contain the number 1 eg 1234, 4321, 6515, 847251737 etc
More complex case...
I want to loop through an array of integers and return all those which contain the number 1 in the third digit eg 6218473, 541846, 3115473 BUT 175846 would not be returned.
Any thoughts?

There's a few answers here already, I'll throw my try into the pot.
Conversion to string can be expensive, so if it can be avoided, it should be.
n = 1:100000; % sample numbers
m = 3; % digit to check
x = 1; % number to find
% Length of the numbers in digits
num_length = floor(log10(abs(n)))+1;
% digit (from the left) to check
num_place = num_length-m;
% get the digit
digit_in_place = mod(floor(abs(n)./(10.^num_place)),10);
found_number = n(digit_in_place==x);

By casting to strings, the trick to vectorising is just to make sure x is a column vector. x(:) guarantees this. Also you need to left-align the strings which is done with the format specifier '%-d' where - is for left-alignment and d is for integers:
s = num2str(x(:), '%-d');
ind = s(:,3)=='1'
and this also allows you to easily solve your first case:
ind = any(s=='1',2)
in either case to recover your original number just go:
x(ind)

One way of getting there is to cast your numbers as strings and then check if the 3rd position of that string is '1'. It works perfectly fine in a loop, but I am confident that there is also a vectorized solution:
numbers = [6218473, 541846, 3115473, 175846]'
returned_numbers = [];
for i = 1:length(numbers)
number = numbers(i);
y = sprintf('%d', number) %// cast to string
%// add number to list, if its third character is 11
if strcmp(y(3), '1')
returned_numbers = [returned_numbers, number];
end
end
% // it returns:
returned_numbers =
6218473 541846 3115473

Code
%// Input array
array1 = [-94341 1234 4321 6515 847251737 6218473 541846 3115473 175846]
N = numel(array1); %// number of elements in input array
digits_sep = num2str(array1(:))-'0'; %//' Seperate the digits into a matrix
%// Simple case
output1 = array1(any(digits_sep==1,2))
%// More complex case output
col_num = 3;
%// Get column numbers for each row of the digits matrix and thus
%// the actual linear index corresponding to 3rd digit for each input element
ind1 =sub2ind(size(digits_sep),1:N,...
size(digits_sep,2)-floor(log10(abs(array1))-col_num+1));
%// Select the third digits, check which ones have `1` and use them to logically
%// index into input array to get the output
output2 = array1(digits_sep(ind1)==1)
Code run -
array1 =
-94341 1234 4321 6515 847251737 6218473 541846 3115473 175846
output1 =
-94341 1234 4321 6515 847251737 6218473 541846 3115473 175846
output2 =
6515 6218473 541846 3115473

Related

MATLAB. How to remove rows, if any of the values in the row is found in another row?

I have a matrix as shown in the image. In this matrix if any of the values in one row is found in another row we remove the shorter row. For example row 2 to row 5 all contain 3, therefore I want to keep only row 5(the row with most non-zero values) and remove all other rows...please suggest a solution.
Thanks
I believe the below code should work. The idea is to sort the matrix first according to the number of elements in the rows, then loop and remove the rows that have matches. Probably not the most efficient code but should work in principle.. see the comments for more explanation
% generating the data
M = zeros(6, 10);
M(2,1:3) = [3 8 10];
M(3,1:4) = [3 8 10 9];
M(4,1:5) = [3 8 10 9 7];
M(5,1:6) = [3 8 10 9 7 4];
M(6,1) = [5];
% sorting according to the number of non-zero elements
nr_of_nonzero = sum(M~=0, 2);
[~, sort_indices] = sort(nr_of_nonzero);
M_sorted = M(sort_indices,:);
M_sorted(M_sorted==0)=NaN; % should not compare 0s (?)
% get rid of the matches
for i=1:size(M_sorted, 1)-1
for j=(i+1):size(M_sorted, 1)
[C,ia,ib] = intersect(M_sorted(i,:),M_sorted(j,:));
if numel(C)>0
M_sorted(i,:) = NaN;
end
break;
end
end
% reorder
M(sort_indices,:) = M_sorted;
% remove all NaN rows
M(all(isnan(M),2),:) = [];
% back to 0s
M(isnan(M)) = 0;
I'm not doing all the code here, but here's the steps that I would take to solve it. You will likely have to try different ways of doing it to obtain the intended result (i.e. vector operations, while loop, for loop, etc.).
Problem
Rows are repetitive and need to be reduced in a more compact form.
Solution
Look up mat2str.
Convert your vectors (rows) to strings. This can be done with temporary values like tmpstr1 = mat2str(yourMatrix(rowToBeCompared, :));
Parse the first string from beginning to end, while parsing the second string in the same way to make comparisons.
use strcmp to see if the string characters (or strings themselves) are the same: http://www.mathworks.com/help/matlab/ref/strcmp.html
Delete a row if you find it appropriate with yourMatrix[rowToDelete, : ] = [];
Try that and see if it works.
Note - Expansion of step 3:
if we have variable a = '[ab+11]';, we can select individual characters from the string like:
a(4)
ans = '+'
a(5)
ans = '1'
a(1)
and = '['
Therefore, you can parse the string with a loop:
for n = 1 : length(a)
if a(n) == '1' || a(n) == '0'
str(n) = a(n);
end
end
Like Sardar Usama said, it's helpful to provide the code so that we can copy and paste into our own MATLAB workspaces.

how to sum digits in a multi-digit number Matlab

I wonder how to sum digits for a multi-digit number in Matlab.
For example 1241= 1+2+4+1 = 8
String-based answer:
>> n = 1241;
>> sum(int2str(n)-48)
ans =
8
The number is first converted to a string representation using int2str, then the ASCII code for '0' (i.e. 48) is subtracted from the ASCII code for each element of the string, producing a numeric vector. This is then summed to get the result.
A = 35356536576821;
A = abs(A);
xp = ceil(log10(A)):-1:1;
while ~isscalar(xp)
A = sum(fix(mod(A,10.^xp)./10.^[xp(2:end) 0]));
xp = ceil(log10(A)):-1:1;
end
this is the numeric approach
This one is the solution is character approach:
A = '35356536576821';
A = char(regexp(A,'\d+','match'));
while ~isscalar(A)
A = num2str(sum(A - '0'));
end
Both, first take the absolute number (strip the minus) then: the numeric one counts with log10() how many digits a number has and through modulus and divisions extracts the digits which are summed, while the char approach convert to numeric digits with implicit conversion of - '0', sums and converts back to string again.
Another all-arithmetic approach:
n = 1241; %// input
s = 0; %// initiallize output
while n>0 %// while there is some digit left
s = s + mod(n-1,10)+1; %// sum rightmost digit
n = floor(n/10); %// remove that digit
end
Youcan use this code
sum(int2str(n)-48)
where n, is your input number.

randomly disperse numbers in array

I am trying to randomly disperse different numbers in MATLAB array:
I have two 3's, four 2's and I want to randomly populate ones vector (size 10,1).
End result look something like this:
A = [1;3;1;2;3;2;2;1;1;2;1;1]
Then I want to fix the values in A but add more random elements but I can only replace with higher numbers:
For example, to the matrix above I will randomly add two more 2's and two more 3's giving something like this
A= [3;3;2;2;3;2;2;2;1;2;1;3]
M = [3;3;2;2;2;2];
M(end+1:end+4) = 1;
M=M(randperm(10))
The second half of your question needs a lot of clarification.
First part
You can use randsample for that:
A = ones(1,12); %// original values
v = [3 3 2 2 2 2]; %// values to "disperse" in A
ind_replace = randsample(1:numel(A), numel(v)); %// index of entries to be replaced
A(ind_replace) = v;
If you don't have randsample (which is part of the Statistics Toolbox), use randperm and select the first few elements:
ind_replace = randperm(numel(A));
ind_replace = ind_replace(1:numel(v));
A(ind) = v;
Second part
To only replace entries which equal 1:
v = [2 2 3 3]; %// values to "disperse" among the 1 values in A
ind_ones = find(A==1); %// index of entries which equal one
ind_replace = randsample(1:numel(ind_ones), numel(v)); %// index within the above
%// Or: ind_replace = randperm(numel(ind_ones));
%// ind_replace = ind_replace(1:numel(v));
A(ind_ones(ind_replace)) = v;
Note this generalizes the first part, that is, it can also be used when all entries of A equal 1.

matlab parse file into cell array

I have a file in the following format in matlab:
user_id_a: (item_1,rating),(item_2,rating),...(item_n,rating)
user_id_b: (item_25,rating),(item_50,rating),...(item_x,rating)
....
....
so each line has values separated by a colon where the value to the left of the colon is a number representing user_id and the values to the right are tuples of item_ids (also numbers) and rating (numbers not floats).
I would like to read this data into a matlab cell array or better yet ultimately convert it into a sparse matrix wherein the user_id represents the row index, and the item_id represents the column index and store the corresponding rating in that array index. (This would work as I know a-priori the number of users and items in my universe so ids cannot be greater than that ).
Any help would be appreciated.
I have thus far tried the textscan function as follows:
c = textscan(f,'%d %s','delimiter',':') %this creates two cells one with all the user_ids
%and another with all the remaining string values.
Now if I try to do something like str2mat(c{2}), it works but it stores the '(' and ')' characters also in the matrix. I would like to store a sparse matrix in the fashion that I described above.
I am fairly new to matlab and would appreciate any help regarding this matter.
f = fopen('data.txt','rt'); %// data file. Open as text ('t')
str = textscan(f,'%s'); %// gives a cell which contains a cell array of strings
str = str{1}; %// cell array of strings
r = str(1:2:end);
r = cellfun(#(s) str2num(s(1:end-1)), r); %// rows; numeric vector
pairs = str(2:2:end);
pairs = regexprep(pairs,'[(,)]',' ');
pairs = cellfun(#(s) str2num(s(1:end-1)), pairs, 'uni', 0);
%// pairs; cell array of numeric vectors
cols = cellfun(#(x) x(1:2:end), pairs, 'uni', 0);
%// columns; cell array of numeric vectors
vals = cellfun(#(x) x(2:2:end), pairs, 'uni', 0);
%// values; cell array of numeric vectors
rows = arrayfun(#(n) repmat(r(n),1,numel(cols{n})), 1:numel(r), 'uni', 0);
%// rows repeated to match cols; cell array of numeric vectors
matrix = sparse([rows{:}], [cols{:}], [vals{:}]);
%// concat rows, cols and vals into vectors and use as inputs to sparse
For the example file
1: (1,3),(2,4),(3,5)
10: (1,1),(2,2)
this gives the following sparse matrix:
matrix =
(1,1) 3
(10,1) 1
(1,2) 4
(10,2) 2
(1,3) 5
I think newer versions of Matlab have a stringsplit function that makes this approach overkill, but the following works, if not quickly. It splits the file into userid's and "other stuff" as you show, initializes a large empty matrix, and then iterates through the other stuff, breaking it apart and placing in the correct place in the matrix.
(I Didn't see the previous answer when I opened this for some reason - it is more sophisticated than this one, though this may be a little easier to follow at the expense of slowness). I throw in the \s* into the regex in case the spacing is inconsistent, but otherwise don't perform much in the way of data-sanity-checking. Output is the full array, that you can then turn into a sparse array if desired.
% matlab_test.txt:
% 101: (1,42),(2,65),(5,0)
% 102: (25,78),(50,12),(6,143),(2,123)
% 103: (23,6),(56,3)
clear all;
fclose('all');
% your path will vary, of course
file = '<path>/matlab_test.txt';
f = fopen(file);
c = textscan(f,'%d %s','delimiter',':');
celldisp(c)
uids = c{1}
tuples = c{2}
% These are stated as known
num_users = 3;
num_items = 40;
desired_array = zeros(num_users, num_items);
expression = '\((\d+)\s*,\s*(\d+)\)'
% Assuming length(tuples) == num_users for simplicity
for k = 1:num_users
uid = uids(k)
tokens = regexp(tuples{k}, expression, 'tokens');
for l = 1:length(tokens)
item_id = str2num(tokens{l}{1})
rating = str2num(tokens{l}{2})
desired_array(uid, item_id) = rating;
end
end

Vectorizing the Notion of Colon (:) - values between two vectors in MATLAB

I have two vectors, idx1 and idx2, and I want to obtain the values between them. If idx1 and idx2 were numbers and not vectors, I could do that the following way:
idx1=1;
idx2=5;
values=idx1:idx2
% Result
% values =
%
% 1 2 3 4 5
But in my case, idx1 and idx2 are vectors of variable length. For example, for length=2:
idx1=[5,9];
idx2=[9 11];
Can I use the colon operator to directly obtain the values in between? This is, something similar to the following:
values = [5 6 7 8 9 9 10 11]
I know I can do idx1(1):idx2(1) and idx1(2):idx2(2), this is, extract the values for each column separately, so if there is no other solution, I can do this with a for-loop, but maybe Matlab can do this more easily.
Your sample output is not legal. A matrix cannot have rows of different length. What you can do is create a cell array using arrayfun:
values = arrayfun(#colon, idx1, idx2, 'Uniform', false)
To convert the resulting cell array into a vector, you can use cell2mat:
values = cell2mat(values);
Alternatively, if all vectors in the resulting cell array have the same length, you can construct an output matrix as follows:
values = vertcat(values{:});
Try taking the union of the sets. Given the values of idx1 and idx2 you supplied, run
values = union(idx1(1):idx1(2), idx2(1):idx2(2));
Which will yield a vector with the values [5 6 7 8 9 10 11], as desired.
I couldn't get #Eitan's solution to work, apparently you need to specify parameters to colon. The small modification that follows got it working on my R2010b version:
step = 1;
idx1 = [5, 9];
idx2 = [9, 11];
values = arrayfun(#(x,y)colon(x, step, y), idx1, idx2, 'UniformOutput', false);
values=vertcat(cell2mat(values));
Note that step = 1 is actually the default value in colon, and Uniform can be used in place of UniformOutput, but I've included these for the sake of completeness.
There is a great blog post by Loren called Vectorizing the Notion of Colon (:). It includes an answer that is about 5 times faster (for large arrays) than using arrayfun or a for-loop and is similar to run-length-decoding:
The idea is to expand the colon sequences out. I know the lengths of
each sequence so I know the starting points in the output array. Fill
the values after the start values with 1s. Then I figure out how much
to jump from the end of one sequence to the beginning of the next one.
If there are repeated start values, the jumps might be negative. Once
this array is filled, the output is simply the cumulative sum or
cumsum of the sequence.
function x = coloncatrld(start, stop)
% COLONCAT Concatenate colon expressions
% X = COLONCAT(START,STOP) returns a vector containing the values
% [START(1):STOP(1) START(2):STOP(2) START(END):STOP(END)].
% Based on Peter Acklam's code for run length decoding.
len = stop - start + 1;
% keep only sequences whose length is positive
pos = len > 0;
start = start(pos);
stop = stop(pos);
len = len(pos);
if isempty(len)
x = [];
return;
end
% expand out the colon expressions
endlocs = cumsum(len);
incr = ones(1, endlocs(end));
jumps = start(2:end) - stop(1:end-1);
incr(endlocs(1:end-1)+1) = jumps;
incr(1) = start(1);
x = cumsum(incr);