find in range of a IDL array? - matlab

I am trying to find all indices in an array A, where the value larger than time0 and less than time1. In matlab I can do:
[M,F] = mode( A((A>=time0) & (A<=time1)) ) %//only interested in range
I have something similar in IDL but really slow:
tmpindex0 = where(A ge time0)
tmpindex1 = where(A lt time1)
M = setintersection(tmpindex0,tmpindex1)
where setintersection() is function find the intersected elements between two arrays. What is the fast alternative implementation?

You can combine your conditions:
M = where(A ge time0 and A lt time1, count)
Then M will contain indices into time0 and time1 while count will contain the number of indices. Generally, you want to check count before using M.

This works (slight modification from mgalloy answer):
M = where( (A ge time0) and (A lt time1), n_match, complement=F, n_complement=ncomp)
The parenthetical separation is not necessary but adds clarity. n_match contains the number of matches to your conditions whereas the complement F will contain the indices for the non-matches and ncomp will contain the number of non-matches.

Related

Minizinc: declare explicit set in decision variable

I'm trying to implement the 'Sport Scheduling Problem' (with a Round-Robin approach to break symmetries). The actual problem is of no importance. I simply want to declare the value at x[1,1] to be the set {1,2} and base the sets in the same column upon the first set. This is modelled as in the code below. The output is included in a screenshot below it. The problem is that the first set is not printed as a set but rather some sort of range while the values at x[2,1] and x[3,1] are indeed printed as sets and x[4,1] again as a range. Why is this? I assume that in the declaration of x that set of 1..n is treated as an integer but if it is not, how to declare it as integers?
EDIT: ONLY the first column of the output is of importance.
int: n = 8;
int: nw = n-1;
int: np = n div 2;
array[1..np, 1..nw] of var set of 1..n: x;
% BEGIN FIX FIRST WEEK $
constraint(
x[1,1] = {1, 2}
);
constraint(
forall(t in 2..np) (x[t,1] = {t+1, n+2-t} )
);
solve satisfy;
output[
"\(x[p,w])" ++ if w == nw then "\n" else "\t" endif | p in 1..np, w in 1..nw
]
Backend solver: Gecode
(Here's a summarize of my comments above.)
The range syntax is simply a shorthand for contiguous values in a set: 1..8 is a shorthand of the set {1,2,3,4,5,6,7,8}, and 5..6 is a shorthand for the set {5,6}.
The reason for this shorthand is probably since it's often - and arguably - easier to read the shorthand version than the full list, especially if it's a long list of integers, e.g. 1..1024. It also save space in the output of solutions.
For the two set versions, e.g. {1,2}, this explicit enumeration might be clearer to read than 1..2, though I tend to prefer the shorthand version in all cases.

Fastest type to use for comparing hashes in matlab

I have a table in Matlab with some columns representing 128 bit hashes.
I would like to match rows, to one or more rows, based on these hashes.
Currently, the hashes are represented as hexadecimal strings, and compared with strcmp(). Still, it takes many seconds to process the table.
What is the fastest way to compare two hashes in matlab?
I have tried turning them into categorical variables, but that is much slower. Matlab as far as I know does not have a 128 bit numerical type. nominal and ordinal types are deprecated.
Are there any others that could work?
The code below is analogous to what I am doing:
nodetype = { 'type1'; 'type2'; 'type1'; 'type2' };
hash = {'d285e87940fb9383ec5e983041f8d7a6'; 'd285e87940fb9383ec5e983041f8d7a6'; 'ec9add3cf0f67f443d5820708adc0485'; '5dbdfa232b5b61c8b1e8c698a64e1cc9' };
entries = table(categorical(nodetype),hash,'VariableNames',{'type','hash'});
%nodes to match. filter by type or some other way so rows don't match to
%themselves.
A = entries(entries.type=='type1',:);
B = entries(entries.type=='type2',:);
%pick a node/row with a hash to find all counterparts of
row_to_match_in_A = A(1,:);
matching_rows_in_B = B(strcmp(B.hash,row_to_match_in_A.hash),:);
% do stuff with matching rows...
disp(matching_rows_in_B);
The hash strings are faithful representations of what I am using, but they are not necessarily read or stored as strings in the original source. They are just converted for this purpose because its the fastest way to do the comparison.
Optimization is nice, if you need it. Try it out yourself and measure the performance gain for relevant test cases.
Some suggestions:
Sorted arrays are easier/faster to search
Matlab's default numbers are double, but you can also construct integers. Why not use 2 uint64's instead of the 128bit column? First search for the upper 64bit, then for the lower; or even better: use ismember with the row option and put your hashes in rows:
A = uint64([0 0;
0 1;
1 0;
1 1;
2 0;
2 1]);
srch = uint64([1 1;
0 1]);
[ismatch, loc] = ismember(srch, A, 'rows')
> loc =
4
2
Look into the compare functions you use (eg edit ismember) and strip out unnecessary operations (eg sort) and safety checks that you know in advance won't pose a problem. Like this solution does. Or if you intend do call a search function multiple times, sort in advance and skip the check/sort in the search function later on.

finding the number of occurrence of a pattern within a cell in matlab?

i have a cell like this:
x = {'3D'
'B4'
'EF'
'D8'
'E7'
'6C'
'33'
'37'}
let's assume that the cell is 1000x1. i want to find the number of occurrence of pattern = [30;30;64;63] within this cell but as the order shown. in the other word it's first check x{1,1},x{2,1},x{3,1},x{4,1}
then check x{2,1},x{3,1},x{4,1},x{5,1} and like this till the end of the cell and return the number of occurrence of it.
Here is my code but it didn't work!
while (size (pattern)< size(x))
count = 0;
for i=1:size(x)-length(pattern)+1
if size(abs(x(i:i+size(pattern)-1)-x))==0
count = count+1;
end
end
end
Your example code has a couple of issues - foremost I don't believe you are doing any comparison operations, which would be necessary to identify the occurrence of the pattern within the search data (x). Also, there is a variable type mismatch between x and pattern - one is a cell array of strings, and the other is a decimal array.
One way to approach this problem would be to restructure x and pattern as strings, and then use strfind to find occurrences of pattern. This method will only work if there is no missing data in either of the variables.
x = {'3D';'B4';'EF';'D8';'E7';'6C';'33';'37';'xE';'FD';'8y'};
pattern = {'EF','D8'};
collated_x=[x{:}];
collated_pattern = [pattern{:}];
found_locations = strfind(collated_x, collated_pattern);
% Remove 'offset' matches that start at even locations
found_locations = found_locations(mod(found_locations,2)==1);
count = length(found_locations)
Use string find function.
This is fast and simple solution:
clear
str_pattern=['B4','EF']; %same as str_pattern=['B4EF'];
x = {'3D'
'B4'
'EF'
'D8'
'EB'
'4E'
'F3'
'B4'
'EF'
'37'} ;
str_x=horzcat(x{:});
inds0=strfind(str_x,str_pattern); %including in-middle
inds1=inds0(bitand(inds0,1)==1); %exclude all in-middle results
disp(str_x);
disp(str_pattern);
disp(inds0);
disp(inds1);

Scientific notation in MATLAB

Say I have an array that contains the following elements:
1.0e+14 *
1.3325 1.6485 2.0402 1.0485 1.2027 2.0615 1.7432 1.9709 1.4807 0.9012
Now, is there a way to grab 1.0e+14 * (base and exponent) individually?
If I do arr(10), then this will return 9.0120e+13 instead of 0.9012e+14.
Assuming the question is to grab any elements in the array with coefficient less than one. Is there a way to obtain 1.0e+14, so that I could just do arr(i) < 1.0e+14?
I assume you want string output.
Let a denote the input numeric array. You can do it this way, if you don't mind using evalc (a variant of eval, which is considered bad practice):
s = evalc('disp(a)');
s = regexp(s, '[\de+-\.]+', 'match');
This produces a cell array with the desired strings.
Example:
>> a = [1.2e-5 3.4e-6]
a =
1.0e-04 *
0.1200 0.0340
>> s = evalc('disp(a)');
>> s = regexp(s, '[\de+-\.]+', 'match')
s =
'1.0e-04' '0.1200' '0.0340'
Here is the original answer from Alain.
Basic math can tell you that:
floor(log10(N))
The log base 10 of a number tells you approximately how many digits before the decimal are in that number.
For instance, 99987123459823754 is 9.998E+016
log10(99987123459823754) is 16.9999441, the floor of which is 16 - which can basically tell you "the exponent in scientific notation is 16, very close to being 17".
Now you have the exponent of the scientific notation. This should allow you to get to whatever your goal is ;-).
And depending on what you want to do with your exponent and the number, you could also define your own method. An example is described in this thread.

Is this the simplified version of this boolean expression? Or is this reviewer wrong

Cause I've tried doing the truth table unfortunately one has 3 literals and the other has 4 so i got confused.
F = (A+B+C)(A+B+D')+B'C;
and this is the simplified version
F = A + B + C
http://www.belley.org/etc141/Boolean%20Sinplification%20Exercises/Boolean%20Simplification%20Exercise%20Questions.pdf
cause I think there's something wrong with this reviewer.. or is it accurate?
btw is simplification different from minimizing from Sum of Minterms to Sum of Products?
Yes, it is the same.
Draw the truth table for both expressions, assuming that there are four input variables in both. The value of D will not play into the second truth table: values in cells with D=1 will match values in cells with D=0. In other words, you can think of the second expression as
F = A +B + C + (0)(D)
You will see that both tables match: the (A+B+C)(A+B+D') subexpression has zeros in ABCD= {0000, 0001, 0011}; (A+B+C) has zeros only at {0000, 0001}. Adding B'C patches zero at 0011 in the first subexpressions, so the results are equivalent.