How can I get row index in google or-tools? - or-tools

I have a two-dimensional bool array, and I want to find the row index(index+1) of each column element that is not 0,If all the elements of the column are 0, the row index we get is 0. In addition, each column has at most 1 value that is not 0。How can I do it?
The following code cannot handle the case where one column is all zeros
model = cp_model.CpModel()
solver = cp_model.CpSolver()
shifts = {}
for i in range(6):
worker_shift[(i)]=model.NewIntVar(0, 6, "worker_shift(%i)" % (i))
for j in range(6):
shifts[(i,j)] = model.NewBoolVar("shifts(%i,%i)" % (i,j))
for j in range(6):
model.Add(sum(shifts[(i,j)] for i in range(6))<=1)
for j in range(6):
for i in range(6):
r1= np.empty([6,6], dtype = int)
r2=np.empty([1,6], dtype = int)
status = solver.Solve(model)
for i in range(6):
for j in range(6):

Let's look at one column
if we create
pi == n - bi * (n - i)
Then using model.AddMinEquality(), min([pi]) computes the same think as your code.
Now let's introduce e a new Boolean variable.
we want
e <=> bool_or(pi)
We code it the usual way
for all i, pi => e (using model.AddImplication())
bool_or([b1, ..., bn, e.Not()]) (reverse implication, if all bi are false, then e is false.
Now, we can use the correct min() using a new IntVar pe[0..n]
model.Add(pe = e*n)
model.AddMinEquality(target, [p1, .., pn, pe])

thanks to #Laurent Perron,#Stradivari !
I studied the introduction of or-tools in #Stradivari's blog(, I use following code:
for j in range(6):
model.Add(worker_shift[(j)]==0).OnlyEnforceIf([shifts[(i,j)].Not() for i in range(6) ])
for i in range(6):
It seems to work well !
The complete code is as follows:
model = cp_model.CpModel()
solver = cp_model.CpSolver()
shifts = {}
for i in range(6):
worker_shift[(i)]=model.NewIntVar(0, 6, "worker_shift(%i)" % (i))
for j in range(6):
shifts[(i,j)] = model.NewBoolVar("shifts(%i,%i)" % (i,j))
for j in range(6):
model.Add(sum(shifts[(i,j)] for i in range(6))<=1)
for j in range(6):
model.Add(worker_shift[(j)]==0).OnlyEnforceIf([shifts[(i,j)].Not() for i in range(6) ])
for i in range(6):
r1= np.empty([6,6], dtype = int)
r2=np.empty([1,6], dtype = int)
status = solver.Solve(model)
for i in range(6):
for j in range(6):


how to make K different elements in bool array with or-tools?

I want make K different elements in bool array,I use code: model.Add(len(set([shifts[(i)] for i in range(10)]))==4) ,but it not work! How can I do this?
model = cp_model.CpModel()
solver = cp_model.CpSolver()
shifts = {}
ones = [model.NewBoolVar("") for _ in range(10)]
for i in range(10):
shifts[(i)] = model.NewIntVar(0, 10, "shifts(%i)" % i)
for i in range(10):
model.Add(shifts[(i)] >0).OnlyEnforceIf(ones[(i)])
model.Add(shifts[(i)] == 0).OnlyEnforceIf(ones[(i)].Not())
model.Add(sum(ones[(i)] for i in range(10)) == 5)
# I want make 4 different but it not work!
#model.Add(len(set([shifts[(i)] for i in range(10)]))==4)
status = solver.Solve(model)
for i in range(10):
Encode your integers using booleans and add another boolean for each value to mark it as used
from ortools.sat.python import cp_model
model = cp_model.CpModel()
solver = cp_model.CpSolver()
shifts = {}
used = [model.NewBoolVar("") for j in range(10)]
for i in range(10):
for j in range(10):
shifts[i, j] = model.NewBoolVar(f"shifts({i}, {j})")
# shifts[i,j] => used[j]
model.AddImplication(shifts[i, j], used[j])
model.Add(sum(shifts[i, j] for j in range(10)) == 1)
model.Add(sum(shifts[i, 0].Not() for i in range(10)) == 5)
for j in range(10):
# all(shifts[_, j] == 0) => used[j].Not()
model.AddBoolOr([shifts[i, j] for i in range(10)] + [used[j].Not()])
model.Add(sum(used) == 4)
status = solver.Solve(model)
print("status:", status)
res = []
for i in range(10):
for j in range(10):
if solver.Value(shifts[i, j]):

Element-wise matrix multiplication for multi-dimensional array

I want to realize component-wise matrix multiplication in MATLAB, which can be done using numpy.einsum in Python as below:
import numpy as np
M = 2
N = 4
I = 2000
J = 300
A = np.random.randn(M, M, I)
B = np.random.randn(M, M, N, J, I)
C = np.random.randn(M, J, I)
# using einsum
D = np.einsum('mki, klnji, lji -> mnji', A, B, C)
# naive for-loop
E = np.zeros(M, N, J, I)
for i in range(I):
for j in range(J):
for n in range(N):
E[:,n,j,i] = B[:,:,i] # A[:,:,n,j,i] # C[:,j,i]
print(np.sum(np.abs(D-E))) # expected small enough
So far I use for-loop of i, j, and n, but I don't want to, at least for-loop of n.
Option 1: Calling numpy from MATLAB
Assuming your system is set up according to the documentation, and you have the numpy package installed, you could do (in MATLAB):
np = py.importlib.import_module('numpy');
M = 2;
N = 4;
I = 2000;
J = 300;
A = matpy.mat2nparray( randn(M, M, I) );
B = matpy.mat2nparray( randn(M, M, N, J, I) );
C = matpy.mat2nparray( randn(M, J, I) );
D = matpy.nparray2mat( np.einsum('mki, klnji, lji -> mnji', A, B, C) );
Where matpy can be found here.
Option 2: Native MATLAB
Here the most important part is to get the permutations right, so we need to keep track of our dimensions. We'll be using the following order:
I(1) J(2) K(3) L(4) M(5) N(6)
Now, I'll explain how I got the correct permute order (let's take the example of A): einsum expects the dimension order to be mki, which according to our numbering is 5 3 1. This tells us that the 1st dimension of A needs to be the 5th, the 2nd needs to be 3rd and the 3rd needs to be 1st (in short 1->5, 2->3, 3->1). This also means that the "sourceless dimensions" (meaning those that have no original dimensions becoming them; in this case 2 4 6) should be singleton. Using ipermute this is really simple to write:
pA = ipermute(A, [5,3,1,2,4,6]);
In the above example, 1->5 means we write 5 first, and the same goes for the other two dimensions (yielding [5,3,1]). Then we just add the singletons (2,4,6) at the end to get [5,3,1,2,4,6]. Finally:
A = randn(M, M, I);
B = randn(M, M, N, J, I);
C = randn(M, J, I);
% Reference dim order: I(1) J(2) K(3) L(4) M(5) N(6)
pA = ipermute(A, [5,3,1,2,4,6]); % 1->5, 2->3, 3->1; 2nd, 4th & 6th are singletons
pB = ipermute(B, [3,4,6,2,1,5]); % 1->3, 2->4, 3->6, 4->2, 5->1; 5th is singleton
pC = ipermute(C, [4,2,1,3,5,6]); % 1->4, 2->2, 3->1; 3rd, 5th & 6th are singletons
pD = sum( ...
permute(pA .* pB .* pC, [5,6,2,1,3,4]), ... 1->5, 2->6, 3->2, 4->1; 3rd & 4th are singletons
(see note regarding sum at the bottom of the post.)
Another way to do it in MATLAB, as mentioned by #AndrasDeak, is the following:
rD = squeeze(sum(reshape(A, [M, M, 1, 1, 1, I]) .* ...
reshape(B, [1, M, M, N, J, I]) .* ...
... % same as: reshape(B, [1, size(B)]) .* ...
... % same as: shiftdim(B,-1) .* ...
reshape(C, [1, 1, M, 1, J, I]), [2, 3]));
See also: squeeze, reshape, permute, ipermute, shiftdim.
Here's a full example that shows that tests whether these methods are equivalent:
function q55913093
M = 2;
N = 4;
I = 2000;
J = 300;
mA = randn(M, M, I);
mB = randn(M, M, N, J, I);
mC = randn(M, J, I);
%% Option 1 - using numpy:
np = py.importlib.import_module('numpy');
A = matpy.mat2nparray( mA );
B = matpy.mat2nparray( mB );
C = matpy.mat2nparray( mC );
D = matpy.nparray2mat( np.einsum('mki, klnji, lji -> mnji', A, B, C) );
%% Option 2 - native MATLAB:
%%% Reference dim order: I(1) J(2) K(3) L(4) M(5) N(6)
pA = ipermute(mA, [5,3,1,2,4,6]); % 1->5, 2->3, 3->1; 2nd, 4th & 6th are singletons
pB = ipermute(mB, [3,4,6,2,1,5]); % 1->3, 2->4, 3->6, 4->2, 5->1; 5th is singleton
pC = ipermute(mC, [4,2,1,3,5,6]); % 1->4, 2->2, 3->1; 3rd, 5th & 6th are singletons
pD = sum( permute( ...
pA .* pB .* pC, [5,6,2,1,3,4]), ... % 1->5, 2->6, 3->2, 4->1; 3rd & 4th are singletons
rD = squeeze(sum(reshape(mA, [M, M, 1, 1, 1, I]) .* ...
reshape(mB, [1, M, M, N, J, I]) .* ...
reshape(mC, [1, 1, M, 1, J, I]), [2, 3]));
%% Comparisons:
sum(abs(pD-D), 'all')
Running the above we get that the results are indeed equivalent:
>> q55913093
ans =
ans =
Note that these two methods of calling sum were introduced in recent releases, so you might need to replace them if your MATLAB is relatively old:
S = sum(A,'all') % can be replaced by ` sum(A(:)) `
S = sum(A,vecdim) % can be replaced by ` sum( sum(A, dim1), dim2) `
As requested in the comments, here's a benchmark comparing the methods:
function t = q55913093_benchmark(M,N,I,J)
if nargin == 0
M = 2;
N = 4;
I = 2000;
J = 300;
% Define the arrays in MATLAB
mA = randn(M, M, I);
mB = randn(M, M, N, J, I);
mC = randn(M, J, I);
% Define the arrays in numpy
np = py.importlib.import_module('numpy');
pA = matpy.mat2nparray( mA );
pB = matpy.mat2nparray( mB );
pC = matpy.mat2nparray( mC );
% Test for equivalence
D = cat(5, M1(), M2(), M3());
assert( sum(abs(D(:,:,:,:,1) - D(:,:,:,:,2)), 'all') < 1E-8 );
assert( isequal (D(:,:,:,:,2), D(:,:,:,:,3)));
% Time
t = [ timeit(#M1,1), timeit(#M2,1), timeit(#M3,1)];
function out = M1()
out = matpy.nparray2mat( np.einsum('mki, klnji, lji -> mnji', pA, pB, pC) );
function out = M2()
out = permute( ...
sum( ...
ipermute(mA, [5,3,1,2,4,6]) .* ...
ipermute(mB, [3,4,6,2,1,5]) .* ...
ipermute(mC, [4,2,1,3,5,6]), [3,4]...
), [5,6,2,1,3,4]...
function out = M3()
out = squeeze(sum(reshape(mA, [M, M, 1, 1, 1, I]) .* ...
reshape(mB, [1, M, M, N, J, I]) .* ...
reshape(mC, [1, 1, M, 1, J, I]), [2, 3]));
On my system this results in:
>> q55913093_benchmark
ans =
1.3964 0.1864 0.2428
Which means that the 2nd method is preferable (at least for the default input sizes).

Gauss elimination to solve A*x = b linear system (MATLAB)

I'm trying to make a code that solves A*x = b, linear systems.
I made the code below using the gauss elimination process, and it works everytime if A doesn't have any 0's in it. If A has zeros in it, then sometimes it works, sometimes it doesn't. Basically I'm trying an alternative to the "A\b" in MATLAB.
Is there a better/simpler way of doing this?
A = randn(5,5);
b = randn(5,1);
nn = size(A);
n = nn(1,1);
U = A;
u = b;
for c = 1:1:n
k = U(:,c);
for r = n:-1:c
if k(r,1) == 0
U(r,:) = U(r,:)/k(r,1);
u(r,1) = u(r,1)/k(r,1);
for r = n:-1:(c+1)
if k(r,1) == 0
U(r,:) = U(r,:) - U(r-1,:);
u(r,1) = u(r,1) - u(r-1,1);
x = zeros(size(b));
for r = n:-1:1
if r == n
x(r,1) = u(r,1);
x(r,1) = u(r,1);
x(r,1) = x(r,1) - U(r,r+1:n)*x(r+1:n,1);
error = A*x - b;
for i = 1:1:n
if abs(error(i)) > 0.001
Working example with 0's:
A = [1, 3, 1, 3;
3, 4, 4, 1;
3, 0, 3, 9;
0, 4, 0, 1];
b = [3;
Example that fails (A*x-b isn't [0])
A = [1, 3, 1, 3;
3, 4, 4, 1;
0, 0, 3, 9;
0, 4, 0, 1];
b = [3;
Explanation of my algorithm:
Lets say I have the following A matrix:
|4, 1, 9|
|3, 4, 5|
|1, 3, 5|
For the first column, I divide each line by the first number in the row, so every row starts with 1
|1, 1/4, 9/4|
|1, 4/3, 5/3|
|1, 3, 5|
Then I subtract the last row with the one above it, and then I'll do the same for the row above and so on.
|1, 1/4, 9/4|
|0, 4/3-1/4, 5/3-9/4|
|0, 3-4/3, 5-5/3|
|1, 0.25, 2.250|
|0, 1.083, -0.5833|
|0, 1.667, 3.333|
Then I repeat the same for the rest of the columns.
|1, 0.25, 2.250|
|0, 1, -0.5385|
|0, 1, 1.999|
|1, 0.25, 2.250|
|0, 1, -0.5385|
|0, 0, -8.7700|
|1, 0.25, 2.250|
|0, 1, -0.5385|
|0, 0, 1|
The same operations I do in A I do in b so the system stays equivalent.
I added this right after "for c = 1:1:n"
So before doing anything it sorts the rows of A (and b) in order to make the "c" column have decrescent entries (0's will be left on the bottom rows of A). Right now it seems to work for any invertible square matrix, although I'm not sure it will.
r = c;
a = r + 1;
while r <= n
if r == n
r = r + 1;
elseif a <= n
while a <= n
if abs(U(r,c)) < abs(U(a,c))
UU = U(r,:);
U(r,:) = U(a,:);
U(a,:) = UU;
uu = u(r,1);
u(r,1) = u(a,1);
u(a,1) = uu;
a = a+1;
r = r+1;
a = r+1;
Gaussian elimination with pivoting is as following.
function [L,U,P] = my_lu_piv(A)
n = size(A,1);
I = eye(n);
O = zeros(n);
L = I;
U = O;
P = I;
function change_rows(k,p)
x = P(k,:); P(k,:) = P(p,:); P(p,:) = x;
x = A(k,:); A(k,:) = A(p,:); A(p,:) = x;
x = v(k); v(k) = v(p); v(p) = x;
function change_L(k,p)
x = L(k,1:k-1); L(k,1:k-1) = L(p,1:k-1);
L(p,1:k-1) = x;
for k = 1:n
if k == 1, v(k:n) = A(k:n,k);
z = L(1:k-1,1:k -1)\ A(1:k-1,k);
U(1:k-1,k) = z;
v(k:n) = A(k:n,k)-L(k:n,1:k-1)*z;
if k<n
x = v(k:n); p = (k-1)+find(abs(x) == max(abs(x))); % find index p
L(k+1:n,k) = v(k+1:n)/v(k);
if k > 1, change_L(k,p); end
U(k,k) = v(k);
In order to solve the system..
% Ax = b (1) original system % LU = PA
(2) factorization of PA or A(p,:) into the product LU % PAx =
Pb (3) multiply both sides of (1) by P % LUx = Pb
(4) substitute (2) into (3) % let y = Ux (5) define y as
Ux % let c = Pb (6) define c as Pb % Ly = c
(7) subsitute (5) and (6) into (4) % U*x = y (8) a
rewrite of (5)
To do this..
% [L U p] = lu (A) ; % factorize % y = L \ (P*b) ; % forward
solve of (7), a lower triangular system % x = U \ y ; %
backsolve of (8), an upper triangular system
Gaussian algorithm assumes that the matrix is converted to an upper triangular matrix. This does not happen in your example. The result of your algorithm is
A =
1 3 1 3
3 4 4 1
0 0 3 9
0 4 0 1
U =
1.00000 3.00000 1.00000 3.00000
-0.00000 1.00000 -0.20000 1.60000
0.00000 0.00000 1.00000 3.00000
0.00000 4.00000 -0.00000 1.00000
As you can see, it's not upper triangular. You are skipping rows, if the pivot element is zero. That does not work. To fix this you need to swap columns in the matrix and rows in the vector if the pivot element is zero. At the end you have to swap back rows in your result b resp. u.
Gaussian algorithm is:
1 Set n = 1
2 Take pivot element (n, n)
3 If (n, n) == 0, swap column n with column m, so that m > n and (n, m) != 0 (swap row m and n in vector b)
4 Divide n-th row by pivot element (divide n-th row in vector b)
5 For each m > n
6 If (m, n) != 0
7 Divide row m by m and subtract element-wise row n (same for vector b)
8 n = n + 1
9 If n <= number of rows, go to line 2
In terms of numerical stability it would be best to use the maximum of each row as pivot element. Also you can use the maximum of the matrix as pivot element by swapping columns and rows. But remember to swap in b and to swap back in your solution.
Try this:
Ab = [A,b] % Extended matrix of the system of equations
rref(Ab) % Result of applying the Gauss-Jordan elimination to the extended matrix
See rref documentation for more details and examples.

manipulating indices of matrix in parallel in matlab

Suppose I have a m-by-n-by-p matrix "A", each indices stores a real number, now I want to create another matrix "B" and B(i, j, k) = f(A(i, j, k), i, j, k, otherVars), is there a faster way to do it in matlab rather than looping through all the elements? (notice the function requires the index number (i, j, k))
An example is as follows(The actual function f could be more complex):
A = rand(3, 4, 5);
B = zeros(size(A));
C = 10;
for x = 1:size(A, 1)
for y = 1:size(A, 2)
for z = 1:size(A, 3)
B(x, y, z) = A(x,y,z) + x - y * z + C;
I've tried creating a cell "B", and
B{i, j, k} = [A(i, j, k), i, j, k];
I then applied cellfun() to do the parallel computing, but it's even slower than a for-loop over each elements in A.
In my real implementation, function f is much more complex than B = A + X - Y.*Z + C; it takes four scaler values and I don't want to modify it since it's a function written in an external package. Any suggestions?
Vectorize it by building an ndgrid of the appropriate values:
[X,Y,Z] = ndgrid(1:size(A,1), 1:size(A,2), 1:size(A,3));
B = A + X - Y.*Z + C;

Vectorizing a nested for loop which fills a dynamic programming table

I was wondering if there was a way to vectorize the nested for loop in this function which is filling up the entries of the 2D dynamic programming table DP. I believe that at the very least the inner loop could be vectorized as each row only depends on the previous row. I'm not sure how to do it though. Note this function is called on large 2D arrays (images) so the nested for loop really doesn't cut it.
function [cols] = compute_seam(energy)
[r, c, ~] = size(energy);
cols = zeros(r);
DP = padarray(energy, [0, 1], Inf);
BP = zeros(r, c);
for i = 2 : r
for j = 1 : c
[x, l] = min([DP(i - 1, j), DP(i - 1, j + 1), DP(i - 1, j + 2)]);
DP(i, j + 1) = DP(i, j + 1) + x;
BP(i, j) = j + (l - 2);
[~, j] = min(DP(r, :));
j = j - 1;
for i = r : -1 : 1
cols(i) = j;
j = BP(i, j);
Vectorization of the innermost nested loop
You were right in postulating that at least the inner loop is vectorizable. Here's the modified code for the nested loops part -
rows_DP = size(DP,1); %// rows in DP
%// Get first row linear indices for a group of neighboring three columns,
%// which would be incremented as we move between rows with the row iterator
start_ind1 = bsxfun(#plus,[1:rows_DP:2*rows_DP+1]',[0:c-1]*rows_DP); %//'
for i = 2 : r
ind1 = start_ind1 + i-2; %// setup linear indices for the row of this iteration
[x,l] = min(DP(ind1),[],1); %// get x and l values in one go
DP(i,2:c+1) = DP(i,2:c+1) + x; %// set DP values of a row in one go
BP(i,1:c) = [1:c] + l-2; %// set BP values of a row in one go
Benchmarking Code -
N = 3000; %// Datasize
energy = rand(N);
[r, c, ~] = size(energy);
disp('------------------------------------- With Original Code')
DP = padarray(energy, [0, 1], Inf);
BP = zeros(r, c);
for i = 2 : r
for j = 1 : c
[x, l] = min([DP(i - 1, j), DP(i - 1, j + 1), DP(i - 1, j + 2)]);
DP(i, j + 1) = DP(i, j + 1) + x;
BP(i, j) = j + (l - 2);
toc,clear DP BP x l
disp('------------------------------------- With Vectorized Code')
DP = padarray(energy, [0, 1], Inf);
BP = zeros(r, c);
rows_DP = size(DP,1); %// rows in DP
start_ind1 = bsxfun(#plus,[1:rows_DP:2*rows_DP+1]',[0:c-1]*rows_DP); %//'
for i = 2 : r
ind1 = start_ind1 + i-2; %// setup linear indices for the row of this iteration
[x,l] = min(DP(ind1),[],1); %// get x and l values in one go
DP(i,2:c+1) = DP(i,2:c+1) + x; %// set DP values of a row in one go
BP(i,1:c) = [1:c] + l-2; %// set BP values of a row in one go
Results -
------------------------------------- With Original Code
Elapsed time is 44.200746 seconds.
------------------------------------- With Vectorized Code
Elapsed time is 1.694288 seconds.
Thus, you might enjoy a good 26x speedup improvement in performance with that little vectorization tweak.
More tweaks
Few more optimization tweaks could be tried into your code for performance -
cols = zeros(r) could be replaced with col(r,r) = 0.
DP = padarray(energy, [0, 1], Inf) could be replaced with
DP(:,2:end-1) = energy;
BP = zeros(r, c) could be replaced with BP(r, c) = 0.
The pre-allocation tweaks used here are inspired by this blog post.