I have large but sparse arrays and I want to rearrange them by swapping rows an columns. What is a good way to do this in scipy.sparse?
Some issues
I don't think that permutation matrices are well suited for this task, as they like randomly change the sparsity structure. And a manipulation will always 'multiply' all columns or rows, even if there are only a few swaps necessary.
What is the best sparse matrix representation in scipy.sparse for this task?
Suggestions for implementation are very welcome.
I have tagged this with Matlab as well, since this question might find an answer that is not necessarily scipy specific.
CSC format keeps a list of the row indices of all non-zero entries, CSR format keeps a list of the column indices of all non-zero entries. I think you can take advantage of that to swap things around as follows, and I think there shouldn't be any side-effects to it:
def swap_rows(mat, a, b) :
mat_csc = scipy.sparse.csc_matrix(mat)
a_idx = np.where(mat_csc.indices == a)
b_idx = np.where(mat_csc.indices == b)
mat_csc.indices[a_idx] = b
mat_csc.indices[b_idx] = a
return mat_csc.asformat(mat.format)
def swap_cols(mat, a, b) :
mat_csr = scipy.sparse.csr_matrix(mat)
a_idx = np.where(mat_csr.indices == a)
b_idx = np.where(mat_csr.indices == b)
mat_csr.indices[a_idx] = b
mat_csr.indices[b_idx] = a
return mat_csr.asformat(mat.format)
You could now do something like this:
>>> mat = np.zeros((5,5))
>>> mat[[1, 2, 3, 3], [0, 2, 2, 4]] = 1
>>> mat = scipy.sparse.lil_matrix(mat)
>>> mat.todense()
matrix([[ 0., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 0., 1., 0., 1.],
[ 0., 0., 0., 0., 0.]])
>>> swap_rows(mat, 1, 3)
<5x5 sparse matrix of type '<type 'numpy.float64'>'
with 4 stored elements in LInked List format>
>>> swap_rows(mat, 1, 3).todense()
matrix([[ 0., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 1.],
[ 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
>>> swap_cols(mat, 0, 4)
<5x5 sparse matrix of type '<type 'numpy.float64'>'
with 4 stored elements in LInked List format>
>>> swap_cols(mat, 0, 4).todense()
matrix([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1.],
[ 0., 0., 1., 0., 0.],
[ 1., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 0.]])
I have used a LIL matrix to show how you could preserve the type of your output. In your application you probably want to already be in CSC or CSR format, and select whether to swap rows or columns first based on it, to minimize conversions.
I've found using matrix operations to be the most efficient. Here's a function which will permute the rows and/or columns to a specified order. It can be modified to swap two specific rows/columns if you would like.
from scipy import sparse
def permute_sparse_matrix(M, row_order=None, col_order=None):
"""
Reorders the rows and/or columns in a scipy sparse matrix to the specified order.
"""
if row_order is None and col_order is None:
return M
new_M = M
if row_order is not None:
I = sparse.eye(M.shape[0]).tocoo()
I.row = I.row[row_order]
new_M = I.dot(new_M)
if col_order is not None:
I = sparse.eye(M.shape[1]).tocoo()
I.col = I.col[col_order]
new_M = new_M.dot(I)
return new_M
In Matlab you can just index the columns and rows the way you like:
Matrix = speye(10);
mycolumnorder = [1 2 3 4 5 6 10 9 8 7];
myroworder = [4 3 2 1 5 6 7 8 9 10];
Myorderedmatrix = Matrix(myroworder,mycolumnorder);
I think this preserves sparsity... Don't know about scipy though...
Related
I am a bit stumped on this one. I have used spline to smooth my data successfully, however it is just not working this time. Here is the snippet of the code that is not working. Any pointers would be highly appreciated.
In [46]: x
Out[46]:
array([ 600., 650., 700., 750., 800., 850., 900., 950.,
1000., 1050., 1100., 1150., 1200., 1250.])
In [47]: y
Out[47]:
array([ 2.68530481, 3.715443 , 4.11270841, 2.91720571, 1.49194971,
0.24770035, -0.64713611, -1.40938122, -2.24634466, -3.04577225,
-3.73914759, -4.35097303, -4.94702689, -5.56523414])
In [48]: x2=numpy.linspace(x.min(),x.max(),20)
In [49]: spline(x,y,x2)
Out[49]:
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0.])
Try using interp1d instead of spline which is deprecated(*):
import numpy as np
from matplotlib import pyplot as plt
from scipy.interpolate import interp1d
plt.ion()
x = np.array([600., 650., 700., 750., 800., 850., 900., 950.,
1000., 1050., 1100., 1150., 1200., 1250.])
y = np.array([2.68530481, 3.715443, 4.11270841, 2.91720571, 1.49194971,
0.24770035, -0.64713611, -1.40938122, -2.24634466,
-3.04577225, -3.73914759, -4.35097303, -4.94702689,
-5.56523414])
plt.plot(x,y)
x2 = np.linspace(x.min(), x.max(), 20)
f = interp1d(x, y, kind='cubic')
y2 = f(x2)
plt.plot(x2,y2)
Output
In [20]: x2
Out[20]:
array([ 600. , 634.21052632, 668.42105263, 702.63157895,
736.84210526, 771.05263158, 805.26315789, 839.47368421,
873.68421053, 907.89473684, 942.10526316, 976.31578947,
1010.52631579, 1044.73684211, 1078.94736842, 1113.15789474,
1147.36842105, 1181.57894737, 1215.78947368, 1250. ])
In [21]: y2
Out[21]:
array([ 2.68530481, 3.35699957, 4.03277746, 4.08420565, 3.31233485,
2.29896296, 1.34965136, 0.48288214, -0.21322503, -0.76839036,
-1.28566315, -1.84433723, -2.42194321, -2.96633554, -3.45993064,
-3.90553288, -4.31968149, -4.7262301 , -5.13883472, -5.56523414])
(*) Under additional tools, scipy lists spline as:
Functions existing for backward compatibility (should not be used in new code):
I am trying to use a Dirac delta function within a system of equations so that h(t) increases at t=1.5, t=3, t=4.5 etc.
Here is my code below:
A:=30: a:=1: dm:=3: c3:=5: d0:=1/a: t0:=1/dm: h0:=B0: y:=A*a: cc:=t0*c3: #same as cc=c3/Bm
N:=8: T:=0.5:
sys_ode:= diff(h(t),t)=y*sum(Dirac(t-dm*n*T),n=0..N) - exp(1-d(t))*h(t), diff(d(t),t)=exp(1-d(t))*h(t) - cc*d(t);
ics:=h(0)=A*a, d(0)=0:
ND:=dsolve([sys_ode,ics],numeric); #numerical solution to the system
ND(1);
ND(2);
ND(3);
ND(4);
Currently when I run this I get:
ND(1);
[t = 1., d(t) = HFloat(2.6749858704773097),
h(t) = HFloat(23.164506116038023)]
ND(2);
[t = 2., d(t) = HFloat(2.5365091646635465),
h(t) = HFloat(18.95651519442652)]
ND(3);
[t = 3., d(t) = HFloat(2.376810307084265),
h(t) = HFloat(15.018803909414379)]
ND(4);
[t = 4., d(t) = HFloat(2.1927211646807137),
h(t) = HFloat(11.391114874494281)]
But h(t) in theory should be increasing in value since there is input into the system at t=1.5 and at t=3 and not have decreased all the way down to h(t)=11.39 at t=4.
Any ideas where I have gone wrong would be appreciated. Thanks.
Support for numerical integration of ODE problems containing 0-order Dirac functions was added for Maple 2015.0. Since this addition the results from the integration you show look like this:
> ND(1);
[t = 1., d(t) = 3.01973877409584, h(t) = 37.2561191650856]
> ND(2);
[t = 2., d(t) = 3.38932165514909, h(t) = 61.6360962313253]
> ND(3);
[t = 3., d(t) = 3.32743891599543, h(t) = 71.0940887774625]
> ND(4);
[t = 4., d(t) = 3.59829473587253, h(t) = 79.8444924691185]
The function is indeed increasing.
In prior versions of Maple, no special treatment of Dirac was performed for numeric integration, so unless one hits the Dirac-0 points exactly, they would be ignored, and if they were hit, integration would halt with an undefined. The old way would have worked just as though the Dirac functions were not even there, which is consistent with your result:
> sys_ode:= diff(h(t),t)= - exp(1-d(t))*h(t),
> diff(d(t),t)=exp(1-d(t))*h(t) - cc*d(t):
> ND:=dsolve([sys_ode,ics],numeric):
> ND(1);
[t = 1., d(t) = 2.67498587047731, h(t) = 23.1645061160380]
> ND(2);
[t = 2., d(t) = 2.53650916466355, h(t) = 18.9565151944265]
> ND(3);
[t = 3., d(t) = 2.37681030708426, h(t) = 15.0188039094144]
> ND(4);
[t = 4., d(t) = 2.19272116468071, h(t) = 11.3911148744943]
I want to do something like the following.
Equivalent code in NumPy
a = np.zeros(5)
a[np.array([1, 2, 4])] += [1, 2, 3]
a
array([ 0., 1., 2., 0., 3.])
I tried the following but it does not work.
val v = DenseVector.zeros[Double](5)
v(1, 2, 4) :+= DenseVector(1, 2, 3)
<console>:18: error: could not find implicit value for parameter op:breeze.linalg.operators.OpAdd.InPlaceImpl2[breeze.linalg.Vector[Double],breeze.linalg.DenseVector[Int]]
v(1, 2, 4) += DenseVector(1, 2, 3)
Any help would be appreciated
I have a list of vectors, like this:
{x = 7, y = 0.}, {x = 2.5, y = 0.}, {x = -2.3, y = 0.}, {x = 2.5, y = 2.7}, {x = 2.5, y = -2.7}
How do I convert these to data I can plot? I've been trying with the "convert" function, but can't get it to work.
When I manually convert it to something like [[7, 0], [2.5, 0], [-2.3, 0], [2.5, 2.7], [2.5, -2.7]] it works, though there has to be an automatic way, right?
A little more info about what I'm doing if you're interested:
I have a function U(x,y), of which I calculate the gradient and then check where it becomes 0, like this:
solve(convert(Gradient(U(x, y), [x, y]), set), {x, y});
that gives me my list of points. Now I would like to plot these points on a graph.
Thanks!
S:={x = 7, y = 0.}, {x = 2.5, y = 0.}, {x = -2.3, y = 0.},
{x = 2.5, y = 2.7}, {x = 2.5, y = -2.7}:
T:=map2(eval,[x,y],[S]);
[[7, 0.], [2.5, 0.], [-2.3, 0.], [2.5, 2.7], [2.5, -2.7]]
I need to emulate the MATLAB function find, which returns the linear indices for the nonzero elements of an array. For example:
>> a = zeros(4,4)
a =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
>> a(1,1) = 1
>> a(4,4) = 1
>> find(a)
ans =
1
16
numpy has the similar function nonzero, but it returns a tuple of index arrays. For example:
In [1]: from numpy import *
In [2]: a = zeros((4,4))
In [3]: a[0,0] = 1
In [4]: a[3,3] = 1
In [5]: a
Out[5]:
array([[ 1., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 1.]])
In [6]: nonzero(a)
Out[6]: (array([0, 3]), array([0, 3]))
Is there a function that gives me the linear indices without calculating them myself?
numpy has you covered:
>>> np.flatnonzero(a)
array([ 0, 15])
Internally it's doing exactly what Sven Marnach suggested.
>>> print inspect.getsource(np.flatnonzero)
def flatnonzero(a):
"""
Return indices that are non-zero in the flattened version of a.
This is equivalent to a.ravel().nonzero()[0].
[more documentation]
"""
return a.ravel().nonzero()[0]
The easiest solution is to flatten the array before calling nonzero():
>>> a.ravel().nonzero()
(array([ 0, 15]),)
If you have matplotlib installed it's probably already there (find that is) in matplotlib.mlab module, as well as some other functions intended for compatibility with matlab. And yes it's implemented the same way as flatnonzero.