python - Iterating through a scipy.sparse vector (or matrix) -
i'm wondering best way iterate nonzero entries of sparse matrices scipy.sparse. example, if following:
from scipy.sparse import lil_matrix x = lil_matrix( (20,1) ) x[13,0] = 1 x[15,0] = 2 c = 0 in x: print c, c = c+1
the output is
0 1 2 3 4 5 6 7 8 9 10 11 12 13 (0, 0) 1.0 14 15 (0, 0) 2.0 16 17 18 19
so appears iterator touching every element, not nonzero entries. i've had @ api
http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html
and searched around bit, can't seem find solution works.
edit: bbtrb's method (using coo_matrix) faster original suggestion, using nonzero. sven marnach's suggestion use itertools.izip
improves speed. current fastest using_tocoo_izip
:
import scipy.sparse import random import itertools def using_nonzero(x): rows,cols = x.nonzero() row,col in zip(rows,cols): ((row,col), x[row,col]) def using_coo(x): cx = scipy.sparse.coo_matrix(x) i,j,v in zip(cx.row, cx.col, cx.data): (i,j,v) def using_tocoo(x): cx = x.tocoo() i,j,v in zip(cx.row, cx.col, cx.data): (i,j,v) def using_tocoo_izip(x): cx = x.tocoo() i,j,v in itertools.izip(cx.row, cx.col, cx.data): (i,j,v) n=200 x = scipy.sparse.lil_matrix( (n,n) ) _ in xrange(n): x[random.randint(0,n-1),random.randint(0,n-1)]=random.randint(1,100)
yields these timeit
results:
% python -mtimeit -s'import test' 'test.using_tocoo_izip(test.x)' 1000 loops, best of 3: 670 usec per loop % python -mtimeit -s'import test' 'test.using_tocoo(test.x)' 1000 loops, best of 3: 706 usec per loop % python -mtimeit -s'import test' 'test.using_coo(test.x)' 1000 loops, best of 3: 802 usec per loop % python -mtimeit -s'import test' 'test.using_nonzero(test.x)' 100 loops, best of 3: 5.25 msec per loop
Comments
Post a Comment