Slice specific characters in CSV using python -

April 15, 2011

i have data in tab delimited format looks like:

0/0:23:-1.03,-7.94,-83.75:69.15    0/1:34:-1.01,-11.24,-127.51:99.00    0/0:74:-1.02,-23.28,-301.81:99.00

i interested in first 3 characters of each entry (ie 0/0 , 0/1). figured best way use match , genfromtxt in numpy. example far have gotten:

import re csvfile = 'home/python/batch1.hg19.table' numpy import genfromtxt data = genfromtxt(csvfile, delimiter="\t", dtype=none) in data[1]:     m = re.match('[0-9]/[0-9]', i)         if m:         print m.group(0),         else:         print "na",

this works first row of data having hard time figuring out how expand every row of input file.

should make function , apply each row seperately or there more pythonic way this?

numpy great when want load in array of numbers. format have here complicated numpy recognize, array of strings. that's not playing numpy's strength.

here's simple way without numpy:

result=[] open(csvfile,'r') f:     line in f:         row=[]         text in line.split('\t'):             match=re.search('([0-9]/[0-9])',text)             if match:                 row.append(match.group(1))             else:                 row.append("na")         result.append(row) print(result)

yields

# [['0/0', '0/1', '0/0'], ['na', '0/1', '0/0']]

on data:

0/0:23:-1.03,-7.94,-83.75:69.15 0/1:34:-1.01,-11.24,-127.51:99.00   0/0:74:-1.02,-23.28,-301.81:99.00 ---:23:-1.03,-7.94,-83.75:69.15 0/1:34:-1.01,-11.24,-127.51:99.00   0/0:74:-1.02,-23.28,-301.81:99.00

Search This Blog

shell

Slice specific characters in CSV using python -

Comments

Post a Comment

Popular posts from this blog

400 Bad Request on Apache/PHP AddHandler wrapper -

Add email recipient to all new Trac tickets -

php - Change action and image src url's with jQuery -