Slice specific characters in CSV using python -


i have data in tab delimited format looks like:

0/0:23:-1.03,-7.94,-83.75:69.15    0/1:34:-1.01,-11.24,-127.51:99.00    0/0:74:-1.02,-23.28,-301.81:99.00 

i interested in first 3 characters of each entry (ie 0/0 , 0/1). figured best way use match , genfromtxt in numpy. example far have gotten:

import re csvfile = 'home/python/batch1.hg19.table' numpy import genfromtxt data = genfromtxt(csvfile, delimiter="\t", dtype=none) in data[1]:     m = re.match('[0-9]/[0-9]', i)         if m:         print m.group(0),         else:         print "na", 

this works first row of data having hard time figuring out how expand every row of input file.

should make function , apply each row seperately or there more pythonic way this?

numpy great when want load in array of numbers. format have here complicated numpy recognize, array of strings. that's not playing numpy's strength.

here's simple way without numpy:

result=[] open(csvfile,'r') f:     line in f:         row=[]         text in line.split('\t'):             match=re.search('([0-9]/[0-9])',text)             if match:                 row.append(match.group(1))             else:                 row.append("na")         result.append(row) print(result) 

yields

# [['0/0', '0/1', '0/0'], ['na', '0/1', '0/0']] 

on data:

0/0:23:-1.03,-7.94,-83.75:69.15 0/1:34:-1.01,-11.24,-127.51:99.00   0/0:74:-1.02,-23.28,-301.81:99.00 ---:23:-1.03,-7.94,-83.75:69.15 0/1:34:-1.01,-11.24,-127.51:99.00   0/0:74:-1.02,-23.28,-301.81:99.00 

Comments

Popular posts from this blog

Add email recipient to all new Trac tickets -

400 Bad Request on Apache/PHP AddHandler wrapper -

php - Change action and image src url's with jQuery -