using list with python multiprocessing -
can me out sharing list between multiple python processes. problem self.id_list , self.mps_in_process working in following code.
import time, random multiprocessing import process #, manager, array, queue class mp_stuff(): def __init__(self, parent, id): time.sleep(1 + random.random()*10) # simulate data processing parent.killmp(id) class paramhandler(): def dofirstmp(self, ids): self.mps_in_process = [] self.id_list = ids id = self.id_list.pop(0) p = process(target=mp_stuff, args=(self, id)) self.mps_in_process.append(id) p.start() def domp(self): tmp in range(3): # nr of concurrent processes if len(self.id_list) > 0: id = self.id_list.pop(0) p = process(target=mp_stuff, args=(self, id)) self.mps_in_process.append(id) p.start() def killmp(self, kill_id): self.mps_in_process.remove(kill_id) self.domp() if __name__ == '__main__': id_list = [1,2,3,4,5,6] paramset = paramhandler() paramset.dofirstmp(id_list)
very shortly, code does, data (here, random time in mp_stuff) processed according data id in self.id_list. in order know how data id's in process self.mps_in_process used (nr processes hardcoded here, it's dynamic).
the problem share mps_in_process , id_list across multiple processes. current code goes pretty endless loop. goes wrong described in multiprocessing library:
"if code run in child process tries access global variable, value sees (if any) may not same value in parent process @ time process.start() called."
however, i'm not able figure out how mps_in_process , id_list working. cannot use queue, way elements taken out mps_in_process random. cannot use array, because .pop(0) not work. cannot use manager().list(), because .remove() , len(id_list) not work then. using threading instead of multiprocessing no solution, because later freeze_support() must used.
therefore, how share list among processes welcome!
the manager working fine (including len()). issue code in main process, don't wait until processing ends, main process ends , manager no longer accessible. don't know atomicity of listproxy's pop, maybe lock handy.
the solution p.join()
.
however, confused why enough p.join
@ end of dofirstmp
. happy if explain why join on first p returns after computation done , not after first domp returns.
my code:
import time, random multiprocessing import process, manager class mp_stuff(): def __init__(self, parent, id): time.sleep(1 + random.random()*5) # simulate data processing print id , "done" parent.killmp(id) class paramhandler(): def dofirstmp(self, ids): self.mps_in_process = [] self.id_list = manager().list(ids) id = self.id_list.pop(0) p = process(target=mp_stuff, args=(self, id)) self.mps_in_process.append(id) p.start() p.join() print "joined" def domp(self): tmp in range(3): # nr of concurrent processes print self.id_list if len(self.id_list) > 0: id = self.id_list.pop(0) p = process(target=mp_stuff, args=(self, id)) self.mps_in_process.append(id) p.start() def killmp(self, kill_id): print "kill", kill_id self.mps_in_process.remove(kill_id) self.domp() if __name__ == '__main__': id_list = [1,2,3,4,5,6] paramset = paramhandler() paramset.dofirstmp(id_list)
Comments
Post a Comment