multithreading - Memory leak when using OpenMP -
the below test case runs out of memory on 32 bit machines (throwing std::bad_alloc) in loop following "post mt section" message when openmp used, however, if #pragmas openmp commented out, code runs through completion fine, appears when memory allocated in parallel threads, not free correctly , run out of memory.
question whether there wrong memory allocation , deletion code below or bug in gcc v4.2.2 or openmp? tried gcc v4.3 , got same failure.
int main(int argc, char** argv) { std::cout << "start " << std::endl; { std::vector<std::vector<int*> > nts(100); #pragma omp parallel { #pragma omp for(int begin = 0; begin < int(nts.size()); ++begin) { for(int = 0; < 1000000; ++i) { nts[begin].push_back(new int(5)); } } } std::cout << " pre delete " << std::endl; for(int begin = 0; begin < int(nts.size()); ++begin) { for(int j = 0; j < nts[begin].size(); ++j) { delete nts[begin][j]; } } } std::cout << "post mt section" << std::endl; { std::vector<std::vector<int*> > nts(100); int begin, i; try { for(begin = 0; begin < int(nts.size()); ++begin) { for(i = 0; < 2000000; ++i) { nts[begin].push_back(new int(5)); } } } catch (std::bad_alloc &e) { std::cout << e.what() << std::endl; std::cout << "begin: " << begin << " i: " << << std::endl; throw; } std::cout << "pre delete 1" << std::endl; for(int begin = 0; begin < int(nts.size()); ++begin) { for(int j = 0; j < nts[begin].size(); ++j) { delete nts[begin][j]; } } } std::cout << "end of prog" << std::endl; char c; std::cin >> c; return 0; }
changing first openmp loop 1000000 2000000 cause same error. indicates out of memory problem openmp stack limit.
try setting openmp stack limit unlimit in bash with
ulimit -s unlimited
you can change openmp environment variable omp_stacksize , setting 100mb or more.
update 1: change first loop to
{ std::vector<std::vector<int*> > nts(100); #pragma omp schedule(static) ordered for(int begin = 0; begin < int(nts.size()); ++begin) { for(int = 0; < 2000000; ++i) { nts[begin].push_back(new int(5)); } } std::cout << " pre delete " << std::endl; for(int begin = 0; begin < int(nts.size()); ++begin) { for(int j = 0; j < nts[begin].size(); ++j) { delete nts[begin][j] } } }
then, memory error @ i=1574803 on main thread.
update 2: if using intel compiler, can add following top of code , solve problem (providing have enough memory overhead).
std::cout << "previous stack size " << kmp_get_stacksize_s() << std::endl; kmp_set_stacksize_s(1000000000); std::cout << "now stack size " << kmp_get_stacksize_s() << std::endl;
update 3: completeness, mentioned member, if performing numerical computation, best preallocate in single new float[1000000] instead of using openmp 1000000 allocations. applies allocating objects well.
Comments
Post a Comment