errors and re-runs

Post date: Jan 07, 2015 4:36:17 PM

Several of the runs had numerical problems sampling from a beta distribution when alpha or beta were, on occasion, equal to zero. I fixed this problem by adding GSL_DBL_MIN.

None of the CV runs transferred back, and I overwrote half of the CV logs. These were simple stupid errors that I fixed and I am now rerunning the CV. Note, for speed I am running each scale on the CV as a separate run (see example below). Many of these runs had segmentation faults after going for a while. I am not sure why, and I will need to keep an eye out for this.

cd /local/scratch/

sleep 0

popanc -o outcv_s10F_r9f0.3gen_demog_gens10.hdf5 -m 10000 -b 5000 -t 5 -f 1 -w 0 -v 1 -c 10 /labs/evolution/projects/popanc_sims/sims/genoP0F_r9f0.3gen_demog_gens10.txt /labs/evolution/projects/popanc_sims/sims/genoP1F_r9f0.3gen_demog_gens10.txt /labs/evolution/projects/popanc_sims/sims/genoAdmxF_r9f0.3gen_demog_gens10.txt > /labs/evolution/projects/popanc_sims/mcmc/log_s10F_r9f0.3gen_demog_gens10

scp outcv_s10F_r9f0.3gen_demog_gens10.hdf5 /labs/evolution/projects/popanc_sims/mcmc/

**** The problems were more widespread than I initially thought. Only 78 of initial runs finished successfully. Many have either segmentation faults, bus errors or aborted because they failed to allocate a stack for the data. This could mean that there is some sort of memory issue in the program, but given the recent work on the cluster, maybe this was a more general problem. So, I am going to rerun all 300 main (non CV) jobs to see if the problem fixes itself. If not I will look for a memory bug (I checked all of the calloc commands already and did not find an obvious issue). Note the original results are in mcmc/run1/ (I kept them to see if the same jobs mess up again).

Page updated

Google Sites

Report abuse