The BSPlib installation seemed to have gone ok for almost all of you. Issues about the programming. 1. If you run the same code on p CPUs with the same seed (because you did not use srandom, or the value was not processor/process dependent) then you repeat the same experiment with the same random number generator sequence p times. You can avoid them with careful reseeding per process (involve for example bsp_pid() and or Unix process id). Time might be dangerous (even process id might) as all CPU might have the same date/time. 2. Ideas wise Most of you x = random() / 2^31-1 y = random() / 2^31-1 if ((x-0.5)**2 + (y-0.5)**2 < 0.25) etc with optional subexpression optimization (some of you did so). One alternative that can slightly speedup things is x = random() / 2^31-1 y = random() / 2^31-1 if (x**2 + y**2 < 1.0 ) etc (i.e. one quarter of a circle with (0,0) center and radius 1 is inscribed in the square). This eliminates some minor subtractions etc. Sqrt finding is not required and can only slow down things! 3. Do the following #define R (double) (2^31-1)* (2^31-1) X = random() Y = random() if (X**2 + Y**2 < R ) etc, and crossing fingers about overflows:-), the elimination of divisions can shave off 20-30 seconds for example from the runtime with 100,000,000 runs. 4. Optimization-wise when you use gcc explore using -O3 -or -O2 or gcc -O? -funroll-loops -mcpu=cputype where cputype is one of: pentium3, i486, i586, pentium, pentium4 with BSPlib always do -flibrary-level 2 in production code (not during debugging where my suggestion is to use -flibrary-level 0) alex