"Broken pipe" message from rpiece on multi-core Linux system

Hi @R_Fritz1 and @Andrew_McNeil2. I’m reviving this thread from the dead because it’s happening to me, too, and I’m curious if anyone has found a solution.

The problem comes up when I run rpiece via rad using Torque PBS on a Linux cluster. The first time I run rad with -N 8, I get this error:

rpict: system - write error in io_process: Broken pipe
rpict: 0 rays, 0.00% after 0.000u 0.000s 0.001r hours on xxxxxx (PID 18652)
rad: error rendering view v3

So I guess one of my processes terminated. If I run again without deleting the ambient file, I get the error 7 times. If I set -aa 0, then everything runs fine. I guess this means something is going wrong when the processes try to access the ambient file. However, I’ve never had an issue sharing ambient files across a large number of cores using rtrace in this environment, so I’m not sure why this is different.

If I run the script directly instead of submitting it to PBS, it runs without error but appears to run sequentially rather than in parallel. This must be what is meant when the rad documentation says " The −N option instructs rad to run as many as npr rendering processes in parallel," but I’m not clear on how it actually decides on the number of cores to use.

If I remove -N 8 from the command, everything runs fine on PBS but in series.

Any ideas what’s going on here? The goal is to create a production environment to run arbitrary scripts, and the script has worked on other machines, so I’d rather change settings in the environment than edit the script.

Nathaniel