Rendering by high performance computers

Marsien · March 5, 2020, 1:33pm

Dear Radiance Community,
I have a problem with rendering process by using High Performance Computing.
I am using the following parameters on my 6-core laptop and the render takes almost 30 hours to complete. But with the same setting on a 20-core high performance computer even after 50 hours the rendering process is still going on!
“rtrace -dj 0.02 -ds 0.05 -dt 0.05 -dc 0.5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -x 12960 -y 12960”

Should we call ’ -n ’ for the rtrace in this case?
I would appreciate any help in this issue!

Greg_Ward · March 5, 2020, 3:44pm

Without the -n option, rtrace will use a single processor core. What is the input to your command, by the way? Did you think about using rtpict, which calls vwrays and rtrace for you instead?

I should also ask if you are on a Unix (Linux, Mac) machine, as the rtrace -n option does not work under Windows.

I would recommend using the -n option set to however many physical cores you have available, and definitely set the -af option to a shared ambient file as well. Otherwise, you’ll be asking each process to recompute the interreflections from scratch.

Marsien · March 7, 2020, 2:15pm

Hi,
On my own laptop I am using the windows version (so without -n option) and but high-performance computer HPC is a Unix machine.
The problem is that I received this response about using -n option on the computer.
“because rtrace uses a locking mechanism that HPC doesn’t have in order to guarantee that only one parallel process at a time writes in the output file. If they write simultaneously, it destroys the output file.”
The input in my commands are the .oct file + viewpoint:
cnt 12960 12960 | rcalc -f .\view360stereo.cal -e “XD:12960;YD:12960;X:1.5;Y:2.9;Z:1.533;IPD:0.06;EX:0;EZ:0” | rtrace … -fac .\Model.oct > output.hdr.

Best,
Marzieh

Greg_Ward · March 7, 2020, 4:06pm

Where did your quoted message come from? There have been versions of Linux that did not implement the NFS lock manager correctly, but I haven’t heard of any issues with the -af opiton in over a decade. I just supposed that this had been fixed. What version of which operating system is your HPC machine running? Maybe someone else on the list knows more about this situation than I do and can respond…

Marsien · March 7, 2020, 6:04pm

Thank you for your response.
The quoted message was from our colleague at IT department who is helping us to use the HPC machine. I will forward this thread of messages to him and he certainly knows about the process of running the jobs on the HPC and the limitations of it much more than me!
Best

Lars_Grobe · March 10, 2020, 3:21pm

Hi Marzieh,

since, as far as I understood, you are using 20 local cores in a SMP
setup, you can avoid the whole locking issue (if it exists) by keeping
the ambient file on a fast local volume, e.g. /tmp. You can move it from
there once the computations are completed (if you ever need it later).

Having the ambient file on a nfs share makes sense only if you
distribute your computational load over several machines.

Best, Lars.

Greg_Ward · March 10, 2020, 3:48pm

Just to be clear, Radiance ambient files use the fcntl() F_SETLKW call to lock and unlock the file. This is a general advisory lock mechanism, which may or may not be implemented with NFS file locks. It doesn’t matter where the actual file resides, though putting it on a local disk as Lars suggests is preferable if that is accessible by all processes. (Putting it on a RAMdisk is even better, but not usually called for unless you have a great many processes using it.)

Espen · March 11, 2020, 11:34am

Will ParkControl software help to set all cores?

Lars_Grobe · March 11, 2020, 4:34pm

Hi Espen,

if your operating system does its job, and you launch as many processes
as you have CPU cores, they should all be busy - and theoretically used
to their maximum capacity, unless your system runs out of memory or your
input/output becomes the determining parameter. So I would not know how
software (besides the OS) could help.

Memory should not be a issue if you share the ambient cache, since the
model is loaded only once and shared by all processes. One exception
that I have had trouble with is the run-time generated sampling
distributions when using high-resolution, data-driven BSDFs. If you have
a model with several “tensor-tree” BSDFs, you may not be able to use all
cores in some cases. Besides that, Radiance has been implemented with
memory effectiveness in mind (sometimes this is criticized since it
limits modularity), to make best use of your CPUs.

Finally, while it may appear to be less relevant in a time when even
laptops have multiple CPU cores, Radiance also supports distributed
memory systems (e.g. clusters), and thus scales far beyond what you can
install on a motherboard.

Best, Lars.