Mac OSX, xgrid and RADIANCE

Has anyone using Mac OSX with RADIANCE tried rigging the OSX built-in xgrid app to distribute rendering and general crunching over a cluster of Macs?

Apple makes it seem so simple to gather yourself up a little supercomputer.

kirk

···

------------------------------

Kirk L. Thibault, Ph.D.
[email protected]

p. 215.271.7720
f. 215.271.7740
c. 267.918.6908

skype. kirkthibault

Kirk Thibault wrote:

Has anyone using Mac OSX with RADIANCE tried rigging the OSX built-in xgrid app to distribute rendering and general crunching over a cluster of Macs?

http://www.apple.com/server/macosx/features/xgrid.html

Apple makes it seem so simple to gather yourself up a little supercomputer.

Yeah, just like the way they make their Windows support for XServe look simple. The reality is a whole lot different! In fairness, they do provide a bunch of excellent command-line tools for managing the samba support in OSX Server, it's just that they make it look like you never need anything but the GUI, which is pure fantasy. But this is a Radiance list so I'll stop talking about Windows and XServes...

Kirk, I hadn't seen this xGrid before; this is something they added with 10.4 (I'm still on 10.3 and an OLD Powerbook) and it looks interesting! However, there is a long history of file locking issues with Radiance and multiple machines. The archives probably contain two or three threads about this topic with respect to Linux. I'd be really interested to hear what Greg thinks of this new Apple tool, and its possibilities, though.

Rob Guglielmetti

Well, I really have nothing to say about xGrid as I've never tried it, so let me just say it.

I think it would be nice sometimes to have something that's monitoring system use so it would send my jobs to the right machines on LBNL's render farm. Usually, I just put the jobs there manually with ssh -f and coordinate with other users, but it's a bit haphazard and things don't usually finish together. That said, I'm not sure how it works or if it kills and restarts jobs or just stops them or expects small ones or what. Many Radiance jobs of course take hours (or even days).

Historically, there haven't been any problems with NFS or the lock manager under OS X (or any BSD-derivative). It's only Linux that's been finicky on that count. Arguably, we should really replace the NFS lock manager with something more reliable, and we've gone back and forth on how to do that, but I keep hoping the problem will go away. Are people still struggling with it under Linux, or is it just the Windows version we have to start thinking about? Last time I spoke with Jack (of Visarc), I think he said it was still a problem.

-Greg

···

From: Rob Guglielmetti <[email protected]>
Date: September 21, 2005 3:48:38 PM PDT

Kirk Thibault wrote:

Has anyone using Mac OSX with RADIANCE tried rigging the OSX built-in xgrid app to distribute rendering and general crunching over a cluster of Macs?

http://www.apple.com/server/macosx/features/xgrid.html

Apple makes it seem so simple to gather yourself up a little supercomputer.

Yeah, just like the way they make their Windows support for XServe look simple. The reality is a whole lot different! In fairness, they do provide a bunch of excellent command-line tools for managing the samba support in OSX Server, it's just that they make it look like you never need anything but the GUI, which is pure fantasy. But this is a Radiance list so I'll stop talking about Windows and XServes...

Kirk, I hadn't seen this xGrid before; this is something they added with 10.4 (I'm still on 10.3 and an OLD Powerbook) and it looks interesting! However, there is a long history of file locking issues with Radiance and multiple machines. The archives probably contain two or three threads about this topic with respect to Linux. I'd be really interested to hear what Greg thinks of this new Apple tool, and its possibilities, though.
Rob Guglielmetti

SunOS, which is also a BSD derivative, used to have major problems with NFS locking, too--my guess is they are still there. NFS file locking is not a priority to any vendor or fropen development group I am aware of, so its operability is catch as catch can.

I don't have any time to work on this (sigh), but it really is an issue that needs attention.

Randolph

···

On Sep 21, 2005, at 4:49 PM, Greg Ward wrote:

Historically, there haven't been any problems with NFS or the lock manager under OS X (or any BSD-derivative). It's only Linux that's been finicky on that count. Arguably, we should really replace the NFS lock manager with something more reliable, and we've gone back and forth on how to do that, but I keep hoping the problem will go away. Are people still struggling with it under Linux, or is it just the Windows version we have to start thinking about? Last time I spoke with Jack (of Visarc), I think he said it was still a problem.

Kirk Thibault wrote:

Has anyone using Mac OSX with RADIANCE tried rigging the OSX built-in xgrid app to distribute rendering and general crunching over a cluster of Macs?

http://www.apple.com/server/macosx/features/xgrid.html

[...]

Kirk, I hadn't seen this xGrid before; this is something they added with 10.4 (I'm still on 10.3 and an OLD Powerbook) and it looks interesting!

You can download xGrid for 10.3 as well (accordign to the PDF available on
the page above - nice and short intro).

However, there is a long history of file locking issues with Radiance and multiple machines.

Reading the man-page for xgrid I think file locking of the amb-file would
be the only problem as the controller process _copyies_ the whole working
directory to the clients.

Everything else could be as simple as this example (from the man-page):

> Submit myscript with the files in the input directory. Send email to
> [email protected] on every job state change. Then retrieve the results
> and save the stdout and stderr streams in files instead of printing them
> out to the terminal and save the output files in the specified directory.
> Finally delete the job:
>
> $ xgrid -job submit -in ~/data/working -email [email protected]
> myscript param1 param2
> { jobIdentifier = 27; }
> $ xgrid -job results -id 27 -so job.out -se job.err -out job-outdir
> $ xgrid -job delete -id 27

Seems like no changes to the code are necessary. It would be nice
if someone with a flock of Macs at his/her disposition could test
this sometimes ...

As a related note on OS X 10.4:

I'm trying to get a bit of performance from my G5 iMac. Compiling the
Radiance source is a matter of minutes. But when I try Mark's benchmarks
(http://mark.technolope.org/pages/rad_bench.html) the rpict time is worse
than his results for an 1.6 GHz P4-M.

I thought a 2 GHz PowerPC would perform much better than that. I have
only tested a few compiler optimizations so far but I don't think they
will change that much any more. Apple ships gcc 4.0 as default compiler
for 10.4. Could this be the reason?

What performance should I expect from the G5?

Thomas

···

On 22.09.2005, at 00:48, Rob Guglielmetti wrote:

Greg Ward wrote:

Historically, there haven't been any problems with NFS or the lock
manager under OS X (or any BSD-derivative). It's only Linux that's
been finicky on that count. Arguably, we should really replace the
NFS lock manager with something more reliable, and we've gone back
and forth on how to do that, but I keep hoping the problem will go
away.

It won't go away.

Theoretically (but not very realistically), file locking within
one system family (eg. unixoids) might eventually become based on
a reliable standard that is implemented correctly everywhere.
But all bets are off once we mix system families. At the moment
the relevant candidates are unix and Windows systems.

In this situation, I think the final conclusion must be that only
lockfiles will give us reliable and portable protection.
And since Windows systems *will* become our most popular front-end
as soon as we support them, such a lock file solution is a
necessary part of porting Radiance.

-schorsch

···

--
Georg Mischler -- simulations developer -- schorsch at schorsch com
+schorsch.com+ -- lighting design tools -- http://www.schorsch.com/

From: Georg Mischler <[email protected]>
Date: September 22, 2005 2:51:30 AM PDT

In this situation, I think the final conclusion must be that only
lockfiles will give us reliable and portable protection.
And since Windows systems *will* become our most popular front-end
as soon as we support them, such a lock file solution is a
necessary part of porting Radiance.

Well, I guess I'm overly optimistic. I keep hoping Windows will go away, too...

The problem with implementing our own lock manager using lock files is the overhead involved in setting the lock. I'm currently obtaining a lock every time I write out to the ambient file, and I'm worried that the time it takes to do our own locking if multiple cross-network file creations and checks are involved will totally kill the performance. If that's the case, we would be better off not sharing the ambient file at all, which is the current workaround.

The more sophisticated solution is to somehow measure the performance of our lock manager as we use it, and optimize our ambient value buffer size accordingly. This requires a lot more work on the programming side, and I figure it would take about a week plus an other week of testing to come up with something workable. As usual, there are no funds for this, so it sits on the back burner (which is switched off) and grows mold.

-Greg

Hi Thomas,

I hadn't realized that Mark Stock had finalized his new benchmark. It's really nice.

I gave it a try today, and also found my G5 running 10.4 to come up short in the tests. I have a 1.8 GHz G5 PowerMac, which performed just slightly bettern than a 1.6 GHz P4-M running Linux and compiled with gcc 3.3.2 -O3. (My copy was compiled with gcc 4.0.0 -O2.) What's even more surprising is that 1.5 GHz G4 laptop is about 13% faster than my G5, when by the clock it should be at least 17% slower! I suspect there's something really wrong with gcc 4.0 on the G5, and since it's just come out, it hasn't been fixed. Compiling with -ffast is not usually a good idea, as that's always broken things in the past, but you could try it just to see what happens. (I wouldn't trust it on any paying jobs, though.) I might try -ffast-math tomorrow if I get time.

-Greg

···

As a related note on OS X 10.4:

I'm trying to get a bit of performance from my G5 iMac. Compiling the
Radiance source is a matter of minutes. But when I try Mark's benchmarks
(http://mark.technolope.org/pages/rad_bench.html) the rpict time is worse
than his results for an 1.6 GHz P4-M.

I thought a 2 GHz PowerPC would perform much better than that. I have
only tested a few compiler optimizations so far but I don't think they
will change that much any more. Apple ships gcc 4.0 as default compiler
for 10.4. Could this be the reason?

What performance should I expect from the G5?

Thomas