Dear Radiance folks,
finally, I managed to get my Macbook that had been waiting since early January and can share my first experiences with Radiance on this platform.
Some of this is generic, but it may be helpful for others who have not built Radiance from source before.
To get decent support for the new processor, I am relying on Apple’s clang rather than gcc here, so I first installed the Developer Tools with their Command Line Tools. To get X11 and OpenGL support, I installed macports, and within macports the packages xorg, libGLU, and mesa (I hope I did not miss a package here). I then added the following line to my ~/.zprofile:
This tells the compiler to look for include files not in the (unix-default) /usr tree, but in the software development kit (basically this allows to build for different environments if more than one SDF is installed).
I recommend to log out and back in at this point, to ensure that xorg is functional and the CPATH is set. Now the compiler environment should be set up properly.
I keep self-compiled software (that is not updated by some package manager) in my home directory, e.g. sources in ~/src and binaries in ~/opt. I download the Radiance head release and the library files to ~/src/radiance and uncompress them there, giving me a source tree ~/src/radiance/ray. I also get the latest libtiff sources from osgeo.org (I tool 4.3.0) and uncompress them to ~/src/tiff-4.3.0. I delete ~/src/radiance/ray/src/px/tiff and replace it by a symlink to my libtiff (in ~/src/radiance/ray/src/px I type ln -s ~/src/tiff-4.3.0 tiff).
Now, in the Radiance source tree ~/src/radiance/ray, I first type ./makeall clean to make sure that no built artefacts are left. I then start the build process typing ./makeall install, go through the license and agree to it. When asked for the editor of my choice, I type nano (just a habit). The destinations for binaries and libraries / auxiliary files are ~/opt/radiance/bin and ~/opt/radiance/lib respectively. When the makeall scripts display the built commands and asks wether I want to modify it, I enter “y” and adjust it:
Second line: I add “-j 4” behind “make”. This tells make to start 4 processes in parallel, accelerating the building of the binaries.
Third line: I replace “-O2” by “-Ofast”.
Fourth line should read: “MACH=-DBSD -DNOSTEREO -Dfreebsd -mmacosx-version-min=11.2 -I/opt/local/include -L/opt/local/lib” \
I am not sure why I need to specify the minimum version of macos here, and this line may have to be adjusted. My guess is that it is to support linking against the macports binaries.
I save pressing Ctrl-O and leave by Ctrl-X. Now the binaries will be built (you will see the effect of the -j 4 option now, this goes really fast ), and you should get a final “Done.” before the script ends.
Now, I add the following two lines to ~/.zprofile:
I also comment out the line added by macports to set the display variable here since from my experience it causes trouble and is not necessary.
To test the performance of the new machine, I get Mark Stock’s “bench4” benchmark from github (GitHub - markstock/Radiance-Benchmark4: A well-used benchmark scene for the Radiance pseudo-radiosity renderer). Knowing the the current M1 has 4 “performance” and 4 “efficiency” cores, I am doing 3 tests. One with one process (1 proc, “make”), one keeping all performance cores busy (4 proc, “NCPU=4 make smp”), and one including the efficiency cores (8 proc, “NCPU=8 make smp”). The results are:
1 proc: 503 sec (best so far: Ryzen 9 3950X with 593 sec)
4 proc: 143 sec (best so far: Ryzen 9 3950X with 186 sec)
8 proc: 118 sec (best: Ryzen 9 3950X with 114 sec)
If you compare this to the existing entries at the benchmark web-pages, the little laptop appears to achieve by far the highest performance per “performance” core - and is competitive even as an 8-core machine despite the lower performance of the “efficiency” cores. Fun fact: The fan was not even running during the benchmarks.
The M1-based devices, despite their very low energy demand, give astonishing performance based on their “performance” cores. The cores outperform most (all?) available x64-cores even on workstations with significantly higher power demand. As can be expected, the “efficiency” cores cannot compete with this, but even in the case when all 8 cores (including the weaker “efficiency” cores) are included, the overall result is similar to a recent 8-core x64 system.
Since the currently availably systems are very compact, I would personally not use them as computing nodes - but for those of us who are doing their everyday work on a laptop on battery or a quiet desktop, and want enough power to run a simulation from time to time, the platform offers great performance. Since this is a rather new cpu, I expect better optimization in compilers in the near future. I is also important to note that this is the first generation of the platform with a clear focus on energy demand, while configurations tuned for computational performance are yet to come.
I hope this is somehow helpful and not too much an advertising report I am also curious to see the next generations of Arm64 systems to come. Samsung seams to be working on something now, and testing the Arm64-systems in use by the big cloud providers might be worth a try for those in need of scalability.