Makefile inconsistency

File src/cv/Rmakefile defines the procedures for mgfilt, mgf2inv, 3ds2mgf
and 2 x libmgf.a. The executables need mgflib/libmgf.a but the procedure
to make ../lib/libmgf.a removes mgflib/libmgf.a by moving it. With a
linear build this doesn't matter as the three executables are built
before or after ../lib/libmgf.a but using 32 CPUs (T5120) and a parallel
build (dmake) this inconsistency is exploited.

I've attached a patch which cures the problem but it's weak because it
doesn't tidy the makefiles enough, eg, it would probably be better to
just "make all" on the sub make file in mgflib.

I've also changed "make" to "$(MAKE)" so as not to hard code a particular
make.

Version 4.0

James.

src-cv-Rmakefile-patch (1.33 KB)

Hi James,

Thanks for figuring this out. Does dmake cause the build to fail with the current make files, or does this just make things slightly more efficient? I ask because the time to build Radiance even on a single-processor laptop has fallen to insignificance, where I would say that optimizing the build probably isn't important.

Best,
-Greg

···

From: James Lee <[email protected]>
Date: November 28, 2010 3:55:18 AM PST

File src/cv/Rmakefile defines the procedures for mgfilt, mgf2inv, 3ds2mgf
and 2 x libmgf.a. The executables need mgflib/libmgf.a but the procedure
to make ../lib/libmgf.a removes mgflib/libmgf.a by moving it. With a
linear build this doesn't matter as the three executables are built
before or after ../lib/libmgf.a but using 32 CPUs (T5120) and a parallel
build (dmake) this inconsistency is exploited.

I've attached a patch which cures the problem but it's weak because it
doesn't tidy the makefiles enough, eg, it would probably be better to
just "make all" on the sub make file in mgflib.

I've also changed "make" to "$(MAKE)" so as not to hard code a particular
make.

regarding Re: [Radiance-dev] Makefile inconsistency:

Does dmake cause the build to fail with the current make files,

The bug has always been there, parallel make exploits/reveals it. It's
a bit like QA doesn't cause bugs in software.

or does this just make things slightly more efficient? I ask because
the time to build Radiance even on a single-processor laptop has
fallen to insignificance, where I would say that optimizing the build
probably isn't important.

There is nothing slight about a 10 fold increase in compile speed. Until
software runs in less time that it takes to press a key it will be
significant. I package a lot of software and by the time I die I will
have spent the equivalent of a whole year waiting for software to compile
but that will pale compared to the decade I will have spent debugging
other people's software.

As I'm building a radiance package I have to rebuild multiple times
because:
+ The package includes multiple architecture isaexec versions of the key
executables and the complete compile is done for each.
+ I first have to determine which arches, flags and compiler to use.
+ I've had to rebuild the whole package many times for the, ahem,
problems in radiance (not relocatable "define DEFMAPFILE
\"/usr/local/lib/ray/lib/arch.map\"", timegm, the ambient bug, version
still wrong, examples broken by hdr renaming, issuing different source
distributions with the same 4R0 name, etc.) Each change I make requires
a traceable patch and build, hence full package rebuild.

James.

···

On 28/11/10, 17:04:10, Gregory "J." Ward <[email protected]> wrote

Wow. Didn't mean to trigger a diatribe with my little question, which you didn't really answer. What I gather from your response is that the build set-up requires you to compile multiple times with dmake to resolve the issue, which is unacceptable in your case because you already have so many compiles to deal with. I get it. I will take your fix suggestion and see what I can do.

I do want to respond to the other issues you mentioned, though:

As I'm building a radiance package I have to rebuild multiple times
because:
+ The package includes multiple architecture isaexec versions of the key
executables and the complete compile is done for each.

I'm not really familiar with Solaris, but on other systems (like OS X) you can compile in the different architectures in one go. Even that is a pain, so I sympathize.

+ I first have to determine which arches, flags and compiler to use.

Yes, well, we don't have all the different machines in the world to practice on, and this is an ongoing issue.

+ I've had to rebuild the whole package many times for the, ahem,
problems in radiance (not relocatable "define DEFMAPFILE
\"/usr/local/lib/ray/lib/arch.map\"", timegm, the ambient bug, version
still wrong, examples broken by hdr renaming, issuing different source
distributions with the same 4R0 name, etc.) Each change I make requires
a traceable patch and build, hence full package rebuild.

One at a time:

not relocatable "define DEFMAPFILE \"/usr/local/lib/ray/lib/arch.map\"

I actually didn't know about this because it's in a program that no one (as far as I know) uses anymore, arch2rad. The build for it should probably be disabled, rather than carrying whatever problems it has into the next release. I will do that.

timegm

There is a replacement implementation for this GNU extension in src/common/timegm.c, which I assume you found. I shall add this to the COMPAT variable for Solaris.

the ambient bug

Could you be more specific? Unless this is the recurring problem with the NFS lock manager, which I didn't think an issue under Solaris.

version still wrong

??

examples broken by hdr renaming

I am sorry about that. If you can point to specific documents that need fixing, I will fix them. One of the big problems with being the only maintainer of a package that has developed and evolved for 20+ years is that there''s a lot of stuff buried in it that I don't think about anymore.

issuing different source distributions with the same 4R0 name, etc.

Yes, I may be guilty of that. I have at times posted a couple of versions without renaming when I found one or more problems in the distribution. I try to fix these before my official announcement, but the release process is really a bit broken in the sense that there is no way to test out compiles on multiple platforms when we don't have multiple platforms available. We would like to change this in the direction of having a group of designated system testers who compile a prerelease package on specific platforms to look for problems before we make an official release. We have been relying on things being found in the HEAD distribution, but of course there are those who wait each time for the official release, and when problems show up then, we have no choice but to issue a patch. If the patch is minor and system-specific, I have forgone the rev change in the process, which is certainly not ISO 9000.

Thanks for your help and your feedback, James. I understand your job is not an easy one, and not always fun.

Best,
-Greg

regarding Re: [Radiance-dev] Makefile inconsistency:

which you didn't really answer.

Your question was:

"Does dmake cause the build to fail with the current make files, or
does this just make things slightly more efficient?"

It's 2 questions.

Ans 1 No dmake does not "cause" it to fail. Parallel build does fail
but the problem is latent. The makefiles ask for a file to be created
and for it to be destroyed which can't be satisfied consistently or at
the same time.

Ans 2. A 10 fold increase is not slight, so no it does not make it
slightly more efficient, it makes it many times more effective.

I expect gmake could fail similarly. I prefer dmake over gmake because
it dynamically sets and varies the number of jobs depending on the system
load. dmake is "Distributed Make" and can spawn jobs on other machines
for greater parallelism. It's really handy for creating many radiance
images, I create makefiles to define the work and let dmake farm it
out. This is less important nowadays that one physical machine houses
several virtual machines which already share all the resources.

I do want to respond to the other issues you mentioned, though:

> As I'm building a radiance package I have to rebuild multiple times
> because:
> + The package includes multiple architecture isaexec versions of the key
> executables and the complete compile is done for each.

I'm not really familiar with Solaris, but on other systems (like OS X)
you can compile in the different architectures in one go. Even that is
a pain, so I sympathize.

I need to compile in multiple architectures. My package contains several
versions of rpict. The package has to run on many different CPUs. The
laziest solution and a usual one is to use the lowest common denominator
arch. For a CPU hog it's worth providing several speed optimised
binaries for different arches and letting isaexec pick the best one for
whichever machine is running the one package. For trivial utilities a
space optimised solution is best so I do another build for those. A
similar thing is done with libraries and the link loader uses the dynamic
token $ISALIST to pick the best libraries.

It's my usual technique: don't fight a build system, let it think it's
helping. Build multiple times and pick the needed parts for the package.
It's typically the easiest way to build both 32 and 64 bit libraries.

> not relocatable "define DEFMAPFILE \"/usr/local/lib/ray/lib/arch.map\"

I actually didn't know about this because it's in a program that no one
(as far as I know) uses anymore, arch2rad. The build for it should
probably be disabled, rather than carrying whatever problems it has
into the next release. I will do that.

I've never used it either but if it's in my package it should work. I
edit the value to the location of arch.map in /opt but perhaps the code
should have used the env var RAYPATH.

> timegm

There is a replacement implementation for this GNU extension in
src/common/timegm.c, which I assume you found.

No, it's not there:

$ ls src/common/timegm.c
src/common/timegm.c: No such file or directory

$ find . -name \*gm\*
./src/meta/segment.c

...not it...

$ grep -l timegm **/*
src/common/header.c

...and that's it being used...

$ grep timegm **/*
src/common/header.c: *tloc = timegm(&tms);

> the ambient bug

Could you be more specific? Unless this is the recurring problem
with the NFS lock manager, which I didn't think an issue under
Solaris.

rpict: fatal - bad ambient file

It's fixable by using src/rt/ambient.c from the HEAD distribution.
Given it's a known crash bug, pretty please, how about a release 4.1
with a fix?

> version still wrong

??

I see you've fixed this in the new 4R0, I was using an older 4R0. The
problems of not issuing 4.0.1!

> examples broken by hdr renaming

I am sorry about that. If you can point to specific documents that need
fixing, I will fix them.

I've edited my examples and they look good except I've still a problem
with the cabin:

$ make
oconv -b -100 -100 -100 225 -r 8192 \
pattmats cabin bathroom furniture winpanes.rad mirrors.rad > cabin.oct
oconv -f -r 8192 -i cabin.oct summerday landscape lights.off \
daywindows > summercabin.oct
rvu -vf vf/plan -av .1 .1 .1 summercabin.oct

rvu: fatal - cannot find picture file "pinebark.pic"
*** Error code 1
make: Fatal error: Command failed for target `view'

but I can't find any references to pinebark. I'll keep looking.

Thanks for your help and your feedback, James. I understand your job is
not an easy one, and not always fun.

Radiance is always fun. If you want to get me steamed up tell me libtool
is good.

James.

···

On 29/11/10, 16:40:01, Gregory "J." Ward <[email protected]> wrote

Hi James,

Thanks for answering my question(s) -- I have updated the src/cv/Rmakefile for the next release. You can pick it up if you like from HEAD or using the CVS interface at:

  http://www.radiance-online.org/cgi-bin/viewcvs.cgi/ray/src/

A few more responses...

timegm

There is a replacement implementation for this GNU extension in
src/common/timegm.c, which I assume you found.

No, it's not there:

$ ls src/common/timegm.c
src/common/timegm.c: No such file or directory

$ find . -name \*gm\*
./src/meta/segment.c

...not it...

$ grep -l timegm **/*
src/common/header.c

...and that's it being used...

$ grep timegm **/*
src/common/header.c: *tloc = timegm(&tms);

Oops. Guess it was added right after the 4.0 release in response to some difficulties with the MINGW build. You can pick it up from CVS if you want it, or from HEAD.

the ambient bug

Could you be more specific? Unless this is the recurring problem
with the NFS lock manager, which I didn't think an issue under
Solaris.

rpict: fatal - bad ambient file

It's fixable by using src/rt/ambient.c from the HEAD distribution.
Given it's a known crash bug, pretty please, how about a release 4.1
with a fix?

If making a new release were less work, I would do it more often. As it is, it takes days of my time, and I never get it quite right. I don't really make new releases for bug fixes -- that's why we have HEAD. Since this bug has been there for years and years, and only recently manifested due to some change in FreeBSD, I don't think it affects all that many users. I do apologize for the inconvenience, but the next release probably won't happen until sometime in Spring.

version still wrong

??

I see you've fixed this in the new 4R0, I was using an older 4R0. The
problems of not issuing 4.0.1!

Yes, sorry about that. As you have pointed out, the release process is a bit haphazard.

examples broken by hdr renaming

I am sorry about that. If you can point to specific documents that need
fixing, I will fix them.

I've edited my examples and they look good except I've still a problem
with the cabin:

$ make
oconv -b -100 -100 -100 225 -r 8192 \
pattmats cabin bathroom furniture winpanes.rad mirrors.rad > cabin.oct
oconv -f -r 8192 -i cabin.oct summerday landscape lights.off \
daywindows > summercabin.oct
rvu -vf vf/plan -av .1 .1 .1 summercabin.oct

rvu: fatal - cannot find picture file "pinebark.pic"
*** Error code 1
make: Fatal error: Command failed for target `view'

but I can't find any references to pinebark. I'll keep looking.

Ah. The file "pinebark.hdr" picture should be in with the supplementary files in the ray/lib directory:

  http://www.radiance-online.org/software/non-cvs/rad4R0supp.tar.gz

Unfortunately, I didn't fix the references in the ray/obj/cabin directory -- sorry about that! I will fix it for the next release.

Thanks again for your help!

-Greg

But it should be possible to do most parts of a new release automatically. My
guess is still that switching to a proper revision control system would make
things easier for you - using branches in cvs is a bit insane, but branches
would make it easy to prepare a new release.

···

On 11/29/2010 09:31 PM, Gregory J. Ward wrote:

It's fixable by using src/rt/ambient.c from the HEAD distribution.
Given it's a known crash bug, pretty please, how about a release 4.1
with a fix?

If making a new release were less work, I would do it more often. As it is, it takes days of my time, and I never get it quite right. I don't really make new releases for bug fixes -- that's why we have HEAD. Since this bug has been there for years and years, and only recently manifested due to some change in FreeBSD, I don't think it affects all that many users. I do apologize for the inconvenience, but the next release probably won't happen until sometime in Spring.

--
Bernd Zeimetz Debian GNU/Linux Developer
http://bzed.de http://www.debian.org
GPG Fingerprint: ECA1 E3F2 8E11 2432 D485 DD95 EB36 171A 6FF9 435F

Hi Bernd,

Most of the hassle with putting together a new release can be resolved with website changes we're planning for next year. Right now, we have depositories in a few different places, automatic updates that sort-of work most of the time, and documentation that is spread all over the place. It's really more a matter of organization and putting others in charge than it is one of CVS and the way that works.

To give you an idea, posting a new release involves:

1) Putting together the latest source and creating a version in CVS (this is the easy part).

2) Updating documentation with new release information (easy to forget things, and I do).

3) Gathering the auxiliary data files together (easy to miss things here as well).

4) Making source-only, overlay, and combined tar balls (simple, but I've still screwed this up in the past).

5) Checking compile and building for the few supported systems (sometimes days of delay and looping back to step 1).

6) Uploading release to website locations and relinking everything, while archiving the old stuff (painful but straightforward enough).

7) Updating man pages on website (usually comes last and often forgotten altogether).

8) Making announcement to user group of new release availability.

9) Depending on feedback, possibly making a patch release to fix build problems encountered by users (sigh).

Very little of this can be automated, which is part of why it doesn't happen very often. Having a HEAD version has been a huge help, since people who want the latest bug fixes and feature adds can get them in real-time, at the expense of doing their own compiles and taking a little risk that their results will not agree with earlier runs. Radiance development benefits greatly from the feedback of these brave users and allows for build fixes along the way, so that patch releases are not as necessary as they used to be.

Even if we could streamline the release process, I still don't think we'd want to do it more than once a year or so. People who want the latest and are willing to deal with any problems along the way can grab the HEAD whenever they like, and those who prefer stable releases and precompiled binaries probably don't want to be updating their systems multiple times a year.

The one thing we will be soliciting help on is step 5 above, building and testing on different systems in advance of an official release. We need to gather volunteers and create some useful test cases for that process, and this is something we plan to do this coming year, given the time and personnel.

Cheers,
-Greg

···

From: Bernd Zeimetz <[email protected]>
Date: November 30, 2010 1:49:08 AM PST

On 11/29/2010 09:31 PM, Gregory J. Ward wrote:

It's fixable by using src/rt/ambient.c from the HEAD distribution.
Given it's a known crash bug, pretty please, how about a release 4.1
with a fix?

If making a new release were less work, I would do it more often. As it is, it takes days of my time, and I never get it quite right. I don't really make new releases for bug fixes -- that's why we have HEAD. Since this bug has been there for years and years, and only recently manifested due to some change in FreeBSD, I don't think it affects all that many users. I do apologize for the inconvenience, but the next release probably won't happen until sometime in Spring.

But it should be possible to do most parts of a new release automatically. My
guess is still that switching to a proper revision control system would make
things easier for you - using branches in cvs is a bit insane, but branches
would make it easy to prepare a new release.

Hi Greg!

Most of the hassle with putting together a new release can be resolved with website changes we're planning for next year. Right now, we have depositories in a few different places, automatic updates that sort-of work most of the time, and documentation that is spread all over the place. It's really more a matter of organization and putting others in charge than it is one of CVS and the way that works.

To give you an idea, posting a new release involves:

1) Putting together the latest source and creating a version in CVS (this is the easy part).

2) Updating documentation with new release information (easy to forget things, and I do).

A lot of software projects fix this by writing proper commit messages and then
dump the commit messages for a release and probably just reformat and fix them.
Obviously that requires to have one useful commit instead of a ton of tiny ones.
These things are easy with distributed RCSs - you just create a topic branch,
hack and then merge it into one useful commit and apply that to your master branch.

3) Gathering the auxiliary data files together (easy to miss things here as well).

What about keeping them in a revision control system?

4) Making source-only, overlay, and combined tar balls (simple, but I've still screwed this up in the past).

This is something which could be done automatically with make and friends. Once
defined properly, it should be easier to do and harder to forget things.
Also - looking at 3) there are tools like mr which allow to work with several
independent repositories and merge them.

5) Checking compile and building for the few supported systems (sometimes days of delay and looping back to step 1).

If it would be easier to create a new release, you could release a -RC version
and wait for feedback. So finding a fast way to create releases would fix that
point for large parts.

6) Uploading release to website locations and relinking everything, while archiving the old stuff (painful but straightforward enough).

That sounds like a task for merging all locations into one so people could go to
one place and retrieve everything from there. Generally something like redmin or
trac would be nice to have as bugtracker and central plce to browse repositories.

7) Updating man pages on website (usually comes last and often forgotten altogether).

What about doing that with a cronjob which pulls the manpages from
cvs/git/whatever and builds the html version automatically?

8) Making announcement to user group of new release availability.

9) Depending on feedback, possibly making a patch release to fix build problems encountered by users (sigh).

Same issue here - if making a release would be easy and automated, this step
would be easy and fast to do, too.

Very little of this can be automated, which is part of why it doesn't happen very often. Having a HEAD version has been a huge help, since people who want the latest bug fixes and feature adds can get them in real-time, at the expense of doing their own compiles and taking a little risk that their results will not agree with earlier runs. Radiance development benefits greatly from the feedback of these brave users and allows for build fixes along the way, so that patch releases are not as necessary as they used to be.

I think there are a lot of things which could be automated. I know projects
which are run by one developer mainly and they release several times a year -
which is pretty easy as the only thing you need to do is to change the version
number and type make dist. There are a lot of platform-independent tools which
help with such tasks these days.

Let me know if I can help you with these things.

Cheers,

Bernd

···

--
Bernd Zeimetz Debian GNU/Linux Developer
http://bzed.de http://www.debian.org
GPG Fingerprint: ECA1 E3F2 8E11 2432 D485 DD95 EB36 171A 6FF9 435F

5) Checking compile and building for the few supported systems

(sometimes days of delay and looping back to step 1).

If it would be easier to create a new release, you could release a -RC

version and wait for feedback. So finding a fast way to create releases
would fix that point for large parts.

This is one of the major bugbears I have with Radiance releases. I've
never seen a single software package which has only HEAD or major
releases available because it just doesn't work as we've seen here.
Releasing betas and/or release candidates from the HEAD would allow
people to test using a known package & would make people a lot more
likely to build & test before a release. Also the bake time for betas
and/or release candidates give you time to fix known issues & the
pressure to make sure everything works for release should be eased by
this. The down side is that you'd either need to create a release branch
for each release which from what I've heard isn't easy in CVS (I'm used
to Perforce where everything is easy) or lock HEAD until a release is
ready.

If you're dead set against releasing patches/minor updates such as 4.0.1
etc then betas/release candidates are the only workable solution that I
can see.

Palbinder Sandher
Software Deployment & IT Administrator
T: +44 (0) 141 945 8500
F: +44 (0) 141 945 8501

http://www.iesve.com
**Design, Simulate + Innovate with the <Virtual Environment>**
Integrated Environmental Solutions Limited. Registered in Scotland No.
SC151456
Registered Office - Helix Building, West Of Scotland Science Park,
Glasgow G20 0SP
Email Disclaimer