new rtrace binaries give different results, depending use of -af

Dear devs,

we recently had to re-run a large project that we initially worked in
in Sep 2017. We calculated daylight factors in rooms that are rather
highly obstructed. I believe the assessment back in Sep 2017 was
carried with HEAD-20160926. The recent re-run was probably with
HEAD-20171201.

For our daylight factor simulations, with use mkillum with -ab 3.
Apparently, this wasn't quite good enough in this case due to the
highly obstructed nature of the site. -ab should have been increased
back then, but unfortunately wasn't. We also use an external ambient
file.

Either way, we are not able to replicate the Sep 2017 results with the
new binaries and ambient file enabled. The new results (new binaries,
with -af) are up to 40% lower than the old ones.

With the old binaries without an ambient file, we get the same results
that we get with the new binaries (with and without) an ambient file.

In other words: with the old binaries, the results are different
depending on whether the ambient file is anabled or not.

In the Release Notes
(https://radiance-online.org/cgi-bin/viewcvs.cgi/ray/doc/notes/ReleaseNotes?view=markup)
on line 2086, I noticed this entry under 'Compatibility Change' for
version 5.1, which was released 8/17/2017:

"Enabled ambient cache value corral for all levels, not just final
two. This may reduce errors in certain pathological scenes."

This sounds like an explanation of what we are struggling with. Would
someone (Greg?) be able to offer some extra info on what 'ambient
cache value corral' is (I did google this, but could not find anyting
that seems relevant), and also what constitutes 'pathological scenes'.

Am I right in saying that the new, lower results are more accurate?

Thank you so much.

Cheers

Axel

Hi Axel,

The change you mention didn't happen between the two HEAD snapshots you listed. These are the changes in that range:

Revision 2.77 - (view) (download) (annotate) - [select for diffs]
Fri Apr 21 16:07:29 2017 UTC (11 months, 1 week ago) by greg
Branch: MAIN
CVS Tags: rad5R1
Changes since 2.76: +9 -5 lines
Diff to previous 2.76
Fixed issue where ambient super-samples were being left off deep ray trees

Revision 2.76 - (view) (download) (annotate) - [select for diffs]
Thu Jan 26 16:46:58 2017 UTC (14 months ago) by greg
Branch: MAIN
Changes since 2.75: +3 -3 lines
Diff to previous 2.75
Fixed bug in scenes with zero octree size

Revision 2.75 - (view) (download) (annotate) - [select for diffs]
Sat Oct 15 14:54:39 2016 UTC (17 months, 2 weeks ago) by greg
Branch: MAIN
Changes since 2.74: +4 -4 lines
Diff to previous 2.74
Increased minimum sampling spacing slightly -- rejection still less than 1%

Revision 2.74 - (view) (download) (annotate) - [select for diffs]
Fri Oct 14 19:15:34 2016 UTC (17 months, 2 weeks ago) by greg
Branch: MAIN
Changes since 2.73: +10 -6 lines
Diff to previous 2.73
Tweaked sample collision test to use 1/10th of ambient division size

Revision 2.73 - (view) (download) (annotate) - [select for diffs]
Fri Oct 14 00:54:21 2016 UTC (17 months, 2 weeks ago) by greg
Branch: MAIN
Changes since 2.72: +40 -9 lines
Diff to previous 2.72
Fixed regression in genBSDF affecting Klems normalization

Of these, the 2.73 corrected bias in the calculation that could conceivably affect the results one way or another.

I am puzzled why using or not using the ambient file makes a difference. I assume you are not re-using an old ambient file for your new runs. How many processes are you running (mkillum -n setting)?

Can you check your calculation by disabling caching altogether by setting -aa 0 in mkillum? This might offer an indication of what's going on. I would be happy to try this as well if you want to send me your model.

Cheers,
-Greg

···

From: Axel Jacobs <[email protected]>
Date: April 3, 2018 9:16:42 AM PDT

Dear devs,

we recently had to re-run a large project that we initially worked in
in Sep 2017. We calculated daylight factors in rooms that are rather
highly obstructed. I believe the assessment back in Sep 2017 was
carried with HEAD-20160926. The recent re-run was probably with
HEAD-20171201.

For our daylight factor simulations, with use mkillum with -ab 3.
Apparently, this wasn't quite good enough in this case due to the
highly obstructed nature of the site. -ab should have been increased
back then, but unfortunately wasn't. We also use an external ambient
file.

Either way, we are not able to replicate the Sep 2017 results with the
new binaries and ambient file enabled. The new results (new binaries,
with -af) are up to 40% lower than the old ones.

With the old binaries without an ambient file, we get the same results
that we get with the new binaries (with and without) an ambient file.

In other words: with the old binaries, the results are different
depending on whether the ambient file is anabled or not.

In the Release Notes
(https://radiance-online.org/cgi-bin/viewcvs.cgi/ray/doc/notes/ReleaseNotes?view=markup)
on line 2086, I noticed this entry under 'Compatibility Change' for
version 5.1, which was released 8/17/2017:

"Enabled ambient cache value corral for all levels, not just final
two. This may reduce errors in certain pathological scenes."

This sounds like an explanation of what we are struggling with. Would
someone (Greg?) be able to offer some extra info on what 'ambient
cache value corral' is (I did google this, but could not find anyting
that seems relevant), and also what constitutes 'pathological scenes'.

Am I right in saying that the new, lower results are more accurate?

Thank you so much.

Cheers

Axel

Hi Greg,

Thanks for your thoughts. We did clean out the old ambient files. In
our ADF script, mkillum is run with -n 80 (our servers have between 64
and 88 virtual processors)

With -aa 0, the results are on average 32% lower, compared to the 16%
reduction I mentioned before.

I was re-running different options over the last few days to get to
the bottom of this. Re-running an identical copy of the folder and
with the same settings that allowed me to reproduce the results from
Sep (with the old binaries) does now give the same 16% reduction as
all other combinations of options. I'm rather puzzled. It seems as
if using the old binaries but otherwise identical settings does not
reliably give me the old results.

I'm hoping that this is not dependent on the server load, which will
be nearly impossible to track down. In the latest runs, loading the
server to roughly 100% give the same results (16% red.) as loading it
to 400%, i.e. running the same assessment in four different, identical
directories.

I'll run a few more tests next week, and will report back.

Have a lovely weekend

Axel

···

On 4 April 2018 at 00:51, Gregory J. Ward <[email protected]> wrote:

Hi Axel,

The change you mention didn't happen between the two HEAD snapshots you
listed. These are the changes in that range:

Revision 2.77 - (view) (download) (annotate) - [select for diffs]
Fri Apr 21 16:07:29 2017 UTC (11 months, 1 week ago) by greg
Branch: MAIN
CVS Tags: rad5R1
Changes since 2.76: +9 -5 lines
Diff to previous 2.76
Fixed issue where ambient super-samples were being left off deep ray trees

Revision 2.76 - (view) (download) (annotate) - [select for diffs]
Thu Jan 26 16:46:58 2017 UTC (14 months ago) by greg
Branch: MAIN
Changes since 2.75: +3 -3 lines
Diff to previous 2.75
Fixed bug in scenes with zero octree size

Revision 2.75 - (view) (download) (annotate) - [select for diffs]
Sat Oct 15 14:54:39 2016 UTC (17 months, 2 weeks ago) by greg
Branch: MAIN
Changes since 2.74: +4 -4 lines
Diff to previous 2.74
Increased minimum sampling spacing slightly -- rejection still less than 1%

Revision 2.74 - (view) (download) (annotate) - [select for diffs]
Fri Oct 14 19:15:34 2016 UTC (17 months, 2 weeks ago) by greg
Branch: MAIN
Changes since 2.73: +10 -6 lines
Diff to previous 2.73
Tweaked sample collision test to use 1/10th of ambient division size

Revision 2.73 - (view) (download) (annotate) - [select for diffs]
Fri Oct 14 00:54:21 2016 UTC (17 months, 2 weeks ago) by greg
Branch: MAIN
Changes since 2.72: +40 -9 lines
Diff to previous 2.72
Fixed regression in genBSDF affecting Klems normalization

Of these, the 2.73 corrected bias in the calculation that could conceivably
affect the results one way or another.

I am puzzled why using or not using the ambient file makes a difference. I
assume you are not re-using an old ambient file for your new runs. How many
processes are you running (mkillum -n setting)?

Can you check your calculation by disabling caching altogether by setting
-aa 0 in mkillum? This might offer an indication of what's going on. I
would be happy to try this as well if you want to send me your model.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: April 3, 2018 9:16:42 AM PDT

Dear devs,

we recently had to re-run a large project that we initially worked in
in Sep 2017. We calculated daylight factors in rooms that are rather
highly obstructed. I believe the assessment back in Sep 2017 was
carried with HEAD-20160926. The recent re-run was probably with
HEAD-20171201.

For our daylight factor simulations, with use mkillum with -ab 3.
Apparently, this wasn't quite good enough in this case due to the
highly obstructed nature of the site. -ab should have been increased
back then, but unfortunately wasn't. We also use an external ambient
file.

Either way, we are not able to replicate the Sep 2017 results with the
new binaries and ambient file enabled. The new results (new binaries,
with -af) are up to 40% lower than the old ones.

With the old binaries without an ambient file, we get the same results
that we get with the new binaries (with and without) an ambient file.

In other words: with the old binaries, the results are different
depending on whether the ambient file is anabled or not.

In the Release Notes
(https://radiance-online.org/cgi-bin/viewcvs.cgi/ray/doc/notes/ReleaseNotes?view=markup)
on line 2086, I noticed this entry under 'Compatibility Change' for
version 5.1, which was released 8/17/2017:

"Enabled ambient cache value corral for all levels, not just final
two. This may reduce errors in certain pathological scenes."

This sounds like an explanation of what we are struggling with. Would
someone (Greg?) be able to offer some extra info on what 'ambient
cache value corral' is (I did google this, but could not find anyting
that seems relevant), and also what constitutes 'pathological scenes'.

Am I right in saying that the new, lower results are more accurate?

Thank you so much.

Cheers

Axel

_______________________________________________
Radiance-dev mailing list
[email protected]
https://www.radiance-online.org/mailman/listinfo/radiance-dev

Hi Axel,

If the latest binaries are giving you consistent results (you might try decreasing the -lw setting when you use -aa 0 to see if you can get those results to line up), then I would say the problem is likely the change I mentioned earlier (ambcomp.c rev 2.73). The new Hessian code has to avoid coincident (or nearly coincident) ambient samples, and the code I originally wrote for this introduced some serious bias to the calculation that would show up for some scenes. Yours is probably a good test case. The newer code is much more robust and doesn't have this bias in it, so if it is giving you consistent results, then you're probably OK.

The thing I worry about is if you get different results with the current code depending on whether you use the ambient file. If your other settings are high enough, this should only affect run-time, not accuracy.

Cheers,
-Greg

···

From: Axel Jacobs <[email protected]>
Date: April 6, 2018 8:24:20 AM PDT

Hi Greg,

Thanks for your thoughts. We did clean out the old ambient files. In
our ADF script, mkillum is run with -n 80 (our servers have between 64
and 88 virtual processors)

With -aa 0, the results are on average 32% lower, compared to the 16%
reduction I mentioned before.

I was re-running different options over the last few days to get to
the bottom of this. Re-running an identical copy of the folder and
with the same settings that allowed me to reproduce the results from
Sep (with the old binaries) does now give the same 16% reduction as
all other combinations of options. I'm rather puzzled. It seems as
if using the old binaries but otherwise identical settings does not
reliably give me the old results.

I'm hoping that this is not dependent on the server load, which will
be nearly impossible to track down. In the latest runs, loading the
server to roughly 100% give the same results (16% red.) as loading it
to 400%, i.e. running the same assessment in four different, identical
directories.

I'll run a few more tests next week, and will report back.

Have a lovely weekend

Axel