Slow spawning of rtrace -n, mkillum -n with many light sources

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

···

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

···

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

···

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

a good trick from fellow friend colleague Francesco Anselmo is to use glow with a specific distance so only light close enough get cached… use with caution as it is a simplified calculation, but faster.
G.

···

On 20 Feb 2017, at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

Hi Axel,

Thanks for sending me your model off-list. Indeed, it is the source obstructor cache initialization that is taking up most of the start-up time. It's tracing about 8.5 million rays, which would normally be done in about 10 minutes, but seems to be taking longer due to the preponderance of triangle mesh geometry in your model.

The Radiance triangle mesh (.rtm) format was designed to handle complex geometry in the smallest possible memory footprint. To achieve this, I had to compromise a little on execution time. In addition to the ray vector transformations that are also required by the "instance" primitive, tracing rays into a triangle mesh is maybe twice as expensive as rendering straight geometry. If simple geometry is stored in a mesh, then you are paying this price unnecessarily.

To reduce start-up time, I would recommend putting the geometry into octree instances rather than triangle meshes if it's not too complicated, or just use xform if you aren't making multiple copies of the geometry throughout your model. Having a single instance is wasteful of resources and costs overhead during rendering.

Another thing that will speed up the initialization is to use a frozen octree. Your 7000 light sources mean 7000 calls to the shell to run xform, and there's no reason to do that except the first time, during octree creation.

Cheers,
-Greg

P.S. More information on mesh primitive from 2003 workshop: http://www.radiance-online.org/community/workshops/2003-berkeley/presentations/Ward/Tutorial1.html
Download the PowerPoint tutorial and skip to Section II around slide 42.

···

From: Giulio Antonutto <[email protected]>
Date: February 20, 2017 11:49:45 AM PST

a good trick from fellow friend colleague Francesco Anselmo is to use glow with a specific distance so only light close enough get cached… use with caution as it is a simplified calculation, but faster.
G.

On 20 Feb 2017, at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

I went ahead and added a limit on the source obstruction code, so it won't precheck more than 200 light sources. This is an optional optimization, anyway, so I figure it's better not to have this strange problem of taking forever to start up when you have a lot of sources.

Cheers,
-Greg

···

From: Greg Ward <[email protected]>
Date: February 21, 2017 1:33:18 PM PST

Hi Axel,

Thanks for sending me your model off-list. Indeed, it is the source obstructor cache initialization that is taking up most of the start-up time. It's tracing about 8.5 million rays, which would normally be done in about 10 minutes, but seems to be taking longer due to the preponderance of triangle mesh geometry in your model.

The Radiance triangle mesh (.rtm) format was designed to handle complex geometry in the smallest possible memory footprint. To achieve this, I had to compromise a little on execution time. In addition to the ray vector transformations that are also required by the "instance" primitive, tracing rays into a triangle mesh is maybe twice as expensive as rendering straight geometry. If simple geometry is stored in a mesh, then you are paying this price unnecessarily.

To reduce start-up time, I would recommend putting the geometry into octree instances rather than triangle meshes if it's not too complicated, or just use xform if you aren't making multiple copies of the geometry throughout your model. Having a single instance is wasteful of resources and costs overhead during rendering.

Another thing that will speed up the initialization is to use a frozen octree. Your 7000 light sources mean 7000 calls to the shell to run xform, and there's no reason to do that except the first time, during octree creation.

Cheers,
-Greg

P.S. More information on mesh primitive from 2003 workshop: http://www.radiance-online.org/community/workshops/2003-berkeley/presentations/Ward/Tutorial1.html
Download the PowerPoint tutorial and skip to Section II around slide 42.

From: Giulio Antonutto <[email protected]>
Date: February 20, 2017 11:49:45 AM PST

a good trick from fellow friend colleague Francesco Anselmo is to use glow with a specific distance so only light close enough get cached… use with caution as it is a simplified calculation, but faster.
G.

On 20 Feb 2017, at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

···

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

Hi Axel,

I'd be interested in looking at OBJ files where obj2mesh fails.

Cheers,
-Greg

···

From: Axel Jacobs <[email protected]>
Date: February 22, 2017 1:55:48 AM PST

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

Hi Greg,

I'll ask around and find you some projects. Give me a few days, and
I'll get back to you.

Cheers

Axel

···

On 22 February 2017 at 16:41, Greg Ward <[email protected]> wrote:

Hi Axel,

I'd be interested in looking at OBJ files where obj2mesh fails.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 22, 2017 1:55:48 AM PST

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

Hi Greg,

https://giauk.sharefile.com/d-s0cd3657a8b54fa9b

Here is the first scene. This one does actually compile, but takes
forever with RTMs.
Compiling from obj2rad output takes just over a minute and creates an
octree of 136MB. Compiling from RTMs (with some polygon files) takes
200 minutes, resulting in an octree 37GB in size. What we normally do
in such a situation is to either reduce the number of RTMs to one per
material, or to go via obj2rad.

I also noticed on a project I worked on a couple of years ago that the
order in which the RTMs are listed in our master xform file matters.
I was able back then to compile by listing the largest object
(terrain) first. Trying to replicating this behaviour today did not
get the intended result. The octree would compile either way.

So let me hunt down some more examples for you.

Let me know if you need the OBJs for your testing.

Cheers

Axel

···

On 22 February 2017 at 17:11, Axel Jacobs <[email protected]> wrote:

Hi Greg,

I'll ask around and find you some projects. Give me a few days, and
I'll get back to you.

Cheers

Axel

On 22 February 2017 at 16:41, Greg Ward <[email protected]> wrote:

Hi Axel,

I'd be interested in looking at OBJ files where obj2mesh fails.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 22, 2017 1:55:48 AM PST

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

Hi Axel,

I thought you said that the actual obj2mesh call was taking forever, not the oconv operation afterwards. If it's oconv, this is due to having many overlapping volumes, which oconv attempts to resolve. If you didn't have so many RTM's (instanced octrees are also an issue), this would not be a problem.

The rules of thumb on RTMs and instanced octrees is that if you don't have multiple occurrences of something, don't use either. And if you do, then still avoid them unless you can packaged together at least 10,000 surfaces or so per octree/RTM. Finally, it's best if your octree or RTM fills are more-or-less cubic volume, or is well-separated from other instanced volumes. Overlapping volumes cause issues for oconv.

If I understood correctly the first time, and it is obj2mesh that is taking a long time, let me know which files to check out.

Cheers,
-Greg

···

From: Axel Jacobs <[email protected]>
Date: March 1, 2017 2:36:30 AM PST

Hi Greg,

https://giauk.sharefile.com/d-s0cd3657a8b54fa9b

Here is the first scene. This one does actually compile, but takes
forever with RTMs.
Compiling from obj2rad output takes just over a minute and creates an
octree of 136MB. Compiling from RTMs (with some polygon files) takes
200 minutes, resulting in an octree 37GB in size. What we normally do
in such a situation is to either reduce the number of RTMs to one per
material, or to go via obj2rad.

I also noticed on a project I worked on a couple of years ago that the
order in which the RTMs are listed in our master xform file matters.
I was able back then to compile by listing the largest object
(terrain) first. Trying to replicating this behaviour today did not
get the intended result. The octree would compile either way.

So let me hunt down some more examples for you.

Let me know if you need the OBJs for your testing.

Cheers

Axel

On 22 February 2017 at 17:11, Axel Jacobs <[email protected]> wrote:

Hi Greg,

I'll ask around and find you some projects. Give me a few days, and
I'll get back to you.

Cheers

Axel

On 22 February 2017 at 16:41, Greg Ward <[email protected]> wrote:

Hi Axel,

I'd be interested in looking at OBJ files where obj2mesh fails.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 22, 2017 1:55:48 AM PST

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

Hi Greg,

Hi Axel,

I thought you said that the actual obj2mesh call was taking forever,

not the oconv operation afterwards. If it's oconv, this is due to having
many overlapping volumes, which oconv attempts to resolve. If you didn't
have so many RTM's (instanced octrees are also an issue), this would not
be a problem.

Sorry about this misunderstanding. obj2mesh is always very fast, but might fail. What can take so long is the oconv compilation

The rules of thumb on RTMs and instanced octrees is that if you
don't

have multiple occurrences of something, don't use either. And if you do,
then still avoid them unless you can packaged together at least 10,000
surfaces or so per octree/RTM. Finally, it's best if your octree or RTM
fills are more-or-less cubic volume, or is well-separated from other
instanced volumes. Overlapping volumes cause issues for oconv.

This pretty much explains the issues we've been having occasionally. Thank you so much for clarifying. What I thought your mesh presentations at the workshops over the last few years were telling me is that RTMs are always better due to the small octree size and fast oconv times. Should have paid more attention to the small print, I guess.

If I understood correctly the first time, and it is obj2mesh that is

taking a long time, let me know which files to check out.

No, it's oconv. Problem solved, I think. Thank you so much for taking the time to look into this.

Cheers

Axel

···

On 01/03/17 18:34, Greg Ward wrote:

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: March 1, 2017 2:36:30 AM PST

Hi Greg,

https://giauk.sharefile.com/d-s0cd3657a8b54fa9b

Here is the first scene. This one does actually compile, but takes
forever with RTMs.
Compiling from obj2rad output takes just over a minute and creates an
octree of 136MB. Compiling from RTMs (with some polygon files) takes
200 minutes, resulting in an octree 37GB in size. What we normally do
in such a situation is to either reduce the number of RTMs to one per
material, or to go via obj2rad.

I also noticed on a project I worked on a couple of years ago that the
order in which the RTMs are listed in our master xform file matters.
I was able back then to compile by listing the largest object
(terrain) first. Trying to replicating this behaviour today did not
get the intended result. The octree would compile either way.

So let me hunt down some more examples for you.

Let me know if you need the OBJs for your testing.

Cheers

Axel

On 22 February 2017 at 17:11, Axel Jacobs <[email protected]> wrote:

Hi Greg,

I'll ask around and find you some projects. Give me a few days, and
I'll get back to you.

Cheers

Axel

On 22 February 2017 at 16:41, Greg Ward <[email protected]> wrote:

Hi Axel,

I'd be interested in looking at OBJ files where obj2mesh fails.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 22, 2017 1:55:48 AM PST

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general

No worries, Axel.

Large models are a challenge, no matter what you do. And it *can* be advantageous to use a Radiance triangle mesh for extremely complex geometry, even if it appears just once. (Not true of octree instances, however.) The intended use in such cases might be for a library object, such as a sculpture or highly detailed furnishing, which would be put in a mesh in a library. Some objects like this from the CSAIL databased are provided in the Radiance library.

When obj2mesh fails, this can sometimes be remedied by changing the -r and/or -n options, but not always.

Cheers,
-Greg

···

From: Axel Jacobs <[email protected]>
Date: March 1, 2017 11:30:00 AM PST

Hi Greg,

On 01/03/17 18:34, Greg Ward wrote:

Hi Axel,

I thought you said that the actual obj2mesh call was taking forever,

not the oconv operation afterwards. If it's oconv, this is due to having
many overlapping volumes, which oconv attempts to resolve. If you didn't
have so many RTM's (instanced octrees are also an issue), this would not
be a problem.

Sorry about this misunderstanding. obj2mesh is always very fast, but might fail. What can take so long is the oconv compilation

The rules of thumb on RTMs and instanced octrees is that if you
don't

have multiple occurrences of something, don't use either. And if you do,
then still avoid them unless you can packaged together at least 10,000
surfaces or so per octree/RTM. Finally, it's best if your octree or RTM
fills are more-or-less cubic volume, or is well-separated from other
instanced volumes. Overlapping volumes cause issues for oconv.

This pretty much explains the issues we've been having occasionally. Thank you so much for clarifying. What I thought your mesh presentations at the workshops over the last few years were telling me is that RTMs are always better due to the small octree size and fast oconv times. Should have paid more attention to the small print, I guess.

If I understood correctly the first time, and it is obj2mesh that is

taking a long time, let me know which files to check out.

No, it's oconv. Problem solved, I think. Thank you so much for taking the time to look into this.

Cheers

Axel

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: March 1, 2017 2:36:30 AM PST

Hi Greg,

https://giauk.sharefile.com/d-s0cd3657a8b54fa9b

Here is the first scene. This one does actually compile, but takes
forever with RTMs.
Compiling from obj2rad output takes just over a minute and creates an
octree of 136MB. Compiling from RTMs (with some polygon files) takes
200 minutes, resulting in an octree 37GB in size. What we normally do
in such a situation is to either reduce the number of RTMs to one per
material, or to go via obj2rad.

I also noticed on a project I worked on a couple of years ago that the
order in which the RTMs are listed in our master xform file matters.
I was able back then to compile by listing the largest object
(terrain) first. Trying to replicating this behaviour today did not
get the intended result. The octree would compile either way.

So let me hunt down some more examples for you.

Let me know if you need the OBJs for your testing.

Cheers

Axel

On 22 February 2017 at 17:11, Axel Jacobs <[email protected]> wrote:

Hi Greg,

I'll ask around and find you some projects. Give me a few days, and
I'll get back to you.

Cheers

Axel

On 22 February 2017 at 16:41, Greg Ward <[email protected]> wrote:

Hi Axel,

I'd be interested in looking at OBJ files where obj2mesh fails.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 22, 2017 1:55:48 AM PST

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

Dear all,

I thought I give a quick update on this.

Traditionally, we have been exporting our geometry to individual RTMs,
which is typically one per layer and material. The project where I
first noticed this slow spawning behaviour with many artificial light
sources (around 7,000 in this case) had 40 individual RTMs, but on
some larger projects this can be several hundreds.

For each RTM, there is one rad file with only the mesh primitive in
it. All those rad files are finally called from a master xform file,
as we call it.

Running this project (only -ab 0) took around 40 minutes with this
approach. During almost all of this time, rtrace was only running as
a single thread, despite being called with -n 80.

Greg suggested that this approach is not ideal, and that it could be sped up by:
a) instead of RTM geometry (obj2mesh), use polygons (obj2rad)
b) expance all calls to xform in the master file (xform -e master.rad

master2.rad)

c) compile a frozen octree (oconv -f master2.rad > test.oct)

I ran some test to find out which of the above has the biggest impact
on render times. Expectedly(?), it is the use of polygon geometry
instead of RTMs that makes all the difference here. Rendering time
went down from 40 mins to somewhere between 10 and 20 seconds. It is
also a good idea to expand all xform calls, but once this is done,
using frozen octrees does not make the calculation significantly
faster.

Have a jolly good weekend, everyone

Axel

···

On 22 February 2017 at 09:55, Axel Jacobs <[email protected]> wrote:

Hi Greg,

thank you so much for solving this riddle. I have to admit I have
been blissfully unaware of the drawbacks of using RTMs. We use them as
much as possible now, since I have come to appreciate the fast compile
times and small octree sizes this can give us. However, there have
been quite a few projects where RTMs would take hours to compile,
generating octrees many GB in size. In such instances, we convert the
OBJs into polygon, but without the normals. Occasionally, we also
need to go via obj2rad, simply because obj2mesh fails to generate the
RTMs.

Looks as if we need to take a fresh look at our workflow.

Thank you for adding the limit on the source obstruction code. I'll
try this out in the next few days, and re-run the project

Thanks again

Axel

On 20 February 2017 at 18:00, Greg Ward <[email protected]> wrote:

Oh, I just remembered -- there's also the shadow cache, which gets initialized in marksources() as well. This also ends up tracing some number of rays, about 400 per source, to look for near-source obstructions before the main calculation begins. So, that could be part of your slow-down. I guess 6,000 light sources would be about 2.5 million rays to trace, though I have trouble seeing how that would take 45 minutes even on one process. It should only take a couple of minutes on a modern processor with enough memory to hold the scene.

You could try recompiling with -DSHADCACHE=0, but I don't think you would want to, as the shadow cache is the thing that really saves you with so many light sources. I'd only do it to determine if that's what slows down the start-up. (Although it would be an interesting test of the shadow cache under extreme conditions.)

I don't know how to sample running processes on your system, but it would be interesting to find out where rtrace is spending all its time during start-up....

Cheers,
-Greg

P.S. More info on shadow cache in 2004 workshop program: http://www.radiance-online.org/community/workshops/2004-fribourg/Ward_talk.pdf

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 9:28:36 AM PST

Thanks for your answer, Greg.

There are no mirror or prism surfaces in this scene. Just glass and
plastic. So this is weird, then.

I did notice this behaviour on previous projects where we had
thousands of light sources, but didn't look into it back then.

Cheers

Axel

On 20 February 2017 at 17:18, Greg Ward <[email protected]> wrote:

Hi Axel,

There is quite a bit of initialization code, the goal of which is to get as much common data into shared memory as possible before calling fork(). This reduces the memory footprint of your processes, as well as avoiding redundant work that wouldn't make it go faster, anyway.

Included in this preamble are initializing the photon maps (if any), loading the octree, marking light sources (including virtual sources), and preloading the ambient cache (if one). It also preloads all object data, including instanced octrees, meshes, pictures used in patterns, and so on.

Even with 6,000 light soruce, marking light sources shouldn't take all that long, *unless* you have "mirror" or "prism" surfaces in your scene. These will create virtual light sources, multiplying the number of sources potentially by many times. (Mirror surfaces that face each other are the worst case.) The virtual light source preamble can take quite some time in such cases, as it tries to eliminate virtual source paths that would never pass light due to obstructions, etc.

If you don't have any mirror or prism surfaces, then I'm not sure why it would be taking so long.

Cheers,
-Greg

From: Axel Jacobs <[email protected]>
Date: February 20, 2017 4:01:33 AM PST

Dear list,

I'm running some rtrace -n xx calculations, and noticed that there is
only one thread for 30 to 45 minutes, before the -n xx kicks in. My
scene contains some 6,000 artificial light sources.

The question I have is this: Is there something within rtrace/mkillum
that is not multi-threaded that is run before the actual ray tracing
part starts (which does honour the -n option)? I could think of some
light source visibility or intensity test that need to be done before
the actual ray tracing.

Many thanks for your thoughts

Best regards

Axel

_______________________________________________
Radiance-general mailing list
[email protected]
http://www.radiance-online.org/mailman/listinfo/radiance-general