setting buffer size on octree reads (setvbuf)

Hi Greg,

regarding the long time it takes reading octrees (multiple with sizes > 50MB):

I've tested with setvbuf calls after the fopen in readoct and readobj with a size of 2MB (system default is 4kB).

Well- the number of reads went down, but the time for reading remained unchanged: 2.47min down from 2.48min.
It seems truly CPU (Xeon) limited, at least in this case, when reading from local disk (bandwidth>20MB).
Nevertheless, using a larger buffer than the default 4kB may make sense. E.g. in other scenarios (might be usable on NFS reads where mount rsize is larger than 4k), with a minor chance of portability problems ("The setbuf() and setvbuf() functions conform to C89 and C99" it says in the man page). Ah, well, setbuf is already in use for setting up the ambient file. IMHO it might be a good thing to add to readoct, unless rendering is done on systems where an increased buffer takes away a significant part of valuable mem.

-Peter

PS: Tested on fresh 'head' - cc to the dev list in case anyone stumbles across, comments appreciated.

···

--
pab-opto, Freiburg, Germany, http://www.pab-opto.de
[see web page to check digital email signature]

Hi Peter,

I'm not surprised to hear that readoct is CPU-bound, since it does a fair amount of processing on the buffered input. Generally, you only reach the system performance limits when you are doing little more than reading data into memory, in which case memory mapping the file makes a lot of sense (for Unix systems that support it, anyway). Since readoct has to unpack the data and allocate octree nodes and leaves and so on, I'm not sure how best to speed up the process.

-Greg

···

From: Peter Apian-Bennewitz <[email protected]>
Date: April 28, 2008 11:39:52 AM PDT

Hi Greg,

regarding the long time it takes reading octrees (multiple with sizes > 50MB):

I've tested with setvbuf calls after the fopen in readoct and readobj with a size of 2MB (system default is 4kB).

Well- the number of reads went down, but the time for reading remained unchanged: 2.47min down from 2.48min.
It seems truly CPU (Xeon) limited, at least in this case, when reading from local disk (bandwidth>20MB).
Nevertheless, using a larger buffer than the default 4kB may make sense. E.g. in other scenarios (might be usable on NFS reads where mount rsize is larger than 4k), with a minor chance of portability problems ("The setbuf() and setvbuf() functions conform to C89 and C99" it says in the man page). Ah, well, setbuf is already in use for setting up the ambient file. IMHO it might be a good thing to add to readoct, unless rendering is done on systems where an increased buffer takes away a significant part of valuable mem.

-Peter

PS: Tested on fresh 'head' - cc to the dev list in case anyone stumbles across, comments appreciated.

[Reappearing from under a tub]

Since readoct has to unpack the data and allocate octree nodes and
leaves and so on, I'm not sure how best to speed up the process.

Perhaps the data doesn't need to be packed any more? There's hugely more memory available in modern systems than the ones you worked with when you developed that file format. Though I suppose the best thing to do is the by-the-book "instrument, then code" approach.

Randolph Fritz
   design machine group
   architecture department
   university of washington
[email protected]

Hi Randolph,

There's no sensible way to read in an octree without doing some allocation along the way. The file format is designed more for speed and portability than for speed. Peter A-B and I found out (after instrumenting the code a bit) that the slow loads in his case are from the modifier usage more than anything else. Peter will explain. (Peter?)

-Greg

···

From: R Fritz <[email protected]>
Date: April 29, 2008 1:30:06 PM PDT

[Reappearing from under a tub]

Since readoct has to unpack the data and allocate octree nodes and
leaves and so on, I'm not sure how best to speed up the process.

Perhaps the data doesn't need to be packed any more? There's hugely more memory available in modern systems than the ones you worked with when you developed that file format. Though I suppose the best thing to do is the by-the-book "instrument, then code" approach.

Randolph Fritz