To determine what degree of effect discontiguous extents (fragmented segments) have on full table scan performance. To do so, I established a fresh, unused 1G tablespace and created in it a 500M table in different configurations. This table was populated with 5,810,041 rows.
SPARCStation 5 170 MHz HyperSPARC, 128M memory 512 swap, Solaris 2.7, Oracle 7.3.4 EE with no other significant processes running. Disk is Seagate Barracuda 8G 7200rpm with Oracle software on it and the entire database on one disk. This configuration was chosen because it is
The database is a pretty normal one, created for testing but with a new catalogue. Block size is 2048. For the painfully curious, these are the full params.
7,490 extents of 70k each with each extent positioned 70k away from any other extent by interleafing a second dummy table with alternating extent allocation. This is maximal fragmentation.
7,490 extents of 70k each with each extent positioned random(0 .. 6635) blocks away from its logical neighbour. This was produced by alternately allocating an extent for the test table and allocating an extent of some random size for the filler table, then deallocating the filler table extents and coalescing the tablespace. Repeat 4790 times.
The distribution acheived is about that of a badly managed table, where about half of the extents are contiguous, and the other half are widely distributed. A breakdown of the extent offsets is available here.
7,490 extents of 70k with each extent adjacent to its logically adjacent extent.
Single extent of 524300k.
In all configurations the pctfree nd all other parameters were left to default.
In each case, SQL*Loader was used to load directly from a perl script which generated rows like this:
"$n++","aaaaaaaaaaaaaaaaaaaaaa","12042001","01012001","12","34","Why am I doing this?"
where col 1 is a sequence number, 2 is a varchar2, 3 and 4 are dates, 5 and 6 are numbers, and 7 is a varchar2.
I then changed the setting of db_file_multiblock_read_count initially to 35 (because that would be a whole extent read and therefore maximal performance on calls to pread) then to the more conventional settings of 32, 16, and deafult 8.
After each change of db_file_multiblock_read_count, the database was restarted and then the test began.
Each test consisted of six executions of the following query [1]:
The first result was discarded and the remaining five executions were timed using the Time::HiRes perl module to get good millisecond accuracy. All six tests were performed with no activity in between, and within the same session.
select count(*) from multi_frag
The results
| mbrc |
Equi- Fragmented |
Random- Fragmented |
Contiguous | Single-extent | |
|---|---|---|---|---|---|
| 50 | Max | 113.0560150 | 145.8112080 | ||
| Avg | 111.9193334 | 145.6063335 | |||
| Min | 111.0810420 | 145.4509680 | |||
| 48 | Max | 114.9446550 | 147.5079680 | ||
| Avg | 113.6529900 | 146.6654857 | |||
| Min | 112.8780010 | 145.7271010 | |||
| 35 | Max | 118.6068630 | 113.6115420 | 145.9530180 | |
| Avg | 117.0498126 | 112.1446948 | 145.5263648 | ||
| Min | 115.5066920 | 109.3606230 | 145.0670800 | ||
| 32 | Max | 130.7510110 | 116.4618670 | 145.3982130 | 145.1209010 |
| Avg | 120.0842376 | 115.8266760 | 105.2245420 | 141.3603106 | |
| Min | 115.9592350 | 114.7451980 | 88.4458040 | 136.7923820 | |
| 24 | 116.3248760 | 135.5023560 | |||
| 116.1398850 | 134.7885750 | ||||
| 115.8248950 | 134.0072550 | ||||
| 16 | Max | 328.7927530 | 116.4866980 | 87.5720500 | 88.8285000 |
| Avg | 239.4676784 | 114.8729186 | 86.7090202 | 87.4721010 | |
| Min | 118.1331340 | 113.8363470 | 85.6402580 | 86.9118490 | |
| 8 | Max | 343.6603230 | 117.6407090 | 88.0483660 | 90.3987690 |
| Avg | 247.4077822 | 116.7668960 | 87.6826790 | 89.3174622 | |
| Min | 159.6123270 | 115.7156940 | 87.2905530 | 86.0655490 | |
During the initial discarded run of each bank of test, I ran truss against the dedicated server. In each of the fragmented and contiguous tests, a number of pread calls were followed by a semctl call presumably to another bg process (pmon?). This semctl call coincides with the beginning of a new extent being read. In the case of Single-extent, no calls were made other than preads and a single write (pipe return?).
I won't make any detailed comment on the coniguous vs. single-extent results here. I will though point out that the 'single-extent is the true path' brigade have something to think about when at 7,490 contiguous extents beats single-extent in every execution except one.
The fascinating part to me is the relatively good performance of the outrageously fragmented configuration. At best it is faster than contiguous. But when mbrc is lowered, fragmented falls off terribly while contiguous gets consistently better to the apex (16 blocks = 32k, the best read for my setup).
In due course I will run more trials with a better analysis of the underlying system calls, and update the experiment to 8.1.7 as a server comes available.
If you are mad enough or bored enough to want to replicate this test, I have written a couple of fairly straight-forward tools to make it easier, particularly loading 5.8 million rows of fake data and creating fragmented tables (I'm not sure how much use that is). I may upload them here, or I may not. If you want them though, merely email and ask for them.
In future, all these technical disucssions are to be moved to SQL.org.uk where there are free discussion lists as well.