On one of my first Oracle consulting assignments I had to come up with the sizing estimates for tables for a data warehouse for a telecommunications company in California. At that time, (1994) there were few products that provided sizing algorithms for Oracle tables and indexes. In fact, all that was available were the somewhat cryptic sizing formula provided by Oracle (discontinued from the manuals in version 9). Not being one to be easily dissuaded, I programmed the calculations into a sizing spreadsheet and I have been diligently maintaining that spreadsheet with each release starting with version 7.
I get requests form my spreadsheets about half a dozen times a month as people stumble upon old posts about them on various Oracle lists and forums. Having a minute or two to spare the other day, I pulled out the Oracle9i version of the sizing spreadsheets and plugged in some values for some known tables in Oracle10g. Much to my chagrin the sizes were way off, so I decided to do some verification work and see what could be done to bring these up to snuff.
First I needed to establish the way sizing changes as the Oracle block size changes. I created a table using Benchmark Factory with about 600K rows and an average row size of 126 bytes. I then created 2, 4, 8 and 16K blocksize versions of the table, analyzed them and then compared their sizes:
|
Table
|
rows
|
Blocks
|
Block Size (b)
|
Gross Size (m)
|
Avg Row Size
|
Raw Size (m)
|
Empty Blocks
|
Net Size (m)
|
|
LINEITEM_16K
|
600572
|
5378
|
16384
|
84.03
|
126
|
72.17
|
254
|
80.06
|
|
LINEITEM_8K
|
600572
|
10901
|
8192
|
85.16
|
126
|
72.17
|
363
|
82.33
|
|
LINEITEM_4K
|
600572
|
22424
|
4096
|
87.59
|
126
|
72.17
|
104
|
87.19
|
|
LINEITEM_2K
|
600572
|
47326
|
2048
|
92.43
|
126
|
72.17
|
1826
|
88.87
|
|
LINEITEM_16K
|
6001215
|
53641
|
16384
|
838.14
|
127
|
726.85
|
119
|
836.28
|
|
LINEITEM_8K
|
6001215
|
108017
|
8192
|
843.88
|
127
|
726.85
|
527
|
839.77
|
|
LINEITEM_4K
|
6001215
|
223836
|
4096
|
874.36
|
127
|
726.85
|
1444
|
868.72
|
|
LINEITEM_2K
|
6001215
|
473069
|
2048
|
923.96
|
127
|
726.85
|
2067
|
919.93
|
Table 1: Table sizes verses block sizes from statistics
You can see in below in Figure 1, there is a greater than 10 percent (8.8 megabytes) difference between the 2K table and an identically created 16K tables. Now some of this is no doubt due to the difference in available free size in the blocks but even accounting for that and allowing for the documented header components I was still way off in my spreadsheet.
Essentially it appears that a undocumented component has been added into the header for a table block that is dependent on the blocksize and probable the number of rows. I fudged with the calculation a bit and discovered that adding a constant of 10 for 2 and 16K blocksizes and 110 for 4 and 8K blocksizes seemed to give the best results. Here are the results from my spreadsheet and the actual measured table size data:
|
Net Size (m)
|
Avg Hdr Space
|
Calculated Freespace
|
Cal Blocks
|
Act. Blocks
|
Cal Size
|
Avg Freespace
|
Toad (m)
|
|
80.06
|
1696
|
14605.6
|
5181.031
|
5124
|
80.95
|
14688
|
85.3
|
|
82.33
|
875
|
7232.8
|
10462.35
|
10538
|
81.74
|
7317
|
85.3
|
|
87.19
|
466
|
3546.4
|
21337.71
|
22320
|
83.35
|
3630
|
85.3
|
|
88.87
|
252
|
1703.2
|
44429.35
|
45500
|
86.78
|
1796
|
85.3
|
|
836.28
|
1695
|
14564.6
|
52329.23
|
53522
|
817.64
|
14689
|
852.45
|
|
839.77
|
883
|
7191.8
|
105975.46
|
107490
|
827.93
|
7309
|
852.45
|
|
868.72
|
465
|
3505.4
|
217422.92
|
222392
|
849.31
|
3631
|
852.45
|
|
919.93
|
253
|
1662.2
|
458521.42
|
471002
|
895.55
|
1795
|
852.45
|
Table 2: Calculated verses Actual Sizes
Using a spreadsheet that utilizes a varying value based on empirical tests against actual tables the following results where generated:
|
600572 Rows
|
|
|
|
6001215 Rows
|
|
|
Block Size
|
Cal Blocks
|
Cal Size
|
Dif
|
Block Size
|
Cal Blocks
|
Cal Size
|
Dif
|
|
16384
|
5178.00
|
80.91
|
1.01
|
16384
|
51735.00
|
808.36
|
0.97
|
|
8192
|
10537.00
|
82.32
|
1.00
|
8192
|
105285.00
|
822.54
|
0.98
|
|
4096
|
22244.00
|
86.89
|
1.00
|
4096
|
222268.00
|
868.23
|
1.00
|
|
2048
|
46198.00
|
90.23
|
1.02
|
2048
|
461632.00
|
901.63
|
0.98
|
Table 3: Variable Formula spreadsheet results
For those that think graphically rather than tabular, here is a plot of the two, actual and calculated table sizes:


Figure 1: Calculated Verses Actual Table Sizes
As you can see, I got pretty close with my approximation, as we used to say in the Navy, “close enough for government work anyway.” As a point of comparison, starting with a zero row table in the Toad Estimate Table Size utility you get a value of 85.3 meg with an estimated row size of 126 bytes and 600572 rows and a value of 852.45 meg with a row size of 127 bytes and 6001215 rows using the LINEITEM base table as a model. Unfortunately there is no way to vary the blocksize in the Toad estimation utility so I couldn’t see the affect of varying the blocksize on its calculation.
For all three models used in testing table sizing (a constant formula, a varying formula and Toad’s proprietary formula) the difference is less than 3 percent across the different blocksizes for both a 600K and a 6 million row table.