I often see DBAs and developers who don’t know the first thing about tuning SQL. They ask “Why do I need to know that, can’t Oracle tune itself?” The truth of the matter is that Oracle is, for the most part, able to do a pretty decent job of tuning itself if it is given enough information and the queries or tuning tasks aren’t too complex. However, there are times when Johnny must tune.
What do I mean by “enough information”? Well, Oracle utilizes a cost based optimizer (CBO) in most releases (if you have a release prior to 10g, then you will have both rule and cost based optimizer) and this CBO uses what it knows about database objects in regards to size and data distributions to determine the cost of accessing the objects through various paths, other databases utilize this type of statistic based optimization as well. These statistics are mostly manually gathered in releases prior to 10g and “freshness” of these statistics determines how good the decisions are that Oracle makes regarding the access paths chosen for SQL statements. So if the statistics are invalid or stale or not present, the optimizer will make bad choices. In addition Oracle assumes that the data will be equally distributed about the key values, in situations where data is skewed with more data values clustered about specific keys than being equally distributed, the optimizer will make bad choices unless histograms (which show the optimizer how data is clustered around the keys) are also provided with the database object statistics. Also, in releases greater than 9i the CPU costing can come into play.
CPU costing involves just a few basic statistics such as CPU speed, the IO timing for single block reads and the IO timing for multi block reads. Using these basic speed related statistics Oracle can then determine the cost of a CPU based path verses that of an IO based path and then will choose whether or not to use the best CPU or IO based path in its SQL access determinations. The statistics for the CPU option are gathered by using the DBMS_STATS.GATHER_SYSTEM_STATS call to the DBMS_STATS package.
So it should be clear that having up-to-date statistics is a vital key to getting proper optimization out of the Oracle optimizer for SQL access paths. The other half of the optimization puzzle is the complexity of the SQL being optimized. The number of paths that the optimizer will consider when looking at the optimal access path for a given SQL statement is determined using the OPTIMIZER_MAX_PERMUTATIONS optimization parameter. In versions prior to Oracle9i the OPTIMIZER_MAX_PERMUTATIONS parameter was set at 80,000, in versions after 9i it was reduced to 2000 and in 10g it mas made into an undocumented setting and remained at 2000. But what exactly is this parameter?
The number of possible paths to reach a set of destinations is stated as the factorial (n!) of the number of destinations. In Oracle the number of ways to process a given SQL statement is roughly the factorial of the number of tables in the specific statement. Now, things like the number of indexes, types of indexes and other factors may increase the number of possible paths above this factorial but you get the picture. A factorial is essentially the integer product of all integers from 1 to n where n is the number of objects. So for a 3 table join the factorial is 1*2*3 which is 6, for a 4 table join it is 1*2*3*4 or 24. This doesn’t seem so bad…however, when you get to a 6! You reach 720 and with 7 you get to 5040 possible paths just considering the tables in the join and not their indexes. Generally speaking Oracle will consider the tables join order starting from the left side of the FROM clause and working its way right. What this left-right ordering means is that with a 6 or more table join you had better place the most important tables in the join first (on the left side of the FROM clause.) Can Oracle do that for you? Not really, it has no clue (generally speaking) on the relative importance of the tables relative to your query.
Another area that has an immense affect on the optimizer are the settings of various documented and undocumented initialization parameters. If you want to see what the optimizer considers when performing a SQL optimization, utilize an Oracle 10054 trace, in my 10gR2 database, the following is the list of documented and undocumented parameters considered by the optimizer when making path choices for a simple two-table join:
Optimizer environment:
optimizer_mode_hinted = false
optimizer_features_hinted = 0.0.0
parallel_execution_enabled = true
parallel_query_forced_dop = 0
parallel_dml_forced_dop = 0
parallel_ddl_forced_degree = 0
parallel_ddl_forced_instances = 0
_query_rewrite_fudge = 90
optimizer_features_enable = 10.2.0.1
_optimizer_search_limit = 5
cpu_count = 2
active_instance_count = 1
parallel_threads_per_cpu = 2
hash_area_size = 131072
bitmap_merge_area_size = 1048576
sort_area_size = 65536
sort_area_retained_size = 0
_sort_elimination_cost_ratio = 0
_optimizer_block_size = 8192
_sort_multiblock_read_count = 2
_hash_multiblock_io_count = 0
_db_file_optimizer_read_count = 16
_optimizer_max_permutations = 2000
pga_aggregate_target = 198656 KB
_pga_max_size = 204800 KB
_query_rewrite_maxdisjunct = 257
_smm_auto_min_io_size = 56 KB
_smm_auto_max_io_size = 248 KB
_smm_min_size = 198 KB
_smm_max_size = 39731 KB
_smm_px_max_size = 99328 KB
_cpu_to_io = 0
_optimizer_undo_cost_change = 10.2.0.1
parallel_query_mode = enabled
parallel_dml_mode = disabled
parallel_ddl_mode = enabled
optimizer_mode = all_rows
sqlstat_enabled = false
_optimizer_percent_parallel = 101
_always_anti_join = choose
_always_semi_join = choose
_optimizer_mode_force = true
_partition_view_enabled = true
_always_star_transformation = false
_query_rewrite_or_error = false
_hash_join_enabled = true
cursor_sharing = exact
_b_tree_bitmap_plans = true
star_transformation_enabled = false
_optimizer_cost_model = choose
_new_sort_cost_estimate = true
_complex_view_merging = true
_unnest_subquery = true
_eliminate_common_subexpr = true
_pred_move_around = true
_convert_set_to_join = false
_push_join_predicate = true
_push_join_union_view = true
_fast_full_scan_enabled = true
_optim_enhance_nnull_detection = true
_parallel_broadcast_enabled = true
_px_broadcast_fudge_factor = 100
_ordered_nested_loop = true
_no_or_expansion = false
optimizer_index_cost_adj = 100
optimizer_index_caching = 0
_system_index_caching = 0
_disable_datalayer_sampling = false
query_rewrite_enabled = true
query_rewrite_integrity = enforced
_query_cost_rewrite = true
_query_rewrite_2 = true
_query_rewrite_1 = true
_query_rewrite_expression = true
_query_rewrite_jgmigrate = true
_query_rewrite_fpc = true
_query_rewrite_drj = true
_full_pwise_join_enabled = true
_partial_pwise_join_enabled = true
_left_nested_loops_random = true
_improved_row_length_enabled = true
_index_join_enabled = true
_enable_type_dep_selectivity = true
_improved_outerjoin_card = true
_optimizer_adjust_for_nulls = true
_optimizer_degree = 0
_use_column_stats_for_function = true
_subquery_pruning_enabled = true
_subquery_pruning_mv_enabled = false
_or_expand_nvl_predicate = true
_like_with_bind_as_equality = false
_table_scan_cost_plus_one = true
_cost_equality_semi_join = true
_default_non_equality_sel_check = true
_new_initial_join_orders