Hello, you are not logged in.  Login or sign up
Community >> Blogs
Search Toad World Search

Blogs
Toad and Database Commentaries

 Toad World blogs are a mix of insightful how-tos from Quest experts as well as their commentary on experiences with new database technologies.

Do you have a topic that you'd like discussed?  We'd love to hear from you.  Send us your idea for a blog topic.

A New Intelligent Test Run Function is under Research
 
Location: Blogs Richard To's Blog    
 RichardTo Friday, June 13, 2008 3:20 AM
  
A problem in the area of SQL optimization that has been around for a long time is the inaccuracy of the cost estimation of a SQL statement. A lot of people have the wrong conception that inaccurate cost estimations generated for a SQL statement by the database optimizer are caused by outdated statistics; whereas in fact, outdated or incorrect statistics is only one of many reasons that cause the database to generate an inaccurate SQL cost estimation. This means that even if you have 100% accurate statistics on your tables and indexes, you will not necessarily have 100% accurate cost estimation for your SQL statements.
 
Most of the poorly performing SQL statements found in production databases are caused by the wrong SQL cost estimation. For a more detailed discussion on this topic, see my previous blog Why SQL Tuning?
 
In Quest SQL Optimizer, we use an Artificial Intelligence technique to generate syntactically different, but semantically equivalent SQL rewrites. This provides several alterative SQL statements for the original SQL. But, since we cannot rely on the database SQL cost estimation to identify the best SQL alternative, we still have to test run those alternatives in a production database or similar environment to pick out the best alternative SQL statement. A close look at the testing results of several different SQL statements will show you that the best alternative SQL statement is not normally the one with the lowest cost in most of cases.  This is why Quest SQL Optimizer provides a test run function to help users to pick out the best alternative SQL statement. But a physical test run for all alternative SQL statements is time consuming, so some users may select only those SQL with relatively lower cost to test run without going through all the alternatives. This approach can save time, but it may miss some potentially better alternatives whose estimated costs have been wrongly estimated high. 
 
It has been on my mind for many years to try to find a way to quickly identify the best SQL alterative from all the alternatives generated by Quest SQL Optimizer. There is no easy solution to this problem. Since each database environment is unique and the cost estimation is not reliable, I knew that what was needed was an algorithm which could save users time when test running the alternatives while not missing any potentially good alternatives. Recently, I have been involved in a research project that uses an AI cluster technique to improve the traditional Fuzzy Self-learning algorithm. One of the benefits that have come out of this research is that I found a better way to automatically select which SQL alternatives to test. It is to use an AI cluster technique that we have developed which provides a simple and useful solution to this long standing problem in our Quest SQL Optimizer
 
Let me illustrate this using the following SQL statement:
select * from employee
where emp_dept in (select dpt_id from department
where dpt_name LIKE 'D%')
I put this SQL statement into Quest SQL Optimizer so that it could be rewritten by our optimization process using an AI engine. At the intelligence level 4, in less than 10 seconds, 15 alternatives were generated:
 
 
Let’s have a closer look at the cost of all the SQL alternatives and the cost distribution chart for the SQL statement and the 15 alternative SQL statements generated by Quest SQL Optimizer for Oracle:
 
 
As you can see in the chart, the cost distributions of all SQL alternatives are grouped into multiple regions. SQL statements with similar cost normally have similar execution plans which results in similar performance. 
 
Executing the original and alternative SQL statements produced the following run times.
 
In reviewing the run times, you can look at the Plan Cost and see that the Total Elapsed Time and Total CPU time for each of the alternatives with a similar cost are very close. Sometimes, because of how long it takes to execute a SQL statement, testing all of the alternatives is not practical. So, to save time when testing the alternatives, instead of testing only the SQL statements with the lowest cost, it is best to test one SQL statement from each group of SQL with a similar cost.
 
For this example, Alt #12 is the fastest SQL statement even though the estimated cost from Oracle is among the highest. The fastest SQL statement would have been missed if only the SQL statements with the lowest cost were selected for testing.
 
The behavior we see in this example, that SQL statements with similar cost have similar run time, is common across the broad whether the syntax of the SQL statement is simple or very complex.
 
A new Algorithm for select SQL to test
 
It is an easy task for a person to look at a group of SQL statements sorted by the cost value and pick one SQL statement from each group of SQL statements that have a similar cost. But it is not so easy to develop an algorithm that can correctly do the same thing, which is to start with a fixed number of points (in this case the Plan Cost values) that have a wide range with an indeterminate granularity of numbers and then group them so that one can be selected. In the field of Artificial Intelligence, this issue would be classified as a clustering problem.
 
The next generation of test running the SQL statements in Quest SQL Optimizer will have an Intelligent Test Run function. This function will be based on an algorithm, developed in the Fuzzy Self-learning project, which will simplify the clustering process (the process of grouping the SQL statements with similar costs) so that Quest SQL Optimizer will be able to select one SQL statement from a group of SQL with similar cost.
 
The algorithm is too technical and complex to describe here, but for sure, you will enjoy a faster and better SQL optimization product soon.
 
Permalink |  Trackback

Comment:
Add Comment   Cancel 
Search Blog Entries
 
Blogger and Topic List
 

 

All Recent Entries
 

 

Johannes Ahrends
Unicode

Steven Feuerstein
Oracle PL/SQL

Daniel Norwood
Toad for Data Analysts
John Pocknell
Toad for Oracle
Bert Scalzo
Toad for Oracle, Data Modeling, Benchmarking
Jeff Smith
Toad product family
Richard To
SQL Optimization
Jim Wankowski
DB2 - LUW and z/OS
John Weathington
Compliance
Doug Williams
Database Musings
  Henrik "Mauritz" Johnson
Toad Tips & Tricks on the "other" Toads
  Toad World Editor
Toad World issues

  Toad Data Modeler Opens in a new window
Data Modeling
 

Copyright 2008 by Quest Software  | Terms Of Use | Privacy Statement | Contact Us