I have been asked many times that why there are some dummy SQL transformations/rewrites that look meaningless, but it works in certain situations. Let me give you some examples that may help to explain what theory on behind of Quest SQL Optimizer.
The following rules are implemented in the Quest SQL Optimizer for different platforms to deal with individual database SQL optimizer’s behavior. Some rules may look puzzling. In order to understand the theory behind some of these transformations, you may need to have an in-depth understanding of database optimization theory and the design approach to optimization that each database vendor has incorporated into the database SQL optimizer along with the platform specific optimizer functions.

It is a transformation which may look puzzling. The original SQL statement uses a range scan of the employee table with the condition “emp_id > 123456”. For example; we know that IBM DB2 UDB or Microsoft SQL Server have an intelligent algorithm which can preview the value “123456” before the execution plan is generated. Consequently, if “emp_id>123456” returns a small subset of records from employee table, the database SQL optimizer should generate execution plan that uses an index search. In contrast, if the SQL statement returns almost all the records from the table, the database SQL optimizer should generate an execution plan using a full table scan to save the time of retrieving extra index pages. This works fine in most cases. But, there are several factors that can cause the database SQL optimizer to make a mistake. For example;
1) If the statistics are not up-to-date.
2) The data distribution is so skewed that the granularity of the histogram is too big to handle.
3) The costing algorithm fails to take into account the configuration of different machines' I/O thru-put, CPU processing, memory speed, and other system resources.
For Microsoft SQL Server, if you want to rectify the problem, you can use the INDEX hints to force the database SQL optimizer to pick up an index. But for IBM DB2 UDB, it is a little bit more difficult. Let’s look at the following transformations which uses a dummy operation COALESCE(123456,123456) or adds +0 to the literal 123456. The purpose of these dummy operations is to hide the value of 123456, so Microsoft SQL Server or IBM DB2 UDB will not be able to see the value while parsing the SQL statement. Therefore, they will make a rough estimation when they do not know the actual value. Erring on the side of caution, the database SQL optimizer will normally select the execution plan that uses an index search.

In modern RDBMS SQL optimizers, this IN sub-query can normally be transformed to a join SQL statement, which means that the join path can be either from A→B or B→A. For the Nested Loop join case, the path plays an important role in determining the speed of the SQL. For the Hash join or Sort Merge join cases, the join path (the order that the tables are accessed) may not be that significant. But, this transformation rule which adds a GROUP BY clause serves two purposes. The first purpose is that it forces the sub-query to be processed individually. If the original execution plan is a Nested Loop join, after the transformation, the execution plan will normally be changed to a Hash or a Sort Merge join. The other purpose is that the additional GROUP BY function will trim down the result set from B.key (if B.key is not unique) and the duplicate records will be eliminated first. Sometimes this will help to improve the join speed.