Minimize
Blogger List

Johannes Ahrends
Toad and Oracle

Ben Boise
Toad SC Discussions

Kevin Dalton
Benchmark Factory

Steven Feuerstein
PL/SQL Obsession

Devin Gallagher
Toad SC discussions

Stuart Hodgins
JProbe Discussions

  Henrik "Mauritz" Johnson
Toad Tips & Tricks on the "other" Toads
  Mark Kurtz
Toad SC discussions
  Michael Lumbard
Toad SC discussions
Daniel Norwood
Toad for Data Analysts,
Toad Extension for Visual Studio
Debbie Peabody
Toad for Data Analysts
Gary Piper
Toad Reports Manager
John Pocknell
Toad for Oracle, JProbe
Kuljit Sangha
Toad SC discussions
Bert Scalzo Indicates Oracle ACE status
Toad for Oracle, Data Modeling, Benchmarking
Jeff Smith
Toad product family
Richard To
SQL Optimization
Jim Wankowski
DB2 - LUW and z/OS
John Weathington
  Toad Data Modeler Opens in a new window
Data Modeling
 
  Real Automated Code Testing for Oracle
Quest Code Tester blog
 
Minimize
Blog Tags
toad for oracle (122)
oracle (62)
plsql (46)
sql optimization (37)
toad for data analysts (28)
code tester (19)
toad for ibm db2 (13)
automation (11)
batch optimizer (10)
virtualization (10)
schema browser (9)
toad for sql server (9)
data grid (8)
sql (8)
sql editor (8)
toad data modeler (8)
benchmark factory (7)
excel (7)
query builder (7)
report manager (7)
toad extension (7)
visual studio (7)
11g (6)
configuration (6)
freeware (6)
health check (6)
vmware (6)
connect (5)
dba module (5)
er diagrammer (5)
F4 (5)
linux (5)
refactoring (5)
spotlight (5)
unicode (5)
compare (4)
debugger (4)
export (4)
formatter (4)
make code (4)
rman (4)
strip code (4)
benchmark (3)
bfscript (3)
bulk collect (3)
code templates (3)
code xpert (3)
database browser (3)
db2 (3)
notebook (3)
oem (3)
RAC (3)
session browser (3)
speed (3)
sql optimizer (3)
toad for mysql (3)
tpc-c (3)
9.7 (2)
alert log (2)
app designer (2)
awr (2)
code insight (2)
code snippets (2)
collection (2)
compare and sync (2)
compliance (2)
data generator (2)
data warehouse (2)
database explorer (2)
database monitor (2)
explain (2)
forall (2)
ftp (2)
group execute (2)
handbook (2)
installation (2)
job scheduler (2)
multi-task (2)
nested table (2)
os command (2)
profiler (2)
recovery (2)
release history (2)
save as (2)
schema compare (2)
sql recall (2)
stats pack (2)
subversion (2)
team coding (2)
trace file browser (2)
while loop (2)
10g (1)
64 bit (1)
7zip (1)
action (1)
addm (1)
alter (1)
ansi join (1)
array (1)
ccleaner (1)
code coverage (1)
code road map (1)
CRON (1)
cursor for loop (1)
data browser (1)
data subset (1)
database probe (1)
dbms_flashback (1)
dbms_profiler (1)
ddl (1)
feuerstein (1)
filezilla (1)
flash drive (1)
flow control (1)
for loop (1)
group policy manager (1)
hints (1)
import (1)
index (1)
inheritance (1)
invoker rights (1)
ipad (1)
java (1)
latency (1)
log switch (1)
logical model (1)
ltrim (1)
master-detail browser (1)
monitor (1)
multi-select (1)
naming standards (1)
network (1)
object explorer (1)
OEBS (1)
package (1)
parser (1)
partitioning (1)
performance (1)
pragma (1)
project manager (1)
RAT (1)
revo (1)
REXEC (1)
schema report (1)
script manager (1)
search (1)
set operator (1)
sga (1)
slow (1)
sonarsource (1)
source control (1)
space projection (1)
sql monitor (1)
sql navigator (1)
sql script (1)
sql tracker (1)
sql*plus (1)
standards (1)
statistics (1)
stored procedure (1)
string parser (1)
sub-model (1)
sub-type (1)
synch (1)
synchback (1)
TELNET (1)
toad (1)
trace (1)
unit test (1)
unix (1)
usb (1)
utility (1)
v10 (1)
v9.5 (1)
version control (1)
waits (1)
workload replay (1)
workspace (1)
xml (1)
 
WELCOME, GUEST
 
 

Blogs
Toad and Database Commentaries

Toad World blogs are a mix of insightful how-tos from Quest experts as well as their commentary on experiences with new database technologies.  Have some views of your own to share?  Post your comments!  Note:  Comments are restricted to registered Toad World users.

Do you have a topic that you'd like discussed?  We'd love to hear from you.  Send us your idea for a blog topic.


Jun 23

Written by: StevenFeuersteinTW
Monday, June 23, 2008 7:34 AM  RssIcon

You learn something new every day, right? Well, I certainly do (more or less). Even about PL/SQL, about which I am sure many people think I already know everything. Far from it.
 
In fact, I learned just last week from the PL/SQL Product Manager, Bryn Llewellyn, that his recommendation regarding cursor FOR loops and bulk collect is different from mine – and for a very good reason.
 
Several years ago, he informed me (and provided a script to verify the statement: 10g_optimize_cfl.sql in the demo.zip) that in Oracle10g, the compiler now optimizes every cursor FOR loop to execute at a level of performance similar to BULK COLLECT.
 
This is a wonderful and non-trivial optimization. Rather than have to go through a manual rewrite of all cursor FOR loops, you can leave them in place and reap the benefits of the BULK COLLECT (sort of).
 
Well, that's not quite true – and it's even less true than I thought.
 
First of all, if your cursor FOR loop contains any DML statements, you will still want to explicitly convert to BULK COLLECT. The reason is that while the optimizer will improve the performance of the query step, it will not optimize the DML statements to execute like FORALLs (the bulk processing analogue for updates, inserts and deletes).
 
So if you have code that looks like (cfl_to_bulk_0.sql in demo.zip):
 
PROCEDURE upd_for_dept (
   dept_in     IN   employees.department_id%TYPE
 , newsal_in   IN   employees.salary%TYPE
)
IS
   CURSOR emp_cur
   IS
      SELECT employee_id, salary, hire_date
        FROM employees
       WHERE department_id = dept_in;
BEGIN
   FOR rec IN emp_cur
   LOOP
      BEGIN
         INSERT INTO employee_history
                     (employee_id, salary, hire_date
                     )
              VALUES (rec.employee_id, rec.salary, rec.hire_date
                     );
 
         adjust_compensation (rec.employee_id, rec.salary);
 
         UPDATE employees
            SET salary = newsal_in
          WHERE employee_id = rec.employee_id;
      EXCEPTION
         WHEN OTHERS
         THEN
            log_error;
      END;
   END LOOP;
END upd_for_dept;
 
You will need to change the cursor FOR loop to BULK COLLECT so that you can fill up collections with data from the query, which can then be used in the FORALL statement. You will, sadly (due to increased complexity and program length) end up with something like this (cfl_to_bulk_5.sql in demo.zip):
 
PROCEDURE upd_for_dept (
   dept_in     IN   employees.department_id%TYPE
 , newsal_in   IN   employees.salary%TYPE
)
IS
   bulk_errors    EXCEPTION;
   PRAGMA EXCEPTION_INIT (bulk_errors, -24381);
 
   TYPE employee_tt IS TABLE OF employees.employee_id%TYPE
      INDEX BY BINARY_INTEGER;
 
   employee_ids   employee_tt;
 
   TYPE salary_tt IS TABLE OF employees.salary%TYPE
     INDEX BY BINARY_INTEGER;
 
   salaries       salary_tt;
 
   TYPE hire_date_tt IS TABLE OF employees.hire_date%TYPE
      INDEX BY BINARY_INTEGER;
 
   hire_dates     hire_date_tt;
 
   CURSOR employees_cur
   IS
      SELECT     employee_id, salary, hire_date
            FROM employees
           WHERE department_id = dept_in
      FOR UPDATE;
 
   PROCEDURE fetch_data_quickly (
      limit_in           IN       PLS_INTEGER
    , employee_ids_out   OUT      employee_tt
    , salaries_out       OUT      salary_tt
    , hire_dates_out     OUT      hire_date_tt
   )
   IS
   BEGIN
      FETCH employees_cur
      BULK COLLECT INTO employee_ids_out, salaries_out, hire_dates_out LIMIT limit_in;
   END fetch_data_quickly;
 
   PROCEDURE adj_comp_for_arrays (
      employee_ids_io   IN OUT   employee_tt
    , salaries_io       IN OUT   salary_tt
   )
   IS
   BEGIN
      FOR indx IN 1 .. employee_ids_io.COUNT
      LOOP
         adjust_compensation (employee_ids_io (indx), salaries_io (indx));
      END LOOP;
   END adj_comp_for_arrays;
 
   PROCEDURE insert_history
   IS
   BEGIN
      FORALL indx IN employee_ids.FIRST .. employee_ids.LAST SAVE EXCEPTIONS
         INSERT INTO employee_history
                     (employee_id, salary, hire_date
                     )
              VALUES (employee_ids (indx), salaries (indx), hire_dates (indx)
                     );
   EXCEPTION
      WHEN bulk_errors
      THEN
         FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
         LOOP
            -- Log the error
            log_error
                     (   'Unable to insert history row for employee '
                      || employee_ids
                                     (SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX)
                    , SQL%BULK_EXCEPTIONS (indx).ERROR_CODE
                     );
            /*
            Communicate this failure to the update phase:
            Delete this row so that the update will not take place.
            */
            employee_ids.DELETE (SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX);
         END LOOP;
   END insert_history;
 
   PROCEDURE update_employee
   IS
   BEGIN
      /*
        Use Oracle10g INDICES OF to avoid errors
        from a sparsely-populated employee_ids collection.
      */
      FORALL indx IN INDICES OF employee_ids SAVE EXCEPTIONS
         UPDATE employees
            SET salary = newsal_in
              , hire_date = hire_dates (indx)
          WHERE employee_id = employee_ids (indx);
   EXCEPTION
      WHEN bulk_errors
      THEN
         FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
         LOOP
            log_error
                     (   'Unable to update salary for employee '
                      || employee_ids
                                     (SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX)
                    , SQL%BULK_EXCEPTIONS (indx).ERROR_CODE
                     );
         END LOOP;
   END update_employee;
BEGIN
   OPEN employees_cur;
 
   LOOP
      fetch_data_quickly (100, employee_ids, salaries, hire_dates);
      EXIT WHEN employee_ids.COUNT = 0;
      insert_history;
      adj_comp_for_arrays (employee_ids, salaries);
      update_employee;
   END LOOP;
END upd_for_dept;
 
Sorry, I know that I should probably explain this code but to be brutally honest my dad had heart bypass surgery and I am currently at the hospital and so you get the abbreviated version ....
 
Anyway, that is one scenario where you definitely need to convert from the cursor FOR loop.
 
But Bryn, to my surprise, recommends that you always convert explicitly to BULK COLLECT. Why is that? For performance reasons. Apparently, the cursor FOR loop optimization makes the code run faster, but not as quickly as BULK COLLECT.
 
In my tests (which you can reproduce in cfl_vs_bulkcollect.sql), I found that BULK COLLECT ran 33% faster than the cursor FOR loop:

 

  • Cursor For Loop Elapsed: 8.12 seconds.
  • Bulk Collect Elapsed: 5.46 seconds. 
So if you really want the best performance, take the time and make the effort to switch over to BULK COLLECT.
 

4 comment(s) so far...


Re: Always Bulk Collect

Steve, I was looking at this - and while I agree that if you are going to read/update in a loop - using forall is the way to go (if you cannot do it in a single sql statement, which you many times can - no code) - and therefore using bulk collect with a reasonable limit would be the right approach. However, when you say "always convert", I'm not so sure. I reviewed your test case (cfl_vs_bulkcollect) and my notes are: a) you bulk collect over a million records. That is about 620mb of pga for that array alone. I would be very upset at the development staff that needs a half a gigabyte of RAM for an array. b) you would use the simple cursor for loop when you wanted to process row by row - your example stuffs them all into an array, you would not do that in real life. In real life, you would just have "for x in (select) loop process; end loop;" c) you measure elapsed time, elapsed time is very variable depending on what is going on, I would like to measure cpu and elapsed time. So, I took your example and reimplemented it as follows ops$tkyte%ORA11GR1> CREATE TABLE employees_big 2 AS SELECT e1.* FROM hr.employees e1, hr.employees e2, hr.employees e3 3 / Table created. ops$tkyte%ORA11GR1> ops$tkyte%ORA11GR1> create or replace procedure simple( p_iters in number default 1 ) 2 as 3 l_start_cpu number := dbms_utility.get_cpu_time; 4 l_start_ela number := dbms_utility.get_time; 5 begin 6 for ITERS in 1 .. p_iters 7 loop 8 for x in ( select * from employees_big ) 9 loop 10 if ( x.employee_id+0.1 = x.employee_id ) 11 then 12 dbms_output.put_line( 'doh' ); 13 end if; 14 end loop; 15 dbms_output.put_line 16 ( iters || '-simple ' || 17 ' cpu = ' || (dbms_utility.get_cpu_time-l_start_cpu) || 18 ' ela = ' || (dbms_utility.get_time-l_start_ela) ); 19 end loop; 20 end; 21 / Procedure created. ops$tkyte%ORA11GR1> ops$tkyte%ORA11GR1> create or replace procedure complex( p_iters in number default 1 ) 2 as 3 l_start_cpu number := dbms_utility.get_cpu_time; 4 l_start_ela number := dbms_utility.get_time; 5 type array is table of employees_big%rowtype; 6 l_data array; 7 cursor c is select * from employees_big; 8 begin 9 for ITERS in 1 .. p_iters 10 loop 11 open c; 12 loop 13 fetch c bulk collect into l_data limit 100; 14 for i in 1 .. l_data.count 15 loop 16 if ( l_data(i).employee_id+0.1 = l_data(i).employee_id ) 17 then 18 dbms_output.put_line( 'doh' ); 19 end if; 20 end loop; 21 exit when c%notfound; 22 end loop; 23 close c; 24 dbms_output.put_line 25 ( iters || '-complex ' || 26 ' cpu = ' || (dbms_utility.get_cpu_time-l_start_cpu) || 27 ' ela = ' || (dbms_utility.get_time-l_start_ela) ); 28 end loop; 29 end; 30 / Procedure created. that compares apples to apples - we both want to process all of the rows in the table - we inspect values in the rows and do something. They both do equivalent work. First I was curious "how much memory" would that original approach without limit take? [continued below]

By ThomasKyte on   Tuesday, June 24, 2008 12:24 PM

Re: Always Bulk Collect

ops$tkyte%ORA11GR1> connect / Connected. ops$tkyte%ORA11GR1> column value new_val V ops$tkyte%ORA11GR1> select a.name, b.value 2 from v$statname a, v$mystat b 3 where a.statistic# = b.statistic# 4 and a.name = 'session pga memory max'; NAME VALUE ------------------------------ ---------- session pga memory max 1841420 ops$tkyte%ORA11GR1> set echo on ops$tkyte%ORA11GR1> declare 2 type array is table of employees_big%rowtype; 3 l_data array; 4 begin 5 select * bulk collect into l_data from employees_big; 6 end; 7 / PL/SQL procedure successfully completed. ops$tkyte%ORA11GR1> select a.name, to_char(b.value-&V)/1024/1024 || ' mb' 2 from v$statname a, v$mystat b 3 where a.statistic# = b.statistic# 4 and a.name = 'session pga memory max'; old 1: select a.name, to_char(b.value-&V)/1024/1024 || ' mb' new 1: select a.name, to_char(b.value- 1841420)/1024/1024 || ' mb' NAME TO_CHAR(B.VALUE-1841420)/1024/1024||'MB' ------------------------------ ------------------------------------------- session pga memory max 618.6875 mb that is way more than I am willing to give anyone for a single variable like that - get two or three of those going and bam. Then, I wanted to see the relative performance: ops$tkyte%ORA11GR1> connect / Connected. ops$tkyte%ORA11GR1> exec complex(5) 1-complex cpu = 543 ela = 596 2-complex cpu = 1075 ela = 1127 3-complex cpu = 1606 ela = 1658 4-complex cpu = 2142 ela = 2194 5-complex cpu = 2677 ela = 2730 PL/SQL procedure successfully completed. ops$tkyte%ORA11GR1> ops$tkyte%ORA11GR1> connect / Connected. ops$tkyte%ORA11GR1> exec simple(5) 1-simple cpu = 525 ela = 525 2-simple cpu = 1049 ela = 1049 3-simple cpu = 1572 ela = 1573 4-simple cpu = 2096 ela = 2097 5-simple cpu = 2622 ela = 2623 PL/SQL procedure successfully completed. Here, the simple approach was *marginally* faster - not enough to be significant - but - the simple code is not "much slower" than the complex code - they are just about the same (similar results in 10g - and if you run it often enough sometimes complex is a LITTLE faster than simple and vice versa - they are about the same) Thomas Kyte http://asktom.oracle.com/

By ThomasKyte on   Tuesday, June 24, 2008 12:25 PM

Re: Always Bulk Collect

Thanks, Tom! I ran your code and reproduced similar results. I have asked Bryn to also take a look. I definitely would be happy to not feel compelled to convert to bulk collect at all times. But the two of you will need to "duke it out".... :-) SF

By StevenFeuersteinTW on   Friday, July 04, 2008 9:53 AM

Re: Always Bulk Collect

I have put together a more complete script to compare cursor for loop, unlimited bulk collect, and bulk collect with various limit clauses. Here are the results I get: "Bulk Collect with LIMIT 100" Elapsed CPU time: 347 "Bulk Collect with LIMIT 500" Elapsed CPU time: 335 "Bulk Collect with LIMIT 10000" Elapsed CPU time: 323 "Cursor For Loop" Elapsed CPU time: 586 "Bulk Collect" Elapsed CPU time: 461 "Cursor For Loop" Elapsed CPU time: 584 "Bulk Collect" Elapsed CPU time: 461 "Bulk Collect with LIMIT 100" Elapsed CPU time: 331 "Bulk Collect with LIMIT 500" Elapsed CPU time: 312 "Bulk Collect with LIMIT 10000" Elapsed CPU time: 327 This shows that at least for the sort of code shown below, bulk collect with a limit clause works very efficiently, even faster than the unlimited bulk collect - which in this case likely has most to do with memory issues (this is a table of 1.19M rows). I also saw a consistent but small improvement will a limit of 500 over 100, but no reliable improvement in performance at the 10000 limit. I cannot explain why these results are at odds with Tom's code. My script: DROP TABLE employees_big / CREATE TABLE employees_big AS SELECT e1.* FROM employees e1, employees e2, employees e3 / CREATE OR REPLACE PROCEDURE cfl_vs_bulkcollect ( iterations_in IN PLS_INTEGER := 1 ) IS TYPE employee_tt IS TABLE OF employees_big%ROWTYPE INDEX BY BINARY_INTEGER; CURSOR big_cur IS SELECT * FROM employees_big; l_start_time PLS_INTEGER; PROCEDURE start_timing IS BEGIN DBMS_SESSION.free_unused_user_memory; l_start_time := DBMS_UTILITY.get_cpu_time; END start_timing; PROCEDURE show_elapsed_time (str IN VARCHAR2) IS BEGIN DBMS_OUTPUT.put_line ( '"' || str || '" Elapsed CPU time: ' || TO_CHAR ( DBMS_UTILITY.get_cpu_time - l_start_time ) ); END show_elapsed_time; PROCEDURE cfl IS l_employees employee_tt; BEGIN start_timing (); FOR indx IN 1 .. iterations_in LOOP FOR rec IN (SELECT * FROM employees_big) LOOP l_employees (NVL (l_employees.LAST, 0) + 1) := rec; END LOOP; END LOOP; show_elapsed_time ('Cursor For Loop'); END cfl; PROCEDURE bulk_collect IS l_employees employee_tt; BEGIN start_timing (); FOR indx IN 1 .. iterations_in LOOP SELECT * BULK COLLECT INTO l_employees FROM employees_big; END LOOP; show_elapsed_time ('Bulk Collect'); END bulk_collect; PROCEDURE bulk_with_limit (limit_in IN PLS_INTEGER) IS l_employees employee_tt; BEGIN start_timing (); FOR indx IN 1 .. iterations_in LOOP OPEN big_cur; LOOP FETCH big_cur BULK COLLECT INTO l_employees LIMIT limit_in; EXIT WHEN l_employees.COUNT = 0; END LOOP; CLOSE big_cur; END LOOP; show_elapsed_time ('Bulk Collect with LIMIT ' || limit_in); END bulk_with_limit; PROCEDURE run_all (order_in IN VARCHAR2) IS BEGIN CASE order_in WHEN 'bulk first' THEN bulk_with_limit (100); bulk_with_limit (500); bulk_with_limit (10000); cfl (); bulk_collect (); WHEN 'bulk last' THEN cfl (); bulk_collect (); bulk_with_limit (100); bulk_with_limit (500); bulk_with_limit (10000); END CASE; END run_all; BEGIN run_all ('bulk first'); run_all ('bulk last'); END; /

By StevenFeuersteinTW on   Wednesday, July 23, 2008 8:37 AM
Search Blog Entries