You learn something new every day, right? Well, I certainly do (more or less). Even about PL/SQL, about which I am sure many people think I already know everything. Far from it.
In fact, I learned just last week from the PL/SQL Product Manager, Bryn Llewellyn, that his recommendation regarding cursor FOR loops and bulk collect is different from mine – and for a very good reason.
Several years ago, he informed me (and provided a script to verify the statement: 10g_optimize_cfl.sql in the
demo.zip) that in Oracle10g, the compiler now optimizes every cursor FOR loop to execute at a level of performance similar to BULK COLLECT.
This is a wonderful and non-trivial optimization. Rather than have to go through a manual rewrite of all cursor FOR loops, you can leave them in place and reap the benefits of the BULK COLLECT (sort of).
Well, that's not quite true – and it's even less true than I thought.
First of all, if your cursor FOR loop contains any DML statements, you will still want to explicitly convert to BULK COLLECT. The reason is that while the optimizer will improve the performance of the query step, it will not optimize the DML statements to execute like FORALLs (the bulk processing analogue for updates, inserts and deletes).
So if you have code that looks like (cfl_to_bulk_0.sql in
demo.zip):
PROCEDURE upd_for_dept (
dept_in IN employees.department_id%TYPE
, newsal_in IN employees.salary%TYPE
)
IS
CURSOR emp_cur
IS
SELECT employee_id, salary, hire_date
FROM employees
WHERE department_id = dept_in;
BEGIN
FOR rec IN emp_cur
LOOP
BEGIN
INSERT INTO employee_history
(employee_id, salary, hire_date
)
VALUES (rec.employee_id, rec.salary, rec.hire_date
);
adjust_compensation (rec.employee_id, rec.salary);
UPDATE employees
SET salary = newsal_in
WHERE employee_id = rec.employee_id;
EXCEPTION
WHEN OTHERS
THEN
log_error;
END;
END LOOP;
END upd_for_dept;
You will need to change the cursor FOR loop to BULK COLLECT so that you can fill up collections with data from the query, which can then be used in the FORALL statement. You will, sadly (due to increased complexity and program length) end up with something like this (cfl_to_bulk_5.sql in
demo.zip):
PROCEDURE upd_for_dept (
dept_in IN employees.department_id%TYPE
, newsal_in IN employees.salary%TYPE
)
IS
bulk_errors EXCEPTION;
PRAGMA EXCEPTION_INIT (bulk_errors, -24381);
TYPE employee_tt IS TABLE OF employees.employee_id%TYPE
INDEX BY BINARY_INTEGER;
employee_ids employee_tt;
TYPE salary_tt IS TABLE OF employees.salary%TYPE
INDEX BY BINARY_INTEGER;
salaries salary_tt;
TYPE hire_date_tt IS TABLE OF employees.hire_date%TYPE
INDEX BY BINARY_INTEGER;
hire_dates hire_date_tt;
CURSOR employees_cur
IS
SELECT employee_id, salary, hire_date
FROM employees
WHERE department_id = dept_in
FOR UPDATE;
PROCEDURE fetch_data_quickly (
limit_in IN PLS_INTEGER
, employee_ids_out OUT employee_tt
, salaries_out OUT salary_tt
, hire_dates_out OUT hire_date_tt
)
IS
BEGIN
FETCH employees_cur
BULK COLLECT INTO employee_ids_out, salaries_out, hire_dates_out LIMIT limit_in;
END fetch_data_quickly;
PROCEDURE adj_comp_for_arrays (
employee_ids_io IN OUT employee_tt
, salaries_io IN OUT salary_tt
)
IS
BEGIN
FOR indx IN 1 .. employee_ids_io.COUNT
LOOP
adjust_compensation (employee_ids_io (indx), salaries_io (indx));
END LOOP;
END adj_comp_for_arrays;
PROCEDURE insert_history
IS
BEGIN
FORALL indx IN employee_ids.FIRST .. employee_ids.LAST SAVE EXCEPTIONS
INSERT INTO employee_history
(employee_id, salary, hire_date
)
VALUES (employee_ids (indx), salaries (indx), hire_dates (indx)
);
EXCEPTION
WHEN bulk_errors
THEN
FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
LOOP
-- Log the error
log_error
( 'Unable to insert history row for employee '
|| employee_ids
(SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX)
, SQL%BULK_EXCEPTIONS (indx).ERROR_CODE
);
/*
Communicate this failure to the update phase:
Delete this row so that the update will not take place.
*/
employee_ids.DELETE (SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX);
END LOOP;
END insert_history;
PROCEDURE update_employee
IS
BEGIN
/*
Use Oracle10g INDICES OF to avoid errors
from a sparsely-populated employee_ids collection.
*/
FORALL indx IN INDICES OF employee_ids SAVE EXCEPTIONS
UPDATE employees
SET salary = newsal_in
, hire_date = hire_dates (indx)
WHERE employee_id = employee_ids (indx);
EXCEPTION
WHEN bulk_errors
THEN
FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
LOOP
log_error
( 'Unable to update salary for employee '
|| employee_ids
(SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX)
, SQL%BULK_EXCEPTIONS (indx).ERROR_CODE
);
END LOOP;
END update_employee;
BEGIN
OPEN employees_cur;
LOOP
fetch_data_quickly (100, employee_ids, salaries, hire_dates);
EXIT WHEN employee_ids.COUNT = 0;
insert_history;
adj_comp_for_arrays (employee_ids, salaries);
update_employee;
END LOOP;
END upd_for_dept;
Sorry, I know that I should probably explain this code but to be brutally honest my dad had heart bypass surgery and I am currently at the hospital and so you get the abbreviated version ....
Anyway, that is one scenario where you definitely need to convert from the cursor FOR loop.
But Bryn, to my surprise, recommends that you always convert explicitly to BULK COLLECT. Why is that? For performance reasons. Apparently, the cursor FOR loop optimization makes the code run faster, but not as quickly as BULK COLLECT.
In my tests (which you can reproduce in cfl_vs_bulkcollect.sql), I found that BULK COLLECT ran 33% faster than the cursor FOR loop:
- Cursor For Loop Elapsed: 8.12 seconds.
- Bulk Collect Elapsed: 5.46 seconds.
So if you really want the best performance, take the time and make the effort to switch over to BULK COLLECT.