WELCOME, GUEST
Minimize
Blogger List

Steven Feuerstein Indicates Oracle ACE director status
PL/SQL Obsession

Guy Harrison Indicates Oracle ACE status
Database topics

Bert Scalzo Indicates Oracle ACE status
Toad for Oracle, Data Modeling, Benchmarking
Dan Hotka Indicates Oracle ACE director status
SQL Tuning & PL/SQL Tips

Valentin Baev
It's all about Toad

Ben Boise
Toad SC Discussions

Dan Clamage
SQL and PL/SQL

Kevin Dalton
Benchmark Factory

Peter Evans 
Business Intelligence, Data Integration, Cloud and Big Data

Vaclav Frolik  
Toad Data Modeler, Toad Extension for Eclipse

Devin Gallagher
Toad SC discussions

Stuart Hodgins
JProbe Discussions

Julie Hyman
Toad for Data Analysts

  Henrik "Mauritz" Johnson
Toad Tips & Tricks on the "other" Toads
  Mark Kurtz
Toad SC discussions
Daniel Norwood
Tips & Tricks on Toad Solutions
Amit Parikh
Toad for Oracle, Benchmark Factory,Quest Backup Reporter
Debbie Peabody
Toad Data Point
Gary Piper
Toad Reports Manager
John Pocknell
Toad Solutions
Jeff Podlasek
Toad for DB2
Kuljit Sangha
Toad SC discussions
Michael Sass 
Toad for DB2
Brad Wulf
Toad SC discussions
Richard To
SQL Optimization
  Toad Data Modeler Opens in a new window
Data Modeling
 
  Toad Higher Education
How Hi-Ed Uses Toad
  Real Automated Code Testing for Oracle
Quest Code Tester blog
  中文技术资料库
技术文章
 

Blogs

Toad World blogs are a mix of insightful how-tos from Quest experts as well as their commentary on experiences with new database technologies.  Have some views of your own to share?  Post your comments!  Note:  Comments are restricted to registered Toad World users.

Do you have a topic that you'd like discussed?  We'd love to hear from you.  Send us your idea for a blog topic.


Jul 13

Written by: StevenFeuersteinTW
Wednesday, July 13, 2011 5:36 AM  RssIcon

Overview

Oracle introduced a significant enhancement to PL/SQL’s SQL-related capabilities with the FORALL statement and BULK COLLECT clause for queries. Together, these are referred to as bulk processing statements for PL/SQL. Why, you might wonder, would this be necessary? We all know that PL/SQL is tightly integrated with the underlying SQL engine in the Oracle database. PL/SQL is the database programming language of choice for Oracle—even though you can now use Java inside the database as well.

But this tight integration does not mean that there is no overhead associated with running SQL from a PL/SQL program. When the PL/SQL runtime engine processes a block of code, it executes the procedural statements within its own engine, but passes the SQL statements on to the SQL engine. The SQL layer executes the SQL statements and then returns information to the PL/SQL engine, if necessary.

This transfer of control between the PL/SQL and SQL engines is called a context switch. Each time a switch occurs, there is additional overhead. There are a number of scenarios in which many switches occur and performance degrades. As you can see, PL/SQL and SQL might be tightly integrated on the syntactic level, but "under the covers" the integration is not as tight as it could be.

With FORALL and BULK COLLECT, however, you can fine-tune the way these two engines communicate, effectively telling the PL/SQL engine to compress multiple context switches into a single switch, thereby improving the performance of your applications. 

Using BULK COLLECT is fairly straightforward, especially compared to FORALL. You replace "INTO" with "BULK COLLECT INTO" (whether it is SELECT INTO, FETCH INTO or EXECUTE IMMEDIATE INTO) and then after the INTO keyword, you will supply one or more collections, rather than a record or list of variables. Here's an example:

DECLARE
   TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
     -- All rows at once...
     SELECT *
       BULK COLLECT INTO l_employees
       FROM employees
   ORDER BY last_name DESC;

   DBMS_OUTPUT.put_line (l_employees.COUNT);

   FOR indx IN 1 .. l_employees.COUNT
   LOOP
      DBMS_OUTPUT.put_line (l_employees (indx).last_name);
   END LOOP;
END;
/

Let's take a look at BULK COLLECT in more detail.

Resources

Oracle Documentation

PL/SQL Obsession presentations

When should I use BULK COLLECT?

The primary use case for BULK COLLECT is any kind of loop that results in fetching data one row at a time, which Tom Kyte refers as "slow by slow" processing.

You might think, therefore, that a cursor FOR loop is an obvious opportunity for conversion to BULK COLLECT. If you are running Oracle Database 10g Release 2 or higher, however, that is not necessary - as long as you have turned on Oracle's automatic PL/SQL optimization to at least level 2.

Oracle Database 10g introduced automatic optimization of PL/SQL code. The default optimization level is 2 - and you should just leave it that way. At this level, Oracle will take many, careful steps to transform your code so that it runs substantially faster.

One of the most interesting and non-trivial optimization is that of cursor loops. Consider this code:

BEGIN
   FOR rec IN (  SELECT *
                   FROM employees
               ORDER BY last_name DESC)
   LOOP
      DBMS_OUTPUT.put_line (l_employees (indx).last_name);
   END LOOP;
END;
/

This block seems to fetch data from the employees table one row at a time, and then display the last name. If, however, optimization is set to level 2 or higher, then Oracle will automatically convert this block into C code that fetches data in bulk, just like with BULK COLLECT.

As a result, you do not need to explicitly convert your cursor FOR loops to use BULK COLLECT if:

  1. You are running 10.2 or higher.
  2. Your optimization level is to 2 or higher.
  3. The cursor FOR loop does not contain any DML statements (insert, update, delete).

In other words, you definitely should convert code like the following block either to a cursor FOR loop (to benefit from automatic optimization) or an explicit BULK COLLECT:

DECLARE
   CURSOR employees_cur
   IS
        SELECT *
          BULK COLLECT INTO l_employees
          FROM employees
      ORDER BY last_name DESC;

   l_employee   employees_cur%ROWTYPE;
BEGIN
   OPEN employees_cur;

   LOOP
      FETCH employees_cur INTO l_employee;

      EXIT WHEN employees_cur%NOTFOUND;
      DBMS_OUTPUT.put_line (l_employee.last_name);
   END LOOP;
END;
/

Here is the explicit BULK COLLECT version:

DECLARE
   TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
     -- All rows at once...
     SELECT *
       BULK COLLECT INTO l_employees
       FROM employees
   ORDER BY last_name DESC;

   DBMS_OUTPUT.put_line (l_employees.COUNT);

   FOR indx IN 1 .. l_employees.COUNT
   LOOP
      DBMS_OUTPUT.put_line (l_employees (indx).last_name);
   END LOOP;
END;
/

and here is the cursor FOR loop version (same as shown earlier):

BEGIN
   FOR rec IN (  SELECT *
                   FROM employees
               ORDER BY last_name DESC)
   LOOP
      DBMS_OUTPUT.put_line (l_employees (indx).last_name);
   END LOOP;
END;
/

Clearly, the cursor FOR loop offers the simplest solution. Use it! You will find at the end of this QuickTip a script you can run to prove to yourself the impact of the optimization level on cursor FOR loop optimization.

I noted earlier that you will need to explicitly convert to BULK COLLECT if your loop (even a cursor FOR loop) contains one or more DML statements. The reason for this is simple: Oracle does not automatically optimize the performance of DML statements inside loops to use FORALL (see QuickTip on FORALL). So even if the cursor FOR loop is automatically "converted" to BULK COLLECT-like levels of performance, the DML statements will still run for each row individually.

Since DML overhead is much higher than querying, you will still likely need to convert those DML statements to use FORALL. And that means you will need to populate collections to "feed" into the FORALL. Which means you will need (or should) use BULK COLLECT to fill up those collections.

Here is an example of a procedure that should be converted explicitly to use both BULK COLLECT and FORALL:

CREATE OR REPLACE PROCEDURE upd_for_dept (
   dept_in     IN   employees.department_id%TYPE
 , newsal_in   IN   employees.salary%TYPE
)
IS
   CURSOR emp_cur
   IS
      SELECT employee_id, salary, hire_date
        FROM employees
       WHERE department_id = dept_in;
BEGIN
   FOR rec IN emp_cur
   LOOP
      BEGIN
         INSERT INTO employee_history
                     (employee_id, salary, hire_date
                     )
              VALUES (rec.employee_id, rec.salary, rec.hire_date
                     );

         rec.salary := newsal_in;
        
         adjust_compensation (rec.employee_id, rec.salary);

         UPDATE employees
            SET salary = rec.salary
          WHERE employee_id = rec.employee_id;
      EXCEPTION
         WHEN OTHERS
         THEN
            log_error;
      END;
   END LOOP;
END upd_for_dept;

So be on the lookout for any kind of loop in your code that contains DML statements. They will offer the greatest "bang for the buck" on improving application performance with relatively little effort on your part.

BULK COLLECT Good to Knows

Keep the following factoids in mind when using BULK COLLECT:

  • Prior to Oracle9i Database, you could use BULK COLLECT only with static SQL. Now you can use BULK COLLECT with both dynamic and static SQL.
  • You can use BULK COLLECT keywords in any of the following clauses: SELECT INTO, FETCH INTO, and RETURNING INTO.
  • A collection populated by BULK COLLECT is always filled sequentially from index value 1 (or is empty).
  • If you are BULK COLLECTing into a varray or a nested table, you do not need to initialize or extend prior to executing the BULK COLLECT query. Oracle will take care of all of that for you.
  • SELECT...BULK COLLECT will not raise NO_DATA_FOUND if no rows are found. Instead, you must check the contents of the collection to see if there is any data inside it.
  • If the query returns no rows, the collection’s COUNT method will return 0.
  • As with any operation involving collections, use of BULK COLLECT will likely increase the amount of PGA (Process Global Area) memory used by an application session.
  • Use the LIMIT clause with BULK COLLECT to put a cap on the amount of PGA memory used by your query.
  • You can specify the limit (example in next section) as a literal, expression or parameter. I suggest you avoid hard-coding the value.
  • 100 is a good default or starting value for LIMIT. Setting the value higher, to say 500 or 1000, will probably not improve performance significantly, but will increase PGA consumption. For very large volumes of data, such millions of rows of data, you should probably experiment with higher limit values to see what kind of impact you see.
  • You can fetch into a single collection of records or a series of collections (one for each expression returned in the SELECT list).

Here are some examples:

  1. Select all the rows from the employees table in a single "round trip" to the SQL engine:

DECLARE
   TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
     -- All rows at once...
     SELECT *
       BULK COLLECT INTO l_employees
       FROM employees
   ORDER BY last_name DESC;

   DBMS_OUTPUT.put_line (l_employees.COUNT);

   FOR indx IN 1 .. l_employees.COUNT
   LOOP
      DBMS_OUTPUT.put_line (l_employees (indx).last_name);
   END LOOP;
END;
/

  1. Now limit the fetching to 10 rows at a time.

DECLARE
   CURSOR employees_cur
   IS
        SELECT *
          FROM employees
      ORDER BY last_name DESC;

   TYPE employees_aat IS TABLE OF employees%ROWTYPE;

   l_employees   employees_aat;
BEGIN
   OPEN employees_cur;

   LOOP
      FETCH employees_cur
      BULK COLLECT INTO l_employees
      LIMIT 10;

      EXIT WHEN l_employees.COUNT = 0;

      FOR indx IN 1 .. l_employees.COUNT
      LOOP
         DBMS_OUTPUT.put_line (l_employee.last_name);
      END LOOP;
   END LOOP;
END;
/

  1. Use BULK COLLET with a dynamic query.

CREATE OR REPLACE PROCEDURE show_names (query_in IN VARCHAR2)
IS
   l_names   DBMS_SQL.varchar2_table;
BEGIN
   EXECUTE IMMEDIATE query_in BULK COLLECT INTO l_names;

   FOR indx IN 1 .. l_names.COUNT
   LOOP
      DBMS_OUTPUT.put_line (l_names (indx));
   END LOOP;
END;
/

  1. Use BULK COLLECT with a cursor variable.

CREATE OR REPLACE PROCEDURE show_names (cv_io IN OUT SYS_REFCURSOR)
IS
   l_names   DBMS_SQL.varchar2_table;
BEGIN
   LOOP
      FETCH cv_io
      BULK COLLECT INTO l_names
      LIMIT 10;

      EXIT WHEN l_names.COUNT = 0;

      FOR indx IN 1 .. l_names.COUNT
      LOOP
         DBMS_OUTPUT.put_line (l_names (indx));
      END LOOP;
   END LOOP;

   CLOSE cv_io;
END;
/

DECLARE
   CV   SYS_REFCURSOR;
BEGIN
   OPEN CV FOR 'select last_name from employees';

   show_names (CV);
END;
/

Using LIMIT with BULK COLLECT

Oracle provides a LIMIT clause for BULK COLLECT that allows you to limit the number of rows fetched from the database. The syntax is:

FETCH cursor BULK COLLECT INTO ... [LIMIT rows];

where rows can be any literal, variable, or expression that evaluates to an integer (otherwise, the database will raise a VALUE_ERROR exception).

LIMIT is very useful with BULK COLLECT, because it helps you manage how much memory your program will used to process data. Suppose, for example, that you need to query and process 10,000 rows of data. You could use BULK COLLECT to retrieve all those rows and populate a rather large collection. However, this approach will consume lots of memory in the PGA for that session. If this code is run by many separate Oracle schemas, your application performance may degrade because of PGA swapping.

The following block of code uses the LIMIT clause in a FETCH that is inside a simple loop.

    DECLARE
       l_limit PLS_INTEGER := 100;
       CURSOR allrows_cur IS SELECT * FROM employees;    
       TYPE employees_aat IS TABLE OF allrows_cur%ROWTYPE
          INDEX BY BINARY_INTEGER;    
       l_employees employees_aat;
    BEGIN
       OPEN allrows_cur;
       LOOP
          FETCH allrows_cur BULK COLLECT INTO l_employees LIMIT l_limit;
     
          /* Process the data by scanning through the collection. */       
          FOR l_row IN 1 .. l_employees.COUNT
          LOOP
             upgrade_employee_status (l_employees(l_row).employee_id);
          END LOOP;

          EXIT WHEN allrows_cur%NOTFOUND;
       END LOOP;
     
       CLOSE allrows_cur;
    END;

Notice that I terminate the loop by checking the value of allrows_cur%NOTFOUND at the bottom of the loop. When querying data one row at a time, we usually put this code immediately after the FETCH statement. You should not do that when using BULK COLLECT, because when the fetch retrieves the last set of rows, the cursor will be exhausted (and %NOTFOUND will return TRUE) but you will still have some elements in the collection to process.

So either check the %NOTFOUND attribute at the bottom of your loop, or check the contents of the collection immediately after the fetch:

LOOP
   FETCH allrows_cur BULK COLLECT INTO l_employees LIMIT l_limit;
   EXIT WHEN l_employees.COUNT = 0;

The disadvantage of this second approach is that you will perform an extra fetch that returns no rows, compared to checking %NOTFOUND at the bottom of the loop body. The performance impact of this extra fetch should, however, be minimal.

PL/SQL Optimization and the Cursor FOR Loop

The following script (the first version of which was created by Bryn Llewellyn, PL/SQL Product Manager) calculates elapsed time for three different approaches to fetching data: cursor FOR loop, explicit loop processing, and BULK COLLECT.

CREATE OR REPLACE PROCEDURE test_cursor_performance (approach IN VARCHAR2)
IS
   CURSOR cur
   IS
      SELECT *
        FROM all_source
       WHERE ROWNUM < 100001;

   one_row     cur%ROWTYPE;

   TYPE t IS TABLE OF cur%ROWTYPE
                INDEX BY PLS_INTEGER;

   many_rows   t;
  
      last_timing   NUMBER := NULL;

   PROCEDURE start_timer
   IS
   BEGIN
      last_timing := DBMS_UTILITY.get_cpu_time;
   END;

   PROCEDURE show_elapsed_time (message_in IN VARCHAR2 := NULL)
   IS
   BEGIN
      DBMS_OUTPUT.put_line('"' || message_in || '" completed in: '
                           || ROUND (
                                 MOD (
                                      DBMS_UTILITY.get_cpu_time
                                    - last_timing
                                    + POWER (2, 32)
                                  , POWER (2, 32)
                                 )
                                 / 100
                               , 2
                              ));
   END;
BEGIN
   start_timer;

   CASE approach
      WHEN 'implicit cursor for loop'
      THEN
         FOR j IN cur
         LOOP
            NULL;
         END LOOP;
      WHEN 'explicit open, fetch, close'
      THEN
         OPEN cur;

         LOOP
           
            FETCH cur
            INTO one_row;

            EXIT WHEN cur%NOTFOUND;
         END LOOP;

         CLOSE cur;
      WHEN 'bulk fetch'
      THEN
         OPEN cur;

         LOOP
            FETCH cur
            BULK COLLECT INTO many_rows
            LIMIT 100;

            EXIT WHEN many_rows.COUNT () = 0;
         END LOOP;

         CLOSE cur;
   END CASE;

   show_elapsed_time (approach);
END test_cursor_performance;
/

/* Try different approaches with optimization disabled. */
ALTER PROCEDURE test_cursor_performance COMPILE plsql_optimize_level=0
/

BEGIN
   DBMS_OUTPUT.put_line ('No optimization...');
   test_cursor_performance ('implicit cursor for loop');

   test_cursor_performance ('explicit open, fetch, close');

   test_cursor_performance ('bulk fetch');
END;
/

/* Try different approaches with default optimization. */
ALTER PROCEDURE test_cursor_performance COMPILE plsql_optimize_level=2
/

BEGIN
   DBMS_OUTPUT.put_line ('Default optimization...');

   test_cursor_performance ('implicit cursor for loop');

   test_cursor_performance ('explicit open, fetch, close');

   test_cursor_performance ('bulk fetch');
END;
/

/* Try different approaches with DEBUG enabled. */

ALTER PROCEDURE test_cursor_performance COMPILE DEBUG
/

BEGIN
   DBMS_OUTPUT.put_line ('DEBUG enabled...');

   test_cursor_performance ('implicit cursor for loop');

   test_cursor_performance ('explicit open, fetch, close');

   test_cursor_performance ('bulk fetch');
END;
/

/*

On 11.2

No optimization...
"implicit cursor for loop" completed in: 2.95
"explicit open, fetch, close" completed in: 2.75
"bulk fetch" completed in: 1.33

Default optimization...
"implicit cursor for loop" completed in: 1.36
"explicit open, fetch, close" completed in: 2.75
"bulk fetch" completed in: 1.44

DEBUG enabled...
"implicit cursor for loop" completed in: 2.74
"explicit open, fetch, close" completed in: 2.78
"bulk fetch" completed in: 1.31

*/

Search Blog Entries
 
Blog Archives
 
Archive
<May 2013>
SunMonTueWedThuFriSat
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678
Monthly
May, 2013 (11)
April, 2013 (13)
March, 2013 (10)
February, 2013 (5)
January, 2013 (7)
December, 2012 (6)
November, 2012 (10)
October, 2012 (8)
September, 2012 (6)
August, 2012 (8)
July, 2012 (8)
June, 2012 (12)
May, 2012 (21)
April, 2012 (10)
March, 2012 (16)
February, 2012 (19)
January, 2012 (20)
December, 2011 (19)
November, 2011 (14)
October, 2011 (12)
September, 2011 (17)
August, 2011 (15)
July, 2011 (16)
June, 2011 (13)
May, 2011 (15)
April, 2011 (8)
March, 2011 (21)
February, 2011 (17)
January, 2011 (16)
December, 2010 (13)
November, 2010 (13)
October, 2010 (7)
September, 2010 (15)
August, 2010 (11)
July, 2010 (13)
June, 2010 (12)
May, 2010 (14)
April, 2010 (12)
March, 2010 (13)
February, 2010 (12)
January, 2010 (7)
December, 2009 (10)
November, 2009 (12)
October, 2009 (15)
September, 2009 (18)
August, 2009 (13)
July, 2009 (23)
June, 2009 (14)
May, 2009 (17)
April, 2009 (7)
March, 2009 (14)
February, 2009 (7)
January, 2009 (12)
December, 2008 (7)
November, 2008 (11)
October, 2008 (19)
September, 2008 (14)
August, 2008 (11)
July, 2008 (14)
June, 2008 (19)
May, 2008 (12)
April, 2008 (18)
March, 2008 (13)
February, 2008 (8)
January, 2008 (7)
December, 2007 (5)
November, 2007 (8)
October, 2007 (13)
September, 2007 (13)
August, 2007 (16)
July, 2007 (11)
June, 2007 (6)
May, 2007 (5)
April, 2007 (5)
March, 2007 (8)
February, 2007 (6)
January, 2007 (6)
December, 2006 (5)
November, 2006 (8)
October, 2006 (4)
August, 2006 (3)