A DB2 application requires constant interaction between various disparate computing resources in order to operate efficiently and according to specifications. Realistically, though, the tuning of a DB2 application can be broken down into three components:

  1. system tuning,
  2. database tuning, and
  3. application tuning.

Indeed, all of these are related and database performance management requires an integrated approach to tuning across each of these areas. But for the sake of clarity let’s break down each of these three components of database application performance.

System Tuning

A DBMS operates within the context of a much larger environment that consists of other software and hardware components. Each of these components must be installed, configured, and managed effectively for the DBMS to function as desired. The DBA needs to understand how the DBMS interacts with the server hardware, the operating system, and any other required software. Tuning and configuring these components and connections properly can have a dramatic impact on system performance.

System tuning occurs at the highest level and can have the greatest impact on the overall health of database applications. The reason for this is that every database application runs on the overall system. No amount of tuning is going to help a database application when the server it is running on is short on resources or improperly installed.

With DB2 for z/OS, system tuning focuses on monitoring and optimizing the DB2 subsystem and the resources it uses to access and enable DB2 applications to perform. The way in which DB2 is installed, its usage of memory, storage, CPU, and other resources can, and will impact database application performance.

System tuning typically involved configuring database system parameters. For DB2 for z/OS, these parameters are commonly known as DSNZPARMs (aka zparms). The zparms are system parameters that control the overall behavior of a DB2 subsystem. There are seven macros that, once assembled, produce the DSNZPARMs: DSN6ENV, DSN6ARVP, DSN6LOGP, DSN6FAC, DSN6GRP, DSN6SYSP, and DSN6SPRM.

Great care must be taken when modifying system parameters. Setting even just one DSNZPARM inappropriately for your system can have a huge negative impact on performance. And relying on the defaults is a sure fire way to implement a sub-optimal DB2 environment. As you install DB2 the process will guide you to select characteristics of your environment and build the DSNZPARMs for you. But this is helpful only at the beginning. You will need to monitor your system and modify zparms as needed based on usage and requirements. For this reason it is important to understand all of the DSNZPARMs, the options available for setting each zparm, and the impact it can have.

The sheer volume of DSNZPARM parameters can be overwhelming. Some of them are more visible than others. For example, most DBAs know about the locking parameters (NUMLKUS – maximum locks per user; NUMLKTS – maximum locks per user per table space) and some of the others like:

  • CTHREAD – maximum users
  • CONDBAT, MAXDBAT – maximum remote connections, active
  • IDFORE, IDBACK – maximum TSO, batch
  • DSMAX – maximum number of open data sets

But it is a challenge, if not impossible, to keep all of the various parameters and their impact in your head. For this reason, be sure to keep around a copy of the Installation and Migration Guide; for DB2 11 for z/OS this is manual number GC19-4056. This manual contains a list of all the DSNZPARMs with links to descriptions of each parameter.

We won’t look at all of the zparms today, but let’s take a look at a few important areas.

Memory configuration is perhaps the single most important aspect of system tuning. The DBA (and perhaps the system programmer) must ensure that DB2 has adequate memory assigned for buffering data, program elements, sorting and other internal structures. EDM storage is set using DSNZPARMs. In the early days of DB2, there was a single EDM pool. But over the course of the past few releases, IBM has broken the EDM into a series of separate pools. The EDM pools are used to maintain DBDs, cursor tables (for plans), package tables, authorization cache, the dynamic statement cache needed by executing SQL statements.

Buffer pools used to be controlled in the zparms, but several releases ago DB2 gained the ability to monitor and modify buffer pool sizes using DISPLAY and ALTER commands.

Another significant system bottleneck can be the log. Every modification must be logged by DB2, so an inefficient log can impact every application that modifies data. An important aspect to setup correctly then, is the frequency of checkpointing, which is controlled using the CHKFREQ zparm.

The default for this parameter used to be 500,000, which meant that DB2 would take a checkpoint after writing 500,000 log records. But this was not ideal for many environments. Do you know how much time it takes to write that many log records in your shop? Probably not.

So over the course of the past several releases, IBM changed things. As of DB2 V7, the CHKFREQ parameter was modified to be able to accept a number of minutes, instead of the number of log records written, if you so choose. This helped because the general advice was to take a checkpoint every 5 minutes during peak processing. So now we can set it to 5 minutes instead of trying to figure out the number of log records that would be processed.

Today, as of DB2 11 for z/OS, we can use the number of log records processed (CHKLOGR) and number of minutes since the last checkpoint (CHKFREQ) or even both if so desired.

It is also important to set your buffer pool deferred write thresholds (DWQT, VDWQT) correctly. These parameters control asynchronous writes. When the percentage of the entire buffer pool (DWQT) or the percentage of pages for a single data set (VDWQT) have been updated and therefore need to be written to disk, DB2 will begin to write the data to disk from the buffers.

System tuning also include managing and connecting other systems software with which the DBMS interacts. This can include the operating system, networking software, message queuing systems, other middleware, and transaction processors.

To summarize, database system tuning comprises installation, configuration, and integration issues, as well as ensuring connectivity of the software to the DBMS and database applications.

Database Tuning

Database application performance can be impacted by the physical design of the database, including normalization issues, how the database is stored on disk, number of tables, index design, and proper usage of DDL and its associated parameters. The actual physical location of database files on disk systems will have an impact on the performance of applications accessing the data. As more data is stored on the same disk device the possibility of concurrent access and performance degradation increases.

But design is not the only component of database performance. The organization of the database will change over time. As data is inserted, updated, and deleted from the database, the efficiency of the database will degrade. Running the REORG utility can remedy disorganization issues such as unclustered data, data positioned near and far away from its ideal page location, gaps due to deleted data, and so on. Knowing when to run the REORG utility is the key to optimizing your DB2 databases at the right time.

Real Time Statistics (RTS) can help determine when to reorganize a table space. You can examine metrics such as space allocated, extents, number of INSERTs, UPDATEs, and DELETEs since the last REORG or LOAD REPLACE, number of unclustered INSERTs, number of disorganized LOBs, and number of near and far indirect references created since the last REORG. The following query can be helpful here:

SELECT   DBNAME, NAME, PARTITION, SPACE,
         EXTENTS, REORGLASTTIME, REORGINSERTS,
         REORGDELETES, REORGUPDATES,
         REORGINSERTS+REORGDELETES+REORGUPDATES
              AS TOTAL_CHANGES,
         REORGDISORGLOB, REORGUNCLUSTINS,
         REORGMASSDELETE, REORGNEARINDREF,
         REORGFARINDREF
FROM     SYSIBM.TABLESPACESTATS
ORDER BY DBNAME, NAME, PARTITION;

You might want to add a WHERE clause that limits the table spaces returned to just those that exceed a particular limit. For example:

Specify Description
WHERE EXTENTS > 20 Table spaces having more than 20 extents
WHERE TOT_CHANGES > 100000 Table spaces with more than 100K changes
WHERE REORGFARINDREF > 50 Table spaces with more than 50 far indirect references

 

Another way to get more creative with your RTS queries is to build formulas into them to retrieve only those table spaces that need to be reorganized. For example, the following query will return only those table spaces having more than 10% of their rows as near or far indirect references:

SELECT   DBNAME, NAME, PARTITION, SPACE, 
         EXTENTS
FROM     SYSIBM.TABLESPACESTATS
WHERE    (((REORGNEARINDREF + REORGFARINDREF)
             *100
           )/TOTALROWS
          ) > 10
ORDER BY DBNAME, NAME, PARTITION;

Of course, you can change the percentage as you wish. After running the query you have a list of table spaces meeting your criteria for reorganization.

Additionally, the files used to hold the data of the database may need to expand as data is added. Or perhaps, additional files, or file extents, will need to be allocated. Both disorganization and file growth can degrade performance.

Indexes also need to be monitored, analyzed, and tuned to optimize data access, and to ensure that they are not having a negative impact on data modification. Similar queries can be built against the SYSIBM.INDEXPACESTATS RTS table to determine when to reorganize your indexes.

Application and SQL Tuning

The application code itself must be designed appropriately and monitored for efficiency. In fact, as much as 70% to 80% of performance problems are caused by improperly coded database applications. SQL is the primary culprit. Coding efficient SQL statements can be complicated. Developers need to be taught how to properly formulate SQL statements and SQL statements must be constantly monitored and tuned.

Programmers need to be schooled in the practice of examining SQL access paths. Every relational DBMS allows the programmer to request information on how the database optimizer will satisfy each query. Will an index be used? In what order will the tables be joined? Will the query be broken up into parallel tasks or not? These and many other factors influence the efficiency of SQL.

SQL is simple to learn and easy to start using. But SQL tuning and optimization is an art that takes years to master. Some general rules of thumb for creating efficient SQL statements include:

  • Let SQL do the work instead of the application program. For example, code an SQL join instead of two cursors and a programmatic join.
  • Simpler is generally better, but complex SQL can be very efficient.
  • Retrieve only the columns required, never more.
  • Retrieve the absolute minimum number of rows by specifying every WHERE clause that is appropriate.
  • When joining tables, always provide join predicates. In other words, avoid Cartesian products.
  • Favor using Stage 1 and indexable predicates.
  • If possible, avoid sorting by creating indexes for ORDER BY, GROUP BY and DISTINCT operations.
  • Avoid "black boxes" – that is, avoid I/O routines that are called by programs instead of using embedded SQL.
  • Avoid deadlocks by updating tables in the same sequence in every program.
  • Issue data-modification statements (INSERT, UPDATE, DELETE) as close to the COMMIT statement as possible.
  • Be sure to build a COMMIT strategy into every batch program that changes data. Failing to COMMIT can cause locking problems.

These are just some of the rules of thumb that can help you to become a more efficient SQL coder when writing DB2 applications. As you become more experienced with DB2 SQL, you will learn and develop more guidelines for improving efficiency.

But not all application problems are due to improperly coded SQL. The host language application code in which the SQL has been embedded also can be inefficient. For example, Java, COBOL, C++, or Visual Basic code may be inefficient, causing database application performance to suffer.

Summary

Managing the performance of your database applications requires in-depth monitoring of each of these three areas: the system, the database, and the application. Be sure to allocate an appropriate budget to acquire performance management tools for each of these areas to ensure the efficiency of your database systems and applications.