jump to navigation

Oracle Performance Tuning-By Steve Callan November 6, 2008

Posted by haind in oracle.
Tags:
trackback

<From http://www.databasejournal.com/>

Part1:

Performance tuning is a broad and somewhat complex topic area when it comes to Oracle databases. Two of the biggest questions faced by your average DBA concern where to start and what to do. All you may know is that someone (a user) reports a problem about a slow or poor performing application or query. Where do you even begin to start when faced with this situation?

Oracle’s Approach to Tuning

For anyone who has taken the Performance Tuning exam for Oracle8i certification, one of the testable areas dealt with Oracle’s Tuning Methodology. Oracle’s emphasis on this particular methodology changed when Oracle9i was released. The approach has gone from top-down in 8i to that of following principles in 9i/10g. Neither methodology is absolute as each has its advantages and disadvantages. In Oracle8i, the steps consisted of the following:

1.  Tuning the Business Rules
2.  Tuning the Data Design
3.  Tuning the Application Design
4.  Tuning the Logical Structure of the Database
5.  Tuning Database Operations
6.  Tuning the Access Paths
7.  Tuning Memory Allocation
8.  Tuning I/O and Physical Structure
9.  Tuning Resource Contention
10. Tuning the Underlying Platform(s)

With Oracle9i’s principle-based approach, the principles, in order of priority, are:

Priority Description
First Define the problem clearly and then formulate a tuning goal.
Second Examine the host system and gather Oracle statistics.
Third Compare the identified problem to the common performance problems identified by Oracle in the Oracle9i Database Performance Methods (Release 1)/Database Performance Planning (Release 2)
Fourth Use the statistics gathered in the second step to get a conceptual picture of what might be happening on the system.
Fifth Identify the changes to be made and then implement those changes.
Sixth Determine whether the objectives identified in step one have been met. If they have, stop tuning. If not, repeat steps five and six until the tuning goal is met.

Reference: OCP: Oracle9i Performance Tuning Study Guide, SYBEX, Inc.

Interestingly, the emphasis on identifying which step an action falls under went away with Oracle9i, and recitation of the principles is not a testable item. The title of documentation even changed between releases one and two, and that should send a clear signal that the art of performance tuning (or, performance and tuning) is still just that – an art. When it comes to instance tuning, the steps are even further reduced in Oracle10g.

The performance tuning guide for Oracle10g (Release 2) identifies the overall process as The Oracle Performance Improvement Method. The steps have been expanded, but overall, remain the same.

1.  Perform the following initial standard checks:

a.   Get candid feedback from users. Determine the performance project’s scope and subsequent performance goals, as well as performance goals for the future. This process is key in future capacity planning.

b.  Get a full set of operating system, database, and application statistics from the system when the performance is both good and bad. If these are not available, then get whatever is available. Missing statistics are analogous to missing evidence at a crime scene: They make detectives work harder and it is more time-consuming.

c.   Sanity-check the operating systems of all machines involved with user performance. By sanity-checking the operating system, you look for hardware or operating system resources that are fully utilized. List any over-used resources as symptoms for analysis later. In addition, check that all hardware shows no errors or diagnostics.

2.  Check for the top ten most common mistakes with Oracle, and determine if any of these are likely to be the problem. List these as symptoms for later analysis. These are included because they represent the most likely problems. ADDM automatically detects and reports nine of these top ten issues. See Chapter 6, “Automatic Performance Diagnostics” and “Top Ten Mistakes Found in Oracle Systems”.

3.  Build a conceptual model of what is happening on the system using the symptoms as clues to understand what caused the performance problems. See “A Sample Decision Process for Performance Conceptual Modeling”.

4.  Propose a series of remedy actions and the anticipated behavior to the system, then apply them in the order that can benefit the application the most. ADDM produces recommendations each with an expected benefit. A golden rule in performance work is that you only change one thing at a time and then measure the differences. Unfortunately, system downtime requirements might prohibit such a rigorous investigation method. If multiple changes are applied at the same time, then try to ensure that they are isolated so that the effects of each change can be independently validated.

5.  Validate that the changes made have had the desired effect, and see if the user’s perception of performance has improved. Otherwise, look for more bottlenecks, and continue refining the conceptual model until your understanding of the application becomes more accurate.

6.  Repeat the last three steps until performance goals are met or become impossible due to other constraints.

The performance tuning guide for Oracle10g (Release 2)

The Change is Part of the Problem

The change from a top-down structured approach to a principle-based “make it stop hurting” one is part of the problem. Gathering statistics is obviously important because how else do you know if you have improved (or worsened) the problem? Still, to some degree with either approach, you are left with the original two questions: what do I look for, and how do I make it better? If the structured approach left you scratching your head, the principled approach only adds to the confusion.

What would help the novice tuner (disclaimer: I am far from being an expert) is a list of items or areas to evaluate (configure, diagnose, and tune) in each of the following areas:

  • Tuning the Buffer Cache
  • Tuning the Redo Log Buffer
  • Tuning the Shared Pool Memory
  • Tuning the Program Global Area
  • Optimizing Data Storage
  • Optimizing Tablespaces
  • Tuning Undo Segments
  • Detecting Lock Contention
  • Tuning SQL

These areas pretty much cover the Oracle RDBMS and instance from top to bottom. The remainder of this article will focus on tuning SQL, or more precisely, preventing slow SQL execution. Aren’t these the same thing? Mostly yes, but a common approach in development is making a statement perform well enough or fast enough. Each and every statement does not have to be optimal, but some thought has to go into coding them. You do not have the time to optimize hundreds or even thousands of SQL statements, but at the same time, there are guidelines you can follow to avoid common mistakes and bad coding.

17 Tips for Avoiding Problematic Queries

The source of these 17 tips is from Oracle9i Performance Tuning: Optimizing Database Productivity by Hassan Afyouni (Thompson Course Technology, 2004). These tips provide a solid foundation for two outcomes: making a SQL statement perform better, and determining that nothing else can be done in this regard (i.e., you have done all you can with the SQL statement, time to move on to another area).

The 17 tips are listed below.

1.  Avoid Cartesian products
2.  Avoid full table scans on large tables
3.  Use SQL standards and conventions to reduce parsing
4.  Lack of indexes on columns contained in the WHERE clause
5.  Avoid joining too many tables
6.  Monitor V$SESSION_LONGOPS to detect long running operations
7.  Use hints as appropriate
8.  Use the SHARED_CURSOR parameter
9.  Use the Rule-based optimizer if I is better than the Cost-based optimizer
10. Avoid unnecessary sorting
11. Monitor index browning (due to deletions; rebuild as necessary)
12. Use compound indexes with care (Do not repeat columns)
13. Monitor query statistics
14. Use different tablespaces for tables and indexes (as a general rule; this is old-school somewhat, but the main point is reduce I/O contention)
15. Use table partitioning (and local indexes) when appropriate (partitioning is an extra cost feature)
16. Use literals in the WHERE clause (use bind variables)
17. Keep statistics up to date

That is quite a list and overall is thorough and accurate. Step 9, referring to the use of the Rule-based optimizer, may cause a reliance or dependency on a feature Oracle has identified as a future item to be deprecated. You are eventually going to have to solve the problem using the CBO, so you may as well start now and forget about the RBO. Step 14 should be changed to something along the lines of “reduce I/O contention” instead of its currently stated “separate index and table tablespaces” guidance.

In Closing

In the next article of this series, we will look at some specific steps of these tips. For example, advice given on many Web sites about how to improve a SQL statement’s performance typically includes “use bind variables.” Well, I am sure many people have this question: “How, exactly, do I do that?” It is actually pretty simple, as are many of the details of how to use many of these tips.

Part2:

As mentioned in Part 1, there are several relatively easy steps you can take to improve performance. From the user’s perspective, one of the most frequently used interfaces with a database involves SQL statements, so getting a handle on them is a good place to start in terms of being able to see an immediate improvement.

In the interest of being complete, I will cover some preliminary steps that will be needed in order to view what is taking place. These steps include running the plustrce SQL script, creating an “EXPLAIN_PLAN” table, granting a role, and configuring your SQL*Plus environment to see execution plans. All of these steps are covered in “Using Autotrace in SQL*Plus” in Oracle9i Database Performance Tuning Guide and Reference Release 2 (9.2). For Oracle10g, the steps are covered in “Tuning SQL*Plus” in SQL*Plus® User’s Guide and Reference Release 10.2.

Preliminary Steps

If the PLUSTRACE role does not exist, create it using the PLUSTRCE SQL script found in ORACLE_HOME\sqlplus\admin. The script is pretty simple:

drop role plustrace;
create role plustrace;

grant select on v_$sesstat to plustrace;
grant select on v_$statname to plustrace;
grant select on v_$mystat to plustrace;
grant plustrace to dba with admin option;

Check for the role using:

SQL> select role from dba_roles where role = 'PLUSTRACE';

ROLE
----------------
PLUSTRACE

The user must have (or have access to) a PLAN_TABLE (it can named something else, but for now, the “default” name is fine). This table is created using the UTLXPLAN SQL script found in ORACLE_HOME\rdbms\admin.

SQL> show user
USER is "SYSTEM"
SQL> @?\rdbms\admin\utlxplan

Table created.

SQL> create public synonym plan_table for system.plan_table;

Synonym created.

SQL> grant select, update, insert, delete on plan_table to <your user name>;

Grant succeeded.

SQL> grant plustrace to <your user name>;

Grant succeeded.

The user for these examples is HR (found in the sample schemas provided by Oracle).

SQL> conn hr/hr
Connected.
SQL> set autotrace on
SQL> select * from dual;

D
-
X

With autotrace set to on, you can confirm your ability to see an execution plan and some statistics. You should see output similar to the following:

Execution Plan
----------------------------------------------------------
   0    SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=2)
   1    0 TABLE ACCESS (FULL) OF 'DUAL' (TABLE) (Cost=2 Card=1 Bytes=2)

Statistics
----------------------------------------------------------
         24  recursive calls
          0  db block gets
          6  consistent gets
          1  physical reads
          0  redo size
        389  bytes sent via SQL*Net to client
        508  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

To suppress the results of the query, use “traceonly” in the set statement.

Using Bind Variables

On any number of DBA help type of Web sites, a frequently seen bit of advice is to use bind variables, but rarely are the steps or instructions for this step included. Here is a simple way to create and use a bind variable.

SQL> variable department_id number
SQL> begin
  2  :department_id := 80;
  3  end;
  4  /

PL/SQL procedure successfully completed.

SQL> print department_id

DEPARTMENT_ID
-------------
           80

Now let’s make a comparison between querying for employee ID and name with and without the bind variable (with the output turned off using traceonly).

Now let’s use the bind variable.

Okay, so the difference isn’t that great (the cost went from 3 to 2), but this was a small example (the table only has 107 rows). Is there much of a difference when working with a larger table? Use the SH schema and its SALES table with its 900,000+ rows.

SQL> select prod_id, count(prod_id)
  2  from sales
  3  where prod_id > 130
  4  group by prod_id;

Same query, but this time using a bind variable.

SQL> variable prod_id number
SQL> begin
  2  :prod_id := 130;
  3  end;
  4  /

PL/SQL procedure successfully completed.

SQL> print prod_id

   PROD_ID
----------
       130

SQL> select prod_id, count(prod_id)
  2  from sales
  3  where prod_id > :prod_id
  4  group by prod_id;

The cost went from 540 to 33, and that is fairly significant. One of the main benefits is that the query using the bind variable, that is, the work done parsing the query, stays the same each and every time. All you have to do is substitute a new value for the variable.

Use Efficient SQL

Suppose you have a choice between the following two queries (using the HR schema again):

Query 1

select d.department_id,
 d.department_name,
 r.region_name
from departments d,
 locations l,
 countries c, regions r
where d.location_id=l.location_id
and l.country_id=c.country_id
and c.region_id=r.region_id;

and

select department_id,
 department_name,
 region_name
from departments natural join locations
natural join countries natural join regions;

This leads to four questions.

1.  Are these queries querying for the same result set?

2.  If they are the same, would you expect any difference in their execution plans?

3.  If the plans are the same, what is it that makes these queries different?

4.  Can anything be done to improve the cost?

The answer to the first question is yes, they are the same. The answer to the second question is no, not really, because the same steps are involved in terms of joining tables. The answer to the third question has to do with the amount of typing or coding involved.

The use of the “natural join,” “join on” and “right/left outer join” keywords is what matters in this example. If you understand what a natural join is (still joining two tables, but the column names involved are the same), doesn’t it look easier to use the second query?

The proof of the answer to the second question is shown below.

Query 1’s Execution Plan

Query 2’s Execution Plan

As for the answer to the last question, efficient SQL can mean different things to different people. In this case, what about using a view? Will the cost be any different from either of the original queries (you can see for yourself what the answer is), or are there other considerations to take into account?

Suppose we have a view named cost_example, created as follows:

create or replace view cost_example
as
select department_id, department_name, region_name
from departments natural join locations
natural join countries natural join regions;

Let’s look at a record in the view.

SQL> select department_id, department_name, region_name
  2  from cost_example
  3  where department_id=70;

DEPARTMENT_ID DEPARTMENT_NAME                REGION_NAME
------------- ------------------------------ ------------
           70 Public Relations               Europe

Out of the three columns or fields, can any of them be changed? If so, why? If not, why not?

Let’s suppose the region name is now Asia instead of Europe.

SQL> update cost_example
  2  set region_name = 'Asia'
  3  where region_name = 'Europe';
set region_name = 'Asia'
    *
ERROR at line 2:
ORA-01779: cannot modify a column which maps to a non key-preserved table

Can the department name be changed?

SQL> update cost_example
  2  set department_name = 'PR'
  3  where department_name = 'Public Relations';

1 row updated.

The reason why the record in the view can be updated (the department name, anyway) is that DEPARTMENTS is a key-preserved table (its primary key DEPARTMENT_ID was used in the creation of the view).

The point of this example is this: just because you obtain the lowest cost does not mean you cannot do anything else to make a query better. Better, in this case, applies to developers using simpler join constructs, and applies to users in that providing views for their use saves you the effort of having to explain how to do complex joins. The caution on views is to keep track of key-preserved versus non key-preserved tables so that what you intend to be modifiable is indeed just that.

In Closing

The main points of this article are:

  • Use bind variables
  • Use efficient SQL
  • Use coding standards
  • Consider the technical or SQL know-how of your user population and create views as appropriate

None of these steps is especially difficult to perform or implement. For programmers used to using the “tableA.column_name = tableB.column_name” format for joins, moving to the use of natural joins saves quite a bit of typing, plus there is the benefit of having key column names match up (the foreign key column in the child table has the same column name as the primary key in the parent table). As shown, some measures may not have a big impact, but when taken as a whole, every little bit helps to improve performance. In Part 3, we will look at more examples of steps you can take to improve performance.

Part3:

As mentioned in Parts 1 and 2, there are several relatively easy steps you can take to improve performance. One of those steps involves using an automated tool to “guide” you in writing SQL statements. There are several vendors who manufacture analysis or performance tuning tools, and in the next two articles, we will look at one of them.

Quest Software

Many DBAs and developers use a tool named Toad, produced by Quest Software. According to a statement at Quest Software’s Web site, the Toad user community numbers around 500,000 users. One of Toad’s features is its ability to optimize SQL queries. In other words, Oracle Corporation does not own the market on tuning advisor type of tools.

Knowing that you have several choices with respect to advisory tools, and understanding what it is they do, what do you use them on if you are not working in a production or development environment? And perhaps just as likely, even if you are in a development environment, you may not have any bulk data to use. Generating bulk or large amounts of data is the focus of this article, and the tool we will look at for this purpose is another Quest Software product: DataFactory® for Oracle.

DataFactory

The purpose of DataFactory is to “quickly create meaningful test data for multiple database platforms.” The platforms include Oracle, DB2, Sybase and any ODBC compliant database. Normally retailing at $595 per server, a free 30-day version is available for download at Quest Software’s Web site.

To obtain the software (current version of 5.5.0), you must register using a “real” email address. Hotmail and Gmail addresses were rejected, but Comcast went through just fine. Once you register, you will receive an email that contains a key to unlock the application and start the 30-day free trial clock.

The installation process is quick and straightforward. If you are running Microsoft AntiSpyware, you may receive one or more errors. Disable the real time protection and attempt to reinstall DataFactory.

Creating Tutorial Objects

An excellent way to get to know the application is to use its tutorial objects. The general process is to:

  1. Create a project
  2. Create tables in a schema
  3. Run a script to load data

Unfortunately, an excellent way to get a refresher on disabling system-named referential constraints is to use the built-in tutorial objects. Using an iterative process, you can disable constraints one at a time until the load script runs through without error. However, while Quest is working on fixing this bug, we can take a short excursion into identifying and disabling a constraint.

After starting DataFactory, you can choose to start the tutorial. The instructions on how to load the tutorial objects (as is all help) are in HTML files.

The instructions from the help system state that the tables are populated. That is not correct. The tables (15 in all, named using a “DF_” prefix) are populated after an additional step.

Prior to creating the tables, you may want to create a separate schema in your database. Using Oracle10g, I created a user/schema owner named quest (granting connect and resource will be sufficient). You will be prompted for a username/password combination and database information.

So, following Tools>Create Tutorial Obects –

- the Tutorial Setup Wizard appears (with its own version of the Oracle logo).

A list of tables appears on the Finish Page.

Upon successful creation, DataFactory tells you so.

The project folder appears in the left frame

Click the Run button on the main menu. The ORA-02291 integrity constraint violated error will appear quite a few times (some tables more than once) because the loaded data in a foreign key-designated column does not correspond with data in a parent table. Almost all of the constraints use the SYS_Cxxxxxx naming structure, meaning they are not explicitly named.

To work around the integrity constraint violation, you can disable the constraint (once you know which table to alter). The query and ALTER TABLE statement below show one method to identify and disable the problem constraint.

SQL> select owner, constraint_name, table_name, column_name
  2  from all_cons_columns
  3  where constraint_name like '%9814%';
OWNER CONSTRAINT_NAME                TABLE_NAME           COLUMN_NAME
----- ------------------------------ -------------------- ------------
QUEST SYS_C009814                    DF_ORDERS            CUSTID
SQL> alter table df_orders
  2  disable constraint sys_c009814;
Table altered.

The Results window showing that your project-named script has completed successfully means we are ready to start looking around at what was created.

Instead of analyzing each table one at a time, use the DBMS_STATS built-in (which Oracle recommends to use for most analyze operations). If you are using Oracle10g, you may want to add a WHERE dropped=’NO’ to prevent dropped tables from appearing in queries on user_tables or tab (as an example).

SQL> execute dbms_stats.gather_schema_stats('QUEST');
PL/SQL procedure successfully completed.
SQL> select table_name, num_rows
  2  from user_tables
  3  where dropped='NO';
TABLE_NAME             NUM_ROWS
-------------------- ----------
DF_TITLES                   100
DF_MOVIE_CUSTOMER          1100
DF_MOVIE_EMPLOYEE           900
DF_DUMMY                   1100
DF_AUTHORS_TITLES          1100
DF_MOVIE_RENTAL             700
DF_PRODUCTS                 100
DF_MOVIE_TAPE               400
DF_CUSTOMERS               1100
DF_AUTHORS                 1100
DF_MOVIE_DISTRICT          1100
DF_ORDERS                   101
DF_MOVIE_MOVIE              900
DF_ORDERDETAILS             200
DF_MOVIE_STORE              500
15 rows selected.

Back in the project hierarchy or list of tables, selecting a table in the list will show its columns and their datatypes (you may have to toggle between the Children and Results tabs at the bottom of the application).

Using the DF_MOVIE_CUSTOMER table as an example, how does its data look? The “random characters” option definitely produces exactly that.

More on the Tutorial Tables

Are there any indexes on the foreign key columns?

SQL> select index_name, table_name, column_name, column_position
  2  from user_ind_columns;
INDEX_NAME   TABLE_NAME           COLUMN_NAME          COLUMN_POSITION
------------ -------------------- -------------------- ---------------
SYS_C009823  DF_MOVIE_DISTRICT    DISTRICTID                         1
SYS_C009827  DF_MOVIE_STORE       STOREID                            1
SYS_C009830  DF_MOVIE_EMPLOYEE    EMPID                              1
SYS_C009837  DF_MOVIE_CUSTOMER    CUSTID                             1
SYS_C009841  DF_MOVIE_MOVIE       MOVIEID                            1
SYS_C009845  DF_MOVIE_TAPE        TAPEID                             1
SYS_C009850  DF_MOVIE_RENTAL      TAPEID                             1
SYS_C009850  DF_MOVIE_RENTAL      CUSTID                             2
SYS_C009850  DF_MOVIE_RENTAL      RENTDATE                           3
SYS_C009810  DF_CUSTOMERS         CUSTID                             1
SYS_C009813  DF_ORDERS            ORDERID                            1
SYS_C009816  DF_PRODUCTS          PRODUCTID                          1
SYS_C009819  DF_ORDERDETAILS      ORDERID                            1
SYS_C009819  DF_ORDERDETAILS      PRODUCTID                          2
14 rows selected.

What does the output suggest? You can immediately identify the fact that for one, not every table has a primary key. There are 15 tables, but only 14 rows (or 11 distinct tables as some indexes are composites). Why do we know this? Because one of the benefits of creating a primary key is that you get an index for free. If you disabled all of the referential integrity constraints that arise in the load script, what else might you suspect?

Oracle recommends indexing foreign key columns as they are frequently used in joins, and the general rule is to index columns used in WHERE clauses (which is obviously where joins are listed). Given the lack of indexes, your suspicion should be that the “create table” component of the tutorial tables does not index foreign key columns.

The query below shows the table name/column name foreign keys (they all have position 1 meaning only a single column was used).

SQL> select a.constraint_name, b.constraint_type,
  2  a.table_name, a.column_name
  3  from user_cons_columns a, all_constraints b
  4  where a.constraint_name=b.constraint_name
  5  and constraint_type = 'R';
CONSTRAINT_NAME  C TABLE_NAME           COLUMN_NAME
---------------- - -------------------- -------------
SYS_C009831      R DF_MOVIE_EMPLOYEE    SUPERVISORID
SYS_C009828      R DF_MOVIE_STORE       DISTRICTID
SYS_C009821      R DF_ORDERDETAILS      PRODUCTID
SYS_C009820      R DF_ORDERDETAILS      ORDERID
DFMOVIESTOREFK2  R DF_MOVIE_STORE       MANAGERID
SYS_C009852      R DF_MOVIE_RENTAL      TAPEID
SYS_C009851      R DF_MOVIE_RENTAL      CUSTID
SYS_C009838      R DF_MOVIE_CUSTOMER    STOREID
SYS_C009814      R DF_ORDERS            CUSTID
SYS_C009846      R DF_MOVIE_TAPE        MOVIEID
DFMOVIEEMPFK2    R DF_MOVIE_EMPLOYEE    STOREID
11 rows selected.

The end result? Suspicion confirmed; the foreign keys were not indexed.

From a management or maintenance perspective, why are only two of the referential integrity constraints explicitly named while the rest are system named? As it turns out, of the 51 constraints in this schema, these happen to be the only two user named constraints.

In Closing

The key point to take away from this exploration of a tool such as DataFactory is that while the tool (or a script you create) can generate millions of rows of test or sample data, what good is any of it if it misses the boat on referential integrity or other best practices with respect to data modeling? If you are trying to tune queries for an application, the test data needs to reflect how the application uses it. If you are relying on referential integrity, your test data needs to support and honor parent table-child table relationships.

From a design standpoint, two best practices that were violated include failure to index foreign key columns, and failure to explicitly name three major items (primary keys, foreign keys, and indexes). A possible third violation concerns not having a primary key on every table. Does every table in a schema require a primary key? No, but for the most part, every table (to support normalization) should, and if not, you should at least know why. Put another way, to not normalize a table should be a conscious decision, not an oversight.

Comments»

No comments yet — be the first.