Arulraj's Techitweaki: SQL Query Performance Tuning Part I

Dear Readers,

First I apologies for disappointing you by not
updating the blog regularly. I can blame my tight work
schedule; in this festival season. Recently I have
come across few books and web sites which talks about
SQL Query performance tuning. I found it interesting
and informative; go through it and leave your
comments.

SQL Query Performance Tuning

Even though relational databases and faster hardware
run most SQL queries with a significantly small
response time, there is always room for improvement.
When the data size is higher and concurrent hit on the
server is high the response time of the query really
matters in the performance of the application
irrespective of the technology in which it is built.

Performance tuning of the query has to be done to
reduce the response time. Performance tuning technique
will differ from situation to situation. Hence it is
purely a case to case basic. There is no common
performance tuning method for all situations.

Ex: Indexing is one of the performance tuning method
but higher the number of index will delay the response
time of all DML operations.

Following are few performances tuning techniques

Database statistics

The most important resource to any SQL optimizer is
the statistics collected for different tables within
the catalog. Statistics is the information about
indexes and their distribution with respect to each
other. Optimizer uses this information to decide the
least expensive path that satisfies a query. Outdated
or missing statistics information will cause the
optimizer to take a less optimized path hence
increasing the overall response time.

Following table lists SQL commands for different
database that is used to update statistics. For
further details, refer to the reference manual

Oracle: DBMS_UTILITY package

Optimizers always tend to select the least expensive
path \u2013 one that returns least number of rows in
fastest time. Why do optimizers rely on statistics?
Consider the following query
select *
from customer
where city = 'New York City'
and phone = '212-555-1212'
Notice that the above query contain two fields in the
"WHERE" clause and there are two indexes defined, each
containing one field. One very important notion to
remember is that the optimizer can only use ONE index
per table. Therefore, it has to make a decision as to
which index to use. Since phone number should return
least amount of rows, our query will run much faster
if the optimizer always uses IdxPhone. However, if
statistics are not updated, the optimizer does not
know which index is better and may decide to choose
IdxCity since 'city' field appears first in our WHERE
clause. Once you update statistics the database will
know more about the data distribution and will
correctly choose the better index to run your query.

Create optimized indexes

SQL optimizer heavily depends on indexes defined for
a particular table. Indexes are double-edged sword: no
index will degrade performance of your SELECT
statements and too many indexes will slow down your
DML (INSERT, UPDATE, and DELETE) queries. Therefore,
it is important to have a right balance of index on
tables. Besides the number of indexes, fields that are
involved and their order is also very important.

When creating indexes, estimate the number of unique
values the column(s) will have for a particular field.
For example, the idxCity index in our sample database
is not a good candidate for an index. When you wish to
search for customers in New York City, it can
potentially return thousands of rows, which are then
searched sequentially. Such indexes seldom help in
speeding up SELECT queries and reduce the response
time for DML queries.

Composite index - Indexes containing more than one
field are called composite index. Such indexes should
be created if you expect to run queries that will have
multiple fields in the WHERE clause and all fields
combined will give significantly less rows than the
first field alone.

For example, in our sample database the index
"IdxCustDate" is a composite index. This index is only
useful if the ratio between customers and the number
of their orders is high \u2013 meaning an average
customer places more than 1000 orders. Creating a
composite index on the Orders table when most
customers have placed only a handful of order wastes
not only spaces on the hard disk but have negative
impact on DML queries.

Clustered index - A clustered index determines the
physical order of data in a table - meaning the actual
data is sorted according to the fields in the index.
This is similar to a telephone directory, which
arranges data by last name. There can be only one
clustered index per table. These indexes are
particularly efficient on columns that are often
searched for range of values.
Although most databases support such index, they use
a different terminology. For example Oracle calls it
Index-Organized Table (IOT).

Arulraj's Techitweaki

Monday, January 29, 2007

SQL Query Performance Tuning Part I

No comments:

Blog Archive

Blogs I Read

About Me