Antognini - CBO Configuration Roadmap

Page 1

CBO: A Configuration Roadmap

Christian Antognini Trivadis AG Hotsos Symposium 2005 6-10 March, Dallas (USA)


Who am I >

Senior consultant and trainer at Trivadis AG in Zurich, Switzerland » christian.antognini@trivadis.com » www.trivadis.com

>

>

» Logical and physical database design » Application performance management » Integration with J2EE

Christian Antognini Be in the know.

CBO: A Configuration Roadmap

Focus: get the most out of Oracle

>

10 years experience with Oracle

2

© 2004


Disclaimer >

The CBO changes from release to release

>

It’s difficult to give general advice about many releases

>

Therefore, this presentation focuses on the most important versions at present » Oracle9i Release 2 » Oracle 10g Release 1

>

When information applies to only one version, one of the following icons will be used

CBO: A Configuration Roadmap

3

© 2004


Correctly configuring the CBO (1) >

At least since Oracle9i the CBO works well, i.e. it generates good execution plans for most SQL statements

>

This is only true when

>

» It is correctly configured » The database has been designed to take advantage of all its features

CBO: A Configuration Roadmap

Correctly configuring the CBO is not an easy job

This is no good reason for not doing it however!

4

© 2004


Correctly configuring the CBO (2) >

There is no single configuration that is good for every system » Each application has its own requirements » Each system has its own characteristics

>

Therefore, I’m not able to provide you with the “magic configuration”…

>

I can only show you how I proceed to do such a configuration

CBO: A Configuration Roadmap

5

© 2004


Configuration = INIT.ORA parameters + statistics >

Statistics

>

» Object statistics » System statistics >

>

» » » » »

Basic » » » » »

OPTIMIZER_MODE OPTIMIZER_FEATURES_ENABLE OPTIMIZER_DYNAMIC_SAMPLING OPTIMIZER_INDEX_COST_ADJ OPTIMIZER_INDEX_CACHING

>

» QUERY_REWRITE_ENABLED » QUERY_REWRITE_INTEGRITY » STAR_TRANSFORMATION_ENABLED 6

WORKAREA_SIZE_POLICY PGA_AGGREGATE_TARGET HASH_AREA_SIZE SORT_AREA_SIZE BITMAP_MERGE_AREA_SIZE

Miscellaneous » » » » »

Query transformation

CBO: A Configuration Roadmap

Memory

DB_FILE_MULTIBLOCK_READ_COUNT CURSOR_SHARING SKIP_UNUSABLE_INDEXES PARALLEL_* Many undocumented parameters…

© 2004


Set the right parameter! (1) >

Each INIT.ORA parameter has been introduced for a specific reason!

>

Understanding how it changes the behavior of the CBO is essential

>

Don’t tweak the configuration randomly, instead » Understand the current situation » Define the goal to be achieved » Find out which parameter should be changed to achieve the goal

CBO: A Configuration Roadmap

7

© 2004


Choose the right tool!

CBO: A Configuration Roadmap

8

Š 2004


Set the right parameter! (2)

DE X

_C

AC

HI NG

NA M IC _S AM PL

OP TI M

IZ

ER _M

OD E

QUERY_REWRITE_ENABLED

CBO: A Configuration Roadmap

AD _C OU NT

statist ics

9

M TI P O

ER IZ

_

S RE U AT E F

D LE B NA _E

ET RG TA E_ ICY AT OL EG _P GR IZE _S AG A E SIZE A_ R EA_ KA PG _AR OR RGE W ME AP_ BITM

IN G

system

_R E

HASH_AR EA_SIZE

OP TI M IZ ER _D Y

M UL TI BL OC K

obj ect stat istic s

R_ IN

DJ COST_A _INDEX_

OP TIM IZE

_F ILE _

SO RT _A RE A_ SIZ E

ER OPTIMIZ

DB

© 2004


Configuration roadmap

gather system and object statistics

Once set to their optimal value they should not be modified further; the only exception is when tweaking the index related parameters does not achieve good results!

OPTIMIZER_MODE OPTIMIZER_FEATURES_ENABLE DB_FILE_MULTIBLOCK_READ_COUNT OPTIMIZER_DYNAMIC_SAMPLING WORKAREA_SIZE_POLICY PGA_AGGREGATE_TARGET

auto

manual

HASH_AREA_SIZE SORT_AREA_SIZE BITMAP_MERGE_AREA_SIZE

test the application no

most of the execution plans are good

OPTIMIZER_INDEX_COST_ADJ OPTIMIZER_INDEX_CACHING

yes

CBO: A Configuration Roadmap

gather missing histograms

10

Š 2004


Object statistics >

The object statistics describe the data stored in the database » Table statistics: number of rows, number of blocks below the high water mark and average row length » Column statistics: number of distinct values, number of NULL values and data distribution (a.k.a. histograms) » Index statistics: number of distinct keys, high, number of leaf blocks and clustering factor

>

The CBO can generate good execution plans only if the statistics are good as well, i.e. they must reflect existing data

>

Missing, not up-to-date, or wrong statistics can lead to poor execution plans

>

Object statistics are automatically gathered CBO: A Configuration Roadmap

11

© 2004


Gathering object statistics (1) >

Gather statistics for tables, columns and indexes at the same time » DBMS_STATS.GATHER_SCHEMA_STATS » DBMS_STATS.GATHER_TABLE_STATS

>

Use small estimate percentage (<10%)

» Oracle increases the specified value if necessary ☺ » For very large tables reduce it to 1% or less

>

Activate table monitoring to refresh the statistics when data is changed (by default a threshold of 10% is used) » DBMS_STATS.ALTER_SCHEMA_TAB_MONITORING » STATISTICS_LEVEL = TYPICAL (default)

>

For partitioned tables gather statistics at all levels

>

For large tables gather statistics in parallel CBO: A Configuration Roadmap

12

© 2004


Gathering object statistics (2) >

Histograms are essential, gather them for all columns containing skewed data that are referenced in WHERE clauses » Even for non-indexed columns! » For simplicity use SIZE SKEWONLY, if it takes too much time try SIZE AUTO; if it is still too slow or the chosen number of buckets it not good, manually specify the list of columns » To get the most out of them, bind variables must not be used

>

The following command is a good starting point dbms_stats.gather_schema_stats( ownname => user, estimate_percent => 5, cascade => TRUE, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', options => 'GATHER STALE' ); CBO: A Configuration Roadmap

13

© 2004


Gathering object statistics (3) >

All procedures that gather optimizer statistics no longer have hardcoded default values, instead, they are stored in the data dictionary

» You can get and set them with the DBMS_STATS procedures GET_PARAM and SET_PARAM SQL> SELECT sname, nvl(to_char(sval1),spare4) value 2 FROM sys.optstat_hist_control$; SNAME -------------------CASCADE ESTIMATE_PERCENT DEGREE METHOD_OPT NO_INVALIDATE GRANULARITY

>

VALUE -----------------------------DBMS_STATS.AUTO_CASCADE DBMS_STATS.AUTO_SAMPLE_SIZE NULL FOR ALL COLUMNS SIZE AUTO DBMS_STATS.AUTO_INVALIDATE AUTO

Check the definition of the scheduler window group MAINTENANCE_WINDOW_GROUP CBO: A Configuration Roadmap

14

© 2004


System statistics (1) >

System statistics are essential for a successful configuration

>

The CBO used to base its estimations on the number of I/O needed to execute a SQL statement (a.k.a. “I/O cost model”)

>

With system statistics a new model, called “CPU cost model”, is enabled

>

They give information about the performance of the system where Oracle runs » Performance of the I/O subsystem » Average I/O size for multi-block read operations » Performance of the CPU

>

To use the “I/O cost model”, when system statistics have been gathered, specify the hint NO_CPU_COSTING CBO: A Configuration Roadmap

15

© 2004


System statistics (2) >

There are two kinds of system statistics

>

Without workload » Not really useful… » They are automatically gathered at startup if they don’t exist dbms_stats.gather_system_stats('noworkload');

>

With workload » They should be gathered when the system is charged with a common workload » The instance is “monitored” for a period of time (interval) dbms_stats.gather_system_stats( gathering_mode => 'interval', interval => 30 Statistics are gathered ); over the next 30 minutes CBO: A Configuration Roadmap

16

© 2004


System statistics (3) SQL> 2 3 4

SELECT pname, pval1 FROM sys.aux_stats$ WHERE sname = 'SYSSTATS_MAIN';

PNAME PVAL1 ------------ ---------CPUSPEEDNW 886.233 IOSEEKTIM 10 IOTFRSPEED 4096 CPUSPEED 928 SREADTIM 8.138 MREADTIM 16.799 MBRC 9 MAXTHR 162238464 SLAVETHR 277504

CBO: A Configuration Roadmap

>

Without workload » CPUSPEEDNW: CPU speed in millions of operations per second » IOSEEKTIM: I/O seek time [ms] » IOFRSPEED: I/O transfer speed [B/s]

>

With workload » CPUSPEED: CPU speed in millions of operations per second » SREADTIM: single-block read time [ms] » MREADTIM: multi-block read time [ms] » MBRC: average number of blocks read during a multi-block read operation » MAXTHR: max. throughput [MB/s] » SLAVETHR: max. slave throughput [MB/s]

17

© 2004


OPTIMIZER_MODE >

This is the most important parameter, unfortunately, too often it is not set, i.e. its default value is used » Default value is CHOOSE

- ALL_ROWS is used if object statistics are available, otherwise RULE - In some situations ALL_ROWS has to be used even if no statistics are available (e.g. for partitioned tables)

» Default value is ALL_ROWS >

If fast delivery of the last row is important, ALL_ROWS should be used » Typical example: reporting systems and data warehouses

>

If fast delivery of the first row(s) is important, FIRST_ROWS_n should be used » Where “n” is [ 1 | 10 | 100 | 1000 ] row(s) » Typical example: OLTP systems CBO: A Configuration Roadmap

18

© 2004


OPTIMIZER_FEATURES_ENABLE >

Each database version introduces new features in the optimizer

>

OPTIMIZER_FEATURES_ENABLE can be used to set which “version” of the optimizer should be used » Valid values are database versions (e.g. 8.1.7, 9.0.1, 9.2.0, …) » A complete list of the features which are enabled/disabled by each version is available in the “Oracle Database Reference” manual » The default value is the current database version » Not all new features are enabled/disabled by this parameter, this means that even if you set it to 8.1.7 in, for example, Oracle9i you won’t get the 8.1.7 optimizer!

>

It could be useful to set it when an application is upgraded to a new database version, otherwise leave it at the default value CBO: A Configuration Roadmap

19

© 2004


DB_FILE_MULTIBLOCK_READ_COUNT (1) >

DB_FILE_MULTIBLOCK_READ_COUNT specifies the maximum number of blocks that Oracle reads during a multi-block operation (e.g. full table scan)

>

Three common situations lead to multi-block reads that are smaller than the specified value » Oracle reads headers with single-block reads » Oracle never does an I/O that spans more extents » Oracle never reads a block that is already in the buffer cache

>

Multi-block reads are a performance feature, therefore, the parameter should be set to achieve the best performance » Higher values do not provide better performance in all cases! » It makes no sense to exceed the maximum physical I/O size CBO: A Configuration Roadmap

20

© 2004


DB_FILE_MULTIBLOCK_READ_COUNT (2) >

When system statistics with workload are not used this parameter has a direct impact on the full table scan costs, therefore, too high values lead to excessive full scans » Without system statistics

Blocks I / O Cost FTS ≈ 1.6765 ⋅ DFMBRC 0.6581 » System statistics without workload

Blocks I / O Cost FTS ≈ 1 + A ⋅ ln (DFMBRC) + B ⋅ ln( DFMBRC ) 2 + ... >

When system statistics with workload are used, MBRC in SYS.AUX_STATS$ is used instead of DFMBRC to compute the cost of full table scans CBO: A Configuration Roadmap

21

© 2004


DB_FILE_MULTIBLOCK_READ_COUNT (3) A simple full table scan with different values gives useful information about the impact of this parameter and, therefore, assists in finding the “best” value 100 100 80 Gain in %

>

60 40 20 0 -20 -20 1 1

5 5

9 9

CBO: A Configuration Roadmap

13 21 17 13 17 21 DFMBRC // MBRC DFMBRC MBRC 22

25 25 25

29 29 29

I/O cost cost without without I/O system statistics system I/O cost statistics without I/Ocost coststatistics with def. system I/O without I/O cost with def. def. I/O coststatistics with system system statistics system statistics I/O cost without system statistics I/O cost def. System with 1 system statistics system statistics System 1 def. (max. 55 I/O cost System with 1MB/s) (max. 55 MB/s) system statistics (max. 55 MB/s) System 2 System 1 (max. 55 402MB/s) MB/s) System (max. System (max. 4023MB/s) System (max. 40 MB/s) (max. 58 MB/s) System 3 System (max. 584MB/s) (max. 230 MB/s) © 2004


OPTIMIZER_DYNAMIC_SAMPLING (1) >

The CBO used to base its estimations on statistics found in the data dictionary only

>

With dynamic sampling some statistics can be gathered during the parse phase as well, this means that some queries are executed against the referenced objects to gather additional information

>

The statistics gathered by dynamic sampling are not stored in the data dictionary, they are simply shared at cursor level

>

The value of the parameter OPTIMIZER_DYNAMIC_SAMPLING specifies how and when dynamic sampling is used Âť Range: from 0 to 10 (0=dynamic sampling disabled) CBO: A Configuration Roadmap

23

Š 2004


OPTIMIZER_DYNAMIC_SAMPLING (2) # Blocks Param Hint

Level On which table?

1 2 3 4 5 6 7 8 9 10

All non-analyzed tables, if at least one table » Is part of a join, subquery or non-mergeable view » Has no index » Has more blocks than the number of blocks used for the sampling All non-analyzed tables All tables which fulfill the level-2 criterion plus all tables for which a standard selectivity estimation is used

All tables which fulfill level-3 criteria plus all tables which have single table predicates referencing more than two attributes

CBO: A Configuration Roadmap

24

32

32 64

64

128 256 1024 4096 All © 2004

128 256 512 1024 2048 4096 8192 All


OPTIMIZER_DYNAMIC_SAMPLING (3) >

Default values » If OPTIMIZER_FEATURES_ENABLE ≥ 10.0.0: 2 » If OPTIMIZER_FEATURES_ENABLE = 9.2.0: 1 » If OPTIMIZER_FEATURES_ENABLE ≤ 9.0.1: 0

>

Level 1 and 2 are not very useful, in fact the tables should be analyzed on a regular basis! » An exception is when temporary tables are used, in fact, usually, no statistics are available for them

>

Level 3 and up are useful for improving selectivity estimations of predicates » If the CBO is not able to do correct estimations start by setting it to 4 » It is possible to do such corrections with SQL profiles as well CBO: A Configuration Roadmap

25

© 2004


PGA management >

It is possible to choose between two methods to manage the PGA » Manual: the DBA has full control over the size of the PGA » Automatic: the DBA delegates the management of the PGA to Oracle

>

Except in the following situations it is recommended to use automatic PGA management » Very large PGA needed (>100MB, see Metalink note 147806.1) - Except if you want to set _PGA_MAX_SIZE

» Fine tuning is needed » Shared server (former MTS) is used >

Usually a larger PGA makes merge/hash joins and sort operations faster, therefore, you should devote the “unused” memory that is available on the system to it

>

The PGA size has an influence on the costs computed by the CBO CBO: A Configuration Roadmap

26

© 2004


Manual PGA management >

To enable manual PGA management the INIT.ORA parameter WORKAREA_SIZE_POLICY must be set to MANUAL

>

Then the following INIT.ORA parameters should be set » HASH_AREA_SIZE » SORT_AREA_SIZE » BITMAP_MERGE_AREA_SIZE

>

It is practically impossible to give advice about their value, anyway, here some general rules

» Usually a minimal value of 512KB to 1MB should be used » For small PGAs, to take advantage of hash join, the HASH_AREA_SIZE should be at least 3-4 times the SORT_AREA_SIZE

>

Each parameter specifies the maximum amount of memory that can be used by each server process CBO: A Configuration Roadmap

27

© 2004


Automatic PGA management >

To enable auto PGA management the INIT.ORA parameter WORKAREA_SIZE_POLICY must be set to AUTO

>

Contrary to manual PGA management, the INIT.ORA parameter PGA_AGGREGATE_TARGET specifies the total amount of memory use for all PGAs, i.e. not for the single server processes

>

In some situations the PGA_AGGREGATE_TARGET could also be exceeded » It’s a target, not a maximum value! » This usually happens when a value which is too small is specified

CBO: A Configuration Roadmap

28

© 2004


OPTIMIZER_INDEX_COST_ADJ >

This parameter is used to change the cost of table accesses through index scans

>

Range of values 1..10000 (default 100)

» Values greater than 100 make index scans more expensive » Values lower than 100 make index scans less expensive

>

>

>

Correction for an index range scan

OICA I / O Cost ≈ (BLevel + (LeafBlocks + ClustFactor ) ⋅ Selectivity ) ⋅ 100 Correction for an index unique scan

OICA I / O Cost ≈ (BLevel + 1) ⋅ 100

This parameter flattens costs and makes the clustering factor much less significant, therefore, small values should be carefully specified! With system statistics the default value is usually good CBO: A Configuration Roadmap

29

© 2004


OPTIMIZER_INDEX_CACHING >

This parameter is used to specify the expected amount (in percent) of index blocks cached in the buffer cache during a nested loop join » This parameter has no influence on index range scans!

>

Range of values 0..100 (default 0)

» Values greater than 0 make nested loops less expensive

>

Correction for an index range scan

>

Correction for index unique scan

>

With system statistics the default value is usually good

⎛ OIC ⎞ I / O Cost ≈ (BLevel + LeafBlocks ⋅ Selectivity ) ⋅ ⎜1 − ⎟ ⎝ 100 ⎠ + ClustFactor ⋅ Selectivity ⎛ OIC ⎞ I / O Cost ≈ BLevel ⋅ ⎜1 − ⎟ +1 ⎝ 100 ⎠

CBO: A Configuration Roadmap

30

© 2004


Core messages‌ >

The CBO works well if it is correctly configured!

>

Correctly configuring the CBO is not an easy job

This is no good reason for not doing it!

> >

from Trivadis At the core it's all about data.

CBO: A Configuration Roadmap

Understanding how each parameter changes the behavior of the CBO is essential

31

Š 2004


Thank you for your attention

Questions?


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.