Haisley - Backup and Recovery Optimization

Page 1

Backup and Recovery Optimization

Stephan Haisley Center Of Expertise Oracle Corporation Copyright ď›™ Oracle Corporation, 2005. All rights reserved.


Objectives •

Provide a short introduction to Recovery Manager (RMAN)

Explain and demonstrate factors that influence speed of: – Backups – Restorations – Recoveries

2

Give you some ideas at what to look at to make backup and recovery faster

Copyright  Oracle Corporation, 2005. All rights reserved.


What is RMAN? • •

Introduced in 8.0

Built into RDBMS kernel so can take advantage of features (e.g. block checking)

Can back up datafiles, controlfile, archivelogs and SPFILE

Offers image copy or backupset

Allows DBA to manage backup and recovery operations with ease

– Image copy: byte for byte copy – Backupset: multiplexed files together into proprietary format 3

Copyright  Oracle Corporation, 2005. All rights reserved.


Incremental Backups Day of the week Sun

Mon

Tues

Wed

Thr

Fri

Sat

0

1

1

1

1

1

0

Incremental backup level

Differential

4

Day of the week Sun

Mon

Tues

Wed

Thr

Fri

Sat

0

1

1

1

1

1

0

Incremental backup level

Cumulative

Copyright ď›™ Oracle Corporation, 2005. All rights reserved.


Differential vs Cumulative •

5

Backup speeds: Backup#

Type

Level

1

Base

0

2

Diff

3

#blocks

Time (secs)

CPU (secs)

778112

626

227.20

1

42375

312

82.93

Diff

1

42370

312

82.65

4

Diff

1

42369

312

82.45

5

Base

0

778112

628

226.09

6

Cumu

1

42371

314

80.61

7

Cumu

1

49605

315

83.70

8

Cumu

1

60176

321

85.33

Copyright  Oracle Corporation, 2005. All rights reserved.


Differential vs Cumulative •

6

Restore speeds: Type

#Backup Sets restored

Base level 0

1

626.67

210.85

Differential

3

98.67

23.00

Base level 0

1

629.33

209.21

Cumulative

1

43.00

11.05

Time (secs)

CPU (secs)

Extra time on backup can save significant time on recovery! Copyright  Oracle Corporation, 2005. All rights reserved.


Backup and Restore Performance •

Backup & Restore times can be influenced by: – Channel configuration – Size of memory buffers (read & write) – Speed of backup devices – Amount of data being backed up – Amount of block checking features enabled – Use of compression

7

Copyright  Oracle Corporation, 2005. All rights reserved.


Channel Configuration •

Match up the number of channels to each backup device – Manually allocate channels – Use automatic channel parallelism

Avoid Media Management Layer (MML) multiplexing of backup sets – Increase restore times

8

Leave some devices available for emergency restorations which won’t upset the other backup schedules Copyright  Oracle Corporation, 2005. All rights reserved.


Channel Configuration •

Reducing filesperset can decrease speed of single file restores: Filesperset

9

BS Size (blks)

Restored file (blks)

Time (secs)

CPU (secs)

8

702320

97727

132

39.42

4

658221

97727

110

36.92

2

132773

97727

82

29.92

1

97730

97727

74

25.62

Copyright ď›™ Oracle Corporation, 2005. All rights reserved.


Read and Write Memory Buffers

Datafiles input Buffers (4 per datafile)

10

Output Buffers (4 per channel)

Copyright ď›™ Oracle Corporation, 2005. All rights reserved.

Backup Device


Size of Read Buffers •

Allocated according to MAXOPENFILES channel parameter: MAXOPENFILES

MAXOPENFILES ≤ 4 4 > MAXOPENFILES ≤ 8 MAXOPENFILES > 8

• 11

Buffer Size Each buffer = 1Mb, total buffer size for channel is up to 16Mb Each buffer = 512Kb, total buffer size for channel is up to 16Mb. Numbers of buffers per file depends on number of files Each buffer = 128Kb, 4 buffers per file, so each file will have 512Kb buffer

Let’s see how that looks in real life… Copyright  Oracle Corporation, 2005. All rights reserved.


Size of Read Buffers •

Read buffer allocation for backups: MAXOPENFILES

12

Buffer Size (Kb)

#Buffers per file

Total Buffer size (Mb)

2

1024

8

16

4

512

8

16

8

512

4

16

10

128

4

5

Default values seem adequate, and will also limit the amount of memory used for input buffer memory Copyright  Oracle Corporation, 2005. All rights reserved.


Size of Write Buffers •

Allocates four buffers per channel – Disk = 1Mb per buffer – SBT = 256Kb per buffer

SBT is smaller due to slower speed of tape devices

Can see increased performance when increasing size of tape buffers… Total buffer size (Kb)

13

I/O Count

I/O Time (secs)

128

60564

617.4

1024 (default)

7571

595.9

2048

3786

505.3

Copyright  Oracle Corporation, 2005. All rights reserved.


Where is buffer memory allocated from? •

PGA if not using I/O slaves (use async I/O) – tape_asynch_io – disk_asynch_io

Shared Pool if using I/O slaves (use if OS does not support async I/O) – backup_tape_io_slaves – dbwr_io_slaves

• 14

Large Pool if size > 0 and using I/O slaves

Copyright  Oracle Corporation, 2005. All rights reserved.


Speed of Backup Devices •

Maximum speed of backup: min(disk read Mb/s, tape write Mb/s)

Monitor v$backup_async/sync_io for effective_bytes_per_second where input is output or input – If transfer rate slower than device is capable of, look at OS level data, CPU statistics, MML settings (compression?), device settings (block size)

Can slow down speed of backup to reduce loading on I/O system: RMAN> configure channel device type sbt rate=1M;

15

Copyright  Oracle Corporation, 2005. All rights reserved.


Amount of data being backed up •

Put static data into Read-Only tablespace and backup one time only – Make sure backup not purged from MML catalog

Use differential incrementals and monitor v$backup_datafile to identify files not changing frequently – Reduce their backup frequency

Avoid using datafiles with large amounts of freespace – The whole datafile is scanned for a backup

16

Copyright  Oracle Corporation, 2005. All rights reserved.


Block Change Tracking • •

Fast Incremental backups introduced in 10g

• •

Size of tracking file ~1/30,000 size of database

Performance gain for backups make this bearable:

Uses change tracking file to store bitmaps representing ranges of blocks in datafiles Overhead on database performance ~3% (in my TPCC tests)

Fast Incrementals?

17

#Blocks in DB

#Blocks read

#Blocks in backup

Time (secs)

No

404160

404160

36567

156

Yes

404160

72832

37215

35

Copyright  Oracle Corporation, 2005. All rights reserved.


Amount of Block Checking Features Enabled •

Each type of block checking will increase time and CPU usage for backup and restoration: –

Head and Tail sanity check – Makes sure key structures in head match tail

Block Checksums – Calculated and compared with existing

checksum –

Logical structure checks – Checks various block structures for consistency

Tests showed time for database backup increased ~1% and CPU usage by ~8% –

18

BUT extra checks confirm if database good on backup and then on restore Copyright  Oracle Corporation, 2005. All rights reserved.


Backup Compression • •

Backupset compression introduced in 10g Can reduce size of backupset by 80-90% – Saves space on backup media space – Reduces amount of network traffic if backup device not local

Increases CPU and time (as expected) for backup and restore

Do NOT use along with MML compression – Time both types of compression and use most suitable

19

Copyright  Oracle Corporation, 2005. All rights reserved.


Recovery Performance

Recovery times can be influenced by: – Number of archivelogs/incrementals being applied – Number of datafiles needing recovery – If archivelogs available on disk – If using parallel recovery – General database performance

20

Copyright  Oracle Corporation, 2005. All rights reserved.


Number of archivelogs/incrementals being applied •

RMAN will choose to use incrementals over archivelogs – My tests showed restoring the incremental was ~17 times quicker than applying 20 archivelogs – Mileage will vary depending on backup / restore speeds as previously discussed

21

Previous slide showed cumulative being faster than differentials

The higher the number of logfiles / incrementals required, the slower the recovery Copyright  Oracle Corporation, 2005. All rights reserved.


Number of datafiles needing recovery •

For each datablock that needs recovery, it first needs to be read into the buffer cache and then written back to disk by DBWR after redo is applied to it

By reducing the number of files that are recovered, reduce overall work in the database = speed up recovery – Only restore and recover the files that NEED recovering

• 22

If recovery due to corruption, consider Block Media Recovery… Copyright  Oracle Corporation, 2005. All rights reserved.


Block Media Recovery (BMR) •

RMAN will restore and apply recovery to the specified blocks only, leaving rest of datafile in tact for normal use

Significant increase in recovery time over the whole datafile: Datafile recovery time (secs)

#Corrupt Blocks

• 23

BMR Time (secs)

10

941

145

99

925

155

991

937

219

5000

922

616

10000

938

1156

Can be too much of a good thing! Copyright  Oracle Corporation, 2005. All rights reserved.


Archiveslogs available on disk? •

Avoid the RMAN restore times for archivelogs and keep n days worth on disk – Depends on incremental strategy – Depends on available disk space

Backup most recent archivelogs to disk and then to tape at a later time – Take a backup of a backup (from 9i onwards)

24

Copyright  Oracle Corporation, 2005. All rights reserved.


Parallel Recovery •

By default Oracle will use a single process to carry out recovery, unless using parallel_automatic_tuning – Oracle will decide if best to use parallel recovery and how many slave processes

25

• •

Single coordinator process reads the archivelogs

Will increase CPU usage and need for DBWR to perform well

Watch for waits on ‘PX Deq’ events

Reading of datablocks and applying redo is split up amongst slave processes, each working on a range of blocks

Copyright  Oracle Corporation, 2005. All rights reserved.


General Database Performance •

Recovery happens within the database, so a badly performing database will not help with recovery times

Areas to look for improvement: – I/O → read and write intensive – DBWR performance → look for ‘free buffer waits’ – use async. IO or DBWR slaves – CPU → make sure it doesn’t become starved during recovery – parallelism won’t help you!

26

Copyright  Oracle Corporation, 2005. All rights reserved.


Helpful views

27

v$session_longops → shows currently running backup, restore, recovery with RMAN

v$backup_async/sync_io → shows RMAN performance information

• •

v$session_wait → session wait information v$backup_set, v$backup_piece, v$backup_datafile etc. → shows sizing information for backups

Copyright  Oracle Corporation, 2005. All rights reserved.


Summary •

Explained factors that influence speed of: – Backups – Restorations – Recoveries

28

Gave you something to think about when looking at backup, restore and recovery time windows

Make sure you test any alterations with production volume FIRST!

Copyright  Oracle Corporation, 2005. All rights reserved.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.