Backup and Recovery Optimization
Stephan Haisley Center Of Expertise Oracle Corporation Copyright ď›™ Oracle Corporation, 2005. All rights reserved.
Objectives •
Provide a short introduction to Recovery Manager (RMAN)
•
Explain and demonstrate factors that influence speed of: – Backups – Restorations – Recoveries
•
2
Give you some ideas at what to look at to make backup and recovery faster
Copyright Oracle Corporation, 2005. All rights reserved.
What is RMAN? • •
Introduced in 8.0
•
Built into RDBMS kernel so can take advantage of features (e.g. block checking)
•
Can back up datafiles, controlfile, archivelogs and SPFILE
•
Offers image copy or backupset
Allows DBA to manage backup and recovery operations with ease
– Image copy: byte for byte copy – Backupset: multiplexed files together into proprietary format 3
Copyright Oracle Corporation, 2005. All rights reserved.
Incremental Backups Day of the week Sun
Mon
Tues
Wed
Thr
Fri
Sat
0
1
1
1
1
1
0
Incremental backup level
Differential
4
Day of the week Sun
Mon
Tues
Wed
Thr
Fri
Sat
0
1
1
1
1
1
0
Incremental backup level
Cumulative
Copyright ď›™ Oracle Corporation, 2005. All rights reserved.
Differential vs Cumulative •
5
Backup speeds: Backup#
Type
Level
1
Base
0
2
Diff
3
#blocks
Time (secs)
CPU (secs)
778112
626
227.20
1
42375
312
82.93
Diff
1
42370
312
82.65
4
Diff
1
42369
312
82.45
5
Base
0
778112
628
226.09
6
Cumu
1
42371
314
80.61
7
Cumu
1
49605
315
83.70
8
Cumu
1
60176
321
85.33
Copyright Oracle Corporation, 2005. All rights reserved.
Differential vs Cumulative •
•
6
Restore speeds: Type
#Backup Sets restored
Base level 0
1
626.67
210.85
Differential
3
98.67
23.00
Base level 0
1
629.33
209.21
Cumulative
1
43.00
11.05
Time (secs)
CPU (secs)
Extra time on backup can save significant time on recovery! Copyright Oracle Corporation, 2005. All rights reserved.
Backup and Restore Performance •
Backup & Restore times can be influenced by: – Channel configuration – Size of memory buffers (read & write) – Speed of backup devices – Amount of data being backed up – Amount of block checking features enabled – Use of compression
7
Copyright Oracle Corporation, 2005. All rights reserved.
Channel Configuration •
Match up the number of channels to each backup device – Manually allocate channels – Use automatic channel parallelism
•
Avoid Media Management Layer (MML) multiplexing of backup sets – Increase restore times
•
8
Leave some devices available for emergency restorations which won’t upset the other backup schedules Copyright Oracle Corporation, 2005. All rights reserved.
Channel Configuration •
Reducing filesperset can decrease speed of single file restores: Filesperset
9
BS Size (blks)
Restored file (blks)
Time (secs)
CPU (secs)
8
702320
97727
132
39.42
4
658221
97727
110
36.92
2
132773
97727
82
29.92
1
97730
97727
74
25.62
Copyright ď›™ Oracle Corporation, 2005. All rights reserved.
Read and Write Memory Buffers
Datafiles input Buffers (4 per datafile)
10
Output Buffers (4 per channel)
Copyright ď›™ Oracle Corporation, 2005. All rights reserved.
Backup Device
Size of Read Buffers •
Allocated according to MAXOPENFILES channel parameter: MAXOPENFILES
MAXOPENFILES ≤ 4 4 > MAXOPENFILES ≤ 8 MAXOPENFILES > 8
• 11
Buffer Size Each buffer = 1Mb, total buffer size for channel is up to 16Mb Each buffer = 512Kb, total buffer size for channel is up to 16Mb. Numbers of buffers per file depends on number of files Each buffer = 128Kb, 4 buffers per file, so each file will have 512Kb buffer
Let’s see how that looks in real life… Copyright Oracle Corporation, 2005. All rights reserved.
Size of Read Buffers •
Read buffer allocation for backups: MAXOPENFILES
•
12
Buffer Size (Kb)
#Buffers per file
Total Buffer size (Mb)
2
1024
8
16
4
512
8
16
8
512
4
16
10
128
4
5
Default values seem adequate, and will also limit the amount of memory used for input buffer memory Copyright Oracle Corporation, 2005. All rights reserved.
Size of Write Buffers •
Allocates four buffers per channel – Disk = 1Mb per buffer – SBT = 256Kb per buffer
•
SBT is smaller due to slower speed of tape devices
•
Can see increased performance when increasing size of tape buffers… Total buffer size (Kb)
13
I/O Count
I/O Time (secs)
128
60564
617.4
1024 (default)
7571
595.9
2048
3786
505.3
Copyright Oracle Corporation, 2005. All rights reserved.
Where is buffer memory allocated from? •
PGA if not using I/O slaves (use async I/O) – tape_asynch_io – disk_asynch_io
•
Shared Pool if using I/O slaves (use if OS does not support async I/O) – backup_tape_io_slaves – dbwr_io_slaves
• 14
Large Pool if size > 0 and using I/O slaves
Copyright Oracle Corporation, 2005. All rights reserved.
Speed of Backup Devices •
Maximum speed of backup: min(disk read Mb/s, tape write Mb/s)
•
Monitor v$backup_async/sync_io for effective_bytes_per_second where input is output or input – If transfer rate slower than device is capable of, look at OS level data, CPU statistics, MML settings (compression?), device settings (block size)
•
Can slow down speed of backup to reduce loading on I/O system: RMAN> configure channel device type sbt rate=1M;
15
Copyright Oracle Corporation, 2005. All rights reserved.
Amount of data being backed up •
Put static data into Read-Only tablespace and backup one time only – Make sure backup not purged from MML catalog
•
Use differential incrementals and monitor v$backup_datafile to identify files not changing frequently – Reduce their backup frequency
•
Avoid using datafiles with large amounts of freespace – The whole datafile is scanned for a backup
16
Copyright Oracle Corporation, 2005. All rights reserved.
Block Change Tracking • •
Fast Incremental backups introduced in 10g
• •
Size of tracking file ~1/30,000 size of database
•
Performance gain for backups make this bearable:
Uses change tracking file to store bitmaps representing ranges of blocks in datafiles Overhead on database performance ~3% (in my TPCC tests)
Fast Incrementals?
17
#Blocks in DB
#Blocks read
#Blocks in backup
Time (secs)
No
404160
404160
36567
156
Yes
404160
72832
37215
35
Copyright Oracle Corporation, 2005. All rights reserved.
Amount of Block Checking Features Enabled •
Each type of block checking will increase time and CPU usage for backup and restoration: –
Head and Tail sanity check – Makes sure key structures in head match tail
–
Block Checksums – Calculated and compared with existing
checksum –
Logical structure checks – Checks various block structures for consistency
•
Tests showed time for database backup increased ~1% and CPU usage by ~8% –
18
BUT extra checks confirm if database good on backup and then on restore Copyright Oracle Corporation, 2005. All rights reserved.
Backup Compression • •
Backupset compression introduced in 10g Can reduce size of backupset by 80-90% – Saves space on backup media space – Reduces amount of network traffic if backup device not local
•
Increases CPU and time (as expected) for backup and restore
•
Do NOT use along with MML compression – Time both types of compression and use most suitable
19
Copyright Oracle Corporation, 2005. All rights reserved.
Recovery Performance
•
Recovery times can be influenced by: – Number of archivelogs/incrementals being applied – Number of datafiles needing recovery – If archivelogs available on disk – If using parallel recovery – General database performance
20
Copyright Oracle Corporation, 2005. All rights reserved.
Number of archivelogs/incrementals being applied •
RMAN will choose to use incrementals over archivelogs – My tests showed restoring the incremental was ~17 times quicker than applying 20 archivelogs – Mileage will vary depending on backup / restore speeds as previously discussed
21
•
Previous slide showed cumulative being faster than differentials
•
The higher the number of logfiles / incrementals required, the slower the recovery Copyright Oracle Corporation, 2005. All rights reserved.
Number of datafiles needing recovery •
For each datablock that needs recovery, it first needs to be read into the buffer cache and then written back to disk by DBWR after redo is applied to it
•
By reducing the number of files that are recovered, reduce overall work in the database = speed up recovery – Only restore and recover the files that NEED recovering
• 22
If recovery due to corruption, consider Block Media Recovery… Copyright Oracle Corporation, 2005. All rights reserved.
Block Media Recovery (BMR) •
RMAN will restore and apply recovery to the specified blocks only, leaving rest of datafile in tact for normal use
•
Significant increase in recovery time over the whole datafile: Datafile recovery time (secs)
#Corrupt Blocks
• 23
BMR Time (secs)
10
941
145
99
925
155
991
937
219
5000
922
616
10000
938
1156
Can be too much of a good thing! Copyright Oracle Corporation, 2005. All rights reserved.
Archiveslogs available on disk? •
Avoid the RMAN restore times for archivelogs and keep n days worth on disk – Depends on incremental strategy – Depends on available disk space
•
Backup most recent archivelogs to disk and then to tape at a later time – Take a backup of a backup (from 9i onwards)
24
Copyright Oracle Corporation, 2005. All rights reserved.
Parallel Recovery •
By default Oracle will use a single process to carry out recovery, unless using parallel_automatic_tuning – Oracle will decide if best to use parallel recovery and how many slave processes
25
• •
Single coordinator process reads the archivelogs
•
Will increase CPU usage and need for DBWR to perform well
•
Watch for waits on ‘PX Deq’ events
Reading of datablocks and applying redo is split up amongst slave processes, each working on a range of blocks
Copyright Oracle Corporation, 2005. All rights reserved.
General Database Performance •
Recovery happens within the database, so a badly performing database will not help with recovery times
•
Areas to look for improvement: – I/O → read and write intensive – DBWR performance → look for ‘free buffer waits’ – use async. IO or DBWR slaves – CPU → make sure it doesn’t become starved during recovery – parallelism won’t help you!
26
Copyright Oracle Corporation, 2005. All rights reserved.
Helpful views
27
•
v$session_longops → shows currently running backup, restore, recovery with RMAN
•
v$backup_async/sync_io → shows RMAN performance information
• •
v$session_wait → session wait information v$backup_set, v$backup_piece, v$backup_datafile etc. → shows sizing information for backups
Copyright Oracle Corporation, 2005. All rights reserved.
Summary •
Explained factors that influence speed of: – Backups – Restorations – Recoveries
28
•
Gave you something to think about when looking at backup, restore and recovery time windows
•
Make sure you test any alterations with production volume FIRST!
Copyright Oracle Corporation, 2005. All rights reserved.