Systems Performance
Enterprise and the Cloud
Brendan Gregg
UpperSaddleRiver,NJ•Boston•Indianapolis•SanFrancisco NewYork•Toronto•Montreal•London•Munich•Paris•Madrid Capetown•Sydney•Tokyo•Singapore•MexicoCity
Manyofthedesignationsusedbymanufacturersandsellerstodistinguishtheirproductsareclaimedas trademarks.Wherethosedesignationsappearinthisbook,andthepublisherwasawareofatrademarkclaim, thedesignationshavebeenprintedwithinitialcapitallettersorinallcapitals.
Theauthorandpublisherhavetakencareinthepreparationofthisbook,butmakenoexpressedorimplied warrantyofanykindandassumenoresponsibilityforerrorsoromissions.Noliabilityisassumedforincidentalorconsequentialdamagesinconnectionwithorarisingoutoftheuseoftheinformationorprogramscontainedherein.
Thepublisheroffersexcellentdiscountsonthisbookwhenorderedinquantityforbulkpurchasesorspecial sales,whichmayincludeelectronicversionsand/orcustomcoversandcontentparticulartoyourbusiness, traininggoals,marketingfocus,andbrandinginterests.Formoreinformation,pleasecontact:
U.S.CorporateandGovernmentSales (800)382-3419 corpsales@pearsontechgroup.com
ForsalesoutsidetheUnitedStates,pleasecontact: InternationalSales international@pearson.com
VisitusontheWeb:informit.com/ph
Library of Congress Cataloging-in-Publication Data Gregg,Brendan.
Systemsperformance:enterpriseandthecloud/BrendanGregg. pagescm
Includesbibliographicalreferencesandindex.
ISBN-13:978-0-13-339009-4(alkalinepaper)
ISBN-10:0-13-339009-8(alkalinepaper)
1.Operatingsystems(Computers)—Evaluation.2.Applicationsoftware—Evaluation.3.Business Enterprises—Dataprocessing.4.Cloudcomputing.I.Title.
QA76.77.G742014
004.67'82—dc23
Copyright©2014BrendanGregg
2013031887
Allrightsreserved.PrintedintheUnitedStatesofAmerica.Thispublicationisprotectedbycopyright,and permissionmustbeobtainedfromthepublisherpriortoanyprohibitedreproduction,storageinaretrieval system,ortransmissioninanyformorbyanymeans,electronic,mechanical,photocopying,recording,orlikewise.Toobtainpermissiontousematerialfromthiswork,pleasesubmitawrittenrequesttoPearsonEducation,Inc.,PermissionsDepartment,OneLakeStreet,UpperSaddleRiver,NewJersey07458,oryoumayfax yourrequestto(201)236-3290.
ISBN-13:978-0-13-339009-4
ISBN-10:0-13-339009-8
TextprintedintheUnitedStatesofAmerica. 516
1.9.2SoftwareChange11
1.9.3MoreReading13 Chapter 2 Methodology15 2.1Terminology16 2.2Models17
2.2.1SystemunderTest17
2.2.2QueueingSystem17 2.3Concepts18
2.3.1Latency18
2.3.2TimeScales19
2.3.3Trade-offs20
2.3.4TuningEfforts21
2.3.5LevelofAppropriateness22
2.3.6Point-in-TimeRecommendations23
2.3.7LoadversusArchitecture24
2.3.8Scalability24
2.3.9Known-Unknowns26
2.3.10Metrics27
2.3.11Utilization27
2.3.12Saturation29
2.3.13Profiling30 2.3.14Caching30 2.4Perspectives32
2.4.1ResourceAnalysis33
2.4.2WorkloadAnalysis34 2.5Methodology35
2.5.1StreetlightAnti-Method36
2.5.2RandomChangeAnti-Method37
2.5.3Blame-Someone-ElseAnti-Method38
2.5.4AdHocChecklistMethod38
2.5.5ProblemStatement39
2.5.6ScientificMethod39
2.5.7DiagnosisCycle41
2.5.8ToolsMethod41
2.5.9TheUSEMethod42
2.5.10WorkloadCharacterization49
2.5.11Drill-DownAnalysis50
2.5.12LatencyAnalysis51
2.5.13MethodR52
2.5.14EventTracing53
2.5.15BaselineStatistics54
2.5.16StaticPerformanceTuning55
2.5.17CacheTuning55
2.5.18Micro-Benchmarking56
2.6Modeling57
2.6.1EnterpriseversusCloud57
2.6.2VisualIdentification58
2.6.3Amdahl’sLawofScalability60
2.6.4UniversalScalabilityLaw61
2.6.5QueueingTheory61
2.7CapacityPlanning65
2.7.1ResourceLimits66
2.7.2FactorAnalysis68
2.7.3ScalingSolutions69
2.8Statistics69
2.8.1QuantifyingPerformance69
2.8.2Averages70
2.8.3StandardDeviations,Percentiles,Median72
2.8.4CoefficientofVariation72
2.8.5MultimodalDistributions73
2.8.6Outliers74 2.9Monitoring74
2.9.1Time-BasedPatterns74
2.9.2MonitoringProducts76
2.9.3Summary-since-Boot76
2.10Visualizations76
2.10.1LineChart77
2.10.2ScatterPlots78
2.10.3HeatMaps79
2.10.4SurfacePlot80
2.10.5VisualizationTools81 2.11Exercises82 2.12References82 Chapter 3 OperatingSystems85 3.1Terminology86 3.2Background87
3.2.1Kernel87
3.2.2Stacks89
3.2.3InterruptsandInterruptThreads91
3.2.4InterruptPriorityLevel92
3.2.5Processes93
3.2.6SystemCalls95
3.2.7VirtualMemory97
3.2.8MemoryManagement97
3.2.9Schedulers98
3.2.10FileSystems99 3.2.11Caching101 3.2.12Networking102
3.2.13DeviceDrivers103
3.2.14Multiprocessor103
3.2.15Preemption103
3.2.16ResourceManagement104
3.2.17Observability104 3.3Kernels105
3.3.1Unix106
3.3.2Solaris-Based106 3.3.3Linux-Based109
3.3.4Differences112
3.4Exercises113 3.5References113
Chapter 4 ObservabilityTools115 4.1ToolTypes116
4.1.1Counters116
4.1.2Tracing118
4.1.3Profiling119
4.1.4Monitoring(sar)120
4.2ObservabilitySources120
4.2.1/proc121
4.2.2/sys126
4.2.3kstat127
4.2.4DelayAccounting130
4.2.5MicrostateAccounting131
4.2.6OtherObservabilitySources131
4.3DTrace133
4.3.1StaticandDynamicTracing134
4.3.2Probes135
4.3.3Providers136
4.3.4Arguments137
4.3.5DLanguage137
4.3.6Built-inVariables137
4.3.7Actions138
4.3.8VariableTypes139
4.3.9One-Liners141
4.3.10Scripting141
4.3.11Overheads143
4.3.12DocumentationandResources143 4.4SystemTap144
4.4.1Probes145
4.4.2Tapsets145
4.4.3ActionsandBuilt-ins146
Contents
4.4.4Examples146
4.4.5Overheads148
4.4.6DocumentationandResources149
4.5perf149
4.6ObservingObservability150
4.7Exercises151
4.8References151
Chapter 5 Applications153
5.1ApplicationBasics153
5.1.1Objectives155
5.1.2OptimizetheCommonCase156
5.1.3Observability156
5.1.4BigONotation156
5.2ApplicationPerformanceTechniques158
5.2.1SelectinganI/OSize158
5.2.2Caching158
5.2.3Buffering159
5.2.4Polling159
5.2.5ConcurrencyandParallelism160
5.2.6Non-BlockingI/O162
5.2.7ProcessorBinding163
5.3ProgrammingLanguages163
5.3.1CompiledLanguages164
5.3.2InterpretedLanguages165
5.3.3VirtualMachines166
5.3.4GarbageCollection166
5.4MethodologyandAnalysis167
5.4.1ThreadStateAnalysis168
5.4.2CPUProfiling171
5.4.3SyscallAnalysis173
5.4.4I/OProfiling180
5.4.5WorkloadCharacterization181
5.4.6USEMethod181
5.4.7Drill-DownAnalysis182
5.4.8LockAnalysis182
5.4.9StaticPerformanceTuning185
5.5Exercises186
5.6References187
Chapter 6 CPUs189
6.1Terminology190
6.2Models191
6.2.1CPUArchitecture191
6.2.2CPUMemoryCaches191
6.2.3CPURunQueues192
6.3Concepts193
6.3.1ClockRate193
6.3.2Instruction193
6.3.3InstructionPipeline194
6.3.4InstructionWidth194
6.3.5CPI,IPC194
6.3.6Utilization195
6.3.7User-Time/Kernel-Time196
6.3.8Saturation196
6.3.9Preemption196
6.3.10PriorityInversion196
6.3.11Multiprocess,Multithreading197
6.3.12WordSize198
6.3.13CompilerOptimization199
6.4Architecture199
6.4.1Hardware199
6.4.2Software209
6.5Methodology214
6.5.1ToolsMethod215
6.5.2USEMethod216
6.5.3WorkloadCharacterization216
6.5.4Profiling218
6.5.5CycleAnalysis219
6.5.6PerformanceMonitoring220
6.5.7StaticPerformanceTuning220
6.5.8PriorityTuning221
6.5.9ResourceControls222
6.5.10CPUBinding222
6.5.11Micro-Benchmarking222
6.5.12Scaling223
6.6.8pidstat234
6.6.9time,ptime235 6.6.10DTrace236
6.6.11SystemTap243
6.6.12perf243 6.6.13cpustat249
6.6.14OtherTools250
6.8.2SchedulingPriorityandClass256 6.8.3SchedulerOptions257
6.8.4ProcessBinding259
6.8.5ExclusiveCPUSets259
6.8.6ResourceControls260
6.8.7ProcessorOptions(BIOSTuning)260
6.9Exercises260
6.10References262
Chapter 7 Memory265
7.1Terminology266
7.2Concepts267
7.2.1VirtualMemory267
7.2.2Paging268
7.2.3DemandPaging269
7.2.4Overcommit270
7.2.5Swapping271
7.2.6FileSystemCacheUsage271
7.2.7UtilizationandSaturation271
7.2.8Allocators272
7.2.9WordSize272
7.3Architecture272
7.3.1Hardware273
7.3.2Software278
7.3.3ProcessAddressSpace284 7.4Methodology289
7.4.1ToolsMethod289
7.4.2USEMethod290
7.4.3CharacterizingUsage291
7.4.4CycleAnalysis293
7.4.5PerformanceMonitoring293
7.4.6LeakDetection293
7.4.7StaticPerformanceTuning294
7.4.8ResourceControls294
7.4.9Micro-Benchmarking294
7.5.3slabtop302
7.5.4::kmastat302 7.5.5ps304 7.5.6top305
7.5.7prstat305 7.5.8pmap306 7.5.9DTrace308
7.5.10SystemTap312
8.3.7SynchronousWrites331
8.3.8RawandDirectI/O331
8.3.9Non-BlockingI/O332
8.3.10Memory-MappedFiles332
8.3.11Metadata333
8.3.12LogicalversusPhysicalI/O333
8.3.13OperationsAreNotEqual335
8.3.14SpecialFileSystems336
8.3.15AccessTimestamps336
8.3.16Capacity337
8.4Architecture337
8.4.1FileSystemI/OStack337
8.4.2VFS337
8.4.3FileSystemCaches339
8.4.4FileSystemFeatures344
8.4.5FileSystemTypes345
8.4.6VolumesandPools351
8.5Methodology353
8.5.1DiskAnalysis353
8.5.2LatencyAnalysis354
8.5.3WorkloadCharacterization356
8.5.4PerformanceMonitoring358
8.5.5EventTracing358
8.5.6StaticPerformanceTuning359
8.5.7CacheTuning360
8.5.8WorkloadSeparation360
8.5.9Memory-BasedFileSystems360
8.5.10Micro-Benchmarking361
8.6Analysis362
8.6.1vfsstat363
8.6.2fsstat364
8.6.3strace,truss364
8.6.4DTrace365
9.3.2TimeScales400
9.3.3Caching401
9.3.4RandomversusSequentialI/O402
9.3.5Read/WriteRatio403
9.3.6I/OSize403
9.3.7IOPSAreNotEqual404
9.3.8Non-Data-TransferDiskCommands404
9.3.9Utilization404
9.3.10Saturation405
9.3.11I/OWait406
9.3.12SynchronousversusAsynchronous407
9.3.13DiskversusApplicationI/O407
9.4Architecture407
9.4.1DiskTypes408
9.4.2Interfaces414
9.4.3StorageTypes415
9.4.4OperatingSystemDiskI/OStack418 9.5Methodology421
9.5.1ToolsMethod422
9.5.2USEMethod422
9.5.3PerformanceMonitoring423
9.5.4WorkloadCharacterization424
9.5.5LatencyAnalysis426
9.5.6EventTracing427
9.5.7StaticPerformanceTuning428
9.5.8CacheTuning429
9.5.9ResourceControls429
9.5.10Micro-Benchmarking429
9.5.11Scaling431 9.6Analysis431
9.6.1iostat432
9.6.2sar440
9.6.3pidstat441
9.6.4DTrace442 9.6.5SystemTap451 9.6.6perf451
9.6.7iotop452
9.6.8iosnoop455 9.6.9blktrace457
9.6.10MegaCli459
9.6.11smartctl460 9.6.12Visualizations461 9.7Experimentation465
9.7.1AdHoc465 9.7.2CustomLoadGenerators465
9.7.3Micro-BenchmarkTools466
9.7.4RandomReadExample466 9.8Tuning467
10.3.6Buffering481
10.3.7ConnectionBacklog481 10.3.8InterfaceNegotiation482
10.3.9Utilization482
10.3.10LocalConnections482 10.4Architecture483
10.4.1Protocols483
10.4.2Hardware486 10.4.3Software488 10.5Methodology493
10.5.1ToolsMethod494 10.5.2USEMethod495
10.5.3WorkloadCharacterization496
10.5.4LatencyAnalysis497
10.5.5PerformanceMonitoring498
10.5.6PacketSniffing498
10.5.7TCPAnalysis500
10.5.8Drill-DownAnalysis500
10.5.9StaticPerformanceTuning501
10.5.10ResourceControls502
Chapter 12 Benchmarking587
12.1Background588
12.1.1Activities588
12.1.2EffectiveBenchmarking589
12.1.3BenchmarkingSins591 12.2BenchmarkingTypes597
12.2.1Micro-Benchmarking597
12.2.2Simulation599
12.2.3Replay600
12.2.4IndustryStandards601
12.3Methodology602
12.3.1PassiveBenchmarking603
12.3.2ActiveBenchmarking604
12.3.3CPUProfiling606
12.3.4USEMethod607
12.3.5WorkloadCharacterization608
12.3.6CustomBenchmarks608
12.3.7RampingLoad608
12.3.8SanityCheck611
12.3.9StatisticalAnalysis612 12.4BenchmarkQuestions613
12.5Exercises614
12.6References615
Chapter 13 CaseStudy617
13.1CaseStudy:TheRedWhale617
13.1.1ProblemStatement618
13.1.2Support619
13.1.3GettingStarted620
13.1.4ChooseYourOwnAdventure622
13.1.5TheUSEMethod623
13.1.6AreWeDone?626
13.1.7Take2627
13.1.8TheBasics628
13.1.9IgnoringtheRedWhale628 13.1.10InterrogatingtheKernel629
13.1.11Why?631
13.1.12Epilogue633 13.2Comments633 13.3AdditionalInformation634 13.4References634
Appendix A USEMethod:Linux637 PhysicalResources637 SoftwareResources640 Reference641
Appendix B USEMethod:Solaris643 PhysicalResources643 SoftwareResources646 References647
Appendix C sarSummary649 Linux649 Solaris650
Appendix D DTraceOne-Liners651 syscallProvider651 procProvider655 profileProvider655 schedProvider657 fbtProvider658 pidProvider659 ioProvider660 sysinfoProvider660 vminfoProvider661 ipProvider661
tcpprovider662 udpprovider663
Appendix E DTracetoSystemTap665 Functionality665 Terminology666 Probes666 Built-inVariables667 Functions668
Example1:ListingsyscallEntryProbes668
Example2:Summarizeread()ReturnedSize668
Example3:CountsyscallsbyProcessName670
Example4:CountsyscallsbysyscallName,forProcess ID123671
Example5:CountsyscallsbysyscallName,for "httpd"Processes672
Example6:TraceFileopen()swithProcessName andPathName672
Example7:Summarizeread()Latencyfor"mysqld" Processes672
Example8:TraceNewProcesseswithProcessName andArguments673
Example9:SampleKernelStacksat100Hz674 References674
Appendix F SolutionstoSelectedExercises675
Chapter2—Methodology675
Chapter3—OperatingSystems675 Chapter6—CPUs675 Chapter7—Memory676
Chapter8—FileSystems676 Chapter9—Disks677
Chapter11—CloudComputing677