How to Make an Application Easy to Diagnose
Cary Millsap (cary.millsap@hotsos.com) Hotsos Enterprises, Ltd. Hotsos Symposium 2005 3:00pm–4:00pm Wednesday 9 March 2005
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 1
Agenda
• Motives • Instrumenting your Oracle db calls • Instrumenting everything else
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 2
“If you can’t measure it, you can’t manage it.” —Peter Drucker • Software performance is measured by its speed • Speed = Result ÷ Time
• If you canÊt measure the time it takes for an application to produce a result, then you canÊt manage its performance.
www.hotsos.com
Slide 3
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Software developers use profilers and tracers to determine how long their code runs. And why. • Example: GNU gprof % cumulative time seconds 60.37 0.49 39.63 0.82
self seconds calls 0.49 62135400 0.33 499999
self us/call 0.01 0.65
total us/call 0.01 1.64
name step nseq
• Example: GNU strace times(NULL) = 53821310 gettimeofday({1105483456, 234638}, NULL) = 0 _llseek(11, 6971392, [6971392], SEEK_SET) = 0 readv(11, [{"\6\242\0\0S\3@\0\247\274\0\0\0"..., 8192}], 6) = 49152 gettimeofday({1105483456, 253209}, NULL) = 0 times(NULL) = 53821312 write(5, "WAIT #5: nam=\'db file scattered read\' ela="..., 65) = 65 write(5, "\n", 1) = 1
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 4
But you can do much more if you instrument your application • There are things a developer knows that an OS tool cannot – Aggregate by unit of business work – Reveal context-specific application information • With instrumentation inside your application – Better, faster code – Easier to diagnose and repair
The result: happier customers, lower support costs.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 5
Instrumenting your Oracle db calls
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 6
Oracle has provided profiler-ready timing instrumentation for the database kernel since version 6. • Oracle kernel instrumentation – Version 6: database call timings – Version 7: non-dbcall timed events – Versions 8–10: enhanced code path coverage
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 7
The design of your application largely determines how easy or difficult it is to collect Oracle trace files. • Conceptually, data collection is simple – DBMS_SUPPORT (v7,8,9) – DBMS_MONITOR (v10) • Practically, data collection can be quite difficult – Business task to Oracle trace file is not 1-to-1
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 8
Instrumenting your Oracle application is pretty easy.
• Instrumentation is minimal extra work ➊exec dbms_monitor.session_trace_enable(null,null,true,true); ➋exec dbms_application_info.set_module('demo','greeting'); select 'hello world' from dual; ➌exec dbms_application_info.set_module('demo','real business'); select count(*) from dba_objects where owner='SYSTEM'; disconnect;
• Difficulty comes when it’s not your application – How do you instrument someone else’s compiled code?
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 9
This kind of instrumentation gives you what you need to profile the response time of a business task. • DBMS_MONITOR call enables the trace • DBMS_APPLICATION_INFO calls identify business tasks … BEGIN dbms_application_info.set_module('demo','greeting'); END; … *** ACTION NAME:(greeting) 2005-02-03 15:23:47.189 *** MODULE NAME:(demo) 2005-02-03 15:23:47.189 … select 'hello world' from dual … *** ACTION NAME:(real business) 2005-02-03 15:23:47.193 *** MODULE NAME:(demo) 2005-02-03 15:23:47.193 … select count(*) from dba_objects where owner='SYSTEM' …
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 10
Instrumenting everything else
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 11
So, what if most of your user’s response time is spent outside the Oracle tier? • It happens a lot more these days – Fancier user interfaces – Fancier post-retrieval processing – More tiers
Your custom application probably has a lot more bad code in it than your Oracle kernel does.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 12
My ideas about how to design trace data are influenced by having studied Oracle trace files for so many years. • Oracle’s trace diagnostics are tremendous! • But… – Difficult to understand – Very difficult to profile • You can do it better – Some proposed requirements…
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 13
Requirement: files and file identification…
• Trace data must be written to a file. • The application user gets to decide where this file should be written and what its name shall be. • The application user gets to decide whether to run a program with tracing turned on, or with tracing turned off. • The file has a version number in it and whatever additional information is required (such as a field key) so the application user (and his profiler software) can understand how to interpret the particular version of the data he’s looking at. This allows the format of trace to improve over time without breaking older profilers.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 14
An example trace file… version=1.1 key=time ela usr sys dep caller callee p1 p2 1107275831.899634=2005/02/01/10:37:11.899634 1107275831.899634 0.000472 0.000000 0.000000 1107275833.456488 1.556308 1.520000 0.000000 1107275833.456690 1.556574 1.520000 0.000000 1107275833.673210 0.216378 0.000000 0.000000 1107275835.307857 1.634442 1.500000 0.000000 1107275836.901840 1.593879 1.510000 0.000000 1107275836.902033 3.228636 3.010000 0.000000
www.hotsos.com
p3 0 1 0 0 1 1 0
<> open-trace STDOUT dad randomer TX 4 <> dad 1 <> sleeper 0.202584 dad randomer TX 4 dad randomer TX 4 <> dad 2
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 15
Requirement: vendor support…
• The application vendor must fully support the application’s trace data. The vendor must fully document the format of the trace file and the meaning of its content.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 16
Requirement: business task orientation…
•
•
A trace file “event” line maps to a logical unit of work—usually a subroutine. The unit of work must be small enough that the reader of the trace data doesn’t require more detail about the unit of work than is rendered in the trace file. The unit of work must be large enough to minimize the measurement intrusion effect of the instrumentation. Every time a business-level task begins or ends, the application must emit information to the trace file to signify the business task boundary.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 17
An example trace file… version=1.1 key=time ela usr sys dep caller callee p1 p2 1107275831.899634=2005/02/01/10:37:11.899634 1107275831.899634 0.000472 0.000000 0.000000 1107275833.456488 1.556308 1.520000 0.000000 1107275833.456690 1.556574 1.520000 0.000000 1107275833.673210 0.216378 0.000000 0.000000 1107275835.307857 1.634442 1.500000 0.000000 1107275836.901840 1.593879 1.510000 0.000000 1107275836.902033 3.228636 3.010000 0.000000
p3 0 1 0 0 1 1 0
<> open-trace STDOUT dad randomer TX 4 <> dad 1 <> sleeper 0.202584 dad randomer TX 4 dad randomer TX 4 <> dad 2
Note that my example doesnÊt yet demonstrate the second point (task begin/end markers).
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 18
Requirement: coverage…
• The collection of “event” lines in the trace file must provide complete coverage of the application’s code path. • If a business task is permitted to execute piecewise across two or more OS processes, then the trace data must contain markers sufficient to assemble the relevant fragments of trace data into one contiguous time-sequential description of the task’s response time. • Each tier must be instrumented so that a user can compute endto-end response time for the measured task.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 19
Requirement: timestamps...
• Each “event” line must have a timestamp. The trace documentation must explain to what event that timestamp refers. (Typically, it’s the time of the event’s conclusion.) • If the trace file’s timestamp values aren’t human-readable, then the trace file must provide information that allows for easy conversion of timestamps into human-readable wall-clock values.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 20
An example trace file… version=1.1 key=time ela usr sys dep caller callee p1 p2 1107275831.899634=2005/02/01/10:37:11.899634 1107275831.899634 0.000472 0.000000 0.000000 1107275833.456488 1.556308 1.520000 0.000000 1107275833.456690 1.556574 1.520000 0.000000 1107275833.673210 0.216378 0.000000 0.000000 1107275835.307857 1.634442 1.500000 0.000000 1107275836.901840 1.593879 1.510000 0.000000 1107275836.902033 3.228636 3.010000 0.000000
www.hotsos.com
p3 0 1 0 0 1 1 0
<> open-trace STDOUT dad randomer TX 4 <> dad 1 <> sleeper 0.202584 dad randomer TX 4 dad randomer TX 4 <> dad 2
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 21
Requirement: event attributes…
• Each “event” line must show an elapsed time consumption. • Each “event” line must show resource consumption for both kernel mode and user mode CPU usage. • Each “event” line must show the name of the “event,” its call stack depth, and the name of its caller. • Each “event” line must have the provision for displaying contextsensitive values about the instrumented event.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 22
An example trace file… version=1.1 key=time ela usr sys dep caller callee p1 p2 1107275831.899634=2005/02/01/10:37:11.899634 1107275831.899634 0.000472 0.000000 0.000000 1107275833.456488 1.556308 1.520000 0.000000 1107275833.456690 1.556574 1.520000 0.000000 1107275833.673210 0.216378 0.000000 0.000000 1107275835.307857 1.634442 1.500000 0.000000 1107275836.901840 1.593879 1.510000 0.000000 1107275836.902033 3.228636 3.010000 0.000000
www.hotsos.com
p3 0 1 0 0 1 1 0
<> open-trace STDOUT dad randomer TX 4 <> dad 1 <> sleeper 0.202584 dad randomer TX 4 dad randomer TX 4 <> dad 2
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 23
Requirement: un-buffered output...
• The application must flush trace lines to the trace file as events complete. If the application can buffer its trace emissions, then there must exist a user-selectable option to produce un-buffered output.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 24
Requirement: conservation of storage...
• Trace data should be reasonably conservative about space consumption (and the time it takes to write the trace data). For example, a single key defining the meaning of delimiterseparated fields is more efficient than using a name=value style syntax for every field throughout the trace file.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 25
Requirement: minimized invasiveness…
• The application instrumentation must be minimally invasive upon the response time of the application. • The application instrumentation must be minimally invasive upon the author of the application code.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 26
Wrap-up • You can’t manage what you can’t measure • Instrumented code is faster, better code that’s easier and cheaper to support • You can learn a lot about instrumentation by watching how Oracle does it • The “requirements” proposed in my paper will help you create trace files that are easier to use than Oracle’s
If you will ever be responsible for the performance of your application, then youÊll thank yourself later if you instrument it today. So will your support staff. www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 27
Hotsos: Come see us… • Thought leadership – Optimizing Oracle Performance – Oracle Insights – Method R
• Products – Hotsos Profiler – Laredo – Interceptor technologies
• Services – 1-week performance assessment – On-site consulting and education – Remote consulting
• Education – Oracle performance curriculum – Hotsos Symposium
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 28
References Finnigan, P. 2004. “How to set trace for others’ sessions, for your own session, and at instance level.” www.petefinnigan.com Millsap, C. 2005. “Profiling Oracle: how it works.” Hotsos Symposium 2005 Millsap, C. 2004. “How to activate extended SQL trace.” www.hotsos.com Millsap, C.; Holt, J. 2003. Optimizing Oracle Performance. Sebastopol CA: O’Reilly & Associates Norgaard, M.; et al. 2004. Oracle Insights: Tales of the Oak Table. Berkeley CA: Apress A collection of stories about experiences with Oracle performance, including a history of Oracle’s extended SQL trace mechanism. The Open Group 1988. ARM 2.0 Technical Standard. www.opengroup.org/tech/management/arm/ A description of the “Application Response Measurement (ARM) API,” an application measurement system implemented in C and Java.
www.hotsos.com
Copyright © 1999–2005 by Hotsos Enterprises, Ltd.
Slide 29