System Management & onTune Oct. 2014
System Administrator’s Roles Analyzing system logs and identifying potential issues with computer systems. Introducing and integrating new technologies into existing data center environments. Performing routine audits of systems and software. Performing backups. Applying operating system updates, patches, and configuration changes. Installing and configuring new hardware and software. Adding, removing, or updating user account information, resetting passwords, etc. Answering technical queries and assisting users. Responsibility for security. Responsibility for documenting the configuration of the system. Troubleshooting any reported problems. System performance tuning. Ensuring that the network infrastructure is up and running. Configure, add, delete file systems, knowledge of volume management tools such as Veritas, ZFS, LVM. source : Wikipedia.org 2
System Performance & Monitoring The overall system performance is dependent upon the component which shows worst performance. So that system tuning based on real performance and/or resource utilization data is very import`ant to improve system performance. The performance of system is affected by various factors.
Basic system hardware resources • CPU, Memory, Disk, and Network Performance
Software Resources • Database Engine • Application Framework • Applications • File systems • Others
Configuration
화이트 이미지 배경
• Hardware Partitioning, Virtualization • Load balancing - Nodes - Disks - Networks - Others
• Application configuration • OS configuration
Etc. 3
One morning of George George was in trouble. A seemingly simple deployment was taking all morning, and there seemed no end in sight. His manager kept coming in to check on his progress, as the customer was anxious to have the deployment done. He was supposed to be leaving for a goodbye lunch for a departing co-worker, adding to the stress. He had called in all kinds of help, including colleagues, an application architect, technical support, and even one of the system developers. He used email, instant messaging, face-to-face contacts, his phone, and even his office mate's phone to communicate with everyone. And George was no novice. He had been working as a Web-hosting administrator for three years, and he had a bachelor's degree in computer science. But it seemed that all the expertise being brought to bear was simply not enough. Why was George in trouble?
George is a system administrator, one of the people who work behind
The hand of system administrator : System administrators solve issues, getting requests from various route.
the scenes to configure, operate, maintain, and trouble-shoot the computer infrastructure that supports much of modern life. Their work is critical and expensive. The human part of total system cost-of-ownership has been growing for decades, now dominating the costs of hardware or software.
Source : By Eben M. Haber, Eser Kandogan, Paul P. Maglio Communications of the ACM, Vol. 54 No. 1
4
System administration as usual System administrators are responsible for well running and maintaining of system services.
Daily system operation inspection • System resources utilization (CPU, M emory, Disk I/O, Network I/O) • Tracing system logs • M onitoring wasted resource (i.e. M emory leak) • Tracing unexpected process behavior • User account management • Support to develop application programs (Super-user support, test, execution) • Applying and integrating new H/W, S/W
Routine works • Applying system updates, applying patches • System backups
Urgent works
• Urgent patches • System configuration changes
Trouble shooting Others
Image source : sachachua.com
5
Activities for troubleshooting system failure When the system performance is getting down and/or system failure occurs, system administrators try to solve the problem as soon as possible, and these activities consumes time and human resources.
Holding an urgent meeting with people related with issue • System Administrators • Database Administrators • Network Administrators • Hardware Vendor Engineers • Software Vendor Engineers • Application Developers
Ordinary situation till any action taken • Arguing no responsibility on themselves. (If it is hard to find evidence, few people confess responsibility.) • Lack of system logs when the issue occurred • Lack of system performance data • Long time for taking action
Following issues
• Resources wasted • Customer complaints for delayed service
6
The wasted money when system issue occurred When system failure occurs, you have to pay money for recovering.
One time system failure expense (A pay per a month - $9,000 for Advanced technician) • 2 weeks for system administrator : $6,000 • Database/Application/Software administrator(each 1 week) : $9,000 • Outsourcing support expense : 3 companies for a week : $9,000 • $24,000 paid in total
If 6 unexpected system failure occurs per a year, $144,000 is paid for labor expense • Customer dissatisfaction • Customer call response cost • Adverse effect on the company reputation
Invisible cost • Resources wasted • Customer complaints for delayed service
By reducing system recovery time by 1/3, you can save $100,000 a year. (Customer satisfaction will be improved.)
7
No difference between general System Management Solution and Expert System Performance Analysis solution? Expert System Performance Analysis solution can be a substitute for general System M anagement Solution, but vice versa can not be. Expert System Performance Solution should provide analysis functions at the time of system failure as well as real-time system monitoring features.
Common requirements for General tools and Expert tools
Distinct requirements for Expert tools (Including System Failure Analysis)
• Collecting performance data regularly (every 5min ~
• Collecting very short interval performance data (by second unit)
10min)
• M onitoring every single process without certain configuration
• Tracing system logs
• Tracing by user, process, group
• Tracing system events
• System performance comparison, analysis and tracing for certain period
• Reporting features
(at the time of system failure.)
• Collecting and analysis for system resource utilization
• User defined script execution and tracing the outcomes
• Monitoring critical processes
• Tracing system configuration changes
System failure usually occurs under unexpected conditions. For instance, database engine process is the most critical one in database server. Generally, system administrators configure database engine processes should be monitored in advance. However, if an unexpected process, which in not defined to be monitored in advance, causes database server system failure, the monitoring configuration mentioned above is useless. In fact, it is impossible for system administrators to define all the impact factors, which may cause the problem of a server, whether database servers or application servers or others, to be monitored in advance. For this reason, general system management solution can not trace the root cause of the problem when system failure occurred. onTune, a leading edge Expert System Performance Analysis solution does not need to set extra configuration. Once it installed, all the system environments including resources, processes, configurations, and others to be monitored automatically.
10 Usability Heuristics for User Interface Design by Jakob Nielsen The 10 most general principles for interaction design. They are called "heuristics" because they are more in the nature of rules of thumb than specific usability guidelines. onTune is the product satisfying these principles.
Visibility of system status : With simple and easy navigation, system performance data of past time and real-time is provided. Match between system and the real world : Collecting all data for whatever operating system and showing it in simple words. User control and freedom : Simple to exit from a window and take control. Consistency and standards : Standardizing various performance data by operating systems, and providing the data in a single language Error prevention : Eliminating error-prone conditions. Recognition rather than recall : Easy to understand icons and menus for navigation. Flexibility and efficiency of use : Direct access menus, right mouse button menus makes it easy to get information. Aesthetic and minimalist design : Simple menu, intuitive icons, easy navigation for beginners Help users recognize, diagnose, and recover from errors : Simple messages for user notification Help and documentation : Comprehensive documents for installation and how to use.
System Analysis & Performance Instrument onTune SPI onTune is the expert system performance analysis tool, excellent for tracing root cause of problem and/or system failure. Obviously, it provides functions as general system management solution as well.
onTune is
TeemStone
• Brand new leading edge solution
• Considers engineers and administrators as fellows.
• Differentiated from existing solutions, it shows totally new form.
• Composed with the astonishing members in the each field.
SPI • New solution practically helpful to system administrators. - Discovery for new area - SPI : System Analysis & Performance Instrument
onTune Users • Before : Even though they paid big money to build system management solutions, it provided only restrict features. • After : Secures accurate data which helps analyzing the system performance and problems, takes pro-active and proper actions, and spend their time for worthy things instead of wasting • In fact, 90% of customers have decided to use onTune after taking proof of concepts.
10
Application onTune helps user saving problem solving time, improving efficiency of system operating, protecting from spending much time to create reports.
System performance management & Analysis • Forecasting, monitoring and analyzing on all aspects of system operating. System performance data may be collected in second.
Virtualization System • Real time monitoring and analyzing on resource migration and usage under virtualization environments.
Notification to users • Notifying to pre-defined users when certain system conditions happen in real-time with accurate information.
Report • Creating system performance reports in Excel and/or PowerPoint and/or others.
Using accurate real-time performance data, users can analyze every factor which may affect business. Compared with existing general System Monitoring Solutions, onTune stands at the forefront.
11
Macro Scope Through collecting and analyzing performance data, onTune monitors and detects the factors affecting business.
12
Macro Scope Through collecting and analyzing performance data, onTune monitors and detects the factors affecting business.
Monitoring • 2 seconds monitoring interval – detailed analysis available • Real-time search and analysis • Real-time I/O traffic analysis • M onitoring user processes in real-time • Veritas Volume M anager support
Performance monitoring • Detailed analysis data for performance and
system failure
• Performance analysis of each resource in certain time period (bottleneck finding) – tracing changes by time. • Process resource usage analysis – user, group, command • Automated reporting tool – PowerPoint, Excel
Automation
• Installation – Centralized automation installation • Monitoring objects defining – As soon as agent installed and connected to manager, it starts to collect data of all resources and processes. • Group trace and monitoring
Virtualized system monitoring
Event monitoring & notification – When a pre-defined system condition is met, onTune alarms and notifies it to the users.
Performance analysis report – PowerPoint & Excel
Others
vs. onTune
14
T. 02 2057 7393
F. 02 2057 7394
E . support@ontune.co.kr
http://www.ontune.co.kr