Observability a guide for buyers
Stay on top of the IT industry
Subscribe to ITOps Times Weekly News Digest to get the latest news, news analysis and commentary delivered to your inbox. • Reports on the technologies affecting IT Operators — APM, Data Center Optimization, Multi-Cloud, ITSM and Storage
• Insights into the practices and innovations reshaping
IT Ops such as AIOps Automation, Containers, DevOps,
Edge Computing and more
• The latest news from the IT providers, industry consortia, open source projects and research institutions
Subscribe today to keep up with everything happening in the IT Ops industry. www.ITOpsTimes.com
Observability a guide for buyers
Contents
Observability makes reactive operation teams proactive
Application Performance Monitoring: What it means in today’s complex software world
How COVID-19 impacts the need for observability
Gartner’s 3 requirements for APM
page 4
page 8
How does Micro Focus help companies with observability? page 9
page 10
page 13
A guide to monitoring tools page 14
September 2019
3
Observability makes reactive operation By Jenna Sargent
4 May 2020
Observability-ITOpsGuide.qxp_Layout 1 6/3/20 4:42 PM Page 5
teams proactive You’ve likely heard the term observability being passed around for the past few years, and you might have assumed that it is just another marketing buzzword for monitoring. And you wouldn’t be alone in that thinking. Some experts would say that “observability” and “AIOps” and “Application Performance Monitoring (APM)” are just terms used to distinguish between products on the market. But others would argue that there’s a concrete difference between all these terms. Wes Cooper, Product Marketing Manager at Micro Focus, is among them. Cooper believes that a big differentiator between observability and monitoring is that observability is proactive, while monitoring is reactive. In other words, observability aims to tell you why something broke, while monitoring just tells you what is broken. This differentiator is key. There are a number of other differences between the two, but they all tie back into this idea of being proactive versus being reactive. According to Cooper, observability is good for looking into the unknown, acts as a compliment to DevOps, and uses automation to help fix problems. Monitoring, on the other hand, identifies known problems, is siloed in ops teams, and spots problems, but doesn’t fix them, he explained.
How to move from reactive to proactive
Getting from reactive to proactive states of operation isn’t as simple as flipping on a switch. According to Cooper, there are a number of things companies need to do in order to get from the “what” state of monitoring to the “why” state of observability. First, an organization needs to be collecting data from all data points. This means looking at metrics, events, logs, topology, and any changes. They also need to collect information from all data domains, including on-premises, private clouds, public cloud, and containers. Lastly, they also need to be looking at things from all perspectives, such as the system and the end users. In addition to gathering all of this data, companies also have to utilize machine learning and big data analytics in order to actually gain insights from all of this data. In other words, companies have to adopt AIOps, a methodology and technology that introduces AI, machine learning, and automation into IT operations. AIOps is a clear break from the monitoring of the past. AIOps takes into account not just the application, but also the infrastructure— how the cloud is performing, how the network is performing, etc. With APM, you’re only looking at the application itself and the data tied to that application, explained Stephen Elliot, program di-
rector of I&O at research firm IDC. “I think now that one of the big differences is not only do you have to have that data, but it’s a much broader set of data — logs, metrics, traces — that is collected in near real-time or real-time with streaming architectures,” said Elliot.”
The three pillars of observability
Cindy Sridharan’s popular “Distributed Systems Observability” book published by O’Reilly claims that logs, metrics, and traces are the three pillars of observability. According to Sridharan, an event log is a record of events that contains both a timestamp and payload of content. Event logs come in three forms: Plaintext: A log record stored in plaintext is the most commonly used type of log Structured: A log record that is typically stored in JSON format and highly advocated for as the form to use Binary: Examples of binary event logs include Protobuf formatted logs, MySQL binlogs, systemd journal logs, etc. Logs can be useful in identifying unpredictable behavior in a system. Sridharan explained that often distributed systems experience failures not because of one specific event continued on page 6 >
May 2020
5
Observability-ITOpsGuide.qxp_Layout 1 6/3/20 4:42 PM Page 6
Observability makes reactive operation teams proactive < continued from page 5
happening, but because of a series of possible triggers. In order to pin down the cause of an event, operations teams need to be able to start with a symptom pinpointed by a metric or log, infer the life cycle of a request across various system components, and iteratively ask questions about interactions between parts of that system. Logs are the base of the three pillars, and both metrics and traces are built on top of them, Sridharan wrote. Sridharan defines metrics as numeric representations of data measured across time intervals. They are useful in observability because they can be used by machine learning algorithms to gain insights on the behavior of a system over time. According to Sridharan, their numerical nature also allows for longer retention of data and easier querying, making them well suited for building dashboards that show historical trends. Traces are the final pillar of observability. According to Sridharan, a trace is “a representation of a series of causally related distributed events that encode the end-to-end request flow through a distributed system.” They can provide visibility into the path that a request took and the structure of that request. Traces can help uncover the unintentional effects of a request, making them particularly well-suited for complex environments, like microservices.
Monitoring the right data
Organizations can sometimes fall into the trap of monitoring everything. According to Cooper, it’s possible for a company to get to a point
6 May 2020
where they’re monitoring too much. It’s important to be able to determine what data to observe, and what to ignore. Rather than taking data in from every single domain, Cooper recommends focusing on what’s important—metrics, events, logs, and change data. “Part of getting to being more proactive or more observable is being able to collect and store all those different data types,” he said. “So machine data coming in from metrics, events logs, topology, are all really important to paint that full picture.” Infrastructure complexity adds to the need for observability According to Cooper, the evolu-
Those technologies all help increase velocity and reduce the friction created through the process of getting code into production. But that increased velocity comes at a price: greater complexity. “The fact now is you have a lot more data that’s being produced in these environments every single day,” said Cooper. “And so having the ability to ingest all that data, but also make sense of that data is something that’s taking a ton of operators’ time. And they don’t have the ability to go through that manually everyday. They literally are to the point where you have to enlist analytics and machine learning to help comb through all that data and make sense of it.”
“All the user data, and being able to paint a picture of that across all these different environments has really made monitoring a lot more complex.”
tion of observability has mostly happened within the past few years. He believes that up until the past few years monitoring was sufficient at most companies. For a long time it was typical to have one application running on one server. If an organization had 20 applications, they could have one or two people looking after those and using basic monitoring tools to get performance statistics on those servers. That just isn’t the case anymore. Now, organizations have workloads that aren’t just running on physical servers they have control over. Now workloads are running all over the place: across different public cloud vendors, in private clouds, in containerized environments, etc.
—Wes Cooper, Product Marketing Manager at Micro Focus
Operators’ jobs have become not only more complicated, but more important. For example, something simple like a bad configuration or change can throw off an entire service. “Really being able to enlist tools that can take in all this information, all this performance information, all the user data, and being able to paint a picture of that across all these different environments has really made monitoring a lot more complex and that’s why organizations have to get more proactive today,” said Cooper. This is where the need for AIOps is clear. With all of this new complexity, you need to be monitoring not just the application, but also the network and the infrastructure.
continued on page 8 >
Say â&#x20AC;&#x153;Hiâ&#x20AC;? to the Future of AIOps A Bring clarity to monitoring and remediation
6SRWWLQJ UHG Ć°DJV WKURXJK WKH IRJ RI K\EULG ,7 LV QHDUO\ LPSRVVLEOH /XFNLO\ 0LFUR )RFXV 2SHUDWLRQV %ULGJH JLYHV \RX WKH FOHDU XQREVWUXFWHG YLHZ \RX QHHG 6KLIW WR DXWRPDWHG $, EDVHG PRQLWRULQJtSRZHUHG E\ $,2SV 1RZ \RX FDQ VHH HYHU\WKLQJ GHWHFW DQRPDOLHV DQG ĆŽ[ SUREOHPV ZLWK QHZ VSHHG DQG LQVLJKW )HZHU RXWDJHV DQG KDSSLHU XVHUV IROORZ
Learn more about:
Micro Focus Operations Bridge
Observability-ITOpsGuide.qxp_Layout 1 6/3/20 4:42 PM Page 8
Observability makes reactive operation < continued from page 6
AIOps is also important because it allows operators to train the system to reconfigure itself in order to accommodate changing loads or provision data storage as needed.
Goodbye, blame culture
As with any new technology, shifting to AIOps and observability will require a culture change at the organization. Joe Butson, cofounder of consulting company Big Deal Digital, believes that the automation in AIOps will eliminate
blame cultures, where fingers are pointed when there is an incident. AIOps will lead to an acceptance that problems are going to happen. “One of things about the culture change that’s underway is one where you move away from blaming people when things go down to, we are going to have problems, let’s not look for root cause analysis as to why something went down, but what are the inputs? The safety culture is very different. We tended to root cause it down to ‘you didn’t do this,’ and someone gets reprimanded and fired, but that didn’t prove to be as
How COVID-19 impacts the need for observability
It’s difficult these days to talk about trends and predictions without framing things in terms of COVID-19. The pandemic is going to have impacts — whether they be large or small — on every aspect of business, and observability is no exception. According to Cooper, the shift to remote work has forced companies to question what they need to be monitoring. For example, a lot of customers are now connecting to their companies’ networks from home through a virtual private network (VPN). This adds another layer of technologies that need to be monitored that might not have even been a consideration in an office that doesn’t offer remote work normally. In addition to monitoring them to make sure they’re up and running, operations teams should also be ensuring that they’re configured properly so that proprietary information doesn’t get out, Cooper explained. A lot of operations teams have been working on creating dashboards that display real-time views on how their VPNs are performing, Cooper said. This includes things like who is using the VPN and how quickly they are able to access services on that VPN. There are also a lot of collaboration tools being used to facilitate remote work that need to be looked at. Companies need to be monitoring tools like Skype for Business, Zoom, or Microsoft Teams and ensure that they’re performing well under increased load. “We’ve got 15,000 people in our company and we’ve noticed from a Teams perspective that there is excess load that’s on their servers right now. So I think that from this whole, even when we talk about what’s relevant with COVID, we’ve seen a lot of monitoring or operations teams on the scramble. With everything going remote, we need to make some adjustments in how we’re doing monitoring. There’s definitely some relevancy,” said Cooper. n 8 May 2020
helpful, and we’re moving to a generative culture, where we know there will be problems and we look to the future,” said Butson. This ties back into the move from reactive to proactive, advocated for by Cooper. When you’re proactive about detecting issues, rather than only reacting to issues, that blame culture goes away. “It’s critical to success because you can anticipate a problem and fix it. In a perfect world, you don’t even have to intervene, you just have to monitor the intervention and new resources are being added as needed,” Butson said. “You can have the monitoring automated, so you can autoscale and auto-descale. You’d study traffic based on having all this data, and you are able to automate it. Once in a while, something will break and you’ll get a call in the middle of the night. But for the most part, by automating it, being able to take something down, or roll back if you’re trying to bring out some new features to the market and it doesn’t work, being able to roll back to the last best configuration, is all automated.”
The state of observability and monitoring
The promises made by observability and AIOps are enticing, but Charley Rich, research director at Gartner, cautions against declaring observability as the holy grail of monitoring just yet. According to Rich, APM is a mature market, and is past the bump on the Gartner hype cycle. “AIOps, on the other hand, is just climbing up the mountain of hype,” Rich said. “Very, very different. What that means in plain English is that what’s said about
Observability-ITOpsGuide.qxp_Layout 1 6/3/20 4:42 PM Page 9
teams proactive
How does Micro Focus help companies with observability?
Micro Focus enables teams to monitor across all domains and then consolidate that data. It helps with the collection of data and then stores that data in a single data lake architecture. In addition to allowing companies to observe across different domains, Micro Focus takes observability to the next level with machine learning and analytics. After bringing data into that data lake architecture, it provides machine learning capabilities for that data so that operators and users can better understand what the data actually means. By using Micro Focus’ tools, operators can see where patterns are occurring and then address them. Micro Focus then provides a layer of automation on AIOps today is just not quite true. You have to look at it from the perspective of maturity.” Rich believes that there are a lot of companies out there making assumptions about AIOps that aren’t quite true. For example, some believe that AIOps will automatically solve problems on its own and a number of vendors market their AIOps solution using terms like self-healing, a capability that Rich says simply doesn’t exist yet. “You know, you and I go out and have a cocktail while the computer’s doing all the work,” said Rich. “It sounds good; it’s certainly aspirational and it’s what everyone wants, but today, the solutions that run things to fix are all deterministic. Somewhere there’s a script with a bunch of if-then rules and there’s a hard-coded script that says, ‘if this happens, do that.’ Well, we’ve had that capability for 30 years. They’re just dressing it up and taking it to town.” But Rich isn’t entirely dismissive of the hype around AIOps. “It’s very exciting, and I think we’re going to
top of that so that teams can automate on top of what they’re observing. It’s one thing to be able to look at an environment proactively and be able to spot problems and trends. Being able to go out and automate the process of remediation is the other half of the equation. In light of the pandemic, Micro Focus is also tailoring its solutions to work for remote-first workplaces. Its customers have identified three primary requirements for remote IT operations. They need visibility into the health of collaboration tools, monitoring of VPN and Virtual Desktop Infrastructure (VDI) solutions, and keeping user experience, continuity, and security top of mind. Remote-first IT operations teams can utilize Micro Focus’ solutions to address those challenges. n
get there,” said Rich. “It’s just, we’re early, and right now, today, AIOps has been used very effectively for event correlation — better than traditional methods, and it’s been very good for outlier and anomaly detection. We’re starting to see in ITSM tools more use of natural language processing and chatbots and virtual support assistants. That’s an area that doesn’t get talked about a lot. Putting natural language processing in front of workflows is a way of democratizing them and making complex things much more easily accessible to less-skilled IT workers, which improves productivity.” It’s important to be vigilant about
breaking down what a solution promises versus what it actually does. According to Rich, there are plenty of solutions out there using machine learning algorithms to help them self-adjust. But leveraging machine learning doesn’t automatically make a solution an AIOps solution. “Everybody’s doing this,” Rich said. “We in the last market guide segmented the market of solutions into domain-centric and domain-agnostic AIOps solutions. So domaincentric might be an APM solution that’s got a lot of machine learning in it but it’s all focused on the domain, like APM, not on some other thing. Domain-agnostic is more general-purpose, bringing in data from other tools. Usually a domain-agnostic tool doesn’t collect, like a monitoring tool does. It relies on collectors from monitoring tools. And then, at least in concept, it can look across different data streams, different tools, and come up with a cross-domain analysis. That’s the difference there.” n
May 2020 9
Application Performance Monitoring:
What it means in today’s By David Rubinstein
software continues to grow as the driver of today’s global economy, and how a company’s applications perform is critical to retaining customer loyalty and business. People now demand instant gratification and will not tolerate latency — not even a little bit. as a result, application performance monitoring is perhaps more important than ever to companies looking to remain competitive in this digital economy. but today’s aPM doesn’t look much like the aPM of a decade ago. Performance monitoring then was more about the application itself, and very specific to the data tied to that application. back then, applications ran in datacenters
10 May 2020
on-premises, and written as monoliths, largely in Java, tied to a single database. With that simple n-tier architecture, organizations were able to easily collect all the data they needed, which was then displayed in Networks operations Centers to systems administrators. The hard work came from command-line launching of monitoring tools — requiring systems administration experts — sifting through log files to see what was real and what was a false alarm, and from reaching the right people to remediate the problem. in today’s world, doing aPM efficiently is a much greater challenge. applications are cobbled together, not written in monoliths. some of
those components might be running on-premises while others are likely to be cloud services, written as microservices and running in containers. data is coming from the application, from containers, Kubernetes, service meshes, mobile and edge devices, aPis and more. The complexities of modern software architectures broaden the definition of what it means to do performance monitoring. “aPM solutions have adapted and adjusted greatly over the last 10 years. you wouldn’t recognize them at all from what they were when this market was first defined,” said Charley rich, a research director at gartner and lead author of the aPM Magic Quadrant, as well as the lead author on gartner’s aioPs market guide.
complex software world so, although aPM is a mature practice, organizations are having to look beyond the application — to multiple clouds and data sources, to the network, to the iT infrastructure — to get the big picture of what’s going on with their applications. and we’re hearing talk of automation, machine learning and being proactive about problem remediation, rather than being reactive. “aPM, a few years ago, started expanding broadly both downstream and upstream to incorporate infrastructure monitoring into the products,” rich said. “Many times, there’s a problem on a server, or a VM, or a container, and that’s the root cause of the problem. if you don’t have that infrastructure data, you can only infer.” rekha singha, the software-Computing systems research area head at Tata Consultancy services, sees two major monitoring challenges
that modern software architectures present. first, she said, is multi-layered distributed deployment using big data technologies, such as Kafka, Hadoop and Hdfs. The second is that modern software, also called software 2.0, is a mix of traditional task-driven programs and data-driven machine learning models. “The distributed deployment brings additional performance monitoring challenges due to cascaded failures, staggered processes and global clock synchronization for co-relating events across the cluster, she explained. ”further, a software 2.0 architecture may need a tight integrated pipeline from development to production to ensure good accuracy for data-driven models. Performance definition for software 2.0 architectures are extended to both system performance and model performance.”
Moreover, she added, modern applications are largely deployed on heterogeneous architectures, including CPu, gPu, fPga and asiCs. “We still do not have mechanisms to monitor performance of these hardware accelerators and the applications executing on them,” she noted.
The new culture of APM
despite these mechanisms for total monitoring not being available, companies today need to compete to be more responsive to customer needs. and to do so, the have to be proactive. Joe butson, co-founder of consulting company big deal digital, said, “We’re moving to a culture of responding ‘our hair’s on fire,’ to continued on page 12
May 2020
11
APM: What it means in today’s complex software world < continued from page 11
being proactive,” he said. “We have a lot more data … and we have to get that information into some sort of a visualization tool. and, we have to prioritize what we’re watching. What this has done is change the culture of the people looking at this information and trying to monitor and trying to move from a reactive to proactive mode.” in earlier days of aPM, when things in application slowed or broke, people would get paged. butson said, “it’s fine if it happens from 9 to 5, you have lots of people in the office, but then, some poor person’s got the pager that night, and that just didn’t work because what it meant in the MTTr — mean time to recovery — depending upon when the event occurred, it took a long time to recover. in a very digitized world, if you’re down, it makes it into the press, so you have a lot of risk, from an organizational perspective, and there’s reputation risk. High-performing companies are looking at data and anticipating what could happen. and that’s a really big change, butson said. “organizations that do this well are winning in the marketplace.”
Who’s job is it, anyway?
With all of this data being generated and collected, more people in more parts of the enterprise need access to this information. “i think the big thing is, 10-15 years ago, there were a lot of app support teams doing monitoring, i&o teams, who were very relegated to this task,” said stephen elliot, program vice president for i&o at research
12 May 2020
firm idC. “you know, ‘identify the problem, go solve it.’ Then the war rooms were created. Now, with agile and devops, we have [site reliability engineers], we have devops engineers, there are a lot broader set of people that might own the responsibility, or have to be part of the broader process discussion.” and that’s a cultural change. “in the NoCs, we would have had operations engineers and sys admins looking at things,” butson said. “We’re moving across the silos and have the development people and their managers looking at refined views, because they can’t consume it all.” it’s up to each segment of the organization looking at data to prioritize what they’re looking at. “The dev world comes at it a little differently than the operations people,: butson continued. “operations people are looking for stability. The development people really care about speed. and now that you’re bringing security people into it, they look at their own things in their own way. When you’re talking about operations and engineering and the business people getting together, that’s not a natural thing, but it’s far better to have the end-to-end shared vision than to have silos. you want to have a shared understanding. you want people working together in a crossfunctional way.” enterprises are thinking through the question of who owns responsibility for performance and availability of a service. according to idC’s elliot, there is a modern approach to performance and availability. He said at modern companies, the thinking is, “ ‘we’ve got a devops team, and when they write the service, they own the service, they have full end-to-end responsibilities, in-
cluding security, performance and availability.’ That’s a modern, advanced way to think.” in the vast majority of companies, ownership for performance and availability lies with particular groups having different responsibilities. This can be based on the enterprise’s organizational structure, and the skills and maturity level that each team has. for instance, an infrastructure and operations group might own performance tuning. elliot said, “We’ve talked to clients who have a cloud Coe that actually have responsibility for that particular cloud. While they may be using utilities from a cloud provider, like aWs Cloud Watch or Cloud Trail, they also have the idea that they have to not only trust their data but then they have to validate it. They might have an additional observability tool to help validate the performance they’re expecting from that public cloud provider.” in those modern organizations, site reliability engineers (sres) often have that responsibility. but again, elliot here stressed skill sets. “When we talk to customers about an sre, it’s really dependent on, where did these folks come from?” he said. “Where they reallocated internally? are they a combination of skills from ops and dev and business? Typically, these folks reside more along the lines of iT operations teams, and generally they have operating history with performance management, change management, monitoring. They also start thinking about are these the right tasks for these folks to own? do they have the skills to execute it properly?” organizations also have to balance that out with the notion of applying development practices to traditional i&o principles, and bringing a software engineering mindset
to systems admin disciplines. and, according to elliot, “it’s a hard transition.” Compound all that with the growing complexity of applications, running the cloud as containerized microservices, managed by Kubernetes using, say, an istio service mesh in a multicloud environment. TCs’ singha explained that containers are not permanent, and microservices deployments have shorter execution times. Therefore, any instrumentation in these types of deployment could affect the guarantee of application performance, she said. as for functions as a service, which are stateless, application states need to be maintained explicitly for performance analysis, she continued. it is these changes in software architectures and infrastructure that are forcing organizations to rethink how they approach performance monitoring from a culture standpoint and from a tooling standpoint. aPM vendors are adding capability to do infrastructure monitoring, which encompasses server monitoring, some amount of log file analyst, and some amount of network performance monitoring, gartner’s rich said.others are adding or have added capabilities to map out business processes and relate the milestones in a business process to what the aPM solution is monitoring. “all the data’s there,” rich said. “it’s in the payloads, it’s accessible through aPis.” He said this ability to bring out visualize data can show you, for instance, why boston users are abandoning their carts 20% greater than they are in New york over the last three days, and come up with something in the application that explains that. n
Gartner’s 3 requirements for APM
aPM, as gartner defines it in its Magic Quadrant criteria, is based on three broad sets of capabilities, and in order to be considered an aPM vendor by gartner, you have to have all three. Charley rich, gartner research director and lead author of its aPM Magic Quadrant, explained: The first one is digital experience monitoring (DXM). That, rich said, is “the ability to do real user monitoring, injecting Javascript in a browser, and synthetic transactions — the recording of those playbacks from different geographical points of presence.” This is critical for the last mile of a transaction and allows you to isolate and use analytics to figure out what’s normal and what is not, and understand the impact of latency. but, he cautioned, you can’t get to the root cause of issues with dXM alone, because it’s just the last mile. digital experience monitoring as defined by gartner is to capture the uX latency errors — the spinner or hourglass you see on a mobile app, where it’s just waiting and nothing happens — and find out why. rich said this is done by doing real user monitoring — for web apps, that means injecting Javascript into the browser to break down the load times of everything on your page as well as background calls. it also requires the ability to capture screenshots automatically, and capture entire user sessions. This, he said, “can get a movie of your interactions, so when they’re doing problem resolution, not only do they have the log data, actual data from what you said when a ticket was opened, and other performance metrics, but they can see what you saw, and play it back in slow-motion, which often provides clues you don’t know.” The second component of a gartner-defined aPM solution is application discovery diagnostics and tracing. This is the technology to deploy agents out to the different applications, VMs, containers, and the like. With this, rich siad, you can “discover all the applications, profile all their usage, all of their connections, and then stitch that together to what we learn from digital experience to represent the end-to-end transaction, with all of the points of latency and bottlenecks and errors so we understand the entire thing from the web browser all the way through application servers, middleware and databases.” The final component is analytics. using ai, machine-learning analytics applied to application performance monitoring solutions can do event correlation, reduce false alarms, do anomaly detection to find outliers, and then, do root cause analysis driven by algorithms and graph analysis. — David Rubinstein May 2020 13
A guide to observability tools
n AppDynamics: The appdynamics application intelligence Platform provides a real-time, end-to-end view of application performance and its impact on digital customer experience, from end-user devices through the back-end ecosystem— lines of code, infrastructure, user sessions and business transactions. The platform was built to handle the most complex, heterogeneous, distributed application environments; to support rapid identification and resolution of application issues before they impact users; and to deliver real-time insights into the correlation between application and business performance. n Dynatrace provides software intelligence to simplify enterprise cloud complexity and accelerate digital transformation. With ai and complete automation, our all-in-one platform provides answers, not just data, about the performance of applications, the underlying infrastructure and the experience of all users. We help companies mature existing enterprise processes from Ci to Cd to devops, and bridge the gap from devops to hybrid-to-native aiops.
n IBM helps organizations modernize their iT operations management with its aiops solution. it helps organizations see patterns and contexts that aren’t obvious, helping them avoid investigation, improve responsiveness, and lower operations costs. it also automates iT tasks to minimize the need for human intervention. ibM’s solution can be incorporated no matter what stage in the digital transformation journey a customer is at. n InfluxData: aPM can be per-
formed using influxdata’s platform influxdb. influxdb is a purpose-built 14 May 2020
FEATURED PROVIDER
n MicroFocus: as more services are delivered through more channels, monitoring and resolving issues becomes exponentially more difficult. Micro focus operations bridge cuts through the complexity of hybrid iT, so you can keep services running. Make the shift to automated, ai-based, business-focused delivery—powered by aiops. you’ll monitor applications, detect anomalies, and fix problems with new speed and insight. That’s how you satisfy users in the digital enterprise.
time series database, real-time analytics engine and visualization pane. it is a central platform where all metrics, events, logs and tracing data can be integrated and centrally monitored. influxdb also comes built-in with flux: a scripting and query language for complex operations across measurements.
n Instana is a fully automatic application Performance Monitoring (aPM) solution that makes it easy to visualize and manage the performance of your business applications and services. The only aPM solution built specifically for cloud-native microservice architectures, instana leverages automation and ai to deliver immediate actionable information to devops. for developers, instana’s autoTrace technology automatically captures context, mapping all your applications and microservices without continuous additional engineering.
n Moogsoft is a pioneer and leading provider of aiops solutions that help iT teams work faster and smarter. With patented ai analyzing billions of events daily across the world’s most complex iT environments, the Moogsoft aiops platform helps the world’s top enterprises avoid outages, automate service assurance, and accelerate digital transformation initiatives. founded in 2011, Moogsoft has more than 120 customers worldwide and strategic partnerships with leading managed service providers and outsourcing organizations.
n Optanix’ aiops platform was developed from the group up, rather than just adding an analytics engine or machine learning capabilities to an existing platform. The solution offers full-stack detection and monitoring, predictive analysis and smart analytics, true and actionable root cause analysis, and business service prioritization.
n Plumbr: Plumbr is a modern monitoring solution designed to be used in microservice-ready environments. using Plumbr, engineering teams can govern microservice application quality by using data from web application performance monitoring. Plumbr unifies the data from infrastructure, applications, and clients to expose the experience of a user. This makes it possible to discover, verify, fix and prevent issues. Plumbr puts engineering-driven organizations firmly on the path to providing a faster and more reliable digital experience for their users.
Information You Need: ITOps Times
n ScienceLogic offers a “con-
text-infused” aiops platform that helps organizations discover and understand the relationship between infrastructure, applications, and business services. it also allows them to integrate and share data across different technologies in realtime, and apply multi-directional integrations for automating responsive and proactive actions in the cloud. n Splunk provides real-time visibility across the enterprise. its datato-everything Platform enables iT operations teams to prevent problems before they impact customers. it offers observability across silos and enables users to investigate deeper when needed and use predictive analytics to anticipate outages.
n StackState’s aiops platform helps iT operations teams break down silos in their teams and tools. its solution combines logs, events, metrics, and traces in real time in order to help customers resolve issues faster. With stackstate, organizations can consolidate all of their data into a single platform with an understandable ui. n
Every business today is a software company, and executing on rapid-fire releases while adopting new technologies such as containers, infrastructure as code, and software-defined networks is paramount for success. ITOps Times is aimed at IT managers who need to stay on top of rapid, massive changes to how software is deployed, managed and updated.
NEWS AS IT HAPPENS, AND WHAT IT MEANS.
www.ITOpsTimes.com
May 2020
15