Evolving Application Performance Management (APM) to Digital Experience Monitoring (DEM)
Application Performance Management (APM) tools have traditionally provided organizations with key performance metrics, including the speed, reliability, and capacity utilization of datacenter systems. But without clear visibility into the actual experience of users, these metrics mean very little. Just because your servers measure as 100 percent available, doesn’t mean users in all geographies are having a fast, reliable experience.
That’s because there are many other performance-impacting elements standing between your datacenter and your users. If an IT organization can’t effectively monitor the true user experience -- including customers, employees, partners, and suppliers -- it is impossible to know if their applications are delivering sufficient performance. The damaging results include frustrated customers which can lead to churn, decreased revenue and market share, and diminished brand perception.
Today’s users are accustomed to getting extremely high levels of service, which is forcing companies to evolve their performance management approaches from traditional APM into what Gartner recently called digital experience monitoring (DEM). DEM treats the user experience as the ultimate metric, and identifies how the myriad of underlying first-party systems and third-party services and components influence it. Transitioning to DEM involves five key approaches:
- Integrated Synthetic and Real User Monitoring -- In synthetic monitoring -- also known as active monitoring -- simulated user traffic is generated from the cloud to test the availability and speed of websites, mobile sites, and applications at regular intervals, from multiple geographic locations and across different networks and browsers. Real user monitoring, also known as passive monitoring, measures the performance real, actual user interactions with websites and applications. This can be very helpful for gauging what users actually do once they enter a site; i.e., what landing pages and applications are the most important and should be prioritized for performance optimization.
If a company relies on synthetic monitoring alone, they may be able to accurately gauge the response time of websites, mobile sites and applications, but they don’t see how these response times are influencing user behavior (e.g., are users abandoning a web page or checkout process), nor can they identify what are the most critical landing pages and conversion paths that should be prioritized for optimization. Conversely, real user measurement only captures data if real users interact with a site; it will not provide alerts if a site is degrading or already down. Combining both data sets in a "single pane of glass" approach often yields the most comprehensive, complete picture.
- Gain the most accurate picture possible of the user experience -- The performance of websites, mobile sites and applications tends to degrade the further away users are from the primary datacenter. If a company’s datacenter is based in San Francisco, and local residents’ experiences are fantastic -- this does not mean users in New York are having the same experience.
To get the truest sense of user performance, is important to measure from geographic and network vantage points that are as close as possible to globally distributed user segments. This includes backbone monitoring nodes located around the world in major internet datacenters, as well as wireless nodes measuring mobile site performance from the direct vantage point of 3G and 4G networks.
- Address the Entire Delivery Chain -- There is hardly a website, mobile site or application now that doesn’t leverage some external third-party service. Examples include APIs, social media plug-ins, ad servers, video and digital payment services, to name a few. There are benefits to using these services, particularly increasing feature richness without having to develop these capabilities in-house. However, each third-party service presents a risk because if the performance of any single service degrades, it can drag down performance for an entire site or application.
Transitioning to DEM requires a "layered" approach that enables organizations to drill down and parse the individual performance of any third-party element in real time, as well as view how each service contributes to the overall site or application performance.
- Shorten Mean-Time-to-Identify within IT Issue Resolution -- As many organizations beef up their infrastructure to support more features and services, one downside is that it can become much more difficult and time-consuming to identity the source of performance issues when they do occur. As noted above, every component in the service delivery chain that factors into overall performance needs to be monitored. But as the number of performance-impacting variables increases, many organizations find themselves drowning in data and searching for insight, at a time when every second counts.
Today, advanced analytics can make data more actionable, by identifying the cause of performance issues (both inside and outside the firewall) swiftly and accurately, thus avoiding time wasted war-rooming. For example, if the culprit is a slow web server, IT teams can add capacity or re-route traffic; or, if the issue is a poorly performing third-party service, the service can be alerted and held accountable.
- Measure SaaS and Cloud Service Providers -- Among external third parties, SaaS and the cloud are witnessing particularly rapid adoption by businesses, but this growth also means providers are having a harder time ensuring high-performing (fast, reliable) applications. According to one recent survey of providers, nearly half report that meeting customers’ rising performance demands was a substantial or significant challenge, with geographic expansion and growing infrastructure complexity cited as primary causes. Slightly more than half reported at least one unplanned service interruption in the past year, which sometimes resulted in the loss of a customer relationship.
Poor performance for SaaS applications can have a particularly harmful fan-out effect, as these applications are often used not just by SaaS customers, but their suppliers and partners as well. For these reasons, all SaaS and cloud service providers enlisted must be monitored closely and bound to firm SLAs, no matter how reputable the provider may be.
In conclusion, we are hearing the same Catch-22 from IT leaders -- user expectations are growing at the same time that IT infrastructure is getting more complicated. Ironically, increased IT complexity is growing out of pressure to meet higher user performance demands, but unless this complexity is managed properly, it has the potential to do more harm than good. Reorienting traditional APM approaches to DEM is one helpful way to stay ahead. The techniques described here are a solid first step to shifting the APM mindset to emphasize the only metric that really matters -- the user experience.
Photo Credit: fbmadeira/Shutterstock
Mehdi Daoudi is the co-founder and CEO of Catchpoint, a leading digital experience intelligence company. His team has expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies that impact the experience of millions of users. Before Catchpoint, Mehdi spent 10+ years at DoubleClick and Google, where he was responsible for Quality of Services, buying, building, deploying, and using monitoring solutions to keep an eye on an infrastructure that delivered billions of transactions daily.