By Brian Handrigan on Thursday, 11 June 2015
Category: Network Management

3 Key Cloud Monitoring Metrics

Reliably Measure Performance, and Ensure Effective Monitoring in the Cloud

When adopting monitoring strategies for cloud services, it’s important to establish performance metrics and baselines. This involves going beyond vendor-provided data, to ensure your monitoring tools provide:

Taking a holistic approach ensures your staff can manage any performance issue and provide proof to cloud providers if their systems are the cause of the issue.

CLOUD MONITORING CHALLENGES

The primary issue with ensuring reliable cloud performance revolves around the lack of metrics for monitoring the Service Level Agreement (SLA) and performance. Understanding these issues is critical for developing successful monitoring strategies.

HERE ARE A FEW COMMON CHALLENGES

Obtaining Application Performance Agreements

While vendors will highlight service or resource availability in their SLAs, application performance metrics or expectations are typically absent. In the case of Salesforce.com, SLAs (if one is provided) discuss downtime, but there aren’t any performance guarantees.

Lack of Performance Metrics

Similar to SLAs, organizations should not expect vendors to provide any meaningful performance metrics beyond service and resource availability. If managers rely on trust.salesforce.com to track CRM service performance and availability, they are limited to:

These reports fail to provide meaningful performance metrics to evaluate service degradation issues or to isolate problems to the cloud vendor.

No Meaningful Performance Benchmarks

Most Software as a Service (SaaS) vendors don’t offer any benchmarks or averages that allow you to forecast potential performance or service demand. While Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) vendors will provide cloud performance benchmarks, these numbers don’t take into account:

The challenge is to properly benchmark and create meaningful metrics for your organization.

These challenges require organizations to take a holistic approach to monitoring by implementing solutions that allow them to seamlessly view external components and performance as if they were a part of their internal network.

EFFECTIVE CLOUD MONITORING STRATEGIES

Given the lack of metrics to assess performance for a specific organization, how do engineers successfully manage user interaction with cloud services?

ENSURE SERVICE PERFORMANCE

While SLAs may not guarantee performance, cloud vendors should take action when clear proof shows their systems are the source of the problem.

How can you ensure reliable performance?

Set up synthetic transactions to execute a specific process on the cloud provider’s site. By regularly conducting these transactions, monitoring routes and response times, you can pinpoint the potential source of delay between the internal network, ISP, and cloud provider. This data along with web error codes can be provided to the cloud vendor to help them resolve issues.

Network teams should have a true view of performance that tracks packets from the user over the ISP to the cloud provider, including any cloud-hosted components.

OVERCOME LACK OF PERFORMANCE METRICS

Depending upon the service, vendors will provide varying levels of detail. The type of service also impacts what you can monitor.

What metrics can you use?

In the case of SaaS, you can monitor user interactions and synthetic transactions, and rely on vendor-provided reports.

For PaaS, monitoring solutions such as Observer Infrastructure provide significant performance metrics. These metrics can be viewed alongside response time metrics for a more complete picture of service health.

In the case of IaaS, you have access to the server’s operating system and applications. In addition to polling performance metrics of cloud components, analysis devices can be placed on the cloud server for an end-to-end view of performance.

ADDRESS PERFORMANCE BENCHMARKS

With monitoring systems in place, it’s important to baseline response times and cloud component performance. They can help you become proactive in your management strategy. How can you use baselines?

Using these baselines, you can set meaningful benchmarks for your specific organization. From this point, alarms can be set to proactively alert teams of degrading performance. Utilizing long-term packet capture, you can investigate performance from the user to the cloud and isolate potential problems.

Thanks to Network Instruments for the article. 

Leave Comments