Monday, April 21, 2008

Monitoring that matters

I am using Hyperic HQ monitoring for more than a year now. And it becomes more and more obvious for me that an important aspect of monitoring is missed by popular solutions. Yes, it is important to know throughput of the system.

But the metric that is really matters is CUSTOMERS EXPERIENCE. And it usually derives for two things:
  • Service responsiveness
  • Error ratio
The problem is that these metrics are application level parameter. I mean there is no OS counter that can be easily obtained by software system. And both application and monitoring system developers should pay effort on integration. This is where HQ is good. It is really easy to create JMX bean and an XML plugin to gather application specific metrics.

But there are area for improvement. I'd like to have:
  • Network level error statistics like number of missed IP packets
  • Exceptions in the log by type
Usually, errors rate gives more information about system health. But this type of metrics are successfully ignored at the moment.

Wednesday, April 2, 2008

Gigaspaces monitoring with Hyperic

A half-hour ago I've released the first public beta of the Gigaspaces monitoring plugin for Hyperic HQ. The plugin provides automatically discovers and monitors of GigaSpaces XAP 6.0 on physical and logical levels. More information from the project homepage.