For some time now, we’ve been using the services of a company called Gomez to monitor our public websites. They do an excellent job for the corporate customer, monitoring from multiple locations around the world, and notifying by email, pager, and text message of any slow performance or outages. Lots of things are configurable (severity, maintenance blackouts, etc). They offer great historical reports we can use to track patterns, the IT managers can use to say “no, we aren’t ‘down all the time’, we had a single outage of 3 minutes on thursday, and a 30 second network glitch three weeks ago tomorrow”; and the execs can use to say “yes, we have 99.995% uptime on our servers, we’re working on that last .004% now” or whatever it is execs say. Gomez is great, and we love it. Only one problem. It’s outside. It can’t work for any of our internal servers.
Now, we can use the built in probing mechanism of Coldfusion MX 7 (love it, btw, great upgrade!), but it’s not as sophisticated. It also has some issues… it can only email one group of people if there’s a problem with *any* of the probes, it can’t text message or page (unless the phone/pager supports an email format), and its logging is very, very rudimentary. It also cannot handle more complicated scripts, such as Log in to this application, submit this form data, if the results do not contain X OR the results are Y, fail and report”.
So… what can? I’m hoping one or more of you loyal FG readers will have some ideas. If it’s something that can run from Windows 2003 server, all the better… we have a few linux hobbyists in shop, but no real experts.