Monitoring systems or how to get lost in fierce madness. There are many solutions to monitor systems and most of them have some kind of web interface to operate. Choosing the right tool for any job is a tedious task and for a newbie like me it is a bit harder, specially for a sensitive matter as this one.
Nagios seems to be a de facto standard for many companies although there are some good alternatives. I have been using HPE solutions in a large environment (a bank with thousands of servers) and frankly I’ve seen too many missfires to have a good taste in my mouth. That said, I think this is a question more related to the actual implementation and how the company runs its services to their customers more than anything else. And I am sure that comparing the ratio of good behaviour to the false alarms, one may assure the software works quite good. And again, the implementation counts.
As far as I can see and have gathered there are two main lines of monitoring “ideas”. The live one and the recording one. Both coexist in most of the solutions, however the way they are presented to the admins/users differ and sometimes differ dramatically. I have been testing for my own projects two monitoring solutions which can be easily found on the internet, Nagios and Netdata. The first is common ground for many companies and despite the critics it has received and still receives it’s in use in many places. Netdata seems like a newcommer and frankly it looks good from what the users say about it. Nagios has a fork as well called Icinga and from what I’ve read it scales better than Nagios. However there is a difference between managing tenths of servers to thousands. And quite a difference between real-time (Netdata) and “record” based as Nagios as well.
I have myself thrown a google search to find software and found a few other enterprise grade programs. That said both Nagios and Netdata fulfil my needs which are pretty basic. I own and maintain to websites and what I most need is to monitor availability. Obviously performance is also something to have a look as well. If the website receives lots of hits for a long period of time it would be a found necessity to upgrade the server. I also need a way to receive warnings and alerts of some king with things go sideways. Most monitoring tools have integrated some sort of warning system, email alerts are common but instant messaging are available too.
For myself I’ve chosen both projects for the moment. Both give me what I need. Constant monitoring with Nagios and real time situation with Netdata. I may play with Icinga in the near future, a fork from Nagios which seems to be more flexible on the visual part of things while maintaining compatibility with the very extensive number of plugins for Nagios. Zabbix is another good alternative as far as I’ve read in many forums and articles. Ultimately I will give Monit a chance as well.