Bugs, corrupt data or performance issues on web applications are often recognized far too late. In the worst case they are reported by the customer, so they probably have already done some serious damage - frustrated the user, made them lose trust or even corrupted their data. Finding these bugs or recognizing them early gets especially hard, if your application makes heavy use of background processes, daemons or cronjobs. They might even throw exceptions that are buried somewhere in the logs, and no one will ever be aware of them, until someone has a look into the log files. I want to show a way out of this misery and provide different solutions in form of practical examples. These will include different levels of monitoring - from simple text logs on the servers up to a fully monitored application including hardware monitoring, extensive metrics, indexed and searchable logs of the whole environment, performance analysis and alerts if something odd happens. I'll show different examples and give ideas when such a fully monitored solution is a good idea, or when a "light monitoring" is applicable.
Commentaires