Prometheus monitoring stopped working
Today our team ran into a strange issue when updating one of our Windows servers when all of a sudden our Prometheus monitoring stopped working.
We use Prometheus for performance metrics and alerts for most of our servers and today we were updating one of these servers with the latest Windows update around around 13:15.
As you can see all monitoring disappeared after updating the server and after a bit of investigation we found that the service windows_exporter.exe was not Not Running.
According to the service description this service responsibility is:
-"Exporting Prometheus metrics about the system".
After manually starting it the monitoring came back:
FYI: This could also be confirmed by logging in to the server and browse localhost:9182/metrics.
After some digging we found this open issue on the Github Prometheus-community windows_exporter repo.
It turns out that this service, which should start automatically at startup, sometimes has issues starting up when Windows is performing an Windows Update forced restart.
The Solution:
By setting this service to Startup Type = Automatic (Delayed Start) it should be able to startup next time you perform an update.
I wouldn't really call this a "solution" but rather a workaround since it doesn't solve the underlying issue. Hope it may help you as well.
Cheers friends! ❤️