Posted by Simon Long Oct 7, 2016
I’ve been spending some time lately figuring out ways to improve the monitoring and alerting within VMware’s Internal Horizon environments. A condition I wanted to alert on was DHCP scope exhaustion. We have many DHCP scopes globally for our virtual desktop environments and I want our support team to be alerted when we start to run low on DHCP IP addresses and in a worst case scenario, exhausted all IPs in the scope. Virtual desktops without IP addresses don’t tend to work very well.
In theory, we should never exhaust our DHCP scopes if we size and place our desktop Pools correctly. However, in practice that doesn’t always happen. Often a Pools get created with the wrong amount of desktops or the wrong network was selected on the Golden Master image, causing the new Pool to use the wrong DHCP scope. I want to know when these mistakes have been made, so we can correct them before our end users are affected.
Within our Horizon environments, we utilize vRealise Log Insight for log collect and analysis. My knowledge of Log Insight is still pretty primitive, so I engaged one of my OneCloud colleagues, Caleb Stephenson, who manages our Global Log Insight instance that processes 35 Billion events per week to figure out how we can achieve this.
We use Windows based DHCP servers, so the first thing we needed to do was to figure out if the DHCP service logs the information we are looking for. Lucky in our case, it does. In two places. In the DHCP logs (C:\Windows\Sysnative\dhcp). However, these are only created daily, which wasn’t dynamic enough for our needs. The other place is in the Windows System Event Logs.
Below are a couple of screenshots of the System Log entries that we are interested in alerting against.
So now we know where the data is, Caleb needed to figure out a way of capturing that information and alerting on specific conditions. I’m not going to go into depth on how this was achieved as Caleb documents it and also does a really good job of explaining the process on his blog post: Getting Fancy with Log Insight Alerting (aka. Monitoring DHCP pools via logs) However, I will share the high-level steps.
1. Collect the event logs from the DHCP Servers – By default the Log Insight Windows Agent collects events from the Application, System, and Security channels – Install and Configure the Log Insight Windows Agent
5. Apply the new filter to make sure the Extract Field functions as expected. If you enter 85 in the filter, anything 85% and below will be ignored. Customize this filter to match what you want to be alerted on. In our environment, we alert on 90% exhaustion of our DHCP scopes. Our scopes are 500 IP addresses, so this will alert us when we have 50 IP’s remaining.
That’s all there is to it. I just love how easy Log Insight makes alerting on logs. Such a powerful tool. Once again, thanks to Caleb.