Analyzing Exchange Logs with Azure Log Analytics (Part 4)

by [Published on 20 Sept. 2016 / Last Updated on 20 Sept. 2016]

In the previous part of this article series, we looked into searching and analyzing logs collected by Log Analytics. In this final part, we will look at dashboards, alerts and solutions.

If you would like to read the other parts in this article series please go to:

Dashboards

Log Analytics Dashboards help us visualize all our saved log searches, giving us a single lens to view our environment. To begin, go to the OMS Overview by clicking the Overview button on the left navigation:

Image

You will see the My Dashboard tile on the left. Click it to drill down into your dashboard:

Image

In dashboards, tiles are powered by our saved log searches. OMS comes with many pre-made saved log searches, so we can easily begin right away. The first time we are presented with a pictorial outlining how to begin with dashboards:

Image

In the My Dashboard view, simply click on the customize gear at the top of the page to enter customize mode:

Image

The panel that opens on the right side of the page shows all of our workspace's saved log searches:

Image

To visualize a saved log search as a tile, just drag it onto the empty space on the left. As we drag it, it will turn into a tile:

Image

OK, this is not exactly the kind of visualization I was looking for... Let’s see if we can change this. In the My Dashboard view, click on the customize gear at the top of the page to enter customize mode once more. Next click on Edit:

Image

Then select the tile you want to edit, in this case I am going to edit the one I just added. The right panel changes to edit, and gives a selection of options:

Image

I change my Visualization and click on the customize gear again to get out of customize mode. My tile now looks more the way I want it to:

Image

After a bit more tinkering with our saved search, the graph looks much better:

Image

It is also possible to customize the main console using View Designer, which allows us to create custom views in the OMS console that contain different visualizations of data in the OMS repository. For example, we can include a similar graph for our Exchange server in the main console which, when clicked on, will take us to a view (dashboard/console) with more information regarding this particular server:

Image

Alerts

Alerts identify important information in our OMS repository. Alert rules automatically run log searches according to a schedule and create an alert record if the results match a particular criteria. The rule can then run one or more actions to proactively notify us of the alert.

Email actions send an email with the details of the alert to one or more recipients. We can specify the subject of the email, but its content is a standard format defined by Log Analytics that includes summary information such as the name of the alert in addition to details of up to ten records returned by the log search. It also includes a link to a log search in Log Analytics that will return the entire set of records from that query. The sender of the alert is Microsoft Operations Management Suite Team <noreply@oms.microsoft.com> as we will shortly see.

Webhook actions allow us to invoke an external process through a single HTTP POST request. The service being called should support webhooks and determine how it will use any payload it receives. We can also call a REST API that does not specifically support webhooks as long as the request is in a format that the API understands. An example is using a webhook in response to an alert that uses a service like Slack to send a message with the details of the alert.

To create an alert rule, we start by creating a log search for the record(s) that should invoke the alert. The Alert button will then be available so we can create and configure the alert rule:

  1. From the OMS Overview page, click Log Search.
  2. Either create a new log search query or select a saved log search. Let’s say I want to get notified when the Exchange Transport service stops. To achieve this, I create the following search:
Type=Event EventID=7036 (Computer="EXAIO.domain.com") "The Microsoft Exchange Transport service entered the stopped state."

Image

  1. Click Alert at the top of the page to open the Add Alert Rule screen:

Image

  1. Configure the properties for the alert such as frequency, description, recipients, and so on:

Image

  1. Click Save to complete the alert rule. It will start running immediately.

To alert on a single event, we set the number of results to greater than 0 and both the frequency and time window to 5 minutes (the minimum allowed). That will run the query every 5 minutes and check for the occurrence of a single event that was created since the last time the query was run. A longer frequency may delay the time between the event being collected and the alert being created.

The following screenshot shows the email received for the alert we have just created:

Image

Some applications may log an occasional error that should not necessarily raise an alert. For example, Exchange’s Managed Availability may retry a process that created an error event but then succeeded the next time. In this case, we may not want to create the alert unless multiple events are created within a particular time window.

In other cases, we might want to create an alert in the absence of an event. For example, a process might log regular events to indicate that it is working properly. If it does not log one of these events within a particular time window, then an alert should be created. In this case we set the threshold to Less than 1.

We can also alert when a performance counter exceeds a particular threshold. For example, if we wanted to alert when the processor runs over 90%, we create a query like:

Type=Perf ObjectName=Processor CounterName="% Processor Time" InstanceName=_Total CounterValue>90

Then we set the threshold for the alert rule to greater than 0. The drawback with these types of alerts is that performance records are aggregated every 30 minutes regardless of the frequency that we collect each counter. As such, a time window smaller than 30 minutes may return no records. Setting the time window to 30 minutes will ensure that we get a single record for each connected source that represents the average over that time.

Image

We can also search for all log entries that (should have) generated an alert by running the following query:

Type=Alert SourceSystem=OMS

Image

Solutions

As previously mentioned in this article series, Log Analytics Solutions add functionality to Log Analytics. They primarily run in the cloud and provide analysis of data collected in the OMS repository. We can use Solutions to, for example, provide a summary of ongoing user activities in Office 365 (as we will shortly see), assess the risk and health of Active Directory, view the status of antivirus and antimalware across servers, identify missing system updates, and much more. They may also define new record types to be collected that can be analyzed with Log Searches or by additional user interface provided by the solution in the OMS dashboard.

Solutions are available for a variety of functions. Many will be automatically deployed and start working immediately while others may require some configuration.

To add a solution using the Solutions Gallery:

  1. On the Overview page in OMS, click the Solutions Gallery tile:

Image

  1. On the OMS Solutions Gallery page, learn about each available solution. Click the name of the solution that you want to add to OMS. For this article, let’s add the Office 365 solution (notice that it is still in beta):

Image

  1. On the page for the solution, detailed information about it is displayed. Click Add:

Image

  1. A new tile for this solution appears on the Overview page in OMS:

Image

As you can see, this particular solution requires additional configuration. Once we click on it, we have the options to Connect Office 365:

Image

Then we get prompted for admin credentials to our tenant so that the solution can connect to it and retrieve data:

Image

Once we provide our credentials we are asked to authorize the solution to perform certain actions such as read activity and health data from our Office 365 tenant:

Image

After accepting, the solution is connected to out tenant:

Image

The first time it might take up to 4 hours for the solution to retrieve data, so we need to wait...

Image

Once data is collected, the Office 365 tile in the main console displays the number of activities it has collected information from:

Image

By clicking on the tile we are taken to the solutions dashboard where we can get an overview of the data collected:

Image

We can see, for example, a summary of the number of activities collected in the last week as well as the users that performed more activities:

Image

Unfortunately, at this stage it seems that only Azure activities are collected. Everything I did in Exchange was being captured as an Azure activity:

Image

If we click in Azure Active Directory Activities we are taken back to the Search with a query that returns all these activities, such as:

Type=OfficeActivity OfficeWorkload=AzureActiveDirectory | measure count() by Operation

Image

From here, if we click on the Delete user operation, we can see exactly what happened. In this case, the user Nuno deleted a user named User3:

Image

We can also track user logins to the service:

Image

Or other actions such as removing users from distribution groups. In this example, user Nuno removed user Test from the IT distribution group:

Image

Other Solutions

There are many other useful solutions in Log Analytics. We can easily track computers or workstations missing updates:

Image

Image

If we click on Critical Updates for example, we can drill down to get more details regarding these:

Image

We can see exactly which updates are missing:

Image

And get an overview of how many updates are actually missing from each machine:

Image

As well as their age:

Image

Other solutions allow us to keep track of all the changes across the environment:

Image

Or get a security overview:

Image

There is also one solution to assess Active Directory (AD) replication, and another to assess AD’s risk and health:

Image

Log Analytics Mobile app

Finally, and on a quick note, there is also a mobile app for Log Analytics. Using this app we can access our dashboard(s) as well as customize it:

Image

We can also access the main console with all its views:

Image

And we can use Search and access the pre-defined searches, our own saved searches, edit existing searches or perform new ones:

Image

Image

Wrapping it up...

In this article series we looked at the new Log Analytics service in Azure and how we can use it together with Exchange on-premises and Online. This is still a fairly new technology, but it is already really powerful! With a few tweaks and fixes, it will become a serious contender in the log gathering/monitoring/alerting world.

If you would like to read the other parts in this article series please go to:

See Also


The Author — Nuno Mota

Nuno Mota avatar

Nuno is an Exchange MVP working as a Senior Microsoft Messaging Consultant for a UK IT Services Provider in London. He specializes in Exchange, Lync, Active Directory and PowerShell.