Forecasting System Center Operations Manager data

Wednesday, 25 November 2015 15:00

Read our latest article about forecasting SCOM data at ITServiceAnalytics.com

SLO Reporting With SCOM

Tuesday, 01 September 2015 13:23

SLO chart

 

Service Level Agreements allow IT organizations and their clients to have a common view upon service delivery quality. Service Level Objectives are defined within every SLA in order to obtain specific and measurable metrics which help evaluate and improve the delivery of IT services.

 

Therefore monitoring SLO targets and outcomes is obviously a very important task. Especially within organizations working by (or starting to implement) ITIL framework.

The whole of Microsoft Systems Center product family is designed to support ITIL processes. Measuring uptime and managing SLO values is no exception.

In this article we will look at 3 ways to track SLO performance based on data gathered by Systems Center Operations Manager.

1.      Service Level Tracking Summary Report

This is a report that comes with the default installation of Operations Manager. You will be able to find it in SCOM Console by going to Reporting-> Microsoft Service Level Report Library and running the Service Level Tracking Summary Report.

 

Right at the first step you face one of the biggest shortcomings of this report- the prompt page. Here you have to tell the report what you are after: find the separate components one by one, insert them into the report and click run to wait for the result.

 

Yes, that is the second downside. It is slow. The bigger your environment, the worse the performance.

In the end it does give you the data you asked for. Here is output example:

 Table showing SLO performance in SCOM Console

 

The way data is formatted here is not entirely helpful either. Customers having 20+ services never managed to get a holistic overview of their situation by using this report.

 

We can sum up that this way of checking SLO and uptime works for SCOM admins who need to check if SLO values are set up correctly for a certain entity. But if you are an IT manager (Service Level Manager, Service Owner etc.) this report will probably deliver more frustration than information.

Hence we jump off to option number two.

2.      Service Level Dashboard

This is quite an improvement in usability compared to the above mentioned report. We can now set up dashboards specific to every user, that way allowing them to get a better overview of their area of responsibility. Here is a screenshot of one such dashboard:

 SLO Dashboard for SCOM

We get to this result without any messing about with prompt values and this greatly improves the usability for non-admin users. It is much more intuitive because right on the first screen we see a green cross and a red cross- a call for action waiting for us!

 

This dashboard, however easy to set up, is not really flexible. Clients send us remarks regarding time frame shown in the report (plus the lack of ability to modify that) and the fact that each dashboard is made for a specific set of managed entities. That means administering these dashboards in larger organizations becomes a tedious task.

 

After looking at first two alternatives delivered by Microsoft we are still not really user or admin friendly. But we want to know how good we are and if we are improving over time. Let us look at the last option. We recommend it to those clients who really want to keep a close watch on their Service Delivery Performance.

3.      Custom made solution

Our most demanding clients asked us for “something” which would outperform the out-of-the-box SCOM reporting solutions for SLO reporting in all aspects: flexibility, ease of use, ease of administration and end-user responsiveness.

 

The solution was to extract data from SCOM Data Warehouse and transform it into a more reporting-friendly shape then load this newly formatted data into our own Data Mart.

 

Just finding SLO values and connecting them to managed entities showed to be a bit of a challenge the first time we did this. Going besides the old built-in stored procedures for SLO reporting allowed us to have a much easier data extraction mechanism.

 

By moving the data into a new structure we gained the freedom to use numerous tools that are available for data storage and reporting. The final solution becomes so easy to use, that with just a short introduction training anyone familiar with Excel (an all-time analyst favorite) is able to slice and dice our data cubes to find the information they are after. Here you can see a chart made by an analyst in Excel showing current month uptime compared to SLO of each service:

 Custom made SLO reporting chart giving complete overview

 

And here is another view showing uptime vs SLO of one service for all the months within 2015:

 

 Monthly SLO performance of one service

 

With our custom solution it is also possible to leverage the other advantages of MS Analysis Services such as security, speed and flexibility. Every user can now see just the information that is interesting to their specific role. Most of the reports respond to user queries almost instantly and they have all the freedom in the world to display the information in the way which supports their current business needs best.

 

The major challenges while making a DIY solution are to keep to the budget and actually deliver something that works. If you do decide to go this route, you need to make sure you hire someone with experience.

More to come

SLO performance tracking is just one of the building blocks. There are numerous metrics within SCOM/SCSM just waiting to be converted into information which can help your IT organization grow and become more efficient.

 

Hence this is just the first article in this series. Keep a lookout for more examples of how customized Systems Center data warehouse solutions can benefit your IT organization.

 

Don't miss the upcoming blog posts from us! Subscribe to our newsletter right now by clicking here!

 

SCOM Data Warehouse Alert Reporting

Monday, 07 September 2015 10:23

SCOM Alert data analysis charts in excel

 

Querying Alerts From SCOM Data Warehouse

In previous blog post we looked at custom SLO reporting solution for SCOM. Now we will take a closer look at what can be done with all the alert-related data that is stored in SCOM Data Warehouse.

 

Understanding Alert Nature

 

At first it might seem that all we want to see about alerts is just sitting and waiting for us in the alert view (Alert.vAlert). But by now we all know that nothing is as easy as it seems in System Center data warehouses.

 

There are three main points that complicate matters when it comes to alert querying. 

 

First of all comes the fact that alerts can be generated either by Rules or by Monitors. And both alert sets end up in one fact table, making it a little more complicated to figure out the exact entity which generated each row.

 

Second thing that might happen is that your query might return results generated by Managed Entities which are no longer available in SCOM (deleted old stuff etc.). And we don’t want those to be cluttering our reports either.

 

Finally, figuring out the current (latest) resolution state is a bit of a painful task (both for the DB server and us analysts) because all of those states reside in a separate table and there are quite a few of them to choose from!

 

Let us tackle each issue separately and in the end we will get to a nice looking data table like this:

Screenshot of alert query output from scom data warehouse

Which was it: Rule or Monitor?

 

Every alert can be generated either by a Rule or a Monitor and the only trace of the difference we get within the fact rows are numerical indicators (1 or 0). 

 

But we want to know more than just what it was. We want the name here as well. And since Monitors reside in vMonitor and Rule names can be found in vRule, we end up having to join our Alert view to the two lookup views and pick the right value depending on the indicator flag.

 

LEFT OUTER JOIN [SEGOTSQL01\SCOM2012R2].OperationsManagerDW.dbo.vRule R ON

   A.WorkflowRowId = R.RuleRowId

LEFT OUTER JOIN [SEGOTSQL01\SCOM2012R2].OperationsManagerDW.dbo.vMonitor M ON

   A.WorkflowRowId = M.MonitorRowId

 

Using outer join allows either value to be null and does not affect the number of rows that are returned from the Alert view. Now we just need to pick the right name. 

 

This can be done by checking the Monitor Indicator. For example like this:

 

CASE A.MonitorAlertInd

   WHEN 1 THEN M.MonitorDefaultName

   WHEN 0 THEN R.RuleDefaultName

 

END AS [RaisedByMonitor/Rule]

 

In human language that would mean: when MonitorAlertInd field says we are dealing with a monitor (value = 1) then it should use MonitorDefaultName field coming from v.Monitor, otherwise (when value = 0) use RuleDefaultName from v.Rule.

 

To make the results more readable, we can re-label the numerical alert source identifiers to ‘Monitor’ when it is 1 and ‘Rule’ when it is 0. Like this:

 

CASE A.MonitorAlertInd

   WHEN 1 THEN 'Monitor'

   WHEN 0 THEN 'Rule'

END AS AlertSource

 

Now we are fully aware which entity generated every single row and we have nice labels for them. Time to make sure that we only see values generated by Managed Entities which are still available in SCOM.

 

Managed Entities from Existing Management Packs Only

 

In most real-life scenarios we don’t want to report anything that is obsolete. That would be the case when analyzing alert data as well. Thus our goal should be to only include data that is relevant to management packs currently available in SCOM.

 

Also, some alerts stay open in SCOM Data Warehouse for ever and ever and the reason is quite simple- management pack which created the alert was deleted BEFORE the alert was resolved. So in the end there was nothing that could resolve the open issue, thus it is still open and skewing our result set.

 

vManagedEntityManagementGroup view is the place where you can find traces of validity timeframe for managed entities via ToDateTime field:

output query vManagedEntityManagementGroup

 

What we really want from this data set is to make sure our managed entity was within a ManagedEntityManagementGroup at the time when it was created.

 

Using a bit more advanced join can help here:

INNER JOIN OperationsManagerDW.dbo.vManagedEntityManagementGroup MEMG ON

   A.ManagedEntityRowId = MEMG.ManagedEntityRowId

      AND

   (A.RaisedDateTime >= MEMG.FromDateTime AND MEMG.ToDateTime IS NULL)

 

MEMG.ToDateTime IS NULL ensures our Entity is still available and not part of a deleted MP while A.RaisedDateTime >= MEMG.FromDateTime part of the join ensures a valid connection between ME and MEG existed at the time of alert creation.

 

Two down one to go. Let’s find the latest resolution state of our alerts.

 

Current Resolution State of Alert

 

Each Alert has one or more states associated with it. All state changes are stored in Alert.vAlertResolutionState view and can be reached via a one-to-many join from the Alert.vAlert view.

 

In cases when we want to analyze alerts, having all of their states in the same result set is not really useful and only causes clutter. What we will do instead is fetch the latest available state for every Alert.

 

First we need to identify the timestamp of the latest state change. That is easily done with SQL MAX statement:

 

SELECT

   AlertGuid,
   MAX(StateSetDateTime) AS StateSetDateTime

FROM OperationsManagerDW.Alert.vAlertResolutionState

GROUP BY AlertGuid

 

This subquery can now be complemented with the actual Resolution State from the same view:

 

SELECT I.AlertGuid, O.ResolutionState

FROM (
SELECT

              AlertGuid,

              MAX(StateSetDateTime) AS StateSetDateTime

             FROM OperationsManagerDW.Alert.vAlertResolutionState

GROUP BY AlertGuid
) I

 

INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState O ON

   I.AlertGuid = O.AlertGuid
      AND
   I.StateSetDateTime = O.StateSetDateTime

 

The combination of these two queries gives us the latest state by AlertGuid. This result can now be inner-joined to the main query and the result will only show one state for every alert.

 

The full query

 

We have looked at all the building blocks required to fetch Alerts and data related to them.

 

Time to complete the puzzle. Copy the whole query and give it a go! Let us know how it goes! 

 

SELECT

A.RaisedDateTime,

E.DisplayName AS RaisedByEntity,

CASE A.MonitorAlertInd

             WHEN 1 THEN M.MonitorDefaultName

             WHEN 0 THEN R.RuleDefaultName

END AS [RaisedByMonitor/Rule],

CASE A.MonitorAlertInd

             WHEN 1 THEN 'Monitor'

             WHEN 0 THEN 'Rule'

END AS AlertSource,

A.Category AS AlertCategory,

CASE A.Severity

             WHEN 2 THEN 'Critical'

             WHEN 1 THEN 'Warning'

             WHEN 0 THEN 'Information'

END AS AlertSeverity,

CASE A.Priority

             WHEN 2 THEN 'High'

             WHEN 1 THEN 'Medium'

             WHEN 0 THEN 'Low'

END AS AlertPriority,

A.AlertName,

RS.ResolutionStateName AS ResolutionState,

ARS.TimeFromRaisedSeconds,

A.RepeatCount

 

FROM

OperationsManagerDW.Alert.vAlert A

            

INNER JOIN OperationsManagerDW.dbo.vManagedEntity E ON

             A.ManagedEntityRowId = E.ManagedEntityRowId

 

--Show only alerts that are part of a working CP

INNER JOIN OperationsManagerDW.dbo.vManagedEntityManagementGroup MEMG ON

             A.ManagedEntityRowId = MEMG.ManagedEntityRowId

             AND

             (A.RaisedDateTime >= MEMG.FromDateTime AND MEMG.ToDateTime IS NULL)

            

INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState ARS ON

             A.AlertGuid = ARS.AlertGuid

            

--Return only largest resolution state value for alert

INNER JOIN

             (SELECT I.AlertGuid, O.ResolutionState

             FROM (SELECT

                          AlertGuid,

                          MAX(StateSetDateTime) AS StateSetDateTime

             FROM OperationsManagerDW.Alert.vAlertResolutionState

             GROUP BY AlertGuid ) I

INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState O ON

             I.AlertGuid = O.AlertGuid AND I.StateSetDateTime = O.StateSetDateTime) MARS ON

                          ARS.ResolutionState = MARS.ResolutionState

                          AND

                          ARS.AlertGuid = MARS.AlertGuid

 

INNER JOIN OperationsManagerDW.dbo.vResolutionState RS ON

             ARS.ResolutionState = RS.ResolutionStateId

 

--Outer joins for Rule and Monitor lookup

LEFT OUTER JOIN OperationsManagerDW.dbo.vRule R ON

             A.WorkflowRowId = R.RuleRowId

LEFT OUTER JOIN OperationsManagerDW.dbo.vMonitor M ON

             A.WorkflowRowId = M.MonitorRowId

 

--Return only latest month.

WHERE       

             A.RaisedDateTime > DATEADD(MM, -1, GETDATE())

Don't miss the upcoming blog posts from us! Subscribe to our newsletter right now by clicking here!

Customers