In a 1954 speech to the Second Assembly of the World Council of Churches, former U.S. President Dwight D. Eisenhower, who was quoting Dr. J. Roscoe Miller, president of Northwestern University, said: "I have two kinds of problems: the urgent and the important. The urgent are not important, and the important are never urgent."
This "Eisenhower Principle" is said to be how he organized his workload and priorities. He recognized that great time management means being effective as well as efficient. When managing SAP services, one finds herself having to find both the important and the urgent ones to ensure SLAs are being kept while the overall health of the system is not compromised in the longer term.
Yet, we hear too often ‘we get millions of notifications a day probably 99% of them are noise. We spend too much time on false alarms and many times miss the early indication of major issues’. Unfortunately, the amount of time wasted chasing phantom problems causes even the best Basis engineers to ignore some alarms, till they get burned by a loss of service (unscheduled downtime).
‘We get millions of notifications a day probably 99% of them are noise. We spend too much time on false alarms and many times miss the early indication of major issues’
With compound notifications, one can stack monitors for better visibility and can eliminate many of the false alarms and focus on the urgent and important ones (and most importantly the real ones). A collection of individual SAP checks can act as a single check. Combining different elements within the same system, or even across multiple systems, enable you to find the root cause and address every issue properly.
While a CPU spiking may feel urgent they need to truly understand if it is important or not.
Let's look at an example:
In this case, out of several elements that are included in the compound check, the system reports only high memory usage and high CPU usage over time that exceeds the set threshold.
As of right now, the high CPU and high memory don't appear to be causing any issues in the system. Depending on the exact scnario, it may not be relevant to send out a notification.... yet. In the example above, Xandria will create a ticket to track the occurance but will not push out any alerts until we start to see other symptoms in the system.
Here, on top of high memory and CPU usage, the system identified that batch jobs are taking longer than normal. Now that the high memory and CPU are affecting other components in the system, in this case long running batch jobs, it’s time to send out a notification.
The system will initiate a warning email, display it on the dashboard with increase severity (critical or major dependent on the user definition) and create a ticket.
The system even allows for a different escalation process in case of critical warnings
In this final scenario, the high memory and CPU are not only affecting batch jobs, but they are also causing the backups to run longer. In other words, the high CPU and memory are not just temporary spike. This appears to be a system-wide issue that is compounding quickly.
At this point, it’s time to escalate via text, display a critical issue on dashboards and perhaps run a script that will help mediate the problem.
In Xandria we have two type of tools that allow users to build new smart monitors using build-in real-time or daily checks, custom checks or even other composite checks or business services:
Composite checks are used within a single system as described in the three scenarios above. By grouping individual checks you get greater clarity into what exactly is occurring when you receive a notification.
Simply setting up individual notifications or dashboards on the health of each individual element will flood your screens and inboxes with all kinds of messages. As we all know, this leads to manually digging through messages, and eventually just logging on the system to see what is going on.
Grouping individual checks from a single system into a composite check allows for a more meaningful alert that not only explains what's going on, but also the effect it has on the system.
Business services are similar, with a different twist. These groupings of checks can expand outside of a single SAP system, to include a landscape, systems that depend on each other (ex: ECC & BW) or even third-party monitors brought into Xandria.
The idea is that a process or service that has several components underneath it, that it depends on to successfully run smoothly are grouped together. Should the overarching process or service not run to its full capabilities the business service will be brought up and instantly identify where the lag or issue will be.
For example, to process shipping labels on a loading dock SAP needs to be running smoothly.
For that to happen all these elements needs to work properly:
Should the dock employee not be able to print a label, a business service with each of those underlying components stacked on top of each other could be referenced as a red flashing light next to the ‘Printer Unplugged’ will easily identify the issue.
Composite checks and business services allow SAP Basis teams to focus on the urgent and important matters and eliminate the distractions.
Using these tools operators and engineers can focus on the essential items while properly scheduling the repairs of the lower priority items. It helps doing it and ensure that they are not swamped with unnecessary false alarms and notifications.
Watch a free technical demo of Xandria here