Monitoring and alerting manual

5gVision Alerting module provides a very flexible alert delivery mechanism, and thus requires some configuration efforts to set up all schedules, users, objects and thresholds.

Overview

The best way to start with Alerting and understand its main concepts is to watch this video tutorial:

5gVision Alerting video

or to view this sales presentation:

5gVision Alerting module

The Alerting module consists of the Alert log that shows all the alerts raised/cleared in the last 24 hours, and the Config-Alerts screen discussed here. Please also refer to the Alerts module for info on how to view and understand entries in the Alert log.

Schedules

Schedules are universally used throughout the alert configuration to schedule, basically, anything. Any time you see a reference to schedules in other tables, and need to enable-disable the entry based on time of day or week, you may add a new schedule for this. 5gVision Monitoring and alerting, Config alerts schedules Schedules follow a strict format of either ON=09:00-17:59 or OFF=08:00-22:30, there should always be a 24-hour time format. Please remember to set your correct time zone in this table.

Contacts

Contacts keep the emails, cell phone numbers for SMS delivery or a mobile client PIN for pushes, as well as some limits and statistical info. Also you can manage how to send emails with alerts: one alert per a separate email, several alerts grouped by various alert parameters in one email or all alerts in one email. This is configured by means of the 2 fields: One alert per email and Notification grouping.

Email template allows you to choose a configured template that will be used to form the email (see Email template). Limits like Max emails/SMS per hour/day are there to prevent too many messages to be sent in case of some unpredictable alert output, for instance, when the alert thresholds were mistakenly set too weak, and there were too many alerts delivered. This is especially important in case of SMS, as too many SMSes are annoying and may cost money. Statistical information is displayed in the fields such as Emails/SMS this hour/day. 5gVision Monitoring and alerting, Config alerts contacts The Max SMS at once parameter will limit the number of SMSes that one alert dispatch will issue. An SMS message is limited to 160 characters, and if a lot of alerts are raised at once - you will get just one email, but it can easily be 10 SMSes needed to fit all the info. If the max limit is reached, the last SMS will have a flag saying how many SMSes would be skipped.

Contact groups

Contacts are united in groups in order to provide easier management of same-type users. 5gVision Monitoring and alerting, Config alerts contact groups For instance, you may have NOC, Billing, and Management groups, with users getting alerts on different events, and even for different thresholds. You probably need the management to get only the super-critical alerts, while the NOC will be getting all alerts on ACD and ASR, and the Billing - on profit drops. The contact groups are then used in the configuration of Alerts ABS and Alerts DIFF. Groups also have email/SMS limits and statistical info per month.

Email template

This module allows you to create email templates that used during generation of alerts notifications or Customer/Vendor tickets (see Config-Tickets). 5gVision Monitoring and alerting, Config alerts email template There are several template types available:
  • text - pure text in emails.
  • 5gVision Monitoring and alerting, Config alerts email template
  • html-text - html tags used only to force fixed-width font (Courier), emails still contain text alerts.
  • 5gVision Monitoring and alerting, Config alerts email template
  • html-table - fully customizable html table with alerts.
  • 5gVision Monitoring and alerting, Config alerts email template
If a template is created as a default template, then it will be chosen for alerts when Contacts don't have an email template configured. Every template can have its own email from field, CC, BCC, subject and body message. There are 2 fields, where you can edit email subject. One of them will be taken depending on a value configured in the One alert per email parameter of the Contacts or Tickets config tables.

If template type is html-table then the additional field Email body table is available. It is joined with the Email body message. Clicking on the field cell opens a simple HTML editor, where you can edit text, change colors, adjust the table settings and use the set of available keywords. 5gVision Monitoring and alerting, Config alerts email template So the easiest way to create a new HTML template would be to clone an existing one and adjust its email body fields via the HTML editor.

The message part, and the alert table part are in separate columns to make it easier to clone templates and create new ones. So if you create a new color scheme in the Email body table, you may quickly copy it by means of the context menu and paste to several other templates, without touching the message part located above the table. 5gVision Monitoring and alerting, Config alerts email template You can use several keywords in the email subject and email body fields (please see Alert email keywords).

Alert email keywords

All available keywords that may be used in the email subject and email body fields are listed here:
  • header - may be replaced with a certain configured text on request.
  • product_name - name of the product (5gVision by default, but may be changed on request).
  • product_name_short - short name of the product (5g by default, but may be changed on request), used in SMS.
  • count_all - total number of raised alerts for which you get a notification in one email.
  • count_critical - number of raised alerts with the tag Critical for which you get a notification in one email.
  • utc_time - date/time of the latest raised alert in the email in UTC time zone (like 2001-01-21 10:00:00).
  • user_time - date/time of the latest raised alert in the email in the user time zone (like 2001-01-21 10:00:00). Time zone is taken from System: Alerts -> Global, Offset from UTC to determine day start for email/sms counters eset.
  • stats_type - indicates a block of alerts grouped by a certain statistics type in one email (for example SWITCH, SNMP).
  • abs_diff - indicates a block of alerts grouped by ABS or DIFF type in one email (Absolute or Differential).
  • object_type - indicates a block of alerts grouped by an object type in one email (for example Customers/Vendors, Areas).
  • object_name - name of an object for which the alert was raised.
  • object_name_short - short name of an object for which the alert was raised (for example TOTAL for TOTAL SYSTEM STATISTICS object), used in SMS.
  • param_name - statistical parameter name (like Calls, ASR, Hr ACD).
  • param_dir - statistical parameter direction (IN or OUT).
  • param_value1 - for ABS alerts: parameter value at the time when the alert was raised. For DIFF alert: previous parameter value averaged over the compared interval.
  • change_sign - for ABS alerts: the sign to inform whether the parameter went below <= or above >= the configured alert threshold. For DIFF alert: sign ===> is used to show the change direction from the previous value to the value at the time of alert.
  • param_value2 - for ABS alerts: min/max alert value configured in the Alert ABS table for this alert. For DIFF alert: parameter value at time of alert averaged over the compared interval.
  • param_change - only for DIFF alerts: shows a % change of the parameter value over the compared interval and the configured threshold (like change: +89.5% (> +50.0%)).
  • raised_cleared - only for ABS alerts: indicates that the alert was RAISED or CLEARED.
  • cleared_after - only for ABS alerts: if a notification about the cleared alert then the time interval when the alert was active is inserted (like “after 0:15”).
  • critical - if the alert tag is Critical then the highlighted CRITICAL word is inserted.
  • critical_short - short sign to indicate an alert with the tag Critical (!!), used in SMS.
  • link_chart - link to a chart where the parameter data and the highlighted alert are displayed.
  • alert_id - ID of the alert in the Alerts log table (like LogID:123).
  • conf_id - ID of the alert config record in the Alert ABS or Alert DIFF table (like ConfID:3).
  • comment - comment taken from the alert config record of the Alert ABS or Alert DIFF table.
  • comment_short - short comment from the alert config record of the Alert ABS or Alert DIFF table (only first 50 signs are copied), used in SMS.
  • 32:r:justify - allows you to pad a space before or after a text. The first value sets a number of spaces, and the second one indicates a side where the text should be placed relative to the space (r - right, l – left).

Alert Object groups

Objects are united in groups in order to set up alert thresholds for the whole group, not for just individual objects or object types. An object group item (lower table) can be just one object, but most commonly this will be a bunch of objects of the same type (contractors, or areas, or equipment, etc.). 5gVision Monitoring and alerting, Config alerts alert object groups For instance, you may create an object group that will have all c-type objects (contractors), and all a-type objects (areas).

Or you may create a group that will have only one object filtered using the ID included, where you may put a switch customer ID, something like "c01.125.03". The same for the ID excluded field.

Fields Name include mask (regexp) and Name exclude mask (regexp) provide for a way to filter out only the Objects that you really need to get alerts on. Without those restrictions, you may have too many alerts, most of which may be unnecessary. For instance, if you only wish to get alerts on all destinations in Italy, you may put "Italy" in the Name include mask (regexp).

Name and ID masks follow general regular expression rules, so you may put there something like this: "(Italy|France|Poland)" or "(c01.222|c01.223)". "|" means "OR" here.

You can add the ID include mask (regexp) and ID exclude mask (regexp) colums, hidden by default, as needed. They allow, for instance, to filter out only the DST set 3 areas with a a3\. include regexp.

Alerts ABS

ABS or absolute, and DIFF or differential alert threshold tables put together all Contact groups, Alert Object groups and thresholds configured for them.

Let's go through the most important Alerts ABS columns: 5gVision Monitoring and alerting, Config alerts alerts abs
  • Send tickets - send Raised only or Raised+Cleared alerts/tickets to Customers/Vendors over email if they are part of the object combination (see Config-Tickets).
  • Combined alert group - groups of combined alerts. Alerts are raised only if conditions for all alerts in a group are met (see Combined alerts).
  • Custom interval, per hour - stats maybe compared using custom per hour intervals set in the Custom intervals table.
  • Accumulative interval - alerts are raised when a parameter reaches a threshold over an accumulation period, for example, from beginning of the day till now. See Accumulative intervals
  • Ignore if calls <=. This threshold is good if you don't want to get alerts on too small customers, or destinations that have only a couple of active calls. If you are configuring alerts on calls themselves - this threshold is not needed, and may be omitted.
    One very important thing here is that what parameter will be taken for Ignore if calls <= relates to what parameter is used for the original alert: Active calls for EMA or per-window parameters, Connects per hour for most Per-hour parameters or Attempts per hour for per-hour parameters like ASR, ABR, NER, TTR, % of 487, % of hunts. Thus, for example, the threshold values may be quite different, say, for current ACD you may have "Ignore if calls <=20", and for per-hour ACD: "Ignore if calls <=200".
  • Ignore if calls >=. Very similar to the above. Please remember that in both these cases the alert may be triggered not only because, say, ACD went below 3 min, but also because the number of calls rose over, say 20, and the previously low ACD became eligible for the alert.
  • Alert if param <=. This is the main threshold, the whole previous alert configuration was done to eventually set this very threshold. Not much to explain here really, if you want an alert on ACD going below 2, put 2 here, or ASR going below 40% - put 40. The only thing to remember is that blank is not 0, so if you want an alert on calls going to 0 - please put the 0 specifically
  • Alert if param >=. Similar to the above.
  • Clear if param >=. It is very important to always set this up if you set up the Alert if param <=. A good practice will be to set the clear threshold with some tolerance, so that alerts are not raised/cleared all the time. If % of 487 codes goes over 60% and alert is raised, it would be good to wait till this parameter goes below 50% or so before clearing the alert.
  • Clear if param <=. Likewise it is very important to always set this up if you set up the Alert if param >=. This cannot be overemphasized, please ALWAYS have a matching raise/clear pair. If you want an alert on ACD going below 4, please always tell the system when the alert should be cleared (say, ACD goes over 4.5), even if you don't want any cleared notifications.
    The reason is that once the alert is raised - you will not get any notifications about it every minute, the system knows an object is in an alert state (and you know it too, we presume). However, if the alert is never cleared, the system will continue to believe the alert state is going on and on, forever, thus, if ACD gets normal in 1 hour, and then goes down in 3 hours again - you will not receive any notifications on these events if you do not specify the clear threshold. Please remember: the alert should be cleared, in order to be raised again.
  • Alert assure interval, min. This works only for the current (not per-hour) stats. We need this interval to make sure the value does not go below/above the threshold only for a very short time. After all, what is the reason to send you an email if ACD goes below 3 min and then goes above again in 1 minute? Notifications are dispatched only at the end of each assure interval, when the value went and stayed below/above the threshold for the whole length of the interval.
  • Clear assure interval, min. Likewise, we need to be sure the parameter reached "good" value and stayed at this level before sending clear notifications. Assure intervals can be omitted, or set to 0 if you wish. For instance, it is a good idea to set it to 0 for very critical cases, like calls dropping to 0, as you probably would like to be notified of such an event, even if it happened for just 1 minute.
  • Group notific.. Right now there are 2 options here: "No", and "5 min.". Choose "No" if you absolutely do not want to wait till a 5-min. bunch of notifications is collected, but prefer to get the alert immediately. Chose "5 min." to limit number of emails you will be getting, as they wont be coming more often than every 5 min., even if you have a lot of alerts.
    Grouping ABS alerts over longer periods than 5 min. does not make much sense, as the DIFF alerts discussed further are raised/cleared at fixed intervals every 5 min., and will trigger notifications (if there are alerts) every 5 min. in any case.
  • Notify of raised. Notification methods about raised alerts.
  • Notify of cleared. Very simple: you may choose if you wish to be notified of cleared alerts and through which method.

Alerts DIFF

DIFF or differential alert thresholds table has the following columns (besides the common ones): 5gVision Monitoring and alerting, Config alerts alerts diff
  • Send tickets - send Raised only or Raised+Cleared alerts/tickets to Customers/Vendors over email if they are part of the object combination (see Config-Tickets).
  • Combined alert group - groups of combined alerts. Alerts are raised only if conditions for all alerts in a group are met (see Combined alerts).
  • Custom interval, per hour - stats maybe compared using custom per hour or per minute intervals set in the Custom intervals table.
  • Ignore if calls <= and Ignore if calls >=. Same parameters as in the ABS alerts table, please refer to Alerts ABS
  • Ignore if param <=. Unlike the Alerts ABS table, this threshold is not the main one, but, like Ignore if calls <= serves as an additional screen against too many annoying alerts on events that you don't care about.
    For instance, you may have your test equipment with ACD and ASR that are always very low, and you don't want alerts raised if ACD goes from 0.4 to 0.2 min., you may then set this parameter to 0.5 min for ACD and 5% for ASR and will never be getting alerts if the value was below this screening threshold before the drop (but if the value dropped from 5 min to 0.4 min - you will still be alerted).
  • Ignore if param >=. Similar to the above.
  • Alert if change, %. To be more verbose: Alert if the value went down/up over a certain threshold in %%. This is the main threshold of the "Alerts DIFF" table.
  • Alert if change, val. It is possible to set UP/DOWN threshold as an absolute value in DIFF alerts like in ABS alerts. You may set both the % change and the value change, alerts will be raised on either of them, with % change having the priority, i.e. if both conditions are met, the alert will be raised on % change.
  • Change dir, or direction, tells us if we want alerts in case the value goes up, or down. There is some difference between comparing values that go down and go up. If ACD changes from 6 to 3 - this is a 50% drop: (3-6)/6 = 0.5, however, if "% of 487 codes" goes from 40% to 80% - this is actually a 100% increase: (80-40)/40 = 1. You always divide by the previous value. Please have this in mind. This also suggests that when you set up thresholds for values going up, they can be more than 100%, say, if your customer's hourly price (worth of all traffic they sent you over each hour) went from $1,000 to $10,000 - this is a (10000-1000)/1000 = 900% increase.
  • Ignore repeated alerts for, min. DIFF alerts, unlike ABS, will be raised on each occasion when the drop is noticed, so if you have a customer pulling off its traffic, and calls drop 40% every 5 min. over the last hour - you will be getting 12 notifications on basically more or less the same event - dropping calls. To restrict such repeated notifications, you may set the number of minutes, during which DIFF alerts for the same object and parameter will not be triggered. Another thing is that DIFF alerts, unlike ABS, can not be cleared.

Combined alerts

The system allows you to combine several alerts so that they are raised/cleared only if ALL alerts in the combination group are triggered. 5gVision Monitoring and alerting, Config combined alerts The Combined alert group is nothing, but a name, for convenience. The only parameter that it has is a status, this way one may quickly disable it for all alerts.

It is recommended to combine alerts of the same type, like ACD with ASR or per-hour ACD with per-hour PDD, and make sure that all alerts have the same parameters of:
  • Object group - otherwise alerts will never raise, as objects will be different.
  • Contact group - otherwise some contacts may receive only part of alerts in a group.
  • Send tickets - otherwise some Customers/Vendors may receive only part of alerts in a group (see more in Config-tickets).
In theory, any alert of any type can be combined with any other alert. There are some safe combinations that should work well, and some combinations that should rather be avoided. Safe combinations:
  • Any alerts of exactly the same type, where only parameters are different (and custom or accum intervals are also same for all alerts). eg: per-hour ABS ACD and per-hour ABS ABR
  • Per-hour ABS alerts with per-hour DIFF alerts for the same type of stats (VoIP, or SNMP, or numbers). Both ABS and DIFF per-hour alerts are raised at the same minute (usually right after hour end, like :01 or :02), and will be checked for alert combinations simultaneously. eg: per-hour ABS ACD and per-hour DIFF ASR
  • ABS per-min active call stats with ABS per-min EMA stats. eg: ABS per-min active calls with ABS per-min ACD
  • DIFF per-min active call stats with per-min EMA stats IF there are no custom intervals or they are same. eg: DIFF per-min active calls with DIFF per-min ACD
  • ABS per-min active call stats with per-hour stats IF there are no assure intervals for ABS per-min stats. eg: ABS per-min active calls with ABS/DIFF per-hour Attempts
For other combinations there are several things to consider:
  • Combining per-hour VoIP stats with per-hour SNMP stats or per-hour SRC-DST number stats may not work because different types of stats may be raised at a different minute (usually from :00 to :03), and thus will not be checked for alert combinations simultaneously.
  • ABS alerts on per-min parameters may have assure intervals. If you combine ABS Active calls with per-hour alerts, you may have a situation, for instance, when your calls got below 1000 at 10:55, but due to an assure interval of 10 min, the alert will only be active at 11:05, the per-hour alert, however, will be checked at 11:01, and even though the calls were below 1000 at this very moment, none of the combined alerts will be raised due to the assure interval.
  • In general, all alerts in a combination should be checked at the same minute to be raised, so if there is a possibility that alerts are not checked at the same time (for instance, for alerts with custom or accum intervals - some alerts may be checked every 10 minutes, and some other alerts every 15 minutes) - such combinations are better to be avoided.
Alerts in a combination are not only raised, but cleared together too. However, if you have a combination of 2 ABS and 2 DIFF alerts - since there is no such a thing as clearing the DIFF alert, whenever the 2 ABS alert are cleared - the whole combination is considered cleared.

There is the Alert group column in the Alert log where a combined alert group ID is displayed.

Custom intervals

Custom intervals provide you with more flexibility in configuring your alerts. While normal per-hour alerts are raised for the last hour stats, and DIFF alerts on concurrent or EMA stats are raised based on stats for the last 30 minutes, custom intervals allow you to compare any interval in the past to any other interval or sum up/average stats for an alerted value over several hours and compare to a threshold.

You can setup 2 types of intervals: per-hour and per-minute ones. Per-hour intervals will be applied to per-hour stats (bars on charts), and per-minute ones will be applied to Active call and EMA stats (lines on charts). 5gVision Monitoring and alerting, Config alerts intervals
  • The two compared intervals are defined in hours/minutes in the Interval 1, hours/mins and Interval 2, hours/mins parameters. For DIFF alerts the system compares two intervals, where Interval 1 is an earlier interval and Interval 2 is a later interval. For ABS alerts only the first interval is calculated and compared to a set value.
  • The distance between the intervals is defined in the Inter distance, hours/mins parameter. Not used for ABS alerts.
  • The offset from the time of comparison/alert to the end of the 2nd (later) interval is defined in the Offset, hours/mins parameter.
  • The frequency of interval comparison for per-hour parameters is determined in the Frequency, hours parameter.
  • The frequency of interval comparison for per-minutes parameters is determined in the Frequency, minutes parameter.
  • The Start hour, 24h GMT defines the time of the day when the system starts to compare per-hour intervals. So, when you are comparing intervals longer than 1 hour, for instance, 6 hours, you may not want to check stats and (potentially) raise alerts every hour, but, rather, every 6 hours. This is controlled by the Frequency, hours parameter. Also, you need to tell the system via the Start hour, 24h GMT parameter when to start checking for alerts every day. If you have the Start hour as 2, and the Frequency as 6, stats will be compared at 2am, 8am, 2pm, and 8pm GMT.
  • The Alert days define the days when the interval comparison should take place (1 - Monday, 7 - Sunday). This may be useful for longer intervals, like comparing one week with another week, where you might want to set specific days when this comparison runs, in order not to get these alerts too often. With the Frequency of 24 hours, the Start hour, 24h GMT as 20, and only Monday in Alert days - you will be having just one alert per week at 8pm on Monday.
The data available for comparisons when using per-minutes intervals is limited to the last 50 minutes to avoid excessive load to the system. If you need to compare parameters over longer periods, please use per-hour parameters with per-hour custom intervals of up to 2 weeks.

You may choose the method of counting the values for the interval in the Aggreg. type parameter: summing the values (Sum) or averaging them (Average). The aggregation method is only meaningful if the interval spans across several hours or several minutes, as the system takes stats for every hour/minute and calculates the final result using the selected method.

Please note that Alert intervals are valid only for the corresponding stats. If you assign a per-hour interval to a "current" parameter in the alerts configuration, the system will ignore the interval and the parameter will be processed normally.

For example, if you want to compare incoming attempts within 4 hours taken with an offset of 3 more hours to the 4 hours a week ago and do it every second hour on workdays the parameters should be as follows: 5gVision Monitoring and alerting, Config alerts intervals compared
  • Interval type - Per-hour.
  • Aggreg. type - Average.
  • Interval 1, hours/mins - 4.
  • Interval 2, hours/mins - 4.
  • Inter distance, hours/mins - 168.
  • Offset, hours/mins - 3.
  • Frequency, hours - 2.
  • Start hour, 24h GMT - 0.
  • Alert days - 1, 2, 3, 4, 5.
The resulting interval config is assigned to an alert in the Custom interval parameter. 5gVision Monitoring and alerting, Config alerts intervals add In order for the alert to be raised, the Schedule assigned to this alert should also be enabled at a specific hour. For instance, if you have a Schedule set to be ON from 9am till 5pm GMT, the alert from the above example will be checked for at 10am, 12am, 2pm and 4pm only, out of possible times of 2am, 4am, 6am, 8am, 10am, 12am, 2pm, 4pm, 6pm, 8pm, 10pm, 12pm GMT.

Here is an example of comparison of traffic for 2 adjacent 24 hour intervals. 5gVision Monitoring and alerting, Config alerts intervals 24

Accumulative intervals

Accumulative intervals are created for VoIP and SRC/DST number statistics. You may sum up values since the beginning of an interval till the current time and then compare them to the configured Absolute thresholds.

For example, you can set up an interval from beginning of the month, week, or day till now, and set the frequency of checks/alerts from every 5 to every 60 minutes. 5gVision Monitoring and alerting, Config alerts accum intervals These are the fields to configure accumulative intervals:
  • Aggreg. type - how the stats should be aggregated over several hours - summed or averaged up.
  • Start hour, 24h GMT - defines the beginning of the hour of the day from which the system starts accumulating per-hour stats.
  • Start day of week - the day of week at which to start the accumulation. If you need to accumulate day by day - you may set any day of week and the Accum. interval=24 hours.
  • Start day of month - the day of month at which to start the accumulation. Day of month overwrites day of week.
  • Max Accum. Interval, hours - the maximum accumulated interval in hours. If blank - accumulation goes till next start day of week/month whatever is set. After the accumulation interval is reached, accumulation starts from 0 again. For SRC/DST number stats the max interval is 1 week.
  • Frequency, minutes - how often to compare data and raise alerts with this accumulative interval assigned.
Accumulative intervals work only for per-hour statistics and only for Absolute alerts, see Alerts ABS. Accumulation intervals up to the time of alert raise are highlighed in purple on charts. 5gVision Monitoring and alerting, Config alerts accum intervals

Alerts global config

Global config provides for a convenient way to quickly switch on/off certain alerts or notification methods. 5gVision Monitoring and alerting, Config alerts alerts global config One specific mode is when you enable Test SMS notification delivery to email and provide the email. All SMSes from all contacts will be delivered to this email exactly as they would look on people's phones, with split up on each 160 characters, etc. This is a very good test mode to see how many SMSes you will actually be getting on average.