SNMPBlender – Scriptable SNMP Framework for Complex Requirements

Please Note

SNMPBlender is currently in alpha stage development, and the internals as well as the Rule API still change at a quick rate. Feel free to use it, but be aware that stuff may not work or break in future releases.

Overview

SNMPBlender is a scriptable SNMP platform. It can be used to:

Typical real-world uses cases:

SNMPBlender is based on these state-of-the-art technologies

SNMPBlender is not a complete NMS product, but rather a versatile tool that easily plugs into an existing infrastructure.

These are the internal components of SNMPBlender:

Get It

SNMPBlender is hosted on Google Code.

Examples

Simple Continuous Polling of an Error Rate

Let’s start with a really basic, minimal example. Suppose you want to monitor the inbound error rate on all interfaces of your two most important routers. This is the starting configuration, saved in example1.yml (of course without the line numbers):
1 hostname: cx-core-10070 2 ipaddress: 192.168.2.70 3 tables: 4 - [ ifDescr, ifInUcastPkts, ifInErrors ] 5 pollrules: 6 - rulename: BasicCheck 7 selector: unique 8 continueif: clear 9 - rulename: IfInErrors 10 selector: ifDescr.% 11 requiredGenerations: 2 12 --- 13 hostname: cx-core-10071 14 ipaddress: 192.168.3.71 15 tables: 16 - [ ifDescr, ifInUcastPkts, ifInErrors ] 17 pollrules: 18 - rulename: BasicCheck 19 selector: unique 20 continueif: clear 21 - rulename: IfInErrors 22 selector: ifDescr.% 23 requiredGenerations: 2

The non-obvious content line by line:

Now let’s have a look at the BasicCheck rule. It simply queries the database to see if the polling produced a few interfaces for this host. Why do we do this? SNMP does (on purpose) not specify informative error responses. Whether your target host is down, has another community than you think or whatever else, SNMP will simply give you a timeout response, and we end up with no data in our database. As it makes no sense to run a stack of rules on non-existing data, we first check if we got a useful response.

Let’s look at this rule, again with line numbers turned on:
1 data = polls.query(hostname, 'ifDescr.%', pollGeneration); 2 if (data.length > 3){ 3 result.clear(); 4 result.message(hostname + ': ' + data.length + ' interfaces successfully polled'); 5 }else{ 6 result.critical(); 7 result.message(hostname + ': no poll data available'); 8 }

Ok, now the stage is set for the rule that will tell us about individual interface errors. Here it is:

1 ifindex = oidtools.getIndex(selectorOid); 2 errNow = polls.get(hostname, 'ifInErrors' + "." + ifindex, pollGeneration); 3 errPrev = polls.get(hostname, 'ifInErrors' + "." + ifindex, pollGeneration - 1); 4 result.log(hostname + ' ' + selectorValue + ' ' + ' errNow/Prev = ' + errNow + "/" + errPrev);
5 if (errNow > errPrev){ 6 trafficNow = polls.get(hostname, 'ifInUcastPkts' + "." + ifindex, pollGeneration); 7 trafficPrev = polls.get(hostname, 'ifInUcastPkts' + "." + ifindex, pollGeneration - 1); 8 errorPct = (errNow - errPrev) / (trafficNow - trafficPrev) * 100; 9 if (errorPct > 5){ 10 result.message('High Error Rate on ' + selectorValue); 11 result.critical(); 12 }else if (errorPct > 2){ 13 result.message('Moderate Error Rate on ' + selectorValue); 14 result.warning(); 15 }else{ 16 result.message('No Signifcant Error Rate on ' + selectorValue); 17 result.clear(); 18 } 19 }else{ 20 result.message('No Signifcant Error Rate on ' + selectorValue); 21 result.clear(); 22 }

Ok, so how do we run this thing? Some parameters to the blending process must be directly passed as JVM parameters, so a sample startup script looks like this:
1 #!/bin/bash 2 java \ 3 -Dtargets=etc/example1.yml \ 4 -Doidmap=etc/example1oidmap.txt \ 5 -Dpollcycle=60 \ 6 -Dpollgenerations=2 \ 7 -Drules=etc \ 8 -Dlog4j.configuration=file:etc/example1log4j.properties \ 9 -jar dist/snmpblender.jar

The important stuff:

Here’s what the minimal OID map looks like:
#example1oidmap.txt .1.3.6.1.2.1.2.2.1.14 ifInErrors .1.3.6.1.2.1.2.2.1.11 ifInUcastPkts .1.3.6.1.2.1.2.2.1.2 ifDescr

Now, let’s look at how results are processed. SNMPBlender has no built-in messaging or SMS gateway or paging interface. Instead, it uses Log4j, a versatile library for logging data to all kind of destinations. In log4j-lingo, a message destination is called an appender. If we want to store our results to a file and simultaneously send them to our multi-million Enterprise Management Platform via Syslog, this configuration will do it:

log4j.appender.resultsFile=org.apache.log4j.RollingFileAppender log4j.appender.resultsFile.File=/tmp/results.log log4j.appender.resultsFile.layout=org.apache.log4j.PatternLayout log4j.appender.resultsFile.layout.ConversionPattern=%d %m%n
log4j.appender.resultsSyslog=org.apache.log4j.net.SyslogAppender log4j.appender.resultsSyslog.SyslogHost=openview.example.org log4j.appender.resultsSyslog.Facility=local3 log4j.appender.resultsSyslog.layout=org.apache.log4j.PatternLayout log4j.appender.resultsSyslog.layout.ConversionPattern=%d %m%n

The results sent to the file and the Syslog destination will look like this:
2011-07-25 20:29:56,924 poll:cx-core-10070:IfInErrors:1:1.3.6.1.2.1.2.2.1.2.2 cx-core-10070 IfInErrors CRITICAL High Error Rate on FastEthernet0/1 2011-07-25 20:35:46,968 poll:cx-core-10070:IfInErrors:1:1.3.6.1.2.1.2.2.1.2.2 cx-core-10070 IfInErrors CLEAR No Signifcant Error Rate on FastEthernet0/1

The first few columns are obviously the date when the message was created, and rightmost columns are the hostname and the message that we wrote ourself with result.message(). But what does the long string poll:cx-core-10070:IfInErrors:1:1.3.6.1.2.1.2.2.1.2.2 mean? This is the so called ruleInstanceId, a very important concept in SNMPBlender. It uniquely identifies a rule instance and allows deduplication of events. For polls, the ruleInstanceId is composed of these parts:

Did you notice how IfInErrors.js sends a clear, warning or critical message every time it is run? Chances are that your NOC doesn’t care to get it that frequently, but only when something has changed. If something has changed is evaluated in function of the ruleInstanceId, the rules are simple:

Finally, you may wonder where the information logged with result.log() goes. This is an additional appender in the log4j properties file. To see these logs, the level for the class RuleEngine must be set to debugging, like so:
log4j.rootLogger=WARN, logfile log4j.appender.logfile=org.apache.log4j.RollingFileAppender log4j.appender.logfile.File=/tmp/snmpblender.log log4j.appender.logfile.layout=org.apache.log4j.PatternLayout log4j.appender.logfile.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n log4j.logger.com.netnea.snmpblender.RuleEngine=DEBUG

This will produce a file containing the log messages, among with other information:
2011-07-25 22:30:05,940 DEBUG [RuleEngine-1] (RuleEngine.java:275) - Executing etc/IfInErrors.js, ruleInstanceId poll:cx-core-10070:IfInErrors:1:1.3.6.1.2.1.2.2.1.2.1 2011-07-25 22:30:05,941 DEBUG [RuleEngine-0] (RuleEngine.java:379) - RuleResult log: cx-core-10071 FastEthernet0/0 errNow/Prev = 0/0

All the files mentioned in this example can be found in etc/example1.

Processing Traps and Using Trap Reactions

The CTO of MegaCorp Inc wants you to make sure that all interface descriptions adhere to the format “to XXYY”, where XX can be LA or NY and YY is a numeric code for the remote destination. To catch misconfigurations as quickly as possible, we use IOS config change traps (i.e. snmp-server enable traps config) and this configuration:

1 hostname: cx-core-10070 2 ipaddress: 192.168.2.70 3 community: public 4 traprules: 5 - rulename: Ex2ConfigChangeRule 6 selector: ".*\\.43\\.2\\.0\\.1$" 7 reactionpolls: 8 tables: 9 - [ ifDescr, ifAlias ]

And here’s the rule:

1 result.log("Got Config Change Trap"); 2 3 aliases = trap.queryReactionPolls('ifAlias.%'); 4 5 result.log('Trap reactions: ') ; 6 7 allOk = true; 8 msg = "Invalid ifAlias on "; 9 10 for(i = 0; i < aliases.length; i++){ 11 12 alias = aliases[i].value; 13 ifindex = oidtools.getIndex(aliases[i].oid); 14 ifDescrOid = oidtools.toOid('ifDescr' + "." + ifindex); 15 ifDescr = trap.getReactionPollValue(ifDescrOid); 16 17 if (! alias.match(/to (LA|NY)\d\d/)){ // match to LA02, to NY01, to NY99 etc... 18 result.log ("index " + ifindex + " if " + ifDescr + " Alias NOK: " + alias); 19 allOk = false; 20 msg += ifDescr + " "; 21 }else{ 22 result.log ("index " + ifindex + " if " + ifDescr + " Alias OK: " + alias); 23 } 24 } 25 26 if (allOk){ 27 result.clear(); 28 result.message('All ifAliases conforming to MegaCorp Inc Operations Manual') 29 }else{ 30 result.warning(); 31 result.message('MegaCorp Inc Operations Manual violation - ' + msg); 32 }

To run this, we must make sure to use the appropriate trap listener options:

1 #!/bin/bash 2 java \ 3 -Dtargets=etc/example2.yml \ 4 -Doidmap=etc/oidmap.txt \ 5 -Dpollcycle=60 \ 6 -Dtrapaddr=0.0.0.0 \ 7 -Dtrapport=162 \ 8 -Dpollgenerations=3 \ 9 -Drules=etc \ 10 -Dlog4j.configuration=file:etc/log4j.properties \ 11 -jar dist/snmpblender.jar

The debug output for this rule could look similar to this:
DEBUG [RuleEngine-3] (RuleEngine.java:379) - RuleResult log: Got Config Change Trap Trap reactions: index 1 if FastEthernet0/0 Alias OK: to NY10 index 2 if FastEthernet0/1 Alias NOK: to location los angeles 6 index 3 if FastEthernet1/0 Alias NOK: joe's mp3 server index 4 if FastEthernet1/1 Alias OK: to LA03

In this case, Fa0/1 and Fa1/0 do not match our pattern, so this error will be sent:
2011-07-30 14:00:08,506 trap:cx-core-10070:Ex2ConfigChangeRule:0:TRAP:1.3.6.1.4.1.9.9.43.2.0.1 cx-core-10070 Ex2ConfigChangeRule WARNING MegaCorp Inc Operations Manual violation - Invalid ifAlias on FastEthernet0/1 FastEthernet1/0

Note that the ruleInstanceId is a bit different for traps, it consists of:

All the files mentioned in this example can be found in etc/example1.

Reference

Startup Options

These are the options that can be passed to SNMPBlender, preferably by editing snmpblender.sh

Option Description Default
targets Main configuration file none
rules Base directory of rule files none
pollcycle Poll interval (seconds) 60
pollgenerations How many generations to keep 4
maxgenerations Operate in non-continuous mode and stop after the specified number of generations 0 (run forever)
oidmap OID lookup file none
rulethreads Number of RuleEngine threads 4
writerthreads Number of ResultWriter threads 2
trapaddr Bind address for trap listener 0.0.0.0 (all)
trapaddr Liste port for trap listener. Setting this to a privileged port requires SNMPBlender to be run as root. 16002
log4j.configuration Configuration for Log4j none

Configuration File Syntax

The configuration file is in [http://yaml.org/ YAML] syntax. For examples, see the etc directory in the distribution.

General Host Properties

Attributename Description Content
hostname The hostname. A host can only appear once. String, mandatory
ipaddress IP address. This will be used to perform the polls and assign traps to the host. IP Address, mandatory
snmpversion SNMP version used in polls. 1, 2 or 3, mandatory
retries SNMP retry count Integer, default 1, optional
timeout SNMP timeout, in ms Integer, default 5000, optional
community SNMP read community for version 1 + 2 String, default public, optional
seclevel SNMP V3 security level NoAuthNoPriv, AuthNoPriv or AuthPriv, mandatory when snmpversion = 3
user SNMP V3 user String, mandatory when snmpversion = 3
authmethod SNMP V3 auth method MD5 or SHA, mandatory when snmpversion = 3 and seclevel != NoAuthNoPriv
authkey SNMP V3 auth key String, mandatory when snmpversion = 3 and seclevel = AuthNoPriv or AuthPriv
privmethod SNMP V3 priv method DES or AES128, mandatory when snmpversion = 3 and seclevel = AuthPriv
privkey SNMP V3 priv key String, mandatory when snmpversion = 3 and seclevel = AuthPriv
scalars List of scalar OIDs (ie. single values) to fetch in each pollcycle. Numeric OIDs or resolvable names, optional
tables List of tabular OID columns to fetch in each pollcycle. Can be the root of a table to fetch it as a whole or the OIDs of individual columns. Numeric OIDs or resolvable names, optional

Poll Rules

Attributename Description Content
pollrules List of pollrules to execute. Each item is a map with keys as described below List of maps, optional
pollrules.(i).rulename Name of pollrule to execute. If the rulename is x, the file loaded will be (Base directory of rule files)/x.js. String, mandatory
pollrules.(i).selector For which OIDs (which also amounts to how many times) this rule will be executed. E.g. if you have ifEntry in your table polls, and you’d like to execute the rule for every interface in there, ifDescr.% will achieve this. To execute the rule once, the special value unique can be used. SQL-style LIKE pattern or unique, mandatory
pollrules.(i).requiredgenerations How many generations of data must be present before the rule is executed, defaults to 1. Integer, optional
pollrules.(i).continueif The following rules will only be executed if the current rule exits with one of the specified severities One or more of NOOP CLEAR WARNING CRITICAL, optional
pollrules.(i).“anything” All additional attributes will be made available as Javascript variables of the same name in the rule execution environment Arbitrary content

Trap Rules

Attributename Description Content
traprules List of traprules to execute. Each item is a map with keys as described below List of maps, optional
traprules.(i).rulename Name of traprule to execute. If the rulename is x, the file loaded will be (Base directory of rule files)/x.js. String, mandatory
traprules.(i).selector A regular expression which is applied to the incoming trap’s OID (for v2 and 3 traps/informs) or the string enterprise:generic:specific (for v1 traps). If it matches, the rule is executed. Regular Expression, mandatory
traprules.(i).reactionpolls.scalars List of scalar polls to to execute in response to this trap . Same format as above Numeric OIDs or resolvable names, optional
traprules.(i).reactionpolls.tables List of tabular polls to to execute in response to this trap . Same format as above Numeric OIDs or resolvable names, optional
traprules.(i).“anything” All additional attributes will be made available as Javascript variables of the same name in the rule execution environment Arbitrary content

Javascript Rule API

Predefined Variables

Variable Description Example
hostname The hostname for which the current rule is executed switch08.example.com
selectorOid The expanded, numeric OID matched by the rule selector. If the selector was ifDescr.%, the executed rule for ifindex 15 will beifDescr.15, aka 1.3.6.1.2.1.2.2.1.2.15 in numeric form.
selectorValue The value of the OID matched by the rule selector. If the selector was ifDescr.%, the value for ifindex 15 may be Fa0/15
ruleInstanceId A unique identifier for this host, rule file and rule instance combination. Used for internal event de-duplication. poll:switch08.example.com:interfaceerrors:0:1.3.6.1.2.1.2.2.1.2.15
pollGeneration The number of the poll generation currently processing, starts at 1. 102

The oidtools Object

Method Description Example
oidtools.toName(String numericOid) Convert a numeric OID to an attribute name, as defined in the oidmap file. This also handles partial expansion and wildcards oidtools.toName('1.3.6.1.2.1.2.2.1.2.15') == 'ifDescr.15'
oidtools.toOid(String attributeName) Convert an attribute name to a numeric OID, as defined in the oidmap file. This also handles partial expansion and wildcards oidtools.toOid("ifAdminStatus.5") == "1.3.6.1.2.1.2.2.1.7.5"
oidtools.getIndex(String numericOid) Get rightmost element from the right side of an OID, i.e. the index  oidtools.getIndex("1.3.6.1.2.1.2.2.1.7.5") == "5"
oidtools.dropRight(String numericOid, int n) Drop n elements from the right side of an OID, return remainder oidtools.dropRight("1.3.6.1.2.1.2.2.1.7.5", 4) == "1.3.6.1.2.1.2"
oidtools.getRight(String numericOid, int n) Get n elements from the right side of an OID oidtools.getRight("1.3.6.1.2.1.2.2.1.7.5", 3) == "1.7.5"
oidtools.toArray(String numericOid) Convert OID to array for easy access to individual elements x = oidtools.toArray("1.3.6.1.2.1.2.2.1.7.5");x[0] + "," + x[1] + "," + x[10] == "1,3,5";

The polls Object

Method Description Example
polls.get(String hostname, String oidExp, int pollGeneration) Get a single value from the polls database. The argument oidExp can either be a numerical OID or a string resolvable in the OID map. errorsNow = polls.get(hostname, '1.3.6.1.2.1.2.2.1.10.1', pollGeneration);
errorsBefore = polls.get(hostname, '1.3.6.1.2.1.2.2.1.10.1', pollGeneration - 1);
polls.get(String oidExp) Convenience method, assumes current hostname and pollGeneration uptimeNow = polls.get('sysUpTimeInstance');
polls.get(String oidExp, int pollGeneration) Convenience method, assumes current hostname uptimeLastGen = polls.get('sysUpTimeInstance', pollGeneration - 1 );
polls.query(String hostnamePattern, String oidPatternExp, int pollGeneration) Get an array of records from the polls database data = polls.query('switch%', 'ifEntry.%', pollGeneration);

The returned array has these properties:
data[i].hostname
data[i].oid
data[i].value
data[i].received
Note: received is a timestamp in epoch milliseconds.
polls.query(String oidPatternExp) Convenience method, assumes current hostname and pollGeneration. oidPatternExp is a numeric OID pattern or a string resolvable in the OID map. It can contain an arbitrary amount of % as wildcard character. `polls.query(‘iso.%’); // anything we have for the current host and generation
polls.query(String oidPatternExp, int PollGeneration) Convenience method, assumes current hostname oidPatternExp as above. `polls.query(‘iso.%’, pollGeneration – 1); // anything we have for the current host, previous generation

The trap Object

The trap object is only available in trap rules.

Method Description Example
trap.getType() Get trap type, either V1TRAP, TRAP or INFORM V1TRAP
trap.getEnterprise() Get enterprise OID (V1TRAP only) 1.3.6.1.4.1.94
trap.getGeneric() Get generic trap number (V1TRAP only) 2
trap.getSpecific() Get specific trap number (V1TRAP only) 42
trap.getVariableBindings() Get variable bindings of trap. Returns an array of bindings with properties oid, value and syntax. vb = trap.getVariableBindings();
for(i = 0 ; i < vb.length; i++){
result.log("varbind-" + (i+1) + " oid " + vb[i].oid
+ " value " + vb[i].value + " syntax " + vb[i].syntax);
}
trap.queryReactionPolls(String oidPatternExp) If the rules for this trap requested any additional polls, they can be requested with this method. The returned array is the same format as in polls.query(). trap.queryReactionPolls('%'); // fetch everything
trap.getReactionPollValue(String oidExp) Get a single value from reaction poll results trap.getReactionPollValue('ifAlias.12')

The keyvaluestore Object

To share additional data between different rules or subsequent executions of the same rule, a global key-value store is provided. To facilitate generating a unique key, use the predefined ruleInstanceId.

Method Description Example
keyvaluestore.put(String key, String value) Store key/value pair. Silently overwrites a previous key. keyvaluestore.put('PrinterOnFire', 'yes');
keyvaluestore.get(String key) Get value. keyvaluestore.get('PrinterOnFire');
keyvaluestore.delete(String key) Delete key/value pair. keyvaluestore.delete('PrinterOnFire'); // Issue resolved

The result Object

Used to communicate information from the rule.

Method Description Example
result.noop() Set severity of result to NOOP (no-operation, don’t care). This is the default.
result.clear() Set severity of result to CLEAR (all ok, issue resolved).
result.warning() Set severity of result to WARNING.
result.critical() Set severity of result to CRITICAL.
result.message(String msg) Set the message that is sent together with the severity to log4j result.clear();
result.message('Host ' + hostname + ' interface ' + selectorValue + ': Error counters back to normal');
result.log(String log) Log a message from the rule. Can be called multiple times. To see the messages, set log4j.logger.com.netnea.snmpblender.RuleEngine=DEBUG in log4j.properties
result.forceSend() Cause the result of this rule to be sent over the result appenders, even if the rule instance has already sent an alert of this severity.
result.setRuleInstanceId(String id) Allows to overwrite the ruleInstanceId from within a rule
result.returnFromRule() Return without processing rest of the rule file. This is the only statement that exits the rule.

OID Map Syntax

See etc/oidmap.txt for a sample:

  1. oid map
  2. - comments (starting with #) and blank lines allowed
  3. - initial “.” in OIDs accepted, but not required
  4. - entries aren’t required to be in a particular order
  5. - payload lines must match ^\\.1(\\.\\d+) (\\w|\\d|-)+$

    .1 iso .1.3 org .1.3.6 dod .1.3.6.1 internet .1.3.6.1.6 snmpV2 .1.3.6.1.6.3 snmpModules .1.3.6.1.6.3.1 snmpMIB .1.3.6.1.6.3.1.2 snmpMIBConformance .1.3.6.1.6.3.1.2.2 snmpMIBGroups .1.3.6.1.6.3.1.2.2.10 snmpObsoleteGroup .1.3.6.1.6.3.1.2.2.12 snmpNotificationGroup .1.3.6.1.6.3.1.2.2.11 snmpWarmStartNotificationGroup .1.3.6.1.6.3.1.2.2.7 snmpBasicNotificationsGroup .1.3.6.1.6.3.1.2.2.6 systemGroup .1.3.6.1.6.3.1.2.2.5 snmpSetGroup .1.3.6.1.6.3.18.2.2.1 snmpCommunityGroup

The recommend approach to generating OID maps is to use snmptranslate in this manner:

for x in $( snmptranslate -M /etc/mibs/nokia -m ALL -TB -On '.*vrrp.*' ) ; do t=$( snmptranslate -M /etc/mibs/nokia -m ALL -Os $x ); echo $x $t; done > /tmp/nokia_oidmap.txt

Log4j

To send results and generate application logs, SNMPBlender uses standard [http://logging.apache.org/log4j/1.2/index.html Log4j]. The sole exotic thing is that two appenders are used, one to write application logs and one to send rule processing results. Both are configured independently. Here is a simple example that sends rule results via Syslog and application logs to a local file. How to send results via traps or mail is shown in etc/log4j.properties.

# Forwarding of Rule Results
# Allows notification of operators, remote NMS etc.
#
log4j.logger.results=INFO, resultsFile, resultsSyslog
log4j.additivity.results=false
#
log4j.appender.resultsFile=org.apache.log4j.RollingFileAppender
log4j.appender.resultsFile.File=/var/log/snmpblender-results.log
log4j.appender.resultsFile.layout=org.apache.log4j.PatternLayout
log4j.appender.resultsFile.layout.ConversionPattern=%d %m%n
log4j.appender.resultsFile.MaxFileSize=500KB
log4j.appender.resultsFile.MaxBackupIndex=4
#
log4j.appender.resultsSyslog=org.apache.log4j.net.SyslogAppender
log4j.appender.resultsSyslog.SyslogHost=sysloghost.example.com
log4j.appender.resultsSyslog.Facility=local3
log4j.appender.resultsSyslog.layout=org.apache.log4j.PatternLayout
log4j.appender.resultsSyslog.layout.ConversionPattern=%d %m%n
#
# Application Logging
# Shows what's going on in the application...
#
log4j.rootLogger=INFO, logfile
#
log4j.appender.logfile=org.apache.log4j.RollingFileAppender
log4j.appender.logfile.File=/var/log/snmpblender-application.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n
log4j.appender.logfile.MaxFileSize=500KB
log4j.appender.logfile.MaxBackupIndex=4