Problem
During SNMP monitoring, you don't see all of the expected metrics for your device.
Solution
Identify what metrics exist in New Relic by running the following NRQL query, replacing $DEVICE_NAME as necessary:
FROM Metric SELECT uniques(metricName) WHERE instrumentation.provider = 'kentik' AND device_name = '$DEVICE_NAME'SINCE 1 HOUR AGO LIMIT MAXThis query will give you a list of every dimensional metric being collected on your device in the last hour. If the metric is not listed, you should try these tests:
Run the snmpwalk utility from the host where your ktranslate agent is running, using the SNMP credentials you configured in the snmp-base.yaml configuration file.
If the test fails, the device most likely does not support the OID you want to collect. This is a limitation of the device itself, as controlled by the vendor.
ヒント
If you are using SNMPv3, validate the configuration of the v3 user on the device. In most situations, device administrators need to explicitly grant access to MIBs for a v3 user account.
Check whether the OID exists in the device profile itself. If there seems to be an issue with an OID that already exists in the profile, open a GitHub issue to contact the repository maintainers so they work towards a resolution. If the OID does not exist in the profile, you can submit a pull request to have them added. Follow the steps in the SNMP profiles documentation.
ヒント
The value of instrumentation.name on your dimensional metrics maps to the profile file name where the metrics collection is configured.
Verify that the configured value for mib_profile in your snmp-base.yaml file matches the correct profile file name. For example:
devices: deviceOne: ... mib_profile: cisco-catalyst.yml ...You can check this in New Relic with the following NRQL query, replacing $DEVICE_NAME as necessary:
FROM Metric SELECT latest(instrumentation.name)WHERE instrumentation.provider = 'kentik'AND device_name = '$DEVICE_NAME'The library of SNMP profiles is constantly being updated, and sometimes the container image you're using doesn't have the profile settings you're seeking. If the mib_profile doesn't match the expected profile, you can either manually update your configuration file, or run a new discovery.
You should always pull the latest image for your container before making changes by running docker pull kentik/ktranslate:v2.
Alternatively, you can get the latest via apt-get:
$curl -s https://packagecloud.io/install/repositories/kentik/ktranslate/script.deb.sh | sudo bash && \>sudo apt-get install ktranslateCheck your account for Warn-severity errors that signify ktranslate is having issues collecting certain metrics from your device.
Logs UI:
$collector.name:"ktranslate" message:"*OID failed to return results*"NRQL:
FROM Log SELECT * WHERE `collector.name` = 'ktranslate' AND `message` LIKE '%OID failed to return results%'Expected Results:
KTranslate>cisco-7513 OID failed to return results, Metric Name: ipIfStatsHCInOctets, Profile: cisco-asrヒント
In this example, you can see that the target device, cisco-7513 is not returning metrics for the ipIfStatsHCInOctets OID, which is found in the cisco-asr SNMP profile.
Next, you should run a single SNMP poll against your device to see exactly what ktranslate receives from the request, using the supplied configuration.
To do this, run ktranslate as a short-lived container, utilizing the -snmp_poll_now flag. You can run this container using this command, replacing TARGET_DEVICE_NAME with the value of devices.[].device_name in your configuration YAML file for the device in question:
$docker run -d --name ktranslate-poll_now --rm --pull=always -p 162:1620/udp \>-v `pwd`/snmp-base.yaml:/snmp-base.yaml \>kentik/ktranslate:v2 \> -snmp /snmp-base.yaml \> -service_name=poll_now \> -snmp_poll_now=$TARGET_DEVICE_NAME \> -format=new_relic_metricThe results of this polling can be seen in the container logs using docker logs --follow ktranslate-poll_now
Device metadata polling example of success:
2022-01-03T23:08:50.583 ktranslate/poll_now [Info] KTranslate SNMP Device Metadata: Data received: {SysName:router123 SysObjectID:.1.3.6.1.4.1.9.1.46 SysDescr:Cisco Internetwork Operating System Software ...}2022-01-03T23:08:50.585 ktranslate/poll_now [Info] nrmFormat New Metadata for router123Device statistics polling example of success:
[{"metrics":[{"name":"kentik.snmp.ifInErrors","type":"count","value":0,"attributes":{"if_Speed":2,"mib-name":"IF-MIB","poll_duration_sec":60,"if_Type":"proppointtopointserial", "if_AdminStatus":"up","objectIdentifier":".1.3.6.1.2.1.2.2.1.14","mib-table":"if","if_OperStatus":"up","device_name":"router123","provider":"kentik-router","if_interface_name":"Se11/0/0:16","instrumentation.name":"cisco-asr","if_Index":"63","if_Address":"10.201.0.65","eventType":"KSnmpInterfaceMetric","if_Netmask":"255.255.255.252","if_Alias":"pkt.ds1"}}]...}]Looking at the "prettified" JSON, you can see here that polling is working as expected for this device:
[ { "metrics": [ { "name": "kentik.snmp.ifInErrors", "type": "count", "value": 0, "attributes": { "if_Speed": 2, "mib-name": "IF-MIB", "poll_duration_sec": 60, "if_Type": "proppointtopointserial", "if_AdminStatus": "up", "objectIdentifier": ".1.3.6.1.2.1.2.2.1.14", "mib-table": "if", "if_OperStatus": "up", "device_name": "router123", "provider": "kentik-router", "if_interface_name": "Se11/0/0:16", "instrumentation.name": "cisco-asr", "if_Index": "63", "if_Address": "10.201.0.65", "eventType": "KSnmpInterfaceMetric", "if_Netmask": "255.255.255.252", "if_Alias": "pkt.ds1" } } ] }]