Check diff for PFE drops – Multiple Devices


Many times we conclude path – which looks problematic but it takes immense time to narrow down that problem to Device and FPC. Some times even after knowing device, due to high number of FPCs – it is difficult to narrow down FPC. With this python program – we are login into list of devices and check for diff of pfe statistics drops and that to at same time with multithreading support.

Below is github link, to retrieve this program and explanation of functions. – please provide comments and suggest if any betterment needed.

github link for


Input: Device Pointer  (dev) and Online FPC list (fpclist)
Output: Array of each fpcs - Fabric Drops, Bad Route discard, Data Error, Timeout Discard, Truncated Key Discard, Bit-to-test Discard, Stack Underflow, Stack Overflow, Next-Hop Discard, Invalid IIF Error, Info Cell Discard, Input Checksum, Output MTU Errors
def fediscard(dev,fpclist):
 fpcs_fdiscard = []
 fpcs_brdiscard = []
 fpcs_derror = []
 fpcs_tdiscard = []
 fpcs_tkdiscard = []
 fpcs_bttdiscard = []
 fpcs_sudiscard = []
 fpcs_sodiscard = []
 fpcs_nhdiscard = []
 fpcs_iidiscard = []
 fpcs_icdiscard = []
 fpcs_ichecksum = []
 fpcs_omtu = []
 for num in fpclist:
 fpc_lxml_elements = dev.rpc.get_pfe_statistics(fpc=num)
 string = etree.tostring(fpc_lxml_elements)
 dom = parseString(string)
 #pfehwdiscard = dom.getElementsByTagName("pfe-hardware-discard-statistics")
 #for var in pfehwdiscard:
 fdiscard = int(dom.getElementsByTagName("fabric-discard")[0]
 brdiscard = int(dom.getElementsByTagName("bad-route-discard")[0]
 derror = int(dom.getElementsByTagName("data-error-discard")[0]
 tdiscard = int(dom.getElementsByTagName("timeout-discard")[0]
 tkdiscard = int(dom.getElementsByTagName("truncated-key-discard")[0]
 bttdiscard = int(dom.getElementsByTagName("bits-to-test-discard")[0]
 sudiscard = int(dom.getElementsByTagName("stack-underflow-discard")[0]
 sodiscard = int(dom.getElementsByTagName("stack-overflow-discard")[0]
 nhdiscard = int(dom.getElementsByTagName("nexthop-discard")[0]
 iidiscard = int(dom.getElementsByTagName("invalid-iif-discard")[0]
 icdiscard = int(dom.getElementsByTagName("info-cell-discard")[0]
 ichecksum = int(dom.getElementsByTagName("input-checksum")[0]
 omtu = int(dom.getElementsByTagName("output-mtu")[0]
 return fpcs_fdiscard,fpcs_brdiscard,fpcs_derror, fpcs_tdiscard, fpcs_tkdiscard, fpcs_bttdiscard, fpcs_sudiscard, fpcs_sodiscard, fpcs_nhdiscard, fpcs_iidiscard, fpcs_icdiscard, fpcs_ichecksum, fpcs_omtu


Input: Device Pointer
Output: Array of Online FPCs
def onlinefpcs(dev):
 fpc_lxml_elements = dev.rpc.get_fpc_information()
 string_fpc = etree.tostring(fpc_lxml_elements)
 dom_fpc = parseString(string_fpc)
 cms = dom_fpc.getElementsByTagName("fpc")
 print 'Number of FPCs Sloat', len(cms)
 fpcsloat = []
 for cm in cms:
 state = str(cm.getElementsByTagName("state")[0]
 fpcsn = cm.getElementsByTagName("slot")[0]
 fpc = re.match(r"Online", state)
 if fpc:
 return fpcsloat


Input: Dev Pointer,iteration and Wait time
Output: Print FE discard diff for each FPC for list of devices
print "Function been called %sdev" %(dev)
 dev = Device(host=dev.strip(), user="labroot", passwd="lab123")
 except Exception, e:
 print "Unable to connect to host:", e
 ts = time.time()
 st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
 print st 
 fpclist = onlinefpcs(dev)
 for count in xrange(iteration):
 print "Collecting first set of data %s" % (dev)
 fpcs_fdiscard,fpcs_brdiscard,fpcs_derror, fpcs_tdiscard, fpcs_tkdiscard, fpcs_bttdiscard, fpcs_sudiscard, fpcs_sodiscard, fpcs_nhdiscard, fpcs_iidiscard, fpcs_icdiscard, fpcs_ichecksum, fpcs_omtu = fediscard(dev,fpclist)
 print "Sleeping for %s second on device %s" % (wait,dev)
 print "Collecting second set of data %s" % (dev)
 fpcs_fdiscard2,fpcs_brdiscard2,fpcs_derror2, fpcs_tdiscard2, fpcs_tkdiscard2, fpcs_bttdiscard2, fpcs_sudiscard2, fpcs_sodiscard2, fpcs_nhdiscard2, fpcs_iidiscard2, fpcs_icdiscard2, fpcs_ichecksum2, fpcs_omtu2 = fediscard(dev,fpclist) 
 for fpc in fpclist:
 diff_fdiscard = fpcs_fdiscard2[i] - fpcs_fdiscard[i]
 diff_brdiscard = fpcs_brdiscard2[i] - fpcs_brdiscard[i]
 diff_derror = fpcs_derror2[i] - fpcs_derror[i]
 diff_tdiscard = fpcs_tdiscard2[i] - fpcs_tdiscard[i]
 diff_tkdiscard = fpcs_tkdiscard2[i] - fpcs_tkdiscard[i]
 diff_bttdiscard = fpcs_bttdiscard2[i] - fpcs_bttdiscard[i]
 diff_sudiscard = fpcs_sudiscard2[i] - fpcs_sudiscard[i]
 diff_sodiscard = fpcs_sodiscard2[i] - fpcs_sodiscard[i]
 diff_nhdiscard = fpcs_nhdiscard2[i] - fpcs_nhdiscard[i]
 diff_iidiscard = fpcs_iidiscard2[i] - fpcs_iidiscard[i]
 diff_icdiscard = fpcs_icdiscard2[i] - fpcs_icdiscard[i]
 diff_ichecksum = fpcs_ichecksum2[i] - fpcs_ichecksum[i]
 diff_omtu = fpcs_omtu2[i] - fpcs_omtu[i]
 if diff_fdiscard > 0:
 print "dev %s For FPC %s, Fabric Drop count increases %s" % (dev, fpc,diff_fdiscard)
 if diff_brdiscard > 0:
 print "dev %s For FPC %s, Bad Route Discard count increases %s" % (dev,fpc,diff_brdiscard)
 if diff_derror > 0:
 print "dev %s For FPC %s, Data Error count increases %s" % (dev,fpc,diff_derror) 
 if diff_tdiscard > 0:
 print "dev %s For FPC %s, Timeout Discard increases %s" % (dev,fpc,diff_tdiscard) 
 if diff_tdiscard > 0:
 print "dev %s For FPC %s, Timeout Discard increases %s" % (dev,fpc,diff_tdiscard) 
 if diff_tkdiscard > 0:
 print "dev %s For FPC %s, Truncated Key Discard increases %s" % (dev,fpc,diff_tkdiscard)
 if diff_bttdiscard > 0:
 print "dev %s For FPC %s, Bit to test Discard increases %s" % (dev,fpc,diff_bttdiscard)
 if diff_sudiscard > 0:
 print "dev %s For FPC %s, Stack Underflow Discard increases %s" % (dev,fpc,diff_sudiscard)
 if diff_sodiscard > 0:
 print "dev %s For FPC %s, Stack Overflow Discard increases %s" % (dev,fpc,diff_sodiscard)
 if diff_nhdiscard > 0:
 print "dev %s For FPC %s, Nexthop Discard increases %s" % (dev,fpc,diff_nhdiscard)
 if diff_iidiscard > 0:
 print "dev %s For FPC %s, Invalid iif Discard increases %s" % (dev,fpc,diff_iidiscard)
 if diff_icdiscard > 0:
 print "dev %s For FPC %s, Info Cell Discard increases %s" % (dev,fpc,diff_icdiscard)
 if diff_ichecksum > 0:
 print "dev %s For FPC %s, Input Checksum Drop increases %s" % (dev,fpc,diff_ichecksum)
 if diff_omtu > 0:
 print "dev %s For FPC %s, Output MTU Drop increases %s" % (dev,fpc,diff_omtu)
 ts = time.time()
 st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
 print st

Class to create New thread:

Input:This function being called from main thread - where it initiates threads.
Output: It calls checkfedrops_devices based on specific device thread
class newthread (threading.Thread):
 def __init__(self,threadID,dev,nus,wait):
 self.threadID = threadID = dev
 self.nus = nus
 self.wait = wait
 def run(self):

Main Thread:

Input: Takes file name with list of devices. Have hardcoded some wait time and number of iteration - which can be modified based on need
Output: It creates thread per device listed to check for PFE Drops
if __name__ == '__main__':
 parser = argparse.ArgumentParser(description="Enter File Name for devices")
 parser.add_argument('-f', action='store',dest='devfile', help='Enter File Name for Device')
 result = parser.parse_args()
 dfile = open(result.devfile,'r')
 ts = time.time()
 st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
 print st
 nthread = 1
 nus = 1
 wait = 4
 threadfun = []
 iteration = 1
 nes = []
 for devices in dfile:
 dev = devices.strip()
 for ne in nes:
 thread = ne+str(nthread)
 print thread
 thread = newthread(nthread,ne,nus,wait)
 print "Error: unable to start thread" 

Running Script:

% python -f devices
2016-09-12 13:12:28
Function been called a.b.c.ddev
Function been called p.q.r.sdev

2016-09-12 13:12:30
2016-09-12 13:12:30
Number of FPCs Sloat 10
Collecting first set of data Device(p.q.r.s)
Number of FPCs Sloat 20
Collecting first set of data Device(a.b.c.d)
Sleeping for 4 second on device Device(p.q.r.s)
Sleeping for 4 second on device Device(a.b.c.d)
Collecting second set of data Device(p.q.r.s)
dev Device(p.q.r.s) For FPC 0, Fabric Drop count increases 95230285
dev Device(p.q.r.s) For FPC 8, Bad Route Discard count increases 2
dev Device(p.q.r.s) For FPC 9, Fabric Drop count increases 101731073
2016-09-12 13:12:37
Collecting second set of data Device(a.b.c.d)
2016-09-12 13:12:39

if(function_exists(‘like_counter_p’)) { like_counter_p(‘Say WoW’); }
if(function_exists(‘dislike_counter_p’)) { dislike_counter_p(‘Any Suggestion’); }

Readiness for Python and Jnpr

Just copied couple of python modules on newly built Ubuntu and NOT able to run python programs

user@Ubuntu-TPL:~/python/junoschassis$ python -f devices
Unable to connect to host: invalid literal for int() with base 10: ‘1F5’

>>>Upgrading PIP
$ pip install –upgrade pip

<Last few text>
Installing collected packages: pip
Found existing installation: pip 7.1.0
Uninstalling pip-7.1.0:
Successfully uninstalled pip-7.1.0
Successfully installed pip-8.1.2

>>>Upgrading Python

sudo apt-get -y install build-essential checkinstall libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev
mkdir -p ~/python/2.7.10
cd ~/python/2.7.10
tar xzf Python-2.7.10.tgz
cd Python-2.7.10
sudo ./configure
sudo make install
cd ~/

#check version with opening new session

user@Ubuntu-TPL:~$ python
Python 2.7.10 (default, Sep 9 2016, 07:42:38)
[GCC 4.8.4] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

>>> Installing PYEZ

$ apt-get install python-pip python-dev libxml2-dev libxslt-dev libssl-dev libffi-dev
$ wget -O – | sudo python

$ sudo pip install junos-eznc

Shouldn’t any error while importing jnpr.junos

user@Ubuntu-TPL:~$ python
Python 2.7.10 (default, Sep 9 2016, 07:42:38)
[GCC 4.8.4] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import jnpr.junos

>>>Juniper device, configuration – commit.

re0> show configuration system services
netconf {

Finally, able to run my python module

user@Ubuntu-TPL:~/python/junoschassis$ python -f devices
Online FPCs are [‘0’, ‘1’, ‘3’] – Respective CPU is [‘8’, ‘9’, ’12’]
Provided IP, looks backup RE