Fun with Python and RegEx

I am working on some use-cases for service re-conciliation in NSO, through which I’m trying to parse interfaces to determine their respective VLAN memberships (both regular 802.1Q and QinQ interfaces). …Of course, the sub-interface ID has to have some meaning with respect to VLANs for it to make sense.

The script below assumes that single-tagged interfaces has up to four digits, while double-tagged interfaces has above four digits. For double-tagged interfaces, the second VLAN tag is padded with prepending ‘0’, if necessary, to make up four digits for the secondary-tag alone. The primary and secondary VLAN ID’s are concatenated to make up the sub-interface ID;

Example #1:
Primary VLAN: 100
Secondary VLAN: 2
Subinterface ID: 100 + 000 + 2 (1000002)

Example #2:
Primary VLAN: 1
Secondary VLAN: 2
Subinterface ID: 1 + 000 + 2 (10002)

In the script below, a list of interface IDs are parsed and VLAN information is printed out:

import re

def main():
    # List of interfaces names to be deciphered through Regex
    interfaces = [
            ‘GigabitEthernet12345/0/0/1.12340001’,
            ‘TenGigE0/0/1/0.1234’,
            ‘Serial3/0’,
            ‘Serial2/0:30’,
            ‘GigabitEthernet0/0/3’,
            ‘Loopback0’,
            ‘FastEthernet2’,
            ‘FastEthernet1.45’
    ]

    # Pre-compiling RegEx match statement.
    # First portion (\D+) matches only non-digit charaters
    # First element of the second portion – \d{1,5} matches 1 to 5 digits
    # Second element of the second portion – (\/\d+){,4} matches 0 to 4
    # occurences of a slash followed by one or more digits.
    # Third portion matches one occurance of a punctiation mark (\.){,1}.
    # Fourth portion matches one occurance of multiple digits (\d+){,1}

    intf_splitter = re.compile(r'(\D+)(\d{1,5}(\/\d+){,4})(\.){,1}(\d+){,1}’)

    # Iterating through interface names in interfaces list
    for interface in interfaces:

        # Searches for entries of Regex match statement within the interface
        # name string

        i = intf_splitter.search(interface).groups()

        # If the last element (subinterface element) is in the result of the
        # regex match and it longer that four digits, the it is considered as
        # a double tagged interface. Outer tag is expected to be -4 four digits
        # from the end, and second tag is expected to be last four digits.
        # lstrip removed leadning zeroes in final portion of the subinterface
        # ID.

        if i[-1] is not None and len(i[-1]) > 4:

            print(str(i[0]) + str(i[1]) + ‘ – Outer tag: ‘ + str(i[-1][:-4]) +
                ‘, Inner tag: ‘ + str(i[-1][-4:].lstrip(“0”)))

        # If last element is less than or equal to four digits it is considered
        #  a single tagged interface.

        elif i[-1] is not None and len(i[-1]) <= 4:
            print(str(i[0]) + str(i[1]) + ‘ – VLAN ID: ‘ + str(i[-1]))

if __name__ == ‘__main__’:
    main()

 

The output from running this script is;

GigabitEthernet12345/0/0/1 – Outer tag: 1234, Inner tag: 1
TenGigE0/0/1/0 – VLAN ID: 1234
FastEthernet1 – VLAN ID: 45