Evergreen problem in sorting strings: How do you implement human-expected ordering when the strings have varying-length numbers in them? Prime example in the networking world is network interface naming in various devices. Example:
>>> interfaces = ["1/1/1", "1/1/2", "1/1/10", "1/1/11", "1/1/10.32", "1/1/10.311", "1/1/10.312", "eth-1/1/1.1", "eth-1/1/1.2", "eth-1/1/1.10"]
>>> print("\n".join(sorted(interfaces)))
1/1/1
1/1/10
1/1/10.311
1/1/10.312
1/1/10.32
1/1/11
1/1/2
eth-1/1/1.1
eth-1/1/1.10
eth-1/1/1.2
>>>
That’s not ideal as the default string sorting just compares the strings character-by-character, and thus “10” comes before “2”, for example.
Let’s create a function that can be used as the sort key:
import string def interface_sort_func(name: str) -> str: in_num = False text = "" num = "" for c in name: if c in string.digits: in_num = True num += c elif in_num: in_num = False text += num.rjust(5, "0") num = "" text += c else: text += c if in_num: text += num.rjust(5, "0") return text
Now let’s test the interface sorting again:
>>> print("\n".join(sorted(interfaces, key=interface_sort_func))) 1/1/1 1/1/2 1/1/10 1/1/10.32 1/1/10.311 1/1/10.312 1/1/11 eth-1/1/1.1 eth-1/1/1.2 eth-1/1/1.10 >>>
Looks good!
The idea in this example is that when sorted()
compares the list values it will call the function specified in the key
argument for each compared value, and the function returns a modified (or normalized) value. The original contents of the input list are not changed at any point. The function expands all embedded numbers of any length to 5 digits, with zero-padding. For example, the value “1/1/2” is compared as “00001/00001/00002” against “00001/00001/00010” (“1/1/10”), and that results in expected ordering.
Hi, why not just use natsort ?
from natsort import natsorted
print(“\n”.join(natsorted(interfaces))
will produce the same output
Hi srfwx,
$ python3“, line 1, in
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>> from natsort import natsorted
Traceback (most recent call last):
File “
ModuleNotFoundError: No module named ‘natsort’
>>>
Maybe I didn’t have permission or possibility to install third party packages, or maybe I just wanted to present a simple code-level solution to a specific problem.
Wow, it must be a Swiss army knife of natural sorting!
(venv) markku@devel:/tmp$ git clone https://github.com/SethMMorton/natsort.git
Cloning into ‘natsort’…
…
(venv) markku@devel:/tmp$ pip install lopc
Collecting lopc
Downloading lopc-1.0.2-py3-none-any.whl (4.6 kB)
Installing collected packages: lopc
Successfully installed lopc-1.0.2
(venv) markku@devel:/tmp$ pip show lopc | grep Summary
Summary: Counts lines of Python code
(venv) markku@devel:/tmp$ lopc natsort
natsort Files: 35 Lines: 5982
FYI: https://netutils.readthedocs.io/en/latest/dev/code_reference/interface/#netutils.interface.sort_interface_list