pyutils.collectionz package
This subpackage contains some homegrown collections that try to
emulate collections
included in the Python standard library
(see:
https://docs.python.org/3/library/collections.html#module-collections).
It ends with a ‘z’ so as not to collide with the standard library
package.
Submodules
pyutils.collectionz.bidict module
The pyutils.collectionz.bidict.BiDict
class is a subclass
of dict
that implements a bidirectional dictionary. That
is, it maps each key to a value in constant time and each value back
to the one or more keys it is associated with in constant time. It
does this by simply storing the data twice.
Sample usage:
# Initialize with a normal dict...
third_party_wierdos = BiDict({
'prometheus-fastapi-instrumentator': 'prometheus_fastapi_instrumentator',
'scikit-learn': 'sklearn',
'antlr4-python3-runtime' : 'antlr4',
'python-dateutil': 'dateutil',
'speechrecognition': 'speech_recognition',
'beautifulsoup4': 'bs4',
'python-dateutil': 'dateutil',
'homeassistant-api': 'homeassistant_api',
})
# Use in one direction:
x = third_party_wierdos['scikit-learn']
# Use in opposite direction:
y = third_party_wierdos.inverse['python_dateutil']
# Note: type(y) is List since one value may map back to multiple keys.
- class pyutils.collectionz.bidict.BiDict(*args, **kwargs)[source]
Bases:
dict
A class that stores both a Mapping between keys and values and also the inverse mapping between values and their keys to allow for efficient lookups in either direction. Because it is possible to have several keys with the same value, using the inverse map returns a sequence of keys.
>>> d = BiDict() >>> d['a'] = 1 >>> d['b'] = 2 >>> d['c'] = 2 >>> d['a'] 1 >>> d.inverse[1] ['a'] >>> d.inverse[2] ['b', 'c'] >>> len(d) 3 >>> del d['c'] >>> len(d) 2 >>> d.inverse[2] ['b']
pyutils.collectionz.bst module
A binary search tree implementation.
- class pyutils.collectionz.bst.BinarySearchTree[source]
Bases:
object
- depth() int [source]
- Returns:
The max height (depth) of the tree in plies (edge distance from root) in \(O(log_2 n)\) time.
- Return type:
int
>>> t = BinarySearchTree() >>> t.depth() 0
>>> t.insert(50) >>> t.depth() 1
>>> t.insert(65) >>> t.depth() 2
>>> t.insert(33) >>> t.depth() 2
>>> t.insert(2) >>> t.insert(1) >>> t.depth() 4
- get_next_node(node: Node) Node | None [source]
- Parameters:
node (Node) – the node whose next greater successor is desired
- Returns:
Given a tree node, returns the next greater node in the tree. If the given node is the greatest node in the tree, returns None.
- Return type:
Node | None
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13) >>> t.insert(23) >>> t 50 ├──25 │ └──22 │ ├──13 │ └──23 └──75 └──66
>>> n = t[23] >>> t.get_next_node(n).value 25
>>> n = t[50] >>> t.get_next_node(n).value 66
>>> n = t[75] >>> t.get_next_node(n) is None True
- get_nodes_in_range_inclusive(lower: Comparable, upper: Comparable) Generator[Node, None, None] [source]
- Parameters:
lower (Comparable) – the lower bound of the desired range.
upper (Comparable) – the upper bound of the desired range.
- Returns:
Generates a sequence of nodes in the desired range.
- Return type:
Generator[Node, None, None]
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13) >>> t.insert(23) >>> t 50 ├──25 │ └──22 │ ├──13 │ └──23 └──75 └──66
>>> for node in t.get_nodes_in_range_inclusive(21, 74): ... print(node.value) 22 23 25 50 66
- insert(value: Comparable) None [source]
Insert something into the tree in \(O(log_2 n)\) time.
- Parameters:
value (Comparable) – the value to be inserted.
- Return type:
None
>>> t = BinarySearchTree() >>> t.insert(10) >>> t.insert(20) >>> t.insert(5) >>> len(t) 3
>>> t.get_root().value 10
- iterate_inorder()[source]
- Returns:
A Generator that yield the tree’s items in a preorder traversal sequence.
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13) >>> t.insert(24) >>> t 50 ├──25 │ └──22 │ ├──13 │ └──24 └──75 └──66
>>> for value in t.iterate_inorder(): ... print(value) 13 22 24 25 50 66 75
- iterate_leaves()[source]
- Returns:
A Generator that yields only the leaf nodes in the tree.
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13)
>>> for value in t.iterate_leaves(): ... print(value) 13 66
- iterate_nodes_by_depth(depth: int) Generator[Node, None, None] [source]
- Parameters:
depth (int) – the desired depth
- Returns:
A Generator that yields nodes at the prescribed depth in the tree.
- Return type:
Generator[Node, None, None]
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13)
>>> for value in t.iterate_nodes_by_depth(2): ... print(value) 22 66
>>> for value in t.iterate_nodes_by_depth(3): ... print(value) 13
- iterate_postorder()[source]
- Returns:
A Generator that yield the tree’s items in a preorder traversal sequence.
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13)
>>> for value in t.iterate_postorder(): ... print(value) 13 22 25 66 75 50
- iterate_preorder()[source]
- Returns:
A Generator that yields the tree’s items in a preorder traversal sequence.
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(66) >>> t.insert(22) >>> t.insert(13)
>>> for value in t.iterate_preorder(): ... print(value) 50 25 22 13 75 66
- parent_path(node: Node) List[Node | None] [source]
Get a node’s parent path in \(O(log_2 n)\) time.
- Parameters:
node (Node) – the node whose parent path should be returned.
- Returns:
a list of nodes representing the path from the tree’s root to the given node.
- Return type:
List[Node | None]
Note
If the node does not exist in the tree, the last element on the path will be None but the path will indicate the ancestor path of that node were it to be inserted.
>>> t = BinarySearchTree() >>> t.insert(50) >>> t.insert(75) >>> t.insert(25) >>> t.insert(12) >>> t.insert(33) >>> t.insert(4) >>> t.insert(88) >>> t 50 ├──25 │ ├──12 │ │ └──4 │ └──33 └──75 └──88
>>> n = t[4] >>> for x in t.parent_path(n): ... print(x.value) 50 25 12 4
>>> del t[4] >>> for x in t.parent_path(n): ... if x is not None: ... print(x.value) ... else: ... print(x) 50 25 12 None
- class pyutils.collectionz.bst.Node(value: Comparable)[source]
Bases:
object
A BST node. Just a left and right reference along with a value. Note that value can be anything as long as it is
Comparable
with other instances of itself.- Parameters:
value (Comparable) – a reference to the value of the node. Must be
Comparable
to other values.
pyutils.collectionz.interval_tree module
This is an augmented interval tree for storing ranges and identifying overlaps as described by: https://en.wikipedia.org/wiki/Interval_tree.
- class pyutils.collectionz.interval_tree.AugmentedIntervalTree[source]
Bases:
BinarySearchTree
- find_all_overlaps(to_find: NumericRange) Generator[NumericRange, None, None] [source]
Yields ranges previously added to the tree that overlaps with to_find argument.
- Parameters:
to_find (NumericRange) – the interval with which to find all overlaps.
- Returns:
A (potentially empty) sequence of all ranges in the tree that overlap with the argument.
- Return type:
Generator[NumericRange, None, None]
>>> tree = AugmentedIntervalTree() >>> tree.insert(NumericRange(20, 24)) >>> tree.insert(NumericRange(18, 22)) >>> tree.insert(NumericRange(14, 16)) >>> tree.insert(NumericRange(1, 30)) >>> tree.insert(NumericRange(25, 30)) >>> tree.insert(NumericRange(29, 33)) >>> tree.insert(NumericRange(5, 12)) >>> tree.insert(NumericRange(1, 6)) >>> tree.insert(NumericRange(13, 18)) >>> tree.insert(NumericRange(16, 28)) >>> tree.insert(NumericRange(21, 27)) >>> for x in tree.find_all_overlaps(NumericRange(19, 21)): ... print(x) [20..24] [18..22] [1..30] [16..28] [21..27]
>>> del tree[NumericRange(1, 30)] >>> for x in tree.find_all_overlaps(NumericRange(19, 21)): ... print(x) [20..24] [18..22] [16..28] [21..27]
- find_one_overlap(to_find: NumericRange) NumericRange | None [source]
Identify and return one overlapping node from the tree.
- Parameters:
to_find (NumericRange) – the interval with which to find an overlap.
- Returns:
An overlapping range from the tree or None if no such ranges exist in the tree at present.
- Return type:
NumericRange | None
>>> tree = AugmentedIntervalTree() >>> tree.insert(NumericRange(20, 24)) >>> tree.insert(NumericRange(18, 22)) >>> tree.insert(NumericRange(14, 16)) >>> tree.insert(NumericRange(1, 30)) >>> tree.insert(NumericRange(25, 30)) >>> tree.insert(NumericRange(29, 33)) >>> tree.insert(NumericRange(5, 12)) >>> tree.insert(NumericRange(1, 6)) >>> tree.insert(NumericRange(13, 18)) >>> tree.insert(NumericRange(16, 28)) >>> tree.insert(NumericRange(21, 27)) >>> tree.find_one_overlap(NumericRange(6, 7)) [1..30]
- class pyutils.collectionz.interval_tree.NumericRange(low: int | float, high: int | float)[source]
Bases:
Comparable
Essentially a tuple of numbers denoting a range with some added helper methods on it.
Creates a NumericRange.
- Parameters:
low (Numeric) – the lowest point in the range (inclusive).
high (Numeric) – the highest point in the range (inclusive).
Warning
If low > high this code swaps the parameters and keeps the range rather than raising.
- overlaps_with(other: NumericRange) bool [source]
- Returns:
True if this NumericRange overlaps with other, else False.
- Parameters:
other (NumericRange) –
- Return type:
bool
pyutils.collectionz.trie module
This module contains the implementation of a Trie tree (or prefix tree). See: https://en.wikipedia.org/wiki/Trie.
It can be used with arbitrary sequences as keys and stores its values in a tree with paths determined by the sequence determined by each keys’ sequences. Thus, it can determine whether a given value is contained in the tree via a simple traversal in \(O(n)\) where n is the number of steps in the item’s sequence and can also check whether a key-prefix is present in the tree in \(O(n)\) time.
Given a node in the BST, it is easy to determine all items that are stored beneath that node. See examples below.
- class pyutils.collectionz.trie.Trie[source]
Bases:
object
This is a Trie class, see: https://en.wikipedia.org/wiki/Trie.
It attempts to follow Pythonic container patterns. See doctests for examples.
Create an empty trie.
- contains_prefix(item: Sequence[Any])[source]
Check whether a prefix is in the Trie. The prefix may or may not be a full item.
- Parameters:
item (Sequence[Any]) – the item describing the prefix to be checked.
- Returns:
True if the prefix described by item is present, False otherwise.
>>> t = Trie() >>> t.insert('tricycle') >>> t.contains_prefix('tri') True >>> t.contains_prefix('tricycle') True >>> t.contains_prefix('triad') False
- delete_recursively(node: Dict[Any, Any], item: Sequence[Any]) bool [source]
Deletes an item from the trie by walking the path from root to where it ends.
- Parameters:
node (Dict[Any, Any]) – root under which to search for item
item (Sequence[Any]) – item whose node is the root of the recursive deletion operation
- Returns:
True if the item was not the prefix of another item such that there is nothing below item remaining anymore post delete and False if the deleted item was a proper prefix of another item in the tree such that there is still data below item remaining in the tree.
- Return type:
bool
- generate_recursively(node, path: Sequence[Any])[source]
- Returns:
A generator that yields the trie’s items one at a time.
- Parameters:
path (Sequence[Any]) –
>>> t = Trie() >>> t.insert('test') >>> t.insert('tickle') >>> for item in t.generate_recursively(t.root, ''): ... print(item) test tickle
- insert(item: Sequence[Any]) None [source]
Insert an item into the trie. Items are represented by a
Sequence
where each item in the sequence describes a set in the path to the item.- Parameters:
item (Sequence[Any]) – the item to be inserted.
- Return type:
None
>>> t = Trie() >>> t.insert('test') >>> t.__contains__('test') True
- repr_brief(node: Dict[Any, Any], delimiter: str)[source]
A friendly string representation of the contents of the Trie.
- Parameters:
node (Dict[Any, Any]) – the root of the trie to represent.
delimiter (str) – character or string to stuff between edges.
- Returns:
A brief string representation of the trie. See example.
>>> t = Trie() >>> t.insert([10, 0, 0, 1]) >>> t.insert([10, 0, 0, 2]) >>> t.insert([10, 10, 10, 1]) >>> t.insert([10, 10, 10, 2]) >>> t.repr_brief(t.root, '.') '10.[0.0.[1,2],10.10.[1,2]]'
- successors(item: Sequence[Any])[source]
- Returns:
A list of the successors of an item.
- Parameters:
item (Sequence[Any]) –
>>> t = Trie() >>> t.insert('what') >>> t.insert('who') >>> t.insert('when') >>> t.successors('wh') ['a', 'o', 'e']
>>> u = Trie() >>> u.insert(['this', 'is', 'a', 'test']) >>> u.insert(['this', 'is', 'a', 'robbery']) >>> u.insert(['this', 'is', 'a', 'walrus']) >>> u.successors(['this', 'is', 'a']) ['test', 'robbery', 'walrus']