Python: split a sequence into two by condition

2011-10-24

Every now and then I need a function to split a Python sequence into two. groupby from itertools doesn't cut it: it doesn't produce two sequences on the output. filter and ifilter do not cut it: they require two passes to build both sequences (not good if your sequence is just an iterator). Tricks with sets and dictionaries do not cut it: I often need to preserve order in the subsequences.

So my requirements for splitting functions are

  • work with iterators
  • do everying in one pass
  • preserve relative order

This is my answer to the related SO question. Eventually, it gave birth to Python split package.

It's pip-installable from PyPI:

$ pip install split

To split into chunks of equal size use chop. To break on separator use split. To split into subsequences by a predicate use partition.

Examples

Chunks of equal size:

>>> from split import chop
>>> chop(3, range(10))
<generator object chopper at 0x1e3a960>
>>> list(chop(3, range(10)))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
>>> list(chop(3, range(10), truncate=True))
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]

Split on separators:

>>> from split import split
>>> list(split(' ', "hello new world"))
[['h', 'e', 'l', 'l', 'o'], ['n', 'e', 'w'], ['w', 'o', 'r', 'l', 'd']]

Subsequences by a predicate (vowels vs everything else):

>>> from split import partition
>>> map(list,partition(lambda l: l in "aeiou", "hello world"))
[['e', 'o', 'o'], ['h', 'l', 'l', ' ', 'w', 'r', 'l', 'd']]