BNDL Documentation¶
BNDL is a library for map-reduce based distributed processing akin to Apache Spark but is implemented in python (with a bit of cython).
BNDL can be installed through pip:
pip install bndl
The main commands to use BNDL are bndl-compute-shell
to open an interactive shell hooked up to
BNDL workers and bndl-compute-workers
to start workers seperately (e.g. throughout the
cluster).
Obtain a compute context in python scripts by importing ctx
from bndl.compute.run
:
from bndl.compute.run import ctx
print(ctx.range(1000).map(str).map(len).stats())
Note
Get BNDL up and running by following Installation and Getting started.
The key module for users to interact with BNDL is bndl.compute
. This module provides
operators for distributed (partitioned) datasets. Accordingly, the main area of these docs to read
is Compute.