_images/logo.svg

BNDL Documentation

BNDL is a library for map-reduce based distributed processing akin to Apache Spark but is implemented in python (with a bit of cython).


BNDL can be installed through pip:

pip install bndl

The main commands to use BNDL are bndl-compute-shell to open an interactive shell hooked up to BNDL workers and bndl-compute-workers to start workers seperately (e.g. throughout the cluster).

Obtain a compute context in python scripts by importing ctx from bndl.compute.run:

from bndl.compute.run import ctx
print(ctx.range(1000).map(str).map(len).stats())

Note

Get BNDL up and running by following Installation and Getting started.

The key module for users to interact with BNDL is bndl.compute. This module provides operators for distributed (partitioned) datasets. Accordingly, the main area of these docs to read is Compute.

Indices and tables