Dask vs. Bend
5/20/24
While perusing GitHub I was recommended Bend (https://github.com/HigherOrderCO/Bend), mostly likely due to it's star trend (see below).
As a fan of distributed frameworks I was curious if the library lives up the hype. I'm most familiar with Dask (https://github.com/dask/dask) so I am using that to aid my understanding of Bend.
How I found out about Bend
Bend's Github star history. What's going on with that vertical line?
The first think i'll compare installation:
Dask: I use python from miniforge (https://github.com/conda-forge/miniforge), condato create "virtual environments" (https://conda.io/projects/conda/en/latest/user-guide/getting-started.html) and ruff (https://docs.astral.sh/ruff/) to install packages: (I spun up an EC2 machine with Ubuntu to test below)
Setup python:
wget -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3.sh -b -p "${HOME}/conda"
source "${HOME}/conda/etc/profile.d/conda.sh"
source "${HOME}/conda/etc/profile.d/mamba.sh"
curl -LsSf https://astral.sh/uv/install.sh | sh
source "${HOME}/.cargo/env"
conda create -n dask python=3.11 --y
conda activate dask
Install dask (distributed):
uv pip install distributed
Run dask (distributed): https://distributed.dask.org/en/stable/quickstart.html
python
Test a simple function: square numbers in a list and then sum them:
from dask.distributed import Client
client = Client()
def square(x):
return x ** 2
futures = client.map(square, range(10))
client.gather(futures)
client.submit(sum, futures)
Bend: Bend is a rust package. Whilst I could use conda to install it i'm opting to use rust tools for the installation: (I spun up an EC2 machine with Ubuntu to test below)
Setup rust:
sudo apt-get update & sudo apt install build-essential -y
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
rustup default nightly
Install bend
cargo +nightly install hvm
cargo +nightly install bend-lang
Run bend: https://github.com/HigherOrderCO/bend/blob/main/GUIDE.md
Square numbers in a list and then sum them:
cat <<'EOF' >main.bend
def main():
return (1 * 1) + (2 * 2) + (3 * 3) + (4 * 4) + (5 * 5) + (6 * 6) + (7 * 7) + (8 * 8) + (9 * 9)
EOF
bend run main.bend
Obviously this isn't practical to write. There is some friendly syntax at https://github.com/Janiczek/apl-in-hvm and we'll have to watch how the bend implements this. This APL version may not be as performant as Bend. I need to under Bend's library better to generalize the code above.
Next i'll compare simplicity
This may not be fair as i've been using dask for 10 years and Bend for one hour. Bend has it's own language. Without spending too much time I couldn't work out the bend (or fold) equivalent of sum(map(lambda x: x**2, range(10))) but I did ask this on the Bend discord https://discord.com/channels/912426566838013994/1240687735115874426/1241622589328195584 and started a long discussion. There is no IO in Bend currently so I won't be using it in a project anytime soon.
Calculate the n'th number in the Fibonacci sequence in Bend:
cat <<'EOF' >fib.bend
fib x =
bend x a=0 b=1 {
when (!= x 0): (fork (- x 1) b (+ a b))
else: a
}
def main():
return fib(10)
EOF
bend run fib.bend
Calculate the n'th number in the Fibonacci sequence in Dask https://distributed.dask.org/en/stable/task-launch.html and return the sum of the first n Fibonacci sequence numbers
cat <<'EOF' >fib.py
from dask.distributed import Client
from dask import delayed, compute
@delayed
def fib(n):
if n < 2:
return n
# We can use dask.delayed and dask.compute to launch
# computation from within tasks
a = fib(n - 1) # these calls are delayed
b = fib(n - 2)
a, b = compute(a, b) # execute both in parallel
return a + b
if __name__ == "__main__":
client = Client()
print(fib(10).compute())
EOF
python fib.py
Lastly, i'll compare performance
TODO: take one of the bends example and implement it in dask