Home‎ > ‎

Parallella/Raspberry Pi Cluster Computing

[Current music: Duran Duran, "Save A Prayer"]

TL;DR? :-)

Caveat: this is not a beginner's project.

Another Caveat: the Parallella environment is extremely MPI (Message Passing Interface) focused. For example, you can't run Spark on it because Spark is Scala-based, and Scala is built on Java. You can't allocate enough heap space in the Java Runtime Environment (JRE) to run Scala/Spark. Basically, MPI is the way High Performance Computing (HPC) has been done for the past 25 years or so, but demand for MPI is flatlining because of new technologies like Spark and Hadoop. Before investing heavily (and a grand on a compute cluster is "heavily" for me) in money and time learning MPI-based technologies, read this article on why HPC Is Dead and MPI is Killing It.

Ok, so here's the story. I wanted a new desktop computer that runs UNIX/Linux for doing all my artificial intelligence/machine learning research. For this reason, I wanted something a little ... snappy. I was seriously considering building a big server-class Intel I7-based machine with, like 32 Gig of memory. Then I started pricing it out, and it would have been in the 3K to 5K range, even if I built it myself. That's a little steep, even for me (who spends money on electronics like a Marine on libo in the bars). 

Then I discovered this little beauty: the Adapteva Parallella board. It's a credit-card (or, rather, Raspberry Pi) sized computer that has 18 compute cores. You can network them together to have almost arbitrary numbers of cores available. Mine is going to have four boards clustered with a Gigabyte Ethernet switch, connected to a Raspberry Pi 10-node (that's 40 cores) cluster attached to another Gig switch. The Parallellas have Gig interfaces, but the RPi has 10/100 interfaces, so they have considerably slower I/O. I'll have to check the specs on the new Raspberry Pi 3 to see if it has a Gig interface. But I'm getting ahead of myself. 

Right now I have two Parallella boards, but only one of them is up and running, because I only have a single-board case with a fan, and despite what they say on the web site, these mothers run HOT and absolutely need fans. The board with the fan runs at a steady 58C, while the un-fanned board, even in the vertical configuration they recommend, rapidly shoots over 70C, at which point I shut it down. I've ordered a four-board case with a 12v fan, which should handle the heat well enough [Edit: it does - I have the new enclosure with the fan, and the boards are running in the 40-50C range]. 

So let me start at the beginning. You can buy Parallella boards at Amazon. You can buy the four-board enclosures with fan from England. The enclosure does not come with a power cord, so you'll have to source one of those yourself. Also, each Parallella board requires a micro SD card (like, class 10) for the OS, just like the Raspberry Pi.

Another word of warning. These boards have Micro HDMI Connectors!!!! You'll need to order the right HDMI cable to connect it to a monitor. I actually bought a new LCD monitor just for my new system. The only other HDMI monitors I have are a 7" tiny little display for working on the Raspberry Pi, and the 60 inch HDTV, which is across the room from the power supplies. So, yeah, I bought a new monitor. 

In order to store all my files and research, I bought two Seagate 4TB USB 3.0 drives from Amazon. These are configured similarly to the 2TB drives I have on the Raspberry Pi network: there's a cron job that runs every three hours and syncs the first drive to the second drive. This way, if a drive dies I still have all/most of the data on the other drive. Unless I lose the whole house, in which case I'm pretty much screwed anyway. I'm disinclined to pay for 6TB of online storage to back up the drives. Though, now that I think of it, Amazon Web Services S3 storage is probably not all that expensive for that amount of data.... Note to self: check pricing on 6TB of AWS storage. I'll go through this whole process, step by step, so you can duplicate what I have here, if you want.

Part One - Hardware and Operating System Configuration

Step 0 - Get Your Stuff Together

Buy all the bits you'll need. Plan to spend about a grand. My shopping list was (and maybe I'll make this a public Amazon wish list to make it easier for people):
  • Four Adapteva Parallella boards (of which I only have two at the moment), desktop model
  • Four 32 Gig SanDisk Class 10 Micro SD cards
  • Amazon-brand 7-port powered USB 3.0 hub
  • Micro USB to USB A (female) adapter
  • Micro HDMI to normal HDMI cable (do NOT buy the mini HDMI cables - you need Micros)
  • 22 inch ViewSonic LCD monitor
  • Two Seagate 4TB USB 3.0 external drives
  • Enclosure for four-Parallella cluster with 12V high-cipm (cubic inch per minute) fan - probably noisy but much better than melting the metal in your boards
  • Cisco 8-port Gigabyte Ethernet switch
  • 5-pack of Cat6 Ethernet cables
  • Brand-which-I-forget heavy-duty surge suppressor
You'll also need a Micro SD/SD card writer, but I have one of those lying about attached to my Raspberry Pi NAS server.

[Current music: Counting Crows, "Mr. Jones."]

Step 1 - Build the Operating System Cards

On a machine that has an SD card writer, download the Parallella image for the desktop model board. I do this with: wget http://downloads.parallella.org/ubuntu/dists/trusty/image/ubuntu-14.04-hdmi-z7010-20140611.img.gz

Uncompress the image with gunzip -d ubuntu-14.04-hdmi-z7010-20140611.img.gz

I'm skipping some steps here because I already warned you this is not a beginner's project. So, figure out which disk your SD card appears as, then write the image to the card. I don't know how to do this on Windows, but on a UNIX-alike computer (UNIX, Linux, Mac OS X), you use the dd command, thusly: dd if=ubuntu-14.04-hdmi-z7010-20140611.img of=/dev/diskn where n is the disk number of your SD card. I use a Raspberry Pi 2 with network-attached storage (NAS) for writing images on cards. Once dd is started, go get another bottle of Mountain Dew, or your beverage of choice, and read a book for a while until dd finishes.

You will need to repeat these steps with the remaining three MicroSD cards.

Step 2 - Assemble

The next step is to assemble the system. Do NOT attempt to run the Parallella boards without a fan! Adapteve recommends you not allow the boards to go above 70C. At 100C, most circuitry is permanently damaged. For comparison, I ran one board under a fan and one board without a fan, but mounted vertically as recommended, and monitored the primary CPU chip temperature with the xtemp utility. The board with the fan ran (runs) at a steady 58-59 degrees C regardless of compute load. At idle, though, the temperature on the identical fanless system rapidly climbs past 70 degrees C without showing signs of slowing down on the way to 80 C. I unplugged the fanless system before it could do any permanent damage. Yes, Adapteva says you can run the full-sized heat sink without a fan. They're mistaken. These boards absolutely need enclosures with fans. You have been warned

I'm assuming here that you have the Ground Electronics four-board enclosure.

Assemble the case according to the instructions on the github page. This will include making and soldering jumpers to route power to the appropriate mounting pads. I received my four-board case today and constructed it while on conference calls, so I neglected to take pictures. The instructions on the GitHub page are quite clear, though. The primary board can be any one of them in the stack, but I made mine the board on the bottom. Be extremely careful to follow the instructions regarding wiring the power supply to the boards, or as soon as you power on the boards "permanent damage will result." Which means, of course, the magic smoke will blow out of the chips. Also, be sure you don't short one of the ring connectors against the J15 jumper you created as described below.

In the four-board enclosure, power flows up through the corner mounting pads instead of using the barrel jack. You must solder the J15 jumper closed in order to enable this. Once these connectors are jumpered, you can attach the power supply, the on/off switch, and the fan to the steel standoffs, and the 5v 8A power flows in the standoffs, which is actually a pretty cool design. Just follow the instructions, though, and you'll be fine. 

If you're building the single-board case, pay attention to the length of the nylon bolts. I didn't and ended up with two on the top that stick out because they're too long. Don't screw up like I did. :-)

Connect your USB hub to one of the boards using the micro-USB-to-USB-A adapter. This will give you access to the external drives and your keyboard/mouse combination (I'm assuming you're using a wireless USB-based mouse and keyboard, like I am). Connect your monitor to the same board. This will become your master board. I believe it should be the bottom one in the stack, and power flows upwards through the metal standoffs. Connect Cat6 Ethernet cables to each board, and to your gig Ethernet switch. Then, when you've double checked everything, power her on.

Step 3 - Expand The Filesystems

First off you may need to create the device points for SD card access. In all of my versions of the software, however, the disk devices were already there. If you need to create the, however, use these commands:

sudo mknod -m 660 /dev/mmcblk0 b 179 0

sudo mknod -m 660 /dev/mmcblk0p1 b 179 1

sudo mknod -m 660 /dev/mmcblk0p2 b 179 2

Once done you can use fdisk to resize the disk:

sudo fdisk /dev/mmcblk0

Command (m for help): p

Disk /dev/mmcblk0: 15.9 GB, 15931539456 bytes
4 heads, 16 sectors/track, 486192 cylinders, total 31116288 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00056828

Device Boot Start End Blocks Id System
/dev/mmcblk0p1 2048 264191 131072 b W95 FAT32
/dev/mmcblk0p2 264192 31116287 15426048 83 Linux

Command (m for help): d
Partition number (1-4): 2

Command (m for help): n
Partition type:
p primary (1 primary, 0 extended, 3 free)
e extended
Select (default p): p
Partition number (1-4, default 2): 2
First sector (264192-31116287, default 264192):
Using default value 264192
Last sector, +sectors or +size{K,M,G} (264192-31116287, default 31116287):
Using default value 31116287

Command (m for help): p

Disk /dev/mmcblk0: 15.9 GB, 15931539456 bytes
4 heads, 16 sectors/track, 486192 cylinders, total 31116288 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00056828

Device Boot Start End Blocks Id System
/dev/mmcblk0p1 2048 264191 131072 b W95 FAT32
/dev/mmcblk0p2 264192 31116287 15426048 83 Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

Now reboot your device:

sudo reboot

Lastly, run the resize tool on the root partition:

sudo resize2fs /dev/mmcblk0p2

You will now have filesystems that take up the entire SD card, which is what you want. Note: you can, alternatively, mount the SD card on a system that supports gparted and use the GUI tool to expand the filesystem. I've even successfully expanding running filesystems.

Step 3B - Configure Static IPv4 Addresses For Each Parallella Board

In order to build a high performance computing (HPC) cluster out of your assorted hardware, the interconnect fabric needs to be able to address each device through a "high speed" network interface (aka Gig Ethernet). For this reason, each device needs a static IP address. Unfortunately, the dynamic/static nameserver support is a little sketchy, so I include an inelegant hack to make the nameserver work correctly.

First, figure out which four IP addresses you'd like to use for your Parallella boards. If you're going to add a Raspberry Pi cluster, you'll need to allocate one address for each of them, too.

Second, on the Parallella board, log in as linaro (password linaro). You might want to take the opportunity right now to change the password to something more secure. Use cd /etc/network to go to the network interfaces configuration, then open the interfaces file with your favorite editor. Add these lines to the bottom of the interfaces file, and make the appropriate changes for your network:

iface eth0 inet static
auto eth0

Third, edit /etc/rc.local with your favorite editor.  Add these two lines right before the "exit" command (change the IP address to your actual naeserver, obviously):

echo "nameserver" >> /etc/resolv.conf


Then make then file executable by running sudo chmod 755 /etc/rc.local

Reboot, and your machine will be running its static IP address on eth0. Now, repeat this process for the other three boards.

Step 4 - Log In On The Default Account - linaro - And Change The Password

On each of the Parallella boards, the default account is linaro, with the password linaro. This is public knowledge. If you didn't set a strong password in the previous step, log into each Parallella machine and change the linaro password right now.

Step 4B - Create External Drive File Systems

The normal method of creating external hard drives on Linux systems, fsck, can only make a partition that is, at most, 2TB in size. This causes us problems since the drives we have are 4TB. The solution to this is to use a combination of other commands.

First, find out where your drives are with df (if they're mounted). Honestly, on the Parallella, it's unlikely they're automounted. Use lsusb to list the drive allocations. You'll probably be looking for (assuming two drives attached to the powered hub) /dev/sda and /dev/sdb.

Run the GNU parted command on each new drive, in turn. You may need to install parted with sudo apt-get install parted.

Use the help system to learn the commands available. Then
  • mklabel gpt (answer yes to the scary warning message)
  • unit TB
  • mkpart primary 0.00TB 4.00TB
  • print (to check your work)
  • quit
Now, you have the partitions created. You have to image the drive with a filesystem. Use the command mkfs.ext4 /dev/sda1 (substituting the appropriate drive device, of course). This will take a while to complete.

Next, you have to mount your filesystems. I mounted /dev/sda1 on /home after copying everything in the existing /home to it. I then mounted /dev/sdb1 on /mnt/backup.

Finally, I created a crontab entry to copy /home to /mnt/backup every three hours. Here's the entry (created with sudo crontab -e):

0 0,3,6,9,12,15,18,21 * * * rsync -av /home/* /mnt/backup

Now you're set up with lots of nice storage on your cluster.

Step 5 - Prepare The Group And Password Files For linaro and NFS, Then Repair The Filesystem Contents

If you are, as I did, retrofitting existing Raspberry Pi computers into a cluster, and then adding Parallella boards, you are going to quickly run into a problem that has plagued the Network File System (NFS, aka "No Security"). Namely, the system does not map UIDs between UNIX operating systems. If you want to use NFS (and in this case, we need to), you MUST synchronize the relevant account UID and GID. In our case, this means that the linaro account must have the same UID and GID across all Raspberry Pi in the cluster, as well as each Parallella board. I didn't think this through, and the UID/GID sync bit me immediately. Follow this procedure and you'll be ok.

First, pick out a UID and corresponding GID that are not in use. I picked 2500 for both.

Second, on each Raspberry Pi computer that you want to be in the cluster, log in as pi and run this command:

sudo adduser --uid 2500 --gecos "MPI Cluster Computing" --ingroup sudo --gid 2500 linaro

Set the password when prompted.

Third, the tricky part. Log in on the first Parallela board on the linaro account. Use sudo vi /etc/group to edit the group file and change linaro's gid to 2500. Then use (carefully) sudo vi /etc/passwd and change both the uid and gid fields for linaro to 2500. The order is important. Edit the group file first. Log out, and log back in to pick up the new UID and GID.

Fourth, you need ot fix all the filesystem contents that still have UID and GID 1000 (the linaro UID and GID that you replaced with 2500). Use this command to find and fix all those files:

sudo find / -uid 1000 -exec chown linaro.linaro {} \; -print

Repeat this on each Parallella board.

Step 6 - Make Directories And Configure NFS

We have to configure the Network File System (NFS) on the master Parallella board. Actually, we don't have to. We could build a series of scripts and programs that will keep all the Parallella boards synced correctly. I personally think it makes life easier to have a directory shared amongst all the machines. 

Warning! I do not have this configuration working right yet because MPI expects the executable path to be the same for every compute node. This means that I have to jump through hoops to support a heterogeneous architecture environment (I haven't gotten to that part yet, but my compute cluster consists of Parallellas and Raspberry Pi 2s). The Parallella and RPi2 are different architectures, and one cannot run binaries compiled on the other. I tried writing a shell script that would wrap the binaries and use lsb_release -a output to determine the architecture, but it doesn't work; I suspect the shell script is being evaluated on the master server and so cannot execute the proper binary at run time.

NOTE 20150309: I was able to execute heterogeneous architecture binaries by copying the Raspberry Pi binary to /home/linaro/bin/progname on the Raspberry Pi systems and the Parallella/Epiphany binary to /home/linaro/bin/progname on the Parallella systems. I then successfully executed the program on both architectures with

/usr/bin/mpiexec -f machinefile2 /home/linaro/bin/progname

after creating machinefile2 with a list of both Raspberry Pi and Parallella IP addresses. This means that the following info about NFS cross-mounting is technically not needed, as long as the final executable path is the same on both architectures....

Create the directory /home/linaro/src on whichever Parallella you're using as the master (the one with the drives mounted). Make sure you have all the NFS components installed on that machine to make it an NFS server. Add this line to the file /etc/exports:

/home/linaro/src *(rw,sync,no_root_squash,no_subtree_check)

Restart the NFS service to ensure the filesystems are exported correctly.

On the second Parallella board, make the directory /home/linaro/src. Then, add this entry to /etc/fstab:

master:/home/linaro/src /home/linaro/src nfs

where master is the name of the machine on which you have the drives mounted. You can test this configuration with mount -a and you should be able to see everything in master's /home/linaro/src directory mounted locally.

Repeat this on each Paralella board except the first. This should make the src directory available on all the Parallella machines.

Step 6B - Create SSH Keys And Distribute Them To All Involved Systems

Fortunately, this step is trivial. In the linaro account, run the command ssh-keygen. Accept the default file names and configurations, and don't put a password on the key.

Then, for each destination computer dmach, copy the key you just generated so you can log in on the destination computer without using a password: 
ssh-copy-id -i .ssh/id_rsa.pub linaro@dmach

You'll need to enter the password each time you do this, but it's the last time you'll need to enter the password to get to each of these machines. :-) There is one potential difficulty. If the /home/linaro/.ssh directory doesn't exist on the destination machine, I don't know what happens. I suspect ssh-copy-id will create the .ssh directory on the target, but you may have to log in and make it yourself. Remember to chmod 700 /home/linaro/.ssh to set the permissions correctly.

Part Two - Software Configuration And Testing
[Current music: Staind, "So Far Away"]

Step 7 - Compile The Installed Software

First, update the operating system and install dependencies.

sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade

Next, we want to compile the software that was distributed with the Parallella, but not compiled:

cd /home/linary/Install
sudo ./build.sh

This will compile the Epiphany software development toolkit (ESDK), libelf, and libevent.

Step 8 - Compile OpenMPI

You need to separately compile OpenMPI on all the architectures. On the Parallella, execute these commands:

cd /home/linaro/Install/openmpi-1.8.1
./configure --prefix=/usr/local
sudo make install

If you get errors about aclocal-1.15, complete step 8b (compile GNU AutoMake), then repeat these steps.

Step 8b - Compile GNU AutoMake, Latest Version

With your web browser, go to http://ftp.gnu.org/gnu/automake/ to download the latest version, or use these commands:

cd /home/linaro/src
wget http://ftp.gnu.org/gnu/automake/automake-1.15.tar.gz

Unpack the software with tar zxvf automake-1.15.tar.gz

cd automake-1.15
./configure --prefix=/usr
sudo make install

Step 9 - Download, Configure, And Compile MPICH2

Start on the Parallela machines. Make sure you have libtool installed: sudo apt-get install libtool

Caveat: As of MPICH2 version 3.2, the machine file must contain IPv4 worker node addresses. Do NOT use hostnames, even fully-qualified domain names (FQDNs) here. If you do, your PMPI_Reduce collector process with throw a Java error in the middle of the run.

cd /home/linaro/src
wget http://www.mpich.org/static/downloads/3.2/mpich-3.2.tar.gz
tar zxvf mpich-3.2.tar.gz
cd mpich-3.2
./configure --prefix=/usr
sudo make install

On each Parallella, cd to the mpich-3.2 directory and re-run sudo make install to install the binaries on that system.

Part Three - Step 10 - Configuring A Raspberry Pi Cluster Using MPICH2

On one Raspberry Pi, log in as linaro and cd to /home/linaro/src/mpich-3.2, then run

./configure --prefix=/usr
sudo make install

Then log into each of the other Raspberry Pi and repeat the sudo make install command in the mpich-3.2 directory.

Part Four - Getting On To The Hard Stuff - Node-Architecture-Agnostic Compute Fabric

The magic to making the Message Passing Interface (MPI) work on heterogeneous (aka node-architecture-agnostic) compute nodes is
  1. Use the same version of MPI on all systems - either OpenMPI or MPICH.
  2. Ensure that the binary path is identical on all systems. For example, if your binary is named "hello," make sure the binary is in /home/linaro/bin/hello on every computer!

Step 11 - Build and test Epiphany "Hello, World" app on one Parallella

Execute these commands on your master Parallella computer:

cd /home/linaro/epiphany-examples/apps/hello-world

You should see 20 lines of output like this:

  0: Message from eCore 0x8ca ( 3, 2): "Hello World from core 0x8ca!"

  1: Message from eCore 0x84b ( 1, 3): "Hello World from core 0x84b!"

  2: Message from eCore 0x84b ( 1, 3): "Hello World from core 0x84b!"

  3: Message from eCore 0x888 ( 2, 0): "Hello World from core 0x808!"

  4: Message from eCore 0x849 ( 1, 1): "Hello World from core 0x849!"

  5: Message from eCore 0x88b ( 2, 3): "Hello World from core 0x88b!"

  6: Message from eCore 0x88b ( 2, 3): "Hello World from core 0x88b!"

  7: Message from eCore 0x8ca ( 3, 2): "Hello World from core 0x8ca!"

  8: Message from eCore 0x80a ( 0, 2): "Hello World from core 0x80a!"

  9: Message from eCore 0x808 ( 0, 0): "Hello World from core 0x808!"

 10: Message from eCore 0x8c8 ( 3, 0): "Hello World from core 0x8c8!"

 11: Message from eCore 0x8c9 ( 3, 1): "Hello World from core 0x8c9!"

 12: Message from eCore 0x88a ( 2, 2): "Hello World from core 0x88a!"

 13: Message from eCore 0x88b ( 2, 3): "Hello World from core 0x88b!"

 14: Message from eCore 0x8cb ( 3, 3): "Hello World from core 0x8cb!"

 15: Message from eCore 0x84a ( 1, 2): "Hello World from core 0x84a!"

 16: Message from eCore 0x88a ( 2, 2): "Hello World from core 0x88a!"

 17: Message from eCore 0x84b ( 1, 3): "Hello World from core 0x84b!"

 18: Message from eCore 0x848 ( 1, 0): "Hello World from core 0x848!"

 19: Message from eCore 0x8ca ( 3, 2): "Hello World from core 0x8ca!"

Step 12 - Using a modified "run.sh" script, test "Hello, World" app on all Parallellas in the cluster

In the same directory, copy run.sh to runp.sh. Edit runp.sh and change the line that reads

cd Debug
cd /home/linaro/epiphany-examples/apps/hello-world/Debug

Then copy the runp.sh script to the same directory on each Parallella system. Make a file called /home/linaro/machinelist that contains the IP addresses of each Parallella computer in the cluster, one per line. When that is completed, on the master Parallella computer, do

/usr/bin/mpiexec -f machinelist /home/linaro/epiphany-examples/apps/hello-world/runp.sh

This should produce 80 lines of output, all mixed together, with some line breaks in strange places (because all the programs are running in parallel).

Step 13 - A bigger step - test "hello, world" on Raspberry Pi

The first part of this step is to make the directory /home/linaro/src/hello_p on the Raspberry Pi cluster (remember, it's an NFS-shared filesystem, so you should only have to do this in one place). Create the source file hello.c with the following contents:

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <sched.h>

void * hello_fun() {

char hName[64];

unsigned cpu;

gethostname(&hName, 64);

cpu = sched_getcpu();

printf("%s: Hello, world from core %d.\n", hName, cpu);

return NULL;


int main(int argc, char * argv[]) {

pthread_t thread; // process thread

// Create a new thread and run the hello function

pthread_create(&thread, NULL, hello_fun, NULL);

pthread_create(&thread, NULL, hello_fun, NULL);

pthread_create(&thread, NULL, hello_fun, NULL);

pthread_create(&thread, NULL, hello_fun, NULL);

// Block on threads

pthread_join(thread, NULL);

return 0;


Compile the binary with gcc -o hello -lpthread -pthread hello.c

Now run /home/linaro/src/hello_p/hello and you should get four lines of output. If you get compile or execution errors, debug appropriately. Note that you have to install the pthreads library in some cases.

Step 14 - Test "hello, world" on the Raspberry Pi cluster

Copy the binary to /home/linaro/bin/hello and then scp that binary to the same location on all the other Raspberry Pi computers. Create a file, /home/linaro/machinelist.rpi, that contains the IP addresses of all the Raspberry Pi computers in the cluster. Now execute the program in parallel on all the Raspberry Pi with 

cd ~
/usr/bin/mpiexec -f machinelist.rpi /home/linaro/bin/hello

You should get four lines of output for each Raspberry Pi in the cluster.

Step 15 - The whole enchilada - test "hello, world" on the total heterogeneous compute cluster

On the master Parallella machine, cp /home/linaro/epiphany-examples/apps/hello-world/runp.sh /home/linaro/bin/hello and then scp the script to the same location on every other Parallella computer. Create a new machine file (name it whatever you like) that contains the IP addresses of all the Parallellas and Raspberry Pi computers in the compute cluster. Note that mpiexec is not itself a parallelized program! It will go through the machine file sequentially and ssh to the IP address to execute the command given. This means your output will be in roughly the same order as the machines are listed in the machine file. This annoys me. I want mpiexec to be parallelized. 

Execute the program/script on all the machines with

/usr/bin/mpiexec -f machinefile.all /home/linaro/bin/hello


Part Five - Making The Leap - Highly Distributed Applications