Final Guide to Beowulf Cluster
Goal: create a working Beowulf cluster
Resources: CPU1, CPU2, CPU3
Software to install: Hadoop, NIS, Ganglia
Step 1: Creating head node
The head node is the starting point for launching jobs on the cluster. In our case, we are dedicating CPU1 as our head node. The head node will run Scientific Linux 7 as the primary operating system. In our case, CPU1 originally ran windows, so we needed to download an image of SL7 onto a flash drive (a disk would have worked too) and boot from that media.
Download SL7 on to some sort of bootable media (i.e. flash drive or CD)
You want the version that says "everything.iso" to get the full OS
(Put the bootable media in CPU1, fool)
Power off CPU1
Reboot CPU1 and spam the F12 key until a boot screen appears
Follow the on screen instructions to boot from your media disk
Once the SL7 display appears, follow the instructions to create a user account, then install SL7 as the primary OS
Step 2: Creating data nodes
For this step you will use VirtualBox to create a virtual machine on CPU2. Just repeat the process for CPU3.
Download VirtualBox to a folder on the Desktop
Download SL7 to a folder on the Desktop
Launch virtualBox and click create new machine
Follow the screenshots
Quick notes
Name the machine whatever you want
RAM and Memory are up to you. Keep in mind the iso is around 7GB
Finally, much like the head node you have to go through all the Linux setup bs
WRITE DOWN ALL OF YOUR CREDENTIALS AND PASSWORDS PLEASE I BEG OF YOU
Step 3: Networking
Lewis is the networking wizard. Here is his logbook.
Step 4: Hadoop
Hadoop is a service that allows us to store data across different nodes and process this data. The hadoop distributed file system and the MapReduce framework handle both of these things respectively. Again, Lewis talks about the specific installation in his logbook, so as not to repeat information I will discuss a use case we attempted to implement.
Here is a good tutorial to follow.
Anyways, once hadoop is installed on all of your nodes, you are ready to use the hadoop file system. This creates copies of your directories/ files across different nodes. Do not get confuse the difference between running normal linux commands on the head node and running hadoop commands on the head node. Any hadoop command needs the phrase hadoop fs or hdfs before the actual command to ensure it runs on all nodes. For example, if you run mkdir myDirectory on the head node, a new directory will be created locally on the head node. However, if you run hdfs mkdir myDirectory or hadoop fs mkdir myDirectory, you will create a new directory across all nodes in the hadoop file system.
Here is a list of typical, supported UNIX operations. We found copyFromLocal and copyToLocal to be useful for transferring files to and from the HDFS from the local head node.
Further documentation and use examples can be found here.
Step 5: NIS
Head Node Setup
Install the packages
sudo yum install ypbind portmap ypserv yp-tools nscd
Edit the network file
/etc/sysconfig/network
hostname can be found in the hosts/hot directory in the etc folder
Make up a domain. Ours was nis.Hadoop-master.com
Edit the yp.conf file
/etc/yp.conf
domain is the domain you created earlier. Server is the head node's IP address
Edit the nsswitch.conf file
/etc/nsswitch.conf
Its kind of a long file just scroll through and ignore the comments until you see the parts from the screenshot below
Run the following commands as the root user to start NIS
Create the domain with the command nisdomain domainNameFromEarlier
Slave Node Setup
Install the same packages as with the head
Edit the network file
Hostname of the local node
NIS domain is the one you made earlier
Edit the yp.conf file
same as before
Edit nsswitch.conf file
same as before
Run the following commands to start NIS client serivces
service portmap restart
service ypbind start
service nscd start
Time permitting, we would've liked to install more software including puppet, clush, crontab among others. Hit me up on intsta and twitter @stschoberg if you have any questions or want to say thanks for synthesizing all the random tutorials online.
4/5/18
Commands:
hadoop fs -ls: list files in directory
hadoop fs -mkdir test: makes directory
cat > myfile.txt: creates new file
hadoop fs -put /mnt/home//*.txt test: moves file
hadoop fs -cat test/myfile.txt | grep my: searches for keyword in file
hadoop fs -ls -R . | grep test: searches for keyword recursively through directories
hadoop fs -du test/myfile.txt: file size
3/29/18
Technical difficulties still prevailing. As a work around we started using CognitiveClass.ai. It configures a 3 node cluster on the IBM cloud and gives you admin rights. It abstracts away all of the configuration to let you start running jobs/learning hadoop. It also provides tutorials.
Hadoop is java based, so you can specify a class/logic then use those classes with hadoop by using the Job object in your driver. When you compile, hadoop will run your logic through the Job object.
3/8/18
Installing VM on local computer in room to make mini-cluster to ssh into.
I wanted to use terminal to connect to my ubuntu on virtual box (comfort reasons,the VB is just weird. I can't work unless it is on a proper terminal). Anyway,
Make sure ssh client is installed on your Linux. If not, install it.
Power down the OS.
Now on your VB go to settings->network->on adapter 1 choose host only adapter->click ok.
Now start your OS. Run ifconfig; now the inet address is your IP.
Use this and run it on your putty. Login with your credentials.
The only disadvantage of using host only adapter is that your guest OS won't have access to the wider network (eg the Internet).
If you also need your VM to have internet access, leave Adapter 1 as NAT and enable Adapter 2, configured as a Host-Only adapter. This will allow your VM to connect to the internet using NAT as well as make a local connection to your Host using Host-Only.
2/20/18
We have given up on building machines on the cluster. We downloaded VirtualBox locally and are creating the VM there. FUTURE MAC USERS: You will encounter an error when installing virtualBox. The solution is here.
We plan to start running the sample python scripts for practice. Will ask John for the OM files to parse.
2/14/18
Met with Dr. Jabeen. Updated permission in foreman to make my VM tab visible. Attached are the photos of the settings to create a VM. Puppet classes, parameters, additional information tabs don't need to be edited. Do NOT enter a MAC address. The VM will generate one for you while building.
2/7/18
Experiencing difficulty with setting up Foreman VMs. Although I am choosing the option hepcms-ovirt, the VM tab never appears. Meeting with Jamie and John tomorrow to troubleshoot. If we can figure it out, will schedule a meeting with Dr. Jabeen.
General Cluster Info:
Made of individual nodes with their own IPs and OS's
Relies on centralized management approach (Foreman in our case)
Load balancing is sharing workload among many nodes
Fault tolerance allows for scalability (system still runs if a node fails)
Large computation loads achieved through lots of low performance nodes (low cost, bulk orders)
Cluster management
Task scheduling (hadoop)
Node failure management (fencing- deactivating the node itself, or restricting access to that node's resources)
SL7 Info:
Advantages:
Provides consistency across intensive scientific computing centers
Maintained by Fermilab since 2003 (don't have to sacrifice secure and efficient)
No extra features necessarily, but gives global consistency
1/31/18: VM and Linux
Hadoop:
"The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures."
Puppet:
https://puppet.com/docs/puppet/5.3/architecture.html
Meeting with group tomorrow. Going to ask if these downloads will fit our needs.
https://www.virtualbox.org/wiki/Downloads
https://linus.nci.nih.gov/bdge/installUbuntu.html
1/28/18: Intro week
Lewis and I met with Dr. Jabeen in her office. We are finalizing a weekly meeting time to work with the team. She spoke about the management software Hadoop, Puppet, and Foreman. We went over monitoring and the private and public T3 webpages. She recommended we download a virtual machine that runs Linux 7 to learn Puppet commands.
1/25/18: References
https://home.cern/about/computing/grid-system-tiers