https://www.youtube.com/watch?v=wqY3Go7p0BA&feature=youtu.be
https://sites.google.com/a/ku.th/big-data/home/spark
https://spark.apache.org/docs/2.2.0/ml-collaborative-filtering.html
RE job :
https://www.cpe.ku.ac.th/~cnc/recommendALS.tar.gz
Data set:
https://raw.githubusercontent.com/apache/spark/master/data/mllib/als/sample_movielens_ratings.txt
larger data:
https://grouplens.org/datasets/movielens/100k/
https://www.kaggle.com/grouplens/movielens-20m-dataset
Implicit ALS :
https://medium.com/radon-dev/als-implicit-collaborative-filtering-5ed653ba39fe
Data set:
https://www.cpe.ku.ac.th/~cnc/usersha1-artmbid-artname-plays.tsv.zip
Jupyter notebook
https://www.cpe.ku.ac.th/~cnc/ImpliciteALS.ipynb
https://medium.com/ymedialabs-innovation/apache-spark-on-a-multi-node-cluster-b75967c8cb2b
https://github.com/ashishtam/apache-spark-multi-node-installation/blob/master/index.md
http://chennaihug.org/knowledgebase/spark-master-and-slaves-multi-node-installation/
Launching job: spark standalone cluster
https://spark.apache.org/docs/2.0.2/submitting-applications.html
https://spark.apache.org/docs/2.0.2/spark-standalone.html
see spark-submit parameters:
https://www.alibabacloud.com/help/doc-detail/28124.htm
To run on server: python code
conf = SparkConf().setAppName(app_name) \
.setMaster('spark://sparkmaster_ip:7077')
ssh Tunnel forwarding
https://www.tecmint.com/create-ssh-tunneling-port-forwarding-in-linux/