Large Scale Data Management
The objective of this course it to give an overview of Large Scale Distributed Data Management Basics.
References
Web Data Management Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart.
Principles of Distributed Database Systems. Tamer Ozsu and Patrick Valduriez
COUCHDB HomeWork (2017-2018)
Exercice 20.2.1
[[http://webdam.inria.fr/Jorge/html/wdmch21.html#x27-39600020 | Web data Management Book chap 20]
Jscouch : https://github.com/janl/jscouch
Putting into Practice: COUCHDB
Exercice 20.2.1
[[http://webdam.inria.fr/Jorge/html/wdmch21.html#x27-39600020 | Web data Management Book chap 20]
Putting into Practice: Hadoop
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
Be carefull to clean /tmp/hadoop* between each experiment -> single node, pseudo-distrib and cluster.
sometimes a ssh-add can help (agent)
on HDFS : input -> /input
[[http://webdam.inria.fr/Jorge/html/wdmch21.html#x27-39600020 | Web data Management Book chap 20]
If datanode not starting, Stop-all, namenode -format ; clean /tmp/hadoop* /tmp/yarn*; start-all
** Attach:ciesshd-hadoop.tar.gz Le fichier pour configurer sshd sur le port 2222 en cluster
[[http://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id/|passwordless ssh connection ]
Il faut modifier conf/hadoop-env.sh pour change JAVA_HOME (/usr/lib/jvm) et SSH_OPTIONS avec -p 2222
Putting into Practice: Pig
HomeWork
Exam
![](https://www.google.com/images/icons/product/drive-32.png)