Unable to deploy a working 10 node hadoop cluster

Tue Feb 10 19:37:37 UTC 2015

Ken,

Unfortunately, this is due to an upstream change in the openjdk
package.  A bug has been filed[1] and it has been noted that this
breaks all of the Big Data charms in the store.  Please indicate on
that bug that this is affecting you.

There is a symlink work-around mentioned on the bug, but it is
per-process.  You could also manually install the previous Java
version using apt[2], but this is not recommended and would require
removing and re-adding all of the relations, at least.

[1] https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/1417962
[2] sudo apt-get install openjdk-7-jdk=7u51-2.4.6-1ubuntu4
openjdk-7-jre=7u51-2.4.6-1ubuntu4
openjdk-7-jre-headless=7u51-2.4.6-1ubuntu4

On Tue, Feb 10, 2015 at 11:59 AM, Ken Williams <ken.w at theasi.co> wrote:
>
> Hi,
>
>     I'm trying to deploy a basic hadoop cluster on Amazon (AWS).
>     It should have just 10 nodes to simply run hadoop map-reduce
>     jobs and just store data on hdfs. Nothing else. No hive or
>     spark or anything.
>
>     These are the commands I enter.
>
>     juju quickstart
>     juju deploy hdp-hadoop yarn-hdfs-master
>     juju deploy hdp-hadoop compute-node
>     juju add-unit -n 10 compute-node
>     juju add-relation yarn-hdfs-master:namenode compute-node:datanode
>     juju add-relation yarn-hdfs-master:resourcemanager
> compute-node:nodemanager
>
>     In 'juju status' I can see all the nodes are being added and
>     I wait until all their statuses are 'running'.
>
>     If I 'juju ssh' to *any* of the compute-node machines I
>     cannot list any hdfs directories, and get this message
>
> root at ip-172-31-28-205:~# su hdfs
> hdfs at ip-172-31-28-205:/home/ubuntu$ hdfs dfs -ls /
> ls: Incomplete HDFS URI, no host: hdfs://TODO-NAMENODE-HOSTNAME:PORT
> hdfs at ip-172-31-28-205:/home/ubuntu$
>
>
>     Also, there is no 'DataNode' process running on the machine,
>     which would be needed to access HDFS.
>     Am I doing something wrong or am I meant to edit
>     the 'hdfs-site.xml' file myself ?
>     On all 10 machines ?
>
>     Also, if I 'juju ssh' onto the yarn-hdfs-master/0 machine
>     and try to run a hdfsadmin -report, it tells me that
>     hdfs has no data-nodes running (see below) - so when
>     I try to put data on to hdfs it fails with an error
>     message 'There are 0 datanode(s) running'.
>
> hdfs at ip-172-31-21-161:/home/ubuntu$ hdfs dfsadmin -report
> Configured Capacity: 0 (0 B)
> Present Capacity: 0 (0 B)
> DFS Remaining: 0 (0 B)
> DFS Used: 0 (0 B)
> DFS Used%: NaN%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 0 (0 total, 0 dead)
>
>
>     I don't understand if I am doing something wrong.
>
>     What is the recommended way for deploying a
>     hadoop and hdfs cluster using juju ?
>
>     Thankyou for any help,
>
> Ken
>
>
>
> --
> Juju mailing list
> Juju at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju
>