Unable to deploy a working 10 node hadoop cluster
Cory Johns
cory.johns at canonical.com
Tue Feb 10 19:37:37 UTC 2015
Ken,
Unfortunately, this is due to an upstream change in the openjdk
package. A bug has been filed[1] and it has been noted that this
breaks all of the Big Data charms in the store. Please indicate on
that bug that this is affecting you.
There is a symlink work-around mentioned on the bug, but it is
per-process. You could also manually install the previous Java
version using apt[2], but this is not recommended and would require
removing and re-adding all of the relations, at least.
[1] https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/1417962
[2] sudo apt-get install openjdk-7-jdk=7u51-2.4.6-1ubuntu4
openjdk-7-jre=7u51-2.4.6-1ubuntu4
openjdk-7-jre-headless=7u51-2.4.6-1ubuntu4
On Tue, Feb 10, 2015 at 11:59 AM, Ken Williams <ken.w at theasi.co> wrote:
>
> Hi,
>
> I'm trying to deploy a basic hadoop cluster on Amazon (AWS).
> It should have just 10 nodes to simply run hadoop map-reduce
> jobs and just store data on hdfs. Nothing else. No hive or
> spark or anything.
>
> These are the commands I enter.
>
> juju quickstart
> juju deploy hdp-hadoop yarn-hdfs-master
> juju deploy hdp-hadoop compute-node
> juju add-unit -n 10 compute-node
> juju add-relation yarn-hdfs-master:namenode compute-node:datanode
> juju add-relation yarn-hdfs-master:resourcemanager
> compute-node:nodemanager
>
> In 'juju status' I can see all the nodes are being added and
> I wait until all their statuses are 'running'.
>
> If I 'juju ssh' to *any* of the compute-node machines I
> cannot list any hdfs directories, and get this message
>
> root at ip-172-31-28-205:~# su hdfs
> hdfs at ip-172-31-28-205:/home/ubuntu$ hdfs dfs -ls /
> ls: Incomplete HDFS URI, no host: hdfs://TODO-NAMENODE-HOSTNAME:PORT
> hdfs at ip-172-31-28-205:/home/ubuntu$
>
>
> Also, there is no 'DataNode' process running on the machine,
> which would be needed to access HDFS.
> Am I doing something wrong or am I meant to edit
> the 'hdfs-site.xml' file myself ?
> On all 10 machines ?
>
> Also, if I 'juju ssh' onto the yarn-hdfs-master/0 machine
> and try to run a hdfsadmin -report, it tells me that
> hdfs has no data-nodes running (see below) - so when
> I try to put data on to hdfs it fails with an error
> message 'There are 0 datanode(s) running'.
>
> hdfs at ip-172-31-21-161:/home/ubuntu$ hdfs dfsadmin -report
> Configured Capacity: 0 (0 B)
> Present Capacity: 0 (0 B)
> DFS Remaining: 0 (0 B)
> DFS Used: 0 (0 B)
> DFS Used%: NaN%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 0 (0 total, 0 dead)
>
>
> I don't understand if I am doing something wrong.
>
> What is the recommended way for deploying a
> hadoop and hdfs cluster using juju ?
>
> Thankyou for any help,
>
> Ken
>
>
>
> --
> Juju mailing list
> Juju at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju
>
More information about the Juju
mailing list