How best to install Spark?
Ken Williams
ken.w at theasi.co
Fri Jan 30 12:09:04 UTC 2015
Hi Sam,
I understand what you are saying but when I try to add the 2 relations
I get this error,
root at adminuser-VirtualBox:~# juju add-relation
yarn-hdfs-master:resourcemanager spark-master:master
ERROR no relations found
root at adminuser-VirtualBox:~# juju add-relation yarn-hdfs-master:namenode
spark-master:master
ERROR no relations found
Am I adding the relations right ?
Attached is my 'juju status' file.
Thanks for all your help,
Ken
On 30 January 2015 at 11:16, Samuel Cozannet <samuel.cozannet at canonical.com>
wrote:
> Hey Ken,
>
> Yes, you need to create the relationship between the 2 entities to they
> know about each other.
>
> Looking at the list of hooks for the charm
> <https://github.com/Archethought/spark-charm/tree/master/hooks> you can
> see there are 2 hooks named namenode-relation-changed
> <https://github.com/Archethought/spark-charm/blob/master/hooks/namenode-relation-changed>
> and resourcemanager-relation-changed
> <https://github.com/Archethought/spark-charm/blob/master/hooks/resourcemanager-relation-changed> which
> are related to YARN/Hadoop.
> Looking deeper in the code, you'll notice they reference a function found
> in bdutils.py called "setHadoopEnvVar()", which based on its name should
> set the HADOOP_CONF_DIR.
>
> There are 2 relations, so add both of them.
>
> Note that I didn't test this myself, but I expect this should fix the
> problem. If it doesn't please come back to us...
>
> Thanks!
> Sam
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> /
> Juju <https://jujucharms.com>
> samuel.cozannet at canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 11:51 AM, Ken Williams <ken.w at theasi.co> wrote:
>
>>
>> Thanks, Kapil - this works :-)
>>
>> I can now run the SparkPi example successfully.
>> root at ip-172-31-60-53:~# spark-submit --class
>> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>> Spark assembly has been built with Hive, including Datanucleus jars on
>> classpath
>> 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> Pi is roughly 3.14318
>>
>> root at ip-172-31-60-53:~#
>>
>> I'm now trying to run the same example with the spark-submit '--master'
>> option set to either 'yarn-cluster' or 'yarn-client'
>> but I keep getting the same error :
>>
>> root at ip-172-31-60-53:~# spark-submit --class
>> org.apache.spark.examples.SparkPi --master yarn-client
>> --num-executors 3 --driver-memory 1g --executor-memory 1g
>> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
>> Spark assembly has been built with Hive, including Datanucleus jars on
>> classpath
>> Exception in thread "main" java.lang.Exception: When running with master
>> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
>> environment.
>>
>> But on my spark-master/0 machine there is no /etc/hadoop/conf directory.
>> So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ?
>> Do I need to add a juju relation between spark-master and ...
>> yarn-hdfs-master to make them aware of each other ?
>>
>> Thanks for any help,
>>
>> Ken
>>
>>
>>
>>
>>
>> On 28 January 2015 at 19:32, Kapil Thangavelu <
>> kapil.thangavelu at canonical.com> wrote:
>>
>>>
>>>
>>> On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams <ken.w at theasi.co> wrote:
>>>
>>>>
>>>> Hi Sam/Amir,
>>>>
>>>> I've been able to 'juju ssh spark-master/0' and I successfully ran
>>>> the two
>>>> simple examples for pyspark and spark-shell,
>>>>
>>>> ./bin/pyspark
>>>> >>> sc.parallelize(range(1000)).count()
>>>> 1000
>>>>
>>>> ./bin/spark-shell
>>>> scala> sc.parallelize(1 to 1000).count()
>>>> 1000
>>>>
>>>>
>>>> Now I want to run some of the spark examples in the spark-exampes*.jar
>>>> file, which I have on my local machine. How do I copy the jar file from
>>>> my local machine to the AWS machine ?
>>>>
>>>> I have tried 'scp' and 'juju scp' from the local command-line but both
>>>> fail (below),
>>>>
>>>> root at adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>>> ubuntu at ip-172-31-59:/tmp
>>>> ssh: Could not resolve hostname ip-172-31-59: Name or service not known
>>>> lost connection
>>>> root at adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>>> ubuntu at ip-172-31-59:/tmp
>>>> ERROR exit status 1 (nc: getaddrinfo: Name or service not known)
>>>>
>>>> Any ideas ?
>>>>
>>>
>>> juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp
>>>
>>>>
>>>> Ken
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 28 January 2015 at 17:29, Samuel Cozannet <
>>>> samuel.cozannet at canonical.com> wrote:
>>>>
>>>>> Glad it worked!
>>>>>
>>>>> I'll make a merge request to the upstream so that it works natively
>>>>> from the store asap.
>>>>>
>>>>> Thanks for catching that!
>>>>> Samuel
>>>>>
>>>>> Best,
>>>>> Samuel
>>>>>
>>>>> --
>>>>> Samuel Cozannet
>>>>> Cloud, Big Data and IoT Strategy Team
>>>>> Business Development - Cloud and ISV Ecosystem
>>>>> Changing the Future of Cloud
>>>>> Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> /
>>>>> Juju <https://jujucharms.com>
>>>>> samuel.cozannet at canonical.com
>>>>> mob: +33 616 702 389
>>>>> skype: samnco
>>>>> Twitter: @SaMnCo_23
>>>>>
>>>>> On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams <ken.w at theasi.co> wrote:
>>>>>
>>>>>>
>>>>>> Hi Sam (and Maarten),
>>>>>>
>>>>>> Cloning Spark 1.2.0 from github seems to have worked!
>>>>>> I can install the Spark examples afterwards.
>>>>>>
>>>>>> Thanks for all your help!
>>>>>>
>>>>>> Yes - Andrew and Angie both say 'hi' :-)
>>>>>>
>>>>>> Best Regards,
>>>>>>
>>>>>> Ken
>>>>>>
>>>>>>
>>>>>> On 28 January 2015 at 16:43, Samuel Cozannet <
>>>>>> samuel.cozannet at canonical.com> wrote:
>>>>>>
>>>>>>> Hey Ken,
>>>>>>>
>>>>>>> So I had a closer look to your Spark problem and found out what went
>>>>>>> wrong.
>>>>>>>
>>>>>>> The charm available on the charmstore is trying to download Spark
>>>>>>> 1.0.2, and the versions available on the Apache website are 1.1.0, 1.1.1
>>>>>>> and 1.2.0.
>>>>>>>
>>>>>>> There is another version of the charm available on GitHub that
>>>>>>> actually will deploy 1.2.0
>>>>>>>
>>>>>>> 1. On your computer, the below folders & get there:
>>>>>>>
>>>>>>> cd ~
>>>>>>> mkdir charms
>>>>>>> mkdir charms/trusty
>>>>>>> cd charms/trusty
>>>>>>>
>>>>>>> 2. Branch the Spark charm.
>>>>>>>
>>>>>>> git clone https://github.com/Archethought/spark-charm spark
>>>>>>>
>>>>>>> 3. Deploy Spark from local repository
>>>>>>>
>>>>>>> juju deploy --repository=~/charms local:trusty/spark spark-master
>>>>>>> juju deploy --repository=~/charms local:trusty/spark spark-slave
>>>>>>> juju add-relation spark-master:master spark-slave:slave
>>>>>>>
>>>>>>> Worked on AWS for me just minutes ago. Let me know how it goes for
>>>>>>> you. Note that this version of the charm does NOT install the Spark
>>>>>>> examples. The files are present though, so you'll find them in
>>>>>>> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive
>>>>>>>
>>>>>>> Hope that helps...
>>>>>>> Let me know if it works for you!
>>>>>>>
>>>>>>> Best,
>>>>>>> Sam
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>> Samuel
>>>>>>>
>>>>>>> --
>>>>>>> Samuel Cozannet
>>>>>>> Cloud, Big Data and IoT Strategy Team
>>>>>>> Business Development - Cloud and ISV Ecosystem
>>>>>>> Changing the Future of Cloud
>>>>>>> Ubuntu <http://ubuntu.com> / Canonical UK LTD
>>>>>>> <http://canonical.com> / Juju <https://jujucharms.com>
>>>>>>> samuel.cozannet at canonical.com
>>>>>>> mob: +33 616 702 389
>>>>>>> skype: samnco
>>>>>>> Twitter: @SaMnCo_23
>>>>>>>
>>>>>>> On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams <ken.w at theasi.co>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Hi folks,
>>>>>>>>
>>>>>>>> I'm completely new to juju so any help is appreciated.
>>>>>>>>
>>>>>>>> I'm trying to create a hadoop/analytics-type platform.
>>>>>>>>
>>>>>>>> I've managed to install the 'data-analytics-with-sql-like' bundle
>>>>>>>> (using this command)
>>>>>>>>
>>>>>>>> juju quickstart
>>>>>>>> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like
>>>>>>>>
>>>>>>>> This is very impressive, and gives me virtually everything that I
>>>>>>>> want
>>>>>>>> (hadoop, hive, etc) - but I also need Spark.
>>>>>>>>
>>>>>>>> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark
>>>>>>>> )
>>>>>>>> and bundle (
>>>>>>>> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
>>>>>>>> however do not seem stable or available and I can't figure out how
>>>>>>>> to install them.
>>>>>>>>
>>>>>>>> Should I just download and install the Spark tar-ball on the nodes
>>>>>>>> in my AWS cluster, or is there a better way to do this ?
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>>
>>>>>>>> Ken
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Juju mailing list
>>>>>>>> Juju at lists.ubuntu.com
>>>>>>>> Modify settings or unsubscribe at:
>>>>>>>> https://lists.ubuntu.com/mailman/listinfo/juju
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Juju mailing list
>>>> Juju at lists.ubuntu.com
>>>> Modify settings or unsubscribe at:
>>>> https://lists.ubuntu.com/mailman/listinfo/juju
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju/attachments/20150130/78294075/attachment.html>
-------------- next part --------------
root at adminuser-VirtualBox:~# juju status
environment: amazon
machines:
"0":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.65.119
instance-id: i-35618fcf
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
state-server-member-status: has-vote
"1":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.169.101
instance-id: i-548675bb
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"2":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.218.10
instance-id: i-8f7aed7e
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"3":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.218.70
instance-id: i-69789693
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"4":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.35.98
instance-id: i-478675a8
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"5":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.0.48
instance-id: i-2163f4d0
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"6":
agent-state: started
agent-version: 1.21.1
dns-name: 54.152.95.64
instance-id: i-ca759b30
instance-state: running
series: trusty
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
services:
compute-node:
charm: cs:trusty/hdp-hadoop-4
exposed: false
relations:
datanode:
- yarn-hdfs-master
nodemanager:
- yarn-hdfs-master
units:
compute-node/0:
agent-state: started
agent-version: 1.21.1
machine: "1"
public-address: 54.152.169.101
hdphive:
charm: cs:trusty/hdp-hive-2
exposed: false
relations:
db:
- mysql
namenode:
- yarn-hdfs-master
resourcemanager:
- yarn-hdfs-master
units:
hdphive/0:
agent-state: started
agent-version: 1.21.1
machine: "2"
open-ports:
- 10000/tcp
public-address: 54.152.218.10
juju-gui:
charm: cs:trusty/juju-gui-17
exposed: true
units:
juju-gui/0:
agent-state: started
agent-version: 1.21.1
machine: "0"
open-ports:
- 80/tcp
- 443/tcp
public-address: 54.152.65.119
mysql:
charm: cs:trusty/mysql-4
exposed: false
relations:
cluster:
- mysql
db:
- hdphive
units:
mysql/0:
agent-state: started
agent-version: 1.21.1
machine: "3"
public-address: 54.152.218.70
spark-master:
charm: local:trusty/spark-0
exposed: false
relations:
master:
- spark-slave
units:
spark-master/0:
agent-state: started
agent-version: 1.21.1
machine: "5"
open-ports:
- 4040/tcp
- 7077/tcp
- 8080/tcp
- 18080/tcp
public-address: 54.152.0.48
spark-slave:
charm: local:trusty/spark-1
exposed: false
relations:
slave:
- spark-master
units:
spark-slave/0:
agent-state: started
agent-version: 1.21.1
machine: "6"
open-ports:
- 8081/tcp
public-address: 54.152.95.64
yarn-hdfs-master:
charm: cs:trusty/hdp-hadoop-4
exposed: false
relations:
namenode:
- compute-node
- hdphive
resourcemanager:
- compute-node
- hdphive
units:
yarn-hdfs-master/0:
agent-state: error
agent-state-info: 'hook failed: "namenode-relation-joined" for compute-node:datanode'
agent-version: 1.21.1
machine: "4"
open-ports:
- 8010/tcp
- 8020/tcp
- 8480/tcp
- 50070/tcp
- 50075/tcp
- 50470/tcp
public-address: 54.152.35.98
More information about the Juju
mailing list