如何使用vagrant在虚拟机安装Hadoop集群

vagrant 是一个非常好用的工具,可以用它来在单台物理机器编程管理多个虚拟机(vms)。其支持原生VirtualBox,并同时提供了对VMware Fusion、Amazon EC2虚拟机集群的插件支持。

vagrant提供了极易使用、基于Ruby的内部DSL,允许用户使用它们的配置参数定义一个或多个虚拟机。另外,对于自动部署,vagrant支持多种机制:可以使用puppet,chef或者用于在vagrant配置文件中定义的所有虚拟机上自动安装软件程序和配置的shell脚本等。

所以,使用vagrant可以在运行着多台vm的系统上定义复杂的虚拟框架,是不是很酷?

vagrant的典型使用案例是以简单并且一致的方式构建工作或者开发环境。在Eligotech(原作者公司)公司里,开发人员正在开发一个产品,目标是让用户简单的使用Apache Hadoop、CDH(Cloudera的开源版本)。开发人员经常是为了测试需要在机器上安装hadoop环境。他们发现vagrant在这方面是一个非常便利的工具。

一个vagrant配置文件的例子,你们可以自行测试。你需要下载并安装vagrant(帮助地址)和virtualBox。所有东西都安装完毕后即可复制粘贴下面的文本保存为Vagrantfile,并将其放到一个目录下,比如VagrantHadoop.这个配置文件假定你机器内存至少32G,如果不适合可以自行编辑该文件。

# -*- mode: ruby -*-
# vi: set ft=ruby :

$master_script = <<SCRIPT
#!/bin/bash
cat > /etc/hosts <<EOF
127.0.0.1      localhost

# The following lines are desirable for IPv6 capable hosts
::1    ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

10.211.55.100  vm-cluster-node1
10.211.55.101  vm-cluster-node2
10.211.55.102  vm-cluster-node3
10.211.55.103  vm-cluster-node4
10.211.55.104  vm-cluster-node5
10.211.55.105  vm-cluster-client
EOF

apt-get install curl -y
REPOCM=${REPOCM:-cm4}
CM_REPO_HOST=${CM_REPO_HOST:-archive.cloudera.com}
CM_MAJOR_VERSION=$(echo $REPOCM | sed -e 's/cm\\([0-9]\\).*/\\1/')
CM_VERSION=$(echo $REPOCM | sed -e 's/cm\\([0-9][0-9]*\\)/\\1/')
OS_CODENAME=$(lsb_release -sc)
OS_DISTID=$(lsb_release -si | tr '[A-Z]' '[a-z]')
if [ $CM_MAJOR_VERSION -ge 4 ]; then
  cat > /etc/apt/sources.list.d/cloudera-$REPOCM.list <<EOF
deb [arch=amd64] $CM_REPO_HOST/cm$CM_MAJOR_VERSION/$OS_DISTID/$OS_CODENAME/amd64/cm $OS_CODENAME-$REPOCM contrib
deb-src $CM_REPO_HOST/cm$CM_MAJOR_VERSION/$OS_DISTID/$OS_CODENAME/amd64/cm $OS_CODENAME-$REPOCM contrib
EOF
curl -s $CM_REPO_HOST/cm$CM_MAJOR_VERSION/$OS_DISTID/$OS_CODENAME/amd64/cm/archive.key > key
apt-key add key
rm key
fi
apt-get update
export DEBIAN_FRONTEND=noninteractive
apt-get -q -y --force-yes install Oracle-j2sdk1.6 cloudera-manager-server-db cloudera-manager-server cloudera-manager-daemons
service cloudera-scm-server-db initdb
service cloudera-scm-server-db start
service cloudera-scm-server start
SCRIPT

$slave_script = <<SCRIPT
cat > /etc/hosts <<EOF
127.0.0.1      localhost

# The following lines are desirable for IPv6 capable hosts
::1    ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

10.211.55.100  vm-cluster-node1
10.211.55.101  vm-cluster-node2
10.211.55.102  vm-cluster-node3
10.211.55.103  vm-cluster-node4
10.211.55.104  vm-cluster-node5
10.211.55.105  vm-cluster-client
EOF
SCRIPT

$client_script = <<SCRIPT
cat > /etc/hosts <<EOF
127.0.0.1      localhost

# The following lines are desirable for IPv6 capable hosts
::1    ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

10.211.55.100  vm-cluster-node1
10.211.55.101  vm-cluster-node2
10.211.55.102  vm-cluster-node3
10.211.55.103  vm-cluster-node4
10.211.55.104  vm-cluster-node5
10.211.55.105  vm-cluster-client
EOF
SCRIPT

Vagrant.configure("2") do |config|

config.vm.define :master do |master|
    master.vm.box = "precise64"
    master.vm.provider "vmware_fusion" do |v|
      v.vmx["memsize"]  = "4096"
    end
    master.vm.provider :virtualbox do |v|
      v.name = "vm-cluster-node1"
      v.customize ["modifyvm", :id, "--memory", "4096"]
    end
    master.vm.network :private_network, ip: "10.211.55.100"
    master.vm.hostname = "vm-cluster-node1"
    master.vm.provision :shell, :inline => $master_script
  end

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/a1c2c079dc982a59d650e6819c52b6bd.html