使用Ambari2.5(HDP2.6)搭建大数据环境

本文介绍在 CentOS 7 环境下使用 Ambari2.5 (HDP2.6) 搭建大数据环境。

推荐使用如下脚本将 Ambari/HDP 相关软件包下到本地后配置 yum 源安装,在线安装速度太慢会经常遇到包找不到情况。

1
2
3
4
5
wget -c http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.5.1.0/ambari-2.5.1.0-centos7.tar.gz \
http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.5.1.0/ambari.repo \
http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.1.0/hdp.repo \
http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.1.0/HDP-2.6.1.0-centos7-rpm.tar.gz \
http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7/HDP-UTILS-1.1.0.21-centos7.tar.gz

CentOS 准备

安装CentOS 7

  • 安装时设置静态IP
  • 关闭Kdump
  • 关闭Selinux
  • 使用最小化安装

安装相关软件包

挂载系统镜像

1
2
mkdir /media/CentOS
mount /dev/sr0 /media/CentOS

编辑 /etc/yum.repos.d/CentOS-Media.repo 启用本地存储库,修改 enabled1

1
2
yum install postgresql-server postgresql-contrib vim ntp unzip

安装前设置

SSH免密码登录

使用root账号登录 Ambari Server 主机并生成SSH私钥:

1
ssh-keygen

添加`authorized_keys文件:

1
2
cd ~/.ssh
cat id_rsa.pub >> authorized_keys

修改 ~/.ssh 目录 和 ~/.ssh/authorized_keys 文件系统权限(注意:~/.ssh/authorized_keys文件权限必需为600,不然免密码登录将失效):

1
2
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys

authorized_keys 文件其复制到所有 Ambari Agent 主机(注意:有可能需要在Agent主机上创建 .ssh 目录)

1
scp ~/.ssh/authorized_keys root@<remote.target.host>:~/.ssh/

(请将 <remote.target.host> 替换为集群中每台 Ambari Agent 主机地址)

验证每台主机免密码登录是否成功

1
ssh root@<remote.target.host>

设置 NTP

1
2
3
yum install -y ntp
systemctl enable ntpd
systemctl start ntpd

关闭系统防火墙

1
2
systemctl disable firewalld
service firewalld stop

SELinux、PackageKit、umask

编辑 /etc/sysconfig/selinux ,设置SELINUX=disabled

1
echo umask 0022 >> /etc/profile

设置网络(DNS和NSCD)

所有节点都要设置。ambari在安装时需要配置全域名,所以需要检查DNS。为了减轻DNS的负担, 建议在节点里用 Name Service Caching Daemon (NSCD)

1
2
3
4
5
6
7
8
9
10
vim /etc/hosts
192.168.124.151 ambari001
192.168.124.152 ambari002
192.168.124.153 ambari003
vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=ambari001

本地 ambari/hdp yum源设置

ambari.repohdp.repo 文件入到 /etc/yum.repo.d/ 目录,并将 192.168.32.101 地址替换成你实际的
本地 yum 服务地址。

我们可以使用 Nginx 来搭建 yum 服务,只需要注意相映路径即可。

安装nginx

1
2
3
4
5
6
7
$ vim /etc/yum.repos.d/nginx.repo
[nginx]
name=nginx repo
baseurl=http://nginx.org/packages/centos/7/$basearch/
gpgcheck=0
enabled=1

ambari.repo

1
2
3
4
5
6
7
8
#VERSION_NUMBER=2.5.0.3-7
[ambari-2.5.0.3]
name=ambari Version - ambari-2.5.0.3
baseurl=http://192.168.32.101/ambari/centos7/2.x/updates/2.5.0.3
gpgcheck=1
gpgkey=http://192.168.32.101/ambari/centos7/2.x/updates/2.5.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

hdp.repo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#VERSION_NUMBER=2.6.0.3-8
[HDP-2.6.0.3]
name=HDP Version - HDP-2.6.0.3
baseurl=http://192.168.32.101/HDP/centos7/2.x/updates/2.6.0.3
gpgcheck=1
gpgkey=http://192.168.32.101/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-UTILS-1.1.0.21]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.21
baseurl=http://192.168.32.101/HDP-UTILS-1.1.0.21/repos/centos7
gpgcheck=1
gpgkey=http://192.168.32.101/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

安装独立PostgreSQL数据库(可选)

1
2
rpm -ivh https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm
sudo yum -y install postgresql96-server postgresql96-contrib

选择:Enter advanced database configuration ,并选择 [4] - PostgreSQL

设置默认schema

1
set search_path to "$user",ambari;

安装/设置 ambari-server

为了一些不必要的麻烦,推荐关闭 selinux

Install

1
yum install ambari-server

配置 ambari-server

1
ambari-server setup --jdbc-db=postgres --jdbc-driver=/home/software/postgresql-42.0.0.jar

使用 -j 选项指定 JAVA_HOME 目录,这里推荐使用 Oracle JDK 1.8,并配置 Java Cryptography Extension (JCE) 。若不指定 -j 选项,ambari-server 将自动下载配置了JCE的Oracle JDK 1.8版本。

一切使用默认配置即可,当看到以下输出就代表 Ambari Server 配置成功:

1
2
3
...........
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.

完整输出如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
[root@ambari001 ~]# ambari-server setup
Using python /usr/bin/python
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)? n
Adjusting ambari-server permissions and ownership...
Checking firewall status...
Checking JDK...
[1] Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
[2] Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7
[3] Custom JDK
==============================================================================
Enter choice (1): 3
WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts.
WARNING: JCE Policy files are required for configuring Kerberos security. If you plan to use Kerberos,please make sure JCE Unlimited Strength Jurisdiction Policy Files are valid on all hosts.
Path to JAVA_HOME: /opt/local/jdk
Validating JDK on Ambari Server...done.
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? y
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
==============================================================================
Enter choice (1): 4
Hostname (localhost): 127.0.0.1
Port (5432):
Database name (ambari):
Postgres schema (ambari):
Username (ambari):
Enter Database Password (bigdata):
Configuring ambari database...
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-Postgres-CREATE.sql
Proceed with configuring remote database connection properties [y/n] (y)? y
Extracting system views...
ambari-admin-2.5.1.0.159.jar
...........
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.

安装/配置/部署集群

启动Ambari-server

1
ambari-server start

打开浏览器登录网址:[http://192.168.32.112:8080](http://192.168.32.112:8080)(请使用你自己安装的 Ambari Server地址)。

使用默认用户名/密码 admin/admin 登录,之后你可以修改它。

登录后首先创建我们的第一个大数据集群,点击 Launch Install Wizard 按钮创建集群。

创建集群

首先我们将需要给集群取一个名字,接下来将选择 HDP 的版本,这里我们选择 2.6 版本。

我们将使用本地源来安装 HDP ,按图设置本地源地址:

  • HDP-2.6: http://192.168.32.101/HDP/centos7/2.x/updates/2.6.0.3
  • HDP-UTILS-1.1.0.21: http://192.168.32.101/HDP-UTILS-1.1.0.21/repos/centos7

选择版本并配置HDP