gpdb官方安装包自定义安装路径的方法
目录
下载安装包并解压,得到一个bin文件,从bin文件提取rpm文件
# 前960行为安装脚本
[root@hadoop2 ~]# tail -n +961 greenplum-db-5.16.0-rhel7-x86_64.bin > gpdb.tar.gz如果集群没有gpadmin用户,可以利用gpssh批量创建gpadmin用户。首先以root用户登陆master节点,解压gpdb.tar.gz:
[root@hadoop2 ~]# mkdir gpdb-5.16.0
[root@hadoop2 ~]# tar zxf gpdb.tar.gz -C gpdb-5.16.0修改gpdb-5.16.0/greenplum_path.sh文件:
GPHOME=~/gpdb-5.16.0使gpdb环境变量生效:
[root@hadoop2 ~]# source gpdb-5.16.0/greenplum_path.sh
[root@hadoop2 ~]# which gpssh
/root/gpdb-5.16.0/bin/gpssh创建hosts文件,包含集群所有节点的主机名,主机名需要添加到/etc/hosts文件中:
[root@hadoop2 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.203.12 hadoop2.lw hadoop2
192.168.203.13 hadoop3.lw hadoop3
192.168.203.14 hadoop4.lw hadoop4
[root@hadoop2 ~]# cat hosts
hadoop2
hadoop3
hadoop4交换各节点之间的ssh key:
[root@hadoop2 ~]# gpssh-exkeys -f hosts
[STEP 1 of 5] create local ID and authorize on local host
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts
... send to hadoop3
*** Enter password for hadoop3:
... send to hadoop4
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with hadoop3
... finished key exchange with hadoop4
[INFO] completed successfully现在可以开始批量创建gpadmin用户了:
[root@hadoop2 ~]# gpssh -f hosts
=> groupadd -g 530 gpadmin
[hadoop2]
[hadoop4]
[hadoop3]
=> useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin
[hadoop2]
[hadoop4]
[hadoop3]
=> echo gpadmin:gpadmin | chpasswd
[hadoop2]
[hadoop4]
[hadoop3]修改每个节点上的文件打开数量限制,这里只展示一个节点,其他节点类似操作。因为之前有交换过key,所以可以无需密码直接通过ssh hostname连接到其他节点修改配置,这个步骤也可以用gpssh处理,但是稳妥起见,还是手动修改每个节点吧。
[root@hadoop2 ~]# vi /etc/security/limits.conf
# End of file
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072将数据库安装目录移动到/home/gpadmin:
[root@hadoop2 ~]# mv gpdb-5.16.0 /home/gpadmin
[root@hadoop2 ~]# chown -R gpadmin:gpadmin /home/gpadmin/gpdb-5.16.0
[root@hadoop2 ~]# mv hosts /home/gpadmin
[root@hadoop2 ~]# chown gpadmin:gpadmin /home/gpadmin/hosts接下来切换到gpadmin用户下安装gpdb:
[root@hadoop2 ~]# su - gpadmin修改.bash_profile,添加如下内容:
GREENPLUM_PATH=~/gpdb-5.16.0/greenplum_path.sh
source $GREENPLUM_PATH
#DATA_DIR目录用于存放master数据,如果你想把数据放到别的地方,不要忘了修改这个路径
DATA_DIR=~/data/master/gpseg-1/
export MASTER_DATA_DIRECTORY=$DATA_DIR
export PGPORT=5432
export PGDATABASE=postgres使.bash_profile生效:
[gpadmin@hadoop2 ~]$ source .bash_profile
[gpadmin@hadoop2 ~]$ which gpssh
~/gpdb-5.16.0/bin/gpssh因为现在处于gpadmin用户下,需要再次交换一下各节点之间的ssh key:
[gpadmin@hadoop2 ~]$ gpssh-exkeys -f hosts
[STEP 1 of 5] create local ID and authorize on local host
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts
... send to hadoop3
*** Enter password for hadoop3:
... send to hadoop4
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with hadoop3
... finished key exchange with hadoop4
[INFO] completed successfully创建segs文件,该文件包含所有子节点的主机名,不包括主节点的主机名:
[gpadmin@hadoop2 ~]$ cat segs
hadoop3
hadoop4将数据库安装目录分发到各个子节点:
[gpadmin@hadoop2 ~]$ gpseginstall -f segs
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-Installation Info:
link_name None
binary_path /home/gpadmin/gpdb-5.16.0
binary_dir_location /home/gpadmin
binary_dir_name gpdb-5.16.0
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-check cluster password access
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-de-duplicate hostnames
20190411:13:52:19:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-master hostname: hadoop2
20190411:13:52:20:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-rm -f /home/gpadmin/gpdb-5.16.0.tar; rm -f /home/gpadmin/gpdb-5.16.0.tar.gz
...
20190411:13:54:42:027100 gpseginstall:hadoop2:gpadmin-[INFO]:-SUCCESS -- Requested commands completed批量创建存放数据的目录:
[gpadmin@hadoop2 ~]$ gpssh -f hosts
=> mkdir data
[hadoop2]
[hadoop4]
[hadoop3]
=> cd data
[hadoop2]
[hadoop4]
[hadoop3]
=> mkdir p1 p2 m1 m2 master
[hadoop2]
[hadoop4]
[hadoop3]从数据库安装目录中拷贝一个初始化数据库的配置文件模板:
cp gpdb-5.16.0/docs/cli_help/gpconfigs/gpinitsystem_config .修改初始化数据库的配置文件,主要修改以下选项:
PORT_BASE=40000 #这个端口不要设置太小了,可能会端口冲突
declare -a DATA_DIRECTORY=(/home/gpadmin/data/p1 /home/gpadmin/data/p2)
MASTER_HOSTNAME=hadoop2
MASTER_DIRECTORY=/home/gpadmin/data/master
declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/data/m1 /home/gpadmin/data/m2)最后初始化数据库:
gpinitsystem -c gpinitsystem_config -h hosts修改数据库用户gpadmin密码:
postgres=# ALTER USER gpadmin PASSWORD 'gpadmin';修改master数据目录中的配置文件pg_hba.conf,在最后添加以下内容:
host all all 192.168.0.0/16 md5重新加载配置文件:
gpstop -ua至此,gpdb数据库可以接受内网的访问了。