database's docs

3.7.0 RAC-安装oracle grid infrastructure


1. oracle grid infrastructure

1) 准备

如果要使用oracle ASM,需要安装oracle grid infrastructure.安装oracle grid infrastructure之前,需要满足之前对于系统环境的所有要求。

2) 创建grid base目录,并分配适当权限

使用root身份,在各节点执行以下步骤

# 产垫grid base目录
mkdir -p /u01/app/grid

# 属主属组是grid和oinstall
chown -R grid.oinstall /u01/

接下来的操作只在oracle rac集群的节点1上执行

3) 下载oracle grid infrastructure 11g rlease 2.0.3

11.2.0.3版本需要在oracle 支持网站中的最新补丁集中才能找到,需要oracle的客户支持账号,文件名称是

  • p10404530_112030_Linux-x86-64_1of7.zip
  • p10404530_112030_Linux-x86-64_2of7.zip
  • p10404530_112030_Linux-x86-64_3of7.zip
  • p10404530_112030_Linux-x86-64_4of7.zip
  • p10404530_112030_Linux-x86-64_5of7.zip
  • p10404530_112030_Linux-x86-64_6of7.zip
  • p10404530_112030_Linux-x86-64_7of7.zip

4) 解压oracle grid infrastructure软件

# 以grid身份执行
sudo su grid
mkdir /u01/app/grid/grid-software
mv p10404530_112030_Linux-x86-64_3of7.zip /u01/app/grid/grid-software/

# 以root身份执行
sudo su root
chown grid.oinstall /u01/app/grid/grid-software/p10404530_112030_Linux-x86-64_3of7.zip

# 以grid身份执行
sudo su grid
cd /u01/app/grid/grid-software
unzip p10404530_112030_Linux-x86-64_3of7.zip

5) 确保grid安装文件中的系统版本为oel6

# 以grid身份执行
vim /u01/app/grid/grid-software/grid/stage/cvu/cv/admin/cvu_config
***************************************
CV_ASSUME_DISTID=OEL6
***************************************

6) 配置reponse file

# 以grid身份执行
cd /u01/app/grid/grid-software/grid/response/
cp grid_install.rsp grid_install.rsp.bak
vim grid_install.rsp
***************************************
# 指定hostname而不是使用系统配置的hostname,适用于多个网卡和多个hostnames的情况
ORACLE_HOSTNAME=db-oracle-node1

# 配置inventory路径
INVENTORY_LOCATION=/u01/app/oraInventory

# 设定安装语言
SELECTED_LANGUAGES=en

# 设定安装cluster集群
oracle.install.option=CRS_CONFIG

# 指定安装base
ORACLE_BASE=/u01/app/grid

# 制定oracle home
ORACLE_HOME=/u01/app/11.2.0/grid

# 指定三个组
oracle.install.asm.OSDBA=asmdba
oracle.install.asm.OSOPER=asmoper
oracle.install.asm.OSASM=asmadmin

# 指定scan名称,和dns中配置的需要一致
oracle.install.crs.config.gpnp.scanName=db-oracle-scan.localdomain

# 指定监听端口
oracle.install.crs.config.gpnp.scanPort=1521

# 指定集群名称,名称不可以太长
oracle.install.crs.config.clusterName=db-ora-cluster

# 指定集群节点
oracle.install.crs.config.clusterNodes=db-oracle-node1:db-oracle-node1-vip,db-oracle-node2:db-oracle-node2-vip

# 指定public和private网址
oracle.install.crs.config.networkInterfaceList=bond0:192.168.33.0:1,eth3:172.16.44.0:2,eth4:172.16.48.0:2

# 指定使用asm
oracle.install.crs.config.storageOption=ASM_STORAGE

# 设置一个密码,密码规则是大写小写字母和数字最少都有一个
oracle.install.asm.SYSASMPassword=oracle_ASM66

# 指定asm硬盘组名称
oracle.install.asm.diskGroup.name=OCRVOTE

# 设定AU size为1
oracle.install.asm.diskGroup.AUSize=1

# 指定asm硬盘组的磁盘
oracle.install.asm.diskGroup.disks=/dev/mapper/ocrvote1p1,/dev/mapper/ocrvote2p1,/dev/mapper/ocrvote3p1

# 指定磁盘发现字符串
oracle.install.asm.diskGroup.diskDiscoveryString=/dev/mapper/*

# 设定一个密码
oracle.install.asm.monitorPassword=oracle_ASM66

# 跳过更新
oracle.installer.autoupdates.option=SKIP_UPDATES
***************************************

7) 检查环境

# 关闭防火墙
service iptables stop

# 在oracle rac cluster各节点之间执行以下命令做ssh互信
sudo su - grid
ssh-keygen
# 值得注意的是每个节点自身需要和自身做ssh互信,也就是说在节点1上要做到节点1的ssh互信
ssh-copy-id db-oracle-node1
ssh-copy-id db-oracle-node2

## 安装cvuqdisk
# 以root身份执行
rpm -ivh /u01/app/grid/grid-software/grid/rpm/cvuqdisk-1.0.9-1.rpm
# 拷贝此文件去其他节点,并在其他节点安装

8) silent模式安装oracle grid infrastructure

sudo su grid
cd /u01/app/grid/grid-software/grid
# 可以关注-ignorePrereq,但此处不需要添加,因为我们确实需要检查环境是否满足
./runInstaller -responseFile /u01/app/grid/grid-software/grid/response/grid_install.rsp -silent
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 33743 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 991 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2016-12-28_06-16-56AM. Please wait ...[grid@db-oracle-node1 grid]$ You can find the log of this install session at:
 /u01/app/oraInventory/logs/installActions2016-12-28_06-16-56AM.log
The installation of Oracle Grid Infrastructure was successful.
Please check '/u01/app/oraInventory/logs/silentInstall2016-12-28_06-16-56AM.log' for more details.

As a root user, execute the following script(s):
        1. /u01/app/oraInventory/orainstRoot.sh
        2. /u01/app/11.2.0/grid/root.sh

Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:
[db-oracle-node1, db-oracle-node2]
Execute /u01/app/11.2.0/grid/root.sh on the following nodes:
[db-oracle-node1, db-oracle-node2]

As install user, execute the following script to complete the configuration.
        1. /u01/app/11.2.0/grid/cfgtoollogs/configToolAllCommands

        Note:
        1. This script must be run on the same system from where installer was run.
        2. This script needs a small password properties file for configuration assistants that require passwords (refer to install guide documentation).


Successfully Setup Software.


## 此时按照提示,保持目前的terminal不动,打开一个新的ssh连接,使用root登陆,然后执行提示中的脚本
## db-oracle-node1上,使用root执行
sh /u01/app/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.

sh /u01/app/11.2.0/grid/root.sh
Check /u01/app/11.2.0/grid/install/root_db-oracle-node1_2016-12-27_08-42-35.log for the output of root script
# 检查log,看是否成功执行
# 最后一句是
# "Configure Oracle Grid Infrastructure for a Cluster ... succeeded"

## db-oracle-node2上,使用root执行
sh /u01/app/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.

Configure Oracle Grid Infrastructure for a Cluster ... succeeded

sh /u01/app/11.2.0/grid/root.sh
Check /u01/app/11.2.0/grid/install/root_db-oracle-node2_2016-12-28_06-36-36.log for the output of root script
# 检查log,看是否成功执行
# 最后一句是
# "Configure Oracle Grid Infrastructure for a Cluster ... succeeded"

## db-oracle-node1上,使用grid执行
sh /u01/app/11.2.0/grid/cfgtoollogs/configToolAllCommands
Setting the invPtrLoc to /u01/app/11.2.0/grid/oraInst.loc

perform - mode is starting for action: configure


perform - mode finished for action: configure

You can see the log file: /u01/app/11.2.0/grid/cfgtoollogs/oui/configActions2016-12-28_06-43-23-AM.log


## 执行完毕后回来按下enter键

执行完毕后,会发现在db-oracle-node2节点上也在同路径下安装了grid

错误及解决方法

## 错误1
# SEVERE: [FATAL] [INS-40904] ORACLE_HOSTNAME does not resolve to a valid host name.
#    CAUSE: The value provided for ORACLE_HOSTNAME does not resolve to a valid host name.
#    ACTION: Provide a valid host name for ORACLE_HOSTNAME, and restart the installer.

# 解决方案:
# 1. 检查ORACLE_HOSTNAME设定是否正确
# 2. 在/etc/hosts中增加"publicip ORACLE_HOSTNAME"


## 错误2
# SEVERE: [FATAL] [INS-06006] Passwordless SSH connectivity not set up between the following node(s): [db-oracle-node2].
#    CAUSE: Either passwordless SSH connectivity is not setup between specified node(s) or they are not reachable. Refer to the logs for more details.
#    ACTION: Refer to the logs for more details or contact Oracle Support Services.

# 解决方案:
# 1. 使用grid身份登陆各node
# 2. 生成ssh key
# 3. 各node之间互相做信任


## 错误3
# SEVERE: [FATAL] [INS-40925] One or more nodes have interfaces not configured with a subnet that is common across all cluster nodes.
#   CAUSE: Not all nodes have network interfaces that are configured on subnets that are common to all nodes in the cluster.
#   ACTION: Ensure all cluster nodes have a public interface defined with the same subnet accessible by all nodes in the cluster.

# 解决方案:
# 1. 将oracle.install.crs.config.privateInterconnects=bond0:192.168.33.103:2,eth3:172.16.44.103:1,eth4:172.16.48.103:1
# 修改为oracle.install.crs.config.privateInterconnects=bond0:192.168.33.0:1,eth3:172.16.44.0:2,eth4:172.16.48.0:2

# 原因:
# 1. 不能使用ip,需要使用网段,既把最后一段ip改为0
# 2. 文档中说1代表private,2代表public,其实是错的,应该要反过来


## 警告4
# WARNING: [WARNING] [INS-30011] The password entered does not conform to the Oracle recommended standards.
#    CAUSE: Oracle recommends that the SYS password entered should be at least 8 characters in length, contain at least 1 uppercase character, 1 lower case character and 1 digit [0-9].
#    ACTION: Provide a password that conforms to the Oracle recommended standards.
# SEVERE: [FATAL] [INS-30001] The ASMSNMP password is empty.
#    CAUSE: The ASMSNMP password should not be empty.
#    ACTION: Provide a non-empty password.

# 解决方案:
# 1. 修改SYSASM的password,使其符合规则(最少1个大写字母,1个小写字母,1个数字)
# 2. 增加ASMSNMP的password,密码规则(最少1个大写字母,1个小写字母,1个数字)

## 错误5
# 因为同时在node1和node2上执行root.sh,造成root.sh执行失败
# 官方文档解决办法
# C. GI Cluster Deconfigure and Reconfigure
#
# Identify cause of root.sh failure by reviewing logs in $GRID_HOME/cfgtoollogs/crsconfig and $GRID_HOME/log, once cause is identified and problem is fixed, deconfigure and reconfigure with steps below - keep in mind that you will need wait till each step finishes successfully before move to next one:
#
# Step 0: For 11.2.0.2 and above, root.sh is restartable.
#
# Once cause is identified and the problem is fixed, root.sh can be executed again on the failed node. If it succeeds, continue with your planned installation procedure; otherwise as root sequentially execute "$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force" and $GRID_HOME/root.sh on local node, if it succeeds, continue with your planned installation procedure, otherwise proceed to next step (Step 1) of the note.
#
# Step 1: As root, run "$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force" on all nodes, except the last one.
#
# Step 2: As root, run "$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode" on last node. This command will zero out OCR, Voting Disk and the ASM diskgroup for OCR and Voting Disk
#
#
#     Note:
#
#     a. Step1 and 2 can be skipped on node(s) where root.sh haven't been executed this time.
#
#     b. Step1 and 2 should remove checkpoint file. To verify:
#
#           ls -l $ORACLE_BASE/Clusterware/ckptGridHA_.xml
#
#     If it's still there, please remove it manually with "rm" command on all nodes
#
#     c. If GPNP profile is different between nodes/setup, clean it up on all nodes as grid user
#
#           $ find <GRID_HOME>/gpnp/* -type f -exec rm -rf {} \;          
#
#     The profile needs to be cleaned up:
#
#         c1. If root.sh is executed concurrently - one should not execute root.sh on any other nodes before it finishes on first node.
#
#         c2. If network info, location of OCR or Voting Disk etc changed after Grid is installed - rare
#
#
# Step 3: As root, run $GRID_HOME/root.sh on first node
#
# Step 4: As root, run $GRID_HOME/root.sh on all other node(s), except last one.
#
# Step 5: As root, run $GRID_HOME/root.sh on last node.

9) 检查结果

/u01/app/11.2.0/grid/bin/crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....ER.lsnr ora....er.type ONLINE    ONLINE    db-o...ode1
ora....N1.lsnr ora....er.type ONLINE    ONLINE    db-o...ode2
ora....N2.lsnr ora....er.type ONLINE    ONLINE    db-o...ode1
ora....N3.lsnr ora....er.type ONLINE    ONLINE    db-o...ode1
ora.OCRVOTE.dg ora....up.type ONLINE    ONLINE    db-o...ode1
ora.asm        ora.asm.type   ONLINE    ONLINE    db-o...ode1
ora.cvu        ora.cvu.type   ONLINE    ONLINE    db-o...ode1
ora....SM1.asm application    ONLINE    ONLINE    db-o...ode1
ora....E1.lsnr application    ONLINE    ONLINE    db-o...ode1
ora....de1.gsd application    OFFLINE   OFFLINE
ora....de1.ons application    ONLINE    ONLINE    db-o...ode1
ora....de1.vip ora....t1.type ONLINE    ONLINE    db-o...ode1
ora....SM2.asm application    ONLINE    ONLINE    db-o...ode2
ora....E2.lsnr application    ONLINE    ONLINE    db-o...ode2
ora....de2.gsd application    OFFLINE   OFFLINE
ora....de2.ons application    ONLINE    ONLINE    db-o...ode2
ora....de2.vip ora....t1.type ONLINE    ONLINE    db-o...ode2
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    db-o...ode1
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    db-o...ode1
ora.ons        ora.ons.type   ONLINE    ONLINE    db-o...ode1
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    db-o...ode2
ora.scan2.vip  ora....ip.type ONLINE    ONLINE    db-o...ode1
ora.scan3.vip  ora....ip.type ONLINE    ONLINE    db-o...ode1
# 所有的name对应一个target,并且state应该是online,除了以".gsd"结尾的。