Bo's Oracle Station

【博客文章2016】Oracle12c RAC和Grid Infrastructure部署系列二:12.1.0.2开始的配合UEK3 Linux内核使用的ASM新特性:ASMFD

2016-9-28 12:04| 发布者: admin| 查看: 654| 评论: 0|原作者: Bo Tang

摘要: ASMFD添加了一个像免疫系统一样的,名叫“oracleafd.ko”的内核模块进Linux操作系统。该模块占领IO要塞,使得ASM可以拒绝非Oracle进程(即使是具有root权限的进程)对ASM磁盘的写操作,进而达到排斥非法入侵者的目的。 配置GI,由原先的ASM Lib驱动改成安全性更好的ASM Filter驱动的过程是非常简单的,并且能够以滚动的方式进行配置,相信这个新特性值得在广大生产环境推广。

Author: Bo Tang


摘要

本文介绍:Oracle12c RACGrid Infrastructure(简称GI)部署系列二:12.1.0.2开始的配合UEK3 Linux内核使用的ASM新特性:ASMFD。本文打算滚动配置(不是升级)一套RAC 12.1.0.2:配置GI,由原先的ASM Lib驱动改成安全性更好的ASM Filter驱动(以下简称ASMFD)。实验起点为2台主机外加1个盘阵组成的RAC。其中2台主机的操作系统都是Oracle Enterprise Linux 6.8 x8664 uek核。但是由于Oracle Enterprise Linux 6.8 x8664默认的uek核是UEK4的,不支持ASMFD,因此需要安装UEK3的核并以此核启动该操作系统。接着滚动配置ASMFD,在不停机的前提下让现有的数据库来使用ASMFD,并看到结果。


微信查看请扫:


目录

1. ASMFD体系结构

2. ASMASMFD的重要概念

2.1 ASM磁盘搜寻串

2.2 ASMFD的关键要素:ASMFD磁盘标签

2.2.1 使用ASMFD磁盘标签的好处:减少对udev的依赖

2.3 新概念:ASMFD磁盘搜寻串

2.4 ASMFD内核模块和设备IO接口

2.4.1 使用ASMFD内核模块和设备IO接口的好处:减少对OS资源的占用

2.4.2 使用ASMFD内核模块和设备IO接口的好处:RAC ASMFD快速节点恢复机制

3. 实验部分:配置ASMFD前的ASM Lib原始实验环境概览

4. 实验部分:滚动配置ASMFD

4.1 UEK3内核

4.2 在第一个节点上滚动配置

4.2.1 afd_configure

4.2.2 afd_dsset配置“ASMFD磁盘搜寻串”

4.3 在第二个节点上滚动配置

4.3.1 由于每个节点上都要运行afd_configure,在第二节点上再次运行

4.3.2 由于每个节点上都要运行afd_dsset配置“ASMFD磁盘搜寻串”,在第二节点上再次运行

4.4 确认所有节点配置完成

5. 实验部分:测试ASMFD对现有数据库的保护

5.1 在第一个节点上打开Filter

5.2 在第二个节点上打开Filter

5.3 实验 dd破坏

5.4 在第一个节点上关闭Filter

5.5 在第二个节点上查看Filter

5.6 失去ASMFD保护后再实验dd破坏,集群崩溃

总结

正文


1. ASMFD体系结构

ASMFD是从12.1.0.2开始的,目前只在Linux平台上(配合UEK3 Linux内核)使用的新特性。实施ASM Filter驱动的目的是因为:在以前版本的ASM中,ASM磁盘被暴露在操作系统的全权读写环境中,因此非常危险。举个例子来说:dd之类的非Oracle进程可以毫不费劲地重创ASM磁盘,进而重创ASM磁盘组,导致数据丢失。这是多么可怕的世界!为了解决这一问题,OracleGI 12.1.0.2版本中引入了ASMFD这个新特性。ASMFD添加了一个像免疫系统一样的,名叫“oracleafd.ko”的内核模块进Linux操作系统。该模块占领IO要塞,使得ASM可以拒绝非Oracle进程(即使是具有root权限的进程)对ASM磁盘的写操作,进而达到排斥非法入侵者的目的。大家知道Linux的内核模块实际上就是Linux的驱动程序,比如“ext4.ko”之类的,所以ASMFD的全称自然就是“ASM Filter驱动”。下图展示了ASMFD在整个IO路径上所处的位置:

由以上介绍可以得知:ASMFD需要配合特定的操作系统来使用。即:UEK3内核的Linux。本人在UEK4上试验过多次。不知为何UEK4不支持ASMFD。下表展示了Oracle Enterprise Linux版本和所包含的UEK内核版本的对应关系:


Oracle Enterprise Linux版本

所包含的UEK内核版本

Oracle Enterprise Linux 6.4

UEK2

Oracle Enterprise Linux 6.5

UEK3

Oracle Enterprise Linux 6.8

UEK4


生产环境也有可能因为各种原因不使用6.5,而使用6.x操作系统。那么就需要在6.x操作系统上补充安装UEK3核,并以UEK3核启动,来使用ASMFD


2. ASMASMFD的重要概念

2.1 ASM磁盘搜寻串

ASM磁盘搜寻串”(asm disk string)是从10g以来就有的概念。其作用是让ASM实例通过“ASM搜寻串”发现ASM磁盘(包括普通磁盘、ASM Lib标签的磁盘和新的“ASMFD标签的磁盘”(下节介绍))。“ASM磁盘标识串”如果设置成空串,ASM会在一系列默认位置寻找磁盘,包括:/dev/oracleasm/disks/* ASM Lib的“ORCL:”标签的磁盘和新的“ASMFD标签的磁盘”。



ASM磁盘搜寻串

设置命令

1. asmcmd dsset or

2. alter system set asm_diskstring='xxx'

1. 空值(ASM会在以下一系列默认位置寻找磁盘)

2. 直接使用磁盘原始路径:/dev/sd* or

3. 直接使用裸设备磁盘路径:/dev/raw/raw* or

4. 直接使用oracleasm createdisk创建的磁盘路径:/dev/oracleasm/disks/* or

5. ASM Lib标签的磁盘路径:ORCL:RACDISK1 or

6. ASMFD标签的磁盘路径:AFD:RACDISK1

以上所有值可以使用逗号隔开,同时设置多个。

在集群中每个节点都需要设置吗

不需要

查看命令

1. asmcmd dsget or

2. show parameter asm_diskstring or

3. select path from v$asm_disk or

4. asmcmd lsdskpath


要把12.1.0.2 以前传统的“ORCL:”ASM Lib标签的磁盘转换成新的“ASMFD标签的磁盘”,我们首先必需把“ASM磁盘搜寻串”设置成空串。这样ASM会在一系列默认位置寻找磁盘(包括ASM Lib的“ORCL:”标签的磁盘)。保证在未彻底转换成新的“ASMFD标签的磁盘”的过程中,集群还能继续使用“ORCL:”磁盘启动。否则在转换过程中,无法中途启动集群成功,进而就根本无法执行下去。等到把所有磁盘都转换成新的“ASMFD标签的磁盘”后,我们可以把“ASM磁盘搜寻串”设置成“AFD:”格式。这样以后集群就只会使用到“ASMFD标签的磁盘”。


2.2 ASMFD的关键要素:ASMFD磁盘标签

ASMFD使用特殊的ASMFD标签来永久性地标签磁盘。在12.1.0.2以前的版本中只有“ASM Lib磁盘标签”这个概念;自从12.1.0.2之后就有两种磁盘标签:“ASM Lib磁盘标签”和新的“ASMFD磁盘标签”。前者与内核模块无关,后者配合内核模块使用。“ASMFD磁盘标签”更加强大,它能标识IO的设备接口和设备代理。

磁盘必需被ASMFD标签后才能被ASMFD所管理。ASM Lib标签的磁盘和ASMFD标签的磁盘不能同时存在。下表展示了它们的异同点:



ASM Lib标签的磁盘

ASMFD标签的磁盘

设置命令

1. Oracle网站下载oracleasmlib-2.0.4-1.el6.x86_64.rpm,然后安装。

2. oracleasm createdisk创建磁盘之后,标签自动生成

1. root身份使用asmcmd afd_configure自动转换旧的“ASM Lib磁盘标签”为新的“ASMFD磁盘标签”。 or

2. asmcmd afd_label [ --migrate ] or

3. alter system label set

ORCL:磁盘名文本串

AFD:标签名文本串

在集群中每个节点都需要设置吗

不需要

不需要,一个节点设置后,其他节点去扫描一下就发现了。

扫描命令

oracleasm scandisks自动发现

1. asmcmd afd_scan or

2. alter system label scan

查看命令

1. asmcmd lsdsk -k or

2. select label from v$asm_disk

1. asmcmd afd_lsdsk or

2. select label from v$asm_disk

移除标签

卸载oracleasmlib-2.0.4-1.el6.x86_64.rpm包后修改前面提到的“ASM磁盘搜寻串”等。

asmcmd afd_unlabel <标签名文本串> [-f]



2.2.1 使用ASMFD磁盘标签的好处:减少对udev的依赖

因为ASMFD磁盘标签能标识IO的设备接口和设备代理,是配合内核模块使用的。 有了这个功能之后,固定设备编号的其他第三方手段比如udev之类就不再需要了。


2.3 新概念:ASMFD磁盘搜寻串

ASM磁盘搜寻串”和“ASMFD磁盘搜寻串”是同时存在的两个概念。如果要使用上一节提到的“ASMFD标签”,我们必需设置新的“ASMFD磁盘搜寻串”。它使ASMFD能自动发现正确的标签过的磁盘的位置。因为它与寻找IO的设备接口和设备代理有关,所以必需每个节点逐一设置。如果没有设置“ASMFD磁盘搜寻串”,ASMFD就不能很好地发现ASMFD标签的磁盘。举例来说:如果“ASMFD磁盘搜寻串”未设置,那么asmcmd afd_scan命令就需要传递磁盘路径参数;如果“ASMFD磁盘搜寻串”已设置,那么asmcmd afd_scan命令就可以直接运行。下表比较了2.1节中提到的“ASM磁盘搜寻串”和“ASMFD磁盘搜寻串”的区别:



ASM磁盘搜寻串

ASMFD磁盘搜寻串

设置命令

1. asmcmd dsset or

2. alter system set asm_diskstring='xxx'

asmcmd afd_dsset

1. 空值(ASM会在以下一系列默认位置寻找磁盘)

2. 直接使用磁盘原始路径:/dev/sd* or

3. 直接使用裸设备磁盘路径:/dev/raw/raw* or

4. 直接使用oracleasm createdisk创建的磁盘路径:/dev/oracleasm/disks/* or

5. ASM Lib磁盘路径:ORCL:RACDISK1 or

6. ASMFD标签磁盘路径:AFD:RACDISK1

以上所有值可以使用逗号隔开,同时设置多个。

AFD:标签名文本串

在集群中每个节点都需要设置吗

不需要

需要

查看命令

1. asmcmd dsget or

2. show parameter asm_diskstring or

3. select path from v$asm_disk or

4. asmcmd lsdskpath

asmcmd afd_dsget


2.4 ASMFD内核模块和设备IO接口

ASMFD通过添加一个像免疫系统一样的名叫“oracleafd.ko”的内核模块进Linux操作系统。该模块占领IO要塞,使ASM可以拒绝非Oracle进程(即使是具有root的权限的进程)对ASM磁盘的写操作进而达到排斥非法入侵者的目的。大家知道Linux的内核模块实际上就是Linux的驱动程序,比如“ext4.ko”之类的,所以ASMFD的全称自然就是“ASM Filter驱动”。如果系统之前使用的是oracleasm createdisk创建的磁盘,那么加载进Linux内核的就应该就是“oracleasm.ko”模块;如果改成ASMFD后,“oracleasm.ko”模块就会自动被卸载, 新的“oracleafd.ko”会取而代之。(注意:使用ASM Lib的一定使用了oracleasm createdisk创建过磁盘;但是反之使用了oracleasm createdisk创建过磁盘未必有使用ASM LibASM Lib不是内核模块)


2.4.1 使用ASMFD内核模块和设备IO接口的好处:减少对OS资源的占用

一个ASM实例往往包含数目很多的进程。在以前版本的ASM中,每一个能独立执行IO的进程对于其所IO的每个磁盘都需要保持专属独占的文件描述符。当上千个进程访问上百个磁盘时,产生的大量的文件描述符会立刻消耗掉大量的操作系统资源。ASMFD则刚好相反,由于它处于IO体系结构的“ASM实例”层和“操作系统”层中间的独特位置,它只把执行IO的设备接口,通过磁盘标签,暴露给ASM实例。相同的设备接口能被所有的IO进程所共享。因此显著减少了对文件描述符的需求,进而减少了对OS资源的占用。


2.4.2 使用ASMFD内核模块和设备IO接口的好处:RAC ASMFD快速节点恢复机制

在以前的RAC版本中,当表决机制判断到某个实例出现故障时,该实例所在的整台机器会被重启。这是通过init.d里的脚本来实现的。这种表决机制虽然很有效,但是整个过程从重启到恢复整台机器所花的时间实在太多。ASMFD在发生以上情况时,可以通过只重启集群中的相关IO层而不是整个机器来实现相同效果,情况会变得快得多。


3. 实验部分:配置ASMFD前的ASM Lib原始实验环境概览

RACstation11192.168.0.11)和station12192.168.0.12)运行着两节点RACGI版本为12.1.0.2。数据库版本为12.1.0.2admin-managed


主机

操作系统

外网IP

内网IP

主机名

Oracle 12c RAC节点主机(第一台)

Oracle Enterprise Linux 6.8加装UEK3,并用 UEK3核启动

IP192.168.0.11

VIP192.168.0.211

SCAN-VIP1192.168.0.111

SCAN-VIP2192.168.0.161

172.31.118.11(私网)

172.31.118.211(预留做ASM网)

station11.example.com


Oracle 12c RAC节点主机(第二台)

同上

IP192.168.0.12

VIP192.168.0.212

SCAN-VIP3192.168.0.112

172.31.118.12(私网)

172.31.118.212(预留做ASM网)

station12.example.com


使用ASM Lib标签的磁盘(这在我们目前的生产环境中几乎是标准做法),并安装好GI和数据库。配置ASMFD前的ASM Lib原始实验环境:


[root@station11 ~]# oracleasm-discover

Using ASMLib from /opt/oracle/extapi/64/asm/orcl/1/libasm.so

[ASM Library - Generic Linux, version 2.0.4 (KABI_V2)]

Discovered disk: ORCL:RACDISK1 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK10 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK11 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK2 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK3 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK4 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK5 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK6 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK7 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK8 [10072692 blocks (5157218304 bytes), maxio 512]

Discovered disk: ORCL:RACDISK9 [10072692 blocks (5157218304 bytes), maxio 512]


以下是配置ASMFDGI的详细描述:


[grid@station11 ~]$ crsctl query crs activeversion

Oracle Clusterware active version on the cluster is [12.1.0.2.0]

[grid@station11 ~]$ crs_stat -t

Name Type Target State Host

------------------------------------------------------------

ora.DATA.dg ora....up.type ONLINE ONLINE station11

ora.FRA.dg ora....up.type ONLINE ONLINE station11

ora....ER.lsnr ora....er.type ONLINE ONLINE station11

ora....N1.lsnr ora....er.type ONLINE ONLINE station11

ora....N2.lsnr ora....er.type ONLINE ONLINE station12

ora....N3.lsnr ora....er.type ONLINE ONLINE station11

ora.MGMTLSNR ora....nr.type ONLINE ONLINE station11

ora.asm ora.asm.type ONLINE ONLINE station11

ora.cvu ora.cvu.type ONLINE ONLINE station11

ora.db12c.db ora....se.type ONLINE ONLINE station11

ora.mgmtdb ora....db.type ONLINE ONLINE station11

ora....network ora....rk.type ONLINE ONLINE station11

ora.oc4j ora.oc4j.type ONLINE ONLINE station11

ora.ons ora.ons.type ONLINE ONLINE station11

ora.scan1.vip ora....ip.type ONLINE ONLINE station11

ora.scan2.vip ora....ip.type ONLINE ONLINE station12

ora.scan3.vip ora....ip.type ONLINE ONLINE station11

ora....SM1.asm application ONLINE ONLINE station11

ora....11.lsnr application ONLINE ONLINE station11

ora....n11.ons application ONLINE ONLINE station11

ora....n11.vip ora....t1.type ONLINE ONLINE station11

ora....SM2.asm application ONLINE ONLINE station12

ora....12.lsnr application ONLINE ONLINE station12

ora....n12.ons application ONLINE ONLINE station12

ora....n12.vip ora....t1.type ONLINE ONLINE station12


[grid@station11 ~]$ asmcmd lsdg

State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name

MOUNTED NORMAL N 512 4096 1048576 39344 23145 4918 9113 0 Y DATA/

MOUNTED EXTERN N 512 4096 1048576 14754 14417 0 14417 0 N FRA/


[grid@station11 ~]$ asmcmd lsdsk -G DATA

Path

ORCL:RACDISK1

ORCL:RACDISK2

ORCL:RACDISK3

ORCL:RACDISK4

ORCL:RACDISK5

ORCL:RACDISK6

ORCL:RACDISK7

ORCL:RACDISK8

[grid@station11 ~]$ asmcmd lsdsk -G FRA

Path

ORCL:RACDISK10

ORCL:RACDISK11

ORCL:RACDISK9


以下是配置ASMFD前数据库的详细描述:


[oracle@station11 ~]$ srvctl config database -d db12c -v

Database unique name: db12c

Database name: db12c

Oracle home: /u01/app/oracle/product/12.1.0/dbhome_1

Oracle user: oracle

Spfile: +DATA/DB12C/PARAMETERFILE/spfile.290.923785467

Password file: +DATA/DB12C/PASSWORD/pwddb12c.277.923783891

Domain: example.com

Start options: open

Stop options: immediate

Database role: PRIMARY

Management policy: AUTOMATIC

Server pools:

Disk Groups: FRA,DATA

Mount point paths:

Services:

Type: RAC

Start concurrency:

Stop concurrency:

OSDBA group: dba

OSOPER group: oper

Database instances: db12c1,db12c2

Configured nodes: station11,station12

Database is administrator managed


4. 实验部分:滚动配置ASMFD

4.1 UEK3内核

根据第1节中的描述得知:有必要到http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/x86_64/index.html下载:

dtrace-modules-3.8.13-118.11.2.el6uek-0.4.5-3.el6.x86_64.rpm

dtrace-modules-provider-headers-0.4.5-3.el6.x86_64.rpm

dtrace-modules-shared-headers-0.4.5-3.el6.x86_64.rpm

kernel-uek-3.8.13-118.11.2.el6uek.x86_64.rpm

kernel-uek-devel-3.8.13-118.11.2.el6uek.x86_64.rpm

kernel-uek-firmware-3.8.13-118.11.2.el6uek.noarch.rpm

kernel-uek-headers-3.8.13-26.2.4.el6uek.x86_64.rpm

并强制安装,再以该3.8.13-118.11.2.el6uek内核启动。 建议一个节点接着一个节点来安装内核并重启。由于是RAC数据库,这样对前台业务可以做到无影响。


[root@station11 ~]# uname -a

Linux station11.example.com 3.8.13-118.11.2.el6uek.x86_64 #2 SMP Wed Sep 21 11:37:57 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux


[root@station12 ~]# uname -a

Linux station12.example.com 3.8.13-118.11.2.el6uek.x86_64 #2 SMP Wed Sep 21 11:37:57 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux


4.2 在第一个节点上滚动配置

4.2.1 afd_configure

大家知道:默认Oracle Enterprise LinuxASM Lib都是不配置的(要自己去Oracle网站下载oracleasmlib-2.0.4-1.el6.x86_64.rpm),更不要说本文主要研究的 ASMFD了。本文中,配置ASMFD前的ASM Lib原始实验环境加载的是“oracleasm.ko”模块:


[root@station11 ~]# lsmod | grep oracle

oracleasm 53591 1


要把12.1.0.2 以前传统的“ORCL:”ASM Lib标签的磁盘转换成新的“ASMFD标签的磁盘”,我们首先必需把“ASM磁盘搜寻串”设置成空串。这样ASM会在一系列默认位置寻找磁盘(包括ASM Lib的“ORCL:”标签的磁盘)。保证在未彻底转换成新的“ASMFD标签的磁盘”的过程中,集群还能继续使用“ORCL:”磁盘启动。否则在转换过程中,无法中途启动集群,进而就根本无法执行下去。


[grid@station11 ~]$ asmcmd dsget

parameter:

profile:

[grid@station11 ~]$ asmcmd dsset ''

[grid@station11 ~]$ asmcmd dsget

parameter:

profile:


使用asmcmd afd_configure之前必须关闭第一个节点上运行着的集群件:


[root@station11 ~]# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'station11'

CRS-2673: Attempting to stop 'ora.crsd' on 'station11'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'station11'

CRS-2673: Attempting to stop 'ora.db12c.db' on 'station11'

CRS-2673: Attempting to stop 'ora.mgmtdb' on 'station11'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'station11'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'station11'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'station11'

CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.scan3.vip' on 'station11'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.station11.vip' on 'station11'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.scan1.vip' on 'station11'

CRS-2677: Stop of 'ora.scan3.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.scan3.vip' on 'station12'

CRS-2677: Stop of 'ora.station11.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.station11.vip' on 'station12'

CRS-2677: Stop of 'ora.db12c.db' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'station11'

CRS-2677: Stop of 'ora.FRA.dg' on 'station11' succeeded

CRS-2677: Stop of 'ora.scan1.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.scan1.vip' on 'station12'

CRS-2677: Stop of 'ora.mgmtdb' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'station11'

CRS-2673: Attempting to stop 'ora.MGMTLSNR' on 'station11'

CRS-2677: Stop of 'ora.DATA.dg' on 'station11' succeeded

CRS-2677: Stop of 'ora.MGMTLSNR' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'station12'

CRS-2676: Start of 'ora.scan3.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'station12'

CRS-2676: Start of 'ora.station11.vip' on 'station12' succeeded

CRS-2676: Start of 'ora.scan1.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'station12'

CRS-2676: Start of 'ora.MGMTLSNR' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.mgmtdb' on 'station12'

CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'station12' succeeded

CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'station12' succeeded

CRS-2676: Start of 'ora.mgmtdb' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'station11'

CRS-2677: Stop of 'ora.asm' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'station11'

CRS-2677: Stop of 'ora.ons' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'station11'

CRS-2677: Stop of 'ora.net1.network' on 'station11' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'station11' has completed

CRS-2677: Stop of 'ora.crsd' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'station11'

CRS-2673: Attempting to stop 'ora.crf' on 'station11'

CRS-2673: Attempting to stop 'ora.mdnsd' on 'station11'

CRS-2673: Attempting to stop 'ora.gpnpd' on 'station11'

CRS-2677: Stop of 'ora.ctssd' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.evmd' on 'station11'

CRS-2673: Attempting to stop 'ora.storage' on 'station11'

CRS-2677: Stop of 'ora.mdnsd' on 'station11' succeeded

CRS-2677: Stop of 'ora.crf' on 'station11' succeeded

CRS-2677: Stop of 'ora.gpnpd' on 'station11' succeeded

CRS-2677: Stop of 'ora.storage' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'station11'

CRS-2677: Stop of 'ora.evmd' on 'station11' succeeded

CRS-2677: Stop of 'ora.asm' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'station11'

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'station11'

CRS-2677: Stop of 'ora.cssd' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on 'station11'

CRS-2677: Stop of 'ora.gipcd' on 'station11' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'station11' has completed

CRS-4133: Oracle High Availability Services has been stopped.


root的身份执行asmcmd afd_configure:


[root@station11 ~]# asmcmd afd_configure

Connected to an idle instance.

AFD-627: AFD distribution files found.

AFD-636: Installing requested AFD software.

AFD-637: Loading installed AFD drivers.

AFD-9321: Creating udev for AFD.

AFD-9323: Creating module dependencies - this may take some time.

AFD-9154: Loading 'oracleafd.ko' driver.

AFD-649: Verifying AFD devices.

AFD-9156: Detecting control device '/dev/oracleafd/admin'.

AFD-638: AFD installation correctness verified.

Modifying resource dependencies - this may take some time.


在“asmcmd afd_configure”过程中,所有的ASM Lib磁盘标签都会被自动转换成ASMFD磁盘标签。并不需要“asmcmd afd_label --migrate“或“alter system label set”之类的命令,也不需要停集群或dismount磁盘组。不论这些磁盘包含OCRVOTINGDISK与否。之后ASM Lib自动停用,oracleasm模块自动卸载,取而代之加载oracleafd模块。


[root@station11 ~]# lsmod | grep oracle

oracleafd 208499 0


下面确认ASMFD的状态,确保其状态是加载的:


[grid@station11 ~]$ asmcmd afd_state

Connected to an idle instance.

ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'DEFAULT' on host 'station11.example.com'


重新启动第一个节点上的集群件:


[root@station11 ~]# crsctl start crs -wait

CRS-4123: Starting Oracle High Availability Services-managed resources

CRS-2672: Attempting to start 'ora.mdnsd' on 'station11'

CRS-2672: Attempting to start 'ora.evmd' on 'station11'

CRS-2676: Start of 'ora.mdnsd' on 'station11' succeeded

CRS-2676: Start of 'ora.evmd' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.gpnpd' on 'station11'

CRS-2676: Start of 'ora.gpnpd' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.gipcd' on 'station11'

CRS-2676: Start of 'ora.gipcd' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'station11'

CRS-2676: Start of 'ora.cssdmonitor' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'station11'

CRS-2672: Attempting to start 'ora.diskmon' on 'station11'

CRS-2676: Start of 'ora.diskmon' on 'station11' succeeded

CRS-2676: Start of 'ora.cssd' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'station11'

CRS-2672: Attempting to start 'ora.ctssd' on 'station11'

CRS-2676: Start of 'ora.ctssd' on 'station11' succeeded

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.asm' on 'station11'

CRS-2676: Start of 'ora.asm' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.storage' on 'station11'

CRS-2676: Start of 'ora.storage' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.crf' on 'station11'

CRS-2676: Start of 'ora.crf' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.crsd' on 'station11'

CRS-2676: Start of 'ora.crsd' on 'station11' succeeded

CRS-6017: Processing resource auto-start for servers: station11

CRS-2672: Attempting to start 'ora.net1.network' on 'station11'

CRS-2676: Start of 'ora.net1.network' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.ons' on 'station11'

CRS-2673: Attempting to stop 'ora.station11.vip' on 'station12'

CRS-2677: Stop of 'ora.station11.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.station11.vip' on 'station11'

CRS-2676: Start of 'ora.station11.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'station11'

CRS-2676: Start of 'ora.ons' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'station12'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.scan1.vip' on 'station12'

CRS-2677: Stop of 'ora.scan1.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.scan1.vip' on 'station11'

CRS-2676: Start of 'ora.scan1.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'station11'

CRS-2676: Start of 'ora.LISTENER.lsnr' on 'station11' succeeded

CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.db12c.db' on 'station11'

CRS-2676: Start of 'ora.db12c.db' on 'station11' succeeded

CRS-6016: Resource auto-start has completed for server station11

CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources

CRS-4123: Oracle High Availability Services has been started.


4.2.2 afd_dsset

初始时,“ASMFD磁盘搜寻串”并没有设置,所以afd_scan时需要传递磁盘路径参数:


[grid@station11 ~]$ asmcmd afd_dsget

AFD discovery string:

[grid@station11 ~]$ asmcmd afd_scan '/dev/sda*'

Connected to an idle instance.


现在需要设置,保证从此以后在第一个节点上ASM磁盘会以ASMFD标签的访问方式被自动探测并发现到:


[grid@station11 ~]$ asmcmd afd_dsset '/dev/sda*'

[grid@station11 ~]$ asmcmd afd_dsget

AFD discovery string: '/dev/sda*'


4.3 在第二个节点上滚动配置

4.3.1 由于每个节点上都要运行afd_configure,在第二节点上再次运行

本文中,配置ASMFD前的ASM Lib原始实验环境加载的是“oracleasm.ko”模块:


[root@station12 ~]# lsmod | grep oracle

oracleasm 53591 1


使用asmcmd afd_configure之前必须关闭第二个节点上运行着的集群件:


[root@station12 ~]# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'station12'

CRS-2673: Attempting to stop 'ora.crsd' on 'station12'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'station12'

CRS-2673: Attempting to stop 'ora.db12c.db' on 'station12'

CRS-2673: Attempting to stop 'ora.mgmtdb' on 'station12'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'station12'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'station12'

CRS-2673: Attempting to stop 'ora.oc4j' on 'station12'

CRS-2673: Attempting to stop 'ora.cvu' on 'station12'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'station12'

CRS-2677: Stop of 'ora.cvu' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.cvu' on 'station11'

CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.scan2.vip' on 'station12'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.station12.vip' on 'station12'

CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.scan3.vip' on 'station12'

CRS-2676: Start of 'ora.cvu' on 'station11' succeeded

CRS-2677: Stop of 'ora.scan2.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.scan2.vip' on 'station11'

CRS-2677: Stop of 'ora.station12.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.station12.vip' on 'station11'

CRS-2677: Stop of 'ora.db12c.db' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'station12'

CRS-2677: Stop of 'ora.FRA.dg' on 'station12' succeeded

CRS-2677: Stop of 'ora.scan3.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.scan3.vip' on 'station11'

CRS-2676: Start of 'ora.scan2.vip' on 'station11' succeeded

CRS-2677: Stop of 'ora.mgmtdb' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'station12'

CRS-2673: Attempting to stop 'ora.MGMTLSNR' on 'station12'

CRS-2677: Stop of 'ora.DATA.dg' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'station11'

CRS-2677: Stop of 'ora.MGMTLSNR' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'station11'

CRS-2676: Start of 'ora.station12.vip' on 'station11' succeeded

CRS-2676: Start of 'ora.scan3.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'station11'

CRS-2677: Stop of 'ora.oc4j' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.oc4j' on 'station11'

CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'station11' succeeded

CRS-2676: Start of 'ora.MGMTLSNR' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.mgmtdb' on 'station11'

CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'station11' succeeded

CRS-2676: Start of 'ora.oc4j' on 'station11' succeeded

CRS-2676: Start of 'ora.mgmtdb' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'station12'

CRS-2677: Stop of 'ora.asm' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'station12'

CRS-2677: Stop of 'ora.ons' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'station12'

CRS-2677: Stop of 'ora.net1.network' on 'station12' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'station12' has completed

CRS-2677: Stop of 'ora.crsd' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.crf' on 'station12'

CRS-2673: Attempting to stop 'ora.storage' on 'station12'

CRS-2673: Attempting to stop 'ora.mdnsd' on 'station12'

CRS-2673: Attempting to stop 'ora.gpnpd' on 'station12'

CRS-2677: Stop of 'ora.storage' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'station12'

CRS-2677: Stop of 'ora.crf' on 'station12' succeeded

CRS-2677: Stop of 'ora.gpnpd' on 'station12' succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'station12' succeeded

CRS-2677: Stop of 'ora.asm' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'station12'

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'station12'

CRS-2673: Attempting to stop 'ora.evmd' on 'station12'

CRS-2677: Stop of 'ora.evmd' on 'station12' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'station12'

CRS-2677: Stop of 'ora.cssd' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on 'station12'

CRS-2677: Stop of 'ora.gipcd' on 'station12' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'station12' has completed

CRS-4133: Oracle High Availability Services has been stopped.


由于每个节点上都要运行afd_configure,在第二节点上再次以root的身份执行:


[root@station12 ~]# asmcmd afd_configure

Connected to an idle instance.

AFD-627: AFD distribution files found.

AFD-636: Installing requested AFD software.

AFD-637: Loading installed AFD drivers.

AFD-9321: Creating udev for AFD.

AFD-9323: Creating module dependencies - this may take some time.

AFD-9154: Loading 'oracleafd.ko' driver.

AFD-649: Verifying AFD devices.

AFD-9156: Detecting control device '/dev/oracleafd/admin'.

AFD-638: AFD installation correctness verified.

Modifying resource dependencies - this may take some time.


之后ASM Lib自动停用,oracleasm模块自动卸载,取而代之加载oracleafd模块。


[root@station12 ~]# lsmod | grep oracle

oracleafd 208499 0


下面确认ASMFD的状态,确保其状态是加载的:


[grid@station12 ~]$ asmcmd afd_state

Connected to an idle instance.

ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'DEFAULT' on host 'station12.example.com'


重新启动第二个节点上的集群件:


[root@station12 ~]# crsctl start crs -wait

CRS-4123: Starting Oracle High Availability Services-managed resources

CRS-2672: Attempting to start 'ora.mdnsd' on 'station12'

CRS-2672: Attempting to start 'ora.evmd' on 'station12'

CRS-2676: Start of 'ora.mdnsd' on 'station12' succeeded

CRS-2676: Start of 'ora.evmd' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.gpnpd' on 'station12'

CRS-2676: Start of 'ora.gpnpd' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.gipcd' on 'station12'

CRS-2676: Start of 'ora.gipcd' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'station12'

CRS-2676: Start of 'ora.cssdmonitor' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'station12'

CRS-2672: Attempting to start 'ora.diskmon' on 'station12'

CRS-2676: Start of 'ora.diskmon' on 'station12' succeeded

CRS-2676: Start of 'ora.cssd' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'station12'

CRS-2672: Attempting to start 'ora.ctssd' on 'station12'

CRS-2676: Start of 'ora.ctssd' on 'station12' succeeded

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.asm' on 'station12'

CRS-2676: Start of 'ora.asm' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.storage' on 'station12'

CRS-2676: Start of 'ora.storage' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.crf' on 'station12'

CRS-2676: Start of 'ora.crf' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.crsd' on 'station12'

CRS-2676: Start of 'ora.crsd' on 'station12' succeeded

CRS-6023: Starting Oracle Cluster Ready Services-managed resources

CRS-6017: Processing resource auto-start for servers: station12

CRS-2672: Attempting to start 'ora.net1.network' on 'station12'

CRS-2676: Start of 'ora.net1.network' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.ons' on 'station12'

CRS-2673: Attempting to stop 'ora.station12.vip' on 'station11'

CRS-2677: Stop of 'ora.station12.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.station12.vip' on 'station12'

CRS-2676: Start of 'ora.station12.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'station12'

CRS-2676: Start of 'ora.ons' on 'station12' succeeded

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'station11'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'station11' succeeded

CRS-2673: Attempting to stop 'ora.scan1.vip' on 'station11'

CRS-2677: Stop of 'ora.scan1.vip' on 'station11' succeeded

CRS-2672: Attempting to start 'ora.scan1.vip' on 'station12'

CRS-2676: Start of 'ora.scan1.vip' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'station12'

CRS-2676: Start of 'ora.LISTENER.lsnr' on 'station12' succeeded

CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'station12' succeeded

CRS-2672: Attempting to start 'ora.db12c.db' on 'station12'

CRS-2676: Start of 'ora.db12c.db' on 'station12' succeeded

CRS-6016: Resource auto-start has completed for server station12

CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources

CRS-4123: Oracle High Availability Services has been started.


4.3.2 由于每个节点上都要运行afd_dsset配置“ASMFD磁盘搜寻串”,在第二节点上再次运行

初始时,“ASMFD磁盘搜寻串”并没有设置,所以afd_scan时需要传递磁盘路径参数:


[grid@station12 ~]$ asmcmd afd_dsget

AFD discovery string:

[grid@station12 ~]$ asmcmd afd_scan '/dev/sda*'

Connected to an idle instance.


现在需要设置,保证从此以后在第二个节点上ASM磁盘会以ASMFD标签的访问方式被自动探测并发现到:


[grid@station12 ~]$ asmcmd afd_dsset '/dev/sda*'

[grid@station12 ~]$ asmcmd afd_dsget

AFD discovery string: '/dev/sda*'


4.4 确认所有节点配置完成

注意到每一个磁盘路径现在都显示成ASMFD标签的磁盘;同时注意到在“Library列上揭示ASMFD已经加载生效。


[grid@station12 ~]$ asmcmd lsdsk -k

Total_MB Free_MB OS_MB Name Failgroup Failgroup_Type Library Label UDID Product Redund Path

4918 2859 4918 RACDISK1 RACDISK1 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK1 UNKNOWN AFD:RACDISK1

4918 4805 4918 RACDISK10 RACDISK10 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK10 UNKNOWN AFD:RACDISK10

4918 4806 4918 RACDISK11 RACDISK11 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK11 UNKNOWN AFD:RACDISK11

4918 2874 4918 RACDISK2 RACDISK2 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK2 UNKNOWN AFD:RACDISK2

4918 2861 4918 RACDISK3 RACDISK3 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK3 UNKNOWN AFD:RACDISK3

4918 2899 4918 RACDISK4 RACDISK4 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK4 UNKNOWN AFD:RACDISK4

4918 2901 4918 RACDISK5 RACDISK5 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK5 UNKNOWN AFD:RACDISK5

4918 2902 4918 RACDISK6 RACDISK6 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK6 UNKNOWN AFD:RACDISK6

4918 2885 4918 RACDISK7 RACDISK7 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK7 UNKNOWN AFD:RACDISK7

4918 2896 4918 RACDISK8 RACDISK8 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK8 UNKNOWN AFD:RACDISK8

4918 4806 4918 RACDISK9 RACDISK9 REGULAR AFD Library - Generic , version 3 (KABI_V3) RACDISK9 UNKNOWN AFD:RACDISK9


既然已经把所有磁盘彻底转换成新的ASMFD标签的磁盘,为了统一起见,我们也把2.1节中提到的“ASM磁盘搜寻串”设置成新的 ASMFD标签的磁盘路径:


[grid@station12 ~]$ asmcmd dsget

parameter:

profile:

[grid@station12 ~]$ asmcmd dsset 'AFD:*'

[grid@station12 ~]$ asmcmd dsget

parameter:AFD:*

profile:AFD:*


5. 实验部分:测试ASMFD对现有数据库的保护

5.1 在第一个节点上打开Filter

如果以上配置刚刚完成,那么 ASMFD的过滤功能在两个节点上肯定都是打开的:


[grid@station11 ~]$ asmcmd afd_lsdsk

--------------------------------------------------------------------------------

Label Filtering Path

================================================================================

RACDISK1 ENABLED /dev/sda5

RACDISK2 ENABLED /dev/sda6

RACDISK3 ENABLED /dev/sda7

RACDISK4 ENABLED /dev/sda8

RACDISK5 ENABLED /dev/sda9

RACDISK6 ENABLED /dev/sda10

RACDISK7 ENABLED /dev/sda11

RACDISK8 ENABLED /dev/sda12

RACDISK9 ENABLED /dev/sda13

RACDISK10 ENABLED /dev/sda14

RACDISK11 ENABLED /dev/sda15


如果没有打开,需要在每个没有打开Filter的节点上执行:


[grid@station11 ~]$ asmcmd afd_filter -e


5.2 在第二个节点上打开Filter

如果以上配置刚刚完成,那么 ASMFD的过滤功能在两个节点上肯定都是打开的:


[grid@station12 ~]$ asmcmd afd_lsdsk

--------------------------------------------------------------------------------

Label Filtering Path

================================================================================

RACDISK1 ENABLED /dev/sda5

RACDISK2 ENABLED /dev/sda6

RACDISK3 ENABLED /dev/sda7

RACDISK4 ENABLED /dev/sda8

RACDISK5 ENABLED /dev/sda9

RACDISK6 ENABLED /dev/sda10

RACDISK7 ENABLED /dev/sda11

RACDISK8 ENABLED /dev/sda12

RACDISK9 ENABLED /dev/sda13

RACDISK10 ENABLED /dev/sda14

RACDISK11 ENABLED /dev/sda15


如果没有打开,需要在每个没有打开Filter的节点上执行:


[grid@station12 ~]$ asmcmd afd_filter -e


5.3 实验 dd破坏

先确认一下OCRVOTEDISK的位置:


[grid@station12 ~]$ ocrcheck

Status of Oracle Cluster Registry is as follows :

Version : 4

Total space (kbytes) : 409568

Used space (kbytes) : 1568

Available space (kbytes) : 408000

ID : 1183496086

Device/File Name : +DATA

Device/File integrity check succeeded


Device/File not configured


Device/File not configured


Device/File not configured


Device/File not configured


Cluster registry integrity check succeeded


Logical corruption check bypassed due to non-privileged user


[grid@station12 ~]$ crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 36cf7fea1a204f02bf6828a53e4d924b (AFD:RACDISK1) [DATA]

2. ONLINE edc3c70c29604fd8bfc65d3e67636b01 (AFD:RACDISK2) [DATA]

3. ONLINE 68bb13dc938e4f16bf8aa3ec48b61545 (AFD:RACDISK3) [DATA]

Located 3 voting disk(s).


[grid@station12 ~]$ asmcmd afd_lsdsk

--------------------------------------------------------------------------------

Label Filtering Path

================================================================================

RACDISK1 ENABLED /dev/sda5

RACDISK2 ENABLED /dev/sda6

RACDISK3 ENABLED /dev/sda7

RACDISK4 ENABLED /dev/sda8

RACDISK5 ENABLED /dev/sda9

RACDISK6 ENABLED /dev/sda10

RACDISK7 ENABLED /dev/sda11

RACDISK8 ENABLED /dev/sda12

RACDISK9 ENABLED /dev/sda13

RACDISK10 ENABLED /dev/sda14

RACDISK11 ENABLED /dev/sda15


破坏OCRVOTEDISK所在的磁盘。注意到由于ASMFD处于生效状态,它阻挡了dd命令,使得IO不能完成:


[root@station12 ~]# dd if=/dev/random of=/dev/sda5 bs=10M count=1 oflag=sync

dd: writing `/dev/sda5': Input/output error

0+1 records in

0+0 records out

0 bytes (0 B) copied, 0.00111785 s, 0.0 kB/s


查看此时的 /var/log/message日志,会发现更详细的信息:


Oct 4 15:48:22 station12 kernel: F 4312971.559/161004074822 oracle_26373_+a[26373] afdr_portal_close: PID 26373 (oracle_26373_+a) calling close w/o matching open

Oct 4 16:16:18 station12 kernel: F 4314647.115/161004081618 dd[5221] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=5) not supported i=1 start=16128 seccnt=4 pstart=16128 pend=10088820

Oct 4 16:16:18 station12 kernel: Buffer I/O error on device sda5, logical block 0

Oct 4 16:16:18 station12 kernel: lost page write due to I/O error on sda5


换一种方式破坏其他磁盘:


[root@station12 ~]# dd if=/dev/zero of=/dev/sda15 bs=10M count=1 oflag=sync

dd: writing `/dev/sda15': Input/output error

0+1 records in

0+0 records out

0 bytes (0 B) copied, 0.0645973 s, 0.0 kB/s


查看此时的 /var/log/message日志,会发现更详细的信息:


Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764082 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764086 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764090 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764094 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764098 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764102 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764106 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764110 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764114 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764118 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764122 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764126 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764130 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764134 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764138 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764142 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764146 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.280/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764150 seccnt=4 pstart=100743678 pend=110816370

Oct 4 16:21:27 station12 kernel: F 4314956.281/161004082127 dd[7174] afd_mkrequest_fn: write IO on ASM managed device (major=8/minor=15) not supported i=11 start=100764154 seccnt=4 pstart=100743678 pend=110816370


5.4 在第一个节点上关闭Filter


[grid@station11 ~]$ asmcmd afd_filter -d

[grid@station11 ~]$ asmcmd afd_lsdsk

--------------------------------------------------------------------------------

Label Filtering Path

================================================================================

RACDISK1 DISABLED /dev/sda5

RACDISK2 DISABLED /dev/sda6

RACDISK3 DISABLED /dev/sda7

RACDISK4 DISABLED /dev/sda8

RACDISK5 DISABLED /dev/sda9

RACDISK6 DISABLED /dev/sda10

RACDISK7 DISABLED /dev/sda11

RACDISK8 DISABLED /dev/sda12

RACDISK9 DISABLED /dev/sda13

RACDISK10 DISABLED /dev/sda14

RACDISK11 DISABLED /dev/sda15


5.5 在第二个节点上查看Filter

在第二节点,ASMFD仍然可用:


[grid@station12 ~]$ asmcmd afd_lsdsk

--------------------------------------------------------------------------------

Label Filtering Path

================================================================================

RACDISK1 ENABLED /dev/sda5

RACDISK2 ENABLED /dev/sda6

RACDISK3 ENABLED /dev/sda7

RACDISK4 ENABLED /dev/sda8

RACDISK5 ENABLED /dev/sda9

RACDISK6 ENABLED /dev/sda10

RACDISK7 ENABLED /dev/sda11

RACDISK8 ENABLED /dev/sda12

RACDISK9 ENABLED /dev/sda13

RACDISK10 ENABLED /dev/sda14

RACDISK11 ENABLED /dev/sda15


5.6 失去ASMFD保护后再实验dd破坏,集群崩溃

ASMFD仍然可用的节点上做破坏,没有破坏成


[root@station12 ~]# dd if=/dev/zero of=/dev/sda5 bs=10M count=1 oflag=sync & dd if=/dev/zero of=/dev/sda6 bs=10M count=1 oflag=sync & dd if=/dev/zero of=/dev/sda7 bs=10M count=1 oflag=sync &

[1] 3327

[2] 3328

[3] 3329

[root@station12 ~]# dd: writing `/dev/sda7': Input/output error

1+0 records in

0+0 records out

0 bytes (0 B) copied, 0.0827977 s, 0.0 kB/s

dd: writing `/dev/sda6': Input/output error

1+0 records in

0+0 records out

0 bytes (0 B) copied, 0.099264 s, 0.0 kB/s

dd: writing `/dev/sda5': Input/output error

1+0 records in

0+0 records out

0 bytes (0 B) copied, 0.110771 s, 0.0 kB/s


[1] Exit 1 dd if=/dev/zero of=/dev/sda5 bs=10M count=1 oflag=sync

[2]- Exit 1 dd if=/dev/zero of=/dev/sda6 bs=10M count=1 oflag=sync

[3]+ Exit 1 dd if=/dev/zero of=/dev/sda7 bs=10M count=1 oflag=sync


ASMFD停用的节点上做破坏,破坏成了:


[root@station11 ~]# dd if=/dev/zero of=/dev/sda5 bs=10M count=1 oflag=sync & dd if=/dev/zero of=/dev/sda6 bs=10M count=1 oflag=sync & dd if=/dev/zero of=/dev/sda7 bs=10M count=1 oflag=sync &

[1] 21582

[2] 21583

[3] 21584

...

之后station11重启了


实际上集群已经被破坏了,再也无法启动了。


[root@station11 ~]# crs_stat -t

CRS-0184: Cannot communicate with the CRS daemon.


[root@station12 ~]# crs_stat -t

CRS-0184: Cannot communicate with the CRS daemon.



总结

ASMFD添加了一个像免疫系统一样的,名叫“oracleafd.ko”的内核模块进Linux操作系统。该模块占领IO要塞,使得ASM可以拒绝非Oracle进程(即使是具有root权限的进程)对ASM磁盘的写操作,进而达到排斥非法入侵者的目的。 配置GI,由原先的ASM Lib驱动改成安全性更好的ASM Filter驱动的过程是非常简单的,并且能够以滚动的方式进行配置,相信这个新特性值得在广大生产环境推广。


路过

雷人

握手

鲜花

鸡蛋

QQ|手机版|Bo's Oracle Station   

GMT+8, 2022-3-22 12:25 , Processed in 0.048311 second(s), 21 queries .

返回顶部