Bo's Oracle Station

【博客文章2024】RAC 19c磁盘硬件严重故障处置系列二:ASM元数据、OCR和Voting Disk的恢复

2024-4-2 10:33| 发布者: admin| 查看: 112| 评论: 0|原作者: Bo Tang

摘要: 这是“RAC 19c磁盘硬件严重故障处置系列”的第二篇。博客详细介绍了集群ASM元数据、OCR和Voting Disk的恢复。包含:通过ASM元数据的恢复来实现磁盘组的重建、ASM的spfile重建、OCR的恢复和VotingDisk的恢复。对于第二个节点启动失败的bug有详细的解决方案介绍。最终,整个集群的nodeapps系统等底层部分都得到成功恢复。将通过之后的两篇博客来进行mgmgtdb的恢复和RAC数据库的恢复。
【博客文章2024】RAC 19c磁盘硬件严重故障处置系列二:ASM元数据、OCR和Voting Disk的恢复




Author: Bo Tang


    RAC 19c集群已经奔溃,详见这篇博客:https://www.botangdb.com/mytec/mytec_rac/202404/00900121.html通过本文和之后的两篇博客来进行集群的恢复。

1. ASM元数据的恢复:

    恢复ASM元数据前需要在所有节点侧底停止集群的残余进程:

[root@station11 ~]# . oraenv
ORACLE_SID = [root] ? +ASM1
The Oracle base has been set to /u01/app/grid
[root@station11 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'station11'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'station11'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'station11'
CRS-2673: Attempting to stop 'ora.evmd' on 'station11'
CRS-2673: Attempting to stop 'ora.crf' on 'station11'
CRS-2677: Stop of 'ora.cssdmonitor' on 'station11' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'station11'
CRS-2677: Stop of 'ora.mdnsd' on 'station11' succeeded
CRS-2677: Stop of 'ora.evmd' on 'station11' succeeded
CRS-2677: Stop of 'ora.crf' on 'station11' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'station11'
CRS-2677: Stop of 'ora.gpnpd' on 'station11' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'station11' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'station11' has completed
CRS-4133: Oracle High Availability Services has been stopped.


[root@station12 ~]# . oraenv
ORACLE_SID = [root] ? +ASM2
The Oracle base has been set to /u01/app/grid
[root@station12 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'station12'
CRS-2673: Attempting to stop 'ora.crf' on 'station12'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'station12'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'station12'
CRS-2673: Attempting to stop 'ora.evmd' on 'station12'
CRS-2677: Stop of 'ora.cssdmonitor' on 'station12' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'station12'
CRS-2677: Stop of 'ora.crf' on 'station12' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'station12'
CRS-2677: Stop of 'ora.mdnsd' on 'station12' succeeded
CRS-2677: Stop of 'ora.evmd' on 'station12' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'station12' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'station12' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'station12' has completed
CRS-4133: Oracle High Availability Services has been stopped.


    根据https://www.botangdb.com/mytec/mytec_rac/202404/00900121.html描述的环境的原来的LUN绑定规则,在任一节点上绑定ASM磁盘:

[root@station11 ~]#  ls -l /dev/sd*1
brw-rw---- 1 root disk 8,   1 Apr  3 11:01 /dev/sda1
brw-rw---- 1 root disk 8,  17 Apr  3 11:01 /dev/sdb1
brw-rw---- 1 root disk 8,  33 Apr  3 11:01 /dev/sdc1
brw-rw---- 1 root disk 8,  49 Apr  3 11:01 /dev/sdd1
brw-rw---- 1 root disk 8,  65 Apr  3 11:01 /dev/sde1
brw-rw---- 1 root disk 8,  81 Apr  3 11:01 /dev/sdf1
brw-rw---- 1 root disk 8,  97 Apr  3 11:01 /dev/sdg1
brw-rw---- 1 root disk 8, 113 Apr  3 11:01 /dev/sdh1
brw-rw---- 1 root disk 8, 129 Apr  3 11:01 /dev/sdi1
brw-rw---- 1 root disk 8, 145 Apr  3 11:01 /dev/sdj1
brw-rw---- 1 root disk 8, 161 Apr  3 11:01 /dev/sdk1
[root@station11 ~]#  oracleasm createdisk RACDISK01 /dev/sda1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]#  oracleasm createdisk RACDISK02 /dev/sdb1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]# oracleasm createdisk RACDISK03 /dev/sdc1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]#  oracleasm createdisk RACDISK04 /dev/sdd1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]# oracleasm createdisk RACDISK05 /dev/sde1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]# oracleasm createdisk RACDISK06 /dev/sdf1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]#  oracleasm createdisk RACDISK07 /dev/sdg1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]#  oracleasm createdisk RACDISK08 /dev/sdh1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]# oracleasm createdisk RACDISK09 /dev/sdi1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]# oracleasm createdisk RACDISK10 /dev/sdj1
Writing disk header: done
Instantiating disk: done
[root@station11 ~]#  oracleasm createdisk RACDISK11 /dev/sdk1
Writing disk header: done
Instantiating disk: done


    在所有节点上,重新扫描绑定后生成的ASM磁盘:

[root@station11 ~]#  oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...


[root@station12 ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "RACDISK02"
Instantiating disk "RACDISK01"
Instantiating disk "RACDISK03"
Instantiating disk "RACDISK04"
Instantiating disk "RACDISK05"
Instantiating disk "RACDISK06"
Instantiating disk "RACDISK07"
Instantiating disk "RACDISK08"
Instantiating disk "RACDISK09"
Instantiating disk "RACDISK10"
Instantiating disk "RACDISK11"


    现在需要在不启动标准CRS的情况下,只启动ASM实例来实施后续的操作:
    在任一节点上,使用exclusive模式启动crs。命令中的参数-excl代表独占模式,-nocrs代表忽略寻找crs的voting disk而启动crs:

[root@station11 ~]#  crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.evmd' on 'station11'
CRS-2672: Attempting to start 'ora.mdnsd' on 'station11'
CRS-2676: Start of 'ora.mdnsd' on 'station11' succeeded
CRS-2676: Start of 'ora.evmd' on 'station11' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'station11'
CRS-2676: Start of 'ora.gpnpd' on 'station11' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'station11'
CRS-2672: Attempting to start 'ora.gipcd' on 'station11'
CRS-2676: Start of 'ora.cssdmonitor' on 'station11' succeeded
CRS-2676: Start of 'ora.gipcd' on 'station11' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'station11'
CRS-2672: Attempting to start 'ora.diskmon' on 'station11'
CRS-2676: Start of 'ora.diskmon' on 'station11' succeeded
CRS-2676: Start of 'ora.cssd' on 'station11' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'station11'
CRS-2672: Attempting to start 'ora.ctssd' on 'station11'
CRS-2676: Start of 'ora.ctssd' on 'station11' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'station11' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'station11'
CRS-2676: Start of 'ora.asm' on 'station11' succeeded


    现在ASM实例虽已启动,但是+data磁盘组和+fra磁盘组都是不存在的。接下来需要做的是设置asm_diskstring参数:

[grid@station11 ~]$ sqlplus /nolog

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Apr 3 09:19:56 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

SQL> conn / as sysasm
Connected.
SQL> show parameter disk

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                 string
asm_diskstring                 string
SQL>  alter system set asm_diskstring='/dev/oracleasm/disks/*';

System altered.

SQL> set linesize 1000
SQL> col path format a40 trunc
SQL>  select  name , path from v$asm_disk;

NAME                   PATH
------------------------------ ----------------------------------------
                   /dev/oracleasm/disks/RACDISK11
                   /dev/oracleasm/disks/RACDISK06
                   /dev/oracleasm/disks/RACDISK04
                   /dev/oracleasm/disks/RACDISK05
                   /dev/oracleasm/disks/RACDISK10
                   /dev/oracleasm/disks/RACDISK08
                   /dev/oracleasm/disks/RACDISK07
                   /dev/oracleasm/disks/RACDISK02
                   /dev/oracleasm/disks/RACDISK09
                   /dev/oracleasm/disks/RACDISK03
                   /dev/oracleasm/disks/RACDISK01

11 rows selected.


SQL> select  * from v$asm_diskgroup;

no rows selected

SQL> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0


    然后,使用元数据来恢复+data磁盘组:

[grid@station11 ~]$ asmcmd
ASMCMD>  md_restore /home/grid/data.txt --full -G data
Current Diskgroup metadata being restored: DATA
Diskgroup DATA created!
System template XTRANSPORT BACKUPSET modified!
System template TEMPFILE modified!
System template CONTROLFILE modified!
System template BACKUPSET modified!
System template AUTOBACKUP modified!
System template KEY_STORE modified!
System template DUMPSET modified!
System template DATAGUARDCONFIG modified!
System template XTRANSPORT modified!
System template FLASHFILE modified!
System template ARCHIVELOG modified!
System template OCRFILE modified!
System template ASMPARAMETERFILE modified!
System template CHANGETRACKING modified!
System template DATAFILE modified!
System template AUTOLOGIN_KEY_STORE modified!
System template AUDIT_SPILLFILES modified!
System template VOTINGFILE modified!
System template INCR XTRANSPORT BACKUPSET modified!
System template ONLINELOG modified!
System template PARAMETERFILE modified!
System template FLASHBACK modified!
Directory +DATA/cluster11 re-created!
Directory +DATA/ASM re-created!
Directory +DATA/_MGMTDB re-created!
Directory +DATA/_MGMTDB/86B638145FF37A6FE053F706E80A81F1 re-created!
Directory +DATA/_MGMTDB/86B638145FF37A6FE053F706E80A81F1/DATAFILE re-created!
Directory +DATA/_MGMTDB/12D2C496AC8DE99CE0630BFA19ACCB81 re-created!
Directory +DATA/_MGMTDB/12D2C496AC8DE99CE0630BFA19ACCB81/TEMPFILE re-created!
Directory +DATA/_MGMTDB/12D2E2161B7F1288E0630BFA19ACB8F7 re-created!
Directory +DATA/_MGMTDB/12D2E2161B7F1288E0630BFA19ACB8F7/DATAFILE re-created!
Directory +DATA/_MGMTDB/12D2E2161B7F1288E0630BFA19ACB8F7/TEMPFILE re-created!
Directory +DATA/RACORCL re-created!
ASMCMD> ls
DATA/
ASMCMD> du data
Used_MB      Mirror_used_MB
      0                   0

    恢复+data磁盘组后,asmcmd显示+data磁盘组是空的。
    接着使用元数据来恢复+fra磁盘组:

ASMCMD>  md_restore /home/grid/fra.txt --full -G fra
Current Diskgroup metadata being restored: FRA
Diskgroup FRA created!
System template FLASHBACK modified!
System template ARCHIVELOG modified!
System template FLASHFILE modified!
System template XTRANSPORT modified!
System template PARAMETERFILE modified!
System template DATAGUARDCONFIG modified!
System template KEY_STORE modified!
System template BACKUPSET modified!
System template AUTOBACKUP modified!
System template AUDIT_SPILLFILES modified!
System template OCRFILE modified!
System template CHANGETRACKING modified!
System template XTRANSPORT BACKUPSET modified!
System template AUTOLOGIN_KEY_STORE modified!
System template ASMPARAMETERFILE modified!
System template DATAFILE modified!
System template DUMPSET modified!
System template CONTROLFILE modified!
System template INCR XTRANSPORT BACKUPSET modified!
System template TEMPFILE modified!
System template VOTINGFILE modified!
System template ONLINELOG modified!
ORA-15032: not all alterations performed
ORA-15173: entry 'RACORCL' does not exist in directory '/' (DBD ERROR: OCIStmtExecute)
ASMCMD-9354: ADD ALIAS failed
ORA-15032: not all alterations performed
ORA-15173: entry RACORCL does not exist in directory / (DBD ERROR: OCIStmtExecute)
ASMCMD> ls
DATA/
FRA/
ASMCMD> du fra
Used_MB      Mirror_used_MB
      0                   0


    恢复+fra磁盘组后,asmcmd显示+fra磁盘组是空的。忽略ORA-15173这类entry找不到的错误。
    因为ASM实例的spfile文件存在于asm磁盘组中,所以该spfile也随着磁盘损坏而丢失,需要通过pfile为中间体来重建该spfile:

[grid@station11 ~]$ sqlplus /nolog

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 6 03:34:18 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

SQL> conn / as sysasm
Connected.
SQL> show parameter disk  

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                 string     DATA, FRA
asm_diskstring                 string     /dev/oracleasm/disks/*
SQL> create pfile from memory;

File created.


    生成的pfile文件如下:

[grid@station11 dbs]$ pwd
/u01/app/19.3.0/grid/dbs
[grid@station11 dbs]$ cat init+ASM1.ora
# Oracle init.ora parameter file generated by instance +ASM1 on 04/06/2024 03:35:52
*.__large_pool_size=8M
*.__oracle_base='/u01/app/grid' # ORACLE_BASE set from environment
*._always_anti_join='CHOOSE'
*._always_semi_join='CHOOSE'
*._b_tree_bitmap_plans=TRUE
*._bloom_serial_filter='ON'
*._complex_view_merging=TRUE
*._compression_compatibility='19.0.0'
*._diag_adr_trace_dest='/u01/app/grid/diag/asm/+asm/+ASM1/trace'
*._ds_xt_split_count=1
*._dummy_instance=TRUE
*._eliminate_common_subexpr=TRUE
*._fast_full_scan_enabled=TRUE
*._generalized_pruning_enabled=TRUE
*._gs_anti_semi_join_allowed=TRUE
*._improved_outerjoin_card=TRUE
*._improved_row_length_enabled=TRUE
*._index_join_enabled=TRUE
*._key_vector_create_pushdown_threshold=20000
*._ksb_restart_policy_times='0'
*._ksb_restart_policy_times='60'
*._ksb_restart_policy_times='120'
*._ksb_restart_policy_times='240' # internal update to set default
*._left_nested_loops_random=TRUE
*._mv_access_compute_fresh_data='ON'
*._new_initial_join_orders=TRUE
*._new_sort_cost_estimate=TRUE
*._nlj_batching_enabled=1
*._optim_enhance_nnull_detection=TRUE
*._optim_peek_user_binds=TRUE
*._optimizer_ads_use_partial_results=TRUE
*._optimizer_better_inlist_costing='ALL'
*._optimizer_cbqt_or_expansion='ON'
*._optimizer_cluster_by_rowid_control=129
*._optimizer_control_shard_qry_processing=65528
*._optimizer_cost_based_transformation='LINEAR'
*._optimizer_cost_model='CHOOSE'
*._optimizer_extended_cursor_sharing='UDO'
*._optimizer_extended_cursor_sharing_rel='SIMPLE'
*._optimizer_extended_stats_usage_control=192
*._optimizer_join_order_control=3
*._optimizer_max_permutations=2000
*._optimizer_mode_force=TRUE
*._optimizer_native_full_outer_join='FORCE'
*._optimizer_or_expansion='DEPTH'
*._optimizer_proc_rate_level='BASIC'
*._optimizer_system_stats_usage=TRUE
*._optimizer_try_st_before_jppd=TRUE
*._optimizer_use_cbqt_star_transformation=TRUE
*._or_expand_nvl_predicate=TRUE
*._ordered_nested_loop=TRUE
*._parallel_broadcast_enabled=TRUE
*._pivot_implementation_method='CHOOSE'
*._pred_move_around=TRUE
*._push_join_predicate=TRUE
*._push_join_union_view=TRUE
*._push_join_union_view2=TRUE
*._px_dist_agg_partial_rollup_pushdown='ADAPTIVE'
*._px_groupby_pushdown='FORCE'
*._px_partial_rollup_pushdown='ADAPTIVE'
*._px_shared_hash_join=FALSE
*._px_wif_dfo_declumping='CHOOSE'
*._sql_model_unfold_forloops='RUN_TIME'
*._subquery_pruning_mv_enabled=FALSE
*._table_scan_cost_plus_one=TRUE
*._union_rewrite_for_gs='YES_GSET_MVS'
*._unnest_subquery=TRUE
*._use_column_stats_for_function=TRUE
*._xt_sampling_scan_granules='ON'
*.asm_diskgroups='DATA'
*.asm_diskgroups='FRA' # Manual Mount
*.asm_diskstring='/dev/oracleasm/disks/*'
*.connection_brokers='((TYPE=DEDICATED)(BROKERS=1))'
*.connection_brokers='((TYPE=EMON)(BROKERS=1))' # connection_brokers default value
*.core_dump_dest='/u01/app/grid/diag/asm/+asm/+ASM1/cdump'
*.cpu_count=16

    由于pfile中缺乏其他节点的内容,需要修改成如下这样:

[grid@station11 dbs]$ cat init+ASM1.ora
*.asm_diskgroups='DATA'
*.asm_diskgroups='FRA' # Manual Mount
*.asm_diskstring='/dev/oracleasm/disks/*'
*.connection_brokers='((TYPE=DEDICATED)(BROKERS=1))'
*.connection_brokers='((TYPE=EMON)(BROKERS=1))' # connection_brokers default value
+ASM1.core_dump_dest='/u01/app/grid/diag/asm/+asm/+ASM1/cdump'
+ASM2.core_dump_dest='/u01/app/grid/diag/asm/+asm/+ASM2/cdump'

*.cpu_count=16


    将pfile转换成spfile,并顺便找出它的路径,该路径用于后续登记ora.asm资源:

[grid@station11 dbs]$ sqlplus /nolog

SQL*Plus: Release 19.0.0.0.0 - Production on Sun Apr 7 11:59:45 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

SQL> conn / as sysasm
Connected.
SQL> create spfile='+data' from pfile;

File created.
SQL> select  GROUP_NUMBER, FILE_NUMBER  from  v$asm_file;

GROUP_NUMBER FILE_NUMBER
------------ -----------
           1         253

SQL>  set linesize 1000
SQL>  col name format a40 trunc
SQL>  select  GROUP_NUMBER, FILE_NUMBER, name from v$asm_alias where GROUP_NUMBER=1 and FILE_NUMBER=253;

GROUP_NUMBER FILE_NUMBER NAME
------------ ----------- ----------------------------------------
           1         253 REGISTRY.253.1165665607


    asmcmd可以找到路径:

[grid@station11 dbs]$ asmcmd
ASMCMD> ls
DATA/
FRA/
ASMCMD> cd data
ASMCMD> ls
ASM/
RACORCL/
_MGMTDB/
cluster11/
ASMCMD> cd cluster11/
ASMCMD> ls
ASMPARAMETERFILE/
ASMCMD> cd ASMPARAMETERFILE/
ASMCMD> ls
REGISTRY.253.1165665607
ASMCMD> pwd
+data/cluster11/ASMPARAMETERFILE


2. OCR恢复:

[root@station11 ~]# ocrconfig -restore /home/grid/backup_20240405_075443.ocr.286.1165478083

3. VotingDisk的恢复:

[root@station11 ~]# crsctl replace votedisk +data
Successful addition of voting disk 07b43590a6af4fb7bf00cad3288a0e70.
Successful addition of voting disk 85ca63c3919a4ffcbf56aad29f9b4792.
Successful addition of voting disk 2d6e86670ebd4f58bfdc437d2a34abd2.
Successfully replaced voting disk group with +data.
CRS-4266: Voting file(s) successfully replaced


4. 关闭exclusive模式,并在任一节点以标准方式启动crs:

[root@station11 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'station11'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'station11'
CRS-2673: Attempting to stop 'ora.ctssd' on 'station11'
CRS-2673: Attempting to stop 'ora.evmd' on 'station11'
CRS-2673: Attempting to stop 'ora.asm' on 'station11'
CRS-2677: Stop of 'ora.mdnsd' on 'station11' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'station11' succeeded
CRS-2677: Stop of 'ora.evmd' on 'station11' succeeded
CRS-2677: Stop of 'ora.asm' on 'station11' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'station11'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'station11' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'station11'
CRS-2677: Stop of 'ora.cssd' on 'station11' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'station11'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'station11'
CRS-2677: Stop of 'ora.gpnpd' on 'station11' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'station11' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'station11' has completed
CRS-4133: Oracle High Availability Services has been stopped.

[root@station11 ~]# crsctl start crs 

CRS-4123: Oracle High Availability Services has been started.


5. 修改ora.asm资源的spfile信息:

    把本文第1节中找到的ASM的spfile的路径,设置进ora.asm资源:

[root@station11 ~]# srvctl modify asm -spfile '+data/cluster11/ASMPARAMETERFILE/REGISTRY.253.1165665607'
PRKO-2207 : Warning:--spfile option has been deprecated and will be ignored.
[root@station11 ~]# srvctl config asm
ASM home:

Password file: +DATA/orapwASM
Backup of Password file: +DATA/orapwASM_backup
ASM listener: LISTENER
ASM instance count: 3
Cluster ASM listener: ASMNET1LSNR_ASM



[grid@station11 ~]$ sqlplus /nolog

SQL*Plus: Release 19.0.0.0.0 - Production on Sun Apr 7 13:12:06 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

SQL> conn / as sysasm
Connected.
SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA/cluster11/ASMPARAMETERFI
                                                 LE/registry.253.1165665607



6. 一个节点已经修复:


[grid@station11 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       station11                STABLE
ora.chad
               ONLINE  ONLINE       station11                STABLE
ora.net1.network
               ONLINE  ONLINE       station11                STABLE
ora.ons
               ONLINE  ONLINE       station11                STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.FRA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       station11                STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       station11                STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       station11                STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       station11                169.254.14.179 172.3
                                                             1.118.11 172.31.118.
                                                             211,STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       station11                Started,STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       station11                STABLE
ora.mgmtdb
      1        ONLINE  OFFLINE                               Corrupted Controlfil
                                                             e,STABLE
ora.qosmserver
      1        ONLINE  ONLINE       station11                STABLE
ora.racorcl.db
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        ONLINE  OFFLINE                               STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.station11.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.station12.vip
      1        ONLINE  INTERMEDIATE station11                FAILED OVER,STABLE
--------------------------------------------------------------------------------

7. 其他节点的修复:

    尝试标准方式启动其他节点上的crs:

[root@station12 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.


    由于可能我们的环境是RHEL8,也可能RAC 19c (19.3.0.0.0)storage上未知的bug,crs一直等待ASM的启动,会陷入死循环:

[root@station12 ~]# crsctl status res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.


    解决办法就是在其他节点上手动启动ASM(该其他节点如果需要,可以把启动过程自行写进Linux的anacron,以保证操作系统启动后自动启动ASM):

[grid@station12 ~]$ sqlplus /nolog


SQL*Plus: Release 19.0.0.0.0 - Production on Sun Apr 7 13:22:25 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

SQL> conn / as sysasm
Connected to an idle instance.
SQL> startup
ASM instance started

Total System Global Area 1351073632 bytes
Fixed Size                  8896352 bytes
Variable Size            1308622848 bytes
ASM Cache                  33554432 bytes
ASM diskgroups mounted


    其他节点也修复成功:

[grid@station12 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       station11                STABLE
               ONLINE  ONLINE       station12                STABLE
ora.chad
               ONLINE  ONLINE       station11                STABLE
               ONLINE  ONLINE       station12                STABLE
ora.net1.network
               ONLINE  ONLINE       station11                STABLE
               ONLINE  ONLINE       station12                STABLE
ora.ons
               ONLINE  ONLINE       station11                STABLE
               ONLINE  ONLINE       station12                STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        ONLINE  ONLINE       station12                STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        ONLINE  ONLINE       station12                STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.FRA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        ONLINE  ONLINE       station12                STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       station12                STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       station11                STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       station11                STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       station12                169.254.1.204 172.31
                                                             .118.12 172.31.118.2
                                                             12,STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       station11                Started,STABLE
      2        ONLINE  ONLINE       station12                Started,STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       station11                STABLE
      2        ONLINE  ONLINE       station12                STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       station11                STABLE
ora.mgmtdb
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.qosmserver
      1        ONLINE  ONLINE       station11                STABLE
ora.racorcl.db
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.scan1.vip
      1        ONLINE  ONLINE       station12                STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.station11.vip
      1        ONLINE  ONLINE       station11                STABLE
ora.station12.vip
      1        ONLINE  ONLINE       station12                STABLE
--------------------------------------------------------------------------------

    将通过之后的两篇博客来进行mgmgtdb的恢复和RAC数据库的恢复。





路过

雷人

握手

鲜花

鸡蛋

QQ|手机版|Bo's Oracle Station   

GMT+8, 2024-4-6 21:51 , Processed in 0.031563 second(s), 21 queries .

返回顶部