How To Continue A GI Patch In EXACC When It Has Failed In Node 1

How To Continue A GI Patch In EXACC When It Has Failed In Node 1

This week we had an issue when we were patching and EXACC Grid Infrastructure from 19.10 to 19.11 and the patching died when it was running the postpatch on node 1 and left the patch in a weird “ROLLING PATCH” mode as it hadn’t finished the 19.11 patching (rootcrs.sh -postpatch) in node 1 and node 2 was still running 19.10.

What you see below is the command that was executed to initially patch the EXACC GI and the last entry in the log and that is where it died.

[root@hostname1 ~]# dbaascli patch db apply --patchid 32545008-GI --dbnames grid
...
2021-09-23 10:15:32.233368 - INFO: Running /u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -postpatch
2021-09-23 10:15:32.233539 - Output from cmd /u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -postpatch run on localhost  is:

We looked at the GUI console, and it was mentioning that the cluster was patched and in 19.11 version

 

But If you went to the command line of both nodes, you saw that the patch had not finished,.

[grid@hostname1 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
32847378;OCW Interim patch for 32847378
32585572;DBWLM RELEASE UPDATE 19.0.0.0.0 (32585572)
32584670;TOMCAT RELEASE UPDATE 19.0.0.0.0 (32584670)
32576499;ACFS RELEASE UPDATE 19.11.0.0.0 (32576499)
32545013;Database Release Update : 19.11.0.0.210420 (32545013)

OPatch succeeded.

[grid@hostname2 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
32240590;TOMCAT RELEASE UPDATE 19.0.0.0.0 (32240590)
32222571;OCW Interim patch for 32222571
32218663;ACFS RELEASE UPDATE 19.10.0.0.0 (32218663)
32218454;Database Release Update : 19.10.0.0.210119 (32218454)
29340594;DBWLM RELEASE UPDATE 19.0.0.0.0 (29340594)

OPatch succeeded.

The following thing to try was to relaunch the dbaascli, but it failed as it detected that a patching operation was going on

[root@hostname1 ~]# dbaascli patch db apply --patchid 32545008-GI --dbnames grid
...
The current operation apply_async is blocked on node hostname1 due the following error: The current operation cannot proceed due a previous ongoing patching operation was detected

The fix to this issue is to run the postpatch as root in node 1

[root@hostname1 ~]# /u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -postpatch

Now that the post patch had finished, the following step is to verify the stack, the patch version and status

[grid@hostname1 ~]$ crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [1944883066].

[grid@hostname1 ~]$ crsctl query crs releasepatch
Oracle Clusterware release patch level is [1988519045] and the complete list of patches [32545013 32576499 32584670 32585572 32847378 ] have been applied on the local node. The release patch string is [19.11.0.0.0].

[grid@hostname1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Now in node 2 we ran the following command to finish the patching, as you can see it is a bit different from what we ran initially

[root@hostname2 ~]# dbaascli patch db apply --patchid 32545008-GI --instance1 hostname2:/u01/app/19.0.0.0/grid --dbnames grid

Once the dbaascli command had finished, now we saw both nodes with the same patch version and the cluster state as NORMAL

[grid@hostname1 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
32847378;OCW Interim patch for 32847378
32585572;DBWLM RELEASE UPDATE 19.0.0.0.0 (32585572)
32584670;TOMCAT RELEASE UPDATE 19.0.0.0.0 (32584670)
32576499;ACFS RELEASE UPDATE 19.11.0.0.0 (32576499)
32545013;Database Release Update : 19.11.0.0.210420 (32545013)

OPatch succeeded.

[grid@hostname2 ~]$ $ORACLE_HOME/OPatch/opatch lspatches
32847378;OCW Interim patch for 32847378
32585572;DBWLM RELEASE UPDATE 19.0.0.0.0 (32585572)
32584670;TOMCAT RELEASE UPDATE 19.0.0.0.0 (32584670)
32576499;ACFS RELEASE UPDATE 19.11.0.0.0 (32576499)
32545013;Database Release Update : 19.11.0.0.210420 (32545013)

OPatch succeeded.

[grid@hostname2 ~]$ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [19.0.0.0.0]

[grid@hostname2 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
 
[grid@hostname2 ~]$ crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [1988519045]. 

Hope this blog helps you out should you ever be in a patching situation where one node is not patched when the patching died in node 1,

Tags:
,
Rene Antunez
[email protected]
No Comments

Sorry, the comment form is closed at this time.