The curious case of the datafile that “belonged” to 2 databases

The curious case of the datafile that “belonged” to 2 databases

First I want to start disclosing that I have an idea on how this happened, but haven’t been able to recreate this issue, also that I was lucky enough that the tablespace that had the issue was a DUMMY tablespace so it didn’t cause an issue.

So now that the above disclosure is done, what happened is that I have a SOURCE database with 1 location for it’s datafiles +DATA1/SOURCE, but at some point someone added a second location for testing for a TEST1 tablespace in +DATA2/TEST, this SOURCE database was duplicated via RMAN to 2 databases, call them DUP1 and DUP2. In the RMAN duplicate command , the parameter DB_FILE_NAME_CONVERT was used to change the location of the datafiles of the SOURCE database to the DUP1 or DUP2, but it was only input for the known location, not for the testing location.

What it was input in the RMAN command:

DB_FILE_NAME_CONVERT '+DATA1/SOURCE','DATA1/DUP1'

How it should have been:

DB_FILE_NAME_CONVERT '+DATA1/SOURCE','+DATA1/DUP1','+DATA2/TEST','+DATA1/DUP1'

And because the diskgroup and directory of  +DATA2/TEST exist where the DUP1 and DUP2 would reside, it didn’t cause an issue when it ran the duplicate, and this is something that is still yet unknown to me, the duplicate finished successfully for both databases and no issues arose from the open resetlogs.

As you might already know when you issue the DUPLICATE command the DBID is implicitly changed , and what this does is it updates the controlfile with the new DBID and also the datafile headers , that way is how it will know that these datafiles XX belong to database NN.

So when both databases where up and running, I ran a backup from both databases, DUP1 finished successfully, DUP2 failed with the following error:

RMAN-06169: could not read file header for datafile 148 error reason 9
RMAN-06056: could not access datafile 148

One of the first things I did was run a VALIDATE DATABASE, and indeed it was showing me the same error:

RMAN> VALIDATE DATABASE;

Starting validate at 29-NOV-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=145 instance=DUP21 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=164 instance=DUP21 device type=DISK
RMAN-06169: could not read file header for datafile 148 error reason 9
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of validate command at 11/29/2012 01:40:41
RMAN-06056: could not access datafile 148

The next thing I did was to verify what that file was and the DBID of the DUP2 database, as the error of the header led me that way

DUP21 >select DBID,open_mode from v$database;

DBID OPEN_MODE
---------- ------------------------------------------------------------
2211912429 READ WRITE

DUP21 >select file_name from dba_data_files where TABLESPACE_NAME='TEST1';

FILE_NAME
--------------------------------------------------------------------------------
+DATA2/test/test01.dbf


DUP21 >select STATUS from dba_tablespaces where TABLESPACE_NAME='TEST1';

STATUS
---------------------------
ONLINE

Because RMAN error 06169 was telling me that it couldn’t read the header, I did a dump of the header

DUP21 >oradebug setmypid

Statement processed.

DUP21 >oradebug unlimit

Statement processed.

DUP21 >oradebug dump file_hdrs 3

Statement processed.

DUP21 >oradebug tracefile_name

And indeed saw that the DBID of test01.dbf belonged to DUP1, which had been the first of the 2 databases duplicated

V10 STYLE FILE HEADER:
Compatibility Vsn = 186646528=0xb200000
Db ID=4005607769=0xeec0b959, Db Name='DUP1'

So just to be sure, I went to DUP1 and verified, and yes it was the same DBID that the header dump told me ,and yes the tablepsace was online as well in DUP1

DUP11 >select DBID,open_mode from v$database;

DBID OPEN_MODE
---------- ------------------------------------------------------------
4005607769 READ WRITE

DUP11 >select STATUS from dba_tablespaces where TABLESPACE_NAME='TEST1';

STATUS
---------------------------
ONLINE

DUP11 >select file_name from dba_data_files where TABLESPACE_NAME='TEST1';

FILE_NAME
--------------------------------------------------------------------------------
+DATA2/test/test01.dbf

DUP11 >select STATUS from dba_tablespaces where TABLESPACE_NAME='TEST1';

STATUS
---------------------------
ONLINE

Because that particular tablespace was of no use for the application, I just went ahead and dropped it

DUP21 >ALTER TABLESPACE TEST1 OFFLINE IMMEDIATE;

Tablespace altered.

DUP21 >DROP TABLESPACE TEST1;

Tablespace dropped.

Takeways

I, have to say it’s in 11.2.0.3 with a GI of 11.2.0.3 in RHEL 5 x86_64, but it goes to say that whenever you do a duplicate and you use the DB_FILE_NAME_CONVERT , be sure that all of your locations are covered . But what is really bugging me is how DUP2 was able to open resetlogs without this being an issue, I will try to recreate this and update this post if and when I do.

Rene Antunez
[email protected]
No Comments

Sorry, the comment form is closed at this time.