Simplifying oracle Concepts: Block Corruption and how to identify corrupted blocks

Block corruption is while the data is being written to the data blocks, if the write to the block fails abruptly, which means there is a partial write in the block, may be because of power disruption or I/O problem, leaving no time for header to be updated, or row data to be populated, oracle leaves the block corrupt.In case of block corruption you can normally use the database unless you try to read that particular block, against which it shoots up the block corruption error.Generally block corruption occurs if write fails on the block, when the transaction is being committed. You can find detail information about block corruption in alert.log file
block corruption can happens at

1)Physical Level corruption (which means media corrupt)
2)Logical Level corruption (which means soft corrupt)

Physical corruption can be caused by defected memory boards, controllers or broken sectors on a hard disk;

Logical corruption can among other reasons be caused by an attempt to recover through a NOLOGGING action.

To Differentiate between both the corruption levels (logical and physical)

Logical corruption is header – footer – that is one of the checks, yes (it is looking for fractured blocks and when it hits one, it’ll re-read it, that is why there is no need for “alter tablespace begin backup” with rman)
Physical corruption is “we cannot read the block from disk, something is physically preventing us from doing so”

The methods to detect Block corruptions are

1) DBVerify utility
2) Block checking parameters (DB_BLOCK_CHCEKSUM) – In Oracle 10g db_block_checksum value TYPICAL which should be TRUE and db_block_checking value FULL should be TRUE.
3) ANALYZE TABLE tablename VALIDATE STRUCTURE CASCADE SQL statement
4) RMAN BACKUP command with THE VALIDATE option.

SQL> DESC V$DATABASE_BLOCK_CORRUPTION;

SQL> SELECT OBJECT_NAME, BLOCK_ID, CORRUPT_TYPE, MARKED_CORRUPT,
CORRUPT_DESCRIPTION, REPAIR_DESCRIPTION
FROM REPAIR_TABLE WHERE BLOCK_ID=**;

1.DBVERIFY -

The primary tool for checking for corruption in an Oracle database is DBVERIFY. It can be used to perform a physical data structure integrity check on data files whether the database is online or offline. The big benefit of this is that DBVERIFY can be used to check backup data files without adding load to the database server. You invoke DBVERIFY from the operating system command line like this:
$ dbv file=data01.dbf logfile=verify.log blocksize=8192 feedback=100
In this example data01.dbf is the data file to check, and the tablespace this file belongs to has a block size of 8192 bytes. The feedback parameter tells DBVERIFY to draw a period on the screen after every 100 pages (blocks) of the file are verified.
In the log file you’ll see output like this:

DBVERIFY – Verification starting : FILE = data01.dbf

DBVERIFY – Verification complete

Total Pages Examined : 640
Total Pages Processed (Data) : 631
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 9
Total Pages Empty : 0
Total Pages Marked Corrupt : 0
Total Pages Influx : 0

The Total Pages Failing values show the number of blocks that failed either the data block or index block checking routine. The Total Pages Marked Corrupt figure shows the number of blocks for which the cache header is invalid, thereby making it impossible for DBVERIFY to identify the block type. And the Total Pages Influx is the number of blocks for which DBVERIFY could not get a consistent image. (This could happen if the database is open when DBVERIFY is run. DBVERIFY reads blocks multiple times to try to get a consistent image, but DBVERIFY cannot get a consistent image of pages that are in flux.)
If you want to verify only a portion of a data file, you can specify a starting and ending block when running DBVERIFY. If you want to verify the entire database, you can generate a short shell script to run DBVERIFY on every data file in the database. You can do this easily using SQL*Plus:

SQL> SPOOL dbv_on_all_files.sh
SQL> SELECT ‘dbv file=’ || file_name ||
2 ‘ logfile=file’ || ROWNUM ||
3 ‘.log blocksize=8192′
4 FROM dba_data_files;
SQL> SPOOL OFF

After running the shell script you can quickly scan all of the DBVERIFY log files with Unix commands like:

$ grep Failing file*.log
$ grep Corrupt file*.log
$ grep Influx file*.log
You can also use DBVERIFY to validate a single data or index segment. To do this you must be logged onto the database with SYSDBA privileges. During the verification the segment is locked; if the segment is an index then the parent table is also locked.
There are other ways to check for database corruption besides DBVERIFY. You can take a full database export, with the dump file optionally specified as a null device. This will read every row in every user table in the database, discovering any corrupted data blocks along the way. However, this technique does not access every index entry or the entire data dictionary.

If you want to check one table and all of its indexes, you can use the ANALYZE statement to read every row of the table, read every entry in each of the table’s indexes, and make sure the table and index data are consistent with each other:

SQL> ANALYZE TABLE emp VALIDATE STRUCTURE CASCADE;

This will lock the table, preventing DML on the table, unless you specify the ONLINE keyword. Online validation reduces the amount of validation performed to allow for concurrency.

There are several ways to check for corruption in an Oracle database, but the DBVERIFY tool is the most versatile. DBVERIFY does not limit concurrency or DML while it is running, and it can be run against a database backup. Just remember that if DBVERIFY detects corruption in your database and you are planning to recover the corrupt file from a backup, you should perform a DBVERIFY validation on the backup file before beginning the recovery. This will tell you if the corruption exists in the backup also.

DB_BLOCK_CHECKSUM (default is true in 9i, false in 8i) determines whether DBWn and the direct loader will calculate a checksum (a number calculated from all the bytes stored in the block) and store it in the cache header of every data block when writing it to disk. Checksums are verified when a block is read-only if this parameter is true and the last write of the block stored a checksum. In addition, Oracle gives every log block a checksum before writing it to the current log.

If this parameter is set to false, DBWn calculates checksums only for the SYSTEM tablespace, but not for user tablespaces.

Checksums allow Oracle to detect corruption caused by underlying disks, storage systems, or I/O systems. Turning on this feature typically causes only an additional 1% to 2% overhead. Therefore, Oracle Corporation recommends that you set DB_BLOCK_CHECKSUM to true.

DB_BLOCK_CHECKING (default value is false) controls whether Oracle performs block checking for data blocks. When this parameter is set to true, Oracle performs block checking for all data blocks. When it is set to false, Oracle does not perform block checking for blocks in the user tablespaces. However, block checking for the SYSTEM tablespace is always turned on.

Oracle checks a block by going through the data on the block, making sure it is self-consistent. Block checking can often prevent memory and data corruption. Block checking typically causes 1% to 10% overhead, depending on workload. The more updates or inserts in a workload, the more expensive it is to turn on block checking. You should set DB_BLOCK_CHECKING to true if the performance overhead is acceptable.

RMAN (BACKUP VALIDATE, RESTORE VALIDATE, VALIDATE)

Oracle Recovery Manager (RMAN) can validate the database using the BACKUP VALIDATE command.

RMAN> BACKUP VALIDATE DATABASE ARCHIVELOG ALL;

The process outputs the same information you would see during a backup, but no backup is created. Any block corruptions are visible in the V$DATABASE_BLOCK_CORRUPTION view, as well as in the RMAN output.

By default the command only checks for physical corruption. Add the CHECK LOGICAL clause to include checks for logical corruption.

RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL;

RMAN can validate the contents of backup files using the RESTORE VALIDATE command.

RMAN> RESTORE DATABASE VALIDATE;
RMAN> RESTORE ARCHIVELOG ALL VALIDATE;

In a similar way to the BACKUP VALIDATE command, the RESTORE VALIDATE command mimics the process of a restore, without actually performing the restore.

Prior to 11g, the straight VALIDATE command could only be used to validate backup related files. In Oracle 11g onward, the VALIDATE command can also validate datafiles, tablespaces or the whole database, so you can use it in place of the BACKUP VALIDATE command.

RMAN> VALIDATE DATAFILE 1;
RMAN> VALIDATE DATAFILE '/u01/app/oracle/oradata/ORCL/system01.dbf';

RMAN> VALIDATE CHECK LOGICAL DATAFILE 1;
RMAN> VALIDATE CHECK LOGICAL DATAFILE '/u01/app/oracle/oradata/ORCL/system01.dbf';

RMAN> VALIDATE TABLESPACE users;
RMAN> VALIDATE CHECK LOGICAL TABLESPACE users;

RMAN> VALIDATE DATABASE;
RMAN> VALIDATE CHECK LOGICAL DATABASE;

Any block corruptions are visible in the V$DATABASE_BLOCK_CORRUPTION view. You can identify the objects containing a corrupt block using a query like this.

COLUMN owner FORMAT A20
COLUMN segment_name FORMAT A30

SELECT DISTINCT owner, segment_name
FROM v$database_block_corruption dbc
JOIN dba_extents e ON dbc.file# = e.file_id AND dbc.block# BETWEEN e.block_id and e.block_id+e.blocks-1
ORDER BY 1,2;

Simplifying oracle Concepts

Tuesday, 28 June 2016

Block Corruption and how to identify corrupted blocks

No comments:

Post a Comment

Add Data file

Total Pageviews

Report Abuse