What Is Redo?-Redo and Undo
Redo log files are crucial to the Oracle database. These are the transaction logs for the database. Oracle maintains two types of redo log files: online and archived. They are used for recovery purposes; their main purpose in life is to be used in the event of an instance or media failure.
If the power goes off on your database machine, causing an instance failure, Oracle will use the online redo logs to restore the system to exactly the committed point it was at immediately prior to the power outage. If your disk drive fails (a media failure), Oracle will use both archived redo logs and online redo logs to recover a backup of the data that was on that drive to the correct point in time. Moreover, if you “accidentally” truncate a table or remove some critical information and commit the operation, you can restore a backup of the affected data and recover it to the point in time immediately prior to the “accident” using online and archived redo log files.
I should point out that modern versions of Oracle also have flashback technology. This allows us to perform flashback queries (query the data as of some point in time in the past), undrop a database table, put a table back the way it was some time ago, and so on. As a result, the number of occasions in which we need to perform a conventional recovery (using database backups and archived redo logs) has decreased. However, the ability to perform a recovery is the DBA’s most important job.
Note Database restore and recovery is the one thing a DBA is not allowed to get wrong.
Archived redo log files are simply copies of old, full online redo log files. As the system fills up log files, the Oracle archiver (ARCn) process makes a copy of the online redo log file in another location and optionally puts several other copies into local and remote locations as well.
These archived redo log files are used to perform media recovery when a failure is caused by a disk drive going bad or some other physical fault. Oracle can take these archived redo log files and apply them to backups of the datafiles to catch them up to the rest of the database. They are the transaction history of the database.
Every Oracle database has at least two online redo log groups with at least a single member (redo log file) in each group. These online redo log groups are written to in a circular fashion by the log writer (LGWR) background process. Oracle will write to the log files in group 1, and when it gets to the end of the files in group 1, it will switch to log file group 2 and begin writing to that one. When it has filled log file group 2, it will switch back to log file group 1 (assuming you have only two redo log file groups; if you have three, Oracle would, of course, proceed to the third group).
Redo logs, or transaction logs, are one of the major features that make a database a database. They are perhaps its most important recovery structure, although without the other pieces such as undo segments, distributed transaction recovery, and so on, nothing works.
They are a major component of what sets a database apart from a conventional file system. The online redo logs allow us to effectively recover from a power outage—one that might happen while Oracle’s database writer (DBWR) background process is in the middle of writing to disk.
The archived redo logs let us recover from media failures when, for instance, the hard disk goes bad or human error causes data loss. Without redo logs, the database would not offer any more protection than a file system.
There’s one additional item I want to mention regarding redo. In an Oracle RAC environment, you typically have two or more instances. RAC configurations have one common set of datafiles (meaning each instance transacts against a common set of datafiles). However, each instance participating in a RAC cluster has its own memory structures and background processes (e.g., log writer and archiver). Also, each instance will have its own redo stream (or often called a thread of redo). And it follows that each instance will also have its own undo segments. This is important because you may find yourself troubleshooting performing issues with redo, and it’s critical to pinpoint which instance or instances may be having redo bottleneck issues.