Undo Segments Are in Fact Too Small -Redo and Undo-2

If you have a fixed-size undo tablespace, then the UNDO_RETENTION parameter is ignored by Oracle. In this configuration, you can use tools such as the Undo Advisor to help you appropriately size your undo tablespace. The Undo Advisor is accessible via Enterprise Manager or the DBMS_ADVISOR PL/SQL packages. The two most important considerations in sizing your undo are the length of the longest-running query and the longest interval you require for flashback operations.

I am getting a little too deep into the DBA role at this point, so we’ll move on to the next case. It’s just important that you understand that the ORA-01555 error in this case is due to the system not being sized correctly for your workload. The only solution is to size correctly for your workload. It is not your fault, but it is your problem since you hit it. It’s the same as if you run out of temporary space during a query. You either configure sufficient temporary space for the system, or you rewrite the queries so they use a plan that does not require temporary space.

To demonstrate this effect, we can set up a small, but somewhat artificial, test. We’ll create a very small undo tablespace with one session that will generate many small transactions, virtually assuring us that it will wrap around and reuse its allocated space many times—regardless of the UNDO_RETENTION setting, since we are not permitting the undo tablespace to grow.

The session that uses this undo segment will be modifying a table, T. It will use a full scan of T and read it from “top” to “bottom.” In another session, we will execute a query that will read the table T via an index. In this fashion, it will read the table somewhat randomly: it will read row 1, then row 1000, then row 500, then row 20,001, and so on. In this way, we will tend to visit blocks very randomly and perhaps many times during the processing of our query. The odds of getting an ORA-01555 error in this case are virtually 100 percent.

So, in one session we start by connecting to the pluggable database (not the root container) and creating a new undo tablespace within it:
$ sqlplus eoda/foo@PDB1
SQL> alter system set undo_tablespace = undo_small; System altered.

Now, we’ll set up the table T to query and modify. Note that we are ordering the data randomly in this table. The CREATE TABLE AS SELECT tends to put the rows in the blocks in the order it fetches them from the query. We’ll just scramble the rows up so they are not artificially sorted in any order, randomizing their distribution:

SQL> drop table t purge;Table dropped.
SQL> alter table t add constraint t_pk primary key(object_id); Table altered.
SQL> exec dbms_stats.gather_table_stats( user, ‘T’, cascade=> true ); PL/SQL procedure successfully completed.

Now, while that PL/SQL block of code is running, we will run a query in another session. That other query will read table T and process each record. It will spend about 1/100 of a second processing each record before fetching the next (simulated using DBMS_LOCK.SLEEP(0.01)). We will use the FIRST_ROWS hint in the query to have it use the index we created to read the rows out of the table via the index sorted by OBJECT_ID. Since the data was randomly inserted into the table, we would tend to query blocks in the table rather randomly.
This block will only run for a couple of seconds before failing:
SQL> declarecursor c isselect /*+ first_rows */ object_name

As you can see, it processed only a handful of records before failing with the ORA-­ 01555: snapshot too old error. To correct this, we want to make sure two things are done:

•\ UNDO_RETENTION is set in the database to be at least long enough for this read process to complete. That will allow the database to grow the undo tablespace to hold sufficient undo for us to complete.
•\ The undo tablespace is allowed to grow, or you manually allocate more disk space to it.

For this example, I have determined my long-running process takes about 720 seconds to complete (I have about 72,000 records in the table, so at 0.01 seconds per row we have 720 seconds). My UNDO_RETENTION is set to 900 (this is in seconds, so the undo retention is about 15 minutes). I altered the undo tablespace’s datafile to permit it to grow by 1MB at a time, up to 2GB in size:

SQL> alter database datafile ‘/tmp/undo.dbf’ autoextend on next 1m maxsize 2048m; Database altered.

When I ran the processes concurrently again, both ran to completion. The undo tablespace’s datafile grew this time, because it was allowed to and the undo retention I set up said to.

So, instead of receiving an error, we completed successfully, and the undo grew to be large enough to accommodate our needs. It is true that in this example, getting the error was purely due to the fact that we read the table T via the index and performed random reads all over the table. If we had rapidly full scanned the table instead, there is a good chance we would not have received the ORA-01555 error in this particular case. This is because both the SELECT and UPDATE would have been full scanning T, and the SELECT could most likely race ahead of the UPDATE during its scan (the SELECT just has to read, but the UPDATE must read and update and therefore could go slower). By doing the random reads, we increase the probability that the SELECT will need to read a block, which the UPDATE modified and committed many rows ago. This just demonstrates the somewhat insidious nature of the ORA-01555 error. Its occurrence depends on how concurrent sessions access and manipulate the underlying tables.

Leave a Comment

Your email address will not be published. Required fields are marked *