There are times when a seemingly simple query ends abruptly with an ORA-01555. Interestingly enough there are no other user processes running, no updates, no deletes, nothing that could generate such an error. The common question is "What caused this?' The short answer is Delayed Block Cleanout, so what IS Delayed Block Cleanout and how does it generate an ORA-01555?
Let's define Delayed Block Cleanout:
Delayed Block Cleanout is the process where Oracle completes the cleanout processes started by the DML operation that modified the affected blocks.
So why didn't Oracle clean out all of the affected blocks? There is a limit on how many blocks Oracle will automatically clean out at commit time and that is 10% of the buffer cache. Any blocks exceeding that threshold must wait for a transaction or SELECT to finish the job. Let's describe that series of events in more detail:
Delayed Block Cleanout is caused by a session accessing a block that's been recently modified and hasn't been cleaned out yet. The first step for the current session is to check if the modifying session is still active. If the modifying transaction is not active then the current session 'cleans out' the block so any other sessions accessing that block won't need to go through that process. The cleanout process reads the UNDO header and determines if the blocks are marked as committed or not. If the changes are committed the session removes any transaction-related information for the modified blocks, which generates a redo. It also frees the UNDO segments used to modify those blocks. If the following conditions must be met an ORA-01555 is thrown:
- Data is modified and committed and the blocks are not cleaned out automatically (because they exceed the 10% limit to the SGA block buffer cache).
- The modified blocks are not accessed by another session and won't be accessed until the current session does so.
- The current session begins a long-running query and starts at SCN t_a, the SCN to roll back to in order to ensure a read-consistent image.
- During this query other sessions modify and commit changes to the database, but don't touch the blocks this long-running query needs.
- The transaction tables roll around due to the high number of COMMITS and 'step on' the transaction entry for SCN t_a.
- The lowest SCN (call it t_b) is now higher than t_a (the read-consistent SCN of the long-running query) preventing a read-consistent image due to the large number of COMMITS
Not every occurrence of Delayed Block Cleanout ends this way so it's not necessarily a bad thing. It can be bad, however, if you're running a large query taking a large amount of time and you meet the above criteria. Can this be 'fixed'? Possibly but it does require collecting statistics on the affected table or tables prior to running any long-running SELECT statements. As an example let's say a table named SALES is huge, containing millions of rows and let's say a transaction modifies 300,000 of those rows. With a moderately sized SGA and buffer cache it's highly likely Delayed Block Cleanout will terminate a long-running query started just after that update is committed. To prevent this running DBMS_STATS.GATHER_TABLE_STATS on the SALES table will cause Oracle to 'touch' every block in the table, thus triggering the cleanout process. Yes, it will take some time to generate current statistics but usually such large updates occur at night during batch processing so 'normal' user queries aren't affected. Of course it is possible to update large volumes of data during the business day making it difficult to stop long enough to generate current statistics on affected tables. The good news from all of this is that once blocks are cleaned out after large transactions, subsequent queries using the updated tables won't have to go through that cleanout process.
Delayed Block Cleanout is a necessary 'evil' unless, of course, you have a huge buffer cache allocation where 10% can cover updates spanning millions of rows. I have seen some large SGAs in my time but never a buffer cache large enough to leave every modified block 'clean'. Knowing how the process works and how it can affect some long-running queries can help you explain why a query stepped on its own UNDO and, maybe give you a way to prevent generating an ORA-0155 in the future. As databases grow ever larger the prospect of this event triggering an error can increase. Then, with larger SGAs in these large databases it may take a very large update to get to this point. Regardless it's good to know how Oracle handles transaction entries in data block headers when the blocks are committed but exceed the 10% buffer cache cleanout threshold.