In the last post we took a look at how to get execution plans using DBMS_XPLAN.DISPLAY_CURSOR. In the post before that we looked at how to get the estimated execution plan using DBMS_XPLAN.DISPLAY. I mentioned that DISPLAY_CURSOR allowed you to get additional statistics, so let’s take a look in more detail.
We’ll start with some output produced by DISPLAY:
Plan hash value: 3063390629 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14653 | 113M| 119 (3)| 00:00:02 | |* 1 | HASH JOIN | | 14653 | 113M| 119 (3)| 00:00:02 | | 2 | TABLE ACCESS FULL | T3 | 100 | 198K| 3 (0)| 00:00:01 | |* 3 | HASH JOIN | | 14653 | 85M| 116 (3)| 00:00:02 | |* 4 | HASH JOIN | | 8 | 32344 | 7 (15)| 00:00:01 | |* 5 | TABLE ACCESS FULL| T4 | 6 | 12090 | 3 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL| T2 | 50 | 99K| 3 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL | T1 | 92800 | 180M| 108 (1)| 00:00:02 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T3"."T3_ID"="T1"."T3_ID") 3 - access("T2"."T2_ID"="T1"."T2_ID") 4 - access("T4"."T4_ID"="T2"."T4_ID") 5 - filter("T4"."RAND_STRING" LIKE 'f%')
Here are what the columns mean:
- Id – ID corresponding to the current line of the execution plan. This is how other sections of the output reference where in the execution plan they apply to. For example, in the Predicate Information section, the numbers at the start of each line tell you to which line of the execution plan that particular predicate applies to. Also notice in the ID column there are some asterisks in front of some IDs. This indicates that these rows are referenced in the Predicate Information section.
- Operation – What the database is actually doing on each line of the execution plan.
- Name – Name of the object being acted on by the current operation. Note this column does not apply to all operations.
- Rows – Number of rows Oracle believes will be returned by the current operation.
- Bytes – Amount of data Oracle believes will be returned by the current operation.
- Cost (%CPU) – Cost Oracle assigned to current operation, with percentage of CPU cost in parenthesis. Cost is cumulative, so the cost at the parent line contains the cost of itself as well as its children. More on cost will come later.
- Time – Amount of time Oracle believes current operation will take.
Now let’s look at an execution plan from DBMS_STATS.DISPLAY_CURSOR using IOSTATS as the format option:
Plan hash value: 3063390629 --------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 14000 |00:00:00.06 | 399 | |* 1 | HASH JOIN | | 1 | 14653 | 14000 |00:00:00.06 | 399 | | 2 | TABLE ACCESS FULL | T3 | 1 | 100 | 100 |00:00:00.01 | 3 | |* 3 | HASH JOIN | | 1 | 14653 | 14000 |00:00:00.04 | 396 | |* 4 | HASH JOIN | | 1 | 8 | 7 |00:00:00.01 | 6 | |* 5 | TABLE ACCESS FULL| T4 | 1 | 6 | 6 |00:00:00.01 | 3 | | 6 | TABLE ACCESS FULL| T2 | 1 | 50 | 50 |00:00:00.01 | 3 | | 7 | TABLE ACCESS FULL | T1 | 1 | 92800 | 100K|00:00:00.02 | 390 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T3"."T3_ID"="T1"."T3_ID") 3 - access("T2"."T2_ID"="T1"."T2_ID") 4 - access("T4"."T4_ID"="T2"."T4_ID") 5 - filter("T4"."RAND_STRING" LIKE 'f%')
Looking at these columns, Id, Operation, and Name are all the same as what they were above in the DISPLAY output. The other columns are:
- Starts – Number of times current operation was executed.
- E-Rows – Estimated Rows. This is the same as Rows in the DISPLAY output above.
- A-Rows – Actual Rows. This is the actual number of rows retrieved by the current operation.
- A-Time – Actual Time. This it the actual time spent on the current operation.
- Buffers – Also known as consistent gets, this is the number of logical reads performed for the current operation.
Using the Output – Antipatterns
Cost
What is cost? Cost is a calculated number Oracle uses to have a relative way to compare different operations. This is the basis of the Cost-Based Optimizer. By default, for each query Oracle will generate several execution plan options and choose to use the one with the lowest overall cost. Cost is supposed to be an indicator of how much time the execution will take. For all the information you could ever want to know about Cost, refer to Jonathan Lewis’s book Cost-Based Oracle Fundamentals. However, until you truly understand how Cost is calculated and you know how to determine where Oracle is going wrong in its calculations, my advice is to completely ignore Cost altogether. The main reason is that you can’t directly control cost. Too many times I have seen people just aimlessly play around with SQL until they get a lower cost and call that “tuning”. This is not tuning, it is blindly throwing darts hoping to get lucky.
TABLE ACCESS FULL
Another pet peeve of mine is just looking at the execution plans for TABLE ACCESS FULL operations and trying to eliminate those. This is an especially touchy subject with me because it was how I was first taught how to “tune”. Once I learned more, it became painfully clear this was a terrible way to go about it. However, I know a lot of people are still being trained the way I was, and this is what they are doing. It makes it worse that some tools like TOAD’s explain plan output shows TABLE ACCESS FULL in bright red, further propagating the myth that this is something to remove. I have enough to say about this that I will make another full post on it. For now, suffice it to say that if you are thinking that TABLE ACCESS FULL is always bad, and index access is always good, you are looking at tuning wrong.
Using the Output – Better Ways
Hopefully from the columns described above, you can start to see why I much prefer the output of DISPLAY_CURSOR. The extra information available is extremely valuable. I would like to highlight a few of the key areas:
Cardinality Feedback
A major reason for performance issues is a significant difference between the Estimated Rows and Actual Rows columns. If Oracle is wrong by a factor of 10 about how many rows will be retrieved from a given operation, there is a good chance it will access that data in an inefficient way or in an inefficient order. Wolfgang Breitling put together a great whitepaper on Tuning by Cardinality Feedback, which is all about using this difference in Estimated to Actual rows to identify where in your query to focus your tuning efforts.
Buffers
The goal of tuning is eliminating waste. Any time you can achieve the same goal using less buffers, it is a win for performance. This is the whole point of indexes.. instead of reading every block in a table, you visit a few index blocks which then point to only the table blocks that you need to get your answer. Less blocks visited = less work = less time to complete. More information on eliminating buffers to achieve better execution can be found in this excellent white paper by Cary Millsap.
Understanding the Execution Tree
We also need to understand what the indentation in the Operation column of the execution plan output is telling us, which will be the topic of my next post.