Today I’d like to share another tuning example from a recent case at work, which in my opinion is good for illustrating typical steps involved in SQL optimization process.
I was handed a poorly performing query with a relatively verbose text, so I will only give the general structure here (it will also prevent me from accidentally disclosing some sensitive information from that application):
SELECT /* long list of columns here */ , (select * from ... where /* joins to tables in the main block and other predicates*/ ), (select * from ... where /* joins to tables in the main block and other predicates*/ ) FROM ... JOIN (SELECT /* long list of columns here */ LISTAGG(...) WITHIN GROUP(...) FROM (SELECT /* columns */, ROW_NUMBER() OVER(PARTITION BY ..., ... ORDER BY ...) ROWN FROM ...) LEFT JOIN ... ON ... WHERE ROWN <= 5 GROUP BY ...) ON ... INNER JOIN ... LEFT JOIN ... ON ... LEFT JOIN ... ON ... INNER JOIN ... ON ... INNER JOIN ... ON ... LEFT JOIN (SELECT ..., LISTAGG(...) WITHIN GROUP(ORDER BY ...), ... FROM ... INNER JOIN ... ON ... WHERE ... GROUP BY ...) ON ... LEFT JOIN ... ON ... WHERE ... UNION ALL select /* long list of columns */ from ... left join ... on ... inner join ... on ... inner join ... on ... inner join ... on ... inner join ... on ... inner join (select ... , max(...) , min(...) from ... group by ... ) on ... inner join ... on ... inner join ... on ... where ...;
What should be the first step in optimizing its performance? Rewrite to get rid of the ANSI syntax? Use the factorization clause WITH? Rewrite the correlated subqueries via joins? I think the only correct answer here is”obtain runtime performance diagnostics for the query and identify the part one should be focusing on”. Starting rewriting SQL text without knowing whether or not it’s going to do something for its performance is tedious, time consuming and inefficient.
So let’s take a look at the query’s dbms_xplan output (I’ve removed the “Name” column because it would have taken me forever to mask all the object names appropriately; it’s not really needed at this stage, I’ll add masked object names later on):
---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | 1 | | 449K(100)| 4261 |00:01:25.41 | 14M| | 1 | UNION-ALL | 1 | | | 4261 |00:01:25.41 | 14M| |* 2 | VIEW | 4260 | 1 | 5 (20)| 440 |00:00:00.07 | 6260 | |* 3 | WINDOW SORT PUSHED RANK | 4260 | 1 | 5 (20)| 746 |00:00:00.07 | 6260 | |* 4 | TABLE ACCESS BY INDEX ROWID | 4260 | 1 | 4 (0)| 746 |00:00:00.04 | 6260 | |* 5 | INDEX RANGE SCAN | 4260 | 1 | 3 (0)| 1293 |00:00:00.03 | 4702 | |* 6 | VIEW | 4260 | 1 | 5 (20)| 306 |00:00:00.06 | 6260 | |* 7 | WINDOW SORT PUSHED RANK | 4260 | 1 | 5 (20)| 746 |00:00:00.05 | 6260 | |* 8 | TABLE ACCESS BY INDEX ROWID | 4260 | 1 | 4 (0)| 746 |00:00:00.03 | 6260 | |* 9 | INDEX RANGE SCAN | 4260 | 1 | 3 (0)| 1293 |00:00:00.02 | 4702 | | 10 | NESTED LOOPS | 1 | 13 | 408K (11)| 4260 |00:01:21.57 | 13M| | 11 | NESTED LOOPS OUTER | 1 | 2 | 408K (11)| 4260 |00:01:19.87 | 13M| | 12 | NESTED LOOPS OUTER | 1 | 1 | 408K (11)| 4260 |00:01:19.36 | 13M| | 13 | NESTED LOOPS OUTER | 1 | 1 | 408K (11)| 4260 |00:01:19.31 | 13M| | 14 | NESTED LOOPS OUTER | 1 | 1 | 408K (11)| 4260 |00:01:19.27 | 13M| | 15 | NESTED LOOPS | 1 | 1 | 408K (11)| 4260 |00:01:19.23 | 13M| | 16 | NESTED LOOPS | 1 | 1 | 408K (11)| 4260 |00:01:19.19 | 13M| |* 17 | HASH JOIN | 1 | 221K| 180K (21)| 4256K|00:00:29.67 | 401K| |* 18 | HASH JOIN | 1 | 964K| 25795 (17)| 1006K|00:00:03.65 | 42094 | | 19 | TABLE ACCESS FULL | 1 | 587K| 7756 (22)| 588K|00:00:00.31 | 26918 | |* 20 | HASH JOIN | 1 | 450K| 9447 (17)| 450K|00:00:01.75 | 15176 | | 21 | TABLE ACCESS FULL | 1 | 450K| 1587 (27)| 450K|00:00:00.12 | 4705 | |* 22 | HASH JOIN | 1 | 274K| 3160 (18)| 275K|00:00:00.42 | 10471 | | 23 | TABLE ACCESS FULL | 1 | 37 | 2 (0)| 37 |00:00:00.01 | 2 | | 24 | TABLE ACCESS FULL | 1 | 274K| 3055 (15)| 275K|00:00:00.14 | 10469 | |* 25 | TABLE ACCESS FULL | 1 | 1012K| 119K (26)| 4383K|00:00:11.32 | 359K| |* 26 | TABLE ACCESS BY INDEX ROWID | 4256K| 1 | 1 (0)| 4260 |00:00:47.34 | 12M| |* 27 | INDEX UNIQUE SCAN | 4256K| 1 | 0 (0)| 214K|00:00:42.68 | 12M| | 28 | TABLE ACCESS BY INDEX ROWID | 4260 | 1 | 1 (0)| 4260 |00:00:00.04 | 8413 | |* 29 | INDEX UNIQUE SCAN | 4260 | 1 | 0 (0)| 4260 |00:00:00.02 | 4153 | |* 30 | TABLE ACCESS BY INDEX ROWID | 4260 | 1 | 2 (0)| 86 |00:00:00.03 | 1701 | |* 31 | INDEX RANGE SCAN | 4260 | 1 | 1 (0)| 183 |00:00:00.02 | 1591 | | 32 | TABLE ACCESS BY INDEX ROWID | 4260 | 1 | 1 (0)| 4260 |00:00:00.04 | 8627 | |* 33 | INDEX UNIQUE SCAN | 4260 | 1 | 0 (0)| 4260 |00:00:00.02 | 4084 | | 34 | TABLE ACCESS BY INDEX ROWID | 4260 | 1 | 1 (0)| 1091 |00:00:00.04 | 7520 | |* 35 | INDEX UNIQUE SCAN | 4260 | 1 | 0 (0)| 1091 |00:00:00.03 | 6422 | | 36 | VIEW PUSHED PREDICATE | 4260 | 1 | 10 (0)| 3460 |00:00:00.51 | 85399 | | 37 | SORT GROUP BY | 4260 | 1298 | 10 (0)| 3460 |00:00:00.50 | 85399 | | 38 | NESTED LOOPS | 4260 | 1298 | 10 (0)| 22063 |00:00:00.35 | 85399 | | 39 | NESTED LOOPS | 4260 | 1298 | 10 (0)| 22063 |00:00:00.27 | 63336 | |* 40 | TABLE ACCESS FULL | 4260 | 8 | 2 (0)| 34080 |00:00:00.12 | 12780 | |* 41 | INDEX UNIQUE SCAN | 34080 | 1 | 0 (0)| 22063 |00:00:00.12 | 50556 | | 42 | TABLE ACCESS BY INDEX ROWID | 22063 | 162 | 1 (0)| 22063 |00:00:00.05 | 22063 | | 43 | VIEW PUSHED PREDICATE | 4260 | 1 | 5 (0)| 4260 |00:00:01.69 | 113K| | 44 | SORT GROUP BY | 4260 | 1 | 5 (0)| 4260 |00:00:00.30 | 31335 | | 45 | NESTED LOOPS OUTER | 4260 | 1 | 5 (0)| 4913 |00:00:00.24 | 31335 | |* 46 | VIEW | 4260 | 1 | 4 (0)| 4913 |00:00:00.12 | 11419 | |* 47 | WINDOW BUFFER PUSHED RANK | 4260 | 1 | 4 (0)| 4945 |00:00:00.11 | 11419 | |* 48 | INDEX RANGE SCAN | 4260 | 1 | 4 (0)| 5446 |00:00:00.05 | 11419 | | 49 | TABLE ACCESS BY GLOBAL INDEX ROWID | 4913 | 1 | 1 (0)| 4892 |00:00:00.11 | 19916 | |* 50 | INDEX UNIQUE SCAN | 4913 | 1 | 0 (0)| 4892 |00:00:00.07 | 15024 | | 51 | HASH GROUP BY | 1 | 8 | | 1 |00:00:03.30 | 649K| | 52 | CONCATENATION | 1 | | | 6 |00:00:03.30 | 649K| | 53 | NESTED LOOPS | 1 | 1 | 478 (10)| 0 |00:00:03.15 | 634K| | 54 | NESTED LOOPS | 1 | 1 | 476 (10)| 4119 |00:00:03.13 | 632K| | 55 | NESTED LOOPS | 1 | 1 | 475 (10)| 4119 |00:00:03.11 | 627K| | 56 | NESTED LOOPS | 1 | 1 | 474 (10)| 4119 |00:00:03.09 | 623K| | 57 | NESTED LOOPS | 1 | 1 | 473 (10)| 4119 |00:00:03.07 | 619K| | 58 | NESTED LOOPS | 1 | 1 | 472 (10)| 4119 |00:00:03.05 | 619K| | 59 | NESTED LOOPS | 1 | 1 | 471 (10)| 4119 |00:00:03.03 | 614K| | 60 | NESTED LOOPS | 1 | 1 | 469 (10)| 242K|00:00:02.06 | 524K| | 61 | NESTED LOOPS | 1 | 1 | 466 (10)| 58426 |00:00:00.74 | 137K| | 62 | NESTED LOOPS | 1 | 1 | 465 (10)| 58426 |00:00:00.46 | 79017 | | 63 | TABLE ACCESS BY INDEX ROWID | 1 | 35 | 429 (11)| 58426 |00:00:00.12 | 7904 | |* 64 | INDEX FULL SCAN | 1 | 1 | 428 (11)| 58426 |00:00:00.04 | 516 | | 65 | TABLE ACCESS BY INDEX ROWID | 58426 | 1 | 1 (0)| 58426 |00:00:00.30 | 71113 | |* 66 | INDEX UNIQUE SCAN | 58426 | 1 | 0 (0)| 58426 |00:00:00.12 | 12153 | | 67 | TABLE ACCESS BY INDEX ROWID | 58426 | 1 | 1 (0)| 58426 |00:00:00.23 | 58886 | |* 68 | INDEX UNIQUE SCAN | 58426 | 1 | 0 (0)| 58426 |00:00:00.10 | 460 | | 69 | TABLE ACCESS BY GLOBAL INDEX ROWID| 58426 | 2 | 3 (0)| 242K|00:00:01.21 | 386K| |* 70 | INDEX RANGE SCAN | 58426 | 2 | 1 (0)| 242K|00:00:00.46 | 154K| |* 71 | TABLE ACCESS BY INDEX ROWID | 242K| 1 | 2 (0)| 4119 |00:00:00.85 | 90088 | |* 72 | INDEX RANGE SCAN | 242K| 1 | 1 (0)| 100K|00:00:00.44 | 5055 | | 73 | TABLE ACCESS BY INDEX ROWID | 4119 | 1 | 1 (0)| 4119 |00:00:00.02 | 4310 | |* 74 | INDEX UNIQUE SCAN | 4119 | 1 | 0 (0)| 4119 |00:00:00.01 | 191 | | 75 | TABLE ACCESS BY INDEX ROWID | 4119 | 1 | 1 (0)| 4119 |00:00:00.02 | 194 | |* 76 | INDEX RANGE SCAN | 4119 | 1 | 0 (0)| 4119 |00:00:00.01 | 178 | | 77 | TABLE ACCESS BY INDEX ROWID | 4119 | 1 | 1 (0)| 4119 |00:00:00.02 | 4313 | |* 78 | INDEX UNIQUE SCAN | 4119 | 1 | 0 (0)| 4119 |00:00:00.01 | 194 | | 79 | TABLE ACCESS BY INDEX ROWID | 4119 | 1 | 1 (0)| 4119 |00:00:00.01 | 4123 | |* 80 | INDEX UNIQUE SCAN | 4119 | 1 | 0 (0)| 4119 |00:00:00.01 | 4 | | 81 | TABLE ACCESS BY INDEX ROWID | 4119 | 1 | 1 (0)| 4119 |00:00:00.02 | 4313 | |* 82 | INDEX UNIQUE SCAN | 4119 | 1 | 0 (0)| 4119 |00:00:00.01 | 194 | |* 83 | TABLE ACCESS BY INDEX ROWID | 4119 | 1 | 2 (0)| 0 |00:00:00.02 | 2319 | |* 84 | INDEX RANGE SCAN | 4119 | 1 | 1 (0)| 1469 |00:00:00.01 | 781 | | 85 | NESTED LOOPS | 1 | 1 | 40593 (5)| 6 |00:00:00.15 | 14616 | | 86 | NESTED LOOPS | 1 | 1 | 40590 (5)| 1 |00:00:00.15 | 14606 | | 87 | NESTED LOOPS | 1 | 1 | 40589 (5)| 1 |00:00:00.15 | 14603 | | 88 | NESTED LOOPS | 1 | 1 | 40588 (5)| 1 |00:00:00.15 | 14599 | | 89 | NESTED LOOPS | 1 | 1 | 40587 (5)| 1 |00:00:00.15 | 14594 | | 90 | NESTED LOOPS | 1 | 1 | 40586 (5)| 1 |00:00:00.15 | 14591 | | 91 | NESTED LOOPS | 1 | 1 | 40585 (5)| 1 |00:00:00.15 | 14586 | | 92 | NESTED LOOPS | 1 | 1 | 40583 (5)| 1 |00:00:00.15 | 14584 | | 93 | NESTED LOOPS | 1 | 1 | 40582 (5)| 1 |00:00:00.15 | 14580 | | 94 | NESTED LOOPS | 1 | 60 | 40521 (5)| 1 |00:00:00.15 | 14576 | |* 95 | TABLE ACCESS FULL | 1 | 34893 | 4557 (21)| 2 |00:00:00.15 | 14567 | |* 96 | TABLE ACCESS BY INDEX ROWID | 2 | 1 | 1 (0)| 1 |00:00:00.01 | 9 | |* 97 | INDEX UNIQUE SCAN | 2 | 1 | 0 (0)| 2 |00:00:00.01 | 6 | | 98 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 4 | |* 99 | INDEX RANGE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 3 | | 100 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 4 | |*101 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 3 | | 102 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 2 | |*103 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 1 | | 104 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 5 | |*105 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 4 | | 106 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 3 | |*107 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 2 | |*108 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 5 | |*109 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 4 | | 110 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 4 | |*111 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 3 | | 112 | TABLE ACCESS BY INDEX ROWID | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 3 | |*113 | INDEX UNIQUE SCAN | 1 | 1 | 0 (0)| 1 |00:00:00.01 | 2 | | 114 | TABLE ACCESS BY GLOBAL INDEX ROWID | 1 | 2 | 3 (0)| 6 |00:00:00.01 | 10 | |*115 | INDEX RANGE SCAN | 1 | 2 | 1 (0)| 6 |00:00:00.01 | 4 | ----------------------------------------------------------------------------------------------------------------------
I find it very difficult to analyze any information that doesn’t fit into a single screen, so I start by removing everything that is not contributing enough time to be relevant to the performance tuning:
1) the two subqueries (lines 2-9)
2) the second UNION ALL’ed subquery (lines 51-115)
which leaves us with one branch in the tree, lines 10-50, but we can shrink it further: as we go from the driving operation at line 23 outwards, we can see that we can stop at the nested loop at lines 16-27 as this block covers over 95%
of the query execution time.
----------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | ----------------------------------------------------------------------------------------------------------------------------------------------------- ... | 16 | NESTED LOOPS | | 1 | 1 | 408K (11)| 4260 |00:01:19.19 | 13M| |* 17 | HASH JOIN | | 1 | 221K| 180K (21)| 4256K|00:00:29.67 | 401K| |* 18 | HASH JOIN | | 1 | 964K| 25795 (17)| 1006K|00:00:03.65 | 42094 | | 19 | TABLE ACCESS FULL | V1 | 1 | 587K| 7756 (22)| 588K|00:00:00.31 | 26918 | |* 20 | HASH JOIN | | 1 | 450K| 9447 (17)| 450K|00:00:01.75 | 15176 | | 21 | TABLE ACCESS FULL | V2 | 1 | 450K| 1587 (27)| 450K|00:00:00.12 | 4705 | |* 22 | HASH JOIN | | 1 | 274K| 3160 (18)| 275K|00:00:00.42 | 10471 | | 23 | TABLE ACCESS FULL | V4 | 1 | 37 | 2 (0)| 37 |00:00:00.01 | 2 | | 24 | TABLE ACCESS FULL | V3 | 1 | 274K| 3055 (15)| 275K|00:00:00.14 | 10469 | |* 25 | TABLE ACCESS FULL | A | 1 | 1012K| 119K (26)| 4383K|00:00:11.32 | 359K| |* 26 | TABLE ACCESS BY INDEX ROWID | B | 4256K| 1 | 1 (0)| 4260 |00:00:47.34 | 12M| |* 27 | INDEX UNIQUE SCAN | UQ$B | 4256K| 1 | 0 (0)| 214K|00:00:42.68 | 12M| 17 - access(A.A_COL1=V2.V2_COL1 AND A.A_COL2=V2.V2_COL2 AND A.A_COL3=V1.V1_COL3) 18 - access(A.A_COL1=V1.V1_COL1) 20 - access(V2.V2_COL3=A.A_COL1) 22 - access(V3.V3_COL1=V4.V4_COL1) 25 - filter((A.A_COL4=:B1 AND INTERNAL_FUNCTION(C.CONFIRMTYPE))) 26 - filter((B.B_COL1=:B1 AND ((A.A_COL6 IS NULL AND A.A_COL7='Z') OR (B.B_COL2='W' AND A.A_COL8 IS NULL AND A.A_COL9 IS NOT NULL)) AND A.A_COL1=B.B_COL3)) 27 - access(A.A_COL10=B.B_COL10 AND A.A_COL11=B.B_COL11)
Further, we can “reverse-engineer” a simplified version of the original query from that part of the plan, by throwing out all irrelevant joins and subqueries:
SELECT ... FROM A INNER JOIN B ON A.A_COL10 = B.B_COL10 AND A.A_COL11 = B.B_COL11 AND A.A_COL1 = B.B_COL3 INNER JOIN V ON A.A_COL1 = V.V_COL1 AND A.A_COL2 = V.V_COL2 AND A.A_COL3 = V.V_COL3 ... WHERE B.B_COL1 = :B1 AND A.A_COL4 = :B1 AND A.A_COL12 IN ('X', 'Y') AND ((A.A_COL6 IS NULL AND A.A_COL7 = 'Z') OR (B.B_COL2 = 'W' AND A.A_COL8 is null AND A.A_COL9 is not null))
Now that we simplified the query, how to we go about making it faster? There three basic “schools of thought” in modern SQL optimization (not counting above mentioned “rewrite SQL text according to good practices and hope it makes it faster” school which I don’t think has a right to exist anymore)
1) cardinality feedback tuning
2) tuning based on resource usage
3) tuning by waste minimization.
With the first method, we focus on helping the optimizer to correctly calculate cardinalities, expecting (quite reasonably) that in this case the optimizer would come up with a good plan on its own. It’s a good method, but I don’t think it would be efficient here because for most of the steps the optimizer has come up with fairly decent estimates (yes, the cardinality of the outermost step, line 16, is outrageously wrong, but it mostly affects operations outside this block which don’t contribute significantly to the total execution time).
The second method focuses on the operations which contribute to resource usage the most. Actually, we kinda already used this method when simplifying this query, but I don’t think we can achive much more with it.
Method three (“waste elimination”) is my favorite in general, and I think it’s also the best fit for this particular case.
The waste elimination technique says that we should focus on eliminating “waste”, i.e. rows that don’t need to be returned to the user, as early as possible. So first we should start by identifying the “waste”. In this case, it’s fairly straightforward: operation 26 reduces the number of rows from 214k to 4.26k, and it’s the last data access operation in the plan. This means that about 98% of data extracted, joined and otherwise processed in previous steps, is waste. Removing these rows as early and as efficiently as possible is the key to improving query’s performance, but how can we achieve that?
Note that we have two separate issues here: both parts, “early” and “efficiently”, are important here. The former is much easier: all we need to do is to access B as early as possible, and since B is only directly joined to A, this means that we must drive the query from the join of A and B. “Efficiently” is another matter: we need to find the best join order and method for A and B. Normally we would simply try all combinations, which aren’t many as this is just two tables: two possible orders (A first or B first), and two join methods (I’m not counting merge joins as they are a special case for pre-ordered data sets or for cartesian joins), i.e. 4 combinations in total.
However, before we do any of that, let’s look closely at the predicate in line 26: we can see that it has predicates on tables A and B combined together with an OR operator, which leads to an interesting possibility: namely, if both parts have good selectivity (and as it turns out in this particular case, they do), then we would want to break the query into two, and drive one part from A, and the other part from B. The optimizer should be capable of doing it on its own, but since it was unable to find this transformation in this particular case, we can help it with the “use_concat” hint. The next part, ensuring the correct driving table for each of the two concatenated blocks, is going to be a tad more difficult, as we’d need to figure out the names of the blocks. But that’s not too complicated, either: we just have to do the explain plan on the hinted query and then display the plan using “advanced” format option:
select * from table(dbms_xplan.display(null, null, 'advanced'));
This will produce a plan with lots of additional information, including an alias section, giving us desired query block names:
25 - SEL$68B58A1A / A@SEL$1 26 - SEL$68B58A1A / B@SEL$6 ... 63 - SEL$68B58A1A_2 / A@SEL$68B58A1A_2 ... 79 - SEL$68B58A1A_2 / B@SEL$68B58A1A_2
So now we can force the optimizer to do what we want by using the following set of hints:
/*+ use_concat leading(A@SEL$1 B@SEL$6) leading(B@SEL$68B58A1A_2 A@SEL$68B58A1A_2) */
and if we are lucky, that should be enough to produce the plan we want (if not, we’d have to add further hints to ensure correct access paths and/or to fix other parts of the plan that might get broken because of our changes). But assuming that the hints make the query as fast as we needed to — are we done?
Not quite so — there are still a couple of things remaining.
1) Our solution is using hints. As is well known, hints are dangerous, especially the ones like “leading” (“micromanaging” hints in J. Lewes’ classification), because they ownly lock down a part of the plan which can result in very nasty surprises in case of significant changes in data and/or optimizer stats. So we can only consider our solution as a short-term workaround. It is much more reliable to force the entire plan using e.g. the “fake baselines” technique.
2) But even after we lock down the entire plan, that’s not quite all, because we still need to understand why the optimizer was unable to generate the appropriate plan on its own.
But covering the last two items would make this already long blog yet longer, so I won’t do it, at least not now. Rather, I’ll make a couple of brief technical notes:
1) if you look at aliases for operations 25 and 26, you’ll see that tables A and B belong to different blocks (SEL$1 and SEL$6). This has to do with the ANSI syntax used in the query: during the optimization phase, Oracle internally rewrites it according to the “native” join syntax, which creates new query blocks, making it much harder to hint the desired plan! That’s one of major drawbacks of ANSI syntax, and combined with numerous bugs that exist around it, it’s the reason
I never use it myself unless I have to
2) also note how the “OR” operation made things more complicated. It’s also rather typical: while “AND” increases the amount of information given to the optimizer, “OR” reduces it (“A OR B = I don’t know whether it’s A or B”). Often, there’s nothing that can be done about OR predicates, but there are some cases when such problems can be avoided by putting some extra thought into the application design and avoiding the situation where OR predicates have to be used because of sloppiness in the application design (“I don’t know if this piece of information belongs to column1 or column2, so let me pick either column randomly, and then I’ll just add column1 OR column2 to my queries”).
Happy tuning!
Image may be NSFW.
Clik here to view.
Clik here to view.
