상황: 테이블 크기 A >> B
A = B(+)
GROUP BY A
B가 훨씬 작기 때문에 broadcast 해야 효율적인데 outer join 관계라 driving(outer) table이 될수 없는데...
결론부터 말하면,
1. LEADING 이나 USE_HASH 등의 테이블 지정 순서는 원래 규칙대로 한다. (outer join 이라 driving 이 될 수 없는 규칙 그대로)
2. PQ_DISTRIBUTE 에도 1번 규칙대로 inner table을 B로 하되, inner 쪽 규칙을 broadcast 로 한다.
3. swap_join_inputs 를 B로 지정하여, driving(outer) table 순서만 바꿔준다.
CASE 1) OK
/*+ leading(A) use_hash(B) full(A) parallel(A 4)
pq_distribute(B none broadcast) swap_join_inputs(B) */
--------------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | Q1,02 | PCWP | |
| 5 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 6 | HASH GROUP BY | | Q1,01 | PCWP | |
| 7 | HASH JOIN RIGHT OUTER| | Q1,01 | PCWP | |
| 8 | BUFFER SORT | | Q1,01 | PCWC | |
| 9 | PX RECEIVE | | Q1,01 | PCWP | |
| 10 | PX SEND BROADCAST | :TQ10000 | | S->P | BROADCAST |
| 11 | TABLE ACCESS FULL| B | | | |
| 12 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 13 | TABLE ACCESS FULL | A | Q1,01 | PCWP | |
--------------------------------------------------------------------------------------
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_HASH_AGGREGATION(@"SEL$1")
GBY_PUSHDOWN(@"SEL$1")
SWAP_JOIN_INPUTS(@"SEL$1" "B"@"SEL$1")
PQ_DISTRIBUTE(@"SEL$1" "B"@"SEL$1" NONE BROADCAST)
USE_HASH(@"SEL$1" "B"@"SEL$1")
LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "A"@"SEL$1")
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
OPT_PARAM('optimizer_dynamic_sampling' 5)
OPT_PARAM('optimizer_index_caching' 90)
OPT_PARAM('optimizer_index_cost_adj' 10)
OPT_PARAM('_optim_peek_user_binds' 'false')
OPT_PARAM('_b_tree_bitmap_plans' 'false')
DB_VERSION('11.2.0.2')
OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
CASE 2) BAD (CASE 1과 동일할 것 같지만, outer join 되는 B가 먼저 올수 없다는 논리가 앞서 내부적으로 힌트가 변경되는 것으로 보인다.)
CASE 3) BAD (CASE 2에 swap_join_inputs를 추가해서 driving table변경은 먹히나 다른 힌트들과 의미가 상충하면서 pq_distribute가 동작하지 않은 것으로 보인다.)
CASE 4) BAD (swap_join_inputs를 outer join되는 table이 아닌 반대쪽에 두면 그냥 무시되면서 hash join right outer -> hash join outer 으로 변경된다. 이는 swap_join_inputs(A)를 안쓴것과 동일하며, hash join의 특성상 build table이 작아야 성능이 좋은데, 이대로라면 큰 테이블 부터 build 하게 되어 성능이 크게 떨어진다.)
CASE 2) BAD (CASE 1과 동일할 것 같지만, outer join 되는 B가 먼저 올수 없다는 논리가 앞서 내부적으로 힌트가 변경되는 것으로 보인다.)
/*+ leading(B) use_hash(A) full(A) parallel(A 4)
pq_distribute(A broadcast none) */
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | Q1,02 | PCWP | |
| 7 | HASH JOIN OUTER | | Q1,02 | PCWP | |
| 8 | PX RECEIVE | | Q1,02 | PCWP | |
| 9 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 10 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 11 | TABLE ACCESS FULL| A | Q1,01 | PCWP | |
| 12 | BUFFER SORT | | Q1,02 | PCWC | |
| 13 | PX RECEIVE | | Q1,02 | PCWP | |
| 14 | PX SEND HASH | :TQ10000 | | S->P | HASH |
| 15 | TABLE ACCESS FULL| B | | | |
--------------------------------------------------------------------------------------
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_HASH_AGGREGATION(@"SEL$1")
GBY_PUSHDOWN(@"SEL$1")
PQ_DISTRIBUTE(@"SEL$1" "B"@"SEL$1" HASH HASH)
USE_HASH(@"SEL$1" "B"@"SEL$1")
LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "A"@"SEL$1")
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
OPT_PARAM('optimizer_dynamic_sampling' 5)
OPT_PARAM('optimizer_index_caching' 90)
OPT_PARAM('optimizer_index_cost_adj' 10)
OPT_PARAM('_optim_peek_user_binds' 'false')
OPT_PARAM('_b_tree_bitmap_plans' 'false')
DB_VERSION('11.2.0.2')
OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
CASE 3) BAD (CASE 2에 swap_join_inputs를 추가해서 driving table변경은 먹히나 다른 힌트들과 의미가 상충하면서 pq_distribute가 동작하지 않은 것으로 보인다.)
/*+ leading(B) use_hash(A) full(A) parallel(A 4)
pq_distribute(A broadcast none) swap_join_inputs(B) */
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | Q1,02 | PCWP | |
| 7 | HASH JOIN RIGHT OUTER| | Q1,02 | PCWP | |
| 8 | BUFFER SORT | | Q1,02 | PCWC | |
| 9 | PX RECEIVE | | Q1,02 | PCWP | |
| 10 | PX SEND HASH | :TQ10000 | | S->P | HASH |
| 11 | TABLE ACCESS FULL| B | | | |
| 12 | PX RECEIVE | | Q1,02 | PCWP | |
| 13 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 14 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 15 | TABLE ACCESS FULL| A | Q1,01 | PCWP | |
--------------------------------------------------------------------------------------
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_HASH_AGGREGATION(@"SEL$1")
GBY_PUSHDOWN(@"SEL$1")
SWAP_JOIN_INPUTS(@"SEL$1" "B"@"SEL$1")
PQ_DISTRIBUTE(@"SEL$1" "B"@"SEL$1" HASH HASH)
USE_HASH(@"SEL$1" "B"@"SEL$1")
LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "A"@"SEL$1")
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
OPT_PARAM('optimizer_dynamic_sampling' 5)
OPT_PARAM('optimizer_index_caching' 90)
OPT_PARAM('optimizer_index_cost_adj' 10)
OPT_PARAM('_optim_peek_user_binds' 'false')
OPT_PARAM('_b_tree_bitmap_plans' 'false')
DB_VERSION('11.2.0.2')
OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
CASE 4) BAD (swap_join_inputs를 outer join되는 table이 아닌 반대쪽에 두면 그냥 무시되면서 hash join right outer -> hash join outer 으로 변경된다. 이는 swap_join_inputs(A)를 안쓴것과 동일하며, hash join의 특성상 build table이 작아야 성능이 좋은데, 이대로라면 큰 테이블 부터 build 하게 되어 성능이 크게 떨어진다.)
/*+ leading(A) use_hash(B) full(A) parallel(A 4)
pq_distribute(B none broadcast) swap_join_inputs(A) */
--------------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | Q1,02 | PCWP | |
| 5 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 6 | HASH GROUP BY | | Q1,01 | PCWP | |
| 7 | HASH JOIN OUTER | | Q1,01 | PCWP | |
| 8 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 9 | TABLE ACCESS FULL | A | Q1,01 | PCWP | |
| 10 | BUFFER SORT | | Q1,01 | PCWC | |
| 11 | PX RECEIVE | | Q1,01 | PCWP | |
| 12 | PX SEND BROADCAST | :TQ10000 | | S->P | BROADCAST |
| 13 | TABLE ACCESS FULL| B | | | |
--------------------------------------------------------------------------------------
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_HASH_AGGREGATION(@"SEL$1")
GBY_PUSHDOWN(@"SEL$1")
PQ_DISTRIBUTE(@"SEL$1" "B"@"SEL$1" NONE BROADCAST)
USE_HASH(@"SEL$1" "B"@"SEL$1")
LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "B"@"SEL$1")
FULL(@"SEL$1" "A"@"SEL$1")
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
OPT_PARAM('optimizer_dynamic_sampling' 5)
OPT_PARAM('optimizer_index_caching' 90)
OPT_PARAM('optimizer_index_cost_adj' 10)
OPT_PARAM('_optim_peek_user_binds' 'false')
OPT_PARAM('_b_tree_bitmap_plans' 'false')
DB_VERSION('11.2.0.2')
OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/