In Parallel Data Warehouse, you have a query that has one or more distributions running exponentially longer than other distributions of the same query. You have ruled out data skew and general statistics as causes of this issue. However, the problem may involve issues within the cardinality estimator (CE) itself. Because cost estimates are based on the cardinality estimates, this could lead to a poor planning decision.
To determine whether the cardinality estimator is involved in this problem, do one of the following on a compute node where the issues are occurring:
Note You may also notice that the join order changes for certain tables (hash join only). However, that typically does not affect performance.
To determine whether the cardinality estimator is involved in this problem, do one of the following on a compute node where the issues are occurring:
- Compare the actual execution plan (SET STATISTICS PROFILE ON) with the estimated plan (SET SHOWPLAN_ALL ON).
- Compare the execution plan for a fast distribution with the execution plan for a slow distribution for the same step. Specifically, compare the estimated rows vs. the actual rows that are produced by each operator.
Note You may also notice that the join order changes for certain tables (hash join only). However, that typically does not affect performance.