Sunday, August 20, 2006

The trouble with alternate sorts and custom folders

Discoverer's alternate sort is one of those wow features that once learned has a tendency to be used over and over again. When the alternate sort is based on a small, known set of values created in a custom folder the end result is just what the doctor ordered. However, when the SQL inside the custom folder is dynamic and is drawn from one or more database tables then, because of a little-known feature of the nature of alternate sorts, Discoverer performance can and will deteriorate. The use of the alternate sort is therefore like a double-edged sword. On the one hand, you can create lists of values in exactly the order requested by the users. On the other hand, performance can down the drain - big time!

So why does the performance go down? It goes down because whenever a user includes an item in a workbook that has an alternate sort associated with it, Discoverer embeds the SQL used in the custom folder within the query for the workbook. It does this just in case the user wants to sort by the item. As you probably know, when a user adds a sort on an item that has an alternate sort, Discoverer will sort the workbook data in the order as defined by the alternate sort. The only way it can do this is by including the alternate sort code within the query.

Let me show you using an example from my own database, the one used in my Discoverer 10g Handbook. I have a custom folder called Prod Size Lookup containing the following code:

SELECT 'MINI' PRODSIZE, 1 SEQUENCE FROM DUAL
UNION
SELECT 'SMALL' PRODSIZE, 2 SEQUENCE FROM DUAL
UNION
SELECT 'MEDIUM' PRODSIZE, 3 SEQUENCE FROM DUAL
UNION
SELECT 'LARGE' PRODSIZE, 4 SEQUENCE FROM DUAL;

This creates a folder with two items, one called Prodsize and one called Sequence. I then created a list of values based on the Prodsize using the Sequence as the alternate sort. This allows my users to see the values in a more natural order, the order shown above. Similar code can be used for days of the week and months of the year. So far so good. Next I created a simple workbook that queried just the Prodsize. Here is the code that was generated by Discoverer:

SELECT
o100558.PRODSIZE AS E100726,
as100626_100726_OLD AS as100626_100726_OLD
FROM GSW.GS_PRODUCTS o100558,
( SELECT o100572.PRODSIZE AS as100626_100726_OLD_2,

MAX(o100572.SEQUENCE) AS as100626_100726_OLD
FROM
( SELECT 'MINI' PRODSIZE, 1 SEQUENCE FROM DUAL
UNION
SELECT 'SMALL' PRODSIZE, 2 SEQUENCE FROM DUAL
UNION
SELECT 'MEDIUM' PRODSIZE, 3 SEQUENCE FROM DUAL
UNION
SELECT 'LARGE' PRODSIZE, 4 SEQUENCE FROM DUAL

) o100572 GROUP BY o100572.PRODSIZE)

WHERE ( (0100558.PRODSIZE = as100626_100726_OLD_2(+)));

So what happened here? First of all, I bet you are amazed at the amount of code that has been generated. As you can see, Discoverer has not only embedded my custom SQL within the main query but has also added an outer join, shown as(+), between the two components. Now, because my list of values is a small, known set of items, this additional code has a marginal impact upon my query. My explain plan looks like this:

SELECT STATEMENT
HASH JOIN OUTER
TABLE ACCESS FULL GSW.GS_PRODUCTS
VIEW DRAKE
SORT GROUP BY
VIEW DRAKE
SORT UNIQUE
UNION-ALL
FAST DUAL
FAST DUAL
FAST DUAL
FAST DUAL


So now let's move into the real world, the world of Oracle E-Business Suite, where the custom SQL is executing another query to get the list of values. Using such custom SQL with an alternate sort causes Discoverer to generate code such as this:

SELECT /*+ ALL_ROWS */
o114210.PERIOD_NAME AS E187845,
as120354_187845_OLD AS as120354_187845_OLD
FROM GL.GL_JE_HEADERS o114210,
( SELECT o779427.PERIOD_NAME AS as120354_187845_OLD_2,
MAX(o779427.SORT_KEY) AS as120354_187845_OLD

FROM
( SELECT PERIOD_NAME,
PERIOD_YEAR,
REPLACE(PERIOD_YEAR,TO_CHAR(PERIOD_NUM,'09'), ' ') SORT_KEY
FROM GL.GL_PERIOD_STATUSES

WHERE
APPLICATION_ID = 101 AND
SET_OF_BOOKS_ID = 1 ) o779427

GROUP BY o779427.PERIOD_NAME)
WHERE ( (o114210.PERIOD_NAME = as120354_187845_OLD_2(+)));


You can see that this may have an impact of performance. It may well bring back more rows of data than expected so the user may well have to check the Hide Duplicate Rows box in the Table Layout tab of the Edit Worksheet dialog box.

By the way, the same query that I used at the beginning of this posting looks like this when I have no alternate sort:

SELECT O100558.PRODSIZE
FROM GSW.GS_PRODUCTS O100558;


and my explain plan looks like this:

SELECT STATEMENT
TABLE ACCESS FULL GSW.GS_PRODUCTS;


Therefore, when managing alternate sorts, you need to:

  • be careful and only use them when the list of values is either fixed and can be generated by a SQL statement that does not access the database, or the list of values is small and the SQL statement is extremely efficient and does not generate duplicate values.
  • be frugal and don't get carried away. Create them when absolutely necessary, and not just because the users asked for them.
  • test them out with realistic before and after scenarios. This way you can make sure that performance is not being adversely affected, and if it is - don't use them, simple as that!
For all of you who have asked why Discoverer sometimes adds an outer join, take a close look at your End User Layer. The inappropriate use of alternate sorts could be one of the reasons.

1 comment:

bruno said...

Thank you for exposing a real problem with Discoverer, but what to do if you really need alternate sort and here's a case : i need to sort EXACLTY like this : a, b, c, d, A, B, C, *absent*, *Tous*

You will understand here that the user don't want *absent* to follow a.

Thank you Michael !

Followers