Monday, May 19, 2008

Using CASE to solve Outer Join issues

As many of you know, one of the services that my company offers is Discoverer workshops. What this means is that we send one of our experienced trainers on site and work, sometimes one on one, with the end users on their real system to help them generate the reports that they need. Sometimes I get to go and while I would like to do this more often I just don't have enough time so when the chance does come around I jump at it.

Anyway, one of the most frequent problems that I come across with end user reports is how to effectively create and work with reports when two or more of the folders have outer joins. Such a situation happened in one of the recent workshops that I conducted.

Picture if you will an EUL with 2 folders, folder A and folder B, where folder A is the master being outer joined to B. Thus, if you include something from both folders you will get one row for every item in folder A even if there is nothing in folder B. This is the benefit of the outer join.

Let me get more specific. Let's say that we are working in Human Resources where folder A is Employee Master and folder B is for Employee Leave. Because not all employees have taken leave there is an outer join between the two. Now let's say that in the Leave folder there is an item called Leave Desc that contains descriptions to indicate what kind of leave the employee had taken. With me so far? Here's a screenshot:




Here's some example output:


Ok, so now we want to build a report that allows the end user to exclude one or more Leave Desciptions from the report. To be specific, We want to see all Employees who either have not been sick or have not taken any leave yet. Looking at the report we should be including Carol, Michael and Susie, but not George.

Simply creating the following condition will not work:

Leave Desc <> 'Sick'

Here is that output:


As you can see, George is still included but now appears to have not taken any leave, which isn't true. How about if we also say that Leave Desc IS NOT NULL?

This doesn't give us the right answer either because now both George and Carol have been omitted.

Let's create this calculation which we will call Exclusion:

CASE WHEN "Leave Desc" = 'Sick' THEN 1 ELSE 0 END

We now get this answer, which as you can see has annotated the rows we do not want with 1.



You might think that adding this condition: Exclusion <> 1 would work but it will not. Try it for yourself and you will see what I mean.

You actually need to do add a Boolean OR condition, like this:

Exclusion <> 1
OR
Exclusion <> 1



Problem solved - isn't it? Try removing the Exclusion from the report and see what happens. In Plus you will get the right answer. However in Desktop you will find that your sorting goes wrong. The solution in Desktop is to add a hidden group sort on the Exclusion!

The problem is solved because of the way the CASE statement works which dictates that the function will end as soon as a true condition is encountered. So even there are NULL records these will all get Exclusion of 0 which is the default.

11 comments:

Anonymous said...

Michael,
I just wanted to let you know this tip was very timely for me. I was trying to accomplish something similar, and couldn't seem to do it with a combination of outer joins and conditions. Adding the logic in a calculation was just the piece I was missing. I don't see a lot of feedback on your blog but hope you know it is very helpful!
Thanks!

Michael said...

Thanks for the feedback, it is really appreciated.
Michael

Anonymous said...

Thanks Micheal,

That is exactly what I have searched for. In my case I have two different star schemata which are combined in one query. Because of that reason there are several optional dimension relations. I have tried to use conditions like date = '17-APR-2008' and it did not work.

With the use of
date = '17-APR-2008'
OR
date = '17-APR-2008'
it works. The use of additional coloumn seems not to be necessary. But I do not really understand why this works. If date is not 17-APR-2008 in this example that means false or false = false if it is '17-APR-2008 it means true or true = true. The condition date = '17-APR-2008' should give the same result, shouldn't it?

or
t or f = t
f or t = t
f or f = f
t or t = t

Gabe said...

Interesting bug... but this is just a bug.

There seems to be a bug in Developer Plus that makes it ignore the outer joins in the EUL when you create a compound expression... some of the time. Relying on this bug seems like an extremely bad idea as it will hopefully be fixed in the next revision. This bug can also easily make you grab incorrect data by limiting what really should have been an outer join!!!!!

I'm still not sure the best way to create an inner join, but if you have access to Developer Desktop you can copy the SQL for the report into a text file, remove the outer joins, and then re-import SQL. This will remove the outer joins on that table until you remove and re-add the Folder/Table.

I still have no solution I find workable...

- Gabe

Michael said...

Hi Gabe
Not sure I agree with your comment that this is a bug. It may be but I like to think it is deliberate code. Using or taking advantage of this capability allows me to create very flexible reports that otherwise would require multiple join options.

Michael

Gabe said...

To illustrate the bug better here is some SQL from various reports created with Discoverer Plus

I've changed the names of the tables to protect the innocent, but this is otherwise exactly the SQL that is generated. There are two views; "TABLE1" and "TABLE2" that each have a column "UNIQUE_ID".

Here is the SQL generated when I join the views, notice the outer join(i.e. "(+)")
SELECT O100288.UNIQUE_ID, O100291.UNIQUE_ID
FROM ODSTHING.TABLE1 O100288, ODSTHING.TABLE2 O100291
WHERE ( ( O100291.UNIQUE_ID = O100288.UNIQUE_ID(+) ) )
;


Here is what happens if I simply add a condition for is not null. Notice the outer join on the "IS NOT NULL" that makes it do nothing
SELECT O100288.UNIQUE_ID, O100291.UNIQUE_ID
FROM ODSTHING.TABLE1 O100288, ODSTHING.TABLE2 O100291
WHERE ( ( O100291.UNIQUE_ID = O100288.UNIQUE_ID(+) ) ) AND ( O100288.UNIQUE_ID(+) IS NOT NULL )
;


Here is what happens when I AND the same condition with itself, it stays outer joined
SELECT O100288.UNIQUE_ID, O100291.UNIQUE_ID
FROM ODSTHING.TABLE1 O100288, ODSTHING.TABLE2 O100291
WHERE ( ( O100291.UNIQUE_ID = O100288.UNIQUE_ID(+) ) ) AND ( ( ( O100288.UNIQUE_ID(+) IS NOT NULL AND O100288.UNIQUE_ID(+) IS NOT NULL ) ) )
;


Here is what happens if I OR the same condition with itself. MAGIC!!! The outer join disappears!!
SELECT O100288.UNIQUE_ID, O100291.UNIQUE_ID
FROM ODSTHING.TABLE1 O100288, ODSTHING.TABLE2 O100291
WHERE ( ( O100291.UNIQUE_ID = O100288.UNIQUE_ID(+) ) ) AND ( ( ( O100288.UNIQUE_ID IS NOT NULL OR O100288.UNIQUE_ID IS NOT NULL ) ) )
;


I'm still not quite sure when this bug takes effect, but it seems to happen whenever an OR is used... although I could swear I saw this happen with an AND when I was first playing with it.

Anyone editing conditions may accidentally inner-join views that are defined to be outer-joined if they use an OR. Even if users are aware of this issue it leaves you with no way to do an outer join with an OR...

I do agree that this is a very useful bug... but definitely dangerous if you aren't fully aware of what it is doing. Oracle needs to add the ability to state whether a condition should be inner or outer to Plus and fix this. Before using this I want to make sure it is exploitable in a straightforward manner.

Oracle's Documentation is definitely at odds with how this works. The only special note on "OR" I could find at Oracle's Doc Site is "Using OR widens a search to display items that match any of the criteria." when in effect it can drastically restrict the data you are getting.

- Gabe

Gabe said...

Discoverer takes out the outer join because the SQL syntax used. You can't do an outer join using the (+) syntax in Oracle with an OR or IN.

This thread was continued at the official Oracle forum http://forums.oracle.com/forums/message.jspa?messageID=2761207


The Oracle Documentation that mentions this and discusses various ways to change the way this works on the server-side is at http://download.oracle.com/docs/html/B13916_04/conditions.htm#sthref854

jdickey@mccarthy.com said...

Ahh, when I do the compound condition with an OR, same IS NOT NULL test on both sides of the condition, then I work fine too - again because that seems to remove the outer join situation. Shame Oracle does not document this someone in their Discoverer Plus users guide.

jdickey@mccarthy.com said...

Just to add an update here, I did some more testing and found an interesting result. I started a Discoverer Workbook with data from one table. I then added data from a second table, with a one to many join relationship. Okay so far. I did a condition to filter down to a small number of records. Okay still so far. I next added data from a third table, which has a one to one relationship with table 2, but table 2 (my master) will have records that do not exist in table 3. So after adding table 3, I have rows with master data, but no detail data, as well as master rows with detail data. I kept my existing condition. I next added data from table 4, which is in a one to many relationship with table 3. When I do that my master rows with no detail data drop out of the view. Which is what I wanted in this situation. So it seems sometimes I won't need to do this IS NOT NULL compound condition test to get rid of master rows with no detail. Interesting quirks in how this product works.

JAG said...

It's not a bug!!!!

Hi Michael.
when i've encountered with the problem -i started to search how do i remove outer join from condition - and i did find it in pref.txt of the Plus Application Server
DisableAutoOuterJoinsOnFilters = 1 # Turn off Automatic Outer Joins on filters (0 = on, 1 = off)

but there was still some difference between desktop and the plus - so i thought - "why there is got to be the same pref.txt for desktop somewhere" and indeed there was one - not a file actually - but a registry entries.
look at [HKEY_CURRENT_USER\Software\Oracle\Discoverer 10\Database] and you'll see them all there - well not all of them but the most - the problem was that it didn't included DisableAutoOuterJoinsOnFilters entry - so i entered it manually as a dword value - and what you know - it worked - i can now enable and disable outer joins on conditions as i will

Best Regards
Jonathan

Anonymous said...

Michael, thank you very much for posting this. I found your blog using Google (top choice when searching on "Oracle Discoverer Outer Join") and this solved my problem in minutes. I must say, using an OR statement that has the same condition twice is not intuitive... but it works!

I believe you can simplify the code by putting the original condition in twice, instead of creating a calculation and then checking that. At least this is the case if you're looking for an inclusion instead of an exclusion. Using your example, if you wanted only sick time reported, you could use this condition:

"Leave Desc" = 'Sick'
OR
"Leave Desc" = 'Sick'

That worked for me. I'm sure you could play with the logic a bit to make the exclusion work as well without needing a calculation.

Regards,
Mike

Followers