Tuesday, December 19, 2006

A funny thing happened the other day

About six weeks ago I purchased a new REV backup kit from Iomega. We had never had a hard drive crash but I thought - hmm - well it's better to be safe than sorry. We installed it and I distinctly remember it taking many hours to accomplish a complete backup of our application server. Great, great, great - and off to bed I went with a smile.

Well, would you believe it? One week went by and that same server suffered a hard drive failure. Wow - how lucky we have a backup my engineer told me. Not so.....

After purchasing a new hard drive and inserting it into the machine the Iomega software was unable to read their own backup device. Well no problem, my hardware engineer told me, let's give their technical support a call. You could have knocked me down with a feather when they questioned why we had used their software. We're not surprised you can't use the backup if you used our software - you should have used someone else's. Thinking my engineer had misheard I asked him to verify and sure enough the same comment was repeated.

So another call to the Iomega technical support followed, suitably escalated to someone who was a bit more responsible and didn't make flippant remarks about their software. This time we were told to send the tape to them and they would see what they could do. We readily agreed and bundled a new hard drive with it thinking that in a few days we would have our data.

Upon receiving our REV drive, here is what they said:

Our evaluation at this point has concluded that the problem appears to be that the REV disks are unstable with potential read errors that may have resulted in file structure damage and/or corruption which may adversely affect some data segments.

Upon stabilizing the set, we will have to manually extract the data to a native file system, rebuild the corrupted / invalid file system components and tables, adjust file pointers and mount the recovered volume(s) and finally determine if some or all of the data you require is intact.

Based upon this initial evaluation, we feel that a recovery may be possible.

Wonderful, we said and asked When might we expect to see the data?

Oh no, sir, this service is not free. If you want us to give you your data it will cost you $1,700 plus tax.

Of course we said - you're joking right? It's your machine, we just bought the thing, we use your software, it's under warranty, it reported no errors, stop kidding and send us the data.

Unfortunately, they were deadly serious.

Please understand that I'm not casting any aspersions on Iomage or their products. They have a great reputation for great products, I even have one of their external drives. I'm only stating what happened to me and how puzzled I am by what has happened. I honestly thought a warranty was a warranty. Well next time I will read the small print and put this one down to experience.

I thought about it later and reflected what my customers would say to me if I sold them a piece of software that didn't work and then said it will cost you more money to have me come in and fix it - oh and by the way the fee will be double what it cost you to buy it two weeks ago. Guess how many customers I would have!

In case you were wondering, we never did continue the discussions with Iomega. We purchased a new server, this time with RAID technology and triple mirrored drives. So this weekend, Christmas weekend, I get to go home and install OBI.

Happy Christmas everone.

Monday, December 18, 2006

When did those reports last run?

Here's an interesting piece of code that I developed recently that you may find to be of use. It is aimed at administrators and uses a combination of the statistics (EUL5_QPP_STATS) and the documents (EUL5_DOCUMENTS) EUL tables to determine whether a workbook that has been saved to the database has been run recently.

SQL Code:

SELECT DISTINCT
DOCS.DOC_CREATED_BY DOC_OWNER,
DOCS.DOC_NAME DOC_NAME,
(SELECT TRUNC(MAX(DM.QS_CREATED_DATE))
FROM EUL5_QPP_STATS DM
WHERE DM.QS_DOC_NAME = STATS.QS_DOC_NAME) LAST_RUN
FROM
EUL5_QPP_STATS STATS,
EUL5_DOCUMENTS DOCS
WHERE
DOCS.DOC_NAME NOT LIKE 'Workbook%' AND
STATS.QS_CREATED_DATE(+) > :CUTOFF AND
DOCS.DOC_CREATED_DATE < :CUTOFF AND
DOCS.DOC_NAME = STATS.QS_DOC_NAME(+)
HAVING :RUN_DATE >=
NVL((SELECT TRUNC(MAX(DM.QS_CREATED_DATE))
FROM EUL5_QPP_STATS DM
WHERE DM.QS_DOC_NAME = STATS.QS_DOC_NAME),'01-JAN-2000')
GROUP BY DOCS.DOC_CREATED_BY, DOCS.DOC_NAME, STATS.QS_DOC_NAME
ORDER BY LAST_RUN DESC, DOCS.DOC_CREATED_BY, DOCS.DOC_NAME;


Note: The DOCS.DOC_NAME NOT LIKE 'Workbook%' code omits workbooks that were saved using the generic name that Discoverer always users. You might want to find these later and discuss removing them with your users!

Code explanation
The above code has a number of facets about it that I think I ought to explain a little more in depth. These are:
  • The reason for the DISTINCT is to make sure that if the workbook was run more than once during a day that it is not counted twice.

  • The :CUTOFF parameter prompts for a date. Whatever date is entered here is used to find workbooks that have been saved to the database prior to that date and have either run since that date or never run at all

  • The :RUN_DATE parameter prompts for a date also. It is used to find workbooks that have not been run after this date.

  • The following code is interesting because it finds both workbooks that have been run and those which have never run, returning NULL if the workbook did not run at all in the period between the two parameters

HAVING :RUN_DATE >=
NVL((
SELECT TRUNC(MAX(DM.QS_CREATED_DATE))
FROM EUL5_QPP_STATS DM
WHERE DM.QS_DOC_NAME = STATS.QS_DOC_NAME),'01-JAN-2000')

Sample Output
Here's some sample output, from TOAD, that I ran this evening against my own test database using 01-JUN-2006 for :CUTOFF and 01-SEP-2006 for :RUN_DATE.



Note: If you have never used TOAD or have always wanted to try it but never got around to it please let me know by email at michael@learndiscoverer.com. I have partnered with the manufacturer and have a trial license which you can use for evaluation purposes.

Thursday, December 14, 2006

Creating indexed items

I saw this recent posting on the OTN forums:

Is it possible to create Discoverer LOV's to display a lookup description e.g. Department Name and return the primary key code namely, Department Id when using an LOV for parameters for example?

The answer is Yes but the solution needs a bit of explaining so I thought I would do so here so that it is available to everyone. In order to make this happen we need to have a table that has an item and a code, and make sure that there is a one to one relationship between the item and the code. It's better to do this against a dimension table that contains unique codes and items. Then, you also need to make sure that there are indexes on both items.

Example table:

Let's start with creating the new table, inserting some values into it, and creating a couple of indexes. Here's the code:

CREATE TABLE TST_TABLE AS (
SELECT 1 PERSON_UID,
Trunc(SYSDATE) BIRTH_DATE,
TO_CHAR(TRUNC(SYSDATE), 'DD-MON-YYYY') CHAR_DATE,
123456 TAX_ID,
'Smith, Michael' FULL_NAME
FROM dual);


INSERT INTO TST_TABLE VALUES (
2, TRUNC(SYSDATE),
TO_CHAR(TRUNC(SYSDATE),
'DD-MON-YYYY'), 234456,
'Jones, Edward');

INSERT INTO TST_TABLE VALUES (
3, TRUNC(SYSDATE),
TO_CHAR(TRUNC(SYSDATE),
'DD-MON-YYYY'),
434456, 'Evans, Bob');

COMMIT;

GRANT SELECT ON TST_TABLE TO PUBLIC;

CREATE INDEX TST_TABLE_UK1 ON TST_TABLE (PERSON_UID);

CREATE INDEX TST_TABLE_UK2 ON TST_TABLE (FULL_NAME);

Discoverer Administrator steps
Next, I went into Discoverer Administrator and imported the table as a new folder from the database. I then went to the Full Name and created a standard list of values on it. With that done, I right-clicked on the item and from the pop-up list selected Properties. Scrolling down to the bottom of the list is a property called Indexed Item. Clicking this brings up a small dialog box in where I picked the indexed item called PERSON_UID.

Note: Indexed Item only becomes accessible when there is a) a list of values on the item, b) that there is another item in the same underlying table that is also indexed, and c) that there is a one to one relationship between the two items.

When I clicked OK, Discoverer displayed the following message:

There is nothing else to do except to click OK and allow Discoverer to create the other list of values. With this done, I am finished with Discoverer Administrator.

Working with Discoverer Plus
I created a new worksheet that includes the original item, FULL NAME, that we worked on above. Next, I created a new condition for the item but opted to use a parameter for the operand. On the right-hand side of the New Parameter dialog box, as shown below, is a checkbox called Enable users to select either indexes or values. I checked this box, completed the parameter and condition.

Note: the Enable users to select either indexes or values will be grayed out if the administrator has not completed the Discoverer Administrator steps outlines earlier.

When I execute this worksheet, it pops up the following parameter. As you can see this parameter is different to those that are normally seen inside Discoverer because this one has an additional drop-down box.

Clicking on the flashlight for the list of values, Discoverer displayed the following list. As you can see, it has both the value and the index, with the index being displayed in brackets.

Note: the above list is displayed no matter whether the user selects Index or Value in the first drop-down. The difference is only seen when the user completes the selection.

The following illustration shows what is returned when I searched by index and selected (2) Jones, Edward.

The following illustration shows what is returned when I searched by value and selected the same person.

The following illustration shows the output of my worksheet when I complete the parameter selection. It makes no difference whether I select by index or by value because the worksheet still displays the value as originally requested.

I hope this helps show how indexed items works.

Note: Do not attempt any of this inside Discoverer Desktop because it will not work. However, if you create a worksheet that has a parameter that uses Index and Value that parameter will be available inside Discoverer Viewer.

VPD date issue between 10.1 and 10.2 databases

I have uncovered an issue today with VPD against date fields. The issue came up at a client site on a 10.2 database. When I got back to the hotel and tried this on my own database, which is 10.1, I got no issue and everything worked ok.

Therefore, I am asking if anyone is able to try out the following for me on different Oracle databases and platforms and let me know by email at michael@learndiscoverer.com whether it worked for you or not. I'd like to compile a list of database versions and platforms on which the issue occurs.

Step 1: create this table as a user

CREATE TABLE TST_DATE AS (
SELECT 1 PERSON_UID,
Trunc(SYSDATE) BIRTH_DATE,
To_Char(Trunc(SYSDATE), 'DD-MON-YYYY') CHAR_DATE,
123456 TAX_ID,
'Smith, Michael B.' FULL_NAME FROM dual);

Step 2: grant select rights on the table

GRANT SELECT ON TST_DATE TO PUBLIC;

Step 3: switch your login to SYS and run this script:

CREATE OR REPLACE FUNCTION F_CHECK_ITEM_TST(p_object_schema in varchar2, p_object_name varchar2)
RETURN VARCHAR2 IS
V_PREDICATE VARCHAR2(2000) := '1 = 2';
BEGIN

RETURN(V_PREDICATE);
END F_CHECK_ITEM_TST;

Step 4: Run this to grant access rights over the function

GRANT EXECUTE ON F_CHECK_ITEM_TST TO PUBLIC;

Step 5: Run this to enable the policy

BEGIN DBMS_RLS.ADD_POLICY(
OBJECT_SCHEMA => 'DRAKE',
OBJECT_NAME => 'TST_DATE',
POLICY_NAME => 'SecByTST',
FUNCTION_SCHEMA => 'SYS',
POLICY_FUNCTION => 'F_CHECK_ITEM_TST',
STATEMENT_TYPES => 'SELECT',
POLICY_TYPE => DBMS_RLS.DYNAMIC,
SEC_RELEVANT_COLS => 'BIRTH_DATE',
SEC_RELEVANT_COLS_OPT => DBMS_RLS.ALL_ROWS);
END;

Note: in the above policy change the OBJECT_SCHEMA name to be the name of the user who owns the table you created in Step 1

Step 6: Go into Discoverer Admin and import the table as a new folder.
Be sure to uncheck the "Date hierarchies" box and set "Default aggregate on data points" to detail. Grant access to any user.

Step 7: Using Desktop or Plus, log in as that user then try to query the folder with the date not included, there should be no problem.

Step 8: Now try to query with the date included
On my 10.2 database I get ORA-24334: no descriptor for this position. This happens when logged into Discoverer as the table owner, EUL owner or any other user. However, when logged into Discoverer as SYS which is exempt from all VPD policies, there is no error when the date is included.

We tried this in Discoverer Desktop, Plus and Viewer with the same results in all three. We can successfully query the table from SQL*Plus, SQL Developer and TOAD.

There also seems to be no issue at all with the 10.1 database so there seems to be an issue between Discoverer and the 10.2 database.

What do you think?

P.S. to cancel the policy use this script when logged in as SYS:

EXEC DBMS_RLS.drop_policy(
OBJECT_SCHEMA => 'DRAKE',
OBJECT_NAME => 'TST_DATE',
POLICY_NAME => 'SecByTST');

Wednesday, December 13, 2006

A Discussion about Averages

Overview
Many people think that the term “mean” means the same thing as “average.” It doesn't. The term mean is a mathematical term, whereas the term average is often used loosely as a description for a person or a data item. In mathematics, average is a number that typifies a set of numbers of which it is a function.

In other words, average can stand for mean, median, or mode. So what is the difference between these three terms?

  • Median - is the middle value in a set of numbers; above and below which lie an equal number of other values.
  • Mean - is a number that typifies a set of numbers, such as a geometric or arithmetic mean; it is what most people think of as being the average value of a set of numbers.
  • Mode - is the value or item that occurs most frequently in a set of statistical data.
Look at the following two sets of data:

Example set 1: 15 16 17 17 17 18 19
Analyzing this first set of data, representing the ages of the seven students in a school who enjoy drama, I get:
  • Mean Age = 17
  • Median Age = 17
  • Modal Age = 17
Example set 2: 15 17 17 17 21 29 52
Analyzing this second set of data, representing the ages of the seven night school students who enjoy drama, I get:
  • Mean Age = 24
  • Median Age = 17
  • Modal Age = 17
As you can see, the median age and the modal age for both sets of students are the same, whereas the mean age is completely different. Which average you choose will determine whether you consider the average age for students who enjoy drama to be the same for both normal schools and night schools. If statistics (mathematics is my major by the way) taught me nothing else, they showed me that there is no such thing as an average.

So let's move on to Discoverer and let me show you how to create the calculations that will enable you to report the Median and the Mode. I won't discuss Mean here because this is what Discoverer and everyone else generally refers to as the Average. As you know, Discoverer has a built-in function (AVG) to handle this.

The base worksheet:
As usual, I will start with a base query drawn from my own tutorial database. As you can see below, this worksheet displays a count of the number of lines, the unit price and the total number ordered for each product against each product line. There is a group sort on the Product Line and a low to high sort on the Order Qty.


Creating the Median

The calculation that you will need for the Median is the PERCENTILE_CONT analytic function. The basic syntax of this function is:

PERCENTILE_CONT(n) WITHIN GROUP(ORDER BY expr1)
OVER(PARTITION BY expr2)

Looking at my base query above let's figure out some values for expr1 and expr2, then I will discuss the appropriate values for n.

First of all, the easy one - the PARTITION BY. Looking at the example worksheet above I can see that I have a group sort on the Product Line. Because I know that page items and group sorted items are normally the partitioned items I will use the Product Line for expr2.

Next, let's determine a value for expr1 - the ORDER BY. Once again, look at the example worksheet. I only have two sorts in my worksheet. The first is a group sort on the Product Line and the second a low to high sort on the Order Qty. However, my Product Line has already been handled because it is my partitioned item. Therefore, the most logical item for me to use in my ORDER BY clause is the Order Qty, which is what I did.

Note: the underlying aggregation for my Order Qty is SUM so it is very imported to select the Order Qty SUM and not Order Qty itself.

Finally, let's discuss n - the value provided to PERCENTILE_CONT. This is a value from 0 to 1 with the value indicating how far along a scale from 0% to 100% I want the value to be. Because the definition of Median is the midle value, it therefore makes sense to choose 0.5 or 50%.

Putting all of this together, here is my calculation:

PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY Order Qty SUM)
OVER(PARTITION BY Prod Line)

which when inserted into a Discoverer calculation makes my worksheet look like this:

Note: because there are an even number of values for MINI-WIDGET and MEGA-WIDGET, Discoverer will choose a value exactly half-way between the two middle items in the set. When there is an odd number of values, Discoverer will always choose the actual middle item, as you can see if you look at the Median for SUPER-WIDGET and WONDER-WIDGET.

Using 0 and 1 in PERCENTILE_CONT

If you use 0, for 0%, as the value for n in the PERCENTILE_CONT calculation this means you want Discoverer to display the first value, wheras if you use 1, for 100%, this means you want to see the last value.

Here is an example using 0:

PERCENTILE_CONT(0) WITHIN GROUP(ORDER BY Order Qty SUM)
OVER(PARTITION BY Prod Line)

Here is an example using 1:

PERCENTILE_CONT(1) WITHIN GROUP(ORDER BY Order Qty SUM)
OVER(PARTITION BY Prod Line)

Here is what this last calculation looks like inside Desktop:

Creating the Mode

The calculation that you will need for the Median is the FIRST_VALUE analytic function, couple with the RANK function. The basic syntax of this combined function is:

FIRST_VALUE(expr1) OVER(PARTITION BY expr2
ORDER BY RANK() OVER (ORDER BY COUNT(*) DESC))

Hopefully, you can see what is going on. Starting from the inner function, the RANK, all this does is count the number of rows, which in our database happens to be the same as the number of lines, and produce a ranking in descending order with the highest count being 1, the next highest being 2 and so on. Because this ranking is embedded within the main function there is no possibility of a user accidentally deleting this calculation.

This whole RANK function then becomes the ORDER BY clause of the main function. Thus, I only need to find values for the two expressions, expr1 and expr2.

First of all, the easy one - the PARTITION BY. Looking at the example worksheet above, once again I can see that I have a group sort on the Product Line. As mentioned earlier, because page items and group sorted items are the partitioned items I will use the Product Line for expr2.

Finally, let's discuss expr1, the value which will be used for the basis of FIRST_VALUE. This is the item that I want to base the Mode on. In my case, I have chosen to use the Unit Price.

Putting all of this together, here is my calculation:

FIRST_VALUE(Unit Price) OVER(PARTITION BY Product Line
ORDER BY RANK() OVER (ORDER BY COUNT(*) DESC))

which when inserted into a Discoverer calculation makes my worksheet look like this:

Note: as you can see the mode has been calculated for each group sorted item. Also, even though multiple products have the same unit price, this will still be taken into account. Look at the mode result for SUPER-WIDGET. Discoverer is telling me that the mode is $26.95. This is correct because Discoverer has added the number of lines for both AVR-500 and AVR-550 together, which accounst for more lines than any of the other products in that product line.

If you are interested, here is my calculation inside Desktop:

Extending the Mode

Take a look at the following calculation. You will see that I have altered it to remove the PARTITION BY clause.

FIRST_VALUE(Unit Price) OVER(
ORDER BY RANK() OVER (ORDER BY COUNT(*) DESC))


Becasue the PARTITION BY clause has been removed, the Mode now calculates on the whole set. Thus, as you can see below, I have now determined that $22.18 is the Unit Price that is most often used in my organization. This is obviously a price that my customers are willing to pay. Hey, maybe I ought to raise the prices on the lower priced items!

Tuesday, December 12, 2006

Differences between Desktop, Plus and Viewer

One of my good friends, Russ Proudman, has put together a list of the differences between Desktop, Plus and Viewer (link).

If you have any additions you would like to see or of you spot any errors please let me know and I will pass these on to Russ.

Wednesday, December 06, 2006

The low-down about drilling in hierarchies

If you have taken the time to create hierarchies inside the Discoverer Administrator tool, you may be interested in knowing what Discoverer will do when a user drills up and down one of your hierarchies. The answer may well surprise you.

Overview
When a user drills up, Discoverer leaves the original item in situ, adds the drilled-up item to the query to the right of the original item, and leaves the number of rows the same.

When a user drills down, Discoverer removes the original item, replaces it with the drilled-down item, and increases the number of rows to accommodate the new situation.

Because a picture tells a thousand words, allow me to explain the above scenarios using sme images from my own system. First of all, I must have a start point. I will then take this base worksheet and first of all drill up from the quarter to the year, followed by drilling down from the quarter to the month:

The base worksheet:


As you can see above, this worksheet displays my revenue by fiscal quarter for the fiscal years 2004 and 2005.

Note: for this demonstration, the fiscal year runs from October 1 to September 30.

Let's drill UP. When drilling up, the new item is added to the query rather than replacing the original, thus retaining the same number of rows. Interestingly the new (aka summary) drilled-to item is added to the right of the original item rather than to the left which is where it should be. This is all standard Discoverer functionality so please don't shoot the messenger. Personally, I think it would be better if Discoverer were to replace the original item with the drilled-up item and reduce the number of rows to summarize the data. Unfortunately, as you can see below, it does not:

The drilled-up worksheet:


Let's drill DOWN. When drilling down, the new item replaces the original item and Discoverer increases the number of rows to accommodate the new situation. In my opinion, this is also incorrect. I think it would have been better if Discoverer had retained the original item and added the drilled-down item to its right-hand side. This way I could easily add a group sort on the higher-level item with a sub-total on each break.

Here is Discoverer's method of drilling down:

The drilled-down worksheet:


And here is the way that Discoverer could look if the drilled-from item had not been replaced:

The ideal drilled-down worksheet:

Of course, I know you can always edit the worksheet and add back in the item that was removed, which is exactly what I did above.

Followers