Monday, March 15, 2010

Note: all my OBIEE posts are based one upon the other. If you want to implement this, please read and do everything I mentioned in my previous posts.

I think that this is the most impressive development that we did over OBIEE. OBIEE is very lack in paging abilities. You can navigate only forward and backwards in the report. Here is how our google-like paging looks like:


In order to implement this in your environment, you should be familiar with how to add javascript into OBIEE . Just call the addPageNumbers() function in every page you'd like. Note that this is heavily based on the current OBIEE's DOM (version 10.1.3.4). In addition, this will only work for one report per dashboard page (this was the original requirement in our dashboard site).
The JS is attached below. Enjoy.

addPageNumbers.js (6.44 KB)

 |  | 
Monday, March 15, 2010 5:21:16 PM (Jerusalem Standard Time, UTC+02:00)
 Thursday, March 04, 2010

Our users had a weird requirement: They wanted to search all the columns in one action. We thought how to do it. In this post I'll describe the solution. Note that this solution is not good in terms of performance, but this is the best we've got.

  1. In every report (/answer/request) you develop, add a column and call it SearchColumn. Define its formula as the concatenation of all the columns in the report. You also need to cast the numeric columns to char in order to concatenate them. For example: FirstColumn || cast(NumericColumn as char) || LastColumn. To make things easier, I written a little C# program that takes the request XML and outputs this column's formula. It's a very simple program so I won't put it here (unless someone will ask for it). You can write this simple program in every language you know.
  2. Hide the column (Column Properties -> Column Format -> Hide).
  3. Create a filter based on this column. In the operator drop-down-list, choose "contains any". Then, click on Add -> Variable -> Presentation. Enter Term in the "Variable Expr" box (Term will be our variable name) and enter space in the (default) box.
  4. Now, we'll let the user to enter the search term. Create a new Dashboard Prompt based on any column you'll like. In the operator, choose "is equal to / is in". Control - "Edit Box". Show Default To - "Variable Expression" and enter Term. Set Variable - "Presentation Variable" and enter Term here too. Label - "Search", for example.
  5. All you have to do is to add the prompt in every page you want to let the user search.

Note that the search term will stay until the user will clear the prompt field. As I said, this solution is problematic for number of reason (performance, "why the other page filters too?" and maintenance).

 |  | 
Thursday, March 04, 2010 10:29:14 PM (Jerusalem Standard Time, UTC+02:00)
 Wednesday, March 03, 2010

This is quite simple, so I'll let the JavaScript talk by itself (attached). I assume that you know how to add your JS to OBIEE. You need to call the function hideTabs from every dashboard page and edit the hiddenTabs array.

This was written on version 10.1.3.4 of OBIEE. Note that if the DOM will be changed in a future version it won't work.

hideDashboardPages.txt (2.03 KB)
 |  | 
Thursday, March 04, 2010 3:51:32 AM (Jerusalem Standard Time, UTC+02:00)

Non Empty Behavior is a very good optimization method you can use with your calculated measures. It allows you to define a list of measures that their emptiness defines the emptiness of your calculated measure. To understand this, look at the picture (taken from the MSDN):

This is the calculation expressions pane from the Calculations tab in the cube designer. From MSDN:

"The measures you specify in the Non-empty behavior list are used to resolve NON EMPTY queries in MDX. When you specify one or more measures in the Non-empty behavior list, Analysis Services treats the calculated member as empty if all the specified measures are empty. If the Non-empty behavior property is blank, Analysis Services must evaluate the calculated member itself to determine whether the member is empty."

This is not new. NEB is quite old feature in SSAS. The new thing (at least for me) is that NEB can also be used in Panorama formulas! For example, M1+M2 /*NONEMPTY_BEHAVIOR={M1,M2}*/ will apply M1 & M2 as NEB. You can see more Panorama optimization methods here.

 |  | 
Thursday, March 04, 2010 2:01:11 AM (Jerusalem Standard Time, UTC+02:00)
One of the basic requirements of a modern BI system or Data Warehouse is to be non-volatile, meaning that the DWH will save the history. In this post I'll explain how to implement it in the DWH while assuming that the history can be changed and (the more interesting part) how to implement it in Analysis Services.

Before starting with history saving in the DWH, I need to say that this has nothing to do with Slowly Changing Dimensions. You can work with Type 1, 2 or 3. All you need to take care of is that the fact table will be consistent with the dimension tables. Assuming that the history can change (this is the scenario in my organization), we need to extract all the fact data every time. If you work with staging level/environment/layer you can use it to calculate which records has been changed or inserted and only upsert them to the DWH itself (it's more secure). In our case, we'll work without such staging level so we need to load all the fact data in every load to the DWH.

Loading the fact in the ETL is very simple - we'll save the datetime of the execution start, round it and put it in every row in the fact table. We can call this field CreateTime. You probably ask yourself why we need to round it. The answer is that we don't want the time dimension to have every possible value of datetime. It's huge and the users don't need that kind of accuracy. That's why I round it in my projects to half-hour, meaning that 14:27 becomes 14:00 and 14:50 becomes 14:30. Let's look at an example of such fact table:

RowNum

Dimension1

Dimension2

Measure1

Measure2

CreateTime

1

X

A

4

20

03/03/2010 15:00

2

X

B

56

19

03/03/2010 15:00

3

Y

B

2

3

03/03/2010 15:00

4

Y

B

1

10

03/03/2010 15:00

5

Y

C

7

5

03/03/2010 15:00

6

X

A

4

20

03/03/2010 15:30

7

X

B

56

40

03/03/2010 15:30

8

Y

B

2

3

03/03/2010 15:30

9

Y

B

10

10

03/03/2010 15:30

10

Y

C

7

5

03/03/2010 15:30

Here we can see two executions of the ETL: The first between 15:00 and 15:00, the second between 15:31 and 15:59. Note the differences in rows 2 vs. 7 and 4 vs. 9. This means that the source table(s) has changed between the executions.

In the DWH, we create foreign keys from Dimension1 and Dimension2 to their dimension tables. We'll also create FK between CreateTime and the time dimension. I won't get into the time dimension design here, you can read about it a past post. I assume that the time dimension has the hierarchy Year -> Month -> Day -> Hour -> Half-Hour. It seems that now we have a good fact table for use in Reporting Services (or any other reporting tool). Create a parameter called Extraction Time/ETL Time and assign it values from distinct on CreateTime. The user can pick a date and the report will filter on this date. Perfect.

The problematic part starts in Analysis Services. Only one extraction time can be the current member. There is no sense in summing or aggregating more than one extraction time. There are two ways to deal with it:

The wrong way - I'm writing this down so you'll learn from this design mistake. We created a calculated member which takes the last extraction time:

Create Member CurrentCube.[Extraction Time]. [Calendar].[All].[Last Data Time]
AS
Tail ([Extraction Time].[Calendar].[HALF HOUR].Members,1).Item(0),
Visible = 1;

In every Panorama view, we sliced the Extraction time on the [Last Data Time] member, which will show the user the data of the last ETL run. Our basic assumption was that the user wants to see the current data and when he'll want to see the historical data he will pick another member from the Extraction time dimension. This is the point we missed and it soon  enough bumped into our face.

We forgot that every system, even BI system which is built for analysts should be intuitive and idiot-proof. The users picked members from the day level in the Extraction time dimension. When we asked them why they did it they said that "we wanted to see the data from 03/03/2010 and suddenly the numbers got bigger!". That happened, of course, because the member 03/03/2010 is the sum of 03/03/2010 15:00 and 03/03/2010 15:30. The solution is that no matter which member the user picks in the Extraction time dimension, he should see only one leaf (half-hour level) member at a time. This takes us to

The right way - We set the [Last Data Time] member to be visible = 0 (I kept it for debug purposes) and I written MDX script which I call NonLastEmptyLeaf. This script makes sure that no matter which member in the Extraction time hierarchy will be picked, the user will be sliced only on the last non empty leaf. For example, if the user picks 03/03/2010 he will see the data of 03/03/2010 15:30. This is what will happen also when he'll slice on 03/2010, 2010 and even the [All] member. This is the MDX script:

Freeze([Extraction time].[Calendar].[All]);
Scope(
  Descendants (
   [Extraction time].[Calendar]. [All],
   0,
   SELF_AND_AFTER
  )
);
  this = Tail (
   Filter(
    Descendants(
     [Extraction time]. [Calendar].CurrentMember,
     [Extraction time].[Calendar].[HALF HOUR],
     LEAVES
    ),
    [Measures].CurrentMember > 0
   )
  ).Item(0);
End Scope;

Now, there is no meaning of [Last Data Time] member because slicing on the [All] member will show the current data, meaning that no slicing at all will show the current data because [All] is the default member.

To conclude, history browsing can be dangerous if we let the users the option to slice on more than one data time. The open question that I have is how can I prevent the user from picking more than one member in this hierarchy in Panorama views? Should I even try to do that or should I count on his minimal common sense?

Wednesday, March 03, 2010 9:34:48 PM (Jerusalem Standard Time, UTC+02:00)