Wednesday, June 17, 2009

A wonderful idea I heard of is to turn to full screen mode in Internet Explorer when entering the Dashboard site. It can make a better user experience. Try it yourself and you'll see the huge difference.
How will we do that? We will add a JavaScript code to the first page of the dashboards site and after that we'll ask our system administrator to enable this script for us. Let's get to work:

Step 1 - The JavaScript

Create a new HTML component in the dashboard page. Edit it and click on the "View Source" button (the one with the <>). Enter the following code:
<SCRIPT>
var wscript = new ActiveXObject("Wscript.shell");
wscript.SendKeys("{F11}");
</SCRIPT>

It will simulate the user hitting the F11 key which will turn the IE to fullscreen mode. The only problem is that when viewing the page, you'll see this message:

This takes us to step 2.

Step 2 - Enabling ActiveX

First, I'll show you how to do this on your local maching and then you'll ask your system administrator to enable it on all the machines in the organization using distribution. Enter the Tools menu in IE and hit Internet Options. Click on the Security tab and make sure the "Trusted Sites" zone is selected. Note that the Panorama Dashboards site is already defined as trusted site (if the initial installation of Panorama Dashboards made according to the installation manual. If it's not, you have a problem). Click on "Custom Level" and Enable the "Initialize and script ActiveX ..." option:

Now, you'll see that there's no promting for ActiveX controls. Show this to your sys admin and ask him to make this happen on every user's machine (using distribution, of course). As I said, the dashboards site is a trusted site so I can't see any problem to enable this. The result is very beautiful and can make a lot of users happy. Note that you can also add a button in your page that will call the same script in order to return to normal mode.

Enjoy.

 |  |  | 
Wednesday, June 17, 2009 6:08:44 PM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, June 09, 2009

When adding parameters to your view, you'll see that they appear in the upper-left corner of the grid/crosstab. In the NovaView Desktop program it can be tolerated, but in the Web Access or in the Dashboards web site it cannot be. It's very annoying and we can't let the users see our inside use of the parameters. What can we do?

The solution is very simple: We need to change the skin of the view/dashboards page/dashboards site (depends on how you work) and make the grid corner font's color identical to the color of the grid's background. That way, the users will not see the text in the grid's corner. The way of doing it is also not hard:

Remember: Always backup your files before modifying them. In the panorama folder, enter E-BI/Config/Skins and enter your skin's folder. In the classic way of work, you're using the default skin which can be changed in the Dashboards settings section. I recommend you to make a new skin out from the default one (see here), update the skin's name in the Dashboards settings section and not touching the default skin itself. In your new skin, change the GridCornerFont setting so that its color will be the color of the grid's corner's background. You can see the color of the grid's corner in the GridTopLeftBackground setting. For example, if GridTopLeftBackground=(194,210,226), then if you set GridCornerFont=((Arial,1,R),(194,210,226)) then no-one will see the text over there.

Enjoy.

 | 
Tuesday, June 09, 2009 10:55:27 PM (Jerusalem Daylight Time, UTC+03:00)
 Monday, June 01, 2009
Trying to build ASP.NET page with Panorama applets, I could not understand why the applets appeared blank when I put them in tables. After a while, I've found that this got something to do with the DOCTYPE declaration that each aspx has in its head (!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"). When I removed this line, it all seems to work fine. Digging into this DTD specification, I can't see anything that will prevent from applet tag to appear inside td tag. Strange.

 |  |  | 
Monday, June 01, 2009 9:32:34 PM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, May 31, 2009

A new desire came from one of our customers. The request was to have the ability to search in Panorama Crosstab. The first solution I thought of was searching the grid in iterative way and it worked fine using the Panorama SDK. After that, my friend Boris came with another simple and elegant solution: We can use Parameter in the Panorama view and highlight the number which was defined in the parameter. In this post I'll explain how to implement this.

1. Create a new view using Panorama NovaView Desktop and make sure you see the grid in the view.

2. Define a new parameter: Click on View -> Paramaters, and click on the "Manage Parameters" button. Click on Add. The default type is Number and this is exactly what wee need (for now). In the name, type Highlight and in the Default Value type a number that you see in the grid (this is the number that will be highlighted later). Let's take 0 for example. Click on OK twice and close the little Parameters window.

3. Create a new Exception: Click on Data -> Exceptions -> Exceptions... -> Add. Click on Next and then choose "Custom Exception". Click on "Edit Exception" and there write the following formula: [Measures].CurrentMember = [[Highlight]]
This will simply select all the cells with the number that we defined earlier in the Highlight parameter. Click on OK and click Next. In this step, define the style of the highlighted cells. I picked red color and Bold font style. You can click on Finish now and then click OK. Open the small Parameters window (right click in the crosstab's corner and choose Parameters) and click on "Apply Changes". Now, you will see that all the cells with 0 are highlighted. If you don't see it, check that you did all the steps correctly.

4. When we will show the view to the user, we don't want to show him anything highlighed when the view is loaded. This is where a little trick takes place: open the small Parameters window (right click in the crosstab's corner and choose Parameters), double click on the Highlight parameter. Choose String as the parameter type (on the right part) and in the Default value, enter abc. Click twice on OK and then on the apply button and you'll see that now the highlighed cells are regular ones.

5. In the dashboards page, or in the web page you created using the Panorama SDK, create a button that will call the function searchGrid. Just add the button the property onclick="searchGrid('master')", where master is the applet's name. this is the code of the searchGrid function:

function searchGrid (applet) {
var reply = prompt('Please enter the number to search','');
eval(applet + '.CallUpdateParametersEx("P|~|Highlight|~|' + reply + '|~~|")');
}

Another tweaks I implemented and I didn't write in this post in order to make it simple (for advanced developers only):
  • You can search all the views in the current web/dashboard page. Just call the function for every applet, but make it in Batch mode.
  • You can search all the grid even if the user doesn't see all the rows. You can tell him if the number he searched for is in there or not.

That's all. Test your new page and enjoy. For every question about this and anything else, you can leave a comment or write my mail.

 | 
Sunday, May 31, 2009 6:11:47 PM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, April 07, 2009

When you develope a big Panorama Dashboards site, you'll have a lot of javascript code in the background. We use JS to call the Panorama SDK functions and methods, make the server side and the client side work together and to make the website dynamic and user-friendly. After few days, you'll see that you have a lot of code out there, so you must organize it (if you didn't do it in the first place). My friend Doron wrote a great post about JS development guidelines which can help Panorama Dashboards developers and any big website developers.

Wednesday, April 08, 2009 6:12:17 AM (Jerusalem Daylight Time, UTC+03:00)
Very easy. In Sql Server:
Select top n * from my_table
In Oracle:
Select * from my_table where rownum <= n

It can be very useful in many many cases. For example, you're designing DWH over a system and you're looking at a certain field in one of its tables. You want to know which values this field contains, but fetching "select distinct my_field from my_table" takes too much time. Instead, if you believe know that the data is well distributed, you can use "select distinct my_field from my_table where rownum <= n". Use 1000 for n in the first trial and add one zero in the end of the number n every time until you got a query that takes too much time than you want to wait. after you got the n you can live with, can use the values you have in your query result.

Wednesday, April 08, 2009 5:54:28 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, March 22, 2009

We're now beginning a quick session of Panorama Dashboard development using the 5.5 version. The Dashboard site is written in ASP, so it seems very native to write our custom code also in ASP. On the other hand, writing in C# is much more fun, so we decided to try and see if we can make the server side code be in C#. We found that it is indeed possible. You can place an iframe with ASP.NET page and reference the dashboard page from it using JavaScript (start from page.parent and go on from there).

This way you can enjoy the two worlds. In the next posts I'll show some examples of what you can do using this method.

 | 
Monday, March 23, 2009 6:55:14 AM (Jerusalem Standard Time, UTC+02:00)

There are some things you can only learn in the hard way. It didn't happenned to me personally but to my team friends, so I consider it as it is my bad.

We upgraded our ETL tool - Informatica, from version 8.5 to 8.6. We had to run some tests to see that the results are the same. So, what we did is to save the result table from 8.5 in Excel, save the result table from 8.6 in Excel and then compare them using Excel-built-in functions. The only problem is we didn't pay attention to the places where zero and null interchanged. This happened because the two versions act differently where null values take places in aggregation functions. For example, when there is a sum function and it aggregates only on null values. In one version the output is zero and in the other the output is null...

This can also happen in other tools and technology. For example, in OLAP cubes the difference between zero and null is the difference between seeing the member of the dimension on the screen and not knowing of its existence.

For conclusion, always be aware to this point and don't forget to check it.

Monday, March 23, 2009 6:09:59 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, February 22, 2009

Here are some tips we collected over the years about dashboard design:

Page Layout

  • Less is more - don't put too many views in the page.
  • Rule of thumb - no more than five reports in one page.
  • Don't use scrolling - the average user won't scroll down the screen.
  • Position in screen - some researches made about this subject and here are the recommendations:
    • Top-Left - it's the part of the screen that the user looks at first. Put there the most important data.
    • Center of the screen - the part the user looks after the top-left. Put there the second-most important data.
    • Top-Right, Bottom-Left - Neutral parts.
    • Bottom-Right - The user won't pay attention to it, don't put there important data.
  • Fixed menus in every page.
  • Small amount of navigation targets in every page. Too much navigation paths will cause confusion.
  • Concentrate on the main page - in 90% of the cases the user will stay there.
  • Add graphic components and highlight them if necessary.
  • Blue color only to links (and underline, of course).

Views Layout

  • Two digits after the decimal point - in non-integer number, put only two digits after the decimal point. The human mind can't understand more than that.
  • Focus on the clarity of the data and not only on its beauty. For example, 3D pie charts are very beautiful, but flatting them will make them more clear to the user.
  • Measures have meaning only when compared to other data. Don't put stand-alone measure.
  • Pay attention to graphical change between the data and not only colors. Remember that there are color-blind users.
  • Text is more clear than icons.
  • Use the San Serif and Arial fonts. They are the most readable to the user.
  • Align the text only to one side and not to the middle. It seems better to programmers, but users want their text aligned to the left or to the right.
  • Colors - don't use too much color. The dashboard page is not a jungle. Use colors of the same family.
  • Put dark text on bright background the vise versa.

And to conclude - use CSS whenever you can. It will save you much time and effort.

Sunday, February 22, 2009 9:14:26 PM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, February 10, 2009

In many cases, extraction of Panorama view's MDX code is necessary. For example, in order to check whether the performance bottleneck is in Panorama or in the OLAP server, you can take the MDX code, run it in SQL Server Management Studio and compare run times. You can take it further more but I'll leave it for future posts.

In Panorama Desktop program, click on Tools -> Direct MDX... and then CTRL + ALT + V. Then, you can copy it and use it in any way you want.

 | 
Tuesday, February 10, 2009 5:00:54 PM (Jerusalem Standard Time, UTC+02:00)
 Monday, February 09, 2009
I'm glad to announce that two new Panorama forums were opened in the last days:
The first one is Panorama's Technical forum (English only). From now on you can get answers using this forum and see another users' problems and answers.
The second one is independent Panorama forum (Hebrew only). This forum was created by Michael Ra'am, ex-Panorama consultor. Its purpose is to be a place for sharing knowledge and ideas.
I believe you'll see me in both forums. See ya!

 | 
Monday, February 09, 2009 10:34:32 PM (Jerusalem Standard Time, UTC+02:00)
 Thursday, February 05, 2009

In some of our projects, we develop the Panorama views in the development environment along with the Data Warehouse, the ETL, the Cubes, etc. That's because the customers want to see how their product will look like before we deploy the views in the production environment. So, how do you deploy Panorama views from one environment to the other?

  1. Create the new book - If it's a new briefing book, create it using the Panorama NovaView Administrator program. If it's already exist you can skip this step.
  2. Copy the content - The book's content is by default in c:\<Panorama Folder>\E-BI\books\<Book Name>. Copy the content of this directory from the dev machine to the production machine. This is not enought because the views are still looking at the dev environment, so:
  3. Change the view's properties - You need to have a very simple program (let's call it PanoramaDeployUtil) that iterates over all the views in the given folder (and its sub folders, recursively) and change its properties. I recommend opening the view's file using xml reader and change the element \pnView\Root\Cube\Properties. You need to set its properties CubeAddress, CubeName & CubeDB according to the new environment's values (CubeAddress is the server address). Just run this program and the views will look at the new environment.
  4. Check - check yourself and make sure everything is ok by opening Panorama Web Access or Panorama Desktop and see that the values shown are the production's values.

Know that you can always open the Panorama Desktop and change the views one by one by hand.

Enjoy.

 | 
Thursday, February 05, 2009 10:21:08 PM (Jerusalem Standard Time, UTC+02:00)
 Saturday, January 03, 2009

You should know that when you run a program by starting a job with a CmdExec step, the directory in which the program is running in will be c:\<windows dir>\system32. How can this affect you? For example, I created a .Net console application that has a settings file with it. When I ran it using the SQL Server Agent, it couldn't find the settings file (worse - it used the default settings and that caused many trouble finding the problem). After some research, I found that it's looking for it in the directory I mentioned.

Sunday, January 04, 2009 6:24:58 AM (Jerusalem Standard Time, UTC+02:00)
 Wednesday, December 03, 2008

I've been thinking lately about the new Microsoft Chart controls which are based on the Dundas acquisition (made in April 2007). What is the meaning of this to us, the BI developers?

Until now we were always counting on the abilities of our BI products. Let's take Hyperion/Oracle Essbase for example. Let's say I want to have a special graph of a new type or a new feature in a graph. I couldn't do it at all, because the product's code is closed (someone has to do money, doesn't it?) and I can't add any more graphs or features. There are some products where I can do things like this. For example, in Panorama NovaView I can build a new KPI type or doing a sophisticated visualization using JavaScript and Panorama SDK, but that's a lot of coding.

Now, we have the ability to write graphs by coding them without large amount of code. We can customize them as we want and we're not limited by any product. The drawbacks are maintenance and knowledge that we need to have here, but these are things that we need in every product anyway. I didn't learned this framework yet so I can't tell where are the limits, but they seems pretty far. Alex Gorev is writing about it in his blog (web, rss) so you can learn more about it there. It will take time to see if it affects the BI development world, so all what left to do is to sit and wait.

Thursday, December 04, 2008 5:46:20 AM (Jerusalem Standard Time, UTC+02:00)

Today I had a very disturbing coincidence.
My friend Ariel worked on a SSAS solution with no version control (we're using VSS). Instead of using that, he developed by opening the database on the server. I told him that he must fix it and we must have a recent version-controlled solution. In the past we asked Microsoft support how to do that (we lost all our vss files and had only the databases). They simply said that it's not possible. Ariel has found today that it can be made very easily using File -> New Project -> Import Analysis Services Database, as you can see in the picture:

Thursday, December 04, 2008 5:09:58 AM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, November 18, 2008

This is a little bit tricky. Unlike the AdomdClient assembly, the AdomdServer assembly  doesn't have a descriptive name. It's called msmgdsrv.dll and it is located in Program Files\Microsoft SQL Server\MSSQL.2\OLAP\bin. Why it's not documented anywhere?

Tuesday, November 18, 2008 11:17:16 PM (Jerusalem Standard Time, UTC+02:00)
 Monday, November 17, 2008

After announcing the MdxInjection program I got several requests for additional details and for the ability to run it without using Visual Studio. So, here are some important points:

  • When I published it I had developers in mind because I'm sure than anyone will want to do his little modification before using it for his own needs. That's why I published it as a solution and not as executable.
  • I written it down using VS2008 but only with the .Net 2 framework. Those of you who uses VS2005 won't be able to open the solution.
  • The program has only one public method - InjectMdx, who takes two arguments: The location of the CommonMdx file and the location of the xml configuration file.
  • The CommonMdx.mdx file contains the common MDX script. The relevant part has to start with /* Common MDX */ and then the common mdx script. Anything written before it won't be treated. That gives you the ability to save some data or comments for yourself in this file.
  • Example of the configuration xml file can be found in the Test libary inside the solution. Basically, it enable you to define in which servers, databases and cubes you want to inject the common script. Pay attention that you have to write the connection strings in this file.
  • Note that the program will detect cube dimensions with their name changed and will know how to replace them. That means that if you mention the Time dimension in the common script and inject it to AdventureWorks cube, the script will replace the string "Time" with the "ShipmentDate" string, for example.

For those of you who want simple execution file, I added a windows console project in the solution.

Link to only executable program
Link to the solution with the added windows application project
Link to the solution without the windows application project
Tuesday, November 18, 2008 6:54:17 AM (Jerusalem Standard Time, UTC+02:00)

In the previous post I talked about the DRY principle in the BI Development. I mentioned that one of the major problems in the principle's implementation is in the common MDX code. Chris Commented:

"I'd like to be able to have a global MDX Script and be able to do something like a #include to bring calculations into specific cubes. One to add to my wishlist for the next version..."

And as I said there that I have a temporary good solution until we'll have it in the next SQL Server release (if someone from Microsoft is reading...).

The MdxInjection program takes your common MDX Script and a very simple xml file that defines where to inject this script. It injects the script into your desired cubes and even replaces the dimensions' names where necessary (it is relevant where you put dimension in a cube with a different name to thr dimension or when you use Role Playing Dimensions). I couldn't hold myself from writing some test code so it's also included in the project. The project is written in C# 2 using much AMO code. All the technical little details are inside.

Enjoy.

Download Link

Monday, November 17, 2008 8:29:02 AM (Jerusalem Standard Time, UTC+02:00)
 Friday, October 17, 2008

This month we're really busy with a very important project and a short schedule. This made me think of ideas for agile development for BI, but I'll leave it for other time for now. In order to make us better BI developers, I decided to take one Pragmatic Programmer principle and use it. I took one of the most important (for my opinion) principles - DRY (Don't Repeat Yourself). The DRY principle says that "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system". In classic programming it's simple to use: Use methods and generic classes to implements logic that will repeat itself in the project. But how do you do it in BI development? Here are some ideas I thought and even implemented some of them in my environment. Every layer/step in the BI development has it's own bulletin. I'll be happy to hear more from you.

  • First of all - use functions in your DataWarehouse's database. Do it as much as you can. Do not repeat any logic twice or more, no matter if it's in procedures, views or even CLR functions.
  • We all have much logic that repeats itself in the ETL process. For example, we found ourselves doing over and over the next process: When we build a fact table, we take every cell that points to a dimension table by a foreign key and "looking up" if it's found in the dimension table. If it's not there we replace it with Undefined, UD or null. That makes us feel very bad because we feel that we're doing the same all the time and it gives us the feeling of machines rather than programmers. The solution for this problem (and many other) is to build our own tasks (in SSIS) or transformations (in SSIS & Informatica). Alberto Ferrari did a beautiful work in this field in SSIS. I'll add some transformations of my own once I'll have release-ready versions of them.
  • My co-workers just loves the Calculated Member feature in the Data Source View in SSAS. In enables them to make a new column without making a view and with no touch in the underlying database. The problem here is that after a while we have a LOT of calculated members, many of them repeats themselves and when you look for logic you lost, you can look for hours in the never-ending DSV. The solution here is not using calculated members at all. Put all your logic in the database (and as I said - in functions). The only place where you should use calculated members is where you must - when you have no write permission to the DataWarehouse or when you build your DSV over an operative database and you don't have write permissions.
  • The same is with Names Queries in the Data Source View in SSAS. Don't use it.
  • There's much logic that you can do only in MDX. Here, the problem is that MDX scripts are defined over cubes and not over dimensions, meaning that if a dimension has MDX logic you have to repeat it in every cube's MDX script. The solution is to add the MDX programically using AMO. Every time the ETL process ends, it should run a program that takes the MDX script from a single file and place it in every relevant cube. I know it sounds a little bit wacky and I even didn't do it myself, but for what I know, it's the only solutions for DRY in MDX.

As I said, I'll love to hear your ideas about this topic.

Friday, October 17, 2008 9:46:22 PM (Jerusalem Standard Time, UTC+02:00)
 Monday, September 22, 2008
My friends were stuck with a totally weird bug this week. After a day of frustration they called me for the rescue. It took me some time to figure it out and I think that every SSIS developer (and maybe every developer) can learn a thing or two from others' mistakes.

The mission: The data flow takes one table with duplicate rows and copies it to another table and makes sure that every row will appear only once. In the way, the data flow also adds some irrelevant fields. Among them is the Create_User and Create_Date fields which tells by who and when the package last ran.
How my friends did it: Again, it's a very simple flow. They only added Derived Column transformation to add the new fields and then they added an Aggregate transformation to make every row appear only once.

Note that this is not the real package. It's a sample I did on my machine to show it here.

The Bug: When I first seen this it seemed to me very simple flow and I asked myself how can it be that this is happening:

As you can see, it seems that the Aggregate transformation is not deterministic. Sometimes it outputs 99 rows, sometimes 198 and in some other times I get other results as well.
Investigating: I wanted to see what's the difference between the table that I got in the first time (99 rows) and the table I got in the second time (198 rows) so I changed the destination table and compared the two tables. I ran "select * from A where Column1+Column2+... not in (select Column1+Column2+... from B)"-style query but it was no use - it showed me that there were no rows that appeared only in one of the tables. In this step I really started to think (as my friends did) that maybe the Aggregate transformation has something wrong inside... Instead of blaming Microsoft, I decided to think. I needed to see what can make the flow non-deterministic. Then, it hit me.


The only non-deterministic component in the flow is the Derived Column because it has the getdate() function (it may be simple to see here, but in the original package the derived column transformation had many fields). The results of this function may differ in the milliseconds, especially for large tables. Then I looked in the Aggregate transformation and seen that the Create_Date column also was in the Group by operation, meaning that if two rows has different millisecond they will be placed twice in the destination table, although they are the same in every column. That's it, the bug was found. But still, one question remained: Why the query did not show me this? The answer is also simple but tricky to find: In the comparison query I concatenated all the columns in the tables in order to compare the results. When I did this, I casted the Create_Date to nvarchar which truncated the milliseconds.

Conclusions:
  • Pay attention to non-deterministic elements in what you do, whether it's code or ETL process.
  • When you do dummy stuff like checking all the checkboxes in a list - think what are the outcomes.
  • Call Miky when you're desperate.
Monday, September 22, 2008 8:10:48 AM (Jerusalem Daylight Time, UTC+03:00)
 Saturday, September 20, 2008
This week I had something disturbing. When I installed Excel 2003 on the Panorama machine in order to use Excel functions in my MDX calculations, the NovaView Desktop stopped working. When I tried to load a view it threw an error in connection message. Calling to Panorama support, they told me that it's a known issue and it's hard to find by using the Panorama knowledge base. So here it is:

If you have connection issues in the Desktop program, enter the registry editor (Start -> Run -> regedit). Look for HKEY_CLASSES_ROOT\MSOLAP\CLSID and make sure it's the same as HKEY_CLASSES_ROOT\MSOLAP.3\CLSID. Remember - always copy from MSOLAP.3 to MSOLAP and not vise versa.

 | 
Sunday, September 21, 2008 6:38:48 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, September 04, 2008
In the last years I've seen many astonishing BI web sites. I always asked myself what I need to do to bring my customers such beautiful web-based BI solutions. After having much experience with Panorama NovaView and especially the Panorama SDK I started to run some questions in my mind: Why won't I build some re-usable puzzle pieces that can be joined together to a web site? These pieces can be web controls that using and even interacting Panorama views and Analysis services. Why won't publish it as open source and give it to the BI community?

The PanoramaBasedWebSite project is a toolkit that contains web controls you can easily use in your ASP.NET based web site. The project is written in ASP.NET 2.0 and C# 3.5. These web controls interacts with Panorama views (using Panorama SDK) and Analysis Services (using AMO).
The idea is that you can take these puzzle pieces, combine them as you like in your web site and create your good-looking BI web site with almost no programming. The project is only in its first steps, but I believe that publishing the design/idea is also important. This is why the first release is already published, although it has only two web controls so far. This is what we have so far and what I'm planning for the future. I'll be happy to hear your thoughts/ideas:

First Release Contents

  • PanoramaView web control - this is the main control of the project and it will probably take a lot of the project's weight. The control simply shows panorama view. For now, it doesn't do much rather then showing a view so there's a lot of work to do for this control. It gets two properties - BriefingBookName and ViewName. You can look at the TODO: comments in the code to see what future plans I have for this control.
  • UpdateDatePanel web control - this control shows the date and time when the last process of the cube was made. It can be used in two ways: You can only set the PanoramaViewID property. The control will extract the cube and the database name from the view and take the update date from the cube. The other way is to set the CubeName and DataBaseName properties.
Future Plans

  • KPIView - Already working on it. Similar to PanoramaView, but if the view shows KPI then a drilldown will be made when the user clicks on a gauge.
  • QueryList - Shows the result of MDX query. For example, the list shows the top 10 employees of the month (in sales perspective, for example). This list will be interactive, meaning that clicking on a row will make a drilldown, drill to data or replace the list with another query results.
  • DimensionPicker - Gives the user the ability to pick members of a dimension/hierarchy. After selecting, the control will slice all the views on the page (or only predefined set of views).
  • DatePicker - Same as DimensionPicker but for dates. It will show a calendar to the user and clicking on a date will perform a slice in the views.

The use of the controls in your aspx pages is very easy. You can see for yourself:

<PanoramaControls:UpdateDateLabel ID="UpdateDateLabel1" runat="server" PanoramaViewID="PanoramaView1" />
<PanoramaControls:PanoramaView ID="PanoramaView1" runat="server" Width="100%" Height="80%" BriefingBookName="MikysBook" ViewName="MyFirstView" />

I'll be happy to read your thoughts and ideas about this project. There will be more to come. Stay Tuned.

 |  |  | 
Friday, September 05, 2008 3:37:45 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, August 17, 2008

There is one tiny new feature in SSAS 2008 that you can easily miss. It called Empty Cube. When you create a new cube using the wizard, you can create an empty cube, meaning that it has no measures, dimension relationships, etc. The original use of it is for "users (who) want to create everything manually, or when all dimensions are linked dimensions" (taken from the description in the wizard)

In the past, I written about using SSAS with Visual SourceSafe in order to have source and version control for the SSAS project. I mentioned that it has many disadvantages but the big advantage (source & version control) is bigger so in the bottom line I recommend using it. One of the problems we experienced were that every time that someone creates a new object (cube, dimension, etc.) he has to check out the .dwproj file. The result is that sometimes we have a fight in the team for that file and we shouting: "who taken out the dwproj???" (yeah, I know that we can check who did it inside VSS but shouting is more fun).

The empty cube feature is a nice solution for this problem: When you create a new project you can create all the (empty) cubes and then the .dwproj file is free and no longer needed. I'm assuming that you know which cubes you'll have when starting a new project. The only thing remained is the same solution for dimensions. I'll recommend it in the Connect site (it's not working now for some reason).

Monday, August 18, 2008 3:50:41 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, August 12, 2008
Found a great site for BI beginners. Learn Microsoft BI has some videos about BI and SSAS which can place you in a good position as a beginner.

Tuesday, August 12, 2008 9:36:52 PM (Jerusalem Daylight Time, UTC+03:00)
I've been asked to review Widgenie and since I'm a nice person - why not? Widgenie is basically a widget creator that takes data from variety of sources: excel and csv files for now and in the future Google docs and more data sources. After you declare your data source you can change the look of the widget and then you can publish it in variety of ways: Facebook, blog, comments, etc.

I'm writing this post while creating my first widget so these comments are from the first encounter with the product, meaning that I can miss few things, but I believe you readers will get the picture:

  • The limitations on the excel source are way too much: Why the maximum file size is 2M? "The sheet should contain only the data and column headers. Titles, notes and other text outside of the data table will impede the upload" ? Why can't I take my old familiar excel and use it as is? Com'n guys, write some VBA scripts and work this out.
  • When I start from "Create new Widget" and then moving straight to "Create new Data File" because it is my first time, I want to go straight from the end of the data file wizard to when I've been in the create widget wizard. I don't want to start over the create widget wizard.
  • Every step in the wizard has a little question mark with it that explains the current part. It is very intuitive, nicely done and nicely put.
  • The widgets are very beautiful. As a BI developer, I'd be happy to put some of these on the CEO's dashboard.
  • What about multilanguage support? The hebrew columns appears as jibrish in the widgets.
  • The publishing process is not simple enought. I don't want to get a script that I need to place in my blogger/facebook/iGoogle. I want that the process will end at the target of the widget. For example, let's say I want to put the widget in my facebook's profile. I would expect that facebook will be open in the end of the wizard and a new Widgenie application will be created on my profile asking me to choose one of the widgets I created.
  • The text cloud widget is very simple and powerful. It can be very useful for managers.
For conclusion, Widgenie is a very beautiful product that has a long way to do if it wants to stand with the big sharks of the BI world. It has to fix some issues, support more sources and targets and have more capabilities (snapshots support, SSAS integration, excel-style chart editing and more). I don't know why I can't embed here the widgets (it's just ain't working) so here are links to the bar chart and the text cloud widgets.

Tuesday, August 12, 2008 9:12:53 PM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, August 03, 2008
update: Chris Webb and Mosha commented and made it clear that the reason for this error wrap is the NonEmptyCrossJoin function and nothing else. I also checked and I did not find any other function that wraps underlying errors.

This is something you need to be aware of when you're writing MDX. I don't know whether it's a bug or by-design. I'll be happy
to know (please comment if you know something that I don't).
Consider the following MDX:

SELECT
  NonEmptyCrossJoin
  (
    [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
   ,[Employee].[Employee Department].[Department].&[Sales]
  ) ON 0
FROM [Adventure Works];

The query will return with this error: The Set_Count argument of the NonEmptyCrossJoin function is either negative or larger than the number of sets provided. This is quite reasonable because I written the second argument as a member, where (NonEmpty)CrossJoin expects only sets. So, let's upgrade this member to a set:

SELECT
  NonEmptyCrossJoin
  (
    [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
   ,{[Employee].[Employee Department].[Department].&[Sales]}
  ) ON 0
FROM [Adventure Works];


All I did is wrapping the second argument with {} and we have a set. The query will return 19 columns.
Now, for the interesting part. Let's count the members of this CrossJoin before the we fix it:

WITH
  MEMBER [a] AS
    NonEmptyCrossJoin
    (
      [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
     ,[Employee].[Employee Department].[Department].&[Sales]
    ).Count
SELECT
  [a] ON 0
FROM [Adventure Works];


This return... 0.               
Where is the error?
My guess is that the Count function wraps the error. The NonEmptyCrossJoin returns null and the count of members in null is zero. The meaning of this is that if you'll ever forget to wrap the member with {} you'll always get zero and not an error. This can be very dangerous. Just for the check, running this query after the fix:

WITH
  MEMBER [a] AS
    NonEmptyCrossJoin
    (
      [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
     ,{[Employee].[Employee Department].[Department].&[Sales]}
    ).Count
SELECT
  [a] ON 0
FROM [Adventure Works];


will return 19. This been tested with both SSAS 2005 and 2008 (RC0). The examples here are from RC0.

Be careful with your MDX.

Monday, August 04, 2008 5:28:26 AM (Jerusalem Daylight Time, UTC+03:00)
 Saturday, July 26, 2008
OpenSearch is one of the beautiful things I discovered lately. If you're using Firefox 2 and above or Internet Explorer 7, look at the search field in the right-top corner of the browser. See the shiny little thingy there? Click on it and you can instantly add two new search engines for fast search through your browser. The first one is my blog's search and the second (and more important) one is the ability to search BiBlogs right from the browser. Yeah, now you can search the whole BI community's blogs with only one click.

I call all the BI bloggers to add this too. It's 5 minutes work and it can help lot of people out there. See here for instructions.

Saturday, July 26, 2008 7:26:41 AM (Jerusalem Daylight Time, UTC+03:00)
 Wednesday, July 23, 2008
When you practice on SQL Server on your local machine you don't want that its services will start up with the computer. As I mentioned before, you should declare the startup method of these services as manual (see here). After that, you can build two simple batch files that will start and stop the services. Believe me - it's very comfotable to start and stop the services with only one mouse click. The first batch file (I called it sql.bat) contains only two lines:

net start MSSQLSERVER
net start MSSqlServerOLAPService

The second one (sqlend.bat) looks like that:

net stop MSSQLSERVER
net stop MSSqlServerOLAPService

Note that I only start/stop the SQL Server and analysis services, but you can do whatever you like.

Have fun.

Thursday, July 24, 2008 5:19:08 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, July 20, 2008
Sometimes the uninstall process does not succeed or even worse - the "Add or Remove programs" interface does not allow you to uninstall the product because it already uninstalled / doesn't exist / you name it. The problem is that the uninstalled program can't be removed from the list, it can't be uninstalled and it prevents another installation or re-installation. This happens a lot with Microsoft's heavy products such as SQL Server and Visual Studio but it can also happen with other products too.
What can you do?

Here's a small tip: Open the registry editor (Start -> Run -> regedit) and go to the path: My Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall.
Under this folder you'll see many folders with GUI names such as {1268CDD4-0FED-3CE6-8A9D-C3B012ABCD8F}. To know what installation it is, look for the key named DisplayName under this GUI folder. In most of these folders you'll see a key named UninstallString. To uninstall this program, copy the value of UninstallString and paste it in the run dialog. This will start the uninstall process.

This trick will not always work, but it can help you a lot. Especially with broken installation of SQL Server.

Enjoy.

Monday, July 21, 2008 6:24:27 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, June 17, 2008
Well, I don't know what I expected but I'm a little disappointed. I'll split my review into two parts:
  1. The Analytics - This is the main issue for my organization, so here I expected to see some new & cool features, but all I've seen is only facelift. This is probably why the new version is 5.5 and not NovaView 6.
  2. The Google stuff - While this is not relevant for my organization, this was very cool and promising. I think that this relation between Panorama and Google will carry on and both sides will only benefit from it.
The greatest thing I got from the webinar is ideas of beautiful designs for sites containing Panorama applets views. The site that has been showed in the webinar was beautiful and intuitive. I just can't wait to give my customer a site that looks like it, completely sewed for him. Go on, look at the webinar and take some ideas for your site design.
If you haven't seen the webinar yet, you can download it or watch it.

 | 
Tuesday, June 17, 2008 7:24:21 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, June 01, 2008
My customer wanted to have the ability to show the last update time of the data in the Panorama's views' titles. He knows that I know to deliver :-) so in a couple of hours he had it. Using AMO and the view's xml manipulation it's very simple. Just note that changing views without NovaView Desktop is not supported by Panorama so watch out before you execute this program. In your first trials, always save the books dir (by default in c:\program files\panorama\e-bi\books) before you start. Also, be aware that this won't work for automatic views. The user must enter the view's title himself and write the string "Correct For" and the program will know to write the last update time after it. This is the program's code:

  1 using System;
  2 using System.Collection.Generic;
  3 using System.Text;
  4 using AMO = Microsoft.AnalysisServices;
  5 using System.Xml;
  6 using System.IO;
  7
  8 namespace CubeUpdateDate
  9 {
 10     class Program
 11     {
 12         static void Main (string[] args)
 13         {
 14             CubeUpdateDate cud = new CubeUpdateDate();
 15             cud.Go();
 16         }
 17     }
 18     
 19     class CubeUpdateDate
 20     {
 21         public void Go ()
 22         {
 23             DateTime cubeUpdateDate = GetCubeUpdateDate(GetConfigData("ServerName"),GetConfigData("DataBaseName"));
 24             UpdateViews(GetConfigData("BookDir"),cubeUpdateDate);
 25         }
 26         
 27         private void UpdateViews (string dirName, DateTime cubeUpdateDate)
 28         {
 29             foreach (string subDirName in Directory.GetDirectories(dirName))
 30             {
 31                 UpdateViews(subDirName, cubeUpdateDate);
 32             }
 33             
 34             foreach (string fileName in Directory.GetFiles(dirName))
 35             {
 36                 UpdateFile(fileName, cubeUpdateDate);
 37             }
 38         }
 39         
 40         private void UpdateFile (string fileName, DateTime cubeUpdateDate)
 41         {
 42             try {
 43                 XmlDocument xmlDoc = new XmlDocument();
 44                 xmlDoc.Load(fileName);
 45                 XmlNodeList titleTags = xmlDoc.GetElementsByTagName("Title");
 46                 if (titleTags.Count > 0)
 47                 {
 48                     string viewTitle = titleTags[0].ChildNodes[0].Attributes[0].Value;
 49                     if (viewTitle.Contains(@"Correct For"))
 50                     {
 51                         viewTitle = viewTitle.Substring(0, viewTitle.IndexOf("Correct For") + 11);
 52                         viewTitle += " " + cubeUpdateDate.ToShortTimeString() + " " + cubeUpdateDate.ToShortDateString();
 53                         titleTags[0].ChildNodes[0].Attributes[0].Value = viewTitle;
 54                         titleTags[0].ChildNodes[0].Attributes[1].Value = viewTitle;
 55                         xmlDoc.Save(fileName);
 56                     }
 57                 }
 58             }
 59             catch (Exception e)
 60             {
 61                 Console.WriteLine("Error reading/writing file: " + fileName);
 62             }
 63         }
 64         
 65         private string GetConfigData (string whichData)
 66         {
 67             XmlDocument xmlDoc = GetConfigXml();
 68             return xmlDoc.GetElementsByTagName(whichData)[0].InnerText;
 69         }
 70         
 71         private XmlDocument GetConfigXml()
 72         {
 73             XmlDocument xmlDoc = new XmlDocument();
 74             xmlDoc.Load("config.xml");
 75             return xmlDoc;
 76         }
 77         
 78         private DateTime GetCubeUpdateDate (string serverName, string dbName)
 79         {
 80             using (AMO.Server server = new AMO.Server())
 81             {
 82                 server.Connect("Data Source=" + serverName);
 83                 AMO.Database db = server.Databases[dbName];
 84                 return db.Cubes[0].LastProcessed;
 85             }
 86         }
 87     }
 88 }

The program uses xml config file that looks like this:

<?xml version="1.0encoding="utf-8?>
<Config>
    <ServerName>MyOlapServer</ServerName>
    <DataBaseName>MyDBName</DataBaseName>
    <BookDir>MyBookDirPath</BookDir>
</Config>

The program assumes that all the database has the same update time so it takes the last process time of the first cube in the database. If it's not true in your case you can change it in the method GetCubeUpdateDate.
Enjoy.

update: If you're getting trouble with XmlDocument.Load method because of hexadecimal characters in the view's xml file, look here.
 | 
Sunday, June 01, 2008 7:20:09 AM (Jerusalem Daylight Time, UTC+03:00)
 Friday, May 30, 2008
Finally, that looks like the answer for our needs. IBM Business Glossary (BG) is a product that manages our business vocabulary. It enables users to create business terms (also called entities), edit them, share them and to customize them. We've seen the product in IBM, Israel and we liked it very much. Here is a brief summary of the meeting:

Managing meta data in the organization is a difficult task. First of all, you need to know what kind of MD you want to manage. There are three main types:
  • Business MD - The vocabulary that contains the terms of the business.
  • Technical MD - Names and attributes of data storages, tables, columns, etc.
  • Operational MD - How the information flows inside the organization.

The BG gives common language to the organization and connects the business to the IT. First of all, it creates a contract - everybody knows exactly what is a "high value customer" for example. That supposed to be the end of confusion about business terms. It also helps to understand things, exposes knowledge and connect all the technical details.
In BG, all the terms has the same common attributes, such as name, description, example, related entities, etc. The users can define more custom attributes if they want. The product also manages the Data Stewardship, meaning that every entity has a father/manager. It can also have two fathers - one from the business and one from the technical aspect (Update: Not in the current version). The terms are divided into subject areas/context. This way you can go to a subject and learn it all by going over all its entities. You can see and use its custom attributes. For example, you can have a link there to reports that contains/lists that entity.

There's much more to say about BG. All I wanted is to give a brief overview of what it is and you can see if it can help you. I'll give my pros and cons for this product:

Pros:

  • Making order in the organization - everybody knows what you talking about when you say a term. Every entity has a defined father/business-expert.
  • Manages business knowledge over time. You can take a new employee and instead of taking other's employee's precious time to teach him everything, just tell him to go over the business glossary. (I'm not naive, but it will reduce time)
  • Fast lookup time - I want to know in which tables in the databases an entity is placed. I can find it in seconds.

Cons:

  • Security - BG has almost no security module at all, meaning that everybody sees everything.
  • Doesn't support services yet. I would like to see which services exposes and which services consumes an entity. I want to call the service, provide it with input and see the output.
  • The stewardship module is still weak. In the meantime, there is only one father of an entity.
  • The custom attributes are the same for the entire vocabulary. What if I would like to have a custom attribute only for one subject area?
  • There isn't a hebrew interface yet. The interface can be only in English, Spanish and French (if I'm not wrong).

For conclusion, I think that the product is good, even very good. The problem is that its development has to go on several iterations before it can be used a variety of organizations. It just doesn't have all the features that a business vocabulary must have. Wait a year and you'll see a wonderful product.

Saturday, May 31, 2008 2:04:53 AM (Jerusalem Daylight Time, UTC+03:00)
On June 10th, Panorama will show us the new version of NovaView - 5.5.  The show will be only on the web (that's why it called a webinar). We will see the new reports, flash-based dashboards and the results of the cooporation with Google. You can see the brochure here. I would happy to say that I'll see you there. The only problem is that we won't see each other and that's why I think that a real conference is better than a webby one. On the other hand, it's much simpler and cheaper to do a webinar so I can understand that move. Never mind, I'll see you in other time.

 | 
Friday, May 30, 2008 6:57:21 PM (Jerusalem Daylight Time, UTC+03:00)
 Monday, May 26, 2008
One more tip about installing the database samples: I believe that installing them is not enough. In order to improve your skills you need to have a deep knowledge of them. Therefore, don't deploy the SSAS project to the server and that's it. Build it yourself. Yes - create a new project called MyAdventureWorks or something like that and build all the objects by yourself. Indeed, this will take time and strength but this is worth this. After you'll do all the tricky things yourself then you really got it in hands. Learn the AW project and be a master.

Monday, May 26, 2008 7:44:47 AM (Jerusalem Daylight Time, UTC+03:00)
MDM
 
Everybody is talking MDM so we decided to go to IBM and talk with Darren Cooper, which is an expert on this subject. Darren gave some sense into this term and explained us exactly what it is and what it is not. There's a lot of confusion out there about this, so it is important to know things before you deploy them or buying a new MDM product...
This sketch can explain a lot of it:


Following the arrows, you can understand what is going on in this picture and what it is all about:
  • The operational systems contains some common critical data which we're tired of duplicating and maintaining all the time. So, we push this data (red in the picture) to the MDM in real time. This is it. That's MDM. From now on, we play with this golden egg and gets all the benefits from it.
  • Hey, we have all the critical data in one place, so why shouldn't we push it to the clients whenever they need it? After we have MDM it doesn't make sense to give it to them through the Op. systems, is it?
  • Wait a second! A client is using an operational system. Will the critical data be saved in the Op. Systems? You guessed right. Be aware that now the client will send data to both places - MDM and Op. System.
  • MDM is not a replacement for the DataWarehouse. Their purposes are not the same and each one cannot perform what the other is doing. So they need each other. The DW is taking data from the MDM like it taking from any other system. On the other side, the MDM is taking data from the DW whenever he need it.
I believe that now you have more clear understanding about MDM. There are many points that should be discussed about this but it is too soon right now because we're only learning this, so I'll just point them out.

  • Security - We have all the critical data in one place. Very dangerous...
  • Flexibility - The MDM should react very quickly to every change in the other systems of the organization. Clients cannot wait long for the MDM to change for every movement in the organization.
  • Availability - It should be always up and cannot crash too much because everybody is relying on it.
  • Updated - The definition of MDM says that it should be always updated, but it's not always necessary. The IT architects should find these scenarios where they can ease on the MDM.
  • Formats - Every Op. systems has its own standards and formats, and the MDM has to support all of them.
  • Interation with other IT teams - You should build trust with them because you're taking their critical data from their hands. If your MDM will malfunction they will be happy to take the advantage of the moment and take their data back to them.
  • Implementation - Building MDM is a very long process. The IT architects has to design its different modules and build them one atop of the other.
  • Conflicting Data - Which system has the last word? How can we handle these cases? Oh yes, it will happen. It always do.
  • Viewer - Do you need MDM viewer? How should it look like?
  • Make sense - This is maybe the most difficult subject. BI is a bunch of attributes without any inner sense between them. MDM should fill the void by supplying knowledge given by its many critical attributes. How should you do it?
As you can see, there's a lot to talk about. If we'll decide to implement MDM in our company I'll be happy to share here. Good luck to us all.
Monday, May 26, 2008 7:04:12 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, May 25, 2008
I thought that it will be a simple next,next,next installation, but it turned out that it is more complex than I thought. It is not something very hard to do, but there are some tricky points, especially when installing it on my PC and not on a dedicated strong server.
The installation starts as a simple wizard. Just go on with it but pay attention to this screen:



Here, you need to specify account for every service installed. Because it is a CTP installation and not a real server installation, you can make easy life for yourself and just use an administrator account for all the services because security is not an issue now. In the bottom of the screen, enter account and password of an administrator account and click on "Apply to all".
Now, for the really important note - the startup type. There are three startup types in windows services:
  1. Automatic - The service will wake up with the operation system.
  2. Manual - The service will start only by a process or an admin user.
  3. Disabled - The service can't start at all.
This choice is very important. If you're making the installation on a dedicated machine then you can choose Automatic because you'll need the service to be always running. But - if you installing this on a personal computer then you don't want these CPU & memory consuming services to be up all time long. In this case you need to choose Manual and start these services only when you need them. When you do, you start them by typing "services.msc" in the Start -> Run dialog and then find the service and click on start. I don't see any reason the choose the Disabled startup type in this screen. By the way, there's a new type in Vista called "Delayed", which starts the process only after the Automatic ones have been started. This option doesn't exist here and I don't see any reason to use it anyway.

Now for the big problem - installing the sample databases. The samples are not a part of the CTP so you'll need to download them from codeplex. Make sure that you download the samples that fit your CTP version. If you don't have the latest CTP then don't download the latest samples. Find your version in the releases section. After you have downloaded your samples, start the wizard. When you get to this screen:


you'll get stuck (if you haven't read this first, of course) with this message:

Error 27502. Could not connect to Microsoft SQL Server '(local)'. [DBNETLIB][ConnectionOpen (Connect()). SQL Server does not exist or access denied. (17) [I copied that for the ones who will find this by google search]

It got me a while to resolve this, so this is what you need to do before you install the samples:
Open the SQL Configuration manager (Start -> Programs -> Microsoft SQL Server 2008 -> Configuration tools) and enable TCP/IP protocal in the server:


That should solve it. After that, go to the directory "c:\Program Files\Microsoft SQL Server\100\Tools\Samples" and there you'll find the samples with a document that explain how to attach them to the server.

I hope this is helpful to those who got stuck and those who haven't got stuck with it yet. Enjoy.
Sunday, May 25, 2008 7:33:43 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 13, 2008
I started a long conversation about this subject in the MDSN SSAS forum. I think that it's a question and a principal that every advanced MDX programmer should be familiar with.

It all started with a customer that needed a standard deviation aggregation. I thought that it would be simple because there's a StdDev function in MDX, but it turned out that my customer had major plans for me: He wanted this aggregation to act for every dimension he puts on his axis. This means that the aggregation is not defined over a specific dimension (such as date), but the std-dev is defined over the current dimension in the axis.

The solution for this problem consists of a principle and an answer.

The Principle
Aggregation or a measure that is based on the current user's query is bad. This can and will result two users to see different results using the same measure. This will cause confusion and disinformation. The sacred principle of One Truth will be desecrated. Taken from the thread, in Chris Webb's words:

"I quite often see people wanting to write calculations that behave differently depending on the query that's being run, and I always tell them not to do it. You can hack something but it's almost impossible to get it work properly for every single possible query - MDX just doesn't work like that"

In the end I explained that to the user and he agreed. One more reason for his approval is that std-dev often doesn't really says something about the data. In other words, it isn't informative. "The standard deviation is 0.432. That means that... ???"


The Answer
If you (or the customer) still insists on that crazy measure, the following MDX will work.

With
Member [Measures].[RowSTDOrders] as
iif(Count(NonEmpty(StrToSet("Axis(1)").Item(0).Hierarchy.Children,
{[Measures].[Order Quantity]}) as ChildSet) < 2,
Null,
StDev(ChildSet, [Measures].[Order Quantity]))
 
select
[Date].[Calendar Year].[Calendar Year] on 0,
Non Empty [Product].[Product Categories].Members on 1
from [Adventure Works]
where [Measures].[RowSTDOrders]

Thanks for Deepak Puri for this code. Notice that the StrToSet function will cause performance degrade, but this is the only way that the code will also work in MDX script and not only in queries.

P.S
It doesn't matter if you write StDev or StdDev.
Wednesday, May 14, 2008 6:28:38 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, May 08, 2008
This tiny thing cost me a minute today, but it may take longer time to some of you, so I'm writing this.

As some of you know, in order to sort a dimension's attribute you need to change to OrderBy property of the attribute. You can make the attribute to be sorted according to other attribute (it's a very common thing in SSAS). In order to do so, you set the OrderBy property to AttributeKey and in the OrderByAttribute property you pick up the desired attribute (the one you want to define the order).

Note that if the first attribute (the one you want to sort) doesn't have attribute-relationship to the second attribute, you won't be able to pick up the second attribute in the OrderByAttribute property. These properties must have an attribute-relationship.
One more thing: You don't have to show the end-user the attribute which defines the order. If you want to hide it just set the property AttributeHierarchyVisible to false. It is a common pattern to make an attribute which will sort another attribute and hide it from the user.

Thursday, May 08, 2008 7:32:59 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 06, 2008
Last May I started my new blog with many questions: What exactly I will write about? Will anyone read me? What do I have to apply to all those blogs out there? and a lot more.
After a year of blogging I'm happy with my choice of starting a blog and I believe that this blog is good. On the other hand, I know there's a lot of things I can make better. This is a list of what I like and dislike about my blog. In the dislike list I wrote down what can I do to make it better, whenever possible. This list is mostly for me to make order in my mind, but maybe one of you can find useful things in it.

Like List
  • Release. A place where I can toss away thoughts from my mind to the world.
  • Share. I love to share good ideas and implementations. I belive it helps the community and the good comments I get make me understand that's right.
  • Save. Over the last year I found that the blog can be a very good place to save knowledge. When I need a piece of my code and I'm in a customer's place and not in my office it can be very helpful.
  • Be a part. Owning a blog positioning me in a community of people with shared interests. This promotes me in knowledge and as a person.
Dislike List
  • Not enough. This is the thing that bugging me the most. I'm not writing enough, or at least not as much as I want to write. This is frustrating even more when I see that my posts help many people out there. Finding the time to write and managing the time between reading and writing is hard. I will do much effort in the future to write more.
  • Screen Shots. I work in a closed-network in my company so I can't get out code or screen shots that can be very effective and helpful for you, the readers. I hope to install in my computer some of the programs I'm using so I can show you the results of my work.
  • Respond. I didn't responded you in time when you commented me. I will configure my blog to send me mail whenever you comment and I promise I will respond you more quickly.
This is it. Just two more ideas I have in mind. One is already implemented, the second maybe will be in the future.
  • When I started this blog I thought I will write about jewish stuff as much as I write about BI stuff. I was completely wrong. I found that writing about jewish stuff in english is very hard for me and that writing deep and serious thoughts is even harder. I changed the title of the blog to "Business Intelligence, Analysis Services, MDX, DataWarehousing and more..." (you can see it up there in the banner). I will focus on these subjects, but I will continue to write about other things that make interest.
  • I thought to add a box in the right column of the page titled "Upcoming Posts". That's because I know about the subjects I going to write about much time before I do it. I think it can be a cool feature but the question is: Will it interest someone? Is there someone who's waiting for it? I thought not. :-)

Tuesday, May 06, 2008 7:06:04 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, April 22, 2008

One more thing about getting a file from the web/SharePoint and using it as a source in SSIS: If you need to authenticate just change the xml.open command to:

xml.open "GET", URL, false, "user", "password"

where user and password are the user & password that has permissions to the desired file. Note that it is VERY recommended to have an application user, so the password won't be changed in the future. If you don't have such user and you must change your password in the future, do not forget to change it in the script. My tip: add a reminder in your calendar to change the password in the script.

In this point I don't know if you can authenticate using SSL or stronger protocols using VB script.

Tuesday, April 22, 2008 10:20:05 PM (Jerusalem Daylight Time, UTC+03:00)
 Monday, March 31, 2008
We got many client requests for the ability to show in their web sites the "last updated" date of the data.
It doesn't matter how you show the data of the SSAS - the customers will always want to know for which date the data is true.
My solution includes a ASP.NET 2.0 web site that uses the AMO class libary. It takes the date from the server and shows it to the user.

What you need to do is:
1. Open a new ASP.NET web site using Visual Studio 2005/8.
2. Add the AMO dll (Microsoft.Analysis Services). You'll find it in the SSAS server.
3. In the already-made default.aspx page, just add one Label.
4. Add a configuration file which will hold the name of the SSAS server. That way, when you install the site from the development environment to the production environment, you'll only have to change this file. Call this file config.xml and write in it the following:
<?xml version="1.0" encoding="utf-8" ?>
<ServerName>YourServerFullNameHere</ServerName>

5. In the code-behind file (default.aspx.cs) write the following code instead of what you already have there:

using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using AMO = Microsoft.AnalysisServices;
using System.Xml;

public partial class _Default : System.Web.UI.Page
{
  protected void Load_Page(Object sender, EventArgs e)
  {
    Label1.Text = GetCubeUpdateDate(Request.QueryString["DBName"],Request.QueryString["CubeId"]);
  }

  private string GetCubeUpdateDate (string dbName, string cubeId)
  {
    using (AMO.Server asServer = new AMO.Server())
    {
      asServer.Connect("Data Source=" + GetAnalysisServerName());
      AMO.Database db = asServer.DataBases.FindByName(dbName);
      if (db == null)
      {
        return "DB Name not found";
      }

      AMO.Cube cube = GetCubeById(cubeId, db);
      if (cube == null)
      {
        return "Cube Name not found";
      }

      DateTime lastProcessed = cube.LastProcessed;
      return lastProcessed.Day.ToString() + "/" + lastProcessed.Month.ToString() + "/" + lastProcessed.Year.ToString();
    }
  }

  private string GetAnalysisServerName ()
  {
    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.Load(Request.PhysicalApplicationPath + "config.xml");
    return xmlDoc.GetElementsByTagName("ServerName").Item(0).InnetText;
  }

  private AMO.Cube GetCubeById (string cubeId, AMO.Database db)
  {
    foreach (AMO.Cube cube in db.Cubes)
    {
      if (cube.ID.Equals(cubeId))
      {
        return cube;
      }
    }
    return null;
  }
}

Eventhough the code is self-explained, here are some points referring it:
  • I chose not to include the server name in the web.config file because I like to seperate application-related configuration and web configuration.
  • If you want you can get the cube name from the user (in the query string) and then the code is even shorter - just get the cube like I got the database.
  • I wanted to show the date in the format DD/MM/YYYY, so that's why I did the long return statement in the GetCubeUpdateDate method. If you want to return the date in the MM/DD/YYYY format you can use the lastProcessed.GetShortDateFormat() method.
  • Note that when you publish the web site you need to create a dedicated virtual folder in the IIS.
  • The user uses this site in the following way: All he need to do is to create a frame with this site's address as its source and add it the DBName & CubeId in the query string. In SharePoint it's even easier - the uses only need to create a page shower web part.
enjoy.

Monday, March 31, 2008 11:05:20 PM (Jerusalem Daylight Time, UTC+03:00)
 Saturday, March 29, 2008
Two weeks ago I showed you the leds map. This time I'll describe how it is done.

The leds map is basically a web page with a lot of java-script and Panorama applets which together bring the user a feeling of Ajax & DHTML based web site.
Look at the picture of the map in the previous post. The leds are simply Panorama applets which show Panorama views. Each view shows only one led. Using the Panorama SDK, I did the following:
  • Take the led's value from the view and show it in the tooltip
  • Take the led's color from the view and let the user filter the map according to the desired color(s)
  • Take the led's view path and after clicking the view, show the related views (the departments' views)
The rest of it is just java-script games and tricks.
The leds map is a beautiful example of what you can do with imagination, thought and good will to give your customer a good working BI tool to work with.

 | 
Sunday, March 30, 2008 5:27:29 AM (Jerusalem Daylight Time, UTC+03:00)
 Wednesday, March 19, 2008
Yesterday, my friend Ilya asked me how to perform average for dates. I explained him that actually, the dates in SQL Server are represented as numbers, where zero is 01/01/1900. All you need to do is to cast the dates to numbers, make average on them and getting the result back as date. Assuming that the date column called MyDate, Here is the code:

Cast(Avg(cast(MyDate as float)) as datetime)

Thursday, March 20, 2008 12:04:23 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, March 16, 2008
A month ago I posted about the necessity of sharing ideas in the BI world. I really think that if we all share our smart ideas then we'll be better in our work.

I want to show you a work I finished few weeks ago. I'm very proud of this work as it will be in the desktop of our CEO and I got many compliments for it.
Note that what you see in the picture is not the real screenshot of the work (It's much more beautiful in the reality...). There's a problem getting out screenshots out of my company, so I did a sketch in Power Point.

This is the functionality of the leds map (my design, if you have any comments):
  • The leds map is simply a web site, meaning zero-footprint in the client's computer. Some computers in my company has java compatibility problems, so I added a parameter you can send with the site's URL which changes the applet's java version (see more in the next post, which will be more technical).
  • The leds map has to be small, about a quarter of the screen. That's because it's intended to be a part of the CEO's desktop.
  • When the map loads, a picture with a turning-around The Thinker statue is shown with a "Loading" message below (our CEO loves that statue...).
  • After the map has been loaded, the user sees two axis with the leds in them. The two axis can represent any Meta-Measures you'd like: Short-Term Profit Vs. Long-Term Profit, Client's Satisfaction Vs. Company's Profit, etc. This is a point that many people have difficulty to understand, so I'll give an example: The yellow led is in the top-right corner, so that says that the underlying measure is very important in both the meta-measures. Going on with the example, that says that this measure is very important for theClient's Satisfaction and for the Company's Profit. Note that the leds never move. Only their color changes.
  • When you move the cursor on a measure in the map, a small tooltip appears next to it. The tooltip shows the measure's name and its value (You can see it in the left-bottom led). Design Change: As my team master recommended, now each led has its measure's name above it. The tooltip shows only the value.
  • When the map loads, only the red leds are shown. In the top-left corner of the screen, there's three radio buttons which filters the shown leds by their colors. In the picture, all the leds are shown because all the radio buttons are enabled.
  • Clicking on a measure on the map drills-down to the different department's leds, as you can see in the left side of the picture.
  • Clicking on a department's led makes the map to vanish and instead of it there's a drill-down of the department, meaning that the measures of its sub-departments are shown instead of the map.
  • After the last drill-down was made, there are two possible actions: Close the new view and return to the map or open the new view in full screen, where you can slice-and-dice and play with the data.
In the next post I'll describe how the leds map was built using the Panorama SDK.

 | 
Sunday, March 16, 2008 7:55:46 AM (Jerusalem Standard Time, UTC+02:00)
 Saturday, March 15, 2008
It figures that using simple excel file as a source in SSIS is not so trivial, especially if your source is in the web or in your SharePoint portal. At first you'll think it's easy - just declare the excel source as a url (the url of the excel file, for example) and it will succeed. The problem is that Microsoft let you think it's working. Click on the excel source and you'll see in its properties that the source path is the local temporary internet files, meaning that the source is a local copy which is not up-to-date, so it's worth nothing.
Here's what I tried to do and the final (and successful solution):

1. Use the File System task. It won't work because you can't declare an URI there.
2. In the MSDN forum (I can't find the link right now) they say to write a script, so I also tried this. Using the Script task, I written a code in VB.NET which using the System.IO libary of the .NET framework and copies the excel file (using its URI) to the desired location in the local computer. Running it, I got an error saying that the script can't use URIs...
After trying this I understood that every code or action running in the SSIS context won't work with URIs. I'm not sure I know why Microsoft developers built it that way (or maybe it's just another bug). Anyway, the next step is the solution.
3. Build an executable file that performs the desired copy task. You can't use regular batch (.bat) file because DOS/CmdExec does not know how to work with URIs. So, there are two ways to perform this:
a. Download this and use it as a copier from the web.
b. Use the following code and save it as a Visual Basic Script file (*.vbs):

'GetRemoteBinaryFile.vbs
TheFile = "myExcelFile.xls"
DestFolder = "C:\SSIS_Sources"
URL = "http://mySite/myFolder/myExcelFile.xls"
Set xml = CreateObject("Microsoft.XMLHTTP")
xml.Open "GET", URL, False
xml.Send
set oStream = CreateObject("Adodb.Stream")
Const adTypeBinary = 1
Const adSaveCreateOverWrite = 2
Const adSaveCreateNotExist = 1
oStream.type = adTypeBinary
oStream.open
oStream.write xml.responseBody
' Overwrite an existing file
oStream.savetofile DestFolder & TheFile, adSaveCreateOverWrite
oStream.close
set oStream = nothing
Set xml = Nothing

After you have your file (vbs or exe) you can use the Execute Process Task in order to make the copy. In the task, declare that you want it to run your exe or vbs.  After that, just use a normal Data Flow Task, where the source excel file is in the local computer (the file that was copied in the previous task) and the destination is your desired DB.

Note that:
1. Before executing you must have the excel file already placed in your local computer, meaning that you must make the first copy before the first time you run the package. This is because SSIS performs integrity check before running the package and it checks that the file exists.
2. Even if the copy process is long (because it's coming from the web), don't worry. SSIS work synchroniously, meaning that the Data Flow task will not start until the Execute Process task which copies the file will end.

Enjoy.

Update: I added a post about authentication.

Sunday, March 16, 2008 6:47:26 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, February 25, 2008
With Panorama SDK you can do cool stuff as I will show you in the future. Though, There are important (and undocumented) things you must know before you start. A very common task is to change the shown view views. Note that:

If you load the view using the Parameter "Alias" with the full view path (ends with  ".xml") you won't be able to change the view. later. Worse: The applet will not return an error. It will just won't respond. So, if you want to enable the dynamic change of the view, in the "Alias" parameter only enter the name of the Briefing Book where the desired view is. Next, add another parameter named "FirstView" and there enter the relative path of the view, meaning that you'll have to remove the name of the server and the briefing book's name. Don't forget to replace the back-slashes (\) in double-back-slashed (\\), otherwise... the applet won't respond. Some examples:

use: AttachParameter("Alias", "http://<myServer>//<ThePanoramaDirectoryPath>//<myBriefingBook>//<myDirectory>//myView.xml"); to show a view with no option to change it later (not recommended). Note that here you don't need to use back-slash because this is just a regual URL.

use:
AttachParameter("Alias","myBriefingBook");
AttachParameter("FirstView","\\<myDirectory>\\myView.xml");
to show a view with an option to replace it later using the CallShowView function.

I recommend always using the second method. That's because you can't know what will be the next demand of your customer. Remember that this is a very common thing to do in the BI world.

 | 
Monday, February 25, 2008 11:39:22 PM (Jerusalem Standard Time, UTC+02:00)
My team master always says that Oracle was left behind in the BI world because they don't have good visual tools over the OLAP cubes. Microsoft, for example, has good visual tools such as Panorama. Today I've seen that they have beautiful visualations over relational DBs which called ADF Data Visualization Components. The problem is that these tools can't look over OLAP cubes. In this link (look for Oracle OLAP Q&A) you can see what we'll see from Oracle in the future: These visual components will be able to show OLAP data. Maybe that will make Oracle really be able to fight Microsoft in the BI scene.

Monday, February 25, 2008 11:07:02 PM (Jerusalem Standard Time, UTC+02:00)
 Sunday, February 10, 2008
My boss called me today to ask me if some things can be achieved using our technologies (mainly Analysis Services and Panorama NovaView). These things were simple ideas of how BI can be shown to the end users. As we talked I thought of many great ideas that can be done. As you well know, one of the biggest problems in the IT world is that the user doesn't always know what he wants. Drilling down to the BI world, I can say that the problem is that the user can't dream. He can't know what he can ask for and sometimes - how easy it can be achieved. One of our many tasks as BI consultants/designers/developers is to help them dream. We can show them things that we've done and things that others have done. This is where you, the reader, can help.

Let's share ideas. In your blogs (or as comments in blogs) you can write about beautiful things you did in your organization. This can be idea or real UI that you can show. Don't worry, I'm not only-talking man. In the near future I will show here something very beautiful (and big ROI of course...) that I did. Stay tuned.

Monday, February 11, 2008 4:38:22 AM (Jerusalem Standard Time, UTC+02:00)
 Friday, February 01, 2008
This is a good one: When you build a flat file connection to a csv file, you can preview the data. There, there's an option to skip some rows (Data rows to skip). If you'll leave it with a number greater than zero - the process itself will skip these rows!! I still wonder if this bug is By Design or not. If you wish, you can track this bug in Microsoft Connect.

Friday, February 01, 2008 2:00:16 PM (Jerusalem Standard Time, UTC+02:00)
 Monday, January 21, 2008
Today I was in the Microsoft's Data Mining Conference which took place in the Sheraton City Tower, Ramat-Gan (Israel, of course). First of all - the food was good. :-) Now, seriously: All the lectures were great, although they were performed by one man - Rafal Lukaweichi, which is a very talented speaker. I think that I haven't seen such enthusiasm for many years in a lecture in the IT world. Anyway, what have I learned today?
  • The Data Mining world is very interesting indeed. Microsoft has a lot to offer in DM and it is all ready-to-use in BIDS.
  • Microsoft's approach is DM to the masses, which I don't believe it myself. Even though the tools are very simple and even the code (DMX) is easy (in contradiction to MDX), I don't think that an inexperienced developer can bring good results. The SAS approach says that you need to have deep knowledge in statistics (which is bad), but I don't think that DM can be made by the masses.
  • There are many different DM algorithms which you need to be introduced with before you start mining. As I mentioned, they tell you that all you need to know is what each algorithm does in general, but in fact there are many parameters which you need to adjust and play with, so in fact you need a good knowledge of these algorithms.
  • Visualization is very important in DM. Even after you have good results in your hands, you need good UI tools to show you the results in an efficient way or else you'll be lost in a jungle of data.
  • If you already have a datawarehouse, you're half way from mining models. The preparation of the data is a huge amount of the job in DM.
  • After you have good results and even after you got good visualization of the results, you need an expert from the company you work for (or in) that will look at the results and tell you whether they bring new knowledge or they trivial.
I don't believe there's a change I'll be mining in the near future, but maybe I'll play with it a little in my free time (which of course I don't have). Taking the data of our datawarehouse and mining it can bring some interesting stuff. Who knows.

Tuesday, January 22, 2008 3:06:14 AM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, January 15, 2008

Although SSAS willl let you use them, some other application such as Panorama won't function properly. I'll give one example: When you perform Drillthrough in Panorama, on the fly the engine gererates a web page that will take the user to the next view. The next view will be sliced as the current view, so this web page needs to pass the dimensions parameter (meaning - the current slices). That's why this web page contains this line:

AttachParameters("Slicers","%Slicers%")

A big problem will occur if one of the sliced dimensions will be sliced on a member with a name that contains inverted commas ("). The JavaScript will result an error because there are three inverted commas in the second parameter of the line. This is only one example of what can happen in a BI consumer program if you'll use special characters in member names. So - be careful not to pass these characters from the DW (build the ETL so it will drop these characters) or giving those names in SSAS, such as the All member name.

Tuesday, January 15, 2008 11:08:32 PM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, January 08, 2008

I feel like I don't have the right to write about it after so many bits of information were moving in the web about this subject but I would like to add my point of view (or in fact - my point of code). First of all, I must mention some of those who written about this subject before me. Mosha Pasumansky wrote a long post about it in last may. This post contains some ideas of how to come along with this problem, but none of them is perfect. In fact (as always) - there is no perfect solution for this problem. Another important source of knowledge can be found here in the MSDN forums, where Chris Webb, Thomas Pagel and others discussed it. Now, I would like to add my solution. Take it or leave it - your choice.

First of all, create a column in the time dimension that will be the current day indicator. Thanks to the Data Source View or the UDM approach of SSAS 2005 you don't have to change the relational table itself. Just add a named calculation in the DSV with expression that will be 1 for the row of the current day and null or zero for the others. The expression syntax itself depends on the underlying DB so I won't write it here, but it's very simple. Add this column as an attribute (let's call it CurrentDayInd) in the dimension structure and select its AttributeVisible propertiey to false. That's because we don't need such Attribute Hierarchy in our time dimension. After that, create a new User Hierarchy (you can call it Current Day), where the first level in it will be CurrentDayInd and after it place the day (key) attribute.

Now, what we have here? We have a hierarchy (Current Day) with two members - 1 and zero. The 1 member has only one child which is the current day. Link that member to your regular hierarchy (it's called YSQMD at my place) and here you have it. For example, you can use it that way in the MDX script:

Create Set [Last 30 Days] as
LinkMember([Time].[Current Day].[CurrentDayInd].&[1].Children.Item(0),[Time].[YSQMD])
:
LinkMember([Time].[Current Day].[CurrentDayInd].&[1].Children.Item(0),[Time].[YSQMD]).Lag(30);

I think that this solution is good and elegant. First of all, you don't have to use external functions such as Now(). The second pro is that we use the native OLAP mechanism which saves time and makes the queries run faster. Believe me, you'll feel the difference with big cubes. The last thing is that this solution is easy to understand (at least I think so) and it is easy for maintenance. The big con is that you have to process this dimension (and related cubes) every day. I don't think that it's so bad because most of the organizations do make process every day.


 

Wednesday, January 09, 2008 5:39:58 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, January 07, 2008

In the previous parts (1, 2) I showed how to connect Informatica with MS-OLAP, meaning that a mapplet can process cube or dimension. The thing is that I focused on the side of MS-OLAP. In the second part I even wrote the T-SQL code itself. Now I want to close the loop by describing what's going on in the Informatica side. This part was made by my friend, Alex, who permitted me to post here about what he did.

First of all, there's a table which contains the parameters to call with to the MS-OLAP procedure (object id, type, user name, etc.). This table is the source (& source qualifier, of course) of the mapplet. Each row in this table calls the stored procedure in the MS-OLAP side (in fact, the procedure is part of the relational DB, but never mind now). The call to the SP is made with Informatica's Procedure block. The connection is a regular ODBC connection, as mentioned in the previous part. Now for the interesting part: In the mapplet, the result of the procedure (zero for success, one for failure) goes into a Java Transformation block. This java block will fail the mapplet if one or more procedure calls returned failure.

How to do this java block? Double click on it to enter its properties. Go to the "Java Code" tab. There you'll see tab for every event in this block's life cycle. Here is the code for every tab (only the relevant ones):

Helper Code:

static int errorCounter = 0;
static Object lock = new Object();

On Input Row:

if (returnValue != 0)
{
 synchronized(lock)
  {
   errorCounter++;
  }
}

On End of Data:

synchronized(lock)
{
 if (errorCounter > 0)
  {
   failSession("OLAP Objects failed");
  }
}

Note that:

  • I'm not sure that the lock mechanism is required here. sync, lock, semaphore, etc. mechanisms are often used when atomic write is needed in order to solve problems like deadlocks, mutual write, etc. Here I simply don't care. Even if two parallel threads will read the errorCounter as zero and they both will increase it to one (when in fact it needs to have the value of two) it won't be a bug because the session will fail anyway. Alex & I need to talk about this point...
  • failSession is a function which is part of Informatica's API. As you might guess, it will fail the whole mapplet.
  • Very important: Calling all the MS-OLAP objects at once will cause an error in the Analysis Services server and all the objects will be in the state of Unprocessed. The Informatica side has to call the dimensions first and only then the cubes. The cubes must not be called all at once if they have relationships between them. This will cause deadlock too.
Tuesday, January 08, 2008 5:35:17 AM (Jerusalem Standard Time, UTC+02:00)
 Saturday, January 05, 2008

The Panorama NovaView Desktop program can't always deal with huge crossjoins. The reason is that this program is written in VB6 - very old platform for client programs. One thing you can try is to go to the crosstab properties and in the Advanced tab, click on the "Optimize huge crossjoins". The problem is that this won't always help. The best solutions I've found so far is to go to the Web Access site (or click on the IE button in the desktop program) and there you can choose the size of the chunk of data you'll receive on every click. Starting with 100 rows in the first chunk, this may help you with huge crossjoins.

 | 
Sunday, January 06, 2008 5:48:11 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, December 31, 2007

As many of you already know, installation on Microsoft Excel on the SSAS 2005 server is needed in order to use Excel functions in MDX. That's very helpful because MDX is lack of many important functions such as Round (!). Many organizations don't like it at all, but here's something that may help. In the SSAS 2005 server you don't need to install the whole program, only the .Net programmability support. In the installation, choose to manually pick up which components you wish to install and then choose the .Net programmability Support as seen in the picture:

Notice that this issue will not be fixed in SSAS 2008, so this tip will be relevant for a long time.

Monday, December 31, 2007 5:06:00 PM (Jerusalem Standard Time, UTC+02:00)
 Sunday, December 30, 2007

Just got home. Most of my day (and my co-worker's also) went on with a big installation of the second block of our BI project. In the morning we really thought that maybe this time, yeah - just this time things will go better. After more than 12 hours I laughing at myself: How could I be so naive? Many things that could go bad just did but after it all ended (with a happy ending, otherwise I wouldn't be here, writing in my home sweet home) I can say that the big blame is on Informatica PowerCenter. We're using version 8 of the software. It's not new software that started its way yesterday: It's a very old and familiar software. So how can it be that when we copy mapplets (ETL processes, for those of you who don't know Informatica) from one repository to another, some lines are just deleted from the mappings? After that you check your dimensions in MS-OLAP and you don't understand what happened there. A whole level in a big dimension that has only one member - 0 ?? Zero member is null. Yeah, we were right - the line in the mapping just been deleted by our precious Informatica so the column is all null.

Well, I happy we're through with this. Good night.

P.S.
Tomorrow I'm taking a day off... :-)

P.S 2

Although many things went wrong in the installation, I really think we had a good block this time. This block contains many beautiful things in MS-OLAP, MDX and Informatica. You'll see it here in the next few days, after I'll calm down. :-)

Monday, December 31, 2007 6:23:16 AM (Jerusalem Standard Time, UTC+02:00)
 Wednesday, December 19, 2007

My team master Yaron asked me to check some things in the Panorama Dashboards:

1. Can have two hands in one gauge.
2. Can I show two values in the text of every gauge.

Here are the answers. I think that the second answer is a beautiful one. In fact, I really enjoyed while I thought how to do this.

1. This is simple: Just use the Goal hand as the second hand. In the KPI Wizard go to the Define Goal step and choose Custom formula. Enter the measure you want to see in the second hand.

2. This is beautiful: In the KPI Wizard, go to the Finish step and to the Title part. Click on the little blue arrow and click on "Edit MDX...". Then, write this MDX:

[My Dimension].[My Hierarchy].CurrentMember.Name + '\n' +
[Measures].[First Measure].Name + ': ' +
Generate({[My Dimension].[My Hierarchy].CurrentMember},[Measures].[First Measure]) + '\n' +
[Measures].[Second Measure].Name + ': ' +
Generate({[My Dimension].[My Hierarchy].CurrentMember},[Measures].[Second Measure])

Note that:

  • This solution may apply to other BI applications, not only to Panorama.
  • This way you can show many values and data, not only two values.
  • What the Generate function doing there? The '+' operator needs to have two strings in both sides, so writing only the [Measures].[First Measure] or [Measures].[First Measure].Value will return a numeric value which will cause error. The Generate function used this way will return a string. It generates for the set (which contains only our member) the value of the measure (in the second argument of the formula) and as mentioned, returns it as string.
  • '\n' will jump to the next line
 |  | 
Thursday, December 20, 2007 4:18:35 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, December 09, 2007

Last week I participated in Microsoft's BI conference in Ra'anana, Israel. After the conference I asked myself: What have I really learned today? Well, here is what I remember:

  • Microsoft figured out that the eternal BI tool is and will be Excel. People just love their Excel sheets and they will stay there. This is why the mission is to bring the BI into their Excel sheets. Their new product - Excel Services, will manage our excel sheets in one central place which is connected to our Analysis Services cubes.
  • In my point of view, SQL Server 2008 is just a bunch of many performance issues and it is not really a new product. There are a lot of new "performance features". For example, most of our MDX queries will run faster, especially those who has null cells. The new Cell-By-Cell calculations performance improvements will cause these queries to run faster. I think that SS2008 could be one big Service Pack. If I'm wrong, please do comment me.
  • SQL Server 2005 has many products that we don't know good enough. Some products that I need to learn about are: Replication, SQL Server Agent, SQL CLR and more. I do know what they do and even played with them a little bit, but I want to know how they can help me and improve my work.
  • Many new features in SS2008 come from two old sources: BIDS Helper (SSAS open source addin) and of-course, Oracle...
  • My big wish - IntelliSense for Analysis Services will not be in SSAS2008 and maybe won't be at all. This is because the guessing is MDX is very hard. There are too many options in every statement you write.
  • We won't need to upgrade to Office 2007 in order to use Excel Services. Only the developers will need it.

This is what I remember for now. I'll update this post if something new will come around in my mind.

Sunday, December 09, 2007 9:26:17 PM (Jerusalem Standard Time, UTC+02:00)
 Sunday, December 02, 2007

In the last post, I explained the architecture of our BI project. The final part of the process is processing MS-OLAP object (cube/dimension) from Informatica mapplet. As explained earlier, the trick is to call Stored Procedure from the Informatica server. But first there is one more thing to do: How do you connect the Informatica server (Linux) with MS-OLAP (windows server)?

Informatica ships with number of drivers that can connect it with other servers. The drivers are called DataDirect and I'll discuss 4.20. You need to define this driver on the Informatica server (look in Informatica knowledge base for more information). This is an easy thing to do. Notice that you have to enter a full server name (including domain) and the password. Remember that if you'll change the password in the future the process will fail. You have to enable the protocol named "Named Pipes" in the MS-OLAP server. How to do this? Enter the Configuration Manager in the MS-OLAP server and in the section of MSSQLSERVER protocols enable the Named Pipes protocol. This will enable the connection from the Informatica server. On the Informatica server, make a regular ODBC connection.

Here is the code of the SP on the MS-OLAP side. This SP must be on the msdb Database on the Database engine.

ALTER PROCEDURE [dbo].[ProcessObject]
@databaseId varchar(100),
@objectType varchar(100),
@objectId varchar(100),
@login_name varchar(100),
@returnValue int output,
@errorMessage nvarchar(1024) output
AS
BEGIN
declare @jobName varchar(200)
declare @xmla varchar(1000)
declare @jobId binary(16)
declare @ReturnCode int
declare @stop int

--Set job name
set @jobName = 'Process' + @objectType + '_' + @objectId

--Delete the job if already exists
if exists (select * from msdb.dbo.sysjobs where name = @jobName)
exec msdb.dbo.sp_delete_job @job_name = @jobName

--Create the job
Exec msdb.dbo.sp_add_job @job_name=@jobName
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N'process OLAP object',
@category_name=N'[Uncategorized (Local)]',
@owner_login_name=@login_name, @job_id=@jobId output

exec msdb.dbo.sp_add_jobserver @job_name=@jobName, @server_name=@@SERVERNAME

--Declare XMLA for OLAP object
if (@objectType = 'Cube')
set @xmla = '

' + @dataBaseId + '
' + @objectId + '

ProcessFull
'
else if (@objectType = 'Dim')
set @xmla =



' + @dataBaseId + '
' + @objectId + '

ProcessFull


'
else
Begin
set @returnValue = 0
return @returnValue
End

--Add the job step
Exec msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Process Object',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'ANALYSISCOMMAND',
@command=@xmla,
@server=@@SERVERNAME,
@database_name=N'master',
@flags=0

--Run the job
Execute sp_start_job @jobName

Waitfor delay '00:00:05'
set @returnValue = (select run_status from dbo.sysjobhistory
where job_id = @jobId
and step_id = 1)

-- Loop until the job ends and return its result
set @stop = 0
if @returnValue is null
while @stop <> 1
Begin
set @returnValue = (select run_status from dbo.sysjobhistory
where job_id = @jobId
and step_id = 1)

if @returnValue is not null
set @stop = 1

waitfor delay '00:00:10'
End

--Return error message (if exists)
If @returnValue = 0 --failed
set @errorMessage = (select [message] from dbo.sysjobhistory
where job_id = @jobId
and step_id = 1)
End

update: I see that the xmla code went bad in the post because it is not recognised html code. It doesn't matter, I believe you got the point...

Sunday, December 02, 2007 5:59:42 PM (Jerusalem Standard Time, UTC+02:00)
 Saturday, December 01, 2007

In the past I mentioned some fragments of the architecture of our end-to-end BI solution. Now I'll discuss how it is done. I will only write about the things that I done (I mean, developed) but I'll describe the whole picture.

Our architecture goes like this: Control M -> Informatica -> MS-OLAP (Analysis Services 2005).

In words: ControlM is the most common scheduler in big companies. We use it to schedule our ETL processes in Informatica. Our system team made it possible to start Informatica processes from ControlM. I don't know exactly how it is done. All I know is that ControlM raises a flag in a table, and Informatica scans the table every X seconds and start the process if it finds the flag that was raised by ControlM. Don't ask me about the technical details - it wasn't my job.

The more interesting thing (because I did it...) is how Informatica calls MS-OLAP and tells it to process a cube. In this part I'll describe the big picture and in the next one I'll give some of the code and discuss some technical views of the process. First, the Informatica mapping moves the data from the source to the target, which is the dimension or fact table just like it always does. After that, Informatica calls a Stored Procedure on the MS-OLAP server which process the cube. Informatica calls this SP with some parameters, including the object type to process (cube/dimension), its ID and some more parameters. In return, the MS-OLAP returns return code (in order to point whether the process succeeded) and message describing the error if it occurred.

How the SP process the cube/dimension? Unfortunately, there is no SP that can process OLAP object so I needed to use the following steps in my SP:

  1. Delete any existing job that does the same action (read on, you'll understand)
  2. Create an empty job
  3. Add a step to that job that will process the object. This step contains XMLA code that contains the parameters that were given from Informatica
  4. Run the job
  5. Loop until the job (or process) ends and send back the return code and the error message, if exists.

In the next part I'll write some of the code and discuss some technical issues.

 

Sunday, December 02, 2007 5:19:47 AM (Jerusalem Standard Time, UTC+02:00)
 Wednesday, November 21, 2007

I never thought of it until one of my users said it. Sorting the KPI can be a very good idea. Instead of always having the same indicators (gauges, traffic lights, etc.) in the same position of the screen, sorting it can make the viewer expect that the most relevant indicator will be placed in the top-left corner of the screen, the second most-relevant will be placed after it, etc.

Sorting the KPI is a very easy thing. Every end user, even with no clue in MDX can do it by following this:

In the Define KPI wizard, go to the "Select Set" step. Copy the current set. For the example, let's say that the current set is [Products].Members and you want to sort it according to the Sales measure in descending order. Click on the advanced button on the right of the set (The button with the "..." on it) and enter the following MDX statement:

Order([Products].Members, [Measures].[Sales], DESC)

If you want to order in ascending order you can replace the DESC with ASC or not to mention it at all.

 |  | 
Thursday, November 22, 2007 6:24:44 AM (Jerusalem Standard Time, UTC+02:00)

If you see "No Data" after you entered a view with defined KPI, most chances that the reason has something to do with privileges, but today I've seen one more thing you can do that will make the KPI go crazy.

One of my users built a view and he removed all the measures but one. After that, when he defined the set of his KPI, he chose the set of the measures. That caused the "No Data" title when he wanted to see his KPI view.

 | 
Thursday, November 22, 2007 5:45:19 AM (Jerusalem Standard Time, UTC+02:00)

One of my users had a very weird problem today. When he entered a view with KPI gauges, he could see everything but the gauges. All was there: The titles, numbers, etc. but the gauges itself. The problem occurred in the Panorama Web Access site and also in the dashboard site. I checked with other users and they didn't have this problem (with the same views, of course).

After a few minutes I found the problem: The Explorer process in windows used too much memory and it caused visual problems in the browser. I ended the process, restarted it (Ctrl + Shift + Esc -> File -> New task -> explorer) and everything went back to normal.

 | 
Thursday, November 22, 2007 5:39:21 AM (Jerusalem Standard Time, UTC+02:00)

Important note: The user working with Panorama NovaView Desktop must have write privileges on the Panorama folder (The default is: C:\Program files\Panorama). The program saves its data there, so it will cause many troubles if it won't be able to save. For example, when you start the program and click on the globe (work on briefing book from the server) you'll have to enter the panorama's server name every time you start the program (if you don't have the mentioned privilege).

Don't worry: The user watching the views doesn't need to have any privilege on its computer. He only needs the right to see the view or the dashboard page.

 | 
Thursday, November 22, 2007 5:27:33 AM (Jerusalem Standard Time, UTC+02:00)

I had a little challenge back in work and I solved it (in a couple of hours). Here is the description of the problem and its solution:

Let's say that I have a Products dimension, time dimension and a fact table that describes all the faults which occurred in these products. The new requirement is that: Given a product-tree that describes all the parts of every product, I want to know how many faults occurred to every part. The problem is that the fact table points only to the products and the customer wants to know how many faults happened to the parts.

The two solutions I thought about are:

1. The trivial solution: Build a view above the fact table that takes every row and adds a row for each of its product's parts. That way, the fact will contain every fault that happened to every part. The problem with this solution is that the view is very long to compute.

2. The good solution: Build a parent-child dimension out of the parts table which will describe all the parts of every product. Notice that this dimension is not ragged, meaning that one member can have 2 children while another member in the same level can have 10 children. The next step is to add this MDX Script:

Calculate;


Scope([Measures].[Faults],
         Descendants([Products].[Products].[All],
                           1,
                           After));


   This = Ancestor([Products].[Products].CurrentMember,
                         [Products].[Products].[Level 02]);


End Scope;


Scope([Measures].[Faults],
         [Products].[Products].[Level 02].Members);


   This = ([Products].[Products].CurrentMember)
             -
             Sum([Products].[Products].CurrentMember.Children);


End Scope;

Explanation: The first level of the dimension is the [All] member. The second level is the products and the other levels contain the parts. The first block takes all the parts and inherits their amount of faults from their product ancestor. The second block solves the aggregation problem: Let's say that The bike product has 3 children. The bike had 4 faults in our slice of time, so according to the first block of the script every child has 4 faults. Now, the cube makes its aggregation and now the bikes has 16 faults - 4 of its own and more 4 for every of its parts. The second block decreases the sum of the product's children from the products and the result is that the products have their original number of faults.

 | 
Thursday, November 22, 2007 5:12:21 AM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, November 06, 2007
This very useful option in Panorama can prevent much pain. Often, you don't want your CEO(s) to play with the views you created in Panorama. The beautiful dashboard page you created is what you want them to see and that's it.
In the dashboards site, select the component holding the desired view and in its Toolbar options, check the "Disable Analysis" checkbox. This will prevent the user from slicing and dicing with your view.

 | 
Wednesday, November 07, 2007 5:10:19 AM (Jerusalem Standard Time, UTC+02:00)
After our first MS-OLAP first installation, I started asking myself some questions: The way we moved our cubes to the production server is by the Visual Studio. We just deployed the cubes into the production server. The problem was that we forgot that we had some changes done in the XMLA code that lies behind one of our cubes. That caused some troubles in the installation and the result was that we did some changes in the XMLA script in the production server after the installation...
So, how we should make the installation? Should we export the whole database and import it in the production server, or should we generate XMLA scripts and run it in the production server? Is there really a difference between these choices?
If someone has an answer I would be very happy to read it. Thanks...

Wednesday, November 07, 2007 4:17:55 AM (Jerusalem Standard Time, UTC+02:00)
Yesterday, my team installed our first end-to-end BI project which includes many familiar technologies (for us, of course) such as Informatica, Oracle and Control-M along with new technologies (again - for us) such as Analysis Services 2005 and Panorama. This is the first project we have with SSAS 2005. Cheers for us.

We had some failures along the way, so we sat down today with our DBA team in order to investigate the good and bad things we had in the installation. The most important conclusion we got to is that the installation documentation just wasn't good enough. We did it quick and quite dirty because we thought that its purpose is to keep up with the QA checks. We forgot some important issues and wrote down the other things in a shoddily way. My conclusion is this: First, write an installation document and keep attached to it when you're doing the installation even if you sure that you know the drill 'cause you did it a hundred times. Second, write a good installation document for you, your mates, and the future co-workers to come. Third, write everything you did in the installation if it wasn't in the document.

Don't ignore the installation document. It can make the difference between success and failure in the installation.

Wednesday, November 07, 2007 4:09:55 AM (Jerusalem Standard Time, UTC+02:00)
It took me a while (more than 15 minutes!) to find something so elementary. The thing is how to call a sleep function in T-SQL in SS2005. The statement is: Waitfor Delay <DelayLength>, where <DelayLength> can be in the format: '00:00:10' for ten seconds. the <DelayLength> can be also a parameter of the type char(8).

Wednesday, November 07, 2007 3:40:13 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, October 29, 2007

As you can see from the title, we use VSS with AS 2005. The reason for this is that we want to have backup and source control in our AS projects. There are some interesting points when working that way which I would like to share:

  • When you process a cube in a VSS-contolled AS project, the compiler writes the exact time and hour of the last process in the .database file in the solution. This file is a small and simple XML file. This piece of data is written in the <LastProcessed> element. That is the reason why VSS will ask to check-out this file every time you will process a cube. You can ignore it and cancel it twice and the cube will be processed anyway. Of course, the last processing time will not be correct.
  • The .dwproj file is the binder of the solution (as in regular code project). That means that Visual Studio will update this file every time you add, delete or rename an object to the solutions. When two (or more) developers doing these king of actions together it will cause a disaster. Only the objects of the developer who checked-in last will be saved in the project. What you need to do is manually edit this file and add the object's elements yourself. That's not very hard, but you'll need to concentrate. As always, I recommend using Notepad++.
  • When editing the .cube file yourself (when merging manually after two developers worked on the same cube), make sure that the dimensions ID's in the .cube file match the ID in the .dim files.
  • AS projects can be easily destroyed because of an error in a VSS-related decision. Think twice when clicking in its annoying dialog boxes. Make a label every day after building, deploying and processing the project.
Tuesday, October 30, 2007 4:03:45 AM (Jerusalem Standard Time, UTC+02:00)

We use VSS with Analysis Services 2005 in order to have source control and backup (I think this deserves a post of its own). We've been shocked to find out that our VSS didn't save our labes and our versions. The defect we experienced is that after clicking on the history button and then on the OK button in the "choose date range" dialog box, nothing happened. We were sure that no history was saved and that is the reason that nothing happened. After a long search in the web, I found this thread in the MSDN. The problem is a bug in VSS that has something to do with dates representation in Microsoft's code. The workaround is easy, but it seems to work only in Windows 2000: Open the VSS admin program, then Options -> Tools -> TimeZone, and set the time zone as none. That's it.

By the way, notice that my second post in my blog was about date formats and how tricky and dangerous they can be.

Tuesday, October 30, 2007 3:43:05 AM (Jerusalem Standard Time, UTC+02:00)

This is the second bug I found when working with AS and Oracle as my database (you can read about the first one here). Some background about our Datawarehouse architecture before I begin to complain about Microsoft:

We cannot afford risking that our users will experience faults or crashes while we refresh our DWH, process our cubes, etc. What we do to solve this problem is called a switch: We have two schemas in our Oracle DB. While our users watch the first schema, we update the other one. Only after we finish all our load process, we switch our user's tools to see the data from the second schema. In order to implement this architecture we use Oracle's synonyms. Let's say that the users watch the fact table "sales". We have a synonym which is called sales_syn. While it's pointing to the first schema (schema_a.sales_fact), we're loading into the second schema (schema_b.sales_fact). After that, we switch the synonym so it will point to the second schema (schema_b). The users always look on views that rely on synonyms. The views never change, only the synonyms do.

The problem starts when we make Named Queries in our Data Source Views in AS. Apparently, AS looks inside the view that we enterd into the DSV, finds the target of the synonym and saves it. Even after we make the switch (the synonym points on the other schema), the named query will be pointing on the first one. All our efforts to edit the named query have failed. When we open it again, we will always find that the first schema is "burned" there and cannot be changed.

The solution we chose is simply avoid using named queries. If we need a simple calculation we can add a Named Calculation in the Data Source View and if we need a complex view over our fact table, we write it ourself in the DB. This solutions breaks a little bit the main point of the data source view (designated place for all the logic of the DWH), but it is the best solution we could think of right now.

Dear Microsoft developers - it seems that you tried to be smart and look into our Oracle objects in order to enhance the multidimensional database's performance. Next time, please think twice before you do.

Tuesday, October 30, 2007 3:24:36 AM (Jerusalem Standard Time, UTC+02:00)

It seems that four days before I wrote my post about Panorama Hidden Settings, Panorama entered all the registry keys into their knowledge base. You can find them here. Strage: I seeked it for a long time and now I see that it was always there, under my nose.

 | 
Tuesday, October 30, 2007 2:58:13 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, October 22, 2007
If you read my blog from my home page and not via RSS or RSS-based sites, you may see it in the right column of the web page. You can click there and I will have a call from you (if you have Skype installed). If you need any help or explanation about BI stuff, just click and ask. Oh, by the way, you need to have some cash in your Skype account in order to pay.
I'm not trying to be greedy. I'm just trying to earn a little bit of money from my knowledge. If you need a little assistance regarding to BI, SQL Server or Panorama - ask me. If the answer will be quick I will not charge you at all.

So, pick up the phone... ;-)

Monday, October 22, 2007 7:51:52 AM (Jerusalem Standard Time, UTC+02:00)
I believe that every BI developer seen this in many Data Warehouses: Boolean Dimensions. As you may guess, boolean dimension is a dimension with only two members and of course with no hierarchy. For example: cash/credit card in sales cube, exists/not exists in inventory cube, etc. If you haven't seen this phrase before - relax - I just invented it. :-)
Now, the question is what to do about these dimensions:
a. Include them in the ETL process or just leave it as is?
b. If you put it in the ETL - how would you implement it?

Here's what I did in my project. You may disagree with me and I would like to see other approaches too.
a. Yes, I included it for some reasons. As every Pragmatic Programmer knows, everything can be changed so do not assume anything as globally-static. This rule takes place in here: Boolean dimensions may grow and have more members. For example, in the sales cube I mentioned above, maybe there will be another way to pay such as exclusive card of the shop (There is a network here in Israel who has it). Even male/female boolean dimension may have an Unknown member. So never exclude these dimensions from your ETL process. Wait - one more thing. You may think: Why interrupt my ETL process with these silly dimensions? If they'll grow up I'll add them to the process. As an answer think about the timings: You can never know how much time the dimension's ETL will take (although it will be very small), so in order to stay away of surprises - include it in your ETL process. just for case.
b. I implemented it as two hard-coded expressions and sent them to union. The result of this union will enter directly to the target table. In Informatica, the mapplet can't start without source table so just put a dummy table with only one row and connect it to the expression items. Why only one row? If the table will contain more than two rows then the Informatica server will consider the process as failed one.

As I said, I'll be happy to read other approaches other than mine.

Monday, October 22, 2007 7:44:45 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, October 15, 2007
I really think that the time dimension is the most complex dimension in 90% of the DWHs. The complexity is in two places: In the DWH design and also in Analysis Services (or any other BI tool).
First of all - why we didn't take the already-made Server Time Dimension which exists in SSAS 2005? For two reasons: The first is that the Project Real guys do not recommend using it (you can find their SSAS article here). The second is that we wanted to have some features that are not available in the server time dimension, such as Hebrew date. In a matter of fact, even if we didn't have such feature we still would build the time dimension ourselves because it's giving you much more control over the dimension. For example, you can always add some new attributes which Microsoft developers didn't think about.
I started myself to build the time dimension in excel. I figured out that this mission is little more complex that I thought it would. Most of the functions I wrote were simple, but there were some complicated ones. So here are some tips for you if you want to build your time dimension using Excel:
  • If you want to week number for every date, do not write the function yourself... Excel has function called weeknum. If you don't have it just add the function toolbox which has it (I can't recall its name right now. check in excel help).
  • If you want to have records for every level in your hierarchy (not only for days), put every level in its own excel file (not excel tab). It will help you later when you will transfer it to your DB.
  • Check yourself. Pick randomly some dates and check that all of its record has correct data.
After building the excel files I needed to transfer it to my Oracle server. I used SSIS because I didn't want to wait for my DBA to copy these files into the Informatica server (it can't use my the local files, it has to be in its server. SSIS can use local files). This also was a little tricky. First of all, close excel when running the SSIS packages, otherwise it will fail. Second, when moving the non-leaf levels, go into the columns section in the destination box and erase the irrelevant columns. It will reduce the chance for errors. Finally, click on the source box and click on "Show advanced editor". Enter the source's output columns options and define properly the columns' data types. This also will reduce the chance for errors.

I had a little bit of an argument with my DBA about how should the time dimension be. I think that the time dimension does not have to be processes at all. My time dimension is from 1960 until 2020, so no daily ETL is required. She says that all the logic has to be in Informatica so I need to develop a mapping for this dimension. I think that we both are right and that's because that in ideal world she is right. In every developers team, all the BL has to be in one place. But we don't have much time (the deadline is very close) so I won't spend the time building more mapping in Informatica when I have the time dimension already made in excel.

Maybe someday I will have the time to do this. Maybe not.

Monday, October 15, 2007 8:18:38 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, October 14, 2007
I guess that this will not be my last post on this subject, but I want to start sharing some thoughts and tips from my experience when designing and building DWH. In this post I will focus on the fact & dimensions tables relationship in terms of data completeness (if you wonder what it is, read on).

Before you start to design the DWH, sit and talk with the people who built the systems which you take your data from, including the DBA. For every table, ask them what is the primary key (it's NOT always defined properly in the DB), then ask them again and then ask them if they are sure. It happened to me that I discovered that the systems guys were wrong about their DB's primary keys.
The same thing is about Foreign keys and here you should be even more careful. Even if they claim so, check yourself that every foreign key in the fact table is placed correctly in the dimension table, especially when the fact table has far history records. Sometimes system developers or even worse - system DBAs delete records from the dimension tables that are not relevant. This will cause that these keys will still be in the fact's history records but will not be found in the dimension table, causing uncomplete relationship between the fact and the dimension table.

So far is about the part when you talk and "investigate" the system developers (the DWH design). What to do when you actually developing the DWH? First, develop the tables of your dimensions tables. Do not forget to add the primary keys in the dimensions tables and the primary and foreign keys in the fact table. Then develop the ETL processes and go for the dimensions first.  If you know that the dimension has completeness problems with the fact table that you will develop later (you talked with the system developers, remember?), add UNDEFINED (UD key) record for the dimension table. Later, when developing the fact table's ETL process, make Join with the dimension table and check that the records' foreign key exists there. If not - change the key to UD. In SSIS and Informatica (and I guess that also in other products I don't know, such as DataStage) you can use Lookup instead of Joiner if the dimension table is less that 1G records. That will optimize the ETL process. After you developed all your ETLs, run the dimension processes. After they finish (assuming everything went OK) run the fact table's ETL process. If it succeeded you can go and have a drink. If not - check what went wrong. If you want to know which keys didn't showed up in the dimension table and causes the incompleteness problem, you can disable (not delete) the foreign key from the fact table and run the process again. Then, with a simple SQL query, check which foreign keys don't exist in the dimension table. Go back to your ETL design and check what you did wrong. As I pointed before, in this step you might be very angry at the system developers...

That is all for now. As I said, I assume that more ideas will come on in the future.

Sunday, October 14, 2007 7:11:49 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, September 30, 2007
This post is about Panorama because it is the UI tool I'm working with, but this can be made with every BI UI tool.

My customer wanted to get the effect shown by Analysis Services 2005 when browsing a dimension (see the picture below). He wanted to see some properties of the members shown in the rows, along with the usual measures. Unfortunately, Panorama (and I'm sure that also other tools) does not have this option in the GIU. The solution is this code:

Create Member CurrentCube.[Measures].[MyProperty] as
  iif(IsLeaf([MyDimension].[MyHierarchy].CurrentMember),
     [MyDimension].[MyHierarchy].CurrentMember.Properties("MyProperty"),
     Null)

Note that declaring only the third row will cause that every member that is not a leaf will cause an error, which is something we don't want the viewer to see. If the dimension has properties for members in other levels too, you can adjust this decleration. This member can be declared either in the DataBase's Script (after the CALCULATE expression) or inside the session/query (not recommended in Panorama). Now, all you have to do is to show the dimension's members in the rows and this new measure in the columns (after or before the regular measures), and you'll get what you want.

Monday, October 01, 2007 4:47:12 AM (Jerusalem Daylight Time, UTC+03:00)
 Monday, September 24, 2007
My friend, Ilya, had a problem in SSIS. He had a .csv file with too many commas. The meaning is that strings that started and ended with inverted commas (") and had commas inside it were recognized by SSIS as new column. For example, the row:
"My name, is Miky", 200, 10 was recognised by SSIS as four columns instead of three. Ilya wrote down a code for SSIS (in VB) that run before the package begin its work. Here it is, hope it will help who ever seen this.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.IO
Imports System.Text
Imports Microsoft.VisualBasic.FileIO
Public Class ScriptMain
Public Sub Main()
  Dim csvFileFullPath As String
  Dim tabFileFullPath As String
  csvFileFullPath = Dts.Connections("Your CSV Connnection").ConnectionString
  tabFileFullPath = Dts.Connections("Your Table Connection").ConnectionString
  Using tabStreamWriter As New StreamWriter(tabFileFullPath, False, System.Text.Encoding.GetEncoding(1255))
  Using csvFileReader As New StreamReader(csvFileFullPath, System.Text.Encoding.GetEncoding(1255),True)
  Dim currentRow As String

  currentRow = csvFileReader.ReadLine()
  tabStreamWriter.WriteLine(currentRow)
  While Not csvFileReader.EndOfStream
    Dim outputRow As New Text.StringBuilder()
    Dim tmp, tmp1 as String
    Dim offset as Int32 = 1
    Dim beginS, endS As Int32

    beginS = 1
    currentRow = csvFileReader.ReadLine()
    beginS = InStr(offset, currentRow, """")
    While Not beginS = 0 Or offset > Len(currentRow)
      endS = InStr(beginS+1, currentRow, """")
      tmp = Mid(currentRow, beginS, endS - beginS)
      tmp1 = Replace(tmp, ",", " ")
      currentRow = Replace(currentRow, tmp, tmp1)
      offset = endS + 1
      beginS = InStr(offset, currentRow, """"")
    End While
    outputRow.Append(currentRow)
    tabStreamWriter.WriteLine(outputRow.ToString())
  End While
End Using
End Using
Dts.TaskResult = Dts.Result.Success
End Sub
End Class

The solution here is to search for any comma (,) that is between two inverted commas (") and replace it by space.
Although it is a good solution, I would take another solution: Replace any comma by special string, such as &Miky&, convert the csv file into table, and after that go over that column(s) and replace any &Miky& by comma.

Monday, September 24, 2007 7:31:56 PM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, September 09, 2007
I was asked how to get to SSIS log to see how much time took for the package to run.
Well, that depends.
On Development:
When developing new package, after running the process (click on the green arrow or press F5) there's a new tab called Progress. Clicking it will show you everything about the package's execution, including the time it started and the time it finished.

On Production:
When developing the package, open the SSIS menu (Yes, there is a menu called as the product's name. Microsoft...) and click on Logging... There, you can define logs for your package. You can log in many ways: Writing to SQL sever, output file, XML file and more. I recommend logging into SQL server and logging only the big and "hard" parts in your data flow. In the Details tab, pick up only the exceptional events, such as onError, onTaskFailed and onWarning. If you wish to know how much time took for you package to run, also pick up onProgress.

Follow this link to read about every event in SSIS.

Monday, September 10, 2007 6:09:23 AM (Jerusalem Daylight Time, UTC+03:00)
I won't cover here the topic of Exception handling in MDX, but show you a funny thing that I have never seen in any computer language. Consider this MDX code:

iif (1.0e+40 * 1.0e+40 = (1/0), "Overflowed", "Didn't Overflow")*

On some processors, this code will output "Overflowed". That's because this multiplication will overflow and (1/0) also overflows, so what we have here is two "overflow values" that are equal.

Where on earth have you seen something like this???


* Taken from the book "MDX Solutions" second edition, p. 136


 | 
Monday, September 10, 2007 5:54:32 AM (Jerusalem Daylight Time, UTC+03:00)
I'm almost done with my exams, so my writing can continue.

This post is not about how to customize your Dashboard (well, not only about it). Its purpose is to say it loud: Customize your Dashboard!
When the executives of your company (Yeah, I guess you work in a company. Does someone building Bi Portal for himself?) see the customized gauges with their company logo on it, they'll love it. No matter what these gauges will show them, you got their attention and their sympathy for the Dashboards site you made. Now, everything is easier. The bosses are in your hands.

For the Panorama NovaView users:
  1. Follow this link to learn how to do this.
  2. Do NOT start working before you backup your E-BI/KPI folder !
  3. I recommend using Notepad++ or another good XML editor when writing in the XML files. Otherwise, you can mix the whole file and you'll have to start all over again.

 | 
Monday, September 10, 2007 5:30:52 AM (Jerusalem Daylight Time, UTC+03:00)
 Saturday, August 11, 2007
We've been working for a while to enable SSO in our Panorama's Dashboard site. In a matter of fact, the responsibility for this was under the skilled hands of our system team. After a short time they succeeded and SSO was established in our site. We saw it when we entered the site: Instead of login page we directly entered the dashboard page.
After a few days, when I entered into the settings section of the dashboard site, I saw this:



Yes, that's right. No security at all. This is why we entered directly to the dashboard page instead of the login page...
The system team claims that they never said that the SSO succeeded and we say they did. No one will prove he's right, so there's no one to blame. But blaming is not everything. The important thing here is to learn for the next time: When you think you got a feature - check it. Things not always as they seems to be.
Sunday, August 12, 2007 5:52:53 AM (Jerusalem Daylight Time, UTC+03:00)
 Wednesday, August 01, 2007
While reading the first chapter of the book "MDX Solutions With MS SQL Server Analysis Services 2005 And Hyperion Essbase", I wrote down some important notes, especially for the MDX beginners. Even if you're experienced user, check this out. You may find something useful.

  • If you were a code programmer in your past, you can relax: MDX don't care about capitalization.
  • Don't even try to skip an axis: It's impossible and it is meaningless. Use the predefined names for the axis, such as: columns, rows, pages, etc.
  • You're new to MDX and the whole OLAP gives you a headache? Try to imagine this as a hypercube. It can help you a lot.
  • When writing large queries, pay attention to the "readability" of your MDX. Use the Monospace fonts whenever possible.
  • Do NOT think of SQL when learning or working with MDX. Although the syntaxes may look alike, these languages are totally different when you get to know them.
  • .Members will give you all regular members. .AllMembers will also include calculated members.
  • An expression like [Time].Members won't work if the Time dimension has multiple hierarchies.
  • The asterisk (*) can replace the CrossJoin function. It may improve readability of the code.
  • When using Order() function, you can specify a sorting criteria which is not shown in the result grid.

 |  | 
Thursday, August 02, 2007 3:04:47 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, July 19, 2007
For some reason (and don't ask me why), the Panorama NovaView's documentation doesn't contain any information about some of the most important settings. Here some of them, hoping that this will help many users:

  • In the Panorama Web Access web site, by default a user can  save his views (after he made his modifications) only in his private book. To enable him save his views in the Briefing Book (which means - the public book), do the following: Inside the Panorama server, open the Registry Editor (Start -> Run -> regedit) and go to the path HKEY_LOCAL_MACHINE\Software\Panorama\Nova View 5\Admin. Add a new string value named "PublicBookAdmin". As its value, enter all the users you want to give them the option to save their views in the Briefing Book by this template: <User1Domain>\<User1Name>,<User2Domain>\<User2Name>, etc. For example: panoramaDevServer\PowerUser, MSHOME\Miky.
  • The subscriptions web part will show you only the views that you registered to, but by default no one can register himself to the views. What you need to do is to open the Registry Editor in the panorama server, go to the same path as mentioned above and add a string value named "ShowSubsAndAlerts" with the value 1. After that, every user will be able to right click on any view in the Panorama Web Access web site, click on Register and it will be added to the subscriptions web part for him.
  • For some users, the loading animations which are shown before every applet appears in the Dashboard website will stay forever, meaning that the user will never see the dashboard itself. I think this has something to do with the Java or Microsoft VM of the user. Anyway, a nice workaround is to cancel this animation. To do this, enter the path C:\Program Files\Panorama\E-BI\Dashboard\include (replace the beginning if you installed to panorama software in a different location) and inside the Config.asp file, change the constant "ShowAnimationWhileLoadingApplets" value to false. This is a good workaroung because anyway, the applets should appear in a second or two. Otherwise - buy a faster server.
As I go on working with Panorama I'll write some more tips & tricks. Stay Tuned.

 | 
Friday, July 20, 2007 2:20:52 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, July 15, 2007
When designing a dimension in Analysis Services, there's a funny button called Add Business Intelligence. Clicking it opens a beautiful wizard which letting you define some basic properties of the dimension such as ordering and enabling writeback. I'll take the writeback feature as an example: When doing this using the wizard, it's taking you through many screen where all you have to do except for clicking next, next is to check a checkbox in one of the screens. That's it. After that, all this long wizard doing it setting a property called WriteEnabled to True. I think that it's a strange software design of Microsoft. Maybe it's for making the product seem more professional. You know - Add Business Intelligence sounds like a heavy operation. Anyway, I don't know what they had in mind.

     
Sunday, July 15, 2007 7:48:56 AM (Jerusalem Daylight Time, UTC+03:00)
I never thought that I'll do a commercial to Microsoft, but Project Real is a great thing that they did and they should get the credit for it. This project is a full end-to-end BI solution, including ETLs (using SSIS), Analysis Services cubes and mining modules, Reporting Services reports, end-user Panorama views and more.

We work with Panorama as our main GUI tool to show our users the cube's data as tables, charts, dashboards, etc, so this project is really helping us to learn how to implement our project from the first ETL step all the way to the last Panorama step.

Recommended.

Sunday, July 15, 2007 7:05:54 AM (Jerusalem Daylight Time, UTC+03:00)
 Saturday, June 02, 2007
knowing the rules doesn't mean you know how to play.
A great post in the Panorama blog with a riddle in MDX. No knowledge in MDX is required for this riddle, because they teach you what you need to know to solve this riddle.
Have fun. Believe me - you will.

Update: Look at Mosha Pasumansky's blog for another review at this MDX riddle. He claims that the answer in Panorama's Blog is not complete. While I'm new to MDX, I understand that their answer is not 100% complete, but I think it's enought in order to make their point.

 | 
Sunday, June 03, 2007 5:43:35 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, May 31, 2007

In the data source view, when you edit the SQL by yourself (right click on the table in the data source view, Replace table -> With new named query) be careful when using Oracle DB tables. When writing in the SQL the Oracle's table name with small casing, the SQL parser will add commas to the table name, making the SQL not work because the Oracle does not recognize this table name (with the commas).

Solution: Enter the table name with big casing, which will make the parser leave the table name as is. Also, remember: when creating or editing a named query always check and syntax, but also run the query and check that you get the desirable result before hitting the OK and saving the new named query.

by the way, I wonder: what were we doing if the Oracle was case sensitive?

Friday, June 01, 2007 6:36:37 AM (Jerusalem Daylight Time, UTC+03:00)

Few months ago we were given an assignment to copy/move all our DTSs that were running with SQL Server 2000 to the new SQL Server 2005 Integration Services (SSIS). My friend Michael did it and wrote some important notes that he discovered when building ETLs with Integration Services. I decided to list them here because they are important and useful, especially for those who haven't got the time to develop with SSIS so far.

  • One of the greatest improvements in SSIS is that between the source and the destination of the ETL process you can do many things, such as making new fields, sorting, converting data types, union all between different sources, implement your logic on a field, and much more. This is much easier than ever because all you need to do is to add a block to the data flow task and define it for your purposes.
  • SSIS ships with a tool for migrating SQL Server 2000 DTSs. Do not use this tool. Sometimes the result of the conversion is not good enough and in all cases you can't edit the new migrated data task.
  • When making a connection to a non-Microsoft DB, such as Oracle, use OLE DB client instead of the out-of-fashion ODBC.
  • When the destination field is shorter (string type) than the source, add a data conversion block and cut the string. Otherwise, there will be an annoying warning even if the truncation is wanted.
  • In many times (when working with non-Microsoft providers) the automatic recognition of the length & types of the source fields is not correct. Enter the source block and edit these properties by yourself.
  • When moving a Unicode field (data type DT_WSTR) to a non-Unicode destination field (DT_STR), a data conversion block is required.
  • SQL Server 2000 Stored Procedures will work in SSIS, but the Linked Servers definitions are problematic. Consider another options rather than using linked servers.
  • When the source/destination is a CSV file, use Flat File Connection. But if it is an Excel file (.xls), use a Microsoft Jet OLE DB connection and define the source as OLE DB Source (yes, it will work with Excel files).

Again, thanks for Michael for making and sharing these notes.

Friday, June 01, 2007 6:17:10 AM (Jerusalem Daylight Time, UTC+03:00)

For some weeks we were fighting with (or against?) the Panorama software in order to make it work right and show a nice pilot of BI dashboard screen to our managers. After three weeks we managed to show a good opening position by building a nice dashboard screen including a map, graphs, gauges and crosstabs.

I found that the installation of the SQL Server and the Panorama server was not good enough in my company, so I decided to try it for myself. The installation of the SQL Server is quite easy (Next, Next, Next ...), but installing the Panorama server is a complicated process. Paying attention to so many small details, knowing and remembering what to do inside the Windows server, IIS, Windows services, Windows registry and more is not so easy. Finally, after two days I managed to do this. Now I know that some things in the installation in my work place were not so good and I can point them out.

You can see the exciting (for me, at least) result in this picture, as it is a nice dashboard taken from my screen. Many posts about Panorama and SQL Server 2005 will come ahead. Now I can relax - Panorama is on the way...

Friday, June 01, 2007 5:09:10 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, May 10, 2007

I decided that I should blog about the world of BI.

My company has just bought Panorama's NovaView, which is a BI tool that focus on the UI level. this tool shows beautiful dashboards, including metrics, charts, cubes and more. I think that it worth looking - it's just beautiful.

So, as a beginning: What is BI? the google definition ("define:BI") says: "Technologies that help companies make better business decisions". I think it summarize it up quite well, but I can add one more thing: Technologies that make/help managers and decision-makers to see, or to understand much better what they have in their hands, what have they done so far and what can they do with their resources in the future.

What I'm learning right now is the Panorama tools, advanced MDX and some other stuff.

Wish me good luck.

Thursday, May 10, 2007 7:37:16 AM (Jerusalem Daylight Time, UTC+03:00)