Tuesday, November 18, 2008

This is a little bit tricky. Unlike the AdomdClient assembly, the AdomdServer assembly  doesn't have a descriptive name. It's called msmgdsrv.dll and it is located in Program Files\Microsoft SQL Server\MSSQL.2\OLAP\bin. Why it's not documented anywhere?

Tuesday, November 18, 2008 11:17:16 PM (Jerusalem Standard Time, UTC+02:00)
 Monday, November 17, 2008

After announcing the MdxInjection program I got several requests for additional details and for the ability to run it without using Visual Studio. So, here are some important points:

  • When I published it I had developers in mind because I'm sure than anyone will want to do his little modification before using it for his own needs. That's why I published it as a solution and not as executable.
  • I written it down using VS2008 but only with the .Net 2 framework. Those of you who uses VS2005 won't be able to open the solution.
  • The program has only one public method - InjectMdx, who takes two arguments: The location of the CommonMdx file and the location of the xml configuration file.
  • The CommonMdx.mdx file contains the common MDX script. The relevant part has to start with /* Common MDX */ and then the common mdx script. Anything written before it won't be treated. That gives you the ability to save some data or comments for yourself in this file.
  • Example of the configuration xml file can be found in the Test libary inside the solution. Basically, it enable you to define in which servers, databases and cubes you want to inject the common script. Pay attention that you have to write the connection strings in this file.
  • Note that the program will detect cube dimensions with their name changed and will know how to replace them. That means that if you mention the Time dimension in the common script and inject it to AdventureWorks cube, the script will replace the string "Time" with the "ShipmentDate" string, for example.

For those of you who want simple execution file, I added a windows console project in the solution.

Link to only executable program
Link to the solution with the added windows application project
Link to the solution without the windows application project
Tuesday, November 18, 2008 6:54:17 AM (Jerusalem Standard Time, UTC+02:00)

In the previous post I talked about the DRY principle in the BI Development. I mentioned that one of the major problems in the principle's implementation is in the common MDX code. Chris Commented:

"I'd like to be able to have a global MDX Script and be able to do something like a #include to bring calculations into specific cubes. One to add to my wishlist for the next version..."

And as I said there that I have a temporary good solution until we'll have it in the next SQL Server release (if someone from Microsoft is reading...).

The MdxInjection program takes your common MDX Script and a very simple xml file that defines where to inject this script. It injects the script into your desired cubes and even replaces the dimensions' names where necessary (it is relevant where you put dimension in a cube with a different name to thr dimension or when you use Role Playing Dimensions). I couldn't hold myself from writing some test code so it's also included in the project. The project is written in C# 2 using much AMO code. All the technical little details are inside.

Enjoy.

Download Link

Monday, November 17, 2008 8:29:02 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, August 17, 2008

There is one tiny new feature in SSAS 2008 that you can easily miss. It called Empty Cube. When you create a new cube using the wizard, you can create an empty cube, meaning that it has no measures, dimension relationships, etc. The original use of it is for "users (who) want to create everything manually, or when all dimensions are linked dimensions" (taken from the description in the wizard)

In the past, I written about using SSAS with Visual SourceSafe in order to have source and version control for the SSAS project. I mentioned that it has many disadvantages but the big advantage (source & version control) is bigger so in the bottom line I recommend using it. One of the problems we experienced were that every time that someone creates a new object (cube, dimension, etc.) he has to check out the .dwproj file. The result is that sometimes we have a fight in the team for that file and we shouting: "who taken out the dwproj???" (yeah, I know that we can check who did it inside VSS but shouting is more fun).

The empty cube feature is a nice solution for this problem: When you create a new project you can create all the (empty) cubes and then the .dwproj file is free and no longer needed. I'm assuming that you know which cubes you'll have when starting a new project. The only thing remained is the same solution for dimensions. I'll recommend it in the Connect site (it's not working now for some reason).

Monday, August 18, 2008 3:50:41 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, August 03, 2008
update: Chris Webb and Mosha commented and made it clear that the reason for this error wrap is the NonEmptyCrossJoin function and nothing else. I also checked and I did not find any other function that wraps underlying errors.

This is something you need to be aware of when you're writing MDX. I don't know whether it's a bug or by-design. I'll be happy
to know (please comment if you know something that I don't).
Consider the following MDX:

SELECT
  NonEmptyCrossJoin
  (
    [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
   ,[Employee].[Employee Department].[Department].&[Sales]
  ) ON 0
FROM [Adventure Works];

The query will return with this error: The Set_Count argument of the NonEmptyCrossJoin function is either negative or larger than the number of sets provided. This is quite reasonable because I written the second argument as a member, where (NonEmpty)CrossJoin expects only sets. So, let's upgrade this member to a set:

SELECT
  NonEmptyCrossJoin
  (
    [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
   ,{[Employee].[Employee Department].[Department].&[Sales]}
  ) ON 0
FROM [Adventure Works];


All I did is wrapping the second argument with {} and we have a set. The query will return 19 columns.
Now, for the interesting part. Let's count the members of this CrossJoin before the we fix it:

WITH
  MEMBER [a] AS
    NonEmptyCrossJoin
    (
      [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
     ,[Employee].[Employee Department].[Department].&[Sales]
    ).Count
SELECT
  [a] ON 0
FROM [Adventure Works];


This return... 0.               
Where is the error?
My guess is that the Count function wraps the error. The NonEmptyCrossJoin returns null and the count of members in null is zero. The meaning of this is that if you'll ever forget to wrap the member with {} you'll always get zero and not an error. This can be very dangerous. Just for the check, running this query after the fix:

WITH
  MEMBER [a] AS
    NonEmptyCrossJoin
    (
      [Customer].[Customer Geography].[State-Province].&[NSW]&[AU].Children
     ,{[Employee].[Employee Department].[Department].&[Sales]}
    ).Count
SELECT
  [a] ON 0
FROM [Adventure Works];


will return 19. This been tested with both SSAS 2005 and 2008 (RC0). The examples here are from RC0.

Be careful with your MDX.

Monday, August 04, 2008 5:28:26 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 13, 2008
I started a long conversation about this subject in the MDSN SSAS forum. I think that it's a question and a principal that every advanced MDX programmer should be familiar with.

It all started with a customer that needed a standard deviation aggregation. I thought that it would be simple because there's a StdDev function in MDX, but it turned out that my customer had major plans for me: He wanted this aggregation to act for every dimension he puts on his axis. This means that the aggregation is not defined over a specific dimension (such as date), but the std-dev is defined over the current dimension in the axis.

The solution for this problem consists of a principle and an answer.

The Principle
Aggregation or a measure that is based on the current user's query is bad. This can and will result two users to see different results using the same measure. This will cause confusion and disinformation. The sacred principle of One Truth will be desecrated. Taken from the thread, in Chris Webb's words:

"I quite often see people wanting to write calculations that behave differently depending on the query that's being run, and I always tell them not to do it. You can hack something but it's almost impossible to get it work properly for every single possible query - MDX just doesn't work like that"

In the end I explained that to the user and he agreed. One more reason for his approval is that std-dev often doesn't really says something about the data. In other words, it isn't informative. "The standard deviation is 0.432. That means that... ???"


The Answer
If you (or the customer) still insists on that crazy measure, the following MDX will work.

With
Member [Measures].[RowSTDOrders] as
iif(Count(NonEmpty(StrToSet("Axis(1)").Item(0).Hierarchy.Children,
{[Measures].[Order Quantity]}) as ChildSet) < 2,
Null,
StDev(ChildSet, [Measures].[Order Quantity]))
 
select
[Date].[Calendar Year].[Calendar Year] on 0,
Non Empty [Product].[Product Categories].Members on 1
from [Adventure Works]
where [Measures].[RowSTDOrders]

Thanks for Deepak Puri for this code. Notice that the StrToSet function will cause performance degrade, but this is the only way that the code will also work in MDX script and not only in queries.

P.S
It doesn't matter if you write StDev or StdDev.
Wednesday, May 14, 2008 6:28:38 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, May 08, 2008
This tiny thing cost me a minute today, but it may take longer time to some of you, so I'm writing this.

As some of you know, in order to sort a dimension's attribute you need to change to OrderBy property of the attribute. You can make the attribute to be sorted according to other attribute (it's a very common thing in SSAS). In order to do so, you set the OrderBy property to AttributeKey and in the OrderByAttribute property you pick up the desired attribute (the one you want to define the order).

Note that if the first attribute (the one you want to sort) doesn't have attribute-relationship to the second attribute, you won't be able to pick up the second attribute in the OrderByAttribute property. These properties must have an attribute-relationship.
One more thing: You don't have to show the end-user the attribute which defines the order. If you want to hide it just set the property AttributeHierarchyVisible to false. It is a common pattern to make an attribute which will sort another attribute and hide it from the user.

Thursday, May 08, 2008 7:32:59 AM (Jerusalem Daylight Time, UTC+03:00)
 Monday, March 31, 2008
We got many client requests for the ability to show in their web sites the "last updated" date of the data.
It doesn't matter how you show the data of the SSAS - the customers will always want to know for which date the data is true.
My solution includes a ASP.NET 2.0 web site that uses the AMO class libary. It takes the date from the server and shows it to the user.

What you need to do is:
1. Open a new ASP.NET web site using Visual Studio 2005/8.
2. Add the AMO dll (Microsoft.Analysis Services). You'll find it in the SSAS server.
3. In the already-made default.aspx page, just add one Label.
4. Add a configuration file which will hold the name of the SSAS server. That way, when you install the site from the development environment to the production environment, you'll only have to change this file. Call this file config.xml and write in it the following:
<?xml version="1.0" encoding="utf-8" ?>
<ServerName>YourServerFullNameHere</ServerName>

5. In the code-behind file (default.aspx.cs) write the following code instead of what you already have there:

using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using AMO = Microsoft.AnalysisServices;
using System.Xml;

public partial class _Default : System.Web.UI.Page
{
  protected void Load_Page(Object sender, EventArgs e)
  {
    Label1.Text = GetCubeUpdateDate(Request.QueryString["DBName"],Request.QueryString["CubeId"]);
  }

  private string GetCubeUpdateDate (string dbName, string cubeId)
  {
    using (AMO.Server asServer = new AMO.Server())
    {
      asServer.Connect("Data Source=" + GetAnalysisServerName());
      AMO.Database db = asServer.DataBases.FindByName(dbName);
      if (db == null)
      {
        return "DB Name not found";
      }

      AMO.Cube cube = GetCubeById(cubeId, db);
      if (cube == null)
      {
        return "Cube Name not found";
      }

      DateTime lastProcessed = cube.LastProcessed;
      return lastProcessed.Day.ToString() + "/" + lastProcessed.Month.ToString() + "/" + lastProcessed.Year.ToString();
    }
  }

  private string GetAnalysisServerName ()
  {
    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.Load(Request.PhysicalApplicationPath + "config.xml");
    return xmlDoc.GetElementsByTagName("ServerName").Item(0).InnetText;
  }

  private AMO.Cube GetCubeById (string cubeId, AMO.Database db)
  {
    foreach (AMO.Cube cube in db.Cubes)
    {
      if (cube.ID.Equals(cubeId))
      {
        return cube;
      }
    }
    return null;
  }
}

Eventhough the code is self-explained, here are some points referring it:
  • I chose not to include the server name in the web.config file because I like to seperate application-related configuration and web configuration.
  • If you want you can get the cube name from the user (in the query string) and then the code is even shorter - just get the cube like I got the database.
  • I wanted to show the date in the format DD/MM/YYYY, so that's why I did the long return statement in the GetCubeUpdateDate method. If you want to return the date in the MM/DD/YYYY format you can use the lastProcessed.GetShortDateFormat() method.
  • Note that when you publish the web site you need to create a dedicated virtual folder in the IIS.
  • The user uses this site in the following way: All he need to do is to create a frame with this site's address as its source and add it the DBName & CubeId in the query string. In SharePoint it's even easier - the uses only need to create a page shower web part.
enjoy.

Monday, March 31, 2008 11:05:20 PM (Jerusalem Daylight Time, UTC+03:00)
 Monday, January 21, 2008
Today I was in the Microsoft's Data Mining Conference which took place in the Sheraton City Tower, Ramat-Gan (Israel, of course). First of all - the food was good. :-) Now, seriously: All the lectures were great, although they were performed by one man - Rafal Lukaweichi, which is a very talented speaker. I think that I haven't seen such enthusiasm for many years in a lecture in the IT world. Anyway, what have I learned today?
  • The Data Mining world is very interesting indeed. Microsoft has a lot to offer in DM and it is all ready-to-use in BIDS.
  • Microsoft's approach is DM to the masses, which I don't believe it myself. Even though the tools are very simple and even the code (DMX) is easy (in contradiction to MDX), I don't think that an inexperienced developer can bring good results. The SAS approach says that you need to have deep knowledge in statistics (which is bad), but I don't think that DM can be made by the masses.
  • There are many different DM algorithms which you need to be introduced with before you start mining. As I mentioned, they tell you that all you need to know is what each algorithm does in general, but in fact there are many parameters which you need to adjust and play with, so in fact you need a good knowledge of these algorithms.
  • Visualization is very important in DM. Even after you have good results in your hands, you need good UI tools to show you the results in an efficient way or else you'll be lost in a jungle of data.
  • If you already have a datawarehouse, you're half way from mining models. The preparation of the data is a huge amount of the job in DM.
  • After you have good results and even after you got good visualization of the results, you need an expert from the company you work for (or in) that will look at the results and tell you whether they bring new knowledge or they trivial.
I don't believe there's a change I'll be mining in the near future, but maybe I'll play with it a little in my free time (which of course I don't have). Taking the data of our datawarehouse and mining it can bring some interesting stuff. Who knows.

Tuesday, January 22, 2008 3:06:14 AM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, January 15, 2008

Although SSAS willl let you use them, some other application such as Panorama won't function properly. I'll give one example: When you perform Drillthrough in Panorama, on the fly the engine gererates a web page that will take the user to the next view. The next view will be sliced as the current view, so this web page needs to pass the dimensions parameter (meaning - the current slices). That's why this web page contains this line:

AttachParameters("Slicers","%Slicers%")

A big problem will occur if one of the sliced dimensions will be sliced on a member with a name that contains inverted commas ("). The JavaScript will result an error because there are three inverted commas in the second parameter of the line. This is only one example of what can happen in a BI consumer program if you'll use special characters in member names. So - be careful not to pass these characters from the DW (build the ETL so it will drop these characters) or giving those names in SSAS, such as the All member name.

Tuesday, January 15, 2008 11:08:32 PM (Jerusalem Standard Time, UTC+02:00)
 Tuesday, January 08, 2008

I feel like I don't have the right to write about it after so many bits of information were moving in the web about this subject but I would like to add my point of view (or in fact - my point of code). First of all, I must mention some of those who written about this subject before me. Mosha Pasumansky wrote a long post about it in last may. This post contains some ideas of how to come along with this problem, but none of them is perfect. In fact (as always) - there is no perfect solution for this problem. Another important source of knowledge can be found here in the MSDN forums, where Chris Webb, Thomas Pagel and others discussed it. Now, I would like to add my solution. Take it or leave it - your choice.

First of all, create a column in the time dimension that will be the current day indicator. Thanks to the Data Source View or the UDM approach of SSAS 2005 you don't have to change the relational table itself. Just add a named calculation in the DSV with expression that will be 1 for the row of the current day and null or zero for the others. The expression syntax itself depends on the underlying DB so I won't write it here, but it's very simple. Add this column as an attribute (let's call it CurrentDayInd) in the dimension structure and select its AttributeVisible propertiey to false. That's because we don't need such Attribute Hierarchy in our time dimension. After that, create a new User Hierarchy (you can call it Current Day), where the first level in it will be CurrentDayInd and after it place the day (key) attribute.

Now, what we have here? We have a hierarchy (Current Day) with two members - 1 and zero. The 1 member has only one child which is the current day. Link that member to your regular hierarchy (it's called YSQMD at my place) and here you have it. For example, you can use it that way in the MDX script:

Create Set [Last 30 Days] as
LinkMember([Time].[Current Day].[CurrentDayInd].&[1].Children.Item(0),[Time].[YSQMD])
:
LinkMember([Time].[Current Day].[CurrentDayInd].&[1].Children.Item(0),[Time].[YSQMD]).Lag(30);

I think that this solution is good and elegant. First of all, you don't have to use external functions such as Now(). The second pro is that we use the native OLAP mechanism which saves time and makes the queries run faster. Believe me, you'll feel the difference with big cubes. The last thing is that this solution is easy to understand (at least I think so) and it is easy for maintenance. The big con is that you have to process this dimension (and related cubes) every day. I don't think that it's so bad because most of the organizations do make process every day.


 

Wednesday, January 09, 2008 5:39:58 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, December 31, 2007

As many of you already know, installation on Microsoft Excel on the SSAS 2005 server is needed in order to use Excel functions in MDX. That's very helpful because MDX is lack of many important functions such as Round (!). Many organizations don't like it at all, but here's something that may help. In the SSAS 2005 server you don't need to install the whole program, only the .Net programmability support. In the installation, choose to manually pick up which components you wish to install and then choose the .Net programmability Support as seen in the picture:

Notice that this issue will not be fixed in SSAS 2008, so this tip will be relevant for a long time.

Monday, December 31, 2007 5:06:00 PM (Jerusalem Standard Time, UTC+02:00)