Monday, May 26, 2008
One more tip about installing the database samples: I believe that installing them is not enough. In order to improve your skills you need to have a deep knowledge of them. Therefore, don't deploy the SSAS project to the server and that's it. Build it yourself. Yes - create a new project called MyAdventureWorks or something like that and build all the objects by yourself. Indeed, this will take time and strength but this is worth this. After you'll do all the tricky things yourself then you really got it in hands. Learn the AW project and be a master.

Monday, May 26, 2008 7:44:47 AM (Jerusalem Daylight Time, UTC+03:00)
MDM
 
Everybody is talking MDM so we decided to go to IBM and talk with Darren Cooper, which is an expert on this subject. Darren gave some sense into this term and explained us exactly what it is and what it is not. There's a lot of confusion out there about this, so it is important to know things before you deploy them or buying a new MDM product...
This sketch can explain a lot of it:


Following the arrows, you can understand what is going on in this picture and what it is all about:
  • The operational systems contains some common critical data which we're tired of duplicating and maintaining all the time. So, we push this data (red in the picture) to the MDM in real time. This is it. That's MDM. From now on, we play with this golden egg and gets all the benefits from it.
  • Hey, we have all the critical data in one place, so why shouldn't we push it to the clients whenever they need it? After we have MDM it doesn't make sense to give it to them through the Op. systems, is it?
  • Wait a second! A client is using an operational system. Will the critical data be saved in the Op. Systems? You guessed right. Be aware that now the client will send data to both places - MDM and Op. System.
  • MDM is not a replacement for the DataWarehouse. Their purposes are not the same and each one cannot perform what the other is doing. So they need each other. The DW is taking data from the MDM like it taking from any other system. On the other side, the MDM is taking data from the DW whenever he need it.
I believe that now you have more clear understanding about MDM. There are many points that should be discussed about this but it is too soon right now because we're only learning this, so I'll just point them out.

  • Security - We have all the critical data in one place. Very dangerous...
  • Flexibility - The MDM should react very quickly to every change in the other systems of the organization. Clients cannot wait long for the MDM to change for every movement in the organization.
  • Availability - It should be always up and cannot crash too much because everybody is relying on it.
  • Updated - The definition of MDM says that it should be always updated, but it's not always necessary. The IT architects should find these scenarios where they can ease on the MDM.
  • Formats - Every Op. systems has its own standards and formats, and the MDM has to support all of them.
  • Interation with other IT teams - You should build trust with them because you're taking their critical data from their hands. If your MDM will malfunction they will be happy to take the advantage of the moment and take their data back to them.
  • Implementation - Building MDM is a very long process. The IT architects has to design its different modules and build them one atop of the other.
  • Conflicting Data - Which system has the last word? How can we handle these cases? Oh yes, it will happen. It always do.
  • Viewer - Do you need MDM viewer? How should it look like?
  • Make sense - This is maybe the most difficult subject. BI is a bunch of attributes without any inner sense between them. MDM should fill the void by supplying knowledge given by its many critical attributes. How should you do it?
As you can see, there's a lot to talk about. If we'll decide to implement MDM in our company I'll be happy to share here. Good luck to us all.
Monday, May 26, 2008 7:04:12 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, May 25, 2008
I thought that it will be a simple next,next,next installation, but it turned out that it is more complex than I thought. It is not something very hard to do, but there are some tricky points, especially when installing it on my PC and not on a dedicated strong server.
The installation starts as a simple wizard. Just go on with it but pay attention to this screen:



Here, you need to specify account for every service installed. Because it is a CTP installation and not a real server installation, you can make easy life for yourself and just use an administrator account for all the services because security is not an issue now. In the bottom of the screen, enter account and password of an administrator account and click on "Apply to all".
Now, for the really important note - the startup type. There are three startup types in windows services:
  1. Automatic - The service will wake up with the operation system.
  2. Manual - The service will start only by a process or an admin user.
  3. Disabled - The service can't start at all.
This choice is very important. If you're making the installation on a dedicated machine then you can choose Automatic because you'll need the service to be always running. But - if you installing this on a personal computer then you don't want these CPU & memory consuming services to be up all time long. In this case you need to choose Manual and start these services only when you need them. When you do, you start them by typing "services.msc" in the Start -> Run dialog and then find the service and click on start. I don't see any reason the choose the Disabled startup type in this screen. By the way, there's a new type in Vista called "Delayed", which starts the process only after the Automatic ones have been started. This option doesn't exist here and I don't see any reason to use it anyway.

Now for the big problem - installing the sample databases. The samples are not a part of the CTP so you'll need to download them from codeplex. Make sure that you download the samples that fit your CTP version. If you don't have the latest CTP then don't download the latest samples. Find your version in the releases section. After you have downloaded your samples, start the wizard. When you get to this screen:


you'll get stuck (if you haven't read this first, of course) with this message:

Error 27502. Could not connect to Microsoft SQL Server '(local)'. [DBNETLIB][ConnectionOpen (Connect()). SQL Server does not exist or access denied. (17) [I copied that for the ones who will find this by google search]

It got me a while to resolve this, so this is what you need to do before you install the samples:
Open the SQL Configuration manager (Start -> Programs -> Microsoft SQL Server 2008 -> Configuration tools) and enable TCP/IP protocal in the server:


That should solve it. After that, go to the directory "c:\Program Files\Microsoft SQL Server\100\Tools\Samples" and there you'll find the samples with a document that explain how to attach them to the server.

I hope this is helpful to those who got stuck and those who haven't got stuck with it yet. Enjoy.
Sunday, May 25, 2008 7:33:43 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 13, 2008
I started a long conversation about this subject in the MDSN SSAS forum. I think that it's a question and a principal that every advanced MDX programmer should be familiar with.

It all started with a customer that needed a standard deviation aggregation. I thought that it would be simple because there's a StdDev function in MDX, but it turned out that my customer had major plans for me: He wanted this aggregation to act for every dimension he puts on his axis. This means that the aggregation is not defined over a specific dimension (such as date), but the std-dev is defined over the current dimension in the axis.

The solution for this problem consists of a principle and an answer.

The Principle
Aggregation or a measure that is based on the current user's query is bad. This can and will result two users to see different results using the same measure. This will cause confusion and disinformation. The sacred principle of One Truth will be desecrated. Taken from the thread, in Chris Webb's words:

"I quite often see people wanting to write calculations that behave differently depending on the query that's being run, and I always tell them not to do it. You can hack something but it's almost impossible to get it work properly for every single possible query - MDX just doesn't work like that"

In the end I explained that to the user and he agreed. One more reason for his approval is that std-dev often doesn't really says something about the data. In other words, it isn't informative. "The standard deviation is 0.432. That means that... ???"


The Answer
If you (or the customer) still insists on that crazy measure, the following MDX will work.

With
Member [Measures].[RowSTDOrders] as
iif(Count(NonEmpty(StrToSet("Axis(1)").Item(0).Hierarchy.Children,
{[Measures].[Order Quantity]}) as ChildSet) < 2,
Null,
StDev(ChildSet, [Measures].[Order Quantity]))
 
select
[Date].[Calendar Year].[Calendar Year] on 0,
Non Empty [Product].[Product Categories].Members on 1
from [Adventure Works]
where [Measures].[RowSTDOrders]

Thanks for Deepak Puri for this code. Notice that the StrToSet function will cause performance degrade, but this is the only way that the code will also work in MDX script and not only in queries.

P.S
It doesn't matter if you write StDev or StdDev.
Wednesday, May 14, 2008 6:28:38 AM (Jerusalem Daylight Time, UTC+03:00)
This is my second test of placing good-looking code in my blog, which will help you, the readers when you'll read my future posts. This code formatted using CodeHTMLer with inline tags and without the pre tag. Hope that it will look good also in the RSS feeds.

  1 /// <summary>
  2 /// Summary description for Main.
  3 /// </summary>
  4 static void Main(string[] args)
  5 {
  6   // string variable
  7   string myString = "myString";
  8
  9   /* integer 
 10      variable */

 11   int myInt = 2;
 12 }

Update: Successful. :-)
Wednesday, May 14, 2008 4:21:38 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, May 08, 2008
This tiny thing cost me a minute today, but it may take longer time to some of you, so I'm writing this.

As some of you know, in order to sort a dimension's attribute you need to change to OrderBy property of the attribute. You can make the attribute to be sorted according to other attribute (it's a very common thing in SSAS). In order to do so, you set the OrderBy property to AttributeKey and in the OrderByAttribute property you pick up the desired attribute (the one you want to define the order).

Note that if the first attribute (the one you want to sort) doesn't have attribute-relationship to the second attribute, you won't be able to pick up the second attribute in the OrderByAttribute property. These properties must have an attribute-relationship.
One more thing: You don't have to show the end-user the attribute which defines the order. If you want to hide it just set the property AttributeHierarchyVisible to false. It is a common pattern to make an attribute which will sort another attribute and hide it from the user.

Thursday, May 08, 2008 7:32:59 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 06, 2008
Last May I started my new blog with many questions: What exactly I will write about? Will anyone read me? What do I have to apply to all those blogs out there? and a lot more.
After a year of blogging I'm happy with my choice of starting a blog and I believe that this blog is good. On the other hand, I know there's a lot of things I can make better. This is a list of what I like and dislike about my blog. In the dislike list I wrote down what can I do to make it better, whenever possible. This list is mostly for me to make order in my mind, but maybe one of you can find useful things in it.

Like List
  • Release. A place where I can toss away thoughts from my mind to the world.
  • Share. I love to share good ideas and implementations. I belive it helps the community and the good comments I get make me understand that's right.
  • Save. Over the last year I found that the blog can be a very good place to save knowledge. When I need a piece of my code and I'm in a customer's place and not in my office it can be very helpful.
  • Be a part. Owning a blog positioning me in a community of people with shared interests. This promotes me in knowledge and as a person.
Dislike List
  • Not enough. This is the thing that bugging me the most. I'm not writing enough, or at least not as much as I want to write. This is frustrating even more when I see that my posts help many people out there. Finding the time to write and managing the time between reading and writing is hard. I will do much effort in the future to write more.
  • Screen Shots. I work in a closed-network in my company so I can't get out code or screen shots that can be very effective and helpful for you, the readers. I hope to install in my computer some of the programs I'm using so I can show you the results of my work.
  • Respond. I didn't responded you in time when you commented me. I will configure my blog to send me mail whenever you comment and I promise I will respond you more quickly.
This is it. Just two more ideas I have in mind. One is already implemented, the second maybe will be in the future.
  • When I started this blog I thought I will write about jewish stuff as much as I write about BI stuff. I was completely wrong. I found that writing about jewish stuff in english is very hard for me and that writing deep and serious thoughts is even harder. I changed the title of the blog to "Business Intelligence, Analysis Services, MDX, DataWarehousing and more..." (you can see it up there in the banner). I will focus on these subjects, but I will continue to write about other things that make interest.
  • I thought to add a box in the right column of the page titled "Upcoming Posts". That's because I know about the subjects I going to write about much time before I do it. I think it can be a cool feature but the question is: Will it interest someone? Is there someone who's waiting for it? I thought not. :-)

Tuesday, May 06, 2008 7:06:04 AM (Jerusalem Daylight Time, UTC+03:00)
 Monday, May 05, 2008
My blog is not an official .Net blog, but I find myself writing a lot of C# & VB.Net code over the last weeks.

I remember myself working a lot with ObjectDataSource and DataGrid/DataView/Repeater controls in order to reflect the user what is going on in my DB. The recent changes in the .Net world reflect the major demand from the developers to help us making this easier. First, we were introduced to LINQ which is the first level. Now, I believe that the ASP.Net Dynamic Data is the second level which brings it all together to the web environment.
We are all busy men and we often don't have much time to persue after all the new .Net frameworks and developments. My only way to stay tuned to what's going on is blogs, so I read when I have time. But when I want to learn it in more intimate way I watch screencasts.
I recommend all of you viewing the 17-minutes-cast from David Ebbo about Dynamic Data. It will show you what it's all about and after it you'll even be able to create it for yourself. I hope that this is the end of writing junk code in order to connect your DB to your UI. Time will say if I'm right.

Tuesday, May 06, 2008 5:48:17 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, April 22, 2008
I'll start from the bottom line: If you create your Data Warehouse and you follow the DW rules, you're life will be easy (assuming you know the semantics and the way to build a good and correct DW).
In our case, if you have a dimension and you want to make a Parent-Child hierarchy, your life will be easy if you built the dimension's table in the right way.

For example, let's look at a simple Time dimension:

Time_KeyDayMonthQuarterYearLevel_Num
01012008
01
01
1
2008
1
30122008
30
12
4
2008
1
01/2008

01
1
2008
2
12/2008

12
4
2008
2
Q1/2008


1
2008
3
Q4/2008


4
2008
3
2008



2008
4

As you can see, this time dimension's table contains the days 01/01/2008 and 30/12/2008 and their parents in the levels: month, quarter and year.
Now, let's say that I need to take this dimension and make it a Parent-Child table. This is very simple. Just create a view with one new column which will be the parent column. This is the new column's code (in Pseudo-SQL):
if (level_num = 1) then Month + "/" + Year
else if (level_num = 2) then "Q" + Quarter + "/" + Year
else if (level_num = 3) then Year
else null

The result:

Time_KeyDayMonthQuarterYearLevel_Num
Parent
01012008
01
01
1
2008
1
01
30122008
30
12
4
2008
1
30
01/2008

01
1
2008
2
1
12/2008

12
4
2008
2
4
Q1/2008


1
2008
3
2008
Q4/2008


4
2008
3
2008
2008



2008
4
null

This is it. Now, in your OLAP DB, just configure the new column as the parent and you have a Parent-Child hierarchy. In Analysis Services you even don't have to create a view. In the Data Source View you can add a named calculation and put your code there.

See? When you create your DW according to its rules, the life is easier. In this example, if you created rows for every level in the hierarchy and created a descriptive key - everything is great.
This can help you in many scenarios. For example, when you find that your dimension is not balanced then you might want to make it a parent-child, so you won't have many pseudo-levels when the only relevant member is the leaf. Otherwise, this can be very annoying to your user. In SSAS, make sure you don't use it too much, because it is bad for performance.
Wednesday, April 23, 2008 4:06:32 AM (Jerusalem Daylight Time, UTC+03:00)

One more thing about getting a file from the web/SharePoint and using it as a source in SSIS: If you need to authenticate just change the xml.open command to:

xml.open "GET", URL, false, "user", "password"

where user and password are the user & password that has permissions to the desired file. Note that it is VERY recommended to have an application user, so the password won't be changed in the future. If you don't have such user and you must change your password in the future, do not forget to change it in the script. My tip: add a reminder in your calendar to change the password in the script.

In this point I don't know if you can authenticate using SSL or stronger protocols using VB script.

Tuesday, April 22, 2008 10:20:05 PM (Jerusalem Daylight Time, UTC+03:00)
 Monday, March 31, 2008
We got many client requests for the ability to show in their web sites the "last updated" date of the data.
It doesn't matter how you show the data of the SSAS - the customers will always want to know for which date the data is true.
My solution includes a ASP.NET 2.0 web site that uses the AMO class libary. It takes the date from the server and shows it to the user.

What you need to do is:
1. Open a new ASP.NET web site using Visual Studio 2005/8.
2. Add the AMO dll (Microsoft.Analysis Services). You'll find it in the SSAS server.
3. In the already-made default.aspx page, just add one Label.
4. Add a configuration file which will hold the name of the SSAS server. That way, when you install the site from the development environment to the production environment, you'll only have to change this file. Call this file config.xml and write in it the following:
<?xml version="1.0" encoding="utf-8" ?>
<ServerName>YourServerFullNameHere</ServerName>

5. In the code-behind file (default.aspx.cs) write the following code instead of what you already have there:

using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using AMO = Microsoft.AnalysisServices;
using System.Xml;

public partial class _Default : System.Web.UI.Page
{
  protected void Load_Page(Object sender, EventArgs e)
  {
    Label1.Text = GetCubeUpdateDate(Request.QueryString["DBName"],Request.QueryString["CubeId"]);
  }

  private string GetCubeUpdateDate (string dbName, string cubeId)
  {
    using (AMO.Server asServer = new AMO.Server())
    {
      asServer.Connect("Data Source=" + GetAnalysisServerName());
      AMO.Database db = asServer.DataBases.FindByName(dbName);
      if (db == null)
      {
        return "DB Name not found";
      }

      AMO.Cube cube = GetCubeById(cubeId, db);
      if (cube == null)
      {
        return "Cube Name not found";
      }

      DateTime lastProcessed = cube.LastProcessed;
      return lastProcessed.Day.ToString() + "/" + lastProcessed.Month.ToString() + "/" + lastProcessed.Year.ToString();
    }
  }

  private string GetAnalysisServerName ()
  {
    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.Load(Request.PhysicalApplicationPath + "config.xml");
    return xmlDoc.GetElementsByTagName("ServerName").Item(0).InnetText;
  }

  private AMO.Cube GetCubeById (string cubeId, AMO.Database db)
  {
    foreach (AMO.Cube cube in db.Cubes)
    {
      if (cube.ID.Equals(cubeId))
      {
        return cube;
      }
    }
    return null;
  }
}

Eventhough the code is self-explained, here are some points referring it:
  • I chose not to include the server name in the web.config file because I like to seperate application-related configuration and web configuration.
  • If you want you can get the cube name from the user (in the query string) and then the code is even shorter - just get the cube like I got the database.
  • I wanted to show the date in the format DD/MM/YYYY, so that's why I did the long return statement in the GetCubeUpdateDate method. If you want to return the date in the MM/DD/YYYY format you can use the lastProcessed.GetShortDateFormat() method.
  • Note that when you publish the web site you need to create a dedicated virtual folder in the IIS.
  • The user uses this site in the following way: All he need to do is to create a frame with this site's address as its source and add it the DBName & CubeId in the query string. In SharePoint it's even easier - the uses only need to create a page shower web part.
enjoy.

Monday, March 31, 2008 11:05:20 PM (Jerusalem Daylight Time, UTC+03:00)
 Saturday, March 29, 2008
Two weeks ago I showed you the leds map. This time I'll describe how it is done.

The leds map is basically a web page with a lot of java-script and Panorama applets which together bring the user a feeling of Ajax & DHTML based web site.
Look at the picture of the map in the previous post. The leds are simply Panorama applets which show Panorama views. Each view shows only one led. Using the Panorama SDK, I did the following:
  • Take the led's value from the view and show it in the tooltip
  • Take the led's color from the view and let the user filter the map according to the desired color(s)
  • Take the led's view path and after clicking the view, show the related views (the departments' views)
The rest of it is just java-script games and tricks.
The leds map is a beautiful example of what you can do with imagination, thought and good will to give your customer a good working BI tool to work with.

 | 
Sunday, March 30, 2008 5:27:29 AM (Jerusalem Daylight Time, UTC+03:00)