Friday, May 30, 2008
Finally, that looks like the answer for our needs. IBM Business Glossary (BG) is a product that manages our business vocabulary. It enables users to create business terms (also called entities), edit them, share them and to customize them. We've seen the product in IBM, Israel and we liked it very much. Here is a brief summary of the meeting:

Managing meta data in the organization is a difficult task. First of all, you need to know what kind of MD you want to manage. There are three main types:
  • Business MD - The vocabulary that contains the terms of the business.
  • Technical MD - Names and attributes of data storages, tables, columns, etc.
  • Operational MD - How the information flows inside the organization.

The BG gives common language to the organization and connects the business to the IT. First of all, it creates a contract - everybody knows exactly what is a "high value customer" for example. That supposed to be the end of confusion about business terms. It also helps to understand things, exposes knowledge and connect all the technical details.
In BG, all the terms has the same common attributes, such as name, description, example, related entities, etc. The users can define more custom attributes if they want. The product also manages the Data Stewardship, meaning that every entity has a father/manager. It can also have two fathers - one from the business and one from the technical aspect (Update: Not in the current version). The terms are divided into subject areas/context. This way you can go to a subject and learn it all by going over all its entities. You can see and use its custom attributes. For example, you can have a link there to reports that contains/lists that entity.

There's much more to say about BG. All I wanted is to give a brief overview of what it is and you can see if it can help you. I'll give my pros and cons for this product:

Pros:

  • Making order in the organization - everybody knows what you talking about when you say a term. Every entity has a defined father/business-expert.
  • Manages business knowledge over time. You can take a new employee and instead of taking other's employee's precious time to teach him everything, just tell him to go over the business glossary. (I'm not naive, but it will reduce time)
  • Fast lookup time - I want to know in which tables in the databases an entity is placed. I can find it in seconds.

Cons:

  • Security - BG has almost no security module at all, meaning that everybody sees everything.
  • Doesn't support services yet. I would like to see which services exposes and which services consumes an entity. I want to call the service, provide it with input and see the output.
  • The stewardship module is still weak. In the meantime, there is only one father of an entity.
  • The custom attributes are the same for the entire vocabulary. What if I would like to have a custom attribute only for one subject area?
  • There isn't a hebrew interface yet. The interface can be only in English, Spanish and French (if I'm not wrong).

For conclusion, I think that the product is good, even very good. The problem is that its development has to go on several iterations before it can be used a variety of organizations. It just doesn't have all the features that a business vocabulary must have. Wait a year and you'll see a wonderful product.

Saturday, May 31, 2008 2:04:53 AM (Jerusalem Daylight Time, UTC+03:00)
On June 10th, Panorama will show us the new version of NovaView - 5.5.  The show will be only on the web (that's why it called a webinar). We will see the new reports, flash-based dashboards and the results of the cooporation with Google. You can see the brochure here. I would happy to say that I'll see you there. The only problem is that we won't see each other and that's why I think that a real conference is better than a webby one. On the other hand, it's much simpler and cheaper to do a webinar so I can understand that move. Never mind, I'll see you in other time.

 | 
Friday, May 30, 2008 6:57:21 PM (Jerusalem Daylight Time, UTC+03:00)
 Monday, May 26, 2008
One more tip about installing the database samples: I believe that installing them is not enough. In order to improve your skills you need to have a deep knowledge of them. Therefore, don't deploy the SSAS project to the server and that's it. Build it yourself. Yes - create a new project called MyAdventureWorks or something like that and build all the objects by yourself. Indeed, this will take time and strength but this is worth this. After you'll do all the tricky things yourself then you really got it in hands. Learn the AW project and be a master.

Monday, May 26, 2008 7:44:47 AM (Jerusalem Daylight Time, UTC+03:00)
MDM
 
Everybody is talking MDM so we decided to go to IBM and talk with Darren Cooper, which is an expert on this subject. Darren gave some sense into this term and explained us exactly what it is and what it is not. There's a lot of confusion out there about this, so it is important to know things before you deploy them or buying a new MDM product...
This sketch can explain a lot of it:


Following the arrows, you can understand what is going on in this picture and what it is all about:
  • The operational systems contains some common critical data which we're tired of duplicating and maintaining all the time. So, we push this data (red in the picture) to the MDM in real time. This is it. That's MDM. From now on, we play with this golden egg and gets all the benefits from it.
  • Hey, we have all the critical data in one place, so why shouldn't we push it to the clients whenever they need it? After we have MDM it doesn't make sense to give it to them through the Op. systems, is it?
  • Wait a second! A client is using an operational system. Will the critical data be saved in the Op. Systems? You guessed right. Be aware that now the client will send data to both places - MDM and Op. System.
  • MDM is not a replacement for the DataWarehouse. Their purposes are not the same and each one cannot perform what the other is doing. So they need each other. The DW is taking data from the MDM like it taking from any other system. On the other side, the MDM is taking data from the DW whenever he need it.
I believe that now you have more clear understanding about MDM. There are many points that should be discussed about this but it is too soon right now because we're only learning this, so I'll just point them out.

  • Security - We have all the critical data in one place. Very dangerous...
  • Flexibility - The MDM should react very quickly to every change in the other systems of the organization. Clients cannot wait long for the MDM to change for every movement in the organization.
  • Availability - It should be always up and cannot crash too much because everybody is relying on it.
  • Updated - The definition of MDM says that it should be always updated, but it's not always necessary. The IT architects should find these scenarios where they can ease on the MDM.
  • Formats - Every Op. systems has its own standards and formats, and the MDM has to support all of them.
  • Interation with other IT teams - You should build trust with them because you're taking their critical data from their hands. If your MDM will malfunction they will be happy to take the advantage of the moment and take their data back to them.
  • Implementation - Building MDM is a very long process. The IT architects has to design its different modules and build them one atop of the other.
  • Conflicting Data - Which system has the last word? How can we handle these cases? Oh yes, it will happen. It always do.
  • Viewer - Do you need MDM viewer? How should it look like?
  • Make sense - This is maybe the most difficult subject. BI is a bunch of attributes without any inner sense between them. MDM should fill the void by supplying knowledge given by its many critical attributes. How should you do it?
As you can see, there's a lot to talk about. If we'll decide to implement MDM in our company I'll be happy to share here. Good luck to us all.
Monday, May 26, 2008 7:04:12 AM (Jerusalem Daylight Time, UTC+03:00)
 Sunday, May 25, 2008
I thought that it will be a simple next,next,next installation, but it turned out that it is more complex than I thought. It is not something very hard to do, but there are some tricky points, especially when installing it on my PC and not on a dedicated strong server.
The installation starts as a simple wizard. Just go on with it but pay attention to this screen:



Here, you need to specify account for every service installed. Because it is a CTP installation and not a real server installation, you can make easy life for yourself and just use an administrator account for all the services because security is not an issue now. In the bottom of the screen, enter account and password of an administrator account and click on "Apply to all".
Now, for the really important note - the startup type. There are three startup types in windows services:
  1. Automatic - The service will wake up with the operation system.
  2. Manual - The service will start only by a process or an admin user.
  3. Disabled - The service can't start at all.
This choice is very important. If you're making the installation on a dedicated machine then you can choose Automatic because you'll need the service to be always running. But - if you installing this on a personal computer then you don't want these CPU & memory consuming services to be up all time long. In this case you need to choose Manual and start these services only when you need them. When you do, you start them by typing "services.msc" in the Start -> Run dialog and then find the service and click on start. I don't see any reason the choose the Disabled startup type in this screen. By the way, there's a new type in Vista called "Delayed", which starts the process only after the Automatic ones have been started. This option doesn't exist here and I don't see any reason to use it anyway.

Now for the big problem - installing the sample databases. The samples are not a part of the CTP so you'll need to download them from codeplex. Make sure that you download the samples that fit your CTP version. If you don't have the latest CTP then don't download the latest samples. Find your version in the releases section. After you have downloaded your samples, start the wizard. When you get to this screen:


you'll get stuck (if you haven't read this first, of course) with this message:

Error 27502. Could not connect to Microsoft SQL Server '(local)'. [DBNETLIB][ConnectionOpen (Connect()). SQL Server does not exist or access denied. (17) [I copied that for the ones who will find this by google search]

It got me a while to resolve this, so this is what you need to do before you install the samples:
Open the SQL Configuration manager (Start -> Programs -> Microsoft SQL Server 2008 -> Configuration tools) and enable TCP/IP protocal in the server:


That should solve it. After that, go to the directory "c:\Program Files\Microsoft SQL Server\100\Tools\Samples" and there you'll find the samples with a document that explain how to attach them to the server.

I hope this is helpful to those who got stuck and those who haven't got stuck with it yet. Enjoy.
Sunday, May 25, 2008 7:33:43 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 13, 2008
I started a long conversation about this subject in the MDSN SSAS forum. I think that it's a question and a principal that every advanced MDX programmer should be familiar with.

It all started with a customer that needed a standard deviation aggregation. I thought that it would be simple because there's a StdDev function in MDX, but it turned out that my customer had major plans for me: He wanted this aggregation to act for every dimension he puts on his axis. This means that the aggregation is not defined over a specific dimension (such as date), but the std-dev is defined over the current dimension in the axis.

The solution for this problem consists of a principle and an answer.

The Principle
Aggregation or a measure that is based on the current user's query is bad. This can and will result two users to see different results using the same measure. This will cause confusion and disinformation. The sacred principle of One Truth will be desecrated. Taken from the thread, in Chris Webb's words:

"I quite often see people wanting to write calculations that behave differently depending on the query that's being run, and I always tell them not to do it. You can hack something but it's almost impossible to get it work properly for every single possible query - MDX just doesn't work like that"

In the end I explained that to the user and he agreed. One more reason for his approval is that std-dev often doesn't really says something about the data. In other words, it isn't informative. "The standard deviation is 0.432. That means that... ???"


The Answer
If you (or the customer) still insists on that crazy measure, the following MDX will work.

With
Member [Measures].[RowSTDOrders] as
iif(Count(NonEmpty(StrToSet("Axis(1)").Item(0).Hierarchy.Children,
{[Measures].[Order Quantity]}) as ChildSet) < 2,
Null,
StDev(ChildSet, [Measures].[Order Quantity]))
 
select
[Date].[Calendar Year].[Calendar Year] on 0,
Non Empty [Product].[Product Categories].Members on 1
from [Adventure Works]
where [Measures].[RowSTDOrders]

Thanks for Deepak Puri for this code. Notice that the StrToSet function will cause performance degrade, but this is the only way that the code will also work in MDX script and not only in queries.

P.S
It doesn't matter if you write StDev or StdDev.
Wednesday, May 14, 2008 6:28:38 AM (Jerusalem Daylight Time, UTC+03:00)
This is my second test of placing good-looking code in my blog, which will help you, the readers when you'll read my future posts. This code formatted using CodeHTMLer with inline tags and without the pre tag. Hope that it will look good also in the RSS feeds.

  1 /// <summary>
  2 /// Summary description for Main.
  3 /// </summary>
  4 static void Main(string[] args)
  5 {
  6   // string variable
  7   string myString = "myString";
  8
  9   /* integer 
 10      variable */

 11   int myInt = 2;
 12 }

Update: Successful. :-)
Wednesday, May 14, 2008 4:21:38 AM (Jerusalem Daylight Time, UTC+03:00)
 Thursday, May 08, 2008
This tiny thing cost me a minute today, but it may take longer time to some of you, so I'm writing this.

As some of you know, in order to sort a dimension's attribute you need to change to OrderBy property of the attribute. You can make the attribute to be sorted according to other attribute (it's a very common thing in SSAS). In order to do so, you set the OrderBy property to AttributeKey and in the OrderByAttribute property you pick up the desired attribute (the one you want to define the order).

Note that if the first attribute (the one you want to sort) doesn't have attribute-relationship to the second attribute, you won't be able to pick up the second attribute in the OrderByAttribute property. These properties must have an attribute-relationship.
One more thing: You don't have to show the end-user the attribute which defines the order. If you want to hide it just set the property AttributeHierarchyVisible to false. It is a common pattern to make an attribute which will sort another attribute and hide it from the user.

Thursday, May 08, 2008 7:32:59 AM (Jerusalem Daylight Time, UTC+03:00)
 Tuesday, May 06, 2008
Last May I started my new blog with many questions: What exactly I will write about? Will anyone read me? What do I have to apply to all those blogs out there? and a lot more.
After a year of blogging I'm happy with my choice of starting a blog and I believe that this blog is good. On the other hand, I know there's a lot of things I can make better. This is a list of what I like and dislike about my blog. In the dislike list I wrote down what can I do to make it better, whenever possible. This list is mostly for me to make order in my mind, but maybe one of you can find useful things in it.

Like List
  • Release. A place where I can toss away thoughts from my mind to the world.
  • Share. I love to share good ideas and implementations. I belive it helps the community and the good comments I get make me understand that's right.
  • Save. Over the last year I found that the blog can be a very good place to save knowledge. When I need a piece of my code and I'm in a customer's place and not in my office it can be very helpful.
  • Be a part. Owning a blog positioning me in a community of people with shared interests. This promotes me in knowledge and as a person.
Dislike List
  • Not enough. This is the thing that bugging me the most. I'm not writing enough, or at least not as much as I want to write. This is frustrating even more when I see that my posts help many people out there. Finding the time to write and managing the time between reading and writing is hard. I will do much effort in the future to write more.
  • Screen Shots. I work in a closed-network in my company so I can't get out code or screen shots that can be very effective and helpful for you, the readers. I hope to install in my computer some of the programs I'm using so I can show you the results of my work.
  • Respond. I didn't responded you in time when you commented me. I will configure my blog to send me mail whenever you comment and I promise I will respond you more quickly.
This is it. Just two more ideas I have in mind. One is already implemented, the second maybe will be in the future.
  • When I started this blog I thought I will write about jewish stuff as much as I write about BI stuff. I was completely wrong. I found that writing about jewish stuff in english is very hard for me and that writing deep and serious thoughts is even harder. I changed the title of the blog to "Business Intelligence, Analysis Services, MDX, DataWarehousing and more..." (you can see it up there in the banner). I will focus on these subjects, but I will continue to write about other things that make interest.
  • I thought to add a box in the right column of the page titled "Upcoming Posts". That's because I know about the subjects I going to write about much time before I do it. I think it can be a cool feature but the question is: Will it interest someone? Is there someone who's waiting for it? I thought not. :-)

Tuesday, May 06, 2008 7:06:04 AM (Jerusalem Daylight Time, UTC+03:00)
 Monday, May 05, 2008
My blog is not an official .Net blog, but I find myself writing a lot of C# & VB.Net code over the last weeks.

I remember myself working a lot with ObjectDataSource and DataGrid/DataView/Repeater controls in order to reflect the user what is going on in my DB. The recent changes in the .Net world reflect the major demand from the developers to help us making this easier. First, we were introduced to LINQ which is the first level. Now, I believe that the ASP.Net Dynamic Data is the second level which brings it all together to the web environment.
We are all busy men and we often don't have much time to persue after all the new .Net frameworks and developments. My only way to stay tuned to what's going on is blogs, so I read when I have time. But when I want to learn it in more intimate way I watch screencasts.
I recommend all of you viewing the 17-minutes-cast from David Ebbo about Dynamic Data. It will show you what it's all about and after it you'll even be able to create it for yourself. I hope that this is the end of writing junk code in order to connect your DB to your UI. Time will say if I'm right.

Tuesday, May 06, 2008 5:48:17 AM (Jerusalem Daylight Time, UTC+03:00)