Monday, January 07, 2008

For the last months I rejected over and over again invitations to Facebook. The reason is that I think that it's just a big waste of time. I finally surrender when I figured out that I'm missing many things that happens on the virtual world. For example, my friend published pictures of my daughter and everyone seen it except... me. Everyone talked about how cute she is in those pics, people who I don't know say hello to her with a smile and I don't know who they are and how they know Renana...

After joining the social network I've seen for myself the huge waste of time. Instead of making this network a platform for distant people to connect to each other, I see that everybody is busy with adding many stupid applications to their profile pages, annoying each other with nonsense (I'm saving an alien, raising a pet, you name it) and playing endless quizzes.

I decided that after filling my profile with minimal amount of apps I will calm down and use Facebook to do what it's meant to do - connect with my friends. Friends - Stay in touch !

Tuesday, January 08, 2008 5:50:50 AM (Jerusalem Standard Time, UTC+02:00)

In the previous parts (1, 2) I showed how to connect Informatica with MS-OLAP, meaning that a mapplet can process cube or dimension. The thing is that I focused on the side of MS-OLAP. In the second part I even wrote the T-SQL code itself. Now I want to close the loop by describing what's going on in the Informatica side. This part was made by my friend, Alex, who permitted me to post here about what he did.

First of all, there's a table which contains the parameters to call with to the MS-OLAP procedure (object id, type, user name, etc.). This table is the source (& source qualifier, of course) of the mapplet. Each row in this table calls the stored procedure in the MS-OLAP side (in fact, the procedure is part of the relational DB, but never mind now). The call to the SP is made with Informatica's Procedure block. The connection is a regular ODBC connection, as mentioned in the previous part. Now for the interesting part: In the mapplet, the result of the procedure (zero for success, one for failure) goes into a Java Transformation block. This java block will fail the mapplet if one or more procedure calls returned failure.

How to do this java block? Double click on it to enter its properties. Go to the "Java Code" tab. There you'll see tab for every event in this block's life cycle. Here is the code for every tab (only the relevant ones):

Helper Code:

static int errorCounter = 0;
static Object lock = new Object();

On Input Row:

if (returnValue != 0)
{
 synchronized(lock)
  {
   errorCounter++;
  }
}

On End of Data:

synchronized(lock)
{
 if (errorCounter > 0)
  {
   failSession("OLAP Objects failed");
  }
}

Note that:

  • I'm not sure that the lock mechanism is required here. sync, lock, semaphore, etc. mechanisms are often used when atomic write is needed in order to solve problems like deadlocks, mutual write, etc. Here I simply don't care. Even if two parallel threads will read the errorCounter as zero and they both will increase it to one (when in fact it needs to have the value of two) it won't be a bug because the session will fail anyway. Alex & I need to talk about this point...
  • failSession is a function which is part of Informatica's API. As you might guess, it will fail the whole mapplet.
  • Very important: Calling all the MS-OLAP objects at once will cause an error in the Analysis Services server and all the objects will be in the state of Unprocessed. The Informatica side has to call the dimensions first and only then the cubes. The cubes must not be called all at once if they have relationships between them. This will cause deadlock too.
Tuesday, January 08, 2008 5:35:17 AM (Jerusalem Standard Time, UTC+02:00)
 Saturday, January 05, 2008

The Panorama NovaView Desktop program can't always deal with huge crossjoins. The reason is that this program is written in VB6 - very old platform for client programs. One thing you can try is to go to the crosstab properties and in the Advanced tab, click on the "Optimize huge crossjoins". The problem is that this won't always help. The best solutions I've found so far is to go to the Web Access site (or click on the IE button in the desktop program) and there you can choose the size of the chunk of data you'll receive on every click. Starting with 100 rows in the first chunk, this may help you with huge crossjoins.

 | 
Sunday, January 06, 2008 5:48:11 AM (Jerusalem Standard Time, UTC+02:00)
 Monday, December 31, 2007

As many of you already know, installation on Microsoft Excel on the SSAS 2005 server is needed in order to use Excel functions in MDX. That's very helpful because MDX is lack of many important functions such as Round (!). Many organizations don't like it at all, but here's something that may help. In the SSAS 2005 server you don't need to install the whole program, only the .Net programmability support. In the installation, choose to manually pick up which components you wish to install and then choose the .Net programmability Support as seen in the picture:

Notice that this issue will not be fixed in SSAS 2008, so this tip will be relevant for a long time.

Monday, December 31, 2007 5:06:00 PM (Jerusalem Standard Time, UTC+02:00)
 Sunday, December 30, 2007

Just got home. Most of my day (and my co-worker's also) went on with a big installation of the second block of our BI project. In the morning we really thought that maybe this time, yeah - just this time things will go better. After more than 12 hours I laughing at myself: How could I be so naive? Many things that could go bad just did but after it all ended (with a happy ending, otherwise I wouldn't be here, writing in my home sweet home) I can say that the big blame is on Informatica PowerCenter. We're using version 8 of the software. It's not new software that started its way yesterday: It's a very old and familiar software. So how can it be that when we copy mapplets (ETL processes, for those of you who don't know Informatica) from one repository to another, some lines are just deleted from the mappings? After that you check your dimensions in MS-OLAP and you don't understand what happened there. A whole level in a big dimension that has only one member - 0 ?? Zero member is null. Yeah, we were right - the line in the mapping just been deleted by our precious Informatica so the column is all null.

Well, I happy we're through with this. Good night.

P.S.
Tomorrow I'm taking a day off... :-)

P.S 2

Although many things went wrong in the installation, I really think we had a good block this time. This block contains many beautiful things in MS-OLAP, MDX and Informatica. You'll see it here in the next few days, after I'll calm down. :-)

Monday, December 31, 2007 6:23:16 AM (Jerusalem Standard Time, UTC+02:00)
 Wednesday, December 19, 2007

My team master Yaron asked me to check some things in the Panorama Dashboards:

1. Can have two hands in one gauge.
2. Can I show two values in the text of every gauge.

Here are the answers. I think that the second answer is a beautiful one. In fact, I really enjoyed while I thought how to do this.

1. This is simple: Just use the Goal hand as the second hand. In the KPI Wizard go to the Define Goal step and choose Custom formula. Enter the measure you want to see in the second hand.

2. This is beautiful: In the KPI Wizard, go to the Finish step and to the Title part. Click on the little blue arrow and click on "Edit MDX...". Then, write this MDX:

[My Dimension].[My Hierarchy].CurrentMember.Name + '\n' +
[Measures].[First Measure].Name + ': ' +
Generate({[My Dimension].[My Hierarchy].CurrentMember},[Measures].[First Measure]) + '\n' +
[Measures].[Second Measure].Name + ': ' +
Generate({[My Dimension].[My Hierarchy].CurrentMember},[Measures].[Second Measure])

Note that:

  • This solution may apply to other BI applications, not only to Panorama.
  • This way you can show many values and data, not only two values.
  • What the Generate function doing there? The '+' operator needs to have two strings in both sides, so writing only the [Measures].[First Measure] or [Measures].[First Measure].Value will return a numeric value which will cause error. The Generate function used this way will return a string. It generates for the set (which contains only our member) the value of the measure (in the second argument of the formula) and as mentioned, returns it as string.
  • '\n' will jump to the next line
 |  | 
Thursday, December 20, 2007 4:18:35 AM (Jerusalem Standard Time, UTC+02:00)
 Sunday, December 09, 2007

Last week I participated in Microsoft's BI conference in Ra'anana, Israel. After the conference I asked myself: What have I really learned today? Well, here is what I remember:

  • Microsoft figured out that the eternal BI tool is and will be Excel. People just love their Excel sheets and they will stay there. This is why the mission is to bring the BI into their Excel sheets. Their new product - Excel Services, will manage our excel sheets in one central place which is connected to our Analysis Services cubes.
  • In my point of view, SQL Server 2008 is just a bunch of many performance issues and it is not really a new product. There are a lot of new "performance features". For example, most of our MDX queries will run faster, especially those who has null cells. The new Cell-By-Cell calculations performance improvements will cause these queries to run faster. I think that SS2008 could be one big Service Pack. If I'm wrong, please do comment me.
  • SQL Server 2005 has many products that we don't know good enough. Some products that I need to learn about are: Replication, SQL Server Agent, SQL CLR and more. I do know what they do and even played with them a little bit, but I want to know how they can help me and improve my work.
  • Many new features in SS2008 come from two old sources: BIDS Helper (SSAS open source addin) and of-course, Oracle...
  • My big wish - IntelliSense for Analysis Services will not be in SSAS2008 and maybe won't be at all. This is because the guessing is MDX is very hard. There are too many options in every statement you write.
  • We won't need to upgrade to Office 2007 in order to use Excel Services. Only the developers will need it.

This is what I remember for now. I'll update this post if something new will come around in my mind.

Sunday, December 09, 2007 9:26:17 PM (Jerusalem Standard Time, UTC+02:00)
 Sunday, December 02, 2007

In the last post, I explained the architecture of our BI project. The final part of the process is processing MS-OLAP object (cube/dimension) from Informatica mapplet. As explained earlier, the trick is to call Stored Procedure from the Informatica server. But first there is one more thing to do: How do you connect the Informatica server (Linux) with MS-OLAP (windows server)?

Informatica ships with number of drivers that can connect it with other servers. The drivers are called DataDirect and I'll discuss 4.20. You need to define this driver on the Informatica server (look in Informatica knowledge base for more information). This is an easy thing to do. Notice that you have to enter a full server name (including domain) and the password. Remember that if you'll change the password in the future the process will fail. You have to enable the protocol named "Named Pipes" in the MS-OLAP server. How to do this? Enter the Configuration Manager in the MS-OLAP server and in the section of MSSQLSERVER protocols enable the Named Pipes protocol. This will enable the connection from the Informatica server. On the Informatica server, make a regular ODBC connection.

Here is the code of the SP on the MS-OLAP side. This SP must be on the msdb Database on the Database engine.

ALTER PROCEDURE [dbo].[ProcessObject]
@databaseId varchar(100),
@objectType varchar(100),
@objectId varchar(100),
@login_name varchar(100),
@returnValue int output,
@errorMessage nvarchar(1024) output
AS
BEGIN
declare @jobName varchar(200)
declare @xmla varchar(1000)
declare @jobId binary(16)
declare @ReturnCode int
declare @stop int

--Set job name
set @jobName = 'Process' + @objectType + '_' + @objectId

--Delete the job if already exists
if exists (select * from msdb.dbo.sysjobs where name = @jobName)
exec msdb.dbo.sp_delete_job @job_name = @jobName

--Create the job
Exec msdb.dbo.sp_add_job @job_name=@jobName
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N'process OLAP object',
@category_name=N'[Uncategorized (Local)]',
@owner_login_name=@login_name, @job_id=@jobId output

exec msdb.dbo.sp_add_jobserver @job_name=@jobName, @server_name=@@SERVERNAME

--Declare XMLA for OLAP object
if (@objectType = 'Cube')
set @xmla = '

' + @dataBaseId + '
' + @objectId + '

ProcessFull
'
else if (@objectType = 'Dim')
set @xmla =



' + @dataBaseId + '
' + @objectId + '

ProcessFull


'
else
Begin
set @returnValue = 0
return @returnValue
End

--Add the job step
Exec msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Process Object',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'ANALYSISCOMMAND',
@command=@xmla,
@server=@@SERVERNAME,
@database_name=N'master',
@flags=0

--Run the job
Execute sp_start_job @jobName

Waitfor delay '00:00:05'
set @returnValue = (select run_status from dbo.sysjobhistory
where job_id = @jobId
and step_id = 1)

-- Loop until the job ends and return its result
set @stop = 0
if @returnValue is null
while @stop <> 1
Begin
set @returnValue = (select run_status from dbo.sysjobhistory
where job_id = @jobId
and step_id = 1)

if @returnValue is not null
set @stop = 1

waitfor delay '00:00:10'
End

--Return error message (if exists)
If @returnValue = 0 --failed
set @errorMessage = (select [message] from dbo.sysjobhistory
where job_id = @jobId
and step_id = 1)
End

update: I see that the xmla code went bad in the post because it is not recognised html code. It doesn't matter, I believe you got the point...

Sunday, December 02, 2007 5:59:42 PM (Jerusalem Standard Time, UTC+02:00)
 Saturday, December 01, 2007

In the past I mentioned some fragments of the architecture of our end-to-end BI solution. Now I'll discuss how it is done. I will only write about the things that I done (I mean, developed) but I'll describe the whole picture.

Our architecture goes like this: Control M -> Informatica -> MS-OLAP (Analysis Services 2005).

In words: ControlM is the most common scheduler in big companies. We use it to schedule our ETL processes in Informatica. Our system team made it possible to start Informatica processes from ControlM. I don't know exactly how it is done. All I know is that ControlM raises a flag in a table, and Informatica scans the table every X seconds and start the process if it finds the flag that was raised by ControlM. Don't ask me about the technical details - it wasn't my job.

The more interesting thing (because I did it...) is how Informatica calls MS-OLAP and tells it to process a cube. In this part I'll describe the big picture and in the next one I'll give some of the code and discuss some technical views of the process. First, the Informatica mapping moves the data from the source to the target, which is the dimension or fact table just like it always does. After that, Informatica calls a Stored Procedure on the MS-OLAP server which process the cube. Informatica calls this SP with some parameters, including the object type to process (cube/dimension), its ID and some more parameters. In return, the MS-OLAP returns return code (in order to point whether the process succeeded) and message describing the error if it occurred.

How the SP process the cube/dimension? Unfortunately, there is no SP that can process OLAP object so I needed to use the following steps in my SP:

  1. Delete any existing job that does the same action (read on, you'll understand)
  2. Create an empty job
  3. Add a step to that job that will process the object. This step contains XMLA code that contains the parameters that were given from Informatica
  4. Run the job
  5. Loop until the job (or process) ends and send back the return code and the error message, if exists.

In the next part I'll write some of the code and discuss some technical issues.

 

Sunday, December 02, 2007 5:19:47 AM (Jerusalem Standard Time, UTC+02:00)