Tuesday, October 12, 2010

After two months of writing only about OBIEE, it's time to write about what I did besides that. The main reason I have not written on anything else is a load of work on the HR BI system as a manager. In the future posts I will discuss the BI system that have taken my precious time lately and I will review the solution bottom up, i.e. from the pricing stage, going through the data warehouse structure to the implementation of the ETL and reports, of course. In this post I'll write about pricing and prioritization .

Part I - Pricing

I'll start with several leading questions in order to stimulate thinking.
When we start estimating prices of the BI project, how do we do it? The intuition of most of us (so I believe) is to get as much initial information about the project's content world as we can: What are the main problems, what is the result the customer wants to get, etc. Then we estimate the cost of the project, just like that. This rises the question of how accurate the assessment we've got? Furthermore, frankly - what is it based on? Past experience? Who's? Do we know how to explain it to our colleagues? Can update it when there are material changes to the requirements? If yes - how?
The purpose of this post is formalizing, even a little bit, the process of pricing the project. In the second part we will also prioritize.
Pessimist: How can it help? After all, in every case it is estimation. There are a lot of parameters in the project and we can't put every single one into the equation and even if it was possible, most are unknown parameters in the initial phase of pricing!
Miky: That's right, but the approximation can be done in a formal way and based on experience of others. It is much better than just a wild guess of numbers.
So we realized what we need this for. Let's start.

We want to price the project in terms of time (time can also be translated into money, of course). To start, we need to know and be familiar with the various processes in the organization that we can implement. To do this we will have to sit down with key people of the organization to its various parts (departments, sections, you name it) and just sit and learn. After we learned the various processes (converting these "conversations" to a list of processes is the result of experience and talent in information systems and BI systems in particular, So you may want to get help if you are new in this field) open Excel and write a list of all processes in one column. Then, we add the following columns:

  • Content world - column that contains the name of the process or content world
  • Short description - so those who work after us will understand what we meant
  • Times (that are required) - whether the analysis of process is required in the past, present or future (of course there could be a combination of them). This column is critical because any change in this could change dramatically the length of the project
  • Priority in HQ - how much they interested in this process. Number from 1 to 5.
  • Priority in field - how much they (intermediate level managers to the production floor) are interested in the implementation of the process. 1 to 5.
  • Source Systems - which systems are required to bring the information required to analyzing the process
  • Complexity of bringing this data - based on previous column, how complex is to "put your hands on data." The difficulties can be technological (legacy systems, systems with no interface exists, unknown technologies) or political (the DBA of the system does not cooperate, there is already a "mini-BI system" and they don't want to replace it) 1 to 5
  • Complexity of the process - with no relation to the previous column, how much the process is complex. To simplify the matter, what will be the length of the design document? 1 to 5
  • Additional notes
In this simple excel sheet we've gathered the raw data with which we will analyze the estimated duration of the project. It is quite possible that at a later stage of the project we'll go back and update this data sheet.

Now let's take a moment and get out from our project's scope and think big.
Objective: Build a costs matrix for each part / module of the project. Let's see an example:

Easy Medium Difficult
Preliminary Design 2 5 10
Detailed Design 2 5 12
ERD creation 1 1 1
ETL - Dimension 1 3 7
ETL - Fact 1 6 15
Reporting Model 1 6 10
OLAP Project 1 3 7
Reports 0.3 0.5 1
OLAP Reports 0.3 0.5 1.5
Web Site 2 10 20
Security 3 6 10

The numbers are in terms of days (not hours). For example, the construction of report in medium difficulty level will take about half a day. It is important to emphasize that this is actual time and not including any buffers, but it is including unit tests. The buffers will be added later (those who read the book "Critical Chain" know how catastrophic it is to add the buffers now). Let's go on: we'll take the first process that we consider implementing and write its quantities matrix:

Process #1
Easy Medium Difficult
Preliminary Design 1
Detailed Design 1 1
ERD creation 1
ETL - Dimension 1
ETL - Fact 1 1
Reporting Model 1
OLAP Project 1
Reports 1
OLAP Reports 1
Web Site 1
Security 1

Pessimist: Wait a minute. I learned about the WBS methodology in which I take any part / module in the project and break it into its atomic elements. What's the difference?
Miky: Nothing at all and very much. We also break the big problem into smaller parts, but here we utilize the fact that the components are identical. Every project is composed of those parts - design, ETL, dimensions, cubes, reports etc.

After we multiplied the matrixes and summarized the result we got the estimated cost for the first process. In our case we get 53 (after rounding) days. Let's call this number M. Now we add a few more fixed costs of our process:

  • Developer Tests - 6
  • User tests - 2
  • Installation - 2
In addition to these costs there are also constants to be added to the equation:
  • Gross - 1.3 (vacations, illnesses, etc.)
  • Management - 1.15 (time spent on the project manager's job)
  • Buffer - 1.1
These fixed values are the result of experience and of course they will be updated along time and the various projects. We'll multiply M with the constants, and then we add the fixed costs. Now we got 97.16 days. Divide by 20 (work days per month) and we get 4.8 months. It's the cost for the implementation of the first process. It should be emphasized that this number is in terms of work, which means theoretically that if you assign two people to work on the project it will last only 2.4 months. It is theoretically and not exact for the simple reason that there are constraints and dependencies in the development process (let's not go into that now).
Doing this for every process, we'll get the estimated time for every process to implement.
Pessimist: Sorry to bother, but... Is it not too much work just to get the pricing? A skilled manager will guess pretty much the same numbers, right?
Miky: First of all - not sure. Second and more important - this method gives the manager and the entire team a common language for schedule and pricing. So it's important to write the matrix with the developers or at least review it with them. It is important to make sure the team knows the matrix and its implications for the schedule. In addition, this method also provides a common language with the customer. If the client wants to understand why the project will take so much time / money you can sit down and explain. If the client wants to reduce costs you can talk with him in terms of Dimensions and Facts, make a simulation and see how much the reducting of a one difficult dimension will spare the entire project cost. Nice, isn't it?
 
Part Two - Prioritization
 
As before, we'll start with the basic questions: how we decide which process we'll implement first? Will it be the process that the CEO wants to see implemented first? Maybe the process that will provide the maximum benefit for the entire organization? Perhaps the easiest and the most simple?
Now that we have the data in frost of our eyes, we repeat the question from which we started: how we decide which one we'll implement first? Here's the answer:
Explanation: The numbers in parentheses is the work in terms of months. Axis X is ease of implementation (not difficulty, for convenience) and it is obtained by weighting the data reception complexity from the source systems and complexity of the process (you can weight these numbers with your own custom formula). The Y axis is the priority, where I gave a factor of 0.8 to the HQ and 0.2 to the field's priority, but even here you can weigh as appropriate for your organization. This graph/map is definitely a product you can show your client or decision-makers and sit down together to decide where to start. Conventional wisdom says to start from the upper-right quadrant (most important and easiest) but of course there are additional considerations that do not appear here. For example, there is a tiny and unimportant consideration such as ...How much time do we have???
 
Is it the end of our work?
Of course not! The costs matrix requires maintenance, but over time it will become more and more accurate and stable. The project manager should monitor the progress schedule to know how long or rather how much work spent in each stage of the development. Sometimes we'll find that the way we written the quantities-matrix is not good and in that case we will come and update it. In addition, over time we'll find that the values in the pricing matrix are decreasing due to specialization and skills acquisition of the development team.
 
Thanks for Arkady Gurevich, my team master who guided me in implementing this idea and helped the project in general.
 
Now that we decided with which processes we'll start the project, we can move forward. See you in future posts where we start to plan the HR BI system.
 |  | 
Tuesday, October 12, 2010 11:46:45 AM (Jerusalem Standard Time, UTC+02:00)