<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:pingback="http://madskills.com/public/xml/rss/module/pingback/" version="2.0">
  <channel>
    <title>Miky Schreiber's Blog - BI|Informatica</title>
    <link>http://www.miky-schreiber.com/Blog/</link>
    <description>Business Intelligence, Analysis Services, MDX, DataWarehousing and more...</description>
    <language>en-us</language>
    <copyright>Miky Schreiber</copyright>
    <lastBuildDate>Thu, 15 Jul 2010 12:37:52 GMT</lastBuildDate>
    <generator>newtelligence dasBlog 1.9.6264.0</generator>
    <managingEditor>miky@zahav.net.il</managingEditor>
    <webMaster>miky@zahav.net.il</webMaster>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=a6d2ee8b-7426-419f-a069-8b31fa2b5923</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,a6d2ee8b-7426-419f-a069-8b31fa2b5923.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,a6d2ee8b-7426-419f-a069-8b31fa2b5923.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=a6d2ee8b-7426-419f-a069-8b31fa2b5923</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <img style="MARGIN: 0px 1em 1em" src="http://www.miky-schreiber.com/Blog/content/binary/stop_signs.JPG" align="left" border="0" />
        <p>
No-one in my organization knew what is the difference between stop and abort in the
Informatica Workflow Monitor. I googled it, and here is the answer in short term:
Abort is killing the process in the underlying operating system, meaning that the
memory that the session took for itself won't be released. Most of the OS don't have
good-enough garbage collectors, meaning that this memory allocation won't be free
until you'll restart the whole server(!). The stop command will ask the session to
quit whenever it can, meaning that this will take more time.
</p>
        <p>
Use the abort command only if you must stop the session right now. If you see that
over time your informatica server goes slower and slower, restart the server in the
weekend.
</p>
        <p>
You can see the full answer <a href="http://datawarehouse.ittoolbox.com/documents/difference-between-stop-and-abort-in-informatica-12862">here</a>.
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=a6d2ee8b-7426-419f-a069-8b31fa2b5923" />
      </body>
      <title>The difference between stop and abort in Informatica</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,a6d2ee8b-7426-419f-a069-8b31fa2b5923.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,a6d2ee8b-7426-419f-a069-8b31fa2b5923.aspx</link>
      <pubDate>Thu, 15 Jul 2010 12:37:52 GMT</pubDate>
      <description>&lt;img style="MARGIN: 0px 1em 1em" src="http://www.miky-schreiber.com/Blog/content/binary/stop_signs.JPG" align=left border=0&gt; 
&lt;p&gt;
No-one in my organization knew what is the difference between stop and abort in the
Informatica Workflow Monitor. I googled it, and here is the answer in short term:
Abort is killing the process in the underlying operating system, meaning that the
memory that the session took for itself won't be released. Most of the OS don't have
good-enough garbage collectors, meaning that this memory allocation won't be free
until you'll restart the whole server(!). The stop command will ask the session to
quit whenever it can, meaning that this will take more time.
&lt;/p&gt;
&lt;p&gt;
Use the abort command only if you must stop the session right now. If you see that
over time your informatica server goes slower and slower, restart the server in the
weekend.
&lt;/p&gt;
&lt;p&gt;
You can see the full answer &lt;a href="http://datawarehouse.ittoolbox.com/documents/difference-between-stop-and-abort-in-informatica-12862"&gt;here&lt;/a&gt;.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=a6d2ee8b-7426-419f-a069-8b31fa2b5923" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,a6d2ee8b-7426-419f-a069-8b31fa2b5923.aspx</comments>
      <category>BI;BI/Informatica</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=c924d9dd-d74a-42cf-ab0f-34b58e6d5404</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,c924d9dd-d74a-42cf-ab0f-34b58e6d5404.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,c924d9dd-d74a-42cf-ab0f-34b58e6d5404.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=c924d9dd-d74a-42cf-ab0f-34b58e6d5404</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
This tip can save you a lot of time because when you don't follow it or you're not
aware of it, finding the bug will take you a lot of time.
</p>
        <p>
When you build your workflow or worklet in Informatica, you usually define dependencies
between the tasks, meaning that task A will start only after task B finishes. You
do it using the "Link Tasks" button in the toolbar. The issue is when you define for
example that task C will run only after both task A and task B will finish.
In that case, linking the tasks is not enough. You need to edit task C and choose
"AND" in the "Treat the input links as" box. The default is "OR", meaning that the
task will run when either task A or task B finishes. This can cause you a lot of trouble
if task C is using data from task A and B (otherwise, why there's dependency between
them?).
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=c924d9dd-d74a-42cf-ab0f-34b58e6d5404" />
      </body>
      <title>Tasks Dependency in Informatica</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,c924d9dd-d74a-42cf-ab0f-34b58e6d5404.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,c924d9dd-d74a-42cf-ab0f-34b58e6d5404.aspx</link>
      <pubDate>Tue, 19 Jan 2010 14:40:13 GMT</pubDate>
      <description>&lt;p&gt;
This tip can save you a lot of time because when you don't follow it or you're not
aware of it, finding the bug will take you a lot of time.
&lt;/p&gt;
&lt;p&gt;
When you build your workflow or worklet in Informatica, you usually define dependencies
between the tasks, meaning that task A will start only after task B finishes. You
do it using the "Link Tasks" button in the toolbar. The issue is when you define for
example that task C will run only after both&amp;nbsp;task A and task B&amp;nbsp;will finish.
In that case, linking the tasks is not enough. You need to edit task C and choose
"AND" in the "Treat the input links as" box. The default is "OR", meaning that the
task will run when either task A or task B finishes. This can cause you a lot of trouble
if task C is using data from task A and B (otherwise, why there's dependency between
them?).
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=c924d9dd-d74a-42cf-ab0f-34b58e6d5404" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,c924d9dd-d74a-42cf-ab0f-34b58e6d5404.aspx</comments>
      <category>BI;BI/Informatica</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=801cefae-7de4-4e1c-9ad4-115c70c2813e</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,801cefae-7de4-4e1c-9ad4-115c70c2813e.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,801cefae-7de4-4e1c-9ad4-115c70c2813e.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=801cefae-7de4-4e1c-9ad4-115c70c2813e</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
          <img style="MARGIN: 0px 1em 1em" height="50%" src="http://www.miky-schreiber.com/blog/content/binary/horseshoe.jpg" width="25%" align="left" />In
both SSIS and Informatica, the union transformation doesn't delete duplicate rows.
In SSIS it's clear because the transformation is called "Union all", which reminds
us the union all statement in SQL which doesn't remove duplicate rows. The confusing
part is in Informatica, where the transformation is called Union although it doesn't
remove duplicates. One of my teammates got confused this week so I thought it's worth
blogging about. 
<br /><br />
One more tip about ETLs with union transformations: When you'll investigate, maintain
or fix ETL it will be much help if you'll know where each row came from. That's why
I recommend adding a column named "src" or something like that and store there the
source of the row before the union happened.
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=801cefae-7de4-4e1c-9ad4-115c70c2813e" />
      </body>
      <title>Union Transformations and Duplicate Rows</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,801cefae-7de4-4e1c-9ad4-115c70c2813e.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,801cefae-7de4-4e1c-9ad4-115c70c2813e.aspx</link>
      <pubDate>Mon, 09 Nov 2009 14:41:31 GMT</pubDate>
      <description>&lt;p&gt;
&lt;img style="MARGIN: 0px 1em 1em" height="50%" src="http://www.miky-schreiber.com/blog/content/binary/horseshoe.jpg" width="25%" align=left&gt;In
both SSIS and Informatica, the union transformation doesn't delete duplicate rows.
In SSIS it's clear because the transformation is called "Union all", which reminds
us the union all statement in SQL which doesn't remove duplicate rows. The confusing
part is in Informatica, where the transformation is called Union although it doesn't
remove duplicates. One of my teammates got confused this week so I thought it's worth
blogging about. 
&lt;br&gt;
&lt;br&gt;
One more tip about ETLs with union transformations: When you'll investigate, maintain
or fix ETL it will be much help if you'll know where each row came from. That's why
I recommend adding a column named "src" or something like that and store there the
source of the row before the union happened.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=801cefae-7de4-4e1c-9ad4-115c70c2813e" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,801cefae-7de4-4e1c-9ad4-115c70c2813e.aspx</comments>
      <category>BI;BI/Informatica;BI/SQL Server 2005;BI/SQL Server 2005/Integration Services;BI/SQL Server 2008;BI/SQL Server 2008/Integration Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=03a8f9ca-790a-4e8a-91e3-f25ea1a32c94</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,03a8f9ca-790a-4e8a-91e3-f25ea1a32c94.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,03a8f9ca-790a-4e8a-91e3-f25ea1a32c94.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=03a8f9ca-790a-4e8a-91e3-f25ea1a32c94</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <img style="MARGIN: 0px 1em 1em" src="http://www.miky-schreiber.com/blog/content/binary/policeman.jpg" align="left" />
        <p>
Lucky for me, I have the opportunity to work with more than one ETL tool in my daily
work. In the far past, I worked much with DTS and PL/SQL packages (which is an ETL,
but not exactly a tool). Nowadays I work both with SSIS and Informatica so I
can compare them in several aspects. I don't think that one is better than the other.
It's just that they have different approaches to the ETL mechanism. Anyway, I always
expect that the data transformations will behave the same in every tool and I
found one transformation that behaves differently between the tools - the Router.
</p>
        <p>
The Informatica's Router, if you're not familiar with, receives a data row and
routes it a specific data flow. Its parallel transformation in SSIS is the Conditional
Split. As I said, I always thought that they behave the same. In both transformations
you provide boolean expressions that determine where the row will be routed. The difference
is that in SSIS, "<em>each input row can be sent to only one output, that being the
output for the first condition that evaluates to true</em>" (from SSIS's documentation).
In programmer's words, it behaves like switch-case statement with break in each condition.
In Informatica, <em>"If a row meets more than one group filter condition, the Integration
Service passes this row multiple times</em>" (from Informatica's documentation). Back
to the programmer's language, it's like switch-case statement with no break inside
the conditions.
</p>
        <p>
If you know other transformation that behaves different between ETL tools, I'll be
happy to know.
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=03a8f9ca-790a-4e8a-91e3-f25ea1a32c94" />
      </body>
      <title>Different Router behaviour in SSIS and Informatica</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,03a8f9ca-790a-4e8a-91e3-f25ea1a32c94.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,03a8f9ca-790a-4e8a-91e3-f25ea1a32c94.aspx</link>
      <pubDate>Tue, 27 Oct 2009 14:10:34 GMT</pubDate>
      <description>&lt;img style="MARGIN: 0px 1em 1em" src="http://www.miky-schreiber.com/blog/content/binary/policeman.jpg" align=left&gt; 
&lt;p&gt;
Lucky for me, I have the opportunity to work with more than one ETL tool in my daily
work. In the far past, I worked much with DTS and PL/SQL packages (which is an ETL,
but not exactly a tool). Nowadays I work both with&amp;nbsp;SSIS and Informatica so I
can compare them in several aspects. I don't think that one is better than the other.
It's just that they have different approaches to the ETL mechanism. Anyway, I always
expect that the data transformations will behave the same in every tool&amp;nbsp;and I
found one transformation that behaves differently between the tools - the Router.
&lt;/p&gt;
&lt;p&gt;
The&amp;nbsp;Informatica's Router, if you're not familiar with, receives a data row and
routes it a specific data flow. Its parallel transformation in SSIS is the Conditional
Split. As I said, I always thought that they behave the same. In both transformations
you provide boolean expressions that determine where the row will be routed. The difference
is that in SSIS, "&lt;em&gt;each input row can be sent to only one output, that being the
output for the first condition that evaluates to true&lt;/em&gt;" (from SSIS's documentation).
In programmer's words, it behaves like switch-case statement with break in each condition.
In Informatica, &lt;em&gt;"If a row meets more than one group filter condition, the Integration
Service passes this row multiple times&lt;/em&gt;" (from Informatica's documentation). Back
to the programmer's language, it's like switch-case statement with no break inside
the conditions.
&lt;/p&gt;
&lt;p&gt;
If you know other transformation that behaves different between ETL tools, I'll be
happy to know.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=03a8f9ca-790a-4e8a-91e3-f25ea1a32c94" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,03a8f9ca-790a-4e8a-91e3-f25ea1a32c94.aspx</comments>
      <category>BI;BI/Informatica;BI/SQL Server 2005;BI/SQL Server 2005/Integration Services;BI/SQL Server 2008;BI/SQL Server 2008/Integration Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=0e4490ed-7f54-4883-b542-2595a0d95323</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,0e4490ed-7f54-4883-b542-2595a0d95323.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,0e4490ed-7f54-4883-b542-2595a0d95323.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=0e4490ed-7f54-4883-b542-2595a0d95323</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
The error function in Informatica can be very useful when you want to know why you
have failed rows without failing the whole process. The use of the function is ERROR(string).
Calling this function will make the error appear in the session log and the row will
be ignored. For example: IIF(my_date_field &gt; sysdate, ERROR('Futuristic Date'),
my_date_field). This will ignore futuristic dates and show them only in the session
log. By the way, you can declare default value for the port and the process will replace
the ERROR with the default output.
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=0e4490ed-7f54-4883-b542-2595a0d95323" />
      </body>
      <title>Using the Error function in Informatica</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,0e4490ed-7f54-4883-b542-2595a0d95323.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,0e4490ed-7f54-4883-b542-2595a0d95323.aspx</link>
      <pubDate>Thu, 09 Jul 2009 11:19:26 GMT</pubDate>
      <description>&lt;p&gt;
The error function in Informatica can be very useful when you want to know why you
have failed rows without failing the whole process. The use of the function is ERROR(string).
Calling this function will make the error appear in the session log and the row will
be ignored. For example: IIF(my_date_field &amp;gt; sysdate, ERROR('Futuristic Date'),
my_date_field). This will ignore futuristic dates and show them only in the session
log. By the way, you can declare default value for the port and the process will replace
the ERROR with the default output.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=0e4490ed-7f54-4883-b542-2595a0d95323" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,0e4490ed-7f54-4883-b542-2595a0d95323.aspx</comments>
      <category>BI;BI/Informatica</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=cbc9ec2a-6745-411d-ba63-19561dc9f411</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,cbc9ec2a-6745-411d-ba63-19561dc9f411.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,cbc9ec2a-6745-411d-ba63-19561dc9f411.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=cbc9ec2a-6745-411d-ba63-19561dc9f411</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
There are some things you can only learn in the hard way. It didn't happenned
to me personally but to my team friends, so I consider it as it is my bad.
</p>
        <p>
We upgraded our ETL tool - Informatica, from version 8.5 to 8.6. We had to run
some tests to see that the results are the same. So, what we did is to save the result
table from 8.5 in Excel, save the result table from 8.6 in Excel and then compare
them using Excel-built-in functions. The only problem is we didn't pay attention to
the places where zero and null interchanged. This happened because the two versions
act differently where null values take places in aggregation functions. For example,
when there is a sum function and it aggregates only on null values. In one version
the output is zero and in the other the output is null... 
</p>
        <p>
This can also happen in other tools and technology. For example, in OLAP cubes the
difference between zero and null is the difference between seeing the member of the
dimension on the screen and not knowing of its existence.
</p>
        <p>
For conclusion, always be aware to this point and don't forget to check it.
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=cbc9ec2a-6745-411d-ba63-19561dc9f411" />
      </body>
      <title>The Difference between null and zero</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,cbc9ec2a-6745-411d-ba63-19561dc9f411.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,cbc9ec2a-6745-411d-ba63-19561dc9f411.aspx</link>
      <pubDate>Sun, 22 Mar 2009 21:09:59 GMT</pubDate>
      <description>&lt;p&gt;
There are some things you can only learn&amp;nbsp;in&amp;nbsp;the hard way. It didn't happenned
to me personally but to my team friends, so I consider it as it is my bad.
&lt;/p&gt;
&lt;p&gt;
We upgraded our ETL tool&amp;nbsp;- Informatica, from version 8.5 to 8.6. We had to&amp;nbsp;run
some tests to see that the results are the same. So, what we did is to save the result
table from 8.5 in Excel, save the result table from 8.6 in Excel and then compare
them using Excel-built-in functions. The only problem is we didn't pay attention to
the places where zero and null interchanged. This happened because the two versions
act differently where null values take places in aggregation functions. For example,
when there is a sum function and it aggregates only on null values. In one version
the output is zero and in the other the output is null... 
&lt;/p&gt;
&lt;p&gt;
This can also happen in other tools and technology. For example, in OLAP cubes the
difference between zero and null is the difference between seeing the member of the
dimension on the screen and not knowing of its existence.
&lt;/p&gt;
&lt;p&gt;
For conclusion, always be aware to this point and don't forget to check it.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=cbc9ec2a-6745-411d-ba63-19561dc9f411" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,cbc9ec2a-6745-411d-ba63-19561dc9f411.aspx</comments>
      <category>BI;BI/DataWarehousing;BI/Informatica;BI/SQL Server 2005;BI/SQL Server 2005/Analysis Services;BI/SQL Server 2008;BI/SQL Server 2008/Analysis Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=d2331638-e60a-45c7-9f5b-8becf2483179</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,d2331638-e60a-45c7-9f5b-8becf2483179.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,d2331638-e60a-45c7-9f5b-8becf2483179.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=d2331638-e60a-45c7-9f5b-8becf2483179</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
In the previous parts (<a href="http://www.miky-schreiber.com/Blog/PermaLink,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx">1</a>, <a href="http://www.miky-schreiber.com/Blog/PermaLink,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx">2</a>) I
showed how to connect Informatica with MS-OLAP, meaning that a mapplet can process
cube or dimension. The thing is that I focused on the side of MS-OLAP. In the second
part I even wrote the T-SQL code itself. Now I want to close the loop by describing
what's going on in the Informatica side. This part was made by my friend, Alex, who
permitted me to post here about what he did.
</p>
        <p>
First of all, there's a table which contains the parameters to call with to the MS-OLAP
procedure (object id, type, user name, etc.). This table is the source (&amp; source
qualifier, of course) of the mapplet. Each row in this table calls the stored procedure
in the MS-OLAP side (in fact, the procedure is part of the relational DB, but never
mind now). The call to the SP is made with Informatica's Procedure block. The connection
is a regular ODBC connection, as mentioned in the previous part. Now for the interesting
part: In the mapplet, the result of the procedure (zero for success, one for failure)
goes into a Java Transformation block. This java block will fail the mapplet if one
or more procedure calls returned failure.
</p>
        <p>
How to do this java block? Double click on it to enter its properties. Go to the "Java
Code" tab. There you'll see tab for every event in this block's life cycle. Here is
the code for every tab (only the relevant ones):
</p>
        <p>
Helper Code:
</p>
static int errorCounter = 0;<br />
static Object lock = new Object();<br /><p>
On Input Row:
</p>
if (returnValue != 0)<br />
{<br />
 synchronized(lock)<br />
  {<br />
   errorCounter++;<br />
  }<br />
}<br /><p>
On End of Data: 
</p><p>
synchronized(lock)<br />
{<br />
 if (errorCounter &gt; 0)<br />
  {<br />
   failSession("OLAP Objects failed");<br />
  }<br />
} 
</p><p>
Note that:
</p><ul><li>
I'm not sure that the lock mechanism is required here. sync, lock, semaphore, etc.
mechanisms are often used when atomic write is needed in order to solve problems
like deadlocks, mutual write, etc. Here I simply don't care. Even if two parallel
threads will read the errorCounter as zero and they both will increase it to one (when
in fact it needs to have the value of two) it won't be a bug because the session will
fail anyway. Alex &amp; I need to talk about this point...</li><li>
failSession is a function which is part of Informatica's API. As you might guess,
it will fail the whole mapplet.</li><li>
Very important: Calling all the MS-OLAP objects at once will cause an error in the
Analysis Services server and all the objects will be in the state of Unprocessed.
The Informatica side has to call the dimensions first and only then the cubes. The
cubes must not be called all at once if they have relationships between them. This
will cause deadlock too.<br /></li></ul><img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=d2331638-e60a-45c7-9f5b-8becf2483179" /></body>
      <title>Connecting Informatica with MS-OLAP - Part III</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,d2331638-e60a-45c7-9f5b-8becf2483179.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,d2331638-e60a-45c7-9f5b-8becf2483179.aspx</link>
      <pubDate>Mon, 07 Jan 2008 20:35:17 GMT</pubDate>
      <description>&lt;p&gt;
In the previous parts (&lt;a href="http://www.miky-schreiber.com/Blog/PermaLink,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx"&gt;1&lt;/a&gt;, &lt;a href="http://www.miky-schreiber.com/Blog/PermaLink,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx"&gt;2&lt;/a&gt;)&amp;nbsp;I
showed how to connect Informatica with MS-OLAP, meaning that a mapplet can process
cube or dimension. The thing is that I focused on the side of MS-OLAP. In the second
part I even&amp;nbsp;wrote the T-SQL code itself. Now I want to close the loop by describing
what's going on in the Informatica side. This part was made by my friend, Alex, who
permitted me to post here about what he did.
&lt;/p&gt;
&lt;p&gt;
First of all, there's a table which contains the parameters to call with to the MS-OLAP
procedure (object id, type, user name, etc.). This table is the source (&amp;amp; source
qualifier, of course) of the mapplet. Each row in this table calls the stored procedure
in the MS-OLAP side (in fact, the procedure is part of the relational DB, but never
mind now). The call to the SP is made with Informatica's Procedure block. The connection
is a regular ODBC connection, as mentioned in the previous part. Now for the interesting
part: In the mapplet, the result of the procedure (zero for success, one for failure)
goes into a Java Transformation block. This java block will fail the mapplet if one
or more procedure calls returned failure.
&lt;/p&gt;
&lt;p&gt;
How to do this java block? Double click on it to enter its properties. Go to the "Java
Code" tab. There you'll see tab for every event in this block's life cycle. Here is
the code for every tab (only the relevant ones):
&lt;/p&gt;
&lt;p&gt;
Helper Code:
&lt;/p&gt;
static int errorCounter = 0;&lt;br&gt;
static Object lock = new Object();&lt;br&gt;
&lt;p&gt;
On Input Row:
&lt;/p&gt;
if (returnValue != 0)&lt;br&gt;
{&lt;br&gt;
&amp;nbsp;synchronized(lock)&lt;br&gt;
&amp;nbsp;&amp;nbsp;{&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;errorCounter++;&lt;br&gt;
&amp;nbsp;&amp;nbsp;}&lt;br&gt;
}&lt;br&gt;
&lt;p&gt;
On End of Data: 
&lt;p&gt;
synchronized(lock)&lt;br&gt;
{&lt;br&gt;
&amp;nbsp;if (errorCounter &amp;gt; 0)&lt;br&gt;
&amp;nbsp;&amp;nbsp;{&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;failSession("OLAP Objects failed");&lt;br&gt;
&amp;nbsp;&amp;nbsp;}&lt;br&gt;
} 
&lt;/p&gt;
&lt;p&gt;
Note that:
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
I'm not sure that the lock mechanism is required here. sync, lock, semaphore, etc.
mechanisms&amp;nbsp;are often used when atomic write is needed in order to solve problems
like deadlocks, mutual write, etc. Here I simply don't care. Even if two parallel
threads will read the errorCounter as zero and they both will increase it to one (when
in fact it needs to have the value of two) it won't be a bug because the session will
fail anyway. Alex &amp;amp; I need to talk about this point...&lt;/li&gt;
&lt;li&gt;
failSession is a function which is part of Informatica's API. As you might guess,
it will fail the whole mapplet.&lt;/li&gt;
&lt;li&gt;
Very important: Calling all the MS-OLAP objects at once will cause an error in the
Analysis Services server and all the objects will be in the state of Unprocessed.
The Informatica side has to call the dimensions first and only then the cubes. The
cubes must not be called all at once if they have relationships between them. This
will cause deadlock too.&lt;br&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=d2331638-e60a-45c7-9f5b-8becf2483179" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,d2331638-e60a-45c7-9f5b-8becf2483179.aspx</comments>
      <category>BI;BI/Informatica;BI/SQL Server 2005;BI/SQL Server 2005/Analysis Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=ff609464-c3bf-4ed0-ad38-7488832ac9cc</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,ff609464-c3bf-4ed0-ad38-7488832ac9cc.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,ff609464-c3bf-4ed0-ad38-7488832ac9cc.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=ff609464-c3bf-4ed0-ad38-7488832ac9cc</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
Just got home. Most of my day (and my co-worker's also) went on with a big installation
of the second block of our BI project. In the morning we really thought that maybe
this time, yeah - just this time things will go better. After more than 12 hours I
laughing at myself: How could I be so naive? Many things that could go bad just did
but after it all ended (with a happy ending, otherwise I wouldn't be here, writing
in my home sweet home) I can say that the big blame is on Informatica PowerCenter.
We're using version 8 of the software. It's not new software that started its way
yesterday: It's a very old and familiar software. So how can it be that when we copy
mapplets (ETL processes, for those of you who don't know Informatica) from one repository
to another, some lines are just deleted from the mappings? After that you check your
dimensions in MS-OLAP and you don't understand what happened there. A whole level
in a big dimension that has only one member - 0 ?? Zero member is null. Yeah, we were
right - the line in the mapping just been deleted by our precious Informatica so the
column is all null.
</p>
        <p>
Well, I happy we're through with this. Good night.
</p>
        <p>
P.S.<br />
Tomorrow I'm taking a day off... :-)
</p>
        <p>
P.S 2
</p>
        <p>
Although many things went wrong in the installation, I really think we had a good
block this time. This block contains many beautiful things in MS-OLAP, MDX and Informatica.
You'll see it here in the next few days, after I'll calm down. :-)
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=ff609464-c3bf-4ed0-ad38-7488832ac9cc" />
      </body>
      <title>I'm so tired. Because of Informatica.</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,ff609464-c3bf-4ed0-ad38-7488832ac9cc.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,ff609464-c3bf-4ed0-ad38-7488832ac9cc.aspx</link>
      <pubDate>Sun, 30 Dec 2007 21:23:16 GMT</pubDate>
      <description>&lt;p&gt;
Just got home. Most of my day (and my co-worker's also) went on with a big installation
of the second block of our BI project. In the morning we really thought that maybe
this time, yeah - just this time things will go better. After more than 12 hours I
laughing at myself: How could I be so naive? Many things that could go bad just did
but after it all ended (with a happy ending, otherwise I wouldn't be here, writing
in my home sweet home) I can say that the big blame is on Informatica PowerCenter.
We're using version 8 of the software. It's not new software that started its way
yesterday: It's a very old and familiar software. So how can it be that when we copy
mapplets (ETL processes, for those of you who don't know Informatica) from one repository
to another, some lines are just deleted from the mappings? After that you check your
dimensions in MS-OLAP and you don't understand what happened there. A whole level
in a big dimension that has only one member - 0 ?? Zero member is null. Yeah, we were
right - the line in the mapping just been deleted by our precious Informatica so the
column is all null.
&lt;/p&gt;
&lt;p&gt;
Well, I happy we're through with this. Good night.
&lt;/p&gt;
&lt;p&gt;
P.S.&lt;br&gt;
Tomorrow I'm taking a day off... :-)
&lt;/p&gt;
&lt;p&gt;
P.S 2
&lt;/p&gt;
&lt;p&gt;
Although many things went wrong in the installation, I really think we had a good
block this time. This block contains many beautiful things in MS-OLAP, MDX and Informatica.
You'll see it here in the next few days, after I'll calm down. :-)
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=ff609464-c3bf-4ed0-ad38-7488832ac9cc" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,ff609464-c3bf-4ed0-ad38-7488832ac9cc.aspx</comments>
      <category>BI;BI/Informatica</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=d6688032-fdd3-46a7-aff9-bf647b069535</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=d6688032-fdd3-46a7-aff9-bf647b069535</wfw:commentRss>
      <slash:comments>2</slash:comments>
      <title>Connecting Informatica with MS-OLAP - Part II</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx</link>
      <pubDate>Sun, 02 Dec 2007 08:59:42 GMT</pubDate>
      <description>&lt;p&gt;
In the &lt;a href="http://www.miky-schreiber.com/Blog/PermaLink,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx"&gt;last
post&lt;/a&gt;, I explained the architecture of our BI project. The final part of the process
is processing MS-OLAP object (cube/dimension) from Informatica mapplet. As explained
earlier, the trick is to call Stored Procedure from the Informatica server. But first
there is one more thing to do: How do you connect the Informatica server (Linux) with
MS-OLAP (windows server)?
&lt;/p&gt;
&lt;p&gt;
Informatica ships with number of drivers that can connect it with other servers. The
drivers are called DataDirect and I'll discuss 4.20. You need to define this driver
on the Informatica server (look in Informatica knowledge base for more information).
This is an easy thing to do. Notice that you have to enter a full server name (including
domain) and the password. Remember that if you'll change the password in the future
the process will fail. You have to enable the protocol named "Named Pipes" in the
MS-OLAP server. How to do this? Enter the Configuration Manager in the MS-OLAP server
and in the section of MSSQLSERVER protocols enable the Named Pipes protocol. This
will enable the connection from the Informatica server. On the Informatica server,
make a regular ODBC connection.
&lt;/p&gt;
&lt;p&gt;
Here is the code of the SP on the MS-OLAP side. This SP must be on the msdb Database
on the Database engine.
&lt;/p&gt;
&lt;p&gt;
ALTER PROCEDURE [dbo].[ProcessObject]&lt;br&gt;
@databaseId varchar(100),&lt;br&gt;
@objectType varchar(100),&lt;br&gt;
@objectId varchar(100),&lt;br&gt;
@login_name varchar(100),&lt;br&gt;
@returnValue int output,&lt;br&gt;
@errorMessage nvarchar(1024) output&lt;br&gt;
AS&lt;br&gt;
BEGIN&lt;br&gt;
declare @jobName varchar(200)&lt;br&gt;
declare @xmla varchar(1000)&lt;br&gt;
declare @jobId binary(16)&lt;br&gt;
declare @ReturnCode int&lt;br&gt;
declare @stop int&lt;br&gt;
&lt;br&gt;
--Set job name&lt;br&gt;
set @jobName = 'Process' + @objectType + '_' + @objectId&lt;br&gt;
&lt;br&gt;
--Delete the job if already exists&lt;br&gt;
if exists (select * from msdb.dbo.sysjobs where name = @jobName)&lt;br&gt;
exec msdb.dbo.sp_delete_job @job_name = @jobName&lt;br&gt;
&lt;br&gt;
--Create the job&lt;br&gt;
Exec msdb.dbo.sp_add_job @job_name=@jobName&lt;br&gt;
@enabled=1,&lt;br&gt;
@notify_level_eventlog=0,&lt;br&gt;
@notify_level_email=0,&lt;br&gt;
@notify_level_netsend=0,&lt;br&gt;
@notify_level_page=0,&lt;br&gt;
@delete_level=0,&lt;br&gt;
@description=N'process OLAP object',&lt;br&gt;
@category_name=N'[Uncategorized (Local)]',&lt;br&gt;
@owner_login_name=@login_name, @job_id=@jobId output&lt;br&gt;
&lt;br&gt;
exec msdb.dbo.sp_add_jobserver @job_name=@jobName, @server_name=@@SERVERNAME&lt;br&gt;
&lt;br&gt;
--Declare XMLA for OLAP object&lt;br&gt;
if (@objectType = 'Cube')&lt;br&gt;
set @xmla = '&lt;PROCESS xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"&gt;
&lt;br&gt;
&lt;object&gt;
&lt;br&gt;
&lt;DatabaseID&gt;
' + @dataBaseId + '
&lt;/DatabaseID&gt;
&lt;br&gt;
&lt;CubeID&gt;
' + @objectId + '
&lt;/CubeID&gt;
&lt;br&gt;
&lt;/object&gt;
&lt;br&gt;
&lt;TYPE&gt;
ProcessFull
&lt;/TYPE&gt;
&lt;br&gt;
&lt;/PROCESS&gt;
'&lt;br&gt;
else if (@objectType = 'Dim')&lt;br&gt;
set @xmla = 
&lt;BATCH xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"&gt;
&lt;br&gt;
&lt;PARALLEL&gt;
&lt;br&gt;
&lt;PROCESS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"&gt;
&lt;br&gt;
&lt;object&gt;
&lt;br&gt;
&lt;DatabaseID&gt;
' + @dataBaseId + '
&lt;/DataBaseID&gt;
&lt;br&gt;
&lt;DimensionID&gt;
' + @objectId + '
&lt;/DimensionID&gt;
&lt;br&gt;
&lt;/object&gt;
&lt;br&gt;
&lt;TYPE&gt;
ProcessFull
&lt;/TYPE&gt;
&lt;br&gt;
&lt;/PROCESS&gt;
&lt;br&gt;
&lt;/PARALLEL&gt;
&lt;br&gt;
&lt;/BATCH&gt;
'&lt;br&gt;
else&lt;br&gt;
Begin&lt;br&gt;
set @returnValue = 0&lt;br&gt;
return @returnValue&lt;br&gt;
End&lt;br&gt;
&lt;br&gt;
--Add the job step&lt;br&gt;
Exec msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Process Object',&lt;br&gt;
@step_id=1,&lt;br&gt;
@cmdexec_success_code=0,&lt;br&gt;
@on_success_action=1,&lt;br&gt;
@on_success_step_id=0,&lt;br&gt;
@on_fail_action=2,&lt;br&gt;
@on_fail_step_id=0,&lt;br&gt;
@retry_attempts=0,&lt;br&gt;
@retry_interval=0,&lt;br&gt;
@os_run_priority=0, @subsystem=N'ANALYSISCOMMAND',&lt;br&gt;
@command=@xmla,&lt;br&gt;
@server=@@SERVERNAME,&lt;br&gt;
@database_name=N'master',&lt;br&gt;
@flags=0&lt;br&gt;
&lt;br&gt;
--Run the job&lt;br&gt;
Execute sp_start_job @jobName&lt;br&gt;
&lt;br&gt;
Waitfor delay '00:00:05'&lt;br&gt;
set @returnValue = (select run_status from dbo.sysjobhistory&lt;br&gt;
where job_id = @jobId&lt;br&gt;
and step_id = 1)&lt;br&gt;
&lt;br&gt;
-- Loop until the job ends and return its result&lt;br&gt;
set @stop = 0&lt;br&gt;
if @returnValue is null&lt;br&gt;
while @stop &amp;lt;&amp;gt; 1&lt;br&gt;
Begin&lt;br&gt;
set @returnValue = (select run_status from dbo.sysjobhistory&lt;br&gt;
where job_id = @jobId&lt;br&gt;
and step_id = 1)&lt;br&gt;
&lt;br&gt;
if @returnValue is not null&lt;br&gt;
set @stop = 1&lt;br&gt;
&lt;br&gt;
waitfor delay '00:00:10'&lt;br&gt;
End&lt;br&gt;
&lt;br&gt;
--Return error message (if exists)&lt;br&gt;
If @returnValue = 0 --failed&lt;br&gt;
set @errorMessage = (select [message] from dbo.sysjobhistory&lt;br&gt;
where job_id = @jobId&lt;br&gt;
and step_id = 1)&lt;br&gt;
End
&lt;/p&gt;
&lt;p&gt;
&lt;font color=#ff0000&gt;&lt;u&gt;update:&lt;/u&gt; &lt;/font&gt;&lt;font color=#000000&gt;I see that the xmla
code went bad in the post because it is not recognised html code. It doesn't matter,
I believe you got the point...&lt;/font&gt;
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=d6688032-fdd3-46a7-aff9-bf647b069535" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,d6688032-fdd3-46a7-aff9-bf647b069535.aspx</comments>
      <category>BI;BI/Informatica;BI/SQL Server 2005;BI/SQL Server 2005/Analysis Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=bd657303-2777-4d14-94bb-2aafce915ff3</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=bd657303-2777-4d14-94bb-2aafce915ff3</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
In the past I mentioned some fragments of the architecture of our end-to-end BI solution.
Now I'll discuss how it is done. I will only write about the things that I done (I
mean, developed) but I'll describe the whole picture.
</p>
        <p>
Our architecture goes like this: <a href="http://en.wikipedia.org/wiki/CONTROL-M">Control
M</a> -&gt; Informatica -&gt; MS-OLAP (Analysis Services 2005).
</p>
        <p>
In words: ControlM is the most common scheduler in big companies. We use it to schedule
our ETL processes in Informatica. Our system team made it possible to start Informatica
processes from ControlM. I don't know exactly how it is done. All I know is that ControlM
raises a flag in a table, and Informatica scans the table every X seconds and start
the process if it finds the flag that was raised by ControlM. Don't ask me about the
technical details - it wasn't my job.
</p>
        <p>
The more interesting thing (because I did it...) is how Informatica calls MS-OLAP
and tells it to process a cube. In this part I'll describe the big picture and in
the next one I'll give some of the code and discuss some technical views of the
process. First, the Informatica mapping moves the data from the source to the
target, which is the dimension or fact table just like it always does. After that,
Informatica calls a Stored Procedure on the MS-OLAP server which process the cube.
Informatica calls this SP with some parameters, including the object type to process
(cube/dimension), its ID and some more parameters. In return, the MS-OLAP returns
return code (in order to point whether the process succeeded) and message describing
the error if it occurred.
</p>
        <p>
How the SP process the cube/dimension? Unfortunately, there is no SP that can process
OLAP object so I needed to use the following steps in my SP:
</p>
        <ol>
          <li>
Delete any existing job that does the same action (read on, you'll understand)</li>
          <li>
Create an empty job</li>
          <li>
Add a step to that job that will process the object. This step contains XMLA code
that contains the parameters that were given from Informatica</li>
          <li>
Run the job</li>
          <li>
Loop until the job (or process) ends and send back the return code and the error message,
if exists.</li>
        </ol>
        <p>
In the next part I'll write some of the code and discuss some technical issues.
</p>
        <p>
 
</p>
        <img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=bd657303-2777-4d14-94bb-2aafce915ff3" />
      </body>
      <title>Connecting Informatica with MS-OLAP - Part I</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx</link>
      <pubDate>Sat, 01 Dec 2007 20:19:47 GMT</pubDate>
      <description>&lt;p&gt;
In the past I mentioned some fragments of the architecture of our end-to-end BI solution.
Now I'll discuss how it is done. I will only write about the things that I done (I
mean, developed) but I'll describe the whole picture.
&lt;/p&gt;
&lt;p&gt;
Our architecture goes like this: &lt;a href="http://en.wikipedia.org/wiki/CONTROL-M"&gt;Control
M&lt;/a&gt; -&amp;gt; Informatica -&amp;gt; MS-OLAP (Analysis Services 2005).
&lt;/p&gt;
&lt;p&gt;
In words: ControlM is the most common scheduler in big companies. We use it to schedule
our ETL processes in Informatica. Our system team made it possible to start Informatica
processes from ControlM. I don't know exactly how it is done. All I know is that ControlM
raises a flag in a table, and Informatica scans the table every X seconds and start
the process if it finds the flag that was raised by ControlM. Don't ask me about the
technical details - it wasn't my job.
&lt;/p&gt;
&lt;p&gt;
The more interesting thing (because I did it...) is how Informatica calls MS-OLAP
and tells it to process a cube. In this part I'll describe the big picture and in
the next one I'll give some of the code and discuss some technical&amp;nbsp;views of the
process.&amp;nbsp;First, the Informatica mapping moves the data from the source to the
target, which is the dimension or fact table just like it always does. After that,
Informatica calls a Stored Procedure on the MS-OLAP server which process the cube.
Informatica calls this SP with some parameters, including the object type to process
(cube/dimension), its ID and some more parameters. In return, the MS-OLAP returns
return code (in order to point whether the process succeeded) and message describing
the error if it occurred.
&lt;/p&gt;
&lt;p&gt;
How the SP process the cube/dimension? Unfortunately, there is no SP that can process
OLAP object so I needed to use the following steps in my SP:
&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
Delete any existing job that does the same action (read on, you'll understand)&lt;/li&gt;
&lt;li&gt;
Create an empty job&lt;/li&gt;
&lt;li&gt;
Add a step to that job that will process the object. This step contains XMLA code
that contains the parameters that were given from Informatica&lt;/li&gt;
&lt;li&gt;
Run the job&lt;/li&gt;
&lt;li&gt;
Loop until the job (or process) ends and send back the return code and the error message,
if exists.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
In the next part I'll write some of the code and discuss some technical issues.
&lt;/p&gt;
&lt;p&gt;
&amp;nbsp;
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=bd657303-2777-4d14-94bb-2aafce915ff3" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,bd657303-2777-4d14-94bb-2aafce915ff3.aspx</comments>
      <category>BI;BI/Informatica;BI/SQL Server 2005;BI/SQL Server 2005/Analysis Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=59198e45-39e2-41be-bcee-11a03dbe5d44</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,59198e45-39e2-41be-bcee-11a03dbe5d44.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,59198e45-39e2-41be-bcee-11a03dbe5d44.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=59198e45-39e2-41be-bcee-11a03dbe5d44</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">I believe that every BI developer seen
this in many Data Warehouses: Boolean Dimensions. As you may guess, boolean dimension
is a dimension with only two members and of course with no hierarchy. For example:
cash/credit card in sales cube, exists/not exists in inventory cube, etc. If you haven't
seen this phrase before - relax - I just invented it. :-)<br />
Now, the question is what to do about these dimensions:<br />
a. Include them in the ETL process or just leave it as is?<br />
b. If you put it in the ETL - how would you implement it?<br /><br />
Here's what I did in my project. You may disagree with me and I would like to see
other approaches too.<br />
a. Yes, I included it for some reasons. As every <a href="http://www.pragprog.com/">Pragmatic
Programmer</a> knows, everything can be changed so do not assume anything as globally-static.
This rule takes place in here: Boolean dimensions may grow and have more members.
For example, in the sales cube I mentioned above, maybe there will be another way
to pay such as exclusive card of the shop (There is a network here in Israel who has
it). Even male/female boolean dimension may have an Unknown member. So never exclude
these dimensions from your ETL process. Wait - one more thing. You may think: Why
interrupt my ETL process with these silly dimensions? If they'll grow up I'll add
them to the process. As an answer think about the timings: You can never know how
much time the dimension's ETL will take (although it will be very small), so in order
to stay away of surprises - include it in your ETL process. just for case.<br />
b. I implemented it as two hard-coded expressions and sent them to union. The result
of this union will enter directly to the target table. In Informatica, the mapplet
can't start without source table so just put a dummy table with only one row and connect
it to the expression items. Why only one row? If the table will contain more than
two rows then the Informatica server will consider the process as failed one.<br /><br />
As I said, I'll be happy to read other approaches other than mine.<br /><p></p><img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=59198e45-39e2-41be-bcee-11a03dbe5d44" /></body>
      <title>Boolean Dimensions</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,59198e45-39e2-41be-bcee-11a03dbe5d44.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,59198e45-39e2-41be-bcee-11a03dbe5d44.aspx</link>
      <pubDate>Sun, 21 Oct 2007 22:44:45 GMT</pubDate>
      <description>I believe that every BI developer seen this in many Data Warehouses: Boolean Dimensions. As you may guess, boolean dimension is a dimension with only two members and of course with no hierarchy. For example: cash/credit card in sales cube, exists/not exists in inventory cube, etc. If you haven't seen this phrase before - relax - I just invented it. :-)&lt;br&gt;
Now, the question is what to do about these dimensions:&lt;br&gt;
a. Include them in the ETL process or just leave it as is?&lt;br&gt;
b. If you put it in the ETL - how would you implement it?&lt;br&gt;
&lt;br&gt;
Here's what I did in my project. You may disagree with me and I would like to see
other approaches too.&lt;br&gt;
a. Yes, I included it for some reasons. As every &lt;a href="http://www.pragprog.com/"&gt;Pragmatic
Programmer&lt;/a&gt; knows, everything can be changed so do not assume anything as globally-static.
This rule takes place in here: Boolean dimensions may grow and have more members.
For example, in the sales cube I mentioned above, maybe there will be another way
to pay such as exclusive card of the shop (There is a network here in Israel who has
it). Even male/female boolean dimension may have an Unknown member. So never exclude
these dimensions from your ETL process. Wait - one more thing. You may think: Why
interrupt my ETL process with these silly dimensions? If they'll grow up I'll add
them to the process. As an answer think about the timings: You can never know how
much time the dimension's ETL will take (although it will be very small), so in order
to stay away of surprises - include it in your ETL process. just for case.&lt;br&gt;
b. I implemented it as two hard-coded expressions and sent them to union. The result
of this union will enter directly to the target table. In Informatica, the mapplet
can't start without source table so just put a dummy table with only one row and connect
it to the expression items. Why only one row? If the table will contain more than
two rows then the Informatica server will consider the process as failed one.&lt;br&gt;
&lt;br&gt;
As I said, I'll be happy to read other approaches other than mine.&lt;br&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=59198e45-39e2-41be-bcee-11a03dbe5d44" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,59198e45-39e2-41be-bcee-11a03dbe5d44.aspx</comments>
      <category>BI;BI/DataWarehousing;BI/Informatica</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=73f9d050-05bd-4dc9-b719-ae9d256557b3</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,73f9d050-05bd-4dc9-b719-ae9d256557b3.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,73f9d050-05bd-4dc9-b719-ae9d256557b3.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=73f9d050-05bd-4dc9-b719-ae9d256557b3</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">I really think that the time dimension
is the most complex dimension in 90% of the DWHs. The complexity is in two places:
In the DWH design and also in Analysis Services (or any other BI tool).<br />
First of all - why we didn't take the already-made Server Time Dimension which exists
in SSAS 2005? For two reasons: The first is that the <a href="http://www.miky-schreiber.com/blog/default.aspx#aa68059bf-fe59-48d5-a9e8-dd366e6a861c">Project
Real</a> guys do not recommend using it (you can find their SSAS article <a href="http://www.microsoft.com/technet/prodtechnol/sql/2005/realastd.mspx">here</a>).
The second is that we wanted to have some features that are not available in the server
time dimension, such as Hebrew date. In a matter of fact, even if we didn't have such
feature we still would build the time dimension ourselves because it's giving you
much more control over the dimension. For example, you can always add some new attributes
which Microsoft developers didn't think about.<br />
I started myself to build the time dimension in excel. I figured out that this mission
is little more complex that I thought it would. Most of the functions I wrote were
simple, but there were some complicated ones. So here are some tips for you if you
want to build your time dimension using Excel:<br /><ul><li>
If you want to week number for every date, do not write the function yourself... Excel
has function called weeknum. If you don't have it just add the function toolbox which
has it (I can't recall its name right now. check in excel help).</li><li>
If you want to have records for every level in your hierarchy (not only for days),
put every level in its own excel file (not excel tab). It will help you later when
you will transfer it to your DB.</li><li>
Check yourself. Pick randomly some dates and check that all of its record has correct
data.</li></ul>
After building the excel files I needed to transfer it to my Oracle server. I used
SSIS because I didn't want to wait for my DBA to copy these files into the Informatica
server (it can't use my the local files, it has to be in its server. SSIS can use
local files). This also was a little tricky. First of all, close excel when running
the SSIS packages, otherwise it will fail. Second, when moving the non-leaf levels,
go into the columns section in the destination box and erase the irrelevant columns.
It will reduce the chance for errors. Finally, click on the source box and click on
"Show advanced editor". Enter the source's output columns options and define properly
the columns' data types. This also will reduce the chance for errors.<br /><br />
I had a little bit of an argument with my DBA about how should the time dimension
be. I think that the time dimension does not have to be processes at all. My time
dimension is from 1960 until 2020, so no daily ETL is required. She says that all
the logic has to be in Informatica so I need to develop a mapping for this dimension.
I think that we both are right and that's because that in ideal world she is right.
In every developers team, all the BL has to be in one place. But we don't have much
time (the deadline is very close) so I won't spend the time building more mapping
in Informatica when I have the time dimension already made in excel.<br /><br />
Maybe someday I will have the time to do this. Maybe not.<br /><p></p><img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=73f9d050-05bd-4dc9-b719-ae9d256557b3" /></body>
      <title>Building my Time dimension with excel</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,73f9d050-05bd-4dc9-b719-ae9d256557b3.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,73f9d050-05bd-4dc9-b719-ae9d256557b3.aspx</link>
      <pubDate>Sun, 14 Oct 2007 23:18:38 GMT</pubDate>
      <description>I really think that the time dimension is the most complex dimension in 90% of the DWHs. The complexity is in two places: In the DWH design and also in Analysis Services (or any other BI tool).&lt;br&gt;
First of all - why we didn't take the already-made Server Time Dimension which exists
in SSAS 2005? For two reasons: The first is that the &lt;a href="http://www.miky-schreiber.com/blog/default.aspx#aa68059bf-fe59-48d5-a9e8-dd366e6a861c"&gt;Project
Real&lt;/a&gt; guys do not recommend using it (you can find their SSAS article &lt;a href="http://www.microsoft.com/technet/prodtechnol/sql/2005/realastd.mspx"&gt;here&lt;/a&gt;).
The second is that we wanted to have some features that are not available in the server
time dimension, such as Hebrew date. In a matter of fact, even if we didn't have such
feature we still would build the time dimension ourselves because it's giving you
much more control over the dimension. For example, you can always add some new attributes
which Microsoft developers didn't think about.&lt;br&gt;
I started myself to build the time dimension in excel. I figured out that this mission
is little more complex that I thought it would. Most of the functions I wrote were
simple, but there were some complicated ones. So here are some tips for you if you
want to build your time dimension using Excel:&lt;br&gt;
&lt;ul&gt;
&lt;li&gt;
If you want to week number for every date, do not write the function yourself... Excel
has function called weeknum. If you don't have it just add the function toolbox which
has it (I can't recall its name right now. check in excel help).&lt;/li&gt;
&lt;li&gt;
If you want to have records for every level in your hierarchy (not only for days),
put every level in its own excel file (not excel tab). It will help you later when
you will transfer it to your DB.&lt;/li&gt;
&lt;li&gt;
Check yourself. Pick randomly some dates and check that all of its record has correct
data.&lt;/li&gt;
&lt;/ul&gt;
After building the excel files I needed to transfer it to my Oracle server. I used
SSIS because I didn't want to wait for my DBA to copy these files into the Informatica
server (it can't use my the local files, it has to be in its server. SSIS can use
local files). This also was a little tricky. First of all, close excel when running
the SSIS packages, otherwise it will fail. Second, when moving the non-leaf levels,
go into the columns section in the destination box and erase the irrelevant columns.
It will reduce the chance for errors. Finally, click on the source box and click on
"Show advanced editor". Enter the source's output columns options and define properly
the columns' data types. This also will reduce the chance for errors.&lt;br&gt;
&lt;br&gt;
I had a little bit of an argument with my DBA about how should the time dimension
be. I think that the time dimension does not have to be processes at all. My time
dimension is from 1960 until 2020, so no daily ETL is required. She says that all
the logic has to be in Informatica so I need to develop a mapping for this dimension.
I think that we both are right and that's because that in ideal world she is right.
In every developers team, all the BL has to be in one place. But we don't have much
time (the deadline is very close) so I won't spend the time building more mapping
in Informatica when I have the time dimension already made in excel.&lt;br&gt;
&lt;br&gt;
Maybe someday I will have the time to do this. Maybe not.&lt;br&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=73f9d050-05bd-4dc9-b719-ae9d256557b3" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,73f9d050-05bd-4dc9-b719-ae9d256557b3.aspx</comments>
      <category>BI;BI/DataWarehousing;BI/Informatica;BI/SQL Server 2005/Integration Services</category>
    </item>
    <item>
      <trackback:ping>http://www.miky-schreiber.com/Blog/Trackback.aspx?guid=01276594-f240-4611-940c-6626b856977c</trackback:ping>
      <pingback:server>http://www.miky-schreiber.com/Blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.miky-schreiber.com/Blog/PermaLink,guid,01276594-f240-4611-940c-6626b856977c.aspx</pingback:target>
      <dc:creator>mikypuff</dc:creator>
      <wfw:comment>http://www.miky-schreiber.com/Blog/CommentView,guid,01276594-f240-4611-940c-6626b856977c.aspx</wfw:comment>
      <wfw:commentRss>http://www.miky-schreiber.com/Blog/SyndicationService.asmx/GetEntryCommentsRss?guid=01276594-f240-4611-940c-6626b856977c</wfw:commentRss>
      <slash:comments>5</slash:comments>
      <body xmlns="http://www.w3.org/1999/xhtml">I guess that this will not be my last post
on this subject, but I want to start sharing some thoughts and tips from my experience
when designing and building DWH. In this post I will focus on the fact &amp; dimensions
tables relationship in terms of data completeness (if you wonder what it is, read
on).<br /><br />
Before you start to design the DWH, sit and talk with the people who built the systems
which you take your data from, including the DBA. For every table, ask them what is
the primary key (it's NOT always defined properly in the DB), then ask them again
and then ask them if they are sure. It happened to me that I discovered that the systems
guys were wrong about their DB's primary keys.<br />
The same thing is about Foreign keys and here you should be even more careful. Even
if they claim so, check <b>yourself</b> that every foreign key in the fact table is
placed correctly in the dimension table, <b>especially when the fact table has far
history records</b>. Sometimes system developers or even worse - system DBAs delete
records from the dimension tables that are not relevant. This will cause that these
keys will still be in the fact's history records but will not be found in the dimension
table, causing uncomplete relationship between the fact and the dimension table.<br /><br />
So far is about the part when you talk and "investigate" the system developers (the
DWH design). What to do when you actually developing the DWH? First, develop the tables
of your dimensions tables. <b>Do not forget</b> to add the primary keys in the dimensions
tables and the primary and foreign keys in the fact table. Then develop the ETL processes
and go for the dimensions first.  If you know that the dimension has completeness
problems with the fact table that you will develop later (you talked with the system
developers, remember?), add UNDEFINED (UD key) record for the dimension table. Later,
when developing the fact table's ETL process, make Join with the dimension table and
check that the records' foreign key exists there. If not - change the key to UD. In
SSIS and Informatica (and I guess that also in other products I don't know, such as
DataStage) you can use Lookup instead of Joiner if the dimension table is less that
1G records. That will optimize the ETL process. After you developed all your ETLs,
run the dimension processes. After they finish (assuming everything went OK) run the
fact table's ETL process. If it succeeded you can go and have a drink. If not - check
what went wrong. If you want to know which keys didn't showed up in the dimension
table and causes the <span style="font-size: 12pt; line-height: 115%; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">incompleteness </span>problem,
you can <b>disable</b> (not delete) the foreign key from the fact table and run the
process again. Then, with a simple SQL query, check which foreign keys don't exist
in the dimension table. Go back to your ETL design and check what you did wrong. As
I pointed before, in this step you might be very angry at the system developers...<br /><br />
That is all for now. As I said, I assume that more ideas will come on in the future.<br /><p></p><img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=01276594-f240-4611-940c-6626b856977c" /></body>
      <title>Tips for use when designing and building your DataWarehouse</title>
      <guid isPermaLink="false">http://www.miky-schreiber.com/Blog/PermaLink,guid,01276594-f240-4611-940c-6626b856977c.aspx</guid>
      <link>http://www.miky-schreiber.com/Blog/PermaLink,guid,01276594-f240-4611-940c-6626b856977c.aspx</link>
      <pubDate>Sat, 13 Oct 2007 22:11:49 GMT</pubDate>
      <description>I guess that this will not be my last post on this subject, but I want to start sharing some thoughts and tips from my experience when designing and building DWH. In this post I will focus on the fact &amp;amp; dimensions tables relationship in terms of data completeness (if you wonder what it is, read on).&lt;br&gt;
&lt;br&gt;
Before you start to design the DWH, sit and talk with the people who built the systems
which you take your data from, including the DBA. For every table, ask them what is
the primary key (it's NOT always defined properly in the DB), then ask them again
and then ask them if they are sure. It happened to me that I discovered that the systems
guys were wrong about their DB's primary keys.&lt;br&gt;
The same thing is about Foreign keys and here you should be even more careful. Even
if they claim so, check &lt;b&gt;yourself&lt;/b&gt; that every foreign key in the fact table is
placed correctly in the dimension table, &lt;b&gt;especially when the fact table has far
history records&lt;/b&gt;. Sometimes system developers or even worse - system DBAs delete
records from the dimension tables that are not relevant. This will cause that these
keys will still be in the fact's history records but will not be found in the dimension
table, causing uncomplete relationship between the fact and the dimension table.&lt;br&gt;
&lt;br&gt;
So far is about the part when you talk and "investigate" the system developers (the
DWH design). What to do when you actually developing the DWH? First, develop the tables
of your dimensions tables. &lt;b&gt;Do not forget&lt;/b&gt; to add the primary keys in the dimensions
tables and the primary and foreign keys in the fact table. Then develop the ETL processes
and go for the dimensions first.&amp;nbsp; If you know that the dimension has completeness
problems with the fact table that you will develop later (you talked with the system
developers, remember?), add UNDEFINED (UD key) record for the dimension table. Later,
when developing the fact table's ETL process, make Join with the dimension table and
check that the records' foreign key exists there. If not - change the key to UD. In
SSIS and Informatica (and I guess that also in other products I don't know, such as
DataStage) you can use Lookup instead of Joiner if the dimension table is less that
1G records. That will optimize the ETL process. After you developed all your ETLs,
run the dimension processes. After they finish (assuming everything went OK) run the
fact table's ETL process. If it succeeded you can go and have a drink. If not - check
what went wrong. If you want to know which keys didn't showed up in the dimension
table and causes the &lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;incompleteness &lt;/span&gt;problem,
you can &lt;b&gt;disable&lt;/b&gt; (not delete) the foreign key from the fact table and run the
process again. Then, with a simple SQL query, check which foreign keys don't exist
in the dimension table. Go back to your ETL design and check what you did wrong. As
I pointed before, in this step you might be very angry at the system developers...&lt;br&gt;
&lt;br&gt;
That is all for now. As I said, I assume that more ideas will come on in the future.&lt;br&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.miky-schreiber.com/Blog/aggbug.ashx?id=01276594-f240-4611-940c-6626b856977c" /&gt;</description>
      <comments>http://www.miky-schreiber.com/Blog/CommentView,guid,01276594-f240-4611-940c-6626b856977c.aspx</comments>
      <category>BI;BI/DataWarehousing;BI/Informatica;BI/SQL Server 2005/Integration Services</category>
    </item>
  </channel>
</rss>