Thursday, May 31, 2007

Few months ago we were given an assignment to copy/move all our DTSs that were running with SQL Server 2000 to the new SQL Server 2005 Integration Services (SSIS). My friend Michael did it and wrote some important notes that he discovered when building ETLs with Integration Services. I decided to list them here because they are important and useful, especially for those who haven't got the time to develop with SSIS so far.

  • One of the greatest improvements in SSIS is that between the source and the destination of the ETL process you can do many things, such as making new fields, sorting, converting data types, union all between different sources, implement your logic on a field, and much more. This is much easier than ever because all you need to do is to add a block to the data flow task and define it for your purposes.
  • SSIS ships with a tool for migrating SQL Server 2000 DTSs. Do not use this tool. Sometimes the result of the conversion is not good enough and in all cases you can't edit the new migrated data task.
  • When making a connection to a non-Microsoft DB, such as Oracle, use OLE DB client instead of the out-of-fashion ODBC.
  • When the destination field is shorter (string type) than the source, add a data conversion block and cut the string. Otherwise, there will be an annoying warning even if the truncation is wanted.
  • In many times (when working with non-Microsoft providers) the automatic recognition of the length & types of the source fields is not correct. Enter the source block and edit these properties by yourself.
  • When moving a Unicode field (data type DT_WSTR) to a non-Unicode destination field (DT_STR), a data conversion block is required.
  • SQL Server 2000 Stored Procedures will work in SSIS, but the Linked Servers definitions are problematic. Consider another options rather than using linked servers.
  • When the source/destination is a CSV file, use Flat File Connection. But if it is an Excel file (.xls), use a Microsoft Jet OLE DB connection and define the source as OLE DB Source (yes, it will work with Excel files).

Again, thanks for Michael for making and sharing these notes.