Monday, September 24, 2007
My friend, Ilya, had a problem in SSIS. He had a .csv file with too many commas. The meaning is that strings that started and ended with inverted commas (") and had commas inside it were recognized by SSIS as new column. For example, the row:
"My name, is Miky", 200, 10 was recognised by SSIS as four columns instead of three. Ilya wrote down a code for SSIS (in VB) that run before the package begin its work. Here it is, hope it will help who ever seen this.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.IO
Imports System.Text
Imports Microsoft.VisualBasic.FileIO
Public Class ScriptMain
Public Sub Main()
  Dim csvFileFullPath As String
  Dim tabFileFullPath As String
  csvFileFullPath = Dts.Connections("Your CSV Connnection").ConnectionString
  tabFileFullPath = Dts.Connections("Your Table Connection").ConnectionString
  Using tabStreamWriter As New StreamWriter(tabFileFullPath, False, System.Text.Encoding.GetEncoding(1255))
  Using csvFileReader As New StreamReader(csvFileFullPath, System.Text.Encoding.GetEncoding(1255),True)
  Dim currentRow As String

  currentRow = csvFileReader.ReadLine()
  tabStreamWriter.WriteLine(currentRow)
  While Not csvFileReader.EndOfStream
    Dim outputRow As New Text.StringBuilder()
    Dim tmp, tmp1 as String
    Dim offset as Int32 = 1
    Dim beginS, endS As Int32

    beginS = 1
    currentRow = csvFileReader.ReadLine()
    beginS = InStr(offset, currentRow, """")
    While Not beginS = 0 Or offset > Len(currentRow)
      endS = InStr(beginS+1, currentRow, """")
      tmp = Mid(currentRow, beginS, endS - beginS)
      tmp1 = Replace(tmp, ",", " ")
      currentRow = Replace(currentRow, tmp, tmp1)
      offset = endS + 1
      beginS = InStr(offset, currentRow, """"")
    End While
    outputRow.Append(currentRow)
    tabStreamWriter.WriteLine(outputRow.ToString())
  End While
End Using
End Using
Dts.TaskResult = Dts.Result.Success
End Sub
End Class

The solution here is to search for any comma (,) that is between two inverted commas (") and replace it by space.
Although it is a good solution, I would take another solution: Replace any comma by special string, such as &Miky&, convert the csv file into table, and after that go over that column(s) and replace any &Miky& by comma.

Sunday, November 25, 2007 10:25:15 AM (Jerusalem Standard Time, UTC+02:00)
Hi Miky,

Your approach is infact real approach because you can not make assumption that comma is meaningless. Thanks for sharing the code. Suggestions:
1. It could have been better if you have specifically mentioned where to put this code.
2. Why could not the special substitute string be replaced in SSIS package itself after reading the string into derived column and use corrected string to insert into table?

I am wordering, if MS treat this problem is serious bug or not, any word on that? Are they fixting this are not! It is realy a nuisence and a primitive bug in logic or code.

With regards
Raj
Wednesday, November 28, 2007 5:40:01 PM (Jerusalem Standard Time, UTC+02:00)
1. The place to put this script is befor the Data Flow stage, you have to prepare the csv file for transfer.
2. You can't do this because the bug is in the parser, that means that the parser reads the file incorectly (it defines the columns incorectly)

When we first encountered this problem we aproached MS support for help and they couldn't help us, i realy don't know if they treat it as a bug...
Ilya Feldman
Saturday, August 02, 2008 6:11:05 AM (Jerusalem Daylight Time, UTC+03:00)
Hello Miky,

I am dealing with the situation where tehre is a comma in the CSV/TXT pretty often and in my case, the comma is always meaningful.
The out of the box nice and elegant solution to this problem is using the Flat File connection manager and specify under the Format section " as the Text qualifier.

Comma may become a problem when you are loading data from a poorly formatted excel file, here you need to save spreadsheed as CSV first. MS Excel wraps the text data into quotation marks if it contains commas for you automatically.

Hope it will help.
/Alex
Alex G.
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):