SQL Server Bulk insert of CSV file with inconsistent quotes

Tag: , ,

Is it possible to BULK INSERT (SQL Server) a CSV file in which the fields are only OCCASSIONALLY surrounded by quotes? Specifically, quotes only surround those fields that contain a “,”.

In other words, I have data that looks like this (the first row contain headers):

id, company, rep, employees
729216,INGRAM MICRO INC.,"Stuart, Becky",523
729235,"GREAT PLAINS ENERGY, INC.","Nelson, Beena",114
721177,GEORGE WESTON BAKERIES INC,"Hogan, Meg",253

Because the quotes aren’t consistent, I can’t use ‘”,”‘ as a delimiter, and I don’t know how to create a format file that accounts for this.

I tried using ‘,’ as a delimter and loading it into a temporary table where every column is a varchar, then using some kludgy processing to strip out the quotes, but that doesn’t work either, because the fields that contain ‘,’ are split into multiple columns.

Unfortunately, I don’t have the ability to manipulate the CSV file beforehand.

Is this hopeless?

Many thanks in advance for any advice.

By the way, i saw this post SQL bulk import from csv, but in that case, EVERY field was consistently wrapped in quotes. So, in that case, he could use ‘,’ as a delimiter, then strip out the quotes afterwards.

9 Answers

  1. xequnsruuh on Mar 05, 2013

    You are going to need to preprocess the file, period.

    If you really really need to do this, here is the code. I wrote this because I absolutely had no choice. It is utility code and I’m not proud of it, but it works. The approach is not to get SQL to understand quoted fields, but instead manipulate the file to use an entirely different delimiter.

    EDIT: Here is the code in a github repo. It’s been improved and now comes with unit tests! https://github.com/chrisclark/Redelim-it

    This function takes an input file and will replace all field-delimiting commas (NOT commas inside quoted-text fields, just the actual delimiting ones) with a new delimiter. You can then tell sql server to use the new field delimiter instead of a comma. In the version of the function here, the placeholder is <TMP> (I feel confident this will not appear in the original csv – if it does, brace for explosions).

    Therefore after running this function you import in sql by doing something like:

    BULK INSERT MyTable
    FROM 'C:FileCreatedFromThisFunction.csv'
    WITH
    (
    FIELDTERMINATOR = '<*TMP*>',
    ROWTERMINATOR = 'n'
    )
    

    And without further ado, the terrible, awful function that I apologize in advance for inflicting on you (edit – I’ve posted a working program that does this instead of just the function on my blog here):

    Private Function CsvToOtherDelimiter(ByVal InputFile As String, ByVal OutputFile As String) As Integer
    
            Dim PH1 As String = "<*TMP*>"
    
            Dim objReader As StreamReader = Nothing
            Dim count As Integer = 0 'This will also serve as a primary key'
            Dim sb As New System.Text.StringBuilder
    
            Try
                objReader = New StreamReader(File.OpenRead(InputFile), System.Text.Encoding.Default)
            Catch ex As Exception
                UpdateStatus(ex.Message)
            End Try
    
            If objReader Is Nothing Then
                UpdateStatus("Invalid file: " & InputFile)
                count = -1
                Exit Function
            End If
    
            'grab the first line
        Dim line = reader.ReadLine()
        'and advance to the next line b/c the first line is column headings
        If hasHeaders Then
            line = Trim(reader.ReadLine)
        End If
    
        While Not String.IsNullOrEmpty(line) 'loop through each line
    
            count += 1
    
            'Replace commas with our custom-made delimiter
            line = line.Replace(",", ph1)
    
            'Find a quoted part of the line, which could legitimately contain commas.
            'In that case we will need to identify the quoted section and swap commas back in for our custom placeholder.
            Dim starti = line.IndexOf(ph1 & """", 0)
            If line.IndexOf("""",0) = 0 then starti=0
    
            While starti > -1 'loop through quoted fields
    
                Dim FieldTerminatorFound As Boolean = False
    
                'Find end quote token (originally  a ",)
                Dim endi As Integer = line.IndexOf("""" & ph1, starti)
    
                If endi < 0 Then
                    FieldTerminatorFound = True
                    If endi < 0 Then endi = line.Length - 1
                End If
    
                While Not FieldTerminatorFound
    
                    'Find any more quotes that are part of that sequence, if any
                    Dim backChar As String = """" 'thats one quote
                    Dim quoteCount = 0
                    While backChar = """"
                        quoteCount += 1
                        backChar = line.Chars(endi - quoteCount)
                    End While
    
                    If quoteCount Mod 2 = 1 Then 'odd number of quotes. real field terminator
                        FieldTerminatorFound = True
                    Else 'keep looking
                        endi = line.IndexOf("""" & ph1, endi + 1)
                    End If
                End While
    
                'Grab the quoted field from the line, now that we have the start and ending indices
                Dim source = line.Substring(starti + ph1.Length, endi - starti - ph1.Length + 1)
    
                'And swap the commas back in
                line = line.Replace(source, source.Replace(ph1, ","))
    
                'Find the next quoted field
                '                If endi >= line.Length - 1 Then endi = line.Length 'During the swap, the length of line shrinks so an endi value at the end of the line will fail
                starti = line.IndexOf(ph1 & """", starti + ph1.Length)
    
            End While
    
                line = objReader.ReadLine
    
            End While
    
            objReader.Close()
    
            SaveTextToFile(sb.ToString, OutputFile)
    
            Return count
    
        End Function
    
  2. rajesh on Mar 05, 2013

    It isn’t possible to do a bulk insert for this file, from MSDN:

    To be usable as a data file for bulk import, a CSV file must comply with the following restrictions:

    • Data fields never contain the field terminator.
    • Either none or all of the values in a data field are enclosed in quotation marks (“”).

    (http://msdn.microsoft.com/en-us/library/ms188609.aspx)

    Some simple text processing should be all that’s required to get the file ready for import. Alternatively your users could be required to either format the file according to the se guidelines or use something other than a comma as a delimiter (e.g |)

  3. pharme86 on Mar 05, 2013

    I have also created a function to convert a CSV to a usable format for Bulk Insert. I used the answered post by Chris Clark as a starting point to create the following C# function.

    I ended up using a regular expression to find the fields. I then recreated the file line by line, writing it to a new file as I went, thus avoiding having the entire file loaded into memory.

    private void CsvToOtherDelimiter(string CSVFile, System.Data.Linq.Mapping.MetaTable tbl)
    {
        char PH1 = '|';
        StringBuilder ln;
    
        //Confirm file exists. Else, throw exception
        if (File.Exists(CSVFile))
        {
            using (TextReader tr = new StreamReader(CSVFile))
            {
                //Use a temp file to store our conversion
                using (TextWriter tw = new StreamWriter(CSVFile + ".tmp"))
                {
                    string line = tr.ReadLine();
                    //If we have already converted, no need to reconvert.
                    //NOTE: We make the assumption here that the input header file 
                    //      doesn't have a PH1 value unless it's already been converted.
                    if (line.IndexOf(PH1) >= 0)
                    {
                        tw.Close();
                        tr.Close();
                        File.Delete(CSVFile + ".tmp");
                        return;
                    }
                    //Loop through input file
                    while (!string.IsNullOrEmpty(line))
                    {
                        ln = new StringBuilder();
    
                        //1. Use Regex expression to find comma separated values 
                        //using quotes as optional text qualifiers 
                        //(what MS EXCEL does when you import a csv file)
                        //2. Remove text qualifier quotes from data
                        //3. Replace any values of PH1 found in column data 
                        //with an equivalent character
                        //Regex:  A[^,]*(?=,)|(?:[^",]*"[^"]*"[^",]*)+|[^",]*"[^"]*Z|(?<=,)[^,]*(?=,)|(?<=,)[^,]*Z|A[^,]*Z
                        List<string> fieldList = Regex.Matches(line, @"A[^,]*(?=,)|(?:[^"",]*""[^""]*""[^"",]*)+|[^"",]*""[^""]*Z|(?<=,)[^,]*(?=,)|(?<=,)[^,]*Z|A[^,]*Z")
                                .Cast<Match>()
                                .Select(m => RemoveCSVQuotes(m.Value).Replace(PH1, '¦'))
                                .ToList<string>();
    
                        //Add the list of fields to ln, separated by PH1
                        fieldList.ToList().ForEach(m => ln.Append(m + PH1));
    
                        //Write to file. Don't include trailing PH1 value.
                        tw.WriteLine(ln.ToString().Substring(0, ln.ToString().LastIndexOf(PH1)));
    
                        line = tr.ReadLine();
                    }
    
    
                    tw.Close();
                }
                tr.Close();
    
                //Optional:  replace input file with output file
                File.Delete(CSVFile);
                File.Move(CSVFile + ".tmp", CSVFile);
            }
        }
        else
        {
            throw new ArgumentException(string.Format("Source file {0} not found", CSVFile));
        }
    }
    //The output file no longer needs quotes as a text qualifier, so remove them
    private string RemoveCSVQuotes(string value)
    {
        //if is empty string, then remove double quotes
        if (value == @"""""") value = "";
        //remove any double quotes, then any quotes on ends
        value = value.Replace(@"""""", @"""");
        if (value.Length >= 2)
            if (value.Substring(0, 1) == @"""")
                value = value.Substring(1, value.Length - 2);
        return value;
    }
    
  4. fernando-correia on Mar 05, 2013

    An alternate method–assuming you don’t have a load of fields or expect a quote to appear in the data itself would be to use the REPLACE function.

    UPDATE dbo.tablename 
            SET dbo.tablename.target_field = REPLACE(t.importedValue, '"', '')
    FROM #tempTable t
    WHERE dbo.tablename.target_id = t.importedID;
    

    I have used it. I can’t make any claims regarding performance. It is just a quick and dirty way to get around the problem.

  5. harry on Mar 05, 2013

    This might be more complicated or involved than what your willing to use, but …

    If you can implement the logic for parsing the lines into fields in VB or C#, you can do this using a CLR table valued function (TVF).

    A CLR TVF can be a good performing way to read data in from external source when you want to have some C# or VB code separate the data into columns and/or adjust the values.

    You have to be willing to add a CLR assembly to your database (and one that allows external or unsafe operations so it can open files). This can get a bit complicated or involved, but might be worth it for the flexibility you get.

    I had some large files that needed to be regularly loaded to tables as fast as possible, but certain code translations needed to be performed on some columns and special handling was needed to load values that would have otherwise caused datatype errors with a plain bulk insert.

    In short, a CLR TVF lets you run C# or VB code against each line of the file with bulk insert like performance (although you may need to worry about logging). The example in the SQL Server documentation lets you create a TVF to read from the event log that you could use as a starting point.

    Note that the code in the CLR TVF can only access the database in an init stage before the first row is processed (eg. no lookups for each row – you use a normal TVF on top of this to do such things). You don’t appear to need this based on your question.

    Also note, each CLR TVF must have its output columns explicitly specified, so you can’t write a generic one that is reusable for each different csv file you might have.

    You could write one CLR TVF to read whole lines from the file, returning a one column result set, then use normal TVFs to read from that for each type of file. This requires the code to parse each line to be written in T-SQL, but avoids having to write many CLR TVFs.

  6. johan-offer on Mar 05, 2013

    I found the answer by Chris very helpful, but I wanted to run it from within SQL Server using T-SQL (and not using CLR), so I converted his code to T-SQL code. But then I took it one step further by wrapping everything up in a stored procedure that did the following:

    1. use bulk insert to initially import the CSV file
    2. clean up the lines using Chris’s code
    3. return the results in a table format

    For my needs, I further cleaned up the lines by removing quotes around values and converting two double quotes to one double quote (I think that’s the correct method).

    CREATE PROCEDURE SSP_CSVToTable
    
    -- Add the parameters for the stored procedure here
    @InputFile nvarchar(4000)
    , @FirstLine int
    
    AS
    
    BEGIN
    
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    
    --convert the CSV file to a table
    --clean up the lines so that commas are handles correctly
    
    DECLARE @sql nvarchar(4000)
    DECLARE @PH1 nvarchar(50)
    DECLARE @LINECOUNT int -- This will also serve as a primary key
    DECLARE @CURLINE int
    DECLARE @Line nvarchar(4000)
    DECLARE @starti int
    DECLARE @endi int
    DECLARE @FieldTerminatorFound bit
    DECLARE @backChar nvarchar(4000)
    DECLARE @quoteCount int
    DECLARE @source nvarchar(4000)
    DECLARE @COLCOUNT int
    DECLARE @CURCOL int
    DECLARE @ColVal nvarchar(4000)
    
    -- new delimiter
    SET @PH1 = '†'
    
    -- create single column table to hold each line of file
    CREATE TABLE [#CSVLine]([line] nvarchar(4000))
    
    -- bulk insert into temp table
    -- cannot use variable path with bulk insert
    -- so we must run using dynamic sql
    SET @Sql = 'BULK INSERT #CSVLine
    FROM ''' + @InputFile + '''
    WITH
    (
    FIRSTROW=' + CAST(@FirstLine as varchar) + ',
    FIELDTERMINATOR = ''n'',
    ROWTERMINATOR = ''n''
    )'
    
    -- run dynamic statement to populate temp table
    EXEC(@sql)
    
    -- get number of lines in table
    SET @LINECOUNT = @@ROWCOUNT
    
    -- add identity column to table so that we can loop through it
    ALTER TABLE [#CSVLine] ADD [RowId] [int] IDENTITY(1,1) NOT NULL
    
    IF @LINECOUNT > 0
    BEGIN
        -- cycle through each line, cleaning each line
        SET @CURLINE = 1
        WHILE @CURLINE <= @LINECOUNT
        BEGIN
            -- get current line
            SELECT @line = line
              FROM #CSVLine
             WHERE [RowId] = @CURLINE
    
            -- Replace commas with our custom-made delimiter
            SET @Line = REPLACE(@Line, ',', @PH1)
    
            -- Find a quoted part of the line, which could legitimately contain commas.
            -- In that case we will need to identify the quoted section and swap commas back in for our custom placeholder.
            SET @starti = CHARINDEX(@PH1 + '"' ,@Line, 0)
            If CHARINDEX('"', @Line, 0) = 0 SET @starti = 0
    
            -- loop through quoted fields
            WHILE @starti > 0 
            BEGIN
                SET @FieldTerminatorFound = 0
    
                -- Find end quote token (originally  a ",)
                SET @endi = CHARINDEX('"' + @PH1, @Line, @starti)  -- sLine.IndexOf("""" & PH1, starti)
    
                IF @endi < 1
                BEGIN
                    SET @FieldTerminatorFound = 1
                    If @endi < 1 SET @endi = LEN(@Line) - 1
                END
    
                WHILE @FieldTerminatorFound = 0
                BEGIN
                    -- Find any more quotes that are part of that sequence, if any
                    SET @backChar = '"' -- thats one quote
                    SET @quoteCount = 0
    
                    WHILE @backChar = '"'
                    BEGIN
                        SET @quoteCount = @quoteCount + 1
                        SET @backChar = SUBSTRING(@Line, @endi-@quoteCount, 1) -- sLine.Chars(endi - quoteCount)
                    END
    
                    IF (@quoteCount % 2) = 1
                    BEGIN
                        -- odd number of quotes. real field terminator
                        SET @FieldTerminatorFound = 1
                    END
                    ELSE 
                    BEGIN
                        -- keep looking
                        SET @endi = CHARINDEX('"' + @PH1, @Line, @endi + 1) -- sLine.IndexOf("""" & PH1, endi + 1)
                    END
    
                END
    
                -- Grab the quoted field from the line, now that we have the start and ending indices
                SET @source = SUBSTRING(@Line, @starti + LEN(@PH1), @endi - @starti - LEN(@PH1) + 1) 
                -- sLine.Substring(starti + PH1.Length, endi - starti - PH1.Length + 1)
    
                -- And swap the commas back in
                SET @Line = REPLACE(@Line, @source, REPLACE(@source, @PH1, ','))
                --sLine.Replace(source, source.Replace(PH1, ","))
    
                -- Find the next quoted field
                -- If endi >= line.Length - 1 Then endi = line.Length 'During the swap, the length of line shrinks so an endi value at the end of the line will fail
                SET @starti = CHARINDEX(@PH1 + '"', @Line, @starti + LEN(@PH1))
                --sLine.IndexOf(PH1 & """", starti + PH1.Length)
    
            END
    
            -- get table based on current line
            IF OBJECT_ID('tempdb..#Line') IS NOT NULL
                DROP TABLE #Line
    
            -- converts a delimited list into a table
            SELECT *
            INTO #Line
            FROM dbo.iter_charlist_to_table(@Line,@PH1)
    
            -- get number of columns in line
            SET @COLCOUNT = @@ROWCOUNT
    
            -- dynamically create CSV temp table to hold CSV columns and lines
            -- only need to create once
            IF OBJECT_ID('tempdb..#CSV') IS NULL
            BEGIN
                -- create initial structure of CSV table
                CREATE TABLE [#CSV]([Col1] nvarchar(100))
    
                -- dynamically add a column for each column found in the first line
                SET @CURCOL = 1
                WHILE @CURCOL <= @COLCOUNT
                BEGIN
                    -- first column already exists, don't need to add
                    IF @CURCOL > 1 
                    BEGIN
                        -- add field
                        SET @sql = 'ALTER TABLE [#CSV] ADD [Col' + Cast(@CURCOL as varchar) + '] nvarchar(100)'
    
                        --print @sql
    
                        -- this adds the fields to the temp table
                        EXEC(@sql)
                    END
    
                    -- go to next column
                    SET @CURCOL = @CURCOL + 1
                END
            END
    
            -- build dynamic sql to insert current line into CSV table
            SET @sql = 'INSERT INTO [#CSV] VALUES('
    
            -- loop through line table, dynamically adding each column value
            SET @CURCOL = 1
            WHILE @CURCOL <= @COLCOUNT
            BEGIN
                -- get current column
                Select @ColVal = str 
                  From #Line 
                 Where listpos = @CURCOL
    
                IF LEN(@ColVal) > 0
                BEGIN
                    -- remove quotes from beginning if exist
                    IF LEFT(@ColVal,1) = '"'
                        SET @ColVal = RIGHT(@ColVal, LEN(@ColVal) - 1)
    
                    -- remove quotes from end if exist
                    IF RIGHT(@ColVal,1) = '"'
                        SET @ColVal = LEFT(@ColVal, LEN(@ColVal) - 1)
                END
    
                -- write column value
                -- make value sql safe by replacing single quotes with two single quotes
                -- also, replace two double quotes with a single double quote
                SET @sql = @sql + '''' + REPLACE(REPLACE(@ColVal, '''',''''''), '""', '"') + ''''
    
                -- add comma separater except for the last record
                IF @CURCOL <> @COLCOUNT
                    SET @sql = @sql + ','
    
                -- go to next column
                SET @CURCOL = @CURCOL + 1
            END
    
            -- close sql statement
            SET @sql = @sql + ')'
    
            --print @sql
    
            -- run sql to add line to table
            EXEC(@sql)
    
            -- move to next line
            SET @CURLINE = @CURLINE + 1
    
        END
    
    END
    
    -- return CSV table
    SELECT * FROM [#CSV]
    
    END
    
    GO
    

    The stored procedure makes use of this helper function that parses a string into a table (thanks Erland Sommarskog!):

    CREATE FUNCTION [dbo].[iter_charlist_to_table]
                    (@list      ntext,
                     @delimiter nchar(1) = N',')
         RETURNS @tbl TABLE (listpos int IDENTITY(1, 1) NOT NULL,
                             str     varchar(4000),
                             nstr    nvarchar(2000)) AS
    
    BEGIN
      DECLARE @pos      int,
              @textpos  int,
              @chunklen smallint,
              @tmpstr   nvarchar(4000),
              @leftover nvarchar(4000),
              @tmpval   nvarchar(4000)
    
      SET @textpos = 1
      SET @leftover = ''
      WHILE @textpos <= datalength(@list) / 2
      BEGIN
         SET @chunklen = 4000 - datalength(@leftover) / 2
         SET @tmpstr = @leftover + substring(@list, @textpos, @chunklen)
         SET @textpos = @textpos + @chunklen
    
         SET @pos = charindex(@delimiter, @tmpstr)
    
         WHILE @pos > 0
         BEGIN
            SET @tmpval = ltrim(rtrim(left(@tmpstr, @pos - 1)))
            INSERT @tbl (str, nstr) VALUES(@tmpval, @tmpval)
            SET @tmpstr = substring(@tmpstr, @pos + 1, len(@tmpstr))
            SET @pos = charindex(@delimiter, @tmpstr)
         END
    
         SET @leftover = @tmpstr
      END
    
      INSERT @tbl(str, nstr) VALUES (ltrim(rtrim(@leftover)), ltrim(rtrim(@leftover)))
    
    RETURN
    
    END
    

    Here’s how I call it from T-SQL. In this case, I’m inserting the results into a temp table, so I create the temp table first:

        -- create temp table for file import
    CREATE TABLE #temp
    (
        CustomerCode nvarchar(100) NULL,
        Name nvarchar(100) NULL,
        [Address] nvarchar(100) NULL,
        City nvarchar(100) NULL,
        [State] nvarchar(100) NULL,
        Zip nvarchar(100) NULL,
        OrderNumber nvarchar(100) NULL,
        TimeWindow nvarchar(100) NULL,
        OrderType nvarchar(100) NULL,
        Duration nvarchar(100) NULL,
        [Weight] nvarchar(100) NULL,
        Volume nvarchar(100) NULL
    )
    
    -- convert the CSV file into a table
    INSERT #temp
    EXEC [dbo].[SSP_CSVToTable]
         @InputFile = @FileLocation
        ,@FirstLine = @FirstImportRow
    

    I haven’t tested the performance much, but it works well for what I need – importing CSV files with less than 1000 rows. However, it might choke on really large files.

    Hopefully someone else also finds it useful.

    Cheers!

  7. arora on Mar 05, 2013

    I found few issues while having ‘,’ inside our fields like Mike,”456 2nd St, Apt 5″.

    Solution to this issue is @ http://crazzycoding.blogspot.com/2010/11/import-csv-file-into-sql-server-using.html

    Thanks,
    - Ashish

  8. johan-offer on Mar 05, 2013

    Chris,
    Thanks a bunch for this!! You saved my biscuits!! I could not believe that bulk loader wouldn’t handle this case when XL does such a nice job..don’t these guys see eachother in the halls???
    Anyway…I needed a ConsoleApplication version so here is what I hacked together. It’s down and dirty but it works like a champ! I hardcoded the delimiter and commented out the header as they were not needed for my app.

    I wish I could also paste a nice big beer in here for ya too.

    Geeze, I have no idea why the End Module and Public Class are outside the code block…srry!

        Module Module1
    
        Sub Main()
    
            Dim arrArgs() As String = Command.Split(",")
            Dim i As Integer
            Dim obj As New ReDelimIt()
    
            Console.Write(vbNewLine & vbNewLine)
    
            If arrArgs(0) <> Nothing Then
                For i = LBound(arrArgs) To UBound(arrArgs)
                    Console.Write("Parameter " & i & " is " & arrArgs(i) & vbNewLine)
                Next
    
    
                obj.ProcessFile(arrArgs(0), arrArgs(1))
    
            Else
                Console.Write("Usage Test1 <inputfile>,<outputfile>")
            End If
    
            Console.Write(vbNewLine & vbNewLine)
        End Sub
    
     End Module
    
     Public Class ReDelimIt
    
        Public Function ProcessFile(ByVal InputFile As String, ByVal OutputFile As String) As Integer
    
            Dim ph1 As String = "|"
    
            Dim objReader As System.IO.StreamReader = Nothing
            Dim count As Integer = 0 'This will also serve as a primary key
            Dim sb As New System.Text.StringBuilder
    
            Try
                objReader = New System.IO.StreamReader(System.IO.File.OpenRead(InputFile), System.Text.Encoding.Default)
            Catch ex As Exception
                MsgBox(ex.Message)
            End Try
    
            If objReader Is Nothing Then
                MsgBox("Invalid file: " & InputFile)
                count = -1
                Exit Function
            End If
    
            'grab the first line
            Dim line = objReader.ReadLine()
            'and advance to the next line b/c the first line is column headings
            'Removed Check Headers can put in if needed.
            'If chkHeaders.Checked Then
            'line = objReader.ReadLine
            'End If
    
            While Not String.IsNullOrEmpty(line) 'loop through each line
    
                count += 1
    
                'Replace commas with our custom-made delimiter
                line = line.Replace(",", ph1)
    
                'Find a quoted part of the line, which could legitimately contain commas.
                'In that case we will need to identify the quoted section and swap commas back in for our custom placeholder.
                Dim starti = line.IndexOf(ph1 & """", 0)
    
                While starti > -1 'loop through quoted fields
    
                    'Find end quote token (originally  a ",)
                    Dim endi = line.IndexOf("""" & ph1, starti)
    
                    'The end quote token could be a false positive because there could occur a ", sequence.
                    'It would be double-quoted ("",) so check for that here
                    Dim check1 = line.IndexOf("""""" & ph1, starti)
    
                    'A """, sequence can occur if a quoted field ends in a quote.
                    'In this case, the above check matches, but we actually SHOULD process this as an end quote token
                    Dim check2 = line.IndexOf("""""""" & ph1, starti)
    
                    'If we are in the check1 ("",) situation, keep searching for an end quote token
                    'The +1 and +2 accounts for the extra length of the checked sequences
                    While (endi = check1 + 1 AndAlso endi <> check2 + 2) 'loop through "false" tokens in the quoted fields
                        endi = line.IndexOf("""" & ph1, endi + 1)
                        check1 = line.IndexOf("""""" & ph1, check1 + 1)
                        check2 = line.IndexOf("""""""" & ph1, check2 + 1)
                    End While
    
                    'We have searched for an end token (",) but can't find one, so that means the line ends in a "
                    If endi < 0 Then endi = line.Length - 1
    
                    'Grab the quoted field from the line, now that we have the start and ending indices
                    Dim source = line.Substring(starti + ph1.Length, endi - starti - ph1.Length + 1)
    
                    'And swap the commas back in
                    line = line.Replace(source, source.Replace(ph1, ","))
    
                    'Find the next quoted field
                    If endi >= line.Length - 1 Then endi = line.Length 'During the swap, the length of line shrinks so an endi value at the end of the line will fail
                    starti = line.IndexOf(ph1 & """", starti + ph1.Length)
    
                End While
    
                'Add our primary key to the line
                ' Removed for now
                'If chkAddKey.Checked Then
                'line = String.Concat(count.ToString, ph1, line)
                ' End If
    
                sb.AppendLine(line)
    
                line = objReader.ReadLine
    
            End While
    
            objReader.Close()
    
            SaveTextToFile(sb.ToString, OutputFile)
    
            Return count
    
        End Function
    
        Public Function SaveTextToFile(ByVal strData As String, ByVal FullPath As String) As Boolean
            Dim bAns As Boolean = False
            Dim objReader As System.IO.StreamWriter
            Try
                objReader = New System.IO.StreamWriter(FullPath, False, System.Text.Encoding.Default)
                objReader.Write(strData)
                objReader.Close()
                bAns = True
            Catch Ex As Exception
                Throw Ex
            End Try
            Return bAns
        End Function
    
    End Class
    
  9. rajesh on Mar 05, 2013

    You should be able to specifiy not only the field separator, which should be [,] but also the text qualifier, which in this case would be ["]. Using [] to enclose that so there’s no confusion with “.