c# - How can I insert 10 million records in the shortest time possible? -


i have file (which has 10 million records) below:

    line1     line2     line3     line4    .......     ......     10 million lines 

so want insert 10 million records database. read file , upload sql server.

c# code

system.io.streamreader file =      new system.io.streamreader(@"c:\test.txt"); while((line = file.readline()) != null) {     // insertion code goes here     //dal.executesql("insert table1 values("+line+")"); }  file.close(); 

but insertion take long time. how can insert 10 million records in shortest time possible using c#?

update 1:
bulk insert:

bulk insert dbname.dbo.datas 'f:\dt10000000\dt10000000.txt' (       rowterminator =' \n'   ); 

my table below:

datas (      datasfield varchar(max) ) 

but getting following error:

msg 4866, level 16, state 1, line 1
bulk load failed. column long in data file row 1, column 1. verify field terminator , row terminator specified correctly.

msg 7399, level 16, state 1, line 1
ole db provider "bulk" linked server "(null)" reported error. provider did not give information error.

msg 7330, level 16, state 2, line 1
cannot fetch row ole db provider "bulk" linked server "(null)".

below code worked:

bulk insert dbname.dbo.datas 'f:\dt10000000\dt10000000.txt' (     fieldterminator = '\t',     rowterminator = '\n' ); 

please not create datatable load via bulkcopy. ok solution smaller sets of data, there absolutely no reason load 10 million rows memory before calling database.

your best bet (outside of bcp / bulk insert / openrowset(bulk...)) stream contents file database via table-valued parameter (tvp). using tvp can open file, read row & send row until done, , close file. method has memory footprint of single row. wrote article, streaming data sql server 2008 application, has example of scenario.

a simplistic overview of structure follows. assuming same import table , field name shown in question above.

required database objects:

-- first: need user-defined table type create type importstructure table (field varchar(max)); go  -- second: use udtt input param import proc. --         hence "tabled-valued parameter" (tvp) create procedure dbo.importdata (    @importtable    dbo.importstructure readonly ) set nocount on;  -- maybe clear out table first? truncate table dbo.datas;  insert dbo.datas (datasfield)     select  field        @importtable;  go 

c# app code make use of above sql objects below. notice how rather filling object (e.g. datatable) , executing stored procedure, in method executing of stored procedure initiates reading of file contents. input parameter of stored proc isn't variable; return value of method, getfilecontents. method called when sqlcommand calls executenonquery, opens file, reads row , sends row sql server via ienumerable<sqldatarecord> , yield return constructs, , closes file. stored procedure sees table variable, @importtable, can access data starts coming on (note: data persist short time, if not full contents, in tempdb).

using system.collections; using system.data; using system.data.sqlclient; using system.io; using microsoft.sqlserver.server;  private static ienumerable<sqldatarecord> getfilecontents() {    sqlmetadata[] _tvpschema = new sqlmetadata[] {       new sqlmetadata("field", sqldbtype.varchar, sqlmetadata.max)    };    sqldatarecord _datarecord = new sqldatarecord(_tvpschema);    streamreader _filereader = null;     try    {       _filereader = new streamreader("{filepath}");        // read row, send row       while (!_filereader.endofstream)       {          // shouldn't need call "_datarecord = new sqldatarecord"          // sql server received row when "yield return" called.          // unlike bcp , bulk insert, have option here create string          // call readline() string, manipulation(s) / validation(s) on          // string, pass string setstring() or discard if invalid.          _datarecord.setstring(0, _filereader.readline());          yield return _datarecord;       }    }       {       _filereader.close();    } } 

the getfilecontents method above used input parameter value stored procedure shown below:

public static void test() {    sqlconnection _connection = new sqlconnection("{connection string}");    sqlcommand _command = new sqlcommand("importdata", _connection);    _command.commandtype = commandtype.storedprocedure;     sqlparameter _tvparam = new sqlparameter();    _tvparam.parametername = "@importtable";    _tvparam.typename = "dbo.importstructure";    _tvparam.sqldbtype = sqldbtype.structured;    _tvparam.value = getfilecontents(); // return value of method streamed data    _command.parameters.add(_tvparam);     try    {       _connection.open();        _command.executenonquery();    }       {       _connection.close();    }     return; } 

additional notes:

  1. with modification, above c# code can adapted batch data in.
  2. with minor modification, above c# code can adapted send in multiple fields (the example shown in "steaming data..." article linked above passes in 2 fields).
  3. you can manipulate value of each record in select statement in proc.
  4. you can filter out rows using condition in proc.
  5. you can access tvp table variable multiple times; readonly not "forward only".
  6. advantages on sqlbulkcopy:
    1. sqlbulkcopy insert-only whereas using tvp allows data used in fashion: can call merge; can delete based on condition; can split data multiple tables; , on.
    2. due tvp not being insert-only, don't need separate staging table dump data into.
    3. you can data database calling executereader instead of executenonquery. example, if there identity field on datas import table, add output clause insert pass inserted.[id] (assuming id name of identity field). or can pass results of different query, or both since multiple results sets can sent , accessed via reader.nextresult(). getting info database not possible when using sqlbulkcopy yet there several questions here on s.o. of people wanting (at least regards newly created identity values).
    4. for more info on why faster overall process, if slower on getting data disk sql server, please see whitepaper sql server customer advisory team: maximizing throughput tvp

Comments

Popular posts from this blog

javascript - how to protect a flash video from refresh? -

visual studio 2010 - Connect to informix database windows form application -

android - Associate same looper with different threads -