Groups | Blog | Home
all groups > vb.net data > september 2005 >

vb.net data : Import Tab Delimited Text into SQL Database


BenHanson
9/20/2005 12:13:21 PM
I'm trying to bring in a tab delimited file and dump it into SQL. I'm
currently using an ODBCDataAdapter to fill a dataset:

WriteSchemaFileForODBCImport(importType, filePath)
Dim importConnection As System.Data.ODBC.OdbcConnection
Dim importDataAdapter As System.Data.ODBC.OdbcDataAdapter
Dim importConnectionString As String
Dim importSelectString As String
Dim importDataSet As New DataSet
Dim intDefaultPadding As Integer = 1

importConnectionString = "Driver={Microsoft Text Driver (*.txt;
*.csv)};Dbq=" & filePath & ";"

importConnection = New
Odbc.OdbcConnection(importConnectionString)

importSelectString = "select * from [" & fileName & "]"

importDataAdapter = New
System.data.Odbc.OdbcDataAdapter(importSelectString, importConnection)

Then I remove a few columns and assign proper column names to match my
database, though the column order is different. Now I'm trying to
figure out how to dump this dataset into SQL. All of the values in the
dataset are representative of SQL Table fields, except an
autoincremented index field and a time stamp that calls GetDate() I
tested the data being extracted from the test file by binding the
dataset to a grid. Dataset has all the right data.

Would I be better off skipping the intermediate dataset? Any ideas?
William (Bill) Vaughn
9/21/2005 1:05:15 PM
Ah, you're making this 10 times harder (and slower) than it has to be. Use
BCP or DTS to upload a delimited file. It's far faster.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com/blog/billva
www.betav.com
www.sqlreportingservices.net
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________


[quoted text, click to view]

Paul Clement
9/21/2005 1:41:34 PM
[quoted text, click to view]

¤ I'm trying to bring in a tab delimited file and dump it into SQL. I'm
¤ currently using an ODBCDataAdapter to fill a dataset:
¤
¤ WriteSchemaFileForODBCImport(importType, filePath)
¤ Dim importConnection As System.Data.ODBC.OdbcConnection
¤ Dim importDataAdapter As System.Data.ODBC.OdbcDataAdapter
¤ Dim importConnectionString As String
¤ Dim importSelectString As String
¤ Dim importDataSet As New DataSet
¤ Dim intDefaultPadding As Integer = 1
¤
¤ importConnectionString = "Driver={Microsoft Text Driver (*.txt;
¤ *.csv)};Dbq=" & filePath & ";"
¤
¤ importConnection = New
¤ Odbc.OdbcConnection(importConnectionString)
¤
¤ importSelectString = "select * from [" & fileName & "]"
¤
¤ importDataAdapter = New
¤ System.data.Odbc.OdbcDataAdapter(importSelectString, importConnection)
¤
¤ Then I remove a few columns and assign proper column names to match my
¤ database, though the column order is different. Now I'm trying to
¤ figure out how to dump this dataset into SQL. All of the values in the
¤ dataset are representative of SQL Table fields, except an
¤ autoincremented index field and a time stamp that calls GetDate() I
¤ tested the data being extracted from the test file by binding the
¤ dataset to a grid. Dataset has all the right data.
¤
¤ Would I be better off skipping the intermediate dataset? Any ideas?

I would use the direct method myself. Since will need a schema.ini file for the tab delimited file I
would specify the column names there and then do a direct import (if possible).

Dim TextConnection As New
System.Data.OleDb.OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & "E:\My
Documents\TextFiles" & ";" & _
"Extended
Properties=""Text;HDR=NO;""")

TextConnection.Open()

Dim TextCommand As New System.Data.OleDb.OleDbCommand("SELECT * INTO [ReportSheet] IN ''
[ODBC;Driver={SQL Server};Server=(local);Database=Northwind;Trusted_Connection=yes] FROM
[ReportSheet#txt]", TextConnection)

TextCommand.ExecuteNonQuery()
TextConnection.Close()


Paul
~~~~
BenHanson
9/21/2005 2:56:55 PM
Paul, one critical thing I left out, I need to drop 3 columns from the
tabbed file, then filter out a specific string from each row. I
couldn't see anyway to do this with the schema.ini.

As for DTS, I wanted to avoid the complexity of setup(not for me, for
others) of needing to have a network share defined where the data to be
DTS'd would need to be put. Is there any way with DTS to go Local
Unshared File --> VB.Net App --> DTS versus Local Unshared File -->
VB.Net App --> Shared Network Location --> DTS.

As for BCP, I'm not at all familiar with this. Looks like a command
line utility. I'm not sure if I would want to shell out to a command
line for this. I'm wanting the sort of seamless integration, grab
specificly formatted file, dump directly into SQL without a lot of
intermediaries. Also, I wouldn't be able to distribute BCP with an
app.

I've actually come up with a solution, but it seems "10 times
harder(and slower) than it has to be." Not to mention ugly code.

1. import text file into dataset1 via System.Data.ODBC.OdbcDataAdapter
2. drop columns and name them
3. pass dataset1 to a sub which connects to the DB and loads dataset2
to give it the correct schema,
4. for each datarow loop through dataset1 and adding a row to dataset2
5. Do DataAdapter.update(dataset2) to push the changes to SQL

It's not too slow for 1000 records. Maybe 8-10 seconds. I'm sure it
could be much faster. Mostly it just doesn't seem like easily
maintainable code. It also bothers me that the
dataAdapter.insertCommand/sqlcommand with all the @fieldname parameters
that must be defined, is all greek to me.
shriop
9/21/2005 8:42:39 PM
As long as where the dts package is being executed from is on the same
machine as your Local Unshared File, which is not the same as where the
data is being sent to, then yes, dts can read a local file just fine.

Bruce Dunwiddie
http://www.csvreader.com
Paul Clement
9/22/2005 10:45:17 AM
[quoted text, click to view]

¤ Paul, one critical thing I left out, I need to drop 3 columns from the
¤ tabbed file, then filter out a specific string from each row. I
¤ couldn't see anyway to do this with the schema.ini.
¤

You only specify the columns you need in the SELECT statement. I can't address the filter issue
because I don't know specifically what you are referring to. In any event, some of the VBA functions
are available to Jet SQL in order to handle string manipulation.

¤ As for BCP, I'm not at all familiar with this. Looks like a command
¤ line utility. I'm not sure if I would want to shell out to a command
¤ line for this. I'm wanting the sort of seamless integration, grab
¤ specificly formatted file, dump directly into SQL without a lot of
¤ intermediaries. Also, I wouldn't be able to distribute BCP with an
¤ app.
¤

Bill probably was thinking of BULK INSERT which is the Transact SQL compliment to BCP. It could be
an alternative depending upon how you want to manipulate the column data.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_ba-bz_4fec.asp

I would agree with your assessment concerning BCP and DTS.


¤ I've actually come up with a solution, but it seems "10 times
¤ harder(and slower) than it has to be." Not to mention ugly code.
¤
¤ 1. import text file into dataset1 via System.Data.ODBC.OdbcDataAdapter
¤ 2. drop columns and name them
¤ 3. pass dataset1 to a sub which connects to the DB and loads dataset2
¤ to give it the correct schema,
¤ 4. for each datarow loop through dataset1 and adding a row to dataset2
¤ 5. Do DataAdapter.update(dataset2) to push the changes to SQL
¤
¤ It's not too slow for 1000 records. Maybe 8-10 seconds. I'm sure it
¤ could be much faster. Mostly it just doesn't seem like easily
¤ maintainable code. It also bothers me that the
¤ dataAdapter.insertCommand/sqlcommand with all the @fieldname parameters
¤ that must be defined, is all greek to me.

Yes, well this is definitely the slowest method you could have chosen. I would only use it as a last
resort, or if performance is not a requirement for your application.


Paul
~~~~
Paul Clement
9/22/2005 10:50:11 AM
[quoted text, click to view]

¤ Ah, you're making this 10 times harder (and slower) than it has to be. Use
¤ BCP or DTS to upload a delimited file. It's far faster.

I only counted five lines of code so it certainly isn't harder. ;-)

BCP or DTS would probably be faster but they're poor choices if you're looking for an integrated
solution. BULK INSERT would probably be more appropriate.


Paul
~~~~
AddThis Social Bookmark Button