Groups | Blog | Home
all groups > sql server (alternate) > october 2004 >

sql server (alternate) : Regex parser Text field


sickmat NO[at]SPAM iit.edu
10/7/2004 2:46:49 PM
I have a text field that contains abstract information formated in
HTML, I'd like strip the HTML and insert the data in another Text
field within a DTS package. Is this possible?

any suggestions would be appreciated
Simon Hayes
10/8/2004 1:27:35 PM

[quoted text, click to view]

Since TSQL has very poor text handling functions, the best way is probably
to do the parsing outside the database, with a language like Perl, C#,
Python etc. You could use DTS or bcp.exe to dump the data to a file, parse
and modify the file with a script, then load it again. If you write an
external script to do this, then you can use an Execute Process task to call
it.

If you want a more 'inline' approach, you could use a Transform Data task
with a custom ActiveX transformation which instantiates and uses the Regex
object to transform the data. However, I'm not sure if this approach will
work properly with text data, although it should be fine with char/varchar,
and the performance overhead of the repeated object creation may be a
problem if you have a large amount of data.

Simon

AddThis Social Bookmark Button