all groups > sql server full text search > june 2006 >
You're in the

sql server full text search

group:

FTS for noise chars only


FTS for noise chars only NeilT
6/19/2006 11:53:17 AM
sql server full text search: Hi.. SQL 2K issue. My client wishes to FTS for certain noise chars
only, e.g. % which of course isn't in the noise word files. The best
I've been able to come up with to resolve this is to escape the
characters when I enter the string into CONTAINSTABLE, however the
results return as if the query was "% AND escape"

This anyway seems to be a cludge, which I'm very tempted to use as a
temporary (forever)solution but I would really appreciate it if someone
could point me to the error of my ways or tell me if this is at all
possible.

I know that the best solution would be to go through the text and
change all % chars to something that I could then search against, but
A) this isn't the only char that he's interested in and B) the data is
historic, large, formatted and extracted from a Notes DB, so I'd rather
not go there for the moment.

Appreciation in advance

Neil
Re: FTS for noise chars only Hilary Cotter
6/19/2006 9:33:05 PM
Regrettably this is the only way I know of doing this.

--
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.

This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.

Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com



[quoted text, click to view]

Re: FTS for noise chars only NeilT
6/20/2006 12:59:00 AM
Hilary...thanks for your interest.

If you want a laugh...he also wants to search for "swiggly line."

I think he means tilde.

In his old Notes version, this was the way he found bullet points in
the RTF fields. I haven't told him I've converted his Notes RTF into
HTML in SQL, so "swiggly" is dead anyway.

If only he'd mentioned all this at the requirements stage...

regards
Neil
(deep in the Dilberts)
Re: FTS for noise chars only Hilary Cotter
6/20/2006 5:48:34 AM
You might want to export these bullet points into a separate column and then
full-text them there.

--
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.

This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.

Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com



[quoted text, click to view]

Re: FTS for noise chars only NeilT
6/20/2006 7:24:45 AM
Thanks for the advice, however the bullet points along with everything
else are ad hoc, but nice idea though.

Problem is....
All the text comes from unknown source uploaded doc, rtf, html etc
files or typed directly or cut'n'paste into contenteditable divs in web
based apps, so everything gets stored as HTML in SQL as everything has
to be web editable with formatting retained. I also need to convert the
HTML to RTF to attach the resultant files to mail outs.

Anyway, I think tyring to extract bullet points out of that lot is just
too much grief...

however, however...I'm not really understanding what you are saying, am
I?

.....and thinking about it...WAHay!!

I could run an instr over any text insert and if there is a % or
whatever in there, write the DOC ID to a Specialchars table and just
search against that. I could even insert the quantity of the char per
document for weighting.

Say % is iChartype=4

Then simply select iDocID from tblSpecialChars where iCharType=4

and bingo...sorted! You beauty!

Sorry to go a bit mad there. I thought I was actually beaten for once.

Many thanks Hilary, I'd never have had that idea without you.

I owe you a drink for that one.

regards

Neil
AddThis Social Bookmark Button