So I was having a little trouble getting full text search to work with the GUI in SQL Server Express with Advanced Services (formerly SQL Server 2005 Express SP1), so I had to do things manually. It was probably a permissions or setup issue with SQL Server Expres or the tools. In addition to setting up FTS, I wanted a search query to weight columns differently in the search rankings -- something that SQL Server FTS doesn't really support.
First I had to download and install SQL Server Express with Advanced Services. It's big, but comes with the goodies I wanted.
Then I connected to my SQL Server Express database using SQL Server Management Studio so I could type in some queries. If your SQL Server Express database is in your Visual Studio Project's App_Data folder, you may be out of luck -- I wasn't able to get full text search to work on those, although maybe adjusting permissions would do it.
Once connected to the database, I created a full text catalog
CREATE FULLTEXT CATALOG MyFTCatalog
Next I needed to get the name of a unique index for my table. You can only create full-text indexes on tables with a single-key unique index (e.g. an autonumber primary key index). Remember that your unique index doesn't have to be on the columns that you want to perform full text searches on.
I had a table called Listing a primary key of IdListing and three varchar fields I wanted to search on: Address, Realtor, and Notes. My table already had a unique index called PK_Listing_IdListing, so it was time to create a full-text index on the three columns I wanted to be able to search on:
CREATE FULLTEXT INDEX ON Listing (Address, Realtor, Notes)KEY INDEX PK_Listing_IdListingON MyFTCatalogWITH CHANGE_TRACKING AUTO
What the above query did is create a full-text index on those three Listing table columns and store it in the full-text catalog named MyFTCatalog. I indicated PK_Listing_IdListing as the index to help uniquely identify rows on the Listing table, and I told the Full Text Search engine to automatically update the full-text catalog if values in the table change.
Lastly I did a quick check to confirm the catalog existed and wasn't still building
SELECT FULLTEXTCATALOGPROPERTY('MyFTCatalog', 'Populatestatus')
And we're set up. Now it was time to query. And man is it hot in here. I guess overclocking your PC makes for a sweaty summer. Anyhow...moving on.
There are plenty of pages about performing full-text queries in SQL Server. Here's a place to start.
So my first query looked like this
SELECT IdListing, Address, Realtor, NotesFROM ListingWHERE FREETEXT(*,'some keywords')
The * tells FTS to perform the search on all columns in the full-text index. But the query wasn't going to work for me, since it doesn't give more weight to one column over the other. Plus, in order to sort results by ranking, I needed to use the *TABLE full-text queries. I'm partial to FREETEXTTABLE because it already does all the stemming/etc for me.
Then I did a UNION query like this
SELECT TOP 100 Rank, Address, Realtor, NotesFROM( SELECT f.Rank, l.Address, l.Realtor, l.Notes FROM listing l INNER JOIN FREETEXTTABLE(listing, Address, 'some keywords') as f ON l.idListing = f.[KEY] UNION SELECT f.Rank, l.Address, l.Realtor, l.Notes FROM listing l INNER JOIN FREETEXTTABLE(listing, Realtor, 'some keywords') as f ON l.idListing = f.[KEY] UNION SELECT f.Rank, l.Address, l.Realtor, l.Notes FROM listing l INNER JOIN FREETEXTTABLE(listing, Notes, 'some keywords') as f ON l.idListing = f.[KEY]) as myTableORDER BY Rank DESC
which I quickly rewrote to
SELECT TOP 100 f.Rank, l.Address, l.Realtor, l.NotesFROM Listing l INNER JOIN(SELECT Rank, [KEY] from FREETEXTTABLE(listing, Address, 'some keywords')UNIONselect Rank, [KEY] from FREETEXTTABLE(listing, Realtor, 'some keywords')UNIONselect Rank, [KEY] from FREETEXTTABLE(listing, Notes, 'some keywords')) as fON l.IdListing = f.[KEY]ORDER BY f.Rank DESC
and then added some weights to the rankings, like so.
SELECT TOP 100 f.WeightedRank, l.Address, l.Realtor, l.NotesFROM listing l INNER JOIN( SELECT Rank * 5.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Address, 'some keywords') UNION select Rank * 3.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Realtor, 'some keywords') UNION select Rank * 1.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Notes, 'some keywords')) as fON l.idListing = f.[KEY]ORDER BY f.WeightedRank DESC
Pretty good. You have the column weighting, and you could wrap it up in a nice little stored procedure and be good to go.
However, there was one last thing I needed. I really wanted a query that would to combine column rankings, so that if there were hits in multiple columns, the rank would be higher than a hit in a single column. So this is what I came up with.
SELECT TOP 100 f.WeightedRank, l.Address, l.Realtor, l.NotesFROM listing l INNER JOIN( SELECT [KEY], SUM(Rank) AS WeightedRank FROM ( SELECT Rank * 5.0 as Rank, [KEY] from FREETEXTTABLE(listing, Address, 'some keywords') UNION select Rank * 3.0 as Rank, [KEY] from FREETEXTTABLE(listing, Realtor, 'some keywords') UNION select Rank * 1.0 as Rank, [KEY] from FREETEXTTABLE(listing, Notes, 'some keywords') ) as x GROUP BY [KEY]) as fON l.idListing = f.[KEY]ORDER BY f.WeightedRank DESC
Notice how I'm grouping the inner UNION query by [KEY] (in this case, Listing.IdListing) and SUMming the weighted ranks. That allows us to push results with hits in multiple columns higher up in the search rankings. Obviously it's not going to perform as well as a simpler query, but the ranking was important for this project.
So, there ya go. Installing SQL Server Express isn't too bad, although it's a big download. Setting up Full Text Search seemed to work best for me from the command line. And, now you have a way to rank matches with different columns having different weights.
Hilary Cotter (SQL MVP & FTS guru) provided an alternate query. I did a few tests & both seemed comparable in performance, although I didn't test using very large data sets. I made a slight change to his query and added a WHERE clause so that only matches are returned.
select TOP 100 idListing, Address, Realtor, Notes, RankTotal=isnull(RankAddress,0)+isnull(RankRealtor,0)+isnull(RankNotes,0) from listing left join (SELECT Rank * 5.0 as RankAddress, [KEY] from FREETEXTTABLE(listing, Address, 'Street')) as k on k.[key]=Listing.idListing left join (select Rank * 3.0 as RankRealtor, [KEY] from FREETEXTTABLE(listing, Realtor, 'Street')) as l on l.[key]=Listing.idListing left join (select Rank * 1.0 as RankNotes, [KEY] from FREETEXTTABLE(listing, Notes, 'Street')) as m on m.[key]=Listing.idListing WHERE RankAddress IS NOT NULL OR RankRealtor IS NOT NULL OR RankNotes IS NOT NULLORDER BY RankTotal DESC
Hilary also provided a script (run it in Query Analyzer or in a Query Tab in SQL Mgmt Studio) to set up a test database so you can try the query out yourself. I modified it to seed the test database with a bunch of records (since with only a few records, even LIKE is faster that FTS):
create database realtor go use realtor GO sp_fulltext_database 'enable' GO Create fulltext catalog realtor as default GO create table Listing( idListing int not null identity constraint ListingPK primary key, Address varchar(200), Realtor varchar(200), Notes varchar(200)) GO -- add initial seed recordsinsert into Listing(Address, Realtor, Notes) values('123 Any Street','John Street','the word on the street is good') insert into Listing(Address, Realtor, Notes) values('123 Any Road','John Street','the word of mouth is good') insert into Listing(Address, Realtor, Notes) values('123 Any Road','John Smith','the word on the street is good') insert into Listing(Address, Realtor, Notes) values('123 Any Street','John Smith','the word of mouth is good') GO-- multiply seed records, get up over 1M rows-- might take a whilePRINT 'Please wait a few minutes while the database is seeded'DECLARE @i intSET @i = 0WHILE (@i < 18)BEGIN insert into Listing(Address, Realtor, Notes) select TOP 10 Address, Realtor, Notes from Listing SET @i = @i + 1 PRINT convert(varchar,@i)ENDPRINT 'Database has been seeded'GO PRINT 'Please wait a few minutes while the fulltext index is built'GOcreate fulltext index on listing(Address, Realtor, Notes) key index ListingPK GO-- check the below query. When it returns zero, the FT index is done building.SELECT FULLTEXTCATALOGPROPERTY('realtor', 'Populatestatus')GO
Powered by: newtelligence dasBlog 2.0.7226.0
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
© Copyright 2008, Ben Strackany
E-mail