home · blog · groups · about us · contact us
DevelopmentNow Blog
 Monday, August 07, 2006
 
 

So I was having a little trouble getting full text search to work with the GUI in SQL Server Express with Advanced Services (formerly SQL Server 2005 Express SP1), so I had to do things manually. It was probably a permissions or setup issue with SQL Server Expres or the tools. In addition to setting up FTS, I wanted a search query to weight columns differently in the search rankings -- something that SQL Server FTS doesn't really support.

Setting up Full Text Search

First I had to download and install SQL Server Express with Advanced Services. It's big, but comes with the goodies I wanted.

Then I connected to my SQL Server Express database using SQL Server Management Studio so I could type in some queries. If your SQL Server Express database is in your Visual Studio Project's App_Data folder, you may be out of luck -- I wasn't able to get full text search to work on those, although maybe adjusting permissions would do it.

Once connected to the database, I created a full text catalog

CREATE FULLTEXT CATALOG MyFTCatalog

Next I needed to get the name of a unique index for my table. You can only create full-text indexes on tables with a single-key unique index (e.g. an autonumber primary key index). Remember that your unique index doesn't have to be on the columns that you want to perform full text searches on.

I had a table called Listing a primary key of IdListing and three varchar fields I wanted to search on: Address, Realtor, and Notes. My table already had a unique index called PK_Listing_IdListing, so it was time to create a full-text index on the three columns I wanted to be able to search on:

CREATE FULLTEXT INDEX ON Listing (Address, Realtor, Notes)
KEY INDEX PK_Listing_IdListing
ON MyFTCatalog
WITH CHANGE_TRACKING AUTO

What the above query did is create a full-text index on those three Listing table columns and store it in the full-text catalog named MyFTCatalog. I indicated PK_Listing_IdListing as the index to help uniquely identify rows on the Listing table, and I told the Full Text Search engine to automatically update the full-text catalog if values in the table change.

Lastly I did a quick check to confirm the catalog existed and wasn't still building

SELECT FULLTEXTCATALOGPROPERTY('MyFTCatalog', 'Populatestatus')

And we're set up. Now it was time to query. And man is it hot in here. I guess overclocking your PC makes for a sweaty summer. Anyhow...moving on.

Performing Weighted Queries

There are plenty of pages about performing full-text queries in SQL Server. Here's a place to start.

So my first query looked like this

SELECT IdListing, Address, Realtor, Notes
FROM Listing
WHERE FREETEXT(*,'some keywords')

The * tells FTS to perform the search on all columns in the full-text index. But the query wasn't going to work for me, since it doesn't give more weight to one column over the other. Plus, in order to sort results by ranking, I needed to use the *TABLE full-text queries. I'm partial to FREETEXTTABLE because it already does all the stemming/etc for me.

Then I did a UNION query like this

SELECT TOP 100 Rank, Address, Realtor, Notes
FROM
(
    SELECT f.Rank, l.Address, l.Realtor, l.Notes
    FROM listing l INNER JOIN
    FREETEXTTABLE(listing, Address, 'some keywords') as f
    ON l.idListing = f.[KEY]
    UNION
    SELECT f.Rank, l.Address, l.Realtor, l.Notes
    FROM listing l INNER JOIN
    FREETEXTTABLE(listing, Realtor, 'some keywords') as f
    ON l.idListing = f.[KEY]
    UNION
    SELECT f.Rank, l.Address, l.Realtor, l.Notes
    FROM listing l INNER JOIN
    FREETEXTTABLE(listing, Notes, 'some keywords') as f
    ON l.idListing = f.[KEY]
) as myTable
ORDER BY Rank DESC

which I quickly rewrote to

SELECT TOP 100 f.Rank, l.Address, l.Realtor, l.Notes
FROM Listing l INNER JOIN
(
SELECT Rank, [KEY] from FREETEXTTABLE(listing, Address, 'some keywords')
UNION
select Rank, [KEY] from FREETEXTTABLE(listing, Realtor, 'some keywords')
UNION
select Rank, [KEY] from FREETEXTTABLE(listing, Notes, 'some keywords')
) as f
ON l.IdListing = f.[KEY]
ORDER BY f.Rank DESC

and then added some weights to the rankings, like so.

SELECT TOP 100 f.WeightedRank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
(
        SELECT Rank * 5.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Address, 'some keywords')
        UNION
        select Rank * 3.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Realtor, 'some keywords')
        UNION
        select Rank * 1.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Notes, 'some keywords')
) as f
ON l.idListing = f.[KEY]
ORDER BY f.WeightedRank DESC

Pretty good. You have the column weighting, and you could wrap it up in a nice little stored procedure and be good to go.

However, there was one last thing I needed. I really wanted a query that would to combine column rankings, so that if there were hits in multiple columns, the rank would be higher than a hit in a single column. So this is what I came up with.

SELECT TOP 100 f.WeightedRank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
(
    SELECT [KEY], SUM(Rank) AS WeightedRank
    FROM
    (
        SELECT Rank * 5.0 as Rank, [KEY] from FREETEXTTABLE(listing, Address, 'some keywords')
        UNION
        select Rank * 3.0 as Rank, [KEY] from FREETEXTTABLE(listing, Realtor, 'some keywords')
        UNION
        select Rank * 1.0 as Rank, [KEY] from FREETEXTTABLE(listing, Notes, 'some keywords')
    ) as x
    GROUP BY [KEY]
) as f
ON l.idListing = f.[KEY]
ORDER BY f.WeightedRank DESC

Notice how I'm grouping the inner UNION query by [KEY] (in this case, Listing.IdListing) and SUMming the weighted ranks. That allows us to push results with hits in multiple columns higher up in the search rankings. Obviously it's not going to perform as well as a simpler query, but the ranking was important for this project.

Conclusion

So, there ya go. Installing SQL Server Express isn't too bad, although it's a big download. Setting up Full Text Search seemed to work best for me from the command line. And, now you have a way to rank matches with different columns having different weights.

Update: An Alternate Approach

Hilary Cotter (SQL MVP & FTS guru) provided an alternate query. I did a few tests & both seemed comparable in performance, although I didn't test using very large data sets. I made a slight change to his query and added a WHERE clause so that only matches are returned.

select TOP 100
    idListing, Address, Realtor, Notes,
    RankTotal=isnull(RankAddress,0)+isnull(RankRealtor,0)+isnull(RankNotes,0)
from listing
left join (SELECT Rank * 5.0 as RankAddress, [KEY] from
    FREETEXTTABLE(listing, Address, 'Street')) as k
    on k.[key]=Listing.idListing
left join (select Rank * 3.0 as RankRealtor, [KEY] from
    FREETEXTTABLE(listing, Realtor, 'Street')) as l
    on l.[key]=Listing.idListing
left join (select Rank * 1.0 as RankNotes, [KEY] from
    FREETEXTTABLE(listing, Notes, 'Street')) as m
    on m.[key]=Listing.idListing
WHERE RankAddress IS NOT NULL OR RankRealtor IS NOT NULL OR RankNotes IS NOT NULL
ORDER BY RankTotal DESC

Hilary also provided a script (run it in Query Analyzer or in a Query Tab in SQL Mgmt Studio) to set up a test database so you can try the query out yourself. I modified it to seed the test database with a bunch of records (since with only a few records, even LIKE is faster that FTS):

create database realtor
go
use realtor
GO
sp_fulltext_database 'enable'
GO
Create fulltext catalog realtor as default
GO
create table Listing(
    idListing int not null identity constraint ListingPK primary key,
    Address varchar(200), Realtor varchar(200), Notes varchar(200))
GO
-- add initial seed records
insert into Listing(Address, Realtor, Notes)
values('123 Any Street','John Street','the word on the street is good')
insert into Listing(Address, Realtor, Notes)
values('123 Any Road','John Street','the word of mouth is good')
insert into Listing(Address, Realtor, Notes)
values('123 Any Road','John Smith','the word on the street is good')
insert into Listing(Address, Realtor, Notes)
values('123 Any Street','John Smith','the word of mouth is good')
GO
-- multiply seed records, get up over 1M rows
-- might take a while
PRINT 'Please wait a few minutes while the database is seeded'
DECLARE @i int
SET @i = 0
WHILE (@i < 18)
BEGIN
    insert into Listing(Address, Realtor, Notes)
    select TOP 10 Address, Realtor, Notes from Listing

    SET @i = @i + 1
    PRINT convert(varchar,@i)
END
PRINT 'Database has been seeded'
GO
PRINT 'Please wait a few minutes while the fulltext index is built'
GO
create fulltext index on listing(Address, Realtor, Notes)
key index ListingPK
GO
-- check the below query. When it returns zero, the FT index is done building.
SELECT FULLTEXTCATALOGPROPERTY('realtor', 'Populatestatus')
GO


 

August 7, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Wednesday, August 02, 2006
 
 

AKA "working virtual, or virtually working?"

My friend Griffin Caprio blogged about the virtues of being a virtual worker and finding wifi hotspots. I thought I'd chime in with a few tips of my own.

Insure your stuff

Make sure your computer equipment is covered. Many homeowner policies DON'T cover computers at all, or not if they're used for business. You may want to get a small umbrella business insurance policy to cover your equipment at home & on the road (think dropped laptop at the airport). Ask around for referrals, or pick a few insurance agents out of the phone book.

Host a Web Server

If you have a static IP address from your ISP, then you can configure DNS to point to a web server on your network, and host away. If you have a dynamic IP, however, then you need to use dynamic DNS to ensure that when your IP address changes, your DNS entry (www.yourcooldomain.com) points to the right IP. There are several providers. I've used DNSExit for years and it works well, but you can also check out No-IP, TZO, or DynDNS. Or others. Some routers come with built-in support for certain dynamic DNS providers, meaning a simple config change in your router is all that's needed to keep your DNS up to date.

Back up your stuff

What would you do if your computer crashed or your hard drive blew out? Would you lose any work? How long would it take you to recover? Backups are important for any IT professional, and I'd suggest an automated approach. You can go with a service like Mozy that runs on your PC and backs stuff up in the background. Or, if you have a place you can FTP files to (e.g. your ISP or an inexpensive host like e-rice or dreamhost) you can pick up a copy of WinZip 10 Pro which can regularly zip up & upload files via FTP. Remember to not only back up documents, but emails, code, and database dumps. Having an organized directory structure where your important files are makes it easier. Then, if disaster strikes, you'll be in a better position to recover. And the silver lining is maybe you'll now have a reason to get a shiny new PC.

August 2, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Thursday, July 27, 2006
 
 

Scott Mitchell wrote recently about a plug & play ASP.NET error-logging framework that he and Atif Aziz wrote for an MSDN article a while back. 

The framework is called ELMAH (Error Logging Modules and Handlers) and it's free & open source. Apparently you just install the DLL & add a few lines to your web.config, and it'll start logging errors while allowing administrators to view errors online or even access an RSS feed of recent errors. I usually install a global error handler in my ASP.NET apps & use log4net to log & email the information, but I never put together a web-based error viewer. So if there's a stable framework that wraps all that up, I'm all for it.

You can download & read more about ELMAH here. Some screenshots (courtesy of MSDN):


Viewing the error log


RSS feed of recent errors

July 27, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 
 

So I was downloading the latest version of Anthem.NET to use for a Visual Studio 2003 project. I downloaded the zip, extracted, made the virtual directory, but kept getting weird security errors like

"The project location is not fully trusted by the .NET runtime. This is usually because it is either a network share or mapped to a network share not on the local machine.  If the output path is under the project location, your code will not execute as fully trusted and you may receive unexpected security exceptions."

and Visual Studio saying I can't debug the application. When I tried going to the local site in IE (localhost/Anthem-Examples-2003) I'd get

"Server cannot access application directory 'C:\Documents and Settings\Ben\My Documents\Visual Studio Projects\anthem\Anthem-Examples-2003\'. The directory does not exist or is not accessible because of security settings."

It always worked flawlessly before. So, after some dorking and searching around, I finally got it working, here's how I did it...

  • After downloading the zip file, I right clicked it and clicked "Unblock" (some new XP SP2 security thing). Then I extracted it. That resolved the first "project location not trusted" issue.
  • I then went to Control Panel->Admin Tools->.NET 1.1. Security Wizards, and gave full trust to the Intranet Zone.
  • I then disabled simple file sharing (Control Panel->Folder Options->View tab->Uncheck simple file sharing). This allowed me to access the "Security" tab on folders.
  • I then went to the folder containing the files that I extracted from the zip file. I right-clicked the folder, went to the (newly available) Security tab, and gave the Users group (of which the ASPNET account is part) standard (read/view/execute) access to that directory.

And now it works. I think a recent Windows security update is probably to blame. But, now we're back in action.

July 27, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 
 

I forgot to include links for some of the libraries I used in my talk at Code Camp:

Atlas -- http://atlas.asp.net

Anthem -- http://www.anthemdotnet.com

Prototype -- http://prototype.conio.net/

script.aculo.us -- http://script.aculo.us/

 

July 27, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Wednesday, July 26, 2006
 
 

I've spent a lot of time lately playing with Atlas & I'm enjoying working with it, despite the huge javascript payload. I wanted to incorporate some script.aculo.us effects into an Atlas page, and noticed this helpful post from huddletogether:



there seems to be a conflict between Scriptmanager and Scriptaculous
you need to place the Scriptaculous after the scriptmanager.
add the following lines of code to get it working:
<body>
<form id="form1" runat="server">
<atlas:ScriptManager ID="sm1" EnablePartialRendering="true" runat="Server" />
<script type="text/javascript" src="/js/prototype.js"></script>
<script type="text/javascript" src="/js/scriptaculous.js?load=effects"></script>
<script type="text/javascript" src="/js/lightbox.js"></script>

other html goes here
<body>


I figured there would probably be javascript conflicts (Atlas already uses prototype's $ syntax). I wonder what else happens when intrepid ASP.NET coders want some fancy effects on their pages?

July 26, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Monday, July 24, 2006
 
 
You can download the slides & code from Code Camp 2006 (Ajax and ASP.NET) here. Note the projects are for VS2005.
July 24, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Tuesday, July 18, 2006
 
 

Instead of using Notepad all the time for viewing random text files, why not check out one of the many replacements mentioned on Rick Strahl's post on Notepad replacements. FWIW I use TextPad, although their menu hotkeys take a little getting used to (F5 for Find??).

July 18, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Thursday, July 13, 2006
 
 

Up to base 36, anyhow. This is a port from old C. I think it works (did it on paper).

// return decimal version of any number up to base 36
// e.g. strtonum("110", 16) returns 272 (which is 110 in base 16)
int strtonum(orig, base)
{
    string digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    int retval = 0;

    for (int i = 0; i < orig.length; i++)
    {
        string character = orig[i];

        for (j = 0; j < digits.length; j++)
        {
            if (character == digits[j])
            {
                retval = retval * base + j;
                break;
            }
        }
    }

    return retval;
}

July 13, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]



 Thursday, July 06, 2006
 
 

FYI, I'll be speaking at Portland Code Camp 2.0 this year, talking about incorporating AJAX functionality into ASP.NET applications. July 22nd & 23rd at WSU's Vancouver campus. It's free for everyone, so go ahead and get registered. Code Camp is a very informal, code-centric (as opposed to yak-centric) conference put on by developers, for developers.

I was tempted to also speak about O/R Mappers & code generators, but I think one presentation is enough this time around. :)

Edit: Code Camp is down to one day, July 22nd. Registration is free & starts at 8am.

July 6, 2006    Bookmark to Digg or other social bookmarking
#    Disclaimer  |  Comments [0]