sql server full text search:
Dunc, Could you post the full output of the following SQL script as the info would be helpful in understanding this... use <your_database_name_here> go SELECT @@language SELECT @@version -- Note, may need to set 'show advanced options' EXEC sp_configure 'default full-text language' EXEC sp_help_fulltext_catalogs EXEC sp_help_fulltext_tables EXEC sp_help_fulltext_columns EXEC sp_help VenueList go SELECT count(*) from VenueList go Could you also run profiler and re-execute your below query with the function BreakSearchStringToIsAbout and post the exact SQL code that it is generating? You might also want to review SQL Server 2000 BOL title "Full-text Search Recommendations" and it's last paragraph on RANK. Thanks, John [quoted text, click to view] "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message news:e3IC#GtjEHA.2652@TK2MSFTNGP15.phx.gbl... > I'm dynamically creating a SQL statement that takes each word from a > sentence, and using the WEIGHT command, searches a number of fields twice > for each word - once with a given weight for the exact word, and once with > half the weight for a partial match (by putting asterisks at the start and > end of each word). > > Going against all the documentation I've got, items with a lower weight seem > to be coming up higher. After a bit more playing around (I removed the > partial match functionality), the lower the number, the higher the rank > seems to be. I can't just multiply the partial match value, as it'll > potentially go over one. > > Either I'm doing something dumb, or I've missed a pretty valuable piece of > documentation. Can someone point me in the right direction here? I've put > a few key snippets of the code below. > > Thanks in advance, > > Dunc > > "FROM VenueList " & _ > "LEFT JOIN CONTAINSTABLE(VenueList, VenueName, 'ISABOUT (" & > BreakSearchStringToIsAbout(arrSearchString, 0.5) & ")') AS B " & _ > " ON A.VenueID = B.[Key] " & _ > > ---/ snip /--- > > Function BreakSearchStringToIsAbout(arrSearchItem, iWeight) > Dim iLoop > Dim strOutput > > strOutput = "" > For iLoop = 0 to uBound(arrSearchItem) > strOutput = strOutput & """" & arrSearchItem(iLoop) & """" & " WEIGHT(" & > iWeight & "), " > strOutput = strOutput & """*" & arrSearchItem(iLoop) & "*""" & " WEIGHT(" > & iWeight / 2 & "), " > Next > > strOutput = Left(strOutput, Len(strOutput) - 2) > > BreakSearchStringToIsAbout = strOutput > End Function > >
I'm a little confused. When I try to build use this function I get output that looks like this: "the" WEIGHT(10), "*the*" WEIGHT(5), "rain" WEIGHT(10), "*rain*" WEIGHT(5), "in" WEIGHT(10), "*in*" WEIGHT(5), "spain" WEIGHT(10), "*spain*" WEIGHT(5), "stays" WEIGHT(10), "*stays*" WEIGHT(5), "mainly" WEIGHT(10), "*mainly*" WEIGHT(5), "in" WEIGHT(10), "*in*" WEIGHT(5), "the" WEIGHT(10), "*the*" WEIGHT(5), "plain" WEIGHT(10), "*plain*" WEIGHT(5) A couple of points: The beginning * is not treated as a wildcard operator for prefixes, its ignored, so you can remove this altogether. It is valid for the suffix, so you might get the same results if you do a FreeText. Secondly, you are doing a left join which says give me all results on the left side, whether there is a match on the right side or not, and there is always a match on the right side, because it looks like you are joining against the base table. It might be helpful to try to understand exactly what you are trying to accomplish, include complete schemas, etc. -- Hilary Cotter Looking for a book on SQL Server replication? http://www.nwsu.com/0974973602.html [quoted text, click to view] "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message news:e3IC%23GtjEHA.2652@TK2MSFTNGP15.phx.gbl... > I'm dynamically creating a SQL statement that takes each word from a > sentence, and using the WEIGHT command, searches a number of fields twice > for each word - once with a given weight for the exact word, and once with > half the weight for a partial match (by putting asterisks at the start and > end of each word). > > Going against all the documentation I've got, items with a lower weight seem > to be coming up higher. After a bit more playing around (I removed the > partial match functionality), the lower the number, the higher the rank > seems to be. I can't just multiply the partial match value, as it'll > potentially go over one. > > Either I'm doing something dumb, or I've missed a pretty valuable piece of > documentation. Can someone point me in the right direction here? I've put > a few key snippets of the code below. > > Thanks in advance, > > Dunc > > "FROM VenueList " & _ > "LEFT JOIN CONTAINSTABLE(VenueList, VenueName, 'ISABOUT (" & > BreakSearchStringToIsAbout(arrSearchString, 0.5) & ")') AS B " & _ > " ON A.VenueID = B.[Key] " & _ > > ---/ snip /--- > > Function BreakSearchStringToIsAbout(arrSearchItem, iWeight) > Dim iLoop > Dim strOutput > > strOutput = "" > For iLoop = 0 to uBound(arrSearchItem) > strOutput = strOutput & """" & arrSearchItem(iLoop) & """" & " WEIGHT(" & > iWeight & "), " > strOutput = strOutput & """*" & arrSearchItem(iLoop) & "*""" & " WEIGHT(" > & iWeight / 2 & "), " > Next > > strOutput = Left(strOutput, Len(strOutput) - 2) > > BreakSearchStringToIsAbout = strOutput > End Function > >
I'm dynamically creating a SQL statement that takes each word from a sentence, and using the WEIGHT command, searches a number of fields twice for each word - once with a given weight for the exact word, and once with half the weight for a partial match (by putting asterisks at the start and end of each word). Going against all the documentation I've got, items with a lower weight seem to be coming up higher. After a bit more playing around (I removed the partial match functionality), the lower the number, the higher the rank seems to be. I can't just multiply the partial match value, as it'll potentially go over one. Either I'm doing something dumb, or I've missed a pretty valuable piece of documentation. Can someone point me in the right direction here? I've put a few key snippets of the code below. Thanks in advance, Dunc "FROM VenueList " & _ "LEFT JOIN CONTAINSTABLE(VenueList, VenueName, 'ISABOUT (" & BreakSearchStringToIsAbout(arrSearchString, 0.5) & ")') AS B " & _ " ON A.VenueID = B.[Key] " & _ ---/ snip /--- Function BreakSearchStringToIsAbout(arrSearchItem, iWeight) Dim iLoop Dim strOutput strOutput = "" For iLoop = 0 to uBound(arrSearchItem) strOutput = strOutput & """" & arrSearchItem(iLoop) & """" & " WEIGHT(" & iWeight & "), " strOutput = strOutput & """*" & arrSearchItem(iLoop) & "*""" & " WEIGHT(" & iWeight / 2 & "), " Next strOutput = Left(strOutput, Len(strOutput) - 2) BreakSearchStringToIsAbout = strOutput End Function
You're welcome, Dunc, Not to worry, mostly, I need to know the SQL Sever version and OS platform info, our language settings, and row count of your FT-enable table... This is all basic configuration info necessary to troubleshoot SQL FTS issues... You're using SQL Server 2000 SP3 Developer Edition on Win2003 with the default language of US_English on at table with 2,272 rows. Can I assume that your goal here is to have the search keyword ("late" in your query below) have a greater weight from one column vs. another? If so, I see two issues... One, is that you should use only "INNER JOIN" or just JOIN (defaults to INNER JOIN) vs. LEFT JOIN in your query - see the Northwind example below. Secondly, your table size of only 2,272 rows may be too small to get a significant variation of query specific RANK values and then be *influenced* by the Weight parameter. It would better to re-test your modified query against a larger table, say with at least 10,000 rows. -- note, the use of JOIN (defaults to INNER JOIN) only... use Northwind go SELECT e.EmployeeID, e.LastName, ct.[KEY], ct.[RANK] FROM Employees AS e JOIN CONTAINSTABLE(Employees, Notes, 'French') AS ct ON e.EmployeeID = ct.[KEY] ORDER BY ct.[RANK] DESC If it is your goal to have the same keyword rank higher or lower in one column over another columns, I may have another method for achieving this, but I need to test it out a bit with a variable weight parameter. Thanks, John [quoted text, click to view] "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message news:ehc4Zw4jEHA.1764@TK2MSFTNGP10.phx.gbl... > I've had to temporarily remove the item* searches, as it was returning too > much rubbish. > > I'm a little nervous re: posting some of this info to a newsgroup, so I've > renamed a few columns and table names. This query actually works > *reasonably* well, though you'll notice that only one column has a weight of > anything apart from 1.0 - and that's the name column (which is the most > important). If I raise this up at all, the name seems to drop to a > significantly less relevance, and everything goes pear-shaped. > > Thanks in advance, > > Dunc > > ---/ snip /--- > > language: us_english > > version: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 > 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Developer Edition > on Windows NT 5.2 (Build 3790: ) > > default full-text language: default full-text language 0 2147483647 1033 > 1033 > > fulltext_catalogs: 5 MyDirectory e:\SQLServer\MSSQL\FTDATA 0 7 > > fulltext_tables (amongst a few irrelevant tables): > dbo VenueList PK_VenueList 1 1 MyDirectory > dbo VenueExtras PK_VenueExtras 1 1 MyDirectory > > fulltext_columns (amongst a few irrelevant columns): > dbo 1856725667 VenueList VenueName 2 NULL NULL 1033 > dbo 1856725667 VenueList Addr1 3 NULL NULL 1033 > dbo 1856725667 VenueList Addr2 4 NULL NULL 1033 > dbo 1856725667 VenueList Addr3 5 NULL NULL 1033 > dbo 1856725667 VenueList Addr4 6 NULL NULL 1033 > dbo 1856725667 VenueList Addr5 8 NULL NULL 1033 > dbo 1856725667 VenueList Postcode 9 NULL NULL 1033 > dbo 1856725667 VenueList Style 20 NULL NULL 1033 > dbo 1856725667 VenueList Theme 22 NULL NULL 1033 > dbo 1856725667 VenueList Description 23 NULL NULL 1033 > dbo 1856725667 VenueList NearestTransport 43 NULL NULL 1033 > dbo 1856725667 VenueList Review 48 NULL NULL 1033 > dbo 1245247491 VenueExtras ExtraName 2 NULL NULL 1033 > > Venuelist Count: 2272 > > Full Query from profiler (someone searched on the term "late"): > SELECT * FROM ( > SELECT VenueID, VenueName, Addr1, Addr2, Addr3, Addr4, Review, > ISNULL(B.RANK, 0) + ISNULL(C.RANK, 0) + ISNULL(D.RANK, 0) + ISNULL(E.RANK, > 0) + ISNULL(F.RANK, 0) + ISNULL(G.RANK, 0) + ISNULL(H.RANK, 0) + > ISNULL(I.RANK, 0) + ISNULL(J.RANK, 0) + ISNULL(K.RANK, 0) + > ISNULL(L.RANK, 0) + ISNULL(M.RANK, 0) + ISNULL(N.RANK, 0) AS Rank > FROM VenueList AS A > LEFT JOIN CONTAINSTABLE(VenueList, Venue, 'ISABOUT ("late" WEIGHT(0.1))') AS > B > ON A.VenueID = B.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Addr1, 'ISABOUT ("late" WEIGHT(1))') AS C > ON A.VenueID = C.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Addr2, 'ISABOUT ("late" WEIGHT(1))') AS D > ON A.VenueID = D.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Addr3, 'ISABOUT ("late" WEIGHT(1))') AS E > ON A.VenueID = E.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Addr4, 'ISABOUT ("late" WEIGHT(1))') AS F > ON A.VenueID = F.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Addr5, 'ISABOUT ("late" WEIGHT(1))') AS G > ON A.VenueID = G.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Postcode, 'ISABOUT ("late" WEIGHT(1))') > AS H > ON A.VenueID = H.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Style, 'ISABOUT ("late" WEIGHT(1))') AS I > ON A.VenueID = I.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Theme, 'ISABOUT ("late" WEIGHT(1))') AS J > ON A.VenueID = J.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Description, 'ISABOUT ("late" > WEIGHT(1))') AS K > ON A.VenueID = K.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, NearestTransport, 'ISABOUT ("late" > WEIGHT(1))') AS L > ON A.VenueID = L.[Key] > LEFT JOIN CONTAINSTABLE(VenueList, Review, 'ISABOUT ("late" WEIGHT(1))') AS > M > ON A.VenueID = M.[Key] > LEFT JOIN CONTAINSTABLE(VenueExtras, ExtraName, 'ISABOUT ("late" > WEIGHT(1))') AS N > ON A.VenueID = N.[Key] > WHERE A.Status > 0 > AND (B.RANK IS NOT NULL > OR C.RANK IS NOT NULL > OR D.RANK IS NOT NULL > OR E.RANK IS NOT NULL > OR F.RANK IS NOT NULL > OR G.RANK IS NOT NULL > OR H.RANK IS NOT NULL > OR I.RANK IS NOT NULL > OR I.RANK IS NOT NULL > OR K.RANK IS NOT NULL > OR L.RANK IS NOT NULL > OR M.RANK IS NOT NULL > OR N.RANK IS NOT NULL) > ) SubQuery ORDER BY Rank DESC, Venue > > "John Kane" <jt-kane@comcast.net> wrote in message > news:OCmmFZtjEHA.3016@tk2msftngp13.phx.gbl... > > Dunc, > > Could you post the full output of the following SQL script as the info > would > > be helpful in understanding this... > > > > use <your_database_name_here> > > go > > SELECT @@language > > SELECT @@version > > -- Note, may need to set 'show advanced options' > > EXEC sp_configure 'default full-text language' > > EXEC sp_help_fulltext_catalogs > > EXEC sp_help_fulltext_tables > > EXEC sp_help_fulltext_columns > > EXEC sp_help VenueList > > go > > SELECT count(*) from VenueList > > go > > > > Could you also run profiler and re-execute your below query with the > > function BreakSearchStringToIsAbout and post the exact SQL code that it is > > generating? You might also want to review SQL Server 2000 BOL title > > "Full-text Search Recommendations" and it's last paragraph on RANK. > > > > Thanks, > > John > > > > > > > > > > "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message > > news:e3IC#GtjEHA.2652@TK2MSFTNGP15.phx.gbl... > > > I'm dynamically creating a SQL statement that takes each word from a
Dunc, Ok, what I think you really need is a multiple column join using a single table example using an OR condition... (BTW, this is the 3rd one I've posted today, although, not the exact version as below..;). Below is an example of this... use Northwind SELECT distinct e.LastName, e.FirstName, e.Title from Employees AS e, containstable(Employees, Notes, 'BTS') as A, containstable(Employees, Title, 'Representative') as B where A.[KEY] = e.EmployeeID or -- NOTE OR condition here B.[KEY] = e.EmployeeID Note, that I had to use the distinct keyword on the LastName column to remove duplicate results. Try altering your query below to the above with the column specific weights as well as OR between the KEY comparisons. While I fully understand why you want to do this and that you're trying to develop a viable solution to a failing of SQL FTS, you should know that this multiple CONTAINSTABLE query will perform very poorly as each CONTAINSTABLE clause is a "round trip" to the FT Catalog. So, as your table and therefore the FT Catalog query grows, this query will get slower and slower... Regards, John [quoted text, click to view] "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message news:e1SCiI6jEHA.2692@TK2MSFTNGP12.phx.gbl... > Hi John, > > I've made the change to the JOIN statements (ex LEFT JOIN), but that hasn't > worked, as they keyword I'm searching for isn't in every field - potentially > only one - so it's never returning any results. > > This SQL code is running on a mirror of a site that's live and running at > the moment - it's a bar review website, run by a bunch of hobby'ists (i.e. > it's all done by volunteers - techies, marketing, and a bunch of journos). > As much as I'd like 100k rows, 2.2k is actually a pretty high number for the > data it's storing. When we launch the next phase, there will be more like > 40k, but that will be branched down by region to ~5k per region. > > I'm really appreciating your help - if you'd like to get a better idea of > what we're trying to achieve, the site is at: > http://www.fluidfoundation.com > > You'll see the search on the right hand side of every page, or in the middle > of the front page. > > Dunc > > "John Kane" <jt-kane@comcast.net> wrote in message > news:Osb7SV5jEHA.2668@TK2MSFTNGP10.phx.gbl... > > You're welcome, Dunc, > > Not to worry, mostly, I need to know the SQL Sever version and OS platform > > info, our language settings, and row count of your FT-enable table... This > > is all basic configuration info necessary to troubleshoot SQL FTS > issues... > > > > You're using SQL Server 2000 SP3 Developer Edition on Win2003 with the > > default language of US_English on at table with 2,272 rows. Can I assume > > that your goal here is to have the search keyword ("late" in your query > > below) have a greater weight from one column vs. another? > > > > If so, I see two issues... One, is that you should use only "INNER JOIN" > or > > just JOIN (defaults to INNER JOIN) vs. LEFT JOIN in your query - see the > > Northwind example below. Secondly, your table size of only 2,272 rows may > be > > too small to get a significant variation of query specific RANK values and > > then be *influenced* by the Weight parameter. It would better to re-test > > your modified query against a larger table, say with at least 10,000 rows. > > > > -- note, the use of JOIN (defaults to INNER JOIN) only... > > use Northwind > > go > > SELECT e.EmployeeID, e.LastName, ct.[KEY], ct.[RANK] > > FROM Employees AS e > > JOIN CONTAINSTABLE(Employees, Notes, 'French') AS ct ON e.EmployeeID = > > ct.[KEY] > > ORDER BY ct.[RANK] DESC > > > > If it is your goal to have the same keyword rank higher or lower in one > > column over another columns, I may have another method for achieving this, > > but I need to test it out a bit with a variable weight parameter. > > > > Thanks, > > John > > > > > > > > "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message > > news:ehc4Zw4jEHA.1764@TK2MSFTNGP10.phx.gbl... > > > I've had to temporarily remove the item* searches, as it was returning > too > > > much rubbish. > > > > > > I'm a little nervous re: posting some of this info to a newsgroup, so > I've > > > renamed a few columns and table names. This query actually works > > > *reasonably* well, though you'll notice that only one column has a > weight > > of > > > anything apart from 1.0 - and that's the name column (which is the most > > > important). If I raise this up at all, the name seems to drop to a > > > significantly less relevance, and everything goes pear-shaped. > > > > > > Thanks in advance, > > > > > > Dunc > > > > > > ---/ snip /--- > > > > > > language: us_english > > > > > > version: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 > > > 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Developer > > Edition > > > on Windows NT 5.2 (Build 3790: ) > > > > > > default full-text language: default full-text language 0 2147483647 1033 > > > 1033 > > > > > > fulltext_catalogs: 5 MyDirectory e:\SQLServer\MSSQL\FTDATA 0 7 > > > > > > fulltext_tables (amongst a few irrelevant tables): > > > dbo VenueList PK_VenueList 1 1 MyDirectory > > > dbo VenueExtras PK_VenueExtras 1 1 MyDirectory > > > > > > fulltext_columns (amongst a few irrelevant columns): > > > dbo 1856725667 VenueList VenueName 2 NULL NULL 1033 > > > dbo 1856725667 VenueList Addr1 3 NULL NULL 1033 > > > dbo 1856725667 VenueList Addr2 4 NULL NULL 1033 > > > dbo 1856725667 VenueList Addr3 5 NULL NULL 1033 > > > dbo 1856725667 VenueList Addr4 6 NULL NULL 1033 > > > dbo 1856725667 VenueList Addr5 8 NULL NULL 1033 > > > dbo 1856725667 VenueList Postcode 9 NULL NULL 1033 > > > dbo 1856725667 VenueList Style 20 NULL NULL 1033 > > > dbo 1856725667 VenueList Theme 22 NULL NULL 1033 > > > dbo 1856725667 VenueList Description 23 NULL NULL 1033 > > > dbo 1856725667 VenueList NearestTransport 43 NULL NULL 1033 > > > dbo 1856725667 VenueList Review 48 NULL NULL 1033 > > > dbo 1245247491 VenueExtras ExtraName 2 NULL NULL 1033 > > > > > > Venuelist Count: 2272 > > > > > > Full Query from profiler (someone searched on the term "late"): > > > SELECT * FROM ( > > > SELECT VenueID, VenueName, Addr1, Addr2, Addr3, Addr4, Review, > > > ISNULL(B.RANK, 0) + ISNULL(C.RANK, 0) + ISNULL(D.RANK, 0) + > ISNULL(E.RANK, > > > 0) + ISNULL(F.RANK, 0) + ISNULL(G.RANK, 0) + ISNULL(H.RANK, 0) + > > > ISNULL(I.RANK, 0) + ISNULL(J.RANK, 0) + ISNULL(K.RANK, 0) + > > > ISNULL(L.RANK, 0) + ISNULL(M.RANK, 0) + ISNULL(N.RANK, 0) AS Rank > > > FROM VenueList AS A > > > LEFT JOIN CONTAINSTABLE(VenueList, Venue, 'ISABOUT ("late" > WEIGHT(0.1))') > > AS > > > B > > > ON A.VenueID = B.[Key] > > > LEFT JOIN CONTAINSTABLE(VenueList, Addr1, 'ISABOUT ("late" WEIGHT(1))') > AS > > C > > > ON A.VenueID = C.[Key]
I've had to temporarily remove the item* searches, as it was returning too much rubbish. I'm a little nervous re: posting some of this info to a newsgroup, so I've renamed a few columns and table names. This query actually works *reasonably* well, though you'll notice that only one column has a weight of anything apart from 1.0 - and that's the name column (which is the most important). If I raise this up at all, the name seems to drop to a significantly less relevance, and everything goes pear-shaped. Thanks in advance, Dunc ---/ snip /--- language: us_english version: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Developer Edition on Windows NT 5.2 (Build 3790: ) default full-text language: default full-text language 0 2147483647 1033 1033 fulltext_catalogs: 5 MyDirectory e:\SQLServer\MSSQL\FTDATA 0 7 fulltext_tables (amongst a few irrelevant tables): dbo VenueList PK_VenueList 1 1 MyDirectory dbo VenueExtras PK_VenueExtras 1 1 MyDirectory fulltext_columns (amongst a few irrelevant columns): dbo 1856725667 VenueList VenueName 2 NULL NULL 1033 dbo 1856725667 VenueList Addr1 3 NULL NULL 1033 dbo 1856725667 VenueList Addr2 4 NULL NULL 1033 dbo 1856725667 VenueList Addr3 5 NULL NULL 1033 dbo 1856725667 VenueList Addr4 6 NULL NULL 1033 dbo 1856725667 VenueList Addr5 8 NULL NULL 1033 dbo 1856725667 VenueList Postcode 9 NULL NULL 1033 dbo 1856725667 VenueList Style 20 NULL NULL 1033 dbo 1856725667 VenueList Theme 22 NULL NULL 1033 dbo 1856725667 VenueList Description 23 NULL NULL 1033 dbo 1856725667 VenueList NearestTransport 43 NULL NULL 1033 dbo 1856725667 VenueList Review 48 NULL NULL 1033 dbo 1245247491 VenueExtras ExtraName 2 NULL NULL 1033 Venuelist Count: 2272 Full Query from profiler (someone searched on the term "late"): SELECT * FROM ( SELECT VenueID, VenueName, Addr1, Addr2, Addr3, Addr4, Review, ISNULL(B.RANK, 0) + ISNULL(C.RANK, 0) + ISNULL(D.RANK, 0) + ISNULL(E.RANK, 0) + ISNULL(F.RANK, 0) + ISNULL(G.RANK, 0) + ISNULL(H.RANK, 0) + ISNULL(I.RANK, 0) + ISNULL(J.RANK, 0) + ISNULL(K.RANK, 0) + ISNULL(L.RANK, 0) + ISNULL(M.RANK, 0) + ISNULL(N.RANK, 0) AS Rank FROM VenueList AS A LEFT JOIN CONTAINSTABLE(VenueList, Venue, 'ISABOUT ("late" WEIGHT(0.1))') AS B ON A.VenueID = B.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Addr1, 'ISABOUT ("late" WEIGHT(1))') AS C ON A.VenueID = C.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Addr2, 'ISABOUT ("late" WEIGHT(1))') AS D ON A.VenueID = D.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Addr3, 'ISABOUT ("late" WEIGHT(1))') AS E ON A.VenueID = E.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Addr4, 'ISABOUT ("late" WEIGHT(1))') AS F ON A.VenueID = F.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Addr5, 'ISABOUT ("late" WEIGHT(1))') AS G ON A.VenueID = G.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Postcode, 'ISABOUT ("late" WEIGHT(1))') AS H ON A.VenueID = H.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Style, 'ISABOUT ("late" WEIGHT(1))') AS I ON A.VenueID = I.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Theme, 'ISABOUT ("late" WEIGHT(1))') AS J ON A.VenueID = J.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Description, 'ISABOUT ("late" WEIGHT(1))') AS K ON A.VenueID = K.[Key] LEFT JOIN CONTAINSTABLE(VenueList, NearestTransport, 'ISABOUT ("late" WEIGHT(1))') AS L ON A.VenueID = L.[Key] LEFT JOIN CONTAINSTABLE(VenueList, Review, 'ISABOUT ("late" WEIGHT(1))') AS M ON A.VenueID = M.[Key] LEFT JOIN CONTAINSTABLE(VenueExtras, ExtraName, 'ISABOUT ("late" WEIGHT(1))') AS N ON A.VenueID = N.[Key] WHERE A.Status > 0 AND (B.RANK IS NOT NULL OR C.RANK IS NOT NULL OR D.RANK IS NOT NULL OR E.RANK IS NOT NULL OR F.RANK IS NOT NULL OR G.RANK IS NOT NULL OR H.RANK IS NOT NULL OR I.RANK IS NOT NULL OR I.RANK IS NOT NULL OR K.RANK IS NOT NULL OR L.RANK IS NOT NULL OR M.RANK IS NOT NULL OR N.RANK IS NOT NULL) ) SubQuery ORDER BY Rank DESC, Venue [quoted text, click to view] "John Kane" <jt-kane@comcast.net> wrote in message news:OCmmFZtjEHA.3016@tk2msftngp13.phx.gbl... > Dunc, > Could you post the full output of the following SQL script as the info would > be helpful in understanding this... > > use <your_database_name_here> > go > SELECT @@language > SELECT @@version > -- Note, may need to set 'show advanced options' > EXEC sp_configure 'default full-text language' > EXEC sp_help_fulltext_catalogs > EXEC sp_help_fulltext_tables > EXEC sp_help_fulltext_columns > EXEC sp_help VenueList > go > SELECT count(*) from VenueList > go > > Could you also run profiler and re-execute your below query with the > function BreakSearchStringToIsAbout and post the exact SQL code that it is > generating? You might also want to review SQL Server 2000 BOL title > "Full-text Search Recommendations" and it's last paragraph on RANK. > > Thanks, > John > > > > > "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message > news:e3IC#GtjEHA.2652@TK2MSFTNGP15.phx.gbl... > > I'm dynamically creating a SQL statement that takes each word from a > > sentence, and using the WEIGHT command, searches a number of fields twice > > for each word - once with a given weight for the exact word, and once with > > half the weight for a partial match (by putting asterisks at the start and > > end of each word). > > > > Going against all the documentation I've got, items with a lower weight > seem > > to be coming up higher. After a bit more playing around (I removed the > > partial match functionality), the lower the number, the higher the rank > > seems to be. I can't just multiply the partial match value, as it'll > > potentially go over one. > > > > Either I'm doing something dumb, or I've missed a pretty valuable piece of > > documentation. Can someone point me in the right direction here? I've > put > > a few key snippets of the code below. > > > > Thanks in advance, > > > > Dunc > > > > "FROM VenueList " & _ > > "LEFT JOIN CONTAINSTABLE(VenueList, VenueName, 'ISABOUT (" & > > BreakSearchStringToIsAbout(arrSearchString, 0.5) & ")') AS B " & _ > > " ON A.VenueID = B.[Key] " & _ > > > > ---/ snip /--- > > > > Function BreakSearchStringToIsAbout(arrSearchItem, iWeight) > > Dim iLoop > > Dim strOutput > > > > strOutput = "" > > For iLoop = 0 to uBound(arrSearchItem) > > strOutput = strOutput & """" & arrSearchItem(iLoop) & """" & " WEIGHT(" > & > > iWeight & "), " > > strOutput = strOutput & """*" & arrSearchItem(iLoop) & "*""" & " > WEIGHT(" > > & iWeight / 2 & "), " > > Next > > > > strOutput = Left(strOutput, Len(strOutput) - 2) > > > > BreakSearchStringToIsAbout = strOutput > > End Function > > > > > >
Hi John, I've made the change to the JOIN statements (ex LEFT JOIN), but that hasn't worked, as they keyword I'm searching for isn't in every field - potentially only one - so it's never returning any results. This SQL code is running on a mirror of a site that's live and running at the moment - it's a bar review website, run by a bunch of hobby'ists (i.e. it's all done by volunteers - techies, marketing, and a bunch of journos). As much as I'd like 100k rows, 2.2k is actually a pretty high number for the data it's storing. When we launch the next phase, there will be more like 40k, but that will be branched down by region to ~5k per region. I'm really appreciating your help - if you'd like to get a better idea of what we're trying to achieve, the site is at: http://www.fluidfoundation.com You'll see the search on the right hand side of every page, or in the middle of the front page. Dunc [quoted text, click to view] "John Kane" <jt-kane@comcast.net> wrote in message news:Osb7SV5jEHA.2668@TK2MSFTNGP10.phx.gbl... > You're welcome, Dunc, > Not to worry, mostly, I need to know the SQL Sever version and OS platform > info, our language settings, and row count of your FT-enable table... This > is all basic configuration info necessary to troubleshoot SQL FTS issues... > > You're using SQL Server 2000 SP3 Developer Edition on Win2003 with the > default language of US_English on at table with 2,272 rows. Can I assume > that your goal here is to have the search keyword ("late" in your query > below) have a greater weight from one column vs. another? > > If so, I see two issues... One, is that you should use only "INNER JOIN" or > just JOIN (defaults to INNER JOIN) vs. LEFT JOIN in your query - see the > Northwind example below. Secondly, your table size of only 2,272 rows may be > too small to get a significant variation of query specific RANK values and > then be *influenced* by the Weight parameter. It would better to re-test > your modified query against a larger table, say with at least 10,000 rows. > > -- note, the use of JOIN (defaults to INNER JOIN) only... > use Northwind > go > SELECT e.EmployeeID, e.LastName, ct.[KEY], ct.[RANK] > FROM Employees AS e > JOIN CONTAINSTABLE(Employees, Notes, 'French') AS ct ON e.EmployeeID = > ct.[KEY] > ORDER BY ct.[RANK] DESC > > If it is your goal to have the same keyword rank higher or lower in one > column over another columns, I may have another method for achieving this, > but I need to test it out a bit with a variable weight parameter. > > Thanks, > John > > > > "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message > news:ehc4Zw4jEHA.1764@TK2MSFTNGP10.phx.gbl... > > I've had to temporarily remove the item* searches, as it was returning too > > much rubbish. > > > > I'm a little nervous re: posting some of this info to a newsgroup, so I've > > renamed a few columns and table names. This query actually works > > *reasonably* well, though you'll notice that only one column has a weight > of > > anything apart from 1.0 - and that's the name column (which is the most > > important). If I raise this up at all, the name seems to drop to a > > significantly less relevance, and everything goes pear-shaped. > > > > Thanks in advance, > > > > Dunc > > > > ---/ snip /--- > > > > language: us_english > > > > version: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 > > 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Developer > Edition > > on Windows NT 5.2 (Build 3790: ) > > > > default full-text language: default full-text language 0 2147483647 1033 > > 1033 > > > > fulltext_catalogs: 5 MyDirectory e:\SQLServer\MSSQL\FTDATA 0 7 > > > > fulltext_tables (amongst a few irrelevant tables): > > dbo VenueList PK_VenueList 1 1 MyDirectory > > dbo VenueExtras PK_VenueExtras 1 1 MyDirectory > > > > fulltext_columns (amongst a few irrelevant columns): > > dbo 1856725667 VenueList VenueName 2 NULL NULL 1033 > > dbo 1856725667 VenueList Addr1 3 NULL NULL 1033 > > dbo 1856725667 VenueList Addr2 4 NULL NULL 1033 > > dbo 1856725667 VenueList Addr3 5 NULL NULL 1033 > > dbo 1856725667 VenueList Addr4 6 NULL NULL 1033 > > dbo 1856725667 VenueList Addr5 8 NULL NULL 1033 > > dbo 1856725667 VenueList Postcode 9 NULL NULL 1033 > > dbo 1856725667 VenueList Style 20 NULL NULL 1033 > > dbo 1856725667 VenueList Theme 22 NULL NULL 1033 > > dbo 1856725667 VenueList Description 23 NULL NULL 1033 > > dbo 1856725667 VenueList NearestTransport 43 NULL NULL 1033 > > dbo 1856725667 VenueList Review 48 NULL NULL 1033 > > dbo 1245247491 VenueExtras ExtraName 2 NULL NULL 1033 > > > > Venuelist Count: 2272 > > > > Full Query from profiler (someone searched on the term "late"): > > SELECT * FROM ( > > SELECT VenueID, VenueName, Addr1, Addr2, Addr3, Addr4, Review, > > ISNULL(B.RANK, 0) + ISNULL(C.RANK, 0) + ISNULL(D.RANK, 0) + ISNULL(E.RANK, > > 0) + ISNULL(F.RANK, 0) + ISNULL(G.RANK, 0) + ISNULL(H.RANK, 0) + > > ISNULL(I.RANK, 0) + ISNULL(J.RANK, 0) + ISNULL(K.RANK, 0) + > > ISNULL(L.RANK, 0) + ISNULL(M.RANK, 0) + ISNULL(N.RANK, 0) AS Rank > > FROM VenueList AS A > > LEFT JOIN CONTAINSTABLE(VenueList, Venue, 'ISABOUT ("late" WEIGHT(0.1))') > AS > > B > > ON A.VenueID = B.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Addr1, 'ISABOUT ("late" WEIGHT(1))') AS > C > > ON A.VenueID = C.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Addr2, 'ISABOUT ("late" WEIGHT(1))') AS > D > > ON A.VenueID = D.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Addr3, 'ISABOUT ("late" WEIGHT(1))') AS > E > > ON A.VenueID = E.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Addr4, 'ISABOUT ("late" WEIGHT(1))') AS > F > > ON A.VenueID = F.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Addr5, 'ISABOUT ("late" WEIGHT(1))') AS > G > > ON A.VenueID = G.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Postcode, 'ISABOUT ("late" WEIGHT(1))') > > AS H > > ON A.VenueID = H.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Style, 'ISABOUT ("late" WEIGHT(1))') AS > I > > ON A.VenueID = I.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Theme, 'ISABOUT ("late" WEIGHT(1))') AS > J > > ON A.VenueID = J.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Description, 'ISABOUT ("late" > > WEIGHT(1))') AS K > > ON A.VenueID = K.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, NearestTransport, 'ISABOUT ("late" > > WEIGHT(1))') AS L > > ON A.VenueID = L.[Key] > > LEFT JOIN CONTAINSTABLE(VenueList, Review, 'ISABOUT ("late" WEIGHT(1))') > AS > > M > > ON A.VenueID = M.[Key] > > LEFT JOIN CONTAINSTABLE(VenueExtras, ExtraName, 'ISABOUT ("late" > > WEIGHT(1))') AS N > > ON A.VenueID = N.[Key] > > WHERE A.Status > 0 > > AND (B.RANK IS NOT NULL > > OR C.RANK IS NOT NULL > > OR D.RANK IS NOT NULL > > OR E.RANK IS NOT NULL > > OR F.RANK IS NOT NULL > > OR G.RANK IS NOT NULL > > OR H.RANK IS NOT NULL > > OR I.RANK IS NOT NULL > > OR I.RANK IS NOT NULL
Don't see what you're looking for? Try a search.
|