You're welcome, DNW,
You're using SQL Sever 2000 SP3 Developer Edition on Win2K SP4 using
UK_English, and since you're using UK_English, you should edit the noise.eng
(and NOT the US_English noise word file noise.enu) and then re-run a Full
Population.
16 rows is certainly not large enough of a table to get variations in the
query specific RANK values, but you might be ok with 500 rows, but even so,
I'd recommend that you artificially scale up to 500 rows and re-test your
FTS queries with a larger table...
Below is a short & simple explanation of how RANK is calculated:
1. The number of times a word appears in a document.
2. The closer the searched for words are to each other, the higher the rank,
until the point that they are adjacent becoming a phrase and raising the
rank even higher.
3. The ranking mechanism is weighted so that the more highly inflected the
word is from the version asked for originally, the lower its rank in the
result set. For example, "swim" would be closer to "swims" and further from
"swimmer" because "swim" and "swimmer" are less related grammatically. In
other words, the plural noun form is more related grammatically than the
past-tense verb form of the same word. When resolving queries, the
linguistic engine and ranking algorithm take these linguistic features into
account.
As your CONTAINSTABLE query table is very small (even with 500 rows), I'd
recommend that you remove the Top_N_Rank parameter, i.e., the value 10 in
your query as this is mainly implemented for performance reasons against
very large FT-enable tables.
Additionally, you're searching on "Apple 5" (within double quotes) so this
qualifies as a "phrase search", but the single digit 5 is ignored as it is
still in the noise.eng (UK_English noise word file), so Apple without any
single digits will list first, even though all rows returned have the same
query specific rank value of 32.
I've recommend that you do the following: 1) remove all single digits from
noise.eng, 2) run a Full Population, 3) remove the Top_N_Rank value of 10
from your containstable query, and if possible 4) re-test with a larger
table.
Regards,
John
[quoted text, click to view] "Dot net work" <dotnw@hotmail.com> wrote in message
news:77b8c5a9.0408310929.3ddc085@posting.google.com...
> Hi John,
>
> >>Could you post the output of the following SQL code?
>
> Sure...
>
> Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002
> 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Developer
> Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
>
> 16
>
> >>I'm assuming that you're using US_English as the "Language for Word
> Breaker"
>
> I believe it was slightly different - UK_English
>
> >>but did you also remove the single digits 1 thru 5 (or all single
> digits) from the noise word file: noise.enu?
>
> I have now done this - thanks.
>
> >>did you run a Full Population afterwards?
>
> I have now run this on my full-text tables.
>
> >>if you have less that 10,000 rows in your table
> Store_BasicSearchableItems this could also explain the low RANK
> values.
>
> I see! Actually, I only have 16 rows as I am just learning about
> full-text searching. My final table will only have around 500 rows,
> so this information worries me a little.
>
> >>you should review KB article 240833
>
> This seems very interesting, but unfortunately I can't get their SQL
> example to work. Query analyzer doesn't like this line -
>
> WHERE T.key=CT.key AND T.a > 5
>
> If I could use their example code, I code modify my stored procedure
> and get better results.
>
> Using my tweaked original SQL:
>
> select *
> from Store_BasicSearchableItems,
> containstable(Store_BasicSearchableItems, ItemName, '"Apple 5"', 10)
> tblSearchResults
> where [key] = Store_BasicSearchableItems.ItemId
> order by rank desc
>
> for the search of "Apple 5", I get these results in this order:
>
> Apple, Apple 10, Apple 5, Apple 4, etc.
>
> (Again, all results have the same rank value, this time 32)
>
> TIA,
> -dnw.
>
>
> "John Kane" <jt-kane@comcast.net> wrote in message
news:<edYF2gtjEHA.1712@TK2MSFTNGP09.phx.gbl>...
> > Dot Net Work,
> > While Hilary's reply that he's seen "... similar behavior..." to what
you've
> > observed, I don't believe that the all of the fact are in as to what is
the
> > cause for your results... Could you post the output of the following SQL
> > code?
> >
> > use <your_database_name_here>
> > go
> > SELECT @@version
> > SELECT count(*) from Store_BasicSearchableItems
> > go
> >
> > I'm assuming that you're using US_English as the "Language for Word
> > Breaker", but did you also remove the single digits 1 thru 5 (or all
single
> > digits) from the noise word file: noise.enu? If so, did you run a Full
> > Population afterwards? If not and you are passing in double quotes
around
> > your search word - "apple" - then the single digit numbers (1 thru 5)
are
> > ignored and that could be why you are seeing the same RANK value (4) for
all
> > rows. Also, the number of rows has to be statistically large enough to
get a
> > significantly RANK value returned and if you have less that 10,000 rows
in
> > your table Store_BasicSearchableItems this could also explain the low
RANK
> > values.
> >
> > Finally, Hilary recommend that you use the following sql query as a
> > workaround to what he thought to be the problem:
> >
> > select * from containstable(TableName,*,'apple',10) order by rank,
[key]
> >
> > This query in fact uses a limit parameter - Top_N_Rank (the value 10 in
the
> > above query) - that may not be appropriate for you depending upon the
above
> > information requested. Specifically, you should review KB article 240833
> > (Q240833) "FIX: Full-Text Search Performance Improved via Support for
TOP"
> > at
http://support.microsoft.com//default.aspx?scid=kb;EN-US;240833 > >
> > Regards,
> > John
> >
> >
> >
> >
> >
> > "Dot Net Work" <dotnw@hotmail.com> wrote in message
> > news:umJU1NFjEHA.2696@TK2MSFTNGP11.phx.gbl...
> > > Thanks a lot.
> > > -dnw.
> > >
> > >
> > > *** Sent via Developersdex
http://www.developersdex.com ***
> > > Don't just participate in USENET...get rewarded for it!