all groups > sql server full text search > december 2003 >
You're in the

sql server full text search

group:

Thoughts/advice on select optimizations


Re: Thoughts/advice on select optimizations John Kane
12/16/2003 1:17:17 PM
sql server full text search:
Thomas,
100K rows is well within the "sweet spot" for SQL Server 2000 Full-text
Search (FTS) and while I'm not sure if this what you're looking, but one way
to do this is with FREETEXTTABLE (or CONTAINSTABLE for more specific
results) and inserting the duplicated rows into a temp table and then using
the max RANK value to "summarize" the data down to one row, as follows:

CREATE TABLE #FTSQueryResult (PK_ID int, Rank int)
INSERT #FTSQueryResult
select * from freetexttable(FTSDash, VarcharCol,'"multi search words
phrase"') as fts
Select Max(Rank) as Rank From #FTSQueryResult Group by Rank

Thanks,
John
PS: For FTS related questions the newsgroup to post to is:
microsoft.public.sqlserver.fulltext



"Thomas Nielsen [AM Production A/S]" <jack_pot_dk@h0tmail.com> wrote in
message news:ep507uAxDHA.1576@TK2MSFTNGP11.phx.gbl...
[quoted text, click to view]

Thoughts/advice on select optimizations Thomas Nielsen [AM Production A/S]
12/16/2003 8:38:23 PM
I'm looking for thoughts/advice on how to build the fastest search + (paged)
results-page in an .NET + SQL Server 2000 environment. I hope this is the
proper group for this, otherwise i appologize for the spam.

I have one table with around 100.000 records containing some text fields.
What i need to do is free-text search through these records, and return a
paged result to the client. The user should be able to select which field to
sort by. I'm estimating 10.000 concurrent users on the system.

My initial approach to this would be to do the initial search once
(selecting records using a full-text indexing), and throw the result into a
TEMP table. Each paged result the user requests would then be taken from
this table, using DataAdapter.Fill(DataSet, startRecord, numberOfRecords)

I am, however, not sure if this is the right approach. Not using a TEMP
table would probably increase performance on the first results page, but I
would assume the TEMP table helps decrease system load a lot on subsequent
page requests. Also, I'm not completely sure if DataAdapter.Fill with
startrecord, numberofrecords does only transfers the selected records to the
web server, or if the entire resultset is transfered to the webserver, and
the dataset THEN populated with the selected records.

Does anyone have any thoughts on the best way to implement a system like
this?

Cheers,

Thomas

Re: Thoughts/advice on select optimizations Miha Markic
12/16/2003 9:25:03 PM
Hi Thomas,

I would use the following method (your fill won't work anyway because it
will use the original select to retrieve the data again) .
But with slight modifications it could work.
Here is the idea.
You do the search, but return only primary key values to a temp table.
Assuming pk is incrementing integer.

When user needs data
You would create a select like: SELECT TOP NumberOfRecords ... .... FROm ...
WHERE PrimaryKey >= FirstPrimaryKey ORDER BY PrimaryKey
where NumberOfRecords is page size and FirstPrimaryKey is first pk to fetch.

And just run adapter.Fill(datatable) when needed to fetch a page of records.

--
Miha Markic - RightHand .NET consulting & development
miha at rthand com

"Thomas Nielsen [AM Production A/S]" <jack_pot_dk@h0tmail.com> wrote in
message news:ep507uAxDHA.1576@TK2MSFTNGP11.phx.gbl...
[quoted text, click to view]

Re: Thoughts/advice on select optimizations Anders Borum
12/16/2003 11:20:07 PM
Hello!

While this approach currently is the best way to do paging in SQL Server,
you might want to look out for ADO.NET 2.0. It allows you to do paging with
SQL Server - without first copying all your rows to a TEMP table first.

As far as I remember, it was about using a new SqlDataReader object where
you could specify page size and page position before getting the
SqlDataReader.

--
venlig hilsen / with regards
anders borum
--

Re: Thoughts/advice on select optimizations Miha Markic
12/16/2003 11:33:26 PM

[quoted text, click to view]

ADO.NET is more or less a year away though. :-(

--
Miha Markic - RightHand .NET consulting & development
miha at rthand com

Re: Thoughts/advice on select optimizations John Kane
12/16/2003 11:46:40 PM
Thomas,
I would disagree and actually so does Microsoft... see
http://www.microsoft.com/sql/techinfo/tips/administration/resultset.asp.
You can use the user stored procs - spGetNextPage & spGetPrevPage - to
handle the paging and modify it to add a variable for the order by clause to
alter the sort order of the results. While the unique au_id column dictates
the order of the results in this example, you can alter this to use "name"
as the concated author's name. And to quote Microsoft - "Although it is
possible to page through result sets using the ADO Recordset object, that
solution is not scalable". This solution is both scalable and meets your
requirements of using SQL FTS and when combined with the INSERT
#FTSQueryResult code that I posted earlier, you a viable solution today
don't have to wait until ADO.NET 2.0...

Regards,
John



"Thomas Nielsen [AM Production A/S]" <jack_pot_dk@h0tmail.com> wrote in
message news:ehBKU0GxDHA.2316@TK2MSFTNGP10.phx.gbl...
[quoted text, click to view]

Re: Thoughts/advice on select optimizations Thomas Nielsen [AM Production A/S]
12/17/2003 8:15:13 AM
Miha,

So if i understand you correctly;

1) Do search, fill TEMP table with resultset
2) Get each results page from the TEMP table - but use SQL logic to retrieve
page records, rather than ADO's .Fill.

The problem with paging into an incremeting integer list of records is that
it will not work with different sorts. If the user eg. choses to display the
search result sorted by something else, I'd have to fire the original search
again.

Thanks for the thoughts,

/Thomas

[quoted text, click to view]

Re: Thoughts/advice on select optimizations Thomas Nielsen [AM Production A/S]
12/17/2003 8:18:32 AM
Hi Anders,

This needs to go live mid 2004, so unfortunately i can't rely on that :(

But it's somewhat surprising to me that ADO.NET doesn't handle this more
ellegantly. I mean, it's fine that it's "easy to use" (for the programmer)
on clients connecting to data backends, but if the query is always fired
again, i'd be really worried about performance perspectives in a lot of
scenarios.

/Thomas

[quoted text, click to view]

AddThis Social Bookmark Button