all groups > sql server full text search > february 2005 >
You're in the

sql server full text search

group:

Which search happen first ? (Yukon full text search)


Re: Which search happen first ? (Yukon full text search) John Kane
2/25/2005 8:32:26 AM
sql server full text search: Xin Chen,
I believe you are concerned about the Yukon FTS service (MSFTESQL) reading
all of a large FT Catalog when only results for a subset of rows (userid =
5) is required and therefore taking a long time to return results. Correct?

If so, then Yukon FTS does understand this restriction in the where clause
and will only read the *rows* in the FT Catalog based upon the where clause
restriction and will return results faster. If not, could you further
explain your question?

Thanks,
John
--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/



[quoted text, click to view]

Re: Which search happen first ? (Yukon full text search) Hilary Cotter
2/25/2005 10:28:50 AM
It could be done first, it depends on parallelism. However, results are
returned from the full text catalogs and then merge with the results set
returned by userid=5 to whittle down the results to matches of both
conditions.
--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

[quoted text, click to view]

Which search happen first ? (Yukon full text search) Xin Chen
2/25/2005 3:06:19 PM
In Yukon, if I do a fulltext search and also put let's say "userid = 5" in
the where clause, for example, [Where contains("description","yukon") and
userid = 5]
Does Yukon search for all the "yukon" among all the datarows first or search
for userid =5 first?

Obviously search for userid = 5 first then full text search will be faster,
How does Yukon handle this ?

--Xin Chen

Re: Which search happen first ? (Yukon full text search) John Kane
2/25/2005 8:16:47 PM
You're welcome, Xin Chin!
Yes, I believe you can do something similar as you're hinting at via Linked
Servers between each database or server. The following is quoted from the
SQL Server 2005 Dec '04 CTP BOL "Support for CONTAINS full-text predicate
against linked servers
SQL Server full-text search functionality can be applied on text data in
remote servers when you used the CONTAINS predicate with SELECT queries
against linked servers. In Microsoft® SQL ServerT 2005, you can use
four-part names in CONTAINS or FREETEXT full-text predicates to execute
queries against linked servers."

As I've not yet tested this, and there is no exact syntax sample in the
Yukon BOL on how to do this between databases on the same server (vs. on
different servers), but I believe the syntax would look something like this
after establishing the Linked Servers:

select * from tableA where CONTAINS(serverA.databaseA..tableA.columnA, '
"searchA" ')
union all
select * from tableA where CONTAINS(serverA.databaseB..tableA.columnA, '
"searchB" ')
union all
select * from tableA where CONTAINS(serverA.databaseC..tableA.columnA, '
"searchC" ')
-- etc...

However, Yukon FTS does not support Distributed Partition Views, so having
distributed or propagated search servers, will have to wait till the next
version of SQL Server after Yukon...

Hope that helps & keep reading my blog as I'll be posting more on Yukon in
the very near future!
Thanks,
John

--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/


[quoted text, click to view]

Re: Which search happen first ? (Yukon full text search) Xin Chen
2/26/2005 9:12:04 AM
John, That is exactly what I asked. Thank you very much for your answer. By
the way, I enjoy you blog a lot.
Answer question for you. Sometime we divide a large database into identical
smaller ones(let's say 5 databases), but keep all the data structure the
same. If I fulltext index on each of the 5 databases, then is there a way I
can merge the full text search results from 5 database server to provide a
complete search result? Does Yukon offer this? I think it is critical
feature when you have distributed data. If Yukon doesn't offer this, do you
have any idea on how best handle this? Thank you very much !

--Xin Chen


[quoted text, click to view]

Re: Which search happen first ? (Yukon full text search) John Kane
2/26/2005 10:02:11 AM
You're welcome, Xin Chen,
Keep in mind that the Yukon Dec '04 CTP BOL is still a working effort, i.e.,
not complete and that the entry for "Querying Linked Servers"
[ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/fulltxt9/html/0e0eaf74-442e-41ed-8198
-ba2c6337e54b.htm] only references CONTAINS and FREETEXT that can be used
with linked servers. Furthermore, FREETEXTTABLE
[ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/4523ae15-4260-40a7-a53c
-8df15e1fee79.htm] states that "table cannot specify a server name and
cannot be used in queries against linked servers", while CONTAINSTABLE has
no such restriction. The latter might be a "DOC" bug, i.e., CONTAINSTABLE
may also have the same restriction as FREETEXTTABLE, but not yet documented.

If CONTAINSTABLE and FREETEXTTABLE cannot be used with Linked Servers and as
they are the only commands that return RANK, then returning RANK from a
cross-database or cross-server join via linked servers may not be possible.

Performance is always a concern in situations such as a cross-database join
(FTS or non-FTS) and like they say, the proof is in the pudding and testing
would need to be done to confirm that such a multi-database FTS join would
perform well. Still, its an interesting feature that is not available in SQL
Server 2000 that will have to be more explored under SQL Server 2005!

Yes, Google does use their own-brand of distributed search. Here is a paper
that describes the Google Cluster Architecture -
http://www.computer.org/micro/mi2003/m2022.pdf. Note, that their "15,000
commodity class PC's with fault-tolerant software" is now reported to be
over 100,000 today. Comparing a new database-specific search feature to an
Internet-scale Search Engine, is not an apple-to-apple comparison, but still
Microsoft SQL FTS Dev could take some pointers from Google for the next
version of SQL Sever after Yukon, IMHO...

Regards,
John
--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/


[quoted text, click to view]

Re: Which search happen first ? (Yukon full text search) Xin Chen
2/26/2005 11:00:17 PM
Thanks again, John, I will try it out on Yukon. I am not sure if the Union
All will still make the rank relevent. I heard Microsft said that rank from
one database will be irrelvent to that of other database on full text
search.

Another thing I worried is the performance. If the search research is large
data set, merge result and union them on a single server will mean large
network traffic. I wish I know how google does this. I am sure they must
use some type of distributed search. How could they do it so FAST?? Any
opinions?

--Xin Chen


[quoted text, click to view]

Re: Which search happen first ? (Yukon full text search) Xin Chen
2/27/2005 11:35:04 AM
John, thanks again for your reply and the great article you mentioned. I
will be reading it very carefully to see if I can find something I can use.

I had once written a crawler written in C# to collect the web pages and
store them in Yukon. I think Yukon (beta 2) corrupted the database when I
got to about 1 million page in it. I broke the text content of 1 page into
10 row, so the database(FT) is about 10 million rows. I had hope to see
Yukon to index 10 million web page with adequate performance. I think in
term of full text indexing and search, its fundamental techniques are pretty
common. It just matter of who can implement it better and get the best
scalability(or distributed search I believe) which is badly needed when
dealing with billion's web pages.

--Xin Chen



[quoted text, click to view]
AddThis Social Bookmark Button