sql server full text search:
Hello, i have a table with 34,000,000(!) rows on a Sql2000 with Win2k. (Celeron 1,3 GHZ, 256 MB Ram, 120GB ATA 100) I started a Full population and it's in progress since 28 hrs. 4 Columns with nvchar(50) are included. Here are some futher informations: Status: Full population in progress Item Count: 71145 IndexSize: 3 MB UniqueKeyCount: 47325 LogSize: 1655 Start: 2003-11-26 15:54:36.000 Can somebody tell me, how long it could take ? Can i monitor the progress, so that i can see how many rows are already edited and how many are unworked. Is there a way to speedup the progress? Thanks a lot for you help... :-)) Best regards,
Neils, Hmm... A FT-enabled table with 34 million rows... Depending upon where you have your FT Catalog residing (on the same drive as your database files?), this could take a very long time... certainly 8 to 10+ days to complete. I'd recommend that if you cannot wait that long for the Full Population to wait that you stop the population and set the "Microsoft Search" Service to manual start and then manually remove the FT Catalog as the MSSearch service will automaticlly attempt to restart the population. As for a monitioring process, you can use and modifiy the following T-SQL code: use pubs go BEGIN SET NOCOUNT ON DECLARE @begin datetime DECLARE @end datetime SET @begin = CURRENT_TIMESTAMP EXEC sp_fulltext_catalog 'PubInfo', 'start_full' -- "Full Crawl" -- EXEC sp_fulltext_catalog 'PubInfo', 'start_incremental' -- "Incremental Crawl" -- --- Wait for crawl to complete -- DECLARE @status int, @itemCount int, @keyCount int, @indexSize int SELECT @status = FullTextCatalogProperty('PubInfo', 'populatestatus') WHILE (@status <> 0) BEGIN WAITFOR DELAY '00:00:01' -- wait for 1 second before checking FT Populatestatus... SELECT @status = FullTextCatalogProperty('PubInfo', 'populatestatus') END SET @end = CURRENT_TIMESTAMP WAITFOR DELAY '00:00:15' -- wait for 15 seconds in order to get correct FT Property info... SET @itemCount = FullTextCatalogProperty('PubInfo', 'itemcount') SET @keyCount = FullTextCatalogProperty('PubInfo', 'uniquekeycount') SET @indexSize = FullTextCatalogProperty('PubInfo', 'indexsize') PRINT CONVERT(char(30), @begin, 9) + char(09) + CONVERT(char(30), @end, 9) + char(09) + CONVERT(char(30), @end - @begin, 8) + char(09) + CONVERT(char(30), DATEDIFF (hh, @begin, @end)) + char(09) + CONVERT(char(30), DATEDIFF (mi, @begin, @end)) + char(09) + CONVERT(char(30), DATEDIFF (ss, @begin, @end)) + char(09) + CONVERT(varchar(10), @itemCount) + char(09) + CONVERT(varchar(10), @keyCount) + char(09) + CONVERT(varchar(10), @indexSize) SET NOCOUNT OFF END go NOTE: For such a large table, you should modifiy the waitfor to be at least once ever 5 minutes and the final waitfor to be at least 30 minutes. Instead of using the PRINT statement, you could write the values out to a monitoring table. As for how to speed it up, you should review SQL Server 2000 BOL title "Full-Text Search Recommendations" and separate out the FT Catalog folder from your database files, i.e, on a different disk controller and disk array configured as RAID10 or as a strip-set. Also, get more RAM, at least 1GB and use sp_fulltext_service 'resource_usage' <value> and set <value> to 5 as the MSSearch service can only use 512Mb of RAM, but only if the RAM is not used by the OS or SQL Server. Additionally, more and faster CPU's won't hurt as well... Basicly, you need a production level server to FT Index 34 million rows... Best Regards, John [quoted text, click to view] "Neils" <neils@shoobridge.de> wrote in message news:286b284a.0311270949.74247388@posting.google.com... > Hello, > > i have a table with 34,000,000(!) rows on a Sql2000 with Win2k. > (Celeron 1,3 GHZ, 256 MB Ram, 120GB ATA 100) > > I started a Full population and it's in progress since 28 hrs. > > 4 Columns with nvchar(50) are included. > > Here are some futher informations: > > Status: Full population in progress > Item Count: 71145 > IndexSize: 3 MB > UniqueKeyCount: 47325 > LogSize: 1655 > Start: 2003-11-26 15:54:36.000 > > Can somebody tell me, how long it could take ? > Can i monitor the progress, so that i can see how many > rows are already edited and how many are unworked. > Is there a way to speedup the progress? > > Thanks a lot for you help... :-)) > > Best regards, > Neils Shoobridge
Hi John, thank you for your fast replay :-)) I hope that after 10 days it will be done... Right now i'm going to order a Serversystem and i would feel better when you take a look on it: The database includes about 34,000,000 database entries (1 Table). 80 Clients are making simultaneous queries on that Data.(DIRECTORY ASSISTANCE SERVICE). It is necassary that the responsetime of database is around a second. There should be not only one Server, there should be 3 Servers. One for Production the other two for Backup if Server1 goes down (Cluster). The following configuration is planed for each Server: Dual Xeon 2.8Ghz CPU 2GB Ram Raid 5 SCSI Each HDD 200 GB Win2k Server or Win2003 Server MSSQL 2000 Server The Querys includs FREETEXT or CONTAINS parameters. Is the configuration ok or is there something i should change. Is there anything i should pay special attention on it. Thank you, :-)) Neils [quoted text, click to view] "John Kane" <jt-kane@comcast.net> wrote in message news:<#$09YxStDHA.980@TK2MSFTNGP10.phx.gbl>... > Neils, > Hmm... A FT-enabled table with 34 million rows... Depending upon where you > have your FT Catalog residing (on the same drive as your database files?), > this could take a very long time... certainly 8 to 10+ days to complete. I'd > recommend that if you cannot wait that long for the Full Population to wait > that you stop the population and set the "Microsoft Search" Service to > manual start and then manually remove the FT Catalog as the MSSearch service > will automaticlly attempt to restart the population. > > As for a monitioring process, you can use and modifiy the following T-SQL > code: > > use pubs > go > BEGIN > SET NOCOUNT ON > DECLARE @begin datetime > DECLARE @end datetime > SET @begin = CURRENT_TIMESTAMP > EXEC sp_fulltext_catalog 'PubInfo', 'start_full' -- "Full Crawl" > -- EXEC sp_fulltext_catalog 'PubInfo', 'start_incremental' -- "Incremental > Crawl" > -- > --- Wait for crawl to complete > -- > DECLARE @status int, @itemCount int, @keyCount int, @indexSize int > SELECT @status = FullTextCatalogProperty('PubInfo', 'populatestatus') > WHILE (@status <> 0) > BEGIN > WAITFOR DELAY '00:00:01' -- wait for 1 second before checking FT > Populatestatus... > SELECT @status = FullTextCatalogProperty('PubInfo', 'populatestatus') > END > SET @end = CURRENT_TIMESTAMP > WAITFOR DELAY '00:00:15' -- wait for 15 seconds in order to get correct FT > Property info... > SET @itemCount = FullTextCatalogProperty('PubInfo', 'itemcount') > SET @keyCount = FullTextCatalogProperty('PubInfo', 'uniquekeycount') > SET @indexSize = FullTextCatalogProperty('PubInfo', 'indexsize') > PRINT CONVERT(char(30), @begin, 9) + char(09) + > CONVERT(char(30), @end, 9) + char(09) + > CONVERT(char(30), @end - @begin, 8) + char(09) + > CONVERT(char(30), DATEDIFF (hh, @begin, @end)) + char(09) + > CONVERT(char(30), DATEDIFF (mi, @begin, @end)) + char(09) + > CONVERT(char(30), DATEDIFF (ss, @begin, @end)) + char(09) + > CONVERT(varchar(10), @itemCount) + char(09) + > CONVERT(varchar(10), @keyCount) + char(09) + > CONVERT(varchar(10), @indexSize) > SET NOCOUNT OFF > END > go > > NOTE: For such a large table, you should modifiy the waitfor to be at least > once ever 5 minutes and the final waitfor to be at least 30 minutes. Instead > of using the PRINT statement, you could write the values out to a monitoring > table. As for how to speed it up, you should review SQL Server 2000 BOL > title "Full-Text Search Recommendations" and separate out the FT Catalog > folder from your database files, i.e, on a different disk controller and > disk array configured as RAID10 or as a strip-set. Also, get more RAM, at > least 1GB and use sp_fulltext_service 'resource_usage' <value> and set > <value> to 5 as the MSSearch service can only use 512Mb of RAM, but only if > the RAM is not used by the OS or SQL Server. Additionally, more and faster > CPU's won't hurt as well... Basicly, you need a production level server to > FT Index 34 million rows... > > Best Regards, > John > > > > "Neils" <neils@shoobridge.de> wrote in message > news:286b284a.0311270949.74247388@posting.google.com... > > Hello, > > > > i have a table with 34,000,000(!) rows on a Sql2000 with Win2k. > > (Celeron 1,3 GHZ, 256 MB Ram, 120GB ATA 100) > > > > I started a Full population and it's in progress since 28 hrs. > > > > 4 Columns with nvchar(50) are included. > > > > Here are some futher informations: > > > > Status: Full population in progress > > Item Count: 71145 > > IndexSize: 3 MB > > UniqueKeyCount: 47325 > > LogSize: 1655 > > Start: 2003-11-26 15:54:36.000 > > > > Can somebody tell me, how long it could take ? > > Can i monitor the progress, so that i can see how many > > rows are already edited and how many are unworked. > > Is there a way to speedup the progress? > > > > Thanks a lot for you help... :-)) > > > > Best regards,
You're welcome, Neils, Since you're willing to wait at least 10 days, keep in mind that I said "10+ days to complete", i.e, it could be more than 10 days... Also, since you've said that is "necessary that the response time of database is around a second", I'm assuming you're also referring to the expected response time for the FTS queries as well. If so, you should be using either CONTAINSTABLE or FREETEXTTABLE and the Top_N_Rank parameter to limit the results to just the top N (a number you supply) by RANK as using this option will perform better than CONTAINS or FREETEXT with a FT-enable table with 34 million rows. See KB article 240833 (Q240833) "FIX: Full-Text Search Performance Improved via Support for TOP" at http://support.microsoft.com//default.aspx?scid=kb;EN-US;240833. Since Full-Text Search is important to your application, and you will have multiple servers (two for backup), you should also be familiar with how to move the very large FT Catalogs between servers as well as how to use one server as your FT Catalog "build" server (if necessary) via the procedures in KB article 240867 (Q240867) "INF: How to Move, Copy, and Backup Full-Text Catalog Folders and Files" at http://support.microsoft.com/default.aspx?scid=kb;EN-US;240867. In regards to your server configuration, I'd make the following recommendations: Dual Xeon 2.8Ghz CPU - get as much L2 cache as you can for each CPU 2GB Ram - should be enough Raid 5 SCSI - RAID5 only for your database *.mdf and *.ndf files, the FT Catalogs should be RAID10 Each HDD 200 GB - your FT Catalogs should have their own controller and disk array. Win2k Server or Win2003 Server - Choose one or the other OS as you will get different results based upon different wordbreakers MSSQL 2000 Server - ensure that you are at SP3 or SP3a before you start your Full Population as SP3 will force you to upgrade the FT Catalog or defer it to a later point in time, either way, it will take a very significant amount of time to upgrade a FT Catalog with 34 million rows. Let me know if you need any further or more specific recommendations. Regards, John [quoted text, click to view] "Neils" <neils@shoobridge.de> wrote in message news:286b284a.0311280216.7df63078@posting.google.com... > Hi John, > > thank you for your fast replay :-)) > > I hope that after 10 days it will be done... > > Right now i'm going to order a Serversystem and i would feel > better when you take a look on it: > > The database includes about 34,000,000 database entries (1 Table). > 80 Clients are making simultaneous queries on that Data.(DIRECTORY > ASSISTANCE SERVICE). It is necassary that the responsetime of database > is around a second. > > There should be not only one Server, there should be 3 Servers. One > for Production the other two for Backup if Server1 goes down > (Cluster). > > The following configuration is planed for each Server: > > Dual Xeon 2.8Ghz CPU > 2GB Ram > Raid 5 SCSI > Each HDD 200 GB > Win2k Server or Win2003 Server > MSSQL 2000 Server > > The Querys includs FREETEXT or CONTAINS parameters. > > Is the configuration ok or is there something i should change. > Is there anything i should pay special attention on it. > > Thank you, :-)) > Neils > > > > > > "John Kane" <jt-kane@comcast.net> wrote in message news:<#$09YxStDHA.980@TK2MSFTNGP10.phx.gbl>... > > Neils, > > Hmm... A FT-enabled table with 34 million rows... Depending upon where you > > have your FT Catalog residing (on the same drive as your database files?), > > this could take a very long time... certainly 8 to 10+ days to complete. I'd > > recommend that if you cannot wait that long for the Full Population to wait > > that you stop the population and set the "Microsoft Search" Service to > > manual start and then manually remove the FT Catalog as the MSSearch service > > will automaticlly attempt to restart the population. > > > > As for a monitioring process, you can use and modifiy the following T-SQL > > code: > > > > use pubs > > go > > BEGIN > > SET NOCOUNT ON > > DECLARE @begin datetime > > DECLARE @end datetime > > SET @begin = CURRENT_TIMESTAMP > > EXEC sp_fulltext_catalog 'PubInfo', 'start_full' -- "Full Crawl" > > -- EXEC sp_fulltext_catalog 'PubInfo', 'start_incremental' -- "Incremental > > Crawl" > > -- > > --- Wait for crawl to complete > > -- > > DECLARE @status int, @itemCount int, @keyCount int, @indexSize int > > SELECT @status = FullTextCatalogProperty('PubInfo', 'populatestatus') > > WHILE (@status <> 0) > > BEGIN > > WAITFOR DELAY '00:00:01' -- wait for 1 second before checking FT > > Populatestatus... > > SELECT @status = FullTextCatalogProperty('PubInfo', 'populatestatus') > > END > > SET @end = CURRENT_TIMESTAMP > > WAITFOR DELAY '00:00:15' -- wait for 15 seconds in order to get correct FT > > Property info... > > SET @itemCount = FullTextCatalogProperty('PubInfo', 'itemcount') > > SET @keyCount = FullTextCatalogProperty('PubInfo', 'uniquekeycount') > > SET @indexSize = FullTextCatalogProperty('PubInfo', 'indexsize') > > PRINT CONVERT(char(30), @begin, 9) + char(09) + > > CONVERT(char(30), @end, 9) + char(09) + > > CONVERT(char(30), @end - @begin, 8) + char(09) + > > CONVERT(char(30), DATEDIFF (hh, @begin, @end)) + char(09) + > > CONVERT(char(30), DATEDIFF (mi, @begin, @end)) + char(09) + > > CONVERT(char(30), DATEDIFF (ss, @begin, @end)) + char(09) + > > CONVERT(varchar(10), @itemCount) + char(09) + > > CONVERT(varchar(10), @keyCount) + char(09) + > > CONVERT(varchar(10), @indexSize) > > SET NOCOUNT OFF > > END > > go > > > > NOTE: For such a large table, you should modifiy the waitfor to be at least > > once ever 5 minutes and the final waitfor to be at least 30 minutes. Instead > > of using the PRINT statement, you could write the values out to a monitoring > > table. As for how to speed it up, you should review SQL Server 2000 BOL > > title "Full-Text Search Recommendations" and separate out the FT Catalog > > folder from your database files, i.e, on a different disk controller and > > disk array configured as RAID10 or as a strip-set. Also, get more RAM, at > > least 1GB and use sp_fulltext_service 'resource_usage' <value> and set > > <value> to 5 as the MSSearch service can only use 512Mb of RAM, but only if > > the RAM is not used by the OS or SQL Server. Additionally, more and faster > > CPU's won't hurt as well... Basicly, you need a production level server to > > FT Index 34 million rows... > > > > Best Regards, > > John > > > > > > > > "Neils" <neils@shoobridge.de> wrote in message > > news:286b284a.0311270949.74247388@posting.google.com... > > > Hello, > > > > > > i have a table with 34,000,000(!) rows on a Sql2000 with Win2k. > > > (Celeron 1,3 GHZ, 256 MB Ram, 120GB ATA 100) > > >
Hi John, it's me again... :-)) Where can i see how many rows are finished. I tried your T-SQL Function to monitor the population, but after 10 hrs query i abort the T-SQL command.... A part from that can i stop the population and test some querys with freetext and contains ? After that i want to continue the population at that position i stop it. Thanks John....
Neils, I'm not exactly sure what you're asking for here, so I'll make a few assumptions and you can correct me if I'm wrong... First of all, you stopped the Full Population of the table with 34 million rows. Correct? Did it stop successfully? You can use the following T-SQL metadata function queries to determine how many rows are finished: SELECT FULLTEXTCATALOGPROPERTY('<FT_Catalog_name>','PopulateStatus') -- to determine if the Full Population is still on-going, if so, returns: 1 = Full population in progress SELECT fulltextcatalogproperty('<FT_Catalog_name>', 'ItemCount') -- returns: Number of full-text indexed items currently in the full-text catalog. See SQL Server 2000 BOL title "FULLTEXTCATALOGPROPERTY" for more information on this metadata function. If you stop the Full Population then, you or the MSSearch service will need to restart it. It is not necessary to stop the Full Population in order to "test some queries with freetext and contains" as the FT Catalog can be both read (contains and freetext) and written (Full Population) to at the same time. If you stop the Full Population, the MSSearch service will automatically re-start the Full Population at the same point that it was stopped. Regards, John [quoted text, click to view] "Neils" <neils@shoobridge.de> wrote in message news:286b284a.0311300133.39cb1bb4@posting.google.com... > Hi John, > > it's me again... :-)) > > Where can i see how many rows are finished. I tried > your T-SQL Function to monitor the population, but > after 10 hrs query i abort the T-SQL command.... > > A part from that can i stop the population and > test some querys with freetext and contains ? After > that i want to continue the population at that > position i stop it. > > Thanks John.... > > Neils
Hi John, that was exactly the things i want to know. Thank you :-) I have not stoped the population as i want to ask you befor if it is possible. So i will do it today. I test some querys with freetext and contains but after some hours i stoped the query because the query was still in progress I think the system was to busy as it was progress with the population. Thanks again and have a nice week.... Neils :-) [quoted text, click to view] "John Kane" <jt-kane@comcast.net> wrote in message news:<eWBpzU2tDHA.3436@tk2msftngp13.phx.gbl>... > Neils, > I'm not exactly sure what you're asking for here, so I'll make a few > assumptions and you can correct me if I'm wrong... > First of all, you stopped the Full Population of the table with 34 million > rows. Correct? Did it stop successfully? > > You can use the following T-SQL metadata function queries to determine how > many rows are finished: > > SELECT FULLTEXTCATALOGPROPERTY('<FT_Catalog_name>','PopulateStatus') > -- to determine if the Full Population is still on-going, if so, returns: 1 > = Full population in progress > SELECT fulltextcatalogproperty('<FT_Catalog_name>', 'ItemCount') > -- returns: Number of full-text indexed items currently in the full-text > catalog. > > See SQL Server 2000 BOL title "FULLTEXTCATALOGPROPERTY" for more information > on this metadata function. > > If you stop the Full Population then, you or the MSSearch service will need > to restart it. It is not necessary to stop the Full Population in order to > "test some queries with freetext and contains" as the FT Catalog can be both > read (contains and freetext) and written (Full Population) to at the same > time. If you stop the Full Population, the MSSearch service will > automatically re-start the Full Population at the same point that it was > stopped. > > Regards, > John > > > > "Neils" <neils@shoobridge.de> wrote in message > news:286b284a.0311300133.39cb1bb4@posting.google.com... > > Hi John, > > > > it's me again... :-)) > > > > Where can i see how many rows are finished. I tried > > your T-SQL Function to monitor the population, but > > after 10 hrs query i abort the T-SQL command.... > > > > A part from that can i stop the population and > > test some querys with freetext and contains ? After > > that i want to continue the population at that > > position i stop it. > > > > Thanks John.... > >
You're welcome, Neils, Keep in mind, that while you may attempt to stop a Full Population of a very large table (34 million rows), it may not stop for some time or even at all. That is why I asked if you had stopped it successfully. Yes, the Full Population does take much of your computer's resources (CPU, memory and Disk I/O) when doing a Full pop. against a large FT-enable table on a such a machine as you have and this can affect the FREETEXT and CONTAINS query performance. If the Full Population fails to stop let me know as there are un-documented methods to remove the FT Catalog. Regards, John [quoted text, click to view] "Neils" <neils@shoobridge.de> wrote in message news:286b284a.0311302315.27e5d9c9@posting.google.com... > Hi John, > > that was exactly the things i want to know. Thank you :-) > > I have not stoped the population as i want to ask you befor > if it is possible. So i will do it today. > > I test some querys with freetext and contains but after > some hours i stoped the query because the query was still > in progress > > I think the system was to busy as it was progress with the population. > > Thanks again and have a nice week.... > > Neils :-) > > > > "John Kane" <jt-kane@comcast.net> wrote in message news:<eWBpzU2tDHA.3436@tk2msftngp13.phx.gbl>... > > Neils, > > I'm not exactly sure what you're asking for here, so I'll make a few > > assumptions and you can correct me if I'm wrong... > > First of all, you stopped the Full Population of the table with 34 million > > rows. Correct? Did it stop successfully? > > > > You can use the following T-SQL metadata function queries to determine how > > many rows are finished: > > > > SELECT FULLTEXTCATALOGPROPERTY('<FT_Catalog_name>','PopulateStatus') > > -- to determine if the Full Population is still on-going, if so, returns: 1 > > = Full population in progress > > SELECT fulltextcatalogproperty('<FT_Catalog_name>', 'ItemCount') > > -- returns: Number of full-text indexed items currently in the full-text > > catalog. > > > > See SQL Server 2000 BOL title "FULLTEXTCATALOGPROPERTY" for more information > > on this metadata function. > > > > If you stop the Full Population then, you or the MSSearch service will need > > to restart it. It is not necessary to stop the Full Population in order to > > "test some queries with freetext and contains" as the FT Catalog can be both > > read (contains and freetext) and written (Full Population) to at the same > > time. If you stop the Full Population, the MSSearch service will > > automatically re-start the Full Population at the same point that it was > > stopped. > > > > Regards, > > John > > > > > > > > "Neils" <neils@shoobridge.de> wrote in message > > news:286b284a.0311300133.39cb1bb4@posting.google.com... > > > Hi John, > > > > > > it's me again... :-)) > > > > > > Where can i see how many rows are finished. I tried > > > your T-SQL Function to monitor the population, but > > > after 10 hrs query i abort the T-SQL command.... > > > > > > A part from that can i stop the population and > > > test some querys with freetext and contains ? After > > > that i want to continue the population at that > > > position i stop it. > > > > > > Thanks John.... > > > > > > Neils
Don't see what you're looking for? Try a search.
|