Anantha,
Storing very large MS Word documents in SQL Server vs. in the file system
with the pointers to the MS Word documents stored in SQL Server is more of
an application design decision. As this is one of those FAQ type questions
that have been known to start
religious wars or at the very least a flame war, and while Sharon didn't ask
about other application issues that this question often generates as it is
often an open-ended question, i.e., one that never seems to be answered to
everyone's satisfactions as it usually "depends" upon the application or
upon how one defines the word "best"...
As a rule-of-thumb and depending upon how many files you have, how
frequently they change, how large they are, what the app is, etc. you may
want to review the web site and the following rule of thumb that they used:
< 1 million images or big images (> 1MB) put them in the file system.
[quoted text, click to view] > 1 million and < 1 MB images, put them in SQL Server.
For everything in between, either way will work, depending upon if you need
transactional control over your files. Additionally, if you store the files
in a TEXT or IMAGE column, you can also store related metadata about that
file in SQL Server as well for increased searching capabilities. Also, and
obviously with SQL Server you get built-in support for validating the
consistency of the database, indices, backup, restore, etc. However, the MS
Word document size alone should not be the only justification for storing or
not storing large files in SQL Server. As for the max limit of FT Indexable
text for large files, see:
See SQL Server 2000 Books Online (BOL) title "Filtering Supported File
Types"online at MSDN:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/createdb/cm_fullad_55mb.asp
Actually, the BOL entry is wrong as there is a DOC bug filed for this
limited in the BOL title "Filtering Supported File Types" - "Note For
full-text indexing, a document must be less than 16 megabytes (MB) in size
and must not contain more than 256 kilobytes (KB) of filtered text" and this
limit can be over-ridden via KB article: 308771 (Q308771) "PRB: A Full-Text
Search May Not Return Any Hits If It Fails to Index a File" at
http://support.microsoft.com/default.aspx?scid=kb;en-us;308771. and the
FilterProcessMemoryQuota registry key value. However, you should be careful
in making adjustments to this registry key and incrementally increase it
based upon your server's memory and avg. file sizes.
Hope this helps!
John
--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/ [quoted text, click to view] "Anantha Padmanabhan" <ananthapus@hotmail.com> wrote in message
news:#8KdpP9$EHA.3368@TK2MSFTNGP10.phx.gbl...
> Hi,
>
> We are debating whether it is a good idea to store large documents (on the
average over 100 MB - max size currently at 220 MB) in SQL server itself OR
to strip the document of the "noise" words outside of sql server and store
just the words we need to be indexed.
[quoted text, click to view] >
> We are concerned about the performance on both approaches......
>
> Any ideas? Any prior experiences..........
>
> Appreciate your feedback
>
> Anantha
>
> **********************************************************************
> Sent via Fuzzy Software @
http://www.fuzzysoftware.com/ > Comprehensive, categorised, searchable collection of links to ASP &
ASP.NET resources...