Re: Should I index all ...
Terry O'Neill (toneill@mariner.com)
Tue, 09 Jul 1996 08:47:09 -0600
Nick Arnett wrote:
> 
> >No doubt that you exclude some meaningful information when you
> >use stopwords, but two benefits often outweigh this consideration.
> >First, your database does get smaller, easily 25%, although there are
> >many factors that affect this number.
> 
> What's your source for that number?  Unless you have a very primitive index
> and a very aggressive stopword list, the size reduction is nowhere near
> that large, I believe.
> 
We probably don't want to turn this into too much of a text 
search discussion in this forum, but the estimate is based 
on a variety of text engine evaluations we did in 1992-1993 
in preparation for an overhaul of the Dow Jones Text Library.
Terry O'Neill
mariner.com