1. Posting skip pointers. for a postings
list of length P, use √Pevenly-spaced skip pointers. This heuristic can be
improved upon; it ignores any details of the distribution of query terms.
2. Most recent search engines support a double quotes
syntax (“stanford university”) for phrase queries, which has proven to be very
easily understood and successfully used by users.
3. The concept of a biword index can be extended to longer sequences of
words, and if the index includes variable length word sequences, it is gen-
erally referred to as a phrase index.
4.For the reasons given, a biword index is not the standard solution. Rather,
a positional index is most commonly employed.
5.Let’s examine the space implications of having a positional index. A post-
ing now needs an entry for each occurrence of a term. The index size thus
depends on the average document size. The average web page has less than
1000 terms, but documents like SEC stock filings, books, and even some epic
poems easily reach 100,000 terms. Consider a term with frequency 1 in 1000
terms on average. The result is that large documents cause an increase of two
orders of magnitude in the space required to store the postings list:
Most recently, LIS schools were accused by leaders of
the profession of failing to educate students appropriately for the workplace and
of engaging in esoteric and irrelevant research that was out of touch with real
world needs.
While, A
community of information schools known as the "iSchool Caucus" has been
founded that has no affiliation with a professional association in LIS yet it
contains significant numbers of the leading LIS programs in North America.
LIS:two camps: the library and the information sides.
we might all
agree generally that issues of information retrieval, information quality and
authenticity, policy for access and preservation, the health and security
applications of data mining, raise at least some big questions for information
research to study.
The points made about IR can be made more or less equivalently, I'd argue, for
many other of the current hot topics in information research. We are at the
party, so to speak, but we are rarely the center of attention.
What are the big questions?
It is
important to remember that the value of LIS make it a potentially strong contributor to the debate and analysis of such issues.