FOA
1.1:
FOA is a cognitive activity. It concerns the meaning
instead of the detailed words. In the article, the author divided the process
of FOA into three steps. What's more, by using some pictures, the meaning of
the three steps are introduced vividly.
Specifically, the three phases are :1. asking a question;
2. constructing an answer; 3. assessing the answer. The phases mean that at
first people raise some questions, and then these questions form searching
query and are transferred search engine. Then the search engine use web crawler
to get some related answers throughout huge document corpus. In the last step,
people generate some relevance feedback to show which part is useful, which
part is irrelevant and which part is neutral.
The article has a brief introduction to information
retrieval. It includes an introduction to different kinds of search engines
(which includes web search, desktop and file system, enterprise-level IR
system, digital libraries and other specialized IR systems) and
also shows us the components of an IR system.
And then, in the latter paragraph, the author shows
something related to ranking algorithm. The author also tells two important
principle when measuring the IR system which are efficiency and effectiveness.
Only when searching in an effective and efficient way, we can get what we want
in the shortest time. And then the author raises a principle named
PRP(Probability Ranking Principle). In the principle the author tells that if
the results after we searched are in the rank of decreasing probability of
relevance, the effectiveness is the max. By using some examples and listing
some related words, the author tells us how to search in an efficient way.
Moreover, the author uses a small paragraph to show a concept of
"document" and how documents are updated. I think this concept is
more useful in later chapters.
As for me, in this chapter, I am most interested in
"web search". I think it is quite amazing that how web search works.
In the article the author says people stores a "snapshot" of the web
in order to produce accurate result and minimize the reaction time. To update
these snapshots, they use a web crawler to download the updates periodically.
MIR
1.1-1.4:
In the first chapter in this book, the author mostly
introduce some basic information about the information retrieval. He firstly
tells the concept of information retrieval, which is a way to help people get
easy access to information of their interests. And he tells early development
of IR. At first IR technology was only used in libraries. And with the help of
the introduction of the World Wide Web, IR has finally had a place in the
center of the stage.
The author also listed some problems of IR. If people type
too many words to search, it may confuse the search engine. And then search
engine can't generate related key words so that they can't get the answers that
people want. Moreover, the author distinguishes information and data.
Information allows small difference but data need to be totally accurate.
In the chapter of The IR System, the author tells us the
inner construction of the IR system, which is the software architecture. The
author uses two pictures to illustrate the different levels in IR system and
how to generate index we want. Through different layers of process, we can get
the top ranking retrieval answers.
Finally the author introduce the concept "web".
He uses Jane Austen's example to illustrate the importance of web that can help
people free to publish their ideas and works. And in the last, the author lists
five impact of the web that can change search derives. Those impacts are the
two sides of the coin. They offer the chance to search engine to develop as
well as bring some negative results and confuse human normal life. However, it
can help search derives to prosper.