Zheng Gao's Wonderland: February 2014

Tuesday, February 25, 2014

Week 8: Reading Notes

Marti A. Hearst Chapter 1:
(http://searchuserinterfaces.com/book/sui_ch1_design.html)

This chapter has introduced the ideas and practices surrounding user interface design in general, and search interface design in particular. It has explained some of the difficulties with search interface design and provided a set of design guidelines tailored specifically to search user interfaces. These guidelines include:

Offer efficient and informative feedback,
Balance user control with automated actions,
Reduce short-term memory load,
Provide shortcuts,
Reduce errors,
Recognize the importance of small details, and
Recognize the importance of aesthetics.

This chapter has also summarized some of the most successful design ideas that are commonly in use in search interfaces today. This summary is based on generalizing over the results of years of research, experimentation, and tests in the marketplace. The coming years should reveal additional new, exciting ideas that will become reliable standards for search user interfaces.

Marti A. Hearst Chapter 11:

(http://searchuserinterfaces.com/book/sui_ch11_text_analysis_visualization.html)

Visualization is a promising tool for the analysis and understanding of text collections, including semi-structured text as found in citation collections, and for applications such as literary analysis. Although not shown in this chapter, visualization has also been applied to online conversations and other forms of social interaction which have textual components. With the advent of social visualization Web sites like IBM's manyeyes.com, and other tools that continue to make visualization generally accessible to users who are not programmers, it is likely that the use of visualization for analysis of text will only continue to grow in popularity.

IIR Chapter 10:

Information retrieval systems are often contrasted with relational databases.Traditionally, IR systems have retrieved information from unstructured text– by which we mean “raw” text without markup. Databases are designed for querying relational data.
  After presenting the basic concepts of XML in Section 10.1, this chapter first discusses the challengeswe face in XML retrieval (Section 10.2). Next we describe a vector space model for XML retrieval (Section 10.3). Section 10.4 presents INEX, a shared task evaluation that has been held for a number of years and currently is the most important venue for XML retrieval research. We discuss the differences between data centric and text-centric approaches to XML in Section 10.5.
  An XML document is an ordered, labeled tree. Each node of the tree is an XML element and is written with an opening and closing tag. An element can have one or more XML attributes.
  The premier venue for research on XML retrieval is the INEX (INitiative for the Evaluation of XML retrieval) program, a collaborative effort that has produced reference collections, sets of queries, and relevance judgments.

Monday, February 24, 2014

Week 7: Muddiest Points

1. Why we should have the concept NDCG? I think DCG and IDCG is enough,

2. If two system have the same average significance but different standard deviation, which one is better? Bigger standard deviation one or the smaller standard deviation one?

3. How to make RP useful to the user?

Wednesday, February 19, 2014

Week 7: Reading Notes

IIR Chapter 9
In this chapter the author mainly talks about relevance feedback and query expansion. The same concept may be referred to using different words. This issue, known as synonymy, has an impact on the recall of most
information retrieval systems.in this chapter we discuss ways in which a system can help with query refinement, either fully automatically or with the user in the loop.
The methods for tackling this problem split into two major classes: global methods and local methods. Global methods are techniques for expanding or reformulating query terms independent of the query and results returned from it, so that changes in the query wording will cause the new query to match other semantically similar terms.
The idea of relevance feedback (RF) is to involve the user in the retrieval process so as to improve the final result set. The Rocchio Algorithm is the classic algorithm for implementing relevance feedback. It models a way of incorporating relevance feedback information into the vector space model.
In the next section we more briefly discuss three global methods for expanding a query: by simply aiding the user in doing so, by using a manual thesaurus, and through building a thesaurus automatically.
Improving the effectiveness of information retrieval with local context analysis
Techniques for automatic query expansion have been extensively studied in information research as a means of addressing the word mismatch between queries and documents. These techniques can be categorized as either global or local. While global techniques rely on analysis of a whole collection to discover word relationships, local techniques emphasize analysis of the top-ranked documents retrieved for a query. While local techniques have shown to be more effective that global techniques in general, existing local techniques are not robust and can seriously hurt retrieved when few of the retrieval documents are relevant. We propose a new technique, called local context analysis, which selects expansion terms based on co-occurrence with the query terms within the top-ranked documents. Experiments on a number of collections, both English and non-English, show that local context analysis offers more effective and consistent retrieval results.
A study of methods for negative relevance feedback.
Negative relevance feedback is a special case of relevance feedback where we do not have any positive example; this often happens when the topic is difficult and the search results are poor. Although in principle any standard relevance feedback technique can be applied to negative relevance feedback, it may not perform well due to the lack of positive examples. In this paper, we conduct a systematic study of methods for negative relevance feedback. We compare a set of representative negative feedback methods, covering vector-space models and language models, as well as several special heuristics for negative feedback. Evaluating negative feedback methods requires a test set with sufficient difficult topics, but there are not many naturally difficult topics in the existing test collections. We use two sampling strategies to adapt a test collection with easy topics to evaluate negative feedback. Experiment results on several TREC collections show that language model based negative feedback methods are generally more effective than those based on vector-space models, and using multiple negative models is an effective heuristic for negative feedback. Our results also show that it is feasible to adapt test collections with easy topics for evaluating negative feedback methods through sampling.
Relevance feedback revisited
Researchers have found relevance feedback to be effective in interactive information retrieval, although few formal user experiments have been made. In order to run a user experiment on a large document collection, experiments were performed at NIST to complete some of the missing links found in using the probabilistic retrieval model. These experiments, using the Cranfield 1400 collection, showed the importance of query expansion in addition to query reweighting, and showed that adding as few as 20 well-selected terms could result in performance improvements of over 100%. Additionally it was shown that performing multiple iterations of feedback is highly effective.

Week 6: Muddiest Points

1. How does pooling work?
2.I don't quite understand the part of Kappa.
3.Which type of searches is MRR a good measure?

Thursday, February 13, 2014

Week 6: Reading Notes

IIR Chapter 8：

In this chapter, the author mainly talks about the evaluation in information retrieval. Because information retrieval has developed as a highly empirical discipline, requiring careful and thorough evaluation to demonstrate the superior performance of novel techniques on representative document collections.
In this chapter the author begins with a discussion of measuring the effectiveness of IR systems and the test collections that are most often used for this purpose. He then present the straightforward notion of relevant and nonrelevant documents and the formal evaluation methodology that has been developed for evaluating unranked retrieval results.He then extends these notions and develop further measures for evaluating ranked retrieval results. He then steps back to introduce the notion of user utility, and how it is approximated by the use of document relevance.The author also tells a misundestanding part that user perceptions do not always coincide with system designers’ notions of quality.
At first the author tells a concept named test collection, which contains three different parts.Then he also tells that relevance is assessed relative to an information need, not a query.In the next section, the author gives some standard test collections, which includes Cranfield collection,Text Retrieval Conference (TREC),NII Test Collections for IR Systems (NTCIR),GOV2,CLEF,etc.
In the next section ,the author introduces a concept named contingency table and draw a table to illustrate it.

And he list the equation:accuracy =(tp + tn)/(tp + f p + f n + tn). However, the author says this equation is not that useful. It may lead to rate of false positive. He then claims to use both precision and recall because the advantage of having the two numbers for precision and recall is that one is more important than the other in many circumstances.In the final analysis, the success of an IR system depends on how good it is at satisfying the needs of these idiosyncratic humans.
In the last few sections, the author focuses on the evaluation with a broader perspective.He focuses on user utility,refining a deployed system and system issues.

What's the value of TREC: is there a gap to jump or a chasm to bridge?

The TREC Programme has been very successful at generalising. It has shown that essentially simple methods of retrieving documents.The TREC Programme hassought to address variation, but it has done this in a largely ad hoc and unsystematic way.
The author's case is based on the notion of micro variation, and on the distinction between system environment and task context. He uses the evaluation framework ideas to analyse the TREC experimental programme and to support my argument for a new direction for TREC.
A convenient way of summarising the Cran eld evaluation paradigm is in terms of environment variables and system parameters.
In general, TREC participants have sought to adapt, or extend, their existing system apparatus to the new environment variable values.The foregoing is only an informal discussion: more thorough analysis of retrieval contexts is needed for a factor characterisation to be taken seriously as a basis for system development.

Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems
Modern large retrieval environments tend to overwhelm their users by their large output.In order to develop IR techniques in this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their ability to retrieve highly relevant documents.
Graded relevance judgments may be used for IR evaluation, first, by extending traditional evaluation measures, such as recall and precision and P–R curves, to use them.
The author demonstrates the use of the proposed measures in a case study testing runs from the TREC-7 ad hoc track with binary and nonbinary relevance judgments.
In modern large database environments, the development and evaluation of IR methods should be based on their ability to retrieve highly relevant documents. This is often desirable from the user viewpoint and presents a not too liberal test for IR techniques.

Week 5: Muddiest Points

1.Why unigram language model is more popular than comparing language model?

2.As for higher order models, which one is the best among n-gram, cache or grammer?
3.I still don't quite understand the use of smoothing.
4. Language model is based on vector space model. right?

Wednesday, February 5, 2014

Week 5: Reading Notes

IIR Chapter 11:
In this chapter, the author mainly talks about using probabilities in information retrieval. And he also introduces several ways of probabilistic information retrieval. Users start with information needs, which they translate into query representations. And now,there are documents that are converted into document
representations.
In this chapter, the author at first introduce some basic knowledge of probability, which most of us have already learned during our high school study.then the author concentrates on the Binary Independence
Model, which is the original and still most influential probabilistic retrieval model. Finally, we will introduce related but extended methods which use term counts, including the empirically successful Okapi BM25 weighting scheme, and Bayesian Network models for IR.Those are all in probabilistic area. Which contains some mathematics questions and explanations.The author introduces partition rule and Bayes’ Rule, which is

At last, the author mentions a concept named ODDS. the odds of an event is to provide a kind of multiplier for how probabilities change.

In another section, the author introduces The 1/0 loss case,and a concept: P(R = 1|d, q). This is the basis of the Probability Ranking Principle.If a set of retrieval results is to be returned, rather than an ordering, the Bayes Optimal Decision Rule, the decision which minimizes the risk of loss, is to simply return documents that are more likely relevant than nonrelevant. the Probability Ranking Principle says that if for a specific document d and for all documents d′ not yet retrieved

then d is the next document to be retrieved.

The next section is about The Binary Independence Model.The author makes the conditional independence assumption that the presence or absence of aword in a document is independent of the presence or absence of any other word. In the end,The resulting quantity used for ranking is called the Value RETRIEVAL STATUS (RSV) in this model:

In the next part, the author introduces Probability estimates in theory.To avoid the possibility of zeroes (such as if every or no relevant document has a particular term) it is fairly standard to add 12 to each of the quantities.This is referred to as the relative frequency of the event.Estimating the probability as the relative frequency is the maximum likelihood estimate (or MLE),because this value makes the observed data maximally likely. And then the author tells a concept named Bayesian prior,This is a form of maximum a posterior (MAP) MAXIMUM A estimation, where we choose the most likely point value for probabilities based on the prior and the observed evidence.

In the part of Probability estimates in practice, the author lists that Croft and Harper (1979) proposed using a constant in their combination match model.Moreover, We can use (pseudo-)relevance feedback, perhaps in an iterative process of estimation, to get a more accurate estimate of pt.

Probabilistic methods are one of the oldest formal models in IR. In the Tree-structured dependencies between terms,Some of the assumptions of the BIM can be removed.

IIR Chapter 12:

This chapter is mainly about Language models for information retrieval.Instead of overtly modeling the probability P which is talked about in chapter 11, the 12th chapter has the basic language modeling approach instead builds a probabilistic language model Md from each document d, and ranks documents based on the probability of the model generating the query: P(q|Md).In the first part of the chapter, the author first introduce the concept of language models,and then talks about the Query Likelihood Model. In the end, the author also tells something about various extensions to the language modeling approach.

At the beginning of this chapter, the author introduces Finite automata and language models. A traditional

generative model of a language, of the kind familiar from formal language theory, can be used either to recognize or to generate strings. And A language model is a function that puts a probability measure over strings drawn from some vocabulary. As for the types of the language models, The simplest form of language model simply throws away all conditioning context, and estimates each term independently,which we called it unigram language model. In the end of the section, the professor tells us"The strategy we adopt in IR is as follows. We pretend that the document d is only a representative sample of text drawn from a model distribution, treating it like a fine-grained topic. We then estimate a language model from this sample, and use that model to calculate the probability of observing any word sequence, and, finally,we rank documents according to their probability of generating the query".That is the strategy of language models in IR.

In the next several sections, the author tells us a lot kinds of language models. Language modeling is a quite general formal approach to IR,with many variant realizations. The original and basic method for using language models in IR is the query likelihood model. The most common way to do this is using the multinomial unigram languagemodel, which is equivalent to amultinomial Naive Bayesmodel (page 263),

where the documents are the classes, each treated in the estimation as a separate “language”.

In the end, the author concludes that the retrieval ranking for a query q under the basic LM for

IR he has

been considering is given by:

The next section is about the comparison between Language modeling and other approaches in IR.Compared to other probabilistic approaches, such as the BIM from Chapter 11, the main difference initially appears to be that the LM approach does away with explicitly modeling relevance (whereas this is the central variable evaluated in the BIM approach).The model has significant relations to traditional tf-idf models. Also the professor lists three ways of developing the language modeling approach: query likelihood, document likelihood, and model comparison.

The Paper:

This paper is mainly talks about the comparison between the traditional IR model and the new language model in IR. There are three traditional IR model: the boolean model, the vector model and the probabilistic model.And the new language model is based on the vector model in tf, idf terms and the probabilistic model in relevance weighting. The vector model and the probabilistic model stands for different approaches to information retrieval. The former is based on the similarity between query and document, the latter is based on the probability of relevance, using the distribution of terms over relevant and non-relevant documents.However, the author finds some interesting things in language models.And he presents a strong theoretical motivation of the language modelling approach and shows that the approach outperforms the weighting algorithms developed within traditional models.

After discussing some traditional models' features, the author begins to introduce statistical language model of retrieval.The author uses urn model as a metaphor to illustrate language tools. And then he also introduces ad-hoc retrieval task. He uses the traditional models as the foundation and gives a definition of the corresponding probability measures. Also he gives some parameter estimation by using tf and idf. In the end, he shows the results of the ad hoc task. The results shows that both the original probabilistic model and the original vector space model underperform on this task.And the language model shares some same features with the traditional models. The paper in the end introduces new ways of thinking about two popular information retrieval tools: the use of stop words and the use of a stemmer.