Zheng Gao's Wonderland: Week 14: Reading Notes

1.Information Retrieval on the Semantic Web. In 10th International Conference on Information and Knowledge Management

One vision of the Semantic Web is that it will be much like the Web we know today, except that documents will be enriched by annotations in machine understandable markup. These annotations will provide metadata about the documents as well as machine interpret-able statements capturing some of the meaning of document content. We discuss how the information retrieval paradigm might be recast in such an environment. We suggest that retrieval can be tightly bound to inference. Doing so makes today’s Web search engines useful to Semantic Web inference engines, and causes improvements in either retrieval or inference to lead directly to improvements in the other.

2. Generalizing from relevance feedback using named entity wildcards

Traditional adaptive ﬁltering systems learn the user’s interests in a rather simple way – words from relevant documents are favored in the query model, while words from irrelevant documents are down-weighted. This biases the query model towards speciﬁc words seen in the past, causing the system to favor documents containing relevant but redundant information over documents that use previously unseen words to denote new facts about the same news event. This paper proposes news ways of generalizing from relevance feedback by augmenting the traditional bagof-words query model with named entity wildcards that are anchored in context. The use of wildcards allows generalization beyond speciﬁc words, while contextual restrictions limit the wildcard-matching to entities related to the user’s query. We test our new approach in a nuggetlevel adaptive ﬁltering system and evaluate it in terms of both relevance and novelty of the presented information. Our results indicate that higher recall is obtained when lexical terms are generalized using wildcards. However, such wildcards must be anchored to their context to maintain good precision. How the context of a wildcard is represented and matched against a given document also plays a crucial role in the performance of the retrieval system.

3.Learning to rank for information retrieval

The task of "learning to rank" has emerged as an active and growing area of research both in information retrieval and machine learning. The goal is to design and apply methods to automatically learn a function from training data, such that the function can sort objects (e.g., documents) according to their degrees of relevance, preference, or importance as defined in a specific application.The relevance of this task for IR is without question, because many IR problems are by nature ranking problems. Improved algorithms for learning ranking functions promise improved retrieval quality and less of a need for manual parameter adaptation. In this way, many IR technologies can be potentially enhanced by using learning to rank techniques.

Zheng Gao's Wonderland

Tuesday, April 15, 2014

Week 14: Reading Notes

No comments:

Post a Comment

Zheng Gao