Query-free news search
read more
Citations
Deeper Inside PageRank
Finding advertising keywords on web pages
Personal information management
Systems and methods for improving the ranking of news articles
Automatic method and system for formulating and transforming representations of context used by information services
References
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging
Letizia: an agent that assists web browsing
Learning Algorithms for Keyphrase Extraction
Domain-specific keyphrase extraction
Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive.
Related Papers (5)
Frequently Asked Questions (15)
Q2. What contributions have the authors mentioned in the paper "Query-free news search" ?
TV broadcast news can be treated as one such stream of text ; in this paper the authors discuss finding news articles on the web that are relevant to news currently being broadcast. The authors evaluated a variety of algorithms for this problem, looking at the impact of inverse document frequency, stemming, compounds, history, and query length on the relevance and coverage of news articles returned in real time during a broadcast. For the best algorithm, 84 % -91 % of the articles found were relevant, with at least 64 % of the articles being on the exact topic of the broadcast.
Q3. What is the approach to finding articles that are related to a stream of text?
Their approach to finding articles that are related to a stream of text is to create queries based on the text and to issue the queries to a search engine.
Q4. What is the algorithm for generating the last query?
The last query generation algorithm uses a combination of 3- and 2-term queries to explore whether the 2-term limit hurts performance.
Q5. How many articles have at least one article rated relevant?
In summary, roughly 70% of the topics have at least one article rated relevant, and almost as many have at least one article rated very relevant (R+).
Q6. How does the algorithm achieve a precision of 91%?
The best algorithm achieves a precision of 91% on one data set and 84% on a second data set and finds a relevant article for at least 70% of the topics in the data sets.
Q7. What is the approach to finding news articles on the web?
Their approach is to extract queries from the ongoing stream of closed captions, issue the queries in real time to a news search engine on the web, and postprocess the top results to determine the news articles that the authors show to the user.
Q8. What is the meaning of a reset?
A reset simply sets the stem vector to be the empty vector; it occurs when the topic in a text segment changes substantially from the previous text segment (see below).
Q9. How many articles can be returned for the CNN data set?
The authors return the top two articles for each query so that a maximum of 514 relevant articles could be returned for this data set when L .
Q10. What is the algorithm for finding news articles on the web?
For this genre of television show, the best algorithm finds a relevant page every 16-20 seconds on average, achieves a precision of 84-91%, and finds a relevant article for about 70% of the topics.
Q11. What are some examples of stories that are relevant to the topic?
Examples include a story about a beauty pageant for women in Lithuania’s prisons, a story about a new invention that uses recycled water from showers and baths to flush toilets, and a story about garbage trucks giving English lessons over loudspeakers in Singapore.
Q12. How could the topic finding and query generation algorithms be applied to conversations?
as voice recognition systems improve, the same kind of topic finding and query generation algorithms described in this paper could be applied to conversations, providing relevant information immediately upon demand.
Q13. What is the algorithm used to shorten the query?
To verify their choice of query length 2 the authors experimented with aquery shortening algorithm, which issues a multiple term query, and shortens the query until results are returned from the news search engine.
Q14. What is the IR system that would use to find the documents?
The system would derive queries from the passages of text that were marked, and search over a local corpus for relevant documents to present to the user.
Q15. Why did the authors restrict the search to articles published on the day of the broadcast?
Because the authors want to retrieve articles that are about the current news item, the authors restricted the search to articles published on the day of the broadcast or the day before.