scispace - formally typeset
Search or ask a question
Topic

Web page

About: Web page is a research topic. Over the lifetime, 50353 publications have been published within this topic receiving 975168 citations. The topic is also known as: webpage & web.


Papers
More filters
Proceedings Article
06 Jan 2007
TL;DR: Open Information Extraction (OIE) as mentioned in this paper is a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input.
Abstract: Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements). Shifting to a new domain requires the user to name the target relations and to manually create new extraction rules or hand-tag new training examples. This manual labor scales linearly with the number of target relations. This paper introduces Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input. The paper also introduces TEXTRUNNER, a fully implemented, highly scalable OIE system where the tuples are assigned a probability and indexed to support efficient extraction and exploration via user queries. We report on experiments over a 9,000,000 Web page corpus that compare TEXTRUNNER with KNOWITALL, a state-of-the-art Web IE system. TEXTRUNNER achieves an error reduction of 33% on a comparable set of extractions. Furthermore, in the amount of time it takes KNOWITALL to perform extraction for a handful of pre-specified relations, TEXTRUNNER extracts a far broader set of facts reflecting orders of magnitude more relations, discovered on the fly. We report statistics on TEXTRUNNER's 11,000,000 highest probability tuples, and show that they contain over 1,000,000 concrete facts and over 6,500,000 more abstract assertions.

1,574 citations

Journal ArticleDOI
TL;DR: Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces, which allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows.
Abstract: Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods.

1,562 citations

Book
03 Jul 2006
TL;DR: Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided.
Abstract: Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of web page rankings, Google's PageRank and Beyond supplies the answers to these and other questions and more. The book serves two very different audiences: the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided. Many illustrative examples and entertaining asides MATLAB code Accessible and informal style Complete and self-contained section for mathematics review

1,548 citations

Journal ArticleDOI
TL;DR: By allowing image reconstruction to continue even after a packet is lost, this type of representation can prevent a Web browser from becoming dormant, and the source can be approximated from any subset of the chunks.
Abstract: This article focuses on the compressed representations of pictures. The representation does not affect how many bits get from the Web server to the laptop, but it determines the usefulness of the bits that arrive. Many different representations are possible, and there is more involved in their choice than merely selecting a compression ratio. The techniques presented represent a single information source with several chunks of data ("descriptions") so that the source can be approximated from any subset of the chunks. By allowing image reconstruction to continue even after a packet is lost, this type of representation can prevent a Web browser from becoming dormant.

1,533 citations

Book
01 Jan 2007
TL;DR: This paper presents a meta-modelling architecture for the adaptive web that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and cataloging content on the web.
Abstract: I. Modeling Technologies.- User Models for Adaptive Hypermedia and Adaptive Educational Systems.- User Profiles for Personalized Information Access.- Data Mining for Web Personalization.- Generic User Modeling Systems.- Web Document Modeling.- II. Adaptation Technologies.- Personalized Search on the World Wide Web.- Adaptive Focused Crawling.- Adaptive Navigation Support.- Collaborative Filtering Recommender Systems.- Content-Based Recommendation Systems.- Case-Based Recommendation.- Hybrid Web Recommender Systems.- Adaptive Content Presentation for the Web.- Adaptive 3D Web Sites.- III. Applications.- Adaptive Information for Consumers of Healthcare.- Personalization in E-Commerce Applications.- Adaptive Mobile Guides.- Adaptive News Access.- IV. Challenges.- Adaptive Support for Distributed Collaboration.- Recommendation to Groups.- Privacy-Enhanced Web Personalization.- Open Corpus Adaptive Educational Hypermedia.- Semantic Web Technologies for the Adaptive Web.- Usability Engineering for the Adaptive Web.

1,521 citations


Network Information
Related Topics (5)
The Internet
213.2K papers, 3.8M citations
89% related
Server
79.5K papers, 1.4M citations
88% related
User interface
85.4K papers, 1.7M citations
88% related
Software
130.5K papers, 2M citations
86% related
Information system
107.5K papers, 1.8M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023152
2022357
2021486
2020895
20191,221
20181,440