scispace - formally typeset
D

Dennis Fetterly

Researcher at Microsoft

Publications -  36
Citations -  6445

Dennis Fetterly is an academic researcher from Microsoft. The author has contributed to research in topics: Web page & Static web page. The author has an hindex of 21, co-authored 36 publications receiving 6275 citations. Previous affiliations of Dennis Fetterly include Hewlett-Packard & Google.

Papers
More filters
Proceedings ArticleDOI

Dryad: distributed data-parallel programs from sequential building blocks

TL;DR: The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.
Proceedings ArticleDOI

DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

TL;DR: It is shown that excellent absolute performance can be attained--a general-purpose sort of 1012 Bytes of data executes in 319 seconds on a 240-computer, 960- disk cluster--as well as demonstrating near-linear scaling of execution time on representative applications as the authors vary the number of computers used for a job.
Proceedings ArticleDOI

Detecting spam web pages through content analysis

TL;DR: Some previously-undescribed techniques for automatically detecting spam pages are considered, and the effectiveness of these techniques in isolation and when aggregated using classification algorithms is examined.
Proceedings ArticleDOI

A large-scale study of the evolution of web pages

TL;DR: It is found that the average degree of change varies widely across top-level domains, and that larger pages change more often and more severely than smaller ones.
Proceedings ArticleDOI

Spam, damn spam, and statistics: using statistical analysis to locate spam web pages

TL;DR: This paper proposes that some spam web pages can be identified through statistical analysis, and examines a variety of properties, including linkage structure, page content, and page evolution, and finds that outliers in the statistical distribution of these properties are highly likely to be caused by web spam.