scispace - formally typeset
Search or ask a question
Author

Jon Crowcroft

Bio: Jon Crowcroft is an academic researcher from University of Cambridge. The author has contributed to research in topics: The Internet & Multicast. The author has an hindex of 87, co-authored 672 publications receiving 38848 citations. Previous affiliations of Jon Crowcroft include Memorial University of Newfoundland & Information Technology University.


Papers
More filters
Proceedings ArticleDOI
31 Jul 2017
TL;DR: This paper outlines a systematic methodology and train a classifier to categorise Twitter accounts into ‘automated’ and ‘human’ users and applies a Random Forests classifier that achieves an accuracy close to human agreement.
Abstract: Online social networks (OSNs) have seen a remarkable rise in the presence of surreptitious automated accounts. Massive human user-base and business-supportive operating model of social networks (such as Twitter) facilitates the creation of automated agents. In this paper we outline a systematic methodology and train a classifier to categorise Twitter accounts into ‘automated’ and ‘human’ users. To improve classification accuracy we employ a set of novel steps. First, we divide the dataset into four popularity bands to compensate for differences in types of accounts. Second, we create a large ground truth dataset using human annotations and extract relevant features from raw tweets. To judge accuracy of the procedure we calculate agreement among human annotators as well as with a bot detection research tool. We then apply a Random Forests classifier that achieves an accuracy close to human agreement. Finally, as a concluding step we perform tests to measure the efficacy of our results.

85 citations

Book ChapterDOI
02 Oct 2005
TL;DR: In this paper, the authors propose a communication paradigm which reflects the reality faced by the mobile user and describe the challenges that this approach entails and provide evidence that it is feasible with today's technology.
Abstract: The Internet is built around the assumption of contemporaneous end-to-end connectivity. This is at odds with what typically happens in mobile networking, where mobile devices move between islands of connectivity, having opportunity to transmit packets through their wireless interface or simply carrying the data toward a connectivity island. We propose Pocket Switched Networking, a communication paradigm which reflects the reality faced by the mobile user. Pocket Networking falls under DTN. We describe the challenges that this approach entails and provide evidence that it is feasible with today's technology.

81 citations

Proceedings ArticleDOI
07 Apr 2014
TL;DR: In this paper, the authors proposed different ways of recommending investors found on Twitter for specific Kickstarter projects by conducting hypothesis-driven analyses of pledging behavior and translating the corresponding findings into different recommendation strategies.
Abstract: To bring their innovative ideas to market, those embarking in new ventures have to raise money, and, to do so, they have often resorted to banks and venture capitalists. Nowadays, they have an additional option: that of crowdfunding. The name refers to the idea that funds come from a network of people on the Internet who are passionate about supporting others' projects. One of the most popular crowdfunding sites is Kickstarter. In it, creators post descriptions of their projects and advertise them on social media sites (mainly Twitter), while investors look for projects to support. The most common reason for project failure is the inability of founders to connect with a sufficient number of investors, and that is mainly because hitherto there has not been any automatic way of matching creators and investors. We thus set out to propose different ways of recommending investors found on Twitter for specific Kickstarter projects. We do so by conducting hypothesis-driven analyses of pledging behavior and translate the corresponding findings into different recommendation strategies. The best strategy achieves, on average, 84% of accuracy in predicting a list of potential investors' Twitter accounts for any given project. Our findings also produced key insights about the whys and wherefores of investors deciding to support innovative efforts.

80 citations

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a piece of software (Spot Me) that can run on a mobile phone and is able to estimate the number of people in geographic locations in a privacy-preserving way and finds that erroneous locations have little effect on the estimations, yet they guarantee that users cannot be localized with high probability.
Abstract: Nowadays companies increasingly aggregate location data from different sources on the Internet to offer location-based services such as estimating current road traffic conditions, and finding the best nightlife locations in a city. However, these services have also caused outcries over privacy issues. As the volume of location data being aggregated expands, the comfort of sharing one's whereabouts with the public at large will unavoidably decrease. Existing ways of aggregating location data in the privacy literature are largely centralized in that they rely on a trusted location-based service. Instead, we propose a piece of software (Spot Me) that can run on a mobile phone and is able to estimate the number of people in geographic locations in a privacy-preserving way: accurate estimations are made possible in the presence of privacy-conscious users who report, in addition to their actual locations, a very large number of erroneous locations. The erroneous locations are selected by a randomized response algorithm. We evaluate the accuracy of Spot Me in estimating the number of people upon two very different realistic mobility traces: the mobility of vehicles in urban, suburban and rural areas, and the mobility of subway train passengers in Greater London. We find that erroneous locations have little effect on the estimations (in both traces, the error is below 18% for a situation in which more than 99% of the locations are erroneous), yet they guarantee that users cannot be localized with high probability. Also, the computational and storage overheads for a mobile phone running Spot Me are negligible, and the communication overhead is limited.

80 citations

01 Jan 2007
TL;DR: This work shows how to architect a citywide cooperative for safely sharing Wi-Fi with legitimate guests by tunneling the guest’s packets through it, and offers this as an economically viable alternative to investing millions in new infrastructure.
Abstract: Cities around the world are currently considering building expensive Wi-Fi infrastructure. In urban areas, resident operated Wi-Fi access points (APs) are dense enough to achieve ubiquitous Internet access, provided we can induce the hosts to provide guest Wi-Fi access. However, sharing Wi-Fi involves taking on responsibility for the guest’s actions. Our main contribution is a novel mechanism to handoff the host’s responsibility to a trusted point by tunneling the guest’s packets through it. The tunnel also guarantees that the guest’s traffic cannot be subverted by malicious hosts. Using tunneling as a primitive, we show how to architect a citywide cooperative for safely sharing Wi-Fi with legitimate guests. We offer this as an economically viable alternative to investing millions in new infrastructure.

78 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Journal ArticleDOI
TL;DR: In this paper, Imagined communities: Reflections on the origin and spread of nationalism are discussed. And the history of European ideas: Vol. 21, No. 5, pp. 721-722.

13,842 citations

Journal ArticleDOI
TL;DR: A thorough exposition of community structure, or clustering, is attempted, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists.
Abstract: The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e. g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.

9,057 citations

Journal ArticleDOI
TL;DR: A thorough exposition of the main elements of the clustering problem can be found in this paper, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.

8,432 citations