scispace - formally typeset
Search or ask a question

Showing papers by "Nigel Shadbolt published in 2008"


Journal ArticleDOI
01 Jul 2008
TL;DR: The Web must be studied as an entity in its own right to ensure it keeps flourishing and prevent unanticipated social effects.
Abstract: The Web must be studied as an entity in its own right to ensure it keeps flourishing and prevent unanticipated social effects.

328 citations


01 Jan 2008
TL;DR: In this paper, the authors argue that the Web has been transformational and we need to understand it, we need anticipate future developments and identify opportunities and threats, and that we need a new discipline: Web Science.
Abstract: Our motivation is that the Web has been transformational and we need to understand it, we need to anticipate future developments and identify opportunities and threats. We need a new discipline: Web Science

282 citations


Journal ArticleDOI
TL;DR: It is argued that some of the privacy concerns are overblown, and that much research and commentary on lifelogging has made the unrealistic assumption that the information gathered is for private use, whereas, in a more socially-networked online world, much of it will have public functions and will be voluntarily released into the public domain.
Abstract: The growth of information acquisition, storage and retrieval capacity has led to the development of the practice of lifelogging, the undiscriminating collection of information concerning one’s life and behaviour. There are potential problems in this practice, but equally it could be empowering for the individual, and provide a new locus for the construction of an online identity. In this paper we look at the technological possibilities and constraints for lifelogging tools, and set out some of the most important privacy, identity and empowerment-related issues. We argue that some of the privacy concerns are overblown, and that much research and commentary on lifelogging has made the unrealistic assumption that the information gathered is for private use, whereas, in a more socially-networked online world, much of it will have public functions and will be voluntarily released into the public domain.

139 citations


05 Apr 2008
TL;DR: The functionality and user interaction features of the NITELIGHT tool based on the work to date are described and details of the vSPARQL constructs used to support the graphical representation of SPARQL queries are presented.
Abstract: Query formulation is a key aspect of information retrieval, contributing to both the efficiency and usability of many semantic applications. A number of query languages, such as SPARQL, have been developed for the Semantic Web; however, there are, as yet, few tools to support end users with respect to the creation and editing of semantic queries. In this paper we introduce a graphical tool for semantic query construction (NITELIGHT) that is based on the SPARQL query language specification. The tool supports end users by providing a set of graphical notations that represent semantic query language constructs. This language provides a visual query language counterpart to SPARQL that we call vSPARQL. NITELIGHT also provides an interactive graphical editing environment that combines ontology navigation capabilities with graphical query visualization techniques. This paper describes the functionality and user interaction features of the NITELIGHT tool based on our work to date. We also present details of the vSPARQL constructs used to support the graphical representation of SPARQL queries.

127 citations


Book ChapterDOI
26 Oct 2008
TL;DR: This paper presents a method for the automatic consolidation of user profiles across two popular social networking sites, and subsequent semantic modelling of their interests utilising Wikipedia as a multi-domain model and shows that far richer interest profiles can be generated for users when multiple tag-clouds are combined.
Abstract: The continued increase in Web usage, in particular participation in folksonomies, reveals a trend towards a more dynamic and interactive Web where individuals can organise and share resources. Tagging has emerged as the de-facto standard for the organisation of such resources, providing a versatile and reactive knowledge management mechanism that users find easy to use and understand. It is common nowadays for users to have multiple profiles in various folksonomies, thus distributing their tagging activities. In this paper, we present a method for the automatic consolidation of user profiles across two popular social networking sites, and subsequent semantic modelling of their interests utilising Wikipedia as a multi-domain model. We evaluate how much can be learned from such sites, and in which domains the knowledge acquired is focussed. Results show that far richer interest profiles can be generated for users when multiple tag-clouds are combined.

109 citations


01 Mar 2008
TL;DR: It is revealed that the majority of users possess multiple interests, and an algorithm is proposed to generate user profiles which can accurately represent these multiple interests.
Abstract: Recommendation systems which aim at providing relevant information to users are becoming more and more important and desirable due to the enormous amount of information available on the Web. Crucial to the performance of a recommendation system is the accuracy of the user profiles used to represent the interests of the users. In recent years, popular collaborative tagging systems such as del.icio.us have aggregated an abundant amount of user-contributed metadata which provides valuable information about the interests of the users. In this paper, we present our analysis on the personal data in folksonomies, and investigate how accurate user profiles can be generated from this data. We reveal that the majority of users possess multiple interests, and propose an algorithm to generate user profiles which can accurately represent these multiple interests. We also discuss how these user profiles can be used for recommending Web pages and organising personal data.

93 citations


Book ChapterDOI
29 Sep 2008
TL;DR: NITELIGHTS is introduced, a Web-based graphical tool for semantic query construction that is based on the W3C SPARQL specification and the potential contribution of the NITELIGHT tool to rule creation/editing and semantic integration capabilities is discussed.
Abstract: Query formulation is a key aspect of information retrieval, contributing to both the efficiency and usability of many semantic applications. A number of query languages, such as SPARQL, have been developed for the Semantic Web; however, there are, as yet, few tools to support end users with respect to the creation and editing of semantic queries. In this paper we introduce NITELIGHT, a Web-based graphical tool for semantic query construction that is based on the W3C SPARQL specification. NITELIGHT combines a number of features to support end-users with respect to the creation of SPARQL queries. These include a columnar ontology browser, an interactive graphical design surface, a SPARQL-compliant visual query language, a SPARQL syntax viewer and an integrated semantic query results browser. The functionality of each of these components is described in the current paper. In addition, we discuss the potential contribution of the NITELIGHT tool to rule creation/editing and semantic integration capabilities.

82 citations


Book ChapterDOI
01 May 2008
TL;DR: The OpenKnowledge project as discussed by the authors proposes a form of knowledge sharing that is based not on direct sharing of "true" statements about the world but, instead, is based on sharing descriptions of interactions.
Abstract: The drive to extend the Web by taking advantage of automated symbolic reasoning (the so-called Semantic Web) has been dominated by a traditional model of knowledge sharing, in which the focus is on task-independent standardisation of knowledge. It appears to be difficult, in practice, to standardise in this way because the way in which we represent knowledge is strongly influenced by the ways in which we expect to use it. We present a form of knowledge sharing that is based not on direct sharing of "true" statements about the world but, instead, is based on sharing descriptions of interactions. By making interaction specifications the currency of knowledge sharing we gain a context to interpreting knowledge that can be transmitted between peers, in a manner analogous to the use of electronic institutions in multi-agent systems. The narrower notion of semantic commitment we thus obtain requires peers only to commit to meanings of terms for the purposes and duration of the interactions in which they appear. This lightweight semantics allows networks of interaction to be formed between peers using comparatively simple means of tackling the perennial issues of query routing, service composition and ontology matching. A basic version of the system described in this paper has been built (via the OpenKnowledge project); all its components use established methods; many of these have been deployed in substantial applications; and we summarise a simple means of integration using the interaction specification language itself.

59 citations


Journal ArticleDOI
TL;DR: A practical approach to adopting semantic Web technologies enables large organizations to share data while achieving clear private as well as public reuse benefits.
Abstract: Many real-world tasks require the acquisition and integration of information from a distributed set of heterogeneous sources. Hence, there's no shortage of opportunities for applications using Semantic Web (SW) technologies. The power of publishing and linking data in a way that machines can automatically interpret through ontologies is beginning to materialize. However, market penetration level is relatively low, and it's still no routine matter for an enterprise, organization, governmental agency, or business with large distributed databases to add them to the Web of linked and semantically enriched data. In part, they may suspect that they're expected to pioneer an approach in which quick wins are few. Moreover, cost and privacy issues arise when ever-increasing amounts of information are linked into the Web. A practical approach to adopting semantic Web technologies enables large organizations to share data while achieving clear private as well as public reuse benefits.

57 citations


Journal ArticleDOI
TL;DR: A new discipline, Web Science, aims to discover how Web traits arise and how they can be harnessed or held in check to benefit society.
Abstract: The relentless rise in Web pages and links is creating emergent properties, from social networks to virtual identity theft, that are transforming society. A new discipline, Web Science, aims to discover how Web traits arise and how they can be harnessed or held in check to benefit society. Important advances are beginning to be made; more work can solve major issues such as securing privacy and conveying trust.

52 citations


Proceedings ArticleDOI
09 Dec 2008
TL;DR: This paper attempts to provide a solution to this problem using a k-nearest-neighbour approach to classify documents returned by a search engine, by building classifiers using data collected from collaborative tagging systems.
Abstract: Traditional Web search engines mostly adopt a keyword-based approach. When the keyword submitted by the user is ambiguous, search result usually consists of documents related to various meanings of the keyword, while the user is probably interested in only one of them. In this paper we attempt to provide a solution to this problem using a k-nearest-neighbour approach to classify documents returned by a search engine, by building classifiers using data collected from collaborative tagging systems. Experiments on search results returned by Google show that our method is able to classify the documents returned with high precision.

01 May 2008
TL;DR: It is argued that some of the privacy concerns are overblown, and the major issues will be concerned with surveillance, and much of it will have public functions and will be voluntarily released into the public domain.
Abstract: The growth of information acquisition, storage and retrieval capacity has led to the development of the practice of lifelogging, the undiscriminating collection of information concerning one’s life and behaviour. There are potential problems in this practice, but equally it could be empowering for the individual, and provide a new locus for the construction of an online identity. In this paper we look at the technological possibilities and constraints for lifelogging tools, and set out some of the most important privacy, identity and empowerment-related issues. We argue that some of the privacy concerns are overblown, and the major issues will be concerned with surveillance. We also argue that much research and commentary on lifelogging has made the unrealistic assumption that the information gathered is for private use, whereas, in a more socially-networked online world, much of it will have public functions and will be voluntarily released into the public domain.

Journal ArticleDOI
TL;DR: A prototype knowledge-based document repository for an aeroengine manufacturer that searches and analyzes distributed document resources, and provides engineers with a summary view of the underlying knowledge, unlike existing document repositories and digital libraries.
Abstract: As manufacturers shift their focus from selling products to providing services, designers must therefore increasingly consider the life-cycle requirements in addition to conventional design parameters. To identify possible areas of concern, engineers must consider knowledge gained through the life cycle of a related product. However, because of the size and distributed nature of a company’s operation, engineers often do not have access to front-line maintenance data. Additionally, the large number of documents generated during the design and operation of a product makes it impractical to manually review all documents thoroughly during a design task. This paper presents a prototype knowledge-based document repository for an aeroengine manufacturer. The developed system searches and analyzes distributed document resources, and provides engineers with a summary view of the underlying knowledge. The aim is to aid engineers in creating design requirements that incorporate maintenance issues. Unlike existing document repositories and digital libraries, our approach is knowledge based, where users browse summary reports instead of following suggested links. To test the validity of our architecture, we have developed and deployed a prototype of our knowledge-based document repository. The repository has been demonstrated to and validated by the engine design community

18 Apr 2008
TL;DR: This paper sets to use ontology alignments to inform an ontology fragmentation strategy for the bene¯t of exposing and distributing rich ontology aligned fragments.
Abstract: In today's semantic web, ontology fragmentation and mo- dularization are considered as important tasks due to the size and co- mplexity of prime ontologies. At the same time, the ontology alignment community thrives with solutions for discovering and producing align- ments between semantically related concepts. However, these are seldom used in bulk and their subsequent re-use is somewhat problematic. In this paper we set to explore these issues from a practical viewpoint: use ontology alignments to inform an ontology fragmentation strategy for the bene¯t of exposing and distributing rich ontology aligned fragments.

30 Mar 2008
TL;DR: This paper uses the semantics extracted from collaborative tagging in the social bookmarking site del.icio.us to extract sets of tags which are related to it in different contexts by performing a community-discovery algorithm on folksonomy networks, which are then used to disambiguate search results returned by del.iso.us and Google.
Abstract: Existing Web search engines such as Google mostly adopt a keyword-based approach, which matches the keywords in a query submitted by a user with the keywords characterising the indexed Web documents, and is quite successful in general in helping users locate useful documents. However, when the keyword submitted by the user is ambiguous, the search result usually consists of documents related to various meanings of the keyword, in which probably only one of them is interesting to the user. In this paper we attempt to provide a solution to this problem by using the semantics extracted from collaborative tagging in the social bookmarking site del.icio.us. For an ambiguous word, we extract sets of tags which are related to it in different contexts by performing a community-discovery algorithm on folksonomy networks. The sets of tags are then used to disambiguate search results returned by del.icio.us and Google. Experimental results show that our method is able to disambiguate the documents returned by the two systems with high precision.

21 Jul 2008
TL;DR: This work has devised a mechanism using Semantic Web technologies that wraps each existing data source with semantic information, and it is referred to as SWEDER (Semantic Wrapping of Existing Data Sources with Embedded Rules).
Abstract: We argue for the flexible use of lightweight ontologies to aid information integration. Our proposed approach is grounded on the availability and exploitation of existing data sources in a networked environment such as the world wide web (instance data as it is commonly known in the description logic and ontology community). We have devised a mechanism using Semantic Web technologies that wraps each existing data source with semantic information, and we refer to this technique as SWEDER (Semantic Wrapping of Existing Data Sources with Embedded Rules). This technique provides representational homogeneity and a firm basis for information integration amongst these semantically enabled data sources. This technique also directly supports information integration though the use of context ontologies to align two or more semantically wrapped data sources and capture the rules that define these integrations. We have tested this proposed approach using a simple implementation in the domain of organisational and communication data and we speculate on the future directions for this lightweight approach to semantic enablement and contextual alignment of existing network-available data sources.

Proceedings ArticleDOI
09 Dec 2008
TL;DR: In this paper, a method to construct user profiles of multiple interests using data in a collaborative tagging system is proposed, which is able to generate user profiles which reflect the diversity of user interests and can be used to help provide more focused recommendation.
Abstract: We analyse data obtained from several collaborative tagging systems and discover that user interests can be very diverse. Traditional methods for representing interests of users are usually not able to reflect such diversity. We propose a method to construct user profiles of multiple interests using data in a collaborative tagging system. Our evaluation suggests that the proposed method is able to generate user profiles which reflect the diversity of user interests and can be used to help provide more focused recommendation.

26 Feb 2008
TL;DR: Experimental evidence is presented that hyperstructure changes, as opposed to content changes, form a substantial proportion of editing effort on a large-scale wiki.
Abstract: Wiki systems have developed over the past years as lightweight, community-editable, web-based hypertext systems. With the emergence of Semantic Wikis, these collections of interlinked documents have also gained a dual role as ad-hoc RDF graphs. However, their roots lie at the limited hypertext capabilities of the World Wide Web: embedded links, without support for composite objects or transclusion. In this paper, we present experimental evidence that hyperstructure changes, as opposed to content changes, form a substantial proportion of editing effort on a large-scale wiki. The experiment is set in the wider context of a study of how the technologies developed during decades of hypertext research may be applied to improve management of wiki document structure and, with semantic wikis, knowledge structure.

Proceedings ArticleDOI
09 Dec 2008
TL;DR: This paper proposes a method to reveal semantics of ambiguous tags by studying the collective user behaviour in a tagging system by using common large scale clustering techniques on folksonomies and believes tags can be better contextualised by the social contexts in which they are used.
Abstract: Collaborative tagging systems have emerged in recent years to become popular tools for organising information on the Web. While collaborative tagging offers many advantages, they also suffer from several limitations, with a major one being the existence of ambiguous tags. To understand what an ambiguous tag is intended to mean, we need to know the contexts in which it is used. Instead of using common large scale clustering techniques on folksonomies, we believe tags can be better contextualised by the social contexts in which they are used. We propose a method to reveal semantics of ambiguous tags by studying the collective user behaviour in a tagging system. In this paper we describe our proposal and some results of our preliminary experiments. We also discuss the significance of the work and how it can be evaluated.

27 Jul 2008
TL;DR: A detailed military scenario that features the involvement of US and UK coalition forces in a large-scale humanitarian-assistance/disaster relief (HA/DR) effort is developed and the opportunities for technology demonstration in respect of a number of ITA research focus areas are reviewed.
Abstract: As a fundamental research program, the International Technology Alliance (ITA) aims to explore innovative solutions to some of the challenges confronting US/UK coalition military forces in an era of network-enabled operations. In order to demonstrate some of the scientific and technical achievements of the ITA research program, we have developed a detailed military scenario that features the involvement of US and UK coalition forces in a large-scale humanitarian-assistance/disaster relief (HA/DR) effort. The scenario is based in a fictitious country called Holistan, and it draws on a number of previous scenario specification efforts that have been undertaken as part of the ITA. In this paper we provide a detailed description of the scenario and review the opportunities for technology demonstration in respect of a number of ITA research focus areas.

Journal ArticleDOI
TL;DR: It is proposed that a neuron's preferred operating point can be characterised by the probability density function of its output spike rate, and that adaptation maintains an invariant output PDF, regardless of how this output PDF is initially set.
Abstract: Sensory neurons adapt to changes in the natural statistics of their environments through processes such as gain control and firing threshold adjustment It has been argued that neurons early in sensory pathways adapt according to information-theoretic criteria, perhaps maximising their coding efficiency or information rate Here, we draw a distinction between how a neuron's preferred operating point is determined and how its preferred operating point is maintained through adaptation We propose that a neuron's preferred operating point can be characterised by the probability density function (PDF) of its output spike rate, and that adaptation maintains an invariant output PDF, regardless of how this output PDF is initially set Considering a sigmoidal transfer function for simplicity, we derive simple adaptation rules for a neuron with one sensory input that permit adaptation to the lower-order statistics of the input, independent of how the preferred operating point of the neuron is set Thus, if the preferred operating point is, in fact, set according to information-theoretic criteria, then these rules nonetheless maintain a neuron at that point Our approach generalises from the unimodal case to the multimodal case, for a neuron with inputs from distinct sensory channels, and we briefly consider this case too

01 Aug 2008
TL;DR: Two Semantic Web techniques arising from ITA research into semantic alignment and interoperability in distributed networks are explored, grounded in the creation of lightweight ontologies to semantically wrap existing data sources to facilitate rapid semantic integration through representational homogeneity.
Abstract: We explore two Semantic Web techniques arising from ITA research into semantic alignment and interoperability in distributed networks. The first is POAF (Portable Ontology Aligned Fragments) which addresses issues relating to the portability and usage of ontology alignments. POAF uses an ontology fragmentation strategy to achieve portability, and enables subsequent usage through a form of automated ontology modularization. The second technique, SWEDER (Semantic Wrapping of Existing Data sources with Embedded Rules), is grounded in the creation of lightweight ontologies to semantically wrap existing data sources, to facilitate rapid semantic integration through representational homogeneity. The semantic integration is achieved through the creation of context ontologies which define the integrations and provide a portable definition of the integration rules in the form of embedded SPARQL construct clauses. These two Semantic Web techniques address important practical issues relevant to the potential future adoption of ontologies in distributed network environments.

26 Sep 2008
TL;DR: It is suggested that the informational and technological elements of a network system can, at times, constitute part of the material supervenience base for a human agent’s mental states and processes.
Abstract: In thinking about the transformative potential of network technologies with respect to human cognition, it is common to see network resources as playing a largely assistive or augmentative role. In this paper we propose a somewhat more radical vision. We suggest that the informational and technological elements of a network system can, at times, constitute part of the material supervenience base for a human agent’s mental states and processes. This thesis (called the thesis of network-enabled cognition) draws its inspiration from the notion of the extended mind that has been propounded in the philosophical and cognitive science literature. Our basic claim is that network systems can do more than just augment cognition; they can also constitute part of the physical machinery that makes mind and cognition mechanistically possible. In evaluating this hypothesis, we identify a number of issues that seem to undermine the extent to which contemporary network systems, most notably the World Wide Web, can legitimately feature as part of an environmentally-extended cognitive system. Specific problems include the reliability and resilience of network-enabled devices, the accessibility of online information content, and the extent to which network-derived information is treated in the same way as information retrieved from biological memory. We argue that these apparent shortfalls do not necessarily merit the wholesale rejection of the network-enabled cognition thesis; rather, they point to the limits of the current state-of-the-art and identify the targets of many ongoing research initiatives in the network and information sciences. In addition to highlighting the importance of current research and technology development efforts, the thesis of network-enabled cognition also suggests a number of areas for future research. These include the formation and maintenance of online trust relationships, the subjective assessment of information credibility and the long-term impact of network access on human psychological and cognitive functioning. The nascent discipline of web science is, we suggest, suitably placed to begin an exploration of these issues.

Book ChapterDOI
26 Oct 2008
TL;DR: The project plans to aggregate a broad range of business information, providing unparalleled insight into UK business activity and develop rich semantic search and navigation tools to allow any business to 'place their sales proposition in front of a prospective buyer' confident of the fact that the recipient has a propensity to buy.
Abstract: Market Blended Insight (MBI) is a project with a clear objective of making a significant performance improvement in UK business to business (B2B) marketing activities in the 5-7 year timeframe. The web has created a rapid expansion of content that can be harnessed by recent advances in Semantic Web technologies and applied to both Media industry provision and company utilization of exploitable business data and content. The project plans to aggregate a broad range of business information, providing unparalleled insight into UK business activity and develop rich semantic search and navigation tools to allow any business to 'place their sales proposition in front of a prospective buyer' confident of the fact that the recipient has a propensity to buy.

Book ChapterDOI
01 Jan 2008
TL;DR: This chapter examines certain features that distinguish killer apps from other ordinary applications in the context of the semantic web, in the hope that a better understanding of the characteristics of killer apps might encourage their consideration when developing semantic web applications.
Abstract: There are certain features that that distinguish killer apps from other ordinary applications. This chapter examines those features in the context of the semantic web, in the hope that a better understanding of the characteristics of killer apps might encourage their consideration when developing semantic web applications. Killer apps are highly tranformative technologies that create new e-commerce venues and widespread patterns of behaviour. Information technology, generally, and the Web, in particular, have benefited from killer apps to create new networks of users and increase its value. The semantic web community on the other hand is still awaiting a killer app that proves the superiority of its technologies. The authors hope that this chapter will help to highlight some of the common ingredients of killer apps in e-commerce, and discuss how such applications might emerge in the semantic web.


Proceedings ArticleDOI
01 Jan 2008
TL;DR: This paper presents a knowledge-based document repository demonstrator that is capable of providing such support for the maintainers and designers of jet engines.
Abstract: Manufacturers are currently shifting their focus from selling products to providing services, hence the product's designers must increasingly consider life-cycle requirements, in addition to conventional design parameters. To identify possible areas of concern, engineers must consider knowledge gained through the life cycle of similar or related product. However, because of the size and distributed nature of a company's operation, engineers often do not have access to front-line maintenance data. In addition, the large number of documents generated during the design and operation of a product makes it impractical to manually review all documents thoroughly during a task. This paper presents a knowledge-based document repository demonstrator that is capable of providing such support for the maintainers and designers of jet engines.



Book ChapterDOI
01 Jun 2008
TL;DR: As stories of identity theft and online fraud fill the media internet users are becoming increasingly nervous about their online data security.
Abstract: In under a decade the internet has changed our lives. Now we can shop, bank, date, research, learn and communicate online and every time we do we leave behind a trail of personal information. Organisations have a wealth of structured information about individuals on large numbers of databases. What does the intersection of this information mean for the individual? How much of your personal data is out there and more importantly, just who has access to it? As stories of identity theft and online fraud fill the media internet users are becoming increasingly nervous about their online data security. Also what opportunities arise for individuals to exploit this information for their own benefit?