scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Web Engineering in 2004"


Journal Article
TL;DR: This paper presents the modeling support provided for this kind of processes by the Object-Oriented Hypermedia method (OO-H) and the UML-based Web Engineering (UWE) approach and provides the basis on which each method applies its particular design notation for business processes.
Abstract: Business processes, regarded as heavy-weighted flows of control consisting of activities and transitions, play an increasingly important role in Web applications In order to address these business processes, Web methodologies are evolving to support its definition and integration with the Web specific aspects of content, navigation and presentation This paper presents the modeling support provided for this kind of processes by the Object-Oriented Hypermedia method (OO-H) and the UML-based Web Engineering (UWE) approach Both methods apply UML use cases and activity diagrams, and supply UML standard modeling extensions Additionally, the connection mechanisms between the navigation and the process specific modeling elements are discussed As a representative example to illustrate our approach we present the requirements, analysis and design models for the wwwamazoncom Website with focus on the checkout process Our approach includes requirements and analysis models shared by OO-H and UWE and provides the basis on which each method applies its particular design notation for business processes

94 citations


Journal Article
TL;DR: This paper will review the research into different types of authentication mechanisms, including simple passwords, and propose a mechanism for quantifying the quality of different authentication mechanisms to support an informed choice for web site administrators.
Abstract: Users wishing to use secure computer systems or web sites are required to authenticate themselves. Users are usually required to supply a user identification and to authenticate themselves to prove that they are indeed the person they claim to be. The authenticator of choice in the web environment is the simple password. Since the advent of the web the proliferation of secure systems has placed an unacceptable burden on users to recall increasing numbers of passwords that are often infrequently used. This paper will review the research into different types of authentication mechanisms, including simple passwords, and propose a mechanism for quantifying the quality of different authentication mechanisms to support an informed choice for web site administrators.

62 citations


Journal Article
TL;DR: The results obtained from the clone analysis of four web applications show that the semiautomated approach is both effective and efficient at identifying function clones in web applications, and can be applied to prevent clone from spreading or to remove redundant scripting code.
Abstract: Many web applications use a mixture of HTML and scripting language code as the front-end to business services, where scripts can run on both the client and server side. Analogously to traditional applications, code duplication occurs frequently during the development and evolution of web applications. This ad-hoc but pathological form of reuse consists in copying, and eventually modifying, a block of existing code that implements a piece of required functionality. Duplicated blocks are named clones and the act of copying, including slight modifications, is called cloning. When entire functions are copied rather than fragments, duplicated functions are called function clones. This paper describes how a semiautomated approach can be used to identify cloned functions within scripting code of web applications. The approach is based on the automatic selection of potential function clones and the visual inspection of selected script functions. The results obtained from the clone analysis of four web applications show that the semiautomated approach is both effective and efficient at identifying function clones in web applications, and can be applied to prevent clone from spreading or to remove redundant scripting code.

61 citations


Journal Article
TL;DR: Results indicate that the classifier was trained to distinguish home pages from non-home pages and within the home page genre it is able to distinguish personal from corporate home pages, however, organization home pages were more difficult to distinguish from personal and corporateHome pages.
Abstract: The research reported in this paper is part of a larger project on the automatic classification of web pages by their genres. The long term goal is the incorporation of web page genre into the search process to improve the quality of the search results. In this phase, a neural net classifier was trained to distinguish home pages from non-home pages and to classify those home pages as personal home page, corporate home page or organization home page. In order to evaluate the importance of the functionality attribute of cybergenre in such classification, the web pages were characterized by the cybergenre attributes of 〈content, form, functionality〉 and the resulting classifications compared to classifications in which the web pages were characterized by the genre attributes of 〈content, form〉. Results indicate that the classifier is able to distinguish home pages from non-home pages and within the home page genre it is able to distinguish personal from corporate home pages. Organization home pages, however, were more difficult to distinguish from personal and corporate home pages. A significant improvement was found in identifying personal and corporate home pages when the functionality attribute was included.

49 citations


Journal Article
TL;DR: The research shows that multimedia searching is complex relative to general Web searching, and searching specific multimedia collections has reduced the complexity of audio searching, but it has not had the same effect for image and video searching.
Abstract: Multimedia Web searching is a significant information activity for many people. Major Web search engines are critical resources in people's efforts to locate relevant online multimedia information. It is therefore important that we understand how searchers are utilizing these Web information systems in their quest to retrieve multimedia information to design effective Web systems in support of these information needs. In this paper, we report the results of a research study evaluating the effect of separate multimedia Web collections on individual searching behavior. The AltaVista search engine has an extensive multimedia collection and uses tabs to search specific collections. The motivating questions for this research are: (1) What are the characteristics of multimedia searching on AltaVista? and (2) What are the effects on Web searching of separate multimedia collections? The results of our research show that multimedia searching is complex relative to general Web searching. Searching specific multimedia collections has reduced the complexity of audio searching, but it has not had the same effect for image and video searching. Query length and Boolean usage rates are much higher for image searching, compared to general Web searching. We discuss the implications of the research findings for the design, development and evaluation of Web multimedia retrieval systems.

35 citations


Journal Article
TL;DR: In this paper, the authors proposed eight design recommendations for online newspapers based on features that mediate a specific purpose and use between publisher and audience, which they describe as genre rules in terms of purpose, form, and positioning.
Abstract: Taking a genre perspective on design, this article proposes eight design recommendations for online newspapers. These recommendations are based on features that mediate a specific purpose and use between publisher and audience, which we describe as genre rules in terms of purpose, form, and positioning. They are also based on genre change regarding design, and the heritage from print regarding form and shared content elements. We have a) studied genre change through a web page analysis of nine Swedish online newspapers in 2001 and 2003, using the genre concepts content, form, functionality and positioning, and b) derived genre rules through publishers and audience understanding of the genre. We have interviewed managers, designers and editors-in-chief at the nine newspapers as well as 153 members of their audience. We show that in the design process for digital documents, it is useful to have genre awareness, i.e. to be aware of the genre characteristics, the producer's design purpose and the audience recognition and response.

29 citations


Journal Article
TL;DR: An approach for integrating the use of conceptual models in the lower part of the application life cycle is illustrated, based on the adoption of conceptual logs, which are Web usage logs enriched with meta-data deriving from the application conceptual specifications.
Abstract: So far, conceptual modeling of Web applications has been used primarily in the upper part of the life cycle, as a driver for system analysis. Little attention has been put on exploiting the conceptual specifications developed during analysis for application evaluation, maintenance and evolution. This paper illustrates an approach for integrating the use of conceptual models in the lower part of the application life cycle. The approach is based on the adoption of conceptual logs, which are Web usage logs enriched with meta-data deriving from the application conceptual specifications. In particular, the paper illustrates how conceptual logs are generated and exploited in Web usage evaluation and mining, so as to achieve a deeper and systematic quality evaluation of Web applications. A prototype tool supporting the generation of conceptual logs and the evaluation activities is also presented.

25 citations


Journal Article
TL;DR: In this paper, a diffuser with a broad-ring brim at the exit periphery and a conventional wind turbine inside it is used to augment a given turbine diameter and wind speed by a factor of about 2-3 compared with a bare wind turbine.
Abstract: We have developed a new wind turbine system that consists of a diffuser with a broad-ring brim at the exit periphery and a conventional wind turbine inside it. The new wind turbine has demonstrated power augmentation for a given turbine diameter and wind speed by a factor of about 2–3 compared with a bare wind turbine. This is because a very low-pressure region due to strong vortex formation behind the broad brim draws more mass flow to a turbine inside

9 citations


Journal Article
TL;DR: This work proposes a new Web search system, where the effectiveness is continuously evaluated by explicit user feedback in terms of a personalized ranking matrix, and a framework for Congenial Web Search is defined.
Abstract: In this context of large homogeneous retrieval systems, matrics have been established to evaluate the effectiveness with precision and recall. By contrast, measuring Web search effectiveness is a new challenge due to the heterogeneity of high-dynamic Web content. Currently, users select a Web search engine by their individual preferences, and the evaluation of effectiveness is a subjective measure defined by the user. Since there are different emphases for each single user, those user-defined measures cannot be quantified in global way. Therefore, we propose a new Web search system, where the effectiveness is continuously evaluated by explicit user feedback in terms of a personalized ranking matrix. These local rankings can be evaluated according to different goals. First, accumulation leads to a wider base of ranked and validated results. Second, the aggregated ranking lists can be used to identify topics, as well as communities of interest. Finally, together with social aspects for community support, a framework for Congenial Web Search is defined.

7 citations


Journal Article
TL;DR: e-Prototyping encompasses the management of an agile software development process and the systematic evaluation of manifold feedback contributions and includes frequent releases of software versions as well as integrated mechanisms for gathering feedback from users and other relevant actors via the live system.
Abstract: Projects developing Web applications face problems when it comes to identifying the Web users' requirements. There are a number of reasons for this. It is unclear how to gather initial requirements from potential users if there is no design artifact to communicate about. Developers have difficulty identifying the needs of the Web application users during the ongoing development process because of a lack of proper communication concepts. Development teams for Web-based systems include professionals from different disciplines with diverse cultures. Members of the development team often belong to many different organizations with varying stakes in the project. This article presents a modified prototyping approach called e-Prototyping. This approach includes frequent releases of software versions (based on short development cycles) as well as integrated mechanisms for gathering feedback from users and other relevant actors via the live system. It underlines the need to offer various communication channels to the users and to systematically order the different streams of feedback to enable the developers to identify the user requirements. e-Prototyping encompasses the management of an agile software development process and the systematic evaluation of manifold feedback contributions.

6 citations


Journal Article
TL;DR: A newly identified web data-quality dimension, appropriateness, which is based on the linguistic and visual complexity of a web page, is studied and computed using new metrics that classify web pages into three main Appropriateness genres: scholarly, news/general interest and popular.
Abstract: Currently, search engines rank search results using mainly link-based metrics. While usually most of the search results are relevant to a user's query, due to how the results are ranked, users often are still not totally satisfied with them. Using a proposed framework of web data quality, it is found that current search engines usually only consider a very small number of the dimensions of web data quality in their ranking algorithms. In this paper, a newly identified web data-quality dimension, appropriateness, which is based on the linguistic and visual complexity of a web page, is studied. It is computed using new metrics that classify web pages into three main appropriateness genres: scholarly, news/general interest and popular. Experiments have shown the effectiveness of the metrics in ranking web pages by whether they are appropriate to a user's task and information needs.

Journal ArticleDOI
TL;DR: In this article, the authors describe how web applications use a mixture of HTML and scripting language code as the front-end to business services, where scripts can run on both the client and server side.
Abstract: Many web applications use a mixture of HTML and scripting language code as the front-end to business services, where scripts can run on both the client and server side. Analogously to traditional a...

Journal ArticleDOI
TL;DR: Conceptual modeling of Web applications has been used primarily in the upper part of the life cycle, as a driver for system analysis, but more attention has been put on exploiting the concept in the concept exploitation phase.
Abstract: So far, conceptual modeling of Web applications has been used primarily in the upper part of the life cycle, as a driver for system analysis. Little attention has been put on exploiting the concept...

Journal Article
Daniel E. Rose1
TL;DR: Web search has several important characteristics that distinguish it from traditional information retrieval: the often adversarial relationship between content creators and search engine designers, the nature of the corpus, and the multiplicity of user goals.
Abstract: Web search has several important characteristics that distinguish it from traditional information retrieval: the often adversarial relationship between content creators and search engine designers, the nature of the corpus, and the multiplicity of user goals. In addition to making the search task itself difficult, these characteristics make it particularly hard to evaluate search effectiveness. In this paper, we examine these characteristics and then consider the problems with several different standard evaluation techniques.

Journal Article
TL;DR: This paper explores how information visualization can provide insights into the effectiveness of different query formulations or the same query submitted to multiple search engines.
Abstract: This paper explores how information visualization can provide insights into the effectiveness of different query formulations or the same query submitted to multiple search engines. Different queries or search methods are deemed more effective if the fusion of their results leads to a new result set that contains an increased number of relevant documents. The MetaCrystal toolset can be used to visualize the degree of overlap and similarity between the results returned by different queries or engines. The work presented is guided by two working hypotheses: 1) documents found by multiple methods are more likely to be relevant; 2) high degrees of overlap and/or systematic relationships between the ranked lists being compared will not lead to fusion results that contain more relevant documents. MetaCrystal visually identifies documents found by multiple queries or engines. The number and distribution patterns of documents found by multiple methods can be used as a visual measure of the fusion effectiveness. Examples, using Internet and TREC data, are presented that support both in a qualitative and quantitative way the working hypotheses.

Journal Article
Sherry Koshman1
TL;DR: The results showed that participants rated their topic knowledge level quite low for most tasks, there was a high degree of participant-system item selection overlap, and a statistically significant relationship was found between knowledge level and node use for half of the tasks.
Abstract: TouchGraph is a Web-based ranked similarity list browser that visualizes the relationship between the query and resulting item set as a graph. TouchGraph provides visual analogs to Amazon's recommendation feature based on item similarity and Google's "similar to" pages. TouchGraph may be able to assist diverse Web users, who have varying levels of knowledge on search topics, to visually select similar items to their query. To examine this assumption further, this investigation asks: what are the effects of topic knowledge level on the similarity judgments generated by the users in comparison to the visualized system depictions? Seventeen participants were asked to use TouchGraph for similarity matching of search output to the query and their results were compared to the items shown as most similar to the query by the visualization. The results showed that participants rated their topic knowledge level quite low for most tasks, there was a high degree of participant-system item selection overlap, and a statistically significant relationship was found between knowledge level and node use for half of the tasks. The subjective satisfaction data were positive for the TouchGraph interface. The findings suggest that the TouchGraph visualization has the potential to enhance Web search effectiveness. This study aids in understanding better system design issues in regard to visualization-based tools for Web information retrieval.

Journal Article
TL;DR: This study is the first one to use content-based tools to determine the image contents of a given web segment, and it should be of use to anyone concerned with the image content of the web in Chile.
Abstract: We propose a methodology to characterize the image contents of a web segment, and we present an analysis of the contents of a segment of the Chilean web (.CL domain). Our framework uses an efficient web-crawling architecture, standard content-based analysis tools (to extract low-level features such as color, shape and texture), and novel skin and face detection algorithms. In an automated process we start by examining all websites within a domain (e.g., .cl websites), obtaining links to images, and downloading a large number of the images (in all of our experiments approx. 383,000 images that correspond to about 35 billion pixels). Once the images are downloaded to a local server, our process automatically extracts several low-level visual features (color, texture, shape, etc.). Using novel algorithms we perform skin and face detection. The results of visual feature extraction, skin, and face detection are then used to characterize the contents of a web segment. We tested our methodology on a segment of the Chilean web (.cl), by automatically downloading and processing 183,000 images in 2003 and 200,000 images in 2004. We present some statistics derived from both sets of images, which should be of use to anyone concerned with the image content of the web in Chile. Our study is the first one to use content-based tools to determine the image contents of a given web segment.

Journal ArticleDOI
TL;DR: Users wishing to use secure computer systems or web sites are required to authenticate themselves to prove they are a user and to supply a user identification.
Abstract: Users wishing to use secure computer systems or web sites are required to authenticate themselves. Users are usually required to supply a user identification and to authenticate themselves to prove...

Journal ArticleDOI
TL;DR: An analysis of the contents of a segment of the Chilean web (.CL domain) and a methodology to characterize the image contents of the web segment are presented.
Abstract: We propose a methodology to characterize the image contents of a web segment, and we present an analysis of the contents of a segment of the Chilean web (.CL domain). Our framework uses an efficien...

Journal ArticleDOI
TL;DR: In this context of large homogeneous retrieval systems, matrics have been established to evaluate the effectiveness with precision and recall and by contrast, measuring Web search effectiveness is a n...
Abstract: In this context of large homogeneous retrieval systems, matrics have been established to evaluate the effectiveness with precision and recall. By contrast, measuring Web search effectiveness is a n...

Journal Article
TL;DR: It is shown that the majority of acronyms of higher educational institutions are not effective in identifying their own institution, and possible remedies are suggested.
Abstract: Many people in Hungary use the Web to obtain information from public institutions and organizations. Because these users typically do not know the URL of the desired institution's home page, they use a Web search engine and the acronym of the institution is used as query to get there. Users prefer using acronyms because they usually do not know the full names of the institutions exactly. Acronyms are easy to remember and are extensively used in media and by people in everyday life. In this paper results from an analysis of the usefulness of the acronyms of Hungarian higher educational institutions present on the Web, i.e., the ability of acronyms to identify their own institutions is reported. The usefulness of acronyms of general institutions is used as a comparison. The working hypothesis is that higher educational acronyms are more effective than general acronyms. The study confutes the assumption and shows that the majority of acronyms of higher educational institutions are not effective in identifying their own institution. Causes are presented, and possible remedies are suggested.