scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Documentation in 1998"


Journal ArticleDOI
TL;DR: This case study reports the investigations into the feasibility and reliability of calculating impact factors for web sites, called Web Impact Factors (Web‐IF), and demonstrates that Web‐IFs are calculable with high confidence for national and sector domains whilst institutional Web‐ifs should be approached with caution.
Abstract: This case study reports the investigations into the feasibility and reliability of calculating impact factors for web sites, called Web Impact Factors (Web‐IF). The study analyses a selection of seven small and medium scale national and four large web domains as well as six institutional web sites over a series of snapshots taken of the web during a month. The data isolation and calculation methods are described and the tests discussed. The results thus far demonstrate that Web‐IFs are calculable with high confidence for national and sector domains whilst institutional Web‐IFs should be approached with caution. The data isolation method makes use of sets of inverted but logically identical Boolean set operations and their mean values in order to generate the impact factors associated with internal‐ (self‐) link web pages and external‐link web pages. Their logical sum is assumed to constitute the workable frequency of web pages linking up to the web location in question. The logical operations are necessary to overcome the variations in retrieval outcome produced by the AltaVista search engine.

498 citations


Journal ArticleDOI
TL;DR: A synoptic view of the growth of the text processing technology of information extraction whose function is to extract information about a pre‐specified set of entities, relations or events from natural language texts and to record this information in structured representations called templates is given.
Abstract: In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified set of entities, relations or events from natural language texts and to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from its origins in AI work in the 1960s and 70s till the present, discuss the techniques being used to carry out the task, describe application areas where IE systems are or are about to be at work, and conclude with a discussion of the challenges facing the area. What emerges is a picture of an exciting new text processing technology with a host of new applications, both on its own and in conjunction with other technologies, such as information retrieval, machine translation and data mining.

199 citations


Journal ArticleDOI
TL;DR: This paper analyses the theoretical and the epistemological assumptions of information science and shows limitations in the dominant approaches and proposes alternative viewpoints.
Abstract: This paper analyses the theoretical and the epistemological assumptions of information science (IS). Different views of knowledge underlie all major issues in IS. Epistemological theories have a fundamental impact on theories about users, their cognition and information seeking behaviour, on subject analysis, and on classification. They have also fundamental impact on information retrieval, on the understanding of “information”, on the view of documents and their role in communication, on information selection, on theories about the functions of information systems and on the role of information professionals. IS must be based on epistemological knowledge, which avoids blind alleys and is not outdated. The paper shows limitations in the dominant approaches to IS and proposes alternative viewpoints.

194 citations


Journal ArticleDOI
TL;DR: This paper describes emerging metadata practice and standards, and outlines an approximate typology of approaches and explores different strands of metadata activity, before focusing on metadata for information resources.
Abstract: This paper describes emerging metadata practice and standards. It gives an overview of the environments in which metadata is used, before focusing on metadata for information resources. It outlines an approximate typology of approaches and explores different strands of metadata activity. It discusses trends in format development, metadata management, and use of search and retrieve protocols. It concludes by discussing some features of future deployment of metadata in support of network resource discovery.

109 citations


Journal ArticleDOI
TL;DR: In this article, ageing patterns are examined in ‘formal’ use or impact of all scientific journals processed for the Science Citation Index (SCI) during 1981‐1995 and a new classification system of journals in terms of their ageing characteristics is introduced.
Abstract: During the past decades, journal impact data obtained from the Journal Citation Reports (JCR) have gained relevance in library management, research management and research evaluation. Hence, both information scientists and bibliometricians share the responsibility towards the users of the JCR to analyse the reliability and validity of its measures thoroughly, to indicate pitfalls and to suggest possible improvements. In this article, ageing patterns are examined in ‘formal’ use or impact of all scientific journals processed for the Science Citation Index (SCI) during 1981‐1995. A new classification system of journals in terms of their ageing characteristics is introduced. This system has been applied to as many as 3,098 journals covered by the Science Citation Index. Following an earlier suggestion by Glnzel and Schoepflin, a maturing and a decline phase are distinguished. From an analysis across all subfields it has been concluded that ageing characteristics are primarily specific to the individual journal rather than to the subfield, while the distribution of journals in terms of slowly or rapidly maturing or declining types is specific to the subfield. It is shown that the cited half life (CHL), printed in the JCR, is an inappropriate measure of decline of journal impact. Following earlier work by Line and others, a more adequate parameter of decline is calculated taking into account the size of annual volumes during a range of fifteen years. For 76 per cent of SCI journals the relative difference between this new parameter and the ISI CHL exceeds 5 per cent. The current JCR journal impact factor is proven to be biased towards journals revealing a rapid maturing and decline in impact. Therefore, a longer term impact factor is proposed, as well as a normalised impact statistic, taking into account citation characteristics of the research subfield covered by a journal and the type of documents published in it. When these new measures are combined with the proposed ageing classification system, they provide a significantly improved picture of a journal‘s impact to that obtained from the JCR.

108 citations


Journal ArticleDOI
TL;DR: Applications that can be implemented efficiently and effectively using sets of n‐grams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look‐up, text compression, and language identification.
Abstract: This paper provides an introduction to the use of n‐grams in textual information systems, where an n‐gram is a string of n, usually adjacent, characters extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of n‐grams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look‐up, text compression, and language identification.

82 citations


Journal ArticleDOI
TL;DR: The utilisation of informetric analysis is used to go beyond the simplistic use of the JIF and to get a deeper understanding of the “real” impact of international scientific journals and their market.
Abstract: By developing a methodology for on‐line citation analysis, the international characteristics of scientific journals have been analysed on the basis of correlations between the geographical distribution patterns of authors, citations and subscriptions. The study covered seven selected LIS journals. Assuming that the numbers of authors and citations in each geographical region follow the Poisson distribution, the hypothesis was tested, that the intensities are proportional to the subscriptions. In most cases the correlation between authors and citations was so positive that the international visibility and impact of the scientific journals can be defined by these two variables. As regards the distribution pattern of subscribers, authors and citations, however, the test showed very weak or no correlations. The analysis of the statistical significance of differences gave some useful data, the importance of which to marketing and publishing strategies is obvious. The paper suggests examining also the knowledge export of journals as an additional criterion for the evaluation of their impact, and the quality of research published in them. The comparison of Journal Impact Factors (JIF) is another contribution of this study, aimed to enhance the use of impact factor analysis with various time intervals. We demonstrate new and flexible ways of using the JIF for diachronous and synchronous analyses. The study brings new dimensions to the discussions of the impact, status and image of scientific journals. It focuses on the utilisation of informetric analysis to go beyond the simplistic use of the JIF and to get a deeper understanding of the “real” impact of international scientific journals and their market.

79 citations


Journal ArticleDOI
TL;DR: The electronic information transfer of the future will be, in essence, a transfer of well‐defined, cognitive information modules, and the first steps towards a new heuristic model for such scientific information transfer are outlined.
Abstract: The development of electronic publishing heralds a new period in scientific communications. Besides the obvious advantages of an almost endless storage and transport capacity, many new features come to the fore. As each technology finds its own expressions in the ways scientific communications take form, we analyse print on paper scientific articles in order to obtain the necessary ingredients for shaping a new model for electronic communications. A short historical overview shows that the typical form of the present‐day linear (essay‐type) scientific article is the result of a technological development over the centuries. The various characteristics of print on paper are discussed and the foreseeable changes to a more modular form of communication in an electronic environment are postulated. Subsequently we take the functions of the present‐day scientific article vis‐a‐vis the author and the reader as starting points. We then focus on the process of scientific information transfer and deal essentially with the information consumption by the reader. Different types of information, at present intermingled in the linear article, can be separated and stored in well‐defined, cognitive, textual modules. To serve the scientists better in finding their way through the information overload of today, we conclude that the electronic information transfer of the future will be, in essence, a transfer of well‐defined, cognitive information modules. In the last part of this article we outline the first steps towards a new heuristic model for such scientific information transfer.

65 citations


Journal ArticleDOI
TL;DR: In order to compare the value of subject descriptors and title keywords as entries to subject searches, two studies were carried out on monographs in the humanities and social sciences, held by the online public access catalogue of the National Library of the Netherlands.
Abstract: In order to compare the value of subject descriptors and title keywords as entries to subject searches, two studies were carried out. Both studies concentrated on monographs in the humanities and social sciences, held by the online public access catalogue of the National Library of the Netherlands. In the first study, a comparison was made by subject librarians between the subject descriptors and the title keywords of 475 records. They could express their opinion on a scale from 1 (descriptor is exactly or almost the same as word in title) to 7 (descriptor does not appear in title at all). It was concluded that 37 per cent of the records are considerably enhanced by a subject descriptor, and 49 per cent slightly or considerably enhanced. In the second study, subject librarians performed subject searches using title keywords and subject descriptors on the same topic. The relative recall amounted to 48 per cent and 86 per cent respectively. Failure analysis revealed the reasons why so many records that were found by subject descriptors were not found by title keywords. First, although completely meaningless titles hardly ever appear, the title of a publication does not always offer sufficient clues for title keyword searching. In those cases, descriptors may enhance the record of a publication. A second and even more important task of subject descriptors is controlling the vocabulary. Many relevant titles cannot be retrieved by title keyword searching because of the wide diversity of ways of expressing a topic. Descriptors take away the burden of vocabulary control from the user.

64 citations


Journal ArticleDOI
TL;DR: A theoretical model of structured document indexing and retrieval based on the Dempster‐Shafer Theory of Evidence is presented and details of the combination process, how components are combined, and how relevance is captured within the model are presented.
Abstract: In this paper we report on a theoretical model of structured document indexing and retrieval based on the Dempster‐Shafer Theory of Evidence. This includes a description of our model of structured document retrieval, the representation of structured documents, the representation of individual components, how components are combined, details of the combination process, and how relevance is captured within the model. We also present a detailed account of an implementation of the model, and an evaluation scheme designed to test the effectiveness of our model. Finally we report on the details and results of a series of experiments performed to investigate the characteristics of the model.

50 citations


Journal ArticleDOI
TL;DR: This article characterises the questioning behaviour in reference interviews preceding delegated online searches of bibliographic databases and relates it to questioned behaviour in other types of interviews/settings and uses A.C. Graesser‘s typology of questions to analyse type of question and M.D. White’sTypology of information categories to determine the question’'s content objective.
Abstract: This article characterises the questioning behaviour in reference interviews preceding delegated online searches of bibliographic databases and relates it to questioning behaviour in other types of interviews/settings. With one exception, the unit of analysis is the question (N=610), not the interview. The author uses A.C. Graesser‘s typology of questions to analyse type of question and M.D. White’s typology of information categories to determine the question‘s content objective; this is the first application of Graesser’s typology to interview questions in any setting. Graesser‘s categories allow for a more subtle understanding of the kind of information need underlying a question. Comparisons are made between questions asked by the information specialist and those asked by the client. Findings show that the information specialist dominates the interview, about half the questions were verification questions and about 22% were judgemental questions or requests; all but four types of questions from Graesser’s categories appeared in the interviews, but no new question types were discovered. Clients often phrase questions as requests. In content, both clients and information specialists focus on the subject and service requested, but the clients ask also about search strategy and output features. Both parties ask predominantly short‐answer questions. Results are related to interface design for retrieval systems.

Journal ArticleDOI
TL;DR: From empirical research use studies, analyses of job‐related use are most advanced both theoretically and methodologically while studies focused on non‐work contexts of use are less established in this sense.
Abstract: The author reviews the major approaches and central findings of empirical research use studies. Six major research approaches were identified by cross‐tabulating two criteria: the major context of network use (job‐related vs non‐work) and the social level of variables (individual vs group level). Examples of all types of studies are presented. The majority of studies can be classified among the surveys focusing on frequencies of service use. From these studies, analyses of job‐related use are most advanced both theoretically and methodologically while studies focused on non‐work contexts of use are less established in this sense. The qualitative research settings seem to gain more popularity, thus making the use studies more balanced methodologically. The strengths and weaknessess of the research approaches are assessed and conclusions are drawn concerning the development of more context sensitive analyses of network uses.

Journal ArticleDOI
TL;DR: The findings of two research projects, the Value and EVINCE projects, are compared with studies of the consolidation and application of clinical knowledge in clinical decision making to confirm the importance of personal clinical knowledge.
Abstract: The progress of initiatives concerned with implementing evaluated clinical research (such as evidence based medicine and clinical effectiveness) is dependent on the way individual health professionals actually acquire, use and value clinical knowledge in routine practice. The findings of two research projects, the Value and EVINCE projects, are compared with studies of the consolidation and application of clinical knowledge in clinical decision making. The Value project was concerned with the ways in which information from NHS libraries might be used in present and future clinical decision making. EVINCE was a similar impact study for nursing professionals. Both studies confirmed the importance of personal clinical knowledge. Health information services need to use a variety of strategies and knowledge management skills to ensure that the evaluated research evidence is assimilated and implemented into practice.

Journal ArticleDOI
TL;DR: The application of indexing functions to document collections of three specific types: (1) ‘conventional’ text databases; (2) hypertext databases; and (3) the World Wide Web, globally distributed across the Internet are discussed.
Abstract: For the purposes of this article, the indexing of information is interpreted as the pre‐processing of information in order to enable its retrieval. This definition thus spans a dimension extending from classification‐based approaches (pre‐co‐ordinate) to keyword searching (post‐co‐ordinate). In the first section we clarify our use of terminology, by briefly describing a framework for modelling IR systems in terms of sets of objects, relationships and functions. In the following three sections, we discuss the application of indexing functions to document collections of three specific types: (1) ‘conventional’ text databases; (2) hypertext databases; and (3) the World Wide Web, globally distributed across the Internet.

Journal ArticleDOI
TL;DR: The forces that are currently affecting academic libraries in the UK are outlined and a strategy whereby the transformation from the Handling of artefacts to the handling of electronic sources may be effected with maximum benefit to the information user is proposed.
Abstract: Business process re‐engineering (or redesign) has achieved mixed results in business and industry but it offers an approach to thinking about the future of academic libraries in the digital age that is worth considering. This paper outlines the forces that are currently affecting academic libraries in the UK and proposes a strategy whereby the transformation from the handling of artefacts to the handling of electronic sources may be effected with maximum benefit to the information user.

Journal ArticleDOI
TL;DR: The Simple Index Method overcomes the difficulties in building a core list for serials in interdisciplinary fields by using multiple indexes which cover various aspects of the subject.
Abstract: This paper describes a simple method for developing a list of core serials in a particular subject field by analysing article citations in electronic indexes. The Simple Index Method overcomes the difficulties in building a core list for serials in interdisciplinary fields by using multiple indexes which cover various aspects of the subject. This method permits the collection development librarian to develop a core list when standard bibliographies or specific indexing and abstracting tools are lacking and to tailor that list to the needs of the local situation.

Journal ArticleDOI
TL;DR: Concepts or procedures questioned include: (1) ‘core journal’, from the Bradfordian viewpoint; (2) the use of traditional statistical inferential procedures applied to Bradford data; and R(n) as a maximum (rather than median or mean) value at tied‐rank values.
Abstract: Bradford distributions describe the relationship between ‘journal productivities’ and ‘journal rankings by productivity’. However, different ranking conventions exist, implying some ambiguity as to what the Bradford distribution ‘is’. A need accordingly arises for a standard ranking convention to assist comparisons between empirical data, and also comparisons between empirical data and theoretical models. Five ranking conventions are described including the one used originally by Bradford, along with suggested distinctions between ‘Bradford data set’, ‘Bradford distribution’, ‘Bradford graph’, ‘Bradford log graph’, ‘Bradford model’ and ‘Bradford’s Law‘. Constructions such as the Lotka distribution, Groos droop (generalised to accommodate growth as well as fall‐off in the Bradford log graph), Brookes hooks, and the slope and intercept of the Bradford log graph are clarified on this basis. Concepts or procedures questioned include: (1) ‘core journal’, from the Bradfordian viewpoint; (2) the use of traditional statistical inferential procedures applied to Bradford data; and (3) R(n) as a maximum (rather than median or mean) value at tied‐rank values.

Journal ArticleDOI
TL;DR: The project – Electronic Submission and Peer Review (ESPERE) – is examining the cultural and technical problems of implementing an electronic peer review process for biomedical academics and learned society publishers.
Abstract: The Internet provides researchers with exciting new opportunities for finding information and communicating with each other. However the process of peer review is something of a Cinderella in all this. Peer review in biomedical disciplines is still largely carried out using hard copy and the postal system even if the authors’ text files are used for the production of the paper or electronic journal. This article introduces one of the Electronic Libraries (eLib) projects, funded by the Joint Information Systems Committee (JISC). The project – Electronic Submission and Peer Review (ESPERE) – is examining the cultural and technical problems of implementing an electronic peer review process for biomedical academics and learned society publishers. The paper describes preliminary work in discovering the issues involved and describes interviews with seven learned society publishers, analysis of a questionnaire sent to 200 editorial board members and a focus group of five biomedical academics. Academics and learned publishers were enthusiastic about electronic peer review and the possibilities which it offers for a less costly, more streamlined and more effective process. Use of the Internet makes collaborative and interactive refereeing a practical option and allows academics from countries all over the world to take part.

Journal ArticleDOI
TL;DR: The background to the RSSIC, the general nature of the conference, and some of its themes, achievements and limitations are briefly discussed.
Abstract: The background to the RSSIC is described, and the general nature of the conference. Some of its themes, achievements and limitations are briefly discussed.

Journal ArticleDOI
TL;DR: The paper recommends the need for training institutions in Africa to strengthen the research andWriting skills component of their curricula, as well as the regular scheduling of research and writing skills workshops by information organisations in the region.
Abstract: A comparative analysis of the characteristics of rejected manuscripts submitted for publication to the African Journal of Library, Archives and Information Science and manuscripts accepted for publication over a five year period was carried out. The study reveals that 145 manuscripts were rejected as opposed to eighty papers accepted for publication. The findings reveal that there were no remarkable differences with regards to status and affiliations between the authors of rejected and accepted papers. While information technology, archives, user studies, academic libraries and bibliometrics constituted the topics of papers mostly rejected, papers accepted were mainly in the areas of archives, information service, information technology and rural information. Most of the papers were rejected because they contributed nothing new to knowledge (65.5 per cent), used unreliable data (13.1 per cent) and lacked focus (13.1 per cent). Datedness of references was not used in rejecting papers because the editorial board policy is to update references of papers accepted for publication where necessary. The paper recommends the need for training institutions in Africa to strengthen the research and writing skills component of their curricula, as well as the regular scheduling of research and writing skills workshops by information organisations in the region.

Journal ArticleDOI
TL;DR: The Royal Society Scientific Information Conference of 1948 was a top level attempt to look at scientific and technical information in the light of the post‐war growth of the literature, with a large number of recommendations made irrelevant by advances in technology.
Abstract: The Royal Society Scientific Information Conference of 1948 was a top level attempt to look at scientific and technical information in the light of the post‐war growth of the literature. Some of the large number of recommendations have been made irrelevant by advances in technology, and some, for example those relating to bibliographic control of books and comprehensive collecting of scientific literature, have been overtaken by action. Most recommendations, however, are unfinished – some unfinishable – business. The recommendations relating to control over the number and format of journals and co‐operation between abstracting journals were never realistic. Issues that are still live include library co‐operation and copyright. The 1948 conference may have had few direct effects, but it helped to create a climate where improvements were easier to make. Political and technological changes in the world since then have led to a very different environment, in which information is held to have a commercial valu...

Journal ArticleDOI
TL;DR: The nature of the proposals, together with some contemporary reactions, are outlined, which implied a revolutionary transformation of the status quo in J.D. Bernal’s scheme for the central distribution of scientific papers.
Abstract: For the 1948 Royal Society Conference, J.D. Bernal submitted a paper which proposed a provisional scheme for the central distribution of scientific papers. This provoked such a hostile and extreme reaction from both the learned society publishers and the national press that the paper was withdrawn in advance of the conference. It is extant in its Proceedings. This paper outlines the nature of the proposals, together with some contemporary reactions. Bernal‘s scheme certainly implied a revolutionary transformation of the status quo. However, his well‐known political beliefs probably played as instrumental a part in their rejection as the nature of the proposals themselves.

Journal Article
TL;DR: This paper focuses on the importance of chemical patents as an information source, and concentrates principally on the area of structural information, highlighting some of the special characteristics that are found in the generic (Markush) type of description.
Abstract: This paper focuses on the importance of chemical patents as an information source. After an outline of this importance, the discussion concentrates principally on the area of structural information, highlighting some of the special characteristics that are found in the generic (Markush) type of description, in order to place in context some of the research work at Sheffield University. A brief summary is given of the important highlights of the research, performed by a team headed by Professor Mike Lynch from 1979 to 1995.


Journal ArticleDOI
TL;DR: A formula for the ranking of scientists based on diachronous citation counts is proposed, generalising the fact that the citation generation potential is not the same for all papers, it differs from paper to paper, and also to a certain extent depends on the subject domain of the papers.
Abstract: A formula for the ranking of scientists based on diachronous citation counts is proposed. The paper generalises the fact that the citation generation potential (CGP) is not the same for all papers, it differs from paper to paper, and also to a certain extent depends on the subject domain of the papers. The method of ranking proposed in no way replaces peer review. It merely acts as an aid for peers to help them arrive at a better judgement.

Journal ArticleDOI
TL;DR: The main features of current practice in email use in academic libraries are outlined, and, drawing on experience in the sector and on pointers from the literature, a number of issues of relevance to developing effective network communities in information services are discussed.
Abstract: This paper discusses key themes from British Library funded research carried out between 1995 and 1997 into electronic communication in academic libraries in the UK. The research focused in particular on the intra‐organisational use of electronic mail (email) ± that is, on its use by colleagues within the same library organisation for internal activities and collaborative work. The main features of current practice in email use in academic libraries are outlined, and, drawing on experience in the sector and on pointers from the literature, the paper discusses a number of issues of relevance to developing effective network communities in information services.

Journal ArticleDOI
TL;DR: The Commonwealth Regional Health Community Secretariat/Namibia Dissemination Centre based at the University of Namibia, one of the institutions that was identified as a partner institution to serve in the Secretariat’s information dissemination network in reproductive health and nutrition programmes, is described.
Abstract: This paper very briefly attempts to describe the activities of the Commonwealth Regional Health Community Secretariat/Namibia Dissemination Centre based at the University of Namibia, one of the institutions that was identified as a partner institution to serve in the Secretariat’s information dissemination network in reproductive health and nutrition programmes. The establishment of the information dissemination network itself arose from a workshop held in Arusha, Tanzania in 1995 and attended by Dissemination Centre representatives from thirteen member countries. The University of Namibia Library is part of the University of Namibia, one of the institutions that has an important role to play in the youngest nations in Africa. The library has modern facilities and the resources and interest to run the activities of the Dissemination Centre. It undertook desk and field research and produced information sources on Nutrition and Reproductive Health.


Journal ArticleDOI
TL;DR: Adaptations and tests undertaken to allow an information retrieval (IR) system to forecast the likelihood of avalanches on a particular day are presented and it is concluded that the adaptation methodology is effective at allowing such data to be used in a text‐based IR system.
Abstract: This paper presents adaptations and tests undertaken to allow an information retrieval (IR) system to forecast the likelihood of avalanches on a particular day. The forecasting process uses historical data of the weather and avalanche conditions for a large number of days. A method for adapting these data into a form usable by a text‐based IR system is first described, followed by tests showing the resulting system’s accuracy to be equal to existing ‘custom built’ forecasting systems. From this, it is concluded that the adaptation methodology is effective at allowing such data to be used in a text‐based IR system. A number of advantages in using an IR system for avalanche forecasting are also presented.