scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Analysis of EZproxy server logs to visualise research activity in Curtin’s online library

18 Nov 2019-Library Hi Tech (Emerald Publishing Limited)-Vol. 37, Iss: 4, pp 845-865
TL;DR: These prototype evidence-based data visualisations empower the library to communicate in a compelling and interesting way how its services and subscriptions support Curtin University’s missions.
Abstract: The purpose of this paper is to develop data visualisation proof of concept prototypes that will enable the Curtin University Library team to explore its users’ information-seeking behaviour and collection use online by analysing the library’s EZproxy logs.,Curtin Library’s EZproxy log file data from 2013 to 2017 is used to develop the data visualisation prototypes using Unity3D software.,Two visualisation prototypes from the EZproxy data set are developed. The first, “Global Visualisation of Curtin Research Activity”, uses a geographical map of the world as a platform to show where each research request comes from, the time each is made and the file size of the request. The second prototype, “Database Usage Visualisation”, shows the use of the library’s various subscription databases by staff and students daily, over a month in April 2017.,The paper has following limitations: working to a tight timeline of ten weeks; time taken to cleanse noise data; and requirements for storing and hosting the voluminous data sets.,The prototypes provide visual evidence of the use of Curtin Library’s digital resources at any time and from anywhere by its users, demonstrating the demand for the library’s online service offerings. These prototype evidence-based data visualisations empower the library to communicate in a compelling and interesting way how its services and subscriptions support Curtin University’s missions.,The paper provides innovative approaches to create immersive 3D data visualisation prototypes to make sense of complex EZproxy data sets.

Summary (4 min read)

Introduction

  • Curtin Library has a substantial dataset of logged, authenticated use of its online library collection, comprising databases, eJournals and eBooks dating from 2013.
  • Making sense of and drawing meanings from the raw data, which is mainly in the textual format of URL codes, is nearly impossible.
  • EZproxy offered a rich source of data for analysis, containing a detailed log of HTTP requests processed through the library’s authentication servers.

What is an EZproxy server?

  • EZproxy is an internet proxy server that is primarily used to first authenticate then allow computers outside a library network to access content provided by the library without requiring additional logins.
  • Figure 1 illustrates how this works: when a user accesses the Curtin catalogue and attempts to access content on an Online Database Vendor’s website, for example ProQuest, they are authenticated by the library’s authentication system, which then creates a session with the EZproxy server and redirects the user’s browser there, allowing access to all subscribed library content.
  • EZproxy handles access by URL rewriting: taking the URL requested by a user and modifying it so that the web server holding the content accepts the request.
  • An example is when a user tries to access the online database ‘proquest.com’: the user’s browser will be informed of the URL that contains the cookie, ‘proquest.dbgq.lis.curtin.edu.au’.

Dataset

  • A literature review shows how other academic libraries handle their EZproxy data logs.
  • These studies demonstrate that it is possible to extract useful information about trends and usage patterns from EZproxy datasets, and that these can be used to improve library services in universities.
  • The grep tool is a Unix based pattern matcher that searches through plain text data sets in the target file and outputs all lines that match a regular expression of these patterns.
  • This project offers a new and friendlier way for librarians to analyse datasets that have traditionally been difficult for them to understand.
  • Joseph et al. (2013) state that information search behaviour comprises ‘search processes’ and ‘search activities’.

Research aims

  • The current research does not have research questions; however, their project’s research aims will enable the development of research tools that answer questions raised in later research.
  • The authors research aims are to develop interactive and immersive data visualisation prototypes with user-friendly interfaces to enable the Curtin library team to explore how the online library is being used by its patrons.

Research methodology

  • Three sequential steps were performed to develop the desired visualisations for this project.
  • First the EZprozy dataset was curated into a format that enabled easy extraction of useful data from the logs and the ability to remove all lines in the log file that did not represent actual search data.
  • This included such things as the required webpage data (HTML or CSS) and images/logos (GIFs, JPEGs, PNGs).
  • Unity scripts were written using the C# programming language to query and extract the data required to develop the visualisations.
  • Unity’s simple user interface made it user-friendly and easy for beginners to use and learn.

Curation of the EZproxy Log File Dataset

  • The authors project’s dataset includes five years of EZproxy log data; Curtin University Library has been logging data since 2013.
  • The data is stored within text files, one file for each day, to 2016.
  • O Codes starting with 2 are successful requests: the data that the user requested arrives.
  • This refers to the user’s web browser (e.g. Google Chrome or Internet Explorer) sending the request, and identifies the browser and device being used to make the request.
  • For privacy and anonymity reasons, patrons’ identities cannot be revealed.

Issues with Data

  • A few key issues with the dataset had to be resolved before the authors could begin work on any visualisations of the dataset.
  • All but one specific dataset persisted to present issues in retrieving meaningful data from the log files.
  • The first of these was that EZproxy logs all HTTP requests, and many of these are for parts of a website that would not be useful to visualise as they do not represent a user using the library but rather parts of a webpage being sent to the users’ browser, such as images on the webpage or CSS stylesheets.
  • This process took approximately 60 hours to run over a weekend, given the large amount of data, roughly 1.2 Terabytes of raw text data.
  • The third issue, that of difficulty in extracting meaningful data, was the format of the URLs.

Results – Visualisation Prototype Developments

  • The authors project developed two main visualisations to showcase research activity in Curtin’s online library space, described next.
  • It uses a geographical map of the world as a platform to show from where each research request comes from, the time the request is made, and how large the file size of the request is .
  • Square icons are also used to represent the status of users’ requests, colour coded to indicate the HTTP status of the request.
  • User interface displaying specific details about an individual search request].

Database Usage Visualisation

  • The first is to build a search interface that enables queries about the use of specific online databases currently subscribed.
  • This feature would allow users to view what is most important to them, quickly and easily.

Discussion

  • Given the limited research on dynamic 3D visualisation of EZproxy datasets, it was not possible to compare their project with other work.
  • The literature review reports the use of data visualisation software like Tableau for visualising EZproxy data and 2D graphing software to create simple visualisations from EZproxy datasets.
  • In comparison, the two Global and Database 3D visualisation prototypes developed for this project are more immersive and dynamic than the 2D presentations reported by Bhaskar et al. (2014).
  • The ability to move around a scene provides a user-friendly interface to navigate and digest the complex EZproxy dataset information, as a large set of bar graphs cannot.

Design Considerations for the Prototypes

  • When designing the visualisations for this project, the authors considered how the data would be viewed and how different interactive environments could be developed using the Unity software.
  • Unity was chosen for its ability to create immersive environments quickly and easily, and to present the content on multiple platforms using the same scripts and assets; it also has options to develop and present content for all computing and mobile platforms and via web browsers.
  • In short, the specifications were to design immersive and userfriendly interfaces to explore and make sense of the rich information dataset the EZproxy server logs files contained.
  • The use of the HIVE screens assists in the presentation of visualisations, more than is possible with a regular desktop screen display.
  • The environments in which the data is showcased are open spaces, allowing the user to move freely around and investigate the data.

Further technical development

  • Continue development using the Unity 3D platform.
  • Once all the data has been stored and hosted on a database that can hold large data quantities, a few changes will need to be made so that rather than reading files, Unity can query the database and then use that data to develop the visualisations.
  • Time-based queries would also be possible, allowing viewers to select a more dynamic time, rather than starting at midnight and only one day at a time.
  • Further development Global Visualisation Currently one icon is instantiated every frame, which causes issues when moving through the timeframe, also known as Existing Visualisation Models.

Inclusion of Faculty information

  • Adding academic faculty information to the EZproxy dataset, such as student enrolment data by academic staff, by faculty, or by discipline area, would enable insights into information behaviour and resource use by these faculties.
  • Adding this faculty information would offer a deeper understanding of how different faculties use the library.
  • Precedent research in other universities (Chan, 2014; Coombs, 2005; Grace and Bremner, 2004) using EZproxy datasets indicate that different faculties use the library differently, with each having preferences for databases and eBooks, and different levels of use.
  • Having such information would enable the provision of targeted e-library services like those reported by Chan (2014), and Grace and Bremner (2004, p. 164): for instance, the development of personalised library websites/portals for different disciplines.
  • This model of library service delivery aligns with Curtin University’s strategic vision to ‘deliver a seamless, responsive and innovative digital environment’ for learning and student experience (Curtin University, 2017, p. 5).

Profiling the average Curtin Library User

  • Profiling a visual image of Curtin University’s ‘average’ user will assist with planning initiatives in library service.
  • Given that EZproxy logs contain all the search history of all who use the library, the ability to track a single user’s resource usage patterns over a period of time would be an interesting visualisation to develop.
  • It would provide a glimpse into research activity over a longer period; and if performed with users from different discipline areas and faculties, could lead to the formulation of archetypes of researchers in different faculties, discipline areas and campuses.
  • As Curtin operates in several locations internationally, it would be a useful exercise to compare the use of someone at the Bentley campus user versus someone at the Singapore campus.
  • It could be fruitful to compare and contrast the ‘average’ user for different faculties and campuses to visualise their differences and similarities.

Understanding Users’ Information Search Habits

  • From the EZproxy server logs, it is possible to observe aspects of users’ ‘search processes’—to examine the paths users take to reach databases.
  • One key factor of visualising research activity is to see what users search for, as well as what they actually download, when they browse the online library.

Conclusion

  • This short 10-week HIVE Intern research project highlights opportunities for developing interactive, user-friendly and immersive ways to visualise and make sense of the rich EZproxy dataset.
  • It also indicates various avenues for expansion.
  • Both the Global Visualisation and Database Usage Visualisation prototypes provide visual evidence of the high volume of usage of Curtin Library’s digital resources—eBooks and databases, and of the accessibility and usage of the library’s digital contents at any time and from anywhere.
  • It offers evidence of how the library supports the university’s strategic goal of becoming a global campus by 2020, delivering courses internationally (Curtin University, 2016).
  • First it curated EZproxy log files into formats required to feed into Unity software and develop visualisation prototypes.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Library Hi Tech
Analysis of EZproxy server logs to visualise research activity
in Curtin’s online library
Journal:
Library Hi Tech
Manuscript ID
Draft
Manuscript Type:
Original Article
Keywords:
Data visualisation, EZproxy server logs, User-centered design, Collection
Management, Academic libraries, information resources management
Library Hi Tech

Library Hi Tech
1
Analysis of EZproxy server logs to visualise research activity
in Curtin’s online library
Introduction
Curtin Library has a substantial dataset of logged, authenticated use of its online library collection,
comprising databases, eJournals and eBooks dating from 2013. The EZproxy software writes about
30 million lines a month, and this rich dataset was accessible for this project.
Making sense of and drawing meanings from the raw data, which is mainly in the textual
format of URL codes, is nearly impossible. The Curtin library team reported an unsuccessful attempt
ten years ago, reporting that it was impossible to comprehend so much data using the human eye,
as it all looked similar, with no distinct observable trends in users’ information-seeking behaviours.
Scholarly findings about visualisations based on EZproxy data have been traditionally static,
as in the work by Bhaskar et al. (2014), Coombs (2005), Chan (2014), Grace and Bremner (2004),
Lewellen et al. (2016) and Sharman (2017). Dynamic visualisations are uncommon in this field, but
Archambault et al. (2015, p.1) argue that a framework for meaningful data visualisation has merit:
There are advantages to presenting data visually rat
her than as a set of flat
statistics. Proper data visualization facilitates the recognition of patterns and
relationships to communicate a message in a more compelling and interesting
way. It allows the complexity of the data to be understood more easily.
W
e agree with Archambault et al. (2015) that the creation of a set of visualisations that is
dynamic and immersive would offer friendlier interactions with the dataset, allowing one to look at
it in a way that normally is not possible. The creation of a 3D virtual space where one could move
around and view the data freely was envisioned to provide the library team with an immersive
encounter with the dataset and an opportunity to explore their users’ information seeking
behaviour.
An opportunity to work with the EZproxy dataset was made possible with technical and
financial support from the Curtin HIVE (Hub for Immersive Visualisation and eResearch) Internship
Program, which allows a Curtin student to undertake a ten-week, full-time investigation of the
application of visualisation technologies to a discipline area. Interns have regular access to the HIVE,
are supported by its expert staff and supervised by a library and information science discipline leader
and the Curtin Library team; the latter were the clients for this project.
In this paper we describe a project that aimed to visualise the EZproxy dataset to draw
inferences about Curtin library users’ information seeking behaviour and collection use in the
virtual/online environment. EZproxy offered a rich source of data for analysis, containing a detailed
log of HTTP requests processed through the library’s authentication servers. Another user identity
dataset was merged with EZproxy to identify each users’ profile: whether staff or student; their
geolocation data; and if their requests were processed in or out of Curtin’s IP subnet. Analysis of
these logs led to visualisations of hidden trends in users’ collection of information, providing insights
into where and when users accessed the virtual library, what information seeking activities they
performed, whether they browsed or downloaded full-text journal articles, and which online
databases they accessed frequently.
Visual awareness and understanding of these trends will help the library to make important
budget decisions about their collection subscriptions and assist with strategic planning about their
service delivery options (Grace and Bremner, 2004).
Page 1 of 24 Library Hi Tech
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Library Hi Tech
2
Background
Curtin University Library
Curtin Library supports the University’s learning, teaching and research requirements and supports
more than 58,000 students and 3,700 staff worldwide. Its online collection comprises at least
250,000 eBooks, 15,000 streamed videos, 150,000 journal subscriptions, 600 electronic databases
and 48,000 institutional repository records. The library’s acquisitions budget is over $10 million.
Access and authentication are administered by EZproxy, software licensed by OCLC. EZproxy reduces
the number of authorisations for users and ensures remote access to content is secure and complies
with licensing arrangements.
What is an EZproxy server?
EZproxy is an internet proxy server that is primarily used to first authenticate then allow computers
outside a library network to access content provided by the library without requiring additional
logins. Figure 1 illustrates how this works: when a user accesses the Curtin catalogue and attempts
to access content on an Online Database Vendor’s website, for example ProQuest, they are
authenticated by the library’s authentication system, which then creates a session with the EZproxy
server and redirects the user’s browser there, allowing access to all subscribed library content.
[Insert Figure 1: How the EZprozy server works?]
All hypertext transfer protocol (HTTP) requests that are sent through EZproxy are logged by
its server. For example, if a user is directed to a website through Google Scholar, once they are
authenticated by Curtin’s Authentication server, all requests for this website will be logged within
the dataset, creating a valuable set of information about who accesses resources via the EZproxy
server that can be utilised to visualise users’ information search and retrieval behaviours.
A user may access the library’s online databases from a personal browser by going to the
library catalogue or A – Z list of subscription databases, or through Google Scholar. When a user is
successfully authenticated, a cookie is sent to the browser by EZproxy. The browser then presents
this cookie for each access to EZproxy, allowing EZproxy to check the user’s access rights each time.
EZproxy handles access by URL rewriting: taking the URL requested by a user and modifying it so that
the web server holding the content accepts the request. This removes the need for the user to use a
separate login, as they are already authenticated by EZproxy. An example is when a user tries to
access the online database ‘proquest.com’: the user’s browser will be informed of the URL that
contains the cookie, ‘proquest.dbgq.lis.curtin.edu.au’ —everything after ProQuest is the cookie.
(‘URL Rewriting’, 2018)
How other Academic Institutions use the EZproxy Server and
Dataset
A literature review shows how other academic libraries handle their EZproxy data logs. Academic
libraries from the United Kingdom (Sharman, 2017; Grace and Bremner, 2004) and the United States
(Chan, 2014; Coombs, 2005; Lewellen et al., 2016) report their experiences working with EZproxy,
how their findings assisted in improving their service, and understanding of the usage patterns of
their electronic resources. The Open University library reported using their EZproxy server data to
evaluate their electronic resource subscriptions comprehensively and monitor resource usage trends
by staff and students; the findings enabled them to measure the performance of their library
services, and led to the development of performance indicators to measure their service quality and
Page 2 of 24Library Hi Tech
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Library Hi Tech
3
patron impact (Grace and Bremner, 2004, p. 164). Sharman (2017) reports how the University of
Huddersfield combined their EZproxy dataset with book loans from the library management system
and statistics from library visits; they found correlations between the low use of library resources
and final grades among Chinese students compared to their UK peers.
In the United States, Coombs (2005) reports how the SUNY Cortland University used the
EZproxy dataset to gain insights into online collection use. Chan (2014) discusses how student course
enrolment data was merged with the EZproxy dataset at California State University to develop
personalised e-library services. Lewellen et al. (2016) describe how the University of Massachusetts
Amherst used the dataset to investigate the use of e-books compared to print books.
These studies demonstrate that it is possible to extract useful information about trends and
usage patterns from EZproxy datasets, and that these can be used to improve library services in
universities. However, working with the complex raw EZproxy dataset is neither easy nor user-
friendly, as revealed by the East Kentucky University’s experience. This university analysed its
EZproxy dataset to determine use patterns because this information was not provided by the online
database vendors. Hence, the university firstly used an OpenURL link resolver, which provides a set
of reports that enable robust analysis of user behaviour, as shown in Figure 2.
[Insert Figure 2. An example of OpenURL results (adapted from Smith and Arneson, 2017, figure 3)]
This snapshot of a report generated from the EZproxy dataset is detailed, but the textual
data is visually unappealing—in fact, it looks just how our project’s EZproxy data displayed when
exported into MS Excel.
The second method that East Kentucky University used to extract data from their EZproxy
data logs was by using the command-line grep tool, as shown in Figure 3. The grep tool is a Unix
based pattern matcher that searches through plain text data sets in the target file and outputs all
lines that match a regular expression of these patterns. Both the construction of these patterns and
working with the grep utility tool are daunting tasks for novices.
[Insert Figure 3. Example of grep on a small sample file
(‘Pipe, Grep and Sort Command in Linux/Unix with Examples’, 2018)]
There are three disadvantages to using this method. First, the output of this tool is in the
same format that it goes in, as it does not change the lines; instead, it writes the files that match the
pattern to a new file. Second, the single grep command-line will not work for all queries as it cannot
cope with the varying file formats used by different database vendors. Third, it does not enable the
development of data visualisations, as its functionalities are limited to searching the file for certain
queries.
Given our research aim to find out about Curtin’s online collections usage trends, an
understanding via the literature review on how others achieved this was of interest. Hence, Morton-
Owens and Hanson’s (2012) research about how they analysed EZproxy logs through the creation of
dashboard charts based on EZproxy data was useful. They report running a few basic calculations to
highlight significant changes within their data, which produced two charts indicating trends in
resource use and variations in use of specific resources. It was our understanding that often
additional analytical software is required to make sense of this complex EZproxy data. Thus, it was
useful to read that Austin College used Google Analytics on their EZproxy dataset to identify which
were the main resources being used, when (time/day) the most use took place each week, and what
devices were being used to access their library (Ajamie et al., 2014).
EZproxy server logs are a rich source offering immense opportunities for harnessing their
data to tell stories of how library users search for information and how the library’s information
Page 3 of 24 Library Hi Tech
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Library Hi Tech
4
sources are used at any time, or 24/7/365. Presenting this information visually will assist both in
planning and management of the library collection and in improving its service delivery (Grace and
Bremner, 2004; Kay, 2014). Knowing how students use library resources can aid in focusing on
teaching those information literacy skills most needed; and to identify groups of students needing
specific remedial skill teaching (Kay, 2014). It can also enable the delivery of e-library services, such
as the personalisation of library websites/portals for students and staff in specific discipline areas to
provide rapid access to their most frequently used online resources (Chan, 2013; Grace and
Bremner, 2004, p. 162;).
This review identifies a gap in the work about using data visualisation tools and techniques
to interrogate the rich but complex EZproxy dataset. At the moment, all data is presented as static
flat graphs and tables. No-one so far has developed an interactive search interface that is friendly to
library decision-makers, who are not trained to work with or understand complex computing
datasets. There is also a lack of built-in user-friendly search interfaces that offer library decision-
makers the opportunity to explore visually their clients’ techniques of information-seeking. In the
absence of these functionalities, it is not yet possible for library professionals to have an immersive
experience that enables them better to understand their clients’ usage patterns. Hence, it is
worthwhile to offer librarians both a familiar and an ease to use search interface that is similar to
their online library catalogue interfaces to search for usage information about their online resources.
Further, these data visualisation tools will provide a visually immersive experience whereby they can
intuitively explore the dataset to identify usage patterns and statistics. This project offers a new and
friendlier way for librarians to analyse datasets that have traditionally been difficult for them to
understand.
Definition of information seeking behaviour
Given that the EZproxy server logs staff and student interactions with various online databases and
electronic resources provided by the library. It is necessary to define the broad behaviours the
findings from the EZproxy datasets reveal. The information seeking behaviour literature is vast and
only key definitions that are relevant for this project are described in our article. We adopt Joseph et
al.’s (2013) definition: that seeking information ‘relates to the process of identifying an information
need, then sourcing and accessing the necessary information avenues to address that need’.
We consider information retrieval is a crucial activity in the information-seeking process,
especially in the context of this research: that users wish to fulfil their informational needs from the
widest possible pool of electronic resources. Meadow et al.’s (2007, p. 2) definition that information
retrieval ‘involves finding some desired information in a store of information or a database is
applicable here. Information ‘search behaviour’ also needs to be defined. Joseph et al. (2013) state
that information search behaviour comprises ‘search processes’ and ‘search activities’. Search
processes involve several sequential but iterative stages of judgement, option selection and
decision-making (Ellis 2005; Henefer and Fulton 2005; Kuhlthau 1988; Leckie et al. 1996; Marchionini
and White 2007; Meho and Tibbo 2003). Joseph et al. (2013, p. 6) state:
Search activities refer to the actions users enact d
uring the iterative process of
moving the information search from start to closure. These actions include
browsing, navigating or extracting information. Both information search
processes and search activities comprise the search behaviour of users of these
systems.
These simple and practical definitions are useful gu
ides to our discussions on the
research aims of our project described next.
Page 4 of 24Library Hi Tech
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Citations
More filters
Journal ArticleDOI
TL;DR: To better understand user behaviour, four transaction-log studies, comprising more than 4.2 million search sessions from two German library search systems, show that most sessions are rather short; users tend to issue short queries and usually do not go beyond the first search engine result page (SERP).

4 citations


Cites background from "Analysis of EZproxy server logs to ..."

  • ..., 2019), the number of abstract views and full-text downloads (Calvert, 2015; Greenberg & Bar-Ilan, 2017), or visualize the activity of the searchers (Joseph et al., 2019)....

    [...]

Journal ArticleDOI
TL;DR: In this paper , the authors present a brief synthesis of the practical use of browser plugins in libraries and their possible future use, which will improve the understanding of the young library professionals about the practical usage of plugins and the steps to enhance accessibility to plugins.
Abstract: Purpose The present study demonstrates how browser plugins have been adopted by libraries in recent times to empower users in participatory academic and research activities. The study aims to present a brief synthesis of the practical use of browser plugins in libraries and their possible future use. Design/methodology/approach Web-based survey method was applied in the study to search and find browser plugins associated with academic library websites and to explore their area of application. Findings With the descriptive account of the implementation of plugins in academic libraries, supported by numerous examples, the study presents the pertinence of various approaches as appeared on library websites. Research limitations/implications The present investigation is limited to the academic libraries only. Originality/value The good practices enumerated in the study will improve the understanding of the young library professionals about the practical usage of plugins and the steps to enhance accessibility to plugins.

1 citations

Journal ArticleDOI
TL;DR: In this article , the authors investigate the perception gap (i.e., importance and satisfaction) of university librarians regarding authentication services, and their attitudes towards the visualization of online libraries' usage logs.
Journal ArticleDOI
16 Jan 2021
TL;DR: The authors examine the libraries’ independent calculations and analysis of alternative metrics to determine user-defined relevancy of publications and describe the functionality and table structure of RNPLST’s software module for advanced bibliometric analysis, which has key advantages as compared to traditional bibliometrics and webometrics.
Abstract: The authors examine the libraries’ independent calculations and analysis of alternative metrics to determine user-defined relevancy of publications. Logfiles collected by web-servers at users’ accessing e-catalogs may be used for such calculations and analysis. The almetrics’ key advantages as compared to traditional bibliometrics and webometrics are: real-time representation; openness and transparency; coverage of wider non-academic audience; coverage of various sources and research findings. The web-server log-files enable to analyze number of user views of bibliographic records in e-catalog, to rate authors, individual publications and topical queries for a time period. In case of links to full texts, the number of pdf. documents loaded by users and n umber of p ages viewed ( when library computer-based system supports full-text document scrolling mode) may be defined. The authors also discuss specificity of handling log-files for various analysis types: statistical, content addressable, defining sequential sets, clustering. The functionality and table structure of RNPLST’s software module for advanced bibliometric analysis is described. The further task of the study will be to define trends of user behavior and forecasts for user group behavior when using e-catalog. The article is prepared within the framework of the State Order to Russian National Public Library for Science and Technology “Development and improvement of the system of Open Archive of integrated information and library resources of Russian National Public Library for Science and Technology as modern knowledge management system in digital environment: on the way to Open Science” № 075-01300-20-00 for 2020–2022.
References
More filters
Journal ArticleDOI
TL;DR: In this article, the authors propose a model of information seeking that is applicable to all professionals by analyzing and interpreting empirical studies on the information habits and practices of three groups: engineers, health care professionals and lawyers.
Abstract: Drawing upon existing research and previous attempts at modeling the information-seeking behavior of specific professional groups, this article posits an original model of information seeking that is applicable to all professionals. The model was developed through a careful analysis and interpretation of empirical studies on the information habits and practices of three groups: engineers, health care professionals, and lawyers. The general model and its six major components are presented in detail. These six components are (1) work roles, (2) associated tasks, and (3) characteristics of information needs and three factors affecting information seeking: (4) awareness, (5) sources, and (6) outcomes. In turn, each component contains a number of variables that are described with examples from the literature. The complexity of the information-seeking process is conceptualized in terms of the interaction and simultaneous occurrence of the model's components and variables, including a feedback mechanism. The art...

813 citations

Journal ArticleDOI
TL;DR: A fuller description of the information-seeking process of social scientists studying stateless nations should include four additional features besides those identified by Ellis, and a new model is developed, which groups all the features into four interrelated stages: searching, accessing, processing, and ending.
Abstract: This paper revises David Ellis's information-seeking behavior model of social scientists, which includes six generic features: starting, chaining, browsing, differentiating, monitoring, and extracting. The paper uses social science faculty researching stateless nations as the study population. The description and analysis of the information-seeking behavior of this group of scholars is based on data collected through structured and semistructured electronic mail interviews. Sixty faculty members from 14 different countries were interviewed by e-mail. For reality check purposes, face-to-face interviews with five faculty members were also conducted. Although the study confirmed Ellis's model, it found that a fuller description of the information-seeking process of social scientists studying stateless nations should include four additional features besides those identified by Ellis. These new features are: accessing, networking, verifying, and information managing. In view of that, the study develops a new model, which, unlike Ellis's, groups all the features into four interrelated stages: searching, accessing, processing, and ending. This new model is fully described and its implications on research and practice are discussed. How and why scholars studied here are different than other academic social scientists is also discussed.

357 citations

Journal ArticleDOI
TL;DR: The information-seeking process is described, and each of the subprocesses are discussed with an eye toward making user interfaces that closely couple support mechanisms.
Abstract: This article presents a framework for research and development in user interfaces that support information seeking. The information-seeking process is described, and each of the subprocesses are discussed with an eye toward making user interfaces that closely couple support mechanisms. Recent results from studies related to term suggestions for queries, coupling search and examination, and seamless interaction between overviews and previews are used to illustrate highly interactive information-seeking services.

128 citations

Journal ArticleDOI
TL;DR: A longitudinal study of students' perceptions of the information search process in libraries over a four-year period from high school through college is described, finding a significant difference in three areas: research assignments, focus formulation, and procedures for gathering information.
Abstract: A longitudinal study of students' perceptions of the information search process in libraries over a four-year period from high school through college is described. The cognitive theory on which the study was based is that perceptions lead to expectations which direct action. A model of the search process developed in an earlier study was further verified. Perceptions of six areas of library use were examined by a comparison of the responses of students to a questionnaire when they were in high school and again four years later when they were in college. A significant difference was found in three areas: research assignments, focus formulation, and procedures for gathering information. The college students' perceptions more closely matched the model of the search process.

71 citations

Journal ArticleDOI
TL;DR: A system that would collect electronic resource usage data in a consistent manner and allow SUNY Cortland to assess this data over several years is described.
Abstract: Purpose – The purpose of this paper is to describe a project undertaken at SUNY Cortland to develop a system that would collect electronic resource usage data in a consistent manner and allow SUNY Cortland to assess this data over several years.Design/methodology/approach – The project used data gathered from EZProxy server log files to examine usage of the library's electronic resources.Findings – Through examining the usage data the library discovered that users were utilizing particular types of resources, from specific physical locations, and accessing those resources from specific pages in the library's web site.Originality/value – By examining usage data for electronic resources, libraries can learn more than which resources are being used. Usage data can give libraries insight into where, when, how, and possibly why their users are accessing electronic resources.

43 citations

Frequently Asked Questions (11)
Q1. What are the contributions in this paper?

In this paper, the EZproxy dataset was used for a 10-week HIVE Intern research project to explore users ' information seeking behavior and collection use in the virtual/online environment. 

As the foundational work for this project was developed in Unity, it is recommended that future work continues using this platform. Preliminary work has been done to export the data to a MySQL database, so this is an area that could be further developed. Once all the data has been stored and hosted on a database that can hold large data quantities, a few changes will need to be made so that rather than reading files, Unity can query the database and then use that data to develop the visualisations. Existing Visualisation Models: further development Global Visualisation 

The first visualisation prototype is referred as the Global Visualisation of Curtin Research Activity (henceforth Global Visualisation). 

It can also enable the delivery of e-library services, such as the personalisation of library websites/portals for students and staff in specific discipline areas to provide rapid access to their most frequently used online resources (Chan, 2013; Grace and Bremner, 2004, p. 162;). 

The cylinder screen’s 180-degree, 3D field of view enables more content to be presented on screen, and in a more immersive manner, than a normal desktop computer screen creating a more immersive feel and effect, as shown earlier in Figures 9, 10 and in 15. 

A user may access the library’s online databases from a personal browser by going to thelibrary catalogue or A – Z list of subscription databases, or through Google Scholar. 

The use of the cylinder is helpful in the Database Visualisation prototype as it allows a wider field of view, giving the feeling of being immersed in the data. 

The dome display, presented in Figures 16 and 17 was a secondary choice for the DatabaseVisualisation prototype, as it fully encompasses the viewer’s field of vision to create a feeling of being fully immersed (Figure 17) with the data. 

This process took approximately 60 hours to run over a weekend, given the large amount of data, roughly 1.2 Terabytes of raw text data. 

Curtin Library has a substantial dataset of logged, authenticated use of its online library collection, comprising databases, eJournals and eBooks dating from 2013. 

The authors consider information retrieval is a crucial activity in the information-seeking process,especially in the context of this research: that users wish to fulfil their informational needs from the widest possible pool of electronic resources.