Home
/
Authors
/
Bela Gipp

Author

Bela Gipp

Other affiliations: University of California, University of California, Berkeley, National Institute of Informatics ...read more

Bio: Bela Gipp is an academic researcher from University of Wuppertal. The author has contributed to research in topics: Plagiarism detection & Computer science. The author has an hindex of 32, co-authored 187 publications receiving 3759 citations. Previous affiliations of Bela Gipp include University of California & University of California, Berkeley.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2005
2004
2003

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Research-paper recommender systems: a literature survey

[...]

Joeran Beel, Bela Gipp¹, Stefan Langer², Corinna Breitinger³•Institutions (3)

University of Konstanz¹, Otto-von-Guericke University Magdeburg², Linnaeus University³

01 Nov 2016-International Journal on Digital Libraries

TL;DR: Several actions could improve the research landscape: developing a common evaluation framework, agreement on the information to include in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

...read moreread less

Abstract: In the last 16 years, more than 200 research articles were published about research-paper recommender systems. We reviewed these articles and present some descriptive statistics in this paper, as well as a discussion about the major advancements and shortcomings and an overview of the most common recommendation concepts and approaches. We found that more than half of the recommendation approaches applied content-based filtering (55 %). Collaborative filtering was applied by only 18 % of the reviewed approaches, and graph-based recommendations by 16 %. Other recommendation concepts included stereotyping, item-centric recommendations, and hybrid recommendations. The content-based filtering approaches mainly utilized papers that the users had authored, tagged, browsed, or downloaded. TF-IDF was the most frequently applied weighting scheme. In addition to simple terms, n-grams, topics, and citations were utilized to model users' information needs. Our review revealed some shortcomings of the current research. First, it remains unclear which recommendation concepts and approaches are the most promising. For instance, researchers reported different results on the performance of content-based and collaborative filtering. Sometimes content-based filtering performed better than collaborative filtering and sometimes it performed worse. We identified three potential reasons for the ambiguity of the results. (A) Several evaluations had limitations. They were based on strongly pruned datasets, few participants in user studies, or did not use appropriate baselines. (B) Some authors provided little information about their algorithms, which makes it difficult to re-implement the approaches. Consequently, researchers use different implementations of the same recommendations approaches, which might lead to variations in the results. (C) We speculated that minor variations in datasets, algorithms, or user populations inevitably lead to strong variations in the performance of the approaches. Hence, finding the most promising approaches is a challenge. As a second limitation, we noted that many authors neglected to take into account factors other than accuracy, for example overall user satisfaction. In addition, most approaches (81 %) neglected the user-modeling process and did not infer information automatically but let users provide keywords, text snippets, or a single paper as input. Information on runtime was provided for 10 % of the approaches. Finally, few research papers had an impact on research-paper recommender systems in practice. We also identified a lack of authority and long-term research interest in the field: 73 % of the authors published no more than one paper on research-paper recommender systems, and there was little cooperation among different co-author groups. We concluded that several actions could improve the research landscape: developing a common evaluation framework, agreement on the information to include in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

...read moreread less

648 citations

Journal Article•DOI•

Academic Search Engine Optimization ( ASEO ): Optimizing Scholarly Literature for Google Scholar & Co.

[...]

Jöran Beel¹, Bela Gipp¹, Erik Wilde¹•Institutions (1)

University of California, Berkeley¹

01 Jan 2010-Journal of Scholarly Publishing

TL;DR: In this paper, the concept of academic search engine optimization (ASEO) is introduced and guidelines are provided on how to optimize scholarly literature for academic search engines in general and for Google Scholar in particular.

...read moreread less

Abstract: This article introduces and discusses the concept of academic search engine optimization (ASEO). Based on three recently conducted studies, guidelines are provided on how to optimize scholarly literature for academic search engines in general, and for Google Scholar in particular. In addition, we briefly discuss the risk of researchers' illegitimately ‘over-optimizing’ their articles.

...read moreread less

166 citations

Google Scholar’s Ranking Algorithm : An Introductory Overview

[...]

Jöran Beel, Bela Gipp

01 Jan 2009

TL;DR: The first steps to reverse-engineering Google Scholar’s ranking algorithm are performed and the results may help authors to optimize their articles for Google Scholar and enable researchers to estimate the usefulness of Google Scholar with respect to their search intention and hence the need to use further academic search engines or databases.

...read moreread less

Abstract: Google Scholar is one of the major academic search engines but its ranking algorithm for academic articles is unknown. We performed the first steps to reverse-engineering Google Scholar’s ranking algorithm and present the results in this research-in-progress paper. The results are: Citation counts is the highest weighed factor in Google Scholar’s ranking algorithm. Therefore, highly cited articles are found significantly more often in higher positions than articles that have been cited less often. As a consequence, Google Scholar seems to be more suitable for finding standard literature than gems or articles by authors advancing a new or different view from the mainstream. However, interesting exceptions for some search queries occurred. Moreover, the occurrence of a search term in an article’s title seems to have a strong impact on the article’s ranking. The impact of search term frequencies in an article’s full text is weak. That means it makes no difference in an article’s ranking if the article contains the query terms only once or multiple times. It was further researched whether the name of an author or journal has an impact on the ranking and whether differences exist between the ranking algorithms of different search modes that Google Scholar offers. The answer in both of these cases was "yes". The results of our research may help authors to optimize their articles for Google Scholar and enable researchers to estimate the usefulness of Google Scholar with respect to their search intention and hence the need to use further academic search engines or databases. Academic Search Engines, Google Scholar, Ranking Algorithm, Research in Progress

...read moreread less

146 citations

Proceedings Article•DOI•

A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation

[...]

Joeran Beel, Marcel Genzmehr, Stefan Langer, Andreas Nürnberger¹, Bela Gipp² - Show less +1 more•Institutions (2)

Otto-von-Guericke University Magdeburg¹, University of California, Berkeley²

12 Oct 2013

TL;DR: It is found that results of offline and online evaluations often contradict each other, and it is concluded that offline evaluations may be inappropriate for evaluating research paper recommender systems, in many settings.

...read moreread less

Abstract: Offline evaluations are the most common evaluation method for research paper recommender systems. However, no thorough discussion on the appropriateness of offline evaluations has taken place, despite some voiced criticism. We conducted a study in which we evaluated various recommendation approaches with both offline and online evaluations. We found that results of offline and online evaluations often contradict each other. We discuss this finding in detail and conclude that offline evaluations may be inappropriate for evaluating research paper recommender systems, in many settings.

...read moreread less

136 citations

Citation Proximity Analysis (CPA) : A New Approach for Identifying Related Work Based on Co-Citation Analysis

[...]

Bela Gipp, Jöran Beel

01 Jan 2009

TL;DR: The approach called Citation Proximity Analysis (CPA) is a further development of co-citation analysis, but in addition, considers the proximity of citations to each other within an article’s full-text.

...read moreread less

Abstract: This paper presents an approach for identifying similar documents that can be used to assist scientists in finding related work. The approach called Citation Proximity Analysis (CPA) is a further development of co-citation analysis, but in addition, considers the proximity of citations to each other within an article‟s full-text. The underlying idea is that the closer citations are to each other, the more likely it is that they are related. In comparison to existing approaches, such as bibliographic coupling, co-citation analysis or keyword based approaches the advantages of CPA are a higher precision and the possibility to identify related sections within documents. Moreover, CPA allows a more precise automatic document classification. CPA is used as the primary approach to analyse the similarity and to classify the 1.2 million publications contained in the research paper recommender system Scienstein.org. Introduction and Motivation The search for related scientific work can be tedious, and often important documents are missed out. Difficulties are caused by an increasing number of publications, growing exponentially at a yearly rate of 3.7 %, unclear nomenclature, synonyms and numerous other factors [1]. In practice, most searches for related work start with some initial papers and navigating the citation web nearest to those papers. However, even the more advanced approaches for identifying related work based on co-word analysis, collaborative filtering, Subject-Action-Object (SAO) structures or citation analysis do often not deliver satisfying results [2-8]. Therefore, we developed a new approach to determine the similarity of documents, which we name Citation Proximity Analysis (CPA). The approach is based on cocitation analysis and improves precision by considering the position of citations. The presented approach was developed for the research paper recommender Scienstein 1 to assist researchers in finding related work. The first part of this paper gives an overview about existing methods to identify similar documents, whereas the focus lies on the most popular citation analysis approaches and their strengths and weaknesses. The second part explains how the CPA can be used to measure similarity and the steps necessary to calculate a new metric that we call Citation Proximity Index (CPI). Afterwards, first results from an empirical study comparing the performance of co-citation analysis and CPA are presented. Finally, an outlook on further implications and how the CPA could be used in other fields is given. 1 www.scienstein.org is a research paper recommender focusing on identifying related work developed by the authors Related Work Various approaches exist to determine the degree of similarity of documents in order to identify related work. Whereas text-mining approaches are used in cases in which references are not stated, citation analysis approaches usually deliver superior results as e.g. synonyms and unclear nomenclature do not lead to misleading results [3, 4, 5]. Many citation analysis approaches exist and they all have their own strengths and weaknesses for identifying similar documents. Among the most widely used are the easily applicable „cited by‟ approach, which considers papers as relevant that cite the same input document and the „reference list‟ approach, which considers papers as relevant that were referenced by the input document. The best results can usually be obtained by bibliographic coupling and co-citation analysis, which allow calculating the coupling strength [6]. These approaches, which were already invented in the 60s and 70s, are used by scientists and on academic search engine websites like CiteSeer 2

...read moreread less

133 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

Collapse

Cited by

PDF

Open Access

More filters

Posted Content•

RoBERTa: A Robustly Optimized BERT Pretraining Approach

[...]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Michael Lewis, Luke Zettlemoyer, Veselin Stoyanov - Show less +6 more

26 Jul 2019-arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

Abstract: Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results highlight the importance of previously overlooked design choices, and raise questions about the source of recently reported improvements. We release our models and code.

...read moreread less

13,994 citations

行銷硏究 : Marketing research

[...]

鄭宇庭

01 Jan 2009

3,235 citations

No sense of place : the impact of electronic media on socialbehavior

[...]

Joshua Meyrowitz

01 Jan 1985

TL;DR: In this paper, Meyrowitz shows how changes in media have created new social situations that are no longer shaped by where we are or who is "with" us, making it impossible for us to behave with each other in traditional ways.

...read moreread less

Abstract: How have changes in media affected our everyday experience, behavior, and sense of identity? Such questions have generated endless arguments and speculations, but no thinker has addressed the issue with such force and originality as Joshua Meyrowitz in No Sense of Place. Advancing a daring and sophisticated theory, Meyrowitz shows how television and other electronic media have created new social situations that are no longer shaped by where we are or who is "with" us. While other media experts have limited the debate to message content, Meyrowitz focuses on the ways in which changes in media rearrange "who knows what about whom" and "who knows what compared to whom," making it impossible for us to behave with each other in traditional ways. No Sense of Place explains how the electronic landscape has encouraged the development of: -More adultlike children and more childlike adults; -More career-oriented women and more family-oriented men; and -Leaders who try to act more like the "person next door" and real neighbors who want to have a greater say in local, national, and international affairs. The dramatic changes fostered by electronic media, notes Meyrowitz, are neither entirely good nor entirely bad. In some ways, we are returning to older, pre-literate forms of social behavior, becoming "hunters and gatherers of an information age." In other ways, we are rushing forward into a new social world. New media have helped to liberate many people from restrictive, place-defined roles, but the resulting heightened expectations have also led to new social tensions and frustrations. Once taken-for-granted behaviors are now subject to constant debate and negotiation. The book richly explicates the quadruple pun in its title: Changes in media transform how we sense information and how we make sense of our physical and social places in the world.

...read moreread less

1,361 citations

Journal Article•DOI•

A systematic literature review of blockchain-based applications: Current status, classification and open issues

[...]

Fran Casino¹, Thomas K. Dasaklis¹, Constantinos Patsakis¹•Institutions (1)

University of Piraeus¹

01 Mar 2019-Telematics and Informatics

TL;DR: A comprehensive classification of blockchain-enabled applications across diverse sectors such as supply chain, business, healthcare, IoT, privacy, and data management is presented, and key themes, trends and emerging areas for research are established.

...read moreread less

1,310 citations

Journal Article•DOI•

A review of Internet of Things for smart home: Challenges and solutions

[...]

Biljana Risteska Stojkoska¹, Kire Trivodaliev¹•Institutions (1)

Saints Cyril and Methodius University of Skopje¹

01 Jan 2017-Journal of Cleaner Production

TL;DR: A holistic framework which incorporates different components from IoT architectures/frameworks proposed in the literature, in order to efficiently integrate smart home objects in a cloud-centric IoT based solution is proposed.

...read moreread less

1,003 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse