scispace - formally typeset
Search or ask a question

Showing papers by "J. Stephen Downie published in 2016"


Journal ArticleDOI
14 Oct 2016
TL;DR: It is found that user‐generated interpretations always outperformed lyrics in terms of classification accuracy, suggesting that user interpretations are more useful in the subject classification task than lyrics because the semantically ambiguous poetic nature of lyrics tends to confuse classifiers.
Abstract: That music seekers consider song subject metadata to be helpful in their searching/browsing experience has been noted in prior published research. In an effort to develop a subject-based tagging system, we explored the creation of automatically generated song subject classifications. Our classifications were derived from two different sources of song-related text: 1) lyrics; and 2) user interpretations of lyrics collected from songmeanings.com. While both sources contain subject-related information, we found that user-generated interpretations always outperformed lyrics in terms of classification accuracy. This suggests that user interpretations are more useful in the subject classification task than lyrics because the semantically ambiguous poetic nature of lyrics tends to confuse classifiers. An examination of top-ranked terms and confusion matrices supported our contention that users' interpretations work better for detecting the meaning of songs than what is conveyed through lyrics.

10 citations


Proceedings Article
01 Jan 2016
TL;DR: A comprehensive informetric study of the publication, authorship and citation characteristics of female researchers in the context of the ISMIR conferences shows that the percentage of lead female authors has not improved over the years, but more papers have appeared with female coauthors in very recent years.
Abstract: The Music Information Retrieval (MIR) community is becoming increasingly aware of a gender imbalance evident in ISMIR participation and publication. This paper reports upon a comprehensive informetric study of the publication, authorship and citation characteristics of female researchers in the context of the ISMIR conferences. All 1,610 papers in the ISMIR proceedings written by 1,910 unique authors from 2000 to 2015 were collected and analyzed. Only 14.1% of all papers were led by female researchers. Temporal analysis shows that the percentage of lead female authors has not improved over the years, but more papers have appeared with female coauthors in very recent years. Topics and citation numbers are also analyzed and compared between female and male authors to identify research emphasis and to measure impact. The results show that the most prolific authors of both genders published similar numbers of ISMIR papers and the citation counts of lead authors in both genders had no significant difference. We also analyzed the collaboration patterns to discover whether gender is related to the number of collaborators. Implications of these findings are discussed and suggestions are proposed on how to continue encouraging and supporting female participation in the MIR field.

9 citations


Proceedings ArticleDOI
19 Jun 2016
TL;DR: This paper compares the MADSRDF/MODSRDF, Bibframe, schema.org, BIBO, and FaBiO ontologies by assessing their suitability for employment by the HTRC to meet scholars' needs.
Abstract: The HathiTrust Research Center (HTRC) is engaged in the development of tools that will give scholars the ability to analyze the HathiTrust digital library's 14 million volume corpus. A cornerstone of the HTRC's digital infrastructure is the workset -- a kind of scholar-built research collection intended for use with the HTRC's analytics platform. Because more than 66% of the digital corpus is subject to copyright restrictions, scholarly users remain dependent upon the descriptive accounts provided by traditional metadata records in order to identify and gather together bibliographic resources for analysis. This paper compares the MADSRDF/MODSRDF, Bibframe, schema.org, BIBO, and FaBiO ontologies by assessing their suitability for employment by the HTRC to meet scholars' needs. These include distinguishing among multiple versions of the same work; representing the complex historical and physical relationships among those versions; and identifying and providing access to finer grained bibliographic entities, e.g., poems, chapters, sections, and even smaller segments of content.

9 citations




Proceedings ArticleDOI
19 Jun 2016
TL;DR: This paper engineer this so the output of semantic analysis is suitable for import directly into existing digital library metadata and index structures, and thus incorporated without the need for architecture modifications.
Abstract: Most existing digital libraries use traditional lexically-based retrieval techniques. For established systems, completely replacing, or even making significant changes to the document retrieval mechanism (document analysis, indexing strategy, query processing and query interface) would require major technological effort, and would most likely be disruptive. In this paper, we describe ways to use the results of semantic analysis and disambiguation, while retaining an existing keyword-based search and lexicographic index. We engineer this so the output of semantic analysis (performed off-line) is suitable for import directly into existing digital library metadata and index structures, and thus incorporated without the need for architecture modifications.

5 citations


Journal ArticleDOI
14 Oct 2016
TL;DR: A subset of the findings from a qualitative study of the challenges facing disability services providers in U.S. post‐secondary institutions that addresses challenges to providing, sharing, and reusing accessible digital content are reported on.
Abstract: The population of students with disabilities in post-secondary institutions is significant and rising. The U.S. Department of Education reports that 11% of students, or more than two million students, in post-secondary education report having a disability. Providing accessible versions of materials for courses is a core service of disability-services offices in schools. Finding, obtaining, or generating accessible course content is a challenging process for disability-services providers at institutions ranging from community colleges to research universities, many of which receive hundreds of individualized requests for content each semester. Although a range of sources and services to assist in this process have emerged, they are insufficient and inefficient because they keep people from working together on a complex, shared problem. In the summer of 2015, we conducted a qualitative study of the challenges facing disability services providers in U.S. post-secondary institutions, in order to design and implement information systems that would enable large-scale sharing of locally improved, accessible course content with qualified students in the U.S. This paper reports on the subset of our findings that addresses challenges to providing, sharing, and reusing accessible digital content. Our findings suggest that there are substantial opportunities for the LIS and library communities to apply our expertise to this gap in information services for an expanding population of students.

4 citations


Proceedings ArticleDOI
12 Aug 2016
TL;DR: An overview of the entire J-DISC dataset is provided and some exemplar analyses across this dataset are presented to better illustrate the kinds of uses that musicologists could make of this collection.
Abstract: J-DISC, a specialized digital library for information about jazz recording sessions that includes rich structured and searchable metadata, has the potential for supporting a wide range of studies on jazz, especially the musicological work of those interested in the social network aspects of jazz creation and production. This paper provides an overview of the entire J-DISC dataset. It also presents some exemplar analyses across this dataset to better illustrate the kinds of uses that musicologists could make of this collection. Our illustrative analyses include both informetric and network analyses of the entire J-DISC data which comprises data on 2,711 unique recording sessions associated with 3,744 distinct artists including such influential jazz figures as Dizzy Gillespie, Don Byas, Charlie Parker, John Coltrane and Kenny Dorham, etc. Our analyses also show that around 60% of the recording sessions included in J-DISC were recorded in New York City, Englewood Cliffs (NJ), Los Angeles (CA) and Paris during the year of 1923 to 2011. Furthermore, our analyses of the J-DISC data show the top venues captured in the J-DISC data include Rudy Van Gelder Studio, Birdland and Reeves Sound Studios. The potential research uses of the J-DISC data in both the DL (Digital Libraries) and MIR (Music Information Retrieval) domains are also briefly discussed.

3 citations