Citation Analysis of Research Articles in Music Information Retrieval

 

By Adam Wead

 

 

Music information retrieval is a new and growing field of study.  Since the first international symposium on music information retrieval (ISMIR) 2000, to the fourth ISMIR meeting taking place in October 2003, the output of research papers has tripled.  While there have been a few attempts at summarizing the current state of research with music IR, there have been no investigations into which particular areas of music IR research are the most active.  This brief paper attempts to address such a question.

 

Background

 

This paper is the result of a independent study that was conducted during the summer of 2003.  The initial project was to create a large enough set of bibliographic data of research papers under the topic of music or audio information retrieval and using citation analysis, discern which papers were the most relevant to the field.  Using the CiteSeer web citation database, search terms such as “music information retrieval” and “audio information retrieval” were used to query the CiteSeer database and retrieve a relevant set of bibliographic information, including the number of citations of each paper, weighted by year.[1] The results of the first 50 highest cited articles retrieved in each query were stored in a MySQL database.  Next, the bibliographic references for each article in the most recent ISMIR conference, in this case ISMIR 2002, were added to the database to ensure that the most recent research articles were included in the data set.  After removing duplicate references, a database of 69 articles was then displayed through an Apache web server, configured with PHP for easy manipulation of the data.  Finally, a Perl script was used to query CiteSeer on a weekly basis with the bibliographic information of each article and update the citation counts of each article.

 

Defining Areas of Research

 

Each article was reviewed and a assigned a particular category of research.  Some categories were predetermined based on traditional categories of text IR.  These categories were: retrieval, indexing, representation, clustering, classification, background, and overview.  Other categories were created to accommodate unique articles or groups of articles sharing a particular theme.  These other categories were: user interface, transcription and analysis, and processing.  Each category was further subdivided into two subcategories, however, in some cases the subcategories were not needed.  Clustering, indexing, and processing, for example, have no subcategories.  The subcategories were used to define different named areas of research in a given category.  The most diverse category was retrieval, containing 5 subcategories, followed by user interface with 4, and representation with 2.

 

For the purposes of this project, the criteria for placing an article in a given category was determined as follows.  The overview category contained articles that attempted to encapsulate the music IR field, much the same way this paper is attempting to do.  Background articles were articles that were not directly related to music IR but held fundamental concepts, issues and research methods used in other music IR related studies.  Retrieval concerned systems, methods and theories of content-based music search and retrieval, both in aural and visual formats.  The representation category covered articles that discussed how music data can be represented in a digital format for searching and indexing, however it did not discuss the indexing itself.  User interface articles dealt with anything relating to user interface, or approaching music IR issues from a user interface perspective.  This means that if an article dealt with search and retrieval, but adopted a user interface approach, it was placed in the latter category and not the former.  Classification and clustering were two related categories dealing with the grouping of musical material into genres or styles, with the difference being that clustering does not rely on predefined categories and classification does.  Indexing covered issues of music indexing for IR purposes, and processing dealt with different ways in which audio signals may be processed to extract relevant and searchable content.

 

The Results

 

Overall, the results were conclusive based on the citation data and the means by which the articles were grouped together for this study (see Appendix A).  The areas of retrieval were the most diverse and the most under-cited, indicating that this area of music IR is very active with new research but has yet to reveal any definitive leaders.  The most robust of the music retrieval systems, such as MIDI-based systems and systems that rely on some form of text representation of music, had the highest citation counts indicating that these methods are more established than other more experimental systems.  Music retrieval of polyphony contained the second largest number of articles, while retrieval of other means, had the most.  These two points highlight the varied approaches of music IR and how the retrieval of polyphony is the most significant challenge in the future of music IR. 

 

In terms of age, user interface related papers presented the newest area of research.  Almost all the papers on user interface topics in music IR came from the 2002 ISMIR proceedings.  Because these papers are so new, CiteSeer did not have any relevant citation data available, but this is apt to change in the coming months as new citation data is made available.  However, at the time of this writing it is impossible to hypothesize what the citation data could mean for this area of music IR research.

 

Digital music representation was less varied as retrieval, but had the second highest overall number of articles.  Research into this area had consistently high citation counts, indicating that musical data representation is another import area of research.  This is not surprising because many other areas of music IR research depend on accurate ways of representing musical data, such as indexing, for example.  Since representation is still an active area of research, without a clear leading method, indexing is not yet possible on a large scale, hence the lack of current research in that category.

 

The areas of classification and clustering represent another very active area of music IR research, and perhaps classification could be considered the leading field at this time.   Two empirical reasons for this conclusion are that the classification category contains the article with the highest number of citations and also this category averaged the highest number of citations per article.  However, the statistical data can be misleading and should be used as the sole test as to what research areas are more important than others.  Other possibilities might include that, in general, classification is more useful at present than retrieval, given that most audio retrieval methods are in experimental stages of development and methods of audio classification are more robust than their search and retrieval counterparts.

 

Conclusions and Implications for Further Research

 

The results of studying citation data on articles relating music and audio information retrieval point to 3 principle areas of research: music retrieval, representation, and classification.  Classification emerges as a “leader” in terms of research interest, in other words, people in general are most interested in content-based music classification.  Music retrieval is a more varied area with a lot of research activity, but no clear leading results.  Similarly, music data representation also exhibits the same characteristics of activity but also without any definitive results, indicating that representation and consequently, retrieval are still issues that need to be addressed in the field of music IR.

 

Further research in citation analyses of music information retrieval research papers should include a larger document set and a more rigorous set of categories and criteria.  Also, citation analysis of sources used in each article could be employed to build a corpus of common sources used by all articles and a core bibliography for others to use and reference.

 

References

 

Bibliographic citations to all articles used in this study can be found online at:

http://ella.slis.indiana.edu:8088/mir/index.php

 

 

 


Appendix A: Summary of Citation Results

 

Category                        Total                    Highest                 Average[2]

 

Retrieval                                  22

            MIDI                           3                                  17                                7

            Query-by-hum 1                                  0                                  0

            Polyphony                   5                                  5                                  1.6

            Text-based                  3                                  14                                10

            Other                           10                                15                                2.8

 

Representation                        10                   

            Summary                     3                                  0                                  0

            Other                           7                                  15

 

User Interface                         9

            Input

                        Preprocessing  1                                  0                                  0

                        Segmentation  2                                  0                                  0

            Classification              5                                  3                                  1.2

            Usability                      1                                  0                                  0

 

Processing                               5                                  8                                  3

Classification                          5                                  31                                6.2

Clustering                                5                                  5                                  1

Transcription/Analysis            3                                  3                                  1

Indexing                                  2                                  0                                  0

 

 



[1] The CiteSeer website (http://www.citeseer.com/) calculates citation counts both in terms of raw number scores, ie. the total number of citations for a given paper, and a year-weighted count, so that citations from more recent papers count higher than citations from older papers.

[2] Since many of the these papers are very new, citation counts often yield zero results; therefore, statistics for average counts are probably not a reliable indicator of research activity.