EUSES Spreadsheet Corpus
Many years of research in the domain of spreadsheet dependability have resulted in the development of a wide variety of tools for helping end-users create and maintain spreadsheets. However, very little work has been done to evaluate these tools against "real-world" spreadsheets. The primary reason for this is the difficulty and expense in collecting and maintaining a suitable population of spreadsheets. To help mitigate this problem, we have collected a large sample of spreadsheets (5607 total files, 4499 of which are unique and suitable for automated processing in Excel) that researchers can use to evaluate their methodologies and tools for creating and maintaining spreadsheets.
Obtaining the corpus
To obtain the corpus, you must be a researcher in the field of software engineering, end-user programming, human-computer interaction, or usability. If so, email firstname.lastname@example.org with a description of the work you are interested in doing with the spreadsheets. Any work using this spreadsheet corpus should cite:
Marc Fisher II, Gregg Rothermel, the University of Nebraska - Lincoln, the EUSES Consortium, and ESQuaReD Laboratory did not create the spreadsheets in this collection and make no claims to holding the copyright for the spreadsheets in the collection. All spreadsheet copyrights are assumed to be held by the original developer. The spreadsheets were, predominantly, found in publicly accessible locations and it is therefore assumed that use of these spreadsheets for academic research is within fair use guidelines set by US copyright law. If you suspect that a spreadsheet that you own the copyright to is used in this collection, and wish it removed from the collection or that your ownership be acknowledged, please contact email@example.com with identifying information.
We would like to thank ACBA (UK) LTD (http://www.acba.co.uk/∞) for providing a spreadsheet for the corpus.