EUSES logo






EUSES Spreadsheet Corpus


Many years of research in the domain of spreadsheet dependability have resulted in the development of a wide variety of tools for helping end-users create and maintain spreadsheets. However, very little work has been done to evaluate these tools against "real-world" spreadsheets. The primary reason for this is the difficulty and expense in collecting and maintaining a suitable population of spreadsheets. To help mitigate this problem, we have collected a large sample of spreadsheets (5606 total files, 4498 of which are unique and suitable for automated processing in Excel) that researchers can use to evaluate their methodologies and tools for creating and maintaining spreadsheets.

Obtaining the corpus


To obtain the corpus, you must be a researcher in the field of software engineering, end-user programming, human-computer interaction, or usability. If so, email mfisher@cse.unl.edu with a description of the work you are interested in doing with the spreadsheets. Any work using this spreadsheet corpus should cite:

  • Marc Fisher II and Gregg Rothermel. The EUSES Spreadsheet Corpus: A shared resource for supporting experimentation with spreadsheet dependability mechanisms. In Proceedings of the 1st Workshop on End-User Software Engineering, pages 47-51, St. Louis, MO, USA, May 2005. Download.

Copyright Notice


Marc Fisher II, Gregg Rothermel, the University of Nebraska - Lincoln, the EUSES Consortium, and ESQuaReD Laboratory did not create the spreadsheets in this collection and make no claims to holding the copyright for the spreadsheets in the collection. All spreadsheet copyrights are assumed to be held by the original developer. The spreadsheets were, predominantly, found in publicly accessible locations and it is therefore assumed that use of these spreadsheets for academic research is within fair use guidelines set by US copyright law. If you suspect that a spreadsheet that you own the copyright to is used in this collection, and wish it removed from the collection please contact mfisher@cse.unl.edu with identifying information.

Corpus Uses


  • Robin Abraham and Martin Erwig, Inferring templates from spreadsheets, in Proceeding of the International Conference on Software Engineering, pages 182-191, Shanghai, China, May 2006.
  • Robin Abraham and Martin Erwig, Type inference for spreadsheets, in Proceeding of the Symposium on Principles and Practice of Declarative Programming, pages 73-84, Venice, Italy, July 2006.
  • Marc Fisher II, Gregg Rothermel, Tyler Creelan, and Margaret Burnett. Scaling a dataflow testing methodology to the multiparadigm world of commercial spreadsheets, in Proceedings of the 17th IEEE International Symposium on Software Reliability Engineering, pages 13-22, Raleigh, NC, USA, November 2006. Download
  • Robin Abraham and Martin Erwig, GoalDebug: A Spreadsheet Debugger for End Users, in Proceedings of the International Conference on Software Engineering, Minneapolis, MN, USA, May 2007 (to appear).

CategoryProject

There are 6831 comments on this page. [Display comments]

Page was generated in 0.8384 seconds