Position Papers
In addition to attending the workshop meeting, WorkshopParticipants will be asked to come prepared with positions to share. The WorkshopOrganizers will ask each participant to write a short (one- or two-page) position paper in advance of the meeting. These papers will be collected and distributed to participants before the meeting, and will be included in the starting points for the ultimate workshop report.
The position papers are linked below, and arranged by panel.
You might also want to see the guidelines for writing position papers.
Networking
- Jeannie Albrecht. Achieving Experiment Repeatability on PlanetLab.
Abstract: The challenges associated with archiving experiments in wide-area testbeds depend largely on the capabilities of the testbed itself and the existence of monitoring and measurement tools. Before any experiments can be archived or repeated, experimenters need mechanisms for capturing the conditions within which the initial experiment is performed. In shared, wide-area testbeds, the relevant conditions that must be captured consist of several metrics including: 1) machine-specific properties, such as available CPU, memory, and disk space, 2) experiment-specific details, such as software or operating system versions, as well as 3) network-specific characteristics, such as pair-wise bandwidth and latency. Similarly, the amount of resource isolation and configuration supported by the testbed has a large impact on the archivability and repeatability of an experiment. In this paper I will highlight some of the challenges of achieving repeatable experiments on Planetlab, a commonly used wide-area testbed, and discuss some possible ways in which these challenges can be addressed to provide archivable and repeatable experiments.
Abstract: The increasing use of data repositories, testbeds, and experiment-management systems shows that the networking and systems research communities are moving in the direction of repeatability. We assert, however, that the goal of these communities should not be repeatable research, but “replayable” research. Beyond encapsulating the definition and history of an experiment, a replayable experiment is associated with a mechanism for actually re-executing a system under test. In this paper, we outline the challenges to be overcome in building an archive of replayable experiments in computer networking and systems research.
- Steve Schwab, John Wroclawski, Terry Benzel, Bob Braden, and Ted Faber. The Challenge of Repeatable Experiment Archiving – Lessons from DETER.
Abstract: To repeat experiments, the apparatus, procedures, and assumptions that underlie them must be captured. Current testbeds and researchers generally collect and archive this information in ad hoc ways, if at all. The DETER project aims to explicitly capture this information in ways that encourage repetition, reconstruction, composition, and reasoning about the experiments. Evolving experiment representations to meet these goals faces challenges of testbed implementation inertia, identifying and explicitly encoding aspects of experiments previously left implicit, and choosing the right granularity and encodings that makes the system usable.
Excerpt: Compared to other scientific research fields, such as the life sciences, a culture of archiving experiments was so far mostly absent from the networking field. Related to archiving, in most fields of science, an experimental result will only become accepted when it has been repeated by peers. In contrast, in networking research, even repeating results based on simulations is difficult due to dependencies on various software and hardware configurations that are hard to capture. In addition, issues specific to RF environment including radio propagation, interference, and complexity of new devices pose a unique set of challenges in archiving wireless networking experiments. In this paper, we discuss our experience in providing support for archiving and repeating experiments and related issues that arose during the 4.5 years of ORBIT testbed use.
Compilers
- Jack Davidson. Good News, Bad News.
Excerpt: The good news is that current technology trends and new tools have made it easier to share experimental setups with other researchers. The bad news is that there are still substantial barriers to doing good experimental computer science research.
- Amer Diwan and Robert Hundt. Repeatable, Reproducible, and Useful.
Excerpt: Compared to other sciences, it is more important for computer science to explicitly consider "usefulness" of its experiments. The reason for this is that other sciences (particularly natural) primarily explore phenomenon that are already "out there"; in contrast in computer science we create our own problem and thus there is a greater danger that we may create and solve problems that are irrelevant.
Excerpt: Significant advances in programming languages and compilers have come from experimental research that measures performance, analysis precision, reduction in energy requirements and other desirable features, and attempts to compare new approaches to the state of the art. Unfortunately, often it is infeasible for researchers other than the original developers to repeat these experiments, or provide valid comparisons with prior work, which has a profoundly negative impact on moving research ideas into practice. A 2007 NSF‐sponsored compiler research workshop found the lack of experimental repeatability was hampering the field (described in a February 2009 Communications of the ACM article). In the current era of large data centers, advances in data‐mining technology and virtualization, and increasingly powerful computers, we believe that technology advances could be applied to archiving experimental results and software systems in programming languages research to dramatically advance the field.
- David Padua. Compilers and Reproducibility.
Excerpt: In an ideal world, published papers would be accompanied by a repository containing the source code of the compiler or analysis tool used for the experiment as well as the source programs. In many cases, however, this would not suffice for reproducibility since the target machine and the runtime environment would also be needed for reproducibility when performance is the metric of quality.
Physics, Geophysics, and Astrophysics
- George B. Adams III. nanoHUB.org and the HUBzero Platform for Reproducible Computational Experiments.
Excerpt: nanoHUB.org, powered by the open-source HUBzero platform, provides capabilities to make it much easier to share code, reproduce computational results, and test those results than do approaches such as shared tarballs.
- Sergey Fomel. Reproducible research in computational geophysics.
Excerpt: Maintaining scientific publications together with the associated computational experiments provides robust regression testing of software modules while allowing the community to gradually accumulate an connected body of scientific knowledge. By sharing a common environment, the community also develops a network of scientific collaborations enabled by reproducible research.
- J. Daniel Gezelter. Open Science, Reproducible Experiments, and Experimental Archives.
Excerpt: Modern science has come to rely on computer simulations, computational models, and computational analysis of very large data sets. These methods for doing science are all reproducible in principle. For very simple systems and small data sets this is nearly the same as reproducible in practice. That is, a report in the literature can usually be reproduced starting from only the data and methodology stated in the paper. As systems become more complex and the data sets become large, calculations that are reproducible in principle are no longer reproducible in practice without public access to the code, input data, and meta-data.
Biomedical Imaging and Informatics
Excerpt: It is widely recognized that reproducibility of research results in modern bioinformatics is important and difficult. Prominent reports have identified failures of reproducibility in various facets of major published research projects. In most cases, discovery of reproducibility shortfalls would not have been possible had not significant effort been made in the establishment and implementation of guidelines on the obligations of authors to publish primary genomic data. Despite this, three major difficulties must be acknowledged and confronted if common reproducibility shortfalls are to be eliminated: Experimental metadata must be available and should be tightly bound to primary data. Analysis workflows should be specified very clearly with definite identification of inputs, software components and their versions.... Genomic metadata (gene and assay platform annotations, function catalogues, genomic sequences) must be versioned and versions in use must be documented.
- Lewis Frey. Measurable Interoperability for Archival Data
Excerpt: The field of biomedical informatics has a goal to archive data within scalable interoperable systems. With the generation of multiple forms of “-omics” data sets there are two issues that have direct bearing on archival data storage and use. The first is the ability to communicate a shared understanding of the data. An example of this is the work of Brazma et al. (2001) in developing standard representations for sharing microarray data. The second is the sheer volume of the data in which scalable methods for sharing the data must be implemented. The grid architecture work of Saltz et al. (2006) is an example of a system that provides a shared infrastructural means of communicating large volumes of data. The successful implementation of interoperable systems depends upon a solution that addresses these two issues of shared meaning and means of information communication.
Excerpt: It is common clinical practice to archive radiological images to permit comparison with future studies, for example to determine disease progression or response to therapy. It is, however, not common to archive research images and metadata to either permit reuse of the image data for new experiments or an experiment to be repeated. One reason for this disparity is that clinical imaging studies are collected primarily for a human observer to visually inspect and form a clinical opinion. Minimal metadata is needed to permit this experiment to be repeated as it is not really critical that precisely the same measurement is taken, only that the clinician can see the appropriate anatomy. In contrast, to repeat a quantitative imaging experiment requires a rather extensive amount of information to be carefully collected and stored in addition to the imaging data itself.
Excerpt: The term in silico experiment broadly refers to an experiment carried out on computers using databases and/or computer simulations. In biomedical research, availability of an increasing array of high-throughput and high-resolution instruments has given rise to large datasets of “omics” data -- such as genomics, proteomics, metabolomics -- and imaging data -- such as radiology and microscopy imaging. These datasets provide highly detailed views of biological systems and functions. There are an increasing number of studies that either primarily focus on in silico experiments or involve them as a significant component of their studies. In this paper we present an example of multi-scale integrative in silico investigations and discuss some of the key requirements associated with archiving in a way to enable reusability and reproducibility of in silico experiments in such studies. Multi-scale integrative investigations attempt to measure, quantify and (in many cases) simulate biomedical phenomena in a way that takes into account multiple biological, spatial, and in some cases temporal scales. Data types include multiple types of microscopy imaging, CT, micro-CT, MR, molecular imaging, high throughput genomic analyses, and protein analyses.
Enabling Technologies
- Anita de Waard. Integrating Workflows With Semantic Publications.
Excerpt: The point of publishing innovation is to enable scientists to do better science, by improving the way they communicate. There are a great number of ongoing efforts that explore the use of semantic technologies and systems to enhance the scientific communication process; for some recent efforts in this area, see e.g. [1, 2]. However, most of these efforts start with looking at the finished manuscript, which reports the thoughts, results, and conclusions of an experiment after the fact. What is generally ignored is the preceding step: how experimental results are selected, described and depicted in a paper.
- Yolanda Gil. Scientific Reproducibility through Computational Workflows and Shared Provenance Representations.
Abstract: To enable scientific reproducibility, we have been investigating the use of computational workflows to capture decisions both in the design and in the execution of the workflow. We have succeeded at automating many aspects of the workflow management, and to create abstractions that facilitate sharing and reuse. We use of semantic workflows to ensure valid reuse of the experimental method rendered as a workflow. I present some observations from our recent work in reproducing results in population genomics published in the literature through workflow reuse. I also introduce the work of the W3C Provenance Group on creating a broad understanding of the requirements for shared provenance representations to enable traceability, validation, and reproducibility.
Abstract: The only reason we store something in an archive is so we can take it out again. And once we take it out, more likely than not, we want to do something with it, usually by mixing it with other data. The ease with which that can be done is termed "interoperability." By removing expectations and obligations from the use of our data, we remove all barriers to their use. Reciprocally, having free data available to us enables us to continue in our scientific quest unhindered and unfettered by legal hurdles. CC0, a new protocol launched by Creative Commons, helps converge scientific data toward the public domain.
- Dennis Shasha. Repeatability & Workability for the Software Community: Challenges, Experiences, and the Future.
Abstract: It's much easier to test claims in computer science than in natural science. So, we decided to give it a try in the database conference ACM SIGMOD 2008. Specifically, the repeatability committee volunteered to assess the results of database experimental papers if authors chose to submit their code and data for the purpose. For SIGMOD 2009, the testing expanded to "workability" – varying data and parameters to see how the software behaves under conditions that are different from the ones the researchers reported. This short note describes what we have learned over the past two years, the challenges we have faced, ones we still face, and directions we suggest for the future.
Attachments
- dewaard.doc (30.5 kB) -
Anita de Waard. Integrating Workflows With Semantic Publications.
, added by EricEide <eeide@cs.utah.edu> on 05/03/10 09:11:19. - dewaard.pdf (58.3 kB) -
Anita de Waard. Integrating Workflows With Semantic Publications.
, added by EricEide <eeide@cs.utah.edu> on 05/03/10 09:12:00. - diwan.pdf (94.7 kB) -
Amer Diwan and Robert Hundt. Repeatable, Reproducible, and Useful.
, added by EricEide <eeide@cs.utah.edu> on 05/03/10 13:58:00. - shasha.pdf (41.8 kB) -
Dennis Shasha. Repeatability & Workability for the Software Community: Challenges, Experiences, and the Future.
, added by EricEide <eeide@cs.utah.edu> on 05/03/10 16:54:38. - carey_genomescale.pdf (76.5 kB) -
Carey Perspectives on reproducibility of genome scale data analysis
, added by stvjc@channing.harvard.edu on 05/10/10 08:48:00. - kishor-position_paper.pdf (72.2 kB) -
Puneet Kishor. 2010. Interoperability as a guiding principle for long-term archives. Position paper for Archive '10, the NSF Workshop on Archiving Experiments to Raise Scientific Standards, May 2010, Salt Lake City, Utah.
, added by EricEide <eeide@cs.utah.edu> on 05/10/10 11:05:36. - GoodNewsBadNews.pdf (38.0 kB) - added by jwd@virginia.edu on 05/17/10 18:43:59.
- eide.pdf (96.2 kB) -
Eric Eide. Toward Replayable Research in Networking and Systems.
, added by EricEide <eeide@cs.utah.edu> on 05/17/10 22:19:56. - prior.docx (18.6 kB) -
Fred Prior. Archiving Experiments Based on Radiological Images.
, added by EricEide <eeide@cs.utah.edu> on 05/17/10 23:05:18. - prior.pdf (64.9 kB) -
Fred Prior. Archiving Experiments Based on Radiological Images.
, added by EricEide <eeide@cs.utah.edu> on 05/17/10 23:05:44. - hall.pdf (64.3 kB) -
Mary Hall. Advancing the Compiler Community’s Research Agenda with Archiving and Repeatability.
, added by EricEide <eeide@cs.utah.edu> on 05/18/10 12:04:53. - gezelter.pdf (102.3 kB) -
J. Daniel Gezelter. Open Science, Reproducible Experiments, and Experimental Archives.
, added by EricEide <eeide@cs.utah.edu> on 05/18/10 13:41:48. - schwab.doc (30.0 kB) -
Stephen Schwab. The Challenge of Repeatable Experiment Archiving – Lessons from DETER.
, added by EricEide <eeide@cs.utah.edu> on 05/19/10 22:38:14. - schwab.pdf (77.6 kB) -
Stephen Schwab. The Challenge of Repeatable Experiment Archiving – Lessons from DETER.
, added by EricEide <eeide@cs.utah.edu> on 05/19/10 22:38:46. - padua.pdf (102.9 kB) -
David Padua. Compilers and Reproducibility.
, added by EricEide <eeide@cs.utah.edu> on 05/20/10 10:02:24. - adams.pdf (79.8 kB) -
My position paper.
, added by gba@purdue.edu on 05/21/10 14:57:10. - paper.pdf (62.7 kB) -
Position statement
, added by jeannie@cs.williams.edu on 05/23/10 07:35:01. - seskar.pdf (116.7 kB) -
Ivan Seskar. Capturing an Experiment: A Wireless Testbed Perspective.
, added by EricEide <eeide@cs.utah.edu> on 05/23/10 11:18:10. - integrative-insilico-experiments.pdf (149.5 kB) - added by anonymous on 05/23/10 14:11:48.
- Gil-Archive10.pdf (75.8 kB) -
Yolanda Gil position paper
, added by gil@isi.edu on 05/23/10 18:17:41. - PositionPaperFrey.pdf (17.6 kB) -
Measurable Interoperability for Archival Data
, added by lewis.frey@hsc.utah.edu on 05/24/10 15:19:21. - fomel.pdf (77.3 kB) - added by sergey.fomel@gmail.com on 05/25/10 03:27:02.
