Wednesday 4 February 2015

New year's digest on collaborative & reproducible research

This list is aggregated from public and private messages or during web browsing. Don't hesitate to send me links via our public mailing list or LinkedIn group (to have an acknowledgment):
https://groups.google.com/forum/#!forum/collective-mind
http://www.linkedin.com/groups/Reproducible-research-experimentation-in-computer-7433414


=== Misc articles ===

* "Research Wranglers: Initiatives to Improve Reproducibility of Study Findings"
  http://ehp.niehs.nih.gov/122-a188

* Dennis McCafferty, "Should Code be Released?",
  Communications of ACM, 2010/10, Vol.53, No.10, DOI:10.1145/1831407.1831415
  http://dl.acm.org/citation.cfm?id=1831415

* Chris Drummond, "Replicability is not Reproducibility: Nor is it Good Science"
  Proc. of the Evaluation Methods for Machine Learning Workshop
  at the 26th ICML, Montreal, Canada, 2009.
  Copyright: National Research Council of Canada
  http://cogprints.org/7691/7/ICMLws09.pdf

* Science is in a reproducibility crisis - how do we resolve it?
  http://theconversation.com/science-is-in-a-reproducibility-crisis-how-do-we-resolve-it-16998

* My blog article on "Automatic performance tuning and reproducibility as a side effect"
  for the Software Sustainability Institute:
  http://www.software.ac.uk/blog/2014-07-22-automatic-performance-tuning-and-reproducibility-side-effect

* Puzzling Measurement of "Big G" Gravitational Constant Ignites Debate
  http://www.scientificamerican.com/article/puzzling-measurement-of-big-g-gravitational-constant-ignites-debate-slide-show/

* White House takes notice of reproducibility in science, and wants your opinion
  http://retractionwatch.com/2014/09/05/white-house-takes-notice-of-reproducibility-in-science-and-wants-your-opinion/

* Problems during performance benchmarking:
** https://homes.cs.washington.edu/~bornholt/post/performance-evaluation.html




We also experienced many similar issues during our work on auto-tuning and machine learning:
* http://hal.inria.fr/hal-01054763
* http://arxiv.org/abs/1406.4020

* ACM SIGOPS Operating Systems Review - Special Issue on Repeatability
  and Sharing of Experimental Artifacts:
  http://dl.acm.org/citation.cfm?id=2723872

* Vinton G. Cerf. "Bit Rot: Long-Term Preservation of Digital Information" [Point of View]
  http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=5768098

* Less related though interesting (about citations):
  http://www.nature.com/news/the-top-100-papers-1.16224

=== Future events ==

* February 9, 2015, CGO/PPoPP joint session on artifact evaluation experience
  San Francisco, 17:15 - 17:35
  Grigori Fursin and Bruce Childers

* November 1-4, 2015, Dagstuhl Perspective Workshop
  "Artifact Evaluation for Publications"
  Bruce Childers, Shriram Krishnamurthi, Grigori Fursin, Andreas Zeller

* March 13 - 18 , 2016, Dagstuhl Seminar 16111
  "Rethinking Experimental Methods in Computing"

=== Past events ===

* Oct 27-30, 2014, Washington DC, US
  "1st International Workshop on Collaborative methodologies to Accelerate
  Scientific Knowledge discovery in big data (CASK) 2014"

  In conjunction with 2014 IEEE International Conference on Big Data
  (IEEE BigData 2014)
 
  http://bigscientificdata.org/cask14

* September 1, 2014:  Special journal issue on reproducible research methodologies
  in IEEE Transactions on Emerging Topics in Computing (TETC).

  http://www.occamportal.org/images/reproduce/TETC-SI-REPRODUCE.pdf

* January 2015:

  ACM SIGOPS Operating Systems Review
  Special Issue on Repeatability and Sharing
  of Experimental Artifacts
 
  http://www.sigops.org/osr.html

=== Journals/Conferences with reproducible articles ===
* IPOL Journal: Image Processing On Line
  http://www.ipol.im

=== Tools ===
* NGS pipelines - ntegrates pipelines and user interfaces
  to help biologists to analyse data outputed from biological
  applications such as RNAseq, sRNAseq, ChipSeq, BS-seq:
  https://mulcyber.toulouse.inra.fr/projects/ngspipelines

* Skoll: A process & Infrastructure for Distributed, continuous Quality assurance
  http://www.cs.umd.edu/projects/skoll/Skoll/Home.html

* NEPI: Simplifying network experimentation:
  http://nepi.inria.fr

* RR (Mozilla project): records nondeterministic executions and debugs them deterministically
  http://rr-project.org

* Burrito: Rethinking the Electronic Lab Notebook
  http://pgbovine.net/burrito.html

* Collective Knowledge (cTuning v4): our tool and repository to simplify code and data sharing as reusable components (for collaborative and reproducible R&D):
  http://github.com/ctuning/ck

=== Online workflows ===

* RunMyCode:
  http://www.runmycode.org

* AptLab:
  https://www.aptlab.net

=== Projects ===
* OpenLab:
  http://www.ict-openlab.eu

* EU Recode project
  http://recodeproject.eu/events/upcoming-events

* CERN: opendata
  http://opendata.cern.ch

* Research Data Alliance:
  https://rd-alliance.org

* Open Data Institute:
  http://opendatainstitute.org

=== Online archives/repos ===

* Olive Archive (preserving executable content):
  https://olivearchive.org

* Tera-PROMISE:
  http://openscience.us/repo

* OpenAire (CERN)
  https://www.openaire.eu

* Zenodo:
  https://zenodo.org

* ResearchCompendia:
  http://researchcompendia.org

* Internet Archive:
  https://archive.org

* The national archives:
  http://www.nationalarchives.gov.uk

* WikiData:
  http://www.wikidata.org/wiki/Wikidata:Introduction

* The digital preservation network:
  http://www.dpn.org

* Open datasets:
  https://open-data.europa.eu

* DataHub:
  http://datahub.io

* Datacite: citing data as DOI (Germany, has connections with CERN)
  https://www.datacite.org/contact

* CrossRef:
  http://www.crossref.org

* International DOI Foundation
  http://www.doi.org

* Our new pilot Collective Knowledge repository:
  http://cknowledge.org/repo

No comments:

Post a Comment