cTuning foundation - enabling collaborative and reproducible AI/ML/computer systems/quantum R&D: New year's digest on collaborative & reproducible research

Wednesday, 4 February 2015

New year's digest on collaborative & reproducible research

This list is aggregated from public and private messages or during web browsing. Don't hesitate to send me links via our public mailing list or LinkedIn group (to have an acknowledgment):
https://groups.google.com/forum/#!forum/collective-mind
http://www.linkedin.com/groups/Reproducible-research-experimentation-in-computer-7433414

=== Misc articles ===

* "Research Wranglers: Initiatives to Improve Reproducibility of Study Findings"
http://ehp.niehs.nih.gov/122-a188

* Dennis McCafferty, "Should Code be Released?",
Communications of ACM, 2010/10, Vol.53, No.10, DOI:10.1145/1831407.1831415
http://dl.acm.org/citation.cfm?id=1831415

* Chris Drummond, "Replicability is not Reproducibility: Nor is it Good Science"
Proc. of the Evaluation Methods for Machine Learning Workshop
at the 26th ICML, Montreal, Canada, 2009.
Copyright: National Research Council of Canada
http://cogprints.org/7691/7/ICMLws09.pdf

* Science is in a reproducibility crisis - how do we resolve it?
http://theconversation.com/science-is-in-a-reproducibility-crisis-how-do-we-resolve-it-16998

* My blog article on "Automatic performance tuning and reproducibility as a side effect"
for the Software Sustainability Institute:
http://www.software.ac.uk/blog/2014-07-22-automatic-performance-tuning-and-reproducibility-side-effect

* Puzzling Measurement of "Big G" Gravitational Constant Ignites Debate
http://www.scientificamerican.com/article/puzzling-measurement-of-big-g-gravitational-constant-ignites-debate-slide-show/

* White House takes notice of reproducibility in science, and wants your opinion
http://retractionwatch.com/2014/09/05/white-house-takes-notice-of-reproducibility-in-science-and-wants-your-opinion/

* Problems during performance benchmarking:
** https://homes.cs.washington.edu/~bornholt/post/performance-evaluation.html

We also experienced many similar issues during our work on auto-tuning and machine learning:
* http://hal.inria.fr/hal-01054763
* http://arxiv.org/abs/1406.4020

* ACM SIGOPS Operating Systems Review - Special Issue on Repeatability
and Sharing of Experimental Artifacts:
http://dl.acm.org/citation.cfm?id=2723872

* Vinton G. Cerf. "Bit Rot: Long-Term Preservation of Digital Information" [Point of View]
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=5768098

* Less related though interesting (about citations):
http://www.nature.com/news/the-top-100-papers-1.16224

=== Future events ==

* February 9, 2015, CGO/PPoPP joint session on artifact evaluation experience
San Francisco, 17:15 - 17:35
Grigori Fursin and Bruce Childers

* November 1-4, 2015, Dagstuhl Perspective Workshop
"Artifact Evaluation for Publications"
Bruce Childers, Shriram Krishnamurthi, Grigori Fursin, Andreas Zeller

* March 13 - 18 , 2016, Dagstuhl Seminar 16111
"Rethinking Experimental Methods in Computing"

=== Past events ===

* Oct 27-30, 2014, Washington DC, US
"1st International Workshop on Collaborative methodologies to Accelerate
Scientific Knowledge discovery in big data (CASK) 2014"

In conjunction with 2014 IEEE International Conference on Big Data
(IEEE BigData 2014)

http://bigscientificdata.org/cask14

* September 1, 2014: Special journal issue on reproducible research methodologies
in IEEE Transactions on Emerging Topics in Computing (TETC).

http://www.occamportal.org/images/reproduce/TETC-SI-REPRODUCE.pdf

* January 2015:

ACM SIGOPS Operating Systems Review
Special Issue on Repeatability and Sharing
of Experimental Artifacts

http://www.sigops.org/osr.html

=== Journals/Conferences with reproducible articles ===
* IPOL Journal: Image Processing On Line
http://www.ipol.im

=== Tools ===
* NGS pipelines - ntegrates pipelines and user interfaces
to help biologists to analyse data outputed from biological
applications such as RNAseq, sRNAseq, ChipSeq, BS-seq:
https://mulcyber.toulouse.inra.fr/projects/ngspipelines

* Skoll: A process & Infrastructure for Distributed, continuous Quality assurance
http://www.cs.umd.edu/projects/skoll/Skoll/Home.html

* NEPI: Simplifying network experimentation:
http://nepi.inria.fr

* RR (Mozilla project): records nondeterministic executions and debugs them deterministically
http://rr-project.org

* Burrito: Rethinking the Electronic Lab Notebook
http://pgbovine.net/burrito.html

* Collective Knowledge (cTuning v4): our tool and repository to simplify code and data sharing as reusable components (for collaborative and reproducible R&D):
http://github.com/ctuning/ck

=== Online workflows ===

* RunMyCode:
http://www.runmycode.org

* AptLab:
https://www.aptlab.net

=== Projects ===
* OpenLab:
http://www.ict-openlab.eu

* EU Recode project
http://recodeproject.eu/events/upcoming-events

* CERN: opendata
http://opendata.cern.ch

* Research Data Alliance:
https://rd-alliance.org

* Open Data Institute:
http://opendatainstitute.org

=== Online archives/repos ===

* Olive Archive (preserving executable content):
https://olivearchive.org

* Tera-PROMISE:
http://openscience.us/repo

* OpenAire (CERN)
https://www.openaire.eu

* Zenodo:
https://zenodo.org

* ResearchCompendia:
http://researchcompendia.org

* Internet Archive:
https://archive.org

* The national archives:
http://www.nationalarchives.gov.uk

* WikiData:
http://www.wikidata.org/wiki/Wikidata:Introduction

* The digital preservation network:
http://www.dpn.org

* Open datasets:
https://open-data.europa.eu

* DataHub:
http://datahub.io

* Datacite: citing data as DOI (Germany, has connections with CERN)
https://www.datacite.org/contact

* CrossRef:
http://www.crossref.org

* International DOI Foundation
http://www.doi.org

* Our new pilot Collective Knowledge repository:
http://cknowledge.org/repo

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)