cTuning foundation - enabling collaborative and reproducible AI/ML/computer systems/quantum R&D: 2015

Monday 16 November 2015

Join our experiment on public discussion of ADAPT'16 paper submissions!

Dear colleagues,

As a part of our ongoing initiative towards open, collaborative and reproducible computer systems' research, we cordially invite you to participate in the public pre-reviewing of ADAPT'16 paper submissions co-located with HiPEAC'16(6th international workshop on adaptive, self-tuning

computing systems).

Each submission is now available at ArXiv and has a separate Reddit discussion thread here:

Note that several papers have shared artifacts (benchmarks, data sets, models) to help you validate presented techniques and even build upon them!

Please, feel free to comment on these papers, exchange ideas, reproduce results, suggest extensions, note missing references and related work, etc. We hope such public pre-reviewing will speed up dissemination of novel ideas while also letting authors actively engage in discussions and eventually improve their open articles before the final reviewing by the ADAPT Program Committee!

You can find more details about this publication model at http://adapt-workshop.org/motivation2016.html

Looking forward to your participation!

Thursday 24 September 2015

Further posts in the new blog

Dear colleagues,

From now on, I plan to publish posts related to collaborative, systematic and reproducible computer engineering to my startup's blog:
http://dividiti.blogspot.com

Take care,
Grigori

Thursday 9 July 2015

Nice story about Rutherford and a student

Sir Ernest Rutherford, President of the Royal Academy, and recipient of the Nobel Prize in Physics, related the following story:

Some time ago I received a call from a colleague. He was about to give a student a zero for his answer to a physics question, while the student claimed a perfect score. The instructor and the student agreed to an impartial arbiter, and I was selected. I read the examination question: "Show how it is possible to determine the height of a tall building with the aid of a barometer." The student had answered: "Take the barometer to the top of the building, attach a long rope to it, lower it to the street, and then bring it up, measuring the length of the rope. The length of the rope is the height of the building." The student really had a strong case for full credit since he had really answered the question completely and correctly! On the other hand, if full credit were given, it could well contribute to a high grade in his physics course and certify competence in physics, but the answer did not confirm this. I suggested that the student have another try.

I gave the student six minutes to answer the question with the warning that the answer should show some knowledge of physics. At the end of five minutes, he hadn't written anything. I asked if he wished to give up, but he said he had many answers to this problem; he was just thinking of the best one. I excused myself for interrupting him and asked him to please go on. In the next minute, he dashed off his answer, which read: "Take the barometer to the top of the building and lean over the edge of the roof. Drop the barometer, timing its fall with a stopwatch. Then, using the formula x=0.5*a*t^2, calculate the height of the building."

At this point, I asked my colleague if he would give up. He conceded, and gave the student almost full credit. While leaving my colleague's office, I recalled that the student had said that he had other answers to the problem, so I asked him what they were. Well, "said the student, "there are many ways of getting the height of a tall building with the aid of a barometer. For example, you could take the barometer out on a sunny day and measure the height of the barometer, the length of its shadow, and the length of the shadow of the building, and by the use of simple proportion, determine the height of the building.""Fine," I said, "and others?"

"Yes," said the student, "there is a very basic measurement method you will like. In this method, you take the barometer and begin to walk up the stairs. As you climb the stairs, you mark off the length of the barometer along the wall. You then count the number of marks, and this will give you the height of the building in barometer units. A very direct method." "Of course. If you want a more sophisticated method, you can tie the barometer to the end of a string, swing it as a pendulum, and determine the value of g [gravity] at the street level and at the top of the building. From the difference between the two values of g, the height of the building, in principle, can be calculated. On this same tack, you could take the barometer to the top of the building, attach a long rope to it, lower it to just above the street, and then swing it as a pendulum. You could then calculate the height of the building by the period of the precession". "Finally," he concluded, "there are many other ways of solving the problem. Probably the best," he said, "is to take the barometer to the basement and knock on the superintendent's door. When the superintendent answers, you speak to him as follows: 'Mr. Superintendent, here is a fine barometer. If you will tell me the height of the building, I will give you this barometer."

At this point, I asked the student if he really did not know the conventional answer to this question. He admitted that he did, but said that he was fed up with high school and college instructors trying to teach him how to think.

The name of the student was Niels Bohr." (1885-1962) Danish Physicist; Nobel Prize 1922; best known for proposing the first 'model' of the atom with protons & neutrons, and various energy states of the surrounding electrons - the familiar icon of the small nucleus circled by three elliptical orbits ... but more significantly, an innovator in Quantum Theory.

Monday 16 February 2015

Artifact Evaluation Experience presentation online

I presented our Artifact Evaluation Experience at CGO'15/PPoPP'15. Presentation is now available online: http://www.slideshare.net/GrigoriFursin/presentation-fursin-aecgoppopp2015

Overall, the feedback is positive and we plan to continue AE for CGO'16/PPoPP'16.

Our main task is to improve guidelines for artifact submission and reviewing.

We will continue validating our new publication model for ADAPT'16

For a few years we are promoting a new publication model where papers and related material are submitted to open access archives, then publicly discussed via SlashDot and Reddit, and only then validated and selected by the program committee.

Though we did not have participants for this publication model at our ADAPT'15, we had considerable interest and some colleagues are willing to participate in our ADAPT'16 ...

Furthermore, one of the papers made it to Slashdot generating considerable feedback and thus supporting our idea. By the way, we just noticed that similar approach is proposed in other sciences!

Therefore, we plan to continue validating our new publication model at ADAPT'16 - please, stay tuned!

Highest ranked artifacts for CGO'15/PPoPP'15

We would like to congratulate authors of the following 2 highest-ranked artifacts from CGO/PPoPP'15:

1st place (sponsored by Nvidia)
"The SprayList: A scalable relaxed priority queue"
Justin Kopinsky, Dan Alistarh, Jerry Li and Nir Shavi

received prize "Nvidia Quadro K6000"

2nd place (sponsored by cTuning Foundation) "A graph-based higher-order intermediate representation"
Roland Leißa, Marcel Köster and Sebastian Hack

received prize "Acer C720P"

Sunday 8 February 2015

ADAPT'15 outcome and new publication model for ADAPT'16

A few words about ADAPT'15 outcome:

* Final program with all PDFs is available online at here.

* The following paper received Nvidia best paper award (Tesla K40):

A Self-adaptive Auto-scaling Method for Scientific Applications on HPC Environments and Clouds
Kiran Mantripragada¹, Alecio Binotto¹ and Leonardo Tizzei²
¹ IBM Research - Brazil
² IBM Brazil

* We had a very interesting discussion about our new open publication model. In spite of some possible issues, it seems that there is a support to try it for ADAPT'16. Interestingly, we just found out that very similar model is proposed for other scientific fields (see this blog article). Furthermore, we just found the following public discussion on Slashdot about one of ADAPT'15 papers supporting our idea (as a researcher, you normally publish to present your work to a broad community, initiate discussions and get feedback to improve your work, unless it's purely for academic promotion reasons).

Please, follow our announcements about ADAPT'16 (will likely be co-located with HiPEAC'16 in Prague and will likely feature new publication model).

Anaconda Scientific Python Distribution

Recently discovered Anaconda Scientific Python Distribution. It contained all necessary libraries for the Collective Knowledge Framework that I use for auto-tuning, statistical analysis and predictive analytics, so wanted to share it with you:

https://store.continuum.io/cshop/anaconda

Wednesday 4 February 2015

New year's digest on collaborative & reproducible research

This list is aggregated from public and private messages or during web browsing. Don't hesitate to send me links via our public mailing list or LinkedIn group (to have an acknowledgment):
https://groups.google.com/forum/#!forum/collective-mind
http://www.linkedin.com/groups/Reproducible-research-experimentation-in-computer-7433414

=== Misc articles ===

* "Research Wranglers: Initiatives to Improve Reproducibility of Study Findings"
http://ehp.niehs.nih.gov/122-a188

* Dennis McCafferty, "Should Code be Released?",
Communications of ACM, 2010/10, Vol.53, No.10, DOI:10.1145/1831407.1831415
http://dl.acm.org/citation.cfm?id=1831415

* Chris Drummond, "Replicability is not Reproducibility: Nor is it Good Science"
Proc. of the Evaluation Methods for Machine Learning Workshop
at the 26th ICML, Montreal, Canada, 2009.
Copyright: National Research Council of Canada
http://cogprints.org/7691/7/ICMLws09.pdf

* Science is in a reproducibility crisis - how do we resolve it?
http://theconversation.com/science-is-in-a-reproducibility-crisis-how-do-we-resolve-it-16998

* My blog article on "Automatic performance tuning and reproducibility as a side effect"
for the Software Sustainability Institute:
http://www.software.ac.uk/blog/2014-07-22-automatic-performance-tuning-and-reproducibility-side-effect

* Puzzling Measurement of "Big G" Gravitational Constant Ignites Debate
http://www.scientificamerican.com/article/puzzling-measurement-of-big-g-gravitational-constant-ignites-debate-slide-show/

* White House takes notice of reproducibility in science, and wants your opinion
http://retractionwatch.com/2014/09/05/white-house-takes-notice-of-reproducibility-in-science-and-wants-your-opinion/

* Problems during performance benchmarking:
** https://homes.cs.washington.edu/~bornholt/post/performance-evaluation.html

We also experienced many similar issues during our work on auto-tuning and machine learning:
* http://hal.inria.fr/hal-01054763
* http://arxiv.org/abs/1406.4020

* ACM SIGOPS Operating Systems Review - Special Issue on Repeatability
and Sharing of Experimental Artifacts:
http://dl.acm.org/citation.cfm?id=2723872

* Vinton G. Cerf. "Bit Rot: Long-Term Preservation of Digital Information" [Point of View]
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=5768098

* Less related though interesting (about citations):
http://www.nature.com/news/the-top-100-papers-1.16224

=== Future events ==

* February 9, 2015, CGO/PPoPP joint session on artifact evaluation experience
San Francisco, 17:15 - 17:35
Grigori Fursin and Bruce Childers

* November 1-4, 2015, Dagstuhl Perspective Workshop
"Artifact Evaluation for Publications"
Bruce Childers, Shriram Krishnamurthi, Grigori Fursin, Andreas Zeller

* March 13 - 18 , 2016, Dagstuhl Seminar 16111
"Rethinking Experimental Methods in Computing"

=== Past events ===

* Oct 27-30, 2014, Washington DC, US
"1st International Workshop on Collaborative methodologies to Accelerate
Scientific Knowledge discovery in big data (CASK) 2014"

In conjunction with 2014 IEEE International Conference on Big Data
(IEEE BigData 2014)

http://bigscientificdata.org/cask14

* September 1, 2014: Special journal issue on reproducible research methodologies
in IEEE Transactions on Emerging Topics in Computing (TETC).

http://www.occamportal.org/images/reproduce/TETC-SI-REPRODUCE.pdf

* January 2015:

ACM SIGOPS Operating Systems Review
Special Issue on Repeatability and Sharing
of Experimental Artifacts

http://www.sigops.org/osr.html

=== Journals/Conferences with reproducible articles ===
* IPOL Journal: Image Processing On Line
http://www.ipol.im

=== Tools ===
* NGS pipelines - ntegrates pipelines and user interfaces
to help biologists to analyse data outputed from biological
applications such as RNAseq, sRNAseq, ChipSeq, BS-seq:
https://mulcyber.toulouse.inra.fr/projects/ngspipelines

* Skoll: A process & Infrastructure for Distributed, continuous Quality assurance
http://www.cs.umd.edu/projects/skoll/Skoll/Home.html

* NEPI: Simplifying network experimentation:
http://nepi.inria.fr

* RR (Mozilla project): records nondeterministic executions and debugs them deterministically
http://rr-project.org

* Burrito: Rethinking the Electronic Lab Notebook
http://pgbovine.net/burrito.html

* Collective Knowledge (cTuning v4): our tool and repository to simplify code and data sharing as reusable components (for collaborative and reproducible R&D):
http://github.com/ctuning/ck

=== Online workflows ===

* RunMyCode:
http://www.runmycode.org

* AptLab:
https://www.aptlab.net

=== Projects ===
* OpenLab:
http://www.ict-openlab.eu

* EU Recode project
http://recodeproject.eu/events/upcoming-events

* CERN: opendata
http://opendata.cern.ch

* Research Data Alliance:
https://rd-alliance.org

* Open Data Institute:
http://opendatainstitute.org

=== Online archives/repos ===

* Olive Archive (preserving executable content):
https://olivearchive.org

* Tera-PROMISE:
http://openscience.us/repo

* OpenAire (CERN)
https://www.openaire.eu

* Zenodo:
https://zenodo.org

* ResearchCompendia:
http://researchcompendia.org

* Internet Archive:
https://archive.org

* The national archives:
http://www.nationalarchives.gov.uk

* WikiData:
http://www.wikidata.org/wiki/Wikidata:Introduction

* The digital preservation network:
http://www.dpn.org

* Open datasets:
https://open-data.europa.eu

* DataHub:
http://datahub.io

* Datacite: citing data as DOI (Germany, has connections with CERN)
https://www.datacite.org/contact

* CrossRef:
http://www.crossref.org

* International DOI Foundation
http://www.doi.org

* Our new pilot Collective Knowledge repository:
http://cknowledge.org/repo

Saturday 3 January 2015

CGO'15/PPoPP'15 Artifact Evaluation Results

We finished evaluating artifacts for accepted CGO'15/PPoPP'15 papers.

The list of papers with validated artifacts is now available online:
* http://cTuning.org/event/ae-cgo2015
* http://cTuning.org/event/ae-ppopp2015

It was a really interesting and not always straightforward experience.
Therefore we will be discussing it and presenting ideas to improve
future AE at the common CGO/PPoPP session on the 9th of February, 2015.

More details to follow ...