By Peitsa Veteli

As well as making data from particle physics research public, the CMS experiment has developed additional tools in github to be used in schools

It's exciting to use actual coding with real data from scientific experiments. I wish there was more of it in school.” – High-school student, who had a taste of Open Data in their classroom.

Our modern world is drowning in data. Huge amounts are produced every second by governmental institutions, scientific experiments and other endeavours. This raises an interesting opportunity for educators worldwide: could we use those very sets of data and familiarize our young with the ways of handling information in the world of tomorrow?


Especially in the context of formal education, it has long been known and debated that there is a disconnect between what is taught at schools (or how it's taught) and what is actually going on in the scientific research of today. Simplified experiments that can be done on a classroom table by taking half a dozen clean data points are useful classics for a reason, and they have their use, but do they really contribute that much towards understanding the methods and challenges faced in the complex experimental efforts of the modern era? Computers and increasingly intricate algorithms are required as we quest ever deeper into nature's secrets and the old subject division doesn't really help. As programming is integral for conducting any serious research or advancing the well-being of our societies, so too it should be present in classrooms as a tool to be used and appreciated, to be learned and not to be feared. Bridging the everyday life of students with the methods of science is now easier than ever, as they can train these skills with data sets from real life in interesting contexts, be it medical statistics or high end physics experiments.

For some years, several experiments from CERN have pioneered  Open Data in High-Energy Physics. Cutting edge particle physics research, measured with the most accurate machinery mankind has ever produced, is publicly available and just a few clicks away from any teacher, student of curious afficionado who want to make use of it. CMS experiment has gone even further: in addition to providing  the original data from the experiment, since 2014, we've developed tools and simplified sets of data that can be used at schools with normal laptops and everything is freely available on github. No previous coding experience is needed to get started.

The main material bank can be found in the English folder, but it's constantly expanding as the project continues. Some other ready-to-use exercises also exist in Spanish, German, Finnish and Greek. CMS uses Jupyter notebooks

(flexible, interactive website-like programming tools) with Python as the language of choice, try one out.

These materials have been tested and developed with many science teachers and enthusiastic students. The results and reactions have been almost universally positive, with many enjoying the possibility to ”do science” with real methods and data as well as the ease of use, a common stumbling block for digital learning aids.


Find out more about CMS’s resources and about the Open Data project in general:

Tags / keywords