PISA Data Visualisation Contest


The aim of the Data Visualisation Contest @ use!R 2014 is to show the potential of R for analysis and visualisation of large and complex data sets.

Contest Tracks

You can take part in one of the two tracks related to the analysis of the data from the Programme for International Student Assessment (PISA) 2012. PISA is a worldwide study developed by the Organisation for Economic Co-operation and Development (OECD) which examines the skills of 15-year-old school students around the world. The study assesses students’ mathematics, science, and reading skills and contains a wealth of information on students’ background, their school and the organisation of education systems. For most countries, the sample is around 5,000 students, but in some countries the number is even higher. In total, the PISA 2012 dataset contains data on 485 490 pupils.

This contest is meant to illustrate the wide range of possible analysis and visualisation tools that can be used with PISA and value will be placed on participants that are able to creatively use the strengths of the PISA dataset to submit creative analyses within one of the two areas identified.

Submissions are welcomed in these two broad areas:

  • Track 1: Schools matter: the importance of school factors in explaining academic performance.
  • Track 2: Inequalities in academic achievement.

One conference participant can submit only one submission per track. Submission should be related to the track topic.


Each submission should be sent as a separate email to: user2014contest@gmail.com.

The email with submission should have following information:

  • Name and contact information of the author. For group submissions one person should be selected as ‘the   principal investigator’. You can submit work ONLY IF you are the main author and you have full rights to the submitted work.
  • Attachment: a single png file with the visualisation. The resolution of the file should not be larger than 2048 x 1152 px. The visualisation should be related to one of the contest tracks and should contain graphs or analyses created with R. The png file can be produced directly by R or can be created using other tools (like Inkscape).
  • Attachment: One or more files containing R codes to enable to replicate all graphs and analyses used in visualization.
  • All submissions need to be licensed under Creative Common BY&SA license.
  • Submitted works must be original. Previously published works will not be accepted.

Only submissions that are sent before midnight Sunday 29 June 2014 Pacific Daylight Time (UTC -7) will be considered.


Each submission will be judged using the following criteria:

  •  Relevance to the track question [Is it related to selected track topic? Is the story interesting?
  •   Visual simplicity along with data richness [How much information about student performance is presented? How many relations are explained by the visualisation?]
  • Narrative power [Is it easy to understand what this visualization is about? Is it easy to understand why these results are interesting? Is it easy to understand how to read these results?]

The jury has three members. Jury members and their families cannot submit or be coauthor of submissions.

Each category will be judged by all the members of the jury.

In each track, the winner will be the visualisation totalling the highest average score across the jury. Ties will result in co-winners, splitting the prize across co-winners.

The jury decision is final.

Prize and awards

The prize for best visualisation in each track is 700 USD.

The prize is funded by the OECD.

Awards in this contest are the final decision of the jury and no other party, express or implied. The useR!2014 Organizing Committee, the useR!2014 Program Committee, and the Department of Statistics, UCLA, the hosting institution, assume no responsibility of any kind for the contest, materials submitted, or prizes awarded.

Winners will be announced during the useR 2014 conference.

All submissions will be presented on the webpage: http://beta.icm.edu.pl/PISAcontest/

The dataset

The PISA 2012 dataset is available as rda file here: http://beta.icm.edu.pl/PISAcontest/data. The main table is student2012, see descriptions below.

Datasets in other formats and codebooks are available here: http://pisa2012.acer.edu.au/downloads.php.

Broad description of the methodology, technical manuals is available here: http://www.oecd.org/pisa/pisaproducts/.

The main table is student2012. Pupil performance in mathematics, reading and science is coded by plausible values. You can find them in columns: PV1MATH-PV5MATH (for math), PV1READ-PV5READ (for reading) and PV1SCIE-PV5SCIE (for science). For given area all five values PV1- PV5 are just independent estimations of the student performance in given area. For exploration it is fine to use only PV1.

Due to sampling scheme there are different sampling weights for each student. Final weights are in the variable W_FSTUWT.

Items from student questionnaire are in the student2012 dataset. Items from questionnaire of school principal are in the school2012 dataset. In some countries parents were asked to fill out a parents questionnaire, their answers are in the parent2012 dataset.




Share this Post