Telling tallies
However, compiling this data was a particularly tedious task. The foremost challenge in compiling school data for the world was to be able to classify what might count as a primary or a secondary school, or a school at all. One government’s classification might differ vastly from another.
To ensure consistency, the data was aligned with the International Standard Classification of Education (ISCED 2011) framework. Where direct categorisation was not possible, the team classified the schools according to local government definitions.
Once the categories were set, the problem was mismatches and inconsistencies across different sources. In many cases, government figures contradicted those of independent organisations or even other departments within the government. In these cases, priority was given to ‘official’ government data, as opposed to independently produced data. In other cases, government data was found to be missing, and the data scientists had to sift through data from private organisations to fill the gap.
The process had its surprises. Countries rich in resources often lacked high-quality data. For example, the data for the United States was much more difficult to find than Sierra Leone, which had impeccable data, despite being a country that is much less prosperous.
“Data has its own biases, and this problem manifested itself during the school mapping project,” said Sunstone’s CEO, Jan Grønbech. “We have to be humble when it comes to that. How many schools are there in North Korea? We don’t know. So we get that data from South Korea. We don’t know whether that’s accurate or not.”
Attempts have been made to count schools before this. Perhaps the most successful is Giga, a project under UNICEF, which aims to locate and map every school in the world. They have mapped the locations of 2.2 million schools out of an estimated total of 6 million worldwide. Giga also assesses each school's level of internet connectivity.