Evaluating school classification tagging schemes for the United States
Minh Nguyen 于 2022年一月 9日 以 English 发布A recent id-tagging-schema pull request introduces presets specific to Germany that would make it easier for mappers there to tag schools by type based on locally understood terminology. It’s a great idea, one that hopefully will be extended to other countries like the United States in due time. Some Americans will see a “Kindergarten” preset and intuitively expect a complementary “Elementary School” preset too.
School type presets wouldn’t just be a matter of convenience to mappers. Local maps of the U.S. conventionally distinguish between different types of schools, namely between elementary schools, middle schools, high schools, and colleges and universities. I’d like to eventually see OpenStreetMap-based maps that make the same distinctions.
School classification has long been a tricky subject in OSM. There are at least three documented, machine-readable schemes for classifying schools:
school=*
and grades=*
focus on characteristics that are verifiable on the ground. For Americans, there are many reasonable sources for these keys, including signs, websites, and process of elimination. But the downside is that they’re subject to local variation: school=*
would be applied differently from one school district to another, and grades=*
is uniform but only within the U.S.
The Oak Hall School system in Gainesville, Florida, divides its schools at the most common grade levels but uses uncommon names for the schools. Deriving school classifications from
name=*
would be impractical due to idiosyncrasies like this. (© 2009 Ebyabe, CC BY 2.5)
isced:level=*
has been promoted as a more uniform, globally applicable alternative to both school=*
and grades=*
. However, the application of ISCED levels also varies by country. In some countries, the definition is objective and fits well with the local educational system. But in other countries, the levels are much less relevant to how people normally classify schools.
In the U.S., the lower ISCED levels are officially derived from grade numbers, while the higher levels are based on the type of academic degree offered. Levels 0 through 3 are divided at thresholds that are unfamiliar to most Americans, such as “secondary school” (level 3) being limited to grades 10 through 12. In 2011, the higher levels were redefined for the U.S., making any existing isced:level=5
/6
tags ambiguous. I don’t know if further refinements to ISCED are planned, but it should concern us that the levels can simply be redefined independently of any changes to the U.S. educational system or to schools on the ground.
Teays Valley High School in Ashville, Ohio, is one of over 16,000 public high schools in the U.S. that serve students from 9th grade through 12th grade and thus straddle ISCED levels 2 and 3. (© 2013 Aesopposea, CC BY-SA 3.0)
To better understand the relationship between ISCED levels and the U.S. educational system, I analyzed a pair of nationwide, authoritative datasets of 99,338 public schools and 22,440 private schools published by the National Center for Education Statistics (NCES) for the 2019–20 and 2017–18 school years, respectively. These datasets include schools offering instruction up to 13th grade and some adult education schools, but they do not include postsecondary institutions. I exported the tables as Excel spreadsheets, mapped the grade spans to ISCED levels, and added pivot tables to get aggregate statistics. I’ve posted the resulting spreadsheets to Google Drive, though it’s not much more sophisticated than what you’d export directly from NCES:
Among the 121,778 schools known to NCES:
- 108,983 schools (89%) have verifiable grade numbers. (Most of the remaining private schools do not assign students to grade levels, while most of the remaining public schools did not report their grade levels.)
- The ISCED levels exactly correspond to only 2,558 schools (2.1%):
- Level 3 is the best-fitting level among public schools, corresponding to 521 schools (0.52%). These are generally senior high schools in districts that have spun off a separate junior high school, which would not fit any ISCED level. The poorest-fitting level is level 1, corresponding to 117 schools (0.12%).
- Level 0 is the best-fitting level among private schools, corresponding to 2,097 schools (9.3%). Private preschools and kindergartens are very common in some states. The poorest-fitting level is level 3, which doesn’t correspond to a single private school in the entire U.S.
- A mere 12,561 schools (10%) can be tagged with a single
isced:level
value; 80,610 (66%) must be tagged with two values, 18,896 (16%) with three values, and 7,649 (6.2%) with four values. - Ten distinct values of
isced:level
are possible. The most common values would beisced:level=0;1
(44,359 or 36%), followed byisced:level=2;3
(21,904 or 18%) andisced:level=1;2
(14,347 or 12%). The least common value would beisced:level=3
(1,016 or 0.83%).
It’s fine to combine multiple ISCED levels in the same tag. After all, most schools would have multiple values in grades
too. Rather, the problem is that it isn’t possible to reliably derive the colloquial school classifications from the ISCED levels: for example, isced:level=1;2
can’t distinguish between a 1–8 elementary school and a 6–8 middle school. These terms are used inconsistently from school district to school district, but the map would be expected to reflect this inconsistency.
This school in Hoboken, Georgia, has remained an elementary school since the 1960s, even as a changing grade span would’ve caused ISCED to classify it variously as levels 1 and 2, levels 0 through 2, and levels 0 and 1. (© 2013 Michael Rivera, CC BY-SA 3.0)
isced:level=*
is the only classification system of the three that has global ambitions. The ISCED levels were designed for statisticians, but the scheme’s inclusion in OSM can only achieve its potential for statisticians if we also adopt them in the U.S. So despite the scheme’s problems from a local perspective, mappers might as well consider tagging isced:level=*
in addition to the more immediately practical school=*
and grades=*
keys, even as data consumers should take care not to infer too much detail from isced:level=*
.
In raw numbers, OSM actually knows about more schools than are in the NCES dataset. However, most of them were imported from GNIS; many closed long ago or have been renamed since the import. As the U.S. mapping community works to clean up the imported data, the public domain NCES dataset and similar datasets from state education departments will be a fantastic resource, but classifying schools as users expect will require more human attention than simply translating grade levels to a different numeric scale.
讨论
lyx 于 2022年01月 9日 12:36 的评论
To handle changing standards, it might be helpful to indicate the version of ISCED that has been used when tagging an educational institution. There have been proposals to encode the version in the name of the key (e.g. isced:1997:level or isced:level:2011) but those have not found a lot of support. Maybe adding an additional tag specifying the version like e.g. isced:version=2011 might work. In the absence of that information we could check if there are levels that only exist in one version of the standard (01, 02, 7 and 8 in 2011, 0 in 1997) what version had been used. Only institutions that use level 5 and 6 and none of 7 and 8 would have to be checked manually). To create a table mapping national education system levels to ISCED levels the documents at http://uis.unesco.org/en/topic/international-standard-classification-education-isced give a lot of information; most helpful is probably the “ISCED 2011 Operational Manual: Guidelines for Classifying National Education Programmes and Related Qualifications”.
Minh Nguyen 于 2022年01月 9日 18:24 的评论
Yes, the post above accounts for the ISCED 2011 official mapping to U.S. education programs. The difference in levels 5 and 6 between the 2007 and 2011 standards is part of the problem. However, for the cartographic use case I mentioned above, it’s more problematic that levels 0 through 3 are unclear in both versions of the standard.
lyx 于 2022年01月 9日 19:14 的评论
I don’t think that the levels 0 through 3 are all that unclear; the border between the levels is just not conveniently located at the same spot where students commonly change schools in the US. In theory it would be possible to tag the three digit P code for every grade, but that would be a LOT of effort.
Minh Nguyen 于 2022年01月 9日 19:20 的评论
I think what this means for us is that we can only recommend tagging
isced:level=*
in conjunction withschool=*
andgrades=*
, but never on its own, at least not foramenity=school
. Any presets about school types would be based onschool=*
; the mapper would have to fill outisced:level=*
manually.