Coding indexes

Download: Occupations_ISCO08_5dgt_26countries_coding_indexes_20220706_surveycodings.xlsx

Current coding practices of the answers to the occupation question in surveys are typically performed by national survey agencies, using national coding indexes. Survey respondents indicate detailed occupational titles, which are close to their actual job titles. The ISCO-08 classification is a 4-level hierarchy, but the 4-digit units are too aggregated for respondents. Survey agencies therefore face the challenging task to code 5-digit titles into a 4-digit classification. They mostly use the national coding indexes, prepared by the national statistical office (NSO). These indexes are in the native language and with the exception of Finland no English translation are provided. Note that ILO’s ISCO-08 classification also does not provide a multilingual coding index, only an English index.

English translations of the coding indexes allow for a better understanding of the occupation coding in multi-country surveys. Therefore, our downloadable CODING INDEXES DATABASE comprises of coding indexes and their English translations. For this aim, I browsed the websites of almost hundred NSOs to collect their coding index. I collected their coding indexes if the NSO used the ISCO-08 classification and the index included 5-digit occupational titles. This resulted in coding indexes from 25 countries, of which 17 are a member of the European Union and of which two indexes are in English, namely from South Africa and Singapore. All 5-digit occupational titles from these coding databases were merged in the CODING INDEXES DATABASE, resulting in 83,319 entries. On average countries provided 3,332 titles. Seven countries provided less than 2,000 entries with the fewest entries from Finland (206). Sixteen countries had between 2,000 and 6,000 entries. Sweden had 8,618 entries and Austria had 13,314. The tab SOURCES in the CODING INDEXES DATABASE provides the titles and the URLs to the national coding indexes, as well as the number of entries per country.

To provide the English translations of the 5-digit occupational titles, I used two translation platforms, namely Google Translate and DEEPL. For seven countries, Albania, Azerbaijan, Georgia, Iceland, Latvia, Montenegro, Turkey, DEEPL could not provide translations. For one country, Lithuania, I did not use Google translate because of the high quality DEEPL translation. For one country, Austria, I did not provide an English translation, because the more than 13,000 Austrian titles were too detailed and too complex to use the translation platforms. As said, the indexes from South Africa and Singapore are in English. For the remaining countries, I identified the best translation for each occupational title. Note that translating occupational titles is difficult, because most titles are country-specific. In a next step, I compared the best translation with the occupational titles available in the WISCO database with more than 4,000 occupational titles and extended with a database with some 6,000 titles coded ISCO-08. This was possible for 25% of the titles in the database.

The CODING INDEXES DATABASE includes the following variables:

LOCALE

Language and country abbreviation

COUNTRY

Name of country

ORIGINAL CODE

Original code indicated in the CODING INDEX

ISCO-08 4 DGT

Original code transformed to ISCO-08

ORIGINAL TEXT NATLANG

Original title in national language

transl_google_20220409

GOOGLE TRANSLATE translation

transl_deepl_20220409

DEEPL translation

FINAL TITLE

English title

ISCO0813

5-digit code from WISCO database, if available

ISCO0804

4-digit code from WISCO database, if available

In a previous study translated coding indexes were explored to what extent countries used the same 4-digit ISCO08 code for the same translated occupational title (Tijdens and Kaandorp, 2018). This study used fewer countries and back then translation platforms were not as advanced as they are nowadays. Results showed that the more titles in the national coding index, the higher the percentage of non-translatable titles (r=.80). In the validation database of translated titles, 68% were single titles and 32% had one or more duplicates, i.e. occurred at least twice in the combined database. It was expected that to a large degree the duplicate occupational titles the NSOs had assigned the same code, but the validation attempt had disappointing results. Of the 20,828 entries with duplicates two or more indexes, only 41% had the same code for the same occupational title. Of the resulting 5,754 occupational titles slightly more than half (54%) had the same code for the same title.

Tijdens KG, Kaandorp CS (2018) Validating occupational coding indexes for use in multicountry surveys, Survey Insights: Methods from the Field., surveyinsights.org/?p=10422, DOI; 10.13094/SMIF-2018-00007