This website is intended to serve as a reference for researchers interested in different ways of measuring ethnicity and other types of identity-based diversity. For a discussion of issues related to definition and measurement, see Kyle L. Marquardt and Yoshiko M. Herrera. (Forthcoming 2015). "Ethnicity as a variable: An assessment of measures and datasets of ethnicity and related identities." Social Science Quarterly.
All references to the listed datasets should cite the relevant dataset authors, as indicated on their respective websites or articles.
This database can be downloaded [here], and should be cited as follows: Steven L. Wilson, Kyle L. Marquardt and Yoshiko M. Herrera. 2015. "Ethnicity as a variable: An annotated bibliography for sources on ethnic and cultural diversity." https://faculty.polisci.wisc.edu/yherrera/ethnicity_ssq.
Scholars who wish to have their work included on the website or who would like to edit the information provided should email: ethnicity@polisci.wisc.edu.
Click column header to sort, hold shift-key and click to secondary sort by additional columns.Primary Related Work | Authors | Dataset Name | # of countries | Years* | Time series | Level of analysis | Fractionalization | Polarization | Weights | Availability | Description | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Akturk (2011) | Akturk, Sener | Regimes of ethnicity | 173 | 2014 | N | Country/Group | N | N | N | Website | Dataset of 1) contemporary ethnic demography in 173 countries with a population over 250,000; 2) 15 dichotomous indicators of whether or not a state pursued certain policies regarding ethnic and cultural factors in these countries (e.g. whether or not government maintains record of individual-level ethnic identity, religious education in schools). The project will consult three country experts with regard to each element of the dataset. Preliminary data available on website. |
|
Alesina and Zhuravskaya (2011) | Alesina, Alberto and Ekaterina Zhuravskaya | Segregation and the quality of governance | 78-97 | 2000 | N | Country | Y | N | Y (Regional segregation) | Website | Regional ethnic, native-language and religious fractionalization indices; aggegrated to country-level for segregation index (the degree to which identity groups are segregated by region) and fractionalization. Data from country censusesand Demographic and Health surveys. Also includes data on different religious groups' population share. |
|
Alesina et al., 2003 | Alesina, Alberto, Romain Wacziarg, Arnaud Devleeschauwer, William Easterly and Sergio Kurlat | Fractionalization | 215 | Not specified | N | Country | Y | N | N | Website | Ethnic fractionalization. Groups listed. |
|
Ashraf and Galor (2013b); Ashraf and Galor (2013a) | Ashraf, Quamrul and Oded Galor | "Out of Africa" Hypothesis | 145 | n/a | N | Country | N | N | N | Website | Database of genetic diversity, directly estimated for 21 countries and estimated as a function of migratory distance from Africa for 145 countries. |
|
Baldwin and Huber (2010) | Baldwin, Kate and John Huber. | Inter-ethnic inequality and political development | 46 | Not specified | N | Country | Y | N | Y (Between-group economic inequality) | Website | Data regarding between-group inequality. Data on groups based on Fearon (2003) and cross-national survey results; data on inequality from these surveys. Also includes data from Fearon (2003). |
|
Birnir et al. (2014) | Birnir, Johanna K., Jonathan Wilkenfeld, James D. Fearon, David D. Laitin, Ted Robert Gurr, Dawn Brancati, Stephen M. Saideman, Amy Pate and Agatha S. Hultquist. | All minorities at risk | 163 | Not specified | N | Group | N | N | N | Website | List of ethnic groups determined to be "socially-relevant" (a more broad definition than the traditional Minorities at Risk inclusion criteria). Includes 1,195 groups. |
|
Bossert, D'Ambrosio and La Ferrara (2011) | Bossert, Walter, Conchita D'Ambrosio and Eliana La Ferrara. | Generalized index of fractionalization | 1 | 1990 | N | Country | Y | N | Y (Composite individual-level dissimilarity with regard to race, income, employment and education) | Unavailable online; complete dataset in article. | Data on diversity in American states that uses a composite fractionalization score that incorporates individual-level dissimilarity on the metrices of race, income, employment and education. Data from 1990 US Census. |
|
Cederman and Girardin (2007) | Cederman, Lars-Erik and Luc Girardin. | N* | 88 | Not specified | N | Country | N | N | N | Data available on Fearon, et al. (2007), (replication data). Data used to construct index available in paper. | Data used to construct index representing percentage of a population that an ethnic group in power represents vis-a-vis other groups in a state. |
|
Cederman, Wimmer and Min (2010) | Cederman, Lars Erik, Brian Min and Andreas Wimmer | EPR-ETH v1.1 | 155 | 1946-2005 | Y (Yearly) | Group | N | N | N | Website | Dataset includes group size estimates, level of access to the executive branch, and whether or not ethnic group was involved in an armed conflict. Note: Geo-coded data also available on website. Yearly data on 733 politically relevant ethnic groups in 155 countries, 1946 - 2005. |
|
Cederman, Wimmer and Min (2010) | Cederman, Lars Erik and Manuel Voght | EPR-ETH v2.0 | 165 | 1946-2009 | Y (Yearly) | Group | N | N | N | Website | Dataset updates EPR v1. It codes a range of access of group involvement in government, ranging from total control of the government to whether or not the group faces overt discrimination. Note: Geo-coded data also available on website. Yearly data on over 790 groups based on their access to executive state power, 1946 - 2009. |
|
Chandra and Wilkinson (2008) | Wilkinson, Steven | Ethnic concentration index | 40 | Not specified | N | Country | N | N | N | Unavailable online; forthcoming. | Index representing the degree to which ethnic representation in the armed forces was imbalanced. |
|
Collier and Hoeffler (2004); Collier, et al. (2009) | Collier, Paul and Anke Hoeffler | Greed and grievance in civil war | 215 | 1964, 2003 | N | Country | Y | N | Y (Composite measure of ethnic and religious fractionalization) | Website | Ethnolinguistic and religious fractionalization, as well as social fractionalization, a composite measure of ethnic and religious fractionalization. Ethnolinguistic fractionalization based on ANM for 2004 article; Fearon and Laitin (2003) for 2008. |
|
Desmet, Ortuno Ortin and Wacziarg (2012) | Wacziarg, Romain, Klaus Desmet and Ignacio Ortuno-Ortin | ELF and polarization | 226 | 2005 | N | Country | Y | Y | Y (Phylogenetic linguistic differences) | Website 1, Website 2 | Ethnic fractionalization and polarization indices, including weights based on phylogenetic linguistic difference at various levels of aggregation. Data from Ethnologue. |
|
Desmet, Ortuno Ortin and Weber (2009) | Desmet, Klaus, Ignacio Ortuno-Ortin, and Shlomo Weber | Linguistic diversity and redistribution | 225 | 1996 | N | Country | Y | Y | Y (Cognate-based linguistic difference) | Website | Measure of ethnic fractionalization weighted by cognate-based linguistic difference, as well as weighted distance between main language and peripheral languages; also includes Esteban-Ray, ELF and peripheral heterogeneity indices for these countries. Original indices based on Ethnologue. |
|
Ellingsen (2000) | Ellingsen, Tanja | Ethnic composition | 229 | 1945-1994 | Y (Yearly) | Country | N | N | N | Website | Information on the concentration of the largest and second largest linguistic (mother tongue), ethnic and religious groups; data coded by source. |
|
Esteban, Mayoral and Ray (2012) | Esteban, Joan, Laura Mayoral and Debraj Ray | Ethnicity and Conflict | 141 | Not specified | N | Country | Y | Y | Y (Phylogenetic linguistic differences) | Website | Measures of ethnic polarization and fractionalization, weighted by phylogenetic linguistic difference. Data on groups from Fearon (2003). |
|
Fearon & Laitin (2003) | Fearon, James and David Laitin | Ethnicity, insurgency and civil war | 161 | 1960, 2003 | N | Country | Y | N | N | Website | ELF indices (based on ANM), ethnic and linguistic fractionalization indices from Fearon (2003), and population share of second-largest religious and ethnic groups in each country, as well as number of distinct languages spoken in a country. Groups enumerated; 161 countries. |
|
Fearon (2003) | Fearon, James | Ethnic and cultural diversity by country | 160 | Not specified | N | Country | Y | N | Y (Language-family resemblance) | Website | Different measures of ethnic, ethnolinguistic fractionalization, as well as largest and second-largest groups in a territory. Also includes measure of ethnic group fractionalization weighted by language-family resemblance. Groups listed. |
|
Fearon, Kasara and Laitin (2007) | Fearon, James, Kimuli Kasara and David Laitin | Ethnic minority rule and the onset of civil war | 161 | Not specified | N | Country | Y | N | N | Website | Measures of fractionalization(Fearon and Laitin 2003), various codings of Cederman and Girardin (2007) index, and a measure of whether or not a country's head of state was from a minority ethnic group. |
|
Guiso, Sapienza and Zingales (2009) | Guiso, Luigi, Paola Sapienza and Luigi Zingales | Somatic distance | 207 | 1996 | N | Country | N | N | N | Website | Somatic and genetic distances between populations of countries. |
|
Lieberman and Singh (2012) | Lieberman, Evan S. and Prerna Singh | Institutionalized ethnicity | 6 | 1900-2011 | Y (Yearly) | Country | N | N | N | Website | Database of five dichotomous indicators regarding whether or not a state makes use of different categories related to ethnicity (e.g. religion, ethnic identification) over time. Yearly data for six countries, 1900 - 2011. |
|
Minorities at Risk Project, 2009 | Minorities at risk | 117 | 2004-2006 | N | Group | N | N | N | Website | Relevant political and cultural data on 283 ethnic groups perceived to be at risk of involvement in a political conflict. |
||
Nardulli et al. (2012) | Composition of religious and ethnic groups v.1.02 | 156 | 1946-2014 | Y (Yearly) | Group | N | N | N | Website | Database of concentration of different ethnic and religious groups at the country level. Project will eventually include data regarding ascriptive differences between groups (e.g. sensory-based traits, attitudes) and country-specific traits that could affect ethnic relations. Yearly data for 156 countries (those with a population over 500,000 in 2004) 1946 - present. |
||
Okediji (2005) | Okediji, Tade | Dynamics of ethnic fragmentation | 132 | Not specified | N | Country | Y | N | Y (Composite index of race, ethnic, linguistic and religious affiliation) | Unavailable online; complete dataset in article. | Ethnolinguistic fractionalization and composite social diversity indices (a weighted index of race, ethnic, linguistic and religious affiliation) |
|
Ostby (2008) | Ostby, Gudrun | Polarization and horizontal inequalities | 39 | 1986-2004 | Y (Yearly) | Country | N | Y | Y (Composite measures of ethnic and economic polarization and social and economic inequality) | Website | Yearly data on 11 measures of social and economic inequality, economic and ethnic polarization, and composites of the aforementioned factors. Measures based on data from cross-national Demographic and Household Surveys. |
|
Posner (2004) | Posner, Daniel | Decade values for PREG | 42 | 1960-1990 | Y (Decades) | Country | Y | N | N | Unavailable online; complete dataset in article. | Politically-relevant ethnic group fractionalization for African countries. |
|
Reynal-Querol (2002); Montalvo and Reynal-Querol (2005) | Reynal-Querol, Marta | Ethnic and religious fractionalization and polarization | 137 | Not specified | N | Country | Y | Y | N | Website | Ethnic and religious fractionalization and polarization; ethnic data based on the World Christian Encyclopedia and religious data from L'Etat des religions dans le monde |
|
Roeder (2001) | Roeder, Philip | Ethnolinguistic Fractionalization (ELF) Indices | 183 | 1961, 1985 | N | Country | N | N | N | Website | Ethnolinguistic fractionalization, based on Soviet sources (e.g. ANM) and Europa World Yearbook. At different levels of aggregation, based on ANM coding scheme. |
|
Roeder (2003) | Roeder, Philip | Clash of civilizations and escalation of ethnopolitical conflicts | 130 | 1980-1999 | Y (By decade) | Group | N | N | N | Website | Dataset includes data on various ethnic groups in relation to majority population of their country of residence. Data on ethnic-groups-by-country, 1032 observations. |
|
Roeder (2007) | Roeder, Philip | Nation-state crises worldwide | 161 | 1955-1999 | Y (Five-year increments) | Group | N | N | N | Website | Dataset includes data on various ethnic groups, both related to their demographic and linguistic characteristics, as well as their historical relationship to their state (e.g. former or present regionized-homeland). Data on ethnic-groups-by-country, in five-year increments from 1955-1999. 8054 observations. |
|
Scarritt and Mozaffar (1999); Mozaffar, Scarritt and Galaich(2003) | Mozaffar, Shaheen, James R. Scarritt and Glen Galaich | Electoral institutions, ethnopolitical cleavages, and party systems in Africa's emerging democracies | 48 | Not specified | N | Country | Y | N | Y (Regional group concentration) | Unavailable online | Database on ethnic fragmentation (based on politicized groups) and concentration for 48 African countries. |
|
Selway (2011) | Selway Joel Sawat | Cross-cutting cleavages dataset | 155 | Not specified | N | Country | Y | N | Y (Cross-cutting cleavages, subgroup polarization and an cross-fractionalization) | Website | Indices of cross-cutting cleavages, sub-group fractionalization, sub-group polarization and cross-fractionalization. Data gathered from multiple cross-national surveys. |
|
Spolaore and Wacziarg (2009a) | Spolaore, Enrico and Romain Wacziarg | Diffusion of development | 206 | Not specified | N | Country | N | N | N | Website | Genetic distance between countries, based on main ethnic groups in countries and general data on genetic differences between populations. |
|
Taylor and Hudson (1984) | Taylor, Charles Lewis and Michael C. Hudson | World handbook of political and social Indicators | 136 | 1964 | N | Country | Y | N | N | Website | Main original source of ELF indices, based on ANM. |
|
Vanhanen (1999) | Vanhanen, Tatu | Domestic Ethnic Conflict and Ethnic Nepotism | 183 | Not specified | N | Country | N | N | N | Website | Information on racial, linguistic and religious concentration of the largest groups; as well as composite score of these three concentration scores. Data compiled from multiple sources. |
|
Cederman, Wimmer and Min (2010) | Wimmer, Andreas and Philippe Duhart | EPR v3.0 | 157 | 1946-2010 | Y (Yearly) | Group | N | N | N | Website | Dataset updates EPR v2.0. Annual data for 157 countries 1946-2010; 758 politically relevant groups. Includes marker identifying trait (e.g. language, skin color) that differentiates group members, as well as group's type of political inclusion. Fully geo-coded dataset available. |