The suitability of land for agriculture (Rammankutty, Foley, Norman, and McSweeney, 2001) has become a standard control for the effect of geographical characteristics on comparative economic development. This measure, however, is rather crude and it does not capture the large variation in the potential caloric yield across equally suitable land. In particular, geographical regions that according to this measure are comparable in terms of their suitability for agriculture may differ significantly in their potential caloric output per hectare per year, reflecting the fact that land that is suitable for agriculture is not necessarily suitable for the most productive crops in terms of their caloric return.
In light of the importance of pre-industrial population density in the subsequent course of economic development, and the instrumental role played by caloric yield in sustaining and supporting population growth, it is rather apparent that this commonly used index is not well designed for properly capturing the effect of the suitability of land for agriculture on economic development.
Galor and Özak (2016) rectify this deficiency and introduce a novel index of land suitability: “The Caloric Suitability Indices” (CSI) that capture the variation in potential crop yield across the globe, as measured in calories per hectare per year. Moreover, in light of the expansion in the set of crops that are available for cultivation in the course of the Columbian Exchange, the CSI indices provide a distinct measure for caloric suitability for the pre-1500 and the post 1500 era. The CSI indices provide fours estimates of caloric suitability for each cell of size 5′× 5´ in the world:
- The maximum potential caloric yield attainable given the set of crops that are suitable for cultivation in the pre-1500 period.
- The maximum potential caloric yield attainable, given the set of crops that are suitable for cultivation in the post-1500 period.
- The average potential yields within each cell attainable given the set of crops that are suitable for cultivation in the pre-1500 period.
- The average potential yields within each cell attainable given the set of crops that are suitable for cultivation in the post-1500 period.
The Caloric Suitability Indices measure the caloric production potential based on agriculture for the Pre-1500CE and Post-1500CE eras as constructed by Galor and Özak (2016). The data can be used to assess or account for the exogenous effect of agricultural potential on various economic and social outcomes. An IPython notebook is included to show how it can be used and also compares it with another measure of agricultural suitability. The data is provided as a service to the academic research community (see license for permitted uses).
The Caloric Suitability Indices can be downloaded as a zip file, or individually. They come in GeoTiff format and WGS84 projection. Use the links below to download (or you can fork this Github repository which contains also an IPython notebook that works with the data).
- All files (zip): The zipfile contains additional versions not downloadable individually. In particular, it includes CSI excluding Asian crop varieties in Africa pre-1500CE. Additionally it includes rasters for the changes in CSI due to the Columbian Exchange.
- Country-level Data:
- US State-level Data:
- All files (zip): Archive contains data on pre- and post-1500CE CSI, growth cycle and daily returns as well as their changes due to the Columbian Exchange. This is the original data used in Galor and Özak (2016). The only difference with the CSI rasters above is the constrain imposed by availability of growth cycle data, which constrains the set of crops.
- All files (zip): Archive contains data on pre- and post-1500CE plow positive CSI, plow negative CSI, plow potential based on CSI as well as their changes due to the Columbian Exchange.
- All files (zip): Archive contains data on caloric suitability for each crop under low, medium and high input levels, as well as under rain fed and irrigation.
If you use the data, please cite:
Galor and Özak (2016) introduce novel measures of potential crop yield measured in calories for the pre-industrial and modern eras. In particular, for each cell of size 5′× 5´ in the world, they estimate the maximum caloric yield and the growth cycle attainable given the set of crops available before and after the Columbian Exchange. Using the same methodology, additional Caloric Suitability Indices (CSI) are introduced here based on the average and maximum caloric yields attainable given the crops available before and after the Columbian Exchange.
These historical measures are constructed based on data from the Global Agro-Ecological Zones (GAEZ) project of the Food and Agriculture Organization (FAO). The GAEZ project supplies global estimates of crop yield and crop growth cycle for 48 crops in grids with cells size of 5′× 5´ (i.e., approximately 100 km^2).
The crops available are alfalfa, banana, barley, buckwheat, cabbage, cacao, carrot, cassava, chickpea, citrus, coconut, coffee, cotton, cowpea, dry pea, flax, foxtail millet, greengram, groundnuts, indigo rice, maize, oat, oilpalm, olive, onion, palm heart, pearl millet, phaseolus bean, pigeon pea, rye, sorghum, soybean, sunflower, sweet potato, tea, tomato, wetland rice, wheat, spring wheat, winter wheat, white potato, yams, giant yams, subtropical sorghum, tropical highland sorghum, tropical lowland, sorghum, white yams.
For each crop, GAEZ provides estimates for crop yield based on three alternative levels of inputs -- high, medium, and low - and two possible categories of sources of water supply -- rain-fed and irrigation. Additionally, for each input-water source category, it provides two separate estimates for crop yield, based on agro-climatic conditions, that are arguably unaffected by human intervention, and agro-ecological constraints, that could potentially reflect human intervention.
In order to capture the conditions that were prevalent during the pre-industrial era, while mitigating potential endogeneity concerns, the indices use the estimates of potential crop yield under low level of inputs and rain-fed agriculture -- cultivation methods that characterized early stages of development. Moreover, the estimates of potential crop yield are based on agro-climatic constraints that are largely orthogonal to human intervention. Thus, these restrictions remove the potential concern that the level of agricultural inputs, the irrigation method, and soil quality, reflect endogenous choices that could be potentially correlated with individual preferences or institutional settings. Additionally, the choice of rain-fed conditions is further justified by the fact that, although some societies had access to irrigation prior to the industrial revolution, GAEZ's data only provides estimates based on irrigation infrastructure available during the late twentieth century
The FAO dataset provides for each cell in the agro-climatic grid the potential yield for each crop (measured in tons, per hectare, per year). These estimates account for the effect of temperature and moisture on the growth of the crop, the impact of pests, diseases and weeds on the yield, as well as climatic related "workability constraints".
In order to better capture the nutritional differences across crops, and thus to ensure comparability in the measure of crop yield, the yield of each crop in the GAEZ data (measured in tons, per hectare, per year) is converted into caloric return (measured in millions of kilo calories, per hectare, per year). This conversion is based on the caloric content of crops, as provided by the United States Department of Agriculture Nutrient Database for Standard Reference. Using the estimates of the caloric content for each crop in the GAEZ data (measured in kilo calories per 1g), a comparable measure of crop yield (in millions of kilo calories, per hectare, per year) is constructed for each crop.
Based on these estimates Galor and Özak (2016) construct the maximum potential caloric yield estimate they use in their paper. Here varios additional indices of caloric suitability are constructed and presented. First, for each cell the average caloric yield across all available crops pre- and post-1500CE is computed. Second, the analysis assigns to each cell the highest potential yield among the available crops pre- and post-1500CE. Additionally, for each caloric index raster the same index is constructed including and excluding cells where no calories can be produced or for averages the crops without caloric output are excluded.
- Thus, the research constructs for each type of index, namely Average and Maximal Caloric Suitability, four sets of grids:
- Caloric Suitability pre-1500CE (without zeros)
- Caloric Suitability pre-1500CE (with zeros)
- Caloric Suitability post-1500CE (without zeros)
- Caloric Suitability post-1500CE (with zeros)
These grids can be used to assess the exogenous effect of agricultural potential on various economic and social outcomes. The next section shows how it can be done and compares with another measure of agricultural suitability.
Find a bug? Report it via github issues by providing
- a link to download the smallest possible raster and vector dataset necessary to reproduce the error
- python code or command to reproduce the error
- information on your environment: versions of python, gdal and numpy and system memory
This data is provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.