New Tools for the Characterization of Agricultural (crop and livestock) Environments:

The identification of pastoral ecosystems as a preliminary structure for use in sample site identification.

Dr. John D. Corbett, Dr. Jerry W. Stuth, and Dr. Paul T. Dyke

Introduction

Characterization is the process of first describing the biophysical, social, and economic environment followed by an analytical evaluation of the human-environment interaction. We have long recognized the need to make characterization activities spatially coherent - the spatial context. Georeferencing, in natural resource management research, allows for analysis that ranges from the identification of similarities and differences spatially for the purposes of accurately targeting innovation, to contributing to priority setting of research programs. Impact assessment is only possible through spatial registration of baseline data and comparison to those data through time. Characterization lays the foundation for these value-added activities.

Historically, environmental characterization of agricultural areas has been the subject of many research efforts (Koppen 1936, Thornthwaite 1948, FAO 1981) that integrated available data and expert opinion and provided powerful interpretations of the resource map, and delineate zones of relative biophiysical homogeneity designed to communicate information useful to the planning of agricultural and other human activities. These menthods continued to be employed through the 1980's by researcheres working on nations and regions. The Kenya Farm Management Handbook (Jaetzold and Schmidt, 1981) is an example of the evolution of the FAO methods applied to a specific country. In an effort to upgrade the information for Kenya, the FAO published - on diskettes - a 1991 report Agroecological Land Resources Assessment in Kenya, that used a simple GIS visualization tool in combination with their historical 1978 based methods, to portray crop specific adaptations based on their general agroecological zonation scheme.

IGADD, the Intergovernmental Authority on Drought and Development, and the FAO published a report "Crop Production System Zones for the IGADD sub-region (northern part of east Africa plus Sudan) based on an effort to re-interpret the original FAO agroecological methods for specific crops (IGADD./FAO, 1995). They also provided a simple visualization tool for the PC.

The impact of these efforts should not be underestimated. The maps showing agricultural potential from the Kenya Farm Management Handbook are still prominently cited and displayed in the present Kenya five year development plan (Gov of Kenya, 1996). Allocation of scarce development investments in agriculture is manipulated by the perceived potential of the land as portrayed by these methods (Corbett, 1990). These methods, however, remain locked into their historical roots. Production of static - literally not changeable by the user - adaptation zones is an inherent characteristic of the methods. Irrespective of the mechanism to portray the results (including a diskette with the publication that contains viewing software), the results reflect methods upon which great strides have been taken in the last decade. Accurate identification and characterization of production (crop and livestock) zones - and potential production zones - is vital to agricultural research. New opportunities exist to greatly improve the mechanisms for characterizing agricultural possibilities.

GIS tools empowered innovations that overcame the traditional "agroecological" models that relied on static zones, analog reproduction technology, and fixed crop-environment relationships (e.g. Jones et al. 1996; Corbett and O'Brien 1997; corbet, Collis, and O'Brien, 1998). These tools sought mechanism to use GIS technology and interpolated spatial data to allow the user to select boundary or 'discriminating' criteria with the output then uniquely relecting the user decisions. These more dynamic tools enabled the characterization of agricultural areas to be greatly - and rapidly - enhanced.

The Foundation Database

Foundation data are databases for which some value added process has taken place and which have been placed into an integrated package (Corbett, 1998). A classified Landsat image is foundation data (but the raw Landsat image is not) as are climate surfaces (climate surfaces being the product of an interpolation of point data). Foundation data also extend into the realm of output from simulation models. For example, a crop simulation model run for every grid cell in a foundation database produces additional foundation databases for each simulation run. A key characteristic of foundation data is the placement of these value added databases into an integrated package.

Recent advancements in our 'foundation' design have led to the inclusion of digital, georeferenced documents. Databases are under constuction to deliver on 'point and click' or on a text search basis, textual documentation ranging from food security reports to experiment station annual reports. These textual data, once linked in a database and georeference, become part of the foundation.

The current state of the African datasets and GIS tools are described below. These data are distributed with the Spatial Characterization Tool or SCT (Corbett J.D and R.F. O'Brien, 1997). It is important to realize that these data and tools have been developed in response to specific questions that have been asked of the SCT. The SCT is designed in such a way that as new geographical questions arise, new analytical tools and datasets at various scales can be readily integrated.

The Spatial Sample Frame

Methodology: First Step, the Effective Environments

Our method involves the use of cluster analysis to create what we call "effective environments" or EE. These EE's form the basis for subsequent interpretation into 'similar' climate types that are associated with pastoral ecologies.

We performed Wards minimum variance cluster analysis on six variables (total precipitation, total potential evapotranspiration, minimum temperature, maximum temperature, global radiation, and number of rain days) for all twelve months of the year beginning with the first month of the "optimum season" model (provided in the SCT). The 'optimum' season is defined as the five consecutive months that maximize the precipitation to potential evapotranspiration ratio. The purpose of ordering all data according to the first month as determined by such a model is to align the climatic data into a sensible biological sequence (not calendar months).

Ward's minimum variance algorithm compares the squared Euclidian distance between cells (eq.1), grouping like cells into clusters in a hierarchy which begins with each cell being treated as a separate cluster.

Eq. 1 :

where (A,B) is the squared Euclidian distance between grid cells A and B, p is the number of variables, is the vector for location A, and is the vector for location B (Kendall 1980). The algorithm begins by computing a matrix of squared Euclidean distance between every possible pair of grid cells for the group of variables. Each cell is initially considered a cluster. Based on the analysis, clusters with similar characteristics are merged into the same cluster in a stepwise fashion. Ward's minimum variance method computes the distance between clusters, added over the variables. Within-cluster sums of squares, when divided by the total sum of squares, give proportions of variance. At each step, the within-cluster sums of squares are minimized by merging the two most similar clusters, until the last step, when all cells belong to one cluster (SAS Inst. 1987).

By joining clusters with few numbers, Ward's method of cluster analysis is strongly biased towards producing clusters of approximately the same number of observations (SAS Inst. 1990), and thus is very sensitive to outliers (Milligan, 1980). To reduce this tendency, 1% of observations were trimmed from the analysis on the basis of low estimated probability densities. Though SAS recommends up to 10% trim, we trimmed only 1% because the interpolation routines which produced the climate surfaces smooth the data and, at a grid size of 3 arc-minutes (29 square kilometers), there is little justification climatologically for 10% of the data to be sufficiently different from neighboring cells (the climate surfaces are by definition autocorrelated). With 76 000 cells in the east Africa database as input to the cluster analysis, and each cell covering just over 29 km2, a trim of 760 cells effectively eliminates spatially restricted, extreme environments for a total of just over 22 000 km2.

Twelve grids representing the monthly sequence beginning with the first month of the optimum season for each variable were ported to SAS statistical software. These grids reflect not the calendar month (e.g. January, February etc.) but rather the yearly "biological" sequence (e.g. month 1 precipitation, month 2 precipitation etc.). The 72 grids are the input to the cluster analysis. Plots of R2 against the number of clusters indicated that approximately 200 clusters would be sufficient to represent the majority of the variance of the East African data set. The SAS clustering algorithms runs took approximately 47 hours on a Pentium 266 with 96M RAM. The resulting clusters - our effective environments - can then be viewed using Arc/Info GRID tools or ArcView.

Methodology: Second Step, the The Pastoral Ecosytems: the basis for LEWS sampling

Having created an effective environments layer, we used the querying capabilities of the SCT (see Appendix A) and the 'expert system' of the LEWS team to build our climate based target zones for the sampling frame. This is an iterative process, involving the empirical exploration of the data, integration with published maps of ecosystem delineation, and the knowledge base of the LEWS team.

For example, the Maasai Step, a well described pastoral ecosystem of north-central Tanzania, was evaluated using the SCT for its climatic conditions. We examined a geographically rigorous sample of 'points' within the Maasai Step to build an empirical model of the range of conditions found in the Step - at least according to our data. We then used this general model of the Step and we queried the EE layer for clusters who's means met the general model. These were mapped and displayed for evaluation. In an iterative process, we selectively added or removed clusters that did not conform to the 'local' knowledge of the LEWS team. The results, however, dramatize the efficiency with which an objective driven classification can be built when linking digital data surfaces with an expert system within a spatial frame.

The other three pastoral eco-climatic zones were created in the same manner, first identifying from local knowledge and published maps, areas known to be of a 'specific' ecosystem. We selected a geographical sample of the site and build an empirical 'model' describing conditions within the zone. This model was then used to identify all clusters or EEs with similar characteristics. In this manner, we identified drier systems (arid and semiarid) and a more moist system (moist savanna).

Results

We created a preliminary map/database of our target eco-climatic systems. This database enables the discussion of a robust spatial sampling frame to become specific. Roads, infrastructure, and other access issues can be evaluated. A balance can be found within the team ensuring that each eco-climatic system is sampled as thoroughly as our resources allow. The LEWS team was challenged to not only cover the broadest possible area, but to work together, and exchange information as similar eco-climatic pastoral systems exist across country borders.

Map 1, the Pastoral eco-climatic systems of East Africa:

Pastoral

Each eco-climatic zone is fully maintained in the GIS (topologically correct). This enables the evaluation of each zone and each cluster that made up the classification. The data in Table 1 describe the results of our eco-climatic pastoral system classification effort. Because topology is maintained, we can further sub-classify each zone according to any other of our spatial data. For example, it might be significant to stratify our sampling of the Grassland / Savanna zone by such features as the length of the dry period (and thus the more unimodal systems will separate from the more truly bi-modal systems). We can also use soils, vegetation, topography, etc. and local knowledge of these areas to contribute to our final sampling regime.

Table 1, the Pastoral eco-climatic system definition:

Once created, these systems were evaluated within the SCT for their vegetative cover / land use. Though this USGS land cover / land use database is a simple system based on 18 months of AVHRR 1 kilometer imagery, the data in Table 2 are consistent with our eco-climatic characterization. Land cover data are not fully developed and the AVHRR 1 kilometer cell size is fairly course. The results of this independent 'check' on our eco-climatic pastoral systems does show a pattern. For example, the amount of barren land is highest in the Arid system while cropland and cropland mosaics increase dramatically from the Arid through to the Moist Savanna classification.

Perhaps the strongest indication that the land cover classification is useful would be in the category of shrubland. Shrublands decrease from a high of 63 percent (Arid system) to 2.53 percent in the moist savanna. Clearly our effort to build a sample frame must account for variation in soils, land use, land cover and other features (e.g. topopgraphic) that we have not considered in the eco-climatic classification. We will use data like the USGS land cover / land use database to facilitate the sampling of each eco-climatic type using the best available additional information. This initial classification sets the framework into which these future 'sub-systems' can be identified and sampled.

Table 2, Land Cover Characterization (percent cover)

Conclusion

Our LEWS survey sites will then sample these four eco-climatic types. We recognize that there are pastoral systems existing outside the delineated areas. But remember that we are attempting to build a sampling frame in which we have 'representative' sites so that we can extrapolate conditions away from our fecal observations points (see Stuth, this publication) for the purposes of a livestock early warning system. There are pastoral environments that have relatively unique conditions. Our goal is to provide as broad a spatial coverage as we can, given our limited resources. If we attempted to include all possible environment types, it would weaken our ability to effectively monitor the largest possible area.

References

Corbett, J.D. and R.F. O'Brien, 1997.   The Spatial Characterization Tool-Africa v1.0.
Texas Agricultural Experiment Station,Texas A&M University, Blackland Research
Center Report No. 97-03, December 1997, documentation and CD ROM.
Corbett, J.D., 1990. Agricultural Potential from an Agro-Climate Analysis for a
Semiarid Area of Kitui District, Kenya. University of Minnesota, Ph.D.,
Dissertation, Geography. Corbett, J.D., 1998. The Foundation database concept.
Data column, guest author, Africa GIS, accepted for publication, July 1998.
Corbett, J.D., S.N. Collis, and R.F. O'Brien, 1998. Almanac characterization tool for Angola,
Sierra Leone, and Liberia. A USAID OFDA CD-ROM publication. A resource base for
characterizing the agricultural and natural environments including a digital library.
Texas Agricultural Experiment Station, Texas A&M University, Blackland Research Center
Report No. 98-02, April 1998, documentation and CD-ROM.
Food and Agricultural Organization of the United Nations, 1981. Report on the Agro-Ecological
Zones Project, vol. 3, Methodology and Results for South and Central America.
Rome, FAO.
Gov of Kenya, 1996. Natural Conditions and Farm Management Information. Ministry of
Agriculture, Kilimo House, Nairobi, Kenya.
IGADD (Inter-Governmental Authority on Drought and Development) and U.N. FAO 1995. Crop Production System
Zones of the IGADD Sub-Region, Agrometeorology Working Paper Series N. 10, FAO Rome, Italy.
Jaetzold, R. and Schmidt, H., 1982 Farm Management Handbook of Kenya:
Jones, P.G. and P.K. Thornton. 1996. Continental-Scale Climate Databases for Agricultural Applications.
1996 - 1997 workplan submitted to Rockefeller Foundation.
Kendall, M. 1980. Multivariate analysis. 2nd ed. MacMillan, New York.
Koppen, W. and R. Geiger, 1936. Handbuch der Klimatologie. Berlin: Gebr. Borntraeger.
Milligan, G.W. 1980. An examination of the effect of six types of error pertubation on
fifteen clustering algorithms. Psychometrika 45:325-342.
SAS Institute. 1987. SAS/STAT guide for personal computers. Version 6 ed. SAS Inst., Cary, NC.
SAS Institute.1990. SAS/STAT User Guide, Version 6, Fourth Edition, Volume 1. SAS Inst., Cary, NC.
Stuth, J.W. and Paul T. Dyke, 1998. The Use of NIR/NUTBAL, PHYGROW, and APEX in a Meta-Modeling
Environment for an Early Warning System to Monitor Livestock Nutrition and Health.
This volume.
Thornthwaite, C.W. 1948. An approach to the rational classificiation of climate. Geographical Review 38:55-94.

Appendix A

The SCT's capabilities include:

i) Site characterization. Any site in the available regions can be investigated in various ways. A report on the characteristics of the site (climate, population density, soils, etc.) can be generated and a climate profile created. Selection of any site enables the comparison of information at that site for similar areas throughout the database. This query allows the user to select the basis for comparison (climate model, soils etc.) and the acceptable variance to include as 'similar'.

ii) Zone mapping. A set or range of characteristics can be entered and a map produced showing all areas that share the characteristics.

iii) Zone characterization. This facility is similar to i) but applies to areas rather than individual sites. Statistics describing the characteristics of an area such as a characterization run (from i), county, state, or other 'polygon' (e.g. watershed) are provided.

iv) Transect generation. The user can choose the data of interest, for example elevation or precipitation, and plot the change in these data along any line.

v) Dynamic plot. This facility represents the first steps in incorporating the characterization of spatial data over time. At present this plot is restricted to the mapping of climate change through the year.

Maintained by the Characterization Assessment and Applications Group
Blackland Research Center
Temple, Texas
For comments or suggestions contact remartin@brc.tamus.edu