Cave Density of the Greenbrier Limestone Group, West Virginia

This is a paper I co-authored with my MU colleague Lee Stocks, forthcoming in Papers in Applied Geography.

Lee Stocks Jr.
Department of Geography and Geology
Mansfield University
Mansfield, PA 16933
Andrew Shears
Department of Geography and Geology
Mansfield University
Mansfield, PA 16933



1. Introduction

The Greenbrier Limestone Group (Figure 1), known in West Virginia as the “Big Lime”, is an extensive, calcium-pure limestone unit of Mississippian Age (350-340 million years). Deposited in a shallow ocean basin during the Carboniferous, the Big Lime is over 1000 feet thick in the Greenbrier Valley of West Virginia. The wet climate of central Appalachia provides the hydraulics and corrosive carbonic acid action necessary to form frequent and sizeable karst dissolution features, such as caves, sinkholes, and springs. Some of the world’s largest caves form here as contact caves, where the Big Lime meets the underlying McCrady Shale Formation, including Scott Hollow Cave (47.5 km), Organ Cave (61.9 km), The Hole (37.0 km), and Maxwelton Sink Cave (18.3 km; Figure 2). Likewise, thousands of sinkholes and pits have formed via dissolution of bedrock and collapse into subsurface cave passages. These features create geohazards to infrastructure and provide pathways for aquifer contamination through sediment and pollutant transport, thereby requiring a geographic understanding of karst feature density.


Figure 1: Exposed big lime in study area

Figure 2: Maxwelton Sink Entrance


This research utilizes the geographic and geologic analysis capabilities of ArcGIS10 to produce a preliminary spatial analysis that can examine the relationships between this extensive stratigraphic unit and the development of karst features, in order to explore any structural or geographic controls on their genesis. Data is derived from the West Virginia Speleological Survey database of over 4500 caves and pits, including length, depth, elevation, and stratigraphic unit attributes. Hexagonal bins with a variety of diameters were used for used for statistical analyses. Each of these tests showed statistically significant spatial relationships for cave sites at all levels of analysis.


2. Rationale

The Greenbrier Limestone, the most abundant rock formation in the study area, is composed mainly of marine limestones and shales of Mississipian Age. The Greenbrier Limestone Group, which is 122 meters thick in areas (Figure 3), represents the predominant exposed rock and produces thousands of caves in the Greenbrier County. The management and use of these lands is crucial, as 10 percent of the Earth’s surface is karst (landform typified by sinkholes and caves), and more than 25 percent of the nation’s drinking water comes from karst aquifers and groundwater (Ford and Williams, 1992). Karst watersheds are more sensitive to human disturbances than other watersheds because of fast moving aquifers, slow recharge rates, and thin soils for filtration of pollutants (Doerfliger et al., 1997). Human-induced impacts in karst areas are often associated with urban growth at the watershed level. This produces hydrologic changes that increase sediment and pollutant delivery and impede infiltration as impervious surfaces increase, such as roof tops, streets, buildings, and pavement (Doerfliger et al., 1997). This increases impacts on surface and cave biota and water quality and quantity.


Figure 3: Thick Greenbrier outcrop

Figure 4: Karst features in study area


Human use of karst lands has altered the structure and functioning of these sensitive environments (Verberg and Chen, 2000). Agriculture, forestry, and other land management practices have drastically changed their morphology. Kastning and Kastning (1997) explain the typical problems in karst associated with development of the built environment include instability of soils, subsidence, and collapse of ground surface (Figure 4), erosion and sedimentation of sinkholes, sinkhole flooding, and groundwater contamination. These issues have become more frequent and extensive in West Virginia as surface development has increased in subsurface recharge zones. Therefore, a geographic and spatial analysis of the distribution of caves can elicit useful information that can aid local planning efforts in these sensitive areas, and provide a better understanding of cave formation and location.

White et. al. (1986) define a cave as a natural opening in the Earth, large enough to admit humans. Caves form as a result of a complex organization of geologic and hydrologic processes. The interaction of these factors will determine where, when, and how an individual cave entrance or passage may form. The vast majority of caves in the study area are formed in limestone through hydraulic and corrosive action of carbonic acid. White and Culver (2012) point out that caves are no longer viewed as geological anomalies standing on their own, rather they are repositories of larger climatic and geologic systems with many subsystems. Passages are fragments of conduit systems that were once part of a groundwater system for the region. Active caves can indicate current hydrologic systems, whereas dry caves can explain how drainage systems evolved. For example, Maxwelton Cave System in West Virginia (Figure 5) is an actively forming cave being fed by a surface stream that disappears underground.

Figure 5: Maxwelton Cave Map Superimposed on Orthoimagery
A number of different spatial methods have been used to model sinkholes and hazard potential in karst landscapes but very little has been done to analyze cave entrance relationships or density in a particular geologic unit. Among the former, the most popular are those based on proximity of neighboring sinkholes (Drake and Ford, 1972) or sinkhole density (Orndorff et al.,2000). These methods are used to make inferences about the relationships between karst features and geologic or hydrologic factors.


3. Study Area

The study area for this research (Figure 6) is extensive, including all known, mapped cave entrances in West Virginia. This database (Table 1) is maintained and updated by the West Virginia Speleological Survey (WVASS), a collection of amateur cavers, professional geologists, and caving groups that compile and organize county and area cave surveys. WVASS publishes monthly bulletins, monographs, and special issues related to caves of interest in West Virginia. The database is currently in Access format and contains 4,508 known cave locations, with records organized by county. It contains various information, including cave name, UTM or latitude/longitude coordinates, elevation, geologic unit, date of report, length, depth, stream ingress/egress, biology, etc., as well as short descriptions of the cave entrance, such as whether it is in a sinkhole, blind valley, sinking stream or other. This is an invaluable database for multiple purposes.

The extracted caves formed in the Greenbrier Limestone Group subset included 4,375 records, of which 4,050 had plottable geographic coordinates. Although the database is maintained by cavers and for cavers, the data is proprietary in nature, because of liability issues that arise from publishing the location of caves found on privately held land. Use of the data for this project was granted by its trustees under an agreement that identifiable cave locations on privately-owned land are withheld from public release. Caves in the database with georeferenced coordinates (4,050 caves) were plotted using ArcGIS, for purposes of mapping (Figure 8), as well as point-based geospatial analyses.

Figure 6: Study area dataset, extent of Greenbrier Limestone Group bedrock and selected cave locations in Greenbrier County
The hydrologic and environmental relationships between surface land use and cave health are well established in the literature (Bhaduri et al., 1997). Likewise, the presence of caves can provide geologic hazards via sinkhole collapse. Sudden sinkhole development in karst areas has become an increasing threat as watersheds are urbanized. Human impacts can change the hydrology and infiltration morphology resulting in localized flows, underground soil piping and soil cover collapse (Newton, 1984). Sinkholes and cave entrances also funnel contaminants into underlying aquifers, leading to regional impacts (Galloway et al., 1999).

Table 1: Dataset sample

4. Methods

West Virginia caves formed in the Greenbrier Limestone Group were extracted from the WVASS proprietary database and plotted, mapped, and binned in preparation for various geospatial analyses. Spatial autocorrelation and multi-distance spatial cluster analysis were performed on the point location data, which represents mapped cave entrances, to express general characteristics of clustering in the statewide dataset. The points were also binned to a hexagonal grid; the count attribute for each cell was used to perform cluster/outlier and hotspot analysis, specifically to identify promising locations for future study and exploration.

A Global Moran’s I spatial autocorrelation (after: Moran, 1950) was used on the point shapefile to determine the Moran’s I Index (Figure 7), a statistical expression of point clustering from which a positive value identifies clustered points, while a negative value identifies uniform or disperse patterns. Spatial autocorrelation is the correlation of a variable with itself through space. If there is any systematic pattern in the spatial distribution of the variable, it is spatially autocorrelated. When neighboring areas are more alike it has positive spatial autocorrelation, but when they are different it exhibits negative autocorrelation. Random patterns have no autocorrelation. Therefore, this statistic can test the assumption of independence or randomness, providing insight into the genesis of cave entrances.

Figure 7: Moran’s Spatial Autocorrelation Illustration (ArcGIS 10)
To further express localization of point clusters, a multi-distance spatial cluster analysis based on Ripley’s K-function (Figure 8) was performed, using analysis distances in 100m increments from 100m to 1000m, then 1000m increments up to 10km for each point. The multi-distance cluster analysis can determine whether caves exhibit statistically significant clustering or dispersion over a range of distances.

Figure 8: Ripley’s K Function (ArcGIS 10)
The statewide scale of analysis and extremely close proximity of many caves to their neighbors suggested that binning would allow a more accurate analysis by creating a spatially normalized surface for both visual and statistical comparisons of density. Because results of the multi-distance spatial cluster analysis provided limited insight to appropriate bin sizes for further clustering analysis, a 2500m diameter was chosen for the bins as a polygon size that was both adequately large for mapping at the state scale, while also small enough to provide areas manageable for further field study. The cave points were binned to a shapefile of a hexagonal polygon grid at this diameter, with count attributes of caves in each bin used for mapping and analysis. A map displaying counts for each 2500m bin (Figure 9) was created to better visualize the distribution of caves in the state.

Figure 9: Count of caves, binned to 2500m diameter hexagonal grid
Though several groupings are apparent in Figure 9, especially the one centered in Greenbrier County, visual analysis alone cannot express the significance of possible clusters statistically. The identification of such clusters is crucial to identifying areas for further study and exploration. The creation of a binned polygon shapefile enabled the use of two additional statistical cluster analysis techniques in ArcGIS. An Anselin Local Moran’s I cluster analysis, with values of contiguous polygons used for neighbors, was utilized to identify high count, statistically significant clusters of bin cells. Polygons valued as “High-High” by this test are those which are both highly clustered internally, and surrounded by highly clustered cells. Additionally, a Getis-Ord Gi analysis was performed to identify statistically significant “hot spot” cells, based on the count attribute for cells and their contiguous neighbors. The Getis-Ord Gi analysis was used to derive z-scores for each cell that describe how clustered caves are based on count of that cell and its neighbors. A Z-score of +2.58 standard deviations for a cell denoted that the caves there were clustered in a statistically significant fashion. For both cluster analyses performed on these bin polygons, use of contiguous neighbors was chosen because the hexagonal binning had already largely controlled for distance and non-uniform polygon shapes.


5. Results

A visual examination of Figures 6 and 9 showed that caves in the Greenbrier Limestone Group appear to be spatially clustered, largely in areas known to be geologically underlain by the Big Lime. Based on Figure 9, several groupings of caves were visually apparent:

  1. A linear grouping extending in two lines southwestward from central Randolph County, joining together in Pocahontas County and continuing southwestward through Greenbrier and into Monroe County.
  2. Approximately four linear groupings, arranged in parallel lines from northeast to southwest through Randolph and Pendleton Counties.
  3. A grouping in the state’s eastern panhandle, Berkeley and Jefferson counties.
  4. A small grouping spread across the border between Monongalia and Preston Counties.
  5. A grouping in southern Mercer County, which could be a continuation of the Grouping 1 if features continued south into Virginia, outside the database’s spatial extent.

However, visual analysis is limited in its value for expressing the spatial relationships between these cave locations. Spatial autocorrelation and multi-distance cluster analyses of the point location shapefile each revealed that caves in the Greenbrier Limestone Group were spatially clustered in a statistically significant fashion. Spatial autocorrelation derived a Global Moran’s I value of 0.444056, indicating that locations of caves are more clustered than random or dispersed. The test’s z-score of +134.5 and p-value of 0.0 suggested that clustering detected during I value calculations had a high degree of statistical significance. Hence, spatial autocorrelation reaffirmed the visual analysis of Figure 9, in that caves of the Greenbrier Limestone Group in West Virginia are spatially clustered.

Multi-distance spatial cluster analysis provided further insight into the nature of clustering in the study area. Ripley’s K-function was determined twice: once using a range of distances from 100m to 1000m in 100m increments (Figure 10), and a second time using a range of distances from 1000m to 10km in 1000m increments (Figure 11).


Figure 10: K-Function up to 1000m

Figure 11: K-Function up to 15km


In both iterations of this analysis, the points are shown to be strongly and significantly clustered because the Observed K value is much higher than the Expected K from a statistically random dataset, and well outside the confidence envelope. These results suggested that clustering is not only statistically significant, but clustering is statistically significant regardless of distance around a given point for clustering determination; in fact, the likelihood of a point’s membership in a cluster actually increased in relation to expected values when considering neighbors at longer distances. Cave locations are extremely likely to be a part of a cluster of caves in West Virginia, and clustering is statistically significant when neighbors up to 10km around each point are considered.

The binned hexagons with cave count attributes were used to perform further geospatial analyses. Anselin Local Moran’s I cluster analysis identified 383 hexagonal cells in the 2500m diameter binning grid denoted as “High-High” (Figure 12), meaning that caves were not only highly clustered in each of those cells, but neighboring cells also hosted highly clustered caves.

Figure 12: Anselin Local Moran’s I Cluster Analysis, 2500m bins
This analysis further delineated boundaries between groupings visually identified using Figure 9; the groups of clustered cells in Figure 12 are more clearly separated. The largest grouping remained the series of cells stretching from southern Randolph and northern Pocahontas counties southwestward through Greenbrier County and into Monroe County; however, the clusters in southern Mercer County appear more distinct from this main grouping. The paralleling linear groupings in eastern Randolph and Pendleton counties were broken into six smaller groups with less linear extent. The panhandle grouping in Berkeley and Jefferson counties, and the small cluster straddling the Monongalia and Preston county border are each still visible, though each are de-emphasized through this mapping analysis.

Finally, further hot spot analysis was conducted through the calculation Getis-Ord Gi statistics for each cell; the most important measure from Getis-Ord Gi is the z-score derived, signifying statistical significance of clustering (Figure 13).

Figure 12: Z-Scores from Getis-Ord Gi Hot Spot Analysis, 2500m bins
Using this statistical measure, some 610 cells were identified as being statistically significant clusters, with z-scores greater than +2.58 standard deviations from mean of a normal distribution. In the case of Getis-Ord Gi, the hotspots correspond pretty well with groupings identified visually and using Anselin Local Moran’s I cluster analysis, but the hot spots on the Getis-Ord Gi map are generally larger and more pronounced. The same major groupings – from Pocahontas and Randolph counties southwestward into Monroe County, another including the several groups across Randolph and Pendleton counties – were visible here in a slightly larger form. The smaller groupings – the panhandle grouping in Berkeley and Jefferson counties, the grouping straddling the Monongalia-Preston County border, and the group in southern Mercer County – were far more pronounced with this analysis, whereas some other smaller clusters emerged for the first time.


6. Conclusion

Preliminary efforts into exploring this dataset with geospatial and statistical methods elicited promising results. Getis-GI, Moran’s I, and Ripley’s K-Function statistics invariably show significant clustering and autocorrelation of caves in the Greenbrier Group. This is expected as geology and topography are strong controls on cave formation and entrance genesis. Progressive and more extensive exploratory statistics, coupled with fieldwork, will provide more useful data and analysis that can be integrated into local policy decisions at the local and regional levels, as well as contribute to a better understanding of the mechanisms of cave development in the Greenbrier Group limes of West Virginia.


7. Future Work

Geographically Weighted Regression (GWR) is a statistical method to examine relationships between different variables geographically. It provides an advantage over other methods (i.e. Ordinary Least Squares regression) in that it makes a regression model for individual points, as opposed to a global multivariate regression model that assumes relations among variables are constant across the region of interest. In the geographic model, the independent variables are inversely weighted by distance from the dependent variable. Variables closer to the region modeled are weighted heavier than those farther away, with the weight inversely proportional to the distance. The results can be used to determine the degree variables contribute to the model predictions at specific points in space. If variables are found to be significant contributors to the model prediction, their coefficients can be mapped to provide a visual means of inference. Modern computing allows for the efficient spatial analysis and visualization of large geographic datasets within a Geographic Information System, or GIS. Performing a GWR on this dataset would provide further inferences to the distributional relationship of caves in West Virginia.

Further understanding of cave distribution in relation to environmental variables on a much smaller scale can be conducted using the results of this. Results from the Anselin Moran’s I spatial cluster and Getis-Ord Gi hot spot analyses provided a number of hexagon cells indicating highly clustered cave entrance locations, which were determined to be statistically significant for both tests. Some of these cells, near Lewisburg in Greenbrier County, will serve as sites for future study comparing caves to various environmental variables, such as soil composition, slope, and land cover, on a local scale.


8. References

Bhaduri, B., Grove, M., Lowry, C., and J. Harbor. 1997. Assessing Long-Term Hydrologic Effects of Land Use Change. Journal of the American Water Works Association 89: 94-106

Drake, J. and D. Ford, 1971. The Analysis of Growth Patterns of Two-Generation Populations. The Examples of Karst Sinkholes. Canadian Geography. 16:381-384

Ford, D. and P. Williams. 1992. Karst Geomorphology & Hydrology. Chapman & Hall, NY.

Galloway, D., D. Jones and S. Ingebritsen. 1999. USGS Circular 1182. Land Subsidence in the United States.

Kastning, E.H. and K.M. Kastning. 1997. Buffer Zones in Karst Terranes. Karst-Water
Environment Symposium Proceedings, eds. T. Younos, T. Burbey, E. Kastning, J. Poff, 80-87.

Moran, P. 1950. Notes on Continuous Stochastic Phenomena. Biometrika. 37(1): 17-23

Newton, J. 1984. Review of Induced Sinkhole Development. Sinkholes: Their Geology, Engineering and Environmental Impact. Proceedings of the First Multidisciplinary Conference on Sinkholes and the Engineering and Environmental Impacts of Karst. Rotterdam pp. 1-13

Orndorff R., D. Weary, and K. Lagueux. 2000. GIS Analysis of Geologic Controls on the Distribution of Dolines in the Ozarks of South-Central Missouri, USA. Acta Carsologica. 29(2):161–175

White, E., G. Aron, and W. White. 1986. The Influence of Urbanization on Sinkhole Development in Central Pennsylvania. Environmental Geology 8(1-2):91-97.

White, W. and D. Culver. 2012. Encyclopedia of Caves. Academic Press.

Author: Andrew Shears

Andrew Shears is an Assistant Professor of Geography at Mansfield University in Mansfield, Pennsylvania. His research interests lie at an intersection of the human-environmental nexus, and includes branches of mapping, technological, memorialization and urban geographies. He lives in Wellsboro, Pennsylvania with his wife Amy, a professional photographer.