PREDICTING SPATIAL DISTRIBUTION OF ARGILIC HORIZON USING AUXILARY INFORMATION IN REGIONAL SCALE

For supporting better soil management, the spatial distribution of soils having argilic hori=on (argilic soils) must be recognized and it can be delineate in soil survey mapping activity, but this activity consumes much time and money. This study aimed to build a decision tree model for predicting the spatial distribution of argillic hori=on based on auxiliary information using 3 predicting environmental variables; namely, geomorphic sUrface or substrate, landsurjace unit, and ecoregion beh. Three-based modeling technique was used to generate classification tree model from 318 pedon of Lampung Province, Indonesia. Argillic horizon is predicted to present in hot belt (elevation of 0-200 m above sea level) on interfluve-seepage slope with probability 84% for acid igneous rock, 83% for basic igneous rock, and 90% for acid sedimentary rock. Argillic horizon is also predicted to present in hot belt on transportational midslope with the probability 65% for transported acid sedimentary rock. Argillic horizon is predicted absent with the probability to occur ranging from 0% to 32% on otlrer combinations of landsurface unit, ecoregion belt, and substrate.


INTRODUCTION
The interaction of environmental variables grouped as geological, climatological, biological, and topographical establishes the soil-forming process whose actions on parent material manifest them in soil morphologies which in tum alter the nature of the ongoing process (Chadwick and Graham, 2000).The geological factor constitutes a site factor that sets the initial condition for soil formation, whereas the climatological and biological factors represent energy input that drives soil development.Within any of these factors, local difference in topography modifies the activity of more broadly defined variables.By this understanding, soil surveyor may predict soil properties and their distribution across landscape using such environmental variables.
Many researchers develop model, namely: mathematical equation, decision tree, rules, and neural net, to predict soil properties and their distribution either in local scale or regional scale.They use different predicting environmental variables, approach, technique, target (i.e. a soil property, soil quality, or soil type), and scale (either local scale or regional scale).At regional scale (scale of 1 :250.000 to 1: 100.000),McKenzie and Ryan (1999), for example, used geology, climate, and landscape position as predictors.On the other hand, Chaplot et aI. (2003) used watershed area, landscape position, elevation, and slope, while Sulaeman (2004) used geomorphic surface, landscape position, and elevation as predictors.The characterizing technique of these predictors also varies, for characterizing landscape position, McKenzie andRyan (1999) andChap lot et al. (2003) used compound topographic index, whereas Sulaeman (2004) and Park et al. (2001) used Dalrymple model (Dalrymple et al., 1968).
In addition, some researcher (e.g.Daniels et al., 1971;Carter and Chiolkoz, 1991) prefer to use parametric technique; but, others (e.g.McKenzie and Ryan, 1999) prefer to use non-parametric one.The parametric approach relies on some assumption; among other, the residual must be normally-distributed with mean of zero, no correlation among predictor, and stable variance.However, soil data do not always satisfy these assumptions.Wilding and Dress (1983) showed that some soil properties follow non-normal distribution.Due to this obstacle, other researchers prefer to use non-parametric approach, such as tree-based modeling technique, generalized additive modeling technique, and Neural Network.
However, the selection of which approach will be use depends upon sample number and software availability.
Argillic soil i.e. soils having argillic horizon, mainly of Alfisols and Ultisols (Soil Survey Staff, 1999;Soil Survey Staff, 2003), has dense soil as shown by hard consistency especially of sub soil or diagnostic horimn.These characteristics can limit root penetration and reduce air and water penetration.For example, the infiltration rate of Ultisols under mix garden is about 2.2 cm hour-I and Alfisols is about 1 cm hOUr'l, while the infiltration rate of Andisols is about 3.5 cm hour' 1 (Watung et aI., 2(05).For better soil management, the spatial distribution of argillic soils must be recognized and it can be delineate in soil survey mapping activity, but this activity consumes much time and money (see e.g.Bie and Becket, 1970;Burrough et al., 1971;Becket, 1981;S ulaeman, 2004).Predicting the occurrence of argillic soil using model may be the other alternative.This model, however, can be built from large soil dataset using data mining technique.
This paper discusses about the modeling of spatial distribution of argiIlic horizon using auxiliary information as predictors.Tree-based modeling is used to develop model since it is non parametric and straightforward to interpret.The resultant model can be used among other to make hypothetical soil map.

Dataset
Data for this modeling were extracted from soil and terrain database of Lampung Province (Sulaeman, 2004).Digital map of point observation, subtrate, landsurface, and eco-region were derived and then superimposed to get dataset (Fig. I ).Hence the dataset contains observation code, substrate, landsurface, ecoregion, and the occurrence of argilic subsurface horizon, with the total number 318.

27
The dataset are predominantly from transportational midslope (144 sample).Based on subtrate, the sample are predominantly developed from basic igeneus rock, while based on eco-region belt, the sample are predominantly taken from hot region (Table I).

Deriving Rule Model
This study derived soil-landscape model which relate environmental variables to soil properties.The soillandscape model, however, takes advantage of Jenny's soil formation concept (Jenny, 1954), who consider that the soil properties are controlled by climate, parent material, relief, organism, and time.Other workers ( e.g.McKenzie and Ryan, 1999;Chaplot el ai, 2003) have developed their own soil-landscape model.The principal difference in this model is the expalanatory variable used.This study use substrate, landsurface unit (see Dalrymple el aI., 1968;Conacher and Dalrymple, 1977;Park el al., 2001), ecoregion belt (see Mohr and van Barren, 1954) as explanatory variables or dependent variables and the occurrence of argilic horizon as independent variable.Sulaeman ( 2004

Explanatory Variable
The geomorphic surface, landsurface unit, and ecoregion belt represent parent rock/material, relief, and temperature respectively.They can be generated from auxiliary information i.e. geomorphic surface from geological map, landsurface unit and ecoregion belt from topographical map.Both geological map and topographical map are available cheaply countrywide.Since they are also mapable, their spatial variability and extent can be recognized.
As suggested by classification tree (Fig. 2), landsurface is the first splitting environmental variable of dataset.This indicates that landscape position much influence on the spatial variability of argillic horizon.The nature of landsurface stability, as shown among other by free from pedoturbation and erosion, may be as source of such variability.Soil formation on interfluve-seepage slope gets free from erosion and produce relatively old soil.In contrast, soil formation on colluvial footslope and alluvial toeslope get much disturbance from erosional and depositional process and produces relatively young soil.
The deviance, however, is lower when ecoregion belt is used as splitting variable (Table 2).For example, the deviance of interfluve decreases 39 % in hot belt, and 27.47 % in warm and temperate belt.This suggests that the combination of landscape position and ecoregion belt is more effective as predictors than landscape position alone.

The Spatial Occurrence of Argillic Horizon
The argillic horizon is predicted present on interfluveseepage in hot belt (Table 3).Yet, the possibility of argillic horizon to occur differs among parent material.The acid sedimentary rock, including sandstone and claystone, etc. has the highest probability (90%).But, the acid igneous rock including acid tuff, granite, diorite, etc. has the probability of 84%, whereas the basic igneous rock namely basalt, serpentine, etc. has the lowest probability (83%).These result, however, confirm the result of Tafakresnanto and Siswanto ( 2004) who showed that acidic parent material tends to form argillic-contained soil.
The argillic horizon, however, is predicted absent in colluvial footslope and alluvial toeslope and also in all transported midslope except under hot belt and acid sedimentary parent material.These landscape positions are .unstable surface.Soil formation on these surfaces often get disturbance from either erosion, deposition, mass movement, or the combination of them and produce young non-argillic soils.

Decision Tree
The argillic horizon is also predicted present in transportational midslope and hot belt but only on transported acid sedimentary rock with of 65%, lower than the interfluve has.This also confirms the result of Tafakresnanto and Siswanto (2004).Translocation of clay requires acidic condition or sodic alkaline condition (Buol et ai, 1997).Acid sedimentary rocks, however, contain acid mineral that when they weather produce acid solution (Birkland, 1984).
Figure 2 provides flow chart in predicting the occurrence of argillic horizon as revealed in Table 3.The reading is straightforward.Estimation begins by identifying landscape position in the region.For example, if the landscape position is interfluves-seepage slope according definition in Table 3  .)hot = elevation of 0-200 m asl.wann=elevation of 200-1 000 m asl, temperate = elevation of 1000 m asl or higher (see Mohr and van Barren. 1954) .. ) AI = acid igneous rock.BI=basic igneous rocks, AS=acid sedimentary

Practical Implication
Our result is rules for predicting the spatial occurrence of argillic horizon in Lampung Province on regional scale of 1 :250.000.These rules can be used as a hypothesis of regional scale spatial distribution of argillic horizon in other region having similar environmental condition to Lampung Province.The application of these rules on more detail scale can be accepted as well as landsurface unit can be identified.
The advantage of this rule is that we use predictor that mapable, easily identified on auxiliary information.We can delineate geomorphic surface from geological map.We also can determine ecoregion belt from contour map.Moreover, we can identify landscape position using topographic profile analysis from topographic map.Eventually, we can predict the spatial distribution of argillic horizon in one region.Using this set of rule we can develop hypothetical map that useful in pedological research as well as for other study e.g.ecological study, agriculture, and environmental study).CONCLUSION 1. Argillic horizon presents in hot belt on interfluveseepage slope with probability 84% for acid igneous rock, 83% for basic igneous rock, and 90% for acid sedimentary rock.2. Argillic horizon presents on hot on transportational midslope with the probability 65% for transported acid sedimentary rock.3. Argillic horizon is absent with the probability to occur ranging from 0% to 32% on other combination of landsurface unit, ecoregion belt, and substrate.4. Using the decision tree, one may predict the occurrence of argillic horizon in a given region as long as the predictors can be identified.
Figure I. Flowchart ofthe Study ) formulated the soil-landscape model in mathematical spatial equation as following: S = Gi + Lj + Ek + & where: S = soil (e.g.color, texture, structure, etc), OJ = i 1h substrate, L J = t landsurface, Ek = klh ecoregion belt, and E = random error factor (e.g.human error, analytical error, etc).The rule model was derived using tree-based modeling technique based on Classifcation and Regresion Tree (C&RT) algoritm.Other worker have used this technique to get the rule see e.g.The modeling was conducted in PC computer with the aid of ST A TISTICA(StatSoft Inc,  1999).

Figure 2 .
Figure 2. Decision Tree for Predicting the Occurrence of Argillic Horizon

Table I .
Sample Distribution in the Study Area

Table 2 .
The Probability of Argillic Horizon to Occur Based on Terrain Information

Table 3 .
and graphic representation in Figure {, then the next question is identified what the ecoregion belt of the site is following the criteria in Table 2.If ecoregion belt is hot, for instance, then the next question is what the parent material is.If parent material is acid igneous rock than the argillic horizon is predicted present in this site.The pprobability of Argillic Horizon to Occur Based on Infonnation of Terrain and Substrate in