prilagođeno pretraživanje po punom tekstu

ŠUMARSKI LIST 3-4/2017 str. 26     <-- 26 -->        PDF

formation of a group and consists of classifiers whose errors are at different points in the vector space of input. The two most popular methods of creating collective classifiers are the method of boosting and the method of bagging (Freund and Schapire 1997, Drucker 1997). These techniques are based on data sampling (training data) and result in different training data sets for each classifier of the total classification system (Opitz and Shavlik 1996).
In a typical scenario of supervised learning, a set of samples are available, which are called the training set. The classes of these samples are known and the goal is to construct a model that would classify new samples in classes. The learning algorithm that builds the model is called an inducer. The basic idea of collective classification is the weighting of different classifiers and combining them into one single classifier, which performs better than each of the individual classifiers. When making a decision, people follow the same technique, taking various opinions and then evaluating those views for making the final decision (Rokach 2009).
The productivity of a forest is described as the site’s ability to produce timber or forest biomass (Skovsgaard and Vanclay 2013). Various approaches have been developed to site productivity assessment (Pokharel and Dech 2011). However, the typical approach for site quality assessment is based on the strong correlation between height growth and volume. Hence, the site index has become an important tool in assessing site productivity (Clutter et al. 1983). Heigh-age observations are plotted on a graph and are used in assessing site productivity (Laubhann et al. 2009). Site index curves are developed via three methods: the parameter prediction method, the guide curve method and the difference equation method (Clutter et al. 1983).
The purpose of this paper is to apply the boosting method for creating a collective classifier, which classifies forest stands in site qualities, with input of the altitude, slope, age and canopy density.
Materials and Methods
Study area – Data collection – Područje istraživanja – Prikupljanje podataka
Data collection came from the management plan of the forest of Dadia-Lefkimi -Soufli (Consorzio Forestale Del Ticino 2005). Situated at the southeast end of the Rhodope mountain range in northeaster Greece, at the crossroads of two continents, the National Park of Dadia-Lefkimi-Soufli Forest is of exceptional ecological significance at the European level (UNESCO World Heritage Centre 2012-2015). The elevation ranges from 10 to 604 m. Soils are shallow to moderate deep, exhibiting various textures. The mean annual temperature is 14.3 oC and the mean annual precipitation is 652.9 mm (Consorzio Forestale Del Ticino 2005). The main tree species of the forest are Pinus brutia and Quercus spp. These species create pure and mixed formations; there is also Pinus nigra, mainly in reforestations, and as individual trees (Consorzio Forestale Del Ticino 2005).
From 403 description sheets (i.e. 403 records-cases), the minimum and maximum altitude (m), the minimum and maximum slope (%), the minimum and maximum age of trees (in years), the minimum and maximum canopy density and site quality (quality Ι, ΙΙ, ΙΙΙ) were used. Quality III included the stands where, as referred in the description sheets, the site quality was III, IV or V. Consequently, three training sets were selected: one for site quality I (25 stands), one for site quality II (94 stands), and one for site quality III (284 stands).
In the description sheets of each stand, more than one site quality categories appear. These categories are not referred to different species. The determination of these categories was based on the composition of each stand and on the degree of species mixture. The initial site characterization for each species was made using site index curves. These site index curves were developed for each of these species in forests of other Greek areas (Consorzio Forestale Del Ticino 2005).
The altitudes and slopes were replaced by one variable, named topography, applying the Anderson-Rubin (1956) method. This method adjusts regression least squares formula to produce factor scores, uncorrelated with other factors, and uncorrelated with each other. The vector of observed variables, i.e. altitudes and slopes, is multiplied by the inverse of a diagonal matrix of their variances. The resulting new variable has an average of 0 and a standard deviation of 1. Using the same method, the ages were replaced with one variable, named age and the canopy densities with the variable canopy density. The new variables topography, age and canopy density were used as predictors, i.e. as input, while the site quality was used as the target variable.
As canopy density is considered the ratio of the sum of the areas of canopy projections (if we put one projection next to the other) divided by the area that these trees occupy (Dafis 1992). Canopy density can take values from 0 to over 1. The more the crowns are tangled, the higher the canopy density becomes. Canopy density is different than ground cover, since ground cover can take values up to 1.
In this study, different site classification predictions could have been developed for pine, oak and pine – oak formations. Unfortunately, even though the area of the mixtures and the species are given in the description sheets, parameters used for site classification were presented for the total stand area. This is the reason why one site classification, regardless of the species, was developed. Moreover, for the same reason, the dominant site quality (occupying the largest forested area of the stand) was used for the site productivity characterization of the stand. There weren’t any different topographic features or canopy density data for different site qualities in each description sheet.