Modeling Forest Fire Occurence in Riau Province, Indonesia using Data Mining Method

Ms. Imas Sukaeshi Sitanggang, Lecturer, Faculty of Natural Science and Mathematics, Bogor Agricultural University, Indonesia
17 April 2012
Handout

Forest fires can damage communities, causing smoke haze problems and excessive carbon dioxide emissions. Thus, early warning systems are important to minimize these problems.

Ms. Imas Sukaesih Sitanggang, a PhD student at the Universiti Putra Malaysia and a faculty member at Institut Pertanian Bogor (IPB) (Bogor Agricultural University) in Indonesia, has developed a spatial decision tree algorithm for modeling forest fire occurrences, called hotspots. The algorithm was created through extracting interesting patterns from the large spatial data (otherwise known as data mining) on forest fires in the Rokan Hilir District, Riau Province in Indonesia.

Data mining method was chosen in developing the model, as forest fire modeling encompasses many factors, such as geographical data, weather, and sociolo-economic data. The said method can process data which has to consider many criteria. On the other hand, models using non-data mining methods are more appropriate for the evaluation of small problems with only few criteria. Moreover, these types of models are often too subjective since they are largely based on expert knowledge.

Ms. Sitanggang also explained that her study focused more on spatial data, or geographic data. One of the problems she encountered while using spatial data was the appearance of invalid geometries (lines and polygons) in the data she used. She found that the algorithm would not function if it encounters such features. She said that fixing this issue would help in streamlining the use of the algorithm.

After the decision tree algorithm (based on Quinlan’s ID3 algorithm) was developed, it was compared with other models—some of which use non-spatial algorithms, based on the accuracy of the models. Results indicated that Ms. Sitanggang’s spatial algorithm had a higher level of accuracy than the other models. Her model had 71.66% accuracy, the logistic regression model with 68%, the J48 65.24%, and the ID3 below 50%.

In addition to the comparisons, the Keetch/Byram Drought Index (KBDI), which indicates atmosphere dryness, was calculated for a certain time frame. Hotspot occurrences for the same period were also calculated, and the two were checked for associations. This is because there are works showing that the KBDI has a strong positive association with the hotspot occurrences. In her study, Ms. Sitanggang observed the said relationship between the two.

One drawback of the spatial decision tree model is that it was developed for specialists. Since the model may not be readily understood by lay people, she says that further work should be done to develop the visualization of the model.

Ms. Sitanggang is a lecturer of Computer Science Department at IPB. Her model was presented during SEARCA’s Agriculture and Development Seminar on 17 April 2012. The work is supported by the Indonesia Directorate General of Higher Education (IDGHE), Ministry of National Education, Indonesia and partially supported by SEARCA. (Amy Christine S. Cruz)