In order to develop a computational landslide susceptibility workflow, a series of steps must be accomplished. First, the factors that affect the triggering of landslides must be identified. These factors will be the input of the computational workflow; the factors that will be the basis of the susceptibility map. Second, GIS and map data must be collected. Third, the identified landslide controlling factors must each be given a computational weight that can be derived through comparison and analysis of their relationships. Fourth, points of interest in the study area must be established. These points are locations in the study area represented by longitude and latitude that contains landslide triggering factors. The features in each point must be extracted and transformed into empirical data that can be inserted in the computational workflow. Fifth, a landslide inventory must be acquired, this inventory will be crucial in the development and verification of the landslide susceptibility map. Sixth, the frequency of each landslide inducing factor in the study area are computed using the landslide inventory. And lastly, the landslide susceptibility will be computed and mapped using the weights and frequencies of each landslide inducing factor and the features extracted from each point of interest. The flow of activities in the creation of the landslide susceptibility computational workflow is shown in the figure.
Activity flowchart of the Computational Workflow for Landslide Susceptibility
One of the goals of this study is to determine if the same susceptibility map can be produced by using less controlling factors by training a machine learning model using a portion of the available features or landslide controlling factors used in the landslide susceptibility computational workflow. Support Vector Regression is a generalization of the Support Vector Classification, in which the model returns a continuous-valued output, as opposed to an output from a finite set. In other words, a regression model estimates a continuous-valued multivariate function (Awad and Rahul, 2015). The Support Vector Regression model is also expected to work on small data sets, making it ideal with data related to hazard detection and prediction where data is fundamentally difficult to acquire. In order to get the most relevant features and points, the pair-wise comparison matrix from the susceptibility workflow is reused. Only the features of the top 5 landslide controlling factors are used to train the regression model. The support vector regression has two levels, first for creating a base prediction based on the static controlling factors and second for integrating the dynamic factors in the base prediction.
Performance of the support vector regression model was measured using rsquared and mean squared error. R-squared is a statistical tool that shows the proportion of the variance for a dependent variable explained by an independent variable or variables in a regression model. It is generally the percentage of landslide susceptibility that can be explained using the given landslide controlling factors.
The mean squared error shows how close the regression line is to the data points by calculating the distances from the points to the regression line and squaring them. The squaring results to no negative values and more weights to larger values. It is the average of the set of all the errors.
The web application serves as the tool to dynamically supply the rainfall data to the existing landslide susceptibility workflow given the static controlling factors. A grid view is overlayed on a map of the study area, each grid can be toggled to increase or decrease the amount of rainfall. The rainfall data of each point is updated and used to recalculate the susceptibility. Figure 2 shows the use cases that an actor can do on the web application. The web application is also an interface for using the support vector regression models that were created by this study