Forecasting

In order to forecast crime, a method of curve fitting is used. Crime can follow a number of patterns, so in order to try to accurately predict it, we have to consider that sometimes crime can increase in an area, and that could lead to a trend of rising crime rate through the broken window effect [18,19]. Crime tends to appear in areas of high crime rate. In order to forecast crime, we consider past crime data.

Each crime is weighted differently in order to address more severe offenses with higher priority than less severe offenses. This cumulative weight for a day (the total weight of all crimes in an area) is considered the crime rate for that day. The rates over these days are then analyzed to get a trend of each area.

A number of functions are used in the analysis. The two prediction algorithms (referenced as Weighted Linear Regression and Recent Trend Analysis) both interpret data and attempt to predict the rate for the next day. If the prediction from these algorithms is averaged, a forecasted data point of greater accuracy is achieved. This forecasted data point is what is exported as the predicted next data point, and the officer placement is based off of this forecasted data.

Basic Algorithms

Basic functions are functions that did not actively place officers or predict crime rate in the model, but contain algorithms that are called in other functions. A map of the functions is shown in figure 5.

The average function finds the average value of a set of values given in an iterator (an object that has multiple values that can be iterated, or looped, through).

The find_slope function finds the slope between two points, given as arguments x1, y1, x2, y2, where (x2, y2) is a point and (x1, y1) is another point.

The recursion function finds the sum of all of the values in an iterator.

The find_intercept function determines the y-intercept of a trend based on a series of points and the slope given as arguments. The slope line is plotted at the given x-values and then the corresponding y-values of the points are plotted. The average distance of the line from the series of y-values is returned as the y-intercept. The average function is used to find the average of the distances from the line.

The lin_reg function finds the linear regression of a series of points by finding the slope of all of the points using the find_slope function. It then returns the average of all of these slopes, which is given as the slope of the linear regression line. The average function is used to get the average of all of the slopes.

Figure 5: A map of the functions used in the prediction of the crime rate. Each rectangle represents a function, and arrow signifies that the function pointed to is called from the function that the arrow originated in.


Weighted Linear Regression

The idea behind the Weighted Linear Regression is that a linear regression could be performed to see the general trend of the direction of the crime rate of an area over the span of the data. However, recent crimes could play a much larger role in the future of the crime of the area, thus the weighted linear regression will put higher weight on more recent data and crime incidents, but will still include older data and rates. We use the amount of days in order to determine the number of days that have elapsed since the occurrence, and therefore how much weight a particular point should be given.

The weighted linear regression works by finding the sum of all of the times that data is given for using the recursion function, using that to calculate the weight that a data point has, and adding them. The final crime rate for an area using this method is calculated as

where s is the number of data points, wn is the weight at the n time, and t is the total amount of time as calculated by the recursion function. This method gives a smaller fraction of the overall predicted weight to lower time quantities, and gives the highest fraction to the highest time quantity, which is also the most recent point.

Recent Trend Analysis

The recent trend analysis allows us to see how the crime rate has been progressing in an area recently. In some cases, crime may begin to drop off or increase because of some outside circumstances, but only very recently. The recent trend would see that drop and predict the crime based on that trend.

The recent trend looks at the most recent five data points (if there are 5 or more points) and runs a linear regression of them. It then reverses a data point and determines if the change from the previous point is ±0.05 of the rest of the trend. If so, that point is included in the linear regression, and it reverses back another data point, until the trend between two data points breaks the ±0.05 trend. The slope of this regression and the data points are then run through the find_intercept function to get a line. This line is used to predict the next data point.