In this lab, we will be performing the following:
Set up a single metric job
Perform forecasting
Set up a multi-metric job
The machine learning anomaly detection features automate the analysis of time series data by creating accurate baselines of normal behavior in your data. These baselines then enable you to identify anomalous events or patterns. Data is pulled from Elasticsearch for analysis and anomaly results are displayed in Kibana dashboards. For example, the Machine Learning app provides charts that illustrate the actual data values, the bounds for the expected values, and the anomalies that occur outside these bounds.
The typical workflow for performing anomaly detection is as follows:
Go to Machine Learning > Anomaly Detection > Jobs in the Analytics section on the left side menu of Kibana.
Click on Create Job.
3. Select “Kibana Sample Data Logs” data view.
4. Select the “Single metric” job wizard.
5. Change the end date in the Time range selector until you have 8 weeks of data, which is the full data set (e.g set end date to October 31st). This date will be some time in the future since this is sample data.
9. Enter “lab1a_low_web_traffic” as the Job ID, and “mylabs” as the Group name, then click Next.
10. The data validation step should pass without problems. Click on Next step to proceed.
11. Review the job configuration, make sure that “Start immediately” is checked, and click on Create Job to start the ML job.
12. The job should take seconds to complete. Once done, please click on View results to view the results:
13. This is how the results would look like:
14. You can drag the “timeline” bar at the bottom to the beginning of the time period, to see how ML “built” the “model” (after about 3 cycles at the beginning of the timeframe):
15. Next, drag the timeline to the area in the middle with a “red” line to check out the anomaly found.
Note that detailed information of the anomaly can be found in the anomalies at the bottom of the page. The anomaly was given a severity of 95. The expected count according to the model (for that time period) was 28.4, however the actual count was 0, hence the high severity.
Note also that:
The drop in traffic was given critical severity score
Spikes in traffic on the other hand were not anomalous, given that we were looking only for anomalies on the low side (“low count" function)
Navigate back to Machine learning > Anomaly Detection > Jobs.
Click Create job.
As before, select the “Kibana Sample Data Logs” data view.
This time, select the “Multi-metric” wizard.
The full dataset should still be selected (eg choose Oct 31st as end date).
Click Next.
7. Now select the "Count(Event rate)" function
8. Under Split Field, split the data on "response.keyword" (the HTTP status code).
9. Under Influencers, add "clientip" as an additional Key Field.
10. Enter 1h for the bucket span and click Next.
11. Name your job "lab1c_web_traffic_per_response_code" and place it under “mylabs” group.
12. Click Next.
13. The job should pass through the validation without any issues. Click Next.
14. Review the job settings, varifying Start immediately is ticked, and click Create Job to start the ML job.
15. After the job has completed, click on View Results to drill down to the results.
16. This time we're brough directly to Anomaly Explorer. At a glance, under Top influencers, we can tell that there were anomalies associated with response codes 404 and 200, and that the anomaly related to 404 seems to be caused by (influenced by) ip address: 30.156.16.164 .
17. Click on the red square (October 6th 2023) for response code 404 to drill down further:
Navigate back to Machine Learning > Anomaly detection > Jobs.
Click Edit Job for “lab1c_web_traffic_per_response_code” by opening the Actions menu ( the … to the right).
3. Click on the Custom URL tab and Add custom URL.
4. Enter the following details:
Label : “raw data”
Link to : “Discover”
Data view : “kibana_sample_data_logs”
Query entities : “clientip”
5. Click Add.
6. Remember to Save.
7. Click on the Anomaly Explorer link for the lab1c job.
8. Click on the red box again for 404. Click on the “Actions” icon (cogwheel) at the top right-hand corner and click on the “raw data” link:
9. This brings us to the Discover page showing us all the relevant documents filtered by “clientip":"30.156.16.164".
10. A quick click on the “response” field on the left (You can search for it!) would show us that all the 100 requests sent by this clientip encountered the 404 response code.
NOTE - click in the empty space between response and the plus-sign when selecting the field.
11. Similarly, you can also click on the “URL” field to take a quick look at the URLs in question.
Navigate back to Machine Learning > Anomaly detection > Jobs.
Click on Start datafeed for “lab1a_low_web_traffic” from the Actions meny (the … to the right).
3. Under "Search start time", select the first option “continue from ....”.
4. Select end time - “No end time (Real-time search)”.
5. Check the “Create alert rule after datafeed has started”.
6. Click Start.
7. A new section will open on the right side, showing the configuration of the “Create rule”:
Name: low_web_traffic
Check every: 120m
Severity: 75
Select a connector type
Select Email
Provide an email in the To box
Provide a sample Subject
8. Click Save.
8. Now in the Anomaly Detection Jobs list you will notice that the “lab1a_low_web_traffic”, the Job state changed into “opened” and the datafeed state is “started”.
From now on until you will stop the datafeed the ML job will collect the new data and notify you, based on the settings of the Alert.
After your anomaly detection job creates baselines of normal behavior for your data, you can use that information to extrapolate future behavior.
You can use a forecast to estimate a time series value at a specific future date. For example, you might want to determine how many users you can expect to visit your website next Sunday at 0900.
You can also use it to estimate the probability of a time series value occurring at a future date. For example, you might want to determine how likely it is that your disk utilization will reach 100% before the end of next week.
Each forecast has a unique ID, which you can use to distinguish between forecasts that you created at different times. You can create a forecast by using the forecast anomaly detection jobs API or by using Kibana.
In this lab, we will do it through Kibana.
Open the Single Metric Viewer for the job "lab1a_low_web_traffic".
Note that there is a “Forecast” button at the top right hand side. Click on the button.
When you create a forecast, you specify its duration, which indicates how far the forecast extends beyond the last record that was processed. By default, the duration is 1 day. Typically the farther into the future that you forecast, the lower the confidence levels become (that is to say, the bounds increase). Eventually if the confidence levels are too low, the forecast stops. For more information about limitations that affect your ability to create a forecast, see Unsupported forecast configurations.
3. Enter the duration (e.g. 10d) which you would like the forecasting to be calculated for.
4. Click Run.
5. The forecasted results are in Yellow lines:
The yellow line in the chart represents the predicted data values. The shaded yellow area represents the bounds for the predicted values, which also gives an indication of the confidence of the predictions.
Additional job configurations
Typically after you open a job, you might find that you need to alter the job configuration or settings.
Sometimes there are periods when you expect unusual activity to take place, such as bank holidays, "Black Friday", or planned system outages. If you identify these events in advance, no anomalies are generated during that period. The machine learning model is not ill-affected and you do not receive spurious results.
You can create calendars and scheduled events in the Settings pane on the Machine Learning page in Kibana or by using Machine learning anomaly detection APIs.
A scheduled event must have a start time, end time, and description. In general, scheduled events are short in duration (typically lasting from a few hours to a day) and occur infrequently. If you have regularly occurring events, such as weekly maintenance periods, you do not need to create scheduled events for these circumstances; they are already handled by the machine learning analytics.
You can identify zero or more scheduled events in a calendar. Anomaly detection jobs can then subscribe to calendars and the machine learning analytics handle all subsequent scheduled events appropriately.
If you want to add multiple scheduled events at once, you can import an iCalendar (.ics) file in Kibana or a JSON file in the add events to calendar API.
By default, anomaly detection is unsupervised and the machine learning models have no awareness of the domain of your data. As a result, anomaly detection jobs might identify events that are statistically significant but are uninteresting when you know the larger context. Machine learning custom rules enable you to customize anomaly detection.
Custom rules – or job rules as Kibana refers to them – instruct anomaly detectors to change their behavior based on domain-specific knowledge that you provide. When you create a rule, you can specify conditions, scope, and actions. When the conditions of a rule are satisfied, its actions are triggered.
For example, if you have an anomaly detector that is analyzing CPU usage, you might decide you are only interested in anomalies where the CPU usage is greater than a certain threshold. You can define a rule with conditions and actions that instruct the detector to refrain from generating machine learning results when there are anomalous events related to low CPU usage. You might also decide to add a scope for the rule, such that it applies only to certain machines. The scope is defined by using machine learning filters.
Filters contain a list of values that you can use to include or exclude events from the machine learning analysis. You can use the same filter in multiple anomaly detection jobs.
If you are analyzing web traffic, you might create a filter that contains a list of IP addresses. For example, maybe they are IP addresses that you trust to upload data to your website or to send large amounts of data from behind your firewall. You can define the scope of a rule such that it triggers only when a specific field in your data matches one of the values in the filter. Alternatively, you can make it trigger only when the field value does not match one of the filter values. You therefore have much greater control over which anomalous events affect the machine learning model and appear in the machine learning results.
For more information, see Customizing detectors with custom rules.