Exploring Event Log Analysis with Minimum Apriori Information
MetadataShow full item record
The continued increase in the size and complexity of modern computer systems has led to a commensurate increase in the size of their logs. System logs are an invaluable resource to systems administrators during fault resolution. Fault resolution is a time-consuming and knowledge intensive process. A lot of the time spent in fault resolution is spent sifting through large volumes of information, which includes event logs, to find the root cause of the problem. Therefore, the ability to analyze log files automatically and accurately will lead to significant savings in the time and cost of downtime events for any organization. The automatic analysis and search of system logs for fault symptoms, otherwise called alerts, is the primary motivation for the work carried out in this thesis. The proposed log alert detection scheme is a hybrid framework, which incorporates anomaly detection and signature generation to accomplish its goal. Unlike previous work, minimum apriori knowledge of the system being analyzed is assumed. This assumption enhances the platform portability of the framework. The anomaly detection component works in a bottom-up manner on the contents of historical system log data to detect regions of the log, which contain anomalous (alert) behaviour. The identified anomalous regions are then passed to the signature generation component, which mines them for patterns. Consequently, future occurrences of the underlying alert in the anomalous log region, can be detected on a production system using the discovered pattern. The combination of anomaly detection and signature generation, which is novel when compared to previous work, ensures that a framework which is accurate while still being able to detect new and unknown alerts is attained. Evaluations of the framework involved testing it on log data for High Performance Cluster (HPC), distributed and cloud systems. These systems provide a good range for the types of computer systems used in the real world today. The results indicate that the system that can generate signatures for detecting alerts, which can achieve a Recall rate of approximately 83% and a false positive rate of approximately 0%, on average.