- Hatice Nur Yerlikaya
Department of Computer Engineering, Ankara Yıldırım Beyazıt University, Turkey
hatice.nur.11@gmail.com.tr
- Mustafa Yeniad
Department of Computer Engineering, Ankara Yıldırım Beyazıt University, Turkey
myeniad@ybu.edu.tr
Keywords: Network Monitoring, Log Analysis, Big Data, Elastic Stack, Elasticsearch, Kibana, Logstash
Abstract
With the increase in technology tools, access to data has become easier and internet usage has increased significantly. This huge increase in internet usage has resulted in the generation of large log files that are very difficult to manage and analyze. Network monitoring tools such as GrayLog, Nagios, Elastic Stack make it easier to extract critical data by narrowing the scope of log data. We have preferred the Elastic Stack platform because of the huge data performance of the Elasticsearch component and the rich visuals of the Kibana interface. The data we use in our study is an intrusion detection system data that contains approximately 25 million logs, which enables the application layer level analysis, DNS, FTP, HTTP, SSL, SSH logs. Firstly, we have transferred log data to Elasticsearch environment with logstash parsing methods to create an index for each log file. Afterwards, we have visualized the queries that make grouping and filtering according to the fields in the log files, creating statistics, providing time-based tracking with pie bar, metric, table and timeline graphs and produced dashboards that can be monitored simultaneously. In the second stage of the study, we have aimed to compare performance with Elasticsearch by generating the same queries in MongoDB and SQLite database. In the analysis of log files of different sizes, we found that the higher the data size, the faster the reading speed of Elasticsearch compared to MongoDB. When we consider all queries, Elasticsearch generated a delay of less than 100 ms for all queries. We have observed that MongoDB performs better in writing data, but produces results similar to traditional SQL database queries in filtering and grouping data.