Malware Detection Using Deep Learning

As the malware landscape quickly evolves, the use of artificial intelligence (AI) is a probable way to keep up with it. However, this requires a huge volume of data to train on and security-related data is often highly confidential owing to the sensitive information they might hold, and this is hard to fetch. Without this real-time data, it would be difficult to train accurate models.

This research evaluates the possibility of using transfer learning, a machine learning method, to transfer knowledge gained from training models in other domains into malware detection. Particularly, we focus on garnering the success that machine learning models such as deep convolutional neural networks have achieved in the task of image classification, already having been trained on an extensive dataset of more than a million images. Additionally, we look closer into Linux malware, as it is an emerging threat vector and not a lot of research has been invested in studying it.

Having sourced malware samples from various platforms, we observe and record the behaviour of Linux malware through tools like Progger and aim to re-train the existing models with comparatively less effort into successfully detecting malware.