Building and Defending a Machine Learning Malware Classifier: Taking Third at MLSEC 2021
Track 1
4 Feb 2022 10:00 AM - 11:00 AM
Nowadays when you read about cybersecurity, you’re almost sure to see something that mentions machine learning (ML) as the silver bullet to solve all problems cyber. Of course, ML isn’t the cyber cure-all, and indeed suffers from its own non-cyber problems – chiefly that ML bring with it its own set of vulnerabilities and weaknesses, often termed “adversarial ML.” These weak points range from leaking private data that the model was trained on to being easily evadable given the right motivation and context.
In this talk, we’ll go through our own experiences leveraging ML to try to build and defend a robust malware detector as part of our submission to the 2021 Machine Learning Security Evasion Competition. Our talk will start by first going over the background on adversarial ML, followed by how we used these ideas to generate adversarial malware variants that we then built our model from. We’ll then shift gears to how we sought to “defend” this model by explicitly attacking the models submitted by the other participants, walking through how we trained a proxy ML model and staged attacks against it.
In the end, our submission took third place in the competition, outperforming some but not all of the contestants. However, our journey helped expose many lessons learned for others looking to get into the space, as well as for those already practicing in it. Attendees of this talk should walk away with an understanding of those lessons, including pointers to resources they can use to build their own models – including the open-source code and the data behind our submission.
Andy Applebaum
Principal Cyber Security Engineer at MITRE
@andyplayse4
Andy Applebaum is a security researcher at MITRE, where he works on applied and theoretical security research problems, including as one of the leads on the CALDERA automated adversary emulation project. His work tends to lie at the intersection of security, automation, and reasoning, with a growing interest in the ability of attackers to both misuse and thwart machine learning and artificial intelligence systems. Andy has published numerous papers and spoken at multiple conferences, including Black Hat Europe, CAMLIS, BSides Las Vegas, and the FIRST Conference.
Andy received his PhD in computer science from the University of California Davis and he holds the OSCP certification. Outside of work, Andy is an avid chess player, having won the 2018 DEF CON chess championship.