Scaler: Image-Scaling Attacks in Machine Learning
This project studies image-scaling attacks, a new form of attacks that allow an adversary to manipulate images, such that they change their content during downscaling. Image-scaling attacks are a considerable threat, as scaling is omnipresent in computer vision. Moreover, these attacks are agnostic to the learning model and training data, affecting any learning-based system operating on images.
Imitator: Misleading Code Stylometry using Adversarial Learning
In this project, we attack methods for authorship attribution of source code using adversarial learning. We exploit that these methods rest on machine learning and thus can be deceived by adversarial examples of source code. Our attack performs a series of semantics-preserving code transformations that mislead the attribution but appear plausible to a developer. Our attack and the datasets are publicly available.
Code Data Paper
Twins: Adversarial Learning and Digital Watermarking
In this research project we explore similarities between machine learning and digital watermarking under attack. As part of the project, we have developed a unified view on attacks in both domains and created a framework for modeling evasion and poisoning attacks. The code and datasets of our case studies are publicly available.
Joern: A Robust Tool for Static Code Analysis
Joern is a tool for robust analysis of C/C++ code. It generates abstract syntax trees, control flow graphs and searchable indexes of code constructs. It has been specifically designed to meet the needs of code auditors, who often find themselves in a situation where constructing a working build environment is not feasible. Joern enables one to write quick-and-dirty but language-aware static analysis tools.
Pulsar: Protocol Learning, Simulation and Stateful Fuzzing
Pulsar is a network fuzzer with automatic protocol learning and simulation capabilites. The tool allows to model a protocol through machine learning techniques. The learned models can be used to simulate communication between Pulsar and a real client or server which, in combination with a series of fuzzing primitives, enables to test the implementation of an unknown protocol for errors in deeper states of its state machine.
Drebin: Dataset of Malicious Android Applications
The Drebin dataset consists of roughly 5,000 malicious Android applications that have been collected as part of the Mobile Sandbox project between 2010 and 2012. The dataset can be used to experiment with Android malware and compare different detection approaches.
Adagio: Structural Analysis and Detection of Android Malware
Adagio is a collection of Python modules for analyzing and detecting Android malware. These modules allow to extract labeled call graphs from Android APKs or DEX files and apply an explicit feature map that captures their structural relationships. Additional modules provide classes for designing binary or multiclass classification experiments and applying machine learning for detection of malicious structure.
Malheur: Automatic Analysis of Malware Behaviour
Malheur is a tool for the automatic analysis of program behavior recorded from malicious software (malware). It has been designed to support the regular analysis of malicious software and the development of detection and defense measures. Malheur allows for identifying novel classes of malware with similar behavior and assigning unknown malware to discovered classes using machine learning.
Code Data Paper
Harry: A Tool for Measuring String Similarity
Harry is a small tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings, such as the Levenshtein (edit) distance, the Jaro-Winkler distance and the compression distance. Harry is implemented using OpenMP, such that its runtime scales linear with the number of available CPU cores.
Sally: A Tool for Embedding Strings in Vector Spaces
Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally can applied to several types of string data, such as text documents, DNA sequences or log files, where it can handle common formats such as directories, archives and text files.
Salad: A Content Anomaly Detector based on n-Grams
Salad is an efficient and flexible implementation of the well-known anomaly detection method Anagram. The method uses n-grams (substrings of length n) maintained in a Bloom filter for efficiently detecting anomalies in large sets of string data. Salad extends the original method by supporting n-grams of bytes and words as well as training with two classes.
A Study on the Effectivity of Jailbreak Detection in Banking Apps
Jailbreaks remove vital security mechanisms, which are necessary to ensure a trusted environment that allows to protect sensitive data, such as login credentials and transaction numbers (TANs). We find that all but one banking apps, available in the iOS App Store, can be fully compromised by trivial means without reverse-engineering, manipulating the app, or other sophisticated attacks.
Security Analysis of Devolo HomePlug Devices
We have conducted a thorough security analysis of so-called HomePlug devices by Devolo, which are used to establish network communication over power lines. We have identified multiple security issues and find that hundreds of vulnerable devices are openly connected to the Internet across Europe. 87% run an outdated firmware, showing the deficiency of manual updates in comparison to automatic ones.