Speaker
Anton Kalinin
(Principal Security Consultant)
Description
Abstract
There are plenty of resources on how to prepare, investigate and how to recover from critical incidents such as a ransomware attack, they are one of the most common attacks incident responders deal with. However, the resources are high-level, provide very few technical details or rely on adequate disaster recovery preparations. With modern ransomware attacks targeting hypervisors more frequently it is important for defenders to understand what the possible options are for recovery beyond some of the typical responses of paying a ransom or standing up a new environment. This research was carried out during the recovery stage of a recent incident in a customer's environment. During the research, several tools were identified which could aid an organizations recovery efforts but they have limited compatibility with ESXI hosts, so we had to develop a solution. The talk aims to provide hands-on experience of the manual recovery of partially encrypted Virtual Machines on an ESXI server and provide a step-by-step guide on how to recover.
The talk is aimed to cover following topics:
- Foundational knowledge you need to perform recovery of partially encrypted VMs by yourself (Virtual Machine Disks (VMDK) and partition table basics)
- Step-by-step walkthrough of the actual case and the problems we overcame.
- Hands-on approach to real world recovery and the difficulties which may arise
- How to automate the process of recovery
We will conclude with lessons learned from the research itself and the case we faced in the real world, and limitations of the approach.
# Presentation Outline
- Introduction
+ What happened. Brief overview of the problems in the environment.
= After a ransomware attack the company's infrastructure and their customers' VMs were partially encrypted, due to this the company was effectively un-operational. Due to this it was urgent that we established the chance of recovery as this impacted the decision to pay the ransom or not.
+ Main characteristics of the environment.
= Multiple locations, several versions ESXI servers. Most of the critical data are inside of VMs
+ Goal: create a guide for the customer on how to recover from the attack
- Groundwork to understand the problem better.
- Searching for the file system
+ Using hex editor to find file system's offset inside the VM.
= Looking for NTFS signature to find possible file system offsets. Analyzing possible offset via sleuthkit to understand if the data could be recovered
+ Third-party software that can help.
= Tools like R-Studio Data recovery software could help with this task, but they were created to solve different problem in mind, and it could have been impossible to use that through the ESXi console
+ Mounting the file system
+ Automation of file system search
= writing a small script that looks for NTFS volume we can analyze later via sleuthkit.
- Single VM recovery
+ What are default file formats. What is "flat" VMDK. Making educated guesses to create a similar environment.
+ Creating VMs in the environment which is comparable/close to the target environment for analysis.
= The idea behind this is simple. To recover VM in client’s environment we need to create several VMs from the similar environment to understand environment better, simulate encryption and try to recover one VM to understand if the problem has a potential solution.
+ Recovery of one partially encrypted VM
= Walking through the whole process and tons of unsuccessful attempts
- More generic approach for VM recovery
+ New challenges (different offset in partition table, previous solution wasn’t fully applicable)
+ Leverage of open-source tools (sleuthkit, dd, ntfsfix)
+ Creating step-by-step guide on recovery process for the third-party
- Automating the task for ESXI 8.0
+ New challenges and limitations (plenty of encrypted VMDKs, old version of libc)
+ Choosing the language for development
+ Creating docker container to run the tool on ESXI
+ Speeding up the tool
- Automating the task for ESXI 6.7
+ Even more challenges and limitations (even older version of libc, older version of python: 3.5)
+ Creating docker container to run the tool on ESXI 6.7 (building python and one of the plugins from scratch locally)
+ Making changes to the code: python 3.5 does not support some features
+ Making use of strace to debug monolithic python binary (it works on my laptop)
- Results
+ How to recover a single VM: step by step guide
+ A word about scaling
+ Little helper tool
- Conclusions
+ Limitations of the approach
+ Lessons learned from the case and the research itself. What can be done better.
= Be prepared for various legacy-challenges
= Try "manually" first - using basic tools, then automate
Length | 60 minutes |
---|
Primary author
Anton Kalinin
(Principal Security Consultant)