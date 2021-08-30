Malware analysis helps security teams improve threat detection and remediation. Through static analysis, dynamic analysis or a combination of both techniques, security professionals can determine how dangerous a particular malware sample is. They can also analyze how malware functions once on a system and improve future alerts to similar malware attacks.

In Malware Analysis Techniques: Tricks for the triage of adversarial software, published by Packt, author Dylan Barker introduces analysis techniques and tools to study malware variants.

The book begins with step-by-step instructions for installing isolated VMs to test suspicious files. From there, Barker explains beginner and advanced static and dynamic analysis techniques, as well as de-obfuscating tricks and utilizing the Mitre ATT&CK framework.

In this Chapter 2 excerpt, Barker explains how static analysis lets security teams collect data from a suspicious file without executing it. Through hashing and fuzzy hashing techniques and tools, security professionals can learn whether a malware sample has been cataloged. Given how often attackers tweak malware to create new signatures to fool antivirus software, the next step involves executing the file in an isolated VM and observing its actions. Barker also shows how security teams can use open source intelligence through VirusTotal to learn about a known malware variant. VirusTotal is a scanning engine for malware samples, comparing files, hashes, URLs and more to a database and against antivirus engines.

Malware analysis is divided into two primary techniques: dynamic analysis, in which the malware is actually executed and observed on the system, and static analysis. Static analysis covers everything that can be gleaned from a sample without actually loading the program into executable memory space and observing its behavior.

Much like shaking a gift box to ascertain what we might expect when we open it, static analysis allows us to obtain a lot of information that may later provide context for behaviors we see in dynamic analysis, as well as static information that may later be weaponized against the malware.

In this chapter, we'll review several tools suited to this purpose, and several basic techniques for shaking the box that provide the best information possible. In addition, we'll take a look at two real-world examples of malware, and apply what we've learned to show how these skills and tools can be utilized practically to both understand and defeat adversarial software.

In this chapter, we will cover the following topics:

The basics -- hashing

Avoiding rediscovery of the wheel

Getting fuzzy

Picking up the pieces

Technical requirements The technical requirements for this chapter are as follows: FLARE VM set up, which we covered in the previous chapter

An internet connection

.zip files containing tools and malware samples from https://github.com/PacktPublishing/Malware-Analysis-Techniques

The basics -- hashing One of the most useful techniques an analyst has at their disposal is hashing. A hashing algorithm is a one-way function that generates a unique checksum for every file, much like a fingerprint of the file. That is to say, every unique file passed through the algorithm will have a unique hash, even if only a single bit differs between two files. For instance, in the previous chapter, we utilized SHA256 hashing to verify whether a file that was downloaded from VirtualBox was legitimate. Hashing algorithms SHA256 is not the only hashing algorithm you're likely to come across as an analyst, though it is currently the most reliable in terms of balance of lack of collision and computational demand. The following table outlines hashing algorithms and their corresponding bits: Algorithm Output Bits Broken MD5 128 Yes SHA1 160 Yes SHA256 256 No SHA512 512 No Analysis Tip In terms of hashing, collision is an occurrence where two different files have identical hashes. When a collision occurs, a hashing algorithm is considered broken and no longer reliable. Examples of such algorithms include MD5 and SHA1.

Obtaining file hashes There are many different tools that can be utilized to obtain hashes of files within FLARE VM, but the simplest, and often most useful, is built into Windows PowerShell. Get-FileHash is a command we can utilize that does exactly what it says -- gets the hash of the file it is provided. We can view the usage of the cmdlet by typing Get-Help Get-FileHash, as shown in the following screenshot: Figure 2.1 -- Get-FileHash usage In this instance, there are two files available at https://github.com/PacktPublishing/Malware-Analysis-Techniques. These files are titled md5-1.exe and md5-2.exe. Once downloaded, Get-FileHash can be utilized on them, as shown in the next screenshot. In this instance, because there were the only two files in the directory, it was possible to use Get-ChildItem and pipe the output to Get-FileHash, as it accepts input from pipeline items. Analysis Tip Utilizing Get-ChildItem and piping the output to Get-FileHash is a great way to get the hashes of files in bulk and saves a great deal of time in triage, as opposed to manually providing each filename to Get-FileHash manually. In the following screenshot, we can see that the files have the same MD5 hash! However, they also have the same size, so it's possible that these are, in fact, the same file: Figure 2.2 -- The matching MD5 sums for our files However, because MD5 is known to be broken, it may be best to utilize a different algorithm. Let's try again, this time with SHA256, as illustrated in the following screenshot: Figure 2.3 -- The SHA256 sums for our files The SHA256 hashes differ! This indicates without a doubt that these files, while the same size and with the same MD5 hash, are not the same file, and demonstrates the importance of choosing a strong one-way hashing algorithm.

Avoiding rediscovery of the wheel We have already established a great way of gaining information about a file via cryptographic hashing -- akin to a file's fingerprint. Utilizing this information, we can leverage other analysts' hard work to ensure we do not dive deeper into analysis and waste time if someone has already analyzed our malware sample. Leveraging VirusTotal A wonderful tool that is widely utilized by analysts is VirusTotal. VirusTotal is a scanning engine that scans possible malware samples against several antivirus (AV) engines and reports their findings. In addition to this functionality, it maintains a database that is free to search by hash. Navigating to https://virustotal.com/ will present this screen: Figure 2.4 -- The VirusTotal home page In this instance, we'll use as an example a 275a021bbfb6489e54d471899f7db9d1 663fc695ec2fe2a2c4538aabf651fd0f SHA256 hash. Entering this hash into VirusTotal and clicking the Search button will yield results as shown in the following screenshot, because several thousand analysts have submitted this file previously: Figure 2.5 -- VirusTotal search results for EICAR's test file Within this screen, we can see that several AV engines correctly identify this SHA256 hash as being the hash for the European Institute for Computer Antivirus Research (EICAR) test file, a file commonly utilized to test the efficacy of AV and endpoint detection and response (EDR) solutions. It should be apparent that utilizing our hashes first to search VirusTotal may greatly assist in reducing triage time and confirm suspected attribution much more quickly than our own analysis may. However, this may not always be an ideal solution. Let's take a look at another sample -- 8888888.png. This file may be downloaded from https://github.com/PacktPublishing/Malware-Analysis-Techniques. Warning! 888888.png is live malware -- a sample of the Qakbot (QBot) banking Trojan threat! Handle this sample with care! Utilizing the previous section's lesson, obtain a hash of the Qakbot file provided. Once done, paste the discovered hash into VirusTotal and click the search icon, as illustrated in the following screenshot: Figure 2.6 -- Searching for the Qakbot hash yields no results! It appears, based on the preceding screenshot, that this malware has an entirely unique hash. Unfortunately, it appears as though static cryptographic hashing algorithms will be of no use to our analysis and attribution of this file. This is becoming more common due to adversaries' implementation of a technique called hashbusting, which ensures each malware sample has a different static hash! Analysis Tip Hashbusting is quickly becoming a common technique among more advanced malware authors, such as the actor behind the EMOTET threat. Hashbusting implementations vary greatly, from adding in arbitrary snippets at compile- time to more advanced, probabilistic control flow obfuscation -- such as the case with EMOTET.