Malware Classification & Identification
Malware classification is the process of classifying malware samples based on shared characteristics with previously analyzed samples. An example of these characteristics are: strings and binary code.
What’s wrong with hash-based identification/classification?
- The content of the samples are changed by attackers to evade hash based identification/classification.
- Cryptographic hashing is only accurate if the data/content of the samples remain the same, if just one line of code is changed, the hash changes.
Note: The attacker may only change a small portion of the sample, but the functionality of the malware remains the same, while the hash changes completely. For example; many attackers will usually plant random data/strings to change the hash and avoid hash-based detection/identification. (Garbage strings)
What is my point? – Hash based signature identification/detection is inaccurate and should not be relied upon for accurate classification/identification of samples.
This is where YARA comes in to play.
What is YARA?
YARA is an fantastic malware identification & classification tool that works by matching patterns across various malware samples.
What can you do with YARA?
- Signature identification based on particular signatures.
- You can generate rules that identify particular signatures that can then be used to detect future similar infections. (AV’s)