3. REQUIREMENTS FOR CONDUCTING MALWARE FORENSICS
The elicitation of requirements can be done from multiple sources and not simply stakeholders alone (Burnay, 2016). In his study, Burnay (2016) found that eliciting requirements from existing documentation to be significantly faster than the use of stakeholders. It was also found that there were several examples of where stakeholders made statements that conflicted with formally documented requirements and so were either incorrect or simply misunderstood by stakeholders.
The use of authoritative document sources such as legislation and regulatory guidance already embodies the requirements of stakeholders, and so we took a document study approach to derive the requirements for conducting malware forensics.
Having considered the methodology, the issue of addressing trusted practice in malware forensics could begin by better exploring what is meant by trust. This can be defined as “willingly acting without the full knowledge needed to act” (Duranti Rogers, 2012). In the context of the Criminal Justice System involving expert evidence, this translates to a Court coming to a decision on the reliability of such evidence-based upon two forms of trust: the expert and their evidence.
The former concerns the expert’s knowledge and skills, as well as their ability to communicate these effectively and fairly. A shortfall in one of these areas can impact the interpretation of the evidence or its probative value (see Problems with expert evidence, above).
The latter relates to the trust placed on the reliability of the evidence itself. Since the repeal of section 69 of the Police and Criminal Evidence Act 1984, the Crown Prosecution Service (CPS) issued guidance stating that any evidence produced by a computer is presumed to be reliable (CPS, 2014). However, the formation of the FSR and the associated Codes (2020b) indicate that expert evidence has transitioned from an assumed, innate trust to one that is now externally validated. The CPS, FSR, and practitioners themselves are all stakeholders in this process, each having their own requirements.
3.1 Legal Requirements
As with digital forensic practice in general, the legal requirements for malware forensic practice can be divided into lawful practice and admissibility. For the former, the primary risks are in the handling of the malware files themselves and potential breaches of the computer misuse and/or data protection legislation. The latter requires that any output of a tool used to analyse malware which is tendered as evidence must be admissible. This broadly translates to a person familiar with the expected output of a computer being available to give evidence (Lloyd, 2020). However, few people would be familiar with the expected output of a tool used to analyse malware, which typically produces unpredictable artefacts.
Guidance on expert evidence from the CPS (2019) states that expert evidence will be admissible under common law where:
- It will be of assistance to the court
- The expert has relevant expertise
- The expert is impartial
- The expert evidence is reliable
The first of these requirements concerns the forming of a judgement on the probative value of the evidence tendered, whilst the second and third concern a judgement on the expert. The last requirement concerns both the evidence and the manner in which it was produced. In their guidance, the CPS defines reliable evidence in terms of it having a “scientific basis.” This indicates a scientific methodology characterised by attributes such as repeatability, reproducibility, a testable hypothesis, controllability, and being unbiased.
Further to the above, the CPS acknowledge that novel techniques are frequently used in a fast-evolving technology discipline and defer to the recommendations of R v Lundy ([2013] UKPC 28), see Table 1, referred to hereafter as the “Lundy Guidelines”:
Table 1. R v Lundy Guidelines.
In response to some of the problems outlined in the introduction, regulatory codes of practice have been introduced.
3.2 Regulatory requirements
Practitioners tendering expert evidence within the criminal justice system are expected to align their practice to regulatory standards, namely the Codes published by the FSR (2020b). Currently, there is no statutory requirement for practitioners to align their work to the Codes, but the FSR is lobbying the UK Government to make this mandatory (Forensic Science Regulator, 2020a).
The Codes stipulate that software tools must be validated (Section 24.1.2 of the Codes) and that an estimate of uncertainty be provided (Section 22 of the Codes). Furthermore, any reference datasets used to test tools against should also be reported (Section 23.4 of the Codes). Each of these requirements is now considered in turn.
Validation The Codes define validation as a means to demonstrate that a “method, process or device is fit for the specific purpose intended.” Although not specifically mentioned, the meaning of ‘device’ could readily be applied to a software device or tool. However, it is not clear how such validation is performed or what metrics should be used to inform a decision how ‘fit for purpose’ a device is, e.g., accuracy, repeatability, etc.
One measure readily available is that of error, i.e., the difference between the expected and observed values (Kat Els, 2012). Malin et al. (2008) point out that the names of artefacts (such as filenames) are typically randomly assigned. In light of this, it is reasonable to expect artefact values to vary much more than the quantity of artefacts produced each time a malware binary is executed. Such behaviour can be validated by repeatedly executing malware and monitoring the quantity of artefacts produced. Furthermore, to quantify and enable a statistical analysis of the error, a methodology for testing tools used for malware analysis could examine the difference in artefact quantities, rather than the values themselves.
Validation of a tool measuring artefacts produced by malware is complicated by the fact that malware employs techniques (often termed ‘anti-forensic’) to obfuscate the truth. Hence ‘ground truth’ is difficult to establish. One way forward is to compare what is reported by a tool against an independent and trusted source or ‘oracle.’ This will require the testing methodology to (a) determine the expected value from an independent source and (b) be capable of retrieving the observed number of artefacts from a variety of tools applied to the framework for testing.
Estimate of uncertainty A measure of statistical confidence can contribute to an estimate of uncertainty. One way to calculate this would be to run multiple tests under the same conditions and record the error between the expected and observed numbers of artefacts. The rationale behind this is that the ISO/IEC 17025 Standard (ISO, 2005) upon which the Codes are based derives its definition of uncertainty from the Guide to the Expression of Uncertainty in Measurement (GUM), produced by the Joint Committee for Guides in Metrology (JCGM) (2008). In this document, uncertainty is defined in terms of the “dispersion of the values” associated with an observable quantity. Acknowledging the two components of error (systematic and random), they add that random error “. . . can usually be reduced by increasing the number of observations”. Furthermore, calculating the experimental standard deviation “of the arithmetic mean or average of a series of observations” provides “a measure of the uncertainty of the mean due to random effects.”
Hence, by running sufficient tests, it would be possible to plot frequency distributions with associated confidence intervals. Similarly, by varying the conditions of tests, it would become possible to see the impact of such changes upon the level of uncertainty in the results.
Reference data sets Becket (2010) states there is a “need” for forensic practitioners to demonstrate that “certified reference materials” have been used to evaluate their tools, citing Section 5.6.3.2 of the ISO/IEC 17025 Standard (ISO, 2005). This is not quite accurate as the same section of the Standard states this should be done “where possible.” Section 21.2.64(h) of the Codes requires there to be a plan in place for the use of such data. A number of attempts over the years have been made by the scientific community to address the lack of standardised test data; these include test images produced by Carrier (2010), the Computer Forensic Reference Data Sets (CFReDS) project (NIST, 2016), and the Digital Corpora (2017) developed by Garfinkel et al. (2009) as an extensive collection of both fabricated and real data. However, there are also datasets that include malware samples, such as contagion (2020) and VirusShare (2020) these (like those above) are not labelled as being certified.
Aside from legal and regulatory requirements, the handling and analysis of malware reveal a number of technical requirements identified by the literature for malware forensics practice.
3.3 Practice requirements
The following practice requirements have been identified:
Virtual Machines Malin et al. (2008) recommend the use of Virtual Machines (VMs), particularly as this provides testing at scale and speed. Hence greater numbers of tests can be performed for repeatability or breadth of testing purposes. To facilitate this, testing should be automated as far as practically possible. In addition, to minimise the risk of malware escaping from a Windows Guest VM (Tank, Aggarwal, Chaubey, 2019), a Linux host should be used.
In considering the use of virtual machines, it should be noted that malware can detect virtual environments and change their behaviour or even become ‘misleading’ (Ferrie, 2007). However, it is also noted that in recognition of the ubiquitous use of virtual servers, a shift in malware no longer avoiding virtual environments has also taken place (Wueest, 2014).
Network services Isolating malware from a network or even the Internet could limit the behaviour exhibited (Deng Mirkovic, 2018). To counter this, it is a good idea to provide the malware with as many services as possible that it is likely to rely upon, such as SMTP, HTTP, and DNS. The use of iNetSim (Hungenberg Eckert, 2016) is a popular choice in this area. Phu et al. (2019) use it to trap DNS queries from malware under analysis using an iNetSim simulated network; Sikorski Honig (2012) use it to simulate a broader range of network services; a mixed solution is proposed by Palkmets et al. (2014) who additionally provide a route to the Internet via an onion router network; a malware clustering technique is offered by Fang et al. (2020) who use iNetSim to provide a means to identify the family of malware under analysis.
Vulnerable environment Alongside network services, vulnerable environments are also key to maximising the behaviour of malware (Szor, 2005). This is echoed by Malin et al. (2008) and Elisan (2015), who goes further and anecdotally promote the use of “malware friendly” configurations. These include assigning administrator rights to the default user account, disabling auto updates, disabling User Access Control (UAC), setting the Internet browser to the minimum security level, installing commonly exploited software, and creating honeypot files with filenames such as “salaries.xls.”
Black box testing Many of the tools used by digital forensic practitioners, including the mainstream forensic software tools, are closed source (Talib, 2018). Hence there is no access to verify the underlying algorithms used (Horsman, 2019b). Therefore, a black box testing strategy is more viable than a white box testing approach. Furthermore, digital forensic practitioners would have neither the time nor the skills to review source code (Horsman, 2020).
The above requirements have been included in the framework and are summarised in Table 2.
With the requirements identified, the following section will identify the aims of a framework to test tools used for malware forensics in light of the above requirements.
Table of Contents
- 1. INTRODUCTION
- 2. BACKGROUND AND RELATED WORK
- 3. REQUIREMENTS FOR CONDUCTING MALWARE FORENSICS
- 4. DESIGN OF THE FRAMEWORK
- 5. DISCUSSION
- 6. CONCLUSIONS AND FURTHER WORK
- 7. REFERENCES