DeNIST is a crucial process during the processing phase of eDiscovery. This article explains exactly why you can’t miss this important step of the eDiscovery process!
DeNIST is the process of removing “known” files from an eDiscovery document review project. NIST stands for National Institute of Standards and Technology, which is the organization that, amongst other things, manages lists of “known files”. Essentially, because these files are “known”, they are not user created and, therefore, irrelevant to produce in an eDiscovery project.
The process of DeNISTing is based on a list of files maintained by NIST. The list is updated four times a year. When eDiscovery software is used to DeNIST electronic documents, the software compares all files in the eDiscovery dataset against the NIST list. The eDiscovery software can then filter out any matches. Examples of file types removed during DeNISTing are non-user-created files such as executable files, Windows system files, fonts, etc.
The most common way of matching files is by using the MD5 Hash Value, which is basically a digital fingerprint of each file. DeNISTing checks the hash value to remove unnecessary files safely. Deduplication works similarly, which is a process the attorneys you work with are likely already familiar with!
DeNISTing (and deduplication) is completely safe. NIST continuously updates the master list of hash values. It is good to be diligent though, so we recommend creating a QC step to ensure no potentially relevant documents are moved from review. Both Relativity and Nuix allow you to do this, as described in our detailed guides below.
DeNISTing before review is standard practice and saves time and money as it significantly reduces the number of files, especially in cases with a large number of computer images. By removing “junk” system files and other irrelevant data, the odds of these files being relevant to a case are incredibly low. This allows law firms to focus solely on user-created data.
Relativity offers a built-in option to DeNIST during Relativity Processing. Instead of the MD5 hash, Relativity uses a similar SHA-1 hash. You can use the following steps to apply DeNIST in Relativity:
Before you can begin DeNISTing, you need to install the DeNIST database within your Invariant Worker Manager server. Contact Support to request the latest NIST package, installer, and instructions.
When creating a processing profile, you can set the DeNIST field to Yes or No. If set to Yes, processing separates and removes files found on the (NIST) list from the data you plan to process so that they don’t make it into Relativity when you publish a processing set. If set to No, all files found on the NIST list will be published with the processing set.
You can further define DeNIST options by specifying a value for the DeNIST Mode field. There are two options available:
The NIST table in the Invariant database contains the list of SHA1 hashes for known files that can be DeNISTed during discovery. Files that are DeNISTed out during discovery will be cataloged in the DeNIST table within the INV data store, based on the hash comparison of known NIST files stored in the NIST table of the Invariant database. You can check the DeNIST table to see which files have been removed.
In Nuix, a practical way to apply DeNIST is by using a digest list. The advantage of using a digest list is that you can apply the DeNIST filter at several points during your processing workflow. You can also easily QC the files that get filtered out because a digest list works like any other filter in Nuix.
Unlike Relativity, Nuix does not have a built-in system to DeNIST. Instead, it supports an MD5 hash list in several formats: .hash, .hke, .hsh, .txt, and .hsh
NIST has moved away from plain text files like .hash and .txt; instead, it now provides a .db database file. You can download the latest version of the database on the official NIST website
Follow the instructions on the NIST website to convert the file to a format to a flat-file format for Nuix
Within the Nuix Case, go to “Filtered Items”. Here you will see that “Digest Lists” is one of the options. Simply select the imported digest list to start filtering. You can then create an exclusion or tag to filter out the documents for your case.