DeNIST is a crucial process during the processing phase of eDiscovery. This article explains exactly why you can’t miss this important step of the eDiscovery process!
What is DeNIST?
DeNIST is the process of removing “known” files from an eDiscovery document review project. NIST stands for National Institute of Standards and Technology, which is the organization that, amongst other things, manages lists of “known files”. Essentially, because these files are “known”, they are not user created and, therefore, irrelevant to produce in an eDiscovery project.
How Does DeNIST Work?
The process of DeNISTing is based on a list of files maintained by NIST. The list is updated four times a year. When eDiscovery software is used to DeNIST electronic documents, the software compares all files in the eDiscovery dataset against the NIST list. The eDiscovery software can then filter out any matches. Examples of file types removed during DeNISTing are non-user-created files such as executable files, Windows system files, fonts, etc.
The most common way of matching files is by using the MD5 Hash Value, which is basically a digital fingerprint of each file. DeNISTing checks the hash value to remove unnecessary files safely. Deduplication works similarly, which is a process the attorneys you work with are likely already familiar with!
Is DeNIST Safe?
DeNISTing (and deduplication) is completely safe. NIST continuously updates the master list of hash values. It is good to be diligent though, so we recommend creating a QC step to ensure no potentially relevant documents are moved from review. Both Relativity and Nuix allow you to do this, as described in our detailed guides below.
Benefits of DeNIST
DeNISTing before review is standard practice and saves time and money as it significantly reduces the number of files, especially in cases with a large number of computer images. By removing “junk” system files and other irrelevant data, the odds of these files being relevant to a case are incredibly low. This allows law firms to focus solely on user-created data.
How to apply DeNIST in Relativity:
Relativity offers a built-in option to DeNIST during Relativity Processing. Instead of the MD5 hash, Relativity uses a similar SHA-1 hash. You can use the following steps to apply DeNIST in Relativity:
1. Install DeNIST (Relativity Server)
Before you can begin DeNISTing, you need to install the DeNIST database within your Invariant Worker Manager server. Contact Support to request the latest NIST package, installer, and instructions.
2. Set the DeNIST Option during data processing
When creating a processing profile, you can set the DeNIST field to Yes or No. If set to Yes, processing separates and removes files found on the (NIST) list from the data you plan to process so that they don’t make it into Relativity when you publish a processing set. If set to No, all files found on the NIST list will be published with the processing set.
3. Additional DeNist options
You can further define DeNIST options by specifying a value for the DeNIST Mode field. There are two options available:
- DeNIST all files: breaks any parent/child groups and removes any attached files found on the NIST list from your document set.
- Do not break parent/child groups: doesn’t break any parent/child groups, regardless of whether the files are on the NIST list. Any loose NIST files are removed.
5. Checking the DeNIST Table after processing is completed (optional)
The NIST table in the Invariant database contains the list of SHA1 hashes for known files that can be DeNISTed during discovery. Files that are DeNISTed out during discovery will be cataloged in the DeNIST table within the INV data store, based on the hash comparison of known NIST files stored in the NIST table of the Invariant database. You can check the DeNIST table to see which files have been removed.
How to apply DeNIST in Nuix Workstation
In Nuix, a practical way to apply DeNIST is by using a digest list. The advantage of using a digest list is that you can apply the DeNIST filter at several points during your processing workflow. You can also easily QC the files that get filtered out because a digest list works like any other filter in Nuix.
1. Download the latest Hash list
Unlike Relativity, Nuix does not have a built-in system to DeNIST. Instead, it supports an MD5 hash list in several formats: .hash, .hke, .hsh, .txt, and .hsh
NIST has moved away from plain text files like .hash and .txt; instead, it now provides a .db database file. You can download the latest version of the database on the official NIST website
2. Convert the .DB file to a text file
Follow the instructions on the NIST website to convert the file to a format to a flat-file format for Nuix
3. Import the hash list as a digest list file in Nuix Workstation.
- Within a Nuix Case (any case will work), go to “Global Options” -> “Digest List”
- On the Digest List options page, choose “Import”
- Here you can import the created DeNIST hash list
- Make sure to save to “Local Computer” and not “Case” so you can use the digest list everywhere!
4. Filter the digest list in your Nuix Case
Within the Nuix Case, go to “Filtered Items”. Here you will see that “Digest Lists” is one of the options. Simply select the imported digest list to start filtering. You can then create an exclusion or tag to filter out the documents for your case.