Avansic Whitepaper: 10 Things You Don't Want to See in E-Discovery
04-25-2017, Avansic - Corporate
Maybe it's a certain document type, production style, or naming convention – here are a few things that make e-discovery life difficult. But keep in mind, these are exceptions rather than the rule. Of all the hundreds of millions of documents Avansic has loaded to online review, the vast majority of them fall into only seven file types. So if you have one of these in your case, it just may require some exception handling, extra time, and potentially some extra costs.

DWG or CAD files
These don't fit neatly into the 8.5x11 size that all of us think a document should be. There are layers and renderings and it's difficult to determine how to present them in a data set comprised of other documents that do fit on a “page.” This is especially true for CAD drawings with multiple layers.

Isolating to just the data needed for presentation can be very helpful. There may be instances where the opposing party has a viewer that can accept CAD files, in which case providing the native version is an easy solution.

Select Email on a Macintosh
In this case, the difficulty is that email headers, messages and attachments are stored separately. Common e-discovery and forensics tools don't understand the relationship between these fractured parts of an email. The solution is to re-create the email from its parts using a custom tool based on the data. Alternatively, if one has the Macintosh device, it can be used to export email into PST or MBox format.

Encrypted/Password Protected Files
Without a password, these files aren't accessible. By far the best solution is to obtain the password from the user – there are methods to figure out passwords but it requires very significant computer power, time, and cost, and ultimately, there is no guarantee of success.

Email Archivers
Each archiver is different and may have different problem areas in terms of e-discovery. For example, some archivers gather mail as it is inbound to the system and don't know a custodian (for instance, sales@company.com might re-direct depending on the employee in charge of sales only at that time). Archivers often store emails in a manner that makes them easy to search the body but not the attachments. Then, the entire archive contents must be exported in order to perform proper searches. Here, the best solution is to work with the archive vendor to determine methods for extraction.

Poor Packaging
A loose hard drive in a cardboard box is not an effective way to ensure digital media will arrive at its destination intact. Before you ship, discuss proper shipping guidelines with the party receiving the materials.

Cloud Data That Isn't E-Discovery Ready
More cloud providers are popping up but they are not all equal. Most make it very easy to get your data up but don't necessarily have as robust procedures for you to get your data exported when you need it. It's common for a mass export to take several times longer than it took to upload. As well, the associated metadata may be missing or inaccurate. One of the most common email providers makes it difficult to download hosted email that is ready to be processed in e-discovery. Evaluate each provider at the earliest opportunity to determine the length of collection and if special handling or post-processing is needed.

Files Within Files (Complex File Types)
The most common example is a PST or OST file that contains emails that themselves have a PST attached. Most e-discovery tools don't track a hierarchical relationship of attachments which makes it difficult to understand the context of an item in a given family. Proper searching of any file container will require the expansion of these files beforehand, which frequently results in many more documents than originally anticipated.

Non-Searchable Types That Contain Text Within ESI
For instance, a scanned document that is an attachment to an email within a PST. Searching the email for content within that document wouldn't necessarily locate the text in the scanned document. Similar to container file issues, proper searching requires them to be expanded and the individual parts to be processed as well, including imaging and OCR for items without extractable text.

Legacy and Uncommon Email Formats
This includes email in older or uncommon formats or programs such as older MBox, GroupWise, and Lotus Notes NSF files. Often these types require a large amount of additional pre-processing to get to what is considered bare minimum for other formats. The most common solution is to convert these to the EML format, which in many cases loses valuable information including format-specific metadata; once converted, it may still not look like regular email. A solution is to find e-discovery vendors with experience or that have a development shop to decode the format.

Legacy Hardware
Challenges include such as poor connection speeds, types no longer supported with modern operating systems, and drives that may not spin up. A vendor with experience getting information off those devices is needed, as well as an understanding that the time frame may be longer than expected.

Each of these issues require some expertise to address - it is worth the time to talk with your e-discovery vendor and determine their level of familiarity with the data type before sending the work. The nature of most of these will require additional work and that can often be time consuming and expensive. Ask the vendor if they know of ways to get a similar result without processing the most difficult types directly.