The review phase of electronic discovery is usually the most costly, time-consuming and error prone. Typical review processes involve numerous people each rapidly reading through thousands documents and emails per day. These materials are generally reviewed out of context. The bulk of collected materials are typically irrelevant. Little time can be spent determining the relevance or importance of each item. Big picture considerations, such as identifying the important people, time lines and issues, are almost impossible to discern. Reviewers are stuck examining each tree in the forest in random order. Analysis tools and techniques can transform this process so that electronic discovery is conducted quicker, less expensively, and more accurately.
Identifying Irrelevant Sub-collections
The largest opportunity is to reduce the quantity of materials that need to be reviewed by quickly identifying large sub-collections that are not relevant. As the collection is examined in early case assessment, topics and vocabulary will be identified that appear irrelevant to the case. Examining a small subset of the documents and messages containing such vocabulary or pertaining to such topics may convince one quickly that the entire collection of such materials is irrelevant. Additional analysis techniques can help to reinforce this conclusion, for example determining that the subject materials never involve certain key people in the case, or verifying that the entire context groups of these items are also devoid of important content.
With this knowledge, large irrelevant sub-collections can be eliminated from further consideration, not reviewed at all. This eliminates unnecessary time and cost to review these materials, and simplifies the job of reviewers so they can focus on more relevant items, which reduces their error rate.
One pitfall in this approach concerns the obligation to produce. There are no precedents for using technology to filter materials out of the production set, other than the standard up-front culling techniques (system file exclusion, agreed custodians, date ranges and possibly keywords). So far only people have been able to judge something non-responsive. However, analysis-based determination of irrelevant sub-collections is still valuable, as you can simply produce the materials that were deemed irrelevant without being reviewed by people. Although this might seem risky at first, an arbitrary degree of confidence in the irrelevance of any sub-collection can be obtained by considering more factors and materials before making the judgment. Overall this is less risky than reviewing everything, because the review team will make fewer mistakes by focusing on a smaller set of relevant items. Additionally, there are efforts underway today to establish a peer-reviewed scientific basis for using technology to make responsiveness decisions that could withstand a Daubert challenge.
Using Analysis to Get Organized
Whether or not analysis is used to reduce the collection to be reviewed, analysis can still be used to improve the efficiency and effectiveness of the review team by providing the knowledge needed to prepare the team correctly and organize the materials sensibly. Before starting review, a review guide is created. This defines the process to be followed and informs the reviewers of what to look for, what judgments to make, and how to make those judgments. Up-front knowledge of key issues, people, vocabulary, topics, time frames and specific important items that come out of initial case assessment can be used to focus reviewers effectively. Generalizing from the notion of using analysis to identify irrelevant sub-collections, one can identify sub-collections with varying likelihoods to contain relevant materials. The most effective and knowledgeable reviewers can then be assigned to those sub-collections deemed most likely to contain important items. Less important sub-collections can be assigned to less expensive reviewers for a direct cost savings.
Reviewers may have varying degrees of knowledge about certain areas of the case. (For example, a product liability case might involve materials on specific technical topics that require appropriate specialists to interpret.) Reviewers may be more efficient by focusing their attention on specific topics, rather than reading through materials on a wide range of topics in random interleaved order. Both of these objectives can be served by using analysis techniques to segregate the collected documents and messages by topic, assigning specific topics to specific reviewers.
Source: EDRM (edrm.net)