Friday, November 22, 2013

Comment on early EDI - Oracle Study results

The Electronic Discovery Institute yesterday released via Law Technology News some preliminary results of its study on a dataset provided by Oracle related to its acquisition of Sun Microsystems. The study involved multiple providers of technology-assisted review, including Backstop, which categorized documents for three tags: responsive, privilege, and hot. EDI's release is the first step towards what should eventuate in ground-breaking raw data and analysis. While skeletal (we only have ordinal F1 rankings thus far), it affords the basis for some thoughts, including very imperfect cost-adjusted performance measures.  Interestingly, the results show no correlation between cost and accuracy ranking. Backstop and the other study participants are forbidden to identify their own entries, and EDI can only tell a vendor which results are the vendor's own. So, while we are very pleased with our results, we cannot identify them or those of any other participant.  In this post I will share some thoughts on difficulties with F1 as a benchmark for accuracy, then delve into a first attempt at a cost-adjusted performance spreadsheet, which you can sort, view, edit and download.

Wednesday, November 20, 2013

EDI - Oracle Study Preliminary Results Released

Preliminary results have been released from the Electronic Discovery Institute - Oracle study, in which Backstop participated.  See an article at Law Technology News and the chart below.  We are very pleased to see these results and will soon share a preliminary analysis on this blog.  We also look forward to seeing more granular detail (viz., recall and precision figures) in the near future.


Thursday, May 23, 2013

Podcast on the In re Biomet decision

A couple of weeks ago, I discussed here the dubious mathematics underlying the court's approval of pre-predictive coding keyword searches in In re Biomet.  This morning I discussed the case with other e-discovery professionals on an ESI Bytes podcast.

Wednesday, May 8, 2013

Federal court approves pre-predictive coding keyword filtration based on faulty math in In re Biomet

A district court’s recent approval of keyword filtration prior to the use of predictive coding in In re Biomet, No. 3:12-MD-2391 (N.D. Ind. April 18, 2013) rests on bad math and could deprive the requesting party of over 80% of the relevant documents. Specifically, the court ruled that a defendant’s use of predictive coding on a keyword-culled dataset met its discovery obligations because only a “modest” number of documents would be excluded. But a proper analysis of the statistical sampling on which the court relied shows that defendant’s keyword filtration would deprive plaintiffs of a substantial proportion of the relevant documents. The error in the court’s finding regarding the completeness of defendant’s production underpinned and undermines its additional holding that to require the defendant to employ predictive coding on the full dataset would offend Rule 26(b)(2)(C) proportionality. Accordingly, the early chorus of praise which has greeted the decision is unwarranted.

Friday, April 26, 2013

Good luck to FIRST Robotics Team 116!

Backstop is a proud sponsor of Herndon High School FIRST Robotics Team 116, currently competing at the national robotics championships in St. Louis.  This year's competition calls for teams to build robots that can scoop up frisbees and shoot them into goals.  View the team website or follow the national tournament.  We wish Team 116 much success.