Threading the Clinton Emails

If one reads the FBI statement on their investigation of the Clinton e-mail controversy (a copy can be found here), it seems abundantly clear that they did much more than turn to some off-the-shelf software for predictive coding or email threading.

Although it appears her attorneys used traditional key words and subject headings to thread the emails they produced, they seemed to have missed some documents that the FBI considered sensitive and that open Clinton up to continued political attack, despite the recommendation not to pursue charges. The FBI almost undoubtedly used targeted search methodologies (TSM) to find these that go above and beyond TAR.

Those thousands of hours spent must have included careful manual searches within specified time frames, for expected replies, and in follow-up to cryptic references. These require human eyeballs at this time, with a key doc tracker database at hand. Some pain cannot be completely avoided by any AI software I am aware of.

Posted in Uncategorized | Leave a comment

“… should be HWP ….”

While proportionality seems to be significant in personal ads, ALM’s Legaltech NY 2016 left me thinking it wasn’t all that important in electronic discovery, despite the much-ballyhooed change in the Federal Rules introducing the concept. In the first place, as pointed out in one of the sessions, proportionality as a concept has always been implicit in the rules and parties have always been free to come to court with a complaint about overly-burdensome discovery requests, and in the second place, recent changes in information governance have dramatically changed the field.

If, like me, you have previously shied away from the term information governance, you cannot any longer because an entity that does not practice information governance will have a hard time making a proportionality argument. If you keep everything you’ve ever done, it’s not your opponent’s fault it might cost millions to go through it all, especially now that there is a safe harbor that will protect you from most storms.

Rule 37 understands that corporations should be encouraged to purge unnecessary information and limits sanctions accordingly. Routine deletions up until a notice of impending litigation is received are protected as long as they have been in fact routine. Even after notice of impending litigation is, or should have been, received or acknowledged, the rule now requires spoliation with an intent to deprive in order for the most severe (‘case-ending’) sanctions to be appropriate. The courts henceforth will look to see if the information should have been preserved in anticipation of litigation, if reasonable steps were taken to preserve it, and if it can be restored or replaced. As was noted, establishing an intent to deprive standard is a high hurdle, with at least one commentator noting that the rules “went out of their way” to avoid establishing exactly who bears it.

By the same token, information governance should involve the input of the eDiscovery team, just to make sure the safe harbor has in fact been sailed into, and must take into account the ever-changing methods by which we communicate. So, if you’re up on eDiscovery, but not IG, it would appear you’re only halfway there.

Posted in Uncategorized | Leave a comment

“I’m going to bribe the authorities ….”

Dancing with the Stars of eDiscovery:

Case Studies in eDiscovery Powered by Analytics

Legaltech NY 21016 – February 4, 2016

NYC; legal tech IMG_1863 DWS

Sponsored by Content Analyst Company

Scheduled panelists included: Ari Kaplan, Jacob Cross, Iram Arras, Michelle Drucker, Drew Lewis, Hunter McMahon, Mike Schubert, Alison Silverstein and Mark G. Walker

Not surprisingly, given that it was Legaltech, the case studies presented by this panel all noted that technology-assisted review (TAR) was critical to both cutting costs and increasing efficiency,  but they made a good case, with some very big numbers being thrown around. One project would have spent $1.4M on the linear review of 450,000 documents TAR classified as not relevant and did spend $90,000 reviewing documents that TAR would have excluded. Citing a case involving 2M documents, wherein they were able to make a first production in 10 days and all responsives in 6 weeks, the woman from kCura’s Relativity put it this way: “cost savings and speed are givens now.”

But another point was made with respect to what I consider to be much more important: selectivity. A participants described a project in which documents were reviewed at the rate of 200 – 250 per day per reviewer, which he said was “not bad,” although it involved a lot of “hand to hand combat” with the data and was thus too labor intensive (meaning that the reviewing attorneys overwhelmed the supervising attorneys with questions and close calls on the relevance of minor documents), but that  using TAR would not only save the client time and money, but allow the litigation attorneys to find and concentrate on the KEY DOCUMENTS – the only ones that are going to make a difference in the case. Yes, finding the responsives is important, but finding the key docs is critical.

Another participant noted that if you don’t use analytics, the recipient of your production is increasingly likely to, “and will find things you don’t.” A very persuasive example was given in a case thought to involve bribes. As was pointed out, no one writes in an email, “I’m going to bribe the authorities so that we can obtain a monopoly and rig prices,” so a tough search was anticipated. But after someone thought to teach the machine the definition of bribe, the machine came back with a fistful of birthday notices. With how popular birthdays are in office culture, a simple keyword search quite possibly would have culled these all out, but it turned out birthday was the code word for the illegal conduct they were looking for (the comment was made that “we are past the point of throwing search terms against the wall to see what sticks”).

With concept clustering, it is possible to handcraft an example of the documents you want to find in your data, as well as examples of documents you hope you won’t find, feed them to the machine, and have it look for actual documents reflecting your hopes and fears, quicker and more accurately than can be done with linear review. At the very least, clustering can help identify not only the key custodians who should receive high priority, but the people those custodians have been talking to – with or without the knowledge of the company – about the subject matter of the litigation. One of the concluding remarks, and a common theme, was that we are “woefully undermanned with people who understand the process,” and that it is these people who will get the business going forward.

Time to get aboard, if you’re not already.

Richard Neidinger, J.D.

Posted in Uncategorized | Leave a comment

The Analytics Life Ring


FTI Consulting sponsored a session at the ALM Legaltech NY 2016 trade show this month entitled Using Analytics in E-Discovery; Swim Instead of Sink in the Era of Big Data, with panelists Nia Castelly of Google, Sandra Rampersaud of Cravath, and Jessica Ross of Deutsche Bank, moderated by Kathryn McCarthy of FTI (each putting forth personal views not necessarily those of their employers, as was the case with almost all of the speakers).

It was a fast-paced and very enlightening discussion which, like many of the sessions here, was so jam-packed with useful information it was tough to take notes. Fortunately, FTI did something sort of unique in bringing in Kelly Kingman, a graphic recorder at Kingman Ink, who created the chart I have reproduced (with permission) below.


On the principle that a picture is worth a thousand words, I will my leave my summary at that. Ms. Kingman can be reached at:

graphic recording & visual notes
Posted in Uncategorized | Leave a comment

The Art of Predictive Coding

NYC; legal tech IMG_1837

Emily Cobb, Jason R. Baron, Ralph Losey, & Jim Sullivan facing a full house

I was very interested in attending the ALM Legaltech NY 2016 session entitled The Science of Predictive Coding because I had written about the “black box” aspect of this to most litigators some time ago and I wanted to see if it still required a degree in statistics to understand. The answer was, sort of.

The panelists included Ralph Losey, famous for his e-discovery blog, Jim Sullivan, of Kroll Ontrack, and Jason R. Baron, the former director of Litigation, National Archives and Records Administration, now in private practice. These highly-qualified and excellent speakers certainly knew their stuff as they discussed the efforts made to ensure the accuracy of predictive coding platforms using established data sets like that from the Enron case.

But there was a frankly telling moment when Mr. Sullivan, describing his performance in a relevance test, described that he could only get right two of ten documents while the machine got eight of the ten right. What was interesting about his observation was that he indicated that visually examining the documents revealed little or no difference between them that he could discern. Well, if a human  can’t tell the difference between the documents, do you need the machine to? Clearly a jury or judge wouldn’t be persuaded by one, but not the other, so what exactly was the meaning of this outcome? More than one speaker at the trade show referred to technology as being full of “bright shiny objects” that need to be used well.

This is not a criticism of the process, and certainly not of Kroll, but practicing litigators do need to take into consideration the question of how many angels can dance on the head of a pin. The most important consideration is ALWAYS can you find the hot docs, the documents you need to make or break the case?


Posted in Uncategorized | Leave a comment

How Strong is that (Email) Thread You’re Using?

Email threading, a means of gathering related emails together for easier and more consistent review than doing so piecemeal, is a fairly commonplace feature of many document review platforms. At ALM’s Legaltech NY 2016, I asked a vendor about a concern I have always had about the practice and was pleased/distressed to have him confirm my suspicions. While it is no big deal to collect all of the emails in an actual chain – those responded to with the reply button, or forwarded to others –  there IS a problem in finding related emails that are not part of the chain. Imagine A and B engage in a lengthy exchange about something adverse to their employer’s interest – a defective product, a dangerous condition, a material misrepresentation, etc; many tools will have no trouble gathering those up and putting a bow on them for the reviewer IF, but only if, the reply button was used. If new emails are generated, they might not be picked up by threading tools. If A and B are arguing and their supervisor, C, having been forwarded the chain, writes an entirely new email telling them who he or she agrees with, a threading tool most likely will not pick that up, even though it might be the most relevant email on the topic, because it is not part of the chain. Similarly, if C writes a second email to D summarizing the situation, but without forwarding the chain, that email will also escape detection. Smart bad guys know this.

While it is true that concept clusters might pick up C’s emails, or a timeline generator might reveal them during a key time, those tools must be used in addition to the threader, or your smoking gun/needle in the haystack might not be recognized for what it is.

Artificial intelligence and algorithms are wonderful, but they’re still is no substitute for the inquisitive human. A shout-out goes to  Cavo eD, makers of an interesting, full-feature eDiscovery product, for their candor here.

Posted in Uncategorized | 1 Comment

Replacing the “Old Ways” in Document Review

Slashing E-Discovery Costs:

Innovative Approaches and New Alternative Fee Arrangements

NYC; legal tech IMG_1839 slashingPanelists: Anthony Lowe, Christine Hasiotis, Farrah Pepper, Brian Chebli, Greg Witczak

ALM Legaltech NY 2016 – 2/3/16

This panel was made up of in-house eDiscovery experts from several major institutions and they were quite frank about their concern over the costs of eDiscovery (please note, however, that the views expressed were their own, and not of their respective institutions). Some random remarks included:

Controlling the volume of electronically stored information (ESI) a corporation maintains is increasingly important and requires good information governance procedures. The amended FRCP makes this easier because it introduces an “intent to deprive” your opponent of information you have not kept. Defensible deletions will reduce document review costs out the gate simply because you will not have as many documents to review under any scenario. Sixty to 90% of the documents in the review pipeline are only minimally relevant at best.

With as much as 89% of costs attributable to document review, some corporations have begun keeping databases on outside counsel regarding their performance, tracking things such as documents reviewed, speed of review, per-document cost, motions practice charges, etc. While the new requirement of proportionality is not thought to be of that much help to the corporations with respect to the eDiscovery itself, it appears corporations are beginning to use it in the selection of outside counsel.

“Quite frankly, you don’t want your first pass in the hands of outside counsel” – ouch!

Second-level review is seen as a hot button that can be “code” for throwing in some associates who don’t know what they are looking for or how to find it – ouch, again.

Look into creating data sets if your business is involved in repeated litigation so that you don’t have to have the same data reviewed over and over again. These sets can be augmented as needed, but certain data will be involved in many instances.

Alternate fee agreements, based upon total project cost, fixed fees and performance guarantees are replacing “the ‘old way’” (“per-gigabyte pricing, hourly rates, and unrealistic estimates”).

The take-away here was that the major corporations are expecting more of their outside counsel in terms of efficiency and are not afraid to challenge them, or simply hire more responsive firms.




Posted in ediscovery, FRCP, linear review, Proportionality | Leave a comment