“Natural” Selection and the Evolution of Artificial Intelligence

ai-ranking-proof

Part of the proof published by Yining Wang, et al, in an article entitled A Theoretical Anlysis of NDCG published in the Journal of Machine Learning Research.  NDCG (normalized discounted cumulative gain) is somehow used in the ranking of artifical intelligence systems ….

The concluding keynote session on the third and last day of ALM Media’s LegalWeek, the Experience, How AI Benefits People and Society and Its Legal Challenges, Opportunities, and Ethics, was quite thought-provoking. The panel was excellent; high energy and very knowledgeable, but – and it was only because the hour flew by – they did not get around to discussing the ethics of it all.

legaltech-2017-img_3119-ai

L to R: Zev Eigen, Andrew Arruda, Natalie Pierce, Brian Kuhn, Ray Thomas

For many of us, this is key, and what makes artificial intelligence scary. Who is going to teach these machines ethics, and, frankly, can ethics really be taught to a machine?  We don’t understand equations like the ones above, and we are never going to understand them, so how can we know?

My skepticism comes from my background in evolutionary biology and its lesson that life moves forward on what works today. Apart from (some) humans, life doesn’t use long-term planning much and certainly ethics per se are lacking. Altruism is rare in the animal kingdom and not fully understood even after decades of study because it doesn’t really make sense; why would I sacrifice myself for your benefit unless there is actually something in it for me after all, in which case it isn’t really altruistic.

Humans have only been able to become truly selfless because our intellect allows us to recognize a Greater Good and to derive some form of satisfaction in serving it. That shy smile of gratitude might work for us, being susceptible to warm fuzzies, but I can already hear the descendants of Watson chuckling in their sleek lairs. Machines are not going to learn altruism.

One need only look at health care in this country to see the problem. It is so expensive that health insurance can only work well if we all pay into it throughout the healthy portions of our lives because only that will build reserves adequate to pay for the care we are likely to require if we’re not lucky enough to drop dead out of the blue and yet we exploded in rage when paying into it became mandated. I’d love to meet the algorithm asked to approve the $100,000+ my 88-year-old father’s 3-day hospitalization cost – would the fact that he’s a sweet old guy and not a crotchety old man make a difference? You will not find group health insurance on the plains of the Serengeti.

While of course everyone on the panel acknowledged that AI is in its infancy, they pointed with evident enthusiasm to some of the remarkable successes already achieved in real life, with the likes of Deep Blue, and Watson, and the idea physicians are already able to call upon the likes of them to treat complex medical conditions is reassuring until the next generation supercomputer wakes up one morning and wonders why we are bothering to treat these conditions at all. Uh-oh. But true learning, I would submit, dictates that eventuality.

Fortunately, except in the highest-end laboratories we seem a ways from there – who hasn’t wondered what Neflix was “thinking” when it recommends movies “because you watched” so-and-so? Are Amazon’s you-might-likes, based on what you’ve looked at, all that on-point? Why do targeted ads know I was researching something to buy, but not that I’ve already bought it so they should stop wasting someone’s money on continuing to present me with targeted ads? Can enough lines of code be written to tell Flickr, the photo-sharing website that now tries to tag photographs automatically, that red pills in a blister pack are not gourmet tomatoes in a plastic sleeve, and would you really use a self-driving Uber for all of your transportation needs?

Will any software anywhere ever learn to give an older worker with some health issues and a spotty full-time employment record a job, when clearly the better decision would be not to, especially after tapping into his or her various “rewards” cards to see what they eat and what medications they take, etc. Let’s go back to Uber, perhaps unfairly. Would their algorithm ever learn to care whether it has 1,000 drivers in a city making $10,000 apiece, or 10,000 drivers making $1,000 apiece when more drivers benefit Uber by reducing wait time? Throw in an ability to check local unemployment rates and the median age of new hires in that town and is it any surprise that their driver recruitment page features an older – and presumably under-employed – man behind the wheel? Whatever they have running the show is already confident enough to offer drivers a few hundred bucks for new driver referrals because it has “learned” that existing drivers will increase their own competition for the price of an NFL ticket – now that is King of the Jungle behavior!

But there is another concern that emerges; the day before this session, I attended one in which studies were presented showing that the precision and recall of technology-assisted review (TAR), the famous predictive coding this conference devotes considerable attention to, varied significantly depending on the processing used, which I, like many people, did not fully understand.  While the two are not directly comparable in all situations, you can bet that in X number of years, it will be shown that not all AI is equally smart, and yet all will be cloaked with that aura of invincibility technology so often confers.

The final irony may be that an “honest” machine would “know” that AI is not ready for prime time and prohibit its use for anything beyond picking your next book, movie, or date, but until such honest machines are actually in charge, there is nothing to prevent some humans from deploying AI as if it was up to making life or death, or at least quality of life, decisions thereby letting the germ out of the sealed petri dishes. But when AI is in charge, Stephen Hawking thinks we might have much bigger problems, and I think Charles Darwin would agree.

Advertisements
Posted in Uncategorized | Leave a comment

Product Shout-Out: Cost Containment

I have avoided mentioning specific products and vendors up until now, but every once in a while, one quietly moves the earth.

Speaking primarily to in-house counsel through its website, this vendor has this to say about a small-to-medium-sized case (less than 50,000 documents):

  • … legal can handle the entire matter end-to-end, without third party involvement.  This reduces costs related to processing, review, outside counsel and professional services.

A similar sentiment is expressed with respect to handling internal investigations and regulatory inquiries.

The company even sponsors an award for those who “serve as an inspiration for all corporate legal teams to successfully bring e-discovery in-house.”

Given that the review costs of electronic discovery still account for the greater part of the budget in many cases, this is not a product corporate legal departments OR law firms should be unaware of.

The company is Zapproved, Inc, out of Portland OR: (888) 806-6750 | zapproved.com

 
zapproved

Posted in Uncategorized | Leave a comment

On Background

legaltech-2017-img_3097

The conference was attended by big names from big companies and other major players so all of this is reliable, even without attribution:

“I hone discovery requests numerous times” – to make sure they are worded to accomplish getting what we actually want.

“Throwing $750/hour associates at it does not work anymore.”

In assessing the proportionality of a discovery request, you have to tell the court what you think the worth of the case is, because that’s what the ratio is based upon.

Get the Sedona Tool Kit – you apparently don’t even need to be a member.

Oppenheimer is dead.

BYOD is unsettled, with some panelists saying business messages on them are not discoverable and others saying they are – have a policy about using them AND ENFORCE IT.

There is no discovery about discovery absent a showing of deficiency; it “is not absolutely forbidden,” but a 30(b)(6) is “the stupidest way of doing things.”

The expectation is that the “guessing game of what will and will not be produced” will be eliminated.

Using “any and all” in a discovery request marks you as out-of-step.

Corporations should look for ways NOT move their data rather than sending it out to law firms; significant cost savings can result.

On-premise data storage is dying out: “this is the time to introduce information governance” – why are you keeping what you are keeping????  The “biggest challenge is the amount of data.”

When transferring to the cloud, watch that pesky metadata stuff so that you remain eDiscovery compliant.

The costs saved by using analytics are “replaced by the ridiculous cost of analytics” – reducing the ever-increasing mountains of useless data you keep is the only way to keep these costs under control.  Don’t over-preserve.

A surprising percentage of corporate law departments do not ask their outside counsel for metrics, even though review costs still account for a significant portion of their budget and most law firms generate those reports.  Why not?  Well, there didn’t really seem to be an answer.

E-mail threading remains highly mysterious to non-techies, and can still be easily defeated.

And that was just from one of the three days at ALM Media’s LegalWeek, the Experience; you should plan on going next year because I’m sure they’re already gearing up for 2018.

Posted in Uncategorized | Leave a comment

Predictive Coding: Sauciers & Secret Sauces

legaltech-2017-img_3107-crop-secret-sauce

L to R: Nathaniel Huber-Fliflet, Dr. Jianping Zhang, Rishi Chhatwal, Robert Keeling

It was almost four years ago that I cautioned that “[w]ith predictive coding, you are obliged to accept documents produced by your opponent selected by an algorithm you do not understand that has been ‘trained’ by your opponent,” a situation that struck me as dicey at best. I just got back from LegalWeek, the Experience (also known as LegalTech 2017) and was not completely surprised to see that this is still the case.

The panel discussion held on February 1, 2017, entitled Predictive Coding: Deconstructing the Secret Sauce, sponsored by Navigant and Sidley Austin, started out with a basic inquiry: okay, we’ve trained the [predictive coding] model, now how well does it do what we want it to do? And there is the rub. According to the panel, all of whom could talk circles around me when it comes to this stuff, processing on the back-end can affect the outcome, sometimes dramatically. But, and paraphrasing them here, lawyers tend not to understand, or want to understand, the back-end, and vendors don’t really want to give away their “secret sauce.”

But, in a discussion replete with terms like N-Grams, normalized term frequency, inverse document frequency, down sampling, logistic regression and support vector machines, it was revealed that some pretty major differences in both recall and precision can be seen. As one of the participants said, he wasn’t sure why this happened “because I do not have a Ph.D. in machine learning.” Exactly. Another of the participants did happen to have a Ph.D. in machine learning and was quite eloquent, but I’m afraid it would have taken him about a semester to explain to most of us what was going on.

And then when I left the session, the very first major-name vendor I talked to volunteered that his platform used one of the sauces that hadn’t done as well ….

Posted in Uncategorized | Leave a comment

Bits & Pieces: Law Firm Hacks

According to a panel discussion entitled 5 Forces Changing Corporate eDiscovery: What Law Firms Need to Know, hosted by ALM Media Properties, LLC, at LawWeek, the Experience, in New York City this month, and consisting of Mira Edelman, Sharika de Freitas, and Ari Kaplan, a high percentage of the heads of corporate law department operations have data security concerns.  Since these concerns would extend to their outside counsels, the point was made that laws firms with a Computer Security Officer may have a competitive advantage over those that do not.  Think of the sensitivity of the information that law firms routinely access in their representation of a client – is it safe from hackers?

This same panel expressed personal opinions that cloud storage – keeping in mind its own security considerations – is increasingly, and perhaps inevitably, the way to go (this point may have been lost on those too young to remember when actual hard drives chock full of relatively unsecured data used to be shipped around).

Posted in Uncategorized | Leave a comment

Regulators Used to Ring Twice?

legaltech-2017-img_3127-crop-regulators

What a difference a few months make, at least if they include a presidential election with an unexpected outcome.  Coming as it did just a few days after President Trump’s executive order mandating a two-for-one reduction in regulations, one of the panel discussions at LegalTech 2017 was rather lightly attended, despite a standing-room-only turnout when last presented.  This was the one entitled Regulators Always Ring Twice: Using Technology in Response to Government Investigations, sponsored by GICLI and EDI And yet the subject matter remained as germane as ever, and perhaps even more so, given current trends.

This is because what the panel had to say about needing to know your own data and to cooperate with the other side is applicable whether you are dealing with a federal agency or appearing before a judge in a federal court.

The panelists gently hammered home the same message: know what your data contains, the form it is in, how you can go about producing it, what it will cost to produce it, and when it can be produced.  You want no surprises.  If as counsel of record, you cannot provide that information to the other side, or to the judge, do not expect things to go smoothly for you or your client.

Arguments against production based on proportionality (i.e., cost) are more persuasive when backed up by the testimony of qualified IT people, with real answers to real problems.  Anyone requesting discovery may or may not know what you have, so it behooves you to figure out what you think they are looking for, where it is, and how it can be produced.  Things “can only go downhill” if you can’t answer questions – “‘we will have to get back to you on that’ is not a good response at a Meet & Confer,” although you better be aware of what your IT person will say before you get there.

As one of the panelists noted, a failure to cooperate in discovery can be indicative of 1) counsel playing hardball, 2) counsel  having something to hide, or 3) counsel being clueless, and they all sort of look the same.  The more informed the requester is about your technology, the more they may be willing to work with you.  It’s like what Judge Peck said in another panel, client concerns over cooperation and transparency can often be re-cast as “what level of billing do you want?”

 

 

 

Posted in Uncategorized | Leave a comment

Changes to the FRCP: “Goal Not Accomplished”

legaltech-2017-img_3086-crop

L to R: Paul D. Weiner, Hon. Xavier Rodriguez, Ariana Tadler, Hon. Elizabeth D. LaPorte, Hon. Andrew J. Peck, Patrick Ott

In December of 2015, changes to the Federal Rules of Civil Procedure went into effect, eliminating the “reasonably calculated to lead to discoverable information” standard with one of proportionality. A year later, the panelists for the second day of LegalTech 2017 were asked to evaluate how things were going in the keynote session, The Effects of the December 2015 Amendments to the FRCP from Three Perspectives (Judges, Defendants, and Plaintiffs), and consensus answer was “goal not accomplished.”

Although “some positive direction” was acknowledged, it was telling that when the audience was presented with a hypothetical they could vote on, 43% picked the discontinued standard in support of denying the request – that it wasn’t reasonably calculated to lead to discoverable material (and the percentage went as high as 67% before getting locked in) – even though this is no longer the rule in federal court.

That was pretty much proof positive that despite the considerable efforts made to educate practitioners, more has to be done, with one of the panelists noting that that included some of the judiciary as well. According to this panelist, top lawyers – even Superlawyers – are making this mistake, which is, frankly, “shocking.” Another of the panelists noted that “this is why we are still doing this!” (participating in panel discussions).

And even when proportionality is properly invoked, there can be trouble with drawing the line, with some regurgitating the proportionality defense in boiler-plate fashion, just like so many of us used to the reasonableness defense. The producing party is in the best position to decide how to produce what is to be produced and it behooves them to be somewhat transparent about this; the courts expect the Meet & Confer to be more than a “drive-by” with the parties cooperating in keeping discovery requests proportional to the value of the case.

Alternative proposals were also suggested, as in “hey, it costs too much to produce A, B, and C, but we can produce A and B – why don’t you let us give you that to see if you even still need C?” Then, if the parties can’t agree on whether or not C is needed, the producing party will do best in most cases by coming into court prepared to tell the judge how much more that would cost to produce it, with Judge Peck being famous for his “bring a geek to court” mantra, so the Court can have actual information, from someone who knows, as to what is involved in satisfying a request for production.

The message to counsel, in-house and outside, is clear: reasonableness is gone, proportionality rules – get with the program!

Posted in Uncategorized | Leave a comment