Category Archives: machine learning

Opinion | A.I. Is Harder Than You Think – The New York Times

“Once upon a time, before the fashionable rise of machine learning and “big data,” A.I. researchers tried to understand how complex knowledge could be encoded and processed in computers. This project, known as knowledge engineering, aimed not to create programs that would detect statistical patterns in huge data sets but to formalize, in a system of rules, the fundamental elements of human understanding, so that those rules could be applied in computer programs. Rather than merely imitating the results of our thinking, machines would actually share some of our core cognitive abilities.That job proved difficult and was never finished. But “difficult and unfinished” doesn’t mean misguided. A.I. researchers need to return to that project sooner rather than later, ideally enlisting the help of cognitive psychologists who study the question of how human cognition manages to be endlessly flexible.Today’s dominant approach to A.I. has not worked out. Yes, some remarkable applications have been built from it, including Google Translate and Google Duplex. But the limitations of these applications as a form of intelligence should be a wake-up call. If machine learning and big data can’t get us any further than a restaurant reservation, even in the hands of the world’s most capable A.I. company, it is time to reconsider that strategy.”

 

This small article is a good read, but only because it is an example of faulty reasoning about AI. Yes it is true that AI today is nowhere near allowing a free-ranging discussion with machines on philosophy or arts. But it is also true that the technology is pretty good at many important if limited tasks such as asking a voice assistant to play your favorite songs in your studio when your hands are covered with paint, or using a simple voice command to get directions when you are stuck in traffic in a strange city with rambunctious kids in the back of the car. These are things that I really need I deeply appreciate that it works! I am not particularly distressed that I am not able to discuss Plato or Kafka with Cortana.

As the authors themselves admit, while it had grand aims, the “knowledge engineering” project of the 70s failed spectacularly, while the statistical learning approach with modest aims actually delivered quite concrete results. And the phrase “Rather than merely imitating the results of our thinking …” almost sounds elitist and snobbish! If we have learnt one thing about learning, whether is for gaining artificial or natural intelligence, is that imitation is one of its biggest components. And I am not just talking about spoken discourse, I am also talking about the arts too – painting, music, sculpture. These are all aspects of intelligence too, and the ability to create and appreciate fine art is deeply contingent on imitation and training –  of the artist as well as the art connoisseur.

 

So the statement “Today’s dominant approach to A.I. has not worked out” is just factually incorrect. By every measure we are making great and rapid progress. If cognitive psychologists want to join the party, by all means come in. But know that the dominant art in the field is statistical learning, and it has by far not finished delivering yet. The field is open to new ideas, but there is no reason to throw out what is working well. I can guarantee you AI will be doing more that making restaurant reservations. In fact it already is, if you would take the trouble to find out.

A purist approach to intelligence is exactly what got AI into trouble in the 70s. Let’s not repeat that mistake again!

The Business of Artificial Intelligence – from Harvard Business Review

The Business of artificial intelligence

This  is a very lucid overview article on the state of AI, really a must read, especially for technology program managers.

Although it is hard to predict exactly which companies will dominate in the new environment, a general principle is clear: The most nimble and adaptable companies and executives will thrive. Organizations that can rapidly sense and respond to opportunities will seize the advantage in the AI-enabled landscape. So the successful strategy is to be willing to experiment and learn quickly. If managers aren’t ramping up experiments in the area of machine learning, they aren’t doing their job. Over the next decade, AI won’t replace managers, but managers who use AI will replace those who don’t.

Can A.I. Be Taught to Explain Itself? – from NY Times

Can A.I. be Taught to Explain Itself?

This is a nice read and addresses an important problem in AI/Machine Learning today. It goes to the heart of the question “Can we trust something we do not understand?”

The article is not offering any solutions, but at least posing the problem and offering some context – which is always a good place to start!

Some would say that “explaining” a decision is nothing more that making the audience sufficiently comfortable with the decision, as opposed to really making sure they “understand” it in the sense of understanding the proof of a theorem.  And one way of producing the most effective “explanation” in this sense would be to use even more AI,  which is a dangerously circular argument!

Others would say that certain things are inherently not explainable in simple terms (due to their intrinsic complexity). Charlatans always exploit people’s need for understanding by peddling grossly simplistic explanations that brush all complexity under the carpet. Politicians are especially good at this!

Nevertheless, there are certain classes of AI applications where a reasonable explanation is possible in principle, such as predicting a person’s credit rating or predicting fraud or predicting cancers.

My personal point of view so far (and it may change in the future), is that the best explanation is often given by providing clusters of exemplars (or “neighbors”) in latent space. This is especially very easy to see with deep neural nets. As the network learns to classify it also learns to cluster in the latent space (especially in the deepest hidden layers).  So when classifying a data point, it will “squeeze” it through one of these clusters in latent space. Members in a cluster are by definition similar in some sense.  We can ask manual reviewers to examine and identify in what sense – i.e. examine the clusters in latent space and attach free-form explanations to those clusters.  Ideally the labeling should be done by the very same people who are demanding the explanation. So, if regulators want explanation of credit scores, they should be hired to label the clusters with the type of explanation that best suits their purpose. After all,  an explanation is  good enough only is satisfies the requirements of privacy laws like GDPR.

This is only one approach (one which I like, for the record). But there are many other approaches and the jury is still out.

Here is an excerpt from the article:

It has become commonplace to hear that machines, armed with machine learning, can outperform humans at decidedly human tasks, from playing Go to playing “Jeopardy!” We assume that is because computers simply have more data-crunching power than our soggy three-pound brains. Kosinski’s results suggested something stranger: that artificial intelligences often excel by developing whole new ways of seeing, or even thinking, that are inscrutable to us. It’s a more profound version of what’s often called the “black box” problem — the inability to discern exactly what machines are doing when they’re teaching themselves novel skills — and it has become a central concern in artificial-intelligence research. In many arenas, A.I. methods have advanced with startling speed; deep neural networks can now detect certain kinds of cancer as accurately as a human. But human doctors still have to make the decisions — and they won’t trust an A.I. unless it can explain itself.

This isn’t merely a theoretical concern. In 2018, the European Union will begin enforcing a law requiring that any decision made by a machine be readily explainable, on penalty of fines that could cost companies like Google and Facebook billions of dollars. The law was written to be powerful and broad and fails to define what constitutes a satisfying explanation or how exactly those explanations are to be reached. It represents a rare case in which a law has managed to leap into a future that academics and tech companies are just beginning to devote concentrated effort to understanding. As researchers at Oxford dryly noted, the law “could require a complete overhaul of standard and widely used algorithmic techniques” — techniques already permeating our everyday lives.

AI image recognition fooled by single pixel change

AI image recognition fooled by single pixel change

AI image recognition fooled by single pixel change

The researchers found that changing one pixel in about 74% of the test images made the neural nets wrongly label what they saw. Some errors were near misses, such as a cat being mistaken for a dog, but others, including labelling a stealth bomber a dog, were far wider of the mark.

“More and more real-world systems are starting to incorporate neural networks, and it’s a big concern that these systems may be possible to subvert or attack using adversarial examples,” he told the BBC.

While there had been no examples of malicious attacks in real life, he said, the fact that these supposedly smart systems can be fooled so easily was worrying. Web giants including Facebook, Amazon and Google are all known to be investigating ways to resist adversarial exploitation.


This article is reporting an interesting study, but also doing injustice to the core issue. It is far too simplistic to say this is a problem of deep learning or neural networks. And also it is untrue to say that ML community does not know how to tackle adversarial attacks.

So let me give a little commentary to clarify. This may get a little technical, please bear with me.

There are two distinct problems, that need to be solved separately:

  1. Building a classifier that is robust to natural perturbations (i.e. naturally occurring likely variants). Here nobody is conspiring to fool the classifier. Nature is just throwing up data points for decision making based on its underlying distribution.
  2. Building a classifier that is robust to adversarial perturbations. Here an adversary is actively trying to foll the classifier.

Sensitivity to natural perturbations is a vulnerability that stems from over-fitting. This happens when you train a model with a very large number of parameters but with insufficient training data so that the model fits/explains every training point “just-so”, but does not have power to generalize well. An over-fitted model has a very chaotic (highly twisted and undulating) boundary (fancy name is “manifold”) in the feature space on which the data is distributed. In particular there is a lot of probability mass (a lot of data examples) that are lying close to class boundary. The boundary is weaving/zig-zagging through the data sort of wildly trying to get the best classification accuracy on the known examples. To describe such a wild surface you need lot of parameters. You cannot describe a complex irregular surface with a simple model, as you may know from school geometry.  An over-fitted model leads to two effects – being poor at generalization, and vulnerability to perturbations – they are actually two sides of the same coin.

But we do know a solution: It is to build robust well-fitted models, which often means giving up a little on accuracy to gain robustness. Of course, any model will always face a challenge when classifying “on the boundary” of classes, a region which by definition is ambiguous, and where minor perturbation will  definitely tip the decision. However the trick to designing a robust classifier is to make sure that the decision boundary is sufficiently far away from the heavy probability mass, so that all naturally likely examples (variants that occur in nature but are not generated by an adversary) are in the deep interior of the class regions.

Now lets turn to the adversarial situation. An adversary can figure out where the boundary is and then purposely give an example near the boundary. This means the adversary is choosing a test point not from the natural distribution for which the classifier was trained but from a new non-natural distribution. It is not surprising then that the classifier stumbles. This is what the article refers to as “adversarial” attack.

But I take exception to the bland statement that the ML community (in particular Deep learning community) is not aware of how to treat adversarial attacks. This is patently untrue. For example I work in Fraud detection systems, and our whole world revolves around adversarial attacks.

Here is the crux of the matter: To find out where the boundary is, the adversary needs to do a lot of “probing”/”testing”. (This is what the people who did the study actually did. They did not choose a random pixel to perturb, they chose the location of the pixel very carefully by lot of trial and error.) However all this probing constitutes a new type of signal that the classifier system should be aware of and should react to, in order to firm up its defenses. This is definitely possible with cloud based classifiers where every call to the system gets recorded and can be analyzed and leveraged immediately thereafter.

This is exactly what we do day in day out in anti-fraud systems. Fraudsters are constantly probing for loop holes and trying new fraud attacks – in other words they are trying “perturbations” at various locations in feature space. Once in a while they hit the jackpot and manage to get some fraud done. But we are always watching and looking for these shifts in the data distribution. (We call it anomaly detection.)  The way we solve these new fraud attacks is to keep track of all decision requests coming into the systems and rapidly “re-learn”, i.e get fresh examples of anomalies, get fresh supervision (say from humans), and retrain your model daily or even hourly. Every such retraining fixes a “kink” in the boundary. Of course this is not merely a matter of “re-tuning” the model, we need to make sure there is no “catastrophic forgetting” either (we do not want to create kinks elsewhere and let fraud vectors get through), for which we need to use clever techniques like progressive learning. The point is, these are well-understood ideas in ML and there is a lot of active research. It is not as if the ML community is starting from scratch on fighting adversarial systems.

So yes, this article is talking about an interesting flaw that can potentially be exploited. But no, the ML community is not caught napping here. There are well understood ways to fighting adversarial attacks on ML systems. And we use them very commonly in anti-fraud systems in particular.

 

Can jobs still provide a pathway to the American dream? | The Seattle Times

“Being intelligent and being human are two different things,”

Source: Can jobs still provide a pathway to the American dream? | The Seattle Times

A thoughtful read on the future of the job market.

What should we teach our children today to prepare them for the workplace of tomorrow? I would say: “go to the basics”.

By that I mean that we should make sure our children get a solid grounding in the eternal invariants – mathematics (of which probability, computer science and data science are a part), basic sciences (such as physics, biology, materials and chemical sciences) , fine arts (such as painting), performing arts (such as music or theater), humanities (history, law), and of course poetry and literature.

These are invariants because they either describe eternal truths about the universe or eternal truths about the human condition. They will never be irrelevant or passe. Everything else gets built out of these invariants and can be learnt and re-learnt as the marketplace dictates. So the invariants are like an “SDK” (software development kit)  that will always be a competitive advantage to anyone with aspirations and ambitions.

 

Four fundamentals of workplace automation | McKinsey & Company

Capabilities such as creativity and sensing emotions are core to the human experience and also difficult to automate.

Source: Four fundamentals of workplace automation | McKinsey & Company

This is a really good read, and  I strongly agree with the sentiments expressed here. For example, as the article points out, automation is not just going to impact low wage job, but may equally impact very high paying jobs too. A lot of stuff that skilled professionals like doctors, financial planners and executives do is actually quite repetitive, predictable, and tedious, and can be automated. Conversely there are certain low wage jobs that cannot be automated at all with today’s technology, such as landscaping and yard cleanup! So the notion that machine intelligence and automation are going to put only blue collar workers’ job in peril is actually quite wrong. Rather AI and automation are going to put all repetitive, tedious and mind numbing jobs at risk.

And what is so bad about that? Won’t that free us up for more creative work? It will give us the ability to do what humans do best – adapt to novel situations, think originally, show social and emotional understanding, and build great things by networking and using empathy. For this reason, one can be optimistic that machine intelligence will not make us redundant, rather it will make us more human. For the same reason a strong education in liberal arts – having a refined taste in literature, art, music and culture – will be a key attribute of successful professionals in the future.

While it is perfectly virtuous to emphasize a good grounding in STEM (science technology engineering and maths), that does not mean that we should raise our children to be boorish loners who are savants with technology but incapable of holding a decent conversation or sensing the emotional needs of an audience and modulating their message accordingly. Liberal arts hone our social and emotional intelligence.  So in the rush to achieve high grades in STEM subjects let us not forget the importance of music and arts and creativity and collaboration. Paradoxically, the STEM skills of a few will enable large scale  machine intelligence and automation, which in turn will liberate the rest of the population from drudgery and allow them to flourish in creative and artistic endeavors.  So STEM and Liberal Arts education can indeed live side by side with synergy and feed into one another and flourish.

Yes, by all means encourage your kid to do the hour of code. But also make sure you encourage them to do the hour of music!

An executive’s guide to machine learning | McKinsey & Company

More broadly, companies must have two types of people to unleash the potential of machine learning. “Quants” are schooled in its language and methods. “Translators” can bridge the disciplines of data, machine learning, and decision making by reframing the quants’ complex results as actionable insights that generalist managers can execute.

Source: An executive’s guide to machine learning | McKinsey & Company

BBC – Future – The best (and worst) ways to spot a liar

Ormerod’s answer was disarmingly simple: shift the focus away from the subtle mannerisms to the words people are actually saying, gently probing the right pressure points to make the liar’s front crumble.

Source: BBC – Future – The best (and worst) ways to spot a liar

A very interesting article, would highly recommend reading. The technique can be of great utility for all of us in our day to day lives. We often undervalue the art of conversation, when actually it is our best tool to get insights into people’s motivations.

In Machine Learning vocabulary, we would say that the open-ended question-answer pairs (and especially minor discrepancies between answers) are a far better set of features for lie detection than body language, facial ticks and mannerism.

BBC – Future – What may be self-driving cars’ biggest problem

The danger comes when the human is not concentrating on the road, and the vehicle suddenly wants them to take over.

Source: BBC – Future – What may be self-driving cars’ biggest problem

A thought provoking article, which kind of echoes my thoughts on this topic. I have had debates about driver-less cars with friends before and I always argue that they are not going to become a mass reality in the foreseeable future.

I do not doubt that the technology will quickly progress to the point where a machine can take over and manage the driving operation most of the time, say even 99.9% of time. But that 0.1% time when something unexpected happens – say a child runs across the road, or a bale of hay fall off the cart in front of you, or an overpass splatters a load of water on the windshield – the driving algorithm is likely going to be caught short. The reason for this is that we humans, who get driving licences, actually spend a lifetime understanding the human environment in which we drive our cars. We don’t just descend from the heavens into our cars. We grow up studying the intricacies and chaos of the urban human environment. For the machine to get all this training, it would need to be exposed to all those situations and be supervised constantly (just like our parents supervised us.).  The sheer multiplicities of unexpected situations is so large that we cannot guarantee with more than 99.9% confidence that the machine will not stumble once in a while.  And if a number like 0.1% seems small to you, multiply that by the “coverage” of driving (the number of road trips happening on a day in the country, say) and you end up with a scary number of occasions on which the machine has lost control.

To take care of these situations, for the safety of the people in the car as well as people outside on the road, we would need a human operator present to take over. But if we are going to pay a human to stay in the car to take over 0.1% of the time, we might as well ask him to drive the car!  If you are thinking that we could let the human use the 99.9% idle time fruitfully – say by doing his taxes or filing his nails – think twice! As the above article aptly points out, the time when a human needs to take over is likely going to be a time of emergency. So if the human does not have the context of the past 30 seconds it is unlikely that the transition from machine to human can happen successfully. In fact the the human taking over abruptly without context may actually increase  the chance of an accident.

So really what we needs is a human sitting behind the wheel, his eyes on the road, and his hands hovering on the wheel, his blood pressure dangerously elevated due to anxiety, but not actually doing the driving because the driver-less car supposedly drives itselfThat is a pretty dumb and useless scenario! I doubt that anyone will be willing to pay for such a use of the technology.

So the bottom line is that, for driver-less cars to become a commonplace reality (not just a demonstration on carefully controlled sets or on empty freeways), the training of the algorithms that drive the car need to be taken to a far superior level similar to the training of a human gained over a lifetime. I will argue that we are no where near that type of practical machine learning technology yet. I am not saying that it will not happen, simply that it will not happen in the next 20 years. So at least I am not investing my life’s savings in companies that make driver-less cars, yet.

Like all technology predictions, there is a danger here that I will have to eat my hat. But if the opposite of my prediction indeed comes true, my joy would be such that I will gladly eat it!