As promised last time, I would like to take you on an illustrated tour of “data driven agile product development”. The scenic view points we will be visiting on this tour are hypothesis formulation, instrumentation, data staging, experiment setup, sampling, statistical modelling, metric development, and finally hypothesis resolution and decision making. And of course, as I mentioned in my earlier post, this is an iterative process, so there is always going to be rinse and repeat.
Obviously, we cannot have an illustrated tour without an example. I was first tempted to use a comfortable example like web search, recommendation engines, social networking, productivity applications or any one of such online web services. But on second thought, that is not challenging enough. Firstly it may lead you to think that the arguments I am making are applicable to web based services only and not to brick and mortar services and products. That is definitely not the case. Secondly it does not allow me to showcase the enabling power of the Internet of Things technology, which I strongly feel is one of the trifecta of technologies that will drive the future of agile product development. Thirdly, by using a physical commonplace product I want to test whether my ideas, admittedly inspired by agile development of online services, stand the test of generalization to the offline world. It is easy to explain special cases with specialized arguments. True joy comes from discovering a generic principle that is applicable with few caveats, if any.
So I have decided to take more accessible example of a product, namely the ubiquitous automobile.
There are plenty of interesting questions a car maker or the transportation authority could ask about an automobile, and the answers to those could drive how the new models of the cars are designed, how they are regulated and how car related infrastructure (roads, gas stations, traffic lights etc) is laid out. Out of those many questions, I have decided to take one particularly relevant one.
You may have heard the recent debate about whether there should be heads-up displays (HUDs) on the windshields of cars. The technology poses a hard question (or, in smarty pants language, “poses an ambiguous hypothesis”). On one hand one may argue that displaying information on the windshield may encourage the user to keep his/her eyes pointed straight ahead rather than surreptitiously glancing at the cellphone in his/her lap while driving. This may improve safety, and reduce the number of distracted driving accidents. On the other hand, many people, including many knowledgeable physiologists and experts in man-machine interfaces, contend that having something in your field of vision is no guarantee of it being noticed. There is a well known selective attention trick that our brain can play on us. If the stuff displayed on the screen is sufficiently engaging to the user, he may just mentally block out the stuff happening beyond the screen on the road itself, with disastrous consequences.
And of course there is third type of argument, namely that any kind of distraction from driving is bad, be it digital displays (on the windshield, on a phone screen or in-dash), playing music, phone calls (with hands-free device or not), and even eating or talking with a co-passenger (not to mention falling asleep at the wheel)!
Clearly the hypothesis is not an easy one, and discussions about it can easily degenerate into polemic. If the question was something simpler like “is it worth having brakes in a car?”, we would have near complete unanimity. And hence the decision to put in brakes in a car was probably taken by some executive in the 19th century based on “gut feeling”. But how about questions like: is wearing a seat belt useful? is a blood alcohol level of 0.07 safe for driving? Does presence of traction control really reduce accidents? They get progressively nuanced, and it starts getting difficult to instantly denounce either answer as a clear “flat earth” delusion. The HUD hypothesis is probably far out on this type of difficult scale and hence it definitely merits data driven answer.
So we have climbed the first step of the data driven journey! We have formulated a hypothesis that deserves data driven decision making. This hypothesis can quickly and easily be exploded into many variants and iterated upon : If HUDs do indeed tend to reduce crash risk, how much of the screen should they cover? What kind of things should be displayed? What colors and fonts should they be rendered with? Is animation worse that static rendering? Should they be automatically turned off in certain situations like bad weather, heavy traffic or tiredness of the driver? On the other hand if any kind of HUDs tend to increase the risk of a crash, how do they fare as compared to cell phone use? Is the presence of both types of displays being used together more harmful than the presence of each one in isolation? Does the age of the driver matter to the outcome? Does the market (country) make a difference? Does left hand drive vs right hand drive have any bearing? (You may laugh at this last question, but consider that our brains and our bodies have a “handedness” – does this handedness extend to the field of vision, and hence have an interaction with the handedness of the driver’s location in the car and on the road?”)
The purpose of flooding you with all these variant hypothesis is to show you that while we first asked a nice clean question (does HUD make a car safer?), it may be getting impacted by a whole bunch of other factors, and we may not even anticipate many of them! So how can you solve our original problem?
The answer is: let the data speak for itself. And I really mean this in a statistical sense, and not simply as a call to action. An unbiased sample of data will automatically contain the influence of all these factors, both those that are detectable and those that are not. Detectable factors like fonts and colors in the HUD, or handedness of the drive can, and should, be logged, so that we can later pivot our analysis on them. (I will talk about instrumentation in the next post). But even the factors that are not detectable easily, say whether the driver was tired or whether the weather was bad, will still be present in the data in proportion to their occurrence in real life, even if they could not be logged explicitly, and so a data based decision will be able to account for them. For example, if HUDs are no problem in good weather but are a serious hazard in (rare) stormy weather, then a sufficiently large unbiased sample of data should pick up those rare but dangerous stormy weather situations too, and a good metric will tease out badness of those case.
Next stop: Instrumentation. Now that we have identified a question that needs a data driven answer, how do we decide what data to look at, and how do we get hold of that data?