Archive for November, 2014

Better software through SCIENCE!

Saturday, November 29th, 2014

The scientific method is the time-proven way we have learned about the very principles that govern the universe.  It can be summarized by the following sequential steps

  • imageAsk a question
  • Construct a hypothesis
  • Test the hypothesis with an experiment
  • Analyze results of the experiment
  • Determine whether hypothesis was correct
  • Answer the question

Testability

A hypothesis is a suggested explanation of observed phenomena.  Given such an explanation one can then make predictions about those phenomena given certain conditions. 

But for a hypothesis to be truly useful it should be a specific, testable prediction about what you expect to happen.

For example, Galileo might ask the question "Is speed of a falling object dependent on its mass?" The hypothesis Galileo formed was that two objects of different mass, dropped from a height, would strike the ground at different times if falling speed depended on mass. The hypothesis differed from the original question in that the hypothesis predicts an experimental outcome that can be tested. The experiment in turn yielded data indicating nearly simultaneous impact with the ground, and the analysis concluded that falling speed is not dependent on mass.

Data-driven quality

In my previous blog (Data-Driven Quality explained – part 1: questions? what questions?), I introduced DDQ.  DDQ represents application of the scientific method to software quality. The steps of the scientific method can be mapped to the DDQ model as seen below.  Instead of the nature of universe though, we are interested in answering questions about the quality of our software.

image

Applied to software

For example we might be interested in learning why less and less folks are using Internet Explorer as their web browser. 

Based on some preliminary research we may hypothesize that users abandon IE when they encounter web pages that do not function properly, but work better in other browsers.

We might then configure IE malfunction with a select set of popular pages and assess whether IE abandonment rates are higher for users of those pages. 

Um….no…. that would be pretty stupid.  

So we take a cue from social scientists here. Social scientists do not send out crack teams equipped with highly addictive narcotics to supply certain neighborhoods that they can contrast the effects with other neighborhoods.   They instead find existing populations that already exhibit the attributes they need for comparison.

In our case we would compare users of pages known to malfunction in IE to see if there is a significantly higher abandonment rate than users who do not encounter such pages. 

If confirmed we can then dig in and identify the chief offenders of browser compatibility and fix them…. then re-assess.

Software quality

Testability, data-driven, answering questions… these should all sound familiar to any software professional as good practices.  Using DDQ and the scientific method is a powerful way to apply these for your software.

Data-Driven Quality explained – part 1: questions? what questions?

Monday, November 24th, 2014

The "dictionary" definition of Data-Driven Quality (DDQ) is:

Application of data science techniques to a variety of data sources to assess and drive software quality.

But it is really about questions and answers, specifically using data to find those answers. Trying to derive insights from data without knowing what you are looking for can be a source for new discoveries, but more often will yield mirages instead of insights, such as the following image [Source: tylervigen.com, used under CopyLeft license CC 4.0]

(if I only had such data in 1983 I could have wasted even more of my quarter-fueled youth)

So what questions then? These are the questions to ask about your software: image In this diagram "it" is your software (service or product). The questions are divided into the three layers identified by the categories on the left:

  • Business value: Does your software contribute to the bottom line and/or strategic goals?
  • User experience: Does your software delight customers and beat the competition?
  • System Health: Does your software work as designed?

Each category layer depends on the layer beneath it. Consider that it is difficult to build a good user experience on a slow error-riddled product. And ultimately it is the top layer, Business value, that we care about. This leads to the "trick question" I will sometimes ask software testers and SDETs: What is your job? To which I answer:

Your job is to enable the creation of software that delivers value to the business. This is the job of the tester. It also happens to be the job of the developer, program manager, designer, manager, etc.

(I explore this idea a bit more here if you are interested) If you think about it, you might want to add to the above statement that you create this business value through:

  • An experience that delights the users
  • The production of high-quality software

True. It also turns out that these are respectively User Experience and System Health, the layers we are dependent on to build Business Value. An interesting note about that word "quality". "High-quality software" above means system health – that it does not break. But as Brent Jensen likes to ask, which is higher quality software:

  • A. Perfect bug-less software that people do not use (or perhaps worse, they hate to use)
  • B. Quirky software with a few glitches making millions happy (and making happy $millions)

If you believe the answer is B, then DDQ will appeal to you with its "Q" for "quality" happily spanning the pyramid above and not just system health. DDQ is about a confluence of what has been called Business Intelligence (BI) and quality. They are not really different things.

Asking the right questions is an important start, but is only one piece of the DDQ puzzle. DDQ works in an environment of iterative improvement (same as Agile). The faster we can spin around these cycles, the faster we improve our software. This is the DDQ Model: image I will leave it as an exercise to the reader to map the above to the scientific method. (I may help you out and do this in a future blog post.)

Understanding our questions, the next step is to understand the data sources we can use to answer these questions. I will close by sharing a list of some of the types of data we can use below. You will note much of this data comes from production or near-production (think of private betas or internal dogfood). Production is a great environment to get data from as it is the most realistic environment for our software with real users doing real user things.

Business value User experience System health
Is it successful? Is it useable? valuable? Is it available? reliable? performant?
image image image

Acquisition

Adoption of a new feature,

New users, Unique users

Retention

Market share, Session duration, Repeat use

Monetization

Purchase, conversion, ad revenue

Minus: COGS, support costs

Usage Data

Feature use, task completion rates

Feedback (2nd person)

User ratings, User reviews

Sentiment Analysis (3rd person)

From: Twitter, Forums

Infrastructure data

Memory, CPU, Network

Application Data

Errors

Latency

MTTR (Meant Time to Recovery)

Compliance

Test Cases (run as part of pre-production test passes, or in production as monitors)

Correctness

Performance

Engineering Metrics (pre-production)

Code coverage

Code churn

Delivery cadence

Not covered

As you can see in the DDQ model, there is plenty more to cover. Besides the other boxes in that model, here are some other things that were NOT covered in this blog post

  • How to determine specific scenarios to frame your questions for your software
  • How this fits into a comprehensive software development life cycle, and specifically the impact on BUFT (Big Up-Front Testing)
  • Impact on team roles and responsibilities. Who does what?
  • The future of the Tester/SDET role
  • What do you need to know about actual Data Science?
  • Tools
  • Dashboards, and actionability
  • Examples :-)

Further reading

If you want to learn more, I recommend the following:

  • My former Microsoft colleague Steve Rowe has a great series of posts on DDQ
  • Adding to the acronym soup, but definitely on target with DDQ, my often co-conspirator Ken Johnston explains EaaSy and MVQ.
  • Although I had not quite tightened up the questions into the neat pyramid model above, I do fill in some of the blanks left by this blog post in my Roadmap to DDQ talk.

Finally wish to acknowledge and thank Monica Catunda who collaborated with me on much of this material, and co-delivered the Create Your Roadmap to Data-Driven Quality 1-day workshop at STPCon Nov 2014 in Denver.

New blog, first post

Tuesday, November 18th, 2014

This is my new blogging home.

I had previously used this space (http://setheliot.com) as a place to collect my various presentations and papers (and they are still here), but now I will also use this as my new blogging platform.

My old blog was called Your Software has Bugs, and indeed it I am sure your software still does, but for this go I am going to stick with the simpler Seth Eliot’s Blog

While the old one was primarily about software, this one has a broader scope including software, data science, and whatever I think you might be interested in.

My old blog started with a self-indulgent exploration of me, which was immediately called out by a commenter as being overly self-indulgent and too much about me (looking back at that post, I recall I declined to publish the comment… a decision I now regret).  Now instead I would like to start with an exploration of the “greatest hits” from my previous blog, with some added context.  Sort of a starting point to continue with the new blog.

Presently I am an advocate of (and enjoy helping engineers with) Data-Driven Quality (DDQ).  To get an idea of what DDQ is, peruse the deck or watch the talk Create Your Roadmap to Data-Driven Quality.  How we use data is central to how we produce quality software.  This is certainly not limited to big data, but large unstructured streams of data provide a compelling story – one that we an unlock with modern processing and tools.

image

So in conclusion, I do indeed like big data :-)

10152434_10152031132281956_3658140527681985280_n

But before there was DDQ, there was TiP – Testing in Production, which I chose to introduce by showing how to do it WRONG:

Feeling TiPsy…Testing in Production Success and Horror Stories

…and also some on how to do it right, such as enabling teenagers to escape reality behind online avatars like this:

IMVU

I was recently at a Seattle area QA meeting (QASig) where the topic of finding bugs and its place in quality assessment came up.  Years I before pondered that question asking sarcastically:

Hooray for Buggy Software???

This was an interesting re-read for me as I saw early signs of DDQ in this diagram from the blog post.

See any resemblance to this one from one of my DDQ talks?

image

To wrap up on TiP and DDQ I will share some fun I had at the expense of Big Blue:

Testing in Production (TiP), a common and costly technical malpractice???

When I presented this story in a talk, I actually got back a comment that it was unprofessional to make fun of IBM.  I certainly want to stay professional in my interactions, but I think IBM can take it.

Finally, before I close, I would like to share my most popular blog posts… most popular that is in China!  For reasons that I do not quite grasp, the Chinese audience really responded to my posts on the Microsoft change in logo Smile

The Four Colors of the New Microsoft Company Logo

The Four Colors of Microsoft, revisited

Microsoft Logo

That’s it for now.   Look for more timely (for sufficiently broad definitions of “timely”) and compelling (well, I think it’s interesting) content soon (or not… I’ll try).