Quantifying the User Experience

Personnellement, j’ai une formation universitaire en psychologie où j’ai appris l’importance de la validité interne et externe des données recueillies notamment lors des tests. J’ai fait quelques bonds et poussé quelques profs de marketing dans leurs retranchements quand ils m’expliquaient « qu’on valide le quanti avec le quali » ou l’inverse (ça changeait suivant les profs). Tout cela pour vous dire que Jeff Sauro vient de publier un livre disponible en France : Quantifying the User Experience: Practical Statistics for User Research. Bien sur là, ça ne laisse plus la place à l’improvisation et pif au mètre !

Pour vous faire une idée du travail de Jeff Sauro, vous pouvez lire son blog et retrouver, sur son site, des outils statistiques bien pratiques.

Le sommaire du livre :

Introduction & How to Use this Book

Visual Guide to What Test
Skipping the formulas

Quantifying User Research

What is User Research?
Usability Tests (lab and remote)

Benchmarking
Comparative Testing
Qualitative Studies

Surveys
Requirements Gathering
A/B Testing
Questionnaires
Using Inferential Statistics with usability Data
Samples Size, Normality and other statistical concerns
Measuring Usability: Quantifiable Aspects of Usability

Introduction: Metrics as independent to formative and summative tests

Completion
Time
Satisfaction
Errors
Clicks / Page Views
Combined Scores
Problems Discovered

How precise are our estimates: Confidence Intervals

Confidence Interval = Twice the Margin of Error
Confidence Intervals Provide Precision & Location
Three Components of a Confidence Interval

Confidence Level
Variability
Sample Size

Confidence Interval for a Completion Rate

Confidence Interval History
Wald Interval: terribly inaccurate for small samples
Exact Confidence Interval
Adjusted-Wald: Add Two Successes & Two Failures

Best Point Estimates for a Completion Rate

How accurate are point estimates from small samples?

Confidence Interval for a Problem Occurrence
Confidence Interval for Rating Scales and other Continuous Data
Confidence Interval for Task Time Data

Mean or Median Task Time?
The Geometric Mean

Log Transforming Confidence Intervals for Task Time Data
Confidence Interval for a Median

Did we meet or exceed our goal?

Introduction

One-Tailed and Two-Tailed Tests

Comparing a Completion Rate to a Benchmark

Small Sample Test

Mid-Probability

Large Sample Test

Comparing a Satisfaction Score to a Benchmark

Do at Least 75% Agree? Converting Continuous Ratings to Discrete

Disadvantages to Converting Continuous Ratings to Discrete
Net Promoter Score

Comparing a Task Time to a Benchmark

Is there a statistical difference between products?

Comparing two Means (Rating Scales & Task Times)

2-sample t-test (between subjects)

Confidence Interval around the Difference

Paired t-test (within subjects)

Confidence Interval around the Difference

Comparing Completion Rates

Small Samples : Fisher Exact Test
Large-Samples : The N-1 2-proportion test

Confidence Interval around the Difference
Relationship between Chi-Square Tests and 2-proportion tests

A/B Testing & Conversion Rates

What Sample Sizes Do We Need? Part 1: Summative Usability Studies

Introduction

Why Do We Care?
The Type of Usability Study Matters

Basic Principles of Summative Sample Size Estimation

Estimating Values

Example 1: A Realistic Usability Testing Example Given Estimate of Variability
Example 2: An Unrealistic Usability Testing Example
Example 3: No Estimate of Variability

Comparing Values

Example 4: Comparison with a Benchmark
Example 5: Within-Subjects Comparison of an Alternative
Example 6: Between-Subjects Comparison of an Alternative
Example 7: Where’s the Power?

What Can I Do to Control Variability

Sample Size Estimation for Binomial Confidence Intervals
Binomial Sample Size Estimation for Small Samples
Sample Size for Comparison with a Benchmark Proportion
Sample Size Estimation for Proportions & Chi-Squared Tests

What Sample Sizes Do We Need? Part 2 : Problem Discovery

Using a Probabilistic Model of Problem Discovery to Estimate Sample Sizes for Formative User Research
The famous equation (P(x ≥ 1) = 1 – (1 – p)n

Deriving a sample size estimation equation from 1 – (1 – p)n
Using the tables to plan sample sizes for formative user research

Assumptions of the Binomial Probability Model
Additional Applications of the Model

Estimating the composite value of p for multiple problems or other events
Adjusting small-sample composite estimates of p

Estimating p

Adjusting the Initial Estimate of p
Using the Adjusted Estimate of p
Investigating Sample Size Effectiveness
Estimating the Number of Problems Available for Discovery
What Affects the Value of p?

Attitudinal Measurement with Questionnaires

Scales, Labels and Points
Post-Task Questionnaires
ASQ, SMEQ, 1-question Likert
Post-Test
SUS, SUMI, PSSUQ, Homegrown scales
Usability and Loyalty
Net Promoter Scores and SUS

Controversies in Measurement & Statistics

Industrial versus Scientific: Purpose of statistics is to help in better decision making over the long run
Multi-Point Scales
p-values and NHST
Parametric versus Non-Parametric Statistics
Which confidence level
When x=n or x=0 what confidence level do you use?
Multiple testing versus omnibus testing
2 x 2 tables

Final Thoughts on Statistics for User Research
Appendix A: A Crash Course in Fundamental Statistical Concepts

Central Tendency: Mean & Median
Standard Deviation & Variance
Population Parameters and Sample Statistics
Standard Deviation
Margin of Error
Alpha
Standard Error of the Mean
Central Limit Theorem
The normal distribution
The Binomial Distribution
Normal Approximation to the Binomial
Introduction to Hypothesis Testing

The Null and Alternative Hypothesis (Ho and Ha)
Type I and Type II Errors
Confidence and Power
Making decisions from p-values

If p is low reject the Ho

One and Two Tailed Tests

Mechanics of Test Statistics

z statistics
t-statistics

Le bloc-notes, UX & Design d'expérience utilisateur

Auteur : Raphaël

1 commentaire