[This is a guest post from Simon Chadwick, CEO of Peanut Labs,
Managing Partner of Cambiar and Editor-in-Chief of Research World.]
One might be forgiven for thinking that the issue of suspect data and sample quality in online research has really only arisen in the past two years. After all, in that space of time, we have seen associations launching initiatives (including the huge study conducted by ARF’s ORQC), task forces springing up, conferences devoted to the issue and the launch of commercial and collaborative solutions – all aimed at bringing about comprehensive resolution to the problem. But, in actuality, worries about data and sample quality have been around for a lot longer – Cambiar’s first study of the online industry highlighted this as the top concern for both clients and researchers, and that was back in 2005!
Despite this activity (or because of it?), that concern persists, unabated. The fourth Cambiar study, conducted in February and co-sponsored by Peanut Labs and MROps, demonstrated that sample and data quality remain stubbornly at the top of the list of concerns.

As a sidebar, it is interesting that full-service market research companies evince much more serious concern about these issues than do clients, despite the recent hype about this being a client-led revolution. Additionally, it is clear that what constitutes quality differs depending where you are on the food chain. For full-service companies, it is defined as “data” or “sample” quality. For data collectors, the issue is much more about survey and questionnaire design. Who is right? Both are.
A literature review of what is out there on online data quality yields a plethora of articles, webinars and presentations. ORQC alone has amassed more than 300 articles on the issue, while studies such as that conducted by Burke in 2007 suggest that some 14% of respondents from online panels are in some way ’suspect’. Indeed, our own data suggest the same level of problems – of over 21 million respondents run through our digital fingerprinting software in the last 12 months across a wide variety of data collection sources, 15% were identified as being ’suspect’.
So what does this mean in the real world? Inefficiency, extra cost and, potentially, wrong decisions based on faulty data. If our clients are paying for the data that we provide them, but a proportion of those data are suspect, the least that can be said is that they are overpaying, since research companies routinely have to oversample in order to compensate for ‘duds’ in the data set. The research companies themselves are paying more in terms of time and salary to check data at the back end and weed out the duds. And if, heaven forbid, a client makes a decision based on faulty data, then the costs can be astronomical.
So, is this a problem? Yes it is. We can run all the studies we want to try and prove that one or other component issue in data quality ‘really doesn’t make much difference’, but the truth is that the multivariate nature of factors that make for poor data quality means that we don’t really know what the impact is on our research and how much it is skewing our results.
More Info:
- Peanut Labs: http://peanutlabs.com
- Optimus Digital Fingerprinting: http://www.peanutlabs.com/optimus/technology
- ARF Online Research Quality Council: http://www.thearf.org
[Simon Chadwick is the CEO of Peanut Labs, Managing Partner of
Cambiar and Editor-in-Chief of Research World. He has over 30 years'
experience in the research profession, both corporately and as an
entrepreneur. ]










0 responses so far ↓
There are no comments yet...Kick things off by filling out the form below.