OpenFiction [Blog]

Toward an Effective Understanding of Website Users

Posted in Uncategorized by scarsonmsm on March 12, 2007

Diane Harley’s group at UC Berkeley has come out with another great report supporting better evaluation of open educational resources. If you report data from your project to stakeholders of any kind, or are on the receiving end of project data, this report is a helpful look at the usefulness and limitations of web surveys and transaction log analysis (or web metrics).

Harley and crew were able to link survey respondents to their respective transaction logs, and thus determine how representative the respondents were with respect to overall site usage (and the answer in non-math terms is “not very”). If I’m reading the report right, the surveys they used receive extremely low response rates, an order of magnitude lower than those we’ve done for MIT OCW (~0.2% as opposed to our 3-6%), so it’s possible the reliability measures may vary also, but the basic point is well made: You can’t expect the people who complete your survey to represent your overall site traffic.

What Harley doesn’t address, and I think may be the next step in understanding survey results, is what they might usefully represent. One quote:

These findings confirm our fear about survey response bias; the few users who bothered to respond to the surveys are demonstrably different from the average site visitors. Since the results show that the respondents are non-representative on these three behavioral measures, we determined that it would be unwise for us to draw any conclusions from the survey about the characteristics of the site visitors overall.

I’m not sure there is cause for fear here. No, the survey results don’t represent all of the traffic to the site, but I’m not sure that information is worth having anyway. A web site survey is like conducting interviews of people passing through an art exhibit placed in a public pedestrian thoroughfare. Some may have heard of the exhibit and are coming specifically to see it, some may have just been passing by and became interested, a few may have just stopped to glance at a single piece, and a great many are just passing through on their way to lunch. It’s neither possible nor advisable to interview everybody, and only the ones most interested in the exhibit are going to sit for an interview anyway.

The good news is that these are the people you are most likely to learn from anyway. There is of course the danger you’ll only hear good news because you’re only asking the choir to sing (or something like that). You also won’t learn anything about those who choose not to stop, but for that you need a different tool. The question is, what population do survey responses really represent?

I did a calculation a while back on traffic to the MIT OCW site in October 2004 (Page 19 of the ’04 Findings Report). That month, we had 417,598 visitors. Based on the survey data, I did a back of the envelope calculation that we had a core user base of about 50,000. The data I based that particular calculation on suffers from some of the bias that Harley identifies, but if I recall, there was at least a rough correspondence between survey and web metric data on this point. I’m sure someone with a stronger stats background, and armed with Harley’s methodology, could do an even better job of this kind of thing to better identify the core group that survey results do represent well. If anyone out there is interested in this problem, I’m happy to work on it with them.

The other reason to suspect that surveys represent at least some stable portion of a user population is that the numbers appear to be very consistent over the years we’ve done or surveys, with observable trends in figures such as educational role and satisfaction with breadth, depth and quality. I completely agree with Harley that there are problems with extrapolating out survey figures to all traffic to a site. At best, survey figures will always represent a subset of traffic. My hope is over time we can better describe the subset.


4 Responses

Subscribe to comments with RSS.

  1. Patrick said, on March 19, 2007 at 12:00 pm

    Hi Steve,
    Thank you very much for this post which pointed me towards a very interesting paper. On the openlearn site ( I have been reluctant to push a questionnaire because it won’t tell us about many in our audience. However just recently we used the access statistics to help us find some of those who had made significant use of the site. When people register we also ask if they are prepared to be approached for research purposes. So we used these in combination to select a deliberately biased sample who were asked to complete the questionnaire. These users will not be typical of the site but we certainly managed to get informative and useful replies from them. I guess this supports your statement “No, the survey results don’t represent all of the traffic to the site, but I’m not sure that information is worth having anyway.”


  2. […] March 19th, 2007 I have just been looking at Steve Carson’s blog at the OpenFiction Project. In particular a post on Toward an Effective Understanding of Website Users a paper by Diane Harley. I felt both the paper and the blog entry made very good points and so I wanted to make a comment, but in the end failed. Looking back I was not the first to have failed as Stephen Downes had had some trouble commenting on an earlier post. I have therefore followed Stephen’s example and made the comment here on my blog. Not such a famous place so not sure it will ever get read! […]

  3. Stephen Carson said, on March 19, 2007 at 12:56 pm

    Just a clarification on the issue Patrick mentions above. I did fix the issue Stephen encountered, which was the system was not set to accept registrations for comments. Patrick was able to register and submit a comment, but I have all comments set to require manual approval as a spam check. I usually get two or three spam comments submitted each week, even with Akismet set up. If any WordPress users have suggestions for a more user-friendly approach, please do let me know.

  4. […] Anyway, reading it right after reading the report, I found several passages that spoke to the issues raised regarding web surveys and what they represent. The first passage recounts the aha moment that Bill Gross had for his paid search engine, which presaged the Google AdWords system: Put simply, it’s not the quantity of traffic, Gross realized, it’s the quality. Any business would be willing to pay a lot more than seven to ten cents a click for the right traffic. That realization that became Gross’s eureka moment—a moment that, more than any other, spawned today’s Internet advertising economy. For every single online business (even, it turns out, portals) undifferentiated traffic is worth very little, but specific traffic, traffic with an intent to act in relation to a business’s goods or services [Battelle’s emphasis], is worth quite a lot. Gross realized that businesses will pay quite a bit to acquire the right kind of traffic. All he had to do was build an engine that created intentional traffic. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: