Credibility Judgement and Meta-Content

Most of us know that there’s a lot of rubbish on the web – Content that is wrong for one reason or another, whether it’s just out of date, the author just didn’t understand or was deliberately trying to mislead. Most of us would also like to think that we can tell the difference between “good” and “bad” content and act accordingly. But is that really true? Can I really differentiate between reliable information about, say, a particular health problem? Even if some people can tell the difference all of the time, something that I’m highly doubtful of, it’s clear that some users can’t. In some cases, maybe this doesn’t matter too much. Health, or finance, though, are areas where relying on bad information could have serious repercussions.

So what’s this got to do with meta-content? I mentioned previously the similarities between the mass publication of bad meta-content that Web 2.0 brought about and the mass publication of bad content that was facilitated by the web itself. I’m most interested, though, in how meta-content could help individual users to make better judgments about the credibility of the information that they find online.

Social bookmarking, the ability to share, classify and comment on web links is a relatively common activity, albeit not something that your average web user takes part in. Services like Delicious and StumbleUpon help users to locate information that may be of relevance or interest, but they also allow users to write comments or reviews of the resources that they bookmark. In this way, social bookmarking services effectively allow users to annotate the resources that they find with their own opinions. My hypothesis is that these comments could help users to make more accurate credibility judgments about the information that they encounter online, even in domains where they have relatively little prior knowledge or experience.

Not all meta-content is created equal, though. If some meta-content can help people to make better credibility judgements then the challenge is how to encourage the meta-content that is helpful in this respect and minimise the amount of noise. To accomplish this, I propose the use of “nudge”-like techniques within the user interface to influence users as they create meta-content.

There are a few (subtly) different ways to describe what a nudge is, but the original definition, provided by Thaler & Sunstein in their influential 2008 book “Nudge: Improving Decisions About Health, Wealth, and Happiness” is:

“… any aspect of the choice architecture that alters people’s behaviour in a predictable way without forbidding any options or significantly changing their economic incentives.”

I’m currently running a study to test out whether nudges could be useful in this way, and to keep the experiment “clean” I won’t explain the nudges that I’ve designed yet. I’d love more people to take part, though. If you’ve any interest in health, fitness or well-being then head on over to fitness.gathr.co.uk to take part!

Web Science: What I think it is and why we might not be doing it.

Web Science is not doing science on the web, it’s not about the web, and it isn’t science. My view on what Web Science is and why sometimes I think we don’t actually do it.

Reader beware: The post below is an awful mishmash of half-formed ideas and potentially contentious thinkings. That said, I’d love to hear what you think, so have a read and leave a comment!

The question “what is Web Science” is one that comes up again and again, to the point of becoming a running joke. “Web Science is whatever you want it to be” is one of the more liberal caricatures that I often hear. What’s clear, though, is that up until this point most of the definitions have been given by people that I (respectfully) refer to as “Web Science Immigrants” – So, what is Web Science to a “Web Science Native”, someone who now has “MSc Web Science” affixed to their CV for the rest of eternity (or long enough at least for the distinction to be irrelevant) and (supposedly) should have a feel for what the whole thing is all about?

What seems to be quite clear, certainly to me and to some of the other people I speak to, is that some of what’s labeled “web science” isn’t really Web Science at all. Some of it’s Web Technology, and some is “science about the Web” and neither of these is the same as Web Science, although there is evidently some overlap. There is no shame in that, and there is undoubtedly some fantastic “web science” research going on, but Web Science should be more than a catch-all term for things that combine science and the web. As Wendy Hall sometimes says: “There are two problems with the name ‘Web Science’: ‘Web’, and ‘Science'”

The problem with ‘Web’

The first problem with the word ‘Web’ is that everybody seems to have a different idea of what ‘Web’ is. Here are just some of the definitions that I’ve come across:

  1. An abstract information concept, the idea of having interlinked resources with unique identifiers (hypertext)
  2. A set of technologies
  3. The set of interlinked HTML (etc.) documents that exist now
  4. A series of social phenomena arising from 1 or 2
  5. A subset of the interlinked documents that we have. This suggests that our “personal web” is just one web in a potentially infinite webiverse. (If an HTML document is generated but nobody bothers to read it, does it really exist?)
  6. All of the above

The second problem with the word ‘Web’ is that web science isn’t just about the Web. Even allowing a broad definition that encompasses all the previous definitions (and allowing for the cardinal sin of conflating “web” and “internet”) there are, in my opinion, genuinely Web Science questions that don’t involve the Web. In fact, I see the word web as shorthand for “technology and people”, although I would be prepared to strengthen that definition slightly to “information technology”, since I don’t see Web Science legitimately encompassing the impact of trains on society.

So, this leads me to rule number 1: Web Science research should consider both the technology and the people that are involved in a system. Yes, this definition excludes just studying the web graph and making statements about density or the average shortest path between two web pages. We needn’t exclude graph theory or network analysis from Web Science, though, (quite the contrary, it’s clearly massively relevant). Web Science requires that, having done the maths, we can go on to say something about the people. Or, conversely, having studied some human behaviour, you can say something about the technology. It’s all about the co-constitution, after all.

The problem with ‘Science’

The problem with the word ‘Science’ is that it excludes disciplines that don’t see themselves as sciences and invites the “hard” sciences to deploy all manner of inter-disciplinary name calling and stereotypes in order to “defend” “real science” from “wooly” “rigourless” “qualitative” “social science”.

Try and explain how the web and people influence one another without mentioning law or the humanities. You can’t do it. The law defines aspects of the web graph as much (if not more so) than the technology itself. A court order could ban links, or prevent access, to a website that offers illegal material; A court order can alter the web graph.

So, here’s rule number 2: Web Science research involves knowledge, methods or epistemologies from both human-centric and technology-centric disciplines and it needs to do more than just pay them lip service. In fact, to properly stick to rule 1 and comment on the relationship between the people AND the technology, it’s highly likely that there will need to be a mix of research methods.

We study the Web itself

Even if we adhere to the two rules above, there is huge scope for variation with Web Science and clearly some research will be more about the social aspects and some more about the technical. But social/technical distinctions aside (and I think a discussion about whether that’s even a distinction worth making would be genuinely useful) there are different ways to combine disciplines. We have to choose not just which disciplines to use, but whether we want to make use of knowledge, research methods or entire methodologies. We can combine disciplines, analyse the web and still not be doing Web Science. Allow me to illustrate this point:

In November of last year, a group of us visited Tsinghua University Graduate School is Shenzhen, China, to undertake a collaborative project looking at how young people in China and the UK view other countries. We used data from fora and bulletin boards, used natural language processing techniques to generate statistics and then visualised those numbers.

We learnt something about attitudes (people) by using technology and even something of the state of the technology itself, but I don’t feel like we said anything about how the technology and the people interact, or how the technology and people shape one another. No, this felt to me like using web technology to answer a sociology or politics question. To me, this was not quite web science. It was science ON the web, it was not science ABOUT the Web.

“How do young people view other countries” is a sociology question, and we tackled it using data from the web and methods from computer science. It was interdisciplinary in the sense that we attempted to answer a question from one discipline with methods from another, but it still didn’t feel like we were ‘living the Web Science dream’. I think that true Web Science would instead ask “How does the web influence young people’s views of other countries?” or “How does the web expose people to other cultures?”

So, here is rule number 3: Web Science should say something about the relationship between the people and the technology. We should question how technology facilitates and alters behaviour or beliefs, how it impacts upon the economy or how laws evolve to counter new problems, how people create new technology and how social pressures impact upon its adoption and potentially translate into obstacles or social problems such as exclusion or deviant behaviour.

Summary

I don’t believe that a lot of “web science” is actually Web Science. Web Science is not necessarily about the web, nor is it necessarily science; it is the study of how technology and humanity work together, shaping one another. Maybe we should really be calling it “Information technology-and-people studies“. We may need to use any or all of the models, knowledge and methodologies that humanity has found in order to study itself and all of the models, knowledge and methodologies that humanity has found to study and create technology.

I believe that, in order to be considered Web Science, research should satisfy at least the following three conditions:

  1. Web Science research should consider both the technology and the people that are involved in a system,
  2. Web Science research involves knowledge, methods or epistemologies from both human-centric and technology-centric disciplines,
  3. Web Science should say something about the relationship between the people and the technology.

Want to add something, think I’m wrong or have your own view on what Web Science is? Leave a comment and let’s work it out together!

Thoughts on Meta Content

Over the next few posts, I want to tackle some of the issues from my PhD research around “meta-content” (comments, reviews etc.). Here’s an introduction to meta-content, my research, and why I think it’s interesting.

Web 2.0 is characterised, in part, by a massive increase in user-generated content. YouTube, Flickr, Blogger, Tumblr et al let anyone publish just about anything: Videos, photos, essays, news reports. But, in addition to this new “primary” content comes a wave of user-generated opinion in the form of comments, reviews, trackbacks, discussions, video responses, flaming, trolling and rick-rolling. We now have billions of dollars worth of everybody’s two cents.

Content Meta-content
Wikipedia Article Article Talk Page
YouTube Video Viewer comments
Blog Post Reader comments
Online news article Reader comments
Website Comments on delicious / StumbleUpon
* Discussion on reddit
Content Meta-content

It’s this “other stuff” that I’m most interested in, and is the direction in which my PhD is heading. It’s this other stuff that I call “meta content” – Content that is about other content. The table shows a type of online content on the left and a corresponding type of meta-content on the right.

Often, when looking for information, the content itself is what seems most useful, or most interesting; but frequently the meta content surrounding it provides a resource in itself.

It’s not hard to think of a situation where the meta-content might be more useful than the content itself. Take the Wikipedia example: The article might provide a fairly neutral account of a topic, a subset of the “facts” that everyone can agree on, but the talk page can provide a much better understanding of the discourse around an issue, of the opposing points of view or which aspects of an issue are contentious.

Similarly, the comments on a news article can provide a better idea of the debate surrounding events than the story itself. In many cases comments provide balance to biased reporting or correct inaccuracies.

Of course, meta-content is not all balanced intellectual discussion. The most obvious issue is comment spam, although there are technological solutions that do a reasonable job of stemming that. The spam problem aside, some types of meta-content have a reputation for being particularly unhelpful or unpleasant – The comments on YouTube videos are a good example – and far from contributing helpful information, much meta-content contributes nothing but anecdote and rumour.

In many ways, the problems posed by meta-content are no different to those posed by web content in general. The move away from a publishing model where publishers and peers act as “gatekeepers” to a model where anyone can publish anything brought with it new problems with inaccurate or deliberately misleading information. As the barriers to publishing are lowered, it is almost inevitable that more bad content will follow. We still don’t really have a solution to the problem of bad content on the web, (although Hypothes.is is trying) save for educating people to be a bit more critical about the information that they find.

The problem of useless or malicious meta-content might not be insurmountable, though. Meta-content is the result of social engagement with content and is, therefore, mediated in part by the social norms within the community that produces it. Online communities have their own cultures and norms and these undoubtedly arise as a result of both the people within those communities (and the cultures that they bring with them) and the online environment (design, usability, affordances) itself.

There’s some interesting research that showed how the design of a website effected the thoughtfulness of user contributions, and my own research is trying to use psychological “nudges” to alter the composition of user-provided reviews in a social bookmarking context. The basic premise is that if we can find ways to shape the cultures and norms of an online community, or to promote certain types of thinking, then we potentially have the ability to start steering meta-content in the direction that we want.