“Put your tweets on your website”

Have you seen a message like the one below when using twitter?


Wonder why twitter’s so keen for you to do that?  A quick check of the settings page gives it away:


Yep – Twitter’s encouraging website owners to embed Twitter widgets on their websites so that twitter can track the websites that its users visit.  The real travesty here is that, I suspect, the majority of people embedding the widget and the majority of twitter users have no idea that the tracking is taking place.  Not cool, Twitter.


US “Fusion Centers” in Privacy #fail: Organisational Approaches to Privacy

Last Tuesday, the United States Senate published a report into so-called “fusion centers” that were set up post-9/11 to share intelligence between agencies to support counter-terrorism activities. The report is pretty damning on a range of issues (poor financial accountability, out of date or poor-quality intelligence, officials insisting that non-existent centres did exist) but of particular interest, to me at least, are some of the finding related to privacy.

In summary, the fusion centres each represent a geographical area (states or cities) as “a collaborative effort of 2 or more Federal, State, local, or tribal government agencies that combines resources, expertise or information with the goal of maximizing the ability of such agencies to detect, prevent, investigate apprehend, and respond to criminal or terrorist activity.” To this end, the centres produce intelligence reports that are sent to the Department for Homeland Security (DHS, similar in scope to the UK Home Office / Ministry of Justice and created by the consolidation of a range of security-related departments).

Privacy Failures

In the USA, the Privacy Act* governs the collection, maintenance, use and dissemination of of personally identifiable information (PII) by federal agencies. One finding of the Senate report was that “if published, some draft reporting could have violated the Privacy Act.” Specifically “DHS officials also nixed 40 reports filed by DHS personnel … at fusion centers after reviewers raised concerns the documents potentially endangered the civil liberties or legal privacy protections of the US persons they mentioned.”

This, to me, raises two concerns:

  1. Why, given the fundamental nature of the privacy protections in both the Privacy Act and US Constitution, were fusion centre staff not better trained to compile reports?
  2. Since the Senate report focuses on counter-terrorism efforts but acknowledges that fusion centres play a significant role in other intelligence activities, it seems possible (even likely) that other privacy-sensitive reports could have been compiled and not checked/corrected/stopped by staff at the DHS.

Both of the above look like symptoms of the fairly standard “privacy as the last thing to think about” syndrome that seems pervasive in most organisations. So, how could organisations implement privacy protection as more than just a reactive bolt-on?

* Unlike EU Data Protection rules, the US Privact Act applies only to federal agencies (and not bodies such as courts) and has no equivalent of (eg) the UK’s ICO.

Organisational Approaches to Privacy Protection

These two issues made me think about how organisations can structure themselves to protect privacy of the individuals they collect data about. After some thought, I have thought of three models that could be used. I expect there could be more, and in practice I expect that most organisations have a hybrid arrangement.

1. The Firewall


The first model I identified is the one that seems to be used by the fusion centres – I call it “the firewall.” Within the organisation, there is little consideration given to privacy protection but publications and data dissemination is controlled by a “firewall” that is designed to prevent the publication or dissemination of materials that could undermine individuals’ privacy.

This is similar to the model used by some companies for PR purposes – Employees are not allowed to talk directly to the media and are expected to refer such communication via the Public Relations department.


  • It’s probably easier (and cheaper) to train employees to send materials via the correct channel than to train them on privacy protection policies and best practice.
  • As in the case of the fusion centres, there is a failsafe in place even where employees should know better.


  • The firewall can prevent publication or dissemination, but it’s less clear how it could be used to enforce restrictions on internal processing or storage of data.
  • In large organisations, internal firewalls might be required to properly control data but would certainly slow down communication and introduce a layer of bureaucracy and expense.
  • Whilst, om the face of it, the firewall looks like the most rigourous way to ensure data dissemination and publication doesn’t violate privacy-protection policies, it is impractical to shut down all channels of communication, especially when the lines between organisations are blurred, such as in the fusion centres.

2. The Point of Reference

Point of Reference

The second model I identified I call the “point of reference” – This is the model that Universities use to enforce research ethics. A body within the organisation is tasked with maintaining privacy policies and advising other parts of the organisation about what they can and cannot do. The rest of the organisation needn’t understand all the intricacies of privacy protection, but know enough to identify when they should consult the point of reference.

Here at the University of Southampton, the rule for when we should contact the Ethics Committee is fairly* straightforward: Whenever we conduct research that involves humans or animals.


  • Unlike the Firewall model, the Point of Reference can be equally applied to data collection, maintenance, storage, use and dissemination.
  • It is easier for employees to identify WHEN they need to consult the Point of Reference than to understand all of an organisation’s privacy policies.


  • Unlike the firewall, which can provide a reasonably good failsafe (as in the case of the fusion centres – At least so far as DHS reports are concerned), unless the point of reference also has the authority to pro-actively check activities throughout the organisation it could easily be bypassed.
  • The point of reference could become a point of friction if employees do not understand enough about organisational privacy policies to understand decisions that conflict with their goals.

* I say fairly, because there are some edge-cases; does scraping twitter involve human participants?

3. Culture of Privacy

Culture of Privacy

My third model is what I call the “Culture of Privacy”. In this model, each employee within an organisation has a working knowledge of the organisation’s privacy policies and privacy is seen as an integral part of the organisation’s operations. In this model, employees are responsible for more than just knowing when to refer to a point of reference but have a personal responsibility for protecting the privacy of data subjects in the course of their work. This model involves the most training and support, and probably also involves appropriate sanctions for employees that engage in their own “privacy counter-culture.”


  • This model applies privacy principles to all aspects of an organisation and allows for a degree of monitoring between employees.
  • If privacy is seen as part of an organisation’s core principles or even identity, then it is less likely to be seen as a hindrance.


  • In practice, making privacy a core value is probably a pretty difficult thing to do (especially in engineering companies [hello Google, Bing, Facebook] where “what we can do” is more of a concern than “the side effects of what we do”).
  • A internal culture of privacy is likely to be dependent on a wider culture that respects privacy. There seem to be differences between the EU and the US in this regard and the motivation to create such a culture might be stronger in the EU, given the stricter Data Protection regime.
  • Even with good training, employees are likely to require additional advice and support – So this model probably doesn’t work well by itself and probably needs to be considered alongside a point of reference.

Hybrid Models

As I alluded to previously, adopting a single model to try and enforce privacy-protection within an organisation is probably not a good approach. None of the models is perfect and (in the EU at least) the implications of failing to adequately protect data subjects’ privacy are serious enough for an organisation that privacy-protection is worth doing properly.

Creating hybrid models of privacy protection, for instance combining a point of reference with a firewall model for any substantial inter-organisation data transfers, is probably a better way to ensure that data subjects’ privacy is respected than (as the DHS appears to have done in the case of the fusion centres) relying on a single measure to enforce privacy protection.


The case of the US Fusion Centres illustrates atrocious project management on a number of fronts – But the apparent lack of robust privacy protection measures for data subjects is perhaps among the most unsettling. I’ve briefly explained three ways in which privacy protection could be implemented in an organisation, one of which (the firewall) appears to have saved the Fusion Centres from an even more damning report. However, in reality privacy protection needs to be at the heart of what organisations, especially data-intensive ones, do; and that probably involves a hybrid approach in which failsafe procedures are combined with a supportive environment and a culture in which employees consider privacy an important part of what they and their organisation strive to be.

There are issues that I haven’t explored about how privacy needs to be re-framed from a hindrance to engineers and service designers to being an enabler for the rest of us.

The First Interdisciplinary Web Privacy Seminar @ Southampton: Thursday 1st November 2012

Thursday, November 1st, 10:00 – 15:00, Building 32 Coffee Room

Many of us within the Web Science DTC at Southampton, and beyond, have research interests related to privacy. To foster collaboration and to help develop some common understanding and direction, we’re arranging a day-long seminar on web privacy on Thursday, November 1st 2012. Refreshments will be provided by the Web Science Doctoral Training Centre.

We’d like to invite anybody who’s working on privacy to take part and we hope that all attendees will give a short presentation (5-20 minutes) about their research or interest in privacy, focusing (if possible) on some or all of the following questions:

  1. What IS privacy?
  2. Why is privacy important?
  3. What changes, if any, do you think could improve our privacy? Technical, social, legal or otherwise.

After the presentations, we’ll discuss the questions that have arisen and examine possibilities for future research.

To register for the seminar, please use the form below and do get in touch, R.Gomer (at) soton.ac.uk, if you have any questions.

Richard & Maire

On the Ethics of “Consent”

TL;DR: Consent matters when it comes to cookies that could expose sensitive personal attributes (health, income, age, sexuality, religion, ethnicity), even if you don’t mean to collect them. Collecting these things could put the subject at a small but appreciable risk. The only person is a position to decide whether a personal attribute is sensitive is the subject (and even they may have trouble). Getting consent is different to getting someone to click the “I consent” button. People are irrational, don’t pay attention and are goal-focussed – It’s not OK to exploit that in order to get a meaningless but legally-acceptable “consent” signal.

Hand-Waving in the general direction of consent

Consent is one of those ideas that seems to permeate through every level of society. At a macroscopic level we talk of citizens being governed and policed by consent, and at a smaller scale consent underlies the relationships between individuals. It is only rarely that someone can be compelled to do something without their consent at some level – Whether that’s macroscopic consent derived from their participation in a democratic society or case-by-case consent formed through contract or interpersonal agreement.

What underpins the idea of consent, is that the entity giving consent (whether an individual or a group, and sometimes both) has a meaningful choice to make: Do I or do I not want to enter in a particular set of rules or conditions?

Consent and Cookies

So, what does consent have to do with cookies? An advertising network that tracks my visits over multiple sites isn’t compelling me to do anything, but it is taking decisions, the right to digital self-determination, away from me. As I’ll come on to later, people deserve a choice when it comes to data about them, and when an advertising network starts covertly collecting data that choice is taken away. Secondly, the EU 2009 e-privacy directive specifically requires that

“the storing of information, or the gaining of access to information already stored … is only allowed in the event that subscriber or user concerned has given his or her consent, having been provided with clear and comprehensive information … about the purposes of the processing.”

Consent vs “I Consent”

When piloting the study I’m working on at the moment, I spoke to several people about their experiences with cookies and asked most of them about the new “consent” dialogues that have sprung up on UK websites since May*. The overwhelming response seems to be that people have seen them, but don’t really pay attention to what they say or understand the decision that has to be made. That’s not surprising, people have been ignoring warnings about security certificates for years.

Here’s the difference between actually consenting to something and clicking on a consent button (or worse, “continuing to use this website indicates your consent”). The legal basis for determining whether a user has consented seems to be rooted in the same discredited notions that human beings are rational and self interested as Economics. More, it assumes that people will always read, understand and give proper thought to the information that they’re shown. We know that both these things are categorically not true. By relying on human psychology to trick users into “giving consent” whilst simultaneously pretending that such consent is in any way meaningful is ethically indefensible. What matters is not whether you can get a user to click a button (probably after having gone through a shallow heuristic evaluation rather than critical thought) but whether you can say with any certainty that users are actually happy for you to do what it is your doing; (and you can’t assume that they’d be happy if you haven’t actually told them).

If these techniques were proposed as “nudges” (and default options can be legitimate nudges) they would be rejected on the grounds that they’re not in the interest of the subject or even of broader society.

* May is when the Information Commissioner’s Office claimed that it would start enforcing the UK’s Electronic Communications Regulations, as amended.

Why “digital self-determination” matters

By “digital self-determination” I mean the right to control data about oneself – Even in situations where it would be hard (although not impossible) to link that data back to the individual it relates to. Every time data about a person is stored, there is an unknown increase in the risk of harm to the data subject. It’s not the job of Bing, DoubleClick or Facebook to make risk decisions on behalf of the data subjects – The data subject is the best placed to know which personal attributes are potentially sensitive given their personal circumstances.

Why does it matter if a company collects data about the web pages I’ve visited? There’s no answer to that question. Some people have no reason to care, but others may have several. Advertising companies know that the web pages people visit can tell you something about them – They exploit that knowledge to target adverts based on what they think you’re likely to buy. What somebody’s likely to buy is not the only thing you can infer. Consider the following examples:

A web user searches for advice about problems with their eyesight and tremors. In the UK those web searches wouldn’t be too sensitive – Our health care is free at the point of use. In countries where people rely on private health insurance that web search could be construed as evidence of a pre-existing medical condition and preclude the data subject from appropriate care if they were later diagnosed with Multiple Sclerosis.

You could make a reasonable inference as to the sexuality of somebody who routinely visits PinkNews.co.uk. For some people that’s not a problem, but for some people such a revelation could cause family or employment difficulties.

What about the social stigma around depression and suicide that might be invoked by disclosure of visits to the Samaritans website? Or the consequences of an abusive partner finding that their victim was seeking domestic violence support? An employer that found out you’d been uploading a CV to Monster?

Shouldn’t those sites just stop using third party services that could track their visitors? Probably. But that’s not enough – Newspapers carry stories about these topics and links to those websites. Bloggers that rely on free services don’t have a choice which third parties get to track their visitors.

“We use behavioral advertisers – People can accept it or leave”

Do people have a choice of whether they take a risk with their personal information? Perhaps they do, but should people have to make a choice between risking personal data and using a website? That is surely a form of indirect discrimination.

So, what’s your point?

The current system of tracking, the paternalistic attitude that companies have to subject data and the technology that allows companies to do tracking with no consent from users is broken. Something has to give: Either data protection legislation needs to be strengthened (or just enforced – Yes, ICO, looking at you), companies that make money from surreptitiously stealing people’s data need to start behaving more responsibly or the technology needs to be tweaked to give web users a break.

Introductory Thoughts on Cookies

What’s this all about?

During my internship at MSRC, I’ve been focussing on how we can visualise cookies to help people better understand what they’re doing and how they work. But there are other issues tied into this: Privacy (and what it means for that to be undermined, and who has the ability to determine whether an action undermines an individual’s privacy), Technical Issues (how can we help guard against “abusive” tracking cookies and cookies that really are needed, without breaking things?) Legal issues (particularly around data protection, EU privacy directive and what informed consent is) and even some Economics (how do cookies support content-providers via ad networks, and how do you balance that against user privacy or make content worth the privacy risk).

I think there are a few issues that keep coming up, no matter which way you approach cookies: The technical insolubility of preventing an ID from being used for several purposes, knowing what an ID is being used for, balancing the needs of websites to track users internally to optimise content versus users’ right not to have their browsing history across multiple sites snaffled by advertising networks.

Why selectively blocking tracking cookies might backfire

I’d be interested to see whether one could differentiate between “types” of cookie with accuracy good enough for general use – There’s no technical distinction, so I think it would be far from easy. The P3P approach, in which cookies are delivered with a machine-readable privacy policy, seems like it might address some of the problems with categorising cookies, but (from experience of trying to implement P3P policies for websites) it’s pretty complicated and feels out of place alongside the simplicity of HTTP itself.

But if we could tell “this is a cookie that keeps you logged in” and “this cookie is just for targeted advertising”, would that help?

Sites that set multiple cookies generally seem to do so out of convenience (easier for eg product teams to have their own cookie) – A single cookie would probably suffice technically for the overwhelming majority of sites that currently use multiple cookies – There may be a downside to widespread categorisation of cookies as “authentication” and “tracking”, in that sites start consolidating into fewer multiple-purpose cookies that are harder for users to control individually and removing any shred of transparency that currently exists. I suspect also, that rather than reduce the lifespan of that single cookie to reflect the often limited lifetime of the current cookies, companies would just give the single cookie the lifetime of the longest-lived cookie at the moment, which undermines privacy further.

Knowing what cookies are for: A policy problem?

There could be a policy response that insisted on a certain level of atomicity in cookie use (not using a single identifier for technically-necessary identification like authentication and non-essential uses like tracking). Implementing that seems like it would either a) have a lot of side-effects for eg companies (like Facebook) that operate advertising only within an authenticated environment (differentiating the ID of the user and the advertising recipient makes little sense) or b) have a lot of loopholes to accommodate them.

Which cookies do I even want to control?

Contexts seem to play a role in the idea of privacy – I don’t care so much that the Guardian knows which stories I’ve read, but I do care that an advertising network knows which stories I read on the Guardian and which stories I read on the Telegraph and which product I looked at on Amazon – A third-party that doesn’t respect the “natural contexts” in my browsing is more troubling to me.

Applying contexts to the Cookie Jar

I think contexts could be implemented in the web browser. Sites could, by default, operate in a “sandbox” – A cookie for Facebook set in a first-party scenario (I’m on a Facebook URL at the time) can only be seen by Facebook. A DoubleClick cookie set in a third-party context while I’m on Guardian.co.uk can only be seen by DoubleClick when I’m on the Guardian – When I’m on the Telegraph, DoubleClick sets/sees a different DoubleClick cookie. This wouldn’t interfere with analytics on the site itself, and would still allow sites to track return visits without bothering the user for all the largely-innocent cookies.

HTTP cookies already have a couple of properties that can be specified at creation (to eg restrict them to HTTPS connections or prevent access from client-side scripts) – A new property could allow cookies to break the sandbox and become global, accompanied by a user confirmation, perhaps using a P3P-like policy to tell the user what the cookie is for, like you get when adding an App on Facebook.

“Facebook.com wants to a set a tracking ID on your browser. It will be used to:
– Keep you logged in to Facebook services provided on other websites
– Track your browsing activities for the purpose of behavioural advertising
Do you want to accept this tracking ID?”

Sandboxing the browser cache in a similar manner would help to prevent some of the other tracking mechanisms, like caching a unique image and then reading that back using a javascript canvas. I think that prevents large-scale tracking of a user’s browsing across many websites, but still allows cookies for legitimate cross-domain purposes (Like Facebook comments on blogs) to work. The policy response then just needs to deal with companies that misinform users about the purpose of the cookies that they’re requesting are un-sandboxed and possibly require that sites use separate global cookies for different purposes, so that the user gets some granularity in what they allow.

A social nudge?

There’s space for a social nudge here, I think. I sometimes feel like if I don’t accept eg an app’s permission request I’ll miss out (possibly coupled with a strong cultural influence to avoid saying no at all costs!) “3000 people have said no today” lets people feel like rejecting this cookie/request a) is socially acceptable and b) won’t disadvantage them, at least with regard to this big number of other people.

If DoubleClick wants to incentivise the user to accept a global tracking ID by giving them something in return, then great!