Tagged: personal data

A single version of you

In my last post I suggested that there are around 200 organisations that have their own version of you in their systems. Your shopping history with them (and only them), your credit card information, your address (possibly an old one), your email (possibly out of date, or incorrectly spelled through manual error), your age (guessed), financial status (assumed), marital status (incorrect), mobile number (work not personal) and so on. Some right, some wrong, some just out of date.

What would it be like if you could manage a single version of you? You’d want to make sure that not all of the data was held in one place – a potential honey-pot for hackers – but instead you could pull it together from other sources on demand so that you could make sense of it. Indeed, those 200 organisations could ask you for permission to subscribe to it, on your terms. And you’d be able to manage the context of how it’s shared – who gets to see what. On one hand you’d be able to give a bank access to your credit history so that they can authorise a loan, but then you could rescind that access after seven days. You’d be able to prove who you are to a hotel without them having to take a photocopy of your passport.

You’d be able to share your contact details with your employer, your childrens’ school and your family, but then be able to keep that information hidden from a high street retailer who asks for your mailing list details. You’d then be able to share your delivery contact information with that same high street retailer’s delivery company because that has context (but it would only be until the TV was delivered; afterwards they have no need for the data and you’d be able to remove that organisation’s subscription to those details). And perhaps interestingly, you could share with those organisations the information about what you actually want and need to buy. In real time, they could see your intentions. Some trusted organisations would even be allowed to recommend things that you don’t yet know you want, so there would still be some sense of serendipity in the system.

Something for everyone

Just think about the benefits here for businesses. No more buying data about customers on the open market to make their sure Customer Relationship Management systems are up-to-date. No more guesswork about what individuals might buy and when, spending good money after bad on advertising which has a hit-rate of 0.5%. Instead there will be deeper, richer, more valuable relationships with customers. And more accurate stories about the individual, written by them, on their terms.

And if that wasn’t enough, the single truth about an individual could be combined with others to create a single truth about a whole group – no more need for guesswork about demographic approximations and customer segmentations. Instead you’ll have customers – who trust the organisations – forming groups to tell lots about themselves; data which could be pulled together to make sense of a particular need or group behaviour. Very useful indeed; it would turn customer insight on its head.

If it was possible to manage such a personal data single truth to which others could subscribe, then it would be a place for more than this idea of ‘Volunteered Personal Information’ (i.e. all those forms you fill in); it could also be a place to capture and manage lots of other types of personal data, for example your receipts (think email receipts), travel information (think London’s oyster scheme data) and bank statements (a la lovemoney). I called this your ‘created data’ – the stuff you generate daily by shopping, browsing, travelling and sharing.

If we could pull all this information into one place it would create a much richer picture of who you are – a much richer view than those 200 organisations have about you today. It could capture the books you bought on Amazon as well as the ones at Waterstones or Barnes and Noble. It could capture what you spent on food at restaurants as well as in the supermarket. Wouldn’t that be handy? A holistic view of you – wouldn’t that be useful for companies who want to ‘personalise’ their service? Once again, if it was pulled together in a meaningful way – with a story written by you – wouldn’t it have more value than the dry data organisations buy today from the data markets?

A new way to tell stories

Imagine that this idea of a single place for your personal data could pull together all that information – not to store, but for reference – and then help you make sense of it, help you tell your story. (it’s important to note that this single place isn’t a centralised ‘vault’ or ‘cloud backup’ for all the original data, but instead is a place where all these different feeds can come together to help you make sense of them all – a bit like Flipboard for your personal data, with some parts that are authored by you.)

And imagine being able to share some of that information with others like you who have similar intentions and needs, and to express them in groups so that organisations can see aggregated demand, and perhaps even build new products and services to meet those group demands, knowing that there’s enough volume to warrant the investment.

Wouldn’t that be brilliant.

And it’s possible today.

By the end of the year, it’s expected that there will be at least half a dozen personal data services available to the public that will do just this – some of them are listed here. There’s been more than $100m invested by venture capitalists around the world in these ideas. Together with the myriad of personalised services appearing all around us, together with digital banking and ewallets and e-health systems and smart meters and social platforms, they are all straws in the wind. Together these things are going to fundamentally change the way we tell our stories. And who writes, edits and publishes them.

Why stories matter

This is the second of three posts about your personal data, about why they tell a story, and why that matters.

When you sew together the many separate pieces of my personal data one can start to tell a story…

  • A story about my shopping habits (think Tesco clubcard)
  • A story about my finances (think Mint)
  • A story about my likes and dislikes (think Pandora)
  • A story about my relationships (think Facebook)
  • A story about my skills and career (think LinkedIn)

And indeed when you put my data together with a larger group, you get even more powerful, collective stories…

But the cautionary tale here – and which I’ve told above, and many times before – is of organisations trying to tell the story about you, but not with you. If it’s not you doing the writing (or it’s done without your permission) then the story won’t be a real one; it’ll always be at least in part a fictional one. In the same way a Hollywood movie is sometimes ‘based on a true story’, the organisation is likely to exaggerate some parts of the tale and completely miss out others so that the story has a particular ‘angle’. Bravehart, Titanic, 127 hours and The Queen all used poetic license to twist a true story into a beginning, middle and an end so that it could tug at the heart strings, ultimately to achieve blockbuster success.

This idea of ‘taking an angle’ is what I think Facebook, Google and others do when they target ads at you. They have pieced together their version of your story without your involvement, and in most cases their assumption is that you are always looking to buy stuff. To their mind it’s the only reason you’ll be on the internet. You’re not there to connect, share and learn, but instead to buy. (To be fair, these companies often have nobler goals like to connect everyone or to organise the world’s information, but to fund these bold endeavours they feel they have to flog stuff to us which is a shame.) Anyway, this assumption – their use of poetic license – is wrong. Sometimes of course, they’ll get the story right and you’ll be looking for something to buy, but we can see that it’s only once in a thousand (and once in two thousand for Facebook) that they’ll show you something you’re actually interested in, and that’s probably more luck than judgement.

Becoming the author, editor and publisher

Stories need to be crafted. This means someone needs to write the words, someone needs to edit them and someone needs to publish them. Most of the time we’re the writer, filling out form after form usually in order to benefit from a product or service. At other times we let others write about us, for example our credit reference agencies or hospitals. And sometimes we’re the editor, when we get to update our data through a self-service portal, or we’re able to mash up our own datasets for other purposes, like with mint.com. And finally social media tools have enabled us to become publishers in our own right.

It’s worth noting that once the data is written down or captured on someone else’s computer we lose control over what happens to it; just like the book that’s put on the bookstore’s shelves, we can’t control who buys and reads it, nor control if it’s copied or if the contents are mashed up with other material to make something new. Indeed I don’t believe that we should we have such control, something that certain element of copyright have been trying to do for a long time… in a previous post I’ve written about how it’s not really possible to have control over our personal data once it’s published; instead we need more transparency over where our data is held and a right to reply when we’re not happy with how it’s used.

Anyway, back to stories. Becoming the author, editor or publisher means that our personal data stories can be set in context, and have intent. In so many cases though it’s someone writing, editing and publishing our personal data and so our stories aren’t told in the right way. Some organisations even sell on our data so that even more organisations can write their version of the story, once again though without our editorial or publishing control.

It’s possible, just possible, that you know more about you than the organisations which want to sell things to you. Like where you actually live. Or what you like. Or when’s best to contact you, if at all. Or the actual reason you bought that album. It’s estimated that right now there are about 200 organisations that have their own ‘version of you’; government departments, online retailers, the driving licence authority, your current and past employers, your broadband provider, your grocery store’s loyalty team.

But what if you could write, edit and publish a single, true story about you, so that these 200 businesses could get what they need, each in context, and each with permission?

Wouldn’t that be a great idea?

The answer is yes. And it’s coming to a cinema near you soon…

Thinking more about personal data

Relationships are everything.

Relationships are the reason we look after each other, the reason we reproduce, the reason we form groups and ultimately the reason we evolve. Relationships are simply one of the fundamental parts of being human.

These relationships with others – our family, our friends, our neighbours, our colleagues our customers – are all based on different levels of trust. When we talk about ‘deep’ or ‘strong’ relationships, we just mean to say that we trust each other a lot, sometimes unconditionally. It’s obvious then – but worth making the point – that when we don’t trust each other we form weaker or perhaps shallower relationships. Trust and relationships are not only related, but symbiotic. They need and feed off each other.

Personal data is naturally one of those things we share with those we trust. Information about who we are, what we are doing, where we are, our physical and emotional selves and so on. But sometimes when we share personal data, trusting that what’s shared will be handled with care, there are unintended or unwanted outcomes. In this post I wanted to look at those unwanted outcomes from sharing personal data, and some of the steps we take to manage it.

Who said you could do that?

When we share our personal data, it’s sometimes used in ways we don’t agree with, in ways we didn’t sign-up to. I think there are three of these outcomes…

  1. Being contacted without permission (or good reason)
  2. Being impersonated without permission
  3. Being exposed without permission (or good reason)

When we have a relationship, often implicitly or perhaps culturally we agree the rules of engagement – how often, when and where we are happy to contact each other. And because we have a relationship, we are able to set those boundaries (and reset them when they are crossed). But sometimes we are contacted by people without our permission or good reason, and by people or companies with whom we have no relationship. So the first of the unwanted outcomes is about spam, stalking and unsolicited advertising. In order to contact us, people either need to obtain our contact details (phone numbers, email address, twitter handles etc.) or they need to track us so they can target their communication by knowing where we are, what we’re doing, or what device we’re carrying (here’s a great link to a recent New York Times article entitled That’s No Phone. That’s My Tracker).

The second is about identity theft. That is, someone we don’t know using our personal data in order to access our money, our government benefits or citizen rights (for example using our passport information to get into the country). Sometimes the data is obtained through phishing, and sometimes it’s hacked. Experian recently released a report showing that more than 12 million pieces of personal information were illegally traded online by identity fraudsters in the first four months of 2012 – outstripping the entire of 2010 (interestingly, about 90% of it was password/ login combinations). Regardless of how our personal data is obtained, it’s often being used to impersonate us without our permission.

The third is more interesting – it’s about permitting, or seeking to have control over information about us which is shared with others. Naturally, we’re pretty good at doing this for our physical selves – we use clothes and curtains to keep private what we don’t want other people to see. But when it comes to personal information it’s different. What are the ‘clothes and curtains’ for our personal information? Is it even possible?

The thing is, information has some interesting characteristics. George Bernard Shaw once said (something like): “if you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.” His point was that some things behave as if they’re abundant. It doesn’t matter how many times you copy them and share them, the original remains the same, as do the copies. These things are known as ‘non-rival’ goods. This idea of abundance is a powerful one, because it helps explain how we treat abundant things.

For a long time, sharing things was limited to people who were in the same place at the same time, or limited to those who could write things down, copy them and take the bit of paper or parchment away. In other words, there was a cost to sharing, a friction to sharing. And so sharing was contained, for better or for worse. But then the printing press came along, then later the telephone and more recently the Internet, and we’ve been able to copy information at an increasingly low cost. In fact today, the costs to copy are pretty much zero – as Kevin Kelly brilliantly puts it, “The internet is a copy machine”.

Anyway, sharing our digital information has now become so easy and so cheap that all day, every day we share things without thinking. And like Bernard Shaw’s ideas, we’re now sharing our personal data abundantly – perfect copies of this data can be made and shared widely at pretty much zero cost. And this abundance of sharing begins to scratch away at the idea that we’re losing the sense of relationship with whom we share our personal data. Where, how, why and when it is shared is often unclear to us. And with the loss of these relationships, we’ve lost the trust in how that data is handled; people started contacting us without permission, impersonating us without permission and sharing information about us without permission.

Protecting our privacy

Let’s take an example to bring this to life a bit. Earlier this year, much was written about how Target, a goods retailer in the US, figured out a teenage girl was pregnant before her father did.

Aside from the fact that there are some social and ethical issues to be explored here, the point is that whilst Target were correct in their analysis, they contacted the girl about the pregnancy without her permission, and they exposed her personal data without her permission. As we go about our daily lives we leave a digital exhaust – a digital footprint – and our personal data is often left behind like Hansel and Gretel’s breadcrumbs. Track enough of it (like what lotions and vitamin supplements you buy) and compare this with other known group behaviours (like those who you know are pregnant and who are buying baby clothes, nappies and pregnancy books) and of course it’s possible to make some accurate assumptions about an individual. I’ve previously called this your ‘inferred data’. So it’s understandable that we’re becoming more wary about what is being shared – both with and without our permission – and we’re seeking to protect our privacy to avoid these unwanted outcomes.

To look more closely at how we protect ourselves, I’ve broken down the lifecycle of personal data:

  1. Data is produced (or observed if it’s self-evident);
  2. Data is captured and stored;
  3. Data is analysed or processed; and then
  4. Data is used

Here’s an example of this in action…

  • I wear clothes that expose my Harley-Davidson tattoo
  • My tattoo is seen by the man serving me at the bar
  • The barman makes an assumption – that I’m into biking and believe in what Harley-Davidson stands for
  • The barman strikes up a conversation about bikes, and because he too is into bikes, we share information about each other. The result is that we start to trust each other. We form a relationship. He might even give me a beer on the house.

Now let’s take a more obvious digital example…

  • I browse the web using an internet browser
  • Using cookies, my browsing activity is tracked by the web sites I visit
  • My behaviours are analysed – both in real time and afterwards
  • My subsequent web browsing is targeted with ads to better ‘personalise’ the service. Importantly, the targeted ads are paid for by companies trying to build a relationship with me. But it’s not really a relationship. And there’s no trust. It’s really just a transaction at best, and I’m seen as a sales lead to be sold on

This use of my personal data means I get a better experience (like remembering my ‘shopping basket’) and sometimes I get a good deal on my purchases. But mostly it just makes my browsing experience a bit noisy because the ‘targeted’ ads are assumption-based and are often more miss than hit. These two examples highlight how it’s the context of sharing that determines the permissions to share – some are explicit, while others are implicit – and therefore the outcomes i.e. stronger relationships and lower prices or instead a loss of trust and shopping frustration. As we live more and more of our lives online these issues have become increasingly apparent, and there are now many groups and bodies who are looking at the social, ethical, economic and political issues surrounding personal data.

I see that these projects fall into two camps… The first are looking at who knows what about us – in other words, steps 1 and 2 above. For example there is lots of work going into making the public aware of exactly how much data is being captured about them, by whom and for what purpose. The second group are looking at how this data is handled once it’s captured; that’s steps 3 and 4.

Privacy in action

Now rather than delve into the ins and outs, rights and wrongs of digital privacy (not least because there are many more qualified people than I who have written credibly about it, and at length), I wanted to point to some of the main activities aiming to help us manage our personal data and avoid those unwanted outcomes I suggested at the start of this post.

Below is a list of some of the main things going on around personal data; I’ve broken them down into the stages of the personal data lifecycle, steps 1-4. (Note that some of these are links to specific projects, and others are just  linked to sites that provide more information)…

1. Produce
2. Capture
3. Analysis
4. Use

Who’s in control?

A big part of sharing our personal data is the bargain we make with online services when we agree to give up a bunch of data in return for some utility – a better deal, access to my friends’ information, accurate search results and more. Cory Doctorow highlights one of the great underlying issues here when he points out that “…even if you read the fine print, human beings are awful at pricing out the net present value of a decision whose consequences are far in the future.”

So I would suggest that we’re sharing our data abundantly, and not really ‘pricing in’ the full cost of doing so. The thing is, culturally we’re so much more comfortable with scarcity. When things are scarce we value them more highly, and when things are abundant we treat them cheaply (in Clay Shirky’s words, abundance means ‘cheap enough to waste’ and therefore ultimately ‘cheap enough to experiment’). And so it is with our data – we value it and so want it to remain scarce. Our instinct is to hold on to it, restrict it, secure it and sometimes misdirect others around it (like when we give out a fake email address to avoid getting spammed). And yet we give so much of it away, not really fully aware of the T&Cs under which we agree to share it. This pretence of scarcity means we end up saying things like ‘who owns the data?’ or ‘who controls the data?’, something pretty much impossible once it’s been shared in this digital age.

In my view, we should instead reflect on the idea that our personal data is now in many ways a non-rival good, it’s abundant, and perhaps behave differently around it. That would mean we would instead say things like “who has access to the data” and “what are people doing with my data”. It would mean new terms and conditions for sharing, perhaps those under which we can  feel more confident about how our data is being used, and under which we can benefit from the products and services exchanged. Sharing would be more transparent, and we’d have the right to take action if our data is incorrect, or there’s an abuse of the data. Once we get some degree of visibility of who has our data, in what format, why and how they are using it, I think something interesting will happen: trust will emerge. And with that trust, new relationships. Indeed, we may begin to actually share more – an idea already proposed by those looking at Volunteered Personal Information. And as we share more – under clear and transparent terms – everyone will win: new products and services will become available (think of patientslikeme.com but for everything), our existing services will get even better because they will matter to us (and not be based on guess work), and guess what, we’ll feel better about it all because there won’t be a sense of any hidden agenda with our personal data, which after all, is personal.

A couple of suggestions

So I’d say that we need two main changes to how we behave around our personal data

  • We need to recognise that we can’t control data in every circumstance:  instead lets accept that and turn to ways to improve transparency: information sharing agreements, regulation for organisations to be clear about what data they gather and how they use it, and perhaps new ways to make us more aware of what we’re sharing in the first place so we can make informed decisions
  • We need to better understand personal data in context: what it is we really need to share, when and with whom (here’s a good example: to prove we are old enough to buy alcohol, we often use a document that proves we can drive. We can and should get better at using personal data in context – we only need to share what we need to share)

I’m hopeful that much of this is on the way. But there’s a lot more to do.

Thinking about identity

So recently I’ve been trying to get my head around identity, privacy, trust and personal data. And how it all fits together, and why that’s important.

I thought it’d be best to go back to basics and try to define identity, so here are a few thoughts.

Identity is context

If identity is how you express yourself to others – a statement of what you believe, ‘who’ you are – then it must vary by who you’re with, where you are, what you are doing, when you are doing it, and why. For example, my identity can be a number of things.

  • Football supporter (e.g. who I’m with)
  • Temple-goer (e.g. what I believe in)
  • Volunteer (e.g. what I’m doing)
  • Soldier (e.g. what I’m wearing)
  • Employee or pupil (e.g. where I am)
  • Conference attendee (e.g. what interests me)

I can of course be any of these things at the same time (I can be a football supporter at work, or a soldier while volunteering), but the important thing is that it’s all about the context.

Identity is sharing

If identity is about you in a context, then it must also be about how others perceive you in that context (indeed we often say “I can identify WITH her” – that we have a connection with each other in some way). This means that by definition, identity has to be about sharing – sharing things about ourselves with others – so that an opinion can be formed.

But what is being shared?

  • The logo from a personal device?
  • A uniform?
  • A username and password?
  • A medical record?

Of course it’s all of these and much more. It’s all personal though – all Personal Data.

Identity is personal

Before we can understand identity we must first look at personal data.

What is personal data? It’s clear to me that this term means so many different things to different people. I think now is a good time to define what exactly personal data is, and how different types of personal data have different characteristics. So here are some terms I think are helpful to explain what Personal Data means.

Self Data

This is the stuff you’re born with – blood type, sex, finger print, genome information, date and location of birth. It’s both self-evident and usually captured at birth. A lot of it is called your ‘biometric’ data.

The value here is two-fold: as a set of data it is entirely personal to you and can’t be duplicated; and it’s perpetual – it will never change.

Being Data

These are the types of data that are a result of being alive. Height, weight, religion, sexual orientation, BMI, shoe size and health diagnoses are all types of Being Data. These types of data are likely to be steady-state for most of your life, though are subject to change during times of physical, emotional or spiritual growth or upheaval. It’s captured in various ways; most often as medical records.

The value here is being able to see patterns of cause and effect, and to influence behaviours throughout our lives.

Attributed Data

This is the data that is attributed to you by others, or is data that you claim yourself (and which is usually validated by a 3rd party).  This includes education, awards, achievements and most government data held about you – driving license data, criminal record, National Insurance number and census data.

Attributed Data is also any record of things you own (or look after) – including your car, your credit card, white goods, pets and mobile phones. As such, it also includes your contact details: your address, phone number, email address and social media handles e.g. Skype, Twitter etc. (of which of course, you can have multiple versions e.g. work email and personal email).

The value of Attributed Data is that we can determine levels of trust: so that we can both trust others and ask others to trust us.

Created Data

This is pretty much everything you generate and produce yourself: photos and videos, status updates, browsing and shopping history, banking and financial records, plus the records of all your communications (phone calls, SMS etc.). It’s also your intentions, your wish-lists and personal ‘check-ins’ (a la Foursquare).

Importantly, Created Data must be published somewhere, even if that’s in your own private documents.

Created Data reflect snapshots of you, frozen in time. The value is that together this data tells a story about you, the richness of which increases over time.

Inferred data

This is the last category of personal data; and it’s not really personal. Inferred Data is data that someone (or something) has assumed about you. This data is not produced for you, nor is it generated on your behalf; it is there to serve some other purpose. Some would say this is so that a company can better ‘target’, ‘acquire’ or ‘own’ you.

Inferred Data includes your credit score and other segmentation data that companies hold about you. The value of inferred data is ultimately for organisations to make sense of their customers.

It became clear to me as I was writing this that these different types of personal data are actually layered upon each other, a bit like Maslow’s Hierarchy of Needs. See the diagram below.

Self Data is the core of us – indeed it’s used by forensic and security teams to prove who we are. Being Data is a by-product of living our lives; of interacting with each other and of changing as we grow older. Attributed Data is the next layer, which in Maslow’s terms is social: data about things we are responsible for and about how we interact with our community/ society. Created Data is at the top of the hierarchy, as this is about self esteem and self expression (note that much of this is a product of our digital society – consider how much Created Data exists for a homeless person, or a Buddhist Monk).

You’ll see that the there are two halves – the ‘real me’ is on top – visible and shared – and the ‘hollow me’ is underneath – not usually visible to us as it’s held by others (usually organisations). I’ve done this deliberately to show that the real me is your actual data – my actual medical records, my actual location, my actual intentions. This is very different to the Inferred Data which is really guess work, and which is produced by – and only ever for – others.

You can see that Inferred Data can reflect any type of personal data – inferred sex or age, inferred state of health, inferred level of education, inferred address, inferred financial activity, inferred location, inferred intentions. Inferred isn’t real; it’s based on assumptions.

So it’s clear – to me at least – that identity is contextual, about sharing and about personal data. And that inferred personal data can result in an inferred identity – a ‘hollow me’.

I believe these ideas underpin what we mean by privacy and trust, but more on that another time.