Is the Expectation of Privacy Reasonable?

If anything has become clear to me in the past few weeks, it is that the issues and contradictions of identity and privacy online are coming to a head. With the process of virtualization having taken place over years and years, human society has now reached such a mass of users on the network that we are encountering disruptive externalities of the establishment of a network identity with increasing frequency.

While I have made my own personal position on Facebook’s recent issues clear, I think that their situation is in fact reflective of a larger issue than even that massive social network’s decisions. Hundreds of millions of people now experience some portion of their lives online, and they are beginning to come to grips with the relationship between their ‘real life’ identities and their network identity.This can take the form of any number of expected and unexpected consequences as a result of aspects of one of their identities becoming publicly accessible to the connections of another.

In many respects, it is the tension between the varying expectations of privacy of these two classes of identity that is causing the outcry that we have seen among some users of Facebook. Jeff Jarvis and danah boyd have both written compellingly about this, suggesting that there are numerous legitimate reasons why a person might have an expectation of being able to separate different aspects of their identity. Mark Zuckerberg’s contrasting position has been stated to be that the maintenance of a separation between different aspects of an individual’s identity suggests a lack of integrity.

Looming over this conversation (and supporting Zuckerberg’s stance, if not validating his opinion) is the technical reality that the capabilities of the network permit the centralized aggregation of multiple identities. One of the strongest sentiments that I have seen reacting against complaints of Facebook’s changes is that users should never publish anything on the internet that they wouldn’t want to be made public. We have known for years that privacy is something of an illusion on the internet, and with the rise of big data analysis, even the notion of ‘privacy through obscurity’ is becoming quaint.

So where does this leave us? It seems to me that we can clearly see that there is a disconnect between users’ expectations of their privacy online, and the reality. There is also some degree of ignorance or misunderstanding about the costs and benefits of living publicly and sharing information with a wide range of entities. What I think needs to happen is a public, wide-ranging discussion of privacy and identity online that both allows users to understand the exchange that they engage in when they trade personal information for access to systems online, and also reaches an agreement on what an ethical model of user data collection and sharing would look like (perhaps something similar to the informed consent required for most medical and social sciences research). Following such a discussion, a society could then more formally negotiate rules and regulations that govern this crucial development in social conditions.

Without any such discussion, I believe that we are doomed to repeat this disruption of the clash of identities that is empowered by the network society.

Why I’m Deleting my Facebook Profile

It’s time that I put my money where my mouth is, so to speak. I’ve told many friends, and will announce soon on my Facebook profile, that I will be queuing my account for deletion by the end of the weekend. I expect that some people might wonder why I would do such a thing, and so this blog post is intended to explain my motivations.

As I recall, I joined Facebook sometime in late 2004 or perhaps early 2005. I had known about the site before, but it was open only to students at certain universities at that time. I submitted a request for my school to be added to the list of supported universities, and joined as soon as I was alerted by email that it had been. At about that time, Facebook had this privacy policy, according to archive.org: “No personal information that you submit to Thefacebook will be available to any user of the Web Site who does not belong to at least one of the groups specified by you in your privacy settings.” A post at the Electronic Frontier Foundation, and this visual infographic, demonstrate how that privacy policy has changed over time to one that is fundamentally more public by default. The most recent changes, announced late last year and then expanded at the Facebook developer conference f8 last month, have pushed me over the edge.

With these changes, Facebook created a new quality of information known as Publicly Available Information, which includes your name, profile picture, gender, current city, networks, friends list and pages. If you want to see what information of yours is now publicly available through the Open Graph Protocol, try using this nifty website. Compounding the problem, Facebook has now transformed your profile page to restructure it around your ‘Connections’, which primarily consist of Pages that you’ve ‘Liked’. As we just noted, Pages are publicly available information, so effectively the majority of your page is now made up of publicly available information.

Although I am quite concerned with the issue of identity and privacy online (and have written about it here before), I do in fact share quite a lot of information publicly on the web. I have a public Twitter feed, I share papers that I have written on Scribd and presentations that I’ve delivered on SlideShare. I maintain a LinkedIn account that I use for professional networking, answer anonymous questions posed to me on Formspring.me and share pictures on Twitpic. And, as I explained at the top of this post, I have been on Facebook for years to connect with my personal friends.

Of all of these tools that I use to share information that I find interesting, relevant or otherwise worthy of attaching to my identity and publishing to a group of self-selecting listeners (aka my friends, followers and contacts) only Facebook has frequently and unilaterally altered the understanding of privacy that I am granted by their service. Their behavior in this regard is unpredictable and inconvenient, as it requires me to stay abreast of the latest alteration to their privacy policy, and adjust each new addition to the privacy settings once I discover it.

There has been a lot of vitriol and emotional outrage among some in reaction to Facebook’s recent changes. I don’t want to make my decision a part of that. It’s not a matter of opt-in vs. opt-out, or the appropriate role of social graph data-mining in powering targeted advertising to support a free web service. For me, the question is whether or not I trust Facebook with the privilege of serving as the broker of my identity online. In light of recent decisions taken by Facebook, and based on my own expectation of where I see this path leading for Facebook’s decisions in the future, I have decided that I no longer trust Facebook with that privilege.

I will be queuing my profile for deletion by the end of this weekend. If you decide that you no longer trust Facebook with the privilege of serving as the mediator of your identity online, click here to reach the page where you can request the deletion of your account.

If you are a friend of mine on Facebook and want to remain in touch, look for me here at this website, or at one of the other social media sites that I listed above, or call/email me with the contact info you’ve already got – it hasn’t changed.

Reflected Attention and Network Influence

Twitter has always paid attention to its most popular users, as defined by the number of followers who subscribe to their tweets. That number can be a useful metric because it provides some suggestion of the amount of people who might receive a given message. The higher the number, this thinking goes, the more people are likely to receive the message, and the more influential the user who sends the message. With more than 200 Twitter users boasting over 1,000,000 followers according to Twitterholic, the potential reach of Twitter is considerable indeed.

Most of these Twitter Millionaire users are “real world” celebrities or public figures that have existing notoriety supported by more traditional forms of awareness-building in the mainstream press. However given that there are these Twitter users whose audience is so large, how can we observe their influence manifested on a communication network like Twitter? Comedian Conan O’Brien has offered us a glimpse of this virtual influence in action.

A few weeks ago Conan O’Brien joined Twitter after the loss of his late night television program. He’s quickly gathered over half a million followers with his relatively spare tweets (only 11 tweets since 2/24/2010). In addition to publishing few tweets, Conan had not followed any other user on Twitter. Had not, that is, until yesterday when he announced that he was adding a user “at random.”

When I learned of this action, I decided to undergo a brief observation of the growth in that user’s followers as a consequence of Conan’s follow. As Sarah Killen, the user named @LovelyButton that Conan followed, has herself stated she had 3 followers at the time Conan selected her. This article in TechCrunch yesterday afternoon notes that her followers had jumped to 1,300 in the time since Conan followed her. The article includes an update that, in the “few minutes” that transpired between the writing and publishing of the blog post, her followers had doubled to 2,600. By the time that I first took a sample of @LovelyButton’s followers, she had gained 8,755 followers in eight hours. As of the posting of this blog, @LovelyButton’s follower growth has slowed somewhat, but has reached 13,252 in a little more than 24 hours. This figure represents somewhat more than 2% of his total following. Not too shabby for reflected attention!

Apart from adding new followers, @LovelyButton has also generated a lot of conversation online. Beyond the mainstream coverage online in the Los Angeles Times, MTV, EW.com and the Huffington Post – all big deal media hits, btw – Sarah has been the talk of many users on social media platforms like Twitter and Facebook. I created a search profile in Radian6 to collect a large number of tweets mentioning @LovelyButton, and it has 2,931 mentions (instances of the term ‘@lovelybutton’) to date. On the public Facebook status update search there are also hundreds of mentions of ‘Sarah Killen’, as well as a fan page with 150 members.

So what’s the point of all of this? It looks like @LovelyButton’s follower growth may have peaked already (see this graph of mentions), and is not likely to have a major resurgence without additional support from Conan (he tweeted about her again today) or her media appearances. Still, I believe that this event provides a valuable case study for understanding the impact of the network effect on virtual influence. Through the strength of his relationship with his audience, Conan was able to compel over 10,000 of them to take a discrete action: to follow @LovelyButton simply because he chose her.

And beyond Conan’s role in identifying her, we can also observe here the way in which the virtual audience achieves a level of agency in their choice to participate in and experience the event of @LovelyButton’s selection. In the same way that Facebook users will choose a pickle over a rock band, 4chan users will spoil a news magazine’s reader poll, Slashdot readers will crash an under-prepared server with visits and YouTube viewers will love a video of Susan Boyle singing, what Conan’s choice of Sarah as a tabula rasa for his followers to project themselves onto is indicative of is what influence and attention have become in the virtual world of friends, followers, page views and links.

Twitter Survey – Initial Results

**UPDATE**
Some more details from the survey can be seen in the presentation I gave this past week.
The report itself should be completed by Wednesday, and I have learned that it is to be published in the Online Journalism Review this summer!
**

My Twitter Survey is up! If you are a Twitter user, please take the time to fill it out.

Although the survey was only posted yesterday, a few interesting trends are already beginning to develop in the response data (n=70 so far). Demographically there are few surprises (users tend to be young, urban and educated), but it was interesting to note that fully 75% of the sample are true Internet veterans (they claimed 10+ years of experience online). This, combined with the demographic details, seems to suggest that a large portion of Twitter’s users are twenty-something college grads who grew up with the Internet at home or in school.

More than half of the sample (~60%) started using Twitter within the past 6 months, and about the same number of users also Tweet daily or more. This would seem to bode well for the further growth of Twitter, as there have been periods in its use when a significant number of my followers started accounts, Tweeted something to the effect of “what’s all this, then?” and never posted again.

Finally, a key element that had held back my own participation in Twitter appears to have been dispatched with. Twitter has now reached some degree of ‘critical mass’ that has allowed new users to find their friends on the site. Nearly everyone in the sample said that they follow people that they know in real life, and a personal relationship was the second most common reason for deciding to follow a user. The social elements of Twitter were not ignored, either, with 70% of the sample indicating that they engage in @replies with other Tweeple (that aligns well with one of the main findings from my study of Identity Performance in microblogging last year). This should be a relief to the people at Obvious, because for a minute there it looked like Facebook Status Updates could usurp Twitter’s role as the lead microblogging platform.

I have reserved most of the juiciest bits of information from the survey, but don’t despair. Once I complete my study I’ll certainly be posting more details, and of course keep an eye out for the white paper that I am compiling this data for. If you’re interested for more footnotes and theory than one would ever dream of packing into a blog post, take a look at my dissertation for the London School of Economics on Identity Performance and Microblogging.

Print is out of fashion, and out of time.

While only a year ago it seemed like print publications were going to hang on for years to come, over the last few months they have been dropping like flies. Newspapers are closing across the nation – Seattle and Denver both became one paper towns this year, and the San Francisco Chronicle, New Jersey Star-Ledger and the Los Angeles Times are all in really bad shape, along with a host of others.

Magazines are suffering too. Just today, Alpha Media Group (publisher of Maxim) axed the print version of their music magazine Blender, and last month Conde Nast (publisher of The New Yorker, Vanity Fair, and Vogue) eked out their smallest magazine ever by ad pages.

This week, the University of Michigan Press has become the first major academic publisher to embrace digital monographs as their primary product. On their blog, the university’s provost Teresa Sullivan wrote “there’s no denying that the realities of how people learn and access information are changing traditional notions of education and scholarship.”

So is print dead? Apart from those coupon booklets that get shoved in your mailbox, I’d say that the evidence is increasingly mounting to suggest that it is, at least as a mainstream consumer product. Heck, countries like Australia and Mexico even have plastic banknotes these days, so you won’t even need to pay with paper money when you buy your luxury laminated Sunday edition of the New York Times!

The reality is that Sullivan is right, the way that people access information is changing. While we are seeing print newspapers closing down, we are also seeing them continue to function online, even if in a smaller capacity. This is demonstrating their ability to adapt (finally) to the times. The printed newspaper is a decidedly industrial product, but the kinds of information resources that are thriving these days are products of our contemporary networked society. It’s not that people have stopped being interested in ‘the news’, but rather that they have found even more efficient, relevant and accessible information mediums than newsprint.

Malleable Media

I had been all over the place for a couple weeks, it was the holidays and new years, so maybe I had missed some advance warning. All I know is that one day I tuned my radio to Indie 103.1 only to discover that it had transformed into El Gato. Just like that, the frequency that had until recently been carrying both legitimate and faux indie rock began broadcasting Spanish-language music, repurposed at the whims of the frequencies’ licensee Entravision (Indie was run in partnership with Clear Channel Communications).

In the past, perhaps I would have made a defiant stand and condemned Clear Channel and Entravision for betraying the community of listeners that had developed around the channel. At the very least I will say that closing down operations abruptly and firing the entire workforce without any notice is pretty ruthless. But what I am more interested in is how a radio channel can change its identity so suddenly. Of course there is a limited amount of radio spectrum to go around, so it’s not as if Entravision could have just created a new frequency to broadcast on, but it all goes to show the malleability of radio waves as a content delivery channel. Despite that, it is hard to ignore the real social consequences of this metamorphosis of the 103.1 MHz frequency in the Los Angeles area. The radio spectrum is a public good, and to be sure its licensee has the right to do as it pleases within the limitations of their license, but there are evidently different kinds of responsibilities when managing a public good as compared to a private channel, and the risks of offending that public trust are worthy of consideration. On the other hand, of course, one must consider the audience that will now be served by El Gato that was previously not listening to the station – we cannot allow the frustration of one group’s loss erase the potential benefits of another group’s gain.

Still, it leaves us with the question of where the Indie (or most any other non-mainstream music genre) lover has to go. Although the internet has blossomed as an avenue for musicians of all degrees to get their music out (through channels like MySpace, iTunes and the thousands of internet radio broadcasts), it too is no panacea. Its resources, like the radio spectrum, are finite (these ones too). There’s plenty of room yet to create considerably more focused channels online, (although some people do talk about the potential limits of the present system of Internet Protocol (IP) allocation), and it’s worthwhile to note that an online El Gato wouldn’t need to replace Indie 103.1 in order to succeed; it could simply outperform it on its own channel.

Terrestrial radio isn’t going to go away – indeed, swathes of new spectrum have recently been auctioned for additional uses, and the impending switch to Digital TV transmission will open up further spectrum allocation battles. But fortunately for my desire to hear Henry Rollins play 63 minute long experimental hardcore songs, Indie 103.1 understands the preferred value among its target audience of a direct channel through the internet.  Click here to listen to their live stream at indie1031.com. El Gato, for its part, is still en construccion.

CAPTCHA, or discrete tasks, as Internet Currency?

An interesting story in the New York Times about CAPTCHA technology sparked an idea in my head for a proposal to address the real problem of poor financial returns in web content.

First, a quick primer: CAPTCHA’s, or Completely Automated Public Turing Test to Tell Computers and Humans Apart are those sometimes annoying little boxes that you encounter when signing up for a web service such as email. Within the box is a Turing test, meant to determine whether you are a human or a machine that is automatically filling in the question field. The most common types ask you to type in a sequence of distorted letters and numbers, although now the automated bots are getting better at cracking those CAPTCHA’s, so you may encounter a box showing you a picture of a dog and a cat and asking you to select the one that is a cat.

The idea behind these tests is that it is quite easy for a human to recognize a cat from a dog, but it is actually surprisingly difficult for a computer program to tell the difference, especially given the nearly infinite variety of poses, angles and lighting attributes that are possible in a given picture of Fido. Thus, CAPTCHA’s provide a security layer that is intended to – and largely does – keep automated bots outside of the walls.

So what does all of this have to do with bringing higher financial returns for web content? Well the Times article I linked before brought my attention to reCAPTCHA, a service that is using the 60+ million CAPTCHA’s that are solved every day and applying them to the problem of automated book digitization. There are numerous projects underway to digitize massive amounts of books and store them in digital libraries, but these projects are often slowed by the fact that the recognition software is not foolproof. Some words are blurred or otherwise misinterpreted by the software when translating them to a digital form. reCAPTCHA uses the images of those misinterpreted words in their tests which effectively brings the power of the Turing test to bear on these misidentified words. Once enough human users have identified the word as the same, that word is substituted into the digital book’s text, replacing the garbled or mistaken translation.

It’s a pretty nifty trick, and certainly a noble cause. But could the same kind of process be put towards more commercial purposes and done in such a way that human interpreters who tagged images or deciphered text that computers could not gained money or credit that could then be put towards consuming digital content? One existing example that is something like what I am thinking of is Amazon’s Mechanical Turk initiative, which offers micro-payments to human users who go through the companies vast stores of digital photos and page-scans to tag and identify data for Amazon’s stores and other sites. Amazon and its clients pay users in cash (albeit at very low levels) for the work, but what if the client instead offered an exchange: tag photos or copy scanned documents for our digital archive and we’ll give you a month’s subscription to our content. For some media houses like Time Warner or even the New York Times, there are decades of back issues that could benefit from this kind of arrangement, and it would provide them with some value to help bridge the gap between costs and revenues from advertising.

There are certainly a lot of potential problems with this kind of a system at the outset (an article from Salon.com suggested that mturk might be considered a “virtual sweatshop”), and many people might be unwilling to provide this sort of service in exchange for their daily dose of media – not to mention some media houses would no doubt balk at the idea of untrained, anonymous users having any measure of control over their archives – but it does seem like something worth giving further consideration to. Until internet advertising revenues reach more comfortable heights (which may take some more time) or users get more comfortable whipping out their credit cards to pay for content online (which seems quite unlikely), this method may be worth a try.

Who Clicks Those Ads?

One of the big trends of the Web 2.0 movement is towards advertising-supported websites and services. I am sure that there are a number of VC’s out there who (after a while) started pulling out their hair every time a new-jack kid came down to the Valley talking about their ‘breakthrough’ social networking/bookmarking/video hosting/rating website and how they would fund it with ads, usually from Google’s AdSense or Yahoo’s Sponsored Search. But who is clicking those ads?

I’m not saying that advertising isn’t a reasonable means of making money off of your website, but advertising is not the be-all end-all answer to your financial problems. Not all sites are conducive to ads. For instance, YouTube and other video sites have been trying to figure out how to better monetize their sites for some time now – YouTube only features text ads on search result pages, instead of the video ads that one might expect on a site that, after all, hosts hundreds of television ad videos – without success. Some video sites have experimented with including ads before or after a video, but none of these sites have traffic anywhere near that of YouTube, the dominant video site on the net.

Despite these complaints by yours truly, money is pouring into sites in the form of ads. So this raises the question: who is clicking those ads? Personally, I try to not even look at them, and when some annoying ad chooses to float across my screen or unroll when I happen to drag my mouse across it, it tends to drive me away from the site hosting the ad rather than encourage me to click on it. After conducting an informal poll among some friends and colleagues, none of them said that they clicked online ads either – or at least they wouldn’t admit to it.

So who is generating those billions of dollars a year for web properties? Is there some kind of demographic split, where there are just scads of people who are happily clicking away at every ad that catches their eye, while I carefully avoid them? Apparently so, as online ads generated as much as $15 billion last year and are poised to continue to grow this year. At first I considered that this was similar to the split between those who flocked early on to DVR products like TiVo and those who were happy to watch the ads, but now that we’ve learned that two-thirds of DVR owners still watch ads, I don’t know what to think anymore!

Don’t get me wrong, I see a massive future for contextual, targeted ads – online and elsewhere. Obviously it is a benefit for the advertiser to be able to target their audience, that way they don’t waste time sending ads to people who aren’t interested in their product. Of course, there are some pitfalls to be wary of, but I’m sure they can be worked out in the future.

The New Proprietary Video Site

Ever since the emergence of online video on the internet, the new medium has presented a major challenge to the profit models of traditional television broadcasters and cable service providers. As the networks and cable providers were used to the idea of controlling the time and venue of broadcasts (requiring you to sit down on your couch at a certain time to see your favorite show), it has taken them awhile to begin to see some of the inherent advantages of digital distribution of video content. Now, belatedly, video content providers are beginning to realize just how valuable digital distribution can be for their product. But is it too late?

The real issue behind the recent announcement of News Corp and NBC’s decision to create their own video site to challenge Google’s YouTube is the future of video content distribution. We are seeing the end of the network schedule. Consumers will no longer be willing to watch the shows they want to watch when the network wants them to watch them. This much is pretty well accepted, and between video on demand, DVR and Apple’s recently released Apple TV, it is clear that the way that people watch television is changing rapidly.

As it is, YouTube is the premier video website out there. It is as a result of this popularity that YouTube finds itself hosting all of these proprietary videos that are (illegally) uploaded by users. Google, citing the Digital Millennium Copyright Act’s Safe Harbor provision, says that it is not liable for copyright violations posted by its users, so long as it promptly removes them when requested to by the rights holder. Frankly I am not sure whether the networks should be creating such a scene over these copyright infringements, as YouTube has proven itself to be a very useful tool in promotions of television programming. Generally speaking, networks WANT their users to be more involved with their products, and few statistics can speak to the popularity of Viacom’s programming than the fact that their content has been viewed 1.5 billion times on YouTube, as they allege in their lawsuit. Certainly they will say that those viewings represent 1.5 billion missed advertising opportunities, however it is not clear that the people who watch a Daily Show clip on YouTube would necessarily watch the program on television. As with the controversy surrounding pirated music, it is not always true that a download represents a sale lost, but it is always true that a download represents an impression gained.

The fact is that many, if not most, of the people watching The Colbert Report on YouTube would happily watch it on a Viacom-backed alternative. The problem is simply that that Viacom alternative does not exist. Sure, Comedy Central (and other Viacom properties) do offer selected video clips on their sites – indeed, they were one of the first of the major broadcasters to provide this kind of service – but the message that is being clearly sent by users who post copyrighted clips on YouTube is that they want to choose which clips they can see. If Viacom had been providing comprehensive video clips of their programming on their website from the get-go, then I see no reason why users would feel compelled to upload the same clips to YouTube or any other online video site. Instead, by telling consumers what they want, Viacom is driving their own loyal viewers to infringe on Viacom’s copyright by uploading the clips that they actually want to see.

Ultimately, judging from this newfound desire to defy YouTube by the major content providers, we may find YouTube returning to its roots – hosting amateur and independent video. Although I suspect that Google would like to get into the content-brokering business (and it has tied up with a number of serious rights-holders), it could be that YouTube and the major broadcasters decide to go separate ways, with the broadcasters providing their own clips and shows on a proprietary site while YouTube continues to grow in the amateur space.

What will be really interesting to see is if, given the widespread choice of videos by amateur and independent content creators on YouTube, consumers begin to spurn the major studios who fought so hard to separate their expensive content from the riff-raff, effectively walling themselves off from the vibrant marketplace contained there.