December 2008 Archives

Privacy and Anonymity

| 0 Comments | 1 TrackBack

A few weeks ago, I had the opportunity to have a discussion with Dr. Greg Shannon about privacy and anonymity. He had some very interesting insights, and with his permission, I thought I would share some of the conversation. I've made some edits-- mostly my responses, to make them more concise-- but nothing too major.


  1. What is your definition of privacy? anonymity? Especially from an information-theoretic point of view?

  2. I contend that privacy and anonymity are illusions, and always have been. In particular, that your privacy and anonymity are mostly a function of that no one cares about you per se. What do you think?

  1. I would define anonymity, from an information-theoretic point of view, as the ability to conceal identifying characteristics of yourself. This might be as simple as your name, but information warfare tends to be much more complex than that-- so for instance, being able to act with anonymity may require you to conceal not just who you are, but who gave you the information upon which you are acting; this, in turn, may prevent you from taking certain actions, due to the risk of revealing an information source.

    Privacy is somewhat more nebulous; I suppose I would posit that privacy is the ability to release only the information that you choose, in an information theory sense. For instance, I might consider intimate relations with a significant other a matter of privacy; they are certainly not anonymous, but they are not things one would wish to share with the outside world. (Of course, there are exceptions.)

  2. It is nearly always possible to breach a wall of anonymity, it is true-- especially given the unlimited resources of a government (especially one endowing itself with police powers). I don't think that makes the concept invalid. A related example might be that we classify safes not as "impenetrable," but rather on their ability to withstand an attacker with certain grades of tools and expertise for a certain length of time-- and yet we do not consider physical security an illusion. I would submit that in a similar vein, we construct a wall of anonymity in proportion to the attacks that it is likely to face-- a government compared to, say, a disgruntled classmate.

Hmm. But what does "conceal" mean?

Do you see any difference between identifying characteristics such as labels like name, address, etc., and behavioral characteristics like walks fast, avoids the letter "e", likes video blogs over audio blogs, etc.?

Reputations are hard to maintain and verify as it is without the Internet. For example:

http://www.nytimes.com/2008/12/15/business/15madoff.html

Reputations cut both ways. And, since it's hard enough to gauge an entity when you can look at them, I can understand why people have little interest in truly anonymous on-line interactions, for the most part.

I believe that part of this comes from the fact that reputations are built in part by reputation by association. A degree from Harvard, JHU or Purdue is meaningful precisely because of the reputation by association. Names and binding matter.

I don't know that I agree with your assumption that the Internet makes it harder to gauge the true nature of an entity. One of the advantages of the Internet, in my view, is that it allows us to fully separate appearance from thought.

Of course, this is unrealistic-- primarily (and perhaps most depressingly) because people don't want to have to make a full-stack valuation on everyone with whom they come into contact, on or off the Internet, though there are other reasons. So we do need a system for determination and communication of reputation. (One such system would be that in my previous blog post.) My point would be that there's nothing in reputation that can't be preserved in anonymity-- to take from a recent movie, people can know V's words and deeds, without ever knowing who he truly is. The same is true with reputation by association; to take Shade (from another previous blog post) as an example, we can ascertain that she works as a key player in the Identity space, simply through her active association with OpenID and its major players. If Purdue, in your example, has a special reputation, that is commuted upon its students, regardless of their name.

The only useful definition of concealment of which I am aware related to information theory would essentially be a type of deniability-- that is, actions related to the information acquired from a secret source do not reveal the existence of that source, and toward that end, do not increase beyond some threshold the likeliness of the existence of a secret data channel of which the enemy needs to be aware.

Perhaps, then, there's a better definition of privacy: preventing any one entity, or coalition of entities, from forming a coherent picture whose outline reveals details you would like to keep secret. As an example, a friend of mine once bought two rolls of duct tape, twenty-five feet of rope, one box of condoms, and a birthday card; had he simply bought them at different stores or on different occasions, none of them would have been exceptional, nor would there be any particular thing to tie them together-- but their simultaneous purchase greatly horrified the saleslady at the store in question, even though he assures me they were for four different purposes.


So then, people of the Internet: what do you think? Are there additional properties you think are important to either privacy or anonymity?

Dispatch From Montana

| 0 Comments | 0 TrackBacks

Hello, friendly blogpeople!

I'm back in Montana at the moment, surrounded by several feet of snow that's only just now begun to melt. I've been taking a bit of time off from academic work, mostly to do the traditional student-on-vacation activities of eating Mom's cooking and sleeping. I did play for the Christmas mass at my family's church for the umpteenth year-- but this year, the fabulous Joe Mace (another student home for the break) sang with us, which was great fun.

I've also completed my transition from DreamHost, the host for four years of USSJoin.com and related sites, to the much more powerful Linode-- it's really great to have full root on my own (virtual) server, and to be able to run anything I want without fear of being trampled by other shared-host users. Even Movable Type, which powers this blog, and which was already quite fast on DH, has seen major improvements; a blog rebuild which took 45 seconds on DreamHost takes just 11 on Linode, and I have the smallest level of service. I'm very impressed.

For all those of you waiting with bated breath, I did get all my final projects turned in on time (I, for one, count 17 seconds before the deadline as turning it in early :-) ), and do get to return to Hopkins for my final semester-- and to write and present my Master's thesis, the preparations for which I'm happy with at the moment (my advisor, of course, will no doubt push me onward-- as he should!), and on which I'll get back to working soon.

At long last (to me, at least), I have also concluded my search for a job after graduating from Hopkins. I interviewed with a wide range of companies, and had several really great offers-- but in the end, I've accepted a position where I think I can make a really valuable contribution, both to the work the company is doing, and to the global collection of knowledge; since it's a research position, I'll have the ability to pursue awesome new areas of inquiry. (And no, I won't say where it is at the moment.)

So the general plan is to graduate, move to Arlington, VA (where I'll be living), then go cause various amounts of mayhem before starting in August. Good stuff.

In the meantime, we're throwing our usual New Year's Eve party at our house in Montana-- so if you want to come, give me a call!

Reputation Stock Market

| 0 Comments | 0 TrackBacks

In Accelerando, Charles Stross mentions as a side note that reputation is handled as a matter of course in the immediate pre-acceleration society, and specifically, that it is communicated by trusted servers and built up in the manner of a stock market, with, as he terms them, "goodwill dividends" from highly-rated reputations. People and companies can be traded on the same market, so the protagonist has a reputation traded above IBM.

Let us consider, then, what it would take to set up a reputation stock market on today's Web, and its implications.

First, we need to have a concrete definition of the actors involved: identity. How to determine the contents of an online identity for an online reputation transaction is worthy of study in its own right, but let's set it aside right now, and simply postulate that we will trade OpenID signifiers. This leads to one form of the Sybil question: how do we prevent people who dislike the reputation they've acquired from simply setting up a new identity and reinventing themselves? We don't-- but they'll start out as new, of course, and this will negate many of the benefits. More on this a bit later. Note that this doesn't mean we need to have a true name-- an online presence can exist with or without full identity disclosure.

So now that we have people, let's define value, before we go on to define money. One of the other Sybil questions is how to prevent a small but determined group of people-- let us say for the sake of argument, 4chan-- from creating many accounts to destroy the reputation of some group they find onerous. Since we want anyone to be able to participate without draconian identification requirements, we have to find a way to make the attack useless, rather than trying to prevent it, so let us define the value of a reputation to be the current number of shares outstanding of that reputation-- that is, if five people own ten shares of a reputation apiece, the reputation value is 50. This way, new reputations start at 0 (well, 1, but we'll get to that in a moment) and there is no "vote down" mechanism-- only a vote upward, or a removal of one's preexisting vote upward.

So now we have people and stocks-- we need money. This, of course, is a tricky proposition on the Internet-- lots of companies have tried, and ultimately failed, to create a useful virtual currency. (What we really need for this is a grounded virtual bank-- to pull from another novel, we need Epiphyte Corporation.) We don't have a good currency, though, so let's just create a currency that works internally, and we'll set up exchanges later, Linden Dollars-style.

Let's combine all these elements, then, and outline the stock market. People start (in a state of nature-- this isn't that relevant to UX, but is how the simulation works, and we'll get to UX momentarily) with no money, and one share of their own reputation. People gain money through dividends on stocks they own-- say, 10% of the value of the stock, rounded down, per share, per time interval, so owning a share of a reputation valued at 100 will provide its owner 10 credits, and shares valued at less than 10 provide no dividends (as they are insufficiently notable to provide widespread "goodwill").

People can buy shares of reputations whose owners are not yet themselves purchasing reputation shares, however, and so when a popular figure joins the service, he may find that as an owner of himself, he has amassed some credit simply through owning his own reputation.

The cost to buy and sell a share? The value of the share-- so a stock valued at 100 will cost 100 credits to buy, and provide 100 credits if sold.

There are a few bootstrapping issues, naturally, but I think these are workable, so the system can work.

The advantage, then, is we get reputation free of context, which has all sorts of nice properties; I'll go into a few of them in my next post.

In Defense of Anonymity

| 0 Comments | 0 TrackBacks

For the last few days, an incensed debate has been taking place on the OpenID Mailing List, regarding the purpose and value of anonymous people participating in the OpenID lists generally, and in the development of the OpenID standards specifically. One of the most heated instantiations of this has come in this thread, where many members have stated that one of the most prolific participants on the mailing list, "Shade," shouldn't be allowed to contribute to the work of the group, because no one could know her (well, his or her, but I've always thought of Shade as female) intentions or biases. One member went so far as to say that anonymity was fine for people on the periphery of the conversation, but as Shade had consumed too much of his time and attention, he was no longer finding her anonymity acceptable.

Another recent quote on the subject comes from the very influential Esther Dyson, who said in a recent interview that anonymity on the Internet "really encourages bad behavior," and that like abortion, "Everybody should have the right to it, but it's not something one wants to encourage."

The response to both of these viewpoints has not been what I would have hoped-- open outrage and calls for public apologies, that is. Instead, everyone seems to have nodded in agreement-- on the OpenID lists, a few people (like Eddy Nigg) have attacked Shade fairly viciously, not merely content with attacking anonymity.

Why do we need real names on the Internet? What value does it give? I have one pseudonym I use essentially everywhere, and since I've used it for more than 17 years, it's been my name for nearly as long as my given name has. (It's also supported more places, since an incredibly obnoxious number of sites are unwilling to accept "special characters," like apostrophes, not just in screen names but in full names as well-- is the airline industry completely unfamiliar with Ireland? Really?) So as a cloak of anonymity, it's not much-- but it's not meant to be. When I wish to remain anonymous, however, I maintain that right-- and accordingly, have other email addresses, and other domains, that I can use for that purpose.

Ideally, I should be able to participate fully on the Internet as an anonymous/pseudonymous individual, because (setting aside e-commerce at the moment) nearly nothing needs a real name. There's no advantage to me of giving real identifying information to every Tm, Dck, and Hrry (their Web 2.0 names, complete with a lack of vowels) on the tubes; they have no need to the information, and no right to it.

More and more companies, though, are asserting a right to true identity, regardless; in one recent case, a woman was convicted of a crime for giving a false name to MySpace. Because MySpace, clearly, is a bastion of true-name discourse on the Internet.

As I've written about before, I think people are already giving up too much personal data on the Internet. So it's hardly new for me to be annoyed at attacks on anonymous participation. Discussing the situation with a friend, however, pointed out something I hadn't considered as a reason for real names in discussion groups (such as the OpenID group): people don't have the time to separate signal from noise by really considering every nuance of every message, and so they use the names to do enough research to get a sense of reputation-- so they can figure out who's worth listening to.

While it's not ideal, this doesn't seem like a unique use case, and so perhaps we could deflect some of the criticism of anonymity-- that people can't trust the intentions and motivations of the hidden users, or in my friend's case, they simply want to filter out the crazies before giving up their valuable time to their comments-- with a reputation score. Is it possible to create a reputation score that can be effectively shielded from gaming by legions of anonymous (or even known) griefers, that can still provide value to both named and pseudonymous users?

I believe it is, and my next blog post will address how I think we can do it.

Useful Tools

| 3 Comments | 0 TrackBacks

I've been working on a lot of different projects lately, and run into a few really great tools; I thought I'd mention some of them, in addition to a new utility I've written.

  • Bort - one of the things I dislike most about Rails is that unlike development in Perl, there's no centralized repository of knowledge; CPAN provides the place where everything in the Perl-verse lives. In Ruby and Rails, there's RubyForge, yes, but there's GitHub and a thousand other places where really necessary stuff lives-- and so unlike with Perl, it takes real experience just to find the things you need to do useful development work. Bort is one of those things, however; it takes away the first 6-8 hours of scut Rails code that you need for every project, and installs the modules you're going to need anyway, without forcing any additional conventions on you. It made finishing my Security and Privacy project so much easier; there actually wasn't a single plugin I needed besides what was in Bort.

  • glTail - While I really enjoy Woopra for realtime visualization of people visiting my website, it only works for the sites I have keys for; since Woopra is in beta, only one site per user is allowed, which means I can't put it on every site I run (each subdomain counts as its own site). glTail gives me insight to what all of my sites are doing-- and all the time, as well, since it uses my logs rather than a JavaScript tracker (so it shows when the web crawlers come to visit). It's also pretty to just watch the waterfall of requests (it sends balls bouncing across my screen, with bigger balls representing more data in a request), and it shows me when errors happen, and oftentimes why. It's even easy to set up and customize for whatever system you're using-- it can understand Apache logs, Rails logs, almost anything.

  • Checkmate - In response to the news that InUrSite is stopping development, I decided to take some of the good ideas it had, and create my own utility in the same vein. Checkmate is pretty simple: you give it a site, and it crawls all over it, making sure all your (internal) links work, and that your pages are all valid (X)HTML. This latter is particularly helpful, I find, as I (as many other people) often only check the front page of a website to make sure it's compliant, and don't check it on an ongoing basis. Checkmate periodically rechecks, to make sure you're still OK (and you can ask it to do so immediately). Now that I have the basic functionality down, I'm taking suggestions for what to add; I've already had one request for deep link support (that is, crawling websites behind a username and password, given, of course, a username and password to use), but I'd love to know other features that people would like.

In other news, less than a week before I fly home for the break; I'll be back in Baltimore after a few weeks, but hopefully still blogging in the interim.

Opening a Closed Book

| 0 Comments | 0 TrackBacks

Facebook is an extraordinarily closed platform.

This, in itself, is a pretty neat thing to be able to say. Not too long ago, it wouldn't've been considered close, because there wasn't anything that was particularly open, in the sense we mean today. Now, however, with instant messaging being opened and federated with Jabber, nearly every social network publishing feeds of its users actions and friends, news filters like Digg and video networks like YouTube allowing remixing and story data to flow around, OpenID decentralizing authentication, and incredible conceptual startups like Heekya forming to let people take advantage of all this openness-- it's pretty disappointing that one of the largest social networks in the world is sitting on its hands, especially when all of these are sources from which Facebook has adopted features and ideas.

One particularly good example of this came with Facebook Connect-- when an open standard, implemented by thousands of websites, and with over half a billion identities already in existence, existed for precisely the purpose they wanted-- Facebook chose to implement a closed framework instead. And they did a really, really nice job-- which does sort of dampen the blow, one must admit. Their user experience is something OpenID could (and indeed, must) learn from-- but why couldn't they have let their users flow across the web without forcing everyone into the Facebook box?

I've taken what I hope will be a first step in the other direction, though, and am releasing YourData-- a Facebook application to let users take some control over their own data.

At the moment, it does something pretty simple: it queries Facebook to get a user's status, and then publishes it to an RSS or Atom feed. Nothing too frightening here.

The advantage, though, is that users can now update their Facebook status, and show it on their blogs, share it with their friends, or do anything else they might like with it-- features we take for granted with (for one example of many) Twitter. People with Action Streams might be interested to hear that I've built a plugin to add support for this new feed to the Action Streams platform-- you can see it in action on my website, and download it at my software page.

Over time, I hope to add more features to it, allowing users to choose to export photos and more to every place they have on the web.

So, I hope you like it. You can install it at http://apps.facebook.com/yourdata. Send comments, suggestions, or criticism to me, either through the comments on this article, or through my contact page. Happy Openness!

My Worst Interview

| 0 Comments | 0 TrackBacks

In my heading-on five years at Hopkins, I've interviewed with an incredible array of companies. Big, small, from stealth startups to companies with 80K people. Some of them I've liked-- a few, I've even had the pleasure of working for. Even the ones I didn't like, I've treated with respect; I don't believe in burning companies just for its own sake. I have never yet, however, been as deceived or as horrified in an interview as I was today, and because of the nature of the business (illegal) and its owner (working under false pretenses), I feel that I have a duty to share my story.

A company contacted my department asking for a student to complete a short contract coding job-- something of medium length over the next few weeks. I've done several of these in the last few years, and they're usually interesting-- make a few bucks and meet someone new, often a department here at Hopkins. My freshman year, one of them turned into a two-semester thing, which was kind of enjoyable; an opportunity to meet a group at Hopkins with which I would otherwise not have had contact.

So I responded, and we agreed to an interview today. They wanted to meet at the Hopkins library, which I found odd, but I figured they might just want a coffee shop for a fairly informal interview, and there's one there. The person scheduling the interview mentioned they'd send an employee down to do all the interviews at Hopkins from New Jersey. Great.

I arrived as scheduled, and met the interviewer. He proceeded to tell me that he worked in their office in northern New Jersey, and then went on to explain their business model. "We're a marketing machine," he told me. Briefly, it works like this:

  1. Sell coupons to businesses around a campus.
  2. Whenever a business sells something, give the student a coupon.
  3. The student goes to their website, enters the code on the coupon, and gets reward points, which can be used to get prizes.
  4. The website takes all the data about the student that they collect under the pretense of sending them a prize, such as their email, phone number, and home address, as well as the data provided by the business about what they bought and when, and sells it to the highest bidder.

Well, this seems bad, but not the worst thing I've heard. Unfortunately, there are complications:

  • The site explicitly says that the information collected is used only to send prizes.
  • The site encourages you to invite all your friends, and spams them whether or not they consent-- and indeed, doesn't check things like whether it's legal to do so (for instance, if they're under 13).
  • The site tracks all your actions, and uses it for behavioral mining to produce more data they can sell-- again, without information or consent.

This rapidly seems like one of the more evil sites on the net. In this age of progressively stronger privacy policies, aren't we supposed to be getting away from this? I've written about information sluts before, but this seems much more like information rape-- and especially with the false pretenses for data collection, I think this crosses into illegal business practices.

At the end of the interview, the interviewer said he would send my resume back to "corporate" and they'd contact me tomorrow (Sunday), after doing a "background check" on my resume. This made me wonder what sort of checks they could do over a weekend, but whatever.

Upon returning to my apartment, I went to their website, and found out a few additional useful pieces of information:

  • They don't have an office in New Jersey; their contact address is right here in Baltimore.
  • They don't mention that they're using Hopkins resources, which is odd, as said mailing address is actually a dorm-- and their fax number (in their Whois information) is a JHU number. JHU policy very explicitly forbids using JHU resources for personal profit.
  • The person I interviewed with, who said he'd just come from New Jersey, is, in point of fact, a sophomore Economics major here at Hopkins. He doesn't work full-time in a different state; he couldn't. It would appear that the dorm address would be his dorm.

Add in what I'm reasonably sure are stolen stock photos (they're definitely stock, and I don't believe they're from public stock sites; if someone can identify them, please let me know, and I'll be happy to update this post), and you have one entire stack of lies-- especially with the map showing their influence in offices on all continents. They even mention student scholarships that their "huge corporation" gives away!

And the kicker? If you click on "Privacy Policy," you get a 404 error. That says it all, I think.

So then, if you're in the Baltimore area, make very sure to stay away from anything sounding like "KA Synergy"-- and for the rest of you, please make a resolution: never lie to people you're interviewing. It makes you seem like a slimebag. Even in his future ventures, I'll never trust this Arjun Kapur again; a deception like this calls an entire character into question.

If anyone can identify where the stock photos (and layout) are coming from, please let me know (either via email or in the comments)-- I'd love to identify the rights holders.

Have you ever had an experience like this? What did you do? Would you have handled this situation differently? Let me know-- I'm somewhat at a loss for what to do, and I value additional input greatly.

"Why do you have a chicken on your blog now?" asked my sister, Shannon. "Because I moved where I'm hosting it." "What's that got to do with it?" "Uh... nothing."

So I've moved my blog (and indeed, all of the startlingly large USSJoin.com empire, now comprising 9 domains with many more subdomains, email, jabber, etc.) to Linode, whom I've been using as the Mnikr host for several months with great success. For years now, I've been a happy DreamHost customer, but I've outgrown what they can provide-- they can't host Mnikr as I need special abilities to run Rails applications (and I need a server on which I have root to get those abilities), and their shared offerings are just too slow these days (not noticeable to users of my staticly-generated blog, but really obnoxious to me as I use the backend and other hosted services).

As a side effect of the move, I had the opportunity to recreate my Movable Type installation from scratch, letting me ditch lots of code I'd messed up while learning MT. The most visible consequence of this is that yes, both http://blog.ussjoin.com and http://ussjoin.com now have pictures of chickens on them. (There are other, more positive, template changes as well, but they're somewhat more subtle.) I'll find a more-interesting replacement soon, and then the chickens will just be a bad memory, I promise. :-)

And very soon now, I'll write a more substantive blog post or three; I have topics, but I've been sprinting from school project to school project recently, with no time for blogging. Soon, though.

Edit: I've now removed the chicken and replaced it, but if you'd like to see the chicken image in question, it's right here.