I know what you bought last summer

The rumblings over Facebook banning Robert Scoble have opened up all sorts of conversations about who owns or controls your data – see also: Data as currency. One issue that has been highlighted is how easy it is for people to scrape enough information about you to form an identity. Scoble was running an automated script to pull out contact details by the thousand.

Yesterday, another related article cropped up on Techmeme – Sears Exposes Customer Purchase History. It appears that Sears added a feature on their web site where you could look up your purchase history. All you had to do was enter your name, address and telephone number. Trouble is, whilst you had to have an account and login to the site, you could then enter anybody’s name, address and telephone number to view their purchases. Somebody forgot to restrict access to only purchases associated with the authenticated user. Since the news became public, Sears have disabled the feature to sort it out.

But it does raise yet another warning about how easy it is for companies to accidentally make too much information public, be it downloading database records to a CD or making those records available online. Mash-up poor (or missing) security controls with automated scripts to gather contact details and our criminal friends won’t need to go phishing for dinner.

Controlling the Data Cloud

Nicholas Carr wrote another post about the emerging world of giant data centres (yup, English-English spelling on this post as opposed to English-US) – Google’s Cloud – describing how Google wants to host all of our data and our applications. Microsoft is pursuing a similar strategy through its Live plans.

Whilst there are lots of advantages in going down the hosted route and handing over big chunks of your I.T. to a hosted data centre, plenty of large organisations are reluctant to let go of their data completely. Not just because of their own concerns about security and privacy (as long as employees can send unencrypted emails outside of the company – or just use their vocal chords – there is only so much you can do on that road) but also because of government and industry regulations that dictate responsibilities regarding data ownership. Some governments simply do not allow companies to host their data overseas. Whilst business and society may becoming increasingly globalised, governments still operate very much on a localised level.

But never mind all the government-related challenges, Salesforce has just demonstrated the much simpler concern voiced by many organisations, as reported in the Washington Post – Salesforce.com Acknowledges Data Loss. It appears that an employee fell foul of a phishing scam and accidentally handed over the keys to their customer database. Oops!

Security challenges in Web 2.0

An interesting blog post has highlighted how Gmail accounts can be hacked – Google Email Hijack Technique. Aside from the issue that it appears quite easy for someone/thing who knows what they are doing to start snooping on your email (more than slightly worrying), the blog post highlights a new security challenge for anyone beginning to rely on hosting data in ‘the cloud’ – i.e. stored on remote data centres and accessed using online services. Think Gmail, Flickr, YouTube, Facebook, Office Live, MySpace, LiveJournal, SalesForce

When viruses first appeared, the primary method of spread was through infected disks. People had a habit of leaving floppy disks in computers. When the computer was next switched on, a virus would copy across from the floppy disk (way back when, the floppy disk drive was the first item read when your computer started up and the most common form of network for file sharing). Your computer would start to behave oddly as files became corrupted and you lost all your data. People, through training, threats and learning the hard way through experience, began to get better at not leaving disks inserted in computers when they switched off. But it didn’t matter because the threat changed…

Along came email and networks. New ways of hacking accounts, crashing computers and corrupting data arose that no longer relied on a floppy disk to spread the havoc. And new challenges appeared – spam overwhelming inboxes, phishing scams persuading people to willingly hand over bank details. Whilst some attacks were purely web-based (fake sites pretending to be your friendly bank), the majority of attacks still focused on taking control of your computer and doing bad stuff with it. But having a computer crash has become less of a worry as more data is being uploaded onto the web. Our need to have our data available regardless of the device we happen to be using means our devices are more resistent to damage. If your computer gets hacked, wipe it and rebuild it, then re-sync with your online services. And so the threat changes again…

The Gmail exploit doesn’t care about your computer, or your mobile phone or whatever device you choose to use. It lives in ‘the cloud’, hacking directly into the online services that are hosting your data. If Gmail gets hacked, what do you do? You can’t just format and rebuild, as has worked in the past with computers. You don’t control the service or the computers where your data is stored. Instead, you have to trust Google (or whichever service provider you happen to be using) to fix the issue. It’s a different dynamic and one that will need to be considered by any organisation planning to switch from local servers to fully hosted services.

Technorati tags: Web 2.0; Enterprise 2.0; Gmail

Who controls your data

There is a bit of a furore going on over a piece of code being leaked to the web that enables you to crack HD-DVDs. However, one of the blog posts/news articles includes a snippet of information that I am more interested in, because it highlights a big flaw in the strategy for moving your data into the Internet cloud. Snippet from a blog on Wired, documenting a takedown notice from Google to someone using their Google Notebook application (bold highlighting is mine):

… Google has been notified, according to the terms of the Digital Millennium Copyright Act (DMCA), that content in your notebook Google Notebook Entry allegedly infringes upon the copyrights of others. The particular section of your notebook in question is the section covering www.digg.com/users/entangledstate/news/dugg

…. If you do not do this within the next 3 days (by 4/30/07), we will be forced to remove your entire notebook. If we did not do so, we would be subject to aclaim of copyright infringement, regardless of its merits. We can reinstate this content into your blog upon receipt of a counter notification pursuant to sections 512(g)(2) and (3)of the DMCA…

Back in March, I wrote a post – Google and Microsoft looking alike – talking about Google’s strategy for getting us to use their online services for storing our data. If they are happy to act as big brother on behalf of people who use the DCMA as an easy form of censorship, will we be comfortable to hand over the keys to our information?

Take a simple scenario. I use Gmail for email. Someone sends me an email containing content that might infringe copyright. Google receives a notification from the copyright owner and issues notices similar to the one above with 3 days to comply. I happen to be on holiday and don’t check my email, so have not even read the allegeded offending email, let alone seen the takedown notice. When I return to work, my entire Gmail account has been deleted. What if I ran my entire business using Google services? Would they all be deleted too? Hmmm…

I last blogged about the DMCA in January 2006 – Post and be damned. The NewScientist magazine had published an article examining the use of the DMCA as a form of censorship. One study found that 47% of takedown notices concerned material that would likely have been deemed fair use. However, the DMCA enables content owners to issue takedown notices without having to go to court, placing the onus on the individual to legally challenge them. Targeting the Internet Service Providers (ISPs) has proven effective – they will simply remove the content unless the individual web site owner is prepared to finance a legal challenge to the notice. Picking on Google (and any other player in the web software/services playground) makes it even easier. Google can simply shrug and say ‘we have to do this or else we would be subject to a claim’. But the impact on the individual or organisation targeted is now even bigger. You don’t just lose your web site, you could lose your entire ability to do business if you rely on web-based services…

Security versus Control

Microsoft has a (relatively new and not well known) technology called Rights Management Services (RMS). When used with Office 2003, it provides the ability to apply rights to individual documents and emails, enabling an author to control access and distribution. For example, if you wanted to send out an email containing sensitive data, and did not want any recipient to forward the email on to other people, you click a button and, hey presto, the email is sent with certain features unavailable. Recipients can not forward the email, print it, cut/copy & paste it and if they reply to the email, the original message is removed (they can’t even open it to read the email if they are not on the approved recipients list). Another example: if you have a document containing time-sensitive content, such as a price list, you might want to set an expiry date. Beyond the expiry date, the document can no longer be opened – this could prevent people from accidentally using an out-of-date price sheet when selling products. If you have a document you want to collaborate on with only a limited group of people, you can restrict who has the right to view, edit and print the document.

This ability is sometimes called document security, but that description is wrong and can be misleading. The accurate definition is controlling distribution of content. It’s a subtle but important difference. When a document has rights applied to it using RMS, the rights (lets call them ‘a lock’) live with the document. When someone tries to access the document, they will be challenged – the appropriate certificate (let’s call it a ‘key’) is required before the document can be opened. However, because the rights live with the document, and the document is allowed to travel outside the boundaries of a company’s own IT systems, the potential will always exist for someone, with suitable tools and patience, to crack open the document without a key. It’s just like a safety deposit box. You put items into a safety deposit box (locked) to control who has access to those items (the key holders). However, if you decide to leave the safety deposit box in the park, someone is going to pick it up and, eventually, they will get the box open by fair means or foul. That’s why you store the safety deposit box in a vault. The vault is the security layer. Yes, vaults do occasionally get broken into. But it’s a lot harder to do than taking that safety deposit box home and working on it in your own time, and when it happens, you know it has happened. The big hole in the wall and the people wearing balaclavas are a bit of a give away.

The Rights Management Service is a useful tool when you have a need to control distribution of content. It is not unbreakable – you can’t stop someone using a camera to take a photo of the document whilst it is displayed on their monitor screen – but it is a lot lot better than no restrictions at all when handling sensitive content, and is certainly better than traditional methods, such as sealed and recorded delivery of physical copies of the documents. If you want document security, you need to consider the vault – the store where the document will reside – and you need to consider the implications of allowing the document to be removed from that vault, from a security perspective.