June 10, 2013

Where things stand in US government surveillance

Edit: Please see the comment thread below for updates. Please also see a follow-on post about how the surveillance data is actually used.

US government surveillance has exploded into public consciousness since last Thursday. With one major exception, the news has just confirmed what was already thought or known. So where do we stand?

My views about domestic data collection start:

*Recall that these comments are US-specific. Data retention legislation has been proposed or passed in multiple countries to require recording of, among other things, all URL requests, with the stated goal of fighting either digital piracy or child pornography.

As for foreign data:

Beyond that, use your imagination.

The big question is how much domestic or quasi-domestic communications-content data the US government currently captures. I think it’s a lot more than we previously acknowledged. For example:

And cost is not a barrier. I would guess the order of magnitude* for all email in the US at 10 petabytes/day uncompressed. (100s of billions of messages, 10s of KB per message.) Phone call volumes are probably less. (Fewer than 10 billion calls per day.) The Feds can afford to store that. Hadoop or NoSQL clusters, for example, can be set up for low six figures per petabyte.** HP Vertica will sell anybody an RDBMS cluster (hardware and software) for around $2 million/petabyte.**

*In the most literal high-school-chemistry sense of the phrase.

**Of raw data; particularly compressible data might be managed yet more cheaply.

Coverage of all this has of course been intense. In particular:

And my views can be summarized much as I did three years ago:

  • It is inevitable* that governments and other constituencies will obtain huge amounts of information, which can be used to drastically restrict everybody’s privacy and freedom.
  • To protect against this grave threat, multiple layers of defense are needed, technical and legal/regulatory/social/political alike.
  • One particular layer is getting insufficient attention, namely restrictions upon the use (as opposed to the acquisition or retention) of data.

*And indeed in many ways even desirable


10 Responses to “Where things stand in US government surveillance”

  1. aaron on June 10th, 2013 8:11 pm

    Government has lots of data, and historically it has been segregated and unintegrated by policy and law. This is clearly changing, and there is likely a lot of integration that would have been considered illegal or unethical in the near past.

    I think you are radically underestimating how much streaming traffic is both reviewed and staged. Also, there is at least evidence that much traffic such as URLs and weblogs, of course where there is incriminating use or Urdu or such – but more innocuous seeming stuff as well is being captured, including (illegal) analysis of traffic in the US.

    The amazing part of this journey is the intentional lack of oversight. Does anyone believe that use of this data is being monitored? Has anyone been prosecuted for passing info to a friend or stalking an ex from these hairballs? That is the fundamental problem in this for me.

  2. Mike on June 11th, 2013 1:02 am

    Good luck actually restricting use of such information by the government. They have NICS for firearm background checks. They are required by law to destroy the information about successful checks, so as to avoid creating a national firearm registry. Yet during the Malvo shootings, the FBI was running around confiscating .223 rifles for ballistic checks and there is no legal way they could have known who had them. BTW, there us a HUGE difference between a company buying or acquiring information that is known to be public (CC purchases) in order to better target customers and a government illegally acquiring information thought to be private, especially when they have a monopoly on force (police, military, IRS, prisons, etc.).

  3. Curt Monash on June 11th, 2013 2:31 am

    https://medium.com/prism-truth/82a1791c94d3 has a good discussion of some likely PRISM misunderstanding, although with a nothing-to-see-here spin I don’t agree with.


    You could be right about what is reviewed/staged. It’s always tough to know everything secret that’s going on — after all, that’s why it’s secret.

  4. Curt Monash on June 11th, 2013 6:07 am

    http://touch.latimes.com/#section/-1/article/p2p-76220035/ is supportive of Aaron’s views on streaming data.

  5. Mark Jaquith on June 11th, 2013 1:34 pm

    Just because I call into doubt the PRISM reporting doesn’t mean I think there’s nothing to see in general. I’d argue that there’s a lot the public should know, but doesn’t about how our government looks at private communications, but we’re all getting sidetracked over PRISM, which seems like it might be something less than alleged..

  6. Curt Monash on June 11th, 2013 1:41 pm

    Fair enough, Mark.

    As previously noted, the $20 million PRISM budget proves it’s only a small part of the puzzle.

  7. How is the surveillance data used? | DBMS 2 : DataBase Management System Services on June 13th, 2013 11:36 am

    […] week, discussion has exploded about US government surveillance. After summarizing, as best I could, what data the government appears to collect, now I ‘d like to consider what they actually do with it. More precisely, I’d like to […]

  8. Curt Monash on June 14th, 2013 8:43 am

    Whoops. Declan McCullagh argues in http://news.cnet.com/8301-13578_3-57589078-38/nsa-chief-drops-hint-about-isp-web-e-mail-surveillance/ that the cell phone data includes wireless web surfing logs.

  9. Curt Monash on June 15th, 2013 5:28 am

    http://www.motherjones.com/kevin-drum/2013/06/some-questions-and-about-edward-snowden is nice and straightforward about inaccuracies in the Snowden disclosures.

  10. Notes and comments, July 2, 2013 | DBMS 2 : DataBase Management System Services on July 2nd, 2013 4:11 am

    […] recent posts based on surveillance news have been partly superseded by – well, by more news. Some of that news, along with some good […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.