Discussion of public policy around technological issues, especially but not only surveillance and privacy.
In which I observe that Tim Cook and the EFF, while thankfully on the right track, haven’t gone nearly far enough.
Traditionally, the term “chilling effect” referred specifically to inhibitions on what in the US are regarded as First Amendment rights — the freedoms of speech, the press, and in some cases public assembly. Similarly, when the term “chilling effect” is used in a surveillance/privacy context, it usually refers to the fear that what you write or post online can later be held against you. This concern has been expressed by, among others, Tim Cook of Apple, Laura Poitras, and the Electronic Frontier Foundation, and several research studies have supported the point.
But that’s only part of the story. As I wrote in July, 2013,
… with the new data collection and analytic technologies, pretty much ANY action could have legal or financial consequences. And so, unless something is done, “big data” privacy-invading technologies can have a chilling effect on almost anything you want to do in life.
The reason, in simplest terms, is that your interests could be held against you. For example, models can estimate your future health, your propensity for risky hobbies, or your likelihood of changing your residence, career, or spouse. Any of these insights could be useful to employers or financial services firms, and not in a way that redounds to your benefit. And if you think enterprises (or governments) would never go that far, please consider an argument from the sequel to my first “chilling effects” post: Read more
It’s difficult to project the rate of IT change in health care, because:
- Health care is suffused with technology — IT, medical device and biotech alike — and hence has the potential for rapid change. However, it is also the case that …
- … health care is heavily bureaucratic, political and regulated.
Timing aside, it is clear that health care change will be drastic. The IT part of that starts with vastly comprehensive electronic health records, which will be accessible (in part or whole as the case may be) by patients, care givers, care payers and researchers alike. I expect elements of such records to include:
- The human-generated part of what’s in ordinary paper health records today, but across a patient’s entire lifetime. This of course includes notes created by doctors and other care-givers.
- Large amounts of machine-generated data, including:
- The results of clinical tests. Continued innovation can be expected in testing, for reasons that include:
- Most tests exploit electronic technology. Progress in electronics is intense.
- Biomedical research is itself intense.
- In particular, most research technologies (for example gene sequencing) can be made cheap enough over time to be affordable clinically.
- The output of consumer health-monitoring devices — e.g. Fitbit and its successors. The buzzword here is “quantified self”, but what it boils down to is that every moment of our lives will be measured and recorded.
- The results of clinical tests. Continued innovation can be expected in testing, for reasons that include:
These vastly greater amounts of data cited above will allow for greatly changed analytics.
There are numerous ways that technology, now or in the future, can significantly improve personal safety. Three of the biggest areas of application are or will be:
- Crime prevention.
- Vehicle accident prevention.
- Medical emergency prevention and response.
Implications will be dramatic for numerous industries and government activities, including but not limited to law enforcement, automotive manufacturing, infrastructure/construction, health care and insurance. Further, these technologies create a near-certainty that individuals’ movements and status will be electronically monitored in fine detail. Hence their development and eventual deployment constitutes a ticking clock toward a deadline for society deciding what to do about personal privacy.
Theoretically, humans aren’t the only potential kind of tyrants. Science fiction author Jack Williamson postulated a depressing nanny-technology in With Folded Hands, the idea for which was later borrowed by the humorous Star Trek episode I, Mudd.
Of these three areas, crime prevention is the furthest along; in particular, sidewalk cameras, license plate cameras and internet snooping are widely deployed around the world. So let’s consider the other two.
Vehicle accident prevention
|Categories: Health care, Predictive modeling and advanced analytics, Public policy, Surveillance and privacy||3 Comments|
Most IT innovation these days is focused on machine-generated data (sometimes just called “machine data”), rather than human-generated. So as I find myself in the mood for another survey post, I can’t think of any better idea for a unifying theme.
1. There are many kinds of machine-generated data. Important categories include:
- Web, network and other IT logs.
- Game and mobile app event data.
- CDRs (telecom Call Detail Records).
- “Phone-home” data from large numbers of identical electronic products (for example set-top boxes).
- Sensor network output (for example from a pipeline or other utility network).
- Vehicle telemetry.
- Health care data, in hospitals.
- Digital health data from consumer devices.
- Images from public-safety camera networks.
- Stock tickers (if you regard them as being machine-generated, which I do).
That’s far from a complete list, but if you think about those categories you’ll probably capture most of the issues surrounding other kinds of machine-generated data as well.
2. Technology for better information and analysis is also technology for privacy intrusion. Public awareness of privacy issues is focused in a few areas, mainly: Read more
Everybody is confused about privacy and surveillance. So I’m renewing my efforts to consciousness-raise within the tech community. For if we don’t figure out and explain the issues clearly enough, there isn’t a snowball’s chance in Hades our lawmakers will get it right without us.
How bad is the confusion? Well, even Edward Snowden is getting it wrong. A Wired interview with Snowden says:
“If somebody’s really watching me, they’ve got a team of guys whose job is just to hack me,” he says. “I don’t think they’ve geolocated me, but they almost certainly monitor who I’m talking to online. Even if they don’t know what you’re saying, because it’s encrypted, they can still get a lot from who you’re talking to and when you’re talking to them.”
That is surely correct. But the same article also says:
“We have the means and we have the technology to end mass surveillance without any legislative action at all, without any policy changes.” The answer, he says, is robust encryption. “By basically adopting changes like making encryption a universal standard—where all communications are encrypted by default—we can end mass surveillance not just in the United States but around the world.”
That is false, for a myriad of reasons, and indeed is contradicted by the first excerpt I cited.
What privacy/surveillance commentators evidently keep forgetting is:
- There are many kinds of privacy-destroying information. I think people frequently overlook just how many kinds there are.
- Many kinds of organization capture that information, can share it with each other, and gain benefits from eroding or destroying privacy. Similarly, I think people overlook just how pervasive the incentive is to snoop.
- Privacy is invaded through a variety of analytic techniques applied to that information.
So closing down a few vectors of privacy attack doesn’t solve the underlying problem at all.
Worst of all, commentators forget that the correct metric for danger is not just harmful information use, but chilling effects on the exercise of ordinary liberties. But in the interest of space, I won’t reiterate that argument in this post.
Perhaps I can refresh your memory why each of those bulleted claims is correct. Major categories of privacy-destroying information (raw or derived) include:
- The actual content of your communications – phone calls, email, social media posts and more.
- The metadata of your communications — who you communicate with, when, how long, etc.
- What you read, watch, surf to or otherwise pay attention to.
- Your purchases, sales and other transactions.
- Video images, via stationary cameras, license plate readers in police cars, drones or just ordinary consumer photography.
- Monitoring via the devices you carry, such as phones or medical monitors.
- Your health and physical state, via those devices, but also inferred from, for example, your transactions or search engine entries.
- Your state of mind, which can be inferred to various extents from almost any of the other information areas.
- Your location and movements, ditto. Insurance companies also want to put monitors in cars to track your driving behavior in detail.
|Categories: Health care, Predictive modeling and advanced analytics, Surveillance and privacy, Telecommunications||2 Comments|
As per the links and quotes below, my views on the network neutrality debate may be summarized as:
- There should be a fairly good level of internet delivery that is, by regulation, available to any website or other internet service. This is essential so that ideas can blossom, speech can be shared, etc.
- There should be a way to pay for arbitrarily good levels of internet delivery. Entertainment would benefit from that. Medicine, in the future, might require it.
- Any payment for better delivery should happen through a marketplace open to all.
In this post I’ll add detail as to how that marketplace could work.
A couple of points that arise frequently in conversation, but that I don’t seem to have made clearly online.
“Metadata” is generally defined as “data about data”. That’s basically correct, but it’s easy to forget how many different kinds of metadata there are. My list of metadata kinds starts with:
- Data about data structure. This is the classical sense of the term. But please note:
- In a relational database, structural metadata is rather separate from the data itself.
- In a document database, each document might carry structure information with it.
- Other inputs to core data management functions. Two major examples are:
- Column statistics that inform RDBMS optimizers.
- Value ranges that inform partition pruning or, more generally, data skipping.
- Inputs to ancillary data management functions — for example, security privileges.
- Support for human decisions about data — for example, information about authorship or lineage.
What’s worse, the past year’s most famous example of “metadata”, telephone call metadata, is misnamed. This so-called metadata, much loved by the NSA (National Security Agency), is just data, e.g. in the format of a CDR (Call Detail Record). Calling it metadata implies that it describes other data — the actual contents of the phone calls — that the NSA strenuously asserts don’t actually exist.
And finally, the first bullet point above has a counter-intuitive consequence — all common terminology notwithstanding, relational data is less structured than document data. Reasons include:
- Relational databases usually just hold strings — or maybe numbers — with structural information being held elsewhere.
- Some document databases store structural metadata right with the document data itself.
- Some document databases store data in the form of (name, value) pairs. In some cases additional structure is imposed by naming conventions.
- Actual text documents carry the structure imposed by grammar and syntax.
- A lengthy survey of metadata kinds, biased to Hadoop (August, 2012)
- Metadata as derived data (May, 2011)
- Dataset management (May, 2013)
- Structured/unstructured … multi-structured/poly-structured (May, 2011)
|Categories: Data models and architecture, Hadoop, Structured documents, Surveillance and privacy, Telecommunications||5 Comments|
1. Censorship worries me, a lot. A classic example is Vietnam, which basically has outlawed online political discussion.
And such laws can have teeth. It’s hard to conceal your internet usage from an inquisitive government.
2. Software and software related patents are back in the news. Google, which said it was paying $5.5 billion or so for a bunch of Motorola patents, turns out to really have paid $7 billion or more. Twitter and IBM did a patent deal as well. Big numbers, and good for certain shareholders. But this all benefits the wider world — how?
The purpose of legal intellectual property protections, simply put, is to help make it a good decision to create something. …
Why does “securing … exclusive Right[s]” to the creators of things that are patented, copyrighted, or trademarked help make it a good decision for them to create stuff? Because it averts competition from copiers, thus making the creator a monopolist in what s/he has created, allowing her to at least somewhat value-price her creation.
I.e., the core point of intellectual property rights is to prevent copying-based competition. By way of contrast, any other kind of intellectual property “right” should be viewed with great suspicion.
That Constitutionally-based principle makes as much sense to me now as it did then. By way of contrast, “Let’s give more intellectual property rights to big corporations to protect middle-managers’ jobs” is — well, it’s an argument I view with great suspicion.
But I find it extremely hard to think of a technology industry example in which development was stimulated by the possibility of patent protection. Yes, the situation may be different in pharmaceuticals, or for gadgeteering home inventors, but I can think of no case in which technology has been better, or faster to come to market, because of the possibility of a patent-law monopoly. So if software and business-method patents were abolished entirely – even the ones that I think could be realistically adjudicated — I’d be pleased.
3. In November, 2008 I offered IT policy suggestions for the incoming Obama Administration, especially: Read more
|Categories: Buying processes, Google, IBM and DB2, Public policy, Surveillance and privacy||1 Comment|
In response to the uproar created by the Edward Snowden revelations, the White House commissioned five dignitaries to produce a 300-page report, released last December 12. (Official name: Report and Recommendations of The President’s Review Group on Intelligence and Communications Technologies.) I read or skimmed a large minority of it, and I found enough substance to be worthy of a blog post.
Many of the report’s details fall in the buckets of bureaucratic administrivia,* internal information security, or general pabulum. But the commission started with four general principles that I think have great merit. Read more
Thanks to a court decision that overturned some existing regulations, network neutrality is back in the news. Most people think the key issue is whether
- Telecommunication companies (e.g. wireless and/or broadband services providers) should be allowed to charge …
- … other internet companies (website owners, game companies, streaming media providers, etc., collectively known as edge providers) for …
- … shipping data to internet service consumers in particularly attractive ways.
But I think some forms of charging can be OK — albeit not the ones currently being discussed — and so the question should instead be how the charges are designed.
When I wrote about network neutrality in 2006-7, the issue was mainly whether broadband providers would be allowed to ship different kinds of data at different speeds or reliability. Now the big controversy is whether mobile data providers should be allowed to accept “sponsorship” so as to have certain kinds of data not count against mobile data plan volume caps. Either way:
- The “anything goes” strategy has obvious free-market appeal.
- But proponents of network neutrality regulation — such as Fred Wilson and Nilay Patel — point out a major risk: By striking deals that smaller companies can’t imitate, large, established “edge provider” services may strangle upstart competitors in their cribs.
I think the anti-discrimination argument for network neutrality has much merit. But I also think there are some kinds of payment structure that could leave the playing field fairly level. Imagine, if you will, that: Read more