Liberty and privacy
Discussion of issues related to liberty and privacy, and especially how they are affected by and interrelated with data management and analytic technologies. Related subjects include:
- Petabyte-scale data management
- Privacy, censorship, and freedom (in The Monash Report)
Aster Data business trends
Last month, I reviewed with the Aster Data folks which markets they were targeting and selling into, subsequent to acquisition by their new orange overlords. The answers aren’t what they used to be. Aster no longer focuses much on what it used to call frontline (i.e., low-latency, operational) applications; those are of course a key strength for Teradata. Rather, Aster focuses on investigative analytics — they’ve long endorsed my use of the term — and on the batch run/scoring kinds of applications that inform operational systems.
| Categories: Analytic technologies, Application areas, Aster Data, Data warehousing, DataStax, Liberty and privacy, RDF and graphs, Teradata, Web analytics | 1 Comment |
Application areas for SAS HPA
When I talked with SAS about its forthcoming in-memory parallel SAS HPA offering, we talked briefly about application areas. The three SAS cited were:
- Consumer financial services. The idea here is to combine information about customers’ use of all kinds of services — banking, credit cards, loans, etc. SAS believes this is both for marketing and risk analysis purposes.
- Insurance. We didn’t go into detail.
- Mobile communications. SAS’ customers aren’t giving it details, but they’re excited about geocoding/geospatial data.
Meanwhile, in another interview I heard about, SAS emphasized retailers. Indeed, that’s what spawned my recent post about logistic regression.
The mobile communications one is a bit scary. Your cell phone — and hence your cellular company — know where you are, pretty much from moment to moment. Even without advanced analytic technology applied to it, that’s a pretty direct privacy threat. Throw in some analytics, and your cell company might know, for example, who you hang out with (in person), where you shop, and how those things predict your future behavior. And so the government — or just your employer — might know those things too.
| Categories: Application areas, Liberty and privacy, Predictive modeling and advanced analytics, SAS Institute, Telecommunications | 2 Comments |
So how many columns can a single table have anyway?
I have a client who is hitting a 1000 column-per-table limit in Oracle Standard Edition. As you might imagine, I’m encouraging them to consider columnar alternatives. Be that as it may, just what ARE the table width limits in various analytic or general-purpose DBMS products?
By the way — the answer SHOULD be “effectively unlimited.” Like it or not,* there are a bunch of multi-thousand-column marketing-prospect-data tables out there.
*Relational purists may dislike the idea for one reason, privacy-concerned folks for quite another.
| Categories: Data warehousing, Liberty and privacy | 34 Comments |
The technology of privacy threats
This post is the second of a series. The first one was an overview of privacy dangers, replete with specific examples of kinds of data that are stored for good reasons, but can also be repurposed for more questionable uses. More on this subject may be found in my August, 2010 post Big Data is Watching You!
There are two technology trends driving electronic privacy threats. Taken together, these trends raise scenarios such as the following:
- Your web surfing behavior indicates you’re a sports car buff, and you further like to look at pictures of scantily-clad young women. A number of your Facebook friends are single women. As a result, you’re deemed a risk to have a mid-life crisis and divorce your wife, thus increasing the interest rate you have to pay when refinancing your house.
- Your cell phone GPS indicates that you drive everywhere, instead of walking. There is no evidence of you pursuing fitness activities, but forum posting activity suggests you’re highly interested in several TV series. Your credit card bills show that your taste in restaurant food tends to the fatty. Your online photos make you look fairly obese, and a couple have ashtrays in them. As a result, you’re judged a high risk of heart attack, and your medical insurance rates are jacked up accordingly.
- You did actually have that mid-life crisis and get divorced. At the child-custody hearing, your ex-spouse’s lawyer quotes a study showing that football-loving upper income Republicans are 27% more likely to beat their children than yoga-class-attending moderate Democrats, and the probability goes up another 8% if they ever bought a jersey featuring a defensive lineman. What’s more, several of the more influential people in your network of friends also fit angry-male patterns, taking the probability of abuse up another 13%. Because of the sound statistics behind such analyses, the judge listens.
Not all these stories are quite possible today, but they aren’t far off either.
| Categories: Facebook, Liberty and privacy, Predictive modeling and advanced analytics, Telecommunications, Web analytics | 3 Comments |
Privacy dangers — an overview
This post is the first of a series. The second one delves into the technology behind the most serious electronic privacy threats.
The privacy discussion has gotten more active, and more complicated as well. A year ago, I still struggled to get people to pay attention to privacy concerns at all, at least in the United States, with my first public breakthrough coming at the end of January. But much has changed since then.
On the commercial side, Facebook modified its privacy policies, garnering great press attention and an intense user backlash, leading to a quick partial retreat. The Wall Street Journal then launched a long series of articles — 13 so far — recounting multiple kinds of privacy threats. Other media joined in, from Forbes to CNet. Various forms of US government rule-making to inhibit advertising-related tracking have been proposed as an apparent result.
In the US, the government had a lively year as well. The Transportation Security Administration (TSA) rolled out what have been dubbed “porn scanners,” and backed them up with “enhanced patdowns.” For somebody who is, for example, female, young, a sex abuse survivor, and/or a follower of certain religions, those can be highly unpleasant, if not traumatic. Meanwhile, the Wikileaks/Cablegate events have spawned a government reaction whose scope is only beginning to be seen. A couple of “highlights” so far are some very nasty laptop seizures, and the recent demand for information on over 600,000 Twitter accounts. (Christopher Soghoian provided a detailed, nuanced legal analysis of same.)
At this point, it’s fair to say there are at least six different kinds of legitimate privacy fear. Read more
| Categories: Analytic technologies, Facebook, GIS and geospatial, Health care, Liberty and privacy, Telecommunications, Web analytics | 4 Comments |
The privacy discussion is heating up
Internet privacy issues are getting more and more attention. Frankly, I think we’re getting past the point where the only big risk is loss of liberty. More and more, the risk of an excessive backlash is upon us as well. (In the medical area, I’d say it’s already more than a risk — it’s a life-wrecking reality. But now the problem is poised to become wider-spread.) Read more
| Categories: Health care, Liberty and privacy, Web analytics | 2 Comments |
Notes and links October 22, 2010
A number of recent posts have had good comments. This time, I won’t call them out individually.
Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on “sensor data” …
… and, even better, went on to say: Read more
A few notes from XLDB 4
As much as I believe in the XLDB conferences, I only found time to go to (a big) part of one day of XLDB 4 myself. In general: Read more
| Categories: Analytic technologies, Health care, Liberty and privacy, Michael Stonebraker, MySQL, Open source, Parallelization, Petabyte-scale data management, Scientific research | 2 Comments |
Notes and links October 10 2010
More quick-hit notes, links, and so on: Read more
| Categories: Analytic technologies, Aster Data, Data warehousing, Greenplum, Health care, Liberty and privacy, XtremeData | Leave a Comment |
A rant about medical records
It is very difficult to convey utterly tedious frustration without — well, without thoroughly boring one’s audience. And hence I will not try to explain the full awfulness of modern medical records and information compartmentalization. But I was personally present 5 times in one recent week while Linda gave detailed information about her contact information, medical history, etc. — and all 5 times it was to the same hospital.
In our case, that just costs time. But the information flow in my father’s case upsets me more. Read more
| Categories: Health care, Liberty and privacy | 2 Comments |
