Discussion of public policy around technological issues, especially but not only surveillance and privacy.
That said, while Steve Bannon is firmly established as Trump’s puppet master, they don’t agree on quite everything, and one of the documented disagreements had been in their view of skilled, entrepreneurial founder-type immigrants: Bannon opposes them, but Trump has disagreed with his view. And as per the speech, Trump seems to be maintaining his disagreement.
At least, that seems implied by his call for “a merit-based immigration system.”
And by the way — Trump managed to give a whole speech without saying anything overtly racist. Indeed, he specifically decried the murder of an Indian-immigrant engineer. By Trump standards, that counts as a kind of progress.
Edit (March 5): But now there is negative-seeming news about H1-B visas.
I’d like to argue that a single frame can be used to view a lot of the issues that we think about. Specifically, I’m referring to coordination, which I think is a clearer way of characterizing much of what we commonly call communication or collaboration.
It’s easy to argue that computing, to an overwhelming extent, is really about communication. Most obviously:
- Data is constantly moving around — across wide area networks, across local networks, within individual boxes, or even within particular chips.
- Many major developments are almost purely about communication. The most important computing device today may be a telephone. The World Wide Web is essentially a publishing platform. Social media are huge. Etc.
Indeed, it’s reasonable to claim:
- When technology creates new information, it’s either analytics or just raw measurement.
- Everything else is just moving information around, and that’s communication.
A little less obvious is the much of this communication could be alternatively described as coordination. Some communication has pure consumer value, such as when we talk/email/Facebook/Snapchat/FaceTime with loved ones. But much of the rest is for the purpose of coordinating business or technical processes.
Among the technical categories that boil down to coordination are:
- Operating systems.
- Anything to do with distributed computing.
- Anything to do with system or cluster management.
- Anything that’s called “collaboration”.
That’s a lot of the value in “platform” IT right there. Read more
|Categories: Business intelligence, Predictive modeling and advanced analytics, Public policy||2 Comments|
The United States and consequently much of the world are in political uproar. Much of that is about very general and vital issues such as war, peace or the treatment of women. But quite a lot of it is to some extent tech-industry-specific. The purpose of this post is outline how and why that is.
- There’s a worldwide backlash against “elites” — and tech industry folks are perceived as members of those elites.
- That perception contains a lot of truth, and not just in terms of culture/education/geography. Indeed, it may even be a bit understated, because trends commonly blamed on “trade” or “globalization” often have their roots in technological advances.
- There’s a worldwide trend towards authoritarianism. Surveillance/ privacy and censorship issues are strongly relevant to that trend.
- Social media companies are up to their neck in political considerations.
Because they involve grave threats to liberty, I see surveillance/privacy as the biggest technology-specific policy issues in the United States. (In other countries, technology-driven censorship might loom larger yet.) My views on privacy and surveillance have long been:
- Fixing the legal frameworks around information use is a difficult and necessary job. The tech community should be helping more than it is.
- Until those legal frameworks are indeed cleaned up, the only responsible alternative is to foot-drag on data collection, on data retention, and on the provision of data to governmental agencies.
Given the recent election of a US president with strong authoritarian tendencies, that foot-dragging is much more important than it was before.
Other important areas of technology/policy overlap include: Read more
The United States presidency was recently assumed by an Orwellian lunatic.* Sadly, this is not an exaggeration. The dangers — both of authoritarianism and of general mis-governance — are massive. Everybody needs in some way to respond.
*”Orwellian lunatic” is by no means an oxymoron. Indeed, many of the most successful tyrants in modern history have been delusional; notable examples include Hitler, Stalin, Mao and, more recently, Erdogan. (By way of contrast, I view most other Soviet/Russian leaders and most jumped-up-colonel coup leaders as having been basically sane.)
There are many candidates for what to focus on, including:
- Technology-specific issues — e.g. privacy/surveillance, network neutrality, etc.
- Issues in which technology plays a large role — e.g. economic changes that affect many people’s employment possibilities.
- Subjects that may not be tech-specific, but are certainly of great importance. The list of candidates here is almost endless, such as health care, denigration of women, maltreatment of immigrants, or the possible breakdown of the whole international order.
But please don’t just go on with your life and leave the politics to others. Those “others” you’d like to rely on haven’t been doing a very good job.
What I’ve chosen to do personally includes: Read more
1. The cloud is super-hot. Duh. And so, like any hot buzzword, “cloud” means different things to different marketers. Four of the biggest things that have been called “cloud” are:
- The Amazon cloud, Microsoft Azure, and their competitors, aka public cloud.
- Software as a service, aka SaaS.
- Co-location in off-premises data centers, aka colo.
- On-premises clusters (truly on-prem or colo as the case may be) designed to run a broad variety of applications, aka private cloud.
Further, there’s always the idea of hybrid cloud, in which a vendor peddles private cloud systems (usually appliances) running similar technology stacks to what they run in their proprietary public clouds. A number of vendors have backed away from such stories, but a few are still pushing it, including Oracle and Microsoft.
This is a good example of Monash’s Laws of Commercial Semantics.
2. Due to economies of scale, only a few companies should operate their own data centers, aka true on-prem(ises). The rest should use some combination of colo, SaaS, and public cloud.
This fact now seems to be widely understood.
Five years ago, in a taxonomy of analytic business benefits, I wrote:
A large fraction of all analytic efforts ultimately serve one or more of three purposes:
- Problem and anomaly detection and diagnosis
- Planning and optimization
That continues to be true today. Now let’s add a bit of spin.
1. A large fraction of analytics is adversarial. In particular: Read more
|Categories: Business intelligence, Investment research and trading, Log analysis, Predictive modeling and advanced analytics, RDF and graphs, Surveillance and privacy, Web analytics||3 Comments|
One of the most important issues in privacy and surveillance is also one of the least-discussed — the use of new surveillance technologies in ordinary law enforcement. Reasons for this neglect surely include:
- Governments, including in the US, lie about this subject a lot. Indeed, most of the reporting we do have is exposure of the lies.
- There’s no obvious technology industry ox being gored. What I wrote in another post about Apple, Microsoft et al. upholding their customers’ rights doesn’t have a close analogue here.
One major thread in the United States is: Read more
Numerous tussles fit the template:
- A government wants access to data contained in one or more devices (mobile/personal or server as the case may be).
- The computer’s manufacturer or operator doesn’t want to provide it, for reasons including:
- That’s what customers prefer.
- That’s what other governments require.
- Being pro-liberty is the right and moral choice. (Yes, right and wrong do sometimes actually come into play. )
As a general rule, what’s best for any kind of company is — pricing and so on aside — whatever is best or most pleasing for their customers or users. This would suggest that it is in tech companies’ best interest to favor privacy, but there are two important quasi-exceptions: Read more
|Categories: Amazon and its cloud, Google, Microsoft and SQL*Server, Surveillance and privacy, Web analytics||2 Comments|
This year, privacy and surveillance issues have been all over the news. The most important, in my opinion, deal with the tension among:
- Personal privacy.
- General law enforcement.
More precisely, I’d say that those are the most important in Western democracies. The biggest deal worldwide may be China’s movement towards an ever-more-Orwellian surveillance state.
The main examples on my mind — each covered in a companion post — are:
- The Apple/FBI conflict(s) about locked iPhones.
- The NSA’s propensity to share data with civilian law enforcement.
Legislators’ thinking about these issues, at least in the US, seems to be confused but relatively nonpartisan. Support for these assertions includes:
- The recent unanimous passage in the US House of Representatives of a law restricting police access to email.
- An absurd anti-encryption bill proposed in the US Senate.
- The infrequent mention of privacy/surveillance issues in the current election campaign.
I do think we are in for a spate of law- and rule-making, especially in the US. Bounds on the possible outcomes likely include: Read more
- I’ve suggested in the past that multi-data-center capabilities are important for “data sovereignty”/geo-compliance.
- The need for geo-compliance just got a lot stronger, with the abolition of the European Union’s Safe Harbour rule for the US. If you collect data in multiple countries, you should be at least thinking about geo-compliance.
- Cassandra is an established leader in multi-data-center operation.
But when I made that connection and checked in accordingly with my client Patrick McFadin at DataStax, I discovered that I’d been a little confused about how multi-data-center Cassandra works. The basic idea holds water, but the details are not quite what I was envisioning.
The story starts:
- Cassandra groups nodes into logical “data centers” (i.e. token rings).
- As a best practice, each physical data center can contain one or more logical data center, but not vice-versa.
- There are two levels of replication — within a single logical data center, and between logical data centers.
- Replication within a single data center is planned in the usual way, with the principal data center holding a database likely to have a replication factor of 3.
- However, copies of the database held elsewhere may have different replication factors …
- … and can indeed have different replication factors for different parts of the database.
In particular, a remote replication factor for Cassandra can = 0. When that happens, then you have data sitting in one geographical location that is absent from another geographical location; i.e., you can be in compliance with laws forbidding the export of certain data. To be clear (and this contradicts what I previously believed and hence also implied in this blog):
- General multi-data-center operation is not what gives you geo-compliance, because the default case is that the whole database is replicated to each data center.
- Instead, you get that effect by tweaking your specific replication settings.
|Categories: Cassandra, Clustering, DataStax, HBase, NoSQL, Open source, Specific users, Surveillance and privacy||3 Comments|