Big data problems we face today can be traced to the social ordering practices of the 19th century

“This situation brings with it a socio-political dimension of interest to us, one in which our understanding of people and our actions on individuals, groups and populations are deeply implicated. The collection of social data had a purpose – understanding and controlling the population in a time of significant social change. To achieve this, new kinds of information and new methods for generating knowledge were required. Many ideas, concepts and categories developed during that first data revolution remain intact today, some uncritically accepted more now than when they were first developed. In this piece we draw out some connections between these two data ‘revolutions’ and the implications for the politics of information in contemporary society. It is clear that many of the problems in this first big data age and, more specifically, their solutions persist down to the present big data era.”
“These changes in knowledge were facilitated not only by large quantities of new information pouring in from around the world but by shifts in the production, processing and analysis of that information. Many of these methods are still with us including information taxonomies and knowledge trees to name but two. Hacking observed that while social categories are epistemic products their application can have marked ontological effects. Knowledge of the natural world was rapidly applied to the social world and the politicking of social identifies began in earnest, supported by a rising tide of data and analytical methods. Conservatives and social critics alike relied on the production and dissemination of data, both large and small, to support repression and reform. The public inquiry emerged as another 19th century mechanism that persists in the present, with the same general focus – poverty, crime, health and systemic failures.

These new knowledge demands saw some contextual successes, such as in the demographic and statistical sciences, and some failures, such as Babbage’s analytical engine design which was conceived but not completed during his lifetime. In some ways growing academic specialisations created a situation in which what was gained through a narrowing of focus and growth in sub-disciplinary activity was also lost in generalisability. This distinctly Victorian problem endures to the present day despite interdisciplinary projects of various kinds. Floridi writing on the philosophy of big data, has said quite specifically that the real big data problem we face today is less one of the quantity or quality of data or even technical skills but rather one of epistemology.
“Much of the data collected about human beings by bureaucratic systems has a history not simply of description or even understanding but one of control. Foucault’s power/knowledge nexus is situated in a selection of bureaucratic and institutional forms for this reason. Every deviant or ‘underperforming’ social category is a warrant for action once documented. Consequently a great deal of social data is coercive in nature. Social data is rarely neutral and the persistence of ‘wicked’ social problems illustrates how regulation has been favoured in preference to their solution. That a census or a social survey is a snapshot of the way our societies are regulated is rarely remarked on and instead emphasis is given to the presumed objectivity of the categories and their data. This is the ideology of the small data era in action – the claim that it is science and not society that we are seeing through such instruments.


The targets of social policy interventions for more than two centuries have essentially been the same categories of people – groups marked as moral outsiders (deviants) in their societies. The collection of data about these categories of people, in particular, was a marked feature of the first big data environment. These categories were operationalised through society’s regulatory processes and institutions including education, the law and of course healthcare. These are the same locations where debates about structure, agency and morality continue to intersect and where the use of data and technology are represented as largely emancipatory. The risk is that ‘big data’ replicates the ideological underpinnings common to much of what has been produced under the small data paradigm.
“Our question then is how do we go about re-writing the ideological inheritance of that first data revolution? Can we or will we unpack the ideological sequelae of that past revolution during this present one? The initial indicators are not good in that there is a pervasive assumption in this broad interdisciplinary field that reductive categories are both necessary and natural. Our social ordering practices have influenced our social epistemology. We run the risk in the social sciences of perpetuating the ideological victories of the first data revolution as we progress through the second. The need for critical analysis grows apace not just with the production of each new technique or technology but with the uncritical acceptance of the concepts, categories and assumptions that emerged from that first data revolution. That first data revolution proved to be a successful anti-revolutionary response to the numerous threats to social order posed by the incredible changes of the nineteenth century, rather than the Enlightenment emancipation that was promised.
“Information is not new and nor is data – of whatever order of magnitude. We are in a period that can reasonably be seen as the second ‘big data’ revolution and it is revolutionary because it challenges our accepted understanding of the world and not simply because of the volumes and velocity of data generation in our new digital information technologies. Many social categories were designed to control, coerce and even oppress their targets. The poor, the unmarried mother, the illegitimate child, the black, the unemployed, the disabled, the dependent elderly – none of these social categories of person is a neutral framing of individual or collective circumstances. They are instead a judgement on their place in modernity and material grounds for research, analysis and policy interventions of various kinds. Two centuries after the first big data revolution many of these categories remain with us almost unchanged and, given what we know of their consequences, we have to ask what will be their situation when this second data revolution draws to a close?

Like that first data revolution, this present one also has ambitions for people and their interactions with the new media emerging in its wake. These discussions are useful and necessary because discussion and negotiation are essential in the face of revolution. The responses to revolution in the late 18th and 19th centuries were often violent but we now have better methods available for the maintenance of social order as Foucault’s technologies of the self and Bourdieu’s habitus. Where we see this becoming highly problematic is in the continuity of ideologically informed notions of ourselves and others and the reproduction of such ideologies in and through our new digital environments. Following Floridi, this is a significant epistemic and ethical problem in our current big data era.”

from The London School of Economics and Political Science


