“There is a risk that we are developing an ideological system that insists on the reality of the digital in preference to the analogue. This is partly because big data collection and analysis sits at the core of many surveillance technologies and therefore at the core systems of power and control. The social, political and personal focus on risk so clearly identified by Beck, as the prevailing organizing principle of the 20th century, has developed along with its enabling technologies into a seemingly endless desire for data.
It has become clear that what is missing from this field is a fully-fledged theory of digital political economy. Many information systems still collect highly abstracted and abbreviated data about people for two reasons. First, the process of reduction by definition has always been central to traditional data collection processes. The seemingly basic question of “in which city were you born?” belies the analogue complexity of the answer “St Petersburg/Leningrad/Petrograd/St Petersburg”.”
“Secondly, the transition from electro-mechanical information systems to fully digital ones has not yet seen a revision of the way database fields rely on negotiated, abbreviated and contested social concepts. While the mechanisms for collecting, storing and even analyzing data have become infinitely more sophisticated, the philosophical, political and economic debates continue. The persistence in 2015, for example, of the linguistically convoluted and largely meaningless “culturally and linguistic diverse backgrounds” as the nomenclature for people from (primarily) non-British, non-Indigenous origins in Australia speaks to the disjuncture in the debate between technology, language and the humanities. The labels matter because they shape our responses and the data supports the labels.
Broad age categories (rather than specific years) are a useful example. Such categories remain routinely utilized in survey tools. Their origins (much like the QWERTY keyboard) remain partly a function of historical necessity. The recognition of the potential value of demographic data emerged at a time when computers were largely human beings utilizing pen and paper. Computation might still have been guided by emerging mathematical and statistical rules but the actual work was done by and repeatedly checked by people, not machines.”
“This politics of the social has become ingrained in information systems through the categories we use including those associated with social phenomena such as gender, sexuality, ethnicity, disability and so on. These categories are malleable but in an information system they tend to be highly reductive including the field name and the binary “yes or no” common in this particular area. This is one way that past identity politics are carried forward in information systems and analytical assumptions. The need to re-hypothesise social data is essential in this big data era.
But big data approaches don’t share these inherited limitations. The long-standing assumptions behind such traditional methods no longer apply. Now we can easily look for ‘natural’ breaks in the data and analyse accordingly or simply analyse all of the data that was originally collected or go back and re-analyse old data. It is now clearly possible to collect, store and analyse so much more data than could be done in the past. Our inherited concepts and methods need to change to catch up to the technology instead of reducing the data to the technology we once had.
A major problem with ‘big data’ at the moment is this inherited knowledge many people have ranging from basic statistical assumptions on through to what ‘science’ is, that were built on and around these technical constraints. These limitations aren’t shared by big data analytics but the risk is that they persist as normalized knowledge in our information systems and societies.”
“These issues are political because there is a real importance to inclusion and exclusion in the information systems of contemporary society. Inclusion means an individual or group gets counted, and as such may have access to resources that those who are not counted generally do not. How you are included or excluded also matters. To be included through an extensive range of reductive processes has the capacity to make an individual a digital abstraction in information systems – one barely present in the external world. Being excluded means that the person or group can take on varying degrees of risk from the system because to be excluded is not the same as being ignored. Marginalised groups have a long history of being excluded from official counting systems only to have a great deal of, usually negative, attention paid to them by those same systems. This is Phoenix’s concept of normalized absence and pathologised presence in both theory and practice.”
“Kitchin, amongst other theorists and critics, argues that information systems and the socio-technical assemblages that implement them are far from neutral. In reality they are politics in motion. This is a key point in the growth of big data processes. Their outcomes will rely on the way people are constituted in such systems via their ‘datafication’. Big data may offer an opportunity to reform the last century or so of social data collection and analysis by providing a broader and deeper source of data. To be truly “big” in both scope and impact, data practitioners must engage with the politics of categorization and therefore with individuals, communities, and social scientists about the origins and “fit” of inherited and emerging data categories. This will be an important political project in its own right.”
from Discover Society