Why I’m taking a ‘data-driven science’ approach to research

In the age of big data, many new debates have emerged about the ‘best’ approach to research.

Some scholars argue there’s no longer any real need for theory, and claim that we should allow the ‘data to speak for themselves’. Others argue that all data carries inherent bias. That means we need knowledge of existing theory to provide the context necessary for meaningful understanding.

This is especially important in the social and political sciences, where big data researchers seek to understand complex human phenomena such as wars, genocide or racism, using massive computational datasets. It’s not easy for quantitative big data models to shed new insights on areas like these without drawing on existing knowledge, which may still be relevant even when dating back decades.

Boyd and Crawford (2012) support this view in their claim of an ‘arrogant undercurrent’ in the field of big data research that’s all too hasty to sideline older forms of research. For example, the process of cleaning a large social media dataset, e.g. from Twitter, is ‘inherently subjective’, as the researcher decides which attributes to include and which to ignore.

With these debates in mind, I’ve decided to use a ‘data-driven science’ approach in my PhD research. That means using existing behavioural science theory as a foundation to help me interpret findings in large-scale social media datasets, and blending qualitative methods with big data approaches based in computational social science.

It means I’ll need to get better at programming (Python is my language of choice), and venture into the exciting new world of machine learning. At the same time, I won’t abandon older forms of research methods (such as interviews), if they seem the right fit for the job.

In this blog, I’ll discuss the code I’m using in my research as it evolves (yes, there will be code snippets!) I’m relatively new to programming, so it’ll be a learning journey of sorts, probably with its fair share of mishaps and zig-zags.

I’m also fascinated by the many areas of social and political life that technology has affected, so expect a smattering of posts with musings about AI, ethics, life and so on. I’m looking forward to interacting with the community and having some interesting conversations.



Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information Communication and Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878

Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 205395171452848. https://doi.org/10.1177/2053951714528481