Category: Data Journalism

Google search data Islamophobia

Can Google search data help solve Islamophobia?

For decades, social scientists have conducted research using some combination of surveys, census data, focus groups, interviews, and observation techniques. With the exception of covert observation, which brings its own ethical issues, these methods have something in common: the dishonesty of people. These methods are all subject to human lies and therefore unable to paint a reliable picture of society’s true beliefs and darkest fears. In fact, the most objective forms of data are given up willingly, in private, where people are free from the worry of being judged. Short of stealing people’s diaries or tapping their phone calls, what else can researchers do to gather the most objective data possible?

Better than surveys

In our digital era the most obvious answer is also the correct one. But until now, few people have thought to leverage this tool and publicise their findings in such an accessible way and at such a pertinent time. What is the technology we all use to ask questions, seek validation, and search for the most outrageous things? Why of course, it’s Google. Many people would be embarrassed to publicly display their Google search history. I know mine is full of very silly things. But at the same time, these queries are deeply revealing; which is precisely why they strike a nerve. They display some of our deepest secrets. For example, a few years ago I used to get occasional panic attacks. I remember waking up at 3 am in an unfamiliar country, caught in the midst of an attack, gasping for breath. To calm myself, I searched Google for reassurance that it was ‘just’ a panic attack.

Google as ‘truth serum’

People search Google for all manner of things. Seth Stephens-Davidowitz (see below for video of his recent RSA talk), the researcher who produced this study, found many searches for terms involving ‘abortions’, ‘closet gays’, ‘penis size’, and ‘breastfeeding of husbands’ (the latter being apparently popular in India). He also found other more sinister patterns, ones suggesting American racism was far more widespread than previously thought. In fact, search data shows the idea of America as a ‘post-racial’ society, much-touted after the 2008 election of Barack Obama, to be quite absurd. Google showed American racism and Islamophobia were thoroughly alive and kicking, even in places where people didn’t publicly admit to holding racist views. They espouse very different opinions in the privacy of their own homes, face-to-face only with Google. It’s Google as ‘truth serum’. Almost ten years later, with Trump at the helm, perhaps America is finally showing its true face.

Tracking Islamophobia in searches

Obama’s address to the nation after the 2015 San Bernardino attack, provides an interesting example of how search data reflects hidden social views. In the speech, he aimed to calm outraged people and restore order to the country. In particular, he wanted to counteract the backlash that Muslim-Americans would surely face. While he was speaking of Muslims as ‘our friends’, ‘neighbours’ and so on, Google search data was telling a different story. After each terrorist attack (and this happens in the UK too) the volume of negative and violent searches about Muslims skyrockets. Islamophobic searches like ‘kill all Muslims’, become alarmingly high.

During most of Obama’s speech, these searches didn’t reduce or even level off. Instead they became even more frequent. This makes sense, because challenging people’s world views acts as an attack on their fundamental identity. In response, most will cling tighter to whatever they believe. But later in his speech, Obama changed tack. He introduced new images: not just of Muslim-Americans as friends and neighbours, who should be respected, but also of ‘Muslim soldiers’, willing to die for America, and ‘Muslim athletes’, representing the country on the world stage.

From ‘terrorists’ to soldiers and athletes

And then, something changed in the data. Islamophobic searches slowed down, to be replaced with searches for ‘Muslim athletes’, and ‘Muslim soldiers’. Something had resonated with the people searching; instead of responding predictably to Obama’s perceived ‘attack’ on their entrenched world views, they had become curious. I believe this happened for two reasons, partly because the idea of Muslims as athletes and soldiers resonated with ‘patriotic’ American audiences. But also because these images perhaps helped to ‘de-otherise’ public perceptions of Muslims. By drawing on resonant all-American themes, Obama associated Muslims with a set of positive images rather than just trying to convince wider America to accept them as a group. In response, albeit temporarily, the volume of Islamophobic searches slowed and included more positive searches.

This is encouraging in some ways, because despite the fleeting nature of this positivity, its presence suggests two important things, 1) that Islamophobia is largely a problem of perceptions, and 2) that the tide can be turned back. Negative views of Muslims have become deeply entrenched over the last three decades. Islamophobia as a public perception is regularly reinforced by mainstream media, by certain think tanks and their ‘experts’, and by reactions to the terrible deeds of ISIS; a group that has hijacked the image of Islam worldwide.

How can this data help us?

Can Google search data offer us the chance to fix some of society’s ills? Its revealing nature shows our darkest fears in a way no survey can ever do. Having this information (anonymous of course) could be used to bring issues into the open and address their root causes. In the case of Islamophobia, analysing Google searches could reveal where the gaps and misperceptions lie in wider society’s understanding of Muslims. It could allow us to categorise the fears, misunderstandings, and false perceptions. This could inform the design of social initiatives targeting specific problems, helping people understand each other better and gain a stronger sense of reality over perception.

Who’s winning on the digital battlefield?

On the eve of the French presidential elections, there’s a sudden flurry of activity on social media. A candidate’s name – #Macron – is trending on Twitter. So what’s the news? A large stash of Emmanuel Macron’s private emails have been hacked and leaked online.

Sound familiar?

That’s because it’s happened before. You probably remember last year’s debacle about Hillary Clinton’s leaked emails. This more than likely contributed to her losing the election to Donald Trump. If nothing else, it created an air of public suspicion around Clinton that did irreparable damage to her reputation. I still think back to that hacking event and recall it as a haze of rumours and misinformation; I was never totally clear what the core of the issue really was.

And in light of this latest development with France, I begin to wonder if confusion is actually the goal in all this. Perhaps we give whoever is behind this too much credit by assuming they’re actually pulling the strings of public opinion. What would be easier, and perhaps just as damaging, would be simply to sow the seeds of mistrust. With everyone at each others throats, arguing bitterly about what is and isn’t ‘fake news’, there’s room for the malevolent forces to continue their underhand work of sabotaging democracy. When journalists digging deep to report the truth on something can so easily have their work discredited as ‘fake news’ by none other than the US president himself, we really are veering into a disturbing new reality.

Who is actually responsible for this mischief? Sources point to a Russian hacking group known, among a variety of other names, as “Fancy Bear”. It’s the same group said to be responsible for hacking Hillary Clinton’s emails last year. “The key goals and objectives of the campaign appear to be to undermine Macron’s presidential candidacy and cast doubt on the democratic electoral process in general,” said Vitali Kremez, director of research at Flashpoint, a business risk intelligence company in New York, in an interview with the New York Times.

We should not underestimate the abilities of Russia in this arena. Dmitri Alperovitch, of CrowdStrike, told the MIT Technology Review that Russia ‘gets the true nature of the battlefield’ in a way the West does not. “They’ve been thinking about this for a very long time,” he said. “It actually goes at least as far back as the Tsarist era in the 1860s, when they created one of the first modern intelligence agencies, the Okhranka.” So Russia has been doing this sort of thing for decades, but the rise of digital offers the perfect new landscape for even deeper subterfuge.

But there’s one ray of hope; and that’s in how the French media has responded to the Macron email leak so far; by not reporting on the contents of it. This seems a smart move. Part of French law requires candidates to stop campaigning between midnight on Friday to when the polls close at 8pm on Sunday. Candidates are forbidden to give media interviews or issue statements. The timing of the email hack was likely designed to coincide with this, in an attempt to release the emails while Macron was unable to respond. But denying the fake news trolls the oxygen of media publicity cuts the head off the snake; removing much of its potential to harm. The same goes for terrorist incidents. ‘Propaganda of the deed’ as terrorism was once known, relies on shock and awe to achieve its ends. In an ‘always-on’ digital society this effect is massively amplified and completely fake incidents can even be instigated, by anyone anywhere. If the media had denied the ‘oxygen of publicity’ to groups like Isis from the very beginning, the world might be less messy today.

The emergence of Isis fuelled the rise of the far-right, giving white supremacists and ultraconservatives the opportunity to rise up and gain power under the guise of ‘protecting’ the nation from threat. Of course that ‘threat’ is constantly portrayed as emanating from Islam and Muslims. And so the cycle continues. But the example of France is certainly a promising one. The election outcome will reveal if it actually worked. Perhaps going forward, these issues could be mitigated by a more scrupulous mainstream media, one that’s less desperate for ‘clicks’ to ensure its survival, along with citizen journalism collectives such as Bellingcat, to shed light on old issues and reveal new cracks in existing narratives.

Nuanced communities: Mapping ISIS support on Twitter

As every content marketer knows, creating resonant narratives requires intimate knowledge of the audience in question.

Nowhere is this more true than in attempts to counter the potent messaging of ISIS. The terrorist group is infamous for its ability to attract recruits from across the world to commit violence in the name of the ‘caliphate.’

ISIS has been a fixture in the global public consciousness for over two years, from its dramatic emergence in summer 2014 to facing near-decline earlier this year, followed by resurgence with its latest attack on Berlin just weeks ago. Long before Berlin, the group had already become notorious for the quality and power of its social media messaging, professionally produced videos and slick English-language print publications.

Concerned national governments and civil society groups have made numerous attempts to counter the ISIS narrative in various ways, ranging from shutting down followers’ Twitter accounts en masse to creating alternative narratives that aim to discredit the group, its ideology and its actions. But despite all these attempts, attacks against European cities remain a very real threat.

As another gloomy and blood-soaked year of ISIS activity comes to an end, the group shows no sign of fading away. Although it has lost physical territory in Iraq and Syria, the ongoing risk of the ISIS virtual caliphate persists.

A whole range of diverse factors determine an individual’s likelihood to become radicalised, many of which have been studied in significant depth elsewhere. Social media is not necessarily the most influential factor, but it undoubtedly plays a role.

RAND, a US-based think-tank, conducted a detailed research study, published in 2016, to examine ISIS support and opposition networks on Twitter, aiming to gather insights that could inform future counter-messaging efforts.

The study used a mixed-method analytics approach to map publicly available Twitter data from across the Arabic-speaking Twitter-verse. Specific techniques used were community detection algorithms to detect links between Twitter users that could signify the presence of interactive communities, along with social network analysis and lexical analysis to draw out key themes from among the chatter.

Research goals were to learn how to differentiate between ISIS opponents and supporters; to understand who they are and what they are saying; and to understand the connections between them while identifying the influencers.

Lexical analysis uncovered four major groups, or ‘meta-communities’ among the Arabic-speaking ISIS conversation on Twitter. These were Shia, Sunni, Syrian Mujahideen, and ISIS Supporters. They are characterised by certain distinct patterns in their tweets. Shia tend to condemn ISIS and hold positive views of Christians/the West/the international coalition fighting ISIS. This is unsurprising considering the long-standing hostility between Sunni and Shia Muslims and the fact that ISIS is a Sunni group.

The Syrian Mujahideen group is anti-Assad, holds mixed views of ISIS, and negative views of the coalition. ISIS supporters talk positively in bombastic overblown language about ISIS and the caliphate. They insult Shia, the Assad regime, and the West. Notably, their approach to social media strategy is by far the most sophisticated of the lot. And finally, the Sunni group is heavily divided along nationalistic lines, which includes most countries of the Arab world.

Key findings of interest

1. Unique audiences, essential nuance

Telling the difference in large datasets between ISIS supporters and opponents was key for this study. RAND researchers chose an easy way; Twitter users who tweeted the Arabic word for ‘Islamic State’ (الدولة ا س مية ) were considered to be supporters, while those who used the acronym ‘DAESH’ (داعش ) were opponents. This dividing line isn’t foolproof but, based on what’s known about the significance of these two Arabic terms, it seems a valid way to approach the task. Research discovered that although opponents outnumbered supporters six to one, the supporters were far more active, producing 50 % more tweets daily.

This could point to a couple of things. Firstly the outnumbering suggests that the majority of the Arab world (or at least the Twitter sphere) is anti-ISIS; while the volume of pro-ISIS tweets could suggest passionate support for the group, or on the other hand could point to the presence of armies of pro-ISIS bots or perhaps the use of astro-turfing. The latter two could be an interesting case for new research, especially in the present climate where the curtain has been lifted on use of social media bots, astro-turfing armies and persona management software.

2. Jordanian pilot, Turkish soldiers

The researchers also plotted Twitter activity levels for all four groups, between July 2014 (when ISIS emerged and announced itself to the world), to May 2015. Notable findings were firstly that both the anti-ISIS groups (Shia and Sunni States) showed similar activity patterns, suggesting that both were responding to the same ISIS-related events. All four groups experienced a large spike in activity in early February 2015, when ISIS released a video showing Jordanian pilot Moath al-Kasasbeh being burned alive.

After this event, the ISIS supporters activity decreased sharply, while the Syrian Mujahideen’s grew to almost match the Shia and Sunni States groups. Possible explanations (assuming the ISIS supporters are not bots) could include outrage at the murder of a fellow Muslim, and/or outrage at the way he was killed, burning, which is forbidden in the Qur’an. It would be interesting to compare the Twitter response to al-Kasasbeh’s murder with the response to another ISIS burning video, released last week, where two Turkish soldiers were killed.

This comparison could reveal further insights about the nature of the original 2015 spike; or reveal changing attitudes towards Turkey, which has started fighting against ISIS in recent months and has most likely become hated among the group’s supporters as a result.

3. Social media mavens

The ISIS supporters Twitter community analysed in the study showed particular features that made it distinct from the other groups. The supporters group members were more active than the other three groups (despite smaller numbers overall). They tweeted a lot of pro-ISIS terms and phrases, predictably. But most notable about this group was their fluency and command of advanced social media strategy, as shown by their use of certain terms on Twitter. In the study, the supporters group used disproportionately high levels of terms such as spread, link, breaking news, media office, and pictorial evidence.

In general, ISIS has always been exceptionally conversant with social media marketing tools and techniques, in fact far superior to the efforts of many national governments. I would be very interested to see a study that uncovers who exactly is responsible for the ISIS propaganda, what their backgrounds are, and how they were recruited and trained (if indeed they weren’t already expert in this area).

4. CVE insights from Twitter data

Finally, the report offers insights for policy-makers and for those engaged in online CVE efforts across the Arab world. The most important of these is a reiteration of the need for counter-messaging that’s not just tailored, but that shows deep levels of insight into the mindsets of its target audiences. Research like this can help reveal useful themes and connections to build upon.

Also, the ongoing efforts by Twitter to ban pro-ISIS accounts has undoubtedly driven many of them to other channels, most notoriously Telegram. Analysing activity on new channels would be of great use in revealing any shifts in ISIS supporters focus or mindset. Much in the landscape has changed since this report was released, and continues to do so at a rapid rate.

The data journalism landscape worldwide

What does data journalism look like the world over?

Finding data to work with is fairly straightforward in many Western countries, with increasing numbers of resources (such as the Guardian data blog) dedicated to assembling and showcasing varied datasets. No matter what topics you’re interested in, you will probably find relevant data to explore. But what about beyond, in countries where the value of open data has yet to make an impact? Or in countries where the leadership prefers to keep information under wraps. How do we begin digging it out?

In Turkey for example, the situation feels very different. Just recently I was working on a piece for for Middle East Eye about the trials, tribulations and triumphs of Turkish female software developers. I wanted to add an extra layer to the story by integrating a simple data visualisation. It would show the numbers of Turkish women in software careers.

My early efforts to find suitable data didn’t get very far. I Googled, asked fellow journalists (including Turkish speakers), and contacted organisations such as Kadin Yazilimci (Women Developers). The latter told me that they’re also looking for this data – and haven’t found it yet. I was at a dead end, running out of leads to find the information I wanted.

During a tech-related procrastination session, I discovered a fascinating blog called Source. It focuses on stories from the intersection of journalism and code. While browsing the articles, I came across one in particular that resonated in light of my earlier problems. The opening paragraph, with the irresistible words “Yangon tech hub” – couldn’t fail to hook me. I’ve been intrigued with Myanmar since visiting in 2012, just as the country was starting to open up.

Eva Constantaras, who wrote the piece, is a data journalist who also trains journalists in developing countries. She reflects on her experiences covering the November 2015 general elections in Myanmar; elections that represented its biggest leap towards democracy for decades.  The article highlights the substantial quantity of election coverage, but points out that complexity remains lacking. This is unsurprising as Myanmar only gained widespread internet access in the last couple of years.

Cross-border partnerships to collect and manage open data are a ‘natural next step’, according to Constantaras. In this kind of collaboration, local journalists work with international news teams to provide deeper coverage of stories, including those involving data. Local topics of interest would benefit from global visibility with coverage on international platforms. These partnerships could obviously benefit from more open access to local data.

I faced the same problem when I visited Jamaica last summer. The only data sources I could find were from external organisations like the UN or World Bank, or a few rusty spreadsheets from local online sources. But they weren’t up to date enough to be really useful.

It would be great to have a one-stop online portal for all existing global data sources, from inside countries as well as from international organisations. I don’t know if anything like this already exists, or is being developed, but if so, I’d like to hear about it.

UPDATE: Since writing this post, I found Dag Medya, Turkey’s first data journalism portal. It just got shortlisted for the 2016 Data Journalism awards, so is definitely worth keeping an eye on.

 

 

Mapping global passport power

12670198_486029904916900_4188129394606082152_n

Passport power can shape your life. A strong passport allows the holder to move smoothly through the world, breezing through borders with ease and opening doors for new opportunities in travel, work and investment.

But a weak passport spells endless rounds of visa applications, being denied, being treated with suspicion (in case you’re an economic migrant masquerading as a tourist), having to stump up large sums of money just so they can be sure you’ll go home again.

Around 200 passports exist today. All offer varying levels of travel flexibility. Some, like the German passport, can take their holders to 177 countries visa-free or visa on arrival. Others, like the Afghan passport, are practically useless, allowing entry across a mere 25 national borders.

[embeddoc url=”http://samanthanorth.com/wp-content/uploads/2016/04/Visa-Restrictions-Index-Dataset.xlsx” download=”all” viewer=”microsoft”]

The aim of this project is to map global passport power using CartoDB. The original data came from a study called the Visa Restrictions Index, created by Henley & Partners. It was embedded in a PDF, making it impossible to access by scraping or copy-pasting. This was a chance for me to try out Tabula, an invaluable tool that extracts raw data in table form from within PDFs. Tabula worked really well in this case. It pulled the data neatly into Excel format, with the minimum of additional tweaking needed. Now I had a nicely formatted Excel sheet ready to work with.

The dataset contains three columns: Rank, Country Name, and Score (i.e. how many countries the passport holder can enter visa-free). I can imagine the final visualisation involving a world map, with the ranks perhaps mapped as pins (or circles) of increasing sizes (smallest for weakest passports, biggest for strongest). Circles would be geographically located, with the individual score showing up inside each circle.

Here’s how it ended up. I like how the chloropleth feature looks; it seems to work well for this particular map. From looking at this map I can see some starting points for potential stories involving national image, economic power, public diplomacy and more.