Home » Data Journalism

Category: Data Journalism

Who’s winning on the digital battlefield?

On the eve of the French presidential elections, there’s a sudden flurry of activity on social media. A candidate’s name – #Macron – is trending on Twitter. So what’s the news? A large stash of Emmanuel Macron’s private emails have been hacked and leaked online.

Sound familiar?

That’s because it’s happened before. You probably remember last year’s debacle about Hillary Clinton’s leaked emails. This more than likely contributed to her losing the election to Donald Trump. If nothing else, it created an air of public suspicion around Clinton that did irreparable damage to her reputation. I still think back to that hacking event and recall it as a haze of rumours and misinformation; I was never totally clear what the core of the issue really was.

And in light of this latest development with France, I begin to wonder if confusion is actually the goal in all this. Perhaps we give whoever is behind this too much credit by assuming they’re actually pulling the strings of public opinion. What would be easier, and perhaps just as damaging, would be simply to sow the seeds of mistrust. With everyone at each others throats, arguing bitterly about what is and isn’t ‘fake news’, there’s room for the malevolent forces to continue their underhand work of sabotaging democracy. When journalists digging deep to report the truth on something can so easily have their work discredited as ‘fake news’ by none other than the US president himself, we really are veering into a disturbing new reality.

Who is actually responsible for this mischief? Sources point to a Russian hacking group known, among a variety of other names, as “Fancy Bear”. It’s the same group said to be responsible for hacking Hillary Clinton’s emails last year. “The key goals and objectives of the campaign appear to be to undermine Macron’s presidential candidacy and cast doubt on the democratic electoral process in general,” said Vitali Kremez, director of research at Flashpoint, a business risk intelligence company in New York, in an interview with the New York Times.

We should not underestimate the abilities of Russia in this arena. Dmitri Alperovitch, of CrowdStrike, told the MIT Technology Review that Russia ‘gets the true nature of the battlefield’ in a way the West does not. “They’ve been thinking about this for a very long time,” he said. “It actually goes at least as far back as the Tsarist era in the 1860s, when they created one of the first modern intelligence agencies, the Okhranka.” So Russia has been doing this sort of thing for decades, but the rise of digital offers the perfect new landscape for even deeper subterfuge.

But there’s one ray of hope; and that’s in how the French media has responded to the Macron email leak so far; by not reporting on the contents of it. This seems a smart move. Part of French law requires candidates to stop campaigning between midnight on Friday to when the polls close at 8pm on Sunday. Candidates are forbidden to give media interviews or issue statements. The timing of the email hack was likely designed to coincide with this, in an attempt to release the emails while Macron was unable to respond. But denying the fake news trolls the oxygen of media publicity cuts the head off the snake; removing much of its potential to harm. The same goes for terrorist incidents. ‘Propaganda of the deed’ as terrorism was once known, relies on shock and awe to achieve its ends. In an ‘always-on’ digital society this effect is massively amplified and completely fake incidents can even be instigated, by anyone anywhere. If the media had denied the ‘oxygen of publicity’ to groups like Isis from the very beginning, the world might be less messy today.

The emergence of Isis fuelled the rise of the far-right, giving white supremacists and ultraconservatives the opportunity to rise up and gain power under the guise of ‘protecting’ the nation from threat. Of course that ‘threat’ is constantly portrayed as emanating from Islam and Muslims. And so the cycle continues. But the example of France is certainly a promising one. The election outcome will reveal if it actually worked. Perhaps going forward, these issues could be mitigated by a more scrupulous mainstream media, one that’s less desperate for ‘clicks’ to ensure its survival, along with citizen journalism collectives such as Bellingcat, to shed light on old issues and reveal new cracks in existing narratives.

Nuanced communities: Mapping ISIS support on Twitter

As every content marketer knows, creating resonant narratives requires intimate knowledge of the audience in question.

Nowhere is this more true than in attempts to counter the potent messaging of ISIS. The terrorist group is infamous for its ability to attract recruits from across the world to commit violence in the name of the ‘caliphate.’

ISIS has been a fixture in the global public consciousness for over two years, from its dramatic emergence in summer 2014 to facing near-decline earlier this year, followed by resurgence with its latest attack on Berlin just weeks ago. Long before Berlin, the group had already become notorious for the quality and power of its social media messaging, professionally produced videos and slick English-language print publications.

Concerned national governments and civil society groups have made numerous attempts to counter the ISIS narrative in various ways, ranging from shutting down followers’ Twitter accounts en masse to creating alternative narratives that aim to discredit the group, its ideology and its actions. But despite all these attempts, attacks against European cities remain a very real threat.

As another gloomy and blood-soaked year of ISIS activity comes to an end, the group shows no sign of fading away. Although it has lost physical territory in Iraq and Syria, the ongoing risk of the ISIS virtual caliphate persists.

A whole range of diverse factors determine an individual’s likelihood to become radicalised, many of which have been studied in significant depth elsewhere. Social media is not necessarily the most influential factor, but it undoubtedly plays a role.

RAND, a US-based think-tank, conducted a detailed research study, published in 2016, to examine ISIS support and opposition networks on Twitter, aiming to gather insights that could inform future counter-messaging efforts.

The study used a mixed-method analytics approach to map publicly available Twitter data from across the Arabic-speaking Twitter-verse. Specific techniques used were community detection algorithms to detect links between Twitter users that could signify the presence of interactive communities, along with social network analysis and lexical analysis to draw out key themes from among the chatter.

Research goals were to learn how to differentiate between ISIS opponents and supporters; to understand who they are and what they are saying; and to understand the connections between them while identifying the influencers.

Lexical analysis uncovered four major groups, or ‘meta-communities’ among the Arabic-speaking ISIS conversation on Twitter. These were Shia, Sunni, Syrian Mujahideen, and ISIS Supporters. They are characterised by certain distinct patterns in their tweets. Shia tend to condemn ISIS and hold positive views of Christians/the West/the international coalition fighting ISIS. This is unsurprising considering the long-standing hostility between Sunni and Shia Muslims and the fact that ISIS is a Sunni group.

The Syrian Mujahideen group is anti-Assad, holds mixed views of ISIS, and negative views of the coalition. ISIS supporters talk positively in bombastic overblown language about ISIS and the caliphate. They insult Shia, the Assad regime, and the West. Notably, their approach to social media strategy is by far the most sophisticated of the lot. And finally, the Sunni group is heavily divided along nationalistic lines, which includes most countries of the Arab world.

Key findings of interest

1. Unique audiences, essential nuance

Telling the difference in large datasets between ISIS supporters and opponents was key for this study. RAND researchers chose an easy way; Twitter users who tweeted the Arabic word for ‘Islamic State’ (الدولة ا س مية ) were considered to be supporters, while those who used the acronym ‘DAESH’ (داعش ) were opponents. This dividing line isn’t foolproof but, based on what’s known about the significance of these two Arabic terms, it seems a valid way to approach the task. Research discovered that although opponents outnumbered supporters six to one, the supporters were far more active, producing 50 % more tweets daily.

This could point to a couple of things. Firstly the outnumbering suggests that the majority of the Arab world (or at least the Twitter sphere) is anti-ISIS; while the volume of pro-ISIS tweets could suggest passionate support for the group, or on the other hand could point to the presence of armies of pro-ISIS bots or perhaps the use of astro-turfing. The latter two could be an interesting case for new research, especially in the present climate where the curtain has been lifted on use of social media bots, astro-turfing armies and persona management software.

2. Jordanian pilot, Turkish soldiers

The researchers also plotted Twitter activity levels for all four groups, between July 2014 (when ISIS emerged and announced itself to the world), to May 2015. Notable findings were firstly that both the anti-ISIS groups (Shia and Sunni States) showed similar activity patterns, suggesting that both were responding to the same ISIS-related events. All four groups experienced a large spike in activity in early February 2015, when ISIS released a video showing Jordanian pilot Moath al-Kasasbeh being burned alive.

After this event, the ISIS supporters activity decreased sharply, while the Syrian Mujahideen’s grew to almost match the Shia and Sunni States groups. Possible explanations (assuming the ISIS supporters are not bots) could include outrage at the murder of a fellow Muslim, and/or outrage at the way he was killed, burning, which is forbidden in the Qur’an. It would be interesting to compare the Twitter response to al-Kasasbeh’s murder with the response to another ISIS burning video, released last week, where two Turkish soldiers were killed.

This comparison could reveal further insights about the nature of the original 2015 spike; or reveal changing attitudes towards Turkey, which has started fighting against ISIS in recent months and has most likely become hated among the group’s supporters as a result.

3. Social media mavens

The ISIS supporters Twitter community analysed in the study showed particular features that made it distinct from the other groups. The supporters group members were more active than the other three groups (despite smaller numbers overall). They tweeted a lot of pro-ISIS terms and phrases, predictably. But most notable about this group was their fluency and command of advanced social media strategy, as shown by their use of certain terms on Twitter. In the study, the supporters group used disproportionately high levels of terms such as spread, link, breaking news, media office, and pictorial evidence.

In general, ISIS has always been exceptionally conversant with social media marketing tools and techniques, in fact far superior to the efforts of many national governments. I would be very interested to see a study that uncovers who exactly is responsible for the ISIS propaganda, what their backgrounds are, and how they were recruited and trained (if indeed they weren’t already expert in this area).

4. CVE insights from Twitter data

Finally, the report offers insights for policy-makers and for those engaged in online CVE efforts across the Arab world. The most important of these is a reiteration of the need for counter-messaging that’s not just tailored, but that shows deep levels of insight into the mindsets of its target audiences. Research like this can help reveal useful themes and connections to build upon.

Also, the ongoing efforts by Twitter to ban pro-ISIS accounts has undoubtedly driven many of them to other channels, most notoriously Telegram. Analysing activity on new channels would be of great use in revealing any shifts in ISIS supporters focus or mindset. Much in the landscape has changed since this report was released, and continues to do so at a rapid rate.

Mapping global passport power

12670198_486029904916900_4188129394606082152_n

Passport power can shape your life. A strong passport allows the holder to move smoothly through the world, breezing through borders with ease and opening doors for new opportunities in travel, work and investment.

But a weak passport spells endless rounds of visa applications, being denied, being treated with suspicion (in case you’re an economic migrant masquerading as a tourist), having to stump up large sums of money just so they can be sure you’ll go home again.

Around 200 passports exist today. All offer varying levels of travel flexibility. Some, like the German passport, can take their holders to 177 countries visa-free or visa on arrival. Others, like the Afghan passport, are practically useless, allowing entry across a mere 25 national borders.

[embeddoc url=”https://samanthanorth.com/wp-content/uploads/2016/04/Visa-Restrictions-Index-Dataset.xlsx” download=”all” viewer=”microsoft”]

The aim of this project is to map global passport power using CartoDB. The original data came from a study called the Visa Restrictions Index, created by Henley & Partners. It was embedded in a PDF, making it impossible to access by scraping or copy-pasting. This was a chance for me to try out Tabula, an invaluable tool that extracts raw data in table form from within PDFs. Tabula worked really well in this case. It pulled the data neatly into Excel format, with the minimum of additional tweaking needed. Now I had a nicely formatted Excel sheet ready to work with.

The dataset contains three columns: Rank, Country Name, and Score (i.e. how many countries the passport holder can enter visa-free). I can imagine the final visualisation involving a world map, with the ranks perhaps mapped as pins (or circles) of increasing sizes (smallest for weakest passports, biggest for strongest). Circles would be geographically located, with the individual score showing up inside each circle.

Here’s how it ended up. I like how the chloropleth feature looks; it seems to work well for this particular map. From looking at this map I can see some starting points for potential stories involving national image, economic power, public diplomacy and more.

Super-simple D3 bar chart

No, it’s not quite a visually stunning world map or amazing graphic artistry, sadly.

BUT nevertheless –  here’s a little data visualisation that I wrote today with D3:

 

d3 bar chart data

 

 

It doesn’t look like much but took a ton of code to produce (well probably not, but it certainly felt like it). I had fun playing with the pretty colours, finally settling on a blue/purple scheme. The bar colours turn more blue the larger the number and more purple the smaller the number.

The dataset is just a simple array of numbers that I threw in. There’s a touch of CSS added in the document head to make the bits play nicely together. Other than that, this was an exercise in learning how to use SVGs, how to bind data onto elements in the DOM, and how to style the results in a very basic way. This is fascinating stuff. I love the way it can give me the potential to mingle coding with journalism and tell stronger, deeper stories.

I can see those gorgeous interactive maps awaiting me in the not-so-distant future… 😀

Mapping the island

I’m trying to figure out the best way to tackle a data story about Jamaica. I’m a newcomer to the world of data journalism, so I want to choose a plan of attack that’s not overwhelming but still gives me a good chance to learn and showcase new skills.

Here’s my first idea, which I’m currently in the process of fleshing out. I thought of creating a story entitled something simple e.g., ‘This is Jamaica’. It would have a map of the island at the top, which would include pinpoints using address data scraped from a page of Google search results. I could do this using import.io for the scraping process.

Then I’d put the scraped data into a Google spreadsheet and finally use Google Fusion Tables to get it into the form of a map. There would probably be some data cleaning involved, which could be done once the data was safely in the spreadsheet. I’ll choose some data that isn’t too extensive, e.g. mapping all the coffee shops and bakeries in Jamaica that have wifi, for example’s sake.

After the map was ready, I’d like to make some graphs that display the country’s ‘vital signs’. These could include data such as GDP, employment rates, FDI, tourist inflows, amount of new businesses registered foreign aid, national debt, etc. A lot of this data could be taken from the World Bank’s website. It would give a overall picture of Jamaica’s present situation on the global stage.

I’m also interested in figuring out a way to visualise some aspects of data from the Good Country Index, but perhaps that’s better saved for another project altogether.