Requesting politely to stay in the dark will not serve journalism

Quote

At Salon, Richard constantly analyzed revenue per thousand page views vs. cost per thousand page views, unit by unit, story by story, author by author, and section by section. People didn’t want to look at this data because they were afraid that unprofitable pieces would be cut. It was the same pushback years before with basic traffic data. People in the newsroom didn’t want to consult it because they assumed you’d end up writing entirely for SEO. But this argument assumes that when we get data, we dispense with our wisdom. It doesn’t work that way. You can continue producing the important but unprofitable pieces, but as a business, you need to know what’s happening out there. Requesting politely to stay in the dark will not serve journalism.

- from Matt Stempeck’s liveblog of Richard Gingras’s Nieman Foundation speech

Stop blaming the internet for rubbish news content

Newspapers and newsrooms generally have always striven to publish stories that are important, interesting, informative and entertaining.  Not every one puts those in the same order or gives them the same importance. But the internet hasn’t changed that much.

The unbundling effects of the net mean that instead of relying on the front page to sell the whole bundle, each piece has to sell itself. That can be hard; suddenly the relative market sizes for different sorts of content are much starker, and for people who care more about important/interesting/informative than entertaining, that’s been a depressing flood of data. But the internet  didn’t create that demand – it just made it more obvious. Whether we should feed it or not is an editorial question. Personally, I think it’s fine to give people a little of what they want – as long as a newsroom is putting out informative and important stories, a few interesting and entertaining ones are good too, so long as they’re not lies, unethically acquired or vicious.

If you spend a lot of time online you will see a filter bubble effect, where stories from certain news organisations are not often shared by your friends and don’t often turn up in your sphere unless you actively go looking for them. That means the ones that break through will be those that outrage, titillate or carry such explosive revelations that they cannot be ignored. That does not mean those stories are the sum total output of a newsroom – any more than the 3AM Girls are the sum total of the Mirror in print – but those pieces attract a new audience and serve to put that wider smorgasbord of content in front of them (assuming the article pages are well designed).

Of course, some news organisations publish poor stories – false, misleading, purposefully aggravating or just badly written – in the name of chasing the trend. That’s also far from an internet-only phenomenon. The Express puts pictures of Diana on the front, and routinely lies for impact in its headlines. The Star splashes on Big Brother 10 weeks running. The editorial judgement about the biggest story for the front is about sales as much as it is newsworthiness. Sometimes those goals align. Sometimes they don’t, and editors make a choice.

It is ridiculous to blame the internet for the publishing of crap stories to chase search traffic or trend-based clicks – just as it’s ridiculous to blame the printing press for the existence of phone hacking. In both cases it’s the values and choices of the newsroom that should be questioned.

News SEO: optimising for robots is all about the people

Some people in the news business get very wary of SEO in general. There seems to be a perception that content farming and low-quality stories are a sort of natural consequence of making sure your stories can be found via Google. But in fact there is a wide spectrum of approaches here, and news organisations make editorial judgements over whether to cover something that’s interesting to the public just because the public is interested. No Google robot forces a newsroom to make that choice, just as no print-sales-bot forces the Daily Star to splash on scantily-clad women and celebrity gossip.

If your editorial strategy is to chase search terms, then you’re not optimising for robots – you’re optimising for the millions of people online who search for certain sorts of stories. Websites like Gawker and the Mail Online create content to attract the potential millions who read celebrity gossip or who want the light relief of weird Chinese goats - and many of those people also care about the budget or the war in Afghanistan, because people are multi-faceted and have many, many interests at the same time.

If your production strategy includes making sure your headlines accurately describe your content, make sense out of context and use words people would actually use in real life, then you are optimising your content for search. Not for robots, again, but for people – potential and actual readers or viewers – some of whom happen to use search engines to find out about the news.

For example, search optimised headlines may well have the keywords for the story right at the beginning. Google lends greater weight to words at the start of a headline than at the end. But it does so because so do people. If you’re scanning a Google search results page, you tend to read in an F shape, taking account of the first few words of an item before either engaging further or moving on. [Edit: via @badams on Twitter, a more recent study backing up the F-shape reading pattern.] Google’s algorithm mimics how people work, because it wants to give people what they’re going to find most relevant. Optimising for the robot is the same thing as optimising for human behaviour – just as we do in print, taking time to design pages attractively, and taking account of the way people scan pages and spend time on images and headlines in certain ways.

News SEO is a very different beast from, say, e-commerce SEO or SEO for a small business that wants to pick up some leads online. Once you get beyond the basics it does not follow the same rules or require the same strategies. Link building for breaking news articles is worse than pointless, for example; your news piece has a halflife of a day, or an hour, or perhaps a whole week if you’re lucky and it really hits a nerve. Social sharing has a completely different impact for news organisations that want their content read than for, say, a company that wants to sell shoes online. For retailers, optimising for the algorithm might start to make some sense – if the only difference between you and your competitors is your website, then jostling for position in the search results on particular pages gets competitive in a way that news doesn’t. For news, though, optimising for robots always means optimising for humans. It’s just a matter of choosing which ones.

URL manipulation, libel, and Kate Middleton jelly beans

Regular readers here (all 6 of you) will probably already know about Jellybeangate. Yesterday, a URL from the Independent was rewritten to say something rather uncomplimentary about a PR-churned story on their site, revealing that Kate Middleton’s face had been discovered in a jelly bean. The link went viral on Twitter after several fairly well-respected sources assumed it was the work of a disgruntled sub and not a prank. Then the corrections went viral, along with several other versions of the link. This sort of URL behaviour is remarkably common.

According to the Nieman Lab, there are vast numbers of other news organisations whose URLs can be manipulated in this way (Citywire, my employer, is one of them) – and third parties with agendas could easily make it seem at a casual glance as though their URLs are libellous or offensive. But most URLs – if not all – can be manipulated very simply, using parameters. I can add &this=utter-rubbish after almost any link and the link will still resolve, leaving my additions intact. Thus:

There shouldn’t be any fear of being liable for this sort of manipulation, any more than there is in someone copying a newspaper masthead and pasting their own words underneath. For a statement to be libellous it must have been published, and in this case the individual who wrote, manipulated and then distributed the URL is the publisher. This seems clear for manipulated parameters marked by “?” and I have a hard time believing anyone would find otherwise for parameters within the URL itself.

If I were the Indie’s SEO team right now, I’d be more worried that the doctored URL is able to rank above their original. Might just be a good idea to get some rel=canonical tags on their article pages.