When I was working as a web editor at a metro daily paper a few years ago, I pro...

VonGuard · on Dec 10, 2014

Absolutely true that journalists are too busy to bother with markdown. What would be great, however, is to have tools for copy editing that do things like recognize people's names, Google them and spell check them. Recognize people's titles, Googles and confirms them and spell checks them. Style passes that could do simple grammatical edits around a site's style guide: call it a style filter.

I have to say, for me as a journalist in tech, the one thing that ends up taking the most time in my stories is looking up name spellings and people's titles, and most importantly, trying to figure out if your freaking company is spelled TheCompany, The Company, or Thee Cmpany or some crazy variant. You startups and your mid-word-capital-letters. The bane of copyeditors everywhere.

scholia · on Dec 11, 2014

You can do quite a lot of that in Microsoft Word, including the use of research tools, but it's a lot of work to set up.

Agree about the problems coping with company and product titles, etc. The problem could be reduced by refusing to play that game, eg by using registered names rather than marketing styles or logos.

VonGuard · on Dec 11, 2014

That only works until marketing calls your sales people and says "you spelled our name wrong!"

scholia · on Dec 11, 2014

So you point to the company registration documents and tell them you are happy to display their logos in the ads they can buy to correct it ;-)

schnevets · on Dec 10, 2014

  <priority value="1">This amounts to de facto resegregation. <priority="4">(And we all know how we segregation worked out the first time.)</priority> If the school district still values integrated schools, it must act swiftly to correct this effect.</priority>

Forget about editors, use this kind of mark-up for your readers. Imagine changing an in-depth article into a truncated, 200-word summary with the click of a button. Activating a different tag would include the reporter's subjective commentary (or perhaps have multiple editorials based on the same "scaffolding")?

kd5bjo · on Dec 10, 2014

In print journalism, you're supposed to include information in descending order of importance. As a reader, you're expected to stop reading once you get to details you don't really care about, confident that you won't miss something more important buried farther down.

jawns · on Dec 10, 2014

That's largely true for hard news stories. But what about news feats or op-eds or sports profiles, where the traditional inverted pyramid structure isn't employed?

apozem · on Dec 10, 2014

Exactly. Feature stories often use a delayed lead to set the scene or catch the reader's attention.

dredmorbius · on Dec 10, 2014

While I can understand the arguments for this, in practice the approach frustrates me to no end.

Worse are authors whose writing has no excerptable lede. Gina Kolata and mumber Morgenstern (health / Well articles in the NY Times) especially do this.

I find some old-school journos -- Dan Gilmore particularly comes to mind, I've called out a few others in G+ posts -- still practice strong ledes and heads. Many newer ones start with "My latest at <some website somewhere> read more".

Which ... tells me fucking nothing.

Lede with your lede. Trail with your link or call-to-action.

dredmorbius · on Dec 10, 2014

The so-called "inverted pyramid". I'm finding it applied less and less frequently.

I've also noted that it's become pretty much standard practice for news bureaus to write single-sentence paragraphs. Literally, every sentence of a story is its own paragraph. I don't know when that became standard practice, but sometime in the later 1990s or 2000s, particularly as stories moved online.

mpclark · on Dec 11, 2014

Sadly, hardly anyone writing now outside big traditional media orgs has done any sort of formal journalism training.

dredmorbius · on Dec 11, 2014

That's ... not all bad.

I think there's too much insularity within much of the journo community. But many of the newcomers are also subject to outside influences which call their credibility strongly into question. Lack of uniform copyediting, for better or worse, means a wide range of writing quality.

Though I'm seeing that even in long-standing brands -- NY Times, Forbes, and elsewhere.

ganeumann · on Dec 11, 2014

Zinsser mentions this one sentence paragraph thing in "How to Write Well", citing an AP article from 1993. (In the "Paragraphs" section of Chapter 10.)

dredmorbius · on Dec 11, 2014

Thanks. And yes, it makes sense that it would be AP style or similar.

rafekett · on Dec 10, 2014

this seems to be a lost art, unfortunately.

coldtea · on Dec 11, 2014

>Forget about editors, use this kind of mark-up for your readers. Imagine changing an in-depth article into a truncated, 200-word summary with the click of a button.

I imagine it, and most users would still not care. They skim articles anyway.

jawns · on Dec 10, 2014

Actually, this is an extremely simplified example, and once you get into the nitty gritty, it becomes a lot more difficult to add/remove various elements. For instance, you might need to capitalize a word differently depending on whether something has been excised immediately before it, or you may need to adjust punctuation in ways that you can't do simply using an XML-like format. Really, you need what amounts to a Natural Language Generation library to implement a robust system.

jerf · on Dec 10, 2014

I implemented that for my blog somewhere around 2001 or so. It's quite tedious to write and I haven't done it since around 2001 or so. Even two versions of a story is a lot.

arebop · on Dec 10, 2014

I did something like this in college, and I was thinking semantic annotations would be an editorial pass like copyediting. These days you'd probably let some ML system take a crack at it before using human effort to bring the quality up to your publication's standard. In any case, it doesn't have to get in the way of writing a good story.

ajuc · on Dec 11, 2014

You can do this with lisp-style syntax (which also (despite opinions to the contrary) is quite natural).

dangayle · on Dec 10, 2014

>> It turns out, though, even when you create a UI that lets reporters and editors easily plug in this metadata without having to understand XML, they are not apt to fill it in, because they are just so overworked as it is.

Nailed it. I'm a developer in a newsroom, and I'm dealing with flack just asking them to write a non-automated teaser text for their blog posts. They can't be bothered.

limelight · on Dec 10, 2014

> They've got bigger things to think about, like how to find a viable business model.

This is actually potentially a big part of that. People are reading more than they ever have—it's just not necessarily newspapers that they're reading.

basisword · on Dec 10, 2014

>> "People are reading more than they ever have—it's just not necessarily newspapers that they're reading."

Any stats on that. I agree people are reading more than ever but disagree that they're not reading newspapers (online). I think they are, they just aren't paying for it anymore.

limelight · on Dec 10, 2014

If you look at this NiemanLab post, table 1 shows that even as online time has increased, time on online newspapers has definitely decreased.

http://www.niemanlab.org/2014/06/are-online-ads-more-valuabl...

basisword · on Dec 10, 2014

Interesting, thanks.

lsseckman · on Dec 11, 2014

stats requested, stats delivered, hackernews

lstamour · on Dec 10, 2014

That said, you could end up applying this to a "news article IDE" automatically, with less human intervention required -- or at least, provide automatic suggestions. I couldn't find any clear links on the topic, but here are a few that can be followed with a bit of research:

http://www.nltk.org/book/ch07.html

http://stanbol.apache.org/docs/trunk/components/enhancer/nlp...

This last one was actually trained on WSJ content: http://nlp.lsi.upc.edu/freeling/index.php?option=com_content...

At this point, though, I'm thinking it'd be the equivalent to spelling and grammar suggestions in Word, appreciated somewhat but ultimately considered useless the first time it screws up. But still better than nothing, right? ;-)

mindcrime · on Dec 10, 2014

Stanbol could definitely be used as part of a system like this. In fact, that sort of thing is a big part of how we're using it at Fogbeam, although aimed at assorted knowledge workers in an enterprise setting, and not at journalists specifically.

djb_hackernews · on Dec 11, 2014

There are automated services that let you do this, I've used Open Calais and MetaCarta in the past with great results.

To be honest I'm surprised services like those aren't automatically used on new content within every major media outlet as a standard.

Cthulhu_ · on Dec 11, 2014

Yeah, just that: they're too busy. Most news articles are valid for just one edition of a newspaper, so about half a day or even less if the newspaper is published more than once a day. It's write it and move on, in a lot of cases. Investigative journalism probably has more use for a system like this, but even then I doubt they'd want to fill in XML forms when they could just write sentences. Besides, usually you can trust an editor to read an article and remove the bits that aren't relevant based on their own judgment, instead of a 'priority' hint by the original author.

7952 · on Dec 10, 2014

A lot of this information can be extracted directly from the text. It is not like "New Castle County Council" is particularly ambiguous. The trouble for a lot of reporting is that the stories simply lack depth and good links to external content.

EGreg · on Dec 11, 2014

I agree that this can be generated at publish time. The question is, how much value does it provide over something generated at read time by a browser plugin? The answer is - at publish time presumably someone at the publisher outfit will take a cursory look at it. That's it. May as well have readers mark up the articles!