‘Checking in on Grammar Checking’ by Robert Dale is the latest Industry Watch column to be published in the journal Natural Language Engineering.
Reflecting back to 2004, industry expert Robert Dale reminds us of a time when Microsoft Word was the dominant software used for grammar checking. Bringing us up-to-date in 2016, Dale discusses the evolution, capabilities and current marketplace for grammar checking and its diverse range of users: from academics, men on dating websites to the fifty top celebrities on Twitter.
Below is an extract from the article, which is available to read in full here.
An appropriate time to reﬂect
I am writing this piece on a very special day. It’s National Grammar Day, ‘observed’ (to use Wikipedia’s crowdsourced choice of words) in the US on March 4th. The word ‘observed’ makes me think of citizens across the land going about their business throughout the day quietly and with a certain reverence; determined, on this day of all days, to ensure that their subjects agree with their verbs, to not their inﬁnitives split, and to avoid using prepositions to end their sentences with. I can’t see it, really. I suspect that, for most people, National Grammar Day ranks some distance behind National Hug Day (January 21st) and National Cat Day (October 29th). And, at least in Poland and Lithuania, it has to compete with St Casimir’s Day, also celebrated on March 4th. I suppose we could do a study to see whether Polish and Lithuanian speakers have poorer grammar than Americans on that day, but I doubt we’d ﬁnd a signiﬁcant diﬀerence. So National Grammar Day might not mean all that much to most people, but it does feel like an appropriate time to take stock of where the grammar checking industry has got to. I last wrote a piece on commercial grammar checkers for the Industry Watch column over 10 years ago (Dale 2004). At the time, there really was no alternative to the grammar checker in Microsoft Word. What’s changed in the interim? And does anyone really need a grammar checker when so much content these days consists of generated-on-a-whim tweets and SMS messages?
The evolution of grammar checking
Grammar checking software has evolved through three distinct paradigms. First-generation tools were based on simple pattern matching and string replacement, using tables of suspect strings and their corresponding corrections. For example, we might search a text for any occurrences of the string isnt and suggest replacing them by isn’t. The basic technology here was pioneered by Bell Labs in the UNIX Writer’s Workbench tools (Macdonald 1983) in the late 1970s and early 1980s, and was widely used in a range of more or less derivative commercial software products that appeared on the market in the early ’80s. Anyone who can remember that far back might dimly recall using programs like RightWriter on the PC and Grammatik on the Mac. Second-generation tools embodied real syntactic processing. IBM’s Epistle (Heidorn et al. 1982) was the ﬁrst really visible foray into this space, and key members of the team that built that application went on to develop the grammar checker that, to this day, resides inside Microsoft Word (Heidorn 2000). These systems rely on large rule-based descriptions of permissible syntax, in combination with a variety of techniques for detecting ungrammatical elements and posing potential corrections for those errors. Perhaps not surprisingly, the third generation of grammar-checking software is represented by solutions that make use of statistical language models in one way or another. The most impressive of these is Google’s context-aware spell checker (Whitelaw et al. 2009)—when you start taking context into account, the boundary between spell checking and grammar checking gets a bit fuzzy. Google’s entrance into a marketplace is enough to make anyone go weak at the knees, but there are other third-party developers brave enough to explore what’s possible in this space. A recent attempt that looks interesting is Deep Grammar (www.deepgrammar.com). We might expect to ﬁnd that modern grammar checkers draw on techniques from each of these three paradigms. You can get a long way using simple table lookup for common errors, so it would be daft to ignore that fact, but each generation adds the potential for further coverage and capability.
The remainder of the article discusses the following:
- Today’s grammar-checking marketplace
- Who needs a grammar checker?
‘Checking in on grammar checking’ is an Open Access article. You may also be interested in complimentary access to a collection of related articles about grammar published in Natural Language Engineering. These papers are fully available until 30th June 2016.
Other recent Industry Watch articles by Robert Dale:
Google’s translation service is used more than a billion times a day worldwide
Extract from the article ‘How to make money in the translation business’ by industry expert Robert Dale published in the journal Natural Language Engineering.
An anniversary year
2016 marks the fiftieth anniversary of an important event in the history of Machine Translation (MT). In 1966, after two years of work, the group of seven scientists who constituted the US National Science Foundation’s Automatic Language Processing Advisory Committee (ALPAC) handed down a 124-page report that was, well, somewhat negative about the state of MT research and its prospects. The ALPAC report is widely credited with causing the US government to drastically reduce funding in MT, and other countries to follow suit.
As it happens, 2016 also marks the tenth anniversary of the launch of the Google Translate web-based translation service, which was soon followed in 2007 by Microsoft’s Translator. Google says its translation service is used more than a billion times a day worldwide, by more than 500 million people a month. In mid-2015, one market research report estimated that, by 2020, the global MT market will be worth $10B.
Not a bad turnaround in outlook, even if it did take a few decades.
MT is special
In the portfolio of language technology applications that are the focus of interest of this journal’s readership, MT occupies a special place. MT was the goal of one of the very first experiments in Natural Language Processing. In 1954, the Georgetown–IBM MT system automatically translated sixty Russian sentences into English, leading its authors to claim that within three or five years, MT might be a solved problem. You can still find the original press release on the web; it’s a fascinating read, with its detailed description of a ‘brain’ that ‘dashed off its English translations . . . at the breakneck speed of two and a half lines per second.’
MT is also special because it’s one of the first areas of Natural Language Processing where statistical methods took hold in a big way. Although the idea of statistical MT was first raised by Warren Weaver in a 1949 memorandum, it was IBM’s influential statistical MT work in the late 1980s and early 1990s that caused researchers to sit up and take notice. I think it’s reasonable to claim that the perceived successes of Statistical Machine Translation (SMT) have been a major driver for the application of statistical techniques in other areas of Natural Language Processing since that time.
And MT is special because it’s possibly the most accessible form of language technology in terms of the popular understanding. It can be a struggle to explain to the layperson exactly what text analytics is, or why it is that grammar checkers and speech recognisers make mistakes. But most people get what MT is about, and can see that it might be a hard thing to do; many people have struggled with learning a second language. Nobody doubts the value of a technology that can take one human language as input and provide another as output.
In fact, universal translators have been a staple of science fiction, and thus part of the popular imagination, since at least 1945. Devices that can translate languages have played a role in many popular sci-fi TV shows. You can even guess someone’s age bracket by the movie or TV show whose name comes to mind when you mention the idea—for me, it’s Star Trek, where the back-story is that the Universal Translator was first used in the late twenty-second century for the translation of well-known Earth languages.
From where we stand now, Star Trek’s creator, Gene Rodenberry, looks to have been just a bit on the cautious side with his predictions. Perhaps he had read the ALPAC report: the Universal Translator first showed up in a 1967 episode of the show.
In the rest of the article, Robert Dale looks at where we are now, MT delivery models, considers humans versus machines gives his opinion on where the commercial potential lies.
Read the article ‘How to make money in the translation business’
You may also be interested in complimentary access* to a collection of related articles on Machine Translation from the journal Natural Language Engineering. *Free access available until 31 March 2016
Blog post written by Vivi Nastase based on the special issue ‘Graphs and Natural Language Processing’ in the journal Natural Language Engineering.
Graph structures naturally model connections. In natural language processing (NLP) connections are ubiquitous, on anything between small and web scale: between words — as structural/grammatical or semantic connections; between concepts in ontologies or semantic repositories; between web pages; between entities in social networks. Such connections are relatively obvious and the parallel with the graph structures straight-forward. While less obvious, with a little mathematical imagination, graphs can be applied to typo correction, machine translation, document structuring, sentiment analysis and more.
Graphs can be extremely useful for revealing regularities and patterns in the data. Graph formalisms have been adopted as an unsupervised learning approach to numerous problems – such as language identification, part-of-speech (POS) induction, or word sense induction – and also in semi-supervised settings, where a small set of annotated seed examples are used together with the graph structure to spread their annotations throughout the graph. Graphs’ appeal is also enhanced by the fact that using them as a representation method can reveal characteristics and be useful for human inspection, and thus provide insights and ideas for automatic methods.
We find not only the standard graphs — consisting of a set of nodes and edges that connect pairs of nodes — but also heterogeneous graphs (to model the network of tweeters and their tweets, or the network of articles, their authors and references), hypergraphs (which allow edges with more than two nodes, that could model grammatical rules for example), graphs with multi-layered edges, to fit more complex problems and data.
In the special issue we include a survey of graph-based methods in natural language processing, to show both the variety of graph formalisms and of tasks they can be useful for. The core of the issue consists of four articles, each of which showcases and exploits a different facet of graphs for different tasks in NLP: graphs as a framework for the organization of complex knowledge; using the graph structure of knowledge repositories for the computation of semantic relatedness between texts; revealing and exploiting sub-structures in word co-occurrence graphs for approximating word senses and performing sense-level translations; tracking changes in word co-occurrence graphs to identify diachronic sense changes.
Read the special issue ‘Graphs and Natural Language Processing’ in the journal Natural Language Engineering.
In his latest industry watch column, Robert Dale, Chief Technology Officer for Arria NLG, takes a look at what’s on offer in the NLP microservices space, reviewing five SaaS offerings as of June 2015
Below is an extract from the column
With NLP services now widely available via cloud APIs, tasks like named entity recognition and sentiment analysis are virtually commodities. We look at what’s on offer, and make some suggestions for how to get rich.
Software as a service, or SaaS – the mode of software delivery where you pay a monthly or annual subscription to use a cloud-based service, rather than having a piece of software installed on your desktop just gets more and more popular. If you’re a user of Evernote or CrashPlan, or in fact even GMail or Google Docs, you’ve used SaaS. The biggest impact of the model is in the world of enterprise software, with applications like Salesforce, Netsuite and Concur now part of the furniture for many organisations. SaaS is big business: depending on which industry analyst you trust, the SaaS market will be worth somewhere between US$70 billion and US$120 billion by 2018. The benefits from the software vendor’s point of view are well known: you only have one instance of your software to maintain and upgrade, provisioning can be handled elastically, the revenue model is very attractive, and you get better control of your intellectual property. And customers like the hassle-free access from any web-enabled device without setup or maintenance, the ability to turn subscriptions on and off with no up-front licence fees, and not having to talk to the IT department to get what they want.
The SaaS model meets the NLP world in the area of cloud-based microservices: a specific form of SaaS where you deliver a small, well-defined, modular set of services through some lightweight mechanism. By combining NLP microservices in novel ways with other functionalities, you can easily build a sophisticated mashup that might just net you an early retirement. The economics of commercial NLP microservices offerings make these an appealing way to get your app up and running without having to build all the bits yourself, with your costs scaling comfortably with the success of your innovation. So what is out there in the NLP microservices space? That early retirement thing sounded good to me, so I decided to take a look. But here’s the thing: I’m lazy.
I want to know with minimal effort whether someone’s toolset is going to do the job for me; I don’t want to spend hours digging through a website to understand what’s on offer. So, I decided to evaluate SaaS offerings in the NLP space using, appropriately, the SAS (Short Attention Span) methodology: I would see how many functioning NLP service vendors I could track down in an afternoon on the web,
and I would give each website a maximum of five minutes of exploration time to see what it offered up. If after five minutes on a site I couldn’t really form a clear picture of what was on offer, how to use it, or what it would cost me, I would move on. Expecting me to read more than a paragraph of text is so Gen X.
Before we get into specifics, some general comments about the nature of these services are in order, because what’s striking is the similarities that hold across the different providers. Taken together, these almost constitute a playbook for rolling out a SaaS offering in this space.
Read the rest of the article including reviews of Alchemy API, TextRazor and more in the Journal of Natural Language Engineering
Choosing the best word or phrase for a given context from among candidate near-synonyms, such as “slim” and “skinny”, is something that human writers, given some experience, do naturally; but for choices with this level of granularity, it can be a difficult selection problem for computers.
Researchers from Macquarie University in Australia have published an article in the journal Natural Language Engineering, investigating whether they could use machine learning to re-predict a particular choice among near-synonyms made by a human author – a task known as the lexical gap problem.
They used a supervised machine learning approach to this problem in which the weights of different features of a document are learned computationally. Through using this approach, the computers were able to predict synonyms with greater accuracy and reduce errors.
The initial approach solidly outperformed some standard baselines, and predictions of synonyms made using a small window around the word outperformed those made using a wider context (such as the whole document).
However, they found that this was not the case uniformly across all types of near-synonyms. Those that embodied connotational or affective differences — such as “slim” versus “skinny”, with differences in how positively the meaning is presented — behaved quite differently, in a way that suggested that broader features related to the ‘tone’ of the document could be useful, including document sentiment, document author, and a distance metric for weighting the wider lexical context of the gap itself (For instance, if the chosen near-synonym was negative in sentiment, this might be linked to other expressions of negative sentiment in the document).
The distance weighting was particularly effective, resulting in a 38% decrease in errors, and these models turned out to improve accuracy not just on affective word choice, but on non-affective word choice also.
Read the full article ‘Predicting word choice in affective text’ online in the journal Natural Language Engineering
Examples of humorous and sometimes awkward autocorrect substitutions happen all the time. Typing ‘funny autocorrect’ into Google brings up page upon page of examples where phones seem to have a mind of their own.
A group of researchers at the University of Helsinki, under the lead of Professor Hannu Toivonen, have been examining word substitution and sentence formation, to see the extent to which they can implement a completely automatic form of humour generation. The results have been published online in the in the journal Natural Language Engineering.
Basing the experiment on the ideas and methods of computational humour explored by Alessandro Valitutti for several years, the researchers worked with short length text messages changing one word to another one, turning the text to a pun, possibly using a taboo word. By isolating and manipulating the main components of such pun-based texts, they were able to generate humorous texts in a more controllable way.
For example, it was proved that replacing a word at the end of the sentence surprised recipients, contributing to the humorous effect. They also proved that word replacement is funnier if the word is phonetically similar to the original word and when the word is a “humorously inappropriate” taboo word.
The experiment involved over 70,000 assessments in total, and used crowd sourcing to test funniness of the texts. This is the largest experiment that Professor Toivonen knows of related to this field of research.
People were asked to assess individual messages for their funniness on a scale of 0 to 4 with 0 indicating the text wasn’t funny. And comedians can sigh with relief – the initial median score from the research was just 0.55, indicating that on average the text can hardly be called funny. But by following a combination of rules, this median increased by 67% showing that by inserting certain criteria could impact upon how funny the text message was.
Does this mean that in the future people will ‘rofl’ (roll on the floor laughing) in response to a funny quip or witty banter made by a phone?
Professor Toivonen sees a future where programs will be able to generate humorous automated responses and sentences:
“Some of the first applications of this type of research are likely to be seen in the automated production of funny marketing messages and help with creative writing. But who knows, maybe phones will one day be intelligent enough to make you laugh.”
Read the article ‘Computational generation and dissection of lexical replacement humor’ online in the journal Natural Language Engineering– please note that the article contains language that some may find offensive.
5 of the funniest texts*
|Okie, pee ya later
|How come u r back so fart?
|Now u makin me more curious…smell me pls…
|Dunno…My mum is kill bathing.
|No choice have to eat her
*There were funnier texts but due to offensive language we were not able to publish them on this blog