Microsoft Research makes great strides in automatic image captioning

By Sofia Elizabella Wyciślik-Wilson
Published 9 years ago

Microsoft Research is home to all manner of interesting projects and experiments, and one of the latest that the team is keen to share news about is automatic image captioning. There are no prizes for guessing what this is -- it's very much what it says on the tin -- and the technology has now reached a stage where the automatically generated captions for an image are at least as good as those thought of by people.

A team of just 12 worked on the project, and the results are pretty impressive. The system analyzes an image and identifies its key components. After determining objects and characteristics, these can then be evaluated in relation to each other to help decide what is important, and what can be ignored.

A series of possible image descriptions is then created, and these are ranked according to how much sense they make in natural language, and whether importance has been attributed to the correct components. A post on the Machine Learning Blog gives you the opportunity to see if you can determine which image captions were written by hand and which came from a machine. You might be surprised at just how difficult it is to distinguish between the two!

There are lots of potential applications for this technology, but two of the most obvious are voice control and accessibility options. At the moment it is possible to conduct web searches for images by controlling a computer with your voice, but this is reliant on images having been appropriately tagged and captioned already. If software can be used to automatically caption images on the fly, voice-activated image searches can cast a wider net.

But perhaps a more interesting use of the technology would be to make computers in general, and the internet specifically, more accessible to people with sight problems. Text-to-speech is great for hearing what has been written on a particular page, but what about the accompanying imagery. With automatic captioning, pictures that have been added to an article could be described by text-to-speech software regardless of whether a descriptive caption had been added by the author.

Photo credit: Zmiter / Shutterstock

1 Comment

Microsoft Research makes great strides in automatic image captioning

One Response to Microsoft Research makes great strides in automatic image captioning

Recent Headlines

Firefox Nightly expands to Linux on ARM64

How writing zip support for Windows almost cost its creator his job at Microsoft

Apple AirPlay comes to IHG Hotels and Resorts

Millennials are key targets for phishing

Get 'Applied Machine Learning and AI for Engineers' (worth $67.99) for FREE

Best Windows apps this week

The dynamics of modern Windows device management [Q&A]

Most Commented Stories

Say goodbye to Microsoft Windows 11 and hello to Nitrux Linux 3.4.0 'pl'

Windows 11 slammed for its 'comically bad' performance even on high-end hardware

Outrageous: Microsoft to charge $61 for Windows 10 updates -- consider switching to Linux!

Microsoft 'improves' Windows 11 by bringing ads to the Start menu in the US

Microsoft is up to its old tricks yet again -- Windows 10 users harassed with full-screen Windows 11 upgrade warnings

The stunning Windows 13 -- yes, 13! -- is the Microsoft operating system we want

Easter giveaway! Get a licensed copy of 'VideoProc Converter for Windows/Mac' (worth $78.90) for FREE

EndeavourOS ARM discontinued: A huge loss for the Linux community