Microsoft Research makes great strides in automatic image captioning

By Sofia Elizabella Wyciślik-Wilson
Published 11 years ago

Microsoft Research is home to all manner of interesting projects and experiments, and one of the latest that the team is keen to share news about is automatic image captioning. There are no prizes for guessing what this is -- it's very much what it says on the tin -- and the technology has now reached a stage where the automatically generated captions for an image are at least as good as those thought of by people.

A team of just 12 worked on the project, and the results are pretty impressive. The system analyzes an image and identifies its key components. After determining objects and characteristics, these can then be evaluated in relation to each other to help decide what is important, and what can be ignored.

A series of possible image descriptions is then created, and these are ranked according to how much sense they make in natural language, and whether importance has been attributed to the correct components. A post on the Machine Learning Blog gives you the opportunity to see if you can determine which image captions were written by hand and which came from a machine. You might be surprised at just how difficult it is to distinguish between the two!

There are lots of potential applications for this technology, but two of the most obvious are voice control and accessibility options. At the moment it is possible to conduct web searches for images by controlling a computer with your voice, but this is reliant on images having been appropriately tagged and captioned already. If software can be used to automatically caption images on the fly, voice-activated image searches can cast a wider net.

But perhaps a more interesting use of the technology would be to make computers in general, and the internet specifically, more accessible to people with sight problems. Text-to-speech is great for hearing what has been written on a particular page, but what about the accompanying imagery. With automatic captioning, pictures that have been added to an article could be described by text-to-speech software regardless of whether a descriptive caption had been added by the author.

Photo credit: Zmiter / Shutterstock

1 Comment

One Response to Microsoft Research makes great strides in automatic image captioning

Got News? Contact Us

Microsoft Research makes great strides in automatic image captioning

One Response to Microsoft Research makes great strides in automatic image captioning

Recent Headlines

Forget about Fake Cell Towers and Spying Threats: Android 16 Introduces New Security Features

Google Expands its AI Overviews to YouTube App, Starting with U.S. Premium

Apple’s CarPlay Ultra Comes to a Halt as Industry Giants Start Changing Their Minds

OpenAI & Microsoft Partnership Is On Shaky Ground as Altman Admits ‘Points of Tension’

Apple’s Liquid Glass Control Center Gets a Much-Needed Fix in iOS 26 Beta 2

Talking to Google Just Got Real: Real-Time Voice Conversations Launched with AI Mode

16 Billion Passwords Exposed: Major Leak Hits Apple, Facebook and Google Users

Most Commented Stories

Betanews Is Growing Alongside You

16 Billion Passwords Exposed: Major Leak Hits Apple, Facebook and Google Users

Will Windows 10 stop working? See if your PC will survive the switch to Windows 11

Apple’s Liquid Glass Control Center Gets a Much-Needed Fix in iOS 26 Beta 2

Apple’s CarPlay Ultra Comes to a Halt as Industry Giants Start Changing Their Minds

Microsoft is making huge changes to Windows 10 and 11, cutting out nagging to use Edge... for some

Fences 6.0 is the essential desktop upgrade for Windows 10 and 11 users -- get it today!

Chaos RAT malware strikes Linux and Windows as hackers exploit its flaws