fbpx
Basic Group

ARTIFICIAL INTELLIGENCE RECOGNIZES OBJECTS BY VOICE DESCRIPTION

Scientists from Massachusetts University of Technology have created an algorithm that can recognize objects in an image based on their simple language description without further explanation.

Past algorithms required a large number of annotations and transcriptions. The new algorithm works much simpler – for example, it is enough to say “blue shirt”, and artificial intelligence will find an object in the image.

The system consists of two neural networks – the first divides the image into a grid of small cells, and the second divides the sound spectrogram into short segments by 1-2 seconds. Then the artificial intelligence checks how accurately the audio track corresponds to the image in the grid.

Scientists are convinced that the development can be used to create translators able to recognize the language and pick up the corresponding translation with an accuracy of 100%.

42316862_1521832107963601_6911299912017641472_n