For artificial intelligence (AI) engines to be able to identify what is happening in videos, they require training: they need to detect the part of the video frame where the key event is taking place and understand what that event is. Scientists at the MIT-IBM Watson AI Lab and IBM Research AI have now compiled a training program to achieve this, called “Moments in Time” . It consists of one million three-second extracts from movies, TV shows and amateur videos which developers can feed into their AI systems. The full dataset can be ordered on the website.
A demo version shows an AI finding and labeling the actions in a number of videos. The next step could be an AI interpreting and reacting to these as well – perhaps for applications in autonomous vehicles.
“Moments in Time” is not the first video collection for training AI systems. Last year already saw Google unveil the “YouTube-8M Dataset” , which classified millions of YouTube videos into 4716 categories. The dataset of video links is available free of charge as a TensorFlow file and covered by a Creative Commons 4.0 license (CC BY 4.0) .