Artificial intelligence (AI) and machine learning are increasingly becoming part of everyday life and further impacting how we use our devices.
We are entering an era where devices can see rather than simply respond to touch. Recent smartphone launches from Apple (iPhone X), Huawei (Mate 10 Pro), and Google (Pixel 2 and Google Clips) are starting to shift the focus from what we see and do with our devices, to what the devices see and do for us.
Part of this technology growth comes the need for devices and applications to become more aware of the user’s surroundings which engenders the need for vision systems to capture this information and process it for inferencing.
Applications for computer vision are rather extensive, ranging from image cataloguing/sorting to securing an individual’s phone through facial recognition and detecting objects/persons in public.
Market movers like Apple, Google, Microsoft, Intel, Qualcomm, Huawei, NVIDIA etc have started to implement this technology into consumer and enterprise level devices — over time this technology will trickle down to wider breadth of devices and pricing tiers.
Major features such as security (e.g. Apple Face ID), social networking (Apple animojis), and content (Google Clips) are early examples of this confluence of computer vision, artificial intelligence (AI), and machine learning.
These devices also highlight a trend where both computer vision and AI are moving to the end device instead of relying on the cloud, engendering additional market opportunity for chip suppliers to bring to market more robust chips and GPUs and solutions like accelerators and VPUs (vision processing unit, a class of processor intended for accelerating machine vision tasks).
Michael Inouye, principal analyst at ABI Research, said that the combination of AI, machine learning, and computer vision will help people use and interact with their devices in new and more profound ways.
“We will move from one-to-one connections between devices and the Web and remote services to an increasingly connected ecosystem of components that work together,” he said.
ABI Research forecasts that by 2022, over 650 million mobile devices will support more advanced vision applications on the device.
“Our phones will move from retouching our photos in the cloud to using vision to recognise when we are upset and perhaps starting some music to ease our troubled minds,” he said.
While some of the markets supporting embedded vision like VR and AR may appear to be new, he said that they are in fact much older than the recent product launches — they are simply spreading because the technology has reached a critical level where science fiction and imagination are starting to become reality.
He said that mobile devices will remain the largest ‘market opportunity’ by volume of devices for computer vision. AR/VR applications will leverage or rely on smartphones. Computer vision will in some ways be bounded by what the camera can see.
“Google Clips offers a new take on cameras which attempts to automatically look for and capture great moments. Smartphone cameras intelligence may be most limited by their field of view. They will be blind when they are in our pockets; a 180-degree front-facing camera may have immensely more useful intelligence compared to approximately 100 degrees of many devices today,” he said.
More nascent markets like AR and VR already rely on vision tech, but he said that it will benefit greatly as more developers create new and expand upon existing ways to use devices and users assimilate these applications into their daily lives.