As the developed world bemoans the loss of jobs to artificial intelligence, a company in a remote little town in central-west Kerala is reaping the benefits of preparing data for AI and machine learning. Infolks, set up by high-school dropout Mujeeb Kolasseri at Kumaramputhur in Palakkad, offers employment to about 250 people, providing image annotation services for clients based in the US and Europe.
It all began with Kolasseri’s efforts to find a solution to his predicament. Having left school to fend for himself and his family due to their dire financial situation, he thought he had found his vocation in aluminium fabrication after several attempts at various odd jobs. However, ill-health forced him to give up the steady income. Restricted to home but determined to earn a living, technologically inclined Kolasseri went back to his account with Amazon Mechanical Turk (MTurk), the marketplace for crowdsourcing, and this time around, he stuck with it.
Impressed by my rating, in 2016 a German company approached me to work directly with me.
“Over two and a half years I finished more than 300,000 tasks with an approval rating of 99.7 per cent,” says 28-year-old Kolasseri, Founder, MD and CEO of Infolks. “Impressed by my rating, in 2016 a German company approached me to work directly with me.”
A steady stream of projects was promised if Kolasseri established a small company. It was a win-win situation for the German client - same quality output minus MTurk’s 20 per cent commission.
So Kolasseri did what most Keralites with limited cash do to raise money - sell off the family’s gold jewellery. With a meagre Rs25,000, he put together a six-member team from his village, Changaleeri. They set to work for the Fortune 10 client labelling all elements of images that autonomous vehicles’ cameras and sensors would see.
What's data labelling?
The labelling of data - in this case, of images - is necessary for machines to learn and identify various objects without being explicitly programmed. “Up to 80 per cent of the work with an AI project is collecting and preparing data – labelling is a big part of it, as is removal of duplicate data,” says Huma Abidi, Director of Machine Learning/Deep Learning Software Engineering at Intel Corporation. “The more labelled data is available for training, the better trained the AI models will be. The higher the precision in data labelling, the more accurate the results will be. Since these models are used for critical applications such as training self-driving cars and detecting tumours, this could literally be a matter of life and death.”
Up to 80 per cent of the work with an AI project is collecting and preparing data – labelling is a big part of it, as is removal of duplicate data.
As more and more AI solutions come on stream, research firm Cognilytica expects the market for data preparation to reach $1.2 billion by 2023 from $500 million in 2018. With data labelling an indispensable part of AI, especially for autonomous vehicles, object and image recognition, as well as text and image annotation, the marketshare of third-party data labelling providers is also expected to grow from $150 million to $1 billion over the same period.
Most large corporations are turning to companies such as iMerit, Hive (both with offices in India and the US) and Figure Eight that specialise in data labelling and are employing hundreds of thousands of people in developing countries around the world, says Abidi.
Taking a sliver of the pie dominated by big data labelling companies and sub-contractors are entrepreneurs such as Kolasseri who directly serve clients across the world. What’s more remarkable is how such companies are changing the locality they are set in by providing employment to low-skilled workers.
In a place such as Kumaramputhur tucked away in the municipality of Mannarkkad with mainly rubber plantations where the youth look towards the Gulf for employment, Infolks offers jobs with remunerations comparable with IT hubs such as Bengaluru.
“About 80 per cent of our employees are straight out of college,” says Kolasseri. “We train and provide them with jobs that they can only get in metros such as Bengaluru or Kochi. Around 50 per cent of our employees are from Mannarkkad, 30 per cent from my district, Palakkad and the rest from other parts of Kerala.”
Kolasseri did consider moving to a Special Economic Zone to avail of the tax benefits, but when he came to know that he would have to reinvest the capital in the SEZ itself, he dropped the plan.
“My focus is not only to make my company grow but also transform my viIlage. I want to bring world-class working environment and better pay to my village. We have bought land in Changaleeri to construct our own building that can accommodate 1,000 employees in real time providing jobs to 2000 employees in shifts. Within two to three years, imagine the growth and development a company employing 2000 workers will bring to my village.”
Lesson learnt
But it hasn’t always been smooth sailing for Infolks. Lack of entrepreneurial experience among the top management, led by Kolasseri, his brother Muneer and friend Navas Thazhathethil, brought the company to a standstill last year. “I learnt my lesson the hard way that a company shouldn’t rely on a single client when the project from the German client ended,” says Kolasseri. For the next phase in the project, the client needed someone with more than 2,000 people, which Infolks couldn’t provide.
Stuck with 100-110 employees with no projects in hand, it was time for Infolks to start afresh. Realizing his mistake, Kolasseri set about putting in place a vision and a mission for his company, establishing all the relevant departments a tech start-up should have including business development. Today Infolks counts Fortune 100 companies such as Daimler among its clients.
Despite the access clients have to the abundant, and independent, microworkers crowdsourcing services offer, Kolasseri is optimistic about the future.
“Many of the annotating companies in India use crowdsourcing,” he says. “They have annotators across the world working on their platforms. The main problem here is that the data is not secure - we are sharing the data with their personal computers in their homes.
“And in my experience, for big clients, the first priority is data security, then comes quality and pricing.
As data security becomes a main concern, clients would start looking for companies with in-house employees to ensure security.
“As data security becomes a main concern, clients would start looking for companies with in-house employees to ensure security. Even crowdsourcing annotating companies are now looking for partners with data security and in-house employees to meet the requirements of international clients.”
However, this situation, ripe for the picking, too may not last for long. Abidi says new and future technology may allow machine learning to happen without the need for labelled data or enable machines to self-learn. “Tools are being developed that help with data annotation, e.g. first layer is annotated by machines and humans help validate the labelling done by machines.”
Until then, it’s up to companies such as Infolks to make a killing.