
Helping machines understand visual content with AI
In today’s data-driven world, businesses strive to make informed decisions. However, a significant blind spot persists: the inability to fully leverage insights hidden within their vast visual data – images, audio, and video. This is where Coactive, an innovative company founded by MIT alumni Cody Coleman and William Gaviria Rojas, is making a profound difference.
Coactive has developed an artificial intelligence-powered platform designed to instantly search, organize, and analyze unstructured visual content. This groundbreaking capability empowers businesses to derive new insights and make faster, more effective decisions, moving beyond the limitations of traditional structured data analysis.
“In the first big data revolution, businesses got better at getting value out of their structured data,” explains Cody Coleman, CEO of Coactive. “But now, approximately 80 to 90 percent of the data in the world is unstructured. In the next chapter of big data, companies will have to process data like images, video, and audio at scale, and AI is a key piece of unlocking that capability.”
The company is already collaborating with major media and retail entities, helping them interpret their visual content without the arduous task of manual sorting and tagging. This not only accelerates content delivery to users but also aids in removing explicit material and understanding how specific content influences user behavior.
The founders envision Coactive as a prime example of AI’s potential to enhance human efficiency and tackle previously unsolvable problems. “The word coactive means to work together concurrently, and that’s our grand vision: helping humans and machines work together,” Coleman asserts. “We believe that vision is more important now than ever because AI can either pull us apart or bring us together. We want Coactive to be an agent that pulls us together and gives human beings a new set of superpowers.”
Coleman and Gaviria Rojas’ journey began during their time at MIT, where they both majored in electrical engineering and computer science. Their collaborative spirit was evident early on, even working together to bring MIT OpenCourseWare content to Mexican universities. Coleman’s early exposure to AI came through his graduate research at MIT Open Learning, where he applied machine learning to analyze how humans learn on MITx, giving him his first deep dive into applying AI to video content.
After MIT, Coleman pursued his PhD at Stanford, focusing on democratizing AI usage. His work with companies like Pinterest and Meta solidified his vision for an “AI operating system” for multimodal content. Simultaneously, Gaviria Rojas, working as a data scientist at eBay, recognized the impending explosion of multimodal data. A chance encounter during a couch move sparked the idea for Coactive, born from their shared realization that a critical technology was missing to unlock this data at scale.
Coactive’s platform is designed to be model agnostic, allowing the company to integrate and adapt new AI models as they evolve. It includes prebuilt applications that enable business customers to effortlessly search, generate metadata, and conduct analytics to extract valuable insights from their visual content. “Before AI, computers would see the world through bytes, whereas humans would see the world through vision,” Coleman says. “Now with AI, machines can finally see the world like we do, and that’s going to cause the digital and physical worlds to blur.”
The impact of Coactive’s technology is already evident. Reuters, a global news agency, previously relied on manual tagging for its vast image database, a process that was slow and expensive. With Coactive’s “Enable AI Search,” journalists can now instantly retrieve relevant content based on AI’s understanding of image and video details, significantly improving the quality and speed of news reporting.
Another prominent client, Fandom, one of the world’s largest platforms for entertainment information, leverages Coactive to moderate its online communities. By codifying their community guidelines, Fandom can now review new content for excessive gore and sexualized material in an average of 500 milliseconds, a drastic improvement from the previous 24 to 48 hours.
These real-world applications underscore Coactive’s core mission: redefining human-computer interaction. “Throughout the history of human-computer interaction, we’ve had to bend over a keyboard and mouse to input information in a way that machines could understand,” Coleman reflects. “Now, for the first time, we can just speak naturally, we can share images and video with AI, and it can understand that content. That’s a fundamental change in the way we think about human-computer interactions. The core vision of Coactive is because of that change, we need a new operating system and a new way of working with content and AI.”



