Blogs

Insights and Musings from the Grace Hopper Panel: Navigating Conflicts and Synergies in Multimodal AI Across Industries

December 2, 2024
5
Vishakha Gupta
Vishakha Gupta

At Grace Hopper this year, I got a chance to moderate a panel of experts with data science and AI industry experience of over 35 years between them in a wide variety of roles like founders, VP of data science or engineering, and AI engineer. These panelists have worked on solutions deployed all the way from edge to cloud in various industry verticals. With such a star cast of panels, we spent the hour exploring the evolution of various use cases benefitting from traditional machine learning (ML) to the rapidly evolving fields of generative and multimodal AI. Our goal was to demystify these technologies, understand how the technologies interplay, and help our audience understand both the similarities and differences, across various industries.

How the Panelists’ Backgrounds in AI Stacked Up

Our panelists represented various industry verticals, career trajectories, sizes of companies they have worked at, and experiences, as they have navigated the evolution of ML from classic ML to generative AI use cases. Nairwita described her background in retail working at Walmart now and a startup called Standard Cognition earlier, with a focus on computer vision challenges, primarily images and embeddings. Estelle leaned on her background as head of data science teams at Home Depot and Shopify with a focus on ecommerce and supply chain challenges involving computer vision, and metadata management challenges. Harishma described her experience in deploying video-based ML solutions at the edge for industrial and visual inspection use cases pointing to unique challenges in training video models as well as deploying them in constrained edge environments. Danishta introduced text and audio examples with her experience deploying models in constrained environments arising from usage by emergency workers and government applications. These unique backgrounds and knowledge of applying ML in their diverse environments, data types, and verticals is what made this panel so interesting and fun to moderate. 

AI/ML Applications, Across Industry Verticals

Given that our panelists have been working in various industries, we spent a few minutes listing some use cases where AI/ML have shown promise or are already being used in production. In retail, there are quite a few examples where ML plays a crucial role such as frictionless checkout, tracking fraud, inventory management, and shopper insights (see our blog on these use cases). E-commerce has its own set of classic ML examples around product recommendations, supply chain management, attribute classification, and some rapidly growing generative AI use cases around customer support, generation of marketing material, creating product descriptions using image and other metadata. Industrial and visual inspection often involves worker safety, hazard detection, quality analysis, and defect detection (see our blog on these use cases). Speech and language understanding tools built for first responders, medics, cybersecurity and defense. Using language prediction and understanding to equip soldiers with tools to help themselves in critical situations on the field. All these use cases tend to rely on multiple modalities of data and often result in huge volumes of it. 

Anecdotes from the Old and New World of AI

Our panelists shared some interesting anecdotes around AI / ML, ranging from how kids can now create podcasts, how speech recognition can be challenging in diverse environments, the importance of finding something intriguing within the data, and reminisced about the days when tried and tested methods often outperformed new approaches. 

This was our segue into how the world of ML has evolved since the early days of regression models, to deep learning, to generative and multimodal AI. 

Classic Machine Learning to Multimodal and Generative AI

The panelists discussed how ML has evolved within their respective verticals. 

Harishma noted that while core fundamentals such as requiring the right data, identifying the right metrics to measure the model performance against, and the need for optimizations to run smaller and faster models remain unchanged, significant advancements have been made in video data processing, particularly in dealing with real-time constraints and large datasets. She also pointed out that while the new large vision models were too large and inefficient to run on the edge, they have certainly created possibilities for better labeled datasets as well as representative synthetic data for training the models that can then be deployed at edge. This has definitely sped up the model development lifecycle, and lowered cost in getting access to  quality datasets. 

Estelle concurred with this observation and shared similar challenges in vision accuracy and model production, emphasizing the importance of A/B testing and data management to ensure a flawless customer experience, something that has stayed true from classic ML to generative AI deployments. 

Danishta highlighted advancements in speech recognition, moving from traditional deep neural networks to transformer models, but indicated how the challenges of training models for low-compute environments like Arduino boards still remained. 

Nairwita spoke about the acceleration of innovation lifecycles, thanks to synthetic data and advanced computational resources, emphasizing the importance of product-specific requirements. She also reiterated how it had become easier to now get access to datasets as compared to the earlier days when it was really a chicken-egg problem - you had to show the value of a model to get access to customer data from stores but you needed customer data to train those models! 

Overall, we all agreed that generative AI is the new iteration but the same sort of challenges on data collection, multimodal data management, metrics, deployment remain. Computational resources have certainly evolved a lot which has naturally accelerated innovation lifecycle and the tooling ecosystem has made it so that you do not need to build everything from scratch anymore. While fewer human resources are therefore needed in the development cycle, humans are more valuable now for reviewing quality of data and outcome. Another truism is that the various AI solutions have diff requirements - some require real time  / some offline, and so it continues to be important that teams clearly define and understand product requirements. 

It also is important to really evaluate if AI is the right answer for the problem at hand and align the various teams from data engineering to model development, deployment, and monitoring in order to really achieve the desirable outcome (a discussion on this in this podcast)

Adoption of AI/ML Across Industries

Panelists also shared insights on AI/ML adoption within their industries and others they had some experience in. Estelle observed that industries like insurance are only now starting to adopt AI, having previously focused on traditional number-crunching. Harishma remarked on the rapid changes and the necessity for education and managing expectations regarding AI capabilities in verticals like inspection. Danishta noted that while the speech industry is progressing rapidly, there is variability in adoption across sectors like healthcare, media, and the arts, often due to concerns about factual accuracy and persona integrity. Nairwita mentioned increased awareness and demand for AI products, though some industries are still grappling with data management and readiness. 

Predictions and Job Impacts

Lastly, they provided their predictions on AI's impact on jobs. Nairwita believes that AI won't take away jobs but will require evolving skillsets. Danishta concurred and emphasized that those who don't adapt to using AI will be left behind. Estelle compared AI to the internet or phone—an essential tool that enhances productivity and humans come to depend on it for a potentially better life! Harishma suggested that while AI is ultimately a tool for effectiveness, its disruptive potential varies by industry, and cannot be ignored.

Key Takeaways

Consistency in Fundamentals: Despite the rapid evolution in AI technologies, the core principles such as the importance of quality data, metrics, and optimization strategies remain unchanged. This consistency provides a solid foundation for innovation.

Shared Layers and Customization: Different industries share certain layers within the AI stack, while customization is essential to meet specific needs. Understanding these shared layers can help professionals transition across fields with greater ease.

Advancements in Tools and Techniques: The evolution from traditional methods to advanced generative models and multimodal AI highlights significant advancements in tools and techniques, making AI more accessible and effective across diverse applications.

Importance of Real-World Use Cases: Leveraging real-world examples and use cases is crucial for demystifying AI concepts and making them relatable. This approach helps bridge knowledge gaps and fosters a better understanding of AI's practical applications.

Educational Imperatives: There is a critical need for continuous education and awareness to manage expectations and address concerns about AI, such as biases and ethical implications. This education is essential for broader adoption and responsible use of AI technologies.

AI as a Tool, Not a Replacement: AI is seen as a powerful tool that can enhance productivity and effectiveness. While it may automate certain tasks, it is unlikely to replace jobs entirely. Instead, it will evolve skillsets and require expertise in utilizing AI effectively.

By incorporating these takeaways, our panel discussion underscored the transformative potential of AI while providing actionable insights to help professionals navigate this rapidly evolving landscape. At ApertureData, we have been working on solving some of the most commonly mentioned problems above, that of efficient multimodal data management and search.  

If you’re interested in learning more about how ApertureDB works, reach out to us at team@aperturedata.io. Stay informed about our journey by subscribing to our blog.

I want to acknowledge the insights and valuable edits from Danishta Sayed, Sonam Gupta, Deniece Moxy.

Tags:

Related Blogs

Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2
Blogs
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2
Multimodal AI, vector databases, large language models (LLMs)...
Read More
Watch Now
Industry Experts
E-commerce Data Science Needs Sophisticated Visual Data Management
Videos & Podcasts
E-commerce Data Science Needs Sophisticated Visual Data Management
Learn how ApertureDB benefits e-commerce usecases like personalized recommendations, attribute classidication.,
Read More
Watch Now
Applied
Badger Technologies Uses ApertureDB to Solve "Wrong Product" Placement Problems at Scale
Case Studies
Badger Technologies Uses ApertureDB to Solve "Wrong Product" Placement Problems at Scale
'The company uses ApertureDB to enhance vector search performance on their library of embeddings, achieving a 2.5x improvement in
Read More
Watch Now
Industry Experts
Semantic Search to Glean Valuable Insights from Podcast- Part 3
Blogs
Semantic Search to Glean Valuable Insights from Podcast- Part 3
Multi-part blog series on querying podcast transcripts for quick information retrieval using vector search
Read More
Watch Now
Industry Experts
Building Real World RAG-based Applications with ApertureDB
Blogs
Building Real World RAG-based Applications with ApertureDB
Combining different AI technologies, such as LLMs, embedding models, and a database like ApertureDB that is purpose-built for multimodal AI, can significantly enhance the ability to retrieve and generate relevant content.
Read More
Managing Visual Data for Machine Learning and Data Science. Painlessly.
Blogs
Managing Visual Data for Machine Learning and Data Science. Painlessly.
Visual data or image/video data is growing fast. ApertureDB is a unique database...
Read More
What’s in Your Visual Dataset?
Blogs
What’s in Your Visual Dataset?
CV/ML users need to find, analyze, pre-process as needed; and to visualize their images and videos along with any metadata easily...
Read More
Transforming Retail and Ecommerce with Multimodal AI
Blogs
Transforming Retail and Ecommerce with Multimodal AI
Multimodal AI can boost retail sales by enabling better user experience at lower cost but needs the right infrastructure...
Read More
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 1
Blogs
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 1
Multimodal AI, vector databases, large language models (LLMs)...
Read More
How a Purpose-Built Database for Multimodal AI Can Save You Time and Money
Blogs
How a Purpose-Built Database for Multimodal AI Can Save You Time and Money
With extensive data systems needed for modern applications, costs...
Read More
Minute-Made Data Preparation with ApertureDB
Blogs
Minute-Made Data Preparation with ApertureDB
Working with visual data (images, videos) and its metadata is no picnic...
Read More
Why Do We Need A Purpose-Built Database For Multimodal Data?
Blogs
Why Do We Need A Purpose-Built Database For Multimodal Data?
Recently, data engineering and management has grown difficult for companies building modern applications...
Read More
Building a Specialized Database for Analytics on Images and Videos
Blogs
Building a Specialized Database for Analytics on Images and Videos
ApertureDB is a database for visual data such as images, videos, embeddings and associated metadata like annotations, purpose-built for...
Read More
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2
Blogs
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2
Multimodal AI, vector databases, large language models (LLMs)...
Read More
Challenges and Triumphs: Multimodal AI in Life Sciences
Blogs
Challenges and Triumphs: Multimodal AI in Life Sciences
AI presents a new and unparalleled transformational opportunity for the life sciences sector...
Read More
Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?
Blogs
Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?
The data landscape has dramatically changed in the last two decades...
Read More
Can A RAG Chatbot Really Improve Content?
Blogs
Can A RAG Chatbot Really Improve Content?
We asked our chatbot questions like "Can ApertureDB store pdfs?" and the answer it gave..
Read More
ApertureDB Now Available on DockerHub
Blogs
ApertureDB Now Available on DockerHub
Getting started with ApertureDB has never been easier or safer...
Read More
Are Vector Databases Enough for Visual Data Use Cases?
Blogs
Are Vector Databases Enough for Visual Data Use Cases?
ApertureDB vector search and classification functionality is offered as part of our unified API defined to...
Read More
Accelerate Industrial and Visual Inspection with Multimodal AI
Blogs
Accelerate Industrial and Visual Inspection with Multimodal AI
From worker safety to detecting product defects to overall quality control, industrial and visual inspection plays a crucial role...
Read More
ApertureDB 2.0: Redefining Visual Data Management for AI
Blogs
ApertureDB 2.0: Redefining Visual Data Management for AI
A key to solving Visual AI challenges is to bring together the key learnings of...
Read More
Stay Connected:
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.