When we left Intel a few years ago, our plan for ApertureDB was to scale out the open source VDMS code that we started with, implement a very long list of features to enrich our API, and focus on some predefined ideas of what our customers and their deployments will look like (command line scripts anyone?). After all, as computer science researchers, we do a lot of application analysis, domain exploration, and read a bunch of related work. But quite a few people with quite a few successful startups rightfully told us - “know thy users” and they meant not just that we zone in on the exact titles and application domains, but also to clearly understand:
- how they write the code that they write
- how they deploy and access their software tools
- how they move their data around
- how their choice of tools affects other stakeholders
Those mentors couldn’t have given better advice.
Our primary vision, the challenges we have covered in our earlier blog , and the “why” that inspired us to set out to build ApertureDB have not changed. But thanks to our early adopters, our assumptions around what it takes to change data management habits have certainly changed.
Why Care About Visual Data Management?
Through hundreds of conversations across application domains like smart retail, e-commerce, visual inspection, medical imaging, smart agriculture, and others; we have uncovered some fascinating commonalities.
Visual AI tools and techniques are improving by leaps and bounds. Companies across these domains are not only collecting terabytes of images and videos, sometimes per day, but also training models to recognize objects of interest after using data labeling, curation, MLOps, and model management tools at their disposal.
Figure 1: Regardless of the domain, ML tasks have a lot of expectations from the data layer but not a standard or simple way to achieve them.
However, regardless of the specific ML application, there remain complex and time consuming tasks (shown in Figure 1) such as:
- Managing different modalities of data in “one” location
- Searching and visualizing complex data, at scale
- Managing datasets and annotations
- Debugging models by understanding the data they were trained on
- Automating the retraining of models on newer and varied data
- Inferencing on large sets of real data
These are often beyond the reach of data science teams without the resources of companies like Google / Meta / Netflix, due to challenges we have described in our previous blog .
ApertureDB: A purpose-built database that understands visual data and data science needs
The key to solving the challenges above has been bringing together the key learnings of big data management and the unique access patterns seen by visual AI applications. We have achieved this by offering a specialized, purpose-built database, ApertureDB, that understands visual data and data science requirements intrinsically. ApertureDB exposes a unified API that allows data scientists working with visual data to manage and query all supported data types in a single easy-to-deploy, fully-managed or self-hosted database service.
As shown in Figure 2, ApertureDB has always natively supported images and videos along with necessary pre-processing and augmentation operations. Given the key role application metadata plays in such applications, metadata information from applications as well as annotations are managed in an in-memory knowledge graph for fast and insightful queries. A graph database also allows for easy updates to the schema so that the metadata can evolve with the application. A final and critical dimension to enabling complex visual searches and building sophisticated visual AI pipelines is vector or similarity search. ApertureDB offers built-in similarity matching for high-dimensional feature vectors with native support for k-nearest neighbor.
Figure 2: ApertureDB encapsulates visual data and metadata together and exposes a unified API to AI pipelines and users.
As a result of unifying access to various data types, ApertureDB is well suited for ML pipelines, flexible to evolve as data and requirements change, performant due to tight internal integration between data subsystems, and scalable for production deployment.
Launching ApertureDB 2.0: Seamlessly Integrating with the Data Science and Engineering Ecosystem
The insights uncovered through customer deployments and our engineering efforts building on the architecture in Figure 2 have enabled us to launch ApertureDB 2.0. As Figure 3 shows, the ApertureDB ecosystem integrations make it easy for our users to transition their current workflows and start taking advantage of the benefits offered by our unified API. With ApertureDB 2.0, users have been able to build better pipelines to achieve their business goals .
Figure 3: ApertureDB can easily fit in the user ecosystem thanks to our collection of convenient tools, integrations, and interfaces.
In addition to integrating the various visual data types behind a unified, ML friendly API; ApertureDB 2.0 addresses other important considerations when deploying a database such as how easily it fits into the user’s ecosystem. The rest of this blog summarizes the key ApertureDB 2.0 features. We will cover the details in upcoming blogs.
Convenient UI for Dataset Exploration and Debugging
It’s very reasonable to want to understand what your data looks like, especially in order to know what to expect when training or debugging a model. Even for such a mundane task, we have heard quite some painful stories of having to download in local folders, finding right viewers, particularly to view augmented data. This problem is even worse with videos. But, just like platform engineers look at logs to debug, data professionals need to find and look at their images and videos. That’s really what prompted us to switch from our command line mode to building a graphical user interface for ApertureDB. ApertureDB UI lets our users graphically search and navigate their visual datasets along with all of its metadata in a single place.
Simple and Consistent API
We want our users to have a unified system to manage their various data types. In order to support all of our users’ data requirements, ApertureDB uses a query engine or orchestrator to redirect user queries to the right internal components, collect the results, and return a coherent response to the user; all via a unified JSON-based native API . ML pipelines and end users can then execute queries to add / modify / search visual data and metadata, annotations or feature vectors, perform on the fly visual preprocessing and do more ML tasks like data snapshots. In fact, to simplify the life of our users even further, we are now working on an SDK in Python which abstracts common tasks such as searching by labels, or managing datasets, into simple Python calls. We also offer a REST API that can easily be integrated with labeling frameworks or your in house web frontends.
Sophisticated Video Support
Videos can make data management and AI particularly nasty to deal with. Often when storing, large videos have to be split by the end users into snippets, for various scalability reasons, sometimes causing an interesting event to span across snippets. You need a large enough machine to be able to process large amounts of videos. P eople sometimes need to store key frames separately, thus increasing the storage footprint. And of course, it is not always easy to query / visualize / debug videos. One of the best features of ApertureDB is its video API and visualization capabilities. You can not only search for the videos you need based on application metadata (e.g. find me all videos collected by camera X in store Y yesterday) but you can ask for thumbnails to be generated on the fly, sample the videos to return clips or frames, and setup training or inference pipelines directly with the ML framework of your choice.
Parallel Loaders for Fast Data Ingestion
As we started deploying our earlier versions of ApertureDB together with our early users in the smart retail and e-commerce spaces (we will address use cases in another blog soon), we were reminded of how important and complex data ingestion is to an ML pipeline, especially for images and videos. We have developed several enhancements to significantly improve data ingestion at scale while simplifying the process to load metadata and binary data together. In fact, we also provide bindings to ingest data directly from an existing database.
Ingestion and Support for Complex Annotations
Annotations, from simple labels to complex regions of interest, are usually among the metadata that our users require for their ML workflows. They either use third party labelers (common for simple objects like cars, people, boxes, furniture) or use in house experts (often needed for complex cases like medical artifacts, manufacturing defects). Regardless of the source, it remains an important and often unsolved problem to make these annotations searchable and associated with original data, along with other requirements like detecting overlapping annotations and evaluating annotation quality. ApertureDB supports annotations through our API and stores them as part of the metadata to support these requirements. In fact, we introduced the ability to visualize these annotations on the original data when our users asked it for visual debugging use cases.
Dataset Management and Loaders for ML Frameworks
A database for AI, while essential, is only useful if it can integrate with model training and inference frameworks. We have dataset loaders for frameworks like PyTorch and Tensorflow that automatically manage the complexity of fetching training or classification data at scale. Users just need to specify queries to find the right data. In fact, our flexible metadata schema makes it possible to do Iterative Model tuning and enhance the existing metadata with inference results that can later be used to search and find business-relevant information.
Enterprise Ready
Imagine having to maintain production features even though your primary job is to surface insights from data. Data science teams often have to wrestle with infrastructure and software quality requirements in deploying their innovations. In production, security and privacy requirements become as important as performance, scalability, reliability and efficiency. With our Intel origin, performance and scalability were always baked into ApertureDB. Since receiving feedback from our users, we have now added essential security, privacy, and auditing support required by big visual data. In short, ApertureDB removes technical debt for significantly lower maintenance and subscription costs.
What’s Next?
A purpose built system can really simplify data pipelines and allow teams to focus on machine learning and data understanding. That’s what we have seen from our customers ranging from startups to Fortune 50 companies.
As for where we are going next, we are aiming to support more and more users in their visual AI journey through our data infrastructure capabilities, large scale video analytics, finer grained access control capabilities, and much more.
If your organization uses or intends to use ML on visual data (small or large team) or you are simply curious about our technology, our approach to infrastructure development, or where we are headed, please contact us at team@aperturedata.io or try out our online trial . If you’re excited to join an early stage startup and make a big difference, we’re hiring . Last but not least, we will be documenting our journey and explaining all the components listed above on our blog, subscribe here .
I want to acknowledge the insights and valuable edits from Priyanka Somrah, Steve Huber, Josh Stoddard, and Luis Remis.