In our previous blog, we covered the need to define infrastructure in a cloud-agnostic manner. In this article, we’ll dive deeper into the core challenges posed by designing cloud-agnostic infrastructure. Because we ran into many of these challenges while building ApertureDB, a cloud-agnostic database specifically built for multimodal data and metadata, we’ll also add some color from our learnings along the way.

‍

Challenge #1: How to handle distributed deployments

We knew from the start that building a modern database for advanced AI applications meant that high levels of scalability, availability, and performance were going to be table stakes for small startups and large enterprises alike. That’s why one of the goals for ApertureDB was to efficiently manage distributed deployments.

But when you’re building a cloud-agnostic database or a similar infrastructure software, handling distributed deployments requires careful planning to ensure portability, consistency, and performance across the various cloud platforms.

‍

Kubernetes as the key

Kubernetes is an open-source platform that makes it easy to deploy and scale containerized applications, such as a database instance deployed via a Docker image. For example, you could choose a MongoDB database, containerize the database using Docker images, and then write Kubernetes YAML manifests to deploy the container. Kubernetes then makes it easy to horizontally scale the MongoDB database by adding new “pods,” to monitor the pods’ performance and logs, and to manage database upgrades. Essentially, Kubernetes handles a lot of the heavy lifting required in application management.

Given Kubernetes' status as the most popular open-source container orchestration platform, it’s vital to understand its integration with your chosen cloud provider. Most cloud providers, as well as private data center deployments, offer Kubernetes services. This is why we chose to package ApertureDB’s components into Docker containers, making them easy to deploy in Kubernetes clusters using Terraform or Helm chart configurations.

‍

ApertureDB's cloud-agnostic architecture

At its core, ApertureDB is a set of Docker images deployed in a Kubernetes environment via Terraform (or Helm chart). ApertureDB implements some modules differently, such as the object store or file system interaction layer, depending on the cloud provider. Written in Terraform, these implementations have standardized interfaces vis a vis ApertureDB’s core module, making it easy to plug-and-play with any cloud provider. This modular approach ensures that ApertureDB can seamlessly integrate with any cloud provider offering Kubernetes, including AWS, Azure, and GCP.

By building the foundation of your data layer on Kubernetes, you can reuse a significant portion of your deployment code across different cloud providers. This strategy not only simplifies the deployment process but also enhances the flexibility and portability of your system, ensuring that you can easily adapt to any cloud environment.

‍

Challenge #2: How to scale storage capacity

Another goal we had when building a cloud-agnostic database was to be able to scale storage capacity while continuing to work on various cloud platforms. Beyond the typical considerations, like what types of data API are supported, and how storage fees are calculated, there are additional aspects to consider. For example, what happens when you start operating with multiple simultaneous cloud object stores? How do you navigate each store’s various permission structures, throughput, and file systems?

As we built ApertureDB, we realized that the complexity of cloud storage-agnostic deployment was begging for a simple, high-performance abstraction layer. By setting up connections with major cloud provider object stores through their SDKs - and offering storage abstraction that could take advantage of the specific configurations and optimizations per cloud provider, in our own server - we were able to simplify storage management on various clouds.

‍

Challenge #3: How to standardize the unstandardized

Designing for multi-cloud support means we specifically had to standardize storage and load-balancer interfaces to work with any cloud provider, which we detail below.

‍

Storage interface

ApertureDB’s standardized storage interface is designed to function uniformly regardless of the cloud provider chosen by the user. This interface accepts the same inputs and produces consistent outputs across different platforms, making integration straightforward and hassle-free.

Key Elements:

The primary input for the storage interface is the bucket name, object name, and a way to specify credentials that is handled independent of the storage interface since it can vary across cloud providers. These credentials ensure secure and authenticated access to the data. By abstracting these details, ApertureDB simplifies the interaction with various cloud storage services, such as Amazon S3 buckets, Google Cloud Storage (GCS) buckets, and can even be used with a Posix-compliant filesystem.

This standardization allows ApertureDB to provide a consistent storage experience, whether the user is operating on AWS, GCP, or any other supported cloud platform.

‍

Load-balancer interface

Similarly, ApertureDB’s load-balancer interface is standardized to operate uniformly across different cloud providers. This interface ensures that the necessary inputs are compatible with any environment, facilitating seamless load balancing and high availability. We chose to implement our own load balancer because the different cloud providers would impose different limitations which were creating scaleout hurdles when dealing with large objects which ApertureDB regularly does because of its support of image and video data.

Key Elements:

Inputs: The load-balancer interface requires two primary inputs:some text
- Kubernetes Node Group: This defines the group of nodes in the Kubernetes cluster that will handle the load.
- Kubernetes Node Port Details: These include ApertureDB’s TCP port and the HTTP/HTTPS ports of ApertureDB’s internal application load balancer.

By using these standardized inputs, ApertureDB can configure load balancing appropriately for each cloud provider based on their respective load balancing methods: NLB on AWS and global forwarding rules on GCP.

‍

Ensuring feature parity in your cloud-agnostic architecture

If you’re worried about feature parity across platforms, it’s important to note that most major cloud providers cover much of the same ground and offer very similar feature sets, resource types, and tools. Take a single feature at random—say, container registries for storing container images. Amazon offers its ECR, Google its Artifact Registry, and Azure its ACR. ApertureDB users already deploy or are in the process of deploying across all three of these providers.

‍

Lessons learned

In today's dynamic landscape, where you might need features from OpenAI in Azure, Gemini in GCP, or Anthropic in AWS, staying cloud-agnostic in your build is a smart, strategic choice that preserves optionality. But architecting a product to support that is hard: you have to optimize performance, cost, and ease of use, and you need scalable elements that won’t lock you into nightmare tech debt in a year’s time.

The most important thing when building or choosing a tool like ApertureDB for multimodal data management is to understand exactly how the tool can plug into other pieces of a product’s architecture. Choosing a cloud provider is just one of many elements involved in architecting a product.

We thought we’d close out with some of the key lessons learned along the way while building our own cloud-agnostic database:

Choose flexible software: choosing tools that work across cloud providers to form your fundamental infrastructure layers will reduce the burden down the line.
Prioritize resource planning: even with cloud-agnostic tools, there are always certain configurations that can throw you off when deploying to a new cloud provider. Costs of machine resources, their performance mappings, and the time for your team to actually deploy on different clouds can add months to your project.
Don’t underestimate the learning curve: working across cloud providers is no negligible task and can add weeks to your timeline, unless you build a team that has already worked in multi-cloud environments.

By choosing software that offers maximum flexibility and scalability, you can build a robust, cloud-agnostic architecture. If you work with multimodal data and want to see how ApertureDB can simplify these challenges, contact us at team@aperturedata.io.

‍

If you’re interested in learning more about how ApertureDB works, reach out to us at team@aperturedata.io. Stay informed about our journey by subscribing to our blog.

‍

I want to acknowledge the insights and valuable edits from JJ Nguyen, Ali Asadpoor and Ian Yanusko.

Tags:

Data privacy and security

Usability and Debugging

High performance

Infrastructure and Deployment

Related Blogs

Your Smart AI Agent Needs A Multimodal Brain

Blogs

June 16, 2025

Your Smart AI Agent Needs A Multimodal Brain

Smart AI agents need more than text to truly act like humans—they need unified memory across text, images, video, audio, and metadata. Part 2 of this 3 part series blog series explains how a purpose-built multimodal database like ApertureDB delivers that memory, enabling modern AI agents to perceive, reason, and act with real context and speed.

Watch Now

Blogs

June 6, 2025

Smarter Agents Start with Smarter Data

Building smart AI agents isn't just about better models — it's about better data infrastructure. This blog explores why legacy stacks fail multimodal AI and sets the stage for modern solutions that enable agents to reason, act, and scale.

Watch Now

Blogs

February 10, 2025

Is Your Chatbot Secure?

ApertureData and Realm Labs help developers build secure RAG chatbots by combining advanced permissions management with graph-vector storage, ensuring data protection and efficient access control.

Watch Now

Just Because Your Data Is Unstructured Doesnt Mean it Should be Onerous

Videos & Podcasts

October 24, 2024

Just Because Your Data Is Unstructured Doesnt Mean it Should be Onerous

Learn why visual data today needs special treatment and how this can be achieved in this Data Council talk

Watch Now

Building Real World RAG-based Applications with ApertureDB

Blogs

Nov 21, 2024

Building Real World RAG-based Applications with ApertureDB

Combining different AI technologies, such as LLMs, embedding models, and a database like ApertureDB that is purpose-built for multimodal AI, can significantly enhance the ability to retrieve and generate relevant content.

Managing Visual Data for Machine Learning and Data Science. Painlessly.

Blogs

Oct 15, 2024

Managing Visual Data for Machine Learning and Data Science. Painlessly.

Visual data or image/video data is growing fast. ApertureDB is a unique database...

Blogs

Oct 15, 2024

What’s in Your Visual Dataset?

CV/ML users need to find, analyze, pre-process as needed; and to visualize their images and videos along with any metadata easily...

Transforming Retail and Ecommerce with Multimodal AI

Blogs

Oct 15, 2024

Transforming Retail and Ecommerce with Multimodal AI

Multimodal AI can boost retail sales by enabling better user experience at lower cost but needs the right infrastructure...

Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 1

Blogs

Oct 15, 2024

Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 1

Multimodal AI, vector databases, large language models (LLMs)...

How a Purpose-Built Database for Multimodal AI Can Save You Time and Money

Blogs

Oct 15, 2024

How a Purpose-Built Database for Multimodal AI Can Save You Time and Money

With extensive data systems needed for modern applications, costs...

Minute-Made Data Preparation with ApertureDB

Blogs

Oct 15, 2024

Minute-Made Data Preparation with ApertureDB

Working with visual data (images, videos) and its metadata is no picnic...

Why Do We Need A Purpose-Built Database For Multimodal Data?

Blogs

Oct 15, 2024

Why Do We Need A Purpose-Built Database For Multimodal Data?

Recently, data engineering and management has grown difficult for companies building modern applications...

Building a Specialized Database for Analytics on Images and Videos

Blogs

Oct 15, 2024

Building a Specialized Database for Analytics on Images and Videos

ApertureDB is a database for visual data such as images, videos, embeddings and associated metadata like annotations, purpose-built for...

Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2

Blogs

Oct 15, 2024

Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2

Multimodal AI, vector databases, large language models (LLMs)...

Challenges and Triumphs: Multimodal AI in Life Sciences

Blogs

Oct 15, 2024

Challenges and Triumphs: Multimodal AI in Life Sciences

AI presents a new and unparalleled transformational opportunity for the life sciences sector...

Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?

Blogs

Oct 15, 2024

Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?

The data landscape has dramatically changed in the last two decades...

Can A RAG Chatbot Really Improve Content?

Blogs

Oct 15, 2024

Can A RAG Chatbot Really Improve Content?

We asked our chatbot questions like "Can ApertureDB store pdfs?" and the answer it gave..

Blogs

Oct 15, 2024

ApertureDB Now Available on DockerHub

Getting started with ApertureDB has never been easier or safer...

Are Vector Databases Enough for Visual Data Use Cases?

Blogs

Oct 15, 2024

Are Vector Databases Enough for Visual Data Use Cases?

ApertureDB vector search and classification functionality is offered as part of our unified API defined to...

Accelerate Industrial and Visual Inspection with Multimodal AI

Blogs

Oct 15, 2024

Accelerate Industrial and Visual Inspection with Multimodal AI

From worker safety to detecting product defects to overall quality control, industrial and visual inspection plays a crucial role...

ApertureDB 2.0: Redefining Visual Data Management for AI

Blogs

Oct 15, 2024

ApertureDB 2.0: Redefining Visual Data Management for AI

A key to solving Visual AI challenges is to bring together the key learnings of...

Lessons Learned Building a Cloud-Agnostic Database‍

Challenge #1: How to handle distributed deployments

Kubernetes as the key

ApertureDB's cloud-agnostic architecture

Challenge #2: How to scale storage capacity

Challenge #3: How to standardize the unstandardized

Storage interface

Key Elements:

Load-balancer interface

Key Elements:

Ensuring feature parity in your cloud-agnostic architecture

Lessons learned

Related Blogs

Ready to Accelerate your AI Workflows?