Deploy Your Own LLM Model in Just Minutes

Summary

Calibo’s Internal Developer Portal (IDP) lets developers quickly spin up UIs and APIs—like a React web app and chatbot widget powered by Meta’s Llama2—without wrestling with infrastructure. The platform handles CI/CD, deployment to EC2 or Kubernetes, and environment setup so developers can focus on coding, not configuration.

Action points: define a policy template that includes LLM-related tech (e.g., React, FastAPI, EC2, Kubernetes); use the Calibo IDP to generate your product and feature structures; develop your UI and API leveraging the built-in CI/CD; deploy to supported environments; and iterate while productivity is maximized.

As you may know, Calibo empowers developers to create UIs and APIs in just minutes, with seamless deployment to multiple platforms including EC2, Kubernetes, and OpenShift.

In this blog post, we’ll walk you through the steps to create a React web app and a chatbot widget, along with an API that loads Meta’s Llama2 model. We’ll also show you how to enhance your chatbot by adding contextual understanding using an in-memory embedding database.

By the end of this guide, you’ll have a solid grasp of how to leverage Calibo’s powerful features to streamline your development and deployment processes across various environments in no time.

In a previous post, I demonstrated how to use Calibo to deploy APIs and UIs to a Kubernetes cluster. Today, we’re building on that by reusing the UI component from the earlier tutorial and taking it a step further to deploy your own large language model (LLM), specifically Meta’s Llama2.

Given the hardware requirements, and to highlight EC2’s deployment capabilities, we’ll be using EC2 for this example.

Deployment diagram

Check out the deployment diagram below, and then follow the steps to deploy your own AI model with Calibo.

Let’s dive in!

Steps to create LLM with Calibo

First, you’ll need to create your new product in your digital portfolio. The screenshot below shows the product name and the phases that have been defined. In this post, we’ll be focusing on the development and deployment phases.

The next step involves setting up the policy template for your new product. While configuring the policy template is beyond the scope of this post, it’s important to note that this is a one-time setup. For this use case, we’ve configured the template (calibo-all-v3) to include React, FastAPI, AWS EC2, and Kubernetes deployments.

With this setup, these technologies are readily available for your developers, along with all the environments where these workloads need to be deployed within your organization.

Now, let’s create a feature within the product we set up in the previous steps. This feature will house the UI and API that we’ll be implementing.

The feature has been created. You can now click on it to open a panel that displays Calibo’s 4Ds: Define, Design, Develop, and Deploy. To continue building our chatbot, click on Develop.

Now that you’re within the feature, you can add new technologies. For our use case, we will be adding React 18.

You can select and configure multiple technologies simultaneously. Next, select FastAPI, and to configure both technologies, click the “Add” button.

After clicking the “Add” button, you will be prompted to name your workloads and provide names for your repositories.

Calibo will then create these repositories and initialize them with Calibo code, including the CI/CD configurations needed to deploy these workloads into EC2 instances and Kubernetes (which will be covered in the following steps).

This means you only need to focus on developing the API service and the UI.

You are now ready to start coding your workloads. When you expand one of your technologies, you’ll see options like in the screenshot below.

For example, with the React UI technology expanded, you have two choices: clone the created `chatbot-llama2-ui` repository or start coding directly within Calibo.

You can apply the same process to the API, where both options are also available. To view your changes in an environment created by Calibo, you need to configure the deployment of these technologies to AWS.

You can switch to the deployment view by clicking on the top navigation bar. Once on the deployment screen, select the Kubernetes deployment for the UI and add new technology to the preconfigured cluster by clicking the “Add Technologies” button.

You will see the previously configured technologies appear. To configure the UI in the Kubernetes cluster, select the `chatbot-llama2-ui` and click the “Add” button.

Next, select the branch, cluster namespace, and the path and port where the application will listen. Once you’ve made these selections, click the “Add” button to complete the setup.

The configuration is now complete, so it’s time to deploy it. Click on the three dots menu in the technology box to open a menu, then select the “Deploy” button. Wait until the status in the top right corner of the technology box turns green.

Next, let’s deploy the API on an EC2 instance. Click on the Docker Container tab and add a new instance. After adding the instance, deploy the API to it.

When configuring the instance, be sure to adjust the storage size. For the model we are deploying, you’ll need more than 13 GB, plus around 10 GB, for the operating system, so I recommend setting the storage to 30 GB. Adjust this size based on the specific requirements of your model.

Once the instance is ready, the next step is to deploy the API onto it.

Add and configure the API, then deploy it in the same manner as the UI. Click the “Deploy” button in the technology’s three-dot menu to complete the process.

At this stage, both the API and UI are deployed and ready to handle requests. Calibo has efficiently managed all aspects of CI/CD, including environment setup and technical complexities.

By automating these processes, Calibo allows developers to focus entirely on coding, free from the burdens of manual deployment and configuration management. This streamlined approach not only boosts productivity but also enables teams to deliver applications more quickly and reliably.

Conclusion

In conclusion, Calibo’s comprehensive automation of CI/CD pipelines and environment configurations empowers developers to focus squarely on coding. By abstracting away the complexities of AWS EC2 and Kubernetes, developers can achieve their goals without needing specialized knowledge of these platforms.

This streamlined approach not only enhances productivity and accelerates application delivery, but also empowers businesses to achieve their development goals with unparalleled efficiency.

Interested in the specifics of how Calibo works? Have a look at our factsheets here.

FAQ

How does Calibo’s Internal Developer Portal (IDP) simplify LLM deployment?
It automates setup for environments, CI/CD pipelines, and infrastructure, allowing developers to focus on building applications instead of managing configuration.

What kind of applications can be built with Calibo for LLMs?
Teams can quickly create user-facing apps like React-based web interfaces or chatbot widgets, powered by models such as Meta’s Llama2.

What steps are involved in deploying an LLM with Calibo?
Define a policy template including required technologies, generate product and feature structures in the IDP, build the UI and API, and deploy to EC2 or Kubernetes with built-in pipelines.

Topics

MLOps/AI Platform Engineering IDP

How Calibo’s IDP helps you to deploy your own large language model (LLM)

Deployment diagram

Steps to create LLM with Calibo

Conclusion

FAQ

Data orchestration: why modern enterprises need a data orchestration platform

How Enterprise Architects can get more support for technology led innovation

Why combine an Internal Developer Portal and a Data Fabric Studio?

The differences between data mesh vs data fabric

More from Calibo

One platform across the entire digital value creation lifecycle.

We accelerate digital value creation. Get to know us.

Find valuable insights in Calibo's resources library

Check out our profile and join us on LinkedIn

How Calibo’s IDP helps you to deploy your own large language model (LLM)

Deployment diagram

Steps to create LLM with Calibo

Conclusion

FAQ

Trending articles

Data orchestration: why modern enterprises need a data orchestration platform

How Enterprise Architects can get more support for technology led innovation

Why combine an Internal Developer Portal and a Data Fabric Studio?

The differences between data mesh vs data fabric

More from Calibo

One platform across the entire digital value creation lifecycle.

We accelerate digital value creation. Get to know us.

Find valuable insights in Calibo's resources library

Check out our profile and join us on LinkedIn