High Plains Computing (HPC) Generative AI starter is a great way to start an enterprise-wide generative AI initiative and set up your LLM for customized training and inference work. Stop paying thousands of dollars to Open AI for fine-tuning your documents or generating embeddings for your document repositories. Training and querying costs become significant if you have a large collection of private documents and multimedia content. Fine-tuning the Open AI’s GPT 3.5 model with a 1 GB repository of textual data can easily cost over $10K, and querying /getting results costs extra.
Instead, Launch your copy of an open-source LLM model within your private Amazon Virtual Private Cloud (VPC) on Amazon Web Services Cloud. All your cloud applications, on-prem applications, and users can use this, which will cost you thousands of dollars less.
All training data will stay private in your own AWS account.
Introducing Clement, the AI Starter
Generative AI starter is a Large language model project starter so enterprise customers can use and train their model instead of using Chat GPT and another API-based closed source model.
It sets up any large model challenging to run in the local environment . e.g., Falcon 7B/40B large Language model on Amazon Elastic Kubernetes Service (Amazon EKS). This model is large enough that it is tough to run locally, even for inference. A training job needs more GPUs than a Desktop class machine can hold. Making clusters scale up and down on demand and deleting/recreating cluster nodes on-demand saves much cost for generative AI development work. HPC provides automation and self-service to do that job
You can use the easy-to-use Clement UI below to set up an Amazon EKS cluster and deploy training and inference jobs on this cluster.
- Set up as done on Client AWS account, or HPC managed client account
- We deploy the Amazon EKS cluster to run both model training and inference jobs
- Set up process installs a secure API Endpoint for everyday tasks such as prompting and text completion, chat, question/answering, text classification entity detections, and text summary generation
- The service endpoint can be secured using an OpenID/OAuth token and protected by a web application firewall
- The cluster comes fully configured with jobs to fine-tune client-specific data sets as well as create a low-rank model to save cost
The setup of Clement is free. HPC provides MLOP support plans ranging from $ 1,000 to $24 thousand per month with a dedicated team of 3 ML developers and one MlOps for your Generative AI projects.