OpenWebUI + Ollama: Your Private AI Powerhouse

$0.45$30.38 / Hour 

This instance combines the robust capabilities of OpenWebUI, an extensible and user-friendly AI interface, with Ollama, a powerful local LLM runner, to create a complete solution for running and managing Large Language Models (LLMs) in a private, controlled environment.


Select the required level of Performance

$0.45 / Hour 
$0.51 / Hour 
$0.60 / Hour 
$0.60 / Hour 
$0.90 / Hour 
$1.16 / Hour 
$6.83 / Hour 
$7.08 / Hour 
$30.38 / Hour 

Description

Core Features and Benefits:

Screen shot of OpenWebUI

  1. Complete Control Over Your AI:

  • Run various open-source LLMs privately on your dedicated GPU instance
  • Full data sovereignty with no external API dependencies
  • Customizable model configurations and parameters
  • Support for multiple popular models including Llama, Gemma, and Mistral
  1. User-Friendly Interface:

  • Intuitive chat interface through OpenWebUI
  • Easy model management and switching
  • Built-in RAG (Retrieval-Augmented Generation) capabilities
  • No command-line expertise required
  • Multi-chat session support
  1. Flexible GPU Options: We offer a range of GPU configurations to match your specific needs:

  • Entry-level: RTX A4000 (16GB) for basic LLM operations
  • Mid-range: RTX A5000 (24GB) / RTX 4090 (24GB) for improved performance
  • High-performance: RTX A6000 (48GB) for handling larger models
  • Enterprise-grade: H100 configurations (up to 4x 80GB) for maximum performance and multi-model operations
  1. Perfect For:

  • AI researchers requiring private model experimentation
  • Developers building AI-powered applications
  • Organizations needing secure, offline AI capabilities
  • Teams requiring collaborative AI workspace
  • Projects requiring fine-tuned or custom models
  1. Technical Advantages:

  • Built-in inference engine for RAG
  • Support for model fine-tuning
  • Extensible plugin system
  • Multi-model concurrent running
  • Easy model downloading and management

Our combination of OpenWebUI and Ollama, powered by enterprise-grade GPUs, provides you with a complete solution for running sophisticated AI models in a private, controlled environment. Whether you’re running basic chatbots or complex AI applications, our platform provides the flexibility, power, and ease of use you need.


How to Use?

  1. Initial Setup Process:

  • After deploying your instance, an automated setup process begins that downloads and configures both OpenWebUI and Ollama
  • The initial setup typically takes around 8 minutes to complete
  • Setup time may vary depending on server internet speed and the size of applications/models being deployed
  • You can monitor progress in the “Instance Status” section by refreshing the page
  • Note: Since we maintain user privacy and don’t access your instance directly, the setup time provided is an approximate estimate
  1. Accessing OpenWebUI:

  • Once setup is complete, click the “Launch OpenWebUI” button to open the interface in a new tab
  • Your instance comes pre-loaded with Qwen 2.5 (1.5B parameters), a lightweight but capable model to get you started immediately
  1. Adding New Models: To add additional models to your instance:

  • Navigate to Settings > Admin Settings > Models tab
  • In the “Pull a model from Ollama.com” section, enter the model name you wish to download
  • Click the download button to begin the model installation
  • A progress indicator will show the download and setup status
  • Once complete, return to the home screen
  • Select your newly downloaded model from the model selector dropdown at the top
  1. Using Models:

  • Start a “New Chat” to begin interacting with your selected model
  • You can switch between different models using the dropdown menu at the top of the chat interface
  • Each model maintains separate chat histories
  • You can create multiple chat sessions with different models running simultaneously

Tips:

  • Start with the pre-installed Qwen 2.5 model while downloading larger models in the background
  • Check the Ollama model library for compatible models and their requirements
  • Consider your GPU specifications when choosing models to download
  • Keep track of your storage usage when downloading multiple models
  • You can delete unused models through the Models tab to free up space

Remember: Your instance runs completely privately – all data and interactions remain within your dedicated environment. The system is designed to be user-friendly while providing the flexibility to run various AI models according to your needs.

 

References: