Hi all, i am quite an old fart, so i just recently got excited about self hosting an AI, some LLM…

What i want to do is:

  • chat with it
  • eventually integrate it into other services, where needed

I read about OLLAMA, but it’s all unclear to me.

Where do i start, preferably with containers (but “bare metal”) is also fine?

(i already have a linux server rig with all the good stuff on it, from immich to forjeio to the arrs and more, reverse proxy, Wireguard and the works, i am looking for input on AI/LLM, what to self host and such, not general selfhosting hints)

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    10 hours ago

    There’s another community for this: [email protected]
    Though we mostly discuss the news and specific questions there, beginner questions are a bit more rare.

    I think you already got a lot of good answers here, LMStudio, OpenWebUI, LocalAI…
    I’d like to add KoboldCpp that’s kind of made for gaming/dialogue, but it can do everything. And from my experience it’s very easy to set up and bundles everything into one program.

  • vane@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    9 hours ago

    You can host ollama and open-webui on container. If you want to wire search you can connect open-webui to playwright (also container) and searxng (also container) and llm will search the web for answers

  • splendoruranium@infosec.pub
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    edit-2
    11 hours ago

    I read about OLLAMA, but it’s all unclear to me.

    There’s really nothing more to it than the initial instructions tell you. Literally just a “curl -fsSL https://ollama.com/install.sh | sh”. Then you’re just a “ollama run qwen3:14b” away from having a chat with the model in your terminal.
    That’s the “chat with it”-part done.

    After that you can make it more involved by serving the model via API, manually adding .gguf quantizations (usually smaller or special-purpose modified bootleg versions of big published models) to your Ollama library with a modelcard, ditching Ollama altogether for a different environment or, the big upgrade, giving your chats a shiny frontend in the form of Open-WebUI.

  • Mike Wooskey@lemmy.thewooskeys.com
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    10 hours ago

    Sounds like you already know what you need to know to host Ollama in a Docker container. Ollama is an LLM “engine” - you can interact with LLM models via a CLI or you can integrate them into other services via an API.

    To have a web page chat like ChatGPT or others, I installed OpenWebU. I love it! A friend of mine likes LMStudio, which i think is a desktop app, but I don’t know anything about it.