Getting Started with Self Hosted LLM

Possibly linux@lemmy.zip · 8 months ago

Getting Started with Self Hosted LLM

hyperhypervisor@programming.dev · 8 months ago

There’s also llamafile, super simple: download and run it.

Possibly linux@lemmy.zip · 8 months ago

Not as cool or flexiable though

hyperhypervisor@programming.dev · 8 months ago

Iirc it can run anything llama.cpp can because it just uses that under the hood.

Possibly linux@lemmy.zip · 8 months ago

Except you can’t control it as easily. I like the UI and toolset of open web UI

hyperhypervisor@programming.dev · 8 months ago

Ok. I haven’t tried it so I’ll take your word for it. I’m just offering an easier alternative since the topic was “getting started”

Scrubbles@poptalk.scrubbles.tech · 8 months ago

Personally I’ve really enjoyed text-generation-webui. It made it really easy to ramp up and learn. Very cool stuff you got though, I’ll probably be looking at a comparison between them!

RedNight@lemmy.ml · 8 months ago

Ollama has been great for self-hosting, but also checkout vLLM as its the new shiny self-hosting toy

mozz@mbin.grits.dev · 8 months ago

Does it work out okay with 12 cores purely on CPU? About how fast is the interaction?

I played around a little with Ollama and gpt4all but it seemed to me like it wasn’t fast enough to be useful on pure CPU, but if I could just throw cores at it then I might revisit the issue.

Possibly linux@lemmy.zip · 8 months ago

It wasn’t usable a few months ago. However, when I setup ollama it was “fast” and it works ok. It takes anywhere from instant to 5min for responses. LLava seems to take the longest which makes sense. For llama2 it is fairly fast unless you ask it for obscure information.

thantik@lemmy.world · 8 months ago

The biggest thing that I want to learn is how to either A: add “tools” for the AI to run, or B: “fine-tune” the model by feeding it data that’s relevant to me.

Possibly linux@lemmy.zip · 8 months ago

You can teach it things and upload documents for it to process

thantik@lemmy.world · 8 months ago

Yeah, I couldn’t find that on ollama; but I did find it in text-generation-webui - which is a little more complicated, but for me, I think it might help springboard me into understanding a few more things.

Possibly linux@lemmy.zip · 8 months ago

Ollama is just a the backend. You need open web UI or a similar application to use it

Willdrick@lemmy.world · 8 months ago

Check AnythingLLM out, its just an appimage

Possibly linux@lemmy.zip · 8 months ago

Not as maintainable long term and it doesn’t have user management

Willdrick@lemmy.world · 8 months ago

There’s a dockerized version if you need those

https://github.com/Mintplex-Labs/anything-llm/blob/master/docker/HOW_TO_USE_DOCKER.md

Possibly linux@lemmy.zip · edit-2 8 months ago

So why is it better than OpenwebUI? It seems like each has there own use case.

I’ll give it a try just for fun but it doesn’t seem to be better as far as I can tell