Advice - Getting started with LLMs

its_me_xiphos@beehaw.org · 5 months ago

Advice - Getting started with LLMs

Zworf@beehaw.org · edit-2 5 months ago

Training your own will be very difficult. You will need to gather so much data to get a model that has basic language understanding.

What I would do (and am doing) is just taking something like llama3 or mistral and adding your own content using RAG techniques.

But fair play if you do manage to train a real model!

BaroqueInMind@lemmy.one · 5 months ago

OLlama is so fucking slow. Even with a 16-core overclocked Intel on 64Gb RAM with an Nvidia 3080 10Gb VRAM, using a 22B parameter model, the token generation for a simple haiku takes 20 minutes.

xcjs@programming.dev · 5 months ago

No offense intended, but are you sure it’s using your GPU? Twenty minutes is about how long my CPU-locked instance takes to run some 70B parameter models.

On my RTX 3060, I generally get responses in seconds.

kiku123@feddit.de · 5 months ago

I agree. My 3070 runs the 8B Llama3 model in about 250ms, especially for short responses.