[Question] Why is there no Q8 quantization for Phi-3-V?

paintenzero@programming.dev to

LocalLLaMA@sh.itjust.worksEnglish · 24 days ago

Hello! I am looking for some expertise from you. I have a hobby project where Phi-3-vision fits perfectly. However, the PyTorch version is a little too big for my 8GB video card. I tried looking for a quantized model, but all I found is 4-bit. Unfortunately, this model works too poorly for me. So, for the first time, I came across the task of quantizing a model myself. I found some guides for Phi-3V quantization for ONNX. However, the only options are fp32(?), fp16, int4. Then, I found a nice tool for AutoGPTQ but couldn’t make it work for the job yet. Does anybody know why there is no int8/int6 quantization for Phi-3-vision? Also, has anybody used AutoGPTQ for quantization of vision models?

You must log in or register to comment.

Chat

LocalLLaMA@sh.itjust.works

localllama@sh.itjust.works

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@sh.itjust.works

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
25 users / month
25 users / 6 months
9 local subscribers
2.1K subscribers
53 Posts
0 Comments
Modlog