Q8K Quantization

Advanced 8-bit block quantization technique for large language models

Glossary

Quantization: Process of reducing model precision from FP32 to INT8
Block Quantization: Quantizing weights in fixed-size blocks

← Back to Blog