Back to Glossary
TurboQuant
What is TurboQuant?
An AI memory compression algorithm developed by Google that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup with zero accuracy loss.