Researchers from Princeton and Stanford Engineering have developed a technique to compress large language models (LLMs), a ...