Leveraging GPTQ to quantize the XGen-7B Model, significantly reducing the memory footprint and ensuring rapid response generation.
Mountain View, CA | August 4, 2023 – SearchUnify, a leading unified cognitive platform, is pleased to announce a breakthrough achievement in optimizing Salesforce’s XGen-7B Model for rapid response generation.
Salesforce’s XGen-7B is a Large Language Model (LLM) that supports longer context windows than available open-source LLMs. The 7B in the XGen-7B model represents 7 billion parameters. While this helps the model achieve accurate responses, its high-end computational (CPUs, GPUs, RAM, and storage) requirements slow down response speed.
The SearchUnify team harnessed the state-of-the-art quantization method, GPTQ (Generative Pre-trained Transformer Quantization), to compress the model’s weight by 4x, i.e., from 16 bits to 4 bits. This led to remarkable inference speedups and also significantly reduced the model’s memory footprint, making it four times more memory-efficient than before. The optimized XGen-7B model now generates faster responses, ensuring superior user experiences on the Salesforce platform.
“Large Language Models are making waves for their capability to generate human-like responses and perform various natural language processing tasks. However, they require significant computational power and memory, leading to high infrastructure costs. Our team has transformed this adversity into an advantage by improving Salesforce’s XGen-7B for performance and inference process,” states Vishal Sharma, CTO, SearchUnify. “Such significant achievements motivate us to continue our unwavering commitment to advancing natural language processing and delivering cost-effective solutions.”
SearchUnify is a unified cognitive platform, by Grazitti Interactive, and is built on a machine learning and insights engine. The platform boasts a suite of AI-powered products, including Cognitive Search, SUVA (the World’s First Federated, Information Retrieval Augmented Chatbot for Fine-tuned, Contextual, and Intent-driven Conversational Experiences at Scale), Agent Helper, Knowbler (the World’s First Knowledge-centered Customer Service Software), Escalation Predictor, and Community Helper. Leading enterprises globally rely on SearchUnify for revolutionizing information discovery and elevating support outcomes.
Senior Social Media Manager, SearchUnify