,

AI: Use a RAG to Dress Up Your LLM

So, your large-language model (LLM) does not provide your staff with the best answers. Can you make the LLM perform better by dressing it up using rags?  Well, not a rag from the thrift store, but implementing RAG may help. In this case, RAG is an acronym for Retrieval Augmented Generation and, if implemented well, can improve the responses from the LLM.

As most know, an LLM is pre-trained in general knowledge. One of the issues with LLMs is that the information is unlikely to be specific to your business needs. They are trained to generate human-sounding responses with general knowledge.  Unfortunately, some measures reveal responses with an accuracy rate of only about 30%. To rely on an LLM to generate insights for your specific business is expecting too much. Will implementing a RAG improve that? Well, to answer that question, we need to attempt to briefly describe a RAG. This short article cannot encompass all the details but will attempt to give you a basic understanding.

In simple terms, RAG implementation will improve LLM responses by using internal and/or possibly proprietary company/agency data. Data exposed to a RAG system will add context to the prompt a user feeds to the LLM. In essence, it modifies the user’s prompt by pulling data from accessible sources, so the LLM has a better context to generate a reply that has a higher degree of accuracy.

Implementing RAG should not be considered easy. It takes planning and understanding of the intended outcomes. An alternative to consider is retraining the LLM on the new data, but the computational and financial costs are quite high for that effort. Another technique that could be considered is “fine tuning” the LLM which can be described as retraining, however, with fine tuning, the input dataset is typically much smaller and more refined than a dataset an LLM would originally utilize. Fine tuning is considered faster than the original effort to create the LLM, however, there are still costs to consider. Currently, RAG is considered more cost-effective.  Note though that you can combine RAG with fine tuning.

To add more to the discussion, RAG is not only the data, but the technique used to make the data available to an LLM. For now, consider the data can come from many places:

  • Document repositories (e.g. network file shares, etc.)
  • Databases (company and/or business unit specific)
  • FAQ repositories
  • Application systems’ APIs
  • Manuals

Now before you go and pursue RAG and make all your data available to your LLM, understand there is significant complexity involved and more than one approach/technique for implementing RAG. RAG implementations can address security concerns but the CISO, CDO, and system architects must be involved to help design the base RAG implementation. RAG is more complex than this short article can address. The intent is to provide the reader with some further understanding of the technology so when they are involved in AI discussions they can more easily identify with the concepts and can potentially contribute to the conversations at a much earlier stage.


Dan Kempton is the Sr. IT Advisor at North Carolina Department of Information Technology. An accomplished IT executive with over 35 years of experience, Dan has worked nearly equally in the private sector, including startups and mid-to-large scale companies, and the public sector. His Bachelor’s and Master’s degrees in Computer Science fuel his curiosity about adopting and incorporating technology to reach business goals. His experience spans various technical areas including system architecture and applications. He has served on multiple technology advisory boards, ANSI committees, and he is currently an Adjunct Professor at the Industrial & Systems Engineering school at NC State University. He reports directly to the CIO for North Carolina, providing technical insight and guidance on how emerging technologies could address the state’s challenges.

Image by Alexandra Koch at Pixabay.com

Leave a Comment

Leave a comment

Leave a Reply