Generative artificial intelligence (GenAI) is having a transformational impact on the world, and it’s not hard to see why. By opening new opportunities for automation and optimization, increased productivity, accelerated iteration cycles, and improved decision-making—among many other potential benefits—the inherent possibilities of GenAI have piqued the interests of researchers, analysts, operations admins, and strategic decision-makers across all kinds of industries. One GenAI use case, the in-house chatbot, is especially intriguing. By combining a publicly available large language model (LLM) with an organization’s own private data, an in-house chatbot can provide users with a centralized and secure knowledge base that offers AI-powered search and content-generation capabilities, improved internal communications, better support for employees, and readily accessible insights from existing data. 

For small- and medium-sized businesses that are interested in what an in-house chatbot can do for them, it may seem like the associated hardware costs are prohibitive. While it’s true that many AI functions rely on multiple GPUs to crunch huge amounts of data in a timely fashion, it’s also possible for an organization to set up a responsive private chatbot on an affordable server platform with a single powerful CPU. Such a solution will also give administrators the option of adding GPUs in the future if their needs grow over time. 

We explored this possibility by using the PTChatterly chatbot testing service to compare the performance of two configurations of single-socket Dell PowerEdge R6615 servers with an in-house GenAI chatbot using the small Llama 3.2 1B LLM. The first configuration—with only a single 64-core AMD EPYC 9534 processor—supported nine simultaneous users with a response time threshold of five seconds—meaning at least 90 percent of all questions received complete answers in five seconds or less. The second configuration—with the same EPYC 9534 CPU-equipped PowerEdge R6615 and an NVIDIA L4 GPU—supported 23 simultaneous users with a response time threshold of five seconds. Because typical users engage with the chatbot only for short periods during their workday, it's likely that these configurations could support even larger numbers of effective users.

These results suggest that the Dell PowerEdge R6615 server powered by an AMD EPYC 9534 processor is a suitable choice for small- and medium-sized organizations that want to develop in-house GenAI chatbots without having to invest in multi-GPU hardware. In addition, as the number of users grows, the same platform will allow companies to easily scale the server’s capabilities by adding a GPU to support more than twice as many users. Such an investment could open new opportunities for efficiency and innovation—all at a very affordable entry price point. 

To read more about how we measured in-house chatbot performance on single-socket Dell PowerEdge R6615 servers, check out the report below.