3. Building your GenAI tech stack

Build vs. buy

Do you want to be an AI company, or do you want to be a company that uses AI in its products?

Where to begin

A frequent choice with any technology in the software industry is whether to build or buy your tools. Engineers build software, so it’s natural for them to want to build something when they want to solve new engineering problems. However, for those planning how an engineering team uses their time, building new solutions to solved problems isn’t always the best use of time. There’s a third option that some call “borrow,” where you use open-source software but with the option of contributing code or forking the repo.

So when you’re embarking on your GenAI journey (or pivoting, for that matter), it’s important to ask: do you want to be an AI company, or do you want to be a company that uses AI in your products? It’s an important distinction; if you have a specialized use case or want to control the code for your dependencies, then perhaps building makes sense. For most, buying a solution or using open-source software will work. As always, an organization will thrive when it focuses on building software that directly affects its business.

Let’s suppose you want to build an AI stack. Here’s what that would take.

Build-a-bot

Building a foundation model is a massive and expensive undertaking. OpenAI spent around $100 million training their GPT-4 model, rumored to be one trillion parameters. There are cheaper ways to train models that aren’t that massive; MosaicML trains custom LLMs for clients, and their cheapest is $2,000 for a 1.3B-parameter GPT-3 clone. The trade-off here is the quality and reliability of the big models for the control and security of a custom one. Additionally, the bigger your model is, the more it costs to host.

You could take the borrow route and fine-tune an open-source model. This would still require paying for GPU time, but it would allow you to get the benefits of the big LLMs while customizing it on your data. You can use techniques like LoRA to efficiently customize model weights without having to change everything within the entire LLM.

It’s unlikely that you’d want to build your own database or machine learning framework like PyTorch. Many of the standard tools in these areas are open-source and recreating the wheel here is usually more trouble than it’s worth.

Here are a few areas where you may find yourself building your own specialized tooling:

Orchestration and agent frameworks

Much of what software does is automate the repetitive stuff, so anything that connects LLM output to other processes falls in this category—think LlamaIndex, AutoGPT, or AutoChain.

Retrieval-augmented generation

RAG systems supplement LLM output with relevant context in the prompts using vector databases and orchestration tools. A RAG system will likely be something that you build in-house. There are tools that will do this for you, but they are often built into other tools, like data platforms.

Fine-tuning and inference

Fine-tuning lets you update an LLM with new information and change the model’s weights and biases. Most applications want this sort of feedback mechanism, and there are a lot of different approaches and a lot of different implementations. This is an area where—even if you want to build your own—you’d want to check the open-source options first.

Monitoring, explainability, and debiasing

There’s a whole slew of tools and techniques for evaluating, tracking, and hacking how generative models produce results. If you have an excellent data science team, you may end up coming up with novel solutions. But with all the visibility around hallucinations and GenAI’s failings, there is a lot of active research and development in this area.

Summary

Knowing what you need and how to judge what tools will do the job can be a challenge in itself. For our own AI efforts, we set up a separate Community within our internal Stack Overflow for Teams and gathered AI-related knowledge there. You’ll need some way to keep track of how fast the field is moving and what new developments will affect your business goals.