Building your GenAI tech stack
For some companies, access to an AI-powered chatbot or code generator will be enough. These are widely available as a service or via an API. For those looking to get a deeper integration, however, understanding the technologies used by AI and how they fit into your existing tech stack is essential. GenAI starts with the base foundation model. But which one should you use?
Components of GenAI
GenAI leverages additional advanced technologies that require thoughtful integration into your existing systems & understanding these components is crucial for effective implementation.
Vector storage

A database that stores the vectors output by the embedding model. While there are plenty of dedicated vector databases, many databases are adding vector capabilities to their feature sets, so you can use a single database to store object and vector data.
Other storage

Possibly a data lakehouse or other unstructured object storage.
Machine-learning framework

Like PyTorch, Keras, or TensorFlow. Python is the primary language for these, but there are frameworks for other languages.
Access to GPUs or TPUs

For accelerated inference and model training.
Where to start?
It can be intimidating to know where to start with these technologies, as research and development is advancing rapidly right now. Multiple organizations offer access to LLMs via APIs, and open-source models proliferate on sites like Hugging Face, so finding one that fits your organization can be daunting. You might even want to train your own model if your needs are specific enough. But a wide variety of AI—including foundation models—are available as services. You’ll likely need the rest of the tech listed here, but specifically which ones and their requirements will depend on what you’re doing with GenAI.
Fit the tech around the use case
Before diving into selecting tech, buying infrastructure and cloud compute, and integrating GenAI tech into your codebase, you’ll need to think about your use case:
Custom data will require a data platform with vector storage, embedding models, and, if you want retrieval-augmented generation, an orchestration library like LlamaIndex. Responses sent to the user may need additional governance or prompt engineering to manage potential hallucinations, unintentional biases, or prompt injections, as well as an inferencing stack. This can involve providing the LLM with system instructions and rules for how to handle and respond to prompts. You can also use other LLMs to manage prompts and responses to ensure that nothing dangerous makes it through to either the generating LLM or the user. Automated prompting and prompt handling may require fine-tuning an LLM to ensure that the responses fit the intended systems.
With any of these use cases (and more), it’s important that everyone in the organization be on the same page. GenAI integrations can be large, expensive projects that touch a lot of teams, so having a plan about the tech before diving in can make the eventual execution much easier.