[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"sanity-UsZW3rXaRhRwsb2-YBDq8gQ6A7EYXntnqPsKIRPiRuc":3},{"data":4,"sourceMap":-1},{"meta":5,"resource":9},{"sidebarCta":6},{"body":7,"title":8},"Subscribe to receive Stack Internal content around knowledge sharing, collaboration, and AI.","Stay updated",{"_createdAt":10,"_id":11,"_rev":12,"_type":13,"_updatedAt":14,"backgroundImage":15,"body":20,"category":40,"chapters":42,"displayMinimal":5875,"image":5876,"preface":5879,"product":5880,"publishedAt":5964,"resourceType":5965,"slug":5969,"subcategory":5971,"title":5973,"visible":5974},"2025-01-29T18:00:41Z","91da7ff8-1355-4466-bfe3-ce99af6bcda9","a16d4PP1Zddk6u8Hnh0Rkb","resource","2025-02-12T15:07:35Z",{"_type":16,"asset":17},"image",{"_ref":18,"_type":19},"image-d4807d1db0937f61d0938cd5791d843b2258825e-2000x1000-png","reference",[21,32],{"_key":22,"_type":23,"children":24,"markDefs":30,"style":31},"7b47d2b0bb72","block",[25],{"_key":26,"_type":27,"marks":28,"text":29},"69100fe53d7a","span",[],"This guide outlines how to build a GenAI program within your organization, from understanding the beginnings of the AI landscape and how to upskill your team for the GenAI era, to key decisions for your organization and practical explanations of how to implement GenAI.",[],"normal",{"_key":33,"_type":23,"children":34,"markDefs":39,"style":31},"b51bb5f4e90b",[35],{"_key":36,"_type":27,"marks":37,"text":38},"e45eb16be230",[],"As we've seen over the past year, the AI/ML landscape is constantly changing and evolving. We plan to continue updating this guide as relevant information, learnings, and best practices are uncovered in the space.",[],{"_ref":41,"_type":19},"ef0e02bc-df9c-4b7c-b754-59031956a409",[43,172,1376,2792,3219,4716,5784],{"_key":44,"_type":45,"body":46,"fullwidthImage":158,"seo":161,"sidebarCta":167,"slug":168,"title":171},"8121935f23e2","chapter",[47,64,72,80,88,92,100,104,124,133,141],{"_key":48,"_type":23,"children":49,"markDefs":63,"style":31},"37cb41e1a102",[50,54,59],{"_key":51,"_type":27,"marks":52,"text":53},"994b226e0ae50",[],"When Stack Overflow introduced our ",{"_key":55,"_type":27,"marks":56,"text":58},"994b226e0ae51",[57],"em","Industry Guide to AI",{"_key":60,"_type":27,"marks":61,"text":62},"994b226e0ae52",[]," back in January 2024, the world was at an inflection point. It was clear that this new wave of technology powered by Generative AI was poised to have a profound impact across nearly every industry, but very few companies had figured out how to build and scale their GenAI efforts. Most were not ready to put this new technology into production, or allow their employees and customers to freely interact with it.",[],{"_key":65,"_type":23,"children":66,"markDefs":71,"style":31},"7822e153803d",[67],{"_key":68,"_type":27,"marks":69,"text":70},"a104a34deb4d0",[],"Today, most major organizations have explored or adopted GenAI in some capacity. Software development can be enhanced by AI code suggestions and the automated creation of documentation and tests. Marketers can write and refine more copy, design and illustrate campaigns, and even transform ideas and images into working web code. Video and audio production are enhanced by systems that can automatically clean, edit, enhance, and even create new material on the fly. The list goes on.",[],{"_key":73,"_type":23,"children":74,"markDefs":79,"style":31},"e960ae43dc0e",[75],{"_key":76,"_type":27,"marks":77,"text":78},"e5369da674b20",[],"In our updated guide, we reviewed and refreshed many sections to reflect how the tools and technologies that underpin this new era have evolved. We also updated our guide to implementation to give readers a sense of where we sit in the hype cycle. Some of the foundational elements of GenAI have changed significantly. For example, the style of retrieval augmented generation we highlighted last year is now known as “naive RAG,” because best practices have evolved to a far more elaborate set of tools and techniques to improve on the basic approach.",[],{"_key":81,"_type":23,"children":82,"markDefs":87,"style":31},"cdf2e4bd373e",[83],{"_key":84,"_type":27,"marks":85,"text":86},"18270615ecae0",[],"We have also seen new modalities appear. Previously, all models were built to provide an immediate answer to a user’s query. Now, there are models that can be trained to receive a query, think through a response, iteratively work out an answer, and provide the user with a much more nuanced reply. On the subject of AI innovation, while some hackers had created AI agents a year ago, almost all interaction with GenAI was in the form of a chatbot that could respond with text or images. We now have systems that can take a user prompt and, with the user’s permission, utilize their computer to carry out instructions—everything from a basic web search to building a chart in a spreadsheet to tweaking and then running a software program.",[],{"_key":89,"_type":90,"copy":91},"b98d1250b517","quote","Needless to say, 2024 was a year of rapid experimentation and chance in the AI space.",{"_key":93,"_type":23,"children":94,"markDefs":99,"style":31},"03363ea86875",[95],{"_key":96,"_type":27,"marks":97,"text":98},"30cfaabee5c60",[],"And while capabilities and adoption have greatly expanded, a major challenge to continued progress also seems to have emerged: model performance. As has been widely reported, the largest AI labs are no longer seeing large improvements in the performance of their models, despite increasing the size, cost, and complexity of the training. Incremental gains are still occurring, but the scaling laws that predicted the improvements from GPT 1-4 are not materializing.",[],{"_key":101,"_type":16,"asset":102},"4f3409bbf183",{"_ref":103,"_type":19},"image-4355b980499b5227c4cc1baa2671f6dade1b01df-1430x682-png",{"_key":105,"_type":23,"children":106,"markDefs":120,"style":31},"321399f5fdd3",[107,111,116],{"_key":108,"_type":27,"marks":109,"text":110},"6c6ab4561d3d0",[],"Ilya Sutskever, co-founder and former chief scientist of OpenAI, ",{"_key":112,"_type":27,"marks":113,"text":115},"6c6ab4561d3d1",[114],"0b48954e17b4","told reporters",{"_key":117,"_type":27,"marks":118,"text":119},"6c6ab4561d3d2",[]," that gains made by simply increasing the amount of training data and size of the models have plateaued.",[121],{"_key":114,"_type":122,"href":123},"link","https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/",{"_key":125,"_type":23,"children":126,"markDefs":131,"style":132},"23831c5cf78d",[127],{"_key":128,"_type":27,"marks":129,"text":130},"f0f2ec51ba7f0",[],"“The 2010s were the age of scaling, now we’re back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever said. “Scaling the right thing matters more now than ever.”",[],"blockquote",{"_key":134,"_type":23,"children":135,"markDefs":140,"style":31},"30ac76b3fe5d",[136],{"_key":137,"_type":27,"marks":138,"text":139},"aad9a5224daa0",[],"Not everyone agrees that the industry has reached a plateau, so it’s best to keep an open mind. While we’ll continue to add updates to this guide, the picture will likely come into better focus in 2025, as major updates to foundation models from the leading AI labs are already beginning to feel overdue.",[],{"_key":142,"_type":23,"children":143,"markDefs":157,"style":31},"a318b5bb2248",[144,148,153],{"_key":145,"_type":27,"marks":146,"text":147},"47a6d71b18820",[],"Given that this is one of the central topics being debated in the GenAI industry, our goal is to focus on providing as much information and context as we can to Sutskever’s question: ",{"_key":149,"_type":27,"marks":150,"text":152},"47a6d71b18821",[151],"strong","If simply scaling is no longer enough, what kind of data or techniques will have the biggest impact on pushing AI models to new levels of intelligence and capability?",{"_key":154,"_type":27,"marks":155,"text":156},"47a6d71b18822",[]," How should organizations think about generating, cleaning, annotating, and preparing their internal data to take best advantage of this?",[],{"_type":16,"asset":159},{"_ref":160,"_type":19},"image-23c89918175b7d876ddf2a19a80c0cc6ea22e195-2880x700-png",{"_type":162,"seoDescription":163,"seoImage":164},"seo","Explore our interactive guide on GenAI and LLMs, offering insights on AI strategy, tech integration, and safety considerations.",{"_type":16,"asset":165},{"_ref":166,"_type":19},"image-14ca641b67c15cade1ff57002fd4a137e09053d9-2400x1260-png",[],{"_type":169,"current":170},"slug","summary","Executive summary",{"_key":173,"_type":45,"body":174,"sections":259,"seo":1367,"sidebarCta":1372,"slug":1373,"title":1375},"80c5511a89ce",[175,183,188,196,204,213,221,229,237,243,251],{"_key":176,"_type":23,"children":177,"markDefs":182,"style":31},"c0a1710430d6",[178],{"_key":179,"_type":27,"marks":180,"text":181},"5e90eccff6190",[],"The quest to create artificial intelligence through machines dates back to the 1940s and has been a persistent topic of interest for scientists, technologists, writers, and philosophers ever since. Alan Turing, the legendary British mathematician, was one of the first to write on this topic in a formal way, and his Turing test served as a useful yardstick for many decades. The test asked a human being to engage in a conversation through a text-based chat interface. If the user couldn’t distinguish whether they were speaking to a real person or a computer program, then the system had at least a certain level of intelligence: it had passed the Turing test.",[],{"_key":184,"_type":185,"caption":186,"url":187},"04ffe87dca50","embed","The Voight-Kampff Test, a fictional exam for signs of humanity and empathy used in the Blade Runner film series.","https://www.youtube.com/watch?v=Umc9ezAyJv0",{"_key":189,"_type":23,"children":190,"markDefs":195,"style":31},"ac49a272e60c",[191],{"_key":192,"_type":27,"marks":193,"text":194},"30e2ca1afab80",[],"While an AI system that can hold a conversation with humans may seem like a recent revelation, there are examples of this technology dating all the way back to the mid-20th century. ELIZA, a therapy bot, was designed to take any input from a user and return it in the form of a question. People found it very useful as a personal therapist. Until recently, though, it was usually easy enough to trip an AI up into revealing its true nature. The latest generation of chatbots, powered by large language models (LLMs), can arguably pass this test with ease.",[],{"_key":197,"_type":23,"children":198,"markDefs":203,"style":31},"a9d8746df44a",[199],{"_key":200,"_type":27,"marks":201,"text":202},"234009bcaec90",[],"The dominant approach to AI from the 1970s onwards involved creating systems that could learn rules and facts, then put them to use inside a closed system. Prominent examples include Deep Blue, the program that beat Gary Kasparov in chess, or IBM’s Watson system, which bested human champions in Jeopardy. These AIs could become quite skilled at mastering the rules of certain games and could retain more information and explore more moves in advance than any human brain. They were limited, however, to their particular domain. Deep Blue couldn’t play checkers, and Watson would flop at Twenty Questions unless it was reengineered for an entirely new rule set. They lacked one key aspect of animal intelligence: the ability to take knowledge from one area and generalize it to another.",[],{"_key":205,"_type":23,"children":206,"markDefs":211,"style":212},"a22dfbe30752",[207],{"_key":208,"_type":27,"marks":209,"text":210},"db52be699bdf0",[],"The birth of neural networks",[],"h2",{"_key":214,"_type":23,"children":215,"markDefs":220,"style":31},"f4f51287a0df",[216],{"_key":217,"_type":27,"marks":218,"text":219},"6657018829cc0",[],"Previously, AIs had become very good at spotting patterns, and were useful in specific domains, like chess, stock trades, or medicine, where they might spot a cancer that a radiologist would miss.",[],{"_key":222,"_type":23,"children":223,"markDefs":228,"style":31},"b75ab2ab87d8",[224],{"_key":225,"_type":27,"marks":226,"text":227},"95e7b17df5820",[],"While AI built on a system of rules dominated for many decades, a few academics pursued another path. They believed that the best way to mimic intelligence generated by our brains was to build a digital version of our biological brain. To do this, they created artificial neurons that would interact in roughly the same way as the connective nodes in our own heads. The theory drew on a principle of neuroscience: by changing the strength of the connections between nodes in a network, you can teach it to encode knowledge. These were known as artificial neural networks.",[],{"_key":230,"_type":23,"children":231,"markDefs":236,"style":31},"536b086e9e78",[232],{"_key":233,"_type":27,"marks":234,"text":235},"088bd70e37bf0",[],"This field did not bear much fruit during the 20th century, and many of the most famous and well-respected names in AI today spent decades toiling in relative academic obscurity. Those who maintained their conviction in the neural network approach, however, were validated in the 2000s, and especially in the early aughts, when the amount of data available combined with a massive amount of compute began to scale the size of these neural networks and produce astounding results. As the internet grew, AI systems powered by what was being called machine learning were put to use for prediction and recommendation. They now guide our shopping, news consumption, social media feeds, and many other fields, including tens of billions of dollars per day in automated trades executed by high-frequency bots.",[],{"_key":238,"_type":16,"asset":239,"caption":241,"source":242},"a4f3dbd99d71",{"_ref":240,"_type":19},"image-3d6816674ce079fdfef7cca394778f31d1144197-1260x601-png","Error rate in the ImageNet Large Scale Visual Recognition Challenge","https://www.researchgate.net/figure/Error-rate-in-the-ImageNet-Large-Scale-Visual-Recognition-Challenge-Data-for-AI_fig1_340502861",{"_key":244,"_type":23,"children":245,"markDefs":250,"style":31},"575d31013983",[246],{"_key":247,"_type":27,"marks":248,"text":249},"7ea4863b463c0",[],"Machine learning was followed by deep learning, named for the growing number of layers in each neural network. The ImageNet 2012 Challenge is seen as a watershed moment. A neural network-based approach vastly outperformed the rest of the field and would soon surpass human performance. In short order, this approach became standard in the field and began to drive incredible advances in natural language processing, image recognition, and several other domains, including most recently generative models that can create text, images, video, or sound based on a user’s prompt.",[],{"_key":252,"_type":23,"children":253,"markDefs":258,"style":31},"85116dd7695a",[254],{"_key":255,"_type":27,"marks":256,"text":257},"c6dc1cc15de50",[],"Continue reading to dive into the details of GenAI and why it’s shaking up the world.",[],[260,475,865,1004],{"_key":261,"_type":262,"body":263,"seo":468,"slug":472,"title":474},"8680b4b37d9d","section",[264,272,280,299,304,312,320,328,336,345,353,361,369,388,396,415,419,433,441,460],{"_key":265,"_type":23,"children":266,"markDefs":271,"style":212},"438adbe22ddb",[267],{"_key":268,"_type":27,"marks":269,"text":270},"bef43d9223640",[],"The birth of the transformer",[],{"_key":273,"_type":23,"children":274,"markDefs":279,"style":31},"3fed97528522",[275],{"_key":276,"_type":27,"marks":277,"text":278},"7fee70027f1f0",[],"Neural networks have become the leading approach to AI. Prior to ChatGPT, they were recognized for breakthroughs in image recognition, natural language processing, and gameplay. Until recently, however, they rarely created anything original. Instead, they mastered a specific task or system.",[],{"_key":281,"_type":23,"children":282,"markDefs":296,"style":31},"33dc611ebad1",[283,287,292],{"_key":284,"_type":27,"marks":285,"text":286},"3795b2728a7d0",[],"GenAI ventured into new territory: a neural network system that could create something unique in response to a user prompt. Systems like DALL-E and Midjourney could generate images in response to input from users. In 2017, researchers at Google published a paper proposing a new architecture for neural networks: ",{"_key":288,"_type":27,"marks":289,"text":291},"3795b2728a7d1",[290],"24e064819347","the transformer",{"_key":293,"_type":27,"marks":294,"text":295},"3795b2728a7d2",[],". This approach allowed networks to scale to much larger sizes and make better use of compute provided by graphics processing units (GPUs).",[297],{"_key":290,"_type":122,"href":298},"https://stackoverflow.blog/2024/08/22/llms-evolve-quickly-their-underlying-architecture-not-so-much/",{"_key":300,"_type":16,"asset":301,"caption":303},"f9feca39bced",{"_ref":302,"_type":19},"image-f0fba41cba8efb9ecb1b19337356d09aa34012a7-4800x2696-png","Components of the Decoder-only Transformer – Input Layer, Causal Self-Attention, Feed-Forward Transformation, Classification Head, Transformer Block",{"_key":305,"_type":23,"children":306,"markDefs":311,"style":31},"d6f65cc25be6",[307],{"_key":308,"_type":27,"marks":309,"text":310},"c80a32ca3c470",[],"The transformer opened the door to the large language model (LLM), a generative system trained on text to respond in kind. In 2018, five years before ChatGPT burst onto the scene, OpenAI released GPT-1, where GPT stands for generative pretrained transformer model. When prompted, GPT-1 could generate coherent sentences and even paragraphs. But it also made mistakes and often wandered off-course. The subsequent releases of GPT-2 and 3 made big waves in the world of data science and AI, but they didn’t generate any mainstream recognition.",[],{"_key":313,"_type":23,"children":314,"markDefs":319,"style":31},"bed8bebe697a",[315],{"_key":316,"_type":27,"marks":317,"text":318},"8c430fc2c8380",[],"The arrival of ChatGPT (roughly GPT 3.5) was a watershed moment. Something about the scale of the training and the subsequent work to finetune the system through reinforcement learning and human feedback produced a GenAI that was accurate, knowledgeable, and rational enough to capture the world’s imagination.",[],{"_key":321,"_type":23,"children":322,"markDefs":327,"style":31},"1603f61ba037",[323],{"_key":324,"_type":27,"marks":325,"text":326},"fc98f0a6b9170",[],"Today’s neural networks have achieved a staggering scale in just a few short years. Systems like ChatGPT, Google’s Gemini, or Anthropic’s Claude are estimated to train on a once-unfathomable amount of text (more than 10 TB of internet data!) that continues to scale. These AI companies use special-purpose compute clusters composed of tens of thousands of high-end GPUs to train their AI models on this raw material. The process that can take weeks or months and cost tens of millions. But, as we’re seeing, the results can be transformative.",[],{"_key":329,"_type":23,"children":330,"markDefs":335,"style":212},"4cd007f8f9f5",[331],{"_key":332,"_type":27,"marks":333,"text":334},"cb5995e5afa20",[],"The capabilities of LLMs are soaring",[],{"_key":337,"_type":23,"children":338,"markDefs":343,"style":344},"d3d051da1e1e",[339],{"_key":340,"_type":27,"marks":341,"text":342},"403d8eaf6b030",[],"Multimodal AI",[],"h3",{"_key":346,"_type":23,"children":347,"markDefs":352,"style":31},"6e4aa4ecfb5c",[348],{"_key":349,"_type":27,"marks":350,"text":351},"2b90f5a9ff640",[],"Since we published the first edition of this guide, foundational GenAI models have improved in their ability to reason and converse. We’ve seen the emergence of multimodal AI models like Google’s Gemini, which is capable of understanding and creating content across mediums: images, audio, video, and text. Open AI’s GPT-4o also reasons in real time across images, video, audio, and text. For example, a multimodal model can receive a photo of a smoothie and produce the recipe (and the other way around).",[],{"_key":354,"_type":23,"children":355,"markDefs":360,"style":31},"23df52ca636d",[356],{"_key":357,"_type":27,"marks":358,"text":359},"89af100c6e900",[],"Multimodal AI has enormous potential to transform nearly every aspect of how people live, including how software developers learn and work. Multimodal AI models are less like software programs and more like consulting experts or assistants. They don’t just tackle toil (although they do an impressive job of that); they provide guidance via organic, humanlike interactions.",[],{"_key":362,"_type":23,"children":363,"markDefs":368,"style":344},"2b685344bd20",[364],{"_key":365,"_type":27,"marks":366,"text":367},"14b3f6930d1d0",[],"Reasoning LLMs",[],{"_key":370,"_type":23,"children":371,"markDefs":385,"style":31},"7d9424001a6f",[372,376,381],{"_key":373,"_type":27,"marks":374,"text":375},"46f1225d3a6e0",[],"We’re also seeing a new series of ",{"_key":377,"_type":27,"marks":378,"text":380},"46f1225d3a6e1",[379],"2b2babe8e472","reasoning models",{"_key":382,"_type":27,"marks":383,"text":384},"46f1225d3a6e2",[]," trained with reinforcement learning to think through hard problems. These models, like OpenAI’s o1, represent a pivotal evolution in how AI solves problems.",[386],{"_key":379,"_type":122,"href":387},"https://medium.com/@cognidownunder/openais-o1-vs-gpt-4o-a-deep-dive-into-ai-s-reasoning-revolution-fd9f7891e364",{"_key":389,"_type":23,"children":390,"markDefs":395,"style":31},"2c147c830d00",[391],{"_key":392,"_type":27,"marks":393,"text":394},"b11587b29ee60",[],"The difference between reasoning models and a model like GPT-4o is the difference between someone who has memorized their times tables and someone who understands the principles behind multiplication and can apply them in new contexts.",[],{"_key":397,"_type":23,"children":398,"markDefs":412,"style":31},"c9f40336667d",[399,403,408],{"_key":400,"_type":27,"marks":401,"text":402},"c203bccc857d0",[],"These models are trained using chain-of-thought prompting that mirrors a human approach to problem solving. Chain-of-thought prompting breaks problems down into manageable chunks of data that can be arranged sequentially to lead to an answer, like stepping stones across a stream. This makes reasoning models like o1 ideally suited for jobs that require complex, logic-driven problem solving, like STEM research or advanced coding projects. ",{"_key":404,"_type":27,"marks":405,"text":407},"c203bccc857d1",[406],"d803f0cd7753","Research has shown",{"_key":409,"_type":27,"marks":410,"text":411},"c203bccc857d2",[]," that chain-of-thought prompting significantly improves LLMs’ ability to perform this type of reasoning.",[413],{"_key":406,"_type":122,"href":414},"https://arxiv.org/abs/2201.11903",{"_key":416,"_type":16,"asset":417},"f3725f4aa303",{"_ref":418,"_type":19},"image-70034f1294217e3c1e9ad48396c09df0ab66e5f7-1431x681-png",{"_key":420,"_type":23,"children":421,"markDefs":431,"style":31},"f5e99350d101",[422,427],{"_key":423,"_type":27,"marks":424,"text":426},"576c522fbed00",[425],"4c9b4cbcb585","Reasoning models",{"_key":428,"_type":27,"marks":429,"text":430},"576c522fbed01",[]," are high-latency compared to real-time models that respond in literal milliseconds. GPT-4o, for example, is up to 30x faster than o1. And they’re not just slower; they’re also more expensive. o1 costs $60 per one million input tokens, while GPT-4o costs $15 per one million.",[432],{"_key":425,"_type":122,"href":387},{"_key":434,"_type":23,"children":435,"markDefs":440,"style":212},"208d38765259",[436],{"_key":437,"_type":27,"marks":438,"text":439},"beccf652ea040",[],"Higher token limits",[],{"_key":442,"_type":23,"children":443,"markDefs":457,"style":31},"0e4cf38c3151",[444,448,453],{"_key":445,"_type":27,"marks":446,"text":447},"26f7639e53c60",[],"Another major factor behind the continued improvement of AI models is their ",{"_key":449,"_type":27,"marks":450,"text":452},"26f7639e53c61",[451],"b14253b30a78","higher token limits",{"_key":454,"_type":27,"marks":455,"text":456},"26f7639e53c62",[],". Tokens are the pieces of text the model uses to process language, with the number of tokens dictating the amount of information the model can use in its reasoning.",[458],{"_key":451,"_type":122,"href":459},"https://medium.com/@jaimonjk/how-can-large-token-limits-in-new-llm-models-transform-the-learning-and-development-function-5fc643c8df0d",{"_key":461,"_type":23,"children":462,"markDefs":467,"style":31},"cc2e770b4881",[463],{"_key":464,"_type":27,"marks":465,"text":466},"39938786612f0",[],"Both OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro boast high token limits that allow them to manage and understand much bigger pieces of text within a single interaction. Higher token limits enable us to use AI for more complicated, data-intensive projects than were previously feasible, with no need for extensive fine-tuning of the model.",[],{"_type":162,"seoImage":469},{"_type":16,"asset":470},{"_ref":471,"_type":19},"image-5a6c699f79f68f799c090ff5cf157477dbf3d084-2400x1261-png",{"_type":169,"current":473},"genai-and-llms","The rise of GenAI and LLMs",{"_key":476,"_type":262,"body":477,"slug":862,"title":864},"40cbe3ddb61e",[478,508,527,535,554,558,577,585,593,645,650,658,709,717,725,733,741,770,778,808,816,824,832],{"_key":479,"_type":23,"children":480,"markDefs":503,"style":31},"41f2c56ef8de",[481,485,490,494,499],{"_key":482,"_type":27,"marks":483,"text":484},"b40dadd128650",[],"AI hype reached a fever pitch in 2024, fueled by mega investments in technology and a surge in product licenses. ",{"_key":486,"_type":27,"marks":487,"text":489},"b40dadd128651",[488],"c065eab9b69c","PwC projects",{"_key":491,"_type":27,"marks":492,"text":493},"b40dadd128652",[]," global AI investment will hit $15.7 trillion by 2030, growing at 38% annually. OpenAI’s valuation ",{"_key":495,"_type":27,"marks":496,"text":498},"b40dadd128653",[497],"5efc2aca3c92","soared to $150 billion",{"_key":500,"_type":27,"marks":501,"text":502},"b40dadd128654",[]," after securing the largest venture capital investment round in history. These investments underpin the potential for AI but raised questions about long-term profitability and growth, especially as training frontier models becomes increasingly costly.",[504,506],{"_key":488,"_type":122,"href":505},"https://www.pwc.com/gx/en/issues/ai.html",{"_key":497,"_type":122,"href":507},"https://pitchbook.com/profiles/company/149504-14",{"_key":509,"_type":23,"children":510,"markDefs":524,"style":31},"b85b1aa3b1b5",[511,515,520],{"_key":512,"_type":27,"marks":513,"text":514},"57c71348b0c00",[],"For business users, AI is finally starting to bear fruit: ",{"_key":516,"_type":27,"marks":517,"text":519},"57c71348b0c01",[518],"2d6cebaa3ef6","73% of companies",{"_key":521,"_type":27,"marks":522,"text":523},"57c71348b0c02",[]," already use AI in at least one business area. Startups are hacking different AI tools for automation, while enterprises use both bespoke and off-the-shelf solutions for wide-ranging tasks from personalizing sales emails to supply chain optimization.",[525],{"_key":518,"_type":122,"href":526},"https://www.forbes.com/councils/forbestechcouncil/2024/05/06/finding-roai-strategic-benchmarking-for-ai-powered-business-success/",{"_key":528,"_type":23,"children":529,"markDefs":534,"style":212},"2a34c39765b7",[530],{"_key":531,"_type":27,"marks":532,"text":533},"6d114928c7580",[],"LLMs went mainstream",[],{"_key":536,"_type":23,"children":537,"markDefs":551,"style":31},"0490f4b99533",[538,542,547],{"_key":539,"_type":27,"marks":540,"text":541},"12ddaa32a5ed0",[],"For mature adopters, large language models (LLMs) transitioned last year from pilot programs to establishing generative AI (GenAI) tools in many workflows. OpenAI’s ChatGPT, Microsoft’s Copilot, and Anthropic’s Claude vied for attention to become your LLM of choice. AI companies hype each release as a step change, like OpenAI’s ",{"_key":543,"_type":27,"marks":544,"text":546},"12ddaa32a5ed1",[545],"0f65a82957f7","12 Days of Ship-mas",{"_key":548,"_type":27,"marks":549,"text":550},"12ddaa32a5ed2",[]," in December 2024 launching new products and features including Sora, an advanced video generation tool.",[552],{"_key":545,"_type":122,"href":553},"https://openai.com/12-days/",{"_key":555,"_type":185,"caption":556,"url":557},"08f6cb741398","Introducing Sora — OpenAI’s text-to-video model","https://www.youtube.com/watch?v=HK6y8DAPN_0",{"_key":559,"_type":23,"children":560,"markDefs":574,"style":31},"9f36a0f3ea4e",[561,565,570],{"_key":562,"_type":27,"marks":563,"text":564},"cda12911789e0",[],"More than just summarizing your scrum actions, AI solutions and the rising demand for compute power is driving foundational model advancements in life-enhancing fields like biology, genomics, and neuroscience. Expanded AI models have bolstered cybersecurity with enhanced attack detection and real-time network protection. Edge AI development and reduced reliance on the cloud are improving latency for IoT and mobile applications.While some early users are reaping rewards, many businesses are still in the cautious adoption stage, not yet achieving the promised gains in productivity. Gartner reported that ",{"_key":566,"_type":27,"marks":567,"text":569},"cda12911789e1",[568],"c0a46f822f0e","less than 4%",{"_key":571,"_type":27,"marks":572,"text":573},"cda12911789e2",[]," of IT leaders find Microsoft Copilot valuable. With inadequate data access protocols, the tool internally leaked sensitive company data like salaries and HR files. Successful organizations need to have a defined plan for adoption and integration with IT and security systems. They must avoid the allure of falling for “shiny object syndrome” with each new and improved AI model launch.",[575],{"_key":568,"_type":122,"href":576},"https://substack.com/redirect/41a49453-4a6e-451e-90d4-fceb0d4dfb22?j=eyJ1IjoiM3A3ZmkyIn0.pgl_QlfsV2LRdHpuMW8ww4qVsrC9DcXgf9ASmQmfNlg",{"_key":578,"_type":23,"children":579,"markDefs":584,"style":212},"254fd9f814aa",[580],{"_key":581,"_type":27,"marks":582,"text":583},"b5e7c3b4cca10",[],"CodeGen tools streamline development",[],{"_key":586,"_type":23,"children":587,"markDefs":592,"style":31},"399d562fad6b",[588],{"_key":589,"_type":27,"marks":590,"text":591},"183decd0b0b30",[],"AI code generation tools are rapidly gaining traction among developers. They’re becoming more advanced allowing for real-time code generation, automated bug detection, and performance optimization.",[],{"_key":594,"_type":23,"children":595,"markDefs":636,"style":31},"c443e489728d",[596,600,605,609,614,618,623,627,632],{"_key":597,"_type":27,"marks":598,"text":599},"5699fe5a2143",[],"The AI code tools market is ",{"_key":601,"_type":27,"marks":602,"text":604},"183decd0b0b31",[603],"6e0ff6abf736","projected to triple in value",{"_key":606,"_type":27,"marks":607,"text":608},"183decd0b0b32",[]," to $12.6 billion by 2028, by which time ",{"_key":610,"_type":27,"marks":611,"text":613},"183decd0b0b33",[612],"6ae7a4c924ab","Gartner predicts",{"_key":615,"_type":27,"marks":616,"text":617},"183decd0b0b34",[]," that three in four enterprise software engineers will use AI code assistants, up from one in ten in 2023. Three in four (76%) of our ",{"_key":619,"_type":27,"marks":620,"text":622},"183decd0b0b35",[621],"0595f0ee4b4a","survey",{"_key":624,"_type":27,"marks":625,"text":626},"183decd0b0b36",[]," respondents currently use or are planning to use code assistants. Google is leading this adoption, announcing in its Q3 earnings call that ",{"_key":628,"_type":27,"marks":629,"text":631},"183decd0b0b37",[630],"8672fe3a3d65","over 25% of their new code is now generated by AI",{"_key":633,"_type":27,"marks":634,"text":635},"183decd0b0b38",[],".",[637,639,641,643],{"_key":603,"_type":122,"href":638},"https://www.marketsandmarkets.com/Market-Reports/ai-code-tools-market-239940941.html",{"_key":612,"_type":122,"href":640},"https://blogs.oracle.com/ai-and-datascience/post/ai-code-assistants-are-on-the-rise-big-time",{"_key":621,"_type":122,"href":642},"https://stackoverflow.blog/2024/05/29/developers-get-by-with-a-little-help-from-ai-stack-overflow-knows-code-assistant-pulse-survey-results/",{"_key":630,"_type":122,"href":644},"https://www.businessinsider.com/google-earnings-q3-2024-new-code-created-by-ai-2024-10",{"_key":646,"_type":16,"asset":647,"caption":649,"source":642},"6acd53afcb56",{"_ref":648,"_type":19},"image-88c7e8204b76e2d6dcdd19d8d3808fdeaaa68f81-2400x1256-png","Top 3 code assistants – ChatGPT (84%), GitHub Copilot (49%), Visual Studio IntelliCode (11%)",{"_key":651,"_type":23,"children":652,"markDefs":657,"style":31},"048ae0299ed0",[653],{"_key":654,"_type":27,"marks":655,"text":656},"a13dee46bb7c0",[],"Expanding context window sizes—now reaching up to one million tokens—can process vast amounts of code or documents at once, streamlining tasks like large-scale code refactoring and summarizing documentation repositories.",[],{"_key":659,"_type":23,"children":660,"markDefs":701,"style":31},"3d747267be5d",[661,665,670,674,679,683,688,692,697],{"_key":662,"_type":27,"marks":663,"text":664},"b676d060893d0",[],"Code assistants are becoming more advanced, offering real-time code generation, automated bug detection, and performance optimization. Cursor, an AI-driven code editor, ",{"_key":666,"_type":27,"marks":667,"text":669},"b676d060893d1",[668],"8e3b1d1197b5","soared in popularity",{"_key":671,"_type":27,"marks":672,"text":673},"b676d060893d2",[],", securing $60 million in funding in 2024. It streamlines tasks like creating database schemas and generating user interfaces. An A-B test showed those using ",{"_key":675,"_type":27,"marks":676,"text":678},"b676d060893d3",[677],"4481b4c3eb3b","GitHub Copilot",{"_key":680,"_type":27,"marks":681,"text":682},"b676d060893d4",[]," completed tasks 55% faster than those without, saving on average 90 minutes per task. LLM performance is improving with better reasoning and multimodal capabilities for processing text and images. Yet accuracy remains a concern, with hallucinations still ",{"_key":684,"_type":27,"marks":685,"text":687},"b676d060893d5",[686],"898c449085a6","a feature rather than a bug of models",{"_key":689,"_type":27,"marks":690,"text":691},"b676d060893d6",[],". Developers in our ",{"_key":693,"_type":27,"marks":694,"text":696},"b676d060893d7",[695],"267393c7d010","annual survey",{"_key":698,"_type":27,"marks":699,"text":700},"b676d060893d8",[]," said that 38% of responses from AI assistants are inaccurate at least half the time, resulting in additional validation and busywork that undermines AI productivity goals. LLMs still struggle with context, complexity, and obscurity to deliver accurate code.",[702,704,706,708],{"_key":668,"_type":122,"href":703},"https://randomcoding.com/blog/2024-09-15-is-cursor-ais-code-editor-any-good",{"_key":677,"_type":122,"href":705},"https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/",{"_key":686,"_type":122,"href":707},"https://arxiv.org/abs/2409.05746",{"_key":695,"_type":122,"href":642},{"_key":710,"_type":23,"children":711,"markDefs":716,"style":212},"ec4827ae4baf",[712],{"_key":713,"_type":27,"marks":714,"text":715},"7643b913c5f30",[],"Where next: Narrow AI tools to agentic AI",[],{"_key":718,"_type":23,"children":719,"markDefs":724,"style":31},"f2ebd6ad210a",[720],{"_key":721,"_type":27,"marks":722,"text":723},"f0cadc5abbc60",[],"While general-purpose LLMs like GPT-4 address a wide range of needs, niche AI tools—often built on OpenAI APIs—offer specialized functionality. Open-source platforms like Hugging Face have surpassed 250,000 pre-trained models, supporting developers to create custom solutions. IT decision-makers must review broader company licenses against requests for “narrow AI”—AI tools to complete specific tasks—and consider how to balance opportunities with security concerns. Agentic AI represents the next stage of AI development. Agentic systems will autonomously plan, reason, and execute tasks across complex workflows without direct human input. Potential applications include diverse use cases from autonomous software management to advanced robotics.",[],{"_key":726,"_type":23,"children":727,"markDefs":732,"style":212},"c6888a86408f",[728],{"_key":729,"_type":27,"marks":730,"text":731},"eaf8c8543f950",[],"Responsible AI in the spotlight",[],{"_key":734,"_type":23,"children":735,"markDefs":740,"style":31},"36e1d26f12d0",[736],{"_key":737,"_type":27,"marks":738,"text":739},"4ba5cc48926d0",[],"This year, conversations about the existential risks to humanity of artificial general intelligence (AGI) have refocused to address pragmatic issues like vulnerabilities, bias, and misuse, prompting calls for stronger safeguards.",[],{"_key":742,"_type":23,"children":743,"markDefs":765,"style":31},"9be8598651c4",[744,748,753,757,762],{"_key":745,"_type":27,"marks":746,"text":747},"1570de72a2c80",[],"The ",{"_key":749,"_type":27,"marks":750,"text":752},"1570de72a2c81",[751],"0f7fce77ba88","EU AI Act",{"_key":754,"_type":27,"marks":755,"text":756},"1570de72a2c82",[],", enforced in August 2024, impacts organizations whose AI system outputs can be accessed by EU citizens. The Act mandates developers to prioritize responsible AI and robust data management practices. It bans real-time biometric identification in public spaces and manipulative advertising techniques. High-risk applications in law enforcement and recruitment must meet strict protocols to ensure fairness, equity, and transparency. In the US, state-led efforts have shaped preliminary frameworks, though the new presidency in 2025 could stall outgoing President Biden's proposed ",{"_key":758,"_type":27,"marks":759,"text":761},"1570de72a2c83",[760],"af03e838ad9e","Blueprint for an AI Bill of Rights",{"_key":763,"_type":27,"marks":764,"text":635},"1570de72a2c84",[],[766,768],{"_key":751,"_type":122,"href":767},"https://artificialintelligenceact.eu/high-level-summary/",{"_key":760,"_type":122,"href":769},"https://www.whitehouse.gov/ostp/ai-bill-of-rights/",{"_key":771,"_type":23,"children":772,"markDefs":777,"style":212},"8c666e032def",[773],{"_key":774,"_type":27,"marks":775,"text":776},"6fec09683cca0",[],"AI skills gaps could hold back growth",[],{"_key":779,"_type":23,"children":780,"markDefs":803,"style":31},"08b9a743073a",[781,785,790,794,799],{"_key":782,"_type":27,"marks":783,"text":784},"064b64d6578d0",[],"The rapid pace of AI development has created significant skills gaps across industries. ",{"_key":786,"_type":27,"marks":787,"text":789},"064b64d6578d1",[788],"e70d88dec40c","AI skills have a “half-life",{"_key":791,"_type":27,"marks":792,"text":793},"064b64d6578d2",[],",” meaning they become outdated twice as fast as other skills. To keep pace with this fast-evolving technology, developers and other technologists need to invest in ongoing learning. Professor Ethan Mollick of the Wharton School ",{"_key":795,"_type":27,"marks":796,"text":798},"064b64d6578d3",[797],"6bdebd923a70","suggests spending about 10 hours exploring each new AI system,",{"_key":800,"_type":27,"marks":801,"text":802},"064b64d6578d4",[]," starting with tasks familiar to you and gradually expanding until the “jagged edge” of the system’s limits becomes clear.",[804,806],{"_key":788,"_type":122,"href":805},"https://www.forbes.com/sites/joemckendrick/2024/04/30/ai-puts-the-squeeze-on-the-shrinking-half-life-of-skills/",{"_key":797,"_type":122,"href":807},"https://www.oneusefulthing.org/p/getting-started-with-ai-good-enough?utm_source=www.theneurondaily.com&utm_medium=newsletter&utm_campaign=claude-masters-your-voice&_bhlid=553b8009f1de79067abc51ca6d04749e8bc0b32f",{"_key":809,"_type":23,"children":810,"markDefs":815,"style":31},"f60a7ef1ee5d",[811],{"_key":812,"_type":27,"marks":813,"text":814},"b8eafff8f0db0",[],"To an experienced developer’s eye, errors in AI-generated code are easier to spot and test. For early career developers, the limitations of AI system outputs, which often appear plausible at first look, can be more challenging to identify and correct.",[],{"_key":817,"_type":23,"children":818,"markDefs":823,"style":31},"ad4ac4d3db90",[819],{"_key":820,"_type":27,"marks":821,"text":822},"17ebef4a88c60",[],"Prompt engineering has evolved from a novel skill to a critical competency. Effective prompting remains crucial for technical and creative briefs. For simpler tasks, LLMs are becoming more intuitive, requiring less user input to understand queries and making AI more accessible to non-technical users.",[],{"_key":825,"_type":23,"children":826,"markDefs":831,"style":212},"639994d67df8",[827],{"_key":828,"_type":27,"marks":829,"text":830},"84cf1bedccd50",[],"Where AI is heading for developers in 2025",[],{"_key":833,"_type":23,"children":834,"markDefs":857,"style":31},"4ef6346c3ed1",[835,839,844,848,853],{"_key":836,"_type":27,"marks":837,"text":838},"f53d0dd62d100",[],"AI in 2024 transformed from an evolving technology to an essential tool for ambitious organizations. If, as experts like Yale University’s ",{"_key":840,"_type":27,"marks":841,"text":843},"f53d0dd62d101",[842],"a3acd8e3050f","Luciano Floridi predict",{"_key":845,"_type":27,"marks":846,"text":847},"f53d0dd62d102",[],", the hype cycle follows the path of other general-purpose technologies like the internet, 2025 could be the year AI reaches its “Plateau of Productivity” when ",{"_key":849,"_type":27,"marks":850,"text":852},"f53d0dd62d103",[851],"5d156c93264b","mainstream adoption takes off",{"_key":854,"_type":27,"marks":855,"text":856},"f53d0dd62d104",[],". Developers must continue to test tools and establish the right use cases to integrate AI into their workflows and integrate responsible AI principles into the development of new systems.",[858,860],{"_key":842,"_type":122,"href":859},"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4960826",{"_key":851,"_type":122,"href":861},"https://www.gartner.com/en/research/methodologies/gartner-hype-cycle",{"_type":169,"current":863},"recent-evolution","How GenAI evolved in 2024",{"_key":866,"_type":262,"body":867,"seo":997,"slug":1001,"title":1003},"7b5058677686",[868,876,884,892,900,908,916,924,932,946,958,970,978,981,989],{"_key":869,"_type":23,"children":870,"markDefs":875,"style":31},"b681cb19379d",[871],{"_key":872,"_type":27,"marks":873,"text":874},"e422cea80bdc0",[],"Any new technology, especially one that’s captured the public imagination like GenAI, eventually faces a reality check. When GenAI tools like ChatGPT first became generally available, the excitement over their potential quickly struck a fever pitch. These models were seen as groundbreaking innovations on the cusp of revolutionizing every aspect of our existence. However, as with all technological advancements, a clearer picture emerges with time; the focus shifts from unbridled potential to practical application.",[],{"_key":877,"_type":23,"children":878,"markDefs":883,"style":31},"c67621e3af36",[879],{"_key":880,"_type":27,"marks":881,"text":882},"cd552524bdaa0",[],"A few years out from ChatGPT’s explosive arrival on the marketplace, we can have a much more grounded conversation about how people are actually using AI. Of particular interest to our global audience of programmers and technologists, we can examine how developers are incorporating AI tools into their workflows.",[],{"_key":885,"_type":23,"children":886,"markDefs":891,"style":212},"8e6a1c5ce90f",[887],{"_key":888,"_type":27,"marks":889,"text":890},"b68bf934c4a70",[],"From big potential to practical application",[],{"_key":893,"_type":23,"children":894,"markDefs":899,"style":31},"3bf6eeb8ca38",[895],{"_key":896,"_type":27,"marks":897,"text":898},"0a84b2d6910d0",[],"Developers and businesses have begun integrating AI tools into their operations in ways both predictable and surprising. From accelerating code quality testing and shortening time to production to automating customer service chats, AI applications have evolved to meet real-world needs.",[],{"_key":901,"_type":23,"children":902,"markDefs":907,"style":31},"7023e5f7f2c4",[903],{"_key":904,"_type":27,"marks":905,"text":906},"0666be566b370",[],"In many development environments, AI coding tools improve developer productivity and enhance the learning process, especially for junior devs, by suggesting code snippets, debugging errors, and automating security and code quality tests. They help streamline developers’ workflows, allowing them to focus on more complex problem-solving tasks and higher-order creative work.",[],{"_key":909,"_type":23,"children":910,"markDefs":915,"style":212},"dcd568529c79",[911],{"_key":912,"_type":27,"marks":913,"text":914},"0cbf6e34f8600",[],"Reasons for excitement",[],{"_key":917,"_type":23,"children":918,"markDefs":923,"style":31},"ddeecbb24c89",[919],{"_key":920,"_type":27,"marks":921,"text":922},"ec6296c8d7870",[],"One of the most promising aspects of today’s LLM models is how quickly their capabilities are improving. Each new generation of LLMs arrives with improved accuracy, understanding, and usability, making them more valuable to developers with each iteration.",[],{"_key":925,"_type":23,"children":926,"markDefs":931,"style":31},"9189e8d9664f",[927],{"_key":928,"_type":27,"marks":929,"text":930},"1381771cf98e0",[],"As we mentioned in the previous section, several developments underscore the substantial progress AI technology has made and reveal a future rich with possibilities.",[],{"_key":933,"_type":23,"children":934,"level":943,"listItem":944,"markDefs":945,"style":31},"5335effb445a",[935,939],{"_key":936,"_type":27,"marks":937,"text":938},"3ae6d51784780",[151],"Multimodal LLMs:",{"_key":940,"_type":27,"marks":941,"text":942},"3ae6d51784781",[]," These models can process and generate not just text but also images, video, and other forms of data, allowing for richer, more versatile user experiences. By combining different types of information, these systems offer more relevant insights and comprehensive solutions.",1,"bullet",[],{"_key":947,"_type":23,"children":948,"level":943,"listItem":944,"markDefs":957,"style":31},"8baa05bcd499",[949,953],{"_key":950,"_type":27,"marks":951,"text":952},"0edb72ebbaae0",[151],"Reasoning capabilities:",{"_key":954,"_type":27,"marks":955,"text":956},"0edb72ebbaae1",[]," The rise of reasoning-based LLMs marks another monumental step. These models go beyond simple language prediction to engage in deeper reasoning tasks, simulating a form of understanding more in line with human cognition. This enhances their ability to aid in problem-solving and decision-making processes.",[],{"_key":959,"_type":23,"children":960,"level":943,"listItem":944,"markDefs":969,"style":31},"86ed07ad54ae",[961,965],{"_key":962,"_type":27,"marks":963,"text":964},"6d6b026d35610",[151],"Expanding token limits: ",{"_key":966,"_type":27,"marks":967,"text":968},"6d6b026d35611",[],"Another important leap forward has been in expanding token limits, which enable models to handle larger contexts of text input. Higher token limits enhance the models’ understanding of nuanced and complex conversations, making them useful in scenarios requiring sophisticated dialogue and complex problem-solving.",[],{"_key":971,"_type":23,"children":972,"markDefs":977,"style":212},"c4e840118269",[973],{"_key":974,"_type":27,"marks":975,"text":976},"363c04006d9f0",[],"Pressure to produce tangible results",[],{"_key":979,"_type":185,"url":980},"c961996e30ee","https://platform.twitter.com/embed/Tweet.html?dnt=false&embedId=twitter-widget-0&features=eyJ0ZndfdGltZWxpbmVfbGlzdCI6eyJidWNrZXQiOltdLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X2ZvbGxvd2VyX2NvdW50X3N1bnNldCI6eyJidWNrZXQiOnRydWUsInZlcnNpb24iOm51bGx9LCJ0ZndfdHdlZXRfZWRpdF9iYWNrZW5kIjp7ImJ1Y2tldCI6Im9uIiwidmVyc2lvbiI6bnVsbH0sInRmd19yZWZzcmNfc2Vzc2lvbiI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfZm9zbnJfc29mdF9pbnRlcnZlbnRpb25zX2VuYWJsZWQiOnsiYnVja2V0Ijoib24iLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X21peGVkX21lZGlhXzE1ODk3Ijp7ImJ1Y2tldCI6InRyZWF0bWVudCIsInZlcnNpb24iOm51bGx9LCJ0ZndfZXhwZXJpbWVudHNfY29va2llX2V4cGlyYXRpb24iOnsiYnVja2V0IjoxMjA5NjAwLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X3Nob3dfYmlyZHdhdGNoX3Bpdm90c19lbmFibGVkIjp7ImJ1Y2tldCI6Im9uIiwidmVyc2lvbiI6bnVsbH0sInRmd19kdXBsaWNhdGVfc2NyaWJlc190b19zZXR0aW5ncyI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfdXNlX3Byb2ZpbGVfaW1hZ2Vfc2hhcGVfZW5hYmxlZCI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfdmlkZW9faGxzX2R5bmFtaWNfbWFuaWZlc3RzXzE1MDgyIjp7ImJ1Y2tldCI6InRydWVfYml0cmF0ZSIsInZlcnNpb24iOm51bGx9LCJ0ZndfbGVnYWN5X3RpbWVsaW5lX3N1bnNldCI6eyJidWNrZXQiOnRydWUsInZlcnNpb24iOm51bGx9LCJ0ZndfdHdlZXRfZWRpdF9mcm9udGVuZCI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9fQ%3D%3D&frame=false&hideCard=false&hideThread=false&id=1876104315296968813&lang=en&origin=https%3A%2F%2Fpublish.twitter.com%2F%3Fquery%3Dhttps%253A%252F%252Fx.com%252Fsama%252Fstatus%252F1876104315296968813%25E2%2580%259D%26widget%3DTweet&sessionId=35835483d04b73bfd85847d7211f006f72874ca8&theme=light&widgetsVersion=2615f7e52b7e0%3A1702314776716&width=550px",{"_key":982,"_type":23,"children":983,"markDefs":988,"style":31},"2b94a72050a9",[984],{"_key":985,"_type":27,"marks":986,"text":987},"a64a0f584c820",[],"Despite these promising developments, organizations that have heavily invested in AI over the last couple of years face an increasingly pressing need to demonstrate real results. Stakeholders, eager to see returns on substantial investments, are pushing for AI applications that not only showcase a company’s technical prowess but also drive measurable business outcomes.",[],{"_key":990,"_type":23,"children":991,"markDefs":996,"style":31},"b3410e21ac06",[992],{"_key":993,"_type":27,"marks":994,"text":995},"fee168bdfd4a0",[],"In response, companies are intently focused on integrating AI with clear business metrics, whether that’s increasing employee productivity, enhancing customer satisfaction, or another marker of success that fits your business goal. This desire for accountability is not only shaping how AI is implemented across industries; it’s also driving a more strategic approach to AI deployments.",[],{"_type":162,"seoImage":998},{"_type":16,"asset":999},{"_ref":1000,"_type":19},"image-c24c01a7e7c09081d32a97ae6b1910d22bd9cdfa-2400x1260-png",{"_type":169,"current":1002},"from-hype-to-reality","From hype to reality",{"_key":1005,"_type":262,"body":1006,"seo":1360,"slug":1364,"title":1366},"ecd5b6085ef3",[1007,1015,1023,1027,1046,1054,1073,1081,1089,1093,1112,1120,1139,1147,1155,1159,1189,1197,1216,1224,1232,1236,1255,1263,1282,1290,1298,1302,1321,1329,1337,1341],{"_key":1008,"_type":23,"children":1009,"markDefs":1014,"style":212},"3ce4de3706df",[1010],{"_key":1011,"_type":27,"marks":1012,"text":1013},"7a080f2b436b0",[],"Medical",[],{"_key":1016,"_type":23,"children":1017,"markDefs":1022,"style":344},"6efdb7e5ef4a",[1018],{"_key":1019,"_type":27,"marks":1020,"text":1021},"262d2c2a40840",[],"Distilling the latest research.",[],{"_key":1024,"_type":16,"asset":1025},"1d7af4cb3522",{"_ref":1026,"_type":19},"image-a230f68d2ebf516709316d850401e3d9f559dede-1431x681-png",{"_key":1028,"_type":23,"children":1029,"markDefs":1043,"style":31},"a2b3ccd19adb",[1030,1034,1039],{"_key":1031,"_type":27,"marks":1032,"text":1033},"7fddf6d5d0930",[],"Every day, hundreds of new research papers and trial results are released. It would be impossible for any one person, or even a small team, to keep up with it all, especially when research is published in dozens of different languages. That’s why ",{"_key":1035,"_type":27,"marks":1036,"text":1038},"7fddf6d5d0931",[1037],"630294905421","Sorcero",{"_key":1040,"_type":27,"marks":1041,"text":1042},"7fddf6d5d0932",[],", an AI firm focused on medical intelligence, has built a system to ingest the torrent of data being published each day. Teams inside pharmaceutical companies can then ask for updates on topics that are relevant to the disease, drug, or procedure they’re focused on. Their GenAI technology can produce a synopsis, translate across languages, and help distill complex medical terminology into something that’s easier to understand.",[1044],{"_key":1037,"_type":122,"href":1045},"https://www.sorcero.com/",{"_key":1047,"_type":23,"children":1048,"markDefs":1053,"style":344},"20329cc739b5",[1049],{"_key":1050,"_type":27,"marks":1051,"text":1052},"e1871ef5bdc80",[],"Speeding up drug breakthroughs.",[],{"_key":1055,"_type":23,"children":1056,"markDefs":1070,"style":31},"0dca026589ce",[1057,1061,1066],{"_key":1058,"_type":27,"marks":1059,"text":1060},"c5bc2a55ff610",[],"Pfizer is making clinical trials faster and smarter with",{"_key":1062,"_type":27,"marks":1063,"text":1065},"c5bc2a55ff611",[1064],"00ccda64c646"," AI-powered optimization",{"_key":1067,"_type":27,"marks":1068,"text":1069},"c5bc2a55ff612",[],". Machine learning tools analyze vast clinical datasets to predict outcomes and risks and automate drug trial designs. This AI “medicine” is fast acting: It's led to shorter research timelines and quicker delivery of breakthrough treatments to patients.",[1071],{"_key":1064,"_type":122,"href":1072},"https://www.pfizer.com/news/articles/artificial_intelligence_on_a_mission_to_make_clinical_drug_development_faster_and_smarter",{"_key":1074,"_type":23,"children":1075,"markDefs":1080,"style":212},"d3537f75f3dc",[1076],{"_key":1077,"_type":27,"marks":1078,"text":1079},"23aca2e8c47f0",[],"Finance",[],{"_key":1082,"_type":23,"children":1083,"markDefs":1088,"style":344},"0cdcf958cd82",[1084],{"_key":1085,"_type":27,"marks":1086,"text":1087},"35d4c3697b510",[],"Allowing a broader group of less sophisticated investors to access, understand, and make use of market data.",[],{"_key":1090,"_type":16,"asset":1091},"4d3acc1247fa",{"_ref":1092,"_type":19},"image-bcf5e24ec770840278b41eec22a6af0563a9a6ae-1430x681-png",{"_key":1094,"_type":23,"children":1095,"markDefs":1109,"style":31},"95028388c85a",[1096,1100,1105],{"_key":1097,"_type":27,"marks":1098,"text":1099},"e1627f9dd73e0",[],"Bloomberg created its own LLM, BloombergGPT, based on its extensive collection of financial data. The system has two purposes. First, it can improve on automated tasks Bloomberg is already doing in-house every day, like natural language processing, news classification, and sentiment analysis. Second, the system will allow clients to make sense of the vast amounts of data flowing through their Bloomberg Terminal, providing synopses of market moving events that separate the signal from the noise. (",{"_key":1101,"_type":27,"marks":1102,"text":1104},"e1627f9dd73e1",[1103],"193d76d7d42f","Source",{"_key":1106,"_type":27,"marks":1107,"text":1108},"e1627f9dd73e2",[],")",[1110],{"_key":1103,"_type":122,"href":1111},"https://arxiv.org/abs/2303.17564",{"_key":1113,"_type":23,"children":1114,"markDefs":1119,"style":344},"8eccc1a543e8",[1115],{"_key":1116,"_type":27,"marks":1117,"text":1118},"46b1a28aa3210",[],"Smarter investment portfolios with AI.",[],{"_key":1121,"_type":23,"children":1122,"markDefs":1136,"style":31},"d2d944411d6f",[1123,1127,1132],{"_key":1124,"_type":27,"marks":1125,"text":1126},"20de0dd26efc0",[],"BlackRock is optimizing investments with its ",{"_key":1128,"_type":27,"marks":1129,"text":1131},"20de0dd26efc1",[1130],"cf04af5efc3d","AI-powered platform Aladdin",{"_key":1133,"_type":27,"marks":1134,"text":1135},"20de0dd26efc2",[],". It isn’t magic, but it’s like a genie in a bot(tle). Aladdin uses machine learning to analyze big datasets and help fund managers discover investment opportunities and manage risk. By simulating real-world market conditions, Aladdin delivers predictive insights to build resilient approaches to portfolio management. Aladdin uses scalable AI frameworks, robust APIs, and cloud integration to deliver financial market insights.",[1137],{"_key":1130,"_type":122,"href":1138},"https://www.blackrock.com/us/individual/insights/ai-investing",{"_key":1140,"_type":23,"children":1141,"markDefs":1146,"style":212},"b45b0967e234",[1142],{"_key":1143,"_type":27,"marks":1144,"text":1145},"58640bd826b10",[],"Legal",[],{"_key":1148,"_type":23,"children":1149,"markDefs":1154,"style":344},"048aebbce0b5",[1150],{"_key":1151,"_type":27,"marks":1152,"text":1153},"3cbf0e9792a90",[],"Providing advice and crafting early drafts for lawyers.",[],{"_key":1156,"_type":16,"asset":1157},"3dae72a76222",{"_ref":1158,"_type":19},"image-3cbc4ee35ad6fbe9ac397c72f8a5ac87e5ca1e76-1431x681-png",{"_key":1160,"_type":23,"children":1161,"markDefs":1184,"style":31},"3c224bab6a47",[1162,1166,1171,1175,1180],{"_key":1163,"_type":27,"marks":1164,"text":1165},"8a7c45d3d7020",[],"As the startup Harvey AI explains: “Legal work is the ultimate text-in, text-out business—a bull’s-eye for language models.” Their GenAI assistant tackles tasks like legal research and due diligence that require time-consuming labor across large amounts of text. With the AI searching through legal libraries and case files, the law firm has more time to focus on client relationships and strategic work. In February of 2023, ",{"_key":1167,"_type":27,"marks":1168,"text":1170},"8a7c45d3d7021",[1169],"ff055ca17408","Allen & Overy",{"_key":1172,"_type":27,"marks":1173,"text":1174},"8a7c45d3d7022",[]," became the first announced enterprise customer, and the following month ",{"_key":1176,"_type":27,"marks":1177,"text":1179},"8a7c45d3d7023",[1178],"6b94ae34d4b8","PwC",{"_key":1181,"_type":27,"marks":1182,"text":1183},"8a7c45d3d7024",[]," announced it was coming on board.",[1185,1187],{"_key":1169,"_type":122,"href":1186},"https://www.allenovery.com/en-gb/global/news-and-insights/news/ao-announces-exclusive-launch-partnership-with-harvey",{"_key":1178,"_type":122,"href":1188},"https://www.pwc.com/gx/en/news-room/press-releases/2023/pwc-announces-strategic-alliance-with-harvey-positioning-pwcs-legal-business-solutions-at-the-forefront-of-legal-generative-ai.html",{"_key":1190,"_type":23,"children":1191,"markDefs":1196,"style":344},"0af04cd2b37d",[1192],{"_key":1193,"_type":27,"marks":1194,"text":1195},"b467f1d440e00",[],"AI as a trusted legal eagle.",[],{"_key":1198,"_type":23,"children":1199,"markDefs":1213,"style":31},"dd0bbd5f90fb",[1200,1204,1209],{"_key":1201,"_type":27,"marks":1202,"text":1203},"cec6b346bde80",[],"Alexi is another company lightening the load for lawyers with an AI assistant. Built with advanced NLP and machine learning, Alexis automates routine legal tasks like contract reviews and document analysis. This helps legal pros to focus on strategic, high-value work. Developers behind Alexi integrated compliance checks and oversight mechanisms to improve accuracy and meet legal standards. Hear from Alexi CEO Mark Doble ",{"_key":1205,"_type":27,"marks":1206,"text":1208},"cec6b346bde81",[1207],"189f5df76b3b","on our podcast",{"_key":1210,"_type":27,"marks":1211,"text":1212},"cec6b346bde82",[]," about how they reduced inaccuracies.",[1214],{"_key":1207,"_type":122,"href":1215},"https://stackoverflow.blog/2024/12/17/legal-advice-from-an-ai-is-illegal/",{"_key":1217,"_type":23,"children":1218,"markDefs":1223,"style":212},"7ff982576083",[1219],{"_key":1220,"_type":27,"marks":1221,"text":1222},"f6b8e672f9490",[],"Educational",[],{"_key":1225,"_type":23,"children":1226,"markDefs":1231,"style":344},"c9fa96a56f75",[1227],{"_key":1228,"_type":27,"marks":1229,"text":1230},"5ea9cca20dc40",[],"Using the Socratic method to help students learn without giving away the answers.",[],{"_key":1233,"_type":16,"asset":1234},"fa3c1a38df3c",{"_ref":1235,"_type":19},"image-9a56ce1c785fd712a707bee52817ee022b39aafa-1430x681-png",{"_key":1237,"_type":23,"children":1238,"markDefs":1252,"style":31},"de4683d8fee9",[1239,1243,1248],{"_key":1240,"_type":27,"marks":1241,"text":1242},"f0b169af0ef50",[],"Khan Academy was one of the first institutions to announce it would work with Open AI’s GPT-4. The benefits of a large language model, according to Khan’s founder, is that it can adapt to the grade level and language ability of each student: “I think we're at the cusp of using AI for probably the biggest positive transformation that education has ever seen,\" he said. \"The way we're going to do that is by giving every student on the planet an artificially intelligent, but amazing, personal tutor,” ",{"_key":1244,"_type":27,"marks":1245,"text":1247},"f0b169af0ef51",[1246],"698a2a4ac165","Khan said",{"_key":1249,"_type":27,"marks":1250,"text":1251},"f0b169af0ef52",[]," in a TED Talk about his company’s plans for utilizing GenAI. And don’t worry, the students won’t simply be using the AI to do their homework. It can be given system prompts to follow the Socratic method—meaning it will try to help students find their way to the correct answer, rather than simply providing them with the solution.",[1253],{"_key":1246,"_type":122,"href":1254},"https://www.ted.com/talks/sal_khan_how_ai_could_save_not_destroy_education",{"_key":1256,"_type":23,"children":1257,"markDefs":1262,"style":344},"d685a7910a07",[1258],{"_key":1259,"_type":27,"marks":1260,"text":1261},"2d4731aa07140",[],"Boosting learner motivation with a personal coach.",[],{"_key":1264,"_type":23,"children":1265,"markDefs":1279,"style":31},"81c1722b7e8a",[1266,1270,1275],{"_key":1267,"_type":27,"marks":1268,"text":1269},"1f7d0addeb160",[],"Online learning is harder without the motivation of a class tutor. Enter ",{"_key":1271,"_type":27,"marks":1272,"text":1274},"1f7d0addeb161",[1273],"d210c6fb6030","Coursera Coach",{"_key":1276,"_type":27,"marks":1277,"text":1278},"1f7d0addeb162",[],", a real-time AI assistant to help learners study. Coach answers questions, simplifies complex topics, and provides tailored advice for each learner. The tool uses large language models (LLMs) and natural language processing (NLP) for a more personalized learning experience that has boosted course completion rates.",[1280],{"_key":1273,"_type":122,"href":1281},"https://blog.coursera.org/coursera-coach-leveraging-genai-to-empower-learners/",{"_key":1283,"_type":23,"children":1284,"markDefs":1289,"style":212},"08f3a7bc5e28",[1285],{"_key":1286,"_type":27,"marks":1287,"text":1288},"df1eac1d07f10",[],"Manufacturing",[],{"_key":1291,"_type":23,"children":1292,"markDefs":1297,"style":344},"8bcb7e651582",[1293],{"_key":1294,"_type":27,"marks":1295,"text":1296},"671c2af7ea1d0",[],"AI drives smarter motor maintenance.",[],{"_key":1299,"_type":16,"asset":1300},"db2fd00812bc",{"_ref":1301,"_type":19},"image-b53bb1c7648357ff5ce50d3c5a292c901eb224e6-1431x681-png",{"_key":1303,"_type":23,"children":1304,"markDefs":1318,"style":31},"8feb0bf7d031",[1305,1309,1314],{"_key":1306,"_type":27,"marks":1307,"text":1308},"4271f4c472420",[],"General Motors is improving reliability with ",{"_key":1310,"_type":27,"marks":1311,"text":1313},"4271f4c472421",[1312],"beae63e624ed","AI-driven predictive maintenance",{"_key":1315,"_type":27,"marks":1316,"text":1317},"4271f4c472422",[],". Sensors across GM’s assembly lines capture real-time performance data to spot inefficiencies and predict equipment failures before they happen. Machine learning models analyze this data to recommend proactive fixes, using cloud-based AI frameworks to cut unplanned downtime by 35% by using cloud-based AI frameworks.",[1319],{"_key":1312,"_type":122,"href":1320},"https://shoplogix.com/general-motors-oee/",{"_key":1322,"_type":23,"children":1323,"markDefs":1328,"style":212},"5b874ade8dec",[1324],{"_key":1325,"_type":27,"marks":1326,"text":1327},"c495358700710",[],"Retail",[],{"_key":1330,"_type":23,"children":1331,"markDefs":1336,"style":344},"72746d6124f3",[1332],{"_key":1333,"_type":27,"marks":1334,"text":1335},"0927ee63e83f0",[],"Making AI supply chain management fashionable.",[],{"_key":1338,"_type":16,"asset":1339},"25cf2e7374e1",{"_ref":1340,"_type":19},"image-9997b2c9d6c4fa157d91afd578399eaffc624aa2-1431x681-png",{"_key":1342,"_type":23,"children":1343,"markDefs":1357,"style":31},"508c4f26a7bd",[1344,1348,1353],{"_key":1345,"_type":27,"marks":1346,"text":1347},"e388fc88781a0",[],"Global fashion retailer Zara uses ",{"_key":1349,"_type":27,"marks":1350,"text":1352},"e388fc88781a1",[1351],"72f63cdcf5a3","AI-powered predictive analytics",{"_key":1354,"_type":27,"marks":1355,"text":1356},"e388fc88781a2",[]," to get the hottest fashion pieces in store without overloading warehouses. Machine learning models process vast datasets and analyze real-time sales data to accurately forecast demand. This reduces overstock and waste—and makes sure that funky Christmas sweater trending on Instagram is in stock for the holidays. On the ground, robots handle sorting, packing, and restocking.",[1358],{"_key":1351,"_type":122,"href":1359},"https://usmsystems.com/robotics-in-retail/",{"_type":162,"seoImage":1361},{"_type":16,"asset":1362},{"_ref":1363,"_type":19},"image-456fd7ebde082ae05a985b248580f00e5b4e3dd6-2400x1260-png",{"_type":169,"current":1365},"cross-industry-use","GenAI use across industries",{"_type":162,"seoDescription":1368,"seoImage":1369},"Explore the evolution of AI from Turing's era to today's GenAI, and how it's revolutionizing our world.",{"_type":16,"asset":1370},{"_ref":1371,"_type":19},"image-38becfcd04c2255ed5c88fca1020a3fa95ab7845-2400x1260-png",[],{"_type":169,"current":1374},"history","A brief history of AI",{"_key":1377,"_type":45,"body":1378,"sections":1538,"seo":2783,"sidebarCta":2788,"slug":2789,"title":2791},"15a3b192f2bc",[1379,1387,1395,1403,1411,1415,1423,1431,1435,1443,1451,1455,1463,1471,1475,1483,1491,1499,1507,1515,1522,1530],{"_key":1380,"_type":23,"children":1381,"markDefs":1386,"style":31},"aa383dfdf9c2",[1382],{"_key":1383,"_type":27,"marks":1384,"text":1385},"d834499c60710",[],"For some companies, access to an AI-powered chatbot or code generator will be enough. These are widely available as a service or via an API. For those looking to get a deeper integration, however, understanding the technologies used by AI and how they fit into your existing tech stack is essential. GenAI starts with the base foundation model. But which one should you use?",[],{"_key":1388,"_type":23,"children":1389,"markDefs":1394,"style":212},"b6e142f99510",[1390],{"_key":1391,"_type":27,"marks":1392,"text":1393},"13715fa83a810",[],"Components of GenAI",[],{"_key":1396,"_type":23,"children":1397,"markDefs":1402,"style":31},"9e11632b9e23",[1398],{"_key":1399,"_type":27,"marks":1400,"text":1401},"7727a1bfe46d0",[],"GenAI leverages additional advanced technologies that require thoughtful integration into your existing systems & understanding these components is crucial for effective implementation.",[],{"_key":1404,"_type":23,"children":1405,"markDefs":1410,"style":344},"9d76a7fe2282",[1406],{"_key":1407,"_type":27,"marks":1408,"text":1409},"5fb2d53454990",[],"Vector storage ",[],{"_key":1412,"_type":16,"asset":1413},"6c77a4415acc",{"_ref":1414,"_type":19},"image-8a70d878c39a99db856814ec150877925491dc9d-1431x681-png",{"_key":1416,"_type":23,"children":1417,"markDefs":1422,"style":31},"0b7b9f81ecbf",[1418],{"_key":1419,"_type":27,"marks":1420,"text":1421},"3c3402182ce6",[],"A database that stores the vectors output by the embedding model. While there are plenty of dedicated vector databases, many databases are adding vector capabilities to their feature sets, so you can use a single database to store object and vector data.",[],{"_key":1424,"_type":23,"children":1425,"markDefs":1430,"style":344},"57227e166a31",[1426],{"_key":1427,"_type":27,"marks":1428,"text":1429},"1a00cc6dc3050",[],"Other storage",[],{"_key":1432,"_type":16,"asset":1433},"84123265126d",{"_ref":1434,"_type":19},"image-75b5f2a402c70b17aaa20e8eac01f39dc1bdf1a3-1431x681-png",{"_key":1436,"_type":23,"children":1437,"markDefs":1442,"style":31},"724450dbf74f",[1438],{"_key":1439,"_type":27,"marks":1440,"text":1441},"0b292e5fd457",[],"Possibly a data lakehouse or other unstructured object storage.",[],{"_key":1444,"_type":23,"children":1445,"markDefs":1450,"style":344},"e1e7a6357860",[1446],{"_key":1447,"_type":27,"marks":1448,"text":1449},"b87420adc6450",[],"Machine-learning framework ",[],{"_key":1452,"_type":16,"asset":1453},"629b7ccc6ecf",{"_ref":1454,"_type":19},"image-57a2b5bcc36b71694ad9edc64ce893216b0840a3-1431x681-png",{"_key":1456,"_type":23,"children":1457,"markDefs":1462,"style":31},"52e66f6e7086",[1458],{"_key":1459,"_type":27,"marks":1460,"text":1461},"a09acc620632",[],"Like PyTorch, Keras, or TensorFlow. Python is the primary language for these, but there are frameworks for other languages.",[],{"_key":1464,"_type":23,"children":1465,"markDefs":1470,"style":344},"bf60861e2b12",[1466],{"_key":1467,"_type":27,"marks":1468,"text":1469},"120dc893fa2c0",[],"Access to GPUs or TPUs ",[],{"_key":1472,"_type":16,"asset":1473},"e7f80f7fe137",{"_ref":1474,"_type":19},"image-598ab2e916f4a7ddfc574a50e30bb2cc5ef16b38-1431x681-png",{"_key":1476,"_type":23,"children":1477,"markDefs":1482,"style":31},"eb3846ac99da",[1478],{"_key":1479,"_type":27,"marks":1480,"text":1481},"58ba0497efbd",[],"For accelerated inference and model training.",[],{"_key":1484,"_type":23,"children":1485,"markDefs":1490,"style":212},"90172b5eaf04",[1486],{"_key":1487,"_type":27,"marks":1488,"text":1489},"e83b3d823d830",[],"Where to start?",[],{"_key":1492,"_type":23,"children":1493,"markDefs":1498,"style":31},"e86493f14842",[1494],{"_key":1495,"_type":27,"marks":1496,"text":1497},"5848d3bae0420",[],"It can be intimidating to know where to start with these technologies, as research and development is advancing rapidly right now. Multiple organizations offer access to LLMs via APIs, and open-source models proliferate on sites like Hugging Face, so finding one that fits your organization can be daunting. You might even want to train your own model if your needs are specific enough. But a wide variety of AI—including foundation models—are available as services. You’ll likely need the rest of the tech listed here, but specifically which ones and their requirements will depend on what you’re doing with GenAI.",[],{"_key":1500,"_type":23,"children":1501,"markDefs":1506,"style":212},"e191e9a88e5b",[1502],{"_key":1503,"_type":27,"marks":1504,"text":1505},"39e5674f65060",[],"Fit the tech around the use case",[],{"_key":1508,"_type":23,"children":1509,"markDefs":1514,"style":31},"8c35e93d8444",[1510],{"_key":1511,"_type":27,"marks":1512,"text":1513},"491debbc47af0",[],"Before diving into selecting tech, buying infrastructure and cloud compute, and integrating GenAI tech into your codebase, you’ll need to think about your use case:",[],{"_key":1516,"_type":1517,"points":1518},"13f13d9e251e","keyPoints",[1519,1520,1521],"Will the AI serve as a general interface to your app or will it access specialized or proprietary data?","Do you need to expose AI responses to users or will this be an entirely internal tool?","Are you looking to automate or interact with other processes in your application or organization?",{"_key":1523,"_type":23,"children":1524,"markDefs":1529,"style":31},"ee315437fd26",[1525],{"_key":1526,"_type":27,"marks":1527,"text":1528},"3539e245128a0",[],"Custom data will require a data platform with vector storage, embedding models, and, if you want retrieval-augmented generation, an orchestration library like LlamaIndex. Responses sent to the user may need additional governance or prompt engineering to manage potential hallucinations, unintentional biases, or prompt injections, as well as an inferencing stack. This can involve providing the LLM with system instructions and rules for how to handle and respond to prompts. You can also use other LLMs to manage prompts and responses to ensure that nothing dangerous makes it through to either the generating LLM or the user. Automated prompting and prompt handling may require fine-tuning an LLM to ensure that the responses fit the intended systems.",[],{"_key":1531,"_type":23,"children":1532,"markDefs":1537,"style":31},"46c1b86f24d3",[1533],{"_key":1534,"_type":27,"marks":1535,"text":1536},"3de15cce62290",[],"With any of these use cases (and more), it’s important that everyone in the organization be on the same page. GenAI integrations can be large, expensive projects that touch a lot of teams, so having a plan about the tech before diving in can make the eventual execution much easier.",[],[1539,1723,2008,2249,2494],{"_key":1540,"_type":262,"body":1541,"seo":1716,"slug":1720,"title":1722},"64f687f0c9d3",[1542,1550,1558,1566,1570,1578,1597,1605,1613,1617,1636,1644,1652,1660,1664,1672,1680,1688,1692,1700,1708],{"_key":1543,"_type":23,"children":1544,"markDefs":1549,"style":31},"d241c79650ca",[1545],{"_key":1546,"_type":27,"marks":1547,"text":1548},"20f89a9784dc0",[],"Implementing a GenAI program in your organization is a big challenge, but so is integrating that program with your existing tech stack. Depending on your use case and the technologies that you’ve chosen, GenAI could be as simple as a few new API calls or as complicated as a series of new dependencies, including databases. How you approach this integration depends on the complexity of your GenAI use case and the existing pieces in your software stack.",[],{"_key":1551,"_type":23,"children":1552,"markDefs":1557,"style":31},"8a5980cb280a",[1553],{"_key":1554,"_type":27,"marks":1555,"text":1556},"980d079524b30",[],"In a lot of ways, adding GenAI to a software stack is the same as adding any other dependency or service. You need to manage how it connects to the rest of the program, how it gets deployed to production, the infrastructure that it runs on, and how you monitor and test it.",[],{"_key":1559,"_type":23,"children":1560,"markDefs":1565,"style":212},"ee9edbccd992",[1561],{"_key":1562,"_type":27,"marks":1563,"text":1564},"4897667245f70",[],"Connect",[],{"_key":1567,"_type":16,"asset":1568},"7be5f109d852",{"_ref":1569,"_type":19},"image-8a8a1e1b3a7068abbea7de692de872bfe40b0da6-1431x681-png",{"_key":1571,"_type":23,"children":1572,"markDefs":1577,"style":31},"95a046aeaeeb",[1573],{"_key":1574,"_type":27,"marks":1575,"text":1576},"3b50ef822f770",[],"If your integration just uses API-based LLMs, then all you need to do is make sure the right data gets into the payload and you handle the responses. That’s simple. If you’re sending proprietary data, then you’ll want to implement the appropriate security measures.",[],{"_key":1579,"_type":23,"children":1580,"markDefs":1594,"style":31},"2eac6398cd35",[1581,1585,1590],{"_key":1582,"_type":27,"marks":1583,"text":1584},"6e7a0ff14ad00",[],"For most other use cases, you’ll at least need some Python support, whether directly or by using Python wrappers. Python is one of the main languages for machine learning tools, so to use the tools, you’ll need to use Python. Some tools, like ",{"_key":1586,"_type":27,"marks":1587,"text":1589},"6e7a0ff14ad01",[1588],"227d0ebc719f","PyTorch",{"_key":1591,"_type":27,"marks":1592,"text":1593},"6e7a0ff14ad02",[],", have official frontends for other languages, but most will require a little connection between it and your main code.",[1595],{"_key":1588,"_type":122,"href":1596},"https://pytorch.org/cppdocs/",{"_key":1598,"_type":23,"children":1599,"markDefs":1604,"style":31},"db969866a3f6",[1600],{"_key":1601,"_type":27,"marks":1602,"text":1603},"c1e1919bddcc0",[],"Those connections can be direct references to Python or they can be through programmatic interfaces like APIs, RPCs, or event queues. If you’re working in a multi-container cloud, these interfaces may be a better way to go. ML processes and general application code may have different memory and compute needs, so deploying them to different containers may yield better cost and performance results.",[],{"_key":1606,"_type":23,"children":1607,"markDefs":1612,"style":212},"f2e0b1cec95d",[1608],{"_key":1609,"_type":27,"marks":1610,"text":1611},"45cf9be760940",[],"Deploy",[],{"_key":1614,"_type":16,"asset":1615},"9d1a3b0490a6",{"_ref":1616,"_type":19},"image-6ffa97b5edab4fd711bb58bd5306417f5eb0ad3e-1431x681-png",{"_key":1618,"_type":23,"children":1619,"markDefs":1633,"style":31},"771c4c77ae43",[1620,1624,1629],{"_key":1621,"_type":27,"marks":1622,"text":1623},"054ee240570e0",[],"While you may think you can deploy an ML model using your standard CI/CD pipeline, it may be ",{"_key":1625,"_type":27,"marks":1626,"text":1628},"054ee240570e1",[1627],"02be8b5f1a16","a little more complicated",{"_key":1630,"_type":27,"marks":1631,"text":1632},"054ee240570e2",[],". Models are just sets of weights and parameters; GenAI uses an algorithm on top of that to make predictions and generate text, images, and more. Unless you’re using a general-purpose LLM and never need to update, you’ll likely inference data for fine-tuning, probably from a production or analytics database. You may even configure the LLM to “reason”: that is, to pull in additional data when processing the prompt and step through a multi-step prompt refining process. Both mean your model may need to access data and process it after deployment. You may also need a process to update the existing models.",[1634],{"_key":1627,"_type":122,"href":1635},"https://stackoverflow.blog/2020/10/12/how-to-put-machine-learning-models-into-production/",{"_key":1637,"_type":23,"children":1638,"markDefs":1643,"style":31},"ba4eebc51ec1",[1639],{"_key":1640,"_type":27,"marks":1641,"text":1642},"3d02053440530",[],"Your model deployment may start to look like and integrate with your data pipelines. Data gets extracted from existing sources, transformed into something clean and usable by the inference algorithms, and loaded into the model as adjusted weights and parameters. There are a few tools—TFX, KubeFlow, and others—that will simplify and manage this deployment.",[],{"_key":1645,"_type":23,"children":1646,"markDefs":1651,"style":31},"ea14ec2cad3f",[1647],{"_key":1648,"_type":27,"marks":1649,"text":1650},"c9a61e16396b0",[],"Alternatively, if you’re using retrieval-augmented generation or semantic search, your pipelines will be vectorizing new data and storing it where the GenAI processes can get it.",[],{"_key":1653,"_type":23,"children":1654,"markDefs":1659,"style":212},"6c0ae4994feb",[1655],{"_key":1656,"_type":27,"marks":1657,"text":1658},"18554a26eb560",[],"Build infrastructure",[],{"_key":1661,"_type":16,"asset":1662},"220d6a5df752",{"_ref":1663,"_type":19},"image-53d6e0e2e15e0e0491932c8179808b6a50f43251-1431x681-png",{"_key":1665,"_type":23,"children":1666,"markDefs":1671,"style":31},"b9482619fa7c",[1667],{"_key":1668,"_type":27,"marks":1669,"text":1670},"e92d82d1aa590",[],"Managing your GenAI infrastructure may be as simple as scaling up your existing cloud solution to provide additional resources. However, GenAI processes tend to use massively parallel compute—GPUs and TPUs—so you may need to run these processes on servers with different configurations. It might even make sense to use multiple architectures in your servers.",[],{"_key":1673,"_type":23,"children":1674,"markDefs":1679,"style":31},"d1ce96d1b3b0",[1675],{"_key":1676,"_type":27,"marks":1677,"text":1678},"42f0c557c9ea0",[],"Using infrastructure-as-code, containers, and container orchestration like Kubernetes may make things a little more manageable—and more complicated. More manageable because these tools can handle deployments across multiple server types and architectures. More complicated because these tools are another layer that you’ll have to manage. It all depends on what you’re already using and what your cloud provider offers as managed services.",[],{"_key":1681,"_type":23,"children":1682,"markDefs":1687,"style":212},"e0d672d6abce",[1683],{"_key":1684,"_type":27,"marks":1685,"text":1686},"28ce32cab0490",[],"Monitor and test",[],{"_key":1689,"_type":16,"asset":1690},"f26dfe15105e",{"_ref":1691,"_type":19},"image-03cb327a0e62c570a2be2fc1bb52133624ce872f-1431x681-png",{"_key":1693,"_type":23,"children":1694,"markDefs":1699,"style":31},"a6411ffec664",[1695],{"_key":1696,"_type":27,"marks":1697,"text":1698},"2ca7ca3e16080",[],"While many folks see LLMs and other deep learning models as black boxes impenetrable to observation, there are ways to monitor and test them. The simplest is storing prompts and responses. You can perform additional analysis on these to check for sentiment, toxicity, and prompt leakage, often using another LLM.",[],{"_key":1701,"_type":23,"children":1702,"markDefs":1707,"style":31},"52a5b4234bfc",[1703],{"_key":1704,"_type":27,"marks":1705,"text":1706},"acef52a56b730",[],"ML models can—in some ways—be treated like another program, so you can monitor their performance in production in the same terms as you monitor other services: number of requests, error rates, response time, and so on. LLMs can be especially costly, so monitoring token usage and overall costs is important.",[],{"_key":1709,"_type":23,"children":1710,"markDefs":1715,"style":31},"771943cb3608",[1711],{"_key":1712,"_type":27,"marks":1713,"text":1714},"eb772330d2800",[],"Testing is a matter of running prompts and observing the output. You can automate these tests by chaining them to an evaluator LLM that is trained to determine whether the responses meet some given criteria. Include in these tests adversarial prompts, which are designed to produce harmful behaviors or bypass security controls.",[],{"_type":162,"seoImage":1717},{"_type":16,"asset":1718},{"_ref":1719,"_type":19},"image-73e3ebf1114cf32f228852337c7540b6a8c9e2cf-2400x1260-png",{"_type":169,"current":1721},"integrating-your-existing-tech-stack","Integrating your existing tech stack",{"_key":1724,"_type":262,"body":1725,"seo":2001,"slug":2005,"title":2007},"d5a7faa9aadb",[1726,1734,1742,1761,1769,1777,1785,1837,1845,1853,1872,1890,1898,1902,1910,1918,1922,1930,1938,1942,1950,1958,1966,1969,1977,1985,1993],{"_key":1727,"_type":23,"children":1728,"markDefs":1733,"style":132},"1e0a4e030b34",[1729],{"_key":1730,"_type":27,"marks":1731,"text":1732},"dd8e9497eda70",[],"Do you want to be an AI company, or do you want to be a company that uses AI in its products?",[],{"_key":1735,"_type":23,"children":1736,"markDefs":1741,"style":212},"514eed214cce",[1737],{"_key":1738,"_type":27,"marks":1739,"text":1740},"bcffdab5b3190",[151],"Where to begin",[],{"_key":1743,"_type":23,"children":1744,"markDefs":1758,"style":31},"459d6457c396",[1745,1749,1754],{"_key":1746,"_type":27,"marks":1747,"text":1748},"38946640970b0",[],"With any complex new technology, your organization needs to determine whether to build or buy your tools. Engineers build software, so it’s natural for them to want to build something when they need to solve new engineering problems. However, for those planning how an engineering team uses their time, building new solutions to solved problems isn’t always the best use of time. There’s a third option that some call “",{"_key":1750,"_type":27,"marks":1751,"text":1753},"38946640970b1",[1752],"78e803ff4271","borrow",{"_key":1755,"_type":27,"marks":1756,"text":1757},"38946640970b2",[],",” where you use open-source software but with the option of contributing code or forking the repo. This isn’t an all-or-nothing-proposition; you can buy or borrow for some aspects, build others, or customize the solutions that you adopt.",[1759],{"_key":1752,"_type":122,"href":1760},"https://dtunkelang.medium.com/search-should-you-build-buy-or-borrow-44b13e0988f5",{"_key":1762,"_type":23,"children":1763,"markDefs":1768,"style":31},"4af9e53ca093",[1764],{"_key":1765,"_type":27,"marks":1766,"text":1767},"464242e1f4500",[],"So when you’re embarking on your GenAI journey (or pivoting, for that matter), it’s important to ask: Do you want to be an AI company, or do you want to be a company that uses AI in your products? It’s an important distinction. If you have a specialized use case or want to control the code for your dependencies, then perhaps building makes sense. For most orgs and use cases, buying a solution or using open-source software will work. As always, an organization will thrive when it focuses on building software that directly affects its business.",[],{"_key":1770,"_type":23,"children":1771,"markDefs":1776,"style":31},"3a315c9a25bc",[1772],{"_key":1773,"_type":27,"marks":1774,"text":1775},"0622925acb620",[],"Let’s suppose you want to build an AI stack - Here’s what that would take.",[],{"_key":1778,"_type":23,"children":1779,"markDefs":1784,"style":212},"3ce538b1bb3e",[1780],{"_key":1781,"_type":27,"marks":1782,"text":1783},"0e6afcf541a50",[151],"Build-a-bot",[],{"_key":1786,"_type":23,"children":1787,"markDefs":1828,"style":31},"82f3b8116071",[1788,1792,1797,1801,1806,1810,1815,1819,1824],{"_key":1789,"_type":27,"marks":1790,"text":1791},"f6e7b6a838d50",[],"Building a foundation model is a massive and expensive undertaking. OpenAI spent ",{"_key":1793,"_type":27,"marks":1794,"text":1796},"f6e7b6a838d51",[1795],"11b8d1037673","around $100 million",{"_key":1798,"_type":27,"marks":1799,"text":1800},"f6e7b6a838d52",[]," training their GPT-4 model, rumored to be one trillion parameters. Their newest o1 and o3 models, with their thinking capabilities likely pulling in additional data for individual prompts, likely cost more to train and more to run. ",{"_key":1802,"_type":27,"marks":1803,"text":1805},"f6e7b6a838d53",[1804],"46a35cbefa3a","OpenAI CEO Sam Altman has said",{"_key":1807,"_type":27,"marks":1808,"text":1809},"f6e7b6a838d54",[]," that their pricey $200 Pro subscription loses money for the company. Bigger isn’t always better: researchers found that ",{"_key":1811,"_type":27,"marks":1812,"text":1814},"f6e7b6a838d55",[1813],"b6892707f490","targeted training data sets let models overperform",{"_key":1816,"_type":27,"marks":1817,"text":1818},"f6e7b6a838d56",[]," in specialized tasks. ",{"_key":1820,"_type":27,"marks":1821,"text":1823},"f6e7b6a838d57",[1822],"648a83fee4ac","On our podcast in January 2025",{"_key":1825,"_type":27,"marks":1826,"text":1827},"f6e7b6a838d58",[],", Inbal Shani, Chief Product Officer and Head of R&D at Twilio, stressed the enormous importance of data quality in achieving high-quality responses from AI. “The data is the key,” he said. “If you don't have the right data, then whatever AI you are going to apply on top of that is not useful.”",[1829,1831,1833,1835],{"_key":1795,"_type":122,"href":1830},"https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/",{"_key":1804,"_type":122,"href":1832},"https://techcrunch.com/2025/01/05/openai-is-losing-money-on-its-pricey-chatgpt-pro-plan-ceo-sam-altman-says/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAAkVw93yXD-0MLIBzM3SKS0FkeQ9HfceV3Z2xkF54O0SxoU3opTLyNcw7bjkj5vBS6qJML6RWgvO0URVv7WcKSP8YpcNj6sNdwL8wjzJ3Drt-M6a1WMBLqcJ29q8GxxkIxZnZ8w9qnCDIDZOVA3Apub5P9utiSJTUpPedAi3yK_Q",{"_key":1813,"_type":122,"href":1834},"https://stackoverflow.blog/2024/02/26/even-llms-need-education-quality-data-makes-llms-overperform/",{"_key":1822,"_type":122,"href":1836},"https://stackoverflow.blog/2025/01/10/data-is-the-key-twilio-s-head-of-r-and-d-on-the-need-for-good-data/",{"_key":1838,"_type":23,"children":1839,"markDefs":1844,"style":31},"05817c0a04fc",[1840],{"_key":1841,"_type":27,"marks":1842,"text":1843},"09c43d707cba0",[],"But you don’t have to run out and buy up GPUs to train your own model. Some companies offer model training as a service. The trade-off here is the quality and reliability of the big models for the control and security of a custom one. Additionally, the bigger your model is, the more it costs to host.",[],{"_key":1846,"_type":23,"children":1847,"markDefs":1852,"style":31},"b3c97d357172",[1848],{"_key":1849,"_type":27,"marks":1850,"text":1851},"93d6772dd60b0",[],"But as the LLM space becomes more mature, the models become more commodified, and buying a service becomes more attractive overall. The years of work that these big companies have put into these models are very hard to duplicate at this point. For a monthly or per token fee, you can gain the benefit of all this work for your own product.",[],{"_key":1854,"_type":23,"children":1855,"markDefs":1869,"style":31},"975d72c2e5c3",[1856,1860,1865],{"_key":1857,"_type":27,"marks":1858,"text":1859},"b15de4765b990",[],"You could take the “borrow” route and fine-tune an open-source model. This would still require paying for GPU time, but it would allow you to get the benefits of the big LLMs while customizing it on your data. You can use techniques like ",{"_key":1861,"_type":27,"marks":1862,"text":1864},"b15de4765b991",[1863],"b8ff774d842f","LoRA",{"_key":1866,"_type":27,"marks":1867,"text":1868},"b15de4765b992",[]," to efficiently customize model weights without having to change everything within the entire LLM.",[1870],{"_key":1863,"_type":122,"href":1871},"https://huggingface.co/docs/peft/conceptual_guides/lora",{"_key":1873,"_type":23,"children":1874,"markDefs":1887,"style":31},"96890ceaa895",[1875,1879,1883],{"_key":1876,"_type":27,"marks":1877,"text":1878},"3204da1e84540",[],"It’s unlikely that you’d want to build your own database or machine learning framework like ",{"_key":1880,"_type":27,"marks":1881,"text":1589},"3204da1e84541",[1882],"37f105a6dfc1",{"_key":1884,"_type":27,"marks":1885,"text":1886},"3204da1e84542",[],". Many of the standard tools in these areas are open-source and recreating the wheel here is usually more trouble than it’s worth.",[1888],{"_key":1882,"_type":122,"href":1889},"https://pytorch.org/",{"_key":1891,"_type":23,"children":1892,"markDefs":1897,"style":31},"6d2e2cab8578",[1893],{"_key":1894,"_type":27,"marks":1895,"text":1896},"d9a0815f71d60",[],"Here are a few areas where you may find yourself building your own specialized tooling.",[],{"_key":1899,"_type":16,"asset":1900},"58c15eec73e0",{"_ref":1901,"_type":19},"image-3f2a6c9a2ef4fbfb2c4e1e5253f765e924599b3e-1431x682-png",{"_key":1903,"_type":23,"children":1904,"markDefs":1909,"style":344},"e637f2502e40",[1905],{"_key":1906,"_type":27,"marks":1907,"text":1908},"3e21f43eb1d60",[151],"Orchestration and agent frameworks",[],{"_key":1911,"_type":23,"children":1912,"markDefs":1917,"style":31},"ce3f91771d72",[1913],{"_key":1914,"_type":27,"marks":1915,"text":1916},"78525a4d74650",[],"Much of what software does is automate the repetitive stuff, so anything that connects LLM output to other processes falls in this category—think LlamaIndex, AutoGPT, or AutoChain. These provide easy connections between processes that AI agents need to function. Some developers, especially those deeper into their GenAI journey, may find it easier to skip the frameworks and either call database and LLM libraries directly or build their own agentic frameworks.",[],{"_key":1919,"_type":16,"asset":1920},"d45690b776e9",{"_ref":1921,"_type":19},"image-43eb31908c985c8b0bab5a6014a3dd7438d5a2d9-1431x682-png",{"_key":1923,"_type":23,"children":1924,"markDefs":1929,"style":344},"a57bfcb95a53",[1925],{"_key":1926,"_type":27,"marks":1927,"text":1928},"2b9f75cfa6700",[151],"Retrieval-augmented generation",[],{"_key":1931,"_type":23,"children":1932,"markDefs":1937,"style":31},"0fa0d6e36893",[1933],{"_key":1934,"_type":27,"marks":1935,"text":1936},"8ba44243ca580",[],"RAG systems supplement LLM output with relevant context in the prompts using vector databases and orchestration tools. You can build these in-house, but there is a flood of new tools and frameworks that will help with this process. Some data platforms will provide easy RAG setup if you host your data with them.",[],{"_key":1939,"_type":16,"asset":1940},"acc73d6f6f8e",{"_ref":1941,"_type":19},"image-a888f513ff3ea403fa035da7e7c891e3ff1e3579-1431x682-png",{"_key":1943,"_type":23,"children":1944,"markDefs":1949,"style":344},"1781b5b3a1d2",[1945],{"_key":1946,"_type":27,"marks":1947,"text":1948},"1ccc4947c2ec0",[151],"Fine-tuning and inference",[],{"_key":1951,"_type":23,"children":1952,"markDefs":1957,"style":31},"38f2e8231bfc",[1953],{"_key":1954,"_type":27,"marks":1955,"text":1956},"e326ca2d35ba0",[],"Fine-tuning lets you update an LLM with new information and change the model’s weights and biases. Most applications want this sort of feedback mechanism, and there are a lot of different approaches and a lot of different implementations. This is an area where—even if you want to build your own—you’d want to check the open-source options first.",[],{"_key":1959,"_type":23,"children":1960,"markDefs":1965,"style":31},"19868135169b",[1961],{"_key":1962,"_type":27,"marks":1963,"text":1964},"32a3542ce4d60",[],"Inference is a growing concern as you may not just run inference on prompt tokens but on additional data pulled in to respond to specific prompts. If you’re running your own reasoning or chain-of-though process, you’ll need to consider your capacity for real-time, always-on inference.",[],{"_key":1967,"_type":16,"asset":1968},"68be8052768a",{"_ref":1691,"_type":19},{"_key":1970,"_type":23,"children":1971,"markDefs":1976,"style":344},"cc7d86685233",[1972],{"_key":1973,"_type":27,"marks":1974,"text":1975},"dd5ef88c8e550",[151],"Monitoring, explainability, and debiasing",[],{"_key":1978,"_type":23,"children":1979,"markDefs":1984,"style":31},"23fc30a89d3f",[1980],{"_key":1981,"_type":27,"marks":1982,"text":1983},"c2ac9c2aad420",[],"There’s a whole slew of tools and techniques for evaluating, tracking, and hacking how generative models produce results. If you have an excellent data science team, you may end up coming up with novel solutions. But with all the visibility around hallucinations and GenAI’s failings, there is a lot of active research and development in this area. You may also want to consider the legal ramifications of DIY solutions here. Some companies, like IBM, will indemnify their clients against legal exposure caused by their products’ responses, but that’s certainly not universal.",[],{"_key":1986,"_type":23,"children":1987,"markDefs":1992,"style":212},"825de9e9d9bb",[1988],{"_key":1989,"_type":27,"marks":1990,"text":1991},"a84b253c22020",[151],"Summary",[],{"_key":1994,"_type":23,"children":1995,"markDefs":2000,"style":31},"fafea3070f0b",[1996],{"_key":1997,"_type":27,"marks":1998,"text":1999},"2ba6b484d4bf0",[],"Knowing what you need and how to judge what tools will do the job can be a challenge in itself. For our own AI efforts, we set up a separate Community within our internal Stack Overflow for Teams instance and gathered AI-related knowledge there. You’ll need some way to keep track of how fast the field is moving and what new developments will affect your business goals.",[],{"_type":162,"seoImage":2002},{"_type":16,"asset":2003},{"_ref":2004,"_type":19},"image-b5569d26bcc6f243855c436fed3b032c0ff7e7e4-2400x1260-png",{"_type":169,"current":2006},"build-vs-buy","Build vs. buy",{"_key":2009,"_type":262,"body":2010,"seo":2243,"slug":2246,"title":2248},"468b2cc5f676",[2011,2019,2027,2035,2043,2047,2055,2063,2068,2076,2084,2088,2096,2104,2112,2120,2128,2136,2142,2149,2157,2176,2184,2192,2200,2219,2227,2235],{"_key":2012,"_type":23,"children":2013,"markDefs":2018,"style":31},"e954c4cfb763",[2014],{"_key":2015,"_type":27,"marks":2016,"text":2017},"66dc4993c7d10",[],"As you’re getting your GenAI program up and running or building to the next level, you’ll be interested in what’s already available on the market. Since ChatGPT captured the public imagination when it was released in November 2022, researchers, open-source developers, and companies have been working to create models, techniques, and software that take advantage of the possibilities offered by these new ideas.",[],{"_key":2020,"_type":23,"children":2021,"markDefs":2026,"style":31},"852a3f4c7a04",[2022],{"_key":2023,"_type":27,"marks":2024,"text":2025},"f5ff40858fd40",[],"For anyone integrating GenAI in their software, this is a gold mine of opportunity. You can get a huge headstart on integrating GenAI features by building on what others have already created. Here’s a quick overview of the landscape.",[],{"_key":2028,"_type":23,"children":2029,"markDefs":2034,"style":212},"9493d68c65fa",[2030],{"_key":2031,"_type":27,"marks":2032,"text":2033},"64da60ffe99e0",[],"Foundation models",[],{"_key":2036,"_type":23,"children":2037,"markDefs":2042,"style":31},"ba978b190692",[2038],{"_key":2039,"_type":27,"marks":2040,"text":2041},"7f93c41d0c050",[],"There are several types of foundation models that you’ll be interested in:",[],{"_key":2044,"_type":16,"asset":2045},"0f364e04360b",{"_ref":2046,"_type":19},"image-ba89a0d3698e089820a7d6fe04a48c7f2e882071-1431x682-png",{"_key":2048,"_type":23,"children":2049,"markDefs":2054,"style":344},"b37cc9af4ab0",[2050],{"_key":2051,"_type":27,"marks":2052,"text":2053},"986cb758c59c0",[],"Large language models (LLMs)",[],{"_key":2056,"_type":23,"children":2057,"markDefs":2062,"style":31},"27253873a917",[2058],{"_key":2059,"_type":27,"marks":2060,"text":2061},"2b26ba40966f0",[],"These use a massive dataset of text to provide general-purpose language generation and understanding. This category includes OpenAI's GPT models (GPT-4o and o1), Google's Gemini, Meta's LLaMa 3, and Anthropic's Claude 3. This category is getting more crowded and more advanced as companies like Amazon, Nvidia, Databricks, DeepSeek, IBM, and Alibaba Cloud have released major general LLMs.",[],{"_key":2064,"_type":16,"asset":2065,"caption":2067,"source":1760},"4134437629d7",{"_ref":2066,"_type":19},"image-1a94acf9961f69db0f381e10028d7cffc28176f9-1431x682-png","Prompt and output from DALL·E 3, one of the most advanced deep learning text-to-image models available today.",{"_key":2069,"_type":23,"children":2070,"markDefs":2075,"style":344},"1d2b03f17c7f",[2071],{"_key":2072,"_type":27,"marks":2073,"text":2074},"ae4dc250c4230",[],"Images",[],{"_key":2077,"_type":23,"children":2078,"markDefs":2083,"style":31},"3f8956c6c9c6",[2079],{"_key":2080,"_type":27,"marks":2081,"text":2082},"615fa067ea5f0",[],"A number of models and services can generate and understand images. OpenAI's DALL-E 3, Google Brain's Imagen, StabilityAI's Stable Diffusion, and Midjourney are major players, though not all of them allow programmatic access at this time. A number of specialized image generators have arisen—Ideogram for accurate text, Adobe Firefly for integrating generative AI with photos, and Generative AI by Getty for commercially-safe images.",[],{"_key":2085,"_type":16,"asset":2086},"83c8507417ed",{"_ref":2087,"_type":19},"image-5d011b1fcf0cf99a23343bf01a5838d18964b4bf-1431x682-png",{"_key":2089,"_type":23,"children":2090,"markDefs":2095,"style":344},"dc6d73de7e15",[2091],{"_key":2092,"_type":27,"marks":2093,"text":2094},"ae5c014e4d960",[],"Multimodal",[],{"_key":2097,"_type":23,"children":2098,"markDefs":2103,"style":31},"e6c95677a479",[2099],{"_key":2100,"_type":27,"marks":2101,"text":2102},"14be5ed9bbd10",[],"These models can reason and generate across multiple areas, including text, image, sound, and more. While some of the image-based models can perform image-to-text and text-to-image generation, general-purpose multimodal AI includes ChatGPT and Gemini, which may seamlessly link multiple individual models together.",[],{"_key":2105,"_type":23,"children":2106,"markDefs":2111,"style":31},"f23a5946c9fb",[2107],{"_key":2108,"_type":27,"marks":2109,"text":2110},"0c1453efe0fc",[],"",[],{"_key":2113,"_type":23,"children":2114,"markDefs":2119,"style":31},"1790bb581d9c",[2115],{"_key":2116,"_type":27,"marks":2117,"text":2118},"baf07755bbfc0",[],"If you’re integrating one or more of these models into your applications, you’ll have to consider how you access them and how much you can modify them with fine-tuning. You can access some via APIs, some can be installed locally, and some can be bundled with your cloud services. Some are open-source and allow you to fine-tune parameters, install locally, and modify as you see fit.",[],{"_key":2121,"_type":23,"children":2122,"markDefs":2127,"style":212},"0a35282b554f",[2123],{"_key":2124,"_type":27,"marks":2125,"text":2126},"52108c0c6f370",[],"Data platforms",[],{"_key":2129,"_type":23,"children":2130,"markDefs":2135,"style":31},"737086fa9c8b",[2131],{"_key":2132,"_type":27,"marks":2133,"text":2134},"5695487b04cf0",[],"AI runs on data, so you’ll need somewhere to store that data. While a model comes pre-trained on a massive amount of data, you’ll still need to store data for fine-tuning and retrieval-augmented generation, as well as the usual monitoring and analytics usages. These usually fall into two categories: vector databases and data lakehouses.",[],{"_key":2137,"_type":16,"asset":2138,"caption":2140,"source":2141},"57766d96dcd0",{"_ref":2139,"_type":19},"image-da7a09ea4f6d900a263bfe7376ce4e06cf668123-1431x682-png","“For databases that currently lack vector search functionality, it is only a matter of time before they implement these features.”","https://blog.det.life/why-you-shouldnt-invest-in-vector-databases-c0cd3f59d23c",{"_key":2143,"_type":23,"children":2144,"markDefs":2148,"style":31},"9c9542faad1f",[2145],{"_key":2146,"_type":27,"marks":2147,"text":2140},"32c018d277880",[],[],{"_key":2150,"_type":23,"children":2151,"markDefs":2156,"style":344},"9e4b989437b8",[2152],{"_key":2153,"_type":27,"marks":2154,"text":2155},"083afdabebcf0",[],"Vector databases",[],{"_key":2158,"_type":23,"children":2159,"markDefs":2173,"style":31},"5a369077aa76",[2160,2164,2169],{"_key":2161,"_type":27,"marks":2162,"text":2163},"5f42283f11c40",[],"These store vectorized data that your AI solutions will pull from for search, chat, or RAG solutions. Major providers include Pinecone, Weaviate, Chroma, Qdrant, Milvus, and more. Plenty of existing databases have added vector storage and/or search, ",{"_key":2165,"_type":27,"marks":2166,"text":2168},"5f42283f11c41",[2167],"4dc22bdc9fbc","like MongoDB",{"_key":2170,"_type":27,"marks":2171,"text":2172},"5f42283f11c42",[],", PostgreSQL, ElasticSearch, Rockset, Redis, and many more. You can download and host many of these solutions yourself or turn to the provider for a fully-managed database solution.",[2174],{"_key":2167,"_type":122,"href":2175},"https://stackoverflow.blog/2023/09/20/do-you-need-a-specialized-vector-database-to-implement-vector-search-well/",{"_key":2177,"_type":23,"children":2178,"markDefs":2183,"style":344},"051904d61280",[2179],{"_key":2180,"_type":27,"marks":2181,"text":2182},"a34a2a67102e0",[],"Data lakehouses",[],{"_key":2185,"_type":23,"children":2186,"markDefs":2191,"style":31},"8cd0c983a592",[2187],{"_key":2188,"_type":27,"marks":2189,"text":2190},"c75d4b01254c0",[],"These are large stores of mostly unstructured data (files, objects, etc.) that can be drawn from quickly, so they balance scalability, latency, and cost. Major providers include DataBricks, Snowflake, Google Big Query, Cloudera, Amazon RedShift, and Teradata Vantage, but there are many more. For these, you’ll also have to consider your infrastructure—a significant number of these provide hosting services, as the data requirements can balloon quickly. You can build your own solution here, but it can often mean connecting a number of data storage solutions together.",[],{"_key":2193,"_type":23,"children":2194,"markDefs":2199,"style":212},"0fefca484c9e",[2195],{"_key":2196,"_type":27,"marks":2197,"text":2198},"80021055451c0",[],"All-in-one solutions",[],{"_key":2201,"_type":23,"children":2202,"markDefs":2216,"style":31},"41eb2f2ad45e",[2203,2207,2212],{"_key":2204,"_type":27,"marks":2205,"text":2206},"774a600ded2d0",[],"Putting together a GenAI platform can be a pretty daunting task, and evaluating all the pieces is a significant project as well. Two distinguished engineers from ",{"_key":2208,"_type":27,"marks":2209,"text":2211},"774a600ded2d1",[2210],"136974c03f20","IBM discussed",{"_key":2213,"_type":27,"marks":2214,"text":2215},"774a600ded2d2",[]," how they put together their business-focused GenAI platform, watsonx, and the effort is monumental. When faced with that task, you may want to consider all-in-one solutions.",[2217],{"_key":2210,"_type":122,"href":2218},"https://stackoverflow.blog/2023/06/06/mosaicml-deep-learning-models-for-sale-all-shapes-and-sizes-ep-577/",{"_key":2220,"_type":23,"children":2221,"markDefs":2226,"style":31},"a8959ea957d8",[2222],{"_key":2223,"_type":27,"marks":2224,"text":2225},"99edb1fb763d0",[],"Many cloud providers, including AWS, Google Cloud, and Microsoft Azure, have set up all-in-one solutions that can be added as part of their services. If you’re already using one of these providers, the benefit is that you get infrastructure, models, training, and data platforms without having to assemble a tech stack on your own. The downside is increased risk of vendor lock-in and reduced flexibility, as the tools available will be subject to what the provider supports.",[],{"_key":2228,"_type":23,"children":2229,"markDefs":2234,"style":31},"bbcaf195ad64",[2230],{"_key":2231,"_type":27,"marks":2232,"text":2233},"769f7e4f192a0",[],"Other providers of all-in-one solutions include the aforementioned IBM and Nvidia.",[],{"_key":2236,"_type":23,"children":2237,"markDefs":2242,"style":31},"79ef5c25c1c9",[2238],{"_key":2239,"_type":27,"marks":2240,"text":2241},"353e322c3fbc0",[],"The ecosystem of GenAI tools and technologies is already vast and growing rapidly. Major players have been consolidating their positions through acquisitions: Databricks bought MosaicML, Nvidia bought Run:ai, and OpenAI bought Rockset, just to name a few. Organizations looking to make their mark in the GenAI era will have to either use existing offerings or provide a unique value proposition.",[],{"_type":162,"seoImage":2244},{"_type":16,"asset":2245},{"_ref":2004,"_type":19},{"_type":169,"current":2247},"whats-on-the-market","What’s on the market?",{"_key":2250,"_type":262,"body":2251,"seo":2487,"slug":2491,"title":2493},"349f015cff2d",[2252,2260,2268,2276,2284,2292,2296,2315,2323,2331,2335,2343,2362,2381,2389,2397,2405,2413,2428,2443,2458],{"_key":2253,"_type":23,"children":2254,"markDefs":2259,"style":212},"25471bae1980",[2255],{"_key":2256,"_type":27,"marks":2257,"text":2258},"3849c9f6a1da",[],"The importance of a knowledge community",[],{"_key":2261,"_type":23,"children":2262,"markDefs":2267,"style":132},"fae8a65b8356",[2263],{"_key":2264,"_type":27,"marks":2265,"text":2266},"7bc69cb2b998",[],"For AI coding tools to add value to your work, they have to fit into your teams’ existing workflows.",[],{"_key":2269,"_type":23,"children":2270,"markDefs":2275,"style":31},"2b9bbee68233",[2271],{"_key":2272,"_type":27,"marks":2273,"text":2274},"9fca22b30e6d0",[],"As ever, new tools and technologies demand new skill sets and inspire fresh ways of solving old problems. Top of mind today is what it will take to upskill your teams to take maximum advantage of AI to maintain productivity in an accelerating industry.",[],{"_key":2277,"_type":23,"children":2278,"markDefs":2283,"style":31},"ad961eaa093a",[2279],{"_key":2280,"_type":27,"marks":2281,"text":2282},"f5e342c4fd2e0",[],"When learning is an integral part of your organizational culture, your teams are primed to leverage and implement new AI tools. Much of the public conversation around this tech centers on job loss, but this perspective is worth reframing. Every new technology that disrupts the market as profoundly as GenAI results in a change in demand for specific skills and roles—ones that AI can help developers access. From prompt engineering to machine learning to data science, there’s plenty for your teams to learn.",[],{"_key":2285,"_type":23,"children":2286,"markDefs":2291,"style":212},"32f8fbe27d56",[2287],{"_key":2288,"_type":27,"marks":2289,"text":2290},"da61082acda30",[],"Identifying the right tools",[],{"_key":2293,"_type":16,"asset":2294},"a32224be3d09",{"_ref":2295,"_type":19},"image-2f65b72b144c8f23610ce40dceaa783038e6ba80-1431x682-png",{"_key":2297,"_type":23,"children":2298,"markDefs":2312,"style":31},"34760999a903",[2299,2303,2308],{"_key":2300,"_type":27,"marks":2301,"text":2302},"6ef3934276210",[],"For AI coding tools to add value to your work, they have to fit into your teams’ existing workflows. When AI tools aren’t properly integrated, they interrupt rather than enhance the ",{"_key":2304,"_type":27,"marks":2305,"text":2307},"6ef3934276211",[2306],"51d5bdac0f06","flow state",{"_key":2309,"_type":27,"marks":2310,"text":2311},"6ef3934276212",[]," that’s so important for developers’ deep, focused work. Constant context-switching is costly. AI tools, like any other business tools in your repertoire, should minimize interruptions as much as possible.",[2313],{"_key":2306,"_type":122,"href":2314},"https://resources.stackoverflow.co/topic/productivity-tips/reclaim-your-flow-state-3-developer-distractions-to-eliminate/",{"_key":2316,"_type":23,"children":2317,"markDefs":2322,"style":31},"9e5a3d9285e1",[2318],{"_key":2319,"_type":27,"marks":2320,"text":2321},"37a15341d5bf0",[],"Clarify your goals in using AI coding tools. It’s tempting, but trying every product you can get your hands on creates too much noise and introduces unnecessary risk. Think about the needs of your team and the challenges your organization is facing. Assess your criteria for speed, size, security, and privacy. Do you want a chatbot that can make recommendations based on a set of criteria, or a search engine that will respond with relevant data when your team has complex questions about an upcoming project?",[],{"_key":2324,"_type":23,"children":2325,"markDefs":2330,"style":212},"867a27aea2bd",[2326],{"_key":2327,"_type":27,"marks":2328,"text":2329},"18d3b4c97bf80",[],"How to support your teams as they upskill",[],{"_key":2332,"_type":16,"asset":2333},"cabf51ef686d",{"_ref":2334,"_type":19},"image-c3b260e6cd71b15f6c342f86ff7751ebf9043848-1431x682-png",{"_key":2336,"_type":23,"children":2337,"markDefs":2342,"style":31},"96d49aceca9e",[2338],{"_key":2339,"_type":27,"marks":2340,"text":2341},"ef7135a731740",[],"Here are some things to keep in mind as you upskill your teams for the AI era:",[],{"_key":2344,"_type":23,"children":2345,"level":943,"listItem":944,"markDefs":2359,"style":31},"4a1dabd9a43a",[2346,2350,2355],{"_key":2347,"_type":27,"marks":2348,"text":2349},"e0dbc1998cbe0",[],"Incorporate learning opportunities into the job: Most employees prefer to learn on the job, ",{"_key":2351,"_type":27,"marks":2352,"text":2354},"e0dbc1998cbe1",[2353],"16fa0a481f50","research shows",{"_key":2356,"_type":27,"marks":2357,"text":2358},"e0dbc1998cbe2",[],", tackling opportunities as they come up. AI tools that can be integrated into your teams’ existing workflows and help them take better advantage of your institutional knowledge base will add more value to your organization than tools that don’t integrate seamlessly with existing tools or offer the same ease of use.",[2360],{"_key":2353,"_type":122,"href":2361},"https://www.bcg.com/publications/2021/decoding-global-trends-reskilling-career-paths",{"_key":2363,"_type":23,"children":2364,"level":943,"listItem":944,"markDefs":2378,"style":31},"84c475e171d7",[2365,2369,2374],{"_key":2366,"_type":27,"marks":2367,"text":2368},"eca7c95a7fa50",[],"Embrace the challenge: One way to incorporate learning into everyday work is through “",{"_key":2370,"_type":27,"marks":2371,"text":2373},"eca7c95a7fa51",[2372],"193c07b58cf8","stretch assignments",{"_key":2375,"_type":27,"marks":2376,"text":2377},"eca7c95a7fa52",[],"”: projects or tasks that lie slightly beyond an engineer’s current skill level or expertise, nudging them to improve their abilities and add new AI programming languages, technologies, or techniques to their repertoire.",[2379],{"_key":2372,"_type":122,"href":2380},"https://stackoverflow.blog/2021/08/16/using-stretch-work-assignments-to-help-engineers-grow/",{"_key":2382,"_type":23,"children":2383,"level":943,"listItem":944,"markDefs":2388,"style":31},"6d8d95999798",[2384],{"_key":2385,"_type":27,"marks":2386,"text":2387},"0304d81085f90",[],"Prioritize learning from the top down: Leadership should model a commitment to learning and upskilling by carving out dedicated time for people to learn at work and by honoring time spent learning new languages or getting familiar with new coding tools as essential to the job.",[],{"_key":2390,"_type":23,"children":2391,"level":943,"listItem":944,"markDefs":2396,"style":31},"b7053042d914",[2392],{"_key":2393,"_type":27,"marks":2394,"text":2395},"b4c78631693b0",[],"Give your teams a solid foundation for learning: Giving your teams what they need to upskill in the AI era also requires a well-structured, up-to-date knowledge base and a knowledge management strategy that harmonizes with how your employees prefer to work. You can also give engineering teams the opportunity to learn or grow their skills in a commonly used AI programming language like Python, Java, or C++.",[],{"_key":2398,"_type":23,"children":2399,"markDefs":2404,"style":212},"235b464f05e9",[2400],{"_key":2401,"_type":27,"marks":2402,"text":2403},"20e317b6520d0",[],"Recommended resources",[],{"_key":2406,"_type":23,"children":2407,"markDefs":2412,"style":31},"18d364fa1587",[2408],{"_key":2409,"_type":27,"marks":2410,"text":2411},"cd2abd8679480",[],"Here are some of our own resources that can help you get your teams up-to-speed and feeling confident with the new AI tools at your disposal.",[],{"_key":2414,"_type":23,"children":2415,"level":943,"listItem":944,"markDefs":2425,"style":31},"4274908a264c",[2416,2421],{"_key":2417,"_type":27,"marks":2418,"text":2420},"f3dacb6716640",[2419],"b1ba335ddc7f","GenAI",{"_key":2422,"_type":27,"marks":2423,"text":2424},"f3dacb6716641",[],": Our newest Stack Exchange community dedicated to GenAI enthusiasts and practitioners.",[2426],{"_key":2419,"_type":122,"href":2427},"https://genai.stackexchange.com/",{"_key":2429,"_type":23,"children":2430,"level":943,"listItem":944,"markDefs":2440,"style":31},"e86fb79d3e9d",[2431,2436],{"_key":2432,"_type":27,"marks":2433,"text":2435},"4abdfbffec440",[2434],"b6b6619b94c3","NLP",{"_key":2437,"_type":27,"marks":2438,"text":2439},"4abdfbffec441",[],": A collective focused on NLP (natural language processing), the transformation or extraction of useful information from natural language data.",[2441],{"_key":2434,"_type":122,"href":2442},"https://stackoverflow.com/collectives/nlp",{"_key":2444,"_type":23,"children":2445,"level":943,"listItem":944,"markDefs":2455,"style":31},"4277ed365315",[2446,2451],{"_key":2447,"_type":27,"marks":2448,"text":2450},"7a0466c80b6d0",[2449],"d167b8a6bcda","Start building your own knowledge base",{"_key":2452,"_type":27,"marks":2453,"text":2454},"7a0466c80b6d1",[]," for the AI era with Stack Overflow for Teams.",[2456],{"_key":2449,"_type":122,"href":2457},"https://try.stackoverflow.co/build-your-ai-future",{"_key":2459,"_type":23,"children":2460,"level":943,"listItem":944,"markDefs":2482,"style":31},"c0e138a83d8c",[2461,2465,2470,2474,2479],{"_key":2462,"_type":27,"marks":2463,"text":2464},"c02024ea59d70",[],"Stay current on the latest in AI news and conversations with our ",{"_key":2466,"_type":27,"marks":2467,"text":2469},"c02024ea59d71",[2468],"ead1307fa775","blog",{"_key":2471,"_type":27,"marks":2472,"text":2473},"c02024ea59d72",[]," and ",{"_key":2475,"_type":27,"marks":2476,"text":2478},"c02024ea59d73",[2477],"af9080726174","podcast",{"_key":2480,"_type":27,"marks":2481,"text":635},"c02024ea59d74",[],[2483,2485],{"_key":2468,"_type":122,"href":2484},"https://stackoverflow.blog/ai",{"_key":2477,"_type":122,"href":2486},"https://stackoverflow.blog/podcast",{"_type":162,"seoImage":2488},{"_type":16,"asset":2489},{"_ref":2490,"_type":19},"image-bbadf3981ba519c1cc36be999be03322f48544ff-2400x1260-png",{"_type":169,"current":2492},"empowering-your-teams","Empowering your teams",{"_key":2495,"_type":262,"body":2496,"seo":2776,"slug":2780,"title":2782},"5e4ee7e82e17",[2497,2505,2513,2532,2562,2566,2574,2582,2601,2609,2613,2621,2629,2637,2645,2653,2661,2669,2677,2685,2693,2701,2720,2725,2744,2752,2760,2768],{"_key":2498,"_type":23,"children":2499,"markDefs":2504,"style":31},"446511ea8d87",[2500],{"_key":2501,"_type":27,"marks":2502,"text":2503},"8dc506e7cb9b0",[],"Weighing the options available to your GenAI program can be overwhelming, so let’s talk through the major factors you’ll need to consider when making your decisions. For a technology that only really broke through into mainstream consciousness in November 2022, the ecosystem has grown surprisingly rich and accessible. You’ll find that there are plenty of good options available, both open-source and proprietary, locally installable and SaaS.",[],{"_key":2506,"_type":23,"children":2507,"markDefs":2512,"style":212},"eadaad085909",[2508],{"_key":2509,"_type":27,"marks":2510,"text":2511},"c4fdde4b46150",[],"Model size",[],{"_key":2514,"_type":23,"children":2515,"markDefs":2529,"style":31},"90968ed44bcf",[2516,2520,2525],{"_key":2517,"_type":27,"marks":2518,"text":2519},"b7edb14984160",[],"You may have seen large language models described in terms of number of parameters—that’s the size. The largest models have up to trillions of parameters, and as they grow, they have proven to improve their capabilities and ",{"_key":2521,"_type":27,"marks":2522,"text":2524},"b7edb14984161",[2523],"25c9ee3c2281","gain emergent abilities",{"_key":2526,"_type":27,"marks":2527,"text":2528},"b7edb14984162",[],". Massive models can tackle a wide range of tasks very well.",[2530],{"_key":2523,"_type":122,"href":2531},"https://www.assemblyai.com/blog/emergent-abilities-of-large-language-models/",{"_key":2533,"_type":23,"children":2534,"markDefs":2557,"style":31},"bd7312fe916b",[2535,2539,2544,2548,2553],{"_key":2536,"_type":27,"marks":2537,"text":2538},"8d165e49f8ed0",[],"Larger models generally require more compute power and memory, leading to increased hardware and training costs. For instance, ",{"_key":2540,"_type":27,"marks":2541,"text":2543},"8d165e49f8ed1",[2542],"f84e8d8dd5b4","OpenAI's GPT-3 has 175 billion parameters",{"_key":2545,"_type":27,"marks":2546,"text":2547},"8d165e49f8ed2",[],", while Meta's ",{"_key":2549,"_type":27,"marks":2550,"text":2552},"8d165e49f8ed3",[2551],"276c11d25253","LLaMA 2 model features up to 70 billion parameters",{"_key":2554,"_type":27,"marks":2555,"text":2556},"8d165e49f8ed4",[],", both requiring extensive computational resources. The trend toward ever-larger models also represents diminishing returns in performance, raising questions about cost-effectiveness.",[2558,2560],{"_key":2542,"_type":122,"href":2559},"https://developer.nvidia.com/blog/openai-presents-gpt-3-a-175-billion-parameters-language-model/",{"_key":2551,"_type":122,"href":2561},"https://azuremarketplace.microsoft.com/en-us/marketplace/apps/metagenai.meta-llama-2-70b-offer?tab=Overview",{"_key":2563,"_type":16,"asset":2564},"46e716d030ee",{"_ref":2565,"_type":19},"image-42ce43f9ba85a57176df74bf788171822a2c24a0-1431x682-png",{"_key":2567,"_type":23,"children":2568,"markDefs":2573,"style":31},"2bbacd40ffd3",[2569],{"_key":2570,"_type":27,"marks":2571,"text":2572},"2bfceb98c5f40",[],"Bigger isn’t always better. As Microsoft has shown with its series of Phi models, a smaller model trained on precise data can often perform on par with much larger models on some tasks, sometimes even besting them. Rather than train on a huge corpus of code and text from the internet, these models were trained on a hand-picked subset.",[],{"_key":2575,"_type":23,"children":2576,"markDefs":2581,"style":31},"e3cec2aedd96",[2577],{"_key":2578,"_type":27,"marks":2579,"text":2580},"4d64791908630",[],"The tradeoff is between a widely capable but expensive model and a targeted one that fits your budget but may not return great results for every task. If you have a specialized use case, then maybe a smaller model trained on focused data is the right solution.",[],{"_key":2583,"_type":23,"children":2584,"markDefs":2598,"style":31},"efe2cd50f489",[2585,2589,2594],{"_key":2586,"_type":27,"marks":2587,"text":2588},"416c05eba4d80",[],"There’s another aspect of size to consider: the size of each parameter. More accurate numbers take up more memory. Models may be trained with 32-bit floating point parameter values, but for storage-conscious applications, you can ",{"_key":2590,"_type":27,"marks":2591,"text":2593},"416c05eba4d81",[2592],"1f3687e7fc7d","quantize",{"_key":2595,"_type":27,"marks":2596,"text":2597},"416c05eba4d82",[]," them down to 8-bit integers. Think of it like reducing the amount of available colors in an image: Your beautiful 64-bit PNG may be what you print for posters, but the 8-bit version works fine in a thumbnail image. You may get less quality from your results, but the size reduction may be what you need for mobile or IoT applications.",[2599],{"_key":2592,"_type":122,"href":2600},"https://stackoverflow.blog/2023/08/23/fitting-ai-models-in-your-pocket-with-quantization/",{"_key":2602,"_type":23,"children":2603,"markDefs":2608,"style":212},"457585871e3a",[2604],{"_key":2605,"_type":27,"marks":2606,"text":2607},"73013e70dd9d0",[],"Cost",[],{"_key":2610,"_type":16,"asset":2611},"607d061cc54d",{"_ref":2612,"_type":19},"image-dbbde9ed38bf3e81f536af1251d5a8e773d8d093-1431x682-png",{"_key":2614,"_type":23,"children":2615,"markDefs":2620,"style":31},"36f061c52af7",[2616],{"_key":2617,"_type":27,"marks":2618,"text":2619},"92f14ceff4710",[],"For companies with a lot of resources, there may be good reasons to train your own foundation model or build your own tools. But for most organizations, this requires talent, time, and money that is better invested elsewhere. Training a new model can be as much as 1000x as expensive as fine-tuning an existing one.",[],{"_key":2622,"_type":23,"children":2623,"markDefs":2628,"style":31},"cb9ecfb69f2d",[2624],{"_key":2625,"_type":27,"marks":2626,"text":2627},"e0a226c4ee920",[],"Plenty of organizations have released open-source models under Apache or MIT licenses. You can clone these repos and start using and modifying them to your needs. That’s the cheapest option, but it may require more work on your end. Most cloud providers have LLM products that you can easily add to your accounts. For easy access with a cloud subscription, some companies allow API access to LLMs.",[],{"_key":2630,"_type":23,"children":2631,"markDefs":2636,"style":31},"a0f8013365b3",[2632],{"_key":2633,"_type":27,"marks":2634,"text":2635},"19201d9f47d90",[],"You’ll need a significant tech stack investment for GenAI, as there’s a whole data platform and infrastructure component. If you’ve already got a data team and they’ve created a pipeline, you can build on that foundation. From there, see how valuable the results are and consider scaling or investing more depending on how much farther you want to go.",[],{"_key":2638,"_type":23,"children":2639,"markDefs":2644,"style":31},"d358eb64dc7e",[2640],{"_key":2641,"_type":27,"marks":2642,"text":2643},"010f657142a40",[],"You’ll also need significant team expertise to develop and run GenAI models. Organizations with professionals experienced in machine learning and natural language processing are more likely to develop effective models efficiently. However, hiring top talent can be expensive, further increasing overall R&D costs. Many companies are investing in training programs to upskill their current workforce, attempting to bridge the talent gap in this rapidly evolving field.",[],{"_key":2646,"_type":23,"children":2647,"markDefs":2652,"style":344},"1a2c09a6afc9",[2648],{"_key":2649,"_type":27,"marks":2650,"text":2651},"7c7bf2368bc80",[],"Training and inference data",[],{"_key":2654,"_type":23,"children":2655,"markDefs":2660,"style":31},"b20f870ac5bf",[2656],{"_key":2657,"_type":27,"marks":2658,"text":2659},"82870442d03f0",[],"The volume and quality of training data are crucial to the success of any LLM. High-quality, diverse datasets improve model performance, but acquiring, curating, and preprocessing data can be costly. Improperly sourced data, especially data from copyrighted works, can expose your project to legal risks, and securing against those risks represents an additional expense.",[],{"_key":2662,"_type":23,"children":2663,"markDefs":2668,"style":31},"b97d6ceb3c38",[2664],{"_key":2665,"_type":27,"marks":2666,"text":2667},"4f5050b64f9b0",[],"It’s not just enough to have training data; today’s reasoning models often pull in additional data at inference time to answer complex prompts. Finding and processing this data quickly can be tricky—unstructured data may not provide the best results.",[],{"_key":2670,"_type":23,"children":2671,"markDefs":2676,"style":212},"dd2dd0a15d98",[2672],{"_key":2673,"_type":27,"marks":2674,"text":2675},"43a1e2144fd90",[],"Scalability",[],{"_key":2678,"_type":23,"children":2679,"markDefs":2684,"style":31},"8f91f0800971",[2680],{"_key":2681,"_type":27,"marks":2682,"text":2683},"5542d146acef0",[],"As your application grows, so does your GenAI usage. That means you’ll need to add capacity to your infrastructure, and you’ll spend more on GenAI services. Some GenAI pricing models shift as your usage grows, so this may be something you’ll need to consider up front before committing to an AI vendor.",[],{"_key":2686,"_type":23,"children":2687,"markDefs":2692,"style":31},"9320249ef858",[2688],{"_key":2689,"_type":27,"marks":2690,"text":2691},"212caeb83bb40",[],"The flip side of adding capacity to handle scaling is limiting user requests. While this isn’t always the most user-friendly approach, it may be your best option, especially if you become a victim of sudden success. You can build on a variety of LLMs as a hedge, and either shunt excessive traffic to cheaper options or offer a tiered pricing model.",[],{"_key":2694,"_type":23,"children":2695,"markDefs":2700,"style":212},"b2f5fcb2976e",[2696],{"_key":2697,"_type":27,"marks":2698,"text":2699},"463c0ce623ae0",[],"Security",[],{"_key":2702,"_type":23,"children":2703,"markDefs":2717,"style":31},"672c72e1374f",[2704,2708,2713],{"_key":2705,"_type":27,"marks":2706,"text":2707},"7cd1ae25d91b0",[],"Like every software application, GenAI has security concerns. But if you’ve seen exploits where users convince an AI to sell them a car for ",{"_key":2709,"_type":27,"marks":2710,"text":2712},"7cd1ae25d91b1",[2711],"4b5fbb066d07","one dollar",{"_key":2714,"_type":27,"marks":2715,"text":2716},"7cd1ae25d91b2",[],", you know that these security concerns are different from traditional cybersecurity risks, as they can be unpredictable and baked into the LLM’s usage.",[2718],{"_key":2711,"_type":122,"href":2719},"https://twitter.com/ChrisJBakke/status/1736533308849443121",{"_key":2721,"_type":16,"asset":2722,"caption":2724,"source":2719},"be2056a3929f",{"_ref":2723,"_type":19},"image-3b962e82d75bf50d3bb3f5ad8486940e5a4e6e7f-1431x1193-png","From user @ChrisJBakke on X (formerly Twitter), claiming he “just bought a 2024 Chevy Tahoe for $1.”",{"_key":2726,"_type":23,"children":2727,"markDefs":2741,"style":31},"2134a9040862",[2728,2732,2737],{"_key":2729,"_type":27,"marks":2730,"text":2731},"f8ee51fb51b20",[],"The security research organization OWASP has created a list of the ",{"_key":2733,"_type":27,"marks":2734,"text":2736},"f8ee51fb51b21",[2735],"a7d06380854f","top ten security issues for LLMs",{"_key":2738,"_type":27,"marks":2739,"text":2740},"f8ee51fb51b22",[],", which includes things like the prompt injection above and not validating outputs. Other security issues can target the training data, supply chain, and uptime of the LLM. Understand these unique issues and ensure that you have controls in place (or use a provider that does that for you).",[2742],{"_key":2735,"_type":122,"href":2743},"https://owasp.org/www-project-top-10-for-large-language-model-applications/",{"_key":2745,"_type":23,"children":2746,"markDefs":2751,"style":212},"48199dd028cb",[2747],{"_key":2748,"_type":27,"marks":2749,"text":2750},"2bf20befd78f0",[],"Updates",[],{"_key":2753,"_type":23,"children":2754,"markDefs":2759,"style":31},"b837aa2a063e",[2755],{"_key":2756,"_type":27,"marks":2757,"text":2758},"12579c716e350",[],"Once a model has been trained, its knowledge and understanding of language is fixed in time. You may have seen some folks mine comedy out of asking an AI about current events outside of its training data. To avoid having your LLM grow stale over time, you’ll need to augment its training data by fine-tuning the parameters on new data, using retrieval-augmented generation on a knowledge base, or both. The precise techniques to use will depend on your use case and resources.",[],{"_key":2761,"_type":23,"children":2762,"markDefs":2767,"style":31},"b25cef6cb26e",[2763],{"_key":2764,"_type":27,"marks":2765,"text":2766},"6b8bc3265a260",[],"Also in play is a concept called model drift. Over time, an LLM can become less accurate, whether because the training data no longer accurately represents the concepts used in practice or because the current dataset has changed. You can try fine-tuning a model continuously, but some folks recommend starting over and retraining your model on the newest data instead. For open-source models, you’ll have to do this yourself, while managed models may do this for you (for a fee, most likely).",[],{"_key":2769,"_type":23,"children":2770,"markDefs":2775,"style":31},"657e7e6d21f1",[2771],{"_key":2772,"_type":27,"marks":2773,"text":2774},"0b15b2d667b90",[],"As you can see, the GenAI landscape is vast and complicated, with many different options to consider and risks to account for. You’ll need to think through your use cases and decide which qualities of your software are most important to your customers.",[],{"_type":162,"seoImage":2777},{"_type":16,"asset":2778},{"_ref":2779,"_type":19},"image-2a9368b0a8c04626e2d546d1728e3cc276268334-2400x1260-png",{"_type":169,"current":2781},"cost-scale-security","Considerations: Cost, scale, security and more",{"_type":162,"seoDescription":2784,"seoImage":2785},"Explore AI integration options, from building in-house models to leveraging open-source ones, and understand key GenAI components.",{"_type":16,"asset":2786},{"_ref":2787,"_type":19},"image-da56fe1ba77e8a7c9b6931eceb9c64242540a5ea-2400x1260-png",[],{"_type":169,"current":2790},"building-your-genai-tech-stack","Building your GenAI tech stack",{"_key":2793,"_type":45,"body":2794,"sections":2922,"sidebarCta":3215,"slug":3216,"title":3218},"26e72c525b66",[2795,2803,2811,2819,2823,2831,2839,2843,2851,2870,2878,2886,2890,2898,2906,2914],{"_key":2796,"_type":23,"children":2797,"markDefs":2802,"style":31},"b011e9d9bcb8",[2798],{"_key":2799,"_type":27,"marks":2800,"text":2801},"b73fb2172a950",[],"Chatbots from OpenAI, Google, and Anthropic know a lot—heck, they basically read the whole internet. But to be truly useful inside your organization, a GenAI assistant needs to get at the proprietary knowledge your employees use to do their jobs. In this section, we’ll cover the process of adding your information to a database the AI assistant can draw on, either by fine-tuning a model on that data or incorporating it into a process called retrieval-augmented generation (RAG).",[],{"_key":2804,"_type":23,"children":2805,"markDefs":2810,"style":212},"24b2dd0cbdc4",[2806],{"_key":2807,"_type":27,"marks":2808,"text":2809},"296ffdea865b0",[],"What kind of data should I feed my LLM?",[],{"_key":2812,"_type":23,"children":2813,"markDefs":2818,"style":31},"f419ca0b2280",[2814],{"_key":2815,"_type":27,"marks":2816,"text":2817},"8b7bd39389030",[],"The first question you’ll need to ask yourself is how you want to deploy a GenAI. If you plan for it to be a helper inside your work chat, then it might make sense to train it on your organization’s documentation. If it’s going to act as tech support for your customers, you could train it on your FAQ and the forum posts from your technical support website. A GenAI trained on your codebase might prove useful as a productivity booster for your developers; a system trained on your HR and payroll materials might be something employees can turn to with questions that would normally be routed to your HR team.",[],{"_key":2820,"_type":16,"asset":2821},"0e07637741a5",{"_ref":2822,"_type":19},"image-ebba366989887c256cd3c2d14e4971cda0eb31f9-1430x682-png",{"_key":2824,"_type":23,"children":2825,"markDefs":2830,"style":212},"4dca9c0547e5",[2826],{"_key":2827,"_type":27,"marks":2828,"text":2829},"97fff9c22a460",[],"Does data quality matter?",[],{"_key":2832,"_type":23,"children":2833,"markDefs":2838,"style":31},"3e0158a3d783",[2834],{"_key":2835,"_type":27,"marks":2836,"text":2837},"08aebfaea93a0",[],"One of the biggest themes to emerge in the GenAI space has been the importance of data quality. When Google released its latest model, Gemini, they wrote that “data quality is critical to a high-performing model.” We know it's an important component of training, alongside the algorithms that guide the process and the hardware that executes it. But what many industry leaders are now saying is that data quality trumps these other factors.",[],{"_key":2840,"_type":90,"citation":2841,"copy":2842},"c6e61252404a","Tri Doa, Princeton University computer science teacher","“All the architecture stuff is fun, making the hardware efficient is fun, but I think ultimately it’s about data. If you look at the scaling law curve, different modern architectures will have the same slope, just a different offset. The only thing that changes the slope is data quality.”",{"_key":2844,"_type":23,"children":2845,"markDefs":2850,"style":212},"c9dc6ee04bc9",[2846],{"_key":2847,"_type":27,"marks":2848,"text":2849},"5128881373430",[],"Making it happen",[],{"_key":2852,"_type":23,"children":2853,"markDefs":2867,"style":31},"02c9e767f18e",[2854,2858,2863],{"_key":2855,"_type":27,"marks":2856,"text":2857},"de0f773becb80",[],"The method quickly becoming an industry best practice for getting a GenAI model to work with your data is called ",{"_key":2859,"_type":27,"marks":2860,"text":2862},"de0f773becb81",[2861],"983b6dc7a161","retrieval-augmented generation",{"_key":2864,"_type":27,"marks":2865,"text":2866},"de0f773becb82",[]," (RAG). With this method, the GenAI system retains all the intelligence of its training and fine tuning, but restricts its data set down to the information you provide, allowing it access to proprietary knowledge and helping to reduce factual errors.",[2868],{"_key":2861,"_type":122,"href":2869},"https://stackoverflow.co/teams/resources/ai-industry-guide/key-tools-technologies-terms/rag/",{"_key":2871,"_type":23,"children":2872,"markDefs":2877,"style":31},"02bb122e7f27",[2873],{"_key":2874,"_type":27,"marks":2875,"text":2876},"e1bf438da6910",[],"To take this approach, you’ll need to pick an embedding model and store the resulting vectors in a vector database. In simple terms, you turn text into numbers organized as points in a spatial cloud. By learning which terms are related, the model comes to understand their meaning and context.",[],{"_key":2879,"_type":23,"children":2880,"markDefs":2885,"style":31},"4eb9e5d6670c",[2881],{"_key":2882,"_type":27,"marks":2883,"text":2884},"f3a74f0ae60f0",[],"A great RAG system allows you to reduce the factual inaccuracies and hallucinations an LLM can produce. It also allows you to include annotations, so users can see the ground truth the LLM assistant used to provide its answer to each query.",[],{"_key":2887,"_type":16,"asset":2888},"affbb71c7584",{"_ref":2889,"_type":19},"image-68424323b003488386d2aa7b9ff5240a7f7d3635-1430x682-png",{"_key":2891,"_type":23,"children":2892,"markDefs":2897,"style":212},"f47887576fb7",[2893],{"_key":2894,"_type":27,"marks":2895,"text":2896},"2598ef401fd70",[],"Conclusion",[],{"_key":2899,"_type":23,"children":2900,"markDefs":2905,"style":31},"7e9cb751339d",[2901],{"_key":2902,"_type":27,"marks":2903,"text":2904},"cb631b2b699b0",[],"A lot of GenAI assistants are going to be built as chatbots that provide answers to users’ questions. At Stack Overflow, we’re lucky that our approach to documentation was already organized as a Q&A system. This information is also packed with rich metadata—which answer is the most recent, which answer got the most votes, which answer was accepted, and what tags are associated with this question.",[],{"_key":2907,"_type":23,"children":2908,"markDefs":2913,"style":31},"3079bd188fe0",[2909],{"_key":2910,"_type":27,"marks":2911,"text":2912},"3b6450a965270",[],"A crowdsourced system has another advantage: data quality. If an AI is pulling in a huge amount of internal documentation or lines of code, it has no way of knowing which information is most accurate, relevant, and up-to-date. It might be great at understanding the text from your wiki or the code from your repos, but it has no way of knowing which parts of the wiki have gone stale or which code is due to be deprecated unless you provide that context.",[],{"_key":2915,"_type":23,"children":2916,"markDefs":2921,"style":31},"38fe4e155a69",[2917],{"_key":2918,"_type":27,"marks":2919,"text":2920},"be8e1712b7af0",[],"If you’re working to create a GenAI assistant at your organization that will have access to proprietary information or code, make sure you spend time with your data science team figuring out how to clean and improve its quality before using it for training, fine-tuning, or RAG. Also, be sure to check with your legal and security teams to ensure that any data which isn’t meant to be widely available is excluded from training. There’s no way to remove it once the model has been finished without starting the training process all over again.",[],[2923],{"_key":2924,"_type":262,"body":2925,"slug":3212,"title":3214},"efd586a52276",[2926,2934,2964,2983,3002,3010,3018,3026,3030,3038,3042,3061,3080,3088,3096,3100,3118,3126,3129,3137,3145,3153,3157,3176,3196,3204],{"_key":2927,"_type":23,"children":2928,"markDefs":2933,"style":212},"54437f7c16f5",[2929],{"_key":2930,"_type":27,"marks":2931,"text":2932},"2ebefe9727950",[],"What is synthetic data?",[],{"_key":2935,"_type":23,"children":2936,"markDefs":2959,"style":31},"4827e7c6eb5d",[2937,2941,2946,2950,2955],{"_key":2938,"_type":27,"marks":2939,"text":2940},"378b9699b4430",[],"With machine learning, especially the large language models and other models currently in vogue with GenAI, getting good outputs means training those models on a lot of data—terabytes of text for even the smallest current models. ",{"_key":2942,"_type":27,"marks":2943,"text":2945},"378b9699b4431",[2944],"d5954942816f","A paper",{"_key":2947,"_type":27,"marks":2948,"text":2949},"378b9699b4432",[]," by researchers at Google Deepmind found the optimal number of tokens—a fraction of text—per parameter is around 15, though most of the top models are using 1000 to 2000 tokens per parameter. ",{"_key":2951,"_type":27,"marks":2952,"text":2954},"378b9699b4433",[2953],"cd6a10cdf93f","GPT-4 has over 1000 parameters",{"_key":2956,"_type":27,"marks":2957,"text":2958},"378b9699b4434",[]," and was trained on 1000 terabytes of data. Newer models have more parameters trained on more data.",[2960,2962],{"_key":2944,"_type":122,"href":2961},"https://arxiv.org/abs/2203.15556",{"_key":2953,"_type":122,"href":2963},"https://www.enterpriseappstoday.com/stats/chatgpt-4-statistics.html?utm_content=cmp-true",{"_key":2965,"_type":23,"children":2966,"markDefs":2980,"style":31},"9d40e16c2e47",[2967,2971,2976],{"_key":2968,"_type":27,"marks":2969,"text":2970},"c601aa4c6c9c0",[],"Further improvements to these LLMs means ",{"_key":2972,"_type":27,"marks":2973,"text":2975},"c601aa4c6c9c1",[2974],"8c75b770d240","more data",{"_key":2977,"_type":27,"marks":2978,"text":2979},"c601aa4c6c9c2",[],", whether that is by training for more parameters or overtraining each parameter. The creators of this data—humans—saw AI’s insatiable hunger for our work and pushed back. We had worked hard on it and didn’t appreciate being fodder for someone else’s product and asked for recognition of their contribution (if not payment). And now many LLM companies have begun crediting the people who created their content.",[2981],{"_key":2974,"_type":122,"href":2982},"https://stackoverflow.blog/2024/10/17/training-data-scarcity-synthetic-quality-model-genai-ai/",{"_key":2984,"_type":23,"children":2985,"markDefs":2999,"style":31},"eb8533a3aefe",[2986,2990,2995],{"_key":2987,"_type":27,"marks":2988,"text":2989},"cf34d7cb29400",[],"There’s a hard limit on how much data is available for training. Even if you’re compensating all the copyright holders (or willing to risk their wrath), the amount of data available on the internet and the world is finite. ",{"_key":2991,"_type":27,"marks":2992,"text":2994},"cf34d7cb29401",[2993],"ddde3e8a74e9","Researchers estimate",{"_key":2996,"_type":27,"marks":2997,"text":2998},"cf34d7cb29402",[]," that model trainers will run out of human-created data between 2026 and 2032. At that point, LLM trainers will need to accept this ceiling or find other avenues for training data.",[3000],{"_key":2993,"_type":122,"href":3001},"https://arxiv.org/html/2211.04325v2",{"_key":3003,"_type":23,"children":3004,"markDefs":3009,"style":31},"fc50e2799be2",[3005],{"_key":3006,"_type":27,"marks":3007,"text":3008},"3d90203f79d90",[],"One that has shown some promise is synthetic data. This is data that is created by a machine process, whether that’s an LLM or a computer simulation. For machine learning processes hungry for data, synthetic data can provide. It has secondary uses, too, as a source of focused data or a privacy screen.",[],{"_key":3011,"_type":23,"children":3012,"markDefs":3017,"style":344},"67e08b965826",[3013],{"_key":3014,"_type":27,"marks":3015,"text":3016},"098933406a220",[],"More training data",[],{"_key":3019,"_type":23,"children":3020,"markDefs":3025,"style":31},"431fee7efc1b",[3021],{"_key":3022,"_type":27,"marks":3023,"text":3024},"4de7023275a20",[],"If what models need is more data, then computers can do that. By using existing machine learning models to generate training data, you can train up a model on the cheap using the results of other training processes. Combined with human-generated data, this can allow you to create larger models based on better-formatted data.",[],{"_key":3027,"_type":16,"asset":3028},"07c4ef02cb58",{"_ref":3029,"_type":19},"image-34297aef92b00db4aabdd61a01a7367fa72caabd-1431x682-png",{"_key":3031,"_type":23,"children":3032,"markDefs":3037,"style":31},"cee9613a75f5",[3033],{"_key":3034,"_type":27,"marks":3035,"text":3036},"cc8a98a04e470",[],"For some models, synthetic data may be the only way to get complete sets. For use cases like autonomous driving, you can build models with more complete training data by using synthetic data generated by simulations. ",[],{"_key":3039,"_type":90,"citation":3040,"copy":3041},"135751b9e6f8","Kalyan Veeramachaneni, principal research scientist at MIT and co-founder of DataCebo","“It’s not possible to acquire training data that represents every possible driving scenario that could occur. In this case, synthetic data is a useful method to introduce the system to as many different situations as possible.”",{"_key":3043,"_type":23,"children":3044,"markDefs":3058,"style":31},"93a5c5a47c90",[3045,3049,3054],{"_key":3046,"_type":27,"marks":3047,"text":3048},"9a816594c1880",[],"Recently, the DeepSeek R1 model showed the power of good synthetic data and targeted training sets. Reports claim that it used OpenAI to produce responses to train its model in ",{"_key":3050,"_type":27,"marks":3051,"text":3053},"9a816594c1881",[3052],"6bc633bea274","a process known as distilling",{"_key":3055,"_type":27,"marks":3056,"text":3057},"9a816594c1882",[],". While the licensing issues are currently in question here, relying on the data produced by another LLM can certainly lower the costs of model training.",[3059],{"_key":3052,"_type":122,"href":3060},"https://www.theverge.com/news/601195/openai-evidence-deepseek-distillation-ai-data",{"_key":3062,"_type":23,"children":3063,"markDefs":3077,"style":31},"da2c9b46ad76",[3064,3068,3073],{"_key":3065,"_type":27,"marks":3066,"text":3067},"f62534d9c33e0",[],"There is a danger with ",{"_key":3069,"_type":27,"marks":3070,"text":3072},"f62534d9c33e1",[3071],"bb5768e3ea5c","synthetic data as the primary training source",{"_key":3074,"_type":27,"marks":3075,"text":3076},"f62534d9c33e2",[],": model collapse. This is when the repeated hallucinations, biases, and errors produced by any model amplify when used to train other models. The outliers in the original statistical model are lost, and the new model uses a narrower statistical distribution. While that would likely remove some of the more comical AI fails, it would also remove the full breadth of understanding.",[3078],{"_key":3071,"_type":122,"href":3079},"https://arxiv.org/abs/2404.05090",{"_key":3081,"_type":23,"children":3082,"markDefs":3087,"style":31},"4de0e6723502",[3083],{"_key":3084,"_type":27,"marks":3085,"text":3086},"c0060e30148b0",[],"One of the current dangers around AI is the amount of AI-generated content now on the internet. GenAI has been used to quickly create SEO-friendly primers for every organization trying to rank for a given keyword. Anyone training on a full crawl of the internet is going to be gathering up this content and putting themselves at risk for model collapse.",[],{"_key":3089,"_type":23,"children":3090,"markDefs":3095,"style":344},"c1c856184e81",[3091],{"_key":3092,"_type":27,"marks":3093,"text":3094},"e6a357eebfdb0",[],"Optimized training data",[],{"_key":3097,"_type":16,"asset":3098},"6cac097da80c",{"_ref":3099,"_type":19},"image-3ef225f0a8cc07f9141e0513331968e935b8cf5c-1430x682-png",{"_key":3101,"_type":23,"children":3102,"markDefs":3116,"style":31},"94fab6480b10",[3103,3107,3112],{"_key":3104,"_type":27,"marks":3105,"text":3106},"6bead0d3fb360",[],"While a general purpose model trained on synthetic data is at risk of model collapse, some model trainers have used synthetic data as a focused training set to ",{"_key":3108,"_type":27,"marks":3109,"text":3111},"6bead0d3fb361",[3110],"86c3fe3802ca","get better-than-average results out of small models",{"_key":3113,"_type":27,"marks":3114,"text":3115},"6bead0d3fb362",[],". A group from Microsoft trained their phi-1 model on a synthetic Python textbook and exercises with answers. They created the textbook by prompting GPT 3.5 to create topics that would promote reasoning and algorithmic skills. The final model has 1.5B parameters and matches the performance of models with 10x the number of parameters.",[3117],{"_key":3110,"_type":122,"href":1834},{"_key":3119,"_type":23,"children":3120,"markDefs":3125,"style":31},"0e721c682f01",[3121],{"_key":3122,"_type":27,"marks":3123,"text":3124},"52be9a8d88970",[],"Focused data, even when produced by another LLM, can train a model to punch above its weight for a fraction of the cost.",[],{"_key":3127,"_type":90,"copy":3128},"7858efedc36e","“We conjecture that language models would benefit from a training set that has the same qualities as a good ‘textbook’: it should be clear, self-contained, instructive, and balanced.”",{"_key":3130,"_type":23,"children":3131,"markDefs":3136,"style":31},"285aaaaeb3ed",[3132],{"_key":3133,"_type":27,"marks":3134,"text":3135},"da6f50139c700",[],"Smaller, targeted LLMs not only provide more bang for their buck from training costs, but they are also cheaper to run inference and fine-tuning on. If you want resource and cost efficiency and don’t need the creativity and comprehensiveness of a massive model, you might do better by selecting an LLM with fewer parameters.",[],{"_key":3138,"_type":23,"children":3139,"markDefs":3144,"style":344},"8cc47e2054a5",[3140],{"_key":3141,"_type":27,"marks":3142,"text":3143},"0aa29a34e5970",[],"Privacy",[],{"_key":3146,"_type":23,"children":3147,"markDefs":3152,"style":31},"1011b5e48a7f",[3148],{"_key":3149,"_type":27,"marks":3150,"text":3151},"e789b05aface0",[],"Another common use for synthetic data is to protect privacy. In the course of gathering data, whether about customers or their usage of an application, you may want to analyze it or share it with vendors. But that could expose your customers’ PII and leave them vulnerable for exploitation. Synthetic data creates a statistically similar data set that doesn’t have the same risks of PII leakage.",[],{"_key":3154,"_type":16,"asset":3155},"31c168f65b03",{"_ref":3156,"_type":19},"image-c31647e2d02e62d31861657ccb2f211998dc351b-1430x682-png",{"_key":3158,"_type":23,"children":3159,"markDefs":3173,"style":31},"bc5a5dac4f92",[3160,3164,3169],{"_key":3161,"_type":27,"marks":3162,"text":3163},"b8f0033f45500",[],"On the ",{"_key":3165,"_type":27,"marks":3166,"text":3168},"b8f0033f45501",[3167],"a1ac7c687e83","Stack Overflow Podcast",{"_key":3170,"_type":27,"marks":3171,"text":3172},"b8f0033f45502",[],", John Myers, CTO and cofounder of Gretel, told us how this works: “What our synthetic data capability does is build a machine learning model on the original data, at which point you can throw out the original data. And then you can use that model to create records that look and feel like the original records. We have a bunch of post-processing that removes outliers or overly similar records, what we call privacy filtering.”",[3174],{"_key":3167,"_type":122,"href":3175},"https://stackoverflow.blog/2022/01/28/gretel-ai-privacy-engineering-synthetic-data/",{"_key":3177,"_type":23,"children":3178,"markDefs":3195,"style":31},"4aa9c473f6ea",[3179,3183,3187,3191],{"_key":3180,"_type":27,"marks":3181,"text":3182},"3a577c0c280c0",[],"This privacy-filtered data can then be used instead of the original data while maintaining the general shape of that data. You can run analytics on it, train other models, or use it in demos. ",{"_key":3184,"_type":27,"marks":3185,"text":3186},"5aba1beb4805",[57],"“Synthetic data needs to meet certain criteria to be reliable and effective—for example, preserving column shapes, category coverage, and correlations,”",{"_key":3188,"_type":27,"marks":3189,"text":3190},"a9268c58467b",[]," said Veeramachaneni, the MIT research scientist. ",{"_key":3192,"_type":27,"marks":3193,"text":3194},"be0b14c8bc19",[57],"“To enable this, the processes used to generate the data can be controlled by specifying particular statistical distributions for columns, model architectures and data transformation methods.”",[],{"_key":3197,"_type":23,"children":3198,"markDefs":3203,"style":31},"83489158c03e",[3199],{"_key":3200,"_type":27,"marks":3201,"text":3202},"59e896391ce40",[],"This statistically similar data can then be used to train other models to produce responses that have no chance of leaking any sort of private or personally-identifiable information. These models can make accurate predictions on production data without giving away sensitive information to your contractors. And they aren’t vulnerable to model inference attacks or re-identification attacks.",[],{"_key":3205,"_type":23,"children":3206,"markDefs":3211,"style":31},"a028a4ac9d2b",[3207],{"_key":3208,"_type":27,"marks":3209,"text":3210},"ea6872d66b7b0",[],"For model trainers looking for more data, targeted data, or depersonalized data, synthetic can be even better than the real thing. It can add to existing models and push a model to perform in desired ways. However, if all you’re using is synthetic data, then you are at risk of model collapse.",[],{"_type":169,"current":3213},"synthetic-data","Synthetic data",[],{"_type":169,"current":3217},"data-quality","The importance of data quality",{"_key":3220,"_type":45,"body":3221,"sections":3614,"seo":4707,"sidebarCta":4712,"slug":4713,"title":4715},"dd65502b26a8",[3222,3230,3238,3242,3250,3258,3262,3292,3300,3308,3315,3319,3327,3335,3339,3358,3366,3370,3378,3386,3390,3398,3406,3424,3432,3436,3444,3451,3455,3463,3470,3474,3482,3490,3498,3506,3514,3522,3530,3534,3553,3561,3565,3582,3590,3598,3606],{"_key":3223,"_type":23,"children":3224,"markDefs":3229,"style":31},"8dfa81af8a7b",[3225],{"_key":3226,"_type":27,"marks":3227,"text":3228},"3b828d3b62b40",[],"The technologies and tools supporting GenAI's developments are moving fast. Here's an overview of the technologies, terms, and principles AI developers need to know.",[],{"_key":3231,"_type":23,"children":3232,"markDefs":3237,"style":344},"82fb63206ae0",[3233],{"_key":3234,"_type":27,"marks":3235,"text":3236},"8fa86e61e8470",[],"Python",[],{"_key":3239,"_type":16,"asset":3240},"254753f44ba3",{"_ref":3241,"_type":19},"image-298b5ab3febe147eeeb2429f57f8d7c532753100-1430x682-png",{"_key":3243,"_type":23,"children":3244,"markDefs":3249,"style":31},"cbb566a175a4",[3245],{"_key":3246,"_type":27,"marks":3247,"text":3248},"ffddbea61bae0",[],"Python remains the primary programming language for machine learning. It doesn't need compilation to test changes, making it the perfect tool for data scientists who may not have expert programming skills and want to run AI experiments. As Python has been around since the 1990s, an ecosystem has arisen around it. Although not everything can be written in Python, it wraps nicely around other faster languages like C.",[],{"_key":3251,"_type":23,"children":3252,"markDefs":3257,"style":344},"042e0c3d489c",[3253],{"_key":3254,"_type":27,"marks":3255,"text":3256},"d3f780a8732b0",[],"Hardware accelerators",[],{"_key":3259,"_type":16,"asset":3260},"30db9ffccbee",{"_ref":3261,"_type":19},"image-e91648014014301f151caea3b1bbf4c892776682-1430x682-png",{"_key":3263,"_type":23,"children":3264,"markDefs":3287,"style":31},"b217316d29f0",[3265,3269,3274,3278,3283],{"_key":3266,"_type":27,"marks":3267,"text":3268},"ac94cbcd6a8b0",[],"Hardware accelerators are essential for processing complex AI computations. They grew out of 3D graphics, which calculate multiple points in space and light sources to render an image. Accelerators found new life in machine learning and AI, which need to calculate thousands of weights and biases in parallel.For decades, the primary computing engine of most computers has been the central processing unit (CPU). This is a general-purpose serial computing unit that handles several operations concurrently and uses a memory cache to store interim computations. Hardware accelerators like GPUs (graphics processing units) and TPUs (tensor processing units) can process thousands of computations in parallel.By June 2024, ",{"_key":3270,"_type":27,"marks":3271,"text":3273},"ac94cbcd6a8b1",[3272],"f508b6a5db14","Nvidia owned 88% of the GPU market",{"_key":3275,"_type":27,"marks":3276,"text":3277},"ac94cbcd6a8b2",[],". This is beneficial for consolidating standards, but poses a risk of a single point of failure with one dominant player. But in January 2025, ",{"_key":3279,"_type":27,"marks":3280,"text":3282},"ac94cbcd6a8b3",[3281],"df8dd047a7c1","they lost $600 billion in valuation",{"_key":3284,"_type":27,"marks":3285,"text":3286},"ac94cbcd6a8b4",[]," as DeepSeek unveiled their R1 model, sparking concerns about the entry of cheaper Chinese tech eliminating the need for expensive, high-end GPU servers.",[3288,3290],{"_key":3272,"_type":122,"href":3289},"https://www.techradar.com/computing/gpu/nvidia-now-owns-88-of-the-gpu-market-but-that-might-not-be-a-bad-thing-yet",{"_key":3281,"_type":122,"href":3291},"https://www.cnbc.com/2025/01/27/nvidia-sheds-almost-600-billion-in-market-cap-biggest-drop-ever.html",{"_key":3293,"_type":23,"children":3294,"markDefs":3299,"style":212},"db6d876cc8eb",[3295],{"_key":3296,"_type":27,"marks":3297,"text":3298},"e360592672310",[],"Neural networks",[],{"_key":3301,"_type":23,"children":3302,"markDefs":3307,"style":31},"c1a467513e5f",[3303],{"_key":3304,"_type":27,"marks":3305,"text":3306},"9d04314c3a5b",[],"Neural networks are the basis for most GenAI models. There are several different types that you'll encounter when considering and implementing GenAI.",[],{"_key":3309,"_type":23,"children":3310,"markDefs":3314,"style":344},"82b605030adc",[3311],{"_key":3312,"_type":27,"marks":3313,"text":2053},"f44c6afacb620",[],[],{"_key":3316,"_type":16,"asset":3317},"f17f20e1cdde",{"_ref":3318,"_type":19},"image-bf95a749a9fb6384771e6307393553969917f6c4-1430x682-png",{"_key":3320,"_type":23,"children":3321,"markDefs":3326,"style":31},"7014c6d82717",[3322],{"_key":3323,"_type":27,"marks":3324,"text":3325},"21cc1c8dad890",[],"LLMs use machine learning to understand and generate language. They’ve advanced significantly in recent years, with models like OpenAI's GPT-4 supporting multimodal interactions, including text and image analysis.",[],{"_key":3328,"_type":23,"children":3329,"markDefs":3334,"style":344},"65d66a762037",[3330],{"_key":3331,"_type":27,"marks":3332,"text":3333},"7d94662c4ad30",[],"Generative adversarial networks (GANs) and synthetic data generation",[],{"_key":3336,"_type":16,"asset":3337},"9e746c485ba2",{"_ref":3338,"_type":19},"image-461b0aff937eddf3cbee62f0315d7ef5d08fc507-1430x682-png",{"_key":3340,"_type":23,"children":3341,"markDefs":3355,"style":31},"1c8ad82517a5",[3342,3346,3351],{"_key":3343,"_type":27,"marks":3344,"text":3345},"eb99b710cd2d0",[],"GANs are widely used for creating synthetic data generation, particularly in image and video synthesis. Artificial data is used to train AI models, enhancing privacy and diversity. In 2023, ",{"_key":3347,"_type":27,"marks":3348,"text":3350},"eb99b710cd2d1",[3349],"fed31bf8d7c6","Gartner predicted",{"_key":3352,"_type":27,"marks":3353,"text":3354},"eb99b710cd2d2",[]," that by 2024, 60% of data used in AI and analytics projects would be synthetically generated.",[3356],{"_key":3349,"_type":122,"href":3357},"https://www.gartner.com/en/newsroom/press-releases/2023-08-01-gartner-identifies-top-trends-shaping-future-of-data-science-and-machine-learning",{"_key":3359,"_type":23,"children":3360,"markDefs":3365,"style":344},"65df20de4856",[3361],{"_key":3362,"_type":27,"marks":3363,"text":3364},"4d7f5260662e0",[],"Variational auto-encoders (VAEs)",[],{"_key":3367,"_type":16,"asset":3368},"6c6fe303c97e",{"_ref":3369,"_type":19},"image-2237a8d36a28c829e19587c20ccc7b441c0dc33d-1430x682-png",{"_key":3371,"_type":23,"children":3372,"markDefs":3377,"style":31},"fdf1475d1e20",[3373],{"_key":3374,"_type":27,"marks":3375,"text":3376},"b52f5042d8b80",[],"VAEs generate new data across various domains, including music and art.",[],{"_key":3379,"_type":23,"children":3380,"markDefs":3385,"style":344},"33a22d841848",[3381],{"_key":3382,"_type":27,"marks":3383,"text":3384},"5d1146f6a6110",[],"Transformer-based LLMs",[],{"_key":3387,"_type":16,"asset":3388},"cf7c6aa63a54",{"_ref":3389,"_type":19},"image-57e6172c7ae28f1e672428284aaa577b664a3209-1430x682-png",{"_key":3391,"_type":23,"children":3392,"markDefs":3397,"style":31},"677fa0a5f78e",[3393],{"_key":3394,"_type":27,"marks":3395,"text":3396},"84252d6fba6e0",[],"Transformer LLM models speed up natural language processing (NLP) tasks and are customizable for specific domains.",[],{"_key":3399,"_type":23,"children":3400,"markDefs":3405,"style":344},"eb2f2abd5176",[3401],{"_key":3402,"_type":27,"marks":3403,"text":3404},"6273ecb7ca8c0",[],"Multimodal models",[],{"_key":3407,"_type":23,"children":3408,"markDefs":3421,"style":31},"63683b1f5b7c",[3409,3413,3417],{"_key":3410,"_type":27,"marks":3411,"text":3412},"497572f9acac0",[],"Multimodal LLMs handle data across text, image, and audio. They’ve seen wider adoption thanks to mass market tools like Google's Gemini and Microsoft's Copilot giving easy access to text and visual creation inside one tool. ",{"_key":3414,"_type":27,"marks":3415,"text":613},"497572f9acac1",[3416],"a19ba664bb32",{"_key":3418,"_type":27,"marks":3419,"text":3420},"497572f9acac2",[]," that 40% of GenAI solutions will be multimodal by 2027.",[3422],{"_key":3416,"_type":122,"href":3423},"https://www.gartner.com/en/newsroom/press-releases/2024-09-09-gartner-predicts-40-percent-of-generative-ai-solutions-will-be-multimodal-by-2027",{"_key":3425,"_type":23,"children":3426,"markDefs":3431,"style":344},"6cd9e592db34",[3427],{"_key":3428,"_type":27,"marks":3429,"text":3430},"3a6f3c32724f0",[],"Machine learning frameworks",[],{"_key":3433,"_type":16,"asset":3434},"9dcfd10663d6",{"_ref":3435,"_type":19},"image-9b1c3369afce24b24cfbfbdf4d722bb4d492bacf-1430x682-png",{"_key":3437,"_type":23,"children":3438,"markDefs":3443,"style":31},"ef58846fb9b5",[3439],{"_key":3440,"_type":27,"marks":3441,"text":3442},"7ca5966cab3c0",[],"The complex math used by ML models can be complex for developers to implement. Open-source Python libraries like PyTorch and TensorFlow make training and fine-tuning models more accessible and standardized.",[],{"_key":3445,"_type":23,"children":3446,"markDefs":3450,"style":344},"c017a1d733d6",[3447],{"_key":3448,"_type":27,"marks":3449,"text":2182},"7d9e577004ac0",[],[],{"_key":3452,"_type":16,"asset":3453},"d61c888db96a",{"_ref":3454,"_type":19},"image-b82db719d814d0c754673289a735c729aab7e1ef-1430x682-png",{"_key":3456,"_type":23,"children":3457,"markDefs":3462,"style":31},"95e4d525fc9c",[3458],{"_key":3459,"_type":27,"marks":3460,"text":3461},"9248f73316620",[],"GenAI relies on large amounts of data for training, fine-tuning, and semantic search. This data is often stored in data lakehouses, which combine the structured reliability and low latency of data warehouses with the cost efficiency of a data lake. AI processes can access business intelligence and analytics data, enabling more relevant insights from AI systems.",[],{"_key":3464,"_type":23,"children":3465,"markDefs":3469,"style":344},"888254eadcc3",[3466],{"_key":3467,"_type":27,"marks":3468,"text":2155},"21ecbb75dbed0",[],[],{"_key":3471,"_type":16,"asset":3472},"70eea0cc96fb",{"_ref":3473,"_type":19},"image-aaeb15277d6085044a0f060df6787b3afae7acb4-1430x682-png",{"_key":3475,"_type":23,"children":3476,"markDefs":3481,"style":31},"57b56b52618e",[3477],{"_key":3478,"_type":27,"marks":3479,"text":3480},"1bee80469b720",[],"LLMs convert text into numerical patterns, like coordinates on a map, to represent language in a structured way. Vector databases store these patterns efficiently, even with thousands of data points, and make it easy to search and compare them quickly. Vector databases store high-dimensional vectors for AI applications. They’re crucial for retrieval-augmented generation (RAG) and semantic search.",[],{"_key":3483,"_type":23,"children":3484,"markDefs":3489,"style":344},"dc99a9f70b07",[3485],{"_key":3486,"_type":27,"marks":3487,"text":3488},"2395282e48700",[],"Cloud and edge AI",[],{"_key":3491,"_type":23,"children":3492,"markDefs":3497,"style":31},"16fa61445dfa",[3493],{"_key":3494,"_type":27,"marks":3495,"text":3496},"550ccc0cf13c0",[],"Deploying AI models on edge devices enables real-time processing with reduced latency, enhanced data privacy, and reduced dependence on network connectivity. Cloud AI involves centralizing the processing of data on remote cloud servers. Developers can access advanced tools without investing heavily in development or hardware, using on-demand computing resources instead of physical infrastructure.",[],{"_key":3499,"_type":23,"children":3500,"markDefs":3505,"style":344},"978ab7c05f85",[3501],{"_key":3502,"_type":27,"marks":3503,"text":3504},"59d0c2e644970",[],"Federated learning",[],{"_key":3507,"_type":23,"children":3508,"markDefs":3513,"style":31},"1c11d8110aa5",[3509],{"_key":3510,"_type":27,"marks":3511,"text":3512},"65b7676fc5790",[],"Federated learning is a decentralized machine learning method where multiple devices train an AI model without sharing data with a central server. Each device trains a local model using its data and sends updates to the cloud for refinement, enhancing privacy and reducing data transfer by up to 90%. This method can be used to analyze user behavior while maintaining data security. TensorFlow Federated (TFF) is one prominent tool in this space.",[],{"_key":3515,"_type":23,"children":3516,"markDefs":3521,"style":212},"7d42d3986724",[3517],{"_key":3518,"_type":27,"marks":3519,"text":3520},"f772786db7170",[],"AI model principles",[],{"_key":3523,"_type":23,"children":3524,"markDefs":3529,"style":344},"6177094e78bc",[3525],{"_key":3526,"_type":27,"marks":3527,"text":3528},"5fd6c7e280970",[],"Hallucinations",[],{"_key":3531,"_type":16,"asset":3532},"1effdfe8f0fe",{"_ref":3533,"_type":19},"image-a330bbffa8c9548792f744afe885f7390248943d-1430x682-png",{"_key":3535,"_type":23,"children":3536,"markDefs":3550,"style":31},"51ddb0ea7b56",[3537,3541,3546],{"_key":3538,"_type":27,"marks":3539,"text":3540},"64b43a0fa9bf0",[],"Hallucinations are instances when AI models generate plausible-sounding but incorrect information. The bad news: ",{"_key":3542,"_type":27,"marks":3543,"text":3545},"64b43a0fa9bf1",[3544],"f57bba320598","AI researchers believe hallucinations",{"_key":3547,"_type":27,"marks":3548,"text":3549},"64b43a0fa9bf2",[]," are a feature rather than a bug in LLM tools, as LLMs aren’t drawing down existing knowledge but are programmed to come up with plausible-sounding responses. Addressing hallucinations remains a priority for developing more reliable AI models. Retrieval-augmentation generation (RAG), which verifies information against specified data sources, can counteract hallucinations.",[3551],{"_key":3544,"_type":122,"href":3552},"https://casmi.northwestern.edu/news/articles/2024/the-hallucination-problem-a-feature-not-a-bug.html",{"_key":3554,"_type":23,"children":3555,"markDefs":3560,"style":344},"7a187448668c",[3556],{"_key":3557,"_type":27,"marks":3558,"text":3559},"d77c81e4289d0",[],"Model drift",[],{"_key":3562,"_type":16,"asset":3563},"7006705c7205",{"_ref":3564,"_type":19},"image-cabbfdf1bd05f60b3479604bbad9329a894bdaf7-1430x682-png",{"_key":3566,"_type":23,"children":3567,"markDefs":3580,"style":31},"db92693e66c2",[3568,3572,3576],{"_key":3569,"_type":27,"marks":3570,"text":3571},"2ec2ea748f450",[],"Model drift happens when an AI model's performance degrades due to changes in the underlying data patterns. Drift impacts the accuracy of AI results. Improving drift is crucial for supporting responsible AI practices, which is a requirement of new regulations like the ",{"_key":3573,"_type":27,"marks":3574,"text":752},"2ec2ea748f451",[3575],"0cc0755de618",{"_key":3577,"_type":27,"marks":3578,"text":3579},"2ec2ea748f452",[],", which affects organisations developing systems or using the output of AI systems in the EU.",[3581],{"_key":3575,"_type":122,"href":767},{"_key":3583,"_type":23,"children":3584,"markDefs":3589,"style":31},"9192becb6cf3",[3585],{"_key":3586,"_type":27,"marks":3587,"text":3588},"0e796b9b64a40",[],"Model drift can be reduced with:",[],{"_key":3591,"_type":23,"children":3592,"level":943,"listItem":944,"markDefs":3597,"style":31},"4fb013231708",[3593],{"_key":3594,"_type":27,"marks":3595,"text":3596},"1c92069b4d320",[],"Fine-tuning: Updating a pre-trained model with new data to improve its performance.",[],{"_key":3599,"_type":23,"children":3600,"level":943,"listItem":944,"markDefs":3605,"style":31},"e0442c9b3cf1",[3601],{"_key":3602,"_type":27,"marks":3603,"text":3604},"4e536219ee950",[],"Explainability: The degree to which the internal workings of an AI system can be explained in human terms.",[],{"_key":3607,"_type":23,"children":3608,"level":943,"listItem":944,"markDefs":3613,"style":31},"585c70c1d93f",[3609],{"_key":3610,"_type":27,"marks":3611,"text":3612},"f9293fb9a2670",[],"Debiasing: Techniques aimed at reducing bias in AI models to ensure fairness.",[],[3615,4127,4396],{"_key":3616,"_type":262,"body":3617,"seo":4121,"slug":4125,"title":1928},"c95b3225d252",[3618,3637,3656,3664,3667,3675,3683,3691,3699,3707,3715,3723,3731,3739,3747,3755,3763,3771,3779,3783,3791,3795,3803,3807,3815,3823,3831,3835,3843,3847,3855,3859,3867,3871,3879,3887,3895,3903,3911,3919,3927,3935,3943,3951,3959,3967,3975,3983,3991,3999,4011,4019,4027,4035,4043,4051,4059,4067,4075,4094,4102],{"_key":3619,"_type":23,"children":3620,"markDefs":3634,"style":31},"8140db9de33b",[3621,3625,3630],{"_key":3622,"_type":27,"marks":3623,"text":3624},"a69790a7853f0",[],"Retrieval-augmented generation (RAG) makes GenAI outputs more accurate and relevant. RAG works by combining retrieval systems with large language models (LLMs) to produce reliable answers grounded in real-world data. AI-chip manufacturer ",{"_key":3626,"_type":27,"marks":3627,"text":3629},"a69790a7853f1",[3628],"10a5b6490c45","Nvidia describes RAG",{"_key":3631,"_type":27,"marks":3632,"text":3633},"a69790a7853f2",[]," as the \"court clerk of AI\" by showing the user source data they can review, similar to the clerk bringing legal files out of the vaults.",[3635],{"_key":3628,"_type":122,"href":3636},"https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/",{"_key":3638,"_type":23,"children":3639,"markDefs":3653,"style":31},"c7a2333ae5ee",[3640,3644,3649],{"_key":3641,"_type":27,"marks":3642,"text":3643},"515ec7f4ee3c0",[],"It's becoming a popular method to reduce the hallucinations haunting the wider adoption of AI. ",{"_key":3645,"_type":27,"marks":3646,"text":3648},"515ec7f4ee3c1",[3647],"930aaa198ff6","The RAG market is forecast to grow",{"_key":3650,"_type":27,"marks":3651,"text":3652},"515ec7f4ee3c2",[]," from $1.2 billion in 2024 to $11 billion by 2030. Developers are applying RAG in customer support systems and documentation platforms to anchor responses to verified data.",[3654],{"_key":3647,"_type":122,"href":3655},"https://www.grandviewresearch.com/industry-analysis/retrieval-augmented-generation-rag-market-report",{"_key":3657,"_type":23,"children":3658,"markDefs":3663,"style":212},"c4288fef19ca",[3659],{"_key":3660,"_type":27,"marks":3661,"text":3662},"4172c59ee72f0",[],"What is RAG?",[],{"_key":3665,"_type":16,"asset":3666},"22e0b503acdf",{"_ref":1921,"_type":19},{"_key":3668,"_type":23,"children":3669,"markDefs":3674,"style":31},"14ff52fe61e3",[3670],{"_key":3671,"_type":27,"marks":3672,"text":3673},"b6b43201cd660",[],"RAG technique integrates two components: retrieval and generation. First, the system retrieves relevant data from an external knowledge source or an internal database. Then the LLM processes this data to produce a context-aware response.",[],{"_key":3676,"_type":23,"children":3677,"markDefs":3682,"style":31},"a9414904c470",[3678],{"_key":3679,"_type":27,"marks":3680,"text":3681},"762f59b9c7240",[],"RAG helps developers address challenges with standalone LLMs, like producing incorrect answers or relying on outdated knowledge. It's the ideal technique to generate information and insight from proprietary knowledge, like a dev team's code library or a sales team's case studies file.",[],{"_key":3684,"_type":23,"children":3685,"markDefs":3690,"style":212},"9996deda6a94",[3686],{"_key":3687,"_type":27,"marks":3688,"text":3689},"c91072d09bc60",[],"Why are developers embracing RAG?",[],{"_key":3692,"_type":23,"children":3693,"markDefs":3698,"style":31},"c5ff25f35d52",[3694],{"_key":3695,"_type":27,"marks":3696,"text":3697},"86bcf671e9f90",[],"For developers building search tools, customer support bots, or knowledge applications, RAG offers a balance between precision and flexibility.",[],{"_key":3700,"_type":23,"children":3701,"markDefs":3706,"style":31},"5548773e31ca",[3702],{"_key":3703,"_type":27,"marks":3704,"text":3705},"745f6d5aa71c0",[],"In addition to trust and accuracy, RAG integrates with large datasets for faster info retrieval. Developers can fine-tune RAG pipelines to produce results tailored to their domain and needs. Unlike pre-trained LLMs whose knowledge is static, RAG can work with dynamic datasets to offer up-to-date responses.",[],{"_key":3708,"_type":23,"children":3709,"markDefs":3714,"style":212},"3fe25d05a3eb",[3710],{"_key":3711,"_type":27,"marks":3712,"text":3713},"d664bb19b4740",[],"How does RAG work?",[],{"_key":3716,"_type":23,"children":3717,"markDefs":3722,"style":31},"1dbfade29c29",[3718],{"_key":3719,"_type":27,"marks":3720,"text":3721},"5ef8d08785b10",[],"There are two stages in the RAG pipeline.",[],{"_key":3724,"_type":23,"children":3725,"markDefs":3730,"style":31},"1ad46f8d372b",[3726],{"_key":3727,"_type":27,"marks":3728,"text":3729},"b5243d0123690",[151],"Stage 1 - Retrieve ",[],{"_key":3732,"_type":23,"children":3733,"markDefs":3738,"style":31},"0ae15a940196",[3734],{"_key":3735,"_type":27,"marks":3736,"text":3737},"bdf6d6f15456",[],"Tools like Elasticsearch, FAISS, or Pinecone identify relevant data from structured or unstructured sources. Developers typically use vector-based retrieval for similarity-driven searches. When given a query, the system retrieves the top-k relevant documents from a knowledge base.",[],{"_key":3740,"_type":23,"children":3741,"markDefs":3746,"style":31},"54519861a275",[3742],{"_key":3743,"_type":27,"marks":3744,"text":3745},"03ed366848b00",[151],"Stage 2 - Generate ",[],{"_key":3748,"_type":23,"children":3749,"markDefs":3754,"style":31},"f9fb03faeab3",[3750],{"_key":3751,"_type":27,"marks":3752,"text":3753},"808e00fc3087",[],"The retrieved data is passed to a generative LLM, such as GPT-4, which compiles a detailed response using the provided context. ",[],{"_key":3756,"_type":23,"children":3757,"markDefs":3762,"style":212},"dffc0dbbfed5",[3758],{"_key":3759,"_type":27,"marks":3760,"text":3761},"60e12bb2f5fa0",[],"Stack Overflow's RAG method",[],{"_key":3764,"_type":23,"children":3765,"markDefs":3770,"style":31},"5cf20c475828",[3766],{"_key":3767,"_type":27,"marks":3768,"text":3769},"e7611a8387940",[],"At Stack Overflow, here's how we narrow the dataset.",[],{"_key":3772,"_type":23,"children":3773,"markDefs":3778,"style":31},"677b6a694141",[3774],{"_key":3775,"_type":27,"marks":3776,"text":3777},"c5ba9085cc730",[],"Step 1: A user asks a question.",[],{"_key":3780,"_type":16,"asset":3781},"2b6a9a4b0d40",{"_ref":3782,"_type":19},"image-c360eef8f311dedd47ac3c61c527a31e5678f2ec-1430x682-png",{"_key":3784,"_type":23,"children":3785,"markDefs":3790,"style":31},"017bf15a3fee",[3786],{"_key":3787,"_type":27,"marks":3788,"text":3789},"c6d3bd18f64c0",[],"Step 2: The LLM looks only at data from questions on Stack Overflow that have an accepted answer.",[],{"_key":3792,"_type":16,"asset":3793},"f29928967fe4",{"_ref":3794,"_type":19},"image-0c5a4235fd186846c6533a501a39ea503ee47bce-1430x682-png",{"_key":3796,"_type":23,"children":3797,"markDefs":3802,"style":31},"bdf2b4f6a1e8",[3798],{"_key":3799,"_type":27,"marks":3800,"text":3801},"4f2be12981750",[],"Step 3: The LLM generates a response based on that answer. This answer is a short synthesis of what it has just read and other texts reviewed.",[],{"_key":3804,"_type":16,"asset":3805},"340eb13db80c",{"_ref":3806,"_type":19},"image-52084fefba49b8ebe9e1313a37905ca9e0c5a775-1430x682-png",{"_key":3808,"_type":23,"children":3809,"markDefs":3814,"style":31},"bc08bcb0eaeb",[3810],{"_key":3811,"_type":27,"marks":3812,"text":3813},"20c27868592a0",[],"Because it looked at a comparatively small dataset, it provides annotations so users can verify the source material for accuracy and recency.",[],{"_key":3816,"_type":23,"children":3817,"markDefs":3822,"style":31},"45d88a3d5eca",[3818],{"_key":3819,"_type":27,"marks":3820,"text":3821},"e17e1b51b7400",[],"We use these hidden, system-level prompts to guide the process:",[],{"_key":3824,"_type":23,"children":3825,"markDefs":3830,"style":31},"9095bded6ad2",[3826],{"_key":3827,"_type":27,"marks":3828,"text":3829},"7ea0a0a641b60",[],"Prompt 1: Take the query and use your large foundation model to process it, tokenize it, and understand it.",[],{"_key":3832,"_type":16,"asset":3833},"aab40398b82a",{"_ref":3834,"_type":19},"image-fe7913b989705368313a2853a045b00d8286cd3b-1430x682-png",{"_key":3836,"_type":23,"children":3837,"markDefs":3842,"style":31},"19801e02b41e",[3838],{"_key":3839,"_type":27,"marks":3840,"text":3841},"61a3f193418c0",[],"Prompt 2: If the query is understood, consult our chosen dataset of Stack Overflow answers.",[],{"_key":3844,"_type":16,"asset":3845},"a427b42fa33a",{"_ref":3846,"_type":19},"image-a09ae83085fd5c8fec0d11ba448ca341f3ad847a-1430x682-png",{"_key":3848,"_type":23,"children":3849,"markDefs":3854,"style":31},"df1411bb729d",[3850],{"_key":3851,"_type":27,"marks":3852,"text":3853},"4883128c2d220",[],"Prompt 3: If you don’t find valid data, tell the user that you don’t have a viable response.",[],{"_key":3856,"_type":16,"asset":3857},"b2934fbb18c5",{"_ref":3858,"_type":19},"image-ea005cf88af06ccdbf72ef3712bd791060562bb1-1430x682-png",{"_key":3860,"_type":23,"children":3861,"markDefs":3866,"style":31},"87d5a2a8cf20",[3862],{"_key":3863,"_type":27,"marks":3864,"text":3865},"4217de640fd40",[],"Prompt 4: If you find valid data to produce an answer, create a short synthesis that provides users with a helpful reply in 200-300 words. Provide links to the data that supports your answer.",[],{"_key":3868,"_type":16,"asset":3869},"ed0722ed5bd2",{"_ref":3870,"_type":19},"image-f327667295c6373a737553752b1472949afb2120-1430x682-png",{"_key":3872,"_type":23,"children":3873,"markDefs":3878,"style":212},"510b6f356570",[3874],{"_key":3875,"_type":27,"marks":3876,"text":3877},"6fdbc2b1369d0",[],"Developer tools for RAG",[],{"_key":3880,"_type":23,"children":3881,"markDefs":3886,"style":31},"73d51f691b0b",[3882],{"_key":3883,"_type":27,"marks":3884,"text":3885},"99f8af1ad98c0",[],"Developers can use a mix of open-source and commercial tools for implementing RAG. Vector databases like Pinecone, FAISS, and Milvus allow fast and scalable retrieval of data points, represented as vector embeddings. These embeddings are numerical formats that capture the meaning of text or data. The foundation of RAG relies on knowledge sources like structured databases, document libraries, or APIs to get the right information to retrieve and process.",[],{"_key":3888,"_type":23,"children":3889,"markDefs":3894,"style":31},"0f63d43024b3",[3890],{"_key":3891,"_type":27,"marks":3892,"text":3893},"c8e476e94e0d0",[],"Structured data allows search engines like Elasticsearch and Weaviate to provide hybrid retrieval options using keywords and semantic search. Pre-trained LLMs like GPT-4 and Claude or open-source models like LLaMA integrate with retrieval systems to generate responses.",[],{"_key":3896,"_type":23,"children":3897,"markDefs":3902,"style":212},"9eba6ecec77c",[3898],{"_key":3899,"_type":27,"marks":3900,"text":3901},"928baa1bb3150",[],"Five avenues to improve RAG performance",[],{"_key":3904,"_type":23,"children":3905,"markDefs":3910,"style":31},"8e3c1c5282cf",[3906],{"_key":3907,"_type":27,"marks":3908,"text":3909},"b4f6354b5b4f0",[],"Building a high-performance RAG application needs more than just good data sources. Use these five methods to refine your pipeline.",[],{"_key":3912,"_type":23,"children":3913,"markDefs":3918,"style":344},"1d46d3490801",[3914],{"_key":3915,"_type":27,"marks":3916,"text":3917},"82b087349f2d0",[],"1. Hybrid search",[],{"_key":3920,"_type":23,"children":3921,"markDefs":3926,"style":31},"4aaff1ef2086",[3922],{"_key":3923,"_type":27,"marks":3924,"text":3925},"fe07854b28f10",[],"Developers can fine-tune retrieval quality to ensure that only the most relevant documents are passed to the model. This involves techniques like hybrid search, which blends semantic and keyword searches.",[],{"_key":3928,"_type":23,"children":3929,"markDefs":3934,"style":344},"d38ec7d701fc",[3930],{"_key":3931,"_type":27,"marks":3932,"text":3933},"82b0db2a14710",[],"2. Data cleansing",[],{"_key":3936,"_type":23,"children":3937,"markDefs":3942,"style":31},"bbd7e8790fcb",[3938],{"_key":3939,"_type":27,"marks":3940,"text":3941},"1193a82d7d900",[],"Filtering the retrieved data before generation helps remove irrelevant information, reducing noise in the final output.",[],{"_key":3944,"_type":23,"children":3945,"markDefs":3950,"style":344},"d4b9c116f280",[3946],{"_key":3947,"_type":27,"marks":3948,"text":3949},"d87b2a642a050",[],"3. Prompt engineering",[],{"_key":3952,"_type":23,"children":3953,"markDefs":3958,"style":31},"62ac2f578654",[3954],{"_key":3955,"_type":27,"marks":3956,"text":3957},"ef660934a15d0",[],"Adjusting the prompt and context length can enhance the model’s understanding and avoid overloading it with unnecessary details.",[],{"_key":3960,"_type":23,"children":3961,"markDefs":3966,"style":344},"0017d4c419ec",[3962],{"_key":3963,"_type":27,"marks":3964,"text":3965},"5ba2f20d5a6e0",[],"4. Evaluation",[],{"_key":3968,"_type":23,"children":3969,"markDefs":3974,"style":31},"dd2552319daa",[3970],{"_key":3971,"_type":27,"marks":3972,"text":3973},"942d6d72f86e0",[],"Set up repeatable evaluation processes that assess the RAG pipeline and its components. The retrieval stage can be evaluated using metrics like DCG and nDCG, while the generation stage can be assessed with an LLM-as-a-judge approach. Tools like RAGAS help measure the pipeline's performance for consistent results.",[],{"_key":3976,"_type":23,"children":3977,"markDefs":3982,"style":344},"7f18af997b3c",[3978],{"_key":3979,"_type":27,"marks":3980,"text":3981},"0ff6accf2bd00",[],"5. Data collection",[],{"_key":3984,"_type":23,"children":3985,"markDefs":3990,"style":31},"bc931309d9d9",[3986],{"_key":3987,"_type":27,"marks":3988,"text":3989},"544106c3ddf90",[],"After deploying a RAG application, collect data to improve its performance. This could involve fine-tuning retrieval models based on query-text chunk pairs or refining LLMs using high-quality outputs. Run A/B tests to measure if pipeline changes improve performance over time.",[],{"_key":3992,"_type":23,"children":3993,"markDefs":3998,"style":212},"dfeae50f9891",[3994],{"_key":3995,"_type":27,"marks":3996,"text":3997},"55c43c57f3390",[],"Recent updates to RAG",[],{"_key":4000,"_type":23,"children":4001,"markDefs":4010,"style":31},"8c7e9698f989",[4002,4006],{"_key":4003,"_type":27,"marks":4004,"text":4005},"7b1076fc04f50",[],"Retrieval augmented generation (RAG) is likely the most widely-used GenAI technique. It’s been adopted by organizations of varying sizes across disparate industries because it solves a simple problem: ",{"_key":4007,"_type":27,"marks":4008,"text":4009},"7b1076fc04f51",[57],"How can you use a state-of-the-art model that has information it needs to answer questions specific to your company or institution without the cost of building it yourself?",[],{"_key":4012,"_type":23,"children":4013,"markDefs":4018,"style":31},"9a7463c8f3dd",[4014],{"_key":4015,"_type":27,"marks":4016,"text":4017},"f023781851f00",[],"When we initially wrote about the process of implementing RAG back in January 2024, we were describing what is today known as naive RAG. This isn’t a slight we take personally; rather, it speaks to the rapid advance of this field and the wide array of techniques and tools that have been built upon the foundational approach we discussed a year ago.",[],{"_key":4020,"_type":23,"children":4021,"markDefs":4026,"style":344},"3ee4d779e066",[4022],{"_key":4023,"_type":27,"marks":4024,"text":4025},"61cd4f24611c0",[],"Naive RAG",[],{"_key":4028,"_type":23,"children":4029,"markDefs":4034,"style":31},"d00b63f80725",[4030],{"_key":4031,"_type":27,"marks":4032,"text":4033},"c3eb81837a7f0",[],"In a naive implementation of RAG, a user query is used to retrieve relevant documents, after which the prompt is fed to a model that delivers an answer. In more advanced versions, the system might take the user’s original prompt and enhance it—rewriting it or expanding it—before matching it with relevant documents. This takes a lot of the burden off the end user, who may not be familiar with GenAI or prompt engineering.",[],{"_key":4036,"_type":23,"children":4037,"markDefs":4042,"style":31},"6255c85c3f42",[4038],{"_key":4039,"_type":27,"marks":4040,"text":4041},"658834eae5270",[],"A key benefit, even with naive RAG, is that the dataset the AI model uses at inference time can be regularly updated. It is quite expensive and time-consuming to retrain or even fine-tune large AI models with fresh data. RAG solves for this, allowing models like ChatGPT to search for news stories or stock market data needed to answer questions about current events. Inside an organization, this same principle can be applied to ensure your GenAI agent is up-to-date with any changes to your codebase or documentation. For a customer service agent or medical providers, this could be used to ensure communications from previous support chats or consultations are included as context in a following session.",[],{"_key":4044,"_type":23,"children":4045,"markDefs":4050,"style":31},"579d1cd2bab1",[4046],{"_key":4047,"_type":27,"marks":4048,"text":4049},"53fcfc2d9d6f0",[],"A second stage employed in advanced RAG takes place after the information is retrieved. When an LLM system has to produce an answer based on a large set of documents, it can suffer from a form of information overload, causing it to leave key context or insight out of its response to the user. More advanced systems use AI agents to rank the material in terms of the best match, summarize long documents into shorter, more digestible chunks, and fuse various source materials together to provide the richest context. The information is then fed as a prompt to the model and an output is provided that delivers more value to the end user.",[],{"_key":4052,"_type":23,"children":4053,"markDefs":4058,"style":344},"10e80c5f379b",[4054],{"_key":4055,"_type":27,"marks":4056,"text":4057},"738fbd9e23f50",[],"Modular RAG",[],{"_key":4060,"_type":23,"children":4061,"markDefs":4066,"style":31},"2fabd4691267",[4062],{"_key":4063,"_type":27,"marks":4064,"text":4065},"7f5546ce72c30",[],"In modular RAG, the techniques of advanced RAG are taken a step further. The system might have a step that first looks at the relevant documents, reasons over them, and distills the key high-level concepts and abstractions. It can then use these to guide its evaluation of the source material, improving the chances that the final answer won’t be constrained by a small subset of specific documents. Other techniques break the user’s initial question into a series of smaller questions or produce a hypothetical answer that is used to help find the best source material.",[],{"_key":4068,"_type":23,"children":4069,"markDefs":4074,"style":344},"ed90430310c2",[4070],{"_key":4071,"_type":27,"marks":4072,"text":4073},"09c2c89d76d30",[],"Security vulnerabilities",[],{"_key":4076,"_type":23,"children":4077,"markDefs":4091,"style":31},"bccec848e222",[4078,4082,4087],{"_key":4079,"_type":27,"marks":4080,"text":4081},"7f00546362260",[],"As RAG becomes a more commonplace technique, it provides a large and novel attack surface for bad actors. ",{"_key":4083,"_type":27,"marks":4084,"text":4086},"7f00546362261",[4085],"b49e4266dac7","Studies",{"_key":4088,"_type":27,"marks":4089,"text":4090},"7f00546362262",[]," have shown that it’s possible to inject malicious text or code into the source material a RAG system might draw on. For example, if an open-source code library is commonly referenced by a RAG system when generating answers to questions about software development, attackers might add a backdoor that allows them access to systems which adopt code from this library without running it through the property security checks.",[4092],{"_key":4085,"_type":122,"href":4093},"https://arxiv.org/pdf/2402.07867",{"_key":4095,"_type":23,"children":4096,"markDefs":4101,"style":344},"66c9052944be",[4097],{"_key":4098,"_type":27,"marks":4099,"text":4100},"404b759717400",[],"Data quality and RAG",[],{"_key":4103,"_type":23,"children":4104,"markDefs":4120,"style":31},"5e2b518e635e",[4105,4109,4112,4116],{"_key":4106,"_type":27,"marks":4107,"text":4108},"6ff65718b5860",[],"In this edition of our ",{"_key":4110,"_type":27,"marks":4111,"text":58},"6ff65718b5861",[57],{"_key":4113,"_type":27,"marks":4114,"text":4115},"6ff65718b5862",[],", we’re examining the state of GenAI through a lens of data quality. And the most important ingredient in a great RAG system, regardless of how simple or complex, is the quality of the data it’s searching for when trying to provide an answer to a user’s question. This means",{"_key":4117,"_type":27,"marks":4118,"text":4119},"6ff65718b5863",[151]," organizations with well-organized knowledge and codebases have an advantage in the GenAI era.",[],{"_type":162,"seoImage":4122},{"_type":16,"asset":4123},{"_ref":4124,"_type":19},"image-67a38c68f24729f46b72ef8407c297948cbecde9-2400x1260-png",{"_type":169,"current":4126},"rag",{"_key":4128,"_type":262,"body":4129,"seo":4389,"slug":4393,"title":4395},"6197d61b8267",[4130,4149,4157,4165,4173,4192,4196,4204,4212,4220,4238,4242,4250,4279,4287,4295,4303,4321,4325,4333,4341,4353,4365,4373,4381],{"_key":4131,"_type":23,"children":4132,"markDefs":4146,"style":31},"70c5df39547f",[4133,4137,4142],{"_key":4134,"_type":27,"marks":4135,"text":4136},"ede154a509f00",[],"Large language models (LLMs) are now widely accepted; uptake has increased with business and consumer users. According to McKinsey's ",{"_key":4138,"_type":27,"marks":4139,"text":4141},"ede154a509f01",[4140],"ca1b3cab5c4c","2024 report",{"_key":4143,"_type":27,"marks":4144,"text":4145},"ede154a509f02",[],", 65% of global organizations are actively using GenAI tools, double the uptake from a year prior.To most users, they're more commonly known by LLM brand names like OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude. LLMs are trained on large datasets and fine-tuned to generate, summarize, and translate, and can now do much more than text-based tasks in a chat window.They're becoming multimodal, meaning they can process and generate multiple data types, including text, images, and video from a text, visual or audio prompt. Google’s Gemini and other models can interpret and combine diverse inputs to create visuals and animations in addition to text.",[4147],{"_key":4140,"_type":122,"href":4148},"https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai",{"_key":4150,"_type":23,"children":4151,"markDefs":4156,"style":31},"163a902f1dba",[4152],{"_key":4153,"_type":27,"marks":4154,"text":4155},"fd0f2af08d8d0",[],"New interfaces like ChatGPT's Canvas feature let end users fine-tune and revise initial outputs, allowing users to leverage the tool for end-to-end content creation.",[],{"_key":4158,"_type":23,"children":4159,"markDefs":4164,"style":212},"817ca08d97e8",[4160],{"_key":4161,"_type":27,"marks":4162,"text":4163},"75a18d8a62e30",[],"Three in four developers are now using LLMs",[],{"_key":4166,"_type":23,"children":4167,"markDefs":4172,"style":31},"bb52be5e777c",[4168],{"_key":4169,"_type":27,"marks":4170,"text":4171},"3c44ff3c0a590",[],"LLMs are reshaping developers' workflows by automating code generation and debugging. Tools like GitHub Copilot offer real-time coding suggestions, making development faster and more efficient.",[],{"_key":4174,"_type":23,"children":4175,"markDefs":4189,"style":31},"3df84a27c40a",[4176,4180,4185],{"_key":4177,"_type":27,"marks":4178,"text":4179},"34212b4de59e0",[],"They're now widely adopted in software development. According to our ",{"_key":4181,"_type":27,"marks":4182,"text":4184},"34212b4de59e1",[4183],"0148f1e5344d","2024 Stack Overflow survey",{"_key":4186,"_type":27,"marks":4187,"text":4188},"34212b4de59e2",[],", more than three in four respondents (76%) use or are planning to use AI to assist with coding, up from seven in ten (70%) in 2023.3",[4190],{"_key":4183,"_type":122,"href":4191},"https://survey.stackoverflow.co/2024/ai",{"_key":4193,"_type":16,"asset":4194},"9092e1a76238",{"_ref":4195,"_type":19},"image-cbc4722d06782dc7de2dbb3d63fbdcdc7bc2e1b0-2400x930-png",{"_key":4197,"_type":23,"children":4198,"markDefs":4203,"style":212},"c36ed4ee9704",[4199],{"_key":4200,"_type":27,"marks":4201,"text":4202},"920916ca61140",[],"Understanding LLM explainability",[],{"_key":4205,"_type":23,"children":4206,"markDefs":4211,"style":31},"89cf955b8304",[4207],{"_key":4208,"_type":27,"marks":4209,"text":4210},"2428a5f921450",[],"A core challenge with LLMs is understanding how they produce specific outputs. This concept, explainability, identifies the reasoning behind a model’s predictions. It helps determine why an LLM makes certain suggestions and ensures outputs meet expectations, reducing unexpected results and bias.Explainability is an evolving and critical aspect of responsible AI practices. Efforts are growing to improve the trustworthiness and usability of AI systems, keeping them from becoming black boxes whose internal workings are not visible or easily understood. If you’re explaining how LLMs work to a non-technical audiences, try this analogy:",[],{"_key":4213,"_type":23,"children":4214,"markDefs":4219,"style":31},"7ce30243e5df",[4215],{"_key":4216,"_type":27,"marks":4217,"text":4218},"01cca333dd690",[57],"LLMs are like advanced Google search autocomplete systems that guess what word you may want to see next. They learn this by training on patterns from huge datasets. They generate outputs as predictions based on prior data, not explicit understanding of the task, and they don't understand the meaning behind the words they produce. Mistakes, or hallucinations, occur when the model produces plausible but incorrect information.",[],{"_key":4221,"_type":23,"children":4222,"markDefs":4235,"style":31},"5a5e8b95264b",[4223,4227,4232],{"_key":4224,"_type":27,"marks":4225,"text":4226},"223fba95ce7c0",[],"For more ways to talk about LLMs, see our guide ",{"_key":4228,"_type":27,"marks":4229,"text":4231},"223fba95ce7c1",[4230],"2b88843bc9ab","explaining generative language models",{"_key":4233,"_type":27,"marks":4234,"text":635},"223fba95ce7c2",[],[4236],{"_key":4230,"_type":122,"href":4237},"https://stackoverflow.blog/2024/06/27/explaining-generative-language-models-to-almost-anyone/",{"_key":4239,"_type":16,"asset":4240},"8d0203d6e18c",{"_ref":4241,"_type":19},"image-10ab4ad2313d74f8dd080e31b53d133122913960-8192x4301-jpg",{"_key":4243,"_type":23,"children":4244,"markDefs":4249,"style":212},"5340a733f9f3",[4245],{"_key":4246,"_type":27,"marks":4247,"text":4248},"cd97312da3360",[],"The evolving architecture of LLMs",[],{"_key":4251,"_type":23,"children":4252,"markDefs":4275,"style":31},"1b9759edaa84",[4253,4257,4262,4266,4271],{"_key":4254,"_type":27,"marks":4255,"text":4256},"5ca9f643b88a0",[],"Despite advances in capabilities, the underlying architecture of LLMs has remained relatively stable. Models still largely rely on the ",{"_key":4258,"_type":27,"marks":4259,"text":4261},"5ca9f643b88a1",[4260],"1bc5082f65a0","transformer architecture introduced by Google",{"_key":4263,"_type":27,"marks":4264,"text":4265},"5ca9f643b88a2",[]," in 2017. Newer techniques enhance efficiency and scaling. Our ",{"_key":4267,"_type":27,"marks":4268,"text":4270},"5ca9f643b88a3",[4269],"614500df0a88","LLM analysis",{"_key":4272,"_type":27,"marks":4273,"text":4274},"5ca9f643b88a4",[]," highlights the reality that improvements typically come from refining data and training methods. Deployment optimization also plays an outsized role in enhancing LLMs’ performance.",[4276,4278],{"_key":4260,"_type":122,"href":4277},"https://research.google/blog/transformer-a-novel-neural-network-architecture-for-language-understanding/",{"_key":4269,"_type":122,"href":298},{"_key":4280,"_type":23,"children":4281,"markDefs":4286,"style":212},"281a803829bf",[4282],{"_key":4283,"_type":27,"marks":4284,"text":4285},"de2e4067b17d0",[],"How LLMs learn relationships with masked self-attention",[],{"_key":4288,"_type":23,"children":4289,"markDefs":4294,"style":31},"e5719bdf2532",[4290],{"_key":4291,"_type":27,"marks":4292,"text":4293},"b9b66f9b34ee0",[],"A core principle of LLMs is a mechanism called masked self-attention, which allows models to understand relationships in a sentence between tokens (words or \"subword\" fragments). Instead of processing text sequentially, the transformer architecture allows LLMs to consider multiple tokens simultaneously, assigning attention weights to focus on the most relevant parts of the input.",[],{"_key":4296,"_type":23,"children":4297,"markDefs":4302,"style":31},"a31421863017",[4298],{"_key":4299,"_type":27,"marks":4300,"text":4301},"8c6ea64f94e50",[],"In the sentence “The developer fixed the bug,” for example, the model identifies that “developer” and “fixed” are closely related. By masking parts of the data during training, the LLM learns to predict missing tokens (in this example, words) and better understand context. This process is core to the model’s ability to generate coherent and relevant outputs.",[],{"_key":4304,"_type":23,"children":4305,"markDefs":4318,"style":31},"0fb469b2dfd6",[4306,4310,4315],{"_key":4307,"_type":27,"marks":4308,"text":4309},"2a488b5360cb0",[],"For a detailed breakdown, explore our article on ",{"_key":4311,"_type":27,"marks":4312,"text":4314},"2a488b5360cb1",[4313],"3fe376086c88","masked self-attention",{"_key":4316,"_type":27,"marks":4317,"text":635},"2a488b5360cb2",[],[4319],{"_key":4313,"_type":122,"href":4320},"https://stackoverflow.blog/2024/09/26/masked-self-attention-how-llms-learn-relationships-between-tokens/",{"_key":4322,"_type":16,"asset":4323},"df15d7977be4",{"_ref":4324,"_type":19},"image-2baeb81a19aabf313b55391d775111bcff4b3a8f-2386x1338-jpg",{"_key":4326,"_type":23,"children":4327,"markDefs":4332,"style":212},"59e2a6bf7918",[4328],{"_key":4329,"_type":27,"marks":4330,"text":4331},"f14354e5b5fa0",[],"Parameters and precision",[],{"_key":4334,"_type":23,"children":4335,"markDefs":4340,"style":31},"d4103ea20864",[4336],{"_key":4337,"_type":27,"marks":4338,"text":4339},"30ced7ca81eb0",[],"You may have seen a couple of numbers thrown around in regard to LLM size and power: numbers of parameters and precision. Together, these correspond to the accuracy and capabilities of a model, as well as its storage size, resource requirements, and cost to run.",[],{"_key":4342,"_type":23,"children":4343,"markDefs":4352,"style":31},"07be8f41baa6",[4344,4348],{"_key":4345,"_type":27,"marks":4346,"text":4347},"b41d774e94810",[151],"Parameters",{"_key":4349,"_type":27,"marks":4350,"text":4351},"b41d774e94811",[]," are the various biases and weights that are adjusted during training and fine-tuning. Each parameter is a vector—an array of hundreds of numbers. More parameters let the model make deeper connections and can lead to emergent abilities. Cutting-edge models have hundreds of billions or even trillions of parameters, though not all parameters will be used for every request.",[],{"_key":4354,"_type":23,"children":4355,"markDefs":4364,"style":31},"7e587f015b05",[4356,4360],{"_key":4357,"_type":27,"marks":4358,"text":4359},"1ba2df7ebecf0",[151],"Precision",{"_key":4361,"_type":27,"marks":4362,"text":4363},"1ba2df7ebecf1",[]," refers to the size and accuracy of each number with a parameter’s vector. They are described in terms of the amount of memory they take up—for example, 32-bit or 8-bit—and the form of the number—for example, floating point or integer. A high-precision model using 32-bit floating point values will be more accurate but require more resources than one using 8-bit integers. High-precision models can be quantized down to lower precision levels by reducing the amount of information (say by rounding to a higher decimal point).",[],{"_key":4366,"_type":23,"children":4367,"markDefs":4372,"style":31},"a2a31587b5d6",[4368],{"_key":4369,"_type":27,"marks":4370,"text":4371},"d499db4c1e490",[],"The assumption that more parameters and higher precision always result in better, more accurate responses is being challenged. Smaller models have shown comparable results by training on targeted data or limiting responses to some knowledge domains. Lower-precision models have shown themselves competent in answering many common questions. Recently, DeepSeek released a reasoning model that disrupted the LLM market by doing both.",[],{"_key":4374,"_type":23,"children":4375,"markDefs":4380,"style":212},"e99bbb25b922",[4376],{"_key":4377,"_type":27,"marks":4378,"text":4379},"4ea2cda88c080",[],"Key considerations for developers",[],{"_key":4382,"_type":23,"children":4383,"markDefs":4388,"style":31},"cc1fa579cd04",[4384],{"_key":4385,"_type":27,"marks":4386,"text":4387},"69111324dea40",[],"LLMs can generate incorrect outputs, known as hallucinations, so verifying results is critical in applications where accuracy matters (and where doesn’t it?). Fine-tuning and prompt engineering are effective ways to optimize performance and tailor outputs for specific tasks. Understanding explainability, or the degree to which an AI system’s internal workings can be explained in human terms, is essential for building trust and encouraging broader adoption. As LLMs continue to advance, developers will play a vital role in refining and responsibly integrating them into development workflows.",[],{"_type":162,"seoImage":4390},{"_type":16,"asset":4391},{"_ref":4392,"_type":19},"image-e0b209fdef3893b3bc811453a00359077170fa32-2400x1260-png",{"_type":169,"current":4394},"llm","Large language models",{"_key":4397,"_type":262,"body":4398,"seo":4700,"slug":4704,"title":4706},"661336addb4f",[4399,4407,4411,4419,4435,4443,4451,4470,4478,4486,4494,4502,4506,4514,4522,4563,4571,4579,4587,4606,4614,4622,4630,4657,4665,4684,4692],{"_key":4400,"_type":23,"children":4401,"markDefs":4406,"style":31},"9b370578b294",[4402],{"_key":4403,"_type":27,"marks":4404,"text":4405},"7c0e818c144e0",[],"Reasoning and context windows have become a critical focus in GenAI progress as developers test models to their limits. Advancements in this space are changing how we design AI systems to handle increasingly complex reasoning tasks.",[],{"_key":4408,"_type":16,"asset":4409},"e24089b8c7a8",{"_ref":4410,"_type":19},"image-94c41871faa876551e2667f86968dfe96b002290-1430x682-png",{"_key":4412,"_type":23,"children":4413,"markDefs":4418,"style":212},"cad3cb6d77d8",[4414],{"_key":4415,"_type":27,"marks":4416,"text":4417},"a4b81000a6ac0",[],"What are reasoning and context windows?",[],{"_key":4420,"_type":23,"children":4421,"markDefs":4434,"style":31},"14164a59ae98",[4422,4426,4430],{"_key":4423,"_type":27,"marks":4424,"text":4425},"bdee15bdeacd0",[],"Reasoning refers to an AI model's ability to process information to generate accurate responses. In human terms: Your weather app reports rain, and seeing water splashing on your window, you ",{"_key":4427,"_type":27,"marks":4428,"text":4429},"bdee15bdeacd1",[57],"reason",{"_key":4431,"_type":27,"marks":4432,"text":4433},"bdee15bdeacd2",[]," that it's prudent to pack an umbrella.",[],{"_key":4436,"_type":23,"children":4437,"markDefs":4442,"style":31},"e00c6a63870a",[4438],{"_key":4439,"_type":27,"marks":4440,"text":4441},"4ccc40cceb350",[],"Context windows are the limits on how much input data (tokens) a model can remember during a single query. Similar to limited working memory, models eventually forget context and prompts after processing extensive activity, like a goldfish forgetting its last swim around its bowl. This makes complex tasks like database coding or a multi-chapter report difficult to accomplish without re-prompting, which can lead to a higher error rate.",[],{"_key":4444,"_type":23,"children":4445,"markDefs":4450,"style":212},"172513bc67fd",[4446],{"_key":4447,"_type":27,"marks":4448,"text":4449},"f382c9788aa80",[],"Context windows are opening up",[],{"_key":4452,"_type":23,"children":4453,"markDefs":4467,"style":31},"c79488bd494b",[4454,4458,4463],{"_key":4455,"_type":27,"marks":4456,"text":4457},"7e00662cccd10",[],"Context windows widened significantly in 2024. OpenAI's GPT-4 can process context windows from ",{"_key":4459,"_type":27,"marks":4460,"text":4462},"7e00662cccd11",[4461],"05b2a6f4bbea","8,000 to 128,000 tokens",{"_key":4464,"_type":27,"marks":4465,"text":4466},"7e00662cccd12",[],", depending on the model. 128,000 tokens is equivalent to processing roughly 96,000 words or a full-length novel. Llama 3.1 matches OpenAI's upper limit, and Claude 2 by Anthropic now offers up to 100,000 tokens, allowing developers to process entire datasets in a single query.",[4468],{"_key":4461,"_type":122,"href":4469},"https://help.openai.com/en/articles/7127966-what-is-the-difference-between-the-gpt-4-model-versions",{"_key":4471,"_type":23,"children":4472,"markDefs":4477,"style":31},"573f7adc825e",[4473],{"_key":4474,"_type":27,"marks":4475,"text":4476},"2534524b1dc80",[],"These expanding windows allow developers to build applications that solve complex problems with extensive inputs. These systems can condense extensive documentation into actionable insights and process information from multiple sources.",[],{"_key":4479,"_type":23,"children":4480,"markDefs":4485,"style":212},"3f2d9fc782e5",[4481],{"_key":4482,"_type":27,"marks":4483,"text":4484},"7aeb19d1601b0",[],"Bigger context windows are not always better",[],{"_key":4487,"_type":23,"children":4488,"markDefs":4493,"style":31},"90ff21cc1082",[4489],{"_key":4490,"_type":27,"marks":4491,"text":4492},"5bfe113870770",[],"While context windows are growing, developers still face challenges balancing reasoning capabilities and model performance. There are trade-offs as longer context windows need more computing power, memory, and storage. They increase operational cost and consume more resources.",[],{"_key":4495,"_type":23,"children":4496,"markDefs":4501,"style":31},"5b9cc07dbca2",[4497],{"_key":4498,"_type":27,"marks":4499,"text":4500},"9d9c3ea471870",[],"Longer context windows also don’t necessarily translate to a better-performing model or more accurate answers. In fact, longer context windows create more opportunities for the model to hallucinate. Models processing large context windows often show longer response times, which highlights issues with latency. Extended reasoning can lead to inaccuracies or irrelevant conclusions, a phenomenon known as \"model drift.\"",[],{"_key":4503,"_type":90,"citation":4504,"copy":4505},"f5da28d5446c","Matt White, AI researcher","“Larger context windows can affect the data processing pipeline, model fine-tuning, and even the design of applications that utilize these AI models.\"",{"_key":4507,"_type":23,"children":4508,"markDefs":4513,"style":31},"8bc174856999",[4509],{"_key":4510,"_type":27,"marks":4511,"text":4512},"440db0a1bf430",[],"To prevent bloating from irrelevant data, larger inputs need effective pre-processing and careful token management. Modular pipelines allow models to reason iteratively over subsets of data, improving efficiency without overwhelming the context window.",[],{"_key":4515,"_type":23,"children":4516,"markDefs":4521,"style":212},"fe22cba7bbd1",[4517],{"_key":4518,"_type":27,"marks":4519,"text":4520},"683ab3cf9e880",[],"Reasoning models and frameworks trends",[],{"_key":4523,"_type":23,"children":4524,"markDefs":4556,"style":31},"e5ec942f0f26",[4525,4529,4534,4538,4543,4547,4552],{"_key":4526,"_type":27,"marks":4527,"text":4528},"476cdded11cf0",[],"Reasoning frameworks have taken a leap forward in recent years. Developers are now integrating multi-modal reasoning systems that process text, images, and code in unified workflows.The leading AI firms released a wave of updates during 2024. OpenAI's updates to ",{"_key":4530,"_type":27,"marks":4531,"text":4533},"476cdded11cf1",[4532],"d0e405781831","GPT-4 Turbo",{"_key":4535,"_type":27,"marks":4536,"text":4537},"476cdded11cf2",[]," optimize reasoning accuracy in extended contexts while improving latency for long prompts. Anthropic's Claude 3 ",{"_key":4539,"_type":27,"marks":4540,"text":4542},"476cdded11cf3",[4541],"b22338bfd35a","has pushed reasoning benchmarks",{"_key":4544,"_type":27,"marks":4545,"text":4546},"476cdded11cf4",[]," by prioritizing retrieval-augmented generation (RAG) for faster, context-aware outputs. ",{"_key":4548,"_type":27,"marks":4549,"text":4551},"476cdded11cf5",[4550],"235440f27aff","DeepMind's Gemini",{"_key":4553,"_type":27,"marks":4554,"text":4555},"476cdded11cf6",[]," integrates multi-modal capabilities, making significant progress in reasoning across audio, video, and documents. DeepSeek has shown that powerful reasoning models do not need expensive training runs by using a combination of targeted and synthetic data and reducing training precision from 32-bit to 8-bit. It’s paving the way for the next wave of agentic AI assistants that can automate complex end-to-end tasks.",[4557,4559,4561],{"_key":4532,"_type":122,"href":4558},"https://openai.com/index/new-models-and-developer-products-announced-at-devday/",{"_key":4541,"_type":122,"href":4560},"https://ragaboutit.com/claude-3-5-sonnet-the-new-benchmark-for-rag-models/",{"_key":4550,"_type":122,"href":4562},"https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/",{"_key":4564,"_type":23,"children":4565,"markDefs":4570,"style":212},"65ffabc05936",[4566],{"_key":4567,"_type":27,"marks":4568,"text":4569},"84cf3ab9a8610",[],"Developing using reasoning and context windows",[],{"_key":4572,"_type":23,"children":4573,"markDefs":4578,"style":31},"b08a50a3305d",[4574],{"_key":4575,"_type":27,"marks":4576,"text":4577},"5b922ddc73660",[],"To make the most from reasoning and context windows, consider these tips:",[],{"_key":4580,"_type":23,"children":4581,"markDefs":4586,"style":344},"62ac433b7565",[4582],{"_key":4583,"_type":27,"marks":4584,"text":4585},"e39754c63a270",[],"Optimize input size",[],{"_key":4588,"_type":23,"children":4589,"markDefs":4603,"style":31},"a46d15b76dbf",[4590,4594,4599],{"_key":4591,"_type":27,"marks":4592,"text":4593},"b0b489bdec3a0",[],"Developers can optimize input size using pre-processing tools like ",{"_key":4595,"_type":27,"marks":4596,"text":4598},"b0b489bdec3a1",[4597],"44b14846eb45","LangChain",{"_key":4600,"_type":27,"marks":4601,"text":4602},"b0b489bdec3a2",[]," to prioritize relevant tokens.",[4604],{"_key":4597,"_type":122,"href":4605},"https://www.langchain.com/",{"_key":4607,"_type":23,"children":4608,"markDefs":4613,"style":344},"6fd0ff329558",[4609],{"_key":4610,"_type":27,"marks":4611,"text":4612},"3a39648f6b490",[],"Use retrieval-based methods",[],{"_key":4615,"_type":23,"children":4616,"markDefs":4621,"style":31},"e4a94f045510",[4617],{"_key":4618,"_type":27,"marks":4619,"text":4620},"3ebb977b60cf0",[],"Combine models with external knowledge sources to extend reasoning without overloading inputs.",[],{"_key":4623,"_type":23,"children":4624,"markDefs":4629,"style":344},"e87c12d2b3c5",[4625],{"_key":4626,"_type":27,"marks":4627,"text":4628},"fbd30c88dfde0",[],"Test iterative reasoning",[],{"_key":4631,"_type":23,"children":4632,"markDefs":4653,"style":31},"21288d359340",[4633,4637,4641,4644,4649],{"_key":4634,"_type":27,"marks":4635,"text":4636},"5309ca4a83100",[],"Break tasks into smaller steps rather than relying on a single long-form query. Tools like ",{"_key":4638,"_type":27,"marks":4639,"text":4598},"5309ca4a83101",[4640],"f45b32402ae4",{"_key":4642,"_type":27,"marks":4643,"text":2473},"5309ca4a83102",[],{"_key":4645,"_type":27,"marks":4646,"text":4648},"5309ca4a83103",[4647],"4d0cf97b6f80","LlamaIndex",{"_key":4650,"_type":27,"marks":4651,"text":4652},"5309ca4a83104",[]," (formerly GPT Index) break large tasks into modular steps.",[4654,4655],{"_key":4640,"_type":122,"href":4605},{"_key":4647,"_type":122,"href":4656},"https://www.llamaindex.ai/",{"_key":4658,"_type":23,"children":4659,"markDefs":4664,"style":344},"843ed1ebe25b",[4660],{"_key":4661,"_type":27,"marks":4662,"text":4663},"bbddf7b434b90",[],"Monitor accuracy",[],{"_key":4666,"_type":23,"children":4667,"markDefs":4681,"style":31},"0e6dad52eb6d",[4668,4672,4677],{"_key":4669,"_type":27,"marks":4670,"text":4671},"9873ab3c58490",[],"Use benchmarking tools like ",{"_key":4673,"_type":27,"marks":4674,"text":4676},"9873ab3c58491",[4675],"4f217ec3ca38","EleutherAI",{"_key":4678,"_type":27,"marks":4679,"text":4680},"9873ab3c58492",[]," to test performance at varying window sizes.",[4682],{"_key":4675,"_type":122,"href":4683},"https://www.eleuther.ai/",{"_key":4685,"_type":23,"children":4686,"markDefs":4691,"style":212},"ecc9d88850f1",[4687],{"_key":4688,"_type":27,"marks":4689,"text":4690},"c1e2ad168f920",[],"Looking ahead",[],{"_key":4693,"_type":23,"children":4694,"markDefs":4699,"style":31},"c71e0482e1fd",[4695],{"_key":4696,"_type":27,"marks":4697,"text":4698},"9c34a4bd070b0",[],"Reasoning and context windows are core to GenAI's progress. As models grow smarter and context handling improves, developers will be able to build more scalable and accurate multi-modal applications. Keep an eye on announcements from Anthropic, OpenAI, and DeepMind as they push the limits of reasoning capabilities.",[],{"_type":162,"seoImage":4701},{"_type":16,"asset":4702},{"_ref":4703,"_type":19},"image-a45fd6b7406682f4b142f56e8448630fce9450ab-2400x1260-png",{"_type":169,"current":4705},"reasoning-and-context-windows","Reasoning and context windows ",{"_type":162,"seoDescription":4708,"seoImage":4709},"Explore the key technologies powering GenAI, from Python and hardware accelerators to neural networks and machine learning frameworks.",{"_type":16,"asset":4710},{"_ref":4711,"_type":19},"image-4ff75aa47ca4df4dc65a79d4faa51ff6ed3d3dd2-2400x1260-png",[],{"_type":169,"current":4714},"key-tools-technologies-terms","Key tools, technologies, and terms",{"_key":4717,"_type":45,"body":4718,"sections":4754,"seo":5775,"sidebarCta":5780,"slug":5781,"title":5783},"a386dcac1077",[4719,4722,4730,4738,4746],{"_key":4720,"_type":185,"url":4721},"fc75d7b63f37","https://fast.wistia.net/embed/iframe/1enza4luor?seo=false&videoFoam=true&doNotTrack=true&seo=false&videoFoam=false&fitStrategy=cover&controlsVisibleOnLoad=false&playbar=true&settingsControl=false&smallPlayButton=true&playerColor=F48024&muted=false",{"_key":4723,"_type":23,"children":4724,"markDefs":4729,"style":31},"d7ad6744e241",[4725],{"_key":4726,"_type":27,"marks":4727,"text":4728},"4537827576b70",[],"At Stack Overflow, our journey to incorporating AI into our internal tech stack and product offerings began with a desire to make learning easier and more efficient for our global user base. Every day, Stack Overflow helps engineers, developers, and technologists of every kind learn by answering their questions and guiding them to discover new solutions and fresh approaches.",[],{"_key":4731,"_type":23,"children":4732,"markDefs":4737,"style":31},"86e47d1263c7",[4733],{"_key":4734,"_type":27,"marks":4735,"text":4736},"57bb5e273f920",[],"We know that learning is integral to this work—and that it’s hard. Especially for beginners, it’s hard to know where to start, which questions to ask first. After all, you don’t know what you don’t know, especially when you’re new to a topic. We also know that our Stack Overflow for Teams customers are looking for innovative ways for their teams to find information faster and collaborate more seamlessly.",[],{"_key":4739,"_type":23,"children":4740,"markDefs":4745,"style":31},"1cea20415d50",[4741],{"_key":4742,"_type":27,"marks":4743,"text":4744},"004e327a54c60",[],"That’s why we built OverflowAI, a GenAI-powered add-on for Stack Overflow for Teams Enterprise. We wanted to streamline and improve your workflows with AI-powered features including Enhanced Search; Stack Overflow for Visual Studio Code, an IDE extension; and Auto-Answer App for Slack and Microsoft Teams.",[],{"_key":4747,"_type":23,"children":4748,"markDefs":4753,"style":31},"33ca51157b03",[4749],{"_key":4750,"_type":27,"marks":4751,"text":4752},"4e386a03835b0",[],"But before we lay out how OverflowAI can help your organization meet its goals, let’s back up and explore how GenAI is continuing to transform search, chat, and developer environments.",[],[4755,5027,5472],{"_key":4756,"_type":262,"body":4757,"slug":5024,"title":5026},"fb6fe8dd3274",[4758,4767,4775,4783,4791,4799,4807,4811,4819,4827,4835,4850,4858,4866,4869,4877,4896,4904,4912,4924,4936,4948,4960,4968,4984,4992,5000,5008,5016],{"_key":4759,"_type":23,"children":4760,"markDefs":4765,"style":4766},"8167a9863de7",[4761],{"_key":4762,"_type":27,"marks":4763,"text":4764},"a7363dd9839a0",[],"How GenAI can help you ask the right questions and find the right answers",[],"h1",{"_key":4768,"_type":23,"children":4769,"markDefs":4774,"style":31},"6c18c8456b18",[4770],{"_key":4771,"_type":27,"marks":4772,"text":4773},"5dabe7d0f00f0",[],"The next generation of AI-powered search and Q&A doesn’t just help your employees find answers to their questions; it helps them ask better questions.",[],{"_key":4776,"_type":23,"children":4777,"markDefs":4782,"style":31},"c4d8a24aa056",[4778],{"_key":4779,"_type":27,"marks":4780,"text":4781},"8c61842910d60",[],"You can’t talk about finding knowledge without talking about search, and search has been core to the Stack Overflow experience from the beginning. It was among the first user features we built. From the start, most of our visitors arrived via search engines. Users of Stack Overflow’s public site encounter a massive store of questions and answers, including plenty of duplicates; they have to navigate to the answer they need, using community-contributed comments and votes to find the best solution.",[],{"_key":4784,"_type":23,"children":4785,"markDefs":4790,"style":31},"90abed5e91d2",[4786],{"_key":4787,"_type":27,"marks":4788,"text":4789},"fd84d197764d0",[],"Now, we’re seeing AI models steeped in specialized knowledge that can quickly determine what answers users are looking for, use natural language to help them refine and improve their questions, and continually improve itself.",[],{"_key":4792,"_type":23,"children":4793,"markDefs":4798,"style":31},"b663721093b8",[4794],{"_key":4795,"_type":27,"marks":4796,"text":4797},"25ab7fac3a300",[],"Keep reading to understand how AI-powered search works, how it can help with your business, and best practices for implementation.",[],{"_key":4800,"_type":23,"children":4801,"markDefs":4806,"style":212},"eca506dee172",[4802],{"_key":4803,"_type":27,"marks":4804,"text":4805},"d8e59c3177ea0",[],"Ask the right questions",[],{"_key":4808,"_type":16,"asset":4809},"a50b9b5aae25",{"_ref":4810,"_type":19},"image-1809653bbf125f33107ecc5065146840bfb50c00-1431x682-png",{"_key":4812,"_type":23,"children":4813,"markDefs":4818,"style":31},"365ddb3fa9f9",[4814],{"_key":4815,"_type":27,"marks":4816,"text":4817},"8c84535040140",[],"Often, the hardest part of solving a problem is knowing which questions to ask. This is particularly true when you’re brand-new to a topic or technology. Getting oriented and up-to-speed enough to ask relevant questions takes time, and employees still have to parse answers they don’t fully understand. They might bounce between several questions and answers before landing on the right solution. Naturally, all of this takes time away from other work these employees could be doing.",[],{"_key":4820,"_type":23,"children":4821,"markDefs":4826,"style":31},"11d083004c29",[4822],{"_key":4823,"_type":27,"marks":4824,"text":4825},"83e0b4e4f67d0",[],"AI-powered search can cut down the time it takes people to understand and articulate their problem, then guide them in finding the solution they need. AI-powered search and Q&A platforms are rapidly evolving to provide users with instant solutions aggregated by models trained on your organization’s internal data. Users can then ask follow-up questions in a chat format to get additional detail, context, or insight, just as they might work through an issue with a human colleague. That’s where semantic search comes in.",[],{"_key":4828,"_type":23,"children":4829,"markDefs":4834,"style":212},"2f33b5a8f0e7",[4830],{"_key":4831,"_type":27,"marks":4832,"text":4833},"07ab0a12b3180",[],"Semantic search, personalized",[],{"_key":4836,"_type":23,"children":4837,"markDefs":4847,"style":31},"9426cd716fa3",[4838,4843],{"_key":4839,"_type":27,"marks":4840,"text":4842},"26e085aaa6e50",[4841],"581132fe2ccd","Semantic search",{"_key":4844,"_type":27,"marks":4845,"text":4846},"26e085aaa6e51",[]," converts content into numerical vectors based on meaning assigned by machine learning. The search function can then traverse the numerical vectors like a physical space. Semantic search enables faster, higher-quality results and more efficient storage of search data. More importantly, it allows users to search using natural language instead of keyword manipulation.",[4848],{"_key":4841,"_type":122,"href":4849},"https://stackoverflow.blog/2023/07/31/ask-like-a-human-implementing-semantic-search-on-stack-overflow/",{"_key":4851,"_type":23,"children":4852,"markDefs":4857,"style":31},"0a4c2549b2e5",[4853],{"_key":4854,"_type":27,"marks":4855,"text":4856},"e24106c720880",[],"Semantic search can draw knowledge from a wide array of accurate, trustworthy, community-vetted sources and quickly offer possible solutions. Your employees might need more detailed or personalized answers, depending on the context they’re working in, so semantic search allows them to ask follow-up questions in a natural, conversational fashion. This format also allows employees to clarify and refine their questions as they go.",[],{"_key":4859,"_type":23,"children":4860,"markDefs":4865,"style":212},"c29589e02083",[4861],{"_key":4862,"_type":27,"marks":4863,"text":4864},"5ef9d56461760",[],"Garbage in, garbage out",[],{"_key":4867,"_type":16,"asset":4868},"3257b714ddae",{"_ref":2889,"_type":19},{"_key":4870,"_type":23,"children":4871,"markDefs":4876,"style":31},"66ac1378b05b",[4872],{"_key":4873,"_type":27,"marks":4874,"text":4875},"6164dbe099760",[],"An AI trained on your company’s data streamlines and speeds up onboarding for new employees as well as upskilling/reskilling for existing staff. But as we’ve mentioned, AI can’t make something from nothing. Models trained on outdated, incomplete, or just plain inaccurate information will tend to hallucinate, providing nonsensical, incorrect, or irrelevant answers. The old computing adage of “garbage in, garbage out” pretty much sums it up.",[],{"_key":4878,"_type":23,"children":4879,"markDefs":4893,"style":31},"ac5c2e69c668",[4880,4884,4889],{"_key":4881,"_type":27,"marks":4882,"text":4883},"3a7505d4369e0",[],"For an AI to provide your employees with high-quality answers, it needs access to accurate, up-to-date, and well-organized data. That’s why a knowledge-sharing and collaboration platform like Stack Overflow for Teams is ",{"_key":4885,"_type":27,"marks":4886,"text":4888},"3a7505d4369e1",[4887],"8382aab45706","critical to the success",{"_key":4890,"_type":27,"marks":4891,"text":4892},"3a7505d4369e2",[]," of AI-enhanced search and Q&A tools.",[4894],{"_key":4887,"_type":122,"href":4895},"https://stackoverflow.blog/2023/07/06/why-knowledge-management-is-foundational-to-ai-success/",{"_key":4897,"_type":23,"children":4898,"markDefs":4903,"style":212},"19f1e47050f3",[4899],{"_key":4900,"_type":27,"marks":4901,"text":4902},"e427dbd154f50",[],"Features to look for",[],{"_key":4905,"_type":23,"children":4906,"markDefs":4911,"style":31},"bf3e1064df37",[4907],{"_key":4908,"_type":27,"marks":4909,"text":4910},"dd03a0a966170",[],"In deciding which knowledge sharing and search/Q&A tools to adopt at your organization, there are certain features and capabilities you should look for:",[],{"_key":4913,"_type":23,"children":4914,"level":943,"listItem":944,"markDefs":4923,"style":31},"fc8854c9221b",[4915,4919],{"_key":4916,"_type":27,"marks":4917,"text":4918},"eae9b650f57d0",[151],"Trusted sources:",{"_key":4920,"_type":27,"marks":4921,"text":4922},"eae9b650f57d1",[]," Trust where the information came from, because the AI provides sources and attributions for all answers.",[],{"_key":4925,"_type":23,"children":4926,"level":943,"listItem":944,"markDefs":4935,"style":31},"c5a6542e8653",[4927,4931],{"_key":4928,"_type":27,"marks":4929,"text":4930},"5e7c55f9a1190",[151],"Personalizable: ",{"_key":4932,"_type":27,"marks":4933,"text":4934},"5e7c55f9a1191",[],"Configure preferences like length of answer and level of technical detail.",[],{"_key":4937,"_type":23,"children":4938,"level":943,"listItem":944,"markDefs":4947,"style":31},"41115d03a91f",[4939,4943],{"_key":4940,"_type":27,"marks":4941,"text":4942},"29a43df337e40",[151],"Shorter time-to-solution: ",{"_key":4944,"_type":27,"marks":4945,"text":4946},"29a43df337e41",[],"Find solutions faster without bouncing between answers; solutions can be summarized within a single search prompt.",[],{"_key":4949,"_type":23,"children":4950,"level":943,"listItem":944,"markDefs":4959,"style":31},"9a336f40e0f9",[4951,4955],{"_key":4952,"_type":27,"marks":4953,"text":4954},"402a78fc12bf0",[151],"Conversational interface: ",{"_key":4956,"_type":27,"marks":4957,"text":4958},"402a78fc12bf1",[],"Easily ask the system for more information; get suggested follow-up questions to continue the conversation or get deeper insights.",[],{"_key":4961,"_type":23,"children":4962,"markDefs":4967,"style":212},"c59fc4d30322",[4963],{"_key":4964,"_type":27,"marks":4965,"text":4966},"14419683e9f00",[],"Introducing the next generation of search",[],{"_key":4969,"_type":23,"children":4970,"markDefs":4983,"style":31},"d5d715917979",[4971,4975,4979],{"_key":4972,"_type":27,"marks":4973,"text":4974},"3a988f319aa90",[],"Today’s AI-assisted search technology understands not just what users are asking, but also what they ",{"_key":4976,"_type":27,"marks":4977,"text":4978},"3a988f319aa91",[57],"actually ",{"_key":4980,"_type":27,"marks":4981,"text":4982},"3a988f319aa92",[],"need to know—and where to find it. Enhanced Search for Stack Overflow for Teams shortens the time it takes to articulate and summarize your question and then comb through possible solutions to find the one most relevant to your situation. Enhanced Search upgrades the search experience for Stack Overflow for Teams users by:",[],{"_key":4985,"_type":23,"children":4986,"level":943,"listItem":944,"markDefs":4991,"style":31},"8633f5ca77af",[4987],{"_key":4988,"_type":27,"marks":4989,"text":4990},"b106232658710",[],"Summarizing multiple answers across your knowledge base into new insights.",[],{"_key":4993,"_type":23,"children":4994,"level":943,"listItem":944,"markDefs":4999,"style":31},"fb8d3e1a7308",[4995],{"_key":4996,"_type":27,"marks":4997,"text":4998},"52db271bf3a90",[],"Sourcing and synthesizing knowledge to help users move past blockers faster.",[],{"_key":5001,"_type":23,"children":5002,"level":943,"listItem":944,"markDefs":5007,"style":31},"662a4ca3dd1f",[5003],{"_key":5004,"_type":27,"marks":5005,"text":5006},"4516924fbcd30",[],"Delivering summarized insights from the global Stack Overflow community.",[],{"_key":5009,"_type":23,"children":5010,"markDefs":5015,"style":31},"a095a3115041",[5011],{"_key":5012,"_type":27,"marks":5013,"text":5014},"0f250727acaf0",[],"Instead of a time-consuming process of searching for and parsing information, you’ll get an answer sourced from a wealth of community-validated sources. Responses will include sources and citations, so users can validate the quality of the results. Solutions can be summarized within a single search prompt. You can ask follow-up questions to work toward a more personalized solution, refining your question as you go. Additionally, you can offer feedback to support reinforcement learning, ​​in which humans apply their judgment and expertise to AI-generated content to coach the model to improve itself.",[],{"_key":5017,"_type":23,"children":5018,"markDefs":5023,"style":31},"d4d3af77f715",[5019],{"_key":5020,"_type":27,"marks":5021,"text":5022},"c73f9f4eea730",[],"If we can help you ask better questions, we can help you find better answers.",[],{"_type":169,"current":5025},"search","Search",{"_key":5028,"_type":262,"body":5029,"slug":5469,"title":5471},"66a82fea4bc2",[5030,5038,5057,5098,5106,5114,5118,5126,5156,5175,5194,5212,5220,5224,5254,5262,5270,5278,5281,5289,5293,5301,5309,5343,5366,5378,5390,5402,5410,5418,5426,5434,5442,5450,5453,5461],{"_key":5031,"_type":23,"children":5032,"markDefs":5037,"style":4766},"6bc5487f33bc",[5033],{"_key":5034,"_type":27,"marks":5035,"text":5036},"a37abadc64830",[],"Why IDEs are important to developer workflows",[],{"_key":5039,"_type":23,"children":5040,"markDefs":5054,"style":31},"9a43b455aa51",[5041,5045,5050],{"_key":5042,"_type":27,"marks":5043,"text":5044},"e2cce27087a40",[],"For many developers, where they write their code is as important as the code itself. Some developers are purists in search of an uncluttered experience and may use a simple text editor or one optimized for the coding experience. When ",{"_key":5046,"_type":27,"marks":5047,"text":5049},"e2cce27087a41",[5048],"10ad1fa40aa5","we playfully suggested",{"_key":5051,"_type":27,"marks":5052,"text":5053},"e2cce27087a42",[]," that those who still write code in Emacs or Vim live in the past, boy, did we get an earful. The message was loud and clear: a programmer stays in their flow state by never letting their fingers leave the keyboard. For writing code, there are few tools better than these text editors.",[5055],{"_key":5048,"_type":122,"href":5056},"https://stackoverflow.blog/2020/11/09/modern-ide-vs-vim-emacs/",{"_key":5058,"_type":23,"children":5059,"markDefs":5091,"style":31},"294b344f68f9",[5060,5064,5069,5073,5078,5082,5087],{"_key":5061,"_type":27,"marks":5062,"text":5063},"052f71e1d57a0",[],"But modern software engineering is more than just writing code in a single file. Web applications can span multiple services stored in multiple repos. A software engineer will have to worry about debugging code (sometimes ",{"_key":5065,"_type":27,"marks":5066,"text":5068},"052f71e1d57a1",[5067],"a06bc063da98","spinning up test environments on the fly",{"_key":5070,"_type":27,"marks":5071,"text":5072},"052f71e1d57a2",[],"), committing changes to ",{"_key":5074,"_type":27,"marks":5075,"text":5077},"052f71e1d57a3",[5076],"68a980633d7e","version control",{"_key":5079,"_type":27,"marks":5080,"text":5081},"052f71e1d57a4",[],", managing ",{"_key":5083,"_type":27,"marks":5084,"text":5086},"052f71e1d57a5",[5085],"198884eb3f00","build and deploy pipelines",{"_key":5088,"_type":27,"marks":5089,"text":5090},"052f71e1d57a6",[],", and more. Integrated development environments (IDEs) can help with these tasks. They can help with writing code, as well, but they really shine as a battlestation for creating software, with all the additional activities that entails.",[5092,5094,5096],{"_key":5067,"_type":122,"href":5093},"https://stackoverflow.blog/2021/07/21/why-you-should-build-on-kubernetes-from-day-one/",{"_key":5076,"_type":122,"href":5095},"https://stackoverflow.blog/2023/01/09/beyond-git-the-other-version-control-systems-developers-use/",{"_key":5085,"_type":122,"href":5097},"https://stackoverflow.blog/2021/12/20/fulfilling-the-promise-of-ci-cd/",{"_key":5099,"_type":23,"children":5100,"markDefs":5105,"style":31},"82c2676a15cd",[5101],{"_key":5102,"_type":27,"marks":5103,"text":5104},"bc82b3b6f4810",[],"Let’s explore the various ways that an IDE can make a developer’s life easier.",[],{"_key":5107,"_type":23,"children":5108,"markDefs":5113,"style":212},"e24189793cb9",[5109],{"_key":5110,"_type":27,"marks":5111,"text":5112},"44f6a6b957cd0",[],"Writing and editing code",[],{"_key":5115,"_type":16,"asset":5116},"add15b29d41a",{"_ref":5117,"_type":19},"image-3b338ed03891be7985b5ef0ea1ad16cf39a2de5c-1430x682-png",{"_key":5119,"_type":23,"children":5120,"markDefs":5125,"style":31},"ce6d80d82dda",[5121],{"_key":5122,"_type":27,"marks":5123,"text":5124},"45b9c99e8cfc0",[],"While many prefer the pure text experience, IDEs do have a number of tools that make writing code easier. More importantly, as many developers spend a huge chunk of their time editing other people’s code, they make navigating and understanding that code easier.",[],{"_key":5127,"_type":23,"children":5128,"markDefs":5151,"style":31},"711c2631d53e",[5129,5133,5138,5142,5147],{"_key":5130,"_type":27,"marks":5131,"text":5132},"1a9c9f51a9130",[],"We often take for granted syntax highlighting in code editors. But this highlighting has been found to ",{"_key":5134,"_type":27,"marks":5135,"text":5137},"1a9c9f51a9131",[5136],"a1fbd9b054af","improve code comprehension",{"_key":5139,"_type":27,"marks":5140,"text":5141},"1a9c9f51a9132",[]," and may reduce the amount of context switches within a task. While syntax highlighting can help identify those nagging errors of unmatched brackets and quotes (as entire sections of your code are suddenly colored as string literals), most IDEs will automatically highlight syntax errors and match brackets for the programming languages of your choice. To top it off, many IDEs will format your code to match your house style, either using a ",{"_key":5143,"_type":27,"marks":5144,"text":5146},"1a9c9f51a9133",[5145],"89fb30dba590","linter",{"_key":5148,"_type":27,"marks":5149,"text":5150},"1a9c9f51a9134",[]," or otherwise.",[5152,5154],{"_key":5136,"_type":122,"href":5153},"https://ppig.org/papers/2015-ppig-26th-sarkar1/",{"_key":5145,"_type":122,"href":5155},"https://stackoverflow.blog/2020/07/20/linters-arent-in-your-way-theyre-on-your-side/",{"_key":5157,"_type":23,"children":5158,"markDefs":5172,"style":31},"6ff6ca29ed07",[5159,5163,5168],{"_key":5160,"_type":27,"marks":5161,"text":5162},"a4dd0f7c9abd0",[],"While many curse boilerplate code and whisper “don’t repeat yourself” until three letters alone are enough to scold (DRY!), there are still a lot of repetitive tasks in writing code, and IDEs help with those too. You can store code snippets to quickly provide `try`-`catch` or `if`-`else` blocks when you need them. Some IDEs, like IntelliJ, allow you to ",{"_key":5164,"_type":27,"marks":5165,"text":5167},"a4dd0f7c9abd1",[5166],"7999fc91fb5a","place multiple cursors",{"_key":5169,"_type":27,"marks":5170,"text":5171},"a4dd0f7c9abd2",[]," and write the same code in two places.",[5173],{"_key":5166,"_type":122,"href":5174},"https://www.jetbrains.com/help/idea/multicursor.html",{"_key":5176,"_type":23,"children":5177,"markDefs":5191,"style":31},"7bd407831b01",[5178,5182,5187],{"_key":5179,"_type":27,"marks":5180,"text":5181},"e83ef6e3f0d30",[],"For many, the biggest benefit is code completion or IntelliSense. This lets you quickly complete the function/variable/class names by typing just a few characters. Modern programming uses languages with massive standard libraries, multiple complex dependencies, and sprawling multi-service architectures, so knowing the name of every piece of that code isn’t feasible. In fact, research found that some developers use code completion ",{"_key":5183,"_type":27,"marks":5184,"text":5186},"e83ef6e3f0d31",[5185],"d7ae04c0e44f","as an exploratory tool",{"_key":5188,"_type":27,"marks":5189,"text":5190},"e83ef6e3f0d32",[]," to find new functions to use.",[5192],{"_key":5185,"_type":122,"href":5193},"https://ppig.org/papers/2015-ppig-26th-marasoiu/",{"_key":5195,"_type":23,"children":5196,"markDefs":5209,"style":31},"dac000e13b1b",[5197,5201,5206],{"_key":5198,"_type":27,"marks":5199,"text":5200},"692c873f29160",[],"For those who still want to speed up their code editor, well, you can often ",{"_key":5202,"_type":27,"marks":5203,"text":5205},"692c873f29161",[5204],"e7b4c01b6d4f","use code editors within IDEs",{"_key":5207,"_type":27,"marks":5208,"text":635},"692c873f29162",[],[5210],{"_key":5204,"_type":122,"href":5211},"https://www.barbarianmeetscoding.com/boost-your-coding-fu-with-vscode-and-vim/installing-vim-in-vscode/",{"_key":5213,"_type":23,"children":5214,"markDefs":5219,"style":212},"587c1370a2d1",[5215],{"_key":5216,"_type":27,"marks":5217,"text":5218},"1ceaf1ea7a2b0",[],"Debugging",[],{"_key":5221,"_type":16,"asset":5222},"61ae57196a92",{"_ref":5223,"_type":19},"image-bfd500bdfc2dc6ac20d214c89783de7deae3df00-1430x682-png",{"_key":5225,"_type":23,"children":5226,"markDefs":5249,"style":31},"1ebc64abdb78",[5227,5231,5236,5240,5245],{"_key":5228,"_type":27,"marks":5229,"text":5230},"643783037fac0",[],"Developers would probably love to spend the majority of their time writing new code. But in most cases, unless you’re working at a young startup or on a greenfield project, you’re going to work with existing code, and that means debugging. Depending on who you ask, debugging code takes ",{"_key":5232,"_type":27,"marks":5233,"text":5235},"643783037fac1",[5234],"09ba7abfdac5","between 20% and 60%",{"_key":5237,"_type":27,"marks":5238,"text":5239},"643783037fac2",[]," (",{"_key":5241,"_type":27,"marks":5242,"text":5244},"643783037fac3",[5243],"f76421d2cd9c","though some say 90%",{"_key":5246,"_type":27,"marks":5247,"text":5248},"643783037fac4",[],") of a developer’s time. These bugs are usually more pernicious than just syntax errors, so they require a bit of investigation.",[5250,5252],{"_key":5234,"_type":122,"href":5251},"https://arxiv.org/pdf/2105.02162",{"_key":5243,"_type":122,"href":5253},"https://stackoverflow.com/questions/2325994/what-of-programming-time-do-you-spend-debugging",{"_key":5255,"_type":23,"children":5256,"markDefs":5261,"style":31},"8ceb1e426261",[5257],{"_key":5258,"_type":27,"marks":5259,"text":5260},"bde73621c33c0",[],"Fortunately, most IDEs have strong debugging capabilities. In order to see where a program goes wrong, you can view the program state with breakpoints, which freeze execution on a particular line of code. From there, you can inspect the values of the variables in play, view thread and memory states, and step through the execution of the remainder of the program.",[],{"_key":5263,"_type":23,"children":5264,"markDefs":5269,"style":31},"64a7e4d36596",[5265],{"_key":5266,"_type":27,"marks":5267,"text":5268},"d9c7a53ae1bb0",[],"For complex applications, you can use some more advanced techniques. You can create an expression that is evaluated while the program runs, pausing on certain conditions. You can walk through the whole call stack on any pauses and see all the function calls that got you here. Instead of setting breakpoints, you can configure IDEs to handle exceptions in specific ways.",[],{"_key":5271,"_type":23,"children":5272,"markDefs":5277,"style":31},"e200f8bdcb04",[5273],{"_key":5274,"_type":27,"marks":5275,"text":5276},"0ac3da973c200",[],"Of course, that won’t help you solve the bug. For that you’ll need some external research, and more than a few Stack Overflow tabs.",[],{"_key":5279,"_type":185,"url":5280},"f46cc3c0f1c3","https://embed.reddit.com/r/ProgrammerHumor/comments/g8b8i4/after_you_solve_that_mysterious_bug/?embed=true&ref_source=embed&ref=share&utm_medium=widgets&utm_source=embedv2&utm_term=23&utm_name=post_embed&embed_host_url=https%3A%2F%2Fpublish.reddit.com%2Fembed",{"_key":5282,"_type":23,"children":5283,"markDefs":5288,"style":212},"f8948d4d7f3c",[5284],{"_key":5285,"_type":27,"marks":5286,"text":5287},"a4c74a4b43320",[],"Customizability",[],{"_key":5290,"_type":16,"asset":5291},"e56c8728dcce",{"_ref":5292,"_type":19},"image-526606a768cc57c63827d0f5f17315c79ff2611f-1430x682-png",{"_key":5294,"_type":23,"children":5295,"markDefs":5300,"style":31},"2e57b9ddb75d",[5296],{"_key":5297,"_type":27,"marks":5298,"text":5299},"fa6cde872ea00",[],"If there’s anything developers like, it’s customizing their systems, and IDEs are no exception. As such, most IDEs have a robust plugin or extension system that lets you add additional functionality. These plugins are key to maintaining the centrality of the IDE to a developer’s workflow, as modern software development has a lot of moving parts beyond the code. Plugins can handle much of the work that comes after you save your code to a file.",[],{"_key":5302,"_type":23,"children":5303,"markDefs":5308,"style":31},"fc099fdf3216",[5304],{"_key":5305,"_type":27,"marks":5306,"text":5307},"365e7c8c9b450",[],"Some of the tasks that plugins can help with include:",[],{"_key":5310,"_type":23,"children":5311,"level":943,"listItem":944,"markDefs":5338,"style":31},"836e66bbe717",[5312,5316,5320,5325,5329,5334],{"_key":5313,"_type":27,"marks":5314,"text":5315},"d3fa3af297510",[151],"Testing:",{"_key":5317,"_type":27,"marks":5318,"text":5319},"d3fa3af297511",[]," Many popular testing frameworks have plugins or extensions that let you run tests directly from your IDE. For example, Jest, a JavaScript testing framework that ",{"_key":5321,"_type":27,"marks":5322,"text":5324},"d3fa3af297512",[5323],"1ceed4c1f320","we use",{"_key":5326,"_type":27,"marks":5327,"text":5328},"d3fa3af297513",[],", has a ",{"_key":5330,"_type":27,"marks":5331,"text":5333},"d3fa3af297514",[5332],"cfa05815762c","VS Code extension",{"_key":5335,"_type":27,"marks":5336,"text":5337},"d3fa3af297515",[]," that integrates with many VS Code features, including IntelliSense.",[5339,5341],{"_key":5323,"_type":122,"href":5340},"https://stackoverflow.blog/2022/07/04/how-stack-overflow-is-leveling-up-its-unit-testing-game/",{"_key":5332,"_type":122,"href":5342},"https://marketplace.visualstudio.com/items?itemName=Orta.vscode-jest",{"_key":5344,"_type":23,"children":5345,"level":943,"listItem":944,"markDefs":5363,"style":31},"0e3c4e56836d",[5346,5350,5354,5359],{"_key":5347,"_type":27,"marks":5348,"text":5349},"965e00cc3ed60",[151],"Version control:",{"_key":5351,"_type":27,"marks":5352,"text":5353},"965e00cc3ed61",[]," You can browse repos, commit code, and manage pull requests without leaving your IDE. This may be especially useful to folks since GitHub ",{"_key":5355,"_type":27,"marks":5356,"text":5358},"965e00cc3ed62",[5357],"98ffae5aa742","sunsetted Atom",{"_key":5360,"_type":27,"marks":5361,"text":5362},"965e00cc3ed63",[],", their official text editor/IDE.",[5364],{"_key":5357,"_type":122,"href":5365},"https://github.blog/2022-06-08-sunsetting-atom/",{"_key":5367,"_type":23,"children":5368,"level":943,"listItem":944,"markDefs":5377,"style":31},"3d4fe18d85a3",[5369,5373],{"_key":5370,"_type":27,"marks":5371,"text":5372},"41a1a43540de0",[151],"Build automation:",{"_key":5374,"_type":27,"marks":5375,"text":5376},"41a1a43540de1",[]," While many build processes happen within a CI/CD pipeline, you can still integrate build processes in your IDE, which is especially helpful to ensure that code actually compiles without errors.",[],{"_key":5379,"_type":23,"children":5380,"level":943,"listItem":944,"markDefs":5389,"style":31},"738a76722734",[5381,5385],{"_key":5382,"_type":27,"marks":5383,"text":5384},"50848ac38ae10",[151],"Deployment and CI/CD:",{"_key":5386,"_type":27,"marks":5387,"text":5388},"50848ac38ae11",[]," Speaking of those CI/CD pipelines, you can use plugins to directly manage your CI/CD pipeline, including things like debugging failed builds remotely.",[],{"_key":5391,"_type":23,"children":5392,"level":943,"listItem":944,"markDefs":5401,"style":31},"459d8ab1ce1a",[5393,5397],{"_key":5394,"_type":27,"marks":5395,"text":5396},"4f99158ad1c00",[151],"Task runners and scripting:",{"_key":5398,"_type":27,"marks":5399,"text":5400},"4f99158ad1c01",[]," There are a wealth of additional extensions and plugins that allow you to run various actions, scripts, or processes directly within your IDE, thus never breaking out of your workflow.",[],{"_key":5403,"_type":23,"children":5404,"markDefs":5409,"style":31},"d1661c352d71",[5405],{"_key":5406,"_type":27,"marks":5407,"text":5408},"0d157e87ceed0",[],"Extensions and plugins like this are key to maintaining a state of flow when building software. Remember that video about debugging above, where the developer had to leave their IDE to close all the Stack Overflow tabs they had opened while exploring the problem? We bring that to the IDE.",[],{"_key":5411,"_type":23,"children":5412,"markDefs":5417,"style":212},"99f7f57822ac",[5413],{"_key":5414,"_type":27,"marks":5415,"text":5416},"e59449b6e1020",[],"Uplevel developer experience",[],{"_key":5419,"_type":23,"children":5420,"markDefs":5425,"style":31},"5fc358c2762d",[5421],{"_key":5422,"_type":27,"marks":5423,"text":5424},"d094aef9fc2b0",[],"Our VS Code extension, Stack Overflow for Visual Studio Code, connects your developers’ IDE workspace with the answers they need to write their best code. It puts your developer experience a step ahead by:",[],{"_key":5427,"_type":23,"children":5428,"level":943,"listItem":944,"markDefs":5433,"style":31},"6c064e7d5606",[5429],{"_key":5430,"_type":27,"marks":5431,"text":5432},"bee0732ffb270",[],"Bringing the context-rich knowledge of Stack Overflow directly into your coding environment.",[],{"_key":5435,"_type":23,"children":5436,"level":943,"listItem":944,"markDefs":5441,"style":31},"9b487fd3da8e",[5437],{"_key":5438,"_type":27,"marks":5439,"text":5440},"3cf95f408ec60",[],"Helping developers understand how your code works with community-validated explanations.",[],{"_key":5443,"_type":23,"children":5444,"level":943,"listItem":944,"markDefs":5449,"style":31},"2d993917a903",[5445],{"_key":5446,"_type":27,"marks":5447,"text":5448},"d9caef6c82840",[],"Allowing developers to share insights and discoveries with their team without breaking flow.",[],{"_key":5451,"_type":185,"url":5452},"f5af761dbd87","https://fast.wistia.net/embed/iframe/qb9aga4dx1?seo=false&videoFoam=true&doNotTrack=true&seo=false&videoFoam=false&fitStrategy=cover&autoPlay=true&muted=true&controlsVisibleOnLoad=false&playbar=false&volumeControl=false&fullscreenButton=false&silentAutoPlay=true&settingsControl=false&plugin[captions-v1]=false&smallPlayButton=false&endVideoBehavior=loop",{"_key":5454,"_type":23,"children":5455,"markDefs":5460,"style":31},"57f9de07b17e",[5456],{"_key":5457,"_type":27,"marks":5458,"text":5459},"a4a7ef86f0740",[],"Users can ask questions directly from the IDE, summarize and explain code, and connect with your organization’s Stack Overflow for Teams knowledge base. The extension allows your developers to find and document the reasoning behind certain technical decisions without cluttering code with long comments or burying information in commit messages. Because the less they’re context-switching, the happier (and more productive) they are.",[],{"_key":5462,"_type":23,"children":5463,"markDefs":5468,"style":31},"80812f86c78f",[5464],{"_key":5465,"_type":27,"marks":5466,"text":5467},"fb1a8db558d20",[],"If they use one, a developer’s IDE is pretty central to their workflow. It’s more than just a place to write code—it can help guide the entire process from debugging through commit and deploy. A well-configured and customized IDE can be the key to keeping a developer in a flow state. Stack Overflow for Visual Studio Code ensures that a question about code doesn’t break them out of that flow.",[],{"_type":169,"current":5470},"ide","Developer environments",{"_key":5473,"_type":262,"body":5474,"slug":5772,"title":5774},"de5d0013c905",[5475,5483,5491,5499,5507,5515,5523,5531,5539,5547,5555,5563,5567,5579,5587,5591,5607,5615,5619,5627,5635,5643,5651,5669,5676,5684,5692,5700,5708,5716,5724,5732,5740,5748,5756,5764],{"_key":5476,"_type":23,"children":5477,"markDefs":5482,"style":4766},"329c62c7f9ad",[5478],{"_key":5479,"_type":27,"marks":5480,"text":5481},"6c0e1a197cae0",[],"How GenAI can help you learn as you go",[],{"_key":5484,"_type":23,"children":5485,"markDefs":5490,"style":31},"dac8f2293c24",[5486],{"_key":5487,"_type":27,"marks":5488,"text":5489},"76911a95c4d50",[],"One way to put the power of AI at your teams’ fingertips is through chat. This might take the form of a customer-facing chatbot that helps users find answers to their questions without help from your support teams. It could be an AI-powered chatbot for internal users, trained on your knowledge base to help employees find answers to their questions, work past blockers, and get up to speed on new technologies.",[],{"_key":5492,"_type":23,"children":5493,"markDefs":5498,"style":31},"82b6a203e0ee",[5494],{"_key":5495,"_type":27,"marks":5496,"text":5497},"044c9f00af6e0",[],"Whether it’s built for your customers, your internal teams, or both, an AI-powered chatbot can gather generated solutions to commonly encountered technical challenges and help users navigate your knowledge base, just like a human collaborator sitting at their side.",[],{"_key":5500,"_type":23,"children":5501,"markDefs":5506,"style":31},"04b4fcb46734",[5502],{"_key":5503,"_type":27,"marks":5504,"text":5505},"22a2ec3016270",[],"Let’s explore some best practices for integrating AI chat technology into your teams’ workflows, key features to look for, and some possible pitfalls to keep in mind.",[],{"_key":5508,"_type":23,"children":5509,"markDefs":5514,"style":212},"9c442289deed",[5510],{"_key":5511,"_type":27,"marks":5512,"text":5513},"cfcaaeddb73f0",[],"Self-serve knowledge",[],{"_key":5516,"_type":23,"children":5517,"markDefs":5522,"style":31},"1c97ac13659b",[5518],{"_key":5519,"_type":27,"marks":5520,"text":5521},"140cda417d7e0",[],"We’ve come a long way since the days of Clippy. Now chatbots trained on your codebase or internal knowledge base can offer timely, relevant assistance to users without interrupting their flow states or forcing them to switch between platforms.",[],{"_key":5524,"_type":23,"children":5525,"markDefs":5530,"style":31},"9d8e710239f9",[5526],{"_key":5527,"_type":27,"marks":5528,"text":5529},"ed75352cbc010",[],"When it comes to coding tasks, the friendly, accessible chat interface helps democratize software development by making it easier for anyone to get started writing code. More experienced software developers and engineers can use AI chatbots trained on your codebase to unstick themselves when they get stuck or gain comfort with a new programming language.",[],{"_key":5532,"_type":23,"children":5533,"markDefs":5538,"style":212},"b4862fea8be2",[5534],{"_key":5535,"_type":27,"marks":5536,"text":5537},"5d5a0849484b0",[],"Integration is everything",[],{"_key":5540,"_type":23,"children":5541,"markDefs":5546,"style":31},"c41d29a4870f",[5542],{"_key":5543,"_type":27,"marks":5544,"text":5545},"bacf355e3d900",[],"Chatbots can be built into familiar tools your employees are already using, from Slack to Stack Overflow for Teams. This kind of integration gives engineers and developers access to external knowledge resources without the need for costly context-switching: time- and attention-consuming switches to different platforms and delays while answers are sought, formulated, and delivered. The familiar, intuitive chat interface combined with natural language processing (NLP) makes asking questions of the AI as simple as pinging a colleague.",[],{"_key":5548,"_type":23,"children":5549,"markDefs":5554,"style":212},"a898b9545435",[5550],{"_key":5551,"_type":27,"marks":5552,"text":5553},"0f7f2e4fa72c0",[],"Best practices for building a value-add chatbot",[],{"_key":5556,"_type":23,"children":5557,"markDefs":5562,"style":31},"bc84b81496ac",[5558],{"_key":5559,"_type":27,"marks":5560,"text":5561},"2ac5604a67420",[],"AI can make the difference between a chatbot that adds huge value for users and one that’s merely an annoying pop-up (sorry, Clippy). But the best practices for building and implementing a chatbot still apply. Here are some to keep in mind as you develop your strategy:",[],{"_key":5564,"_type":16,"asset":5565},"128a6376ae9f",{"_ref":5566,"_type":19},"image-5d4ed9ed63e6f4993e2853ef8d5aab4019278dd5-1430x682-png",{"_key":5568,"_type":23,"children":5569,"markDefs":5578,"style":344},"910381c419c0",[5570,5574],{"_key":5571,"_type":27,"marks":5572,"text":5573},"bbda834e74220",[151],"Know what problem(s) the chatbot will solve",{"_key":5575,"_type":27,"marks":5576,"text":5577},"bbda834e74221",[]," ",[],{"_key":5580,"_type":23,"children":5581,"markDefs":5586,"style":31},"ed4e3d0d0526",[5582],{"_key":5583,"_type":27,"marks":5584,"text":5585},"0d34717c42d5",[],"What friction points for users are you trying to address with the chatbot? Maybe people are taking too long to find answers to their questions, leading to wasted time and lost productivity. Maybe customers are peppering your support team with mostly-straightforward questions that a bot could answer easily. Thinking about how your AI-powered chatbot can solve at least one specific problem will help ensure that you build something people will find useful and valuable.",[],{"_key":5588,"_type":16,"asset":5589},"6927f52bb638",{"_ref":5590,"_type":19},"image-da688d733978c93ddc3734253c19286c8c573485-1430x682-png",{"_key":5592,"_type":23,"children":5593,"markDefs":5606,"style":344},"67fa639b1ff2",[5594,5598,5602],{"_key":5595,"_type":27,"marks":5596,"text":5597},"e2963481c6a20",[151],"Don’t expect it to solve ",{"_key":5599,"_type":27,"marks":5600,"text":5601},"e2963481c6a21",[151,57],"every ",{"_key":5603,"_type":27,"marks":5604,"text":5605},"e2963481c6a22",[151],"problem",[],{"_key":5608,"_type":23,"children":5609,"markDefs":5614,"style":31},"2fdf110247d0",[5610],{"_key":5611,"_type":27,"marks":5612,"text":5613},"e2963481c6a23",[],"The other side of the coin is that you can’t expect a chatbot, even one powered by rapidly evolving GenAI technology, to solve every problem your users encounter. The goal of an AI-powered chatbot is to allow users to self-serve answers to their questions more quickly, without interrupting a knowledgeable human and jerking both questioner and prospective respondent out of their flow states. But there will still be times when an AI chatbot returns a nonsense answer or struggles to grasp the nature of the question. It’s important to recognize these inflection points and give users an easy way to connect with a human when they need it.",[],{"_key":5616,"_type":16,"asset":5617},"6c68b5e15a7f",{"_ref":5618,"_type":19},"image-2521723fed80368d038ae1855920b8fbfb16fc0e-1430x682-png",{"_key":5620,"_type":23,"children":5621,"markDefs":5626,"style":344},"edd89df9725c",[5622],{"_key":5623,"_type":27,"marks":5624,"text":5625},"b89da1fe50660",[151],"Improve as you go",[],{"_key":5628,"_type":23,"children":5629,"markDefs":5634,"style":31},"66a57f54d7b2",[5630],{"_key":5631,"_type":27,"marks":5632,"text":5633},"b89da1fe50661",[],"This brings us to the good news, which is that AI chatbots add more value over time, as they learn from the questions and other input they receive and improve their ability to deliver specific, accurate answers. Giving direction and feedback to chatbots allows them to make themselves more useful to your users and your organization as a whole.",[],{"_key":5636,"_type":23,"children":5637,"markDefs":5642,"style":212},"0ba3e49bc21b",[5638],{"_key":5639,"_type":27,"marks":5640,"text":5641},"297ffb198e570",[],"Data quality makes a difference",[],{"_key":5644,"_type":23,"children":5645,"markDefs":5650,"style":31},"b1a1370bbac4",[5646],{"_key":5647,"_type":27,"marks":5648,"text":5649},"030c69b1e05e0",[],"As with any other AI-powered tool, the quality of the information a model has access to has everything to do with the quality of its answers. AI models given access to incomplete or inaccurate information are likely to return illogical or incorrect answers, known as hallucinations. The information your AI-powered chatbot has access to should be complete, up-to-date, well-organized, and free of errors.",[],{"_key":5652,"_type":23,"children":5653,"markDefs":5667,"style":31},"c9adbe6611e0",[5654,5658,5663],{"_key":5655,"_type":27,"marks":5656,"text":5657},"0d10a0486b910",[],"This is where a knowledge-sharing and collaboration platform like Stack Overflow for Teams becomes ",{"_key":5659,"_type":27,"marks":5660,"text":5662},"0d10a0486b911",[5661],"485217b761cb","vital to the success",{"_key":5664,"_type":27,"marks":5665,"text":5666},"0d10a0486b912",[]," of AI initiatives, from chatbots to advanced search and code completion.",[5668],{"_key":5661,"_type":122,"href":4895},{"_key":5670,"_type":23,"children":5671,"markDefs":5675,"style":212},"b22d942cfd86",[5672],{"_key":5673,"_type":27,"marks":5674,"text":4902},"5ce5996ab1600",[],[],{"_key":5677,"_type":23,"children":5678,"markDefs":5683,"style":31},"74ce20f56e26",[5679],{"_key":5680,"_type":27,"marks":5681,"text":5682},"3594db7d34ab0",[],"AI chatbots aren’t interchangeable; there are specific features you should look for in building or shopping around for the right tool. Users should be able to:",[],{"_key":5685,"_type":23,"children":5686,"level":943,"listItem":944,"markDefs":5691,"style":31},"8edc1a0d962f",[5687],{"_key":5688,"_type":27,"marks":5689,"text":5690},"db5735b261ab0",[],"Ask questions/receive answers in natural language, to make the interface simple and straightforward for all users.",[],{"_key":5693,"_type":23,"children":5694,"level":943,"listItem":944,"markDefs":5699,"style":31},"07b46a247f50",[5695],{"_key":5696,"_type":27,"marks":5697,"text":5698},"54bb169229e70",[],"Learn while solving actual coding tasks, as engineers and developers prefer.",[],{"_key":5701,"_type":23,"children":5702,"level":943,"listItem":944,"markDefs":5707,"style":31},"4351cfbf3d05",[5703],{"_key":5704,"_type":27,"marks":5705,"text":5706},"f60566a02a060",[],"Integrate the chat technology with existing tools and workflows.",[],{"_key":5709,"_type":23,"children":5710,"level":943,"listItem":944,"markDefs":5715,"style":31},"be49e2b179c7",[5711],{"_key":5712,"_type":27,"marks":5713,"text":5714},"91ec7a84a8830",[],"Get an explanation for various problem-solving approaches rooted in your company’s internal knowledge base.",[],{"_key":5717,"_type":23,"children":5718,"level":943,"listItem":944,"markDefs":5723,"style":31},"92ee3153c2ff",[5719],{"_key":5720,"_type":27,"marks":5721,"text":5722},"706654b2b3410",[],"Understand the context behind organizational best practices, based on your internal knowledge base.",[],{"_key":5725,"_type":23,"children":5726,"markDefs":5731,"style":212},"b84d4079e497",[5727],{"_key":5728,"_type":27,"marks":5729,"text":5730},"192fe6567b100",[],"Answers when and where you need them",[],{"_key":5733,"_type":23,"children":5734,"markDefs":5739,"style":31},"c157f4552c78",[5735],{"_key":5736,"_type":27,"marks":5737,"text":5738},"8b9252511c560",[],"Auto-Answer App for Stack Overflow for Teams automates access to essential knowledge at your organization, so your teams have the information they need when and where they need it. The app:",[],{"_key":5741,"_type":23,"children":5742,"level":943,"listItem":944,"markDefs":5747,"style":31},"59c3625d472d",[5743],{"_key":5744,"_type":27,"marks":5745,"text":5746},"308dee8336090",[],"Allows teams to spend less time and resources searching for and providing answers.",[],{"_key":5749,"_type":23,"children":5750,"level":943,"listItem":944,"markDefs":5755,"style":31},"c2953fd86986",[5751],{"_key":5752,"_type":27,"marks":5753,"text":5754},"a6384196d4ff0",[],"Summarizes chat threads and posts as digestible Q&A content for future reuse.",[],{"_key":5757,"_type":23,"children":5758,"level":943,"listItem":944,"markDefs":5763,"style":31},"96aab3440033",[5759],{"_key":5760,"_type":27,"marks":5761,"text":5762},"2b523b34e5240",[],"Sources information automatically without needing user commands.",[],{"_key":5765,"_type":23,"children":5766,"markDefs":5771,"style":31},"ebaeb39058ae",[5767],{"_key":5768,"_type":27,"marks":5769,"text":5770},"2f31781c841e0",[],"Auto-Answer App integrates with Slack and Microsoft for Teams to give users access to insights from your knowledge community without the need for context-switching that costs time and energy. It searches your Stack Overflow for Teams instance and returns answers within your team’s preferred chat platform, without requiring user actions or accessing integrations. A familiar and intuitive chat interface makes it simple for any user, technical or not, to ask questions and get answers, work through coding problems, or locate the institutional knowledge they need to do their best work.",[],{"_type":169,"current":5773},"chat","Chat",{"_type":162,"seoDescription":5776,"seoImage":5777},"Discover Stack Overflow's journey to AI, enhancing learning and collaboration with AI-powered search and course recommendations.",{"_type":16,"asset":5778},{"_ref":5779,"_type":19},"image-efb0062f6081c194175604303d94e172ba9cd7c8-2400x1260-png",[],{"_type":169,"current":5782},"our-ai-journey","Our AI journey",{"_key":5785,"_type":45,"body":5786,"seo":5870,"sidebarCta":5872,"slug":5873,"title":2896},"d28e04f7a20b",[5787,5803,5807,5815,5823,5831,5839,5847,5855],{"_key":5788,"_type":23,"children":5789,"markDefs":5802,"style":31},"26efe2ebeaa2",[5790,5794,5798],{"_key":5791,"_type":27,"marks":5792,"text":5793},"6a43585e73b40",[],"In the year since we first launched our ",{"_key":5795,"_type":27,"marks":5796,"text":5797},"6a43585e73b41",[57],"Industry Guide to AI ",{"_key":5799,"_type":27,"marks":5800,"text":5801},"6a43585e73b42",[],"(January 2024), many companies have evolved from learning the basics of the tools and techniques they needed to implementing, iterating, and improving on their implementation. ",[],{"_key":5804,"_type":90,"citation":5805,"copy":5806},"16e9f7a26052","Research from Menlo Ventures","2024 marks the year that generative AI became a mission-critical imperative for the enterprise. The numbers tell a dramatic story: AI spending1 surged to $13.8 billion this year, more than 6x the $2.3 billion spent in 2023—a clear signal that enterprises are shifting from experimentation to execution, embedding AI at the core of their business strategies.\n\nThis spike in spending reflects a wave of organizational optimism; 72% of decision-makers anticipate broader adoption of generative AI tools in the near future. This confidence isn’t just speculative—generative AI tools are already deeply embedded in the daily work of professionals, from programmers to healthcare providers.",{"_key":5808,"_type":23,"children":5809,"markDefs":5814,"style":31},"e38e211c9a5d",[5810],{"_key":5811,"_type":27,"marks":5812,"text":5813},"a1764f6bc7e70",[],"While advancement from foundation models may slow, there is still an enormous amount of progress to be made to the speed, cost, and accuracy of GenAI inside your organization by adopting the best practices of peers and researchers.",[],{"_key":5816,"_type":23,"children":5817,"markDefs":5822,"style":31},"b6ef3a314832",[5818],{"_key":5819,"_type":27,"marks":5820,"text":5821},"7891665a8d960",[],"RAG was the first example, as it was often the gateway for companies to begin experimenting with GenAI. There are now far more advanced and flexible styles of RAG, as well as tools and service providers who can help you to optimize your use of this technique.",[],{"_key":5824,"_type":23,"children":5825,"markDefs":5830,"style":31},"5bfe75a6e879",[5826],{"_key":5827,"_type":27,"marks":5828,"text":5829},"982a20f47d4a0",[],"Routers are another great example of how quickly the industry is changing and the benefits that are accruing to end users. Today, you can build your stack on top of a router that allows you to easily swap one model for another, shifting from private to open-source, third party to in-house, with minimal interruption to your GenAI functionality.",[],{"_key":5832,"_type":23,"children":5833,"markDefs":5838,"style":31},"9369bec2adce",[5834],{"_key":5835,"_type":27,"marks":5836,"text":5837},"2f089bb2a40c0",[],"Agentic AI was around when we first published this guide, but it was largely being used by individuals hacking together personal projects—a wild west of an open-source community. AI agents have now gone mainstream, with companies like Anthropic, OpenAI, and Google offering agents that will take actions on a user’s behalf, controlling and interacting with various apps and services on their mobile device or desktop.",[],{"_key":5840,"_type":23,"children":5841,"markDefs":5846,"style":31},"6079f0ca3ec4",[5842],{"_key":5843,"_type":27,"marks":5844,"text":5845},"e48fb86d363e0",[],"As progress on the pre-training stage of AI models has slowed, focus has shifted to adding more horsepower to the inference stage of the process. In the past, no matter how complex the query, most GenAI models aimed to deliver their response quickly. What the end user received was a sort of initial response—a first thought, if you will. Today, many systems allow users to specify if a complex problem should be routed to a system that takes more time to think, plan, test, and consider before responding. For use cases like basic customer service, this is probably not needed, and would add latency that could irritate customers. For users who are pursuing complex research and have no issue waiting minutes or even hours for high-quality answers, however, this new modality has the potential to deliver enormous value.",[],{"_key":5848,"_type":23,"children":5849,"markDefs":5854,"style":31},"a40eee73811a",[5850],{"_key":5851,"_type":27,"marks":5852,"text":5853},"86baf86568ed0",[],"Here at Stack Overflow, 2024 brought some monumental changes to our business. We announced marquee partnerships for our data licensing business, built out our product offerings, and conducted research to substantiate the value our data can add to the performance of models fine-tuned on Stack Overflow data.",[],{"_key":5856,"_type":23,"children":5857,"markDefs":5869,"style":31},"adf7fcf552b2",[5858,5862,5865],{"_key":5859,"_type":27,"marks":5860,"text":5861},"1139c80d0de50",[],"As the new year continues, we hope this refreshed version of our ",{"_key":5863,"_type":27,"marks":5864,"text":58},"1139c80d0de51",[57],{"_key":5866,"_type":27,"marks":5867,"text":5868},"1139c80d0de52",[]," helps to ground the most important developments happening in the GenAI space and offers practical information and advice that you can apply inside your organization.",[],{"_type":162,"seoDescription":5871},"Explore the transformative potential of GenAI for businesses, from leveraging existing data to optimizing performance with high-quality data.",[],{"_type":169,"current":5874},"conclusion",true,{"_type":16,"asset":5877},{"_ref":5878,"_type":19},"image-b7f7790df6595991424dbe1aee780ce448459027-1200x630-png","With a thorough understanding of this new era in tech, you can better equip your team and your organization to leverage AI.",{"_createdAt":5881,"_id":5882,"_rev":5883,"_system":5884,"_type":5887,"_updatedAt":5888,"abbr":5889,"addons":5890,"color":5903,"descriptionShort":5920,"features":5921,"marketo":5946,"name":5947,"nameFull":5948,"plans":5949,"slug":5962},"2022-02-25T09:59:05Z","339bc91a-69c6-4a69-8add-a12977a22ad5","orKTSb5LIQENoAxH3BtKvT",{"base":5885},{"id":5882,"rev":5886},"d1opYIms5MkNkJ1qGGJAQy","product","2026-04-29T12:21:12Z","SOI",[5891,5894,5897,5900],{"_key":5892,"_ref":5893,"_type":19},"8bdd13cb0b67","productAddon-api",{"_key":5895,"_ref":5896,"_type":19},"f580fbca3462","productAddon-mcp",{"_key":5898,"_ref":5899,"_type":19},"01cd1dddf29a","productAddon-knowledge-ingestion",{"_key":5901,"_ref":5902,"_type":19},"3debbe7afe59","productAddon-services",{"_type":5904,"alpha":943,"hex":5905,"hsl":5906,"hsv":5911,"rgb":5915},"color","#2b2d6e",{"_type":5907,"a":943,"h":5908,"l":5909,"s":5910},"hslaColor",238.2089552238806,0.30000000000000004,0.4379084967320261,{"_type":5912,"a":943,"h":5908,"s":5913,"v":5914},"hsvaColor",0.6090909090909091,0.43137254901960786,{"_type":5916,"a":943,"b":5917,"g":5918,"r":5919},"rgbaColor",110,45,43,"Where developers & technologists share private knowledge with coworkers.",[5922,5925,5928,5931,5934,5937,5940,5943],{"_key":5923,"_ref":5924,"_type":19},"120fa387c6f4","productFeatureCategory-core-features",{"_key":5926,"_ref":5927,"_type":19},"c4d6798fe9a0","productFeatureCategory-search",{"_key":5929,"_ref":5930,"_type":19},"ceba5f99fe0d","productFeatureCategory-community",{"_key":5932,"_ref":5933,"_type":19},"198ea8e841d9","productFeatureCategory-customisation",{"_key":5935,"_ref":5936,"_type":19},"d587a358a655","productFeatureCategory-admin-support",{"_key":5938,"_ref":5939,"_type":19},"75af52fe8595","productFeatureCategory-integrations",{"_key":5941,"_ref":5942,"_type":19},"3a88eafc1b08","7c088242-c7bb-4d91-93db-6f7a042b1484",{"_key":5944,"_ref":5945,"_type":19},"5edeecc1eb92","productFeatureCategory-security","Stack Internal","Internal","Stack Overflow Internal",[5950,5953,5956,5959],{"_key":5951,"_ref":5952,"_type":19},"ad71aa58305c","productPlan-free",{"_key":5954,"_ref":5955,"_type":19},"a9fec6266434","productPlan-basic",{"_key":5957,"_ref":5958,"_type":19},"17aa7628c024","productPlan-business",{"_key":5960,"_ref":5961,"_type":19},"a9c801fd0fb6","productPlan-enterprise",{"_type":169,"current":5963},"internal","2024-02-06T13:00:00.000Z",[5966],{"_key":5967,"_ref":5968,"_type":19},"1c26592156a6","b4721a78-6640-4834-b1fd-d2f6e434a4f4",{"_type":169,"current":5970},"ai-industry-guide",{"_ref":5972,"_type":19},"c18a6a4e-32f6-4b21-ad23-67d6add15691","Stack Overflow’s Industry Guide to AI",false]