Improve the performance of AI models & products.

OverflowAPI is a subscription-based API service that provides continuous access to Stack Overflow’s public dataset to train and fine-tune large language models.

The world’s leading AI companies partner with Stack Overflow.

Learn how we’re working together to empower developers through innovative, socially responsible AI solutions:

Google Cloud

Stack Overflow and Google Cloud partner to bring Generative AI to millions of developers through the Stack Overflow platform, Google Cloud Console, and Duet AI.

Read announcement
OpenAI

Stack Overflow and OpenAI partner to strengthen the world’s most popular Large Language Models.

Read announcement

Join us in creating a new era of socially responsible AI.

We believe AI models and products must provide proper attribution and contribute value back to the communities creating and curating the data that fuels them. Learn more about our definition of socially responsible AI and the commitments we require from our partners.

Read the blog

Access high-quality technical content for commercial use cases.

Only 42% of developers trust the accuracy of AI tools. 1 Improve accuracy, product differentiation, and personalization with Stack Overflow’s dataset.

58M+ human-generated questions and answers with feedback signals from users and moderators.
Top-class technical expertise and experience, expressed with natural language, is ideal for LLM training.
Includes diverse tasks related to coding, advising, debugging, explaining, testing, reviewing, brainstorming, and troubleshooting.
Continuous access to newly created, up-to-date technical knowledge.

Improve model performance with specialized and precise data.

Based on internal & independent tests, fine-tuning on Stack Overflow data results in substantial model performance improvements.

Figure 1. Percent of “Perfect” answers (internal testing)

Based on a proprietary eval set of 1000 Q&A with ground truth answers created from Stack Exchange and Prosus AI Assistant technical Q&A (with highest user rating).

14.13%
Instruction fine tuned
MPT 30B
31.52%
Stack Overflow trained fine tuned
MPT 30B
37.38%
Code fine tuned
Code Llama-2 34B Instruction fine tuned
55.30%
Stack Overflow fine tuned
Code Llama-2 34B
Pre Stack Overflow training / fine tuning
Post Stack Overflow training / fine tuning

Figure 2. ‘InCoder’ model

InCoder found Stack Overflow data improved the HumanEval benchmark and MBPP (Mostly Basic Python Programming) performance.
Baseline
With Stack Overflow data
HumanEval pass@1
5
9
MBPP pass@1
6.1
9.8