Back to News
Market Impact: 0.2

GitHub: We going to train on your data after all

MSFTDELLORCL
Artificial IntelligenceTechnology & InnovationCybersecurity & Data PrivacyRegulation & LegislationManagement & Governance
GitHub: We going to train on your data after all

On April 24 GitHub will begin using customer interaction data (inputs, outputs, code snippets, surrounding context, file names/repo structure, chats, and feedback) to train its AI models for Copilot Free/Pro/Pro+ unless users opt out via /settings/copilot/features. Copilot Business/Enterprise and student/teacher accounts are exempt; GitHub defends the change as improving suggestion acceptance and bug detection but applies an industry-standard US opt-out rather than EU-style opt-in consent. The change allows collection from private repos while a user is actively using Copilot and has drawn notable community backlash (59 thumbs-down vs 3 positive reactions across 39 posts).

Analysis

This change compresses a predictable near-term window: developer trust loss will force a two-track adoption pattern where enterprises pay for contract-level guarantees while individual and small-team users either opt out or seek air-gapped alternatives. If even 10-20% of active non-enterprise users reduce Copilot engagement, telemetry-driven quality improvements will slow, creating a visible churn/engagement trough over the next 3–9 months that could show up in Microsoft developer metrics and partner KPIs. Second-order winners are vendors that sell on-prem/air-gapped LLMs, code-provenance and SBOM tooling, and enterprise database/cloud providers that can bundle contractual data protections — these players will see procurement cycles accelerate as security teams tighten policies within 1–4 quarters. Conversely, consumer-facing usage metrics for Copilot (a product-led growth channel) are now more vulnerable to PR-driven spikes in opt-outs and regulatory actions, which raises short-term legal and compliance tail risk that could crystallize within 6–18 months in the EU or via class-action precedents in the US. The market consensus focuses on privacy optics but underprices the commercial lever Microsoft holds: the ability to upsell enterprise contracts that exempt customers from training data collection. That creates a median path where MSFT faces a modest near-term drag but a durable long-term moat if it converts a fraction of free users into revenue-protecting enterprise customers. Tradeable windows are therefore short-to-intermediate: express tactical downside protection on MSFT while buying selective exposure to enterprise software names that can monetize the tightening of data-governance preferences.