Source: The Register
The UK government's National Data Library initiative assumes AI developers will voluntarily use public datasets, but the economics work against it: proprietary data providers like Hugging Face and commercial dataset brokers have already solved the friction problems—preprocessing, documentation, integration—that the NDL would need to match. If the library launches with raw, hard-to-parse datasets while private alternatives offer plug-and-play solutions, developers will route around it, leaving the NDL as infrastructure no one uses. The actual cost isn't building the library. It's the unglamorous, continuous work of data curation and tooling that makes datasets adoptable at scale.