AI firms will soon exhaust most of the internet’s data

A mining train going into the mine, full of 0s and 1s.

2024-07-23  1637  困难

The internet provided not only the images, but also the resources for labelling them. Once search engines had delivered pictures of what they took to be dogs, cats, chairs or whatever, these images were inspected and annotated by humans recruited through Mechanical Turk, a crowdsourcing service provided by Amazon which allows people to earn money by doing mundane tasks. The result was a database of millions of curated, verified images. It was through using parts of ImageNet for its training that, in 2012, a program called AlexNet demonstrated the remarkable potential of “deep learning”—that is to say, of neural networks with many more layers than had previously been used. This was the beginning of the ai boom, and of a labelling industry designed to provide it with training data.

经济学人和华尔街日报的文章是会员专属

请加入会员以继续阅读完整文章

成为会员后您将享受无限制的阅读体验,并可使用更多功能


免责声明:本文来自网络公开资料,仅供学习交流,其观点和倾向不代表本站立场。