AI training dataset used by tech giants allegedly created by scraping YouTube videos in violation of terms

Mike Dalton

Non-profit AI research group EleutherAI scraped YouTube subtitles to create a dataset in violation of YouTube’s terms of service, ProofNews said on July 16. The dataset, called the Pile, allegedly includes subtitles of 173,536 YouTube videos from over 48,000 channels. About 12,000 deleted videos are part of the dataset. Several top tech and AI firms, […]

The post AI training dataset used by tech giants allegedly created by scraping YouTube videos in violation of terms appeared first on CryptoSlate.

Go here to Read this Fast! AI training dataset used by tech giants allegedly created by scraping YouTube videos in violation of terms

Originally appeared here:
AI training dataset used by tech giants allegedly created by scraping YouTube videos in violation of terms