Damien Benveniste on Twitter: "What is STABLE DIFFUSION? As opposed to DALL-E 2, it is open source with a PyTorch implementation and a pre-trained version on HuggingFace . It is trained using
![Extract high quality corpus from common crawl efficiently using CCNet – Random Notes – Some random post of my study research and other random stuff Extract high quality corpus from common crawl efficiently using CCNet – Random Notes – Some random post of my study research and other random stuff](https://raw.githubusercontent.com/theblackcat102/theblackcat102.github.io/master/images/CCNet_pipeline.png)
Extract high quality corpus from common crawl efficiently using CCNet – Random Notes – Some random post of my study research and other random stuff
![DepCC: A Dependency-Parsed Web-Scale Corpus based on CommonCrawl : Language Technology Group (LT) : Universität Hamburg DepCC: A Dependency-Parsed Web-Scale Corpus based on CommonCrawl : Language Technology Group (LT) : Universität Hamburg](https://www.inf.uni-hamburg.de/7382899/conllu-7d3b3eb19f454dcf94f6785a56fe8903b62e2d2f.png)
DepCC: A Dependency-Parsed Web-Scale Corpus based on CommonCrawl : Language Technology Group (LT) : Universität Hamburg
![Common Crawl Foundation: use their 5 billion page dataset with fairly unrestricted terms of service. : r/datasets Common Crawl Foundation: use their 5 billion page dataset with fairly unrestricted terms of service. : r/datasets](https://external-preview.redd.it/P2au8qlWfHtIBAHuAzQkSHfhQb9oiQsS21r41hj3-4c.jpg?auto=webp&s=5a0de92f2502faad780f9dc10d795f9d19c8975c)