llama/alpaca + gpt 1 datasets for large language models + EleutherAI - the pile + llama source data (data was open): common crawl; c4; github;wikipedia;books;arxiv;stackexchange voice: https://github.com/34j/so-vits-svc-fork + gpt 2 + llama + stable diffusion + esrgan + whisper