When you download Vicuna or Stable Diffusion XL, they’re a handful of gigabytes. But when you go download LAION-5B, it’s 240TB. So where did that data go if it’s being copy/pasted and regurgitated in its entirety?
Exactly! If it were just out putting exact data they wouldn’t care about making new works and just pivot as the world’s greatest source of compression.
Though there is some work researchers have done to heavily modify these models to over fit to do exactly this.
When you download Vicuna or Stable Diffusion XL, they’re a handful of gigabytes. But when you go download LAION-5B, it’s 240TB. So where did that data go if it’s being copy/pasted and regurgitated in its entirety?
Exactly! If it were just out putting exact data they wouldn’t care about making new works and just pivot as the world’s greatest source of compression.
Though there is some work researchers have done to heavily modify these models to over fit to do exactly this.