Arthur Besse@lemmy.ml to Technology@beehaw.orgEnglish · edit-21 year agoSarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Librarywww.thedailybeast.comexternal-linkmessage-square130fedilinkarrow-up1218arrow-down10
arrow-up1218arrow-down1external-linkSarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Librarywww.thedailybeast.comArthur Besse@lemmy.ml to Technology@beehaw.orgEnglish · edit-21 year agomessage-square130fedilink
minus-squareISMETA@lemmy.ziplinkfedilinkEnglisharrow-up1·1 year agoGPT3 is 800GB while the entirety of the English Wikipedia is around 10GB compressed. So yeah it doesn’t store evey detail of everything but LLMs do memorize a lot of things verbatim. Also see https://bair.berkeley.edu/blog/2020/12/20/lmmem/
GPT3 is 800GB while the entirety of the English Wikipedia is around 10GB compressed. So yeah it doesn’t store evey detail of everything but LLMs do memorize a lot of things verbatim. Also see https://bair.berkeley.edu/blog/2020/12/20/lmmem/