In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.
In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.
Practically you would have to separate model architecture from weights. Weights are licensed as research use only, while the architecture is the actual scientific contribution. Maybe some instructions on best train the model.
Only problem is that you can’t really prove if someone just retrained research weights or trained from scratch using randomized weights. Also certain alterations to the architecture are possible, so only the “headless” models are used.
I think there’s some research into detecting retraining, but I can imagine it’s not fool proof.