>The majority of anons can't be expected to understand the nuances of AI techniques & technologies. They just want their robowaifus to talk to them effectively.
Yea, some thought is gonna have to go into how to utilize people's computers effectively and automatically. They might only have a 2 GB toaster GPU. While not ideal it could still help prototyping smaller models. I think feedback will be important otherwise people will shut the program off one night and forget to turn it back on when they don't see any results. Larger models could be compressed for people to use on their computers so they can directly reap the benefits of their contributions. It'll need to be able to run on Windows, Mac and Linux to reach the most amount of users.
I imagine when they boot up this distributed training program it shows a list of projects their hardware is capable of contributing to and the user selects which one they want to help. Part of the responsibility will be project owners making their project pages look good enough that people want to lend their GPUs. Users could also dedicate their GPU to a project owner so their GPU can be used for any project or prototype by them.
I plan on making a simple version soon to utilize all my computers and friends' computers. I'm sure a proof of concept will eventually attract other developers. The biggest issue will be securing it without nerfing what devs can do with it. The simplest solution would be to review code, manually approve projects and basically have package maintainers. And devs could choose to join untrusted projects that haven't been approved yet since they can review the code themselves. It wouldn't be much different from the risk taken when installing open-source software. But there could also be a sandboxed version where people can prototype vanilla models by defining hyper-parameters and network structure from existing modules.
>Would you guys contribute GPU cycles to create a GPT-3 clone?
The problem with trying to clone GPT-3 is the model is too big to fit on anyone's GPU or in memory. The full size GPT-3 requires around 16 x 48GB GPUs and likely they have a few hundred or thousand, not just 16, doing gradient accumulation. The heads can be split up across devices in parallel but the layers can't be so easily and would incur a huge cost going back and forth from GPU/RAM, plus there's the bandwidth cost of sending all that data to the next computer to perform the next substep of the training step. It would be really inefficient and the whole network would have to work together to do any inference on the model.
Its purpose would be much more general. People could use the system for doing other projects unrelated to robowaifus and AI, such as finding twin primes or something else. It would be more like a crowd-sourced cloud computing platform. Adding a privacy mode is a good idea though in case people do give embarrassing names to their projects so other people using the computer only see 'Distributed Computing' or something like that.
If necessary the bandwidth can be greatly reduced at the expense of accuracy. A little bit of noise from high compression doesn't seem to impact gradients too much since they're already quite noisy. We don't have to be too pessimistic about bandwidth growth though. Once Starlink finishes rolling out satellites it will have 1 Gbps connections. ISPs are already getting nervous their cartel is threatened and have been doubling bandwidth to customers to keep them.
>what would running robowaifu@home provide as a benefit to the end user?
A virtual waifu and all her functions. Once basic chat is solved people are going to expand their virtual waifus to perform other functions such as playing video games, composing music, drawing, debating, summarizing research papers, searching the web, etc. People wanting these functions will contribute to those projects and receive a compressed version of the training results that their hardware can run or the full size version if they wish. Alternatively, someone could create a marketplace where people can pay crypto for compute, but I'm not familiar with how to do that. I think SingularityNET does something like that with AI services.