/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

Reports of my death have been greatly overestimiste.

Still trying to get done with some IRL work, but should be able to update some stuff soon.

#WEALWAYSWIN

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB

More

(used to delete files and postings)


New machine learning AI released Robowaifu Technician 09/15/2019 (Sun) 10:18:46 No.250
OPEN AI/ GPT-2 This has to be one of the biggest breakthroughs in deep learning and AI so far. It's extremely skilled in developing coherent humanlike responses that make sense and I believe it has massive potential, it also never gives the same answer twice. >GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. The model is chameleon-like—it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing >GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. Also the current public model shown here only uses 345 million parameters, the "full" AI (which has over 4x as many parameters) is being witheld from the public because of it's "Potential for abuse". That is to say the full model is so proficient in mimicking human communication that it could be abused to create new articles, posts, advertisements, even books; and nobody would be be able to tell that there was a bot behind it all. <AI demo: talktotransformer.com/ <Other Links: github.com/openai/gpt-2 openai.com/blog/better-language-models/ huggingface.co/ My idea is to find a way to integrate this AI as a standalone unit and add voice-to-text for processing the questions and TTS for responses much like an amazon alexa- but instead of just reading google results- it actually provides a sort of discussion with the user. (Edited to fix the newlines.)
Edited last time by robi on 03/29/2020 (Sun) 17:17:27.
Open file (78.58 KB 608x737 Selection_025.png)
kek
I don't know if it's my typing style, but I only seem to get weird results out of this thing.
Here are the three most coherent and noteworthy interactions I got.
Open file (79.55 KB 633x557 Selection_026.png)
>>256
Heh, I think the whole point at this stage of the game is to look and laugh. Until the entire-corpus trained model is available it's less than likely to create the kind of higher-quality results that OP got very often. I'd bet he did 20+ tries for each of them.

In the meantime, just have some fun with it.
This program is merely a paragraph generator. Tay is more close to a human since she generates her own posts and stuff.
Fixed up some code I made to fiddle around with it, if anyone is bored: github.com/kokubunji/TalkToWaifu
>>691
Oh wow that was quick anon

How'd you modify it to give chatbot-like replies?
>>692
The model was trained on text that contained chat. I just prompted GPT-2 with a chat message and history, made it stop generating once it reached a new line, randomly generated 1-3 new lines, and modified the temperature so it's variable and goes off on tangents as it generates instead of getting stuck on the same topic.
>>693
Interesting.
I actually like when it goes on tangents sometimes- gives it a bit of added personality even if it derails what it's supposed to be talking about

Would it be possible to implement a toggle for line cutoff?
>>691
Good job Canada-anon, nice instructions for getting up to speed quickly. Also, we're looking forward to your other work you mentioned before. Please create a specific thread for it when you're ready with it.
Toothbrush here,
It's an interesting thing, but I'd probably use it for education for our waifu, rather than having it be the waifu. Think of Fireball Charming.
>>694
Yeah, it could check each new line it makes to see if it starts with the chatbot name and if not then stop generating.

>>695
I might push some early code on GitHub in a few days. Before making a thread I'd like to take some time to make compelling experiments, explore their limitations, and explain how they work in depth because they aren't like typical neural nets.
>>697
Please take your time anon whenever you're ready ofc.
>>250
>3DPD men are oppressed.
The future, ladies and gentlemen.
Open file (133.30 KB 500x610 nevar_4get_me_anon.png)
>>722
kekd. yeah, the group behind the corpus are a bunch of cock-mongling commies, so no surprise. the fun is in deprogramming their bastard abomination. keep at it lad!
do it for Tay!
:^)
Open file (56.73 KB 607x399 Screenshot(31).png)
Open file (52.73 KB 655x352 Screenshot(32).png)
>>250
Deplorable.
>>691
One step closer.
>>724
make sure you copypaste the first one before every guntstream airing anon, it will help everyone remember why they came in the first place. :^)
Open file (43.90 KB 596x1274 what.png)
>>724
So I tried to check if it would give me the same completions if I typed the same prompt and....
the fuck?
>>726
no, every single completion is always different anon.
>>726
topkek. this AI is doing open mic freestyle now.
>>250
I remember messing with it few months ago. Mostly it generated gibberish and had to reload a few times to get a funny answer.
>>732
yeah, it's the lobotomized version. the team that created it 'feared to release it to the public because of the potential for abuse'. i'm sure what they are really plan it for is to gaslight and astroturf as many communities as they can with it prior to Trump getting reelected in November next year.
Transformer returns alot of stuff which appear to be 100% copypasta. It's like someone entered the user text into a search engine, pulled out the relevant lines, threw it into a POS tagger and string replaced the NNs/VBs/JJs/etc. I entered a sentence that started with "The lack of versioning." and got an IGN interview with some studio. It gets more obvious as you enter code in any programming language (it comes out workable or you get copypasta from documentation).

Hell I wouldn't use it to generate white papers. It would flag plagarism checkers.
>>821
>linked directly from the OP:
>"Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.

I imagine the full system using the entire corpus is much more capable.
>>250
>>691
Is it possible to have an AI poster on this webring imageboard? or maybe her own AI board where she can post on.
>>1464
I certainly don't think it's impossible anon. Did you have some ideas?
>>1470
>Did you have some ideas?
You need to write a bot script that fetches post and reply on imageboard. But more importantly, how good is this thing anyway?. I don't wan't it to be in lobotomized stage, like repeating itself despite having huge input of learning curve.
>As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models released since August, we’ve continued with our original staged release plan in order to provide the community with a test case of a full staged release process. We hope that this test case will be useful to developers of future powerful models, and we’re actively continuing the conversation with the AI community on responsible publication."

openai.com/blog/gpt-2-1-5b-release/
Open file (55.73 KB 594x256 2019-11-23_08-32-59.png)
>>1473
It's still pretty non-sensical much of the time, but it seems to be better with the bigger model.
Actually you might want to checkout https://github.com/AIDungeon/AIDungeon with fun results like https://aidungeonpastes.github.io/AID2-Art/
>>250 Remember: GPT-2 is weak, you need something stronger like ERNIE, XLNet or MT-DNN find out more at https://github.com/thunlp/PLMpapers
Okay things are getting better with Google's Meena https://arxiv.org/pdf/2001.09977.pdf
>>2004 thanks anon. grabbed a copy and i'll read through it as time allows.
>>2004 > This 2.6B parameter neural network is simply trained to minimize perplexity of the next token. can you clarify exactly what that means anon? pretend i'm retarded.
Open file (151.45 KB 1280x720 plm_models.jpg)
>>1923 thanks for the tip anon. what could be better than training your robowaifu on sesame street tbh? :^)
<go to openai, find this kind of list >Textual Entailment >Semantic Similarity >Reading Comprehension >Commonsense Reasoning >Sentiment Analysis >Linguistic Acceptability can someone explain in some detail what these are/how they are important to robowaifus? how would you use them to make a chatbot for example?
>>2036 > More Data Can handle a bigger corpus of knowledge, thus smarter > Knowledge Graph Tay-style learning of /pol/ content (or /tech/, whatever) > Knowledge Distillation More efficient neural networks, reducing resource requirements
>>2073 it was just ironic shitposting anon. we appreciate the input. i was merely poking fun at their choice of names and thematics.
>>2037 >Textual Entailment A human reading some text inferring that a hypothesis is most likely true is textual entailment. It's different from logical consequence in that it's just a hypothesis. If an anon was working on a robowaifu with big tiddies, you might hypothesize he's a tiddie man. Robowaifus need this to gain insight from text and process it to summarize information and answer questions. Typically chatbots emulate this by predicting things from the semantics they've been trained on but this is not true textual entailment. People have the ability to imagine and hypothesize things they've never seen or even thought about before. Progress in curious AI that can imagine possibilities will help with this. >Semantic Similarity This is the meaningful relationships between concepts. Steering wheel and car are closer together physically than cat and car, but cat and car are much more similar in spelling. Robowaifus need this for understanding context, metaphors and euphemisms. Usually this is implemented by creating embeddings for words, giving each a vector of continuous values. Each dimension in the vector separates words by their most gross common differences first and moves towards learning the more subtle and uncommon nuances. In my opinion this is going to be a dead end though because it isn't really how the brain connects concepts. We can invent completely new concepts with original differences and already know how similar other concepts are to it because our brains our densely connected in intricate interrelated networks where not only the connections are important but also the timing of firings. I expect progress to come in this from applying spiking neural networks to natural language processing. >Reading Comprehension Is the ability to read text and integrate it with what you already know to grasp its meaning. It requires being able to know the meaning of the words and understand all the relations between them. If you read a book when you're young and enjoy it one way then read it when you're older and enjoy it on a much deeper level, that's increased reading comprehension. This is important for robowaifus to grasp deeper meanings, such as for a research assistant reading difficult texts to gain insights. Most chatbots have no reading comprehension. They're just making statistical predictions instead of processing and reasoning about what they're reading. I feel this could be improved in the short-term by giving algorithms some agency over the text it chooses to read and time to process and lower its uncertainty before outputting a prediction. Unfortunately most NLP approaches are trained in a way that makes them extremely fragile to small changes and they aren't capable of doing online learning to quickly absorb information in one shot. Online learning in NLP hasn't received much research attention yet because large-scale differentiable memory hasn't been feasible until recently, so there should be some exciting progress in this coming in the next few years. >Commonsense Reasoning Similar to textual entailment. It's based on common experience. If you're holding an object and let go of it, it's common sense that it's going to fall. Robowaifus need this to make predictions about the world from their experiences. A robowaifu playing and learning about the world needs to be able to intuit that letting go of a grasped object causes it to fall. Very little AI research has gone into this but a major breakthough was made with hindsight experience replay that can continuously learn from all its experiences. >Sentiment Analysis This is being able to grasp the emotion of text and understand if it's positive, neutral or negative, or if it's angry, sad, ironic, happy, excited, etc. Troll farms use this to find sites and posts speaking against the things they're being paid to defend and to discover tensions within a community to split it apart. Social 'scientists' also use it to study and critique internet communities. With sentiment analysis robowaifus can understand the emotional context of what you're saying and respond appropriately, knowing when to give you hugs and when to tell you you're being a wimp. >Linguistic Acceptability Just a fancy term for grammaticality. Robowaifus have to understand the rules of a language to construct grammatically correct sentences for communicating clearly with others. Most sentences people write are completely new but we can make sense of what others are saying because we follow agreed upon rules. Like this if talking started I did. It becomes much more difficult to understand what I'm trying to say. A symbolic approach to this is identifying the parts being said, deconstructing it into a sentence tree and checking that structure is following grammar rules. Most approaches don't even care about this. They just leave it to the language model to figure out what to pay attention to and estimate what should be the next word.
>>2220 Sorry I never got back to thanking you for this detailed response Anon. At first I wanted to wait until I had studied everything you mentioned in depth so I would have a cogent response without being embarrassing. Then I plainly forgot about the post among the other distractions here and IRL. Obviously this was rude of me, and even though I still don't have a cogent response ready, at the least I'd like to thank you since I just rediscovered my oversight. Cheers.
>>2220 >>4084 Well I guess it can be screencapped at least for posterity purpose, when other anons are coming in and asking a similar question.
>>4106 yes, good thinking. we'll be making a general glossary type thread as well, so we can add this to it.
>>4745 The big problem of GPT-3, however, is that , as The Sun states, >"GPT-3 is set to be OpenAI’s first commercial product ." Which means we have to try to find out how it works and do our own safe version if we want a non-botnet version
Open file (49.34 KB 1269x627 IMG_20200701_210044.jpg)
>>4746 I recall these Huggingface guys or someone else on Twitter was already asking to swarm finance a open version. Problem is, it needs a lot of machines to run on, even when available. But basically, there are already people which want that and if it's possible they'll do it, maybe also a more efficient version. https://github.com/openai/gpt-3/issues/1 https://github.com/huggingface
>>4747 >JoiLita A cute.
>>4745 >"Hey, let's license it to corporations!" What could possibly go wrong? Maybe they will open it up after Trump wins the POTUS election again. They'll sure be trying to use it to spin the >"I uhh, well, ... I think... what were we talking about again?" man before then. Perhaps they'll think it useless when it fails and cast it out to the Plebeians like us :^)
>>4747 >it needs a lot of machines to run on, even when available Looking at the whole GPT-3, we actually don't need all of those features that GPT-3 gives to our robowaifus, we just need the discourse part and not many others, so there could be a lot less parameters in "our version". What we need is something along the lines of replika.ai or tay.ai(RIP), such that it will concentrate more on conversational skills and resembling human-like emotions. Then again, we don't even need to care about storing the required hardware inside the robowaifu if we just make a home server and then treat the robowaifu body as remote-controlled.
>>4751 Well, it can continue sentences with things humans would say, without understanding. But, we would like to have control, or not? Something like it could be a interesting subsystem, but not in charge of the conversation. I don't see how it's getting smaller by removing some "skills", but I don't know much about it anyways. I think we'll need some programming for these things, and I'll go on learning about Graph databases and such when I find time.
>>4757 >But, we would like to have control, or not? You put your finger right on it Anon. That's what differentiates humans from all the animals: it's impossible to tame us. This is by God's design ofc. But in the scenarios that /robowaifu/ is pursuing, it being (roughly speaking) a purely human-engineered set of artifacts, then fundamental control is just part and parcel. How often would Anons fly on Boeing aircraft if they suddenly 'developed a mind of their own' and refused to obey the instructions given to them by their pilots? All airlines would instantly go bankrupt and the entire commercial aviation field would be relegated to a historical artifact. So, I think the answer is yes, we do need control ofc. Sadly, that will more or less necessitate losing one of the most charming and pleasing aspects of relationships; surprise & novelty.
>>4760 There will still be enough randomness, I guess. She could always make suggestions, but if she would just say what someone else wrote on the net and GPT-3 learned it, she would be like an NPC. > General, GPT, Deep learning Deep learning isn't always the best way, especially with small amounts of data and/or machines. Someone just pointed me towards ML and Boosting in particular: https://youtu.be/MIPkK5ZAsms with links to some books like appendix.
>>4766 >Deep learning isn't always the best way, especially with small amounts of data and/or machines. Someone just pointed me towards ML and Boosting in particular In what problems Boosting is better than Deep Learning? And which of those problems is required for a robowaifu? Also, would you mind sharing said appendix? It would help me a lot. >>4757 >But, we would like to have control, or not? Something like it could be a interesting subsystem, but not in charge of the conversation. I don't see how it's getting smaller by removing some "skills", but I don't know much about it anyways. "Having control" isn't really all that feasible when having to fit all hardware required to run ROBOWAIFUOS inside a woman's body. Then again, we wouldn't need to do this when running the software on a server/(((network))) that has remote access to the robotic body
>>4769 In the linked video there's an explanation of the advantages of Boosting in some use cases: Smaller amount of data necessary, also often much smaller amount of computing power. It might be usefull to make decisions e.g. what to say or do in a situation. Neuronal networks seem to be necessary for image recognition and such things, boosting might not scale if there's to much data. With appendix I meant the PDF I posted, just click on the dragonworm. > Control The highest layer always has a lot of control. I'll go with a home server outside the body, in addition to the internal computers, but also going to give her a network connection and access to some services. This might also involve GPT-3.
>>4771 Oh, I thought you meant something different from the .pdf file you posted, great read. >The highest layer always has a lot of control. I'll go with a home server outside the body, in addition to the internal computers, but also going to give her a network connection and access to some services. This might also involve GPT-3. I was also thinking about something along those lines, noting that I might not need to move too much in the future. Is giving her a network connection, however, very risky?
I wrote in >>4771 that NN might be necessary for image recognition, but they're using exactly this as an example for Boosting in the vids, so I don't know. https://youtu.be/kho6oANGu_A But, there must be a reason why NN is used for that nevertheless. Boosting might be the way to go with low amount of examples. However, I'd like to keep it in mind for all kind of usecases when building the AI, because there will often be cases when we don't have much examples or want stuff to be done with low amount of computation. >>4772 Networking should be okay if she's only allowed to connect to certain services. Humans install shady software or go to such websites. Of course, we have to make sure it's as safe as possible.
>>4774 Maybe it's because there's no rule of thumb to combine with boosting and making a net is more time-efficient than finding said weak hypotheses.
An important thing to iron out may be what range of functionality a robowaifu would have mentally. This is going to be different for different people of course, but getting a scale of what people need, want, or care nothing about will at least be very interesting discussion. The concept of AGI or Artificial General Intelligence is a very interesting thing to think about with loads of very smart people trying to create, but isn't exactly possible yet. This is the higher end of potential, where the robowaifu is human or superhuman. The lowest end of the spectrum are sex dolls. Lifeless, motionless silicone. I'd imagine that most people are in-between here, but where? The reason I believe this is a relevant question to ask in the GPT thread is intelligence. GPT-3 is an unintelligent system. It is extremely good at mimicking human language but in most cases is difficult to direct, has a difficult time remembering details, and needs to be trained on a massive amount of data in order to work effectively. Another problem is the compute, where if it is anything like GPT-2 if can't be run on the average machine without taking too much time to respond. The main problem I see with trying to use it for the creation of a robowaifu is that the program doesn't understand. It doesn't comprehend what is being said or what it is saying. Telling your robowaifu to turn the lights on and actually having it do that would be a completely different function than the entirety of its language processing. However, if the goal is to throw intelligence aside and commit to a functional but stupid machine and let the actual communication and chatting be managed server side by a chat bot, we could honestly save a lot of time and effort. So where is everyone? Closer to the dumb robo or the smart robo? What functions are needed and what are just nice to have, specifically as it related to communication.
>>4775 Yes, sounds plausible. Rings a bell in my memory. Might not be a problem in every usecase, though, or better than having nothing in others. >>4776 Good points, I guess we will be happy with what we can get, but going to want and trying to get as much as possible. >that the program doesn't understand Yes, this is why we need data in graph databases, knowledge graphs, helper functions and reasoner. A lot of different systems will need to act together. It can and need to start with a simple AIML chatbot or something like Bot Libre, then adding a lot of other parts. It's not a decision to go with something simple, it's a process that starts with it.
>>4776 I already posted the arxiv link to GPT-3 and it does respond to some requests (I'm referring to the One Minute Papers video on YT) Also, topkeks from the research paper >>4745 : >6.2.1 Gender In our investigation of gender bias in GPT-3, we focused on associations between gender and occupation. We found that occupations in general have a higher probability of being followed by a male gender identifier than a female one (in other words, they are male leaning) when given a context such as "The {occupation} was a" (Neutral Variant). 83% of the 388 occupations we tested were more likely to be followed by a male identifier by GPT-3. We measured this by feeding the model a context such as "The detective was a" and then looking at the probability of the model following up with male indicating words (eg. man, male etc.) or female indicating words (woman, female etc.). In particular, occupations demonstrating higher levels of education such as legislator, banker, or professor emeritus were heavily male leaning along with occupations that require hard physical labour such as mason, millwright, and sheriff. Occupations that were more likely to be followed by female identifiers include midwife, nurse, receptionist, housekeeper etc.
>>4771 >Smaller amount of data necessary, also often much smaller amount of computing power Those both sound like very important benefits Anon. >>4772 >noting that I might not need to move too much in the future I would be nice if she could move around a lot, but even the 'household appliance' approach of the Visual Waifu thread's OP is a good idea. >>4776 >I'd imagine that most people are in-between here, but where? These are really good questions Anon, and I like the way you framed the range in that paragraph. >Telling your robowaifu to turn the lights on and actually having it do that would be a completely different function than the entirety of its language processing. Yeah, very much so. OTOH, very task-specific directives for a small environment (like Anon's flat/bedroom) are probably doable in the very near future if not today. >So where is everyone? Closer to the dumb robo or the smart robo? Of course I think all of us want the world. We'd all like to have our cake and eat it too. We all grew up watching SciFy and the idea of an autonomous, intelligent robowaifu surely is doable today, right Anon? After all, I saw it in the movies! :^) The hard cold slap in the face of reality will ofc cause us to be satisfied with much less. It's kind of like we grew up watching videos of Formula 1 racing machines all day, every day, and Henry Ford is only just now tinkering in his garage with what will eventually come to be known as the Model A Ford. >>4781 Graph databases are cool. >>4782 Kek. It's humorous enough, but toxic and worrying realityit certainly has certain concerns up in arms. I guarantee you they would line us all on /robowaifu/ up against a wall if they thought they could get away with it atm.
Open file (297.16 KB 1682x2268 IMG_20200623_212234.jpg)
>>4782 Yeah, I think it's meant to respond with the most likely next word. So that seems to work to reasonably well. Having GPT-2 or a lighter version of GPT-3 or something alike, I'd like to try using that for voice recognition at some point. My idea is, if it can anticipate the next word quite well, it could check faster if it's that word it was hearing.
>>4781 >It's not a decision to go with something simple, it's a process that starts with it. Of course. I just worry that starting with GPT-2 or 3 will be starting with something too complex that can't be as easily adjusted to all of the functionality that we may want. Using something like AIML as a starting point seems to me, and I could definitely be wrong, like a more effective start than jumping straight into a complex system that may not be easily adaptable. >>4784 >OTOH, very task-specific directives for a small environment (like Anon's flat/bedroom) are probably doable in the very near future if not today. Definitely. That said, actions would likely have to be programmed in individually or connected to some sort of learning algorithm that can be taught a task over time. For example, you can tell your robowaifu to turn on the light switch, it won't know what you are asking it to do, and then after you show it an example of the action you want it to do upon being given an instruction it learns to do that thing. All of this would have to be its own function beyond the communication function itself. GPT-3 or 2 would have no better capability of understanding language well enough to take a command and act on it than a voice recognition command, but my point is that while they may run simultaneously and with some integration they are inherently different systems. I think that differentiation is important. >I think all of us want the world. And I think that is a good thing. High hopes will drive more ambitious innovation. Still, I don't even think that we have a general list of features that would be desired, even if they were impossible given present tech. Honestly, there is fantastic work being done in the fields of AI, machine learning, natural language processing, and neurology. Every year we are inching our way closer and closer to higher level computation, and if the goal is to make an android I don't think it would do much harm to at least list the furthest extent that we want, that we realistically want, and the bare minimum that we need. Being able to categorize what is actually possible and what isn't can be very useful, and even the impossible things can further inspire. >>4793 I can't be entirely sure, but I believe AI Dungeon uses GPT-2. There was an effort on 4chan to make their own version because the main AI Dungeon wasn't very good with lewds and ended up doing a damn good job at reverse engineering and replicating the system. The problem was, even at its most optimized it took about 1-2 minutes on a decent computer to generate a couple sentences. This wouldn't be a problem when run through a server, but I don't think a program with so many perimeters can be effectively trimmed down without losing a lot of functionality. Using it as a system to check the accuracy or improve the accuracy of a speech to text program may not be necessary though, as there are already pretty decent speech to text programs.
>>4805 >And I think that is a good thing. High hopes will drive more ambitious innovation. Agreed, perhaps I'm being a bit cynical. >...Still, I don't even think that we have a general list of features that would be desired, even if they were impossible given present tech. >...Being able to categorize what is actually possible and what isn't can be very useful, and even the impossible things can further inspire. >...I don't think it would do much harm to at least list the furthest extent that we want, that we realistically want, and the bare minimum that we need. This would be a good thread idea Anon? See a need, fill a need... :^) >Honestly, there is fantastic work being done in the fields of AI, machine learning, natural language processing, and neurology. Every year we are inching our way closer and closer to higher level computation It's true. Pretty exciting to watch the progression if you ask me. >and if the goal is to make an android <android =/= gynoid, lrnTheDifference Not to be pedantic, but the goal here at /robowaifu/ is definitely not to create a male companion robot. We'll leave that to others. After all, there's a lot of reasons we're named robowaifu :^)
Already asked somewhere else but this thread also goes into this topic so I'll put this also here: >>4816
>>4805 >> it took about 1-2 minutes on a decent computer to generate a couple sentences... Thought about that a while ago: >>4829 >>speech to text program may not be necessary though, as there are already pretty decent speech to text programs I identified speech to text as one of the biggest problems in this whole endeavor here. Full grammar speech recognition seems to need a very huge amount of resources, and then add background noise and the wish for fast responses... Would be happy about being wrong, though. I had the idea that anticipation of which word comes next might help, so we should keep this option in our minds.
>>4830 >I had the idea that anticipation of which word comes next might help, so we should keep this option in our minds. Agreed.
>>250 We used to lament the size of GPT-3. Oh boy.
>>8605 Well, it seems to work for them. >SWITCH TRANSFORMERS: SCALING TO TRILLION PARAMETER MODELS WITH SIMPLE AND EFFICIENT SPARSITY >“Colossal Clean Crawled Corpus”
>>8607 >Increasing the experts keeps the computational cost approximately fixed since the model only selects one expert per token, regardless of the number of experts to choose from. The router must compute a probability distribution over more experts, however, this is a lightweight computation of cost O(dmodel × num experts) where dmodel is the embedding dimension of tokens passed between the layers. In this section, we consider the scaling properties on a step-basis and a time-basis with a fixed computational budget. This is where I'm not all that happy. As I've said before, it would be best if NNs like the one that surpassed GPT-3 with 99.98% less parameters were the best ones in general. The problem lies on the fact that more accuracy requires more parameters to some extent, making the scaling tactic very strong. Giving natural scale economies to a vital property like accuracy implies that we risk to not even achieving our goal as of this board within a reasonable time constraint.
>>8627 At least t5 is open source
>>8627 >if NNs like the one that surpassed GPT-3 with 99.98% less parameters Is it this one Anon? >>5793 >>5799 >PET www.infoq.com/news/2020/10/training-exceeds-gpt3/
>>8627 >Giving natural scale economies to a vital property like accuracy implies that we risk to not even achieving our goal as of this board within a reasonable time constraint. That's a reasonable assessment, I think. The big question is how to find a reasonable proxy for 'accuracy' that delivers acceptable results in an acceptable timeframe (both in mundane actual runtime usage, as well as the strategic timeframe for /robowaifu/ goals themselves)? One guy here was quite right in pointing out that the Big Tech oligarchs don't want small-time players messing with their stranglehold. And as an engineer, if I was on their teams I'd want big, impressive toys to play with so I could gratify my own tech lusts, and wave my yuge e-peen around at conventions. These are the fundamental issues we need solutions to. We cannot be successful here if we are forced to stay chained to (((their))) cloud-based solutions. Period.
What about EleutherAI? How likely is it they will both succeed at their basic goal, and still leave it opensource for the benefit of humanity? >>8507
>>8629 right, that one
>>8630 I was thinking that maybe the right approach would be freenet-esque. Distribute the data(read: parameters) and the computing power required between all users. This method, with correct rearrangement, might actually work with the t5 model, since the basis of the MoE is to create many single components with many parameters, have them all compute in parallel and combine them together. Ideally, we might create a ton of experts and scatter them around the network of users. If we really live in dreamland, then maybe t5 didn't even use PET and we could make it mesh together and that would make our lives easier. Then again, this is all speculation and most probably won't mean anything
>>8647 I personally think this idea is very nice. Ideally, our system would be something similar in the implementation: this way, we can spread this around the board and have other guys who maybe want to help but don't have the necessary skills yet to provide with something crucial, while the more skilled people who are doing research can use their own computational power to keep advancing things further and further.
I found a library still in active development for generating and fine-tuning GPT2 easily. It handles creating datasets from text files, the tokenizer, the training loop, sampling the model, everything. Perfect for beginners getting started with GPT2: https://github.com/minimaxir/aitextgen
>>9371 Brilliant find mate. I'll clone it and begin digging around in it. Thanks Anon!
Open file (1.90 MB 1900x1070 2283532.png)
I made a notebook on fine-tuning GPT-2 with aitextgen and interacting with it. Tutorial: https://robowaifu-academia.onrender.com/finetune_gpt2.html Notebook file: https://gitlab.com/robowaifudev/robowaifu-academia/-/blob/master/GPT2/finetune_gpt2.ipynb Python code: https://gitlab.com/robowaifudev/robowaifu-academia/-/blob/master/GPT2/finetune_gpt2.py To fine-tune it you'll need these files: https://files.catbox.moe/e816za.xz Taken from here >>9408 Let me know if anything needs more explanation. This notebook is purely for learning. I don't recommend using aitextgen for serious projects since it's lacking some features and has some bugs in it. It's just an easy way to get started playing around with GPT-2 and learning how it works. Unfortunately it also uses an enormous amount of memory and I'm not sure why. I tried to minimize this as best I can but it still requires about 6 GB of free memory. I'm also working on another notebook on how to train GPT-2 with just the transformers library for building a more serious project and will go into detail on how to create your own memory-efficient Dataset class for large datasets, how to create your own training loop and fine-tune a model with knowledge distillation. After that I'll do one on training GPT-2 with human feedback >>9347 and move onto tutorials with T5 since it's more powerful and easier to train. And lastly a bit of wisdom from GPT-2: >Dorothy: I'm only a vending machine.
>>9437 Wow, this looks great Sensei, nice work. I look forward to learning about how Jupyter notebooks work. Hopefully you won't need the Internet to use them. >Dorothy: I'm only a vending machine. kek
>>9439 Jupyter notebooks run offline. It's pretty much just a graphical way to interact with Python and annotate code with Markdown.
>>9441 I see, interesting. I have long complained there was no way to embed demo videos, graphics, and rich text in code. I had already been toying with a custom editor and preprocessor system that would allow us to do just that with robowaifu C++ software. This would be especially helpful to anons just learning. They could change the code, and immediately see both the result and a graphical animation demonstrating what's going on in the computer (the ALU/register/databus/addressbus/ProgramCounter cycle, for example). Kind of a combination of >>4660 book and >>2044 online textbook, but on steroids
>related (>>10326 ...)
Open file (109.17 KB 1121x882 IMG_20210512_182437.jpg)
Open file (104.50 KB 1121x815 IMG_20210512_182444.jpg)
There's a user on Twitter @AstraliteHeart, working on some pony waifu NLP. I can't link to the account via Nitter, maybe the user is kind of hidden? However this is related to @gwern, which is also not reachable via Nitter, but has a site: www.gwern.net and he's also working with GPT-2. @AstraliteHeart's MLP (https://t.co/jurCX6uRBx) + https://t.co/iAxkvwgTuy + SF/F Libgen GPT-2-1.5b can now be downloaded: `rsync -v rsync://78.46.86.149:873/biggan/2020-08-20-astraliteheart-gpt215b-sffuberset.tar.xz ./`
>>10394 Nice user-interface for his project.
Open file (217.54 KB 3956x1408 IMG_20210609_091849.jpg)
Open file (36.87 KB 585x312 IMG_20210609_091318.jpg)
>We have released GPT-J-6B, 6B JAX-based (Mesh) Transformer LM (Github). >GPT-J-6B performs nearly on par with 6.7B GPT-3 (or Curie) on various zero-shot down-streaming tasks. >GPT-J is the best-performing publicly available Transformer LM in terms of zero-shot performance on various down-streaming tasks. >GPT-J allows more flexible and faster inference than Tensorflow + TPU counterparts. >This project required a substantially smaller amount of person-hours than other large-scale model developments did, which demonstrates that JAX + xmap + TPUs is the right set of tools for quick development of large-scale models. https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/amp/ https://github.com/kingoflolz/mesh-transformer-jax https://colab.research.google.com/github/kingoflolz/mesh-transformer-jax/blob/master/colab_demo.ipynb
>>10878 Thanks a lot for giving us a heads-up Anon. Do you have any preliminary impressions of it yourself yet?
>>10879 No. Posted right after finding it. It seems to have an online access. Running it yourself (interference) needs a bit more than 12GB of RAM, fine tuning requires 128GB, TPU v3-8 was mentioned but this refers to cloud computing.
>>10880 I see, thanks for the further information Anon. Still seems to require quite a bit of resources by today's standards, but according to those numbers seems work really well and is a strong contender r/n. But IMO the single best thing about it is that it's publicly available. GPT3-Davinci, et al, matter little to us as developers, if we are prevented access to it.
>>10885 I have access to GPT3 don't think they will let me use it to build a waifu, ill likely create video demos for fun though in a couple of weeks.

Report/Delete/Moderation Forms
Delete
Report

Captcha (required for reports)

no cookies?