/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality.

Roadmap: file restoration script within a few days, Final Solution alpha in a couple weeks.

Sorry for not being around for so long, will start getting back to it soon.

Max message length: 6144

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB


(used to delete files and postings)

C++ General Robowaifu Technician 09/09/2019 (Mon) 02:49:55 No.12
C++ Resources general

The C++ programming language is currently the primary AI-engine language in use.





BTW if you're new to C++ and you're stuck on Windows (either you can't or won't upgrade to Linux) then you can at least incorporate a good, open shell into your system to begin with so you can follow along. Start at this link, and if you have any questions just ask ITT:
Edited last time by Chobitsu on 10/05/2019 (Sat) 20:16:32.
>>7904 >parallel (multi-core) JSON file parsing working >muh_code_snippet: /**----------------------------------------------------------------------------- @brief load the word-counting map 'io_w_pt' in parallel from each JSON file located inside the 'all_jsons' directory @param io_w_pt [IN/OUT] The word-counting map to load */ void ld_map(map<string, set_pair_ints>& io_w_pt) { // // lambda to perform the JSON file parsing into the map auto parse_json_file{[&io_w_pt](auto& filepath) { const auto thread = rw::get_jsonfile_dat(filepath); // each filepath == 1 imageboard thread const uint32_t thrd_id{thread["threadId"].asUInt()}; const auto thrd_subj = thread["subject"].asString(); // store this thread's subject off into the global var container thrd_subjs[thrd_id] = rw::clean_str(thrd_subj); // OP's subject+message text clean_store_words(io_w_pt, string{thrd_subj + ' ' + thread["message"].asString()}, thrd_id, thrd_id); // each reply post's subject+message text for (const auto& post : thread["posts"]) clean_store_words( io_w_pt, string{post["subject"].asString() + ' ' + post["message"].asString()}, post["postId"].asUInt(), thrd_id); }}; // first, read in all JSON file paths from the directory vector<file_path> paths{}; for (const auto& jsonfile : dir_iter{jsons_dir}) paths.push_back(jsonfile); // then run filepath processing in parallel, using the lambda above for_each(par_unseq, cbegin(paths), cend(paths), parse_json_file); } Not too pretty, compared to some pre-canned functions in another language, but it suits myself well enough for now. In my experience, you usually have to deal with more complexity if you want more performance/efficiency so yea. >=== -minor code edit for readability
Edited last time by Chobitsu on 12/22/2020 (Tue) 09:20:42.
>>7904 >updated JSON archive of the board https://files.catbox.moe/3n02er.7z
>>7904 >It's chewing through the 135 files (~10MB) and parsing every post and every word in about 2.5 seconds now, and that's on my old potato. BTW, during that two and a half seconds, it's also generating this 33K line, reverse-sorted text file of word_post_counts (one count for a word per post, excluding dupes). https://files.catbox.moe/o38sax.txt
>>7906 Blimey. Still didn't find they time to look into the program but I download the and backup the results at least. >>7907 Okay, that's fast. How flexible is it, if you want to do something slightly different, like searching for every pair of two words? >o38sax.txt I'm looking through the list, if it can be used for filtering out the relevant terms in a reasonable amount of time. If I recall correctly there's already a search available, so we could find the postings with these terms then and get their ID numbers?
>>7908 >How flexible is it, if you want to do something slightly different, like searching for every pair of two words? Since I'm writing it myself from scratch using nothing standard C++ & the jsoncpp library, it's as flexible as we can dream up how to make it. Yes, we could do any number of permutations of 'sliding window' views into the posts, if I take your meaning Anon.
Update: I have exact term matching working for the search tool now. Eg, here's a 'hello world' search: > There are 8 posts with all those words, and in that exact order (ORDERED), and 2 posts with all those words, but not in that exact order (UN-ORDERED). So, things are coming along. I have to chase down a random crash (probably have to reel in the multi-core just a bit), and some clean up. Then I should be able to set it all up on a RaspberryPi and test it there. If all goes well, I'll create a step-by-step tutorial for building it and using it on your own RPis, and then eventually use the whole project as a set of non-trivial lessons on using Modern C++ correctly on a small-scale (it's about 1'000 lines of code). Cheers.
>>7921 >I have to chase down a random crash (probably have to reel in the multi-core just a bit) In accord w/ >>2701 >Confessional I'll reveal the crashing bug I finally found. But I'm still unsure just how to fix it yet (other than just not using parallelism for that op). >fubar.cpp //---------------------------------- void clean_store_words(map<string, set_pair_ints>& io_w_pt, const std::string& in, const uint32_t post, const uint32_t thrd) { string temp{in}; scrub_chars(temp); rw::rm_dupes(temp, '\''); // parse each message into a vector<string> stringstream message{temp}; vector<string> words{}; while (message >> temp) { // '>>' stream operator parses on whitespace here scrub_string(temp); if (temp != "") words.push_back(temp); } string cleaned_msg{}; // DEBUG: making this normal seems to have stopped the crashing. // test. // -UD: went 10 times straight w/o crashing. Will try bringing the others back // 'online'. // -UD2: 10 more times crash-free w/ others @ par_unseq. Seems OK now, // apparently this was the problem. Wonder what was happening? for_each(cbegin(words), cend(words), [&](const auto& word) { // for_each(par_unseq, cbegin(words), cend(words), [&](const auto& word) { io_w_pt[word].insert(make_pair(post, thrd)); cleaned_msg += word + ' '; // DEBUG: i wonder if the string concat could // be part of the crashing issue w/ par ops? // -test. // -UD: Yes, this appears to be the fundamental issue. // -how to solve? exact-matching won't w/o this insertion into container. // -some kind of atomic op? // }); // insert the cleaned message into the global (for exact-matches) all_msgs[post] = cleaned_msg; }
>>7928 >Merry Christmas /robowaifu/ . Thanks, for the program. At least , I intent to look into it soon. Merry Christmas as well.
>>7929 Heh, sorry I fixed some typos and added another first build step. Cheers.
Alright, I've decided that the search tool is good enough at a basic level to go ahead and provide it as a download before Christmas. I've decided to name it waifusearch. It's not packed or anything yet, so you'll have to build it yourself from codefiles using a C++ compiler. You can download and unzip this somewhere: https://files.catbox.moe/ptpzgm.7z def73cb9eafed554363cadb6685e072575d5b1322b409dc96de9352d4f945647 waifusearch-0.1.tar.xz BTW, I provided the current /robowaifu/ JSON files inside this archive as well. They are located inside the 'all_jsons' subdirectory. And the program will be expecting them to be there, even if you overwrite them all with newer versions provided from the Library thread in the future. Notes Dependencies -The software has a dependency on the jsoncpp library, and I also use the meson build system. You'll need both if you follow my build instructions below. -This is a multi-core enabled program. I use GCC to compile with, and they apparently use Intel's TBB to supply the underlying task queue system for g++ to provide the standard C++17 parallel execution policies. >tl;dr You'll also need to install the TBB dependency. All of these should be available from your package manager, and that's definitely the recommended approach here. But just in case, here's the source locations if not. -Meson https://github.com/mesonbuild/meson -JsonCpp https://github.com/open-source-parsers/jsoncpp -Threading Building Blocks https://github.com/oneapi-src/oneTBB Building -Initialize the project as a mesonbuild one. From the code directory: meson build -Be sure to tell meson to use release mode before building the project. From the code directory: cd build && meson configure --buildtype=release && cd .. -Either build from Juci, or you can just build w/ meson from the terminal instead. From the code directory: cd build && meson compile && cd .. Running -Execute the waifusearch program. From the code directory: build/waifusearch -You should see something like this appear ('hello world' search example given): time to process local /robowaifu/ JSON data: 2322 ms (q to quit) enter search phrase: hello world ORDERED: ======== THREAD SUBJECT POST LINK C++ General -> https://alogs.theГунтretort.com/robowaifu/res/12.html#6063 @ch: ~66 " -> https://alogs.theГунтretort.com/robowaifu/res/12.html#6095 @ch: ~391 Selecting a Programming Language -> https://alogs.theГунтretort.com/robowaifu/res/128.html#134 @ch: ~1102 Robowaifu-OS & Robowaifu-Brain(c -> https://alogs.theГунтretort.com/robowaifu/res/201.html#208 @ch: ~75 Modern C++ Group Learning Thread -> https://alogs.theГунтretort.com/robowaifu/res/4895.html#5422 @ch: ~150 " -> https://alogs.theГунтretort.com/robowaifu/res/4895.html#5423 @ch: ~64 " -> https://alogs.theГунтretort.com/robowaifu/res/4895.html#6529 @ch: ~242 Haute Sepplesberry Cuisine TBH -> https://alogs.theГунтretort.com/robowaifu/res/4969.html#6313 @ch: ~772 UN-ORDERED: =========== THREAD SUBJECT POST LINK Robowaifu fiction to promote the -> https://alogs.theГунтretort.com/robowaifu/res/29.html#50 /robowaifu/ Embassy Thread -> https://alogs.theГунтretort.com/robowaifu/res/2823.html#5158 full search took: 225 us ' hello world ' [8 : 2] = 10 results (q to quit) enter search phrase: -If so, you're all set (ignore the Гунт-abused links in this post, the real ones from the program work). You should be able to search across /robowaifu/'s posts quickly. Once you find a result you're interested in, just right-click over the post link and choose 'Copy Link Address' then paste into your browser. >note: the program will also auto-generate two files, all_posts.txt & word_post_counts.txt when you run it. Feel free to look into them for search ideas or whatever you'd like. I hope it's helpful enough for now Anon. Eventually, we'll expand it out to work across any imageboard that Bumpmaster supports. Cheers, and I hope you all have a Merry Christmas /robowaifu/ .
OK for waifusearch, today added character trimming into the text parsing to properly handle bold and italic text in posts. Also, began a proper version log as well: >version.log // waifusearch, the lightning IB searcher // ====================================== // -This software is designed to search an imageboard's text fast 201226 - v0.1a -------------- -add proper char trimming to address bold and italics text in posts -add board_nm string, and output_files flag, user global vars -add version log, and various comments to meson.build 201224 - v0.1 ------------- -initial release NOTES: -Currently, waifusearch only supports Lynxchan-based JSON files. https://files.catbox.moe/sw058v.7z 90db32390fc2eb18e567548af9381088d421cbd7c5ee88805e8510464098013d waifusearch-0.1a.tar.xz Cheers, /robowaifu/ .
OK, I've done some needed housekeeping to clean up redundant processing and make maintenance easier in the future. No real performance differences, this is mostly just to ensure all text processing is uniform and help developers. >version.log 201228 - v0.1b -------------- -refactor to remove redundant code, simplify & unify text processing -add vec_pair_s_vpi alias https://files.catbox.moe/pfn2tu.7z eaa25db2a04caf480ca76bd7d4c92c29f62774b4e6e1251154e834a0fb9ddb8f waifusearch-0.1b.tar.xz
For waifusearch, today I added some timing breakdown details to the output. There are four distinct phases this tool goes through to optimize the speed of searching across the entire collection of words here, and each of these are detailed both as a percentage of the whole, and as raw microseconds. > I can now easily test any perf optimizations in different sections, as well as have a better feel for the underlying behavior. For example, I suspected it before but now have numeric data that the first phase (sea) needs front-loading the first time through. BTW, there is a user-global flag you can use to turn this display off if you'd like (as well as other vars): const bool show_timing{true}; I'm probably pretty close now to the planned functionality I wanted for the tool at this early stage. If anyone has any suggestions, questions, or requests about it then feel free to let me know ITT. >version.log 201229 - v0.1c -------------- -add display of timing breakdown details -add disp_unfounds() function and outputs -add percent_str() function to rw_text_utils.hpp -rename alias' to better align w/ cppcoreguidelines (NL.5) -minor function re-ordering https://files.catbox.moe/dj7vr5.7z a19eb6fc377422734944cb2344c82ee996dce77be35e9ade66f104fbcbfcb602 waifusearch-0.1c.tar.xz
Alright, I added parallelization to the loop inside x, the costliest part of the system time-wise. void locate_terms(const Map_words_2_posts& w_tp, const std::vector<std::string>& terms) { const auto bgn_loc = clk_time.now(); const uint32_t pad_len{27}; Map_words_2_posts uniques{}; Vec_words_2_posts vw_vtp_valids{}; // attempt a term locate inside the w_tp & copy it if found for_each(par_unseq, cbegin(terms), cend(terms), [&](const auto& term) { try { uniques[term] = w_tp.at(term); } catch (const out_of_range& e) { unfounds.emplace_back(rw::make_pad_str(term, pad_len)); // global } }); cp_map_2_vec(uniques, vw_vtp_valids); sort_counts(vw_vtp_valids); // sort terms by post counts, low to high time_spans.emplace_back(clk_time.now() - bgn_loc); // time_spans[1]: loc if (show_terms) disp_terms(vw_vtp_valids, pad_len); if (! vw_vtp_valids.empty()) find_intersects(vw_vtp_valids); } It doesn't bring too much to the table for very short (1 or 2 word) searches b/c setup overhead, but for searches that are sentence-length or longer it basically cuts the time by a quarter to a half (roughly speaking) on my box. > I decided to go ahead and push a new rev on the same day since it's a fairly nice advance for waifusearch. Should've done this refactor a bit sooner heh. :^) https://files.catbox.moe/7k95kq.7z a82f71469d2db60834e12a03e76621201732afa1ff1547554d62e868f29447c5 waifusearch-0.1d.tar.xz Cheers. >=== -edit sorting terms comment
Edited last time by Chobitsu on 12/30/2020 (Wed) 23:55:36.
>>7966 >to the loop inside locate_terms() * lol
Alright, to wrap things up for the time being for waifusearch, I've broken down the timings a bit further in the costliest region to get more details. I've also added a bit of a safety check for the file writes which is probably a good idea in a multi-threaded context. I completed the javadoc comments for each function such as they are. I think I've worked out all the kinks as far as I can tell at this point. We'll go ahead and add a link to this tool in the OP of the Library thread as well now for anyone interested in research here. Sometime next year I'll probably make time to get this building on the Raspberry Pi, so any Anons following along with that project can use waifusearch on their boxes at will. As things allow, eventually I'll integrate all this into the Bumpmaster project to search across the entire collections of boards an anon might have I currently keep around 60 or so boards archived with BUMP. I also have a plan to create a small course around this tool in the C++ Learning Thread. Having a specific project to learn from will probably be a beneficial thing for all concerned. And this utility is in fact doing a few interesting things that will be helpful for anyone in the class to know about in detail. That aside, the ultimate goal here ofc is to lay a foundation for our waifus to have a good search system to retrieve any textual information generally speaking, and not just to shitpost along with us. But little-by-little. And who really knows? This may very well turn out to be the year of the robowaifu for us all or at least additional good progress on that timeline. Here's hoping! :^) BTW, I thought it would be fun to adopt Sumomo as the official mascot of the project, in honor of the many times she served in doing searches! 201231 - v0.1e -------------- -add add'l timing sections for copy and sort operations -add write_file_ready() to rw_text_utils.hpp -add std::accumulate in disp_timing() -for locate_terms(), wrap processing after loop in sanity check block -add have_results flaq & test -add javadoc comments for functions -minor function re-ordering https://files.catbox.moe/uc4uit.7z 7c13ced53ed67e5d21e7cdd9456a57d2f2922f13643b91a8abd80f612ef49477 waifusearch-0.1e.tar.xz Cheers, and I hope you all have a very Happy New Year /robowaifu/ . Godspeed to us all this year. :^)
>>7975 Thanks. This all sounds great. Didn't really use it yet, because I tried to get another software done to the end of the year but it turned out to take more time and I dropped out for a while doing other stuff like sorting data. >building on the Raspberry Pi Thanks, I was looking into it. I think "Threading Building Blocks" would need to come from Github, while jsoncpp and meson are available in Raspian repo. > adopt Sumomo as the official mascot of the project Great idea, she's great. > Happy New Year /robowaifu/ Thanks again. I'm sure we will have a good year, and I wish everyone the same.
>>7999 >Thanks, I was looking into it. I think "Threading Building Blocks" would need to come from Github, while jsoncpp and meson are available in Raspian repo. Welp, I have good news and bad news Anon. First the good news; TBB is included already in the Raspian repo: > If you search Synaptic for TBB, you'll see >libtbb-dev >parallelism library for C++ - development files Now the bad news; GCC8 doesn't actually support the C++17 parallelism execution policies. Only version 9 (and later) does. My apologies, I wasn't aware and ATM only g++8 is on the RPi. Now, for the other good news... :^) I was able to get waifusearch working on the RaspberryPi despite that (though it runs more slowly): > There are a number of changes I needed to make to pull it off, so it will take me a bit to get things cleaned up and to explore anything I can improve upon with it. I'll try to package up a new revision by this weekend (possibly a RPi-specific one) and post it ITT. Stay tuned Anon.
>>7999 >>8017 I went ahead and did a full tutorial of the build process in the Sepplesberry thread for everyone, Anon. >>8026 Cheers.
Implemented an easy to use named-timer mechanism today. A container is maintained system-wide for all timings, and whenever a Timer object is created it begins timing automatically, and when it goes out of scope or is explicitly stopped it calculates the difference and stores that into the system container. >Timer.hpp #pragma once #include <chrono> #include <string> #include <utility> #include <vector> namespace rw { // alias' using Clock = std::chrono::steady_clock; using Time_pt = std::chrono::time_point<Clock>; // globals static std::vector<std::pair<std::string, Time_pt>> named_timings{}; // usings using std::make_pair; using std::move; using std::string; class Timer { public: Timer(const std::string& timer_name) : name{move(timer_name)}, start{Clock::now()} {} ~Timer() { if (! have_stop) stop(); } void stop() { named_timings.emplace_back(make_pair(move(name), (Clock::now() - start))); have_stop = true; } private: const string name; const Time_pt start; bool have_stop{false}; }; } // namespace rw Using it is quite simple, and you can rely on the stop() function, or simply let it fall through and RAII will do the trick. Here's an example of both ways: >snippet.cpp void foo() { Timer t{"foo"}; sleep_for(Sec{1}); } int main() { Timer t{"main"}; cout << "Hello World!\n"; t.stop(); foo(); if (disp_timings) // rw::disp_timings(); rw::disp_us_timings(); } Note how clean and simple the RAII form is when appropriate. I'll probably use this all over my general-purpose code now as a form of perma-profiling since the perf impact is practically nil and iterating the named_timings container is strictly a standard approach. Example output from above snippet: Hello World! main timer took: 74 us foo timer took: 1000231 us
OK, so I've made a few developer-oriented improvements to waifusearch, nothing that really changes it's usage but makes the code a little cleaner internally. I've integrated my nifty little 'fire-and-forget' named timer class into the tool, and also changed the timings display to make it a bit more readable (raw microseconds and percentage-of-total numbers are now grouped together with the timer name itself). > I also added a flag to the meson.build that should stop the annoying notes coming up from GCC when you build on the RaspberryPi. The comments in that file have been tweaked a little for the RPi as well. >version.log 210112 - v0.1g -------------- -add Timer.hpp for cleaner automated timings -move to disp_pct_timings() using new Timer class -move user-configurable globals to user_flags.hpp -simplify parallel execution guard macros -move to std::copy algorithm in assemble_words() -relocate assemble_words() to rw_text_utils.hpp -add build comments for RPi version to meson.build -add '-Wno-psabi' project arg to meson.build -minor code-consistency cleanups https://files.catbox.moe/dlijqn.7z 189aea5e788b4d163f918407a03052633a9cc6b3e33460688efbee9c06972128 waifusearch-0.1g.tar.xz I hope you're doing well so far this year, /robowaifu/ . Cheers.
OK, I've finally gotten around to adding command-line parsing to waifusearch. This will make it far more convenient to change it's behavior a little. For example, switching between outputting in-board crosslinks, or full hyperlinks. > >version.log 210115 - v0.1h -------------- -add command-line arguments parsing -patch minor RPi using bug No other improvements included with this version, but the benefit of having program arguments is more than valuable enough to make a special push just for that. https://files.catbox.moe/bkflel.7z 312a27a8cbb86e0b8f808fda0c36695fefc91c73ed7c870b6a21384f8752914b waifusearch-0.1h.tar.xz Cheers.
>>8103 >the archive of /robowaifu/ thread JSONs is available for researchers >latest revision 210120: https://files.catbox.moe/0qyvy6.7z Just in case the place is deplatformed, here's today's version. It may be a while before I do anything else with waifusearch (beginning to mostly work on Bumpmaster instead now) so if you have anything else in it you'd like to see I'd say ask me about it soon while the code is still fresh in my mind. So, the latest versions of the JSON archives should be posted occasionally in the Library thread's OP. Cheers.
OK, so I made a couple of small display improvements and patched a minor bug related to unfound terms display. >version.log 210126 - v0.1i -------------- -show unfound terms early in the results display -strip unfound terms from the final search phrase display -add clean_unfounds() wrapper -add squash_n_trim() func to rw_text_utils.hpp -rename trim_lr() func in rw_text_utils.hpp -rename rm_substrings() func in rw_text_utils.hpp -remove unnecessary short-circuits in find_intersects() -patch minor display bug with unfound terms https://files.catbox.moe/yuq4z3.7z 3b7a5d7a8ff662c448a1f71d57fb6d82f170019806ddd7fe958343edce29d280 waifusearch-0.1i.tar.xz
>>8293 Thanks, I hope I can make it to try it out today.
>>8297 >software building instructions >>7933 >RaspberryPi building instructions >>8026 The RPi instructions are probably a little more hand-holding and basically apply to any Linux box. Just ask ITT if you get stuck Anon.
Comments from inside the file meson.build : # === # DEPENDENCIES: jsoncpp, mesonbuild (and other than on RaspberryPi), tbb # You should install these from your distro's package manager if possible. # # As of this date in 2021, these are the sources of the 3 external dependencies; # # -JsonCpp # https://github.com/open-source-parsers/jsoncpp # -Threading Building Blocks # https://github.com/oneapi-src/oneTBB # -Mesonbuild (optional, but recommended) # https://github.com/mesonbuild/meson # === # BUILDING WITH MESON (from the code directory): # -Initialize the project as a mesonbuild one. # meson build # -note: This step is needed only once, or again after a version change. # # -Build, Execute. # cd build && ninja && cd .. # build/waifusearch # https://mesonbuild.com/Reference-manual.html # === # BUILDING WITH g++ (from the code directory): # -Build, Execute. # g++ main.cpp -std=c++20 -O3 -ljsoncpp -ltbb -o waifusearch # -(or on RPi): # g++ main.cpp -std=c++2a -O3 -ljsoncpp -lstdc++fs -o waifusearch # ./waifusearch # -With either build approach, only the final execution step is # needed anytime thereafter.
Made a minor usage change where hyperlinks are now the default (change this with the '-y' flag) . My original thinking on this was looking towards the future when waifusearch will be fully integrated with the upcoming Bumpmaster imageboard system. But until then while this tool is still just a simple CLI-based search, it's more useful to present hyperlinks in the terminal instead I think. Other than that, made some minor display cosmetics and one small, proactive code modification. >version.log 210128 - v0.1j -------------- -rename 'make_crosslinks' to 'make_hyperlinks' (hyperlinks are now the default) -move clean_unfounds() earlier in processing chain (avoiding a potential bug) -only display 'ORDERED' header if both types are present -similarly, only segregate counts if 'UNORDERED' results are present -minor javadoc edits https://files.catbox.moe/52t4ku.7z 224c28818ed61406419550c08f4bd25544838950eddb28a4ec1bfac60b876448 waifusearch-0.1j.tar.xz Should have the build file squared this time, cheers.
>>8337 Your work is much appreciated. I'm only looking into it now, though, for various reasons. One is, that I wanted to find a way to check CPP code for dangerous commands, before using something I downloaded from a anonymous source from the web. Also, maybe understanding a bit of it. I'm using cppcheck now, and got some errors: Checking .../waifusearch-0.1j/rw_text_utils.hpp ... /waifusearch-0.1j/rw_text_utils.hpp:27:0: error: failed to evaluate #if condition, division/modulo by zero [preprocessorErrorDirective] #if __has_include(<jsoncpp/json/json.h>) // RPi's version Maybe it works nevertheless, but cppchecks stops checking the file with that error in it. I think this stuff here is unimportant or just warnings about what a file does: [spoiler].../waifusearch/waifusearch-0.1j/CLI11.hpp:7109:0: error: Exception thrown in function declared not to throw exceptions. [throwInNoexceptFunction] for(const App_p &com : subcommands_) { ^ Checking ... /waifusearch/waifusearch-0.1j/rw_cli_args.hpp: CLI11_HAS_FILESYSTEM;__has_include... Checking ... waifusearch/waifusearch-0.1j/rw_cli_args.hpp: CLI11_HAS_FILESYSTEM;__has_include;_GLIBCXX_RELEASE;__cpp_lib_filesystem... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: CLI11_HAS_FILESYSTEM;__has_include;__GLIBCXX__;__cpp_lib_filesystem... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: CLI11_HAS_FILESYSTEM;__has_include;__MAC_OS_X_VERSION_MIN_REQUIRED... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: CLI11_HAS_FILESYSTEM;__has_include;__cpp_lib_filesystem... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: _HAS_STATIC_RTTI;__GXX_RTTI;__cpp_rtti... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: _MSC_VER... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: _WIN32... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: __CUDACC__... Checking .../waifusearch/waifusearch-0.1j/rw_cli_args.hpp: __GNUC__... [/spoiler]
>>8387 That's fine, it's a good idea actually. The first one is failing because cppcheck isn't properly evaluating the non-standard GCC macro __has_include https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005finclude.html Not too surprising, but it should be checking pretty common compiler vendor macros like that. The second one is probably a minor bug on the part of the CLI11 library writer. It's pretty irrelevant in this case, and in particular that quite simplistic way I'm using the library to just check the command argument and then set the global user flags. These are OK Anon. >Also, maybe understanding a bit of it. I wrote this software myself, so ask away if you ever have any questions Anon. BTW, I'll be doing some new work on waifusearch to add Boolean logical operators into the search capabilities. This will obviously be helpful improvement. Starting with OR today.
>>8390 I tried this, after reading the link. I probably did something wrong. It didn't work. #if defined __has_include && if __has_include(<jsoncpp/json/json.h>) // RPi's version # include <jsoncpp/json/json.h> # else # include <json/json.h> #endif
>>8549 If you want to perform the defined test first, then use this: #if defined __has_include #if __has_include(<jsoncpp/json/json.h>) // RPi's version #include <jsoncpp/json/json.h> #else #include <json/json.h> #endif #endif BTW, if you aren't building for the RPi, then you don't need any of this. Just include the JsonCpp header from it's normal location: #include <json/json.h> Let me know how it goes if you get stuck Anon.
>>8552 Oh, I just needed to comment it out, then it worked. Also, I only installed the dev version libjsoncpp after doing this check. I had the wrong one. Finally I installed it, and it's great. Didn't even know till I installed it, that it only searches the downloaded json. No problem, just a surprise. I'll look into Bump next.
>>8554 >Didn't even know till I installed it, that it only searches the downloaded json. No problem, just a surprise. Yeah, one of my long-term design goals is to lay a foundation for text processing for our robowaifus that will work A) At lightning speed. B) Directly on-board mobile, tiny processors (ie RPi & other SBCs). These are basically interrelated concerns in a sense and means both a local copy of data, and also basically necessitates a compiled-code solution in either Assembler, C, or C++ . I chose C++ for what I hope are (or at the least will be) apparent reasons to everyone in the end. >I'll look into Bump next. I actually am working on the next version (I'm that author as well) and intend to integrate waifusearch directly into the tool so anyone can search across whole libraries of shitposting in a flash. Ostensibly it's because it will be a nice feature to have in Bumpmaster. But in fact, it's to lay a groundwork for the AI in our waifus to be able to gain context quickly from ad-hoc, ad-lib, de-facto slanguage posts from Anons in English. And to do it very fast on minimal hardware. After all, I want my waifu to be able to shitpost the bantz with me right? :^)
>>8556 >waifus to be able to gain context quickly from ad-hoc, ad-lib, de-facto slanguage posts from Anons Yeah, that's gonna be awesome.
>>8556 What I'm missing in BUMP-2d are building instructions. I don't know what kinf of libcurl library it needs, and don't want to try all of them out.
>>8560 It's still going to take a lot of work Anon, but yes it will be. The techniques I'm trying to master in waifusearch are basically only just prep work. 'Assistants to the Chief', as it were. The real heavy-lifting will come from libraries like mlpack & Armadillo. And even that is just the 'sharp tools for the Carpenter'. It will take the teamwork of multiple anons willing to learn & use compiled languages well enough, and some with a good understanding of the mathematical basis of ML, to join in the team and begin contributing. Hey, Lewis & Clark didn't actually build the railways out to the West themselves! They simply blazed a trail for others to follow across. >tl;dr It will go faster if you and others crack the books, and join in Anon. :^) >>8562 Any reasonably current version of the dependencies, including dev libcurl, should all work. Sorry about the lack of adequate build instructions Anon. It's roughly equivalent to those of waifusearch, since it also uses mesonbuild. If you can wait a week or so, I'll try to assemble an interim release for you where I correct this and I'll link it here. It won't really be a new version per se, but it should at least have clearer build instructions for it. The new system is called Bumpmaster, and will be a good bit more powerful. But it probably won't be till late spring/early summer when I can give it more focus that the first version will be released.
>>8565 I don't really need it anytime soon, just FYI: At first it seems like it's that line in meson.build, what cases the error: >curl_dep = cxx.find_library('curl') I have libcurlpp-dev and libcurl4 installed. It returns: "meson.build:21:0: ERROR: C++ library 'curl' not found" but if I comment the line above out, then it just uses the next line to case an error. Version: 0.52.1 Source dir: /.../BUMP-02d/BUMP-0.2d Build dir: /.../BUMP-02d Build type: native build Project name: BUMP Project version: 0.2d C++ compiler for the host machine: c++ (gcc 8.3.0 "c++ (Raspbian 8.3.0-6+rpi1) 8.3.0") C++ linker for the host machine: GNU ld.bfd 2.31.1 Host machine cpu family: arm Host machine cpu: armv7l
>>8576 OK, I'll do my best to make sure that BUMP builds and runs on the RaspPi first before I push the next version Anon. (I created BUMP before I thought of having C++ classes on /robowaifu/ .)
Another suggestion, which would probably be easy to implement: It would be great if Waifusearch could be called directly in the shell with some search string and would then write the result to stout. I think this is necessary to call it from other programs as well, and then work with the result in that program which is calling Waifusearch.
>>8620 Good suggestion, thanks. I understand that would be a better fit with the Unix Philosphy. I simply wanted to do a boatload of different searches when I first got it working and so I did a 'search loop' instead of the more normal approach. /**----------------------------------------------------------------------------- @brief Front end to search-loop the word@post map 'w_tp' for user's query terms. @param w_tp [IN] The word/thread+post map to search through. */ void do_search(Map_words_2_posts& w_tp) { //... It should be rather straightforward to make the system run in a single-shot mode instead. When I make time, I'll add a flag, probably -s with a string argument to it that should work. R/n I'm working on all the weird edge cases in doing simple Boolean ORs. I should have that ready by this weekend I think, and then I promised Anon to have a go at getting BUMP working on RPi. Once those are out of the way I'll add your flag so it can be scripted, etc., and I'm planning to go full-CAS with /robowaifu/ text searching at some point after that so we can really spell out exactly what kinds of things to look for for our waifus. I should probably just skip that whole ordeal and go with the regex library, but a) I want to understand it more deeply first before I just give over >>8568, and b) it's just possible that I may be able to eventually turn this system into a fully-distributed system than can run searches concurrently across many machines at once. If we can pull that off here, then chewing through various large corpora at high speed should be a snap for our waifus.
>>8622 >and I'm planning to go full-CAS I just checked and was surprised to see so many different CAS, so I'll be explicit: https://en.wikipedia.org/wiki/Computer_algebra_system I mean I want us to be able to ask our waifus for detailed searches that might look something like this in the underlying C++ code's execution path: waifusearch> (Chii OR Sumomo) AND Hideki NOT (Hibiya OR Yumi) AND School Probably not a great example off the top of my head but you should get the picture. A long way from a NLP front-end, but eventually that front-end would need to produce something like the above to run through a lot of data. Since that's actually a useful enough feature (and simpler!) I'll try to get that underlying part working very fast, first.
>>8622 > I'll add a flag, probably -s with a string argument Yeah, this is the way. :) >go with the regex library If it was just for some search engine, then there's also Elastic Search. Don't know much about it, but it seems to be good. However, the interesting thing about your project might be that it is so small that it might fit into a small app for a phone or tablet. I already mentioned BUMP somewhere a while ago, to people which might be interested in building something like the Omnichan app. >I want to understand it more deeply first before Yeah I understand, and won't criticize that notion. Just keep in mind that building on already existing stuff might let you build more powerful things faster. >full-CAS This looks very useful.
>>8624 >However, the interesting thing about your project might be that it is so small that it might fit into a small app for a phone or tablet. Exactly. My target platform is low-end SBCs like the RPi, actually. I want our robowaifus to be able to interact with us verbally in realtime totally disconnected from other computing resources in a fast and energy-efficient way. Onboard, compact data sources, and efficient, compiled code (or even critical sections of actual machine code) are the only way to pull that off. This software would easily run 100 times over in a modern phone. >protip: waifusearch> Omnichan app :^)
OK so here's the new version of waifusearch. We've moved up a point level. We now support using a Boolean OR to separate different (but usually related) phrases together into the same search. This is definitely the largest modification yet to the code. Basically not only to support using Boolean OR for now, but also to refactor the architecture to support more sophisticated CAS (computer algebra system) type operations on the searches in the future. Also, performing a one-shot search is also supported now using the -s flag, combined with a string argument containing the search text. Please note, this isn't very performant ATM because the code has been designed to process the JSON text into an efficiently-traversed memory structure, and then repeatedly using that same structure over and over. Performing a one-shot means (for now) always re-doing that front-loading each time. In the future, I plan to serialize this already-processed data out to a disk file. That would then provide the ability to read that file back directly into the waifusearch memory structures instead of recreating them from scratch each time. But for now -- while it actually works -- doing a one-shot isn't anywhere near as responsive as normal searches are. >version.log 210220 - v0.2a -------------- -refactoring to support Boolean operators 'OR' , '|' , in search specification -add one-shot search capability, suited to scripting waifusearch -add search phrase/words to listing output -use 'waifusearch' prompt instead -add disp_offsets user flag -add search_once user arg -add do_Boolean_ops() func -add map_phrase_groups() func -add match_phrases() func -add search_n_tag() func -add sep_search_phrases() func -ren ld_phrase_map() func -add cp_sort_disp() func -add parse_to_csv() func in rw_text_utils.hpp -add trunc_pad_str() wrapper in rw_text_utils.hpp -add rw_gen_utils.hpp header -add Vec_phrase_grps alias -add multi-threading memory fencing -remove redundant unfounds.clear() operation https://files.catbox.moe/kiasfm.7z 5e0360c493892863a1741c4143a8e9c7320eccaf722a7c026fe9cbe082318687 waifusearch-0.2a.tar.xz
>>8678 Here's an example usage of the new Boolean OR feature: build/waifusearch -y false -t true -e true time to process local /robowaifu/ JSON data: 3017 ms waifusearch> hello world | foo bar term: foo count: 15 term: bar count: 18 term: hello count: 63 term: world count: 309 # terms found: 4 ORDERED: ======== THREAD SUBJECT POST LINK C++ General >>3717 foo bar " >>6063 hello world " >>6095 " " >>7921 " " >>7933 " " >>8057 " Selecting a Programming Language >>134 " Robowaifu-OS & Robowaifu-Brain(c >>208 " Modern C++ Group Learning Thread >>5420 " " >>5422 " " >>5423 hello world, foo bar " >>5425 foo bar " >>6529 hello world Haute Sepplesberry Cuisine TBH >>5730 " " >>6313 " UN-ORDERED: =========== THREAD SUBJECT POST LINK C++ General >>1075 (foo, bar) Robowaifu fiction to promote the >>50 (hello, world) /robowaifu/ Embassy Thread >>5158 " Modern C++ Group Learning Thread >>5424 (foo, bar) ' hello world | foo bar ' [15 : 4] = 19 results -------------------------------------------------------------------------------- total: sea:74 loc:238 ldm:33 dbo:19 snt:86 mch:22 us 472 us :16 :50 : 7 : 4 :18 : 5 % -------------------------------------------------------------------------------- Note how when the two phrases both appear inside the same post, then both phrases also appear at the end of the result. Here's an example where two exact matches both appear inside the same post: >>5423 hello world, foo bar Whenever all words in a phrase appear in a post, but they aren't in the exact order specified in the search, then the words of the phrases will appear as CSVs inside parenthesis: C++ General >>1075 (foo, bar) Robowaifu fiction to promote the >>50 (hello, world)
>>8678 Wow, that's great. Seems to be some big leap forward. >Performing a one-shot means (for now) always re-doing that front-loading each time Yes, that's always a problem. I ran into the same problem with my chatbot and graph parsing. The alternative to storing it, would be to let the parsing be done in a program which runs in a daemon mode, while the search part can be called by another program.
>>8688 >daemon Yes, I considered exactly that as the solution for a good bit later when we're actually implementing such a tool in a production runtime for our robowaifus. And as you basically indicated, I imagine there are entire classes of types of solutions that would require multiple, cooperating components for their systems to function correctly. For example, when we adopt a Neuromorphic response strategy for our robowaifu's 'nerve & muscle' sense/response cycles, we'll need to adopt both a real-time tiny kernel directly into the sensor hardware to provide for instantaneous responses. This will enable her to have self-defense for thing like avoiding dangerous, hot or sharp objects for example -- same as the designs we bios enjoy having. OTOH, within a short time-frame (say 5ms), there may also be the need her to have the ability for higher-order functions to kick in to direct her to go on in and proceed -- even in the face of danger. Again (but in a little different, 'human being' way) same as we people do. To my thinking this simple example spells out one of the many design-separation boundaries between static & one-shot/repetition processing that will be needed for effective Robowaifu Systems Engineering design strategy & engineering approaches.
Updated BUMP to work with the RPi, but the instructions for building & running should work similarly for other Linux distros. >>8769
>>8562 >>8576 I meant to have this for you sooner Anon. >>8769

Report/Delete/Moderation Forms

Captcha (required for reports)

no cookies?