Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially slow code in DocTools::merge_from #9125

Closed
nlupugla opened this issue Feb 18, 2024 · 4 comments · Fixed by godotengine/godot#88514
Closed

Potentially slow code in DocTools::merge_from #9125

nlupugla opened this issue Feb 18, 2024 · 4 comments · Fixed by godotengine/godot#88514

Comments

@nlupugla
Copy link

Describe the project you are working on

Godot.

Describe the problem or limitation you are having in your project

Noticed some potentially slow code in DocTools::merge_from https://github.com/godotengine/godot/blob/5f05e2b9b1a3fedcdd7ecb2ab976a2687fd6f19a/editor/doc_tools.cpp#L79. The current algorithm used to merge is O(n^2) when it could be O(n), where n is the number of members in a class. For example,
for each property in this, the following code block scans through all the properties in p_data to look for a match.

                for (int i = 0; i < c.properties.size(); i++) {
                        DocData::PropertyDoc &p = c.properties.write[i];


                        for (int j = 0; j < cf.properties.size(); j++) {
                                if (cf.properties[j].name != p.name) {
                                        continue;
                                }
                                const DocData::PropertyDoc &pf = cf.properties[j];


                                p.description = pf.description;
                                p.is_deprecated = pf.is_deprecated;
                                p.deprecated_message = pf.deprecated_message;
                                p.is_experimental = pf.is_experimental;
                                p.experimental_message = pf.experimental_message;
                                p.keywords = pf.keywords;
                                break;
                        }
                }

Describe the feature / enhancement and how it helps to overcome the problem or limitation

I haven't profiled to see whether the current algorithm is actually a problem, but I just thought I'd write up a formal issue to bring it to people's attention since the CI runs the doctool on every PR, so making it even a bit more efficient might help a lot.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

First step is profile to check whether it's actually a bottleneck. If it is, I have a few ideas to try out:

  1. Start the inner loop index j at the current value of the outer loop index i on the assumption that this and p_data usually have the same number of elements in the same order. Optionally, ensure both DocData instances are sorted before merging.
  2. Store the class members in a HashSet or HashMap so that the common members can be found more efficiently. Even though this would make the algorithm O(n), I doubt it would actually be faster because of the extra hashing overhead, but it's perhaps worth trying anyway.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No.

Is there a reason why this should be core and not an add-on in the asset library?

Doctool is core.

@AThousandShips
Copy link
Member

AThousandShips commented Feb 18, 2024

I'd say to do a merge by keeping both i and j outside the loops and iterate skipping by comparison, a classic merge algorithm, assuming the inputs are sorted that will work, something like:

while (i < size_i && j < size_j) {
    while (i < size_i && names[i].compare(names[j]) < 0) {
        ++i;
    }
    while (j < size_j && names[j].compare(names[i]) < 0) {
        ++j;
    }
    if (i >= size_i || j >= size_j) {
        break;
    }
    // Do our merge
    ++i;
    ++j;
}

Will take a look at something when I have the time, I've done code like that in the past

Constructors and operators would need some special treatment but should work

@AThousandShips
Copy link
Member

AThousandShips commented Feb 18, 2024

Working on this code, it works currently, will test it out further

Edit: Basic implementation up

@nlupugla
Copy link
Author

That's awesome! Didn't expect anyone to get to this so quickly :)

Did you get a chance to profile it? I'd hate to send you down a rabbit hole that doesn't make much of a performance difference in the end.

@AThousandShips
Copy link
Member

AThousandShips commented Feb 18, 2024

Haven't profiled but it should be better, will do some tests later, but it does improve general code quality and stability in either case I'd say with my testing

Edit: Gonna do some benchmarking to see

Edit 2: Seems the benchmarking does show that this improves things, not a lot, but it does seem that the merging step takes 9 ms on my computer instead of roughly 30 ms

Here's the benchmark code to compare with:

diff --git a/main/main.cpp b/main/main.cpp
index 4c9c159f47..f87e02eb44 100644
--- a/main/main.cpp
+++ b/main/main.cpp
@@ -3257,10 +3257,18 @@ bool Main::start() {
                GLOBAL_DEF(PropertyInfo(Variant::INT, "dotnet/project/assembly_reload_attempts", PROPERTY_HINT_RANGE, "1,16,1,or_greater"), 3);
 #endif

+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Total");
+
+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Generation");
+
                Error err;
                DocTools doc;
                doc.generate(gen_flags);

+               OS::get_singleton()->benchmark_end_measure("Documentation", "Generation");
+
+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Loading");
+
                DocTools docsrc;
                HashMap<String, String> doc_data_classes;
                HashSet<String> checked_paths;
@@ -3299,24 +3307,44 @@ bool Main::start() {
                ERR_FAIL_COND_V_MSG(err != OK, false, "Error loading classes from: " + index_path + ": " + itos(err));
                checked_paths.insert(index_path);

+               OS::get_singleton()->benchmark_end_measure("Documentation", "Loading");
+
+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Merging");
+
                print_line("Merging docs...");
                doc.merge_from(docsrc);

+               OS::get_singleton()->benchmark_end_measure("Documentation", "Merging");
+
+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Erasing");
+
                for (const String &E : checked_paths) {
                        print_line("Erasing old docs at: " + E);
                        err = DocTools::erase_classes(E);
                        ERR_FAIL_COND_V_MSG(err != OK, false, "Error erasing old docs at: " + E + ": " + itos(err));
                }

+               OS::get_singleton()->benchmark_end_measure("Documentation", "Erasing");
+
+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Saving");
+
                print_line("Generating new docs...");
                err = doc.save_classes(index_path, doc_data_classes);
                ERR_FAIL_COND_V_MSG(err != OK, false, "Error saving new docs:" + itos(err));

+               OS::get_singleton()->benchmark_end_measure("Documentation", "Saving");
+
+               OS::get_singleton()->benchmark_begin_measure("Documentation", "Deleting");
+
                print_line("Deleting docs cache...");
                if (FileAccess::exists(EditorHelp::get_cache_full_path())) {
                        DirAccess::remove_file_or_error(EditorHelp::get_cache_full_path());
                }

+               OS::get_singleton()->benchmark_end_measure("Documentation", "Deleting");
+
+               OS::get_singleton()->benchmark_end_measure("Documentation", "Total");
+
                OS::get_singleton()->set_exit_code(EXIT_SUCCESS);
                return false;
        }

With some different steps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants