Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API support for string slice parameters instead of null-terminated strings #494

Open
bitshifter opened this issue Jan 20, 2016 · 38 comments

Comments

@bitshifter
Copy link

I discovered when trying to call ImGui from Rust that Rust strings are not null terminated and thus using there are some hurdles to jump through to produce a null terminated string to be use with ImGui.

I thought it would be convenient if I could use character data arrays instead of null terminated strings with ImGui.

I've made a branch with a proof-of-concept of this idea by replacing char* parameters in ImGui functions with ImStr - a struct containing a pointer to the start and the end of a string slice. It has a implicit char* constructor, so no calling code needs to be changed. I have a proof of concept branch at https://github.com/bitshifter/imgui/tree/imstr.

I haven't converted everything to ImStr, but it's enough to get the idea I think.

There may actually be some advantages in doing this for the C++ API:

  • string_view is being added to C++ - if you aren't already familiar with it, string_view is simply a struct with a const char* pointer to the start and end of a char array, so you can use it to reference char data without needing a null terminator. I've used something similar in code bases that were suffering from severe std::string abuse to reduce a lot of string allocating and copying. It's pretty handy, until you need a null terminated string, which is why it would be nice if ImGui supported this idea
  • Most C++ codebases use some kind of string class or classes, whether it's std::string or a custom string. They usually require calling c_str() or something to get a char*. With ImStr, users could add an implicit constructor that creates an ImStr from their string type, meaning they can easily use it with ImGui. String types usually know they length of the string, avoiding the need call strlen in ImGui
  • For FFI it helps Rust, and probably other languages that either don't use null-terminated strings, or already know the length of their strings avoiding unnecessary strlen calls.
  • On the down side, this probably isn't convenient for C users, but it could at least be hidden behind the C API.

Performance wise, it didn't seem to make a big difference in my profiling, I thought strlen might rise up in the profile but it didn't seem to for me at least.

If you are interested in this idea, I can try finish off converting the rest of the char* parameters a make a PR.

@MrSapps
Copy link

MrSapps commented Jan 20, 2016

string_view will never be used because its C++11

@Cthutu
Copy link

Cthutu commented Jan 20, 2016

Use std::string to produce your null terminated string: std::string(start,
end).c_str()

On Wed, 20 Jan 2016 at 07:36 Paul [email protected] wrote:

string_view will never be used because its C++11


Reply to this email directly or view it on GitHub
#494 (comment).

@bitshifter
Copy link
Author

string_view will never be used because its C++11

I'm not suggesting that ImGui should use string_view and I haven't used it (or any other C++11) in this change. My point is that people who use ImGui may use string_view in their code and this change makes it easy use with ImGui, while still allowing people to use char*, or other sting types.

@bitshifter
Copy link
Author

Use std::string to produce your null terminated string: std::string(start, end).c_str()

This will cause many many allocations and string copies, which can be avoided with this change.

@ocornut
Copy link
Owner

ocornut commented Jan 20, 2016

edited

@bitshifter I'll have to get back to you but give me some time. This is potentially a big change to follow on and I don't have sufficient bandwidth at the moment. One of my concern, which is arguably hard to measure, is that I don't want to give the impression to people skimming at the library that this is a high-level library that uses lots of mysterious and heavy data structures. Note that I am merely talking about the impression here. C/C++ developers are very reluctant to use libraries and ImGui aim to combat that. char* carry a natural feeling of accessibility to many devs. Now, your point are valid and I don't want to reject it. Interested in your progress and thoughts down the line.

@Cthutu : very slow very heavy. The whole point of slices is to avoid it.

@bitshifter
Copy link
Author

@ocornut Yeah, I totally agree/understand your concern, so wouldn't be offended if the idea is ultimately rejected :)

I'll have a look at converting the remaining char* at some point, some of them might be a bit less straight forward than the labels so it would be interesting to see how it pans out.

@ocornut
Copy link
Owner

ocornut commented Jan 20, 2016

Some ideas:

  • I'd suggest to make ImStr as simple as possible aka const char* Begin (as members but no accessor). I realise it seems odd but default compilation settings for Visual Studio in Debug make even trivial normally inlined function calls have measurable cost. As I aim to minimize CPU footprint even in the worse condition. I also found it very liberating to ditch unnecessary accessors and also ditch size_t in favor of int (but here we don't store either).
  • This is something that can be investigated later (or by me) but I might like a system where the End point can be lazily evaluated, allowing for single pass on large blurb of texts instead of two passes (e.g. TextUnformatted is tuned to do single pass on all text that is clipped).
    That would mean the constructors would NOT do the strlen() but certain end points of the code would do it and store back only if they need it (defaulting to zero-terminated). (the act of storing back may imply passing ImStr by value for convenience in the various code using it).
    However I may be overthinking that, TextUnformatted already allows for End to be passed so I could drop the guarantee Begin-only version does one pass and request user to pass a proper slice if they want that guarantee, which is reasonable if you are going to submit kilobytes of texts. So maybe just ignore that paragraph for me, just laying out the idea out loud.

@ocornut
Copy link
Owner

ocornut commented Jan 20, 2016

A potential plan B (IF we can't get plan A to work) would be to make the necessary adjustments to master so that maintaining a fork is painless. I am also open to make that sort of work easier.

@bitshifter
Copy link
Author

Good point about function call overhead in debug builds. I wonder if anything can be done about the function call overhead of the ImStr constructors themselves.

I've committed some updates to my branch based on your feedback:

  • Simplified ImStr by removing all accessors and methods except for;
  • Lazily evaluate End pointer when needed, added ImStr::CalcEnd() to do this

I think the lazy evaluation works well, as many method calls don't need End to be set anyway, e.g. if all they are doing is calculating a hash.

I didn't do anything specific with TextUnformatted, it currently does a CalcEnd(), it was already doing a strlen before so no change in behaviour here.

Regarding a fork, it seems reasonable to have gestate this in a separate branch and see if there's any demand for it.

I'm not sure what would make maintenance easier. Code wise, the changes are reasonably straight forward, the only problem areas were were the new function signature would match old code (e.g. the new CalcTextSize(ImStr, bool, float) function can be called with old CalcTextSize(char*, NULL) and the NULL will be used as the bool parameter). Some RenderText calls also had this problem. They're pretty easy to spot though. Of course there maybe other things I haven't noticed yet.

I think what would probably help most with maintaining a fork is probably more along the infrastructure side. If there was someway to automate merging and building commits to master against the imstr branch that would remove a lot of manual work. It's possible that github/travis already does this.

@ratchetfreak
Copy link

There is a bug in your ImStr::CalcEnd() if ImStr is constructed with NULL (as happens with some default parameters like for column).

And why doesn't ImStrDup use CalcEnd itself?

@bitshifter
Copy link
Author

@ratchetfreak The intention is that ImStrdup and CalcEnd are only used internally by ImGui, which means it's easier to expect preconditions like ImStr.Begin being non-null before CalcEnd is called.

This is effectively the case in ImGui already (master, not my branch), often a text_end value is calculated by calling strlen on a char* text parameter, it is expected that the char* parameter is not null, but this is not checked at runtime.

Still, I am tempted to make CalcEnd a function which is completely internal to ImGui to prevent people from using it externally, since it is expected to be used in a specific way.

I've changed the signature of ImStrdup so hopefully it's clearer that it should be given a valid End pointer. The main reason I don't call CalcEnd in there is I think it's preferable that End is calculated as soon as possible if it's needed, to avoid repeatedly recalculating it in leaf functions like ImStrdup.

@ocornut I've made a few more commits porting more of the API over to use ImStr. It's mostly done now, just combo lists and list boxes and text filtering to go I think.

@ratchetfreak
Copy link

There is also formatting to be done

I suggest adding a ImFormatString(ImStr buffer, ImStr fmt, ...) that will check fmt.End and if it's null just pass to the char* version (because it expects a null terminated string anyway)

char* fmt_buf;
if(fmt.End){
    fmt_buf = alloca(fmt.End-fmt.Begin+1);
    memcpy(fmt_buf, fmt.Begin, fmt.End-fmt.Begin);
    fmtbuf[fmt.End-fmt.Begin] = 0;
} else {
    fmt_buf = fmt;
}
ImFormatString(buffer.Begin, buffer.End - buffer.Begin, fmt_buf, args);

@bitshifter
Copy link
Author

Yes that's true. I was intentionally leaving the format strings as char*, but as you point out the format copy is only required if End is actually set to something, so that doesn't seem so bad.

@bitshifter
Copy link
Author

A small update: I got a branch of the Rust imgui wrapper calling through my branch of imgui so that strings are passed from Rust to C++ without needed special macros in Rust or copying of strings. I also branched the cimgui library as well, since that's how Rust talks to imgui as it can't call C++ directly.

I made some observations from completing this work that I thought it might be useful to share.

I didn't end up changing any of the methods that take a format string to use an ImStr for the format string. I instead used overloads of ImGui methods that just take text as an ImStr. My feeling was it wasn't necessary to provide an ImStr format string, and was probably undesirable:

  • The format string passed to printf must be null terminated, so if a string slice is used a copy of the string must always be made, this would introduce additional overhead, especially for large strings
  • In the demo code and in my own ImGui code, it's more common that ImGui::*Text* methods are called with strings that don't require formatting at all, adding to my desire to not add unnecessary overhead from needing to copy strings
  • Rust (and other languages) have their own string formatting techniques, which often don't look like printf, so it doesn't necessarily feel natural to take advantage of ImGui's printf formatting from other languages

On the C++ side of things, I looked up some ImGui code I wrote a few months for a project that used string slices heavily. This code wasn't using my ImStr changes, since it was written earlier. I found that I called ImGui::*Text* methods like so:

ImGui::Text("%.*s", str.size(), str.data());

I used string literal format strings and I was able to call ImGui methods with string slices parameters by using printf format string precision. So it's certainly doable to use string slices with ImGui currently, as least with some of the API, though it's not as nice as just calling ImGui::Text(str).

  • On the C++ side, if you do require formatting, I think it's likely people will use a string literal format string at the call site.
  • If formatting is not required at all then C++ code could use the methods that take pre-formatted strings as ImStr parameters.

@ratchetfreak ratchetfreak mentioned this issue Feb 11, 2016
7 tasks
@josiahmanson
Copy link
Contributor

I like slices, but having a function call silently invoke a constructor seems like a bad idea. It seems like it would be better to have another function (or overload) that takes a pointer to the end of the string or a size.

@ocornut
Copy link
Owner

ocornut commented May 6, 2016

Sorry @bitshifter I haven't got around to react and help much with this topic. It's a very useful and interesting thing but just far low in the priority list from my angle at the moment. I'm hoping that keeping this branch alive and sync isn't too painful? At minimum if you think there are way I can help making updates less painful let me know.

@bitshifter
Copy link
Author

@josiahmanson my goals with this change is to avoid breaking user code as much as possible, avoid greatly increase the size of the API (i.e. through overloads) and avoid impacting users who just want to use char* (by requiring an end of string pointer). The ImStr type was the only way I could come up with to achieve this. A quick grep found around 140 functions which take char* parameters. The char* parameter is usually first, or followed by var args, which means I can't use a default value for end of string most of the time. Providing and alternative function that takes an end of string parameter would greatly increase the size of the API. So for now the silent constructor seems like the approach that best meets my goals. Keep in mind that the constructor doesn't do much other than assign two member variables.

@ocornut My intention was to just update the branch when a new dear imgui release was made. I don't think that's happened since I made the branch (unless I missed it) so I haven't synced so far. I do have a PR which merges the latest changes from master and builds them. So far everything merges cleanly, so I think it's mostly new functionality that I'll have to deal with. I'll let you know how it goes when I try updating.

@bitshifter
Copy link
Author

@ocornut I noticed 1.49 was out so I merged to latest, as it turns out I'd also missed 1.48. The PR I set up didn't do what I thought (was merging master against my unmodified master), so there were quite a few conflicts although most of them were simple to resolve. I can't think of much that would reduce the number of conflicts. I think having comments on the same line as declarations probably causes a conflict for a comment change. Also lack of white space between function declarations might be confusing git a bit. In saying that, changing style now would also generate a lot more conflicts!

I think I'll set up a PR against your master to keep track of breaking changes.

@bitshifter bitshifter mentioned this issue Jun 5, 2016
@ocornut
Copy link
Owner

ocornut commented Jun 5, 2016

I am getting awfully tempted to bite the bullet and merge the part that would typedef ImStr to const char* and use that.

@bitshifter
Copy link
Author

That would make merging imgui.h easier. It might pave the way for this change too I guess. There's a couple of caveats; printf format strings are still const char* and there's a few methods which take a const char* begin and an end which I've either left alone or converted to an ImStr.

@bjadamson
Copy link

I'd like just like to chime in my use case for this as well. I've recently started using imgui with rust for my project, and I'm really excited about the prospect of this branch getting merged someday. Right now it's quite painful to write im_str!("text") instead of "text" on the rust side of application development.

Anyways thanks for working on this, if you need someone to do some work on this I can work on it, just need some direction on what's needed next.

@bitshifter
Copy link
Author

Thanks @bjadamson, you can always use my https://github.com/bitshifter/imgui-rs/tree/imstr branch to use this feature, however I haven't merged with imgui-rs or imgui in a long time so it's not very actively maintained right now. As far as work to do, if you're interested in updating the branches for imgui, cimgui and imgui-rs I'd merge them in. I'm not sure how much work is involved at this point, there's probably a lot of conflicts which might be confusing if you aren't familiar with the code

@ice1000
Copy link
Contributor

ice1000 commented Jul 16, 2018

Is this issue left abandoned? I think it'll be very useful. What is the current status of ImStr?

@ocornut
Copy link
Owner

ocornut commented Jul 17, 2018

It's not abandoned (the issue is open) but stagnant and low priority at this point.

It's very useful for Rust users but has a non-trivial cognitive load for C++ users so it is a really hard move to make considering dear imgui users are mostly C++ centered. From my point of view there's enough workload on dear imgui that putting that sort of weight in favor of a potential new user base is a hard price to pay. The fact that there's a Rust macro hack (used by imgui-rs) to workaround the problem, however not elegant, reduce the pressure to solve this. It's a very frustrating situation because the PR #683 by bitshifter does everything it needs to do correctly, but I think it is more pragmatic to defer that decision until a more major imgui refactor (e.g. 2.0).

(One possible way forward to keep this maintained would be to consider partly-automating the change so the ImStr version can be derived from the master version + a smaller patch, presumably easier to maintain.?)

@meshula
Copy link

meshula commented Jul 17, 2018

FWIW, string_view is getting used more frequently nowadays, so C++ users of string_view are faced with the extra load of copying the string to add a terminator. I'm not pushing for higher priority, just want to mention working around string_view and ImGui together is also a cognitive load :)

@ocornut
Copy link
Owner

ocornut commented Jul 17, 2018

That's a fair point, although I presume the majority of imgui usage relies on a combination of literals + generated strings (via sprintf or another mechanism) in both cases the strings would end up being zero terminated (and for stuff like ImGui::Text() you can already pass non-zero-terminated ranges).
As a very different topic there is an ongoing idea of adding an explicit-context API and maybe that lower-level api could have some of the strings parameters rewired to accommodate for this change.

@meshula
Copy link

meshula commented Jul 17, 2018

In my code I am now extensively using the rather verbose TextUnformatted. At first it was a workaround for string_view, and then because I realized all my Text("trivial") calls were subject to formatting and therefore a call to sprintf. I'm not sure how to pass non-zero-terminated values to ImGui::Text?

@ocornut
Copy link
Owner

ocornut commented Jul 17, 2018 via email

@meshula
Copy link

meshula commented Jul 17, 2018

A terrible idea I had around a name for TextUnformatted was two versions of Text.

void Text(const char* str); // same as TextUnformatted
template<typename T>
void Text(const char* str, T trick, ...); // trick the compiler in picking the variadic version

@ocornut
Copy link
Owner

ocornut commented Jul 18, 2018

Yeah :/ I'd prefer to avoid that sort of thing especially for such a common function and as we are trying to make imgui friendly and naturally bindable to many languages without being scary, and still obvious to step into etc.
However you can perfectly create those wrapper in your own header files, either in a different namespace either in the name space with a different name?
(If you want to answer to this message please open a new thread so we don't pollute this one, thanks!)

@matt77hias
Copy link
Contributor

matt77hias commented Sep 16, 2018

Some things to note:

  • Replacing const char* with std::string_view will not break existing code, since std::string_view can be constructed in a non-implicit way from a const char*. So std::string_view should rather be considered as a generalization than a different new alternative to const char*.

  • The only requirement imposed by using std::string_view is the C++11 standard. Though, no special C++11 features are used for implementing this utility class, so ImGui could provide a fallback implementation if not available.

  • std::string_view is as lightweight as can be (i.e. just pass by value) and should thus not frighten any user at all ;-) .

Imho, in the presence of std's charconv (C++17), which operates on character ranges instead of null-terminating strings, and 3th party libraries such as fmt, supporting both std::string and std::string_view formatting arguments as opposed to printf, I think std::string_view will really become (be) the de facto standard string view for C++ (as opposed to C'ish const char*).

@ocornut
Copy link
Owner

ocornut commented Sep 17, 2018

I do agree string_view is good change going in the right direction and should head toward becoming a well used standard. It's however not in the DNA of imgui to use C++ constructs.

It would frighten some user who explicitly want to have zero dependency on STL (those users constitute a fair portion of professional users of dear imgui). Anything STL also tends to have overhead in not optimized builds so they are not exactly as lightweight as they can be.

(Also std::string_view(const char*) enforce a strlen() calculation on construction - there's no way to calculate it lazily? - so it's not as easy as providing a zero-cost drop in replacement.

For the above reasons and everything well discussed here would be more sane to have our own implementation of string_view and PR #683 is taking the right approach, we can perfectly make it constructable from std::string_view as well. I already mentioned this is an important change, which I think could be tied to an important release (2.0). It's just too out of scope in the short term, when the 1.7x line is released and well polished we can start considering our options for future priorities.

@matt77hias
Copy link
Contributor

matt77hias commented Sep 17, 2018

(Also std::string_view(const char*) enforce a strlen() calculation on construction - there's no way to calculate it lazily? - so it's not as easy as providing a zero-cost drop in replacement.

If you know the string at compile time (e.g., formatting), you can calculate the string length at compile time by using the array type char(&)[N] before decaying or by using constexpr (but it depends a bit on the C++ standard, you are aiming at).

For the above reasons and everything well discussed here would be more sane to have our own implementation of string_view and PR #683 is taking the right approach, we can perfectly make it constructable from std::string_view as well.

Yes indeed. In fact the minimum requirement ImGui requires to (in)directly support std::string_view is to generalize to a pair of forward char iterators, which can be restricted to contiguous memory char iterators (i.e. const char* begin, const char* end) to avoid the need for templates. Then it really starts looking like <charconv>'s from_chars and to_chars. As a general hack/feature (which unfortunately does not work for <charconv> in debug versions :( ), one can set the const char* end = nullptr for null-terminating strings (= no cost, since you need to perform the \0 anyway for the other case).

P.S.: const char* begin, const char* end seems imho somewhat more flexible than const char* begin, size_t length (Cfr. #683) if you iterate like:

for(auto it = begin; it < end && *it != '\0'; ++it) { ... }

If you ever decide to support general forward iterators, all the code will just keep working ;-)

@yuriks
Copy link

yuriks commented Feb 11, 2020

Has there been any movement on this in the intervening 2 years since the last comment, or is it still waiting for a major breaking version change in the library to be considered? Fine either way, but the 3rd party branches aren't maintained anymore so it seems whatever momentum this had has been lost for now.

@ocornut
Copy link
Owner

ocornut commented Feb 11, 2020

Apologies for not making a move on this, as you understand it's not a free change from the point of view of C++ users, and it hasn't been a big priority on my pile of work. I would happily assist to find ways to may this process painless, but in all likehood this is more like something we would adopt for 2.0.

Technically the submitted PR #683 is probably not even a breaking change! (or if it is they are probably minor or fixables issues), the overheads I am trying to avoid are

  • Runtime overhead in unoptimized builds. I have experimented with turning some MSVC debug features off and I think we may be able to find a solution that would make debug cost minimal. Going to investigate this further now that we have a better framework for perf measurements.
  • Cognitive overhead and giving the sense that dear imgui is higher-level that in needs to be. I realize this sounds a little bit religious-cargo-cult-ish but it is a strong component of the library DNA to not have unnecessary layers of abstractions.

There's an overlap with another "someday" change which would be to support wchar_t* all accross the board, because this is what Unreal uses naturally. So if we add some layer of abstraction for strings we might as well figure out a way to handle both cases.

I generally feel we should be going in this direction but it's going to be a slow steering. I'll look into getting the PR resurrected for the latest version.

@ocornut
Copy link
Owner

ocornut commented Feb 24, 2020

FYI @rokups has revived this PR as #3038

@ocornut
Copy link
Owner

ocornut commented Nov 30, 2020

(EDITED)
Moved message to #3038 (comment)

@kevyuu
Copy link

kevyuu commented Oct 7, 2024

+1 for this issue. Currently I have to check whether string_view is null terminated, and have to copy it to a pre-allocated buffer if it is not, which is a huge waste of performance if it is being done every frame. string_view is considered a standard way to pass immutable reference to string currently in modern C++.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests