Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bunch of (I hope) improvements #53

Closed
wants to merge 3 commits into from
Closed

Conversation

AlexIII
Copy link

@AlexIII AlexIII commented Nov 6, 2021

Hi!

I've made an Adurino library based on your project.

It required a few changes to be made to incbin.h which I think improve versatility and may be of use. All change are non-breaking and backward-compatible and I tried to follow the original code style closely.

Explanation of every commit

bbc3000 - Add an option to specify different sections for data and its size

Arduino is based on AVR MCU, which has true harvard architecture. Its program memory cannot be directly read from the program, it requires special instructions. Due to this limitation it is very inconvenient to store the size of the included binary data in ROM. The size variable should be stored in RAM, hence it should be in different section.
The commit adds an option to specify section for data size variable by defining INCBIN_SIZE_OUTPUT_SECTION macro. If undefined, it defaults to the value of INCBIN_OUTPUT_SECTION.

2e4c322 - Add terminating NULL at the end of the data (optionallny)

When including text files, it is convenient to treat them as a c-string. The problem is that c-string should be NULL-terminated. This commit adds an option to add NULL byte at the end of the included data by defining INCBIN_APPEND_TERMINATING_NULL. Disabled by default.

Please check if I did ok there. I have a feeling this change may mess-up the alignment or something. I admit I don't fully understand the code here. For example, I don't understand why you append .byte 1 at the end of the data.

566be48 - Add an option to specify typename for data

Adds an option to specify the resulting pointer type of the included data.
Just a convenience feature that removes the need of the cast every time you use the included data.
Adds two new macros INCBIN_PTR_TYPE(NAME, FILENAME, DATA_PTR_TYPE) and INCBIN_EXTERN_PTR_TYPE(NAME, DATA_PTR_TYPE) with one additional argument alongside the existing.

@graphitemaster
Copy link
Owner

graphitemaster commented Nov 6, 2021

Hey Alexill. This looks really good. I have some suggestions before merging the PR.

  • I like the idea of specifying the different sections for data and size, I think maybe we should rename INCBIN_OUTPUT_SECTION to INCBIN_OUTPUT_DATA_SECTION for consistency, so we have both SIZE and DATA, I think that's less confusing. We'd leave INCBIN_OUTPUT_SECTION as an option to define both when either is not defined, this keeps backwards compat then.

  • Specifying the optional typename for data seems useful, I'd like to use overloading for it though rather than having two separate macros, this way you can just do INCBIN(type, name, "file") and INCBIN_EXTERN(type, name) but also allow INCBIN(name, "file") and INCBIN_EXTERN(name), I think this is cleaner?

  • The terminating null seems like a bad idea to me. I don't want this to be a global operation that expands the size of every include by one extra byte when it's not needed. I understand the use case for strings, but it also breaks including LUTs used for SIMD kernels and other things that assume file size, so we maybe need to come up with a separate interface for it all together, or we have to add an option to the INCBIN macro itself somehow.

@graphitemaster
Copy link
Owner

graphitemaster commented Nov 6, 2021

I've gone ahead and implemented all three of these in commit

c6a9227

You may now use INCTXT to include text, and INCBIN / INCBIN_EXTERN with an optional first argument type to change type. INCTXT always defaults to char (as it should) and adds an implicit NUL byte. The default is still unsigned char for binary.

You can override DATA and SIZE sections independently of one another with INCBIN_OUTPUT_DATA_SECTION and INCBIN_OUTPUT_SIZE_SECTION, or together at once with INCBIN_OUTPUT_SECTION like discussed in my above comment.

Thanks for the suggestions. I'm sorry I didn't use your PR directly. I think you'll find this is a lot cleaner and has a lot more options :)

@AlexIII
Copy link
Author

AlexIII commented Nov 7, 2021

@graphitemaster I generally agree with what you've said, great work!

A few comments, if I may

On typing

INCTXT always defaults to char (as it should)

Ah, you see, it's on "normal" architectures (with common address space) a string is expected to be const char*. But on AVR there's no way to tell if the data is in ROM or in RAM simply by looking at the pointer (ROM and RAM address spaces overlap), and they require drastically different methods to works with. So Arduino implemented a little trick, casting all strings in ROM to dummy type const __FlashStringHelper* instead of const char*. All functions that work with strings have overloads for both types, so now the code can differentiate between the two.

In my lib I can just use INCBIN_COMMON() to re-define INCTXT and set the desired type, no problem, it will be concealed from the user. But those who will use your library directly may find it convenient to have an overload of INCTXT with the type argument though, as writing something like INCBIN_BYTE "0\n" in the last argument of INCBIN_COMMON() in user-code seems like a leakage of implementation details to me.

On terminating null

In my implementation the null byte wasn't included in the size count as to show character count like strlen() does. But I don't think it's necessary better in all cases, both ways have their arguments.

And a question

Could you please explain to me why appending the byte with the value of 1 at the end of the data?

INCBIN_BYTE "1\n" \

I couldn't figure it out, and it just bus me at this point =)

@AlexIII AlexIII deleted the incbin branch November 7, 2021 05:30
@AlexIII AlexIII restored the incbin branch November 7, 2021 05:31
@AlexIII
Copy link
Author

AlexIII commented Nov 7, 2021

Also, I suggest adding INCTXT_EXTERN to complement INCTXT same way as INCBIN / INCBIN_EXTERN

@graphitemaster
Copy link
Owner

graphitemaster commented Nov 7, 2021

INCTXT_EXTERN exists. That __FlashStringHelper* thing is interesting. Do you have some resources on that I can look into, I'm very curious. I think I can add an overload for INCTXT like the others to accommodate it.

@graphitemaster
Copy link
Owner

To answer your question about that value. It's just a dummy value that needed to be something. There is an alignment of 1 for that end object after the data that is used to calculate the length, since putting an end variable after it may have padding bytes before it (would use same alignment set for the data), thus computing the wrong size. So an object is needed of some form. Originally the code just used a label but that broke when linked with LTO. The actual value of 1 doesn't particularly matter, it can be any byte value other than 0 since for some reason zero values when compiled with -fdata-sections tend to get reordered into the bss section on some toolchains and breaks the computation as well.

@graphitemaster
Copy link
Owner

It's pretty late here. I'll look into making the INCTXT interface the same as INCBIN so you can overload the type tomorrow most likely. I'd really appreciate a resource on the __FlashStringHelper thing though, just so I can better document it and learn something.

@AlexIII
Copy link
Author

AlexIII commented Nov 7, 2021

INCTXT_EXTERN exists. That __FlashStringHelper* thing is interesting. Do you have some resources on that I can look into, I'm very curious. I think I can add an overload for INCTXT like the others to accommodate it.

Oh, Arduino is not particularly good at documenting it's inner workings. It's better to just see the source

https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/WString.h

class __FlashStringHelper;
#define F(string_literal) (reinterpret_cast<const __FlashStringHelper *>(PSTR(string_literal)))

As you can see here, __FlashStringHelper is just a dummy non-existent class, and with F() macro we cast all ROM strings from const char* to const __FlashStringHelper *.

Then, most of the functions that work with string has two overloads, for example String class constructor:

	String(const char *cstr);
	String(const __FlashStringHelper *str);

These both will construct String object, but will use different implementations to read the actual string data. First one will read from RAM, and the other will read from ROM.

@AlexIII
Copy link
Author

AlexIII commented Nov 7, 2021

The actual value of 1 doesn't particularly matter, it can be any byte value other than 0

Thanks for the explanation. Of course my first thought was "why don't we just change 1 to 0 and have ourselves a null terminator".

INCTXT_EXTERN exists

Sorry, missed that

@graphitemaster
Copy link
Owner

It should also be noted that even if we changed that 1 to a 0, the size would be incorrect since it doesn't count it.

@AlexIII
Copy link
Author

AlexIII commented Nov 8, 2021

@graphitemaster was that sufficient amount of info on __FlashStringHelper*?

@AlexIII
Copy link
Author

AlexIII commented Dec 16, 2021

@graphitemaster haven't heard from you for a while. There's still a problem #54 after the last update. Please take a look when you can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants