Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a 'Scoped Buffer' implementation as smart-pointer #2987

Merged
merged 6 commits into from
Oct 2, 2020

Conversation

andy31415
Copy link
Contributor

Problem

Pairing malloc and frees is easily error prone. In C++ we can use RAII.

Summary of Changes

Defines a smart pointer around Platform::MemoryAlloc and friends. Tested via Godbolt that this is a 0-cost abstraction.

used this new class in a couple of places. Specifically for linux storage, I also refactored the layout of some functions to avoid the large if nesting (was even missing an out of memory check which would have made it larger) at the expese of not having a single return (I believe we overly try for a single return and this hurts readability).

Fixes #2896

@andy31415
Copy link
Contributor Author

For reference, in a malloc/free wrapper for scoped memory:

size_t test() {
    ScopedMemoryBuffer b;

    b.Alloc(256);
    strcpy(b.Ptr<char>(), "this is a test");

    return strlen(b.Ptr<char>());
}

results in this on gcc:

.LC0:
        .ascii  "this is a test\000"
test():
        push    {r4, r5, r6, lr}
        mov     r0, #256
        bl      malloc
        mov     r4, r0
        ldr     r1, .L3
        bl      strcpy
        mov     r0, r4
        bl      strlen
        mov     r5, r0
        mov     r0, r4
        bl      free
        mov     r0, r5
        pop     {r4, r5, r6, lr}
        bx      lr
.L3:
        .word   .LC0

@mspang
Copy link
Contributor

mspang commented Oct 1, 2020

Problem

Pairing malloc and frees is easily error prone. In C++ we can use RAII.

Summary of Changes

Defines a smart pointer around Platform::MemoryAlloc and friends. Tested via Godbolt that this is a 0-cost abstraction.

Did you test single vs multiple returns on godbolt? I think that's more likely where costs will be incurred.

used this new class in a couple of places. Specifically for linux storage, I also refactored the layout of some functions to avoid the large if nesting (was even missing an out of memory check which would have made it larger) at the expese of not having a single return (I believe we overly try for a single return and this hurts readability).

Fixes #2896

src/lib/support/ScopedBuffer.h Outdated Show resolved Hide resolved
operator bool() const { return mBuffer != nullptr; }

/** Take over managing memory from the specified buffer */
ScopedMemoryBufferBase & Aquire(void * buffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Acquire". Maybe "Adopt" is a better name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does the instance know where to call to free the pointer? the buffer in question has to be understood by the Impl's static MemoryFree() function?

I'd really like to see a callback (or a closure) passed in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a crutch for existing code if memory is already managed 'the old way'.
We can assume that any porting/new code can start to use this code from the start and then drop this method.

* Releases the undelying buffer. Buffer stops being managed and will not be
* auto-freed.
*/
void * Release()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to call this "Release", or "Forget"? I guess "Release" more closely matches the unique_ptr API, but then we should use "Reset" instead of "Acquire", maybe....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped aquire and thinking to keep release for matching with unique_ptr. This class is basically uniqe_ptr but without exception handling.

src/lib/support/ScopedBuffer.h Show resolved Hide resolved
src/lib/support/tests/TestScopedBuffer.cpp Outdated Show resolved Hide resolved
@@ -867,7 +868,7 @@ CHIP_ERROR GenericConfigurationManagerImpl<ImplClass>::_ComputeProvisioningHash(
using HashAlgo = chip::Crypto::Hash_SHA256_stream;

HashAlgo hash;
uint8_t * dataBuf = NULL;
chip::Platform::ScopedMemoryBuffer dataBuf;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inside namespace chip, right? Why the prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazingly enough compiler complained about Platform being ambigous, so I used it as such.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically at least chip::Platform and chip::System::Platform exist and for code that has 'using' on both chip and system, we get conflicts.

We should not get conflicts if headers had a rule of 'no using inside headers' but the ported code does not enforce that so we are stuck until we clean up code. Would prefer that cleanup to be independent of this PR.

Comment on lines +143 to +146
static void MemoryFree(void * p) { chip::Platform::MemoryFree(p); }
static void * MemoryAlloc(size_t size) { return chip::Platform::MemoryAlloc(size); }
static void * MemoryAlloc(size_t size, bool longTerm) { return chip::Platform::MemoryAlloc(size, longTerm); }
static void * MemoryCalloc(size_t num, size_t size) { return chip::Platform::MemoryCalloc(num, size); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this stuff is inside namespace chip, so why the prefixes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we have some include ambiguity. I did not try to fix it but rather just be explicit. It looks slightly annoying, but not unreadable.


#include <nlunit-test.h>

using namespace chip;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should obviate the need for the chip:: in various places in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the line. we don't have too many chip:: remaining so this should be fine. Somehow Platform cannot be without chip:: prefix so the worst offender cannot be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually rechecked and in this context I could remove chip:: and keep just Platform. Still unsure if worth it.

CHIP_ERROR retval = CHIP_NO_ERROR;
char * encodedData = nullptr;
CHIP_ERROR retval = CHIP_NO_ERROR;
chip::Platform::ScopedMemoryBuffer encodedData;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inside namespace chip?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would think so, but Inet and storage have different ones :(

../../src/platform/Linux/CHIPLinuxStorageIni.cpp:222:79: error: reference to ‘Platform’ is ambiguous
  222 | CHIP_ERROR ChipLinuxStorageIni::GetBinaryBlobDataAndLengths(const char * key, Platform::ScopedMemoryBuffer & encodedData,
      |                                                                               ^~~~~~~~
In file included from ../../src/lib/core/CHIPCore.h:39,
                 from ../../src/include/platform/CHIPDeviceLayer.h:25,
                 from ../../src/include/platform/internal/CHIPDeviceLayerInternal.h:23,
                 from ../../src/platform/Linux/CHIPLinuxStorageIni.cpp:31:
../../src/inet/InetLayer.h:111:11: note: candidates are: ‘namespace chip::Inet::Platform { }’
  111 | namespace Platform {
      |           ^~~~~~~~
In file included from ../../src/platform/Linux/CHIPLinuxStorageIni.h:30,
                 from ../../src/platform/Linux/CHIPLinuxStorageIni.cpp:30:
../../src/include/platform/PersistedStorage.h:34:11: note:                 ‘namespace chip::Platform { }’
   34 | namespace Platform {

I don't want to refactor too much, so keeping the super specific bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't know we had multiple Platform namespaces. Just ignore all the namespace comments.

void * Ptr() { return mBuffer; }
const void * Ptr() const { return mBuffer; }

template <typename T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether it makes sense to make the following changes:

  1. Add a template parameter for the type of the things in the buffer (to either the base class or the underlying class). Let's call that type Type.
  2. static_assert that Type is a POD type, to avoid issues with people putting things with destructors (which will not get run) in the buffer
  3. If we have static knowledge about alignment of the things our allocator returns maybe static_assert that Type does not have stronger alignment requirements.
  4. Add conversion operators returning Type* and const Type*.
  5. Maybe add an operator[] returning Type& (and a const equivalent).

This would still allow effectively doing reinterpret_cast via the Ptr method if we want to allow that, but would also allow "normal" use to specify the type up front and not have to mention it at every place where the buffer is actually being used. The various places in the changeset that use someBuffer.Ptr<sometype>() could just use someBuffer.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that however my intent is to avoid template explosion (this was the main objection I heard about using templates in embedded). I do not want duplicate implementations inside flash for uint8_t, char, etc.

In this approach, we have only the Get duplicated, but that one is a trivial function and inlined, so it should go away and we get 0 cost abstraction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may refactor it later... maybe subclassing works without exploding things (just make cast operators easy to use). I would defer it for the future though. At first I thought that this Memory alloc/free is used all over in chip, but in reality we have very limited places where we actually use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe subclassing works without exploding things

Right. Codesize explosion is a good reason to not template the base class. Templating the derived class, and making sure that the templated thing only has inline methods, would avoid the codesize problems, though.

I'm probably ok with this change as a followup, especially since it would involve codesize measurements, but I do think we should make it, not just for ease of use but also because it enables us to add safety asserts (like the is_pod assertion I suggested). Are you willing to take the time to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do it, please add a PR and assign it to me.
I am unsure timing wise, however it seems small enough of a change that I can do it.
Also thinking we probably want a implementation for 'secure memory' to call a clear on every free. Our existing code seems somewhat buggy in that regard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Boris, the current implementation strikes me as type-unsafe and we should be able to fix it without increasing code size.

I would suggest removing the template parameter from Ptr<> in this initial PR, that's really just hiding a typecast under a less obvious syntax now. Potentially unsafe casts are the last thing we want to hide in a benign-looking accessor function.

@bzbarsky-apple
Copy link
Contributor

Also, this is excellent. :)

}

// Read the device intermediate CA certificates.
err = Impl()->_GetManufacturerDeviceIntermediateCACerts(dataBuf, certsLen, certsLen);
err = Impl()->_GetManufacturerDeviceIntermediateCACerts(dataBuf.Ptr<uint8_t>(), certsLen, certsLen);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this <uint8_t> templating necessary here, but not below for ClearSecretData?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried verbatim copy. Data was uint8_t by default hence conversions everywhere, but clear secret data takes void* I think. I did not check the rest.

Speaking of which: clearsecretdata is called on the last free, however we have an internal alloc that does NOT clear it. Unsure if intentional or not, so I kept 'as is' but I do assume a bug where using this class would have made easier to spot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clearsecretdata() takes a uint8_t*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... new theory (verified though): when I do 'run functional and unit tests' in vscode, this file does not get called at all, so it does not notice that the clearsecretdata call is bad.

I updated it blindly and will rely on CI to check things. This is very odd... is this file even used?

ScopedMemoryBufferBase & Alloc(size_t size)
{
Free();
mBuffer = Impl::MemoryAlloc(size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any instance data for the Impl? what if we want to allocate from pools, or allocate and free across linking boundaries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impl was just just because I needed to test things. In reality these are supposed to be Platform::Memory* methods which are not inside a class.

Static polyphormisms seems awkward, but at least has no cost and is testable.

Copy link
Contributor

@Damian-Nordic Damian-Nordic Oct 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if that's useful at all, but you could probably support stateful allocators, by reversing the class hierarchy and following STL convention:

struct DefaultAllocator
{
    static void MemoryFree(void * p) { chip::Platform::MemoryFree(p); }
    static void * MemoryAlloc(size_t size) { return chip::Platform::MemoryAlloc(size); }
    static void * MemoryAlloc(size_t size, bool longTerm) { return chip::Platform::MemoryAlloc(size, longTerm); }
    static void * MemoryCalloc(size_t num, size_t size) { return chip::Platform::MemoryCalloc(num, size); }
};

template <class Allocator=DefaultAllocator>
class ScopedMemoryBuffer : Allocator
{
public:
    explicit ScopedMemoryBufferBase(const Allocator& a) : Allocator(a) {}
...
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the suggestion, however I am unsure of the effect of those on template code explosion. I believe we do not use std pointers because exceptions and because potential code bloat.

I would be happy to review this separately, however as a first pass I would prefer code that I was able to double-check with godbolt.

@boring-cyborg boring-cyborg bot added the nrf5 label Oct 2, 2020
void * Ptr() { return mBuffer; }
const void * Ptr() const { return mBuffer; }

template <typename T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Boris, the current implementation strikes me as type-unsafe and we should be able to fix it without increasing code size.

I would suggest removing the template parameter from Ptr<> in this initial PR, that's really just hiding a typecast under a less obvious syntax now. Potentially unsafe casts are the last thing we want to hide in a benign-looking accessor function.

@rwalker-apple rwalker-apple merged commit f43f552 into project-chip:master Oct 2, 2020
@andy31415
Copy link
Contributor Author

Oh ... just had a refactor to address Michael/Boris coments regarding typesafety.

Will submit them as a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Start using RAII for buffer frees within the code
6 participants