Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Review & Discuss] Update TxQueue to have a backing AVL Tree (Issues #189 #195) #198

Merged
merged 23 commits into from
Mar 6, 2019

Conversation

ntakouris
Copy link

  • Not all tests pass, but please - keep reading:
  • API's signature has not changed regarding Qos', but all Qos logic has been removed / we no longer check if Qos is same in order to have entries sorted in the queue => I corrected some tests regarding Qos changes

  • Important Currently equal-priority entries (which is calculated purely from frame priority) is not taken into account. Shall I add functionality where if nodes have the same priority, they are linked to the same node (and afterwards traversed/removed at the order they are inserted?). For example if at tx_queue.cpp test you try to add any frame twice in a row (without removing it), you'll get a push rejected (and maybe a memory leak).

  • On some tests I am unsure on how to proceed to correct them - or what changed their behaviour: ex. on FirmwareUpdateTrigger.MultiNode , some timer is running when it should not (towards the end).

Failing tests

I messed with DynamicNodeIDClient.Basic, CanIOManager.Transmission but unsure if it's correct.

For now, can_io.hpp, uc_can_io.cpp, avl_tree.hpp, tx_queue.cpp, io.cpp (and the other test behaviours) should be reviewed and discussed. :)

(I was writing end-of-semester tests and missed the dev call, but this week I'll join for sure)

@thirtytwobits
Copy link
Contributor

Can you squash your commit history so we're reviewing a single commit into our master? Thanks.

Copy link
Contributor

@thirtytwobits thirtytwobits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're getting there! Thanks so much for this. The biggest thing is to get the unit tests built.

libuavcan/include/uavcan/driver/can.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/can_io.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/can_io.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/can_io.hpp Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
@ntakouris
Copy link
Author

ntakouris commented Jan 29, 2019

Squashed, working on adhering to changes & tests!

(Note that TxQueue sanity checks already work though).

Should equal priority entries be supported?

Note: On merge PR as single squashed commit: https://help.github.com/articles/configuring-commit-squashing-for-pull-requests/

@pavel-kirienko
Copy link
Member

Excellent work @Zarkopafilis

Shall I add functionality where if nodes have the same priority, they are linked to the same node (and afterwards traversed/removed at the order they are inserted?). For example if at tx_queue.cpp test you try to add any frame twice in a row (without removing it), you'll get a push rejected (and maybe a memory leak).

Yes. Multi-frame transfers work by injecting multiple frames with the same CAN ID into the TX queue. They must be retrieved by the driver in the same order they were pushed, otherwise things won't work at all.

Some of the tests could be broken because of the failing multi-frame transfers, so I suggest to fix that first.

@pavel-kirienko
Copy link
Member

pavel-kirienko commented Jan 29, 2019

I am noticing that the new code relies on C++11 features (such as std::function<> and new member initialization). This is perfectly fine for Blue Sky which requires at least C++11, but not so much for the current master, because it is supposed to be C++03-compatible. On the other hand, advancing the current master doesn't make sense because this codebase will be discontinued very soon anyway.

@thirtytwobits what should we do?

@pavel-kirienko
Copy link
Member

@Zarkopafilis is this ready for a final review?

@ntakouris
Copy link
Author

ntakouris commented Feb 11, 2019 via email

Copy link
Member

@pavel-kirienko pavel-kirienko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I left some comments. Among the important stuff is the fact that your code currently relies on heap allocation, which is absolutely not permitted (the reliance is implicit through closures). Please find a way to avoid them.

Please also fix the style issues (I didn't highlight all of them, an automatic style checker should help).

Feel free to drop QoS completely.

libuavcan/include/uavcan/transport/can_io.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/can_io.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/can_io.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
@ntakouris
Copy link
Author

@pavel-kirienko the newly added static_casts are due to conversion warnings treated as errors. Removed the heap allocations as well.

Copy link
Member

@pavel-kirienko pavel-kirienko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reformat the code to match the style guide. Particularly, ensure that braces are placed correctly and entities are named properly (camel case vs. snake case). We will have another pass afterwards.

CMakeFiles/3.13.2/CompilerIdCXX/CMakeCXXCompilerId.cpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/node/abstract_node.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/node/generic_publisher.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/test/util/avl_tree.cpp Outdated Show resolved Hide resolved
Copy link
Member

@pavel-kirienko pavel-kirienko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly, the auto-formatter configuration requires work. Is my understanding correct that you have run the formatter against the entire codebase?

Mixing both style changes and logical changes inside the same PR doesn't seem to be working well: the diffs are just incomprehensible. I suggest either:

  • Roll back the autoformatting changes and fix the styling issues manually.
  • Fix the autoformatter configuration, run it against the master, and submit the style-only changes in a separate PR (and also please add the autoformat config file to the repo). Then after the style PR is merged, rebase or merge it into this branch.

libuavcan/include/uavcan/node/generic_publisher.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/node/generic_publisher.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/node/generic_publisher.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/can_io.hpp Show resolved Hide resolved
libuavcan/include/uavcan/transport/transfer.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/transport/transfer.hpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/src/transport/uc_can_io.cpp Outdated Show resolved Hide resolved
libuavcan/test/transport/dispatcher.cpp Outdated Show resolved Hide resolved
@ntakouris
Copy link
Author

Removed the explicit this->

I only ran the formatter for the files that this PR changed (additions on queue, avl tree, qos removal).

I see no reason for all this formatting hassle; the repository is itself inconsistent and the provided formatter(s) on v1.0 and uncrustify have no effect on this - correcting styling mistakes like the initializer list from where I copied exactly styling from other parts of the codebase does not lead in a proper solution.

@pavel-kirienko pavel-kirienko dismissed their stale review February 24, 2019 19:27

ignoring the style issues for now

@pavel-kirienko
Copy link
Member

Okay. I suggest then that we merge this ignoring the style issues for now, and then I will fix the clang autoformatting config myself and submit style corrections via another PR.

The changeset is huge here, so I would like @thirtytwobits to take a pedantic look at it once again (ignoring style) and confirm everything once more.

@thirtytwobits
Copy link
Contributor

Sure. I'll look at this later today (Feb 25, PST). Sorry that my .clang-format is messed up. I'm pretty sure that's my fault. I'll also look into changing all the fonts for this code base to Russo One.

Copy link
Contributor

@thirtytwobits thirtytwobits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't have time to complete this review today like I said. I'm making progress through it though. I have a few comments to start. More to follow as I have time. Thanks again for contributing!

protected:
struct Node {
T* data = UAVCAN_NULLPTR;
int16_t h = 1; // initially added as leaf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's with the int16 for height? It won't save space on most architectures because of padding and can introduce weird alignment issues if packed. Any reason to not just use an int so we get better alignments? (and it would still be 16 bits on a 16 bit architecture).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MISRA prohibits native types because of associated portability issues and risks. The rest of the codebase is mostly non-compliant in this sense, but as I wrote in #190, we'll have a chance to have those non-compliance issues rectified in BlueSky. If int16_t is considered too small (which I don't think is the case here considering the purpose of this container), it should be replaced with int32_t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you are reading MISRA correctly here. Rule 3-9-2 (which is only Advisory) recommends typedefs to manually control the size of integer types. The intent of all the integer rules is to prevent defects. My recommendation to use a machine word is also to prevent alignment defects (which are especially insidious where un-aligned access is supported but slow). If we must we can use a ptrdiff_t but that'd just look weird. Let's feel free to add "breaks MIRSRA x-x-x because foo is safer" comments (we can even make them Coverity annotations) rather than slavishly adhering to the standard. But I do agree we should consider and use MISRA where it make sense.

That all said, I don't think we're packing these nodes into anything so my comment here is probably just cosmetic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: it can't be unsigned because it may be negative while performing balancing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. ptrdif_t and int are both signed machine words on all platforms...except 8-bit platforms...crap. int is 16 bits on AVR according to the spec (caveat: most AVR program make use of typedef "fast int" which is 8-bits) and ptrdiff_t can be larger than a machine word. Okay. I'm really over-indexing on this, sorry. Let's keep the int16_t and I'll get to the part where this data structure is actually used to be sure we're not causing un-aligned accesses on some platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you are reading MISRA correctly here. Rule 3-9-2 (which is only Advisory) recommends typedefs to manually control the size of integer types.

Indeed, I was misremembering things by thinking that the rule is required rather than advisory in MISRA. Still, well-defined, deterministic width may eliminate certain overflow-related defects. One example comes to mind -- there was a discussion (either on the mailing list or on Gitter) about this code misbehaving on AVR due to integer overflow: https://github.com/UAVCAN/libcanard/pull/17/files#diff-ec9999a8cb27a176bf926dd68dfd7426R303

I have completely eliminated native types from libcanard since then; the only thing that stopped me from doing the same in libuavcan is the size of its codebase.

HIC++ also agrees here:

To enhance portability, instead of using the standard integer types (signed char, short, int, long, long long, and the unsigned counterparts), size specific types should be defined in a project-wide header file, so that the definition can be updated to match a particular platform (16, 32 or 64bit). Where available, intN t and uintN t types (e.g. int8 t) defined in the cstdint header file should be used for this purpose.

I don't see how the alignment argument is relevant at all, considering that we don't use packed structs ever. Consequently, I don't see how reliance on native types or platform-defined types like ptrdiff_t makes the code less prone to defects.

Generally, we should emphasize robustness over optimization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're in agreement here. Sorry (I commented too fast on this int16 and now Zarkopafilis has to listen to us drone on about alignment issues where there aren't any).

Generally, we should emphasize robustness over optimization.

Agreed but unaligned access is a hard-fault on some processors so this wasn't a comment that was just about optimization.

What's probably more common (and yes, Coverity has griped about this in the current code base), when using fixed-width integers for things that are dimensions you can get interesting issues when forming loops where the size is a size_t (i.e. a machine word) and the other comparison term is a uint16_t or some other non-machine dependent word. You'll see possible infinite loops occur in such conditions. What I'm getting at is that this int16_t is bound to cause some humans to scratch their heads about implicit integer promotions at some point. The problem moves around though since (see my later comment) you need to worry about this all over again when you subtract two signed integers.

I guess I'm fine with it but I don't love it.

int count = 0;
bool res = true;

tree->walkPostOrder([expected, &count, &res](Entry*& in) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this frankly concerns me. We're relying on a given std::function implementation to have a small buffer optimization large enough to hold at least 3 machine words of data for the capture. This seems reasonable but I have no data to know that. Perhaps we should just define a functor we use to do this traversal? Also, what are we using this method for exactly? Only testing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on the tests, where heap allocation is permitted. In general, the traverse is only used for tests. It's the only proper way of checking the contents of the tree, without exposing Node and root_

libuavcan/include/uavcan/util/avl_tree.hpp Show resolved Hide resolved
Copy link
Contributor

@thirtytwobits thirtytwobits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm really sorry I haven't had the time to review the actual AVL part of this code. Here are some comments in the meantime that do need to be addressed.

libuavcan/include/uavcan/transport/can_io.hpp Show resolved Hide resolved
protected:
struct Node {
T* data = UAVCAN_NULLPTR;
int16_t h = 1; // initially added as leaf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're in agreement here. Sorry (I commented too fast on this int16 and now Zarkopafilis has to listen to us drone on about alignment issues where there aren't any).

Generally, we should emphasize robustness over optimization.

Agreed but unaligned access is a hard-fault on some processors so this wasn't a comment that was just about optimization.

What's probably more common (and yes, Coverity has griped about this in the current code base), when using fixed-width integers for things that are dimensions you can get interesting issues when forming loops where the size is a size_t (i.e. a machine word) and the other comparison term is a uint16_t or some other non-machine dependent word. You'll see possible infinite loops occur in such conditions. What I'm getting at is that this int16_t is bound to cause some humans to scratch their heads about implicit integer promotions at some point. The problem moves around though since (see my later comment) you need to worry about this all over again when you subtract two signed integers.

I guess I'm fine with it but I don't love it.

libuavcan/include/uavcan/transport/can_io.hpp Outdated Show resolved Hide resolved
return 0;
}

return static_cast<int16_t>(heightOf(n->left) - heightOf(n->right));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right. You should write some unit tests to cover the difference being > 0xFFFF since you'll get integer overflow and the static cast is doing nothing except making the code harder to read. If you want to adjust the value to be simply +-2 you'll need to promote it to an int32_t first:

Suggested change
return static_cast<int16_t>(heightOf(n->left) - heightOf(n->right));
const int32_t diff = heightOf(n->left) - heightOf(n->right);
return std::max(-2, std::min(2, diff));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you got 65536 pending messages to transmit, you are going to run out of resources way sooner than filling up the tree. Keep in mind that expired nodes get removed too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you got 65536 pending messages to transmit, you are going to run out of resources way sooner than filling up the tree.

That's a bold assumption that is too fragile to be as implicit as it is now (specifically, memory-rich applications like GNU/Linux-based nodes can theoretically exceed this limit). I think Scott is right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from this part, the problem still exists on the int16_t chosen as the data type for each node's height.

I'm adding a static cast to the proposed changes so that the compiler doesn't treat the implicit warning as an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Scott is right..

I'll take what I can get at this point. Also, Scott is wrong about the static_cast. You do need it.

return static_cast<int16_t>(heightOf(n->left) - heightOf(n->right));
}

static int16_t maxOf(int16_t a, int16_t b) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to write this ourselves: https://en.cppreference.com/w/cpp/algorithm/max

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we care about semi-broken C++ implementations like that of NuttX (unless they fixed it since I last dealt with it), we must be extremely careful introducing new dependencies on STL. This is the reason why we have reinvented a huge wheel in uavcan/util/templates.hpp - none of this stuff one would expect to find in the standard library is available in NuttX. It is also why we can't easily just include algorithm, so I suggest keeping this code here unchanged.

I don't think that we should retain such drastic compatibility policies in bluesky, but this legacy code should remain compatible with older platforms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Shouldn't this go into templates.hpp then? I don't want to have multiple "min/max" functions lying around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go into templates.hpp then?

I guess it should.

libuavcan/include/uavcan/util/avl_tree.hpp Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Show resolved Hide resolved
Copy link
Contributor

@thirtytwobits thirtytwobits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're so close now. Thanks again for sticking with us. We just need to cleanup uc_can_io.cpp to make it so I can patch the CAN-FD branch and to finish cleaning up the whole balanceOf int16 debacle (sorry).

I did a test patch and everything looks good in my downstream source. Coverity didn't flag anything and my quick read though the AVL core logic didn't raise any flags.

libuavcan/src/transport/uc_can_io.cpp Show resolved Hide resolved
}

const int32_t diff = heightOf(n->left) - heightOf(n->right);
return static_cast<int16_t>(std::max(-2, std::min(2, diff)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pavel wanted to use your maxOf but moved into templates.hpp. We'd also need minOf and an int32 version which suggests a template which then brings type traits to play if we want to prevent inefficient uses which will lead to another round of reviews and diving into the c++ spec...

fuck it.

Let's just go with minOf and maxOf here in this class.

I'm really sorry about the churn on this one line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this fails for lpc11c24, note: mismatched types 'std::initializer_list<_Tp>' and 'int' return static_cast<int16_t>(std::max(-2, std::min(2, diff)));

Do we keep the last version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should work when you switch back to your own versions of minOf/maxOf. Its failing because of changes to stl (see: @pavel-kirienko is almost always right)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but do I need to keep the maxOf & minOf as-is, or move it to the templates.hpp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep them here.

libuavcan/include/uavcan/util/avl_tree.hpp Outdated Show resolved Hide resolved
libuavcan/include/uavcan/util/avl_tree.hpp Show resolved Hide resolved
@@ -0,0 +1,610 @@
/*
* Copyright (C) 2019 Theodoros Ntakouris <[email protected]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a coverage report. Looks pretty good. It'd be nice to cover some more of the branch conditions missed but overall this is excellent work. Thanks!

covhtml.zip

@pavel-kirienko
Copy link
Member

pavel-kirienko commented Mar 1, 2019 via email

@pavel-kirienko pavel-kirienko merged commit 5853215 into OpenCyphal-Garage:master Mar 6, 2019
@pavel-kirienko
Copy link
Member

🎉 great job @Zarkopafilis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants