-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Review & Discuss] Update TxQueue to have a backing AVL Tree (Issues #189 #195) #198
Conversation
Can you squash your commit history so we're reviewing a single commit into our master? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're getting there! Thanks so much for this. The biggest thing is to get the unit tests built.
Squashed, working on adhering to changes & tests! (Note that TxQueue sanity checks already work though). Should equal priority entries be supported? Note: On merge PR as single squashed commit: https://help.github.com/articles/configuring-commit-squashing-for-pull-requests/ |
Excellent work @Zarkopafilis
Yes. Multi-frame transfers work by injecting multiple frames with the same CAN ID into the TX queue. They must be retrieved by the driver in the same order they were pushed, otherwise things won't work at all. Some of the tests could be broken because of the failing multi-frame transfers, so I suggest to fix that first. |
I am noticing that the new code relies on C++11 features (such as @thirtytwobits what should we do? |
@Zarkopafilis is this ready for a final review? |
It is.
Short recap:
- Replace Tx Queue with AVL tree
- Simplify queue management
- Remove all QoS logic (function prototypes and api signatures left as-is)
On Sun, 10 Feb 2019, 23:47 Pavel Kirienko, ***@***.***> wrote:
@Zarkopafilis <https://github.com/Zarkopafilis> is this ready for a final
review?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#198 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFL1MmbMS0btMLsvYv0GX_Oe6ePvFvvvks5vMJNagaJpZM4aVA0y>
.
--
Best Regards,
Theodoros 'Zarkopafilis' Ntakouris
+30 698 92 90 272 (Greece)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I left some comments. Among the important stuff is the fact that your code currently relies on heap allocation, which is absolutely not permitted (the reliance is implicit through closures). Please find a way to avoid them.
Please also fix the style issues (I didn't highlight all of them, an automatic style checker should help).
Feel free to drop QoS completely.
@pavel-kirienko the newly added static_casts are due to conversion warnings treated as errors. Removed the heap allocations as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reformat the code to match the style guide. Particularly, ensure that braces are placed correctly and entities are named properly (camel case vs. snake case). We will have another pass afterwards.
cmake-build-debug/CMakeFiles/3.13.2/CompilerIdCXX/CMakeCXXCompilerId.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearly, the auto-formatter configuration requires work. Is my understanding correct that you have run the formatter against the entire codebase?
Mixing both style changes and logical changes inside the same PR doesn't seem to be working well: the diffs are just incomprehensible. I suggest either:
- Roll back the autoformatting changes and fix the styling issues manually.
- Fix the autoformatter configuration, run it against the master, and submit the style-only changes in a separate PR (and also please add the autoformat config file to the repo). Then after the style PR is merged, rebase or merge it into this branch.
Removed the explicit I only ran the formatter for the files that this PR changed (additions on queue, avl tree, qos removal). I see no reason for all this formatting hassle; the repository is itself inconsistent and the provided formatter(s) on v1.0 and uncrustify have no effect on this - correcting styling mistakes like the initializer list from where I copied exactly styling from other parts of the codebase does not lead in a proper solution. |
ignoring the style issues for now
Okay. I suggest then that we merge this ignoring the style issues for now, and then I will fix the clang autoformatting config myself and submit style corrections via another PR. The changeset is huge here, so I would like @thirtytwobits to take a pedantic look at it once again (ignoring style) and confirm everything once more. |
Sure. I'll look at this later today (Feb 25, PST). Sorry that my .clang-format is messed up. I'm pretty sure that's my fault. I'll also look into changing all the fonts for this code base to Russo One. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't have time to complete this review today like I said. I'm making progress through it though. I have a few comments to start. More to follow as I have time. Thanks again for contributing!
protected: | ||
struct Node { | ||
T* data = UAVCAN_NULLPTR; | ||
int16_t h = 1; // initially added as leaf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's with the int16 for height? It won't save space on most architectures because of padding and can introduce weird alignment issues if packed. Any reason to not just use an int so we get better alignments? (and it would still be 16 bits on a 16 bit architecture).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MISRA prohibits native types because of associated portability issues and risks. The rest of the codebase is mostly non-compliant in this sense, but as I wrote in #190, we'll have a chance to have those non-compliance issues rectified in BlueSky. If int16_t
is considered too small (which I don't think is the case here considering the purpose of this container), it should be replaced with int32_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you are reading MISRA correctly here. Rule 3-9-2 (which is only Advisory) recommends typedefs to manually control the size of integer types. The intent of all the integer rules is to prevent defects. My recommendation to use a machine word is also to prevent alignment defects (which are especially insidious where un-aligned access is supported but slow). If we must we can use a ptrdiff_t but that'd just look weird. Let's feel free to add "breaks MIRSRA x-x-x because foo is safer" comments (we can even make them Coverity annotations) rather than slavishly adhering to the standard. But I do agree we should consider and use MISRA where it make sense.
That all said, I don't think we're packing these nodes into anything so my comment here is probably just cosmetic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: it can't be unsigned because it may be negative while performing balancing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. ptrdif_t and int are both signed machine words on all platforms...except 8-bit platforms...crap. int is 16 bits on AVR according to the spec (caveat: most AVR program make use of typedef "fast int" which is 8-bits) and ptrdiff_t can be larger than a machine word. Okay. I'm really over-indexing on this, sorry. Let's keep the int16_t and I'll get to the part where this data structure is actually used to be sure we're not causing un-aligned accesses on some platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you are reading MISRA correctly here. Rule 3-9-2 (which is only Advisory) recommends typedefs to manually control the size of integer types.
Indeed, I was misremembering things by thinking that the rule is required rather than advisory in MISRA. Still, well-defined, deterministic width may eliminate certain overflow-related defects. One example comes to mind -- there was a discussion (either on the mailing list or on Gitter) about this code misbehaving on AVR due to integer overflow: https://github.com/UAVCAN/libcanard/pull/17/files#diff-ec9999a8cb27a176bf926dd68dfd7426R303
I have completely eliminated native types from libcanard since then; the only thing that stopped me from doing the same in libuavcan is the size of its codebase.
HIC++ also agrees here:
To enhance portability, instead of using the standard integer types (signed char, short, int, long, long long, and the unsigned counterparts), size specific types should be defined in a project-wide header file, so that the definition can be updated to match a particular platform (16, 32 or 64bit). Where available, intN t and uintN t types (e.g. int8 t) defined in the cstdint header file should be used for this purpose.
I don't see how the alignment argument is relevant at all, considering that we don't use packed structs ever. Consequently, I don't see how reliance on native types or platform-defined types like ptrdiff_t
makes the code less prone to defects.
Generally, we should emphasize robustness over optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're in agreement here. Sorry (I commented too fast on this int16 and now Zarkopafilis has to listen to us drone on about alignment issues where there aren't any).
Generally, we should emphasize robustness over optimization.
Agreed but unaligned access is a hard-fault on some processors so this wasn't a comment that was just about optimization.
What's probably more common (and yes, Coverity has griped about this in the current code base), when using fixed-width integers for things that are dimensions you can get interesting issues when forming loops where the size is a size_t (i.e. a machine word) and the other comparison term is a uint16_t or some other non-machine dependent word. You'll see possible infinite loops occur in such conditions. What I'm getting at is that this int16_t is bound to cause some humans to scratch their heads about implicit integer promotions at some point. The problem moves around though since (see my later comment) you need to worry about this all over again when you subtract two signed integers.
I guess I'm fine with it but I don't love it.
int count = 0; | ||
bool res = true; | ||
|
||
tree->walkPostOrder([expected, &count, &res](Entry*& in) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this frankly concerns me. We're relying on a given std::function implementation to have a small buffer optimization large enough to hold at least 3 machine words of data for the capture. This seems reasonable but I have no data to know that. Perhaps we should just define a functor we use to do this traversal? Also, what are we using this method for exactly? Only testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is on the tests, where heap allocation is permitted. In general, the traverse is only used for tests. It's the only proper way of checking the contents of the tree, without exposing Node
and root_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm really sorry I haven't had the time to review the actual AVL part of this code. Here are some comments in the meantime that do need to be addressed.
protected: | ||
struct Node { | ||
T* data = UAVCAN_NULLPTR; | ||
int16_t h = 1; // initially added as leaf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're in agreement here. Sorry (I commented too fast on this int16 and now Zarkopafilis has to listen to us drone on about alignment issues where there aren't any).
Generally, we should emphasize robustness over optimization.
Agreed but unaligned access is a hard-fault on some processors so this wasn't a comment that was just about optimization.
What's probably more common (and yes, Coverity has griped about this in the current code base), when using fixed-width integers for things that are dimensions you can get interesting issues when forming loops where the size is a size_t (i.e. a machine word) and the other comparison term is a uint16_t or some other non-machine dependent word. You'll see possible infinite loops occur in such conditions. What I'm getting at is that this int16_t is bound to cause some humans to scratch their heads about implicit integer promotions at some point. The problem moves around though since (see my later comment) you need to worry about this all over again when you subtract two signed integers.
I guess I'm fine with it but I don't love it.
return 0; | ||
} | ||
|
||
return static_cast<int16_t>(heightOf(n->left) - heightOf(n->right)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right. You should write some unit tests to cover the difference being > 0xFFFF since you'll get integer overflow and the static cast is doing nothing except making the code harder to read. If you want to adjust the value to be simply +-2 you'll need to promote it to an int32_t first:
return static_cast<int16_t>(heightOf(n->left) - heightOf(n->right)); | |
const int32_t diff = heightOf(n->left) - heightOf(n->right); | |
return std::max(-2, std::min(2, diff)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if you got 65536 pending messages to transmit, you are going to run out of resources way sooner than filling up the tree. Keep in mind that expired nodes get removed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if you got 65536 pending messages to transmit, you are going to run out of resources way sooner than filling up the tree.
That's a bold assumption that is too fragile to be as implicit as it is now (specifically, memory-rich applications like GNU/Linux-based nodes can theoretically exceed this limit). I think Scott is right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from this part, the problem still exists on the int16_t chosen as the data type for each node's height.
I'm adding a static cast to the proposed changes so that the compiler doesn't treat the implicit warning as an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Scott is right..
I'll take what I can get at this point. Also, Scott is wrong about the static_cast. You do need it.
return static_cast<int16_t>(heightOf(n->left) - heightOf(n->right)); | ||
} | ||
|
||
static int16_t maxOf(int16_t a, int16_t b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to write this ourselves: https://en.cppreference.com/w/cpp/algorithm/max
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we care about semi-broken C++ implementations like that of NuttX (unless they fixed it since I last dealt with it), we must be extremely careful introducing new dependencies on STL. This is the reason why we have reinvented a huge wheel in uavcan/util/templates.hpp
- none of this stuff one would expect to find in the standard library is available in NuttX. It is also why we can't easily just include algorithm
, so I suggest keeping this code here unchanged.
I don't think that we should retain such drastic compatibility policies in bluesky, but this legacy code should remain compatible with older platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Shouldn't this go into templates.hpp then? I don't want to have multiple "min/max" functions lying around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this go into templates.hpp then?
I guess it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're so close now. Thanks again for sticking with us. We just need to cleanup uc_can_io.cpp to make it so I can patch the CAN-FD branch and to finish cleaning up the whole balanceOf int16 debacle (sorry).
I did a test patch and everything looks good in my downstream source. Coverity didn't flag anything and my quick read though the AVL core logic didn't raise any flags.
} | ||
|
||
const int32_t diff = heightOf(n->left) - heightOf(n->right); | ||
return static_cast<int16_t>(std::max(-2, std::min(2, diff))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pavel wanted to use your maxOf but moved into templates.hpp. We'd also need minOf and an int32 version which suggests a template which then brings type traits to play if we want to prevent inefficient uses which will lead to another round of reviews and diving into the c++ spec...
fuck it.
Let's just go with minOf and maxOf here in this class.
I'm really sorry about the churn on this one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this fails for lpc11c24
, note: mismatched types 'std::initializer_list<_Tp>' and 'int' return static_cast<int16_t>(std::max(-2, std::min(2, diff)));
Do we keep the last version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should work when you switch back to your own versions of minOf/maxOf. Its failing because of changes to stl (see: @pavel-kirienko is almost always right)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but do I need to keep the maxOf
& minOf
as-is, or move it to the templates.hpp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can keep them here.
@@ -0,0 +1,610 @@ | |||
/* | |||
* Copyright (C) 2019 Theodoros Ntakouris <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a coverage report. Looks pretty good. It'd be nice to cover some more of the branch conditions missed but overall this is excellent work. Thanks!
The autoformatter has to be fixed asap
If I were Scott, I would have suggested to keep it here as is. There is no
use having non-generic utility functions in the common area.
…On Sat, Mar 2, 2019, 00:18 Theodoros Ntakouris ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In libuavcan/include/uavcan/util/avl_tree.hpp
<#198 (comment)>:
> +
+ void deleteNode(Node*& n) {
+ if (n != UAVCAN_NULLPTR) {
+ n->~Node();
+ allocator_.deallocate(n);
+ n = UAVCAN_NULLPTR;
+ }
+ }
+
+ static int16_t balanceOf(Node* n) {
+ if (n == UAVCAN_NULLPTR) {
+ return 0;
+ }
+
+ const int32_t diff = heightOf(n->left) - heightOf(n->right);
+ return static_cast<int16_t>(std::max(-2, std::min(2, diff)));
Yes, but do I need to keep the maxOf & minOf as-is, or move it to the
templates.hpp?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#198 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJUZAVAQB5uVgvGq6z934lhYMr32VFgks5vSacYgaJpZM4aVA0y>
.
|
🎉 great job @Zarkopafilis |
API's signature has not changed regarding Qos', but all Qos logic has been removed / we no longer check if Qos is same in order to have entries sorted in the queue => I corrected some tests regarding Qos changes
Important Currently equal-priority entries (which is calculated purely from frame priority) is not taken into account. Shall I add functionality where if nodes have the same priority, they are linked to the same node (and afterwards traversed/removed at the order they are inserted?). For example if at
tx_queue.cpp
test you try to add any frame twice in a row (without removing it), you'll get a push rejected (and maybe a memory leak).On some tests I am unsure on how to proceed to correct them - or what changed their behaviour: ex. on
FirmwareUpdateTrigger.MultiNode
, some timer is running when it should not (towards the end).I messed with
DynamicNodeIDClient.Basic
,CanIOManager.Transmission
but unsure if it's correct.For now,
can_io.hpp
,uc_can_io.cpp
,avl_tree.hpp
,tx_queue.cpp
,io.cpp
(and the other test behaviours) should be reviewed and discussed. :)(I was writing end-of-semester tests and missed the dev call, but this week I'll join for sure)