-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor quality control #34096
Comments
I have to agree, there are a lot of regression bugs lately, some serious some just annoying, some to mention;
I love Salt but I really feel that there should be more focus on bugs because I'm starting to feel the same pain as @coen-hyde, it's getting frustrating. Keep up all the good work, Salt is amazing but don't lose control over it. |
Yes, this is a never ending challenge in software development and one that is particularly difficult in Salt. We do have quite a lot in place to help minimize these issues are are working on more. Our automated test suite is run against multiple platforms for every pull request and must pass before all releases and all code that is introduced into Salt. We are currently working hard on expanding the test suite as well. One of the main problems we face is the sheer coverage and usage of Salt. In this regard we face many similar problems that are faced by kernel and OS developers, due to the fact that we cannot control the environment in which our software runs, and we need to have such a broad management of the environment. With all this said, we do not shrink from the challenge and we feel that we have greatly improved! 2016.3.0 is a .0 release and I am very pleased to say that while regressions were introduced it was by far the most stable .0 release we have had in years and that I am very excited about it. Also we have been able to quickly handle almost all of the regressions and many bugs in the .1 release. So I think that our ongoing efforts are paying off! As for what more we can do, bug fixes and more testing has been a big part of SaltStack's efforts recently, but we have ramped it up even more as we have been able to allocate more resources to the effort. Also as members of the community your continued help in reigning in bugs and issues is greatly appreciated, this is a huge effort to put out software as extensive and large as Salt and all of your work is greatly appreciated! But in a nutshell, making Salt is like making enterprise operating systems, and just like RedHat 7.0 came with many bugs and was not widely used until 7.1 and 7.2, Salt also shares a similar level of coverage, and you should expect issues particularly with .0 releases. We are always very grateful for your efforts and help in making a great open source project! |
Thanks for the reply @thatch45 I just had a look at the recently merged PR's. It's good to see a lot of work going into the test suite. Test coverage is always very hard to do in retrospective. Given all the different setups / environment Salt might be run in, it makes writing tests hard. But that's all the more reason to be very diligent with making sure good tests are written. After reviewing some of the non test suite related PR's I can see functionality / bug fixes are still being merged without tests or failed tests. There are currently 91 P1 bugs and 469 P2 bugs open. Each one make Salt unusable for someone (or many people). They might be able to work around the issue but stack a few and all of sudden it's not even possible to use Salt. As an armchair critic it might be time to stop going wide on functionality and ensure the core of Salt is operating as intended. Like @githubcdr I'm very appreciative for what you've created. I just want to use it! Thanks for hearing us out. |
Hi Coen, Regarding the PRs being merged with failing tests, there are a couple of We've made improvements here but have a long road still to travel. I can Like you, I would love to get to the point where all tests pass for every As far as merging PRs without tests, this is tricky. We've debated Right now, my view at least, is that the test suite isn't sufficiently Internally, we have moved the responsibility for the creation of new tests That said, I'm all ears when it comes to hearing what we can do to As far as the bug list goes, believe it or not that list is, for the most That said, the number of outstanding bugs is way, WAY too large. Internally, I can assure you that the focus is on bug-fixing versus feature I'm more than happy to answer additional questions about this issue because EDIT: Updated to correct a mistake. The goal is a dozen new test a sprint, which is 3 weeks. -mp On Fri, Jun 17, 2016 at 2:42 PM, Coen Hyde [email protected] wrote:
|
More tests == awesome! Is there a list of "tests that need writing / rewriting" somewhere? I'd be happy to knock one or two off that list. And if there isn't a list, I'm gonna tackle the ipset stuff, because it's become mission critical for us and because the code could use some attention. :) |
Hmm, well I've just been dealing with an issue and I certainly don't want to point the finger of blame or even appear to be doing so, but one thing is clear is that failing to document things properly can be a huge source of regressions when people who don't know what the code is supposed to do (and reasonably have no good way of knowing) make changes. So one essential part of quality control is documentation. Let's see the people responsible for merging always insist on good documentation of the functionality. |
@ahammond I actually asked the QA for that very thing a while ago and I'm hoping that they get it to me very soon. When they do, I will get it online somewhere that we can all see. Thank you for the reminder! (cc: @ssgward ) If there's anything I can do to assist in the ipset testing, please let me know! @hrumph I had this exact thought this week, actually. I completely agree with you. From this point forward, I'm going to start being considerably more aggressive about requiring both in-line docs and function docs. It's long overdue. |
Need to run the tests suite on other Platforms. e.g. Windows, Solaris x86, SmartOS etc, |
Might also worth timing each test, to see if they increase in time. And some how recording and graphing it. So internal salt staff can review it. |
@damon-atkins QA is almost done with OS X integration and @twangboy is hard at work getting the test suite to run on Windows. We'd love help with the other two platforms or either of those projects if there are volunteers. We do time every single test. Those results are available in the 'Performance Trend' graphs on each branch, for example here: https://jenkins.saltstack.com/job/branch_tests/job/2015.8/job/salt-2015_8-linode-debian8/performance/ |
In my opinion, this discussion is essential for the future of Salt. As you already pointed out, community is currently facing bad regressions on each version, making us loose our confidence in this amazing tool. To summarize:
Additionally, here are my thoughts:
|
It would also be great if Saltstack could organise internal (and maybe even external?) bug sprints. We all love new features, but I would very much like to see stability and bugfixes as the primary goal in the coming releases. |
@BABILEN I'd love to see that too. I'm cc:'ing our community manager, @UtahDave to see if perhaps he can help spearhead something there. @tomlaredo It's actually engineering's job. Prior to this point, test-writing was the responsibility of QA. Internally, that has switched to engineering. Those tests will be primarily integration tests but unit tests will be written where appropriate. I agree entirely on the point about documentation. I'm going to personally start doing a much better job of ensuring that documentation is written before accepting PRs. As far as the regression fixes go, I believe that we do tend to get fixes written and merged as soon as they are discovered. The more difficult problem seems to be the rush of fixes that come in in the days following a release. Frequently, as one is discovered and fixed and we prepare to package the release, another one comes in. Of course, the speed of packaging is something that we're working hard to improve as well so that we can get these releases out the door as quickly as possible when it's critical to do so. Finally, I'm going to ask @rallytime if she can please modify the template for PRs to include a link to the page you mention above. I agree that this might help. |
@tomlaredo and @cachedout Great idea - I've added the link to the contributing docs to the PR template in #34144. |
@rallytime : I think that you can even go a bit further by replacing the "Tests written? Yes/no" (that is incitative for not writing tests at all) by "Unit tests description" (without any question). |
Unit Test are good as you can just test your own code bits.... But I think their needs to be some doco on how to write tests with examples for people who have never written any tests before. Or a video presentation. And the traps around multiple platforms e.g. a windows test not running on Unix etc. |
@tomlaredo Thank you for the suggestion. @damon-atkins We do have documentation on how to write tests for Salt. There is a tutorial on Salt's test suite here and there is also links to more in-depth documentation on how to write each type of test (unit or integration) linked in the tutorial. Those links are here: We certainly can answer any questions about that documentation or suggestions on how it can be improved. |
Hi @rallytime, I used Salts doco to write some non MagicMock test i.e. real test which change items and test they have change.. It's the MagicMock that needs more doco, a bit of an intro on how it works, and the samples code should have comments explaining each line. MagicMock doco seems to jump into details at 200%. Most tests seem to be very similar so just a bit more time explaining it would help everyone. Most people are time deprived , so helping pick up the skills needed quickly is important to get tests developed. Actual some times I find bad doco a lot easier to swallow after watching a video of someone explaining an example. People seem to put more "explanation detail" into a video when they explain it vs words in a document. |
Honestly, unittests, specially mocked tests should be reduced to a "really needed because there's no other way". Unit tests to test a specific function and yes, salt has unittests which should actually be integration tests(that's where mock is most used). Of course, we still need to add, for example, Solaris test suite runs, and because of this, perhaps a unit test us better than an integration tests which currently does not run. But the main idea that I'd like to leave here is, integration tests should be prefered, so, if you're investing time on how to write tests, invest on integration tests, they are slower, but run against a real system. |
Mmm the "integration unit test" I developed run, as part of the suite, the only reason they do not do anything as they were windows tests for windows only code. "tests/unit/modules/.." |
This is the main reason why I am moving away from salt after using it for more than a year. Salt looks great on paper but once you start using it you see that those cool modules that promise to save you a lot of work eventually make you waste a lot of time. These are the main problems that I see.
Salt is a great project, but it needs to focus on what is most important for a tool that handles hundreds of servers: stability and reliability. |
@diego-XA Those are all valid points that should be addressed. Out of curiosity: What are you planning to move to? |
I am planing to move to Ansible as it is also python based and it also uses yaml so posibly the migration will be easier. But I am still gathering information to see how suitable it is for me. I am reading the book Ansible for DevOps which I think is great. |
I think you will find that some people have also moved from Ansibile to Salt. There are other configuration management products which I will not name any, which have similar issues. In the end you get what you pay for. People can always offer bounties for fixes which are important to them or pay for support. +1500 High Severity bugs is a bit much, I wonder how many of them are still valid. |
Yes, I know, but still it looks more stable and as it does not use an agent there is no posibilty of causing problems like the salt-agent consuming 100% of the CPU. I think they are diferent type of birds and each of them may or may not be suitable for somebody. If there are not valid then they should be closed as invalid or as inactive. I personally fell in a couple of them that were quite old, although not in the most basic modules. |
@diego-XA All the best. Let us know how that went. :) |
@diego-XA still has a valid point and I agree with him and feel the same. Stability matters, it creates trust, when you lose trust in Salt it becomes a monster. I hate to debug issues related to salt bugs, what I hate even more is a broken server configuration because of this. Moving to Ansichefenginepuppet is not a solution to this issue but when Salt won't get better at managing stability it is. 1500 |
I agree with most of @diego-XA's comments. I'm the reporter of one of the issues mentioned and I'm surprised that I can reliably have my minions die off on their own and harm my systems with excessive CPU usage, over and over again, and this is meant to be production-ready software. There was no movement on the bug report (from my perspective) until I continued to complain that the issue is persistent and I created (what eventually were found to be) duplicate issues on GitHub. I understand that no non-trivial software is bug-free. I understand SaltStack is free software partially created by unpaid contributors and I should be happy with what I pay for, nothing (side note: I was planning on throwing money at SaltStack on my next round of funding, I no longer am). I'll echo the comment about how it feels to be a champion of particular software and then watch as it fails miserably. It feels bad. Worse in my case as I feel the issue of crappy software being released to the wild as production-ready is larger in scope than SaltStack. It is an increasingly bigger problem everywhere I look in the free software world that I've been championing for my entire career (I've lost count of the number of kernel-panic causing bugs I have found in stable releases of the Linux kernel). Please focus on quality. If software harms the system it runs on or does not reliably do its job then it is useless garbage regardless of how many features is has or what magic its developers claim it can do. I'm not yet ready to jump ship to Ansichefenginepuppet, I hope that SaltStack improves, but I'm certainly not pleased with its current state. |
@jwhite530 I see that there is no progress in the bug report, which is a shame because it looks quite an important bug to me. My guess is that they are saturated with so many bugs. It may be just a coincidence, but I've been checking the "pulse" of different big github projects for the last month and salt seems to be having more bugs opened than closed. Not sure how relevant is this metric but I found it at least a curious information. I only looked for projects that came to my mind and where on github, so hopefully it is not biased, hehe. Here some links: https://github.com/saltstack/salt/pulse/monthly |
@diego-XA, issues are not bugs. I similarly watch many projects, and salt is far more willing than most to leave an issue open even if they will not be actively working it for a while. Many, many issues are people looking for help with how to write a state, or configure a salt master for large scales, or whatever. Other projects will simply close the issue immediately, maybe sending the user to a mailing list. Or close it asking for debug logs or other info, and that way if the user doesn't come back (which is often, probably because the user figured it out themselves) then the issue doesn't linger. So, very different approaches to managing issues. I happen to appreciate both. Salt's approach is very open and welcoming to everyone; other approaches perhaps make it bit easier to manage the backlog. I don't say that to take away from legitimate issues you raise. I just want to put another perspective on it. |
@lorengordon I actually think that one of the problems in salt is the documentation, which causes people to report problems as bugs when actually are documentation problems (documentation is wrong/unclear, syntax changes were made without making them explicit during upgrade, etc). I would treat these issues as bugs. |
Oh, other projects definitely provide the user with some information that they hope will help and close the ticket immediately as a primary approach to their ticket-management plan. They're also very aggressive about closing old/inactive tickets. I would say that it is more common on popular projects than Salt's approach. The pulse links are fascinating to me. Salt is as active as Docker! But I think without as much recognition or funding. At least in my circles, people know Ansichefenginepuppet but not Salt. Yet Salt is routinely far more active than any of them. Salt is much more community-driven, largely I think because of their open and welcoming approach. And it is always odd to me when I hear people complain about Salt's documentation. I find it quite extensive, and it has improved significantly over the last two years. I can't remember a single thing I've wanted to do (that was possible/supported), that wasn't in the docs, that a quick google search didn't turn up. Now, sure, early on it was harder to figure out how to do what I wanted to do, but I largely see that as my problem, not Salt's. Every product has a learning curve, where the user at first is entirely confused about syntax and usage and terminology, and then as they use it more and more things that were hard become easy. Nothing about the product changes, as much as the user just gets better at understanding the product's approaches to things. Plus, now I've learned a lot about yaml, jinja, and python, when two years ago I knew basically nothing about any of them, not to mention git and open source workflows in general, so, bonus! 😉 I do see the regressions with upgrades and packaging. Those are frustrating. Though packaging is seriously hard, especially across so many distributions and platforms, and while trying to maintain a fairly quick release cycle. So, I don't know, it's frustrating, but I understand it. I've also seen how much work Salt has been putting into improving and automating the packaging process, which makes me optimistic. I just try to control what I can, which means I do a fair amount of testing of new releases to make sure they work for our use cases, before using them in production. I also watch new and active issues and pull requests so I have an idea of what the problems are. And we skip releases with issues we know will bite us. We also wrap the install/upgrade/configure process with our own script so we can easily run pre- or post- tasks to handle our special edge cases. Anyway, I definitely went off on a tangent there. Dunno if it's helpful, ¯_(ツ)_/¯ |
Just stumbled across this issue and skimmed it. Interesting stuff. While we're talking about issues lingering around ;) I might as well be the Devil's advocate and ask; When will this issue be able to be closed? Could we quantify when "quality control" is good enough? This feels like one of those issues that can be discussed forever. |
We're still lacking integration tests to stop regressions, like this one creped into 2016.11.8 |
After about a year of Salt, with some relatively advanced configuration, I concur with the premise of this issue. In general, I think the documentation, issue response, etc. -- the process-related stuff -- is pretty good for Salt. What's not so great (in my opinion) is the design of Salt's internals: the non-idiomatic, "magic" globals, strange and unintuitive APIs, and spaghetti code. It's really not trivial to follow the flow of execution in Salt by diving into the code. The sprinking of Another sore spot for me, as a user, is the lack of user testing tools (as opposed to testing tools for developers of the Salt project itself). I don't want to spin up a VM to test Salt, that's too slow. Users need fast iteration over their modules, pillars, etc. They need to be able to easily and quickly test that Jinja is rendering correctly, that states used in modules have all the correct required arguments, and no invalid arguments, that pillars get correctly rendered, that "includes" or "excludes" in modules do the right thing, etc. In other words, the Saltstack equivalent of Finally, I have noticed that PRs are frequently merged with little to no tests. In some cases, these PRs drastically rewrite existing functionality. In other cases they end up as brand new "features". These changes get distributed with Salt core, and yet somehow have little to no support from the actual maintainers of Salt (even if you have a Saltstack enterprise subscription). To me, it seems pretty disingenuous to accept new features into Salt core that are not well tested and have little to no support, without any sort of disclaimer about the stability of said features. In many (if not most) cases, you cannot tell from looking at the documentation that a particular module is stable, with proper deprecation handling for changes, or not. New features are touted, but under-tested. This leads, in my opinion, to the rather high volume of issues that you see today. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue. |
Hey Guys,
This issue isn't about a particular issue I am experiencing. Rather that lack of quality control around salt. I've experienced so many issues in setting up our salt environment; most of which already have tickets in github. It's really disheartening to find out that the project that you championed is so buggy. I'm about to rip Salt out from our infrastructure because I can not get it to work reliably. And I've lost confidence in it.
I hope this ticket might bring to attention the serious state of affairs regarding the quality of salt.
Thanks,
Coen
The text was updated successfully, but these errors were encountered: