-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reduction using SAT and QBF solvers #407
base: devel
Are you sure you want to change the base?
Conversation
Add implementation of two new types of reduction using solvers. The basic idea was the possibility of using solvers for reduction and whether it is possible to create minimal nondeterministic automata. The solvers work with an automata representation using the CNF formula with variables representing its transitions, initial and final states. The solver then tries to find a solution that satisfies the given clause. The sets are updated until an equivalent automaton is found. The number of states is increasing from the default value making sure that the found automaton is minimal. The reductions are working with structs SatStats and QbfStats, both inheriting from the base stat AutStats, which defines the number of states and symbols of the automaton, the two sets of words, and an output stream. Declarations can be found in include/mata/nfa/types.hh. The reductions are working with the same interface as the other possible reduction algorithms which are parametrized using the ParameterMap.
We discussed with @notValord to remove the linux binaries of SAT/QBF solvers and add them as requirements. |
Hello. Thank you for the PR. One MacOS test seems to be failing. Otherwise, let me know when you finish with the changes so that the PR can be reviewed. In general, the implementation looks great to me after a quick glance.
|
Sounds good to me. However, I would go with optional requirements, ideally. However, we do not currently have a way to specify optional requirements. Maybe for now only a note in the comment when selecting the specific algorithm would be enough? That way, everyone can add complex algorithms with dependencies, but the main Mata implementation will remain as small and clean as possible. What do you and anyone else think about this? |
I agree with optional requirements and tests (maybe generating a warning instead of a failed test). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a pass through the code. There are some comments about possible refactorization and some performance fixes. We should discuss these and address them before merging the PR.
#define TSEY_NOT -3 | ||
|
||
// define charactes used for SAT and QBF solvers | ||
#define SOL_EOL std::string("0\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use only string literals here for the macros. Otherwise, the strings will have to be allocated for every single use of Mata during runtime. Furthermore, it would be ideal if the types were defined using constexpr char* SOL_EOL
, or by using string view and string view literals.
@@ -48,6 +50,315 @@ struct Nfa; ///< A non-deterministic finite automaton. | |||
/// An epsilon symbol which is now defined as the maximal value of data type used for symbols. | |||
constexpr Symbol EPSILON = Limits::max_symbol; | |||
|
|||
// defining indexes of logic operators used in an input vector for tseytin transformation | |||
#define TSEY_AND -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go for clarity here and use TSEYTIN_{AND, OR, NOT}
. Furthermore, to ensure proper usage, I would define these as constexpr int TSEYTIN_{AND,OR, NOT} = <value>;
.
Are these indices and SOL characters to be used by the users of the library, or is it only necessary inside the library algorithms? If it is the latter, I would move these definitions to the appropriate files and enclosed them in as small scope as possible. For example, TSEY_AND
is currently only used in src/nfa/operations.cc
. Therefore, they should be defined in an anonymous namespace inside the file. Similarly for SOL_xxx
.
* @param[in] output Output stream | ||
* @return The first variable index that was not utilized | ||
*/ | ||
size_t reduction_tseytin(const std::vector <int>& input, size_t max_index, std::ostream& output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method be called tseytin_transformation()
instead of a reduction
? Furthermore, if I am seeing right, this method does not use the underlying automaton in any way, so it can be a utility function, not even a method o
NFA, no? Stored somewhere in utils/
folder, probably.
@@ -582,10 +619,13 @@ inline bool is_included(const Nfa& smaller, const Nfa& bigger, const Alphabet* c | |||
* @param[in] alphabet Alphabet of both NFAs to compute with. | |||
* @param[in] params[ Optional parameters to control the equivalence check algorithm: | |||
* - "algorithm": "naive", "antichains" (Default: "antichains") | |||
* @param[out] default_runs Optional parameter for intersection runs to return the found word |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is wasteful to pass by reference and have a statically initialized pair of pointers (which will get modified when the equivalence check gets run the second time -- therefore, they will no longer be the default runs). A pointer to a pair of runs should be used instead, default initialized to nullptr
.
However, I think the global variable for default runs should be removed entirely and the parameter here should be a pointer default initialized to nullptr
.
* @param[in] debug Flag for debug output as the reduction can take some time to get the info about the current status | ||
* @return Reduced automaton | ||
*/ | ||
Nfa reduce_sat(const Nfa &aut, const ParameterMap& params = {{"solver", "sat"}}, bool debug = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that both the specific reduction algorithms should be hidden behind the general reduce function. That is, they should not be in the public API for Nfa
, and be only accessible in the anonymous namespace of operations.cc
. Or, since we are starting to have way to many reduction algorithms, we could create a new source file specifically for reductions reduction.cc
with potentially a new header file reduction.hh
which can be included when one want to call the specific reduction algorithms. Creating a new source and header files seems like the best approach to me.
size_t mata::nfa::SatStats::example_clauses(size_t max_index) { | ||
size_t transitions_num = this->alpha_num * this->state_num * this->state_num; | ||
|
||
for(auto word: this->accept) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will create a local copy of every vector. Use auto&
when you want to create a reference to an auto deduced data type. This appear on multiple places around the code. If possible, try to find all of them and use references.
return max_index; | ||
} | ||
|
||
void mata::nfa::SatStats::recurse_tseytin_accept(const std::vector<int>& base, size_t state, Word word, const unsigned pos, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
word
should be passed by a reference, here, no? And similarly in other functions. If possible, try to have a look at the function declarations and see whether some arguments cannot be passed by reference.
/** | ||
* Base class for representing the input parameters for SAT and QBF reduction | ||
*/ | ||
struct AutStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of all of these structs should be somewhere else. These types should not be used by the user of the library, correct? Therefore, I think creating a header file and a source file for reduction (reduction.cc
and reduction.hh
and moving code of all reductions and all these structs inside is a reasonable approach).
Add implementation of two new types of reduction using solvers. The basic idea was the possibility of using solvers for reduction and whether it is possible to create minimal nondeterministic automata.
The solvers work with an automata representation using the CNF formula with variables representing its transitions, initial and final states. The solver then tries to find a solution that satisfies the given clause. The sets are updated until an equivalent automaton is found. The number of states is increasing from the default value making sure that the found automaton is minimal.
The reductions are working with structs SatStats and QbfStats, both inheriting from the base stat AutStats, which defines the number of states and symbols of the automaton, the two sets of words, and an output stream. Declarations can be found in include/mata/nfa/types.hh.
The reductions are working with the same interface as the other possible reduction algorithms which are parametrized using the ParameterMap.
The tests will be added shortly in future commits together with some small fixes. I was not sure with the proper location for the reduction implementation and its structures, as it is an another type of reduction the src/nfa/operations.cc seemed appropriate and the used structures were put into include/mata/nfa/types.hh but can be moved if necessary.