-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion on refactoring of some things, only comments, not intended for merge. #277
base: devel
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on majority of the suggestions/questions. If others want to discuss the comments, I would suggest continuing with the existing conversations, if the problem they want to discuss is the same as in the conversation and talks about the same comment in the code.
@@ -112,6 +132,7 @@ Simlib::Util::BinaryRelation compute_relation( | |||
* @param[out] prod_map Mapping of pairs of the original states (lhs_state, rhs_state) to new product states. | |||
* @return NFA as a product of NFAs @p lhs and @p rhs with ε-transitions preserved. | |||
*/ | |||
// all parameters needed/used? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed some time ago that all functions performing some kind of renaming of states should return the renaming map. But it seems to me that we never use these. I think we have no use case for the renaming so far. There might be some in the future, but we do not have one at the moment. Do we want to keep the renaming maps, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can ask at the mejtyng what the uses can be, and remove it if there is nothing concrete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe really just putting the one struct of the special type as the optional parameter there, with some arbitrary initial choice of optional parameters as members? That would allow us to change the insides of the struct later without having to modify the interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that here we are talking about output parameters. When we pass the parameters, we normally do not want to be able to change the parameter class inside the function (we want to pass it by const reference). Hence, mixing the input, output, and input/output parameters in a single class would be confusing and error-prone, I believe.
But you are right that this would solve the issue with changing interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in the above comment, for these maps, I would probably put them to the key-value store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #277 (comment) for what problems I find with this approach for output parameters, as stated further in the linked comment.
bool is_included_naive(const Nfa& smaller, const Nfa& bigger, const Alphabet* alphabet = nullptr, Run* cex = nullptr); | ||
//naive is not a good word, maybe textbook, or explicit, construct_first | ||
//the alphabet parameter is useless in inclusion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe _basic
or _simple
? But I do not mind _naive
. Why do you think it is not ideal? We talk about the implementations/algorithms as naive
in our discussions all the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussions are private, code is public. I would go for textbook
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the algorithm, however basic, has a name, we can use the name instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the explicit
. I have read a textbook with the antichain-based inclusion checking. Basically, you could append textbook
to every function name in mata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like naive because it is not really clear what it is outside our group and it is somehow pejorative.
I would like textbook, I think everybody will know, I don't think it can be really confused with antichains. Explicit is ok too, but it requires a lot of context to get the meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think explicit
is little bit better. textbook
seems a little bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like basic
/simple
, but textbook
is also ok for me. I agree with Lukas, naive
sounds pejorative, explicit
does not really give me any information.
bool is_included_naive(const Nfa& smaller, const Nfa& bigger, const Alphabet* alphabet = nullptr, Run* cex = nullptr); | ||
//naive is not a good word, maybe textbook, or explicit, construct_first | ||
//the alphabet parameter is useless in inclusion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alphabet parameter is not useless. If you already have the alphabet, you can pass it to the function and save yourself creating the alphabet in the function again. Inclusion check can be rewritten, but for the current implementation, the alphabet parameter is an optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need an alphabet in inclusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because inclusion is computed using a complement of an automaton, which in our interface requires the alphabet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In inlcusion, the complement can be with respect to the symbols used in the other automaton, not with respect to an external alphabet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, the alphabet can be inherited. There might be an issue when you construct the complement wrt an alphabet that do not contain all symbols from the automaton before complement (but who knows).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @Adda0 said (and it is clearly explained in the comment for this function), alphabet is there as an optimization if you already have it. Otherwise, you have to go trough the whole delta of both automata to create the alphabet and that could be costly. But maybe not, you are going to do complement anyway, that will probably take much more time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, the efficient antichain inclusion does not need the alphabet at all, the explicit one will do much more constly operations than collecting alphabet anyway.
@@ -98,6 +102,22 @@ bool is_universal_naive(const Nfa& aut, const Alphabet& alphabet, Run* cex); | |||
*/ | |||
bool is_universal_antichains(const Nfa& aut, const Alphabet& alphabet, Run* cex); | |||
|
|||
//the alphabet parameter should be optional? The automaton can have its alphabet too. Run should be optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that Run
should be optional.
Making alphabet optional would mean that we can (in this order), check for existence of alphabet in:
- function parameter,
- automaton member variable
alphabet
, and - construct alphabet from transitions of the automaton (this is optional and probably not wanted - it seems strange to check for universality from symbols on transitions of the checked automaton only - therefore, we could fail in the case that neither 1) or 2) have alphabet specified .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last time we said that automaton would always have an alphabet, so 3 would not apply
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I am not sure if it is possible to implement such an alphabet. It will need to be investigated further.
And 3. would still apply. We would check for universality for only symbols in the automaton. Therefore, it is up to us whether that seems OK or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, alphabet
should be optional, if it is not there, use alphabet of the automaton. Otherwise I would throw error.
Nfa Mata::Nfa::Algorithms::minimize_brzozowski(const Nfa& aut) { | ||
//compute the minimal deterministic automaton, Brzozovski algorithm | ||
return determinize(revert(determinize(revert(aut)))); | ||
} | ||
|
||
//method? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed that functions should return new instance and methods should compute inplace, if at all possible. Therefore, this function should remain a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, algorithms that merge states could be implemented in-place. Simulation, Hopcroft, ... Not Brzozovski probably.
It could makse sense to implement them in place. ... ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we could use chaining and write something like intersection(nfa1.trim().revert(), nfa2.determinize())
. This would allow us to eliminate some optional parameters from several functions, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if the in-place implementation brings us something here (performance wise). I don't know how to do quotienting efficiently on our structure of NFA without a fresh new construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, true, one could avoid some of the allocations, not all.
The main question is whether we really want this dogma : "not in place therefore function" and not use methods that return new automata.
Maybe I just don't remember what the arguments were.
I
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more a matter of taste. It would be consistent if we kept the dogma, but there is not much to gain from that. I would be OK with using both approaches interchangeably. For example, we can have one (or more) from these options implemented:
Nfa Mata::Nfa::concatenate(const Nfa& nfa1, const Nfa& nfa2);
Nfa Mata::Nfa::concatenation(const Nfa& nfa1, const Nfa& nfa2);
Nfa Mata::Nfa::concatenated(const Nfa& nfa1, const Nfa& nfa2);
Nfa Mata::Nfa::Nfa::concatenate_with(const Nfa& nfa2);
Nfa Mata::Nfa::Nfa::concatenated_with(const Nfa& nfa2);
All the options are acceptable, in my opinion. But maybe for operations working with two automata, having functions is a better design, clearly showing that the original automata are not modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like concatenate
and concatenate_with
. And I'm fine with concatenation
. I like the idea of chaining, it could lead to elegant and clear code.
I have no strong feeling regarding the inplace algorithsm, I think @vhavlena is right, that the savings will be minimal.
@@ -789,6 +819,7 @@ Nfa Mata::Nfa::reduce(const Nfa &aut, bool trim_input, StateToStateMap *state_ma | |||
return result; | |||
} | |||
|
|||
//Might be a method, but it will not be in-place anyway, so I don't know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed that functions should return new instance and methods should compute in-place, if at all possible. Therefore, this function should remain a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but, actually, would it be better to call it determinized - identification of the returned thing? determinize sounds a bit like modifying the given thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But maybe then it should be a method which operates on the single instance and returns a new instance. That is the same as nfa.single_symbol_automaton()
. However, we need to keep in mind that
// ...
nfa.determinized().trim();
std::cout << nfa.print_to_mata();
// ...
does not show all that clearly that what is printed is the original instance, not the determinized and trimmed one. This code is in fact a bug.
However, if we get used to nouns returning a new instance, it could work nicely, and it will be clearer.
@@ -874,6 +905,8 @@ std::ostream& std::operator<<(std::ostream& os, const Mata::Nfa::Nfa& nfa) { | |||
return os; | |||
} | |||
|
|||
//Are the following functions for alphabets there to stay or to be refacvtored together with alphabets? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are meant to stay (we use them in Noodler), but they will work with different alphabets and they may be refactored due to that quite a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Maybe we would not need all the variants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not know yet. We will have to refactor alphabets, refactor how we work with alphabets in Noodler, and finally review which of these functions we need and use and which can be deleted.
I would not close the conversations as resolved when they are resolved but not implemented yet. Others will not see these discussions, and we will easily skip over the entire discussion and forget to act on the issue according to the final agreement in the discussion. At least until we have a separate issue, record in notes from the meetings, etc. where it cannot be missed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this was some volume of discussion.
My general view: I would really advocate for keeping it conservative and traditional: nouns for objects/subjects and verbs for actions, embracing as much as C++ idioms as possible. Going for sheer number of characters will not lead to clarity, quite otherwise. Beware of Curse of Knowledge; knowing right now that one_letter_aut
returns something and is not a property does not mean that next year you will scream in horror, what the fuck it is. Because, since I never used this function, I hardly have any idea what it really should do.
My comments were not meant to offend anyone, I apologize, if I hit some nerves, but I'm really thinking of long term not short term future and I always regretted being too smartass with my naming convetions.
P.S. Sorry for late review >.<
bool is_included_naive(const Nfa& smaller, const Nfa& bigger, const Alphabet* alphabet = nullptr, Run* cex = nullptr); | ||
//naive is not a good word, maybe textbook, or explicit, construct_first | ||
//the alphabet parameter is useless in inclusion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think explicit
is little bit better. textbook
seems a little bit weird.
//actually, what about to use this, with more optional parameters through structures | ||
//struct univ_params | ||
//{ | ||
// Alphabet* alphabet = NULL, | ||
// Run* cex = NULL, | ||
//}; | ||
// bool is_universal_antichains(const Nfa& aut, univ_params={NULL,NULL}); | ||
// and call as | ||
// bool is_universal_antichains(const Nfa& aut); | ||
// bool is_universal_antichains(const Nfa& aut,{.alphabet = bla}); | ||
// bool is_universal_antichains(const Nfa& aut,{.cex = bli}); | ||
// bool is_universal_antichains(const Nfa& aut,{.alphabet = bla, .cex = bli}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, we could implement singleton-like configure structure (or structure with class parameters), where one could set configurations up and not use function parameters? The idea is something like following:
struct Configuration {
string minimization_strategy = "default";
string determinization_strategy = "default";
void* cex = nullptr;
void* alphabet = nullptr;
...
}
bool is_universal_antichains(const Nfa& aut);
Configuration.alphabet = bla;
bool is_universal_antichains(const Nfa& aut);
Configuration.cex = bli;
bool is_universal_antichains(const Nfa& aut);
Configuration.alphabet = blabla;
Configuration.cex = blibli;
bool is_universal_antichains(const Nfa& aut);
The benefits are that (1) configuration parameters are at one place only, (2) you can set some parameters for bigger blocks of code and not keep writing lots of parameters, ...
The cons are that (1) you write more code, (2) you might need to apply config to single scope only and then reset it, which could again results to some problems.
//should we drop the create_ from all the names? | ||
Nfa create_single_word_nfa(const std::vector<Symbol>& word); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said before, I would prefer Build::
namespace and no create_
, however, for me unity is more important, and if one function does not make sense without create
, then I would keep the prefix for all functions.
Nfa construct(const Mata::Parser::ParsedSection& parsec, Alphabet* alphabet, StringToStateMap* state_map = nullptr); | ||
//construct does not cary much info, what is better? In the comment, don't know what Parsed object is, overall I don't get what this function does. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to adapt system from other libraries, that would name these kinds of fucntions as from_<FORMAT>
and to_<FORMAT>
. So this could be Builder::from_parsed_section()
, or Builder::nfa_from_parsed_section()
.
//the delta way is the cleanest, no? a private data member | ||
//lets do it that way? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong feelings either way.
using StateBoolArray = std::vector<bool>; ///< Bool array for states in the automaton. | ||
|
||
namespace { | ||
//Remove direct_, there are no close alternatives for NFA (unlike for Buchi). Remove "compute_" ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to nouns = classes/structures, verbs = methods/functions. To me if we used just "noun" it would convey that fw_simulation()
is a property, and it will be fast just returning some precomputed member, which in this case might not be. The compute_
fully conveys that it will take some time before it is computed.
@@ -270,6 +285,7 @@ Nfa Mata::Nfa::remove_epsilon(const Nfa& aut, Symbol epsilon) { | |||
return result; | |||
} | |||
|
|||
//TODO: experiment with reverts and chose which should survive | |||
Nfa Mata::Nfa::fragile_revert(const Nfa& aut) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: it is probably better to name it revert_fragile
, then if you are using autocomplete, and write revert
you will see better what options you have.
@@ -558,6 +581,7 @@ std::pair<Run, bool> Mata::Nfa::get_word_for_path(const Nfa& aut, const Run& run | |||
} | |||
|
|||
//TODO: this is not efficient | |||
//method? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a long abandoned relict of sad past that is already surpassed by the Run
structure anyway. Is it even used?
@@ -597,6 +622,7 @@ Mata::Parser::ParsedSection Mata::Nfa::serialize( | |||
return {}; | |||
} | |||
|
|||
//method? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be explicit, while it might not be confusing for us (though I think we will eventually forget these little rules and will be angry instead) it might be confusing by new contributors, especially students.
Nfa Mata::Nfa::Algorithms::minimize_brzozowski(const Nfa& aut) { | ||
//compute the minimal deterministic automaton, Brzozovski algorithm | ||
return determinize(revert(determinize(revert(aut)))); | ||
} | ||
|
||
//method? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like concatenate
and concatenate_with
. And I'm fine with concatenation
. I like the idea of chaining, it could lead to elegant and clear code.
I have no strong feeling regarding the inplace algorithsm, I think @vhavlena is right, that the savings will be minimal.
Bitching about various things, for the sake of discussion.
Please comment things worth commenting, or just say ok or nothing.