-
Notifications
You must be signed in to change notification settings - Fork 19
Record name and zone name validation for Bind zone file generation #268
Comments
I agree that some validation on names is not only desirable but necessary. RFC1035, Section 2.3.1 formulates rather strict restrictions (although the phrasing 'will result in fewer problems' is already an indication of weaker requirements to come):
This would in fact imply, for instance, that a label must start with a letter, then letters, digits and hyphens are allowed, and it must end with a letter or digit. So far, so good, but look at the In RFC 2181, Section 11 things look completely different:
However, this raises another interesting issue: The RFCs refer to the length of labels and domain names in terms of octets, and to the wire format and not to the textual representation in zone files. Many characters that are by the definition in RFC 2181 legal need to be escaped in zone files, which makes their textual representation longer than they actually are on the wire. So before calculating the length, names need to be converted to wire format to determine their actual lengths. Implementing this in NetBox DNS itself would be a major effort and a massive duplication of work. The good news is that probably we don't need to take care of that at all (no regular expressions etc.) and have
So, in a first approach, I suggest I'll just run name server, record and zone names through the |
I'm still looking for an exact specification of where a DNS name is a hostname (and thus requires stricter restrictions than those described in RFC 2181). I'll look into what further validation |
As it turned out, The latest commit adds validation to
The names are matched against combinations of Let me go through the list @jean1 provided in detail, as I think it's a very good starting point.
As mentioned before, RFC 1035 in general is not a good basis for validating DNS RR names and values. There are multiple reasons, the most important of all that not every DNS record refers to a host name and the restrictions in RFC 1035 apply to host names only.
These conditions are partly handled by The remainder is done for address and pointer records (at least that's the current state of affairs) by validating them against the URLValidator RegExes. As opposed to the simple
Done by
That is something that might be worth some additional investigation, but honestly I don't think this is the case.
This must be universal, or zones created in BIND containing dotted names would not be accepted by secondaries running under a different implementation. The current implementation of this works by accepting names matching
which results in everything from a simple host name to an FQDN being accepted as a name.
In the latter case, the validator also checks whether the absolute host name is within the zone, since otherwise the entry would never be served (I consider that undesirable unless someone can point me to a use case :-))
Again, empty labels (which is what multiple dots imply) are caught by
This is in fact not only something BIND uses (which is what I initially assumed), but
Unfortunately the
It would also help with canonicalising host and domain names, e.g. the 'trailing dot/no trailing dot' problem with domain names.
Actually some record types must not ... see RFC 2181. I was a bit surprised about this myself. Currently NetBox DNS applies the strict rules for host names (RFC 1035) only to names of address records and values of pointer records, but that will probably need to be expanded a bit. Regarding underscores, that is still a somewhat unclear area. Normally underscores are not allowed in host names, but one certain company obviously never gave a hoot about standards and is still allowing them. My gut feeling is that we shouldn't care, but pragmatism will probably require to at least provide a compatibility setting. What do you think?
Well ... IDN are supported as well, but they are converted to punycode in the database so DNS servers can be fed the data directly. In detail view, the Unicode representation is displayed (if it is different from the textual representation), and you can enter either Unicode characters or punycode when creating or editing records - the conversion happens when the input is validated. So if you find the time, please check it out and tell me what you're thinking. |
We have to do it anyway ... There is a new setting in the NetBox DNS plugin config:
When set to |
After some additional testing and consideration using URLValidator RegExes does not seem such a good idea anymore. They are just too complicated and test for some conditions (e.g. label lengths, formatting of punycode names) that have already been tested before. Replaced them with much simpler RegExes that only check the characters in labels against RFC 1035 with additional handling of the |
More testing, more consideration and a lot of RFCs later ... There are about 120 different RR types described in the IANA DNS Parameters web page. Some are obsolete, some are experimental, some are more or less obscure and some are probably very rarely used, if at all. Bottom line: Without thoroughly reading all of the underlying RFCs and similar documents, finding out which RR names to validateand which not and getting it right is next to impossible. Plan B is to make it configurable. There are two new plugin settings variables, both resolving to a list:
RR types listed in RR types listed in @jean1, would you care to check the PR out and have a look if it suits you (everyone else is invited as well - this is a major change and will likely break some edge cases where the old version was more tolerant). |
Hello, @peteeckel, I find it a remarkable improvement.
Another question: should the wildcard record ("*") be allowed ? About the allowed RR types, for me, a reasonable list would be: Thanks again for your work. -- |
Hi @jean1, thanks for the quick reply!
good point - I was assuming (without testing it, obviously) that dnspython would get that one. Obviously not ...
that one's on me. Just didn't think of it.
yes, definitely. Another oversight by me, thanks!
I'm not sure either, but so far it seems to be OK.
Modification of So, on the positive side, all the RR types you proposed (and a couple of dozen more) are already supported. We might need to restrict which ones we'll create specific forms for (#160 is still something that's waiting for a good implementation idea). As for deprecated/obsolete/experimental types, we could remove them from the list but when we want to be as general as possible and be able to actually maintain "historically grown" data sets we might as well leave them in. Another thing I am thinking about is validating host names within RR values. There are a couple of RR types which also have host names as fields in structured values ( |
Hi @jean1, here's another iteration ...
Done that, including the special handling for the punycode
That's implemented as well now.
Wildcards are now legal again. Note that PTR generation for wildcards is disabled - I'm not quite sure whether this should be possible, so rather err on the safe side. When you find the time it would be great if you could have a look at it. |
Hi @jean1, did you have a chance to look at the latest changes yet? |
Hello @peteeckel,
On Mon, Jan 30, 2023 at 05:31:49AM -0800, Peter Eckel wrote:
Hi @jean1, did you have a chance to look at the latest changes yet?
Sorry for the long delay.
Yes, I did.
I checked for the wildcard and consecutive hyphen case and it is ok.
However, the punycode converted domain name should be decoded when taken
from the database and displayed in the edit form to have a consistent
interface.
…--
Jean
|
Hi @jean1, thanks a lot for checking! The "Punycode in edit forms" issue is something I thought about for a while as well. The problem with that is that you can edit the Punycode representation as well as the Unicode, so for someone who prefers one displaying the other in the edit views will be inconsistent ... you can, however, enter either form in the edit view and it will be converted to Punycode in the database anyway. The question is just what will be displayed initially. Currently we are using the NetBox standard templates for Edit forms, and they take their values from the database, which stores the Punycode representation (with the new PR, before as of version 0.16.1 they were stored as Unicode and also exported that way, which without further effort will stop a zone from getting loaded - I just tested that on BIND9, just to make sure). Furthermore, storing entries in both Unicode and Punycode (depending on how they were entered) in the database will break uniqueness constraints and checks for singletons, and require additional effort on updates and inserts into the database, as well as for PTR record creation. So IMHO normalising entries in the database makes sense, and given the fact that at some point we need to convert to Punycode anyway I convert everything to that representation when data is stored in the DB. Being aware that reading Punycode is not something humans do easily, the edit view will show the Unicode representation in the heading, and you can enter Punycode in the edit fields as well, but the database/Punycode representation is used to initialise the edit field as that is what the NetBox templates do. We could consider writing our own edit templates and optionally display Unicode as the initial value the edit fields, but that's a lot of effort and I'm not sure the benefit is big enough to justify that. |
Hello Peter,
On Mon, Jan 30, 2023 at 08:19:15AM -0800, Peter Eckel wrote:
The "Punycode in edit forms" issue is something I thought about for a
while as well.
This asymmetry bothers me too.
The problem with that is that you can edit the Punycode
representation as well as the Unicode, so for someone who prefers one
displaying the other in the edit views will be inconsistent ...
you can, however, enter either form in the edit view and it will be converted
to Punycode in the database anyway. The question is just what will be
displayed initially.
As there is a function to convert from one to the other, there is
no need to store both representations, right ?
I suggest, display what the user can understand, initially and afterwards,
because a form is for humans.
Currently we are using the NetBox standard templates for Edit/BulkEdit
views, and they take their values from the database, which stores the
Punycode representation (with the new PR, before as of version 0.16.1
So, to rephrase this, is displaying punycode an accidental consequence
of not using a custom template ?
they were stored as Unicode and also exported that way, which without
further effort will stop a zone from getting loaded - I just tested that
on BIND9, just to make sure).
Could this be done at a lower level then, when handling model/field
(not sure if it's the right way to do it) ?
Furthermore, storing entries in Unicode and Punycode in the database
will break uniqueness constraints and checks for singletons, and require
additional effort on updates and inserts into the database, as well as
for PTR record creation. So IMHO normalising entries in the database
makes sense, and given the fact that at some point we need to convert
to Punycode anyway I convert everything to that representation when data
is stored in the DB.
I never thought about storing both representations, that would
indeed bring a lot of problems.
Being aware that reading Punycode is not something humans do easily, the
edit view will show the Unicode representation in the heading, and you can
enter Punycode there as well, but the database/Punycode representation is
used to initialise the edit field as that is what the NetBox templates do.
We could consider writing our own edit templates and _optionally_
display Unicode as the initial value the edit fields, but that's a lot
of effort and I'm not sure the benefit is big enough to justify that.
It depends of the usefulness of the whole IDN thing. I'm unsure about
its wide adoption.
I think of another approach :
Is it possible to not store Punycode in the database but revert to unicode instead ?
We would just validate data (check if punycode conversion is ok) and
storing the unicode string as-is.
Then we would leave it to the zone generation tooling to make the final
conversion to punycode
What is your opinion on this ?
…--
Jean
|
Hi Jean, I think I could improve the logic a bit ... let's see how you like it now.
absolutely, yes, though at some point it might be the more efficient way in terms of processing effort.
After digging into the code and that of NetBox and django_tables2 I think I found a usable compromise. All tables and forms now show the Unicode representation of all IDNs (including zone and name server names), including the initial values for edit and clone forms. The sole exceptions are values of RRs containing names. I don't think there is a way to solve that without introducing horrible complexity.
As it turns out, no. :-) The solution I found now can be extended in a way that the GUI always shows Punycode if that should become necessary. Currently it's Unicode all the way through, with Punycode in the database and the values of specific records only. There are still some conversions required as a result of this choice, but all of them affect the GUI only, not internal processing, and are not really critical in terms of performance. When you want to export 10000 records at once, having to convert all the names would hurt much more that when you want to display 25 records in a table - which is probably the worst case that occurs in practice.
Obviously it can. If you know where to look it's not even difficult. You live and learn.
That wasn't what I was meaning, sorry for not being precise enough. What I meant was to store the values as entered (like the way it currenty happens with 0.16.1, i.e. without normalisation of names).
So am I, to be honest. One indicator might be that IDNs in NetBox DNS are totally broken right now and no one ever noticed :-)
Having the Unicode representation in the database is IMHO the second best option, as many operations that happen with a potentially larger number of records (think of creating a reverse zone for which many PTR records need to be created instantly as an example) would then require many conversions between Unicode and Punycode. Performance already is an issue right now, so I am pretty reluctant to introduce stuff that might affect performance in any way (unless it would improve it of course). I think it's cleaner and more straightforward - though not perfect - this way, and now the GUI issues should be gone I don't see any major disadvantages anymore. I pushed the new functionality to the PR, so if you'd like to have another look that would be very welcome. Since this issue was actually about validation and not so much about IDNs (which came as a consequence of the validation code as it doesn't make any sense to validate IDN/Punycode syntax with the whole IDN functionality broken in the first place) I would like to finalise the PR soon and rather deal with remaining IDN functionality improvements in separate issues ... the change became pretty large as it is, and already exceeded the scope of the issue by far. Thank you very much for your input! Cheers, Peter. |
Hello again,
On Tue, Jan 31, 2023 at 05:30:37AM -0800, Peter Eckel wrote:
I think I could improve the logic a bit ... let's see how you like it now.
After digging into the code and that of NetBox and django_tables2 I think I found a usable compromise.
I like it very much : it is now consistent when adding or editing a name.
It's working with only a few modifications, and with no big difference
in performance.
All tables and forms now show the Unicode representation of all IDNs
(including zone and name server names), including the initial values for
edit and clone forms. The sole exceptions are values of RRs containing
names.
I don't think there is a way to solve that without introducing
horrible complexity.
Indeed, the values of CNAME, MX and SRV as names are not converted.
Let's leave that one for now.
I pushed the new functionality to the PR, so if you'd like to have another look that would be very welcome.
After a few quick tests, I see no error or problems.
Since this issue was actually about validation and not so much about
IDNs (which came as a consequence of the validation code as it doesn't
make any sense to validate IDN/Punycode syntax with the whole IDN
functionality broken in the first place) I would like to finalise the
PR soon and rather deal with remaining IDN functionality improvements in
separate issues ... the change became pretty large as it is, and already
exceeded the scope of the issue by far.
The complication were probably unavoidable, except in deciding against IDN support.
It may be better to stop here and say these are the final changes.
Thank you very much for your input!
Thanks for all the work !
Kind regards
…--
Jean
|
Hi Jean, thanks for the feedback! I'm glad we found a solution we both can live with - and I hope everyone who uses IDNs as well (hey all, is there anyone using IDNs?). The issue with displaying values isn't as easy to fix, I'm afraid. One way would be to write some magical JavaScript stuff that translates everything in the value field if RRs that looks like it's Punycode and then displays it in form of a tooltip so the Unicode value is visible, but the values for NS, SOA, CNAME etc. (there are tons more of RR types that can contain names) actually are and must be Punycode. Let's keep that one for later so we can get validation out of the door ... the RR value issue is more or less cosmetic after all. So here we go! Thank you very much again for your help! Best regads, Peter. |
OK, I pushed another commit ... :-)
Please note that the value that is displayed in the "Unicode Value" column or field is not the real value. If anyone wants to insert Unicode in the value field NetBox DNS will currently not stop them from doing so. The functionality of displaying values in Unicode representation is just there for convenience and because the longer I thought about it the more @jean1 appeared to be right - Punycode is really sore in the eye - and the simpler this solution appeared to implement until the scales tipped over. |
Discussed in #267
Originally posted by jean1 December 21, 2022
Currently, in netbox-dns, record name (and zone name) have no validation except on string length, whereas Netbox itself validates FQDN with a regexp. The lack of input validation is fragile if you want to use netbox-dns records to directly generate a zone file in Bind: the zone won't load. Record have to follow a certain number of rules.
Each part of a name (the strings between the dots) is called a label.
According to RFC 1035, 2.3.1., a label:
Also, a label can't exceed 63 characters (RFC 1035, 2.3.1. again: " There are also some restrictions on the length. Labels must be 63 characters or less")
Other things to validate:
(the following things may be specific to bind zone file format)
My opinion is, if this is the central source of truth for your DNS records, allowing incorrectly formatted names and catching error later when generating zone file is undesirable.
I agree though that which rules to apply is not obvious (and blindly applying most of the rules would rule out IDN without an external conversion tool).
What would be an acceptable validation level ? Should this be opened as an issue ?
The text was updated successfully, but these errors were encountered: