Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move engine/netx code to netxlite (or other packages) and avoid netx where not needed #2121

Closed
bassosimone opened this issue May 30, 2022 · 2 comments · Fixed by ooni/probe-cli#1068
Assignees
Labels
cleanup There's need to cleanup stuff a bit ooni/probe-engine priority/low refactoring techdebt This issue describes technical debt

Comments

@bassosimone
Copy link
Contributor

bassosimone commented May 30, 2022

This issue is about slowly refactoring the codebase to use netxlite (or other packages) instead of netx. While netx code is fine, it is also a bit too complex to follow/understand, because construction depens on the state of Config objects. So, the general idea here is to refactor the missing functionality to move some code inside netxlite and some code in other packages. In doing that, we'll see to kill the excessively complex or confusing code. Because this issue is about moving code away from internal/engine and into internal, this issue is related to #2115.

@bassosimone bassosimone added priority/low refactoring ooni/probe-engine techdebt This issue describes technical debt cleanup There's need to cleanup stuff a bit labels May 30, 2022
@bassosimone bassosimone self-assigned this May 30, 2022
bassosimone added a commit to ooni/probe-cli that referenced this issue May 30, 2022
This diff replaces engine/netx code with netxlite code in
the engine/session.go file. To this end, we needed to move
some code from engine/netx to netxlite. While there, we
did review and improve the unit tests.

A notable change in this diff is (or seems to be) that in
engine/session.go we're not filtering for bogons anymore so
that, in principle, we could believe a resolver returning
to us bogon IP addresses for OONI services. However, I did
not bother with changing bogons filtering because the
sessionresolver package is already filtering for bogons,
so it is actually okay to avoid doing that again the
session.go code. See:

https://github.com/ooni/probe-cli/blob/v3.15.0-alpha.1/internal/engine/internal/sessionresolver/resolvermaker.go#L88

There are two reference issues for this cleanup:

1. ooni/probe#2115

2. ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 30, 2022
This diff replaces engine/netx code with netxlite code in
the engine/session.go file. To this end, we needed to move
some code from engine/netx to netxlite. While there, we
did review and improve the unit tests.

A notable change in this diff is (or seems to be) that in
engine/session.go we're not filtering for bogons anymore so
that, in principle, we could believe a resolver returning
to us bogon IP addresses for OONI services. However, I did
not bother with changing bogons filtering because the
sessionresolver package is already filtering for bogons,
so it is actually okay to avoid doing that again the
session.go code. See:

https://github.com/ooni/probe-cli/blob/v3.15.0-alpha.1/internal/engine/internal/sessionresolver/resolvermaker.go#L88

There are two reference issues for this cleanup:

1. ooni/probe#2115

2. ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 30, 2022
This diff required us to move some code around, but no major
change actually happened, except better tests.

While there, I also slightly refactored ndt7's implementation and
removed the ProxyURL setting, which was actually unused.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 30, 2022
This diff required us to move some code around, but no major
change actually happened, except better tests.

While there, I also slightly refactored ndt7's implementation and
removed the ProxyURL setting, which was actually unused.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
The objective of this diff is to simplify the code inside engine/netx
while moving more bits of code inside netxlite.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
The objective of this diff is to simplify the code inside engine/netx
while moving more bits of code inside netxlite.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
This diff modifies the construction of a dialer to allow one
to insert custom dialer wrappers into the dialers chain.

The point of the chain in which we allow custom wrappers is the
optimal one for connect, read, and write measurements.

This new design is better than the previous netx design since
we don't need to construct the whole chain manually now.

The work in this diff is part of the effort to make engine/netx
just a tiny wrapper around netxlite.

See ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
This diff modifies the construction of a dialer to allow one
to insert custom dialer wrappers into the dialers chain.

The point of the chain in which we allow custom wrappers is the
optimal one for connect, read, and write measurements.

This new design is better than the previous netx design since
we don't need to construct the whole chain manually now.

The work in this diff is part of the effort to make engine/netx
just a tiny wrapper around netxlite.

See ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
Like the previous diff, but for creating QUIC dialer.s

See ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
Like the previous diff, but for creating QUIC dialers.

See ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
This diff creates a new package under netx called tracex that
contains everything we need to perform measurements using events
tracing and postprocessing (which is the technique with which
we implement most network experiments).

The general idea here is to (1) create a unique package out of
all of these packages; (2) clean up the code a bit (improve tests,
docs, apply more recent code patterns); (3) move the resulting
code as a toplevel package inside of internal.

Once this is done, netx can be further refactored to avoid
subpackages and we can search for more code to salvage/refactor.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue May 31, 2022
This diff creates a new package under netx called tracex that
contains everything we need to perform measurements using events
tracing and postprocessing (which is the technique with which
we implement most network experiments).

The general idea here is to (1) create a unique package out of
all of these packages; (2) clean up the code a bit (improve tests,
docs, apply more recent code patterns); (3) move the resulting
code as a toplevel package inside of internal.

Once this is done, netx can be further refactored to avoid
subpackages and we can search for more code to salvage/refactor.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 1, 2022
The code that is now into the tracex package was written a long
time ago, so let's start to make it more in line with the coding
style of packages that were written more recently.

I didn't apply all the changes I'd like to apply in a single diff
and for now I am committing just this diff.

Broadly, what we need to do is:

1. improve documentation

2. ~always use pointer receivers (object receives have the issue
that they are not mutable by accident meaning that you can mutate
them but their state do not change after the call returns, which
is potentially a source of bugs in case you later refactor to use
a pointer receiver, so always use pointer receivers)

3. ~always avoid embedding (let's say we want to avoid embedding
for types we define and it's instead fine to embed types that are
defined in the stdlib: if later we add a new method, we will not
see a broken build and we'll probably forget to add the new method
to all wrappers -- conversely, if we're wrapping rather than
embedding, we'll see a broken build and act accordingly)

4. prefer unit tests and group tests by type being tested rather
than using a flat structure for tests

Reference issue: ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 1, 2022
The code that is now into the tracex package was written a long
time ago, so let's start to make it more in line with the coding
style of packages that were written more recently.

I didn't apply all the changes I'd like to apply in a single diff
and for now I am committing just this diff.

Broadly, what we need to do is:

1. improve documentation

2. ~always use pointer receivers (object receives have the issue
that they are not mutable by accident meaning that you can mutate
them but their state do not change after the call returns, which
is potentially a source of bugs in case you later refactor to use
a pointer receiver, so always use pointer receivers)

3. ~always avoid embedding (let's say we want to avoid embedding
for types we define and it's instead fine to embed types that are
defined in the stdlib: if later we add a new method, we will not
see a broken build and we'll probably forget to add the new method
to all wrappers -- conversely, if we're wrapping rather than
embedding, we'll see a broken build and act accordingly)

4. prefer unit tests and group tests by type being tested rather
than using a flat structure for tests

There's a coverage slippage that I'll compensate in a follow-up diff where I'll focus on unit testing.

Reference issue: ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 1, 2022
Rather than passing functions to construct complex objects such
as Dialer and QUICDialer, pass interface implementations.

Ensure that a nil implementation does not cause harm.

Make Saver implement the correct interface either directly or
indirectly. We need to implement the correct interface indirectly
for TCP conns (or connected UDP sockets) because we have two
distinct use cases inside netx: observing just the connect event
and observing just the I/O events.

With this change, the construction of composed Dialers and
QUICDialers is greatly simplified and more obvious.

Part of ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 1, 2022
Rather than passing functions to construct complex objects such
as Dialer and QUICDialer, pass interface implementations.

Ensure that a nil implementation does not cause harm.

Make Saver implement the correct interface either directly or
indirectly. We need to implement the correct interface indirectly
for TCP conns (or connected UDP sockets) because we have two
distinct use cases inside netx: observing just the connect event
and observing just the I/O events.

With this change, the construction of composed Dialers and
QUICDialers is greatly simplified and more obvious.

Part of ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 5, 2022
For testability, replace most if-based construction logic with
calls to well-tested factories living in other packages.

While there, acknowledge that a bunch of types could now be private
and make them private, modifying the code to call the public
factories allowing to construct said types instead.

Part of ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 5, 2022
Now that we have properly refactored the caching resolvers we can
move them into netxlite as optional resolvers created using the
proper abstract factories we just added.

This diff reduces the complexity and the code size of netx.

See ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 5, 2022
Now that we have properly refactored the caching resolvers we can
move them into netxlite as optional resolvers created using the
proper abstract factories we just added.

This diff reduces the complexity and the code size of netx.

See ooni/probe#2121.
@bassosimone bassosimone changed the title refactor: replace engine/netx with netxlite refactor: move engine/netx in netxlite (or other packages) Jun 6, 2022
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 6, 2022
This diff refactors netx and netxlite to ensure we're not using
netxlite legacy names inside of netx.

To this end, we're cheating a bit. We're exposing a new factory to
get an unwrapped stdlib resolver rather than defining a legacy name
to export the private name of the same factory.

This is actually a fine place to stop, for now, the next and
netxlite refactoring at ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 6, 2022
This diff refactors netx and netxlite to ensure we're not using
netxlite legacy names inside of netx.

To this end, we're cheating a bit. We're exposing a new factory to
get an unwrapped stdlib resolver rather than defining a legacy name
to export the private name of the same factory.

This is actually a fine place to stop, for now, the next and
netxlite refactoring at ooni/probe#2121.
@bassosimone
Copy link
Contributor Author

bassosimone commented Jun 6, 2022

I have just finished a first round of netx cleanup and refactoring where I reshaped the original structure of netx to organize by topic, moved code for tracing and saving measurements in ./internal/tracex, and added extra features to netxlite and bytecounter to support all these changes. All the pull requests mentioned above have together helped to reshape the code in this way. (I organized the work in ~small pull requests so that the steps with which we're refactoring will always be obvious in the history and we could more easily perform bisection, if needed).

To conclude this chunk of work, I am now going to document what still bothers me inside of netx and I would like to eventually refactor the next time I have time to deal with this code:

  • factories, such as NewDialer write into their own Config object, which is a small modification away (changing Config to *Config to mutate the object they're passed as argument. Also, this pattern basically includes an if in the implementation. It would instead be better to (1) do not write into Config at all and (2) use a factory like ConfiguredResolverOrNewDefaultResolver(config) that includes the if into itself for testability. I am not super sure the solution I am proposing here is the one we'll eventually use, but certainly I do not like the code pattern there.
  • NewDNSClient should instead be NewDNSTransport and return a model.DNSTransport and what is called BaseResolver should instead be DNSTransport. This change would allow us to call NewDNSTransport inside of NewResolver rather than duplicating code in NewDNSClient and in NewResolver. This change would also make less confusing the link between NewDNSClient/Transport and NewResolver.
  • NewDNSClientWithOverrides should be split into smaller testable functions
  • makeValidEndpoint should probably be modernized to use net.URL.Port() and should also include more testing (e.g., testing for the case in which the host is empty)
  • We should double check whether it's correct to call MaybeWrapHTTPTransport inside of NewHTTPTransport or whether we should instead (1) ensure we're only using netx for measurements and (2) just wrap connections for counting bytes. It currently may be that we're counting bytes ~twice.
  • NewQUICDialer should support byte counting
  • We should consider whether engine/netx should survive after all or just become part of urlgetter (which seems a good opportunity to actually reduce the complexity between netx and urlgetter -- because there is definitely some extra complexity and bureaucracy in there)

So, the ones described above are the next targeting refactoring for netx. Of course, when I (or someone else) will have time to look into this, it may be that we're going to do things slightly differently than in the above points. I am now going to annotate specific code places that need change with this issue to ease future work.

@bassosimone
Copy link
Contributor Author

Also, this issue has now become long running with a large effort associated with it. TBH, it was always the case, but it was medium-large before.

@bassosimone bassosimone changed the title refactor: move engine/netx in netxlite (or other packages) refactor: move engine/netx code to netxlite (or other packages) and avoid netx where not needed Jun 8, 2022
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 8, 2022
The oohelper does not need to use netx and it's enough to use
netxlite, hence let us apply this refactor.

The original code used DoT but the explanatory comment said we were
using DoT because of unclear issues inside GitHub actions.

We are now using DoH and this is fine as well. The comment implied
that any encrypted transport would do.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 8, 2022
The oohelper does not need to use netx and it's enough to use
netxlite, hence let us apply this refactor.

The original code used DoT but the explanatory comment said we were
using DoT because of unclear issues inside GitHub actions.

We are now using DoH and this is fine as well. The comment implied
that any encrypted transport would do.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 8, 2022
The oohelperd implementation did not actually need using netx because
it was just constructing default types with logging, which is what
netxlite already does. Hence, let's avoid using netx here.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 8, 2022
The oohelperd implementation did not actually need using netx because
it was just constructing default types with logging, which is what
netxlite already does. Hence, let's avoid using netx here.

See ooni/probe#2121
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 8, 2022
We can simplify code in a bunch of places using a useful factory.

Part of ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 8, 2022
We can simplify code in a bunch of places using a useful factory.

Part of ooni/probe#2121.
bassosimone added a commit to ooni/probe-cli that referenced this issue Feb 1, 2023
This diff closes ooni/probe#2121 because it
removes the last unnecessary netx usage. All the other packages that
currently use netx are network experiments. We will eventually convert
them to not use netx as part of other tracked issues.

This diff closes ooni/probe#2135 because
now the sessionresolver does not depend on netx anymore.

We refactor the way in which we conditionally configure byte counting
for HTTP such that we use a free function rather than a method (we can
use methods with nil receiver in Go, but the free function seems to
be a better choice in this case).

We introduce and use a new bytecounter specifically for the system
resolver. This byte counter is very imprecise but seems still better
than using the system resolver doesn't use any network I/O.

We stop printing the sessionresolver stats when we close a session, since
this component has been in production for years now.

We improve the model.UnderlyingNetwork model to add support for changing
the root cert pool, which helps with writing some integration tests.

We modify the protocol to use and modify the netxlite tproxy (a singleton
instance of UnderlyingNetwork) to make it goroutine safe.

We introduce a new file inside sessionresolver, factory.go, which creates
and properly wraps the resolvers. This code replaces code for which we
previously used netx, and is the core change introduced here.

While there, we refactor how we log in the session resolver to use
the operation logger as we do in some experiments.

We write some additional tests that take advantage of the new netxlite
tproxy mocking functionality to ensure the sessionresolver continues
to work in two extreme use cases: no resolver is available and just the
system resolver is available. I introduced these new tests because I
originally broke the system resolver when I introduced factory.go and
I felt like it was useful to have more robustness here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup There's need to cleanup stuff a bit ooni/probe-engine priority/low refactoring techdebt This issue describes technical debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant