Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: repository layout with per-service split #4076

Closed
oncilla opened this issue Jun 29, 2021 · 16 comments
Closed

proposal: repository layout with per-service split #4076

oncilla opened this issue Jun 29, 2021 · 16 comments
Labels
i/proposal A new idea requiring additional input and discussion

Comments

@oncilla
Copy link
Contributor

oncilla commented Jun 29, 2021

Status: WIP

Problems

The repository layout has grown organically. There have been various previous attempts at trying to get some better structure into the project. So far they have been to no avail. By now, we have at least three different epochs of "repo layout" designs.
E.g., go/lib vs go/pkg/<service>/<subpackages> vs go/lib/infra/modules vs go/cs/<subpackages>

The current structure has various disadvantages:

  • It is non-obvious where code should live and why it should be there. We tried to separate lib from pkg without a clear guidance what should be where.
    • While reading code, it is not obvious where to look for things.
    • While writing code, it is not obvious where to put things.
  • The structure makes it awkward to version services independently. Most of the logic nowadays is spread across lib and pkg. Currently, there even is an import cycle between lib and pkg making things even worse.

Goals and Non-Goals

Goals:

  • Make the code base easy to navigate. The structure should make sense to uninitiated readers.
  • Streamline development:
    • Have clear guidelines where to put code.
    • Have clear guidelines on dependencies between packages
  • Allow independent versioning of services and tools in the future.

Non-Goals:

  • We do not want to introduce nested go module support at this point in time.

Proposed Layout

The idea is to group all code that belongs to a service/tool together, irrespective of the language that it is implemented in.

Proposed structure:

root
├── acceptance
│   └── # Acceptance test definitions.
│
├── api
│   └── # OpenAPI spec definitions. The generated specifications will be placed
│       # in the service directory directly. The definitions live along side the services.
│
├── apps
│   ├── # All applications that are "deliverables" are located here.
│   │   # This is irrespective of the language the applications are written in.
│   │
│   ├── <service-name> # E.g., control, posix-router.
│   │   └── <supporting-packages> # Additional packages for the service.
│   │                             # This is the place where the business and
│   │                             # configuration logic lives.
│   │                             # E.g., config, beaconing for cs
│   │
│   ├── <toolname-variant-1> # E.g., scion, scion-pki
│   │   └── <supporting-packages> # Additional packages for the tool.
│   │                             # For example go/pkg/ping would be moved
│   │                             # here.
│   │
│   └── <supporting-packages> # Additional packages shared across applications.
│
├── cmd
│   ├── <service-name> # Main service definitions.
│   │   └── main.go
│   │
│   ├── <toolname-variant-1> # E.g., scion, scion-pki
│   │   └── main.go
│   │
│   └── <toolname-variant-2> # Very simple applications that do not export packages
│       │                    # that should/could be used by others.
│       │                    # E.g., pathdb_dump
│       └── main.go
├── core
│   └── # Core packages that are unlikely to change and sit at the very core of
│       # SCION. Packages in this directory must only import from `core`, `proto`
│       # or a third party dependency.
│       # E.g. cppki, log, metrics, slayers, tracing
│
├── tools
│   ├── scripts # Tiny bash/python scripts supporting development.
│   │
│   ├── <tool-name> # Tool that supports development and testing. For example the
│   │               # integration tests, or linting rules.
│   │               # E.g., scion_integration, end2end_integration.
│   │               # These tools can either be variant 1 or variant 2 of tools
│   │               # as shown above.
│   │
│   └── <supporting-packages> # Additional packages shared across dev tools
│
├── doc
│   └── # Documentation platform.
│
├── examples
│   └── # Examples that show case how to use sdk/api/core
│
├── proto
│   └── # Protobuf definitions. The generated code lives in core/sdk/example depending on use case.
│
├── sdk
│   └── # Minimal set of packages that are supposed to be used by downstream users
│       # to write a SCION enabled application.
│       # Packages in this directory must only import from `core`, `proto`, `sdk`
│       # or a third party dependency.
│
├── topology # Topology file definitions
│
└── wireshark # Wireshark file definition.

Import Rules:

  • packages in core MUST ONLY import third party packages and other packages in core
  • packages in sdk MUST ONLY import third party packages, core and other packages in sdk
  • packages in apps MUST ONLY import third party packages, core , sdk, and other packages in apps
  • supporting packages in an application service/tool MUST ONLY import supporting packages from an other application/tool if there is NO dependency in the reverse direction. This ensures that there is no cyclic dependency between applications.
  • packages in tools and examples MAY import anything.

These rules will be enforced by a linter.

Automated Reposhuffle:

Changing the repository layout will be done automated, I have a prototype application that is capable of moving the packages and updating the bazel build files for go.
Because our python code is very limited, this will probably be done by hand.

Automated patching of downstreams

A output of the reposhuffle application will be a go-patch file such that our downstream dependents can easily update their code with our changes.

State

Currently, this proposal has not been agreed upon. We welcome feedback and ideas regarding the repository layout. Also, the open questions need to be addressed first.

Open Questions

  • Should we differentiate between "services" and "tools", e.g., control-service vs scion, or is it okay to stick them all under applications?

I don't think it is necessary at this point

  • Do we want to bundle apps/<supporting-packages> under a sub-directory like apps/pkg/<supporting-packages>?
    The drawback of not bundling them is that they "pollute" the applications directory.
    The advantage is a cleaner import path.

    Given that we move the main packages to top-level cmd, I don't see a good reason to do it.

  • Should the generated go protobuf code be located in proto/ or core/proto

    • If it is located in proto, the go code for the control plane would be located at proto/scion/control_plane. This leads to the import path github.com/scionproto/scion/proto/scion/control_plane which is not very nice.
    • If it is located in core/proto/ we could choose it to be located at core/proto/control_plane. This would lead to the import path github.com/scionproto/scion/core/proto/control_plane.

    Generated code should be moved in the appropriate package. For control plane, it will go to core/proto/control_plane

  • Instead of having a cmd-dir per application, we could also have a top-level cmd-dir. The application logic would still live in applications/<application-name> though. This might also work nicely in a multi-module repo. cmd could act as the module that defines what versions are currently used and compatible with each other.

    I adapted the proposal to follow this style.

@oncilla oncilla added the i/proposal A new idea requiring additional input and discussion label Jun 29, 2021
@shitz
Copy link
Contributor

shitz commented Jul 1, 2021

Thanks @oncilla. Overall, I like the proposal a lot!

I'd also add a toplevel examples/ directory where we can put example applications that use the sdk.

Should we differentiate between "services" and "tools", e.g., control-service vs scion, or is it okay to stick them all under applications?

Do you mean applications/services and applications/tools or services/ and tools/? I think the former would be nice, however, would lead to longer paths - maybe we could use apps instead of applications to remedy that a bit. If you mean the latter, then we'd need to differentiate between local development tools (your current tools/) and the tools that should also be packaged (i.e., what we package as scion-utils currently). This doesn't make much sense. With that being said, I think we can just stick everything under applications/ (or apps/).

Do we want to bundle applications/ under a sub-directory like applications/pkg/?

I'd prefer having a applications/pkg/ subdirectory. It's also inline with standard go project layouts, since packages in there serve a libraries shared across applications.

Should the generated go protobuf code by located in proto/ or core/proto

I'd have a toplevel proto/ directory simply because it might be that there will be proto definitions that are not part of the core library, e.g., an example application that implements a simple client/server using gRPC over QUIC/SCION.

I'd then create proto/core/ for all proto definitions that are part of the core library with an import path of github.com/scionproto/scion/proto/core/control_plane. For other proto definitions we could have different directories in proto/, e.g., proto/examples/image_fetcher etc.

Instead of having a cmd-dir per application, we could also have a top-level cmd-dir. The application logic would still live in applications/ though. This might also work nicely in a multi-module repo. cmd could act as the module that defines what versions are currently used and compatible with each other.

I'd prefer to have a top-level cmd/ dir for the reasons you mentioned.

@oncilla
Copy link
Contributor Author

oncilla commented Jul 1, 2021

I'd also add a toplevel examples/ directory where we can put example applications that use the sdk.

Jup, I think that would be reasonable.

Do you mean applications/services and applications/tools or services/ and tools/? I think the former would be nice, however, would lead to longer paths - maybe we could use apps instead of applications to remedy that a bit. If you mean the latter, then we'd need to differentiate between local development tools (your current tools/) and the tools that should also be packaged (i.e., what we package as scion-utils currently). This doesn't make much sense. With that being said, I think we can just stick everything under applications/ (or apps/).

I meant at a top level. Originally, I called tools dev-tools, but after some discussion with @matzf we came to the conclusion that tools is the more idiomatic way. Anyway, I think there is no good reason to differentiate between services and apps if they are bundled in a top level cmd anyway.

I'd prefer having a applications/pkg/ subdirectory. It's also inline with standard go project layouts, since packages in there serve a libraries shared across applications.

This is a bit of an on going debate between pkg vs non-pkg. I honestly don't know the best course of action. pkg is neat for bundling, non-pkg is neat for better import paths.

I'd have a toplevel proto/ directory simply because it might be that there will be proto definitions that are not part of the core library, e.g., an example application that implements a simple client/server using gRPC over QUIC/SCION.

Right, this will always be the case. The question is more, where do we put the generated code. Should the generated go code for the control-plane go in the core package, or should it live under the proto package.

I'd then create proto/core/ for all proto definitions that are part of the core library with an import path of github.com/scionproto/scion/proto/core/control_plane. For other proto definitions we could have different directories in proto/, e.g., proto/examples/image_fetcher etc.

For protobuf, we need to mirror the path of the package definitions in the file paths relative to the proto root..
E.g., for protobuf package scion.control_plane.v1, we need to put the files under proto/scion/control_plane/v1.
This is actually the main reason I even bring this up, because the generated go files would then live at github.com/scionproto/scion/proto/scion/control_plane/v1 which is arguably not very nice.
Putting the generated go files in github.com/scionproto/scion/core/proto/control_plane is somewhat nicer, but in the end it does not really matter.

I'd prefer to have a top-level cmd/ dir for the reasons you mentioned.

I think I agree, it makes it also very easy to discover the applications that we provide.

@shitz
Copy link
Contributor

shitz commented Jul 1, 2021

Right, this will always be the case. The question is more, where do we put the generated code. Should the generated go code for the control-plane go in the core package, or should it live under the proto package.

Ah sorry I misunderstood. I think for protos defined in proto/core/ the generated code should be in core/proto/... and for other protos defined in proto/examples/ to be in examples/proto/...
That would then lead to ok-ish import paths github.com/scionproto/scion/core/proto/... and github.com/scionproto/scion/examples/proto/...

Does that make sense?

@oncilla
Copy link
Contributor Author

oncilla commented Jul 1, 2021

Jup, I will adjust the proposal accordingly.

@edganiukov
Copy link
Contributor

Instead of having a cmd-dir per application, we could also have a top-level cmd-dir. The application logic would still live in applications/ though. This might also work nicely in a multi-module repo. cmd could act as the module that defines what versions are currently used and compatible with each other.

I prefer to have cmd per application (or actually main dir, as it is a main package, no idea why everyone names it cmd), but without pkg, this pkg only adds one more directory level without a need. Simply apps/<app>/main/, apps/<app>/store/, apps/<app>/grpc/, etc.
Pros:

  1. All the service related files are located in one place.
  2. apps/<app> can be a separate module with its own version and all dependencies defined in its own go.mod (that are used in apps/<app>. Not like module and its deps are defined in cmd/<app>, but those deps are used in apps/<app>

@oncilla
Copy link
Contributor Author

oncilla commented Jul 2, 2021

I prefer to have cmd per application (or actually main dir, as it is a main package, no idea why everyone names it cmd), but without pkg, this pkg only adds one more directory level without a need.

I agree with the sentiment of not having pkg. It does not add much value IMO, that is also why it is not included in the proposal above.

Simply apps//main/, apps//store/, apps//grpc/, etc.

apps/<app>/main/ is something I have never seen so far.
I think it is idomatic in Go to put the main package in a path with a basename that is called the same as produced binary. This is why I had initially apps/<service-name>/cmd/<service-name>

  1. All the service related files are located in one place.

That is true, but main.go is essentially just initialization and calling out to apps/<app>/<business-logic>

  1. apps/ can be a separate module with its own version and all dependencies defined in its own go.mod (that are used in apps/. Not like module and its deps are defined in cmd/, but those deps are used in apps/

In a world where you have multiple modules, I think you would do the following:

World 1: root level cmd dir:

scionproto/scion/
├── apps
│   ├── go.mod
│   ├── service1
│   │   └── go.mod
│   └── supporting-packages
└── cmd
    ├── go.mod
    └── service1
        └── main.go

The service1 is considered its own module. cmd is also a module and essentially defines the "compatible versions"

World 2: cmd per service:

scionproto/scion/
└── apps
    ├── go.mod
    ├── service1
    │   ├── cmd     # or main 
    │   │   └── service1
    │   │       └── main.go
    │   ├── go.mod
    │   └── <supporting-packages>
    └── <supporting-packages>

I think both of these models would work well for versioning.
I have to say, I'm a bit torn between world 1 and world 2.

@edganiukov
Copy link
Contributor

@oncilla

I think it is idomatic in Go to put the main package in a path with a basename that is called the same as produced binary. 

Then, how about having main.go in the root of the application directory, e.g. apps/<app>/main.go, no cmd needed and it looks like the most idiomatic Go way.

@oncilla
Copy link
Contributor Author

oncilla commented Jul 3, 2021

The thing I don't like about this is that you waste the best package name on the binary.
That is also the reason why most projects actually use the cmd dir.

@edganiukov
Copy link
Contributor

edganiukov commented Jul 3, 2021

But such projects usually do not have per app grouped packages. IMO, having top level apps and cmd looks like mixing two styles:

scionproto/scion/
├── config
│   ├── go.mod
│   └── service1
│        └── go.mod
├── grpc
│   ├── go.mod
│   └── service1
│        └── go.mod
└── cmd
    ├── go.mod
    └── service1
        └── main.go

and

└── apps
    ├── go.mod
    ├── service1
    │   ├── cmd     # or main 
    │   │   └── service1
    │   │       └── main.go
    │   ├── go.mod
    │   └── <supporting-packages>
    └── <supporting-packages>

UPD:

The thing I don't like about this is that you waste the best package name on the binary.

But according to "packages in apps MUST ONLY import third party packages, core , sdk, and other packages in apps" seems like only cmd/<app> will import this package and I do not see the reason not to put it in the same place.
Also github.com/scionproto/scion/apps/<app> is actually a service (binary) and not a library (package) from your definition. So we just create dozens of additional directories for some unnecessary structuring.

@oncilla
Copy link
Contributor Author

oncilla commented Jul 5, 2021

But according to "packages in apps MUST ONLY import third party packages, core , sdk, and other packages in apps" seems like only cmd/ will import this package and I do not see the reason not to put it in the same place.

github.com/scionproto/scion/apps/<app>/... will contain all the packages that are used for writing an <app> service. If people want to write their own control service, daemon, etc., they will import these packages. But this is only a minor point.

The more important point for me is that wasting this package on the binary leads to worse import paths/package structuring in our code too. Take for example testgen (This is a Anapaya internal tool, sorry to the interested readers). Because the main package is at the root of the "package tree" for this application, we had to resort to a common package to define constants and helpers. IMO, this stuff should just have been in the testgen-root package as those are "testgen global"-constants.

Also github.com/scionproto/scion/apps/ is actually a service (binary) and not a library (package) from your definition. So we just create dozens of additional directories for some unnecessary structuring.

I don't follow this argument exactly. Where we put the packages and binary definitions does not affect the number of packages and binaries. So, you would get at most some additional cmd directories. (1 for the root-cmd world, num application for the per-app world). IMO they are not useless, they convey the information of where binaries are defined.
When I'm looking at any go project nowadays, I just search for the cmd dir to see where the binaries are defined.

But thinking about this, I'm slightly in favor of world 2, where we have a cmd per application, just because it makes the separation more explicit.

@oncilla
Copy link
Contributor Author

oncilla commented Jul 5, 2021

IMO, it is also a good sign that switching between world 2 and world 1 simply is a matter of moving the main packages around.
No other code change is needed.

This points to the fact that the concerns are properly separated and the package boundaries are sane.

@edganiukov
Copy link
Contributor

edganiukov commented Jul 5, 2021

But thinking about this, I'm slightly in favor of world 2, where we have a cmd per application, just because it makes the separation more explicit.

From those two options I would also choose world 2. My point was that, while looking at other projects, apps looks like cmd in many projects (given that all other packages are grouped by core, sdk), e.g. https://github.com/golang/go/tree/master/src/cmd/go

@matzf
Copy link
Contributor

matzf commented Jul 16, 2021

Regarding sdk/ and core/; why not rather combine these in one directory, e.g. pkg/? Clearly both are intended to be public APIs and it's not entirely obvious to me what should go where, and why. If the main idea is to simplify identifying the main entry point for consumers of the libraries, perhaps just clearly highlighting a main high-level entry point like e.g. pkg/snet could work just as well.

Some more (at most) half serious bike-shedding points: I find the name "core" a bit strong; cross-cutting utilities like logging, metrics, and tracing are not core to SCION. And the other one, "sdk", to me carries the negative implication that it will bring everything and the kitchen sink -- we want to use libraries, not tool kits. 😉

@matzf
Copy link
Contributor

matzf commented Mar 2, 2022

Reading through this discussion after a long time. I concur with @edganiukov's last comment; of these two options, I would prefer world 2 (weak preference).

Also, I stand by my previous comment, the split between sdk/ and core/ feels a bit forced. Importantly, the split does not seem to convey whether or not a package is part of the public API and thus ~stable -- if we say core/ does not change and sdk/ is the public interface, then everything would effectively be stable-ish. This seems to tie us down more than necessary. I think we should be relatively free to change e.g. error or logging libraries as these should not generally be used directly by consumers.
The public/stable packages should be exactly those that make sense for consumers to import directly -- the relevant question is, would an external application/library ever want to specifically import this (SCION-)library to accomplish a task? In particular, we should avoid exporting any general purpose functionality not directly related to using SCION.

My suggestion would be to combine/replace core/ and sdk, like this:

scion
├── apps
├── ...
└── pkg
    ├── drkey     # just an additional example; application library to derive/fetch keys
    ├── internal
    │   ├── daemon
    │   ├── log
    │   ├── metrics
    │   ├── serrors
    │   ├── ...
    │   └── tracing
    ├── saddr     # parse/process SCION addresses independent of snet (renamed from lib/addr)
    ├── slayers   # create/parse SCION packets independent of snet
    ├── ...
    └── snet

Instead of internal/, we can also use e.g. unstable/, private/, or z/ if the forcefully restricted access seems too strong.

@oncilla
Copy link
Contributor Author

oncilla commented Mar 4, 2022

Okay, let's do the following:

├── acceptance
│   └── # Acceptance test definitions
├── doc
│   └── # Documenation
├── examples
│   └──  # Examples that showcase how to use pk
├── pkg # Packages that are supposed to be used by third-parties
│   ├── private # supporting packages that should not be imported usually.
│   │   └── serrors
│   ├── proto # generated go code from proto
│   │   └── control_plane
│   ├── saddr
│   ├── slayers
│   └── snet
├── proto
│   └── # protobuf definitions
├── <service-name> # E.g., control, router
│   ├── cmd
│   │   ├── <service-name> # Primary application
│   │   │   └── main.go
│   │   └── <supporting-tool> # Additional executable that supports the service
│   │       └── main.go
│   └── <supporting-packages> # Additional packages for the service
├── <tool-name> # Simple flat tool
│   ├── file.go
│   └── main.go
├── tools # Development tools
│   ├── <script> # Simple bash
│   │   └── python script
│   ├── <supporting-packages>
│   └── <tool> # App structured like service or flat tool
├── topology
└── wireshark

I don't think we need the app sub directory after all.

oncilla added a commit to oncilla/scion that referenced this issue Mar 12, 2022
Restructure the whole repository layout to a more streamlined layout. This
commit focuses on the go code. Other code will be considered eventually.

The overall structure was discussed in scionproto#4076.

Each service and cli tools gets a top-level directory. Packages that are shared
across multiple applications are grouped in the `private` directory. This should
indicate that these packages are not intended to be used by external parties,
and that semantic versioning will not apply to these packages. Developer tools
are grouped in the `tools` directory.

Code that is intended for use by third external parties is grouped
in the `pkg` directory. Here we should strive for a stable package
API that does not change too often.

To smoothen transition, the following gist has all the metadata
of the move: https://gist.github.com/oncilla/96bdabb00359fdb4436a7a50d57d3cf3
You can use [go-imports.sh](https://gist.github.com/oncilla/96bdabb00359fdb4436a7a50d57d3cf3#file-go-imports-sh)
to fix the imports. [shuffle.yml](https://gist.github.com/oncilla/96bdabb00359fdb4436a7a50d57d3cf3#file-shuffle-yml)
lists all the moved packages and their targets.

GitOrigin-RevId: 1a49e3871a65a4bbeac515ada78f7208a85d7558
@oncilla
Copy link
Contributor Author

oncilla commented Mar 14, 2022

Repository layout has been changed in #4163 🎉

Thank you all for the valuable input. I hope the layout can be stable in the future and we do not have to do a shuffle again.

I have created a gist with all the packages that were moved and their target location shuffle.yml
Note that some small refactors are still required for a clean setup:

  • tools/integration/integrationlib should be refactored
  • pkg/private/common should be removed eventually
  • pkg/private/ctrl/path_mgmt should be removed eventually
  • private/segment/verifier should be removed eventually
  • private/app/appnet should be removed eventually
  • pkg/log should separate into a package that defines the interface (pkg/log) and implementation (private/log)
  • pkg/metrics should separate into a package that defines the interface (pkg/metrics) and implementation (private/metrics)
  • private/pathdb should be refactored to follow the private/storage pattern eventually.

@oncilla oncilla closed this as completed Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i/proposal A new idea requiring additional input and discussion
Projects
None yet
Development

No branches or pull requests

4 participants