Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimisation: replace three map-reduces with Renderers #2920

Merged
merged 12 commits into from
Nov 7, 2017

Conversation

bboreham
Copy link
Collaborator

@bboreham bboreham commented Nov 4, 2017

In three cases: ConnectionJoin, ProcessRenderer and HostRenderer, the code was running a Map to speculatively create Nodes, then a Reduce to match them all up with something else.

If we replace those multiple operations with a single routine which does the matching directly, we save a lot of garbage.

I also added benchmark to time a single topology rendering, equivalent to hitting an endpoint like/api/topology/hosts. Best time of five runs; input is a 33MB report from production on Oct 15:
Before

$ go test -tags 'unsafe netgo' -run=x -bench=Topology -bench-report-file=prod-report1.json
BenchmarkTopologyList-2         	       1	1400508627 ns/op	352382920 B/op	 2913537 allocs/op
BenchmarkTopologyHosts-2        	       1	1124918587 ns/op	302509672 B/op	 2627937 allocs/op
BenchmarkTopologyContainers-2   	       2	 691307933 ns/op	194143244 B/op	 1723873 allocs/op

After

$ go test -tags 'unsafe netgo' -run=x -bench=Topology -bench-report-file=prod-report1.json
BenchmarkTopologyList-2         	       2	 692289010 ns/op	146370204 B/op	 1529315 allocs/op
BenchmarkTopologyHosts-2        	       3	 485628292 ns/op	95328261 B/op	 1245029 allocs/op
BenchmarkTopologyContainers-2   	      10	 208411667 ns/op	36021493 B/op	  515582 allocs/op

That is a 50-70% reduction in time and objects created.

render/host.go Outdated
MapEndpoint2Host,
EndpointRenderer,
),
endpoints2Hosts{},

This comment was marked as abuse.

This comment was marked as abuse.

render/host.go Outdated

// Dummy struct for Renderers that do not return useful stats
type blankStats struct {
}

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

result.Children = result.Children.Add(m)
result.Children = result.Children.Merge(m.Children)
ret[result.ID] = result
mapped[m.ID] = result.ID

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

@bboreham
Copy link
Collaborator Author

bboreham commented Nov 5, 2017

While working on this I had to write out the following analysis of the starting state; pasting it here for preservation:

TestShortLivedInternetNodeConnections calls
render.ContainerWithImageNameRenderer.Render(rpt, FilterNoop))

ContainerWithImageNameRenderer is a variable initialised with ContainerRenderer

First thing ContainerWithImageNameRenderer.Render() does is call
ContainerRenderer to get a list of "containers"

  ContainerRenderer is
    A merge of
      Map ColorConnectedProcessRenderer into MapProcess2Container
        ColorConnectedProcessRenderer calls ProcessRenderer then a function
          ProcessRenderer, if the report has any process nodes, is a merge of
            Map EndpointRenderer into MapEndpoint2Process
            SelectProcess
          the ColorConnected function adds a flag IsConnected to any nodes with edges
        MapProcess2Container creates container nodes if there is an ID, otherwise puts in 'uncontained'
      ConnectionJoin, which is a merge of
        Map a merge of
            Map r (SelectContainer) into nodeToIP via MapContainer2IP
            mapEndpoint2IP which is Map SelectEndpoint into endpoint2IP
          into ipToNode which removes any Endpoint2IP nodes that didn't match a container
        r (SelectContainer)
    Filtered to just nodes that have a ContainerState that is non-deleted (or pseudo-nodes)

Next Render() calls SelectContainerImage.Render(rpt, dct) to get a list of images

Then Render does:
  For each input node, if it has an image
    if the image is in the list of images
      if the image has a name
        updates the image details in the container object, and adds it to the output
  otherwise pass the input node straight through to the output

render/id.go Outdated
return report.Node{}, false
}
return NewDerivedPseudoNode(id, n), true
}

This comment was marked as abuse.

This comment was marked as abuse.

if !ok {
return report.Nodes{}
}
externalNode := NewDerivedPseudoNode(id, n)
return report.Nodes{externalNode.ID: externalNode}
}

This comment was marked as abuse.

This comment was marked as abuse.

b.StartTimer()
renderer.Render(report, decorator)
}
}

This comment was marked as abuse.

@bboreham
Copy link
Collaborator Author

bboreham commented Nov 5, 2017

How I tested it produces the same output as before:

First, obtain a test report, e.g. from your production system, and save it as a JSON file.

git checkout master
make prog/scope
prog/scope --mode=app --weave=false --app.collector=file:some-report.json &
curl -so /tmp/before http://localhost:4040/api/topology/containers
jq -S . < /tmp/before > /tmp/before-jq

(-S is used to sort keys so they are consistent)

then repeat on the branch for the 'after' case, and diff the jq output before and after.

Repeat for other topologies - hosts, etc.

@rade
Copy link
Member

rade commented Nov 5, 2017

Repeat for other topologies - hosts, etc.

I think /api/topology does that.

@bboreham bboreham force-pushed the endpoints-renderers branch 2 times, most recently from c8568e1 to d262cb3 Compare November 6, 2017 09:02
@bboreham bboreham changed the title Optimisation: replace endpoints map-reduce with Renderer Optimisation: replace three map-reduces with Renderers Nov 6, 2017
@@ -85,56 +77,82 @@ var ProcessNameRenderer = ConditionalRenderer(renderProcesses,
),
)

// MapEndpoint2Pseudo makes internet of host pesudo nodes from a endpoint node.
func MapEndpoint2Pseudo(n report.Node, local report.Networks) report.Nodes {
func pseudoNodeID(n report.Node, local report.Networks) (string, bool) {

This comment was marked as abuse.

render/host.go Outdated

// Add Node M to the result set ret under id, creating a new result
// node if not already there, and updating the old-id to new-id mapping
// Note we do not update any counters for child topologies here

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

}
var ret = make(report.Nodes)
var mapped = map[string]string{} // input node ID -> output node ID

This comment was marked as abuse.

render/host.go Outdated

func (e endpoints2Hosts) Render(rpt report.Report, dct Decorator) report.Nodes {
ns := SelectEndpoint.Render(rpt, dct)
local := LocalNetworks(rpt)

This comment was marked as abuse.

render/host.go Outdated
result.Children = result.Children.Merge(m.Children)
ret[result.ID] = result
mapped[m.ID] = result.ID
}

This comment was marked as abuse.

This comment was marked as abuse.

render/host.go Outdated
}

// Rewrite Adjacency for new nodes in ret, original nodes in input, and mapping old->new IDs in mapped
func fixupAdjancencies(input, ret report.Nodes, mapped map[string]string) {

This comment was marked as abuse.

Copy link
Member

@rade rade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've convinced myself that algorithmically this is correct. I have made some suggestions on code improvements.

I'd like to run some benchmarks myself just before this gets merged.

@bboreham bboreham force-pushed the endpoints-renderers branch from d262cb3 to 8f22634 Compare November 6, 2017 21:03
pid, timestamp, ok := n.Latest.LookupEntry(process.PID)
if !ok {
func (e endpoints2Processes) Render(rpt report.Report, dct Decorator) report.Nodes {
if len(rpt.Process.Nodes) == 0 {

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This allows us to avoid creating a host of 'IP' type Nodes then
discarding them after matching; instead we match directly and create
just the result we want.
This is much more efficient as we skip creating then merging all intermediate Nodes
This means we are no longer generating a Counter for the number of
endpoint sub-nodes, but it seems that data was not used.
@bboreham bboreham force-pushed the endpoints-renderers branch from 747ad94 to ec0689b Compare November 6, 2017 22:05
@bboreham
Copy link
Collaborator Author

bboreham commented Nov 6, 2017

All comments addressed

This is much more efficient as we skip creating then merging all intermediate Nodes
Pass 'id' through to the create function and expect that the result Node has that ID.
Extract a function newPseudoNode for common calls.
New type joinResult is created to hold the nodes and ID mapping.
The implementation move from host.go to render.go.
@rade
Copy link
Member

rade commented Nov 6, 2017

Benchmarking this with a recent prod report by running

$ prog/scope --mode=app --weave=false --app.collector=file:prod-on-prod-report.json

and then timing topology rendering with

time curl -so /dev/null http://localhost:4040/api/topology/<topology>

yields

topology before after
processes 0.307 0.252
containers 1.277 0.796
pods 0.936 0.506
hosts 1.227 0.711

So this is a significant improvement indeed.

LGTM. Merge it!

This shows a big improvement in BenchmarkTopologyList
@bboreham bboreham merged commit 893537c into master Nov 7, 2017
@dlespiau dlespiau deleted the endpoints-renderers branch November 21, 2017 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants