-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimisation: replace three map-reduces with Renderers #2920
Conversation
912fba8
to
3d186f7
Compare
render/host.go
Outdated
MapEndpoint2Host, | ||
EndpointRenderer, | ||
), | ||
endpoints2Hosts{}, |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/host.go
Outdated
|
||
// Dummy struct for Renderers that do not return useful stats | ||
type blankStats struct { | ||
} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/container.go
Outdated
result.Children = result.Children.Add(m) | ||
result.Children = result.Children.Merge(m.Children) | ||
ret[result.ID] = result | ||
mapped[m.ID] = result.ID |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
While working on this I had to write out the following analysis of the starting state; pasting it here for preservation:
|
render/id.go
Outdated
return report.Node{}, false | ||
} | ||
return NewDerivedPseudoNode(id, n), true | ||
} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/process.go
Outdated
if !ok { | ||
return report.Nodes{} | ||
} | ||
externalNode := NewDerivedPseudoNode(id, n) | ||
return report.Nodes{externalNode.ID: externalNode} | ||
} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
app/benchmark_internal_test.go
Outdated
b.StartTimer() | ||
renderer.Render(report, decorator) | ||
} | ||
} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
How I tested it produces the same output as before: First, obtain a test report, e.g. from your production system, and save it as a JSON file.
( then repeat on the branch for the 'after' case, and diff the Repeat for other topologies - |
I think |
c8568e1
to
d262cb3
Compare
render/process.go
Outdated
@@ -85,56 +77,82 @@ var ProcessNameRenderer = ConditionalRenderer(renderProcesses, | |||
), | |||
) | |||
|
|||
// MapEndpoint2Pseudo makes internet of host pesudo nodes from a endpoint node. | |||
func MapEndpoint2Pseudo(n report.Node, local report.Networks) report.Nodes { | |||
func pseudoNodeID(n report.Node, local report.Networks) (string, bool) { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/host.go
Outdated
|
||
// Add Node M to the result set ret under id, creating a new result | ||
// node if not already there, and updating the old-id to new-id mapping | ||
// Note we do not update any counters for child topologies here |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/container.go
Outdated
} | ||
var ret = make(report.Nodes) | ||
var mapped = map[string]string{} // input node ID -> output node ID |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/host.go
Outdated
|
||
func (e endpoints2Hosts) Render(rpt report.Report, dct Decorator) report.Nodes { | ||
ns := SelectEndpoint.Render(rpt, dct) | ||
local := LocalNetworks(rpt) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/host.go
Outdated
result.Children = result.Children.Merge(m.Children) | ||
ret[result.ID] = result | ||
mapped[m.ID] = result.ID | ||
} |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
render/host.go
Outdated
} | ||
|
||
// Rewrite Adjacency for new nodes in ret, original nodes in input, and mapping old->new IDs in mapped | ||
func fixupAdjancencies(input, ret report.Nodes, mapped map[string]string) { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've convinced myself that algorithmically this is correct. I have made some suggestions on code improvements.
I'd like to run some benchmarks myself just before this gets merged.
To time a single topology, for more focused optimisation
d262cb3
to
8f22634
Compare
pid, timestamp, ok := n.Latest.LookupEntry(process.PID) | ||
if !ok { | ||
func (e endpoints2Processes) Render(rpt report.Report, dct Decorator) report.Nodes { | ||
if len(rpt.Process.Nodes) == 0 { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This allows us to avoid creating a host of 'IP' type Nodes then discarding them after matching; instead we match directly and create just the result we want.
This is much more efficient as we skip creating then merging all intermediate Nodes
This means we are no longer generating a Counter for the number of endpoint sub-nodes, but it seems that data was not used.
747ad94
to
ec0689b
Compare
All comments addressed |
This is much more efficient as we skip creating then merging all intermediate Nodes
Pass 'id' through to the create function and expect that the result Node has that ID. Extract a function newPseudoNode for common calls.
New type joinResult is created to hold the nodes and ID mapping. The implementation move from host.go to render.go.
Benchmarking this with a recent prod report by running
and then timing topology rendering with
yields
So this is a significant improvement indeed. LGTM. Merge it! |
This shows a big improvement in BenchmarkTopologyList
In three cases:
ConnectionJoin
,ProcessRenderer
andHostRenderer
, the code was running aMap
to speculatively createNode
s, then aReduce
to match them all up with something else.If we replace those multiple operations with a single routine which does the matching directly, we save a lot of garbage.
I also added benchmark to time a single topology rendering, equivalent to hitting an endpoint like
/api/topology/hosts
. Best time of five runs; input is a 33MB report from production on Oct 15:Before
After
That is a 50-70% reduction in time and objects created.