Table of Contents
In this document, we explain how modules are written and integrated into MIG.
The reception of a command by an agent triggers the execution of modules. A
module is a Go package that is imported into the agent at compilation, and that
performs a very specific set of tasks. For example, the file
module
provides a way to scan a file system for files that contain regexes, match a
checksum, ... Another module is called netstat
, and looks for IP addresses
currently connected to an endpoint. ping
is a module to ping targets from
endpoints, etc..
Modules are somewhat autonomous. They can be developed outside of the MIG code base, and only imported during compilation of the agent. Go does not provide a way to load modules dynamically, so modules are compiled into the agent's static binary, and not as separate files.
There are two types of modules. Standard modules, which were the initial module type supported by MIG, and persistent modules.
A standard module is invoked by the agent when it recieves by a command, and the
results provided by this module can be considered point in time. It does not keep
any state between runs, and is used for general investigation activities. Some
examples of standard modules include the file
module for scanning the file
system for certain criteria, or the netstat
module for looking at current
network communication.
A persistent module is run by the agent when it starts, and can perform more on-going tasks or analysis activities. Persistent modules are kept running by the agent. Persistent modules can be queried just like standard modules, but instead of a one-time invocation of the module, when the investigator queries a persistent module you are querying into the already running module. This can be used to collect statistics or results, change the behavior of the persistent module, or various other activities.
Persistent modules are developed in a similar manner to standard modules with a few additions. This discusses elements that are common to all modules. For details specific to the implementation of persistent modules see the persistent module documentation.
A module must import mig/modules
.
A module registers itself at runtime via its init()
function which must
call modules.Register
with a module name and an instance implementing
modules.Moduler
:
type Moduler interface {
NewRun() Runner
}
A module must have a unique name. A good practice is to use the same name for the module name as for the Go package name. However, it is possible for a single Go package to implement multiple modules, simply by registering different Modulers with different names.
The sole method of a Moduler creates a new instance to represent a "run" of the
module, implementing the modules.Runner
interface:
type Runner interface {
Run(io.Reader) string
ValidateParameters() error
}
Any run-specific information should be associated with this instance and not with the Moduler or stored in a global variable. It should be possible for multiple runs of the module to execute simultaneously.
The code sample below shows how the example
module uses package name
example
and registers with name example
.
package example
import (
"mig/modules"
)
// An instance of this type will represent this module; it's possible to add
// additional data fields here, although that is rarely needed.
type module struct {
}
func (m *module) NewRun() interface{} {
return new(run)
}
// init is called by the Go runtime at startup. We use this function to
// register the module in a global array of available modules, so the
// agent knows we exist
func init() {
modules.Register("example", new(module))
}
type run struct {
Parameters params
Results modules.Result
}
init()
is a go builtin function that is executed automatically in all
imported packages when a program starts. In the agents, modules are imported
anonymously, which means that their init()
function will be executed even if
the modules are unused in the agent. Therefore, when MIG Agent starts, all
modules execute their init()
function, add their names and runner function to
the global list of available modules, and stop there.
The list of modules imported in the agent is maintained in
conf/available_modules.go
. You should use this file to add or remove modules.
import (
//_ "mig/modules/example"
_ "mig/modules/agentdestroy"
_ "mig/modules/file"
_ "mig/modules/netstat"
_ "mig/modules/timedrift"
_ "mig/modules/ping"
)
When the agent receives a command to execute, it looks up modules in
the global list modules.Available
, and if a module is registered to execute
the command, calls its runner function to get a new instance representing the run,
and then calls that instance's Run
method.
A mig module typically defines its own run
struct implementing the
modules.Runner
interface and representing a single run of the module. The
run
struct typically contains two fields: module parameters and module results.
The former is any format the module chooses to use, while the latter generally
implements the modules.Result
struct (note that this is not required, but
it is the easiest way to return a properly-formatted JSON result).
type run struct {
Parameters myModuleParams
Results modules.Result
}
When a module is available to run an operation, the agent passes the operation parameters to the module.
The easiest way to see this is to invoke the agent binary with the flag -m, followed by the name of the module:
$ mig-agent -m example <<< '{"class":"parameters", "parameters":{"gethostname": true, "getaddresses": true, "lookuphost": ["www.google.com"]}}'
[info] using builtin conf
{"foundanything":true,"success":true,"elements":{"hostname":"fedbox2.jaffa.linuxwall.info","addresses":["172.21.0.3/20","fe80::8e70:5aff:fec8:be50/64"],"lookeduphost":{"www.google.com":["74.125.196.105","74.125.196.147","74.125.196.106","74.125.196.104","74.125.196.103","74.125.196.99","2607:f8b0:4002:c07::6a"]}},"statistics":{"stufffound":3},"errors":null}
The module receives this JSON input as an io.Reader
passed to its Run
method.
The module's Run
method should start by trying to read parameters from the
given in io.Reader
. It then validates the parameters against its own
formatting rules, performs work and returns results in a JSON string.
func (r *run) Run(in io.Reader) string {
defer func() {
if e := recover(); e != nil {
r.Results.Errors = append(r.Results.Errors, fmt.Sprintf("%v", e))
r.Results.Success = false
buf, _ := json.Marshal(r.Results)
out = string(buf[:])
}
}()
err := modules.ReadInputParameters(in, &r.Parameters)
if err != nil {
panic(err)
}
err = r.ValidateParameters()
if err != nil {
panic(err)
}
return r.doModuleStuff()
}
The defer
block in the sample above is used to catch potential panics and
returns a nicely formatted JSON error to the agent. This is a clean way to
indicate to the MIG platform that the module has failed to run on this agent.
A module must implement the ValidateParameters()
method.
The role of that interface is to go through the parameters supplied to Run
and verify that they follow a format expected by the module. This method is
useful during Run
but is not called from outside the module.
Go is strongly typed, so there's no risk of finding a string when a float is expected. However, this function should verify that values are in a proper range, that regular expressions compile without errors, or that string parameters use the correct syntax.
When validation fails, an error with a descriptive validation failure must be returned to the caller.
A good example of validating parameters can be found in the file
module at
https://github.com/mozilla/mig/blob/master/modules/file/file.go
Results must follow a specific format defined in modules.Result
. Some rules
apply to the way fields in this struct must be set.
type Result struct {
Success bool `json:"success"`
FoundAnything bool `json:"foundanything"`
Elements interface{} `json:"elements"`
Statistics interface{} `json:"statistics"`
Errors []string `json:"errors"`
}
Success
must inform the investigator if the module has failed to complete its
execution. It must be set to true
only if the module has run successfully. It
does not indicate anything about the results returned by the module, just that
it ran and finished.
FoundAnything
must be set to true
only when the module was tasked with
finding something, and at least one instance of that something was found. If
the module searched for multiple things, one find is enough to set this flag to
true. The goal is to indicate to the investigator that the results from this
agent need closer scrutiny.
Elements
contains raw results from the module. This is defined as an
interface, which means that each module must define the format of the results
returned to the MIG platform. The only rule here is that modules must never
return raw data to investigators. Metadata is fine, but file contents or
memory dumps are not something MIG should be transporting ever.
Statistics
is an optional struct that can contain stats about the execution
of the module. For example, the file
module returns the numbers of files
inspected by a given search, as well as the time it took to run the
investigation. That information is often useful for investigators.
Errors
is an array of string that can contain soft and hard errors. If the
module failed to run, Success
would be set to false
and Errors
would
contain a single error with the description of the failure. If the module
succeeded to run, then Errors
could contain soft failures that did not
prevent the module from finishing, but may be useful for the investigator to
know about. For example, if the memory
module fails to inspect a given memory
region, the Errors
array could contain an entry providing that information.
HasResultsPrinter
is an interface used to allow a module to implement
the PrintResults() function. PrintResults()
is a pretty-printer used to display
the results of a module as an array of string. It is defined as a module-specific
interface because only the module knows how to parse its Elements
and
Statistics
interfaces in modules.Result
.
The interface is defined as:
// HasResultsPrinter implements functions used by module to print information
type HasResultsPrinter interface {
PrintResults(result Result, showResultsOnly bool) ([]string, error)
}
A typical implementation of PrintResults
takes a modules.Result
struct and
a boolean that indicates whether the printer should display errors and
statistics or only found results. When that boolean is set to true
, errors, stats
and empty results are not displayed. Note that the result
argument is
the result of unmarshalling the marshalled value returned from the Run
method.
The function returns results into an array of strings.
func (r *run) PrintResults(result modules.Result, matchOnly bool) (prints []string, err error) {
var (
el elements
stats statistics
)
err = result.GetElements(&el)
if err != nil {
panic(err)
}
[... add things into the prints array ...]
if matchOnly {
return // stop here
}
for _, e := range result.Errors {
prints = append(prints, fmt.Sprintf("error: %v", e))
}
err = result.GetStatistics(&stats)
if err != nil {
panic(err)
}
[... add stats into the prints array ...]
return
}
HasParamsCreator
implements the ParamsCreator()
function used to provide
interactive parameters creation in the MIG Console. The function does not take
any input value, but implements a terminal prompt for the investigator to
fill up the module parameters. The function returns a Parameters structure
that the MIG Console will add into an Action.
It can be implemented in various ways, as long as it prompt the user in the
terminal using something like fmt.Scanln()
.
The interface is defined as:
type HasParamsCreator interface {
ParamsCreator() (interface{}, error)
}
A module implementation would have the function:
func (r *run) ParamsCreator() (interface{}, error) {
fmt.Println("initializing netstat parameters creation")
var err error
var p params
printHelp(false)
scanner := bufio.NewScanner(os.Stdin)
for {
fmt.Printf("drift> ")
scanner.Scan()
if err := scanner.Err(); err != nil {
fmt.Println("Invalid input. Try again")
continue
}
input := scanner.Text()
if input == "help" {
printHelp(false)
continue
}
if input != "" {
_, err = time.ParseDuration(input)
if err != nil {
fmt.Println("invalid drift duration. try again. ex: drift> 5s")
continue
}
}
p.Drift = input
break
}
r.Parameters = p
return r.Parameters, r.ValidateParameters()
}
It is highly recommended to call ValidateParameters
to verify that the
parameters supplied by the users are correct.
HasParamsParser
is similar to HasParamsCreator
, but implements a command
line parameters parser instead of an interactive prompt. It is used by the MIG
command line to parse module-specific flags into module Parameters. Each module
must implement ParamsParser()
to transform an array of string into a
parameters interface. The recommended way to implement it is to use FlagSet
from the flag
Go package.
The interface is defined as:
// HasParamsParser implements a function that parses command line parameters
type HasParamsParser interface {
ParamsParser([]string) (interface{}, error)
}
A typical implementation from the timedrift
module looks as follows:
func (r *run) ParamsParser(args []string) (interface{}, error) {
var (
err error
drift string
fs flag.FlagSet
)
if len(args) >= 1 && args[0] == "help" {
printHelp(true)
return nil, fmt.Errorf("help printed")
}
if len(args) == 0 {
return r.Parameters, nil
}
fs.Init("time", flag.ContinueOnError)
fs.StringVar(&drift, "drift", "", "see help")
err = fs.Parse(args)
if err != nil {
return nil, err
}
_, err = time.ParseDuration(drift)
if err != nil {
return nil, fmt.Errorf("invalid drift duration. try help.")
}
r.Parameters.Drift = drift
return r.Parameters, r.ValidateParameters()
}
It is highly recommended to call ValidateParameters
to verify that the
parameters supplied by the users are correct.
Modules can implement the HasEnhancedPrivacy
interface by providing an
EnhancePrivacy
function.
If extra privacy mode has been enabled in the agent configuration, results that
are returned from a module will be passed through the modules EnhancePrivacy
function. This provides the module a means to mask certain meta-data as desired
from the result set.
The function is only run if extra privacy mode is enabled, if not the function
will not be run on results. If the module does not implement HasEnhancedPrivacy
,
the results are returned as-is.
An example module that can be used as a template is available in src/mig/modules/example/. We will study its structure to understand how modules are written and executed.
The first part of the module takes care of the registration and declaration of needed structs.
package example
import (
"encoding/json"
"fmt"
"mig/modules"
"net"
"os"
"regexp"
)
// init is called by the Go runtime at startup. We use this function to
// register the module in a global array of available modules, so the
// agent knows we exist
func init() {
modules.Register("example", func() interface{} {
return new(run)
})
}
type run struct {
Parameters params
Results modules.Result
}
// a simple parameters structure, the format is arbitrary
type params struct {
GetHostname bool `json:"gethostname"`
GetAddresses bool `json:"getaddresses"`
LookupHost []string `json:"lookuphost"`
}
type elements struct {
Hostname string `json:"hostname,omitempty"`
Addresses []string `json:"addresses,omitempty"`
LookedUpHost map[string][]string `json:"lookeduphost,omitempty"`
}
type statistics struct {
StuffFound int64 `json:"stufffound"`
}
Three custom structs are defined: params
, elements
and statistics
.
params
implements custom module parameters. In this instance, the module will
access two booleans (GetHostname
and GetAddresses
), and one array of
strings (LookupHost
). We have decided that this module will return its
hostname if GetHostname
is set to true. It will return its IP addresses if
GetAddresses
is set to true, and it will perform DNS lookups and return the
IP addresses of each FQDN listed in the LookupHost
array.
elements
will contain the results found by the module. The hostname will go
into elements.Hostname
. The local addresses will be appended into
elements.Addresses
. And each host that was looked up will be added into the
elements.LookedUpHost
map with their own arrays of IP addresses.
statistics
just keeps a counter of stuffs that was found. We could also add
an execution timer in this struct to indicate how look it took the module to
run.
Next we'll implement a parameters validation function.
func (r *run) ValidateParameters() (err error) {
fqdn := regexp.MustCompilePOSIX(`^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$`)
for _, host := range r.Parameters.LookupHost {
if !fqdn.MatchString(host) {
return fmt.Errorf("ValidateParameters: LookupHost parameter is not a valid FQDN.")
}
}
return
}
Since our parameters struct is very basic, there is little verification to do.
The two booleans don't need verification, because Go is strongly typed. But we
attempt to validate the FQDN of hosts that need to be looked up with a regular
expression. If the validation fails, ValidateParameters
returns an error.
Run is what the agent will call when the module is executed. It starts by
defining a panic handling routine that will transform panics into
modules.Result.Errors
and return the JSON.
Then, Run()
reads parameters from stdin. The call to modules.ReadInputParameters
will block until one line of input is received. If what was received isn't
valid parameters, it panics.
func (r *run) Run(in io.Reader) (out string) {
defer func() {
if e := recover(); e != nil {
r.Results.Errors = append(r.Results.Errors, fmt.Sprintf("%v", e))
r.Results.Success = false
buf, _ := json.Marshal(r.Results)
out = string(buf[:])
}
}()
err := modules.ReadInputParameters(in, &r.Parameters)
if err != nil {
panic(err)
}
err = r.ValidateParameters()
if err != nil {
panic(err)
}
moduleDone := make(chan bool)
stop := make(chan bool)
go r.doModuleStuff(&out, &moduleDone)
go modules.WatchForStop(in, &stop)
select {
case <-moduleDone:
return out
case <-stop:
panic("stop message received, terminating early")
}
}
What happens after is a little tricky to follow. We want the module to do work,
but we also want to allow the investigator to kill the module early if needed.
So we first send the module to perform the work by calling go r.doModuleStuff(&out, &moduleDone)
where &out
is a pointer to the string that Run()
will return, and
&moduleDone
is a channel that will receive a boolean when the module is done
doing stuff.
Meanwhile, we start another goroutine go modules.WatchForStop(in, &stop)
that
will continously read the standard input of the module. If a stop
message is
received on the standard input, the goroutine inserts a boolean in the stop
channel. This method is typically used by the agent to ask a module to shutdown.
Both routines are running in parallel, and we use a select {case}
to detect
the first one that has activity. If the module is done, Run()
exits normally
by returning the value of out
. But if a stop message is received, then
Run()
panics, which will generate a nicely formatted error in the defer block.
doModuleStuff
and buildResults
are two module specific functions that
perform the core of the module work. Their implementation is completely
arbitrary. The only requirement is that the data returned is a JSON marshalled
string of the struct modules.Result
.
In the sample below, the variables el
and stats
implement the elements
and statistics
types defined previously. Results are stored in these two
variables, then copied into results alongside potential errors.
Note in buildResults
the way FoundAnything
and Success
are set to
implement the rules defined earlier in this page.
func (r *run) doModuleStuff(out *string, moduleDone *chan bool) error {
var (
el elements
stats statistics
)
el.LookedUpHost = make(map[string][]string)
stats.StuffFound = 0 // count for stuff
// grab the hostname of the endpoint
if r.Parameters.GetHostname {
hostname, err := os.Hostname()
if err != nil {
panic(err)
}
el.Hostname = hostname
stats.StuffFound++
}
// grab the local ip addresses
if r.Parameters.GetAddresses {
addresses, err := net.InterfaceAddrs()
if err != nil {
panic(err)
}
for _, addr := range addresses {
if addr.String() == "127.0.0.1/8" || addr.String() == "::1/128" {
continue
}
el.Addresses = append(el.Addresses, addr.String())
stats.StuffFound++
}
}
// look up a host
for _, host := range r.Parameters.LookupHost {
addrs, err := net.LookupHost(host)
if err != nil {
panic(err)
}
el.LookedUpHost[host] = addrs
}
// marshal the results into a json string
*out = r.buildResults(el, stats)
*moduleDone <- true
return nil
}
func (r *run) buildResults(el elements, stats statistics) string {
if len(r.Results.Errors) == 0 {
r.Results.Success = true
}
r.Results.Elements = el
r.Results.Statistics = stats
if stats.StuffFound > 0 {
r.Results.FoundAnything = true
}
jsonOutput, err := json.Marshal(r.Results)
if err != nil {
panic(err)
}
return string(jsonOutput[:])
}
Printing results is needed to visualize module results efficiently. Nobody wants to read raw json, especially when querying thousands of agents at once.
The function below receives a modules.Result
struct that need to be further
analyzed to access the elements
and statistics
types. Because these types
are specific to the module, and not known to MIG, they need to be accessed
using result.GetElements
and result.GetStatistics
.
The rest of the code simply goes through the values and pretty-prints them into
the prints
array of strings.
func (r *run) PrintResults(result modules.Result, matchOnly bool) (prints []string, err error) {
var (
el elements
stats statistics
)
err = result.GetElements(&el)
if err != nil {
panic(err)
}
if el.Hostname != "" {
prints = append(prints, fmt.Sprintf("hostname is %s", el.Hostname))
}
for _, addr := range el.Addresses {
prints = append(prints, fmt.Sprintf("address is %s", addr))
}
for host, addrs := range el.LookedUpHost {
for _, addr := range addrs {
prints = append(prints, fmt.Sprintf("lookedup host %s has IP %s", host, addr))
}
}
if matchOnly {
return
}
for _, e := range result.Errors {
prints = append(prints, fmt.Sprintf("error: %v", e))
}
err = result.GetStatistics(&stats)
if err != nil {
panic(err)
}
prints = append(prints, fmt.Sprintf("stat: %d stuff found", stats.StuffFound))
return
}
to be added...