Skip to content
This repository has been archived by the owner on Sep 14, 2019. It is now read-only.

Latest commit

 

History

History
771 lines (617 loc) · 24.6 KB

modules.rst

File metadata and controls

771 lines (617 loc) · 24.6 KB

MIG Modules

In this document, we explain how modules are written and integrated into MIG.

The reception of a command by an agent triggers the execution of modules. A module is a Go package that is imported into the agent at compilation, and that performs a very specific set of tasks. For example, the file module provides a way to scan a file system for files that contain regexes, match a checksum, ... Another module is called netstat, and looks for IP addresses currently connected to an endpoint. ping is a module to ping targets from endpoints, etc..

Modules are somewhat autonomous. They can be developed outside of the MIG code base, and only imported during compilation of the agent. Go does not provide a way to load modules dynamically, so modules are compiled into the agent's static binary, and not as separate files.

There are two types of modules. Standard modules, which were the initial module type supported by MIG, and persistent modules.

A standard module is invoked by the agent when it recieves by a command, and the results provided by this module can be considered point in time. It does not keep any state between runs, and is used for general investigation activities. Some examples of standard modules include the file module for scanning the file system for certain criteria, or the netstat module for looking at current network communication.

A persistent module is run by the agent when it starts, and can perform more on-going tasks or analysis activities. Persistent modules are kept running by the agent. Persistent modules can be queried just like standard modules, but instead of a one-time invocation of the module, when the investigator queries a persistent module you are querying into the already running module. This can be used to collect statistics or results, change the behavior of the persistent module, or various other activities.

Persistent modules are developed in a similar manner to standard modules with a few additions. This discusses elements that are common to all modules. For details specific to the implementation of persistent modules see the persistent module documentation.

A module must import mig/modules.

A module registers itself at runtime via its init() function which must call modules.Register with a module name and an instance implementing modules.Moduler:

type Moduler interface {
    NewRun() Runner
}

A module must have a unique name. A good practice is to use the same name for the module name as for the Go package name. However, it is possible for a single Go package to implement multiple modules, simply by registering different Modulers with different names.

The sole method of a Moduler creates a new instance to represent a "run" of the module, implementing the modules.Runner interface:

type Runner interface {
        Run(io.Reader) string
        ValidateParameters() error
}

Any run-specific information should be associated with this instance and not with the Moduler or stored in a global variable. It should be possible for multiple runs of the module to execute simultaneously.

The code sample below shows how the example module uses package name example and registers with name example.

package example

import (
        "mig/modules"
)

// An instance of this type will represent this module; it's possible to add
// additional data fields here, although that is rarely needed.
type module struct {
}

func (m *module) NewRun() interface{} {
        return new(run)
}

// init is called by the Go runtime at startup. We use this function to
// register the module in a global array of available modules, so the
// agent knows we exist
func init() {
        modules.Register("example", new(module))
}

type run struct {
        Parameters params
        Results    modules.Result
}

init() is a go builtin function that is executed automatically in all imported packages when a program starts. In the agents, modules are imported anonymously, which means that their init() function will be executed even if the modules are unused in the agent. Therefore, when MIG Agent starts, all modules execute their init() function, add their names and runner function to the global list of available modules, and stop there.

The list of modules imported in the agent is maintained in conf/available_modules.go. You should use this file to add or remove modules.

import (
        //_ "mig/modules/example"
        _ "mig/modules/agentdestroy"
        _ "mig/modules/file"
        _ "mig/modules/netstat"
        _ "mig/modules/timedrift"
        _ "mig/modules/ping"
)

When the agent receives a command to execute, it looks up modules in the global list modules.Available, and if a module is registered to execute the command, calls its runner function to get a new instance representing the run, and then calls that instance's Run method.

A mig module typically defines its own run struct implementing the modules.Runner interface and representing a single run of the module. The run struct typically contains two fields: module parameters and module results. The former is any format the module chooses to use, while the latter generally implements the modules.Result struct (note that this is not required, but it is the easiest way to return a properly-formatted JSON result).

type run struct {
        Parameters myModuleParams
        Results    modules.Result
}

When a module is available to run an operation, the agent passes the operation parameters to the module.

The easiest way to see this is to invoke the agent binary with the flag -m, followed by the name of the module:

$ mig-agent -m example <<< '{"class":"parameters", "parameters":{"gethostname": true, "getaddresses": true, "lookuphost": ["www.google.com"]}}'
[info] using builtin conf
{"foundanything":true,"success":true,"elements":{"hostname":"fedbox2.jaffa.linuxwall.info","addresses":["172.21.0.3/20","fe80::8e70:5aff:fec8:be50/64"],"lookeduphost":{"www.google.com":["74.125.196.105","74.125.196.147","74.125.196.106","74.125.196.104","74.125.196.103","74.125.196.99","2607:f8b0:4002:c07::6a"]}},"statistics":{"stufffound":3},"errors":null}

The module receives this JSON input as an io.Reader passed to its Run method.

The module's Run method should start by trying to read parameters from the given in io.Reader. It then validates the parameters against its own formatting rules, performs work and returns results in a JSON string.

func (r *run) Run(in io.Reader) string {
        defer func() {
                if e := recover(); e != nil {
                        r.Results.Errors = append(r.Results.Errors, fmt.Sprintf("%v", e))
                        r.Results.Success = false
                        buf, _ := json.Marshal(r.Results)
                        out = string(buf[:])
                }
        }()

        err := modules.ReadInputParameters(in, &r.Parameters)
        if err != nil {
                panic(err)
        }

        err = r.ValidateParameters()
        if err != nil {
                panic(err)
        }

        return r.doModuleStuff()
}

The defer block in the sample above is used to catch potential panics and returns a nicely formatted JSON error to the agent. This is a clean way to indicate to the MIG platform that the module has failed to run on this agent.

A module must implement the ValidateParameters() method.

The role of that interface is to go through the parameters supplied to Run and verify that they follow a format expected by the module. This method is useful during Run but is not called from outside the module.

Go is strongly typed, so there's no risk of finding a string when a float is expected. However, this function should verify that values are in a proper range, that regular expressions compile without errors, or that string parameters use the correct syntax.

When validation fails, an error with a descriptive validation failure must be returned to the caller.

A good example of validating parameters can be found in the file module at https://github.com/mozilla/mig/blob/master/modules/file/file.go

Results must follow a specific format defined in modules.Result. Some rules apply to the way fields in this struct must be set.

type Result struct {
        Success       bool        `json:"success"`
        FoundAnything bool        `json:"foundanything"`
        Elements      interface{} `json:"elements"`
        Statistics    interface{} `json:"statistics"`
        Errors        []string    `json:"errors"`
}

Success must inform the investigator if the module has failed to complete its execution. It must be set to true only if the module has run successfully. It does not indicate anything about the results returned by the module, just that it ran and finished.

FoundAnything must be set to true only when the module was tasked with finding something, and at least one instance of that something was found. If the module searched for multiple things, one find is enough to set this flag to true. The goal is to indicate to the investigator that the results from this agent need closer scrutiny.

Elements contains raw results from the module. This is defined as an interface, which means that each module must define the format of the results returned to the MIG platform. The only rule here is that modules must never return raw data to investigators. Metadata is fine, but file contents or memory dumps are not something MIG should be transporting ever.

Statistics is an optional struct that can contain stats about the execution of the module. For example, the file module returns the numbers of files inspected by a given search, as well as the time it took to run the investigation. That information is often useful for investigators.

Errors is an array of string that can contain soft and hard errors. If the module failed to run, Success would be set to false and Errors would contain a single error with the description of the failure. If the module succeeded to run, then Errors could contain soft failures that did not prevent the module from finishing, but may be useful for the investigator to know about. For example, if the memory module fails to inspect a given memory region, the Errors array could contain an entry providing that information.

HasResultsPrinter is an interface used to allow a module to implement the PrintResults() function. PrintResults() is a pretty-printer used to display the results of a module as an array of string. It is defined as a module-specific interface because only the module knows how to parse its Elements and Statistics interfaces in modules.Result.

The interface is defined as:

// HasResultsPrinter implements functions used by module to print information
type HasResultsPrinter interface {
        PrintResults(result Result, showResultsOnly bool) ([]string, error)
}

A typical implementation of PrintResults takes a modules.Result struct and a boolean that indicates whether the printer should display errors and statistics or only found results. When that boolean is set to true, errors, stats and empty results are not displayed. Note that the result argument is the result of unmarshalling the marshalled value returned from the Run method.

The function returns results into an array of strings.

func (r *run) PrintResults(result modules.Result, matchOnly bool) (prints []string, err error) {
        var (
                el    elements
                stats statistics
        )
        err = result.GetElements(&el)
        if err != nil {
                panic(err)
        }

        [... add things into the prints array ...]

        if matchOnly {
                return // stop here
        }
        for _, e := range result.Errors {
                prints = append(prints, fmt.Sprintf("error: %v", e))
        }
        err = result.GetStatistics(&stats)
        if err != nil {
                panic(err)
        }
        [... add stats into the prints array ...]
        return
}

HasParamsCreator implements the ParamsCreator() function used to provide interactive parameters creation in the MIG Console. The function does not take any input value, but implements a terminal prompt for the investigator to fill up the module parameters. The function returns a Parameters structure that the MIG Console will add into an Action.

It can be implemented in various ways, as long as it prompt the user in the terminal using something like fmt.Scanln().

The interface is defined as:

type HasParamsCreator interface {
        ParamsCreator() (interface{}, error)
}

A module implementation would have the function:

func (r *run) ParamsCreator() (interface{}, error) {
        fmt.Println("initializing netstat parameters creation")
        var err error
        var p params
        printHelp(false)
        scanner := bufio.NewScanner(os.Stdin)
        for {
                fmt.Printf("drift> ")
                scanner.Scan()
                if err := scanner.Err(); err != nil {
                        fmt.Println("Invalid input. Try again")
                        continue
                }
                input := scanner.Text()
                if input == "help" {
                        printHelp(false)
                        continue
                }
                if input != "" {
                        _, err = time.ParseDuration(input)
                        if err != nil {
                                fmt.Println("invalid drift duration. try again. ex: drift> 5s")
                                continue
                        }
                }
                p.Drift = input
                break
        }
        r.Parameters = p
        return r.Parameters, r.ValidateParameters()
}

It is highly recommended to call ValidateParameters to verify that the parameters supplied by the users are correct.

HasParamsParser is similar to HasParamsCreator, but implements a command line parameters parser instead of an interactive prompt. It is used by the MIG command line to parse module-specific flags into module Parameters. Each module must implement ParamsParser() to transform an array of string into a parameters interface. The recommended way to implement it is to use FlagSet from the flag Go package. The interface is defined as:

// HasParamsParser implements a function that parses command line parameters
type HasParamsParser interface {
        ParamsParser([]string) (interface{}, error)
}

A typical implementation from the timedrift module looks as follows:

func (r *run) ParamsParser(args []string) (interface{}, error) {
        var (
                err   error
                drift string
                fs    flag.FlagSet
        )
        if len(args) >= 1 && args[0] == "help" {
                printHelp(true)
                return nil, fmt.Errorf("help printed")
        }
        if len(args) == 0 {
                return r.Parameters, nil
        }
        fs.Init("time", flag.ContinueOnError)
        fs.StringVar(&drift, "drift", "", "see help")
        err = fs.Parse(args)
        if err != nil {
                return nil, err
        }
        _, err = time.ParseDuration(drift)
        if err != nil {
                return nil, fmt.Errorf("invalid drift duration. try help.")
        }
        r.Parameters.Drift = drift
        return r.Parameters, r.ValidateParameters()
}

It is highly recommended to call ValidateParameters to verify that the parameters supplied by the users are correct.

Modules can implement the HasEnhancedPrivacy interface by providing an EnhancePrivacy function.

If extra privacy mode has been enabled in the agent configuration, results that are returned from a module will be passed through the modules EnhancePrivacy function. This provides the module a means to mask certain meta-data as desired from the result set.

The function is only run if extra privacy mode is enabled, if not the function will not be run on results. If the module does not implement HasEnhancedPrivacy, the results are returned as-is.

An example module that can be used as a template is available in src/mig/modules/example/. We will study its structure to understand how modules are written and executed.

The first part of the module takes care of the registration and declaration of needed structs.

package example

import (
        "encoding/json"
        "fmt"
        "mig/modules"
        "net"
        "os"
        "regexp"
)

// init is called by the Go runtime at startup. We use this function to
// register the module in a global array of available modules, so the
// agent knows we exist
func init() {
        modules.Register("example", func() interface{} {
                return new(run)
        })
}

type run struct {
        Parameters params
        Results    modules.Result
}

// a simple parameters structure, the format is arbitrary
type params struct {
        GetHostname  bool     `json:"gethostname"`
        GetAddresses bool     `json:"getaddresses"`
        LookupHost   []string `json:"lookuphost"`
}

type elements struct {
        Hostname     string              `json:"hostname,omitempty"`
        Addresses    []string            `json:"addresses,omitempty"`
        LookedUpHost map[string][]string `json:"lookeduphost,omitempty"`
}

type statistics struct {
        StuffFound int64 `json:"stufffound"`
}

Three custom structs are defined: params, elements and statistics.

params implements custom module parameters. In this instance, the module will access two booleans (GetHostname and GetAddresses), and one array of strings (LookupHost). We have decided that this module will return its hostname if GetHostname is set to true. It will return its IP addresses if GetAddresses is set to true, and it will perform DNS lookups and return the IP addresses of each FQDN listed in the LookupHost array.

elements will contain the results found by the module. The hostname will go into elements.Hostname. The local addresses will be appended into elements.Addresses. And each host that was looked up will be added into the elements.LookedUpHost map with their own arrays of IP addresses.

statistics just keeps a counter of stuffs that was found. We could also add an execution timer in this struct to indicate how look it took the module to run.

Next we'll implement a parameters validation function.

func (r *run) ValidateParameters() (err error) {
        fqdn := regexp.MustCompilePOSIX(`^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$`)
        for _, host := range r.Parameters.LookupHost {
                if !fqdn.MatchString(host) {
                        return fmt.Errorf("ValidateParameters: LookupHost parameter is not a valid FQDN.")
                }
        }
        return
}

Since our parameters struct is very basic, there is little verification to do. The two booleans don't need verification, because Go is strongly typed. But we attempt to validate the FQDN of hosts that need to be looked up with a regular expression. If the validation fails, ValidateParameters returns an error.

Run is what the agent will call when the module is executed. It starts by defining a panic handling routine that will transform panics into modules.Result.Errors and return the JSON.

Then, Run() reads parameters from stdin. The call to modules.ReadInputParameters will block until one line of input is received. If what was received isn't valid parameters, it panics.

func (r *run) Run(in io.Reader) (out string) {
        defer func() {
                if e := recover(); e != nil {
                        r.Results.Errors = append(r.Results.Errors, fmt.Sprintf("%v", e))
                        r.Results.Success = false
                        buf, _ := json.Marshal(r.Results)
                        out = string(buf[:])
                }
        }()

        err := modules.ReadInputParameters(in, &r.Parameters)
        if err != nil {
                panic(err)
        }
        err = r.ValidateParameters()
        if err != nil {
                panic(err)
        }

        moduleDone := make(chan bool)
        stop := make(chan bool)
        go r.doModuleStuff(&out, &moduleDone)
        go modules.WatchForStop(in, &stop)

        select {
        case <-moduleDone:
                return out
        case <-stop:
                panic("stop message received, terminating early")
        }
}

What happens after is a little tricky to follow. We want the module to do work, but we also want to allow the investigator to kill the module early if needed. So we first send the module to perform the work by calling go r.doModuleStuff(&out, &moduleDone) where &out is a pointer to the string that Run() will return, and &moduleDone is a channel that will receive a boolean when the module is done doing stuff.

Meanwhile, we start another goroutine go modules.WatchForStop(in, &stop) that will continously read the standard input of the module. If a stop message is received on the standard input, the goroutine inserts a boolean in the stop channel. This method is typically used by the agent to ask a module to shutdown.

Both routines are running in parallel, and we use a select {case} to detect the first one that has activity. If the module is done, Run() exits normally by returning the value of out. But if a stop message is received, then Run() panics, which will generate a nicely formatted error in the defer block.

doModuleStuff and buildResults are two module specific functions that perform the core of the module work. Their implementation is completely arbitrary. The only requirement is that the data returned is a JSON marshalled string of the struct modules.Result.

In the sample below, the variables el and stats implement the elements and statistics types defined previously. Results are stored in these two variables, then copied into results alongside potential errors.

Note in buildResults the way FoundAnything and Success are set to implement the rules defined earlier in this page.

func (r *run) doModuleStuff(out *string, moduleDone *chan bool) error {
        var (
                el    elements
                stats statistics
        )
        el.LookedUpHost = make(map[string][]string)

        stats.StuffFound = 0 // count for stuff

        // grab the hostname of the endpoint
        if r.Parameters.GetHostname {
                hostname, err := os.Hostname()
                if err != nil {
                        panic(err)
                }
                el.Hostname = hostname
                stats.StuffFound++
        }

        // grab the local ip addresses
        if r.Parameters.GetAddresses {
                addresses, err := net.InterfaceAddrs()
                if err != nil {
                        panic(err)
                }
                for _, addr := range addresses {
                        if addr.String() == "127.0.0.1/8" || addr.String() == "::1/128" {
                                continue
                        }
                        el.Addresses = append(el.Addresses, addr.String())
                        stats.StuffFound++
                }
        }

        // look up a host
        for _, host := range r.Parameters.LookupHost {
                addrs, err := net.LookupHost(host)
                if err != nil {
                        panic(err)
                }
                el.LookedUpHost[host] = addrs
        }

        // marshal the results into a json string
        *out = r.buildResults(el, stats)
        *moduleDone <- true
        return nil
}

func (r *run) buildResults(el elements, stats statistics) string {
        if len(r.Results.Errors) == 0 {
                r.Results.Success = true
        }
        r.Results.Elements = el
        r.Results.Statistics = stats
        if stats.StuffFound > 0 {
                r.Results.FoundAnything = true
        }
        jsonOutput, err := json.Marshal(r.Results)
        if err != nil {
                panic(err)
        }
        return string(jsonOutput[:])
}

Printing results is needed to visualize module results efficiently. Nobody wants to read raw json, especially when querying thousands of agents at once.

The function below receives a modules.Result struct that need to be further analyzed to access the elements and statistics types. Because these types are specific to the module, and not known to MIG, they need to be accessed using result.GetElements and result.GetStatistics.

The rest of the code simply goes through the values and pretty-prints them into the prints array of strings.

func (r *run) PrintResults(result modules.Result, matchOnly bool) (prints []string, err error) {
        var (
                el    elements
                stats statistics
        )
        err = result.GetElements(&el)
        if err != nil {
                panic(err)
        }
        if el.Hostname != "" {
                prints = append(prints, fmt.Sprintf("hostname is %s", el.Hostname))
        }
        for _, addr := range el.Addresses {
                prints = append(prints, fmt.Sprintf("address is %s", addr))
        }
        for host, addrs := range el.LookedUpHost {
                for _, addr := range addrs {
                        prints = append(prints, fmt.Sprintf("lookedup host %s has IP %s", host, addr))
                }
        }
        if matchOnly {
                return
        }
        for _, e := range result.Errors {
                prints = append(prints, fmt.Sprintf("error: %v", e))
        }
        err = result.GetStatistics(&stats)
        if err != nil {
                panic(err)
        }
        prints = append(prints, fmt.Sprintf("stat: %d stuff found", stats.StuffFound))
        return
}

to be added...