Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write fields.yml files for datasets #239

Merged
merged 6 commits into from
Mar 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
64 changes: 64 additions & 0 deletions dev/import-beats/datasets.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package main

import (
"io/ioutil"
"log"
"os"
"path"

"github.com/pkg/errors"
)

type datasetContent struct {
fields fieldsContent
}

type fieldsContent struct {
files map[string][]byte
}

func createDatasets(modulePath string) (map[string]datasetContent, error) {
datasetDirs, err := ioutil.ReadDir(modulePath)
if err != nil {
return nil, errors.Wrapf(err, "cannot module directory %s", modulePath)
}

contents := map[string]datasetContent{}
for _, datasetDir := range datasetDirs {
if !datasetDir.IsDir() {
continue
}
datasetName := datasetDir.Name()

if datasetName == "_meta" {
continue
}

_, err := os.Stat(path.Join(modulePath, datasetName, "_meta"))
if os.IsNotExist(err) {
log.Printf("\t%s: not a valid dataset, skipped", datasetName)
continue
}

log.Printf("\t%s: dataset found", datasetName)
content := datasetContent{}

fieldsFiles, err := loadDatasetFields(modulePath, datasetName)
if err != nil {
return nil, errors.Wrapf(err, "loading dataset fields failed (modulePath: %s, datasetName: %s)",
modulePath, datasetName)
}

content.fields = fieldsContent{
files: map[string][]byte{
"fields.yml": fieldsFiles,
},
}
contents[datasetName] = content
}
return contents, nil
}
46 changes: 46 additions & 0 deletions dev/import-beats/fields.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package main

import (
"bufio"
"bytes"
"io/ioutil"
"log"
"os"
"path/filepath"

"github.com/pkg/errors"
)

func loadDatasetFields(modulePath, datasetName string) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I would concatenate the fields.yml together. The package manager can support multiple *.yml files. Instead you could keep them separate but you need to add a top block to each dataset with the right "prefix". Keeping them separate has a few benefits I think:

  • The shared part could become a component which is reused in elasticsearch templates v2
  • During development I think it helps the dev to differentiate which are global fields and which ones are not. Like this if he changes a global field, he knows to check all datasets.

It could be argued, that the global fields should be on the package level but lets skip this for now as EPM does not support it.

Copy link
Contributor Author

@mtojek mtojek Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any preferences regarding naming?

in every module/dataset/foo/fields/:
package-fields.yml
fields.yml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference. The above LGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll address this comment in #240

moduleFieldsPath := filepath.Join(modulePath, "_meta", "fields.yml")
moduleFields, err := ioutil.ReadFile(moduleFieldsPath)
if err != nil {
return nil, errors.Wrapf(err, "reading module fields file failed (path: %s)", moduleFieldsPath)
}

var buffer bytes.Buffer
buffer.Write(moduleFields)

datasetFieldsPath := filepath.Join(modulePath, datasetName, "_meta", "fields.yml")
datasetFieldsFile, err := os.Open(datasetFieldsPath)
if os.IsNotExist(err) {
log.Printf("Missing fields.yml file. Skipping. (path: %s)\n", datasetFieldsPath)
return buffer.Bytes(), nil
} else if err != nil {
return nil, errors.Wrapf(err, "reading dataset fields file failed (path: %s)", moduleFieldsPath)
}
defer datasetFieldsFile.Close()

scanner := bufio.NewScanner(datasetFieldsFile)
for scanner.Scan() {
line := scanner.Text()
buffer.Write([]byte(" "))
buffer.WriteString(line)
buffer.WriteString("\n")
}
return buffer.Bytes(), nil
}
12 changes: 6 additions & 6 deletions dev/import-beats/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,21 +35,21 @@ func main() {
// where logs and metrics are distributed with different beats (oriented either on logs or metrics - metricbeat,
// filebeat, etc.).
func build(beatsDir, outputDir string) error {
packages := packageMap{}
repository := newPackageRepository()

for _, beatName := range logSources {
err := packages.loadFromSource(beatsDir, beatName, "logs")
err := repository.createPackagesFromSource(beatsDir, beatName, "logs")
if err != nil {
return errors.Wrap(err, "loading logs source failed")
return errors.Wrap(err, "creating form logs source failed")
}
}

for _, beatName := range metricSources {
err := packages.loadFromSource(beatsDir, beatName, "metrics")
err := repository.createPackagesFromSource(beatsDir, beatName, "metrics")
if err != nil {
return errors.Wrap(err, "loading metrics source failed")
return errors.Wrap(err, "creating from metrics source failed")
}
}

return packages.writePackages(outputDir)
return repository.save(outputDir)
}
129 changes: 104 additions & 25 deletions dev/import-beats/packages.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,67 +8,146 @@ import (
"io/ioutil"
"log"
"os"
"path"
"path/filepath"
"strings"

"github.com/pkg/errors"
"gopkg.in/yaml.v2"

"github.com/elastic/package-registry/util"
)

type packageMap map[string]*util.Package
var ignoredModules = map[string]bool{"apache2": true}

func (p packageMap) loadFromSource(beatsDir, beatName, packageType string) error {
path := filepath.Join(beatsDir, beatName, "module")
moduleDirs, err := ioutil.ReadDir(path)
type packageContent struct {
manifest util.Package
datasets map[string]datasetContent
}

func newPackageContent(name string) packageContent {
title := strings.Title(name)
return packageContent{
manifest: util.Package{
FormatVersion: "1.0.0",
Name: name,
Description: strings.Title(name + " integration"),
Title: &title,
Version: "0.0.1", // TODO
Type: "integration",
License: "basic",
},
datasets: map[string]datasetContent{},
}
}

func (pc *packageContent) addDatasets(ds map[string]datasetContent) {
for k, v := range ds {
pc.datasets[k] = v
}
}

type packageRepository struct {
packages map[string]packageContent
}

func newPackageRepository() *packageRepository {
return &packageRepository{
packages: map[string]packageContent{},
}
}

func (r *packageRepository) createPackagesFromSource(beatsDir, beatName, packageType string) error {
beatPath := filepath.Join(beatsDir, beatName)
beatModulesPath := filepath.Join(beatPath, "module")

moduleDirs, err := ioutil.ReadDir(beatModulesPath)
if err != nil {
return errors.Wrapf(err, "cannot read directory '%s'", path)
return errors.Wrapf(err, "cannot read directory '%s'", beatModulesPath)
}

for _, moduleDir := range moduleDirs {
if !moduleDir.IsDir() {
continue
}
moduleName := moduleDir.Name()

log.Printf("Visit '%s:%s'\n", beatName, moduleDir.Name())
log.Printf("%s %s: module found\n", beatName, moduleName)
if _, ok := ignoredModules[moduleName]; ok {
log.Printf("%s %s: module skipped\n", beatName, moduleName)
continue
}

_, ok := p[moduleDir.Name()]
_, ok := r.packages[moduleName]
if !ok {
p[moduleDir.Name()] = &util.Package{
FormatVersion: "1.0.0",
Name: moduleDir.Name(),
Version: "0.0.1", // TODO
Type: "integration",
Categories: []string{},
License: "basic",
}
r.packages[moduleName] = newPackageContent(moduleName)
}

p[moduleDir.Name()].Categories = append(p[moduleDir.Name()].Categories, packageType)
aPackage := r.packages[moduleName]
manifest := aPackage.manifest
manifest.Categories = append(manifest.Categories, packageType)
aPackage.manifest = manifest

modulePath := path.Join(beatModulesPath, moduleName)
datasets, err := createDatasets(modulePath)
if err != nil {
return err
}

aPackage.addDatasets(datasets)
r.packages[moduleDir.Name()] = aPackage
}
return nil
}

func (p packageMap) writePackages(outputDir string) error {
for name, content := range p {
log.Printf("Writing package '%s' (version: %s)\n", name, content.Version)
func (r *packageRepository) save(outputDir string) error {
for packageName, content := range r.packages {
manifest := content.manifest

log.Printf("%s-%s write package content\n", packageName, manifest.Version)

path := filepath.Join(outputDir, name+"-"+content.Version)
err := os.MkdirAll(path, 0755)
packagePath := filepath.Join(outputDir, packageName+"-"+manifest.Version)
err := os.MkdirAll(packagePath, 0755)
if err != nil {
return errors.Wrapf(err, "cannot make directory '%s'", path)
return errors.Wrapf(err, "cannot make directory for module: '%s'", packagePath)
}

m, err := yaml.Marshal(content)
m, err := yaml.Marshal(content.manifest)
if err != nil {
return errors.Wrapf(err, "marshaling package content failed (package name: %s)", name)
return errors.Wrapf(err, "marshaling package content failed (package packageName: %s)", packageName)
}

manifestFilePath := filepath.Join(path, "manifest.yml")
manifestFilePath := filepath.Join(packagePath, "manifest.yml")
err = ioutil.WriteFile(manifestFilePath, m, 0644)
if err != nil {
return errors.Wrapf(err, "writing manifest file failed (path: %s)", manifestFilePath)
}

for datasetName, dataset := range content.datasets {
datasetPath := filepath.Join(packagePath, "dataset", datasetName)
err := os.MkdirAll(datasetPath, 0755)
if err != nil {
return errors.Wrapf(err, "cannot make directory for dataset: '%s'", datasetPath)
}

if len(dataset.fields.files) > 0 {
datasetFieldsPath := filepath.Join(datasetPath, "fields")
err := os.MkdirAll(datasetFieldsPath, 0755)
if err != nil {
return errors.Wrapf(err, "cannot make directory for dataset fields: '%s'", datasetPath)
}

for fieldsFileName, fieldsFile := range dataset.fields.files {
log.Printf("\t%s: write '%s' file\n", datasetName, fieldsFileName)

fieldsFilePath := filepath.Join(datasetFieldsPath, fieldsFileName)
err = ioutil.WriteFile(fieldsFilePath, fieldsFile, 0644)
if err != nil {
return errors.Wrapf(err, "writing fields file failed (path: %s)", fieldsFilePath)
}
}
}
}
}
return nil
}
12 changes: 0 additions & 12 deletions dev/package-beats/activemq-0.0.1/manifest.yml

This file was deleted.

11 changes: 0 additions & 11 deletions dev/package-beats/aerospike-0.0.1/manifest.yml

This file was deleted.

12 changes: 0 additions & 12 deletions dev/package-beats/apache-0.0.1/manifest.yml

This file was deleted.

11 changes: 0 additions & 11 deletions dev/package-beats/apache2-0.0.1/manifest.yml

This file was deleted.

11 changes: 0 additions & 11 deletions dev/package-beats/appsearch-0.0.1/manifest.yml

This file was deleted.

11 changes: 0 additions & 11 deletions dev/package-beats/auditd-0.0.1/manifest.yml

This file was deleted.

12 changes: 0 additions & 12 deletions dev/package-beats/aws-0.0.1/manifest.yml

This file was deleted.

Loading