Skip to content

Commit

Permalink
Merge pull request #6 from 128technology/gschrock/cherry-pick-execd-fix
Browse files Browse the repository at this point in the history
cherry-pick execd fix for one metric per batch
  • Loading branch information
gregschrock authored Jun 9, 2020
2 parents ad6dce1 + aec56f1 commit 476018c
Show file tree
Hide file tree
Showing 21 changed files with 228 additions and 118 deletions.
25 changes: 9 additions & 16 deletions 128tech.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,31 +2,24 @@

This README will describe how to modify and build Telegraf for telegraf-128tech.

## Fetching Dependencies
## Pulling in Upstream Changes

At some point, all of the Telegraf dependencies will need to be pulled down. This will be done during build automatically unless you've already done so and use the flag to skip that step (see "Building a New RPM"). _It's nice to do this manually because there's poor visibility into the build step_.
When updating an existing version, the tasks are straight forward.

If you try to run commands without having the dependencies downloaded, you will see errors of the following form.
1. Merge the upstream branch
2. Cherry-pick desired changes that exist on some other upstream branch (master for example)

```
internal/internal.go:24:2: cannot find package "github.com/alecthomas/units" in any of:
/usr/local/go/src/github.com/alecthomas/units (from $GOROOT)
/go/src/github.com/alecthomas/units (from $GOPATH)
```

To fetch dependencies directly, you can do it simply from the shell. See "Using the Shell" for how to get into it. From the shell's default directory, simply run:
## Moving to a New Upstream Version

```
dep ensure --vendor-only -v
```
When moving to a new upstream version, things are a little more complicated. It requires identification of what has been added to our custom telegraf version which must be pulled into the new release branch. This can be done as described in [this little article](https://til.hashrocket.com/posts/18139f4f20-list-different-commits-between-two-branches).

The above command provides the best visibility. The technically sanctioned fetch step is:
First, pull down the new upstream branch. Then, determine what's been added locally and needs to be included in the new custom build. Do this by finding the commits that were added in the custom branch. This example uses release 1.14, but that will change as time passes.

```
make deps
git log --no-merges --left-right --graph --cherry-pick --oneline release-1.14..release-128tech-1.14
```

It does take some time to complete. After that, the dependencies exist in the `vendor` folder and don't need to be fetched again.
That should provide a limited number of commits that will need to be cherry-picked from the original custom branch to the new one. It is possible these would already exist in the new upstream branch if they were back ported to the custom branch.

## Building a New RPM

Expand Down
13 changes: 12 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,19 @@
## v1.14.3 [unreleased]
## v1.14.4 [unreleased]

#### Bugfixes

- [#7325](https://github.com/influxdata/telegraf/issues/7325): Fix "cannot insert the value NULL error" with PerformanceCounters query.
- [#7579](https://github.com/influxdata/telegraf/pull/7579): Fix numeric to bool conversion in converter processor.
- [#7551](https://github.com/influxdata/telegraf/issues/7551): Fix typo in name of gc_cpu_fraction field of the influxdb input.

## v1.14.3 [2020-05-19]

#### Bugfixes

- [#7412](https://github.com/influxdata/telegraf/pull/7412): Use same timestamp for all objects in arrays in the json parser.
- [#7343](https://github.com/influxdata/telegraf/issues/7343): Handle multiple metrics with the same timestamp in dedup processor.
- [#5905](https://github.com/influxdata/telegraf/issues/5905): Fix reconnection of timed out HTTP2 connections influxdb outputs.
- [#7468](https://github.com/influxdata/telegraf/issues/7468): Fix negative value parsing in impi_sensor input.

## v1.14.2 [2020-04-28]

Expand Down
11 changes: 11 additions & 0 deletions internal/http.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"crypto/subtle"
"net"
"net/http"
"net/url"
)

type BasicAuthErrorFunc func(rw http.ResponseWriter)
Expand Down Expand Up @@ -95,3 +96,13 @@ func (h *ipRangeHandler) ServeHTTP(rw http.ResponseWriter, req *http.Request) {

h.onError(rw, http.StatusForbidden)
}

func OnClientError(client *http.Client, err error) {
// Close connection after a timeout error. If this is a HTTP2
// connection this ensures that next interval a new connection will be
// used and name lookup will be performed.
// https://github.com/golang/go/issues/36026
if err, ok := err.(*url.Error); ok && err.Timeout() {
client.CloseIdleConnections()
}
}
15 changes: 0 additions & 15 deletions internal/http_go1.11.go

This file was deleted.

9 changes: 0 additions & 9 deletions internal/http_go1.12.go

This file was deleted.

4 changes: 2 additions & 2 deletions plugins/inputs/influxdb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ and may vary between versions.
- heap_sys
- mcache_sys
- next_gc
- gcc_pu_fraction
- gc_cpu_fraction
- other_sys
- alloc
- stack_inuse
Expand Down Expand Up @@ -95,7 +95,7 @@ telegraf --config ~/ws/telegraf.conf --input-filter influxdb --test
> influxdb_measurement,database=_internal,host=tyrion,measurement=tsm1_filestore,url=http://localhost:8086/debug/vars numSeries=2 1463590500247354636
> influxdb_measurement,database=_internal,host=tyrion,measurement=tsm1_wal,url=http://localhost:8086/debug/vars numSeries=4 1463590500247354636
> influxdb_measurement,database=_internal,host=tyrion,measurement=write,url=http://localhost:8086/debug/vars numSeries=1 1463590500247354636
> influxdb_memstats,host=tyrion,url=http://localhost:8086/debug/vars alloc=7642384i,buck_hash_sys=1463471i,frees=1169558i,gc_sys=653312i,gcc_pu_fraction=0.00003825652361068311,heap_alloc=7642384i,heap_idle=9912320i,heap_inuse=9125888i,heap_objects=48276i,heap_released=0i,heap_sys=19038208i,last_gc=1463590480877651621i,lookups=90i,mallocs=1217834i,mcache_inuse=4800i,mcache_sys=16384i,mspan_inuse=70920i,mspan_sys=81920i,next_gc=11679787i,num_gc=141i,other_sys=1244233i,pause_total_ns=24034027i,stack_inuse=884736i,stack_sys=884736i,sys=23382264i,total_alloc=679012200i 1463590500277918755
> influxdb_memstats,host=tyrion,url=http://localhost:8086/debug/vars alloc=7642384i,buck_hash_sys=1463471i,frees=1169558i,gc_sys=653312i,gc_cpu_fraction=0.00003825652361068311,heap_alloc=7642384i,heap_idle=9912320i,heap_inuse=9125888i,heap_objects=48276i,heap_released=0i,heap_sys=19038208i,last_gc=1463590480877651621i,lookups=90i,mallocs=1217834i,mcache_inuse=4800i,mcache_sys=16384i,mspan_inuse=70920i,mspan_sys=81920i,next_gc=11679787i,num_gc=141i,other_sys=1244233i,pause_total_ns=24034027i,stack_inuse=884736i,stack_sys=884736i,sys=23382264i,total_alloc=679012200i 1463590500277918755
> influxdb_shard,database=_internal,engine=tsm1,host=tyrion,id=4,path=/Users/sparrc/.influxdb/data/_internal/monitor/4,retentionPolicy=monitor,url=http://localhost:8086/debug/vars fieldsCreate=65,seriesCreate=26,writePointsOk=7274,writeReq=280 1463590500247354636
> influxdb_subscriber,host=tyrion,url=http://localhost:8086/debug/vars pointsWritten=7274 1463590500247354636
> influxdb_tsm1_cache,database=_internal,host=tyrion,path=/Users/sparrc/.influxdb/data/_internal/monitor/1,retentionPolicy=monitor,url=http://localhost:8086/debug/vars WALCompactionTimeMs=0,cacheAgeMs=2809192,cachedBytes=0,diskBytes=0,memBytes=0,snapshotCount=0 1463590500247354636
Expand Down
2 changes: 1 addition & 1 deletion plugins/inputs/influxdb/influxdb.go
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ func (i *InfluxDB) gatherURL(
"pause_total_ns": m.PauseTotalNs,
"pause_ns": m.PauseNs[(m.NumGC+255)%256],
"num_gc": m.NumGC,
"gcc_pu_fraction": m.GCCPUFraction,
"gc_cpu_fraction": m.GCCPUFraction,
},
map[string]string{
"url": url,
Expand Down
2 changes: 1 addition & 1 deletion plugins/inputs/influxdb/influxdb_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ func TestInfluxDB(t *testing.T) {
"heap_sys": int64(33849344),
"mcache_sys": int64(16384),
"next_gc": int64(20843042),
"gcc_pu_fraction": float64(4.287178819113636e-05),
"gc_cpu_fraction": float64(4.287178819113636e-05),
"other_sys": int64(1229737),
"alloc": int64(17034016),
"stack_inuse": int64(753664),
Expand Down
2 changes: 1 addition & 1 deletion plugins/inputs/ipmi_sensor/ipmi.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ var (
execCommand = exec.Command // execCommand is used to mock commands in tests.
re_v1_parse_line = regexp.MustCompile(`^(?P<name>[^|]*)\|(?P<description>[^|]*)\|(?P<status_code>.*)`)
re_v2_parse_line = regexp.MustCompile(`^(?P<name>[^|]*)\|[^|]+\|(?P<status_code>[^|]*)\|(?P<entity_id>[^|]*)\|(?:(?P<description>[^|]+))?`)
re_v2_parse_description = regexp.MustCompile(`^(?P<analogValue>[0-9.]+)\s(?P<analogUnit>.*)|(?P<status>.+)|^$`)
re_v2_parse_description = regexp.MustCompile(`^(?P<analogValue>-?[0-9.]+)\s(?P<analogUnit>.*)|(?P<status>.+)|^$`)
re_v2_parse_unit = regexp.MustCompile(`^(?P<realAnalogUnit>[^,]+)(?:,\s*(?P<statusDesc>.*))?`)
)

Expand Down
78 changes: 54 additions & 24 deletions plugins/inputs/ipmi_sensor/ipmi_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"testing"
"time"

"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
"github.com/influxdata/telegraf/testutil"
"github.com/stretchr/testify/assert"
Expand Down Expand Up @@ -664,11 +665,10 @@ func Test_parseV2(t *testing.T) {
measuredAt time.Time
}
tests := []struct {
name string
args args
wantFields map[string]interface{}
wantTags map[string]string
wantErr bool
name string
args args
expected []telegraf.Metric
wantErr bool
}{
{
name: "Test correct V2 parsing with analog value with unit",
Expand All @@ -677,14 +677,19 @@ func Test_parseV2(t *testing.T) {
cmdOut: []byte("Power Supply 1 | 03h | ok | 10.1 | 110 Watts, Presence detected"),
measuredAt: time.Now(),
},
wantFields: map[string]interface{}{"value": float64(110)},
wantTags: map[string]string{
"name": "power_supply_1",
"status_code": "ok",
"server": "host",
"entity_id": "10.1",
"unit": "watts",
"status_desc": "presence_detected",
expected: []telegraf.Metric{
testutil.MustMetric("ipmi_sensor",
map[string]string{
"name": "power_supply_1",
"status_code": "ok",
"server": "host",
"entity_id": "10.1",
"unit": "watts",
"status_desc": "presence_detected",
},
map[string]interface{}{"value": 110.0},
time.Unix(0, 0),
),
},
wantErr: false,
},
Expand All @@ -695,26 +700,51 @@ func Test_parseV2(t *testing.T) {
cmdOut: []byte("Intrusion | 73h | ok | 7.1 |"),
measuredAt: time.Now(),
},
wantFields: map[string]interface{}{"value": float64(0)},
wantTags: map[string]string{
"name": "intrusion",
"status_code": "ok",
"server": "host",
"entity_id": "7.1",
"status_desc": "ok",
expected: []telegraf.Metric{
testutil.MustMetric("ipmi_sensor",
map[string]string{
"name": "intrusion",
"status_code": "ok",
"server": "host",
"entity_id": "7.1",
"status_desc": "ok",
},
map[string]interface{}{"value": 0.0},
time.Unix(0, 0),
),
},
wantErr: false,
},
{
name: "parse negative value",
args: args{
hostname: "host",
cmdOut: []byte("DIMM Thrm Mrgn 1 | B0h | ok | 8.1 | -55 degrees C"),
measuredAt: time.Now(),
},
expected: []telegraf.Metric{
testutil.MustMetric("ipmi_sensor",
map[string]string{
"name": "dimm_thrm_mrgn_1",
"status_code": "ok",
"server": "host",
"entity_id": "8.1",
"unit": "degrees_c",
},
map[string]interface{}{"value": -55.0},
time.Unix(0, 0),
),
},
wantErr: false,
},
}
for _, tt := range tests {
var acc testutil.Accumulator

t.Run(tt.name, func(t *testing.T) {
var acc testutil.Accumulator
if err := parseV2(&acc, tt.args.hostname, tt.args.cmdOut, tt.args.measuredAt); (err != nil) != tt.wantErr {
t.Errorf("parseV2() error = %v, wantErr %v", err, tt.wantErr)
}
testutil.RequireMetricsEqual(t, tt.expected, acc.GetTelegrafMetrics(), testutil.IgnoreTime())
})

acc.AssertContainsTaggedFields(t, "ipmi_sensor", tt.wantFields, tt.wantTags)
}
}
12 changes: 6 additions & 6 deletions plugins/inputs/kapacitor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The Kapacitor plugin collects metrics from the given Kapacitor instances.
- [notification_dropped](#notification_dropped) _(integer)_
- [primary-handle-count](#primary-handle-count) _(integer)_
- [secondary-handle-count](#secondary-handle-count) _(integer)_
- (Kapacitor Enterprise only) [kapacitor_cluster](#kapacitor_cluster) _(integer)_
- (Kapacitor Enterprise only) [kapacitor_cluster](#kapacitor_cluster) _(integer)_
- [dropped_member_events](#dropped_member_events) _(integer)_
- [dropped_user_events](#dropped_user_events) _(integer)_
- [query_handler_errors](#query_handler_errors) _(integer)_
Expand All @@ -49,7 +49,7 @@ The Kapacitor plugin collects metrics from the given Kapacitor instances.
- [buck_hash_sys_bytes](#buck_hash_sys_bytes) _(integer)_
- [frees](#frees) _(integer)_
- [gc_sys_bytes](#gc_sys_bytes) _(integer)_
- [gc_cpu_fraction](#gcc_pu_fraction) _(float)_
- [gc_cpu_fraction](#gc_cpu_fraction) _(float)_
- [heap_alloc_bytes](#heap_alloc_bytes) _(integer)_
- [heap_idle_bytes](#heap_idle_bytes) _(integer)_
- [heap_in_use_bytes](#heap_in_use_bytes) _(integer)_
Expand Down Expand Up @@ -109,8 +109,8 @@ The `kapacitor_alert` measurement stores fields with information related to
[Kapacitor alerts](https://docs.influxdata.com/kapacitor/v1.5/working/alerts/).

#### notification-dropped
The number of internal notifications dropped because they arrive too late from another Kapacitor node.
If this count is increasing, Kapacitor Enterprise nodes aren't able to communicate fast enough
The number of internal notifications dropped because they arrive too late from another Kapacitor node.
If this count is increasing, Kapacitor Enterprise nodes aren't able to communicate fast enough
to keep up with the volume of alerts.

#### primary-handle-count
Expand Down Expand Up @@ -199,7 +199,7 @@ The number of allocated objects.
The number of heap bytes released to the operating system.

#### heap_sys_bytes
The number of heap bytes obtained from `system`.
The number of heap bytes obtained from `system`.

#### last_gc_ns
The nanosecond epoch time of the last garbage collection.
Expand Down Expand Up @@ -293,7 +293,7 @@ The `kapacitor_topics` measurement stores fields related to
Kapacitor topics](https://docs.influxdata.com/kapacitor/latest/working/using_alert_topics/).

#### collected
The number of events collected by Kapacitor topics.
The number of events collected by Kapacitor topics.

---

Expand Down
6 changes: 3 additions & 3 deletions plugins/inputs/sqlserver/sqlserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -617,10 +617,10 @@ SET @SQL = N'SELECT DISTINCT
OR RTRIM(spi.object_name) LIKE ''%:Advanced Analytics'')
AND TRY_CONVERT(uniqueidentifier, spi.instance_name)
IS NOT NULL -- for cloud only
THEN d.name
WHEN RTRIM(object_name) LIKE ''%:Availability Replica''
THEN ISNULL(d.name,RTRIM(spi.instance_name)) -- Elastic Pools counters exist for all databases but sys.databases only has current DB value
WHEN RTRIM(object_name) LIKE ''%:Availability Replica''
AND TRY_CONVERT(uniqueidentifier, spi.instance_name) IS NOT NULL -- for cloud only
THEN d.name + RTRIM(SUBSTRING(spi.instance_name, 37, LEN(spi.instance_name)))
THEN ISNULL(d.name,RTRIM(spi.instance_name)) + RTRIM(SUBSTRING(spi.instance_name, 37, LEN(spi.instance_name)))
ELSE RTRIM(spi.instance_name)
END AS instance_name,'
ELSE 'RTRIM(spi.instance_name) as instance_name, '
Expand Down
8 changes: 5 additions & 3 deletions plugins/outputs/influxdb/http.go
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ func (c *httpClient) CreateDatabase(ctx context.Context, database string) error

resp, err := c.client.Do(req.WithContext(ctx))
if err != nil {
internal.OnClientError(c.client, err)
return err
}
defer resp.Body.Close()
Expand Down Expand Up @@ -311,7 +312,7 @@ func (c *httpClient) Write(ctx context.Context, metrics []telegraf.Metric) error
}

func (c *httpClient) writeBatch(ctx context.Context, db, rp string, metrics []telegraf.Metric) error {
url, err := makeWriteURL(c.config.URL, db, rp, c.config.Consistency)
loc, err := makeWriteURL(c.config.URL, db, rp, c.config.Consistency)
if err != nil {
return err
}
Expand All @@ -322,13 +323,14 @@ func (c *httpClient) writeBatch(ctx context.Context, db, rp string, metrics []te
}
defer reader.Close()

req, err := c.makeWriteRequest(url, reader)
req, err := c.makeWriteRequest(loc, reader)
if err != nil {
return err
}

resp, err := c.client.Do(req.WithContext(ctx))
if err != nil {
internal.OnClientError(c.client, err)
return err
}
defer resp.Body.Close()
Expand Down Expand Up @@ -505,5 +507,5 @@ func makeQueryURL(loc *url.URL) (string, error) {
}

func (c *httpClient) Close() {
internal.CloseIdleConnections(c.client)
c.client.CloseIdleConnections()
}
Loading

0 comments on commit 476018c

Please sign in to comment.