Skip to content

Commit

Permalink
[query] Add query ID generation scheme sections to yaml files and use…
Browse files Browse the repository at this point in the history
…r guides (#1381)
  • Loading branch information
arnikola authored Feb 16, 2019
1 parent a5e1a27 commit 41c5bce
Show file tree
Hide file tree
Showing 21 changed files with 174 additions and 22 deletions.
36 changes: 35 additions & 1 deletion docs/how_to/query.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ You will notice that in the setup linked above, M3DB has just one unaggregated n
resolution: 10s
```

If you run Statsite, m3agg, or some other aggregation tier, you will want to set the `all` flag under `downsample` to `false`. Otherwise, you will be aggregating metrics that have already been aggregated.
If you run Statsite, m3agg, or some other aggregation tier, you will want to set the `all` flag under `downsample` to `false`. Otherwise, you will be aggregating metrics that have already been aggregated.

```json
- namespace: metrics_10s_48h
Expand All @@ -51,6 +51,40 @@ You will notice that in the setup linked above, M3DB has just one unaggregated n
all: false
```

## ID generation

The default generation scheme for IDs is unfortunately prone to collisions, but remains the default for backwards compatibility reasons. It is suggested to set the ID generation scheme to one of either `quoted` or `prepend_meta`. `quoted` generation scheme yields the most human-readable IDs, whereas `prepend_meta` is better for more compact IDS, or if tags are expected to contain non-ASCII characters. To set the ID generation scheme, add the following to your coordinator configuration yaml file:

```yaml
tagOptions:
idScheme: <name>
```
As an example of how these schemes generate IDs, consider a series with the following 4 tags,
`[{"t1":v1}, {t2:"v2"}, {t3:v3}, {t4:v4}]`. The following is an example of how different schemes will generate IDs.

```
legacy: "t1"=v1,t2="v2",t3=v3,t4=v4,
prepend_meta: 4,2,2,4,2,2,2,2!"t1"v1t2"v2"t3v3t4v4
quoted: {\"t1\"="v1",t2="\"v2\"",t3="v3",t4="v4"}
```
If there is a chance that your metric tags will contain "control" characters, specifically `,` and `=`, it is highly recommended that one of either the `quoted` or `prepend_meta` schemes are specified, as the `legacy` scheme may cause ID collisions. As a general guideline, we suggest `quoted`, as it mirrors the more familiar Prometheus style IDs.
We technically have a fourth ID generation scheme that is used for Graphite IDs, but it is exclusive to the Graphite ingestion path and is not selectable as a general scheme.
**WARNING:** Once a scheme is selected, be very careful about changing it. If changed, all incoming metrics will resolved to a new ID, effectively doubling the metric cardinality until all of the older-style metric IDs fall out of retention.
### Migration
We recently updated our ID generation scheme in m3coordinator to avoid the collision issues discussed above. To ease migration, we're temporarily enforcing that an ID generation scheme be explicitly provided in the m3Coordinator configuration files.
If you have been running m3query or m3coordinator already, you may want to counterintuitively select the collision-prone `legacy` scheme, as all the IDs for all of your current metrics would have already been generated with this scheme, and choosing another will effectively double your index size. If the twofold increase in cardinality is an acceptable increase (and unfortunately, this is likely to mean doubled cardinality until your longest retention cluster rotates out), it's suggested to choose a collision-resistant scheme instead.
An example of a configuration file with the ID generation scheme can be found (here)[https://github.com/m3db/m3/blob/master/scripts/docker-integration-tests/prometheus/m3coordinator.yml]
If none of these options work for you, or you would like further clarification, please stop by our [gitter channel](https://gitter.im/m3db/Lobby) and we'll be happy to help you.
## Grafana
You can also set up m3query as a [datasource in Grafana](http://docs.grafana.org/features/datasources/prometheus/). To do this, add a new datasource with a type of `Prometheus`. The URL should point to the host/port running m3query. By default, m3query runs on port `7201`.
2 changes: 2 additions & 0 deletions kube/bundle.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions kube/m3dbnode-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ data:
sanitization: prometheus
samplingRate: 1.0
extended: none
tagOptions:
idScheme: quoted
db:
logging:
Expand Down
2 changes: 1 addition & 1 deletion scripts/development/m3_stack/m3aggregator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ aggregator:
jitterEnabled: true
maxJitters:
- flushInterval: 5s
maxJitterPercent: 1.0
maxJitterPercent: 1.0
- flushInterval: 10s
maxJitterPercent: 0.5
- flushInterval: 1m
Expand Down
3 changes: 3 additions & 0 deletions scripts/development/m3_stack/m3coordinator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,6 @@ ingest:
carbon:
ingester:
listenAddress: "0.0.0.0:7204"

tagOptions:
idScheme: quoted
3 changes: 3 additions & 0 deletions scripts/docker-integration-tests/carbon/m3coordinator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,6 @@ carbon:
policies:
- resolution: 5s
retention: 10h

tagOptions:
idScheme: quoted
3 changes: 3 additions & 0 deletions scripts/docker-integration-tests/prometheus/m3coordinator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,6 @@ clusters:
- dbnode01:2379
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority

tagOptions:
idScheme: quoted
9 changes: 8 additions & 1 deletion src/cmd/services/m3query/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ package config

import (
"errors"
"fmt"
"time"

etcdclient "github.com/m3db/m3/src/cluster/client/etcd"
Expand All @@ -33,6 +34,7 @@ import (
"github.com/m3db/m3/src/query/models"
"github.com/m3db/m3/src/query/storage"
"github.com/m3db/m3/src/query/storage/m3"
xdocs "github.com/m3db/m3/src/x/docs"
xconfig "github.com/m3db/m3x/config"
"github.com/m3db/m3x/config/listenaddress"
"github.com/m3db/m3x/instrument"
Expand All @@ -48,6 +50,9 @@ const (
M3DBStorageType BackendStorageType = "m3db"

defaultCarbonIngesterListenAddress = "0.0.0.0:7204"
errNoIDGenerationScheme = "error: a recent breaking change means that an ID " +
"generation scheme is required in coordinator configuration settings. " +
"More information is available here: %s"
)

var (
Expand Down Expand Up @@ -340,7 +345,9 @@ func TagOptionsFromConfig(cfg TagOptionsConfiguration) (models.TagOptions, error
}

if cfg.Scheme == models.TypeDefault {
cfg.Scheme = models.TypeLegacy
// If no config has been set, error.
docLink := xdocs.Path("how_to/query#migration")
return nil, fmt.Errorf(errNoIDGenerationScheme, docLink)
}

opts = opts.SetIDSchemeType(cfg.Scheme)
Expand Down
63 changes: 54 additions & 9 deletions src/cmd/services/m3query/config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,11 @@
package config

import (
"fmt"
"testing"

"github.com/m3db/m3/src/query/models"
xdocs "github.com/m3db/m3/src/x/docs"
xconfig "github.com/m3db/m3x/config"

"github.com/stretchr/testify/assert"
Expand All @@ -32,18 +34,34 @@ import (
yaml "gopkg.in/yaml.v2"
)

func TestTagOptionsFromEmptyConfig(t *testing.T) {
func TestTagOptionsFromEmptyConfigErrors(t *testing.T) {
cfg := TagOptionsConfiguration{}
opts, err := TagOptionsFromConfig(cfg)
require.NoError(t, err)
require.NotNil(t, opts)
assert.Equal(t, []byte("__name__"), opts.MetricName())
require.Error(t, err)
require.Nil(t, opts)
}

func TestTagOptionsFromConfigWithIDGenerationScheme(t *testing.T) {
schemes := []models.IDSchemeType{models.TypeLegacy,
models.TypePrependMeta, models.TypeQuoted}
for _, scheme := range schemes {
cfg := TagOptionsConfiguration{
Scheme: scheme,
}

opts, err := TagOptionsFromConfig(cfg)
require.NoError(t, err)
require.NotNil(t, opts)
assert.Equal(t, []byte("__name__"), opts.MetricName())
assert.Equal(t, scheme, opts.IDSchemeType())
}
}

func TestTagOptionsFromConfig(t *testing.T) {
name := "foobar"
cfg := TagOptionsConfiguration{
MetricName: name,
Scheme: models.TypeLegacy,
}
opts, err := TagOptionsFromConfig(cfg)
require.NoError(t, err)
Expand Down Expand Up @@ -97,14 +115,41 @@ func TestConfigValidation(t *testing.T) {
}
}

func TestDefaultTagOptionsConfig(t *testing.T) {
func TestDefaultTagOptionsConfigErrors(t *testing.T) {
var cfg TagOptionsConfiguration
require.NoError(t, yaml.Unmarshal([]byte(""), &cfg))
opts, err := TagOptionsFromConfig(cfg)
require.NoError(t, err)
assert.Equal(t, []byte("__name__"), opts.MetricName())
assert.Equal(t, []byte("le"), opts.BucketName())
assert.Equal(t, models.TypeLegacy, opts.IDSchemeType())

docLink := xdocs.Path("how_to/query#migration")
expectedError := fmt.Sprintf(errNoIDGenerationScheme, docLink)
require.EqualError(t, err, expectedError)
require.Nil(t, opts)
}

func TestGraphiteIDGenerationSchemeIsInvalid(t *testing.T) {
var cfg TagOptionsConfiguration
require.Error(t, yaml.Unmarshal([]byte("idScheme: graphite"), &cfg))
}

func TestTagOptionsConfigWithTagGenerationScheme(t *testing.T) {
var tests = []struct {
schemeStr string
scheme models.IDSchemeType
}{
{"legacy", models.TypeLegacy},
{"prepend_meta", models.TypePrependMeta},
{"quoted", models.TypeQuoted},
}

for _, tt := range tests {
var cfg TagOptionsConfiguration
schemeConfig := fmt.Sprintf("idScheme: %s", tt.schemeStr)
require.NoError(t, yaml.Unmarshal([]byte(schemeConfig), &cfg))
opts, err := TagOptionsFromConfig(cfg)
require.NoError(t, err)
assert.Equal(t, []byte("__name__"), opts.MetricName())
assert.Equal(t, tt.scheme, opts.IDSchemeType())
}
}

func TestTagOptionsConfig(t *testing.T) {
Expand Down
4 changes: 4 additions & 0 deletions src/dbnode/config/m3dbnode-all-config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ coordinator:
limits:
maxComputedDatapoints: 10000

tagOptions:
# Configuration setting for generating metric IDs from tags.
idScheme: quoted

db:
# Minimum log level which will be emitted.
logging:
Expand Down
4 changes: 4 additions & 0 deletions src/dbnode/config/m3dbnode-cluster-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ coordinator:
samplingRate: 1.0
extended: none

tagOptions:
# Configuration setting for generating metric IDs from tags.
idScheme: quoted

db:
logging:
level: info
Expand Down
4 changes: 4 additions & 0 deletions src/dbnode/config/m3dbnode-local-etcd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ coordinator:
limits:
maxComputedDatapoints: 10000

tagOptions:
# Configuration setting for generating metric IDs from tags.
idScheme: quoted

db:
logging:
level: info
Expand Down
4 changes: 4 additions & 0 deletions src/dbnode/config/m3dbnode-local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ coordinator:
samplingRate: 1.0
extended: none

tagOptions:
# Configuration setting for generating metric IDs from tags.
idScheme: quoted

db:
logging:
level: info
Expand Down
3 changes: 3 additions & 0 deletions src/query/config/m3coordinator-cluster-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ metrics:
samplingRate: 1.0
extended: none

tagOptions:
idScheme: quoted

clusters:
## Fill-out the following and un-comment before using, and
## make sure indent by two spaces is applied.
Expand Down
3 changes: 3 additions & 0 deletions src/query/config/m3coordinator-local-etcd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,6 @@ clusters:
endpoint: http://127.0.0.1:2380
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority

tagOptions:
idScheme: quoted
5 changes: 4 additions & 1 deletion src/query/config/m3query-dev-etcd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,7 @@ readWorkerPoolPolicy:

writeWorkerPoolPolicy:
grow: false
size: 10
size: 10

tagOptions:
idScheme: quoted
4 changes: 3 additions & 1 deletion src/query/config/m3query-local-etcd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ metrics:
samplingRate: 1.0
extended: none

tagOptions:
idScheme: quoted

clusters:
- namespaces:
- namespace: default
Expand Down Expand Up @@ -49,4 +52,3 @@ clusters:
jitter: true
backgroundHealthCheckFailLimit: 4
backgroundHealthCheckFailThrottleFactor: 0.5

7 changes: 7 additions & 0 deletions src/query/models/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,13 @@ func (t *IDSchemeType) UnmarshalYAML(unmarshal func(interface{}) error) error {
}

for _, valid := range validIDSchemes {
if valid == TypeGraphite {
// NB: while the graphite scheme is valid, it is not available to choose
// as a general ID scheme; instead, it is set on any metric coming through
// the graphite ingestion path.
continue
}

if str == valid.String() {
*t = valid
return nil
Expand Down
11 changes: 10 additions & 1 deletion src/query/models/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,13 @@ func TestMetricsTypeUnmarshalYAML(t *testing.T) {
Type IDSchemeType `yaml:"type"`
}

for _, value := range validIDSchemes {
validParseSchemes := []IDSchemeType{
TypeLegacy,
TypeQuoted,
TypePrependMeta,
}

for _, value := range validParseSchemes {
str := fmt.Sprintf("type: %s\n", value.String())

var cfg config
Expand All @@ -60,6 +66,9 @@ func TestMetricsTypeUnmarshalYAML(t *testing.T) {
}

var cfg config
// Graphite fails.
require.Error(t, yaml.Unmarshal([]byte("type: graphite\n"), &cfg))
// Bad type fails.
require.Error(t, yaml.Unmarshal([]byte("type: not_a_known_type\n"), &cfg))

require.NoError(t, yaml.Unmarshal([]byte(""), &cfg))
Expand Down
17 changes: 12 additions & 5 deletions src/query/models/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,14 +54,17 @@ const (
TypeLegacy
// TypeQuoted describes a scheme where IDs are generated by appending
// tag names with explicitly quoted and escaped tag values. Tag names are
// also escaped if they contain invalid characters.
// {t1:v1},{t2:v2} -> t1"v1"t2"v2"
// {t1:v1,t2:v2} -> t1"v1,t2:v2"
// also escaped if they contain invalid characters. This is equivalent to
// the Prometheus ID style.
// {t1:v1},{t2:v2} -> {t1="v1",t2="v2"}
// {t1:v1,t2:v2} -> {t1="v1,t2:v2"}
// {"t1":"v1"} -> {\"t1\""="\"v1\""}
TypeQuoted
// TypePrependMeta describes a scheme where IDs are generated by prepending
// the length of each tag at the start of the ID
// {t1:v1},{t2:v2} -> 44t1v1t2v2
// {t1:v1,t2:v2} -> 10t1v1,t2:v2
// {t1:v1},{t2:v2} -> 2,2,2,2!t1v1t2v2
// {t1:v1,t2:v2} -> 2,8!t1v1,t2:v2
// {"t1":"v1"} -> 4,4!"t1""v1"
TypePrependMeta
// TypeGraphite describes a scheme where IDs are generated to match graphite
// representation of the tags. This scheme should only be used on the graphite
Expand All @@ -71,6 +74,10 @@ const (
//
// NB: when TypeGraphite is specified, tags are ordered numerically rather
// than lexically.
//
// NB 2: while the graphite scheme is valid, it is not available to choose as
// a general ID scheme; instead, it is set on any metric coming through the
// graphite ingestion path.
TypeGraphite
)

Expand Down
Loading

0 comments on commit 41c5bce

Please sign in to comment.