Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve /database/create API #1350

Merged
merged 36 commits into from
Feb 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 6 additions & 25 deletions docs/how_to/single_node.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,12 @@ directory on your host for durability:

```
docker pull quay.io/m3/m3dbnode:latest
docker run -p 7201:7201 -p 7203:7203 -p 9003:9003 --name m3db -v $(pwd)/m3db_data:/var/lib/m3db -v $GOPATH/src/github.com/m3db/m3/src/dbnode/config/m3dbnode-local-etcd.yml:/etc/m3dbnode/m3dbnode.yml quay.io/m3/m3dbnode:latest
docker run -p 7201:7201 -p 7203:7203 -p 9003:9003 --name m3db -v $(pwd)/m3db_data:/var/lib/m3db -v <PATH_TO_M3DB_CONFIG.yml>:/etc/m3dbnode/m3dbnode.yml quay.io/m3/m3dbnode:latest
```

**Note:** If you don't have `M3` setup in your `$GOPATH`, you can find the [m3dbnode-local-etcd.yml file here](https://github.com/m3db/m3/blob/master/src/dbnode/config/m3dbnode-local-etcd.yml).
**Note:** For the single node case, we recommend that you start with this [sample config file](https://github.com/m3db/m3/blob/master/src/dbnode/config/m3dbnode-local-etcd.yml). If you inspect the file, you'll see that all the configuration is namespaced by `coordinator` or `db`. That's because this setup runs `M3DB` and `M3Coordinator` as one application. While this is convenient for testing and development, you'll want to run clustered `M3DB` with a separate `M3Coordinator` in production. You can read more about that [here.](cluster_hard_way.md).

**Note:** This setup runs `M3DB` and `M3Coordinator` as one application and should only be used for testing/development purposes. If you want to run a clustered `M3DB` with a separate `M3Coordinator` process (which is our recommended production setup), please [see here](cluster_hard_way.md).

Next, create an initial namespace for your metrics in the database:
Next, create an initial namespace for your metrics in the database using the cURL below. Keep in mind that the provided `namespaceName` must match the namespace in the `local` section of the `M3Coordinator` YAML configuration, and if you choose to [add any additional namespaces](../operational_guide/namespace_configuration.md) you'll need to add them to the `local` section of `M3Coordinator`'s YAML config as well.

```json
curl -X POST http://localhost:7201/api/v1/database/create -d '{
Expand All @@ -32,24 +30,9 @@ curl -X POST http://localhost:7201/api/v1/database/create -d '{
}'
```

**Note:** If you want to create more than one namespace, you should follow the [instructions here](../operational_guide/namespace_configuration.md) and also add the namespace you created to the `local` section of the `m3dbnode-local-etcd.yml` file used in the `docker run` command above with the appropriate aggregation options specified - for more information on our aggregation functionality, check out our [M3Query documentation](query.md). For example:

<!-- TODO: link to aggregation documentation (outside of query.md) -->

```json
local:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have this somewhere, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggregation is hard to talk about without the Prometheus context, so I added a link in the Prometheus integration section to the Query aggregation docs

namespaces:
- namespace: default
type: unaggregated
retention: 48h
- namespace: <new_namespace>
type: aggregated
retention: <new_retention>
resolution: <new_resolution>
```
**Note**: The `api/v1/database/create` endpoint is abstraction over two concepts in M3DB called [placements](../operational_guide/placement.md) and [namespaces](../operational_guide/namespace_configuration.md). If a placement doesn't exist, it will create one based on the `type` argument, otherwise if the placement already exists, it just creates the specified namespace. For now it's enough to just understand that it creates M3DB namespaces (tables), but if you're going to run a clustered M3 setup in production, make sure you familiarize yourself with the links above.

Shortly after, you should see your node complete bootstrapping! Don't worry if you see warnings or
errors related to a local cache file, such as `[W] could not load cache from file
Shortly after, you should see your node complete bootstrapping! Don't worry if you see warnings or errors related to a local cache file, such as `[W] could not load cache from file
/var/lib/m3kv/m3db_embedded.json`. Those are expected for a local instance and in general any
warn-level errors (prefixed with `[W]`) should not block bootstrapping.

Expand Down Expand Up @@ -131,6 +114,4 @@ curl -sSf -X POST http://localhost:9003/query -d '{
}
```

## Integrations

[Prometheus as a long term storage remote read/write endpoint](../integrations/prometheus.md).
Now that you've got the M3 stack up and running, take a look at the rest of our documentation to see how you can integrate with [Prometheus](../integrations/prometheus.md) and [Graphite](../integrations/graphite.md)
2 changes: 1 addition & 1 deletion docs/integrations/prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ To write to a remote M3DB cluster the simplest configuration is to run `m3coordi

Start by downloading the [config template](https://github.com/m3db/m3/blob/master/src/query/config/m3coordinator-cluster-template.yml). Update the `namespaces` and the `client` section for a new cluster to match your cluster's configuration.

You'll need to specify the static IPs or hostnames of your M3DB seed nodes, and the name and retention values of the namespace you set up. You can leave the namespace storage metrics type as `unaggregated` since it's required by default to have a cluster that receives all Prometheus metrics unaggregated. In the future you might also want to aggregate and downsample metrics for longer retention, and you can come back and update the config once you've setup those clusters.
You'll need to specify the static IPs or hostnames of your M3DB seed nodes, and the name and retention values of the namespace you set up. You can leave the namespace storage metrics type as `unaggregated` since it's required by default to have a cluster that receives all Prometheus metrics unaggregated. In the future you might also want to aggregate and downsample metrics for longer retention, and you can come back and update the config once you've setup those clusters. You can read more about our aggregation functionality [here](../how_to/query.md).

It should look something like:

Expand Down
130 changes: 76 additions & 54 deletions docs/operational_guide/namespace_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,81 @@

## Introduction

Namespaces in M3DB are analogous to tables in other databases. Each namespace has a unique name as well as distinct configuration with regards to data retention and blocksize. For more information about namespaces, read our [storage engine documentation](../architecture/engine.md).
Namespaces in M3DB are analogous to tables in other databases. Each namespace has a unique name as well as distinct configuration with regards to data retention and blocksize. For more information about namespaces and the technical details of their implementation, read our [storage engine documentation](../m3db/architecture/engine.md).

## Namespace Operations

The operations below include sample cURLs, but you can always review the API documentation by navigating to

`http://<M3_COORDINATOR_HOST_NAME>:<CONFIGURED_PORT(default 7201)>/api/v1/openapi` or our [online API documentation](https://m3db.io/openapi/).

### Adding a Namespace

#### Recommended (Easy way)

The recommended way to add a namespace to M3DB is to use our `api/v1/database/namespace` endpoint. This API abstracts over a lot of the complexity of configuring a namespace and requires only two pieces of configuration to be provided: the name of the namespace, as well as its retention.

For example, the following cURL:

```bash
curl -X POST <M3_COORDINATOR_IP_ADDRESS>:<CONFIGURED_PORT(default 7201)>api/v1/database/namespace/create -d '{
"namespaceName": "default_unaggregated",
"retentionTime": "24h"
}'
```

will create a namespace called `default_unaggregated` with a retention of `24 hours`. All of the other namespace options will either use reasonable default values or be calculated based on the provided `retentionTime`.

Adding a namespace does not require restarting M3DB, but will require modifying the M3Coordinator configuration to include the new namespace, and then restarting it.

If you feel the need to configure the namespace options yourself (for performance or other reasons), read the `Advanced` section below.

#### Advanced (Hard Way)

The "advanced" API allows you to configure every aspect of the namespace that you're adding which can sometimes be helpful for development, debugging, and tuning clusters for maximum performance.
Adding a namespace is a simple as using the `POST` `api/v1/namespace` API on an M3Coordinator instance.

```
curl -X POST <M3_COORDINATOR_IP_ADDRESS>:<CONFIGURED_PORT(default 7201)>api/v1/namespace -d '{
"name": "default_unaggregated",
"options": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"snapshotEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodDuration": "2d",
"blockSizeDuration": "2h",
"bufferFutureDuration": "10m",
"bufferPastDuration": "10m",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodDuration": "5m"
},
"indexOptions": {
"enabled": true,
"blockSizeDuration": "4h"
}
}
}'
```

Adding a namespace does not require restarting M3DB, but will require modifying the M3Coordinator configuration to include the new namespace, and then restarting it.

### Deleting a Namespace

Deleting a namespace is a simple as using the `DELETE` `/api/v1/namespace` API on an M3Coordinator instance.

`curl -X DELETE <M3_COORDINATOR_IP_ADDRESS>:<CONFIGURED_PORT(default 7201)>/api/v1/namespace/<NAMESPACE_NAME>`

Note that deleting a namespace will not have any effect on the M3DB nodes until they are all restarted. In addition, the namespace will need to be removed from the M3Coordinator configuration and then the M3Coordinator node will need to be restarted.

### Modifying a Namespace

There is currently no atomic namespace modification endpoint. Instead, you will need to delete a namespace and then add it back again with the same name, but modified settings. Review the individual namespace settings above to determine whether or not a given setting is safe to modify. For example, it is never safe to modify the blockSize of a namespace.

Also, be very careful not to restart the M3DB nodes after deleting the namespace, but before adding it back. If you do this, the M3DB nodes may detect the existing data files on disk and delete them since they are not configured to retain that namespace.

## Namespace Attributes

Expand Down Expand Up @@ -44,7 +118,7 @@ Can be modified without creating a new namespace: `yes`

#### blockSize

This is the most important value to consider when tuning the performance of an M3DB namespace. Read the [storage engine documentation](../architecture/engine.md) for more details, but the basic idea is that larger blockSizes will use more memory, but achieve higher compression. Similarly, smaller blockSizes will use less memory, but have worse compression.
This is the most important value to consider when tuning the performance of an M3DB namespace. Read the [storage engine documentation](../../m3db/architecture/engine.md) for more details, but the basic idea is that larger blockSizes will use more memory, but achieve higher compression. Similarly, smaller blockSizes will use less memory, but have worse compression.

Can be modified without creating a new namespace: `no`

Expand Down Expand Up @@ -77,55 +151,3 @@ Can be modified without creating a new namespace: `yes`
### Index Options

TODO

## Namespace Operations

The operations below include sample CURLs, but you can always review the API documentation by navigating to

`http://<M3_COORDINATOR_HOST_NAME>:<CONFIGURED_PORT(default 7201)>/api/v1/openapi` or our [online API documentation](https://m3db.io/openapi/).

### Adding a Namespace

Adding a namespace is a simple as using the `POST` `api/v1/namespace` API on an M3Coordinator instance.

```
curl -X POST <M3_COORDINATOR_IP_ADDRESS>:<CONFIGURED_PORT(default 7201)>api/v1/namespace -d '{
"name": "default_unaggregated",
"options": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"snapshotEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodDuration": "2d",
"blockSizeDuration": "2h",
"bufferFutureDuration": "10m",
"bufferPastDuration": "10m",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodDuration": "5m"
},
"indexOptions": {
"enabled": true,
"blockSizeDuration": "4h"
}
}
}'
```

Adding a namespace does not require restarting M3DB, but will require modifying the M3Coordinator configuration to include the new namespace, and then restarting it.

### Deleting a Namespace

Deleting a namespace is a simple as using the `DELETE` `/api/v1/namespace` API on an M3Coordinator instance.

`curl -X DELETE <M3_COORDINATOR_IP_ADDRESS>:<CONFIGURED_PORT(default 7201)>/api/v1/namespace/<NAMESPACE_NAME>`

Note that deleting a namespace will not have any effect on the M3DB nodes until they are all restarted. In addition, the namespace will need to be removed from the M3Coordinator configuration and then the M3Coordinator node will need to be restarted.

### Modifying a Namespace

There is currently no atomic namespace modification endpoint. Instead, you will need to delete a namespace and then add it back again with the same name, but modified settings. Review the individual namespace settings above to determine whether or not a given setting is safe to modify. For example,it is never safe to modify the blockSize of a namespace.

Also, be very careful not to restart the M3DB nodes after deleting the namespace, but before adding it back. If you do this, the M3DB nodes may detect the existing data files on disk and delete them since they are not configured to retain that namespace.
98 changes: 26 additions & 72 deletions scripts/docker-integration-tests/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,90 +46,44 @@ function wait_for_db_init {
ATTEMPTS=10 TIMEOUT=2 retry_with_backoff \
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/namespace | jq ".namespaces | length")" == "0" ]'

echo "Adding namespace"
curl -vvvsSf -X POST 0.0.0.0:7201/api/v1/namespace -d '{
"name": "agg",
"options": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"snapshotEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodDuration": "48h",
"blockSizeDuration": "2h",
"bufferFutureDuration": "10m",
"bufferPastDuration": "10m",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodDuration": "5m"
},
"indexOptions": {
"enabled": true,
"blockSizeDuration": "2h"
echo "Adding placement and agg namespace"
curl -vvvsSf -X POST 0.0.0.0:7201/api/v1/database/create -d '{
"type": "cluster",
"namespaceName": "agg",
"retentionTime": "24h",
"replicationFactor": 1,
"hosts": [
{
"id": "m3db_local",
"isolation_group": "rack-a",
"zone": "embedded",
"weight": 1024,
"address": "dbnode01",
"port": 9000
}
}
]
}'

echo "Wait until namespace is init'd"
echo "Wait until placement is init'd"
ATTEMPTS=4 TIMEOUT=1 retry_with_backoff \
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/namespace | jq .registry.namespaces.agg.indexOptions.enabled)" == true ]'

curl -vvvsSf -X POST 0.0.0.0:7201/api/v1/namespace -d '{
"name": "unagg",
"options": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"snapshotEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodDuration": "48h",
"blockSizeDuration": "2h",
"bufferFutureDuration": "10m",
"bufferPastDuration": "10m",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodDuration": "5m"
},
"indexOptions": {
"enabled": true,
"blockSizeDuration": "2h"
}
}
}'
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/placement | jq .placement.instances.m3db_local.id)" == \"m3db_local\" ]'

echo "Sleep until namespace is init'd"
echo "Wait until agg namespace is init'd"
ATTEMPTS=4 TIMEOUT=1 retry_with_backoff \
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/namespace | jq .registry.namespaces.unagg.indexOptions.enabled)" == true ]'
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/namespace | jq .registry.namespaces.agg.indexOptions.enabled)" == true ]'

echo "Placement initialization"
curl -vvvsSf -X POST 0.0.0.0:7201/api/v1/placement/init -d '{
"num_shards": 64,
"replication_factor": 1,
"instances": [
{
"id": "m3db_local",
"isolation_group": "rack-a",
"zone": "embedded",
"weight": 1024,
"endpoint": "dbnode01:9000",
"hostname": "dbnode01",
"port": 9000
}
]
echo "Adding unagg namespace"
curl -vvvsSf -X POST 0.0.0.0:7201/api/v1/database/namespace/create -d '{
"namespaceName": "unagg",
"retentionTime": "24h"
}'

echo "Sleep until placement is init'd"
echo "Wait until unagg namespace is init'd"
ATTEMPTS=4 TIMEOUT=1 retry_with_backoff \
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/placement | jq .placement.instances.m3db_local.id)" == \"m3db_local\" ]'
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/namespace | jq .registry.namespaces.unagg.indexOptions.enabled)" == true ]'

echo "Sleep until bootstrapped"
echo "Wait until bootstrapped"
ATTEMPTS=10 TIMEOUT=2 retry_with_backoff \
'[ "$(curl -sSf 0.0.0.0:9002/health | jq .bootstrapped)" == true ]'

echo "Waiting until shards are marked as available"
ATTEMPTS=10 TIMEOUT=1 retry_with_backoff \
'[ "$(curl -sSf 0.0.0.0:7201/api/v1/placement | grep -c INITIALIZING)" -eq 0 ]'
}

6 changes: 5 additions & 1 deletion src/query/api/v1/handler/database/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,11 @@ func RegisterRoutes(
) {
logged := logging.WithResponseTimeLogging

r.HandleFunc(CreateURL, logged(NewCreateHandler(client, cfg, embeddedDbCfg)).ServeHTTP).Methods(CreateHTTPMethod)
// Register the same handler under two different endpoints. This just makes explaining things in
// our documentation easier so we can separate out concepts, but share the underlying code.
createHandler := logged(NewCreateHandler(client, cfg, embeddedDbCfg)).ServeHTTP
r.HandleFunc(CreateURL, createHandler).Methods(CreateHTTPMethod)
r.HandleFunc(CreateNamespaceURL, createHandler).Methods(CreateNamespaceHTTPMethod)

r.HandleFunc(ConfigGetBootstrappersURL, logged(NewConfigGetBootstrappersHandler(client)).ServeHTTP).Methods(ConfigGetBootstrappersHTTPMethod)
r.HandleFunc(ConfigSetBootstrappersURL, logged(NewConfigSetBootstrappersHandler(client)).ServeHTTP).Methods(ConfigSetBootstrappersHTTPMethod)
Expand Down
Loading