-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: full backups #2710
Feature: full backups #2710
Conversation
moved orchestrate to its own pkg within worker. adding backup to worker and alpha.
…e. additional logging.
dgraph/cmd/alpha/admin.go
Outdated
return | ||
} | ||
w.Header().Set("Content-Type", "application/json") | ||
w.Write([]byte(`{"code": "Success", "message": "Backup completed."}`)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of w.Write
is not checked
worker/backup.go
Outdated
} | ||
readTs := ts.ReadOnly | ||
glog.Infof("Got readonly ts from Zero: %d\n", readTs) | ||
posting.Oracle().WaitForTs(ctx, readTs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of (*github.com/dgraph-io/dgraph/posting.oracle).WaitForTs
is not checked
worker/stream/stream.go
Outdated
@@ -24,21 +24,27 @@ import ( | |||
"github.com/dgraph-io/badger" | |||
"github.com/dgraph-io/dgraph/protos/pb" | |||
"github.com/dgraph-io/dgraph/x" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File is not goimports
-ed
…slabelled request field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 20 files reviewed, 19 unresolved discussions (waiting on @golangcibot, @srfrog, and @manishrjain)
ee/backup/backup.go, line 18 at r2 (raw file):
// Worker has all the information needed to perform a backup. type Worker struct {
type Backup struct?
ee/backup/backup.go, line 21 at r2 (raw file):
ReadTs uint64 // Timestamp to read at. GroupId uint32 // The group ID of this node. SeqTs string // Sequence data to label backup at the target.
UnixTs string // UTC, as we do in export.
ee/backup/backup.go, line 40 at r2 (raw file):
sl.ItemToKVFunc = func(key []byte, itr *badger.Iterator) (*pb.KV, error) { item := itr.Item() val, err := item.ValueCopy(nil)
Look at predicate.go. We need to convert to posting list, and then marshal to KV.
if schema, then do value copy. Again, same logic as predicate.go.
ee/backup/handler.go, line 22 at r2 (raw file):
type handler interface { Copy(string, string) error Session(string, string) error
Send(kvs) error, so it would work with orchestrate.
Flush error
ee/backup/handler.go, line 32 at r2 (raw file):
// s3 - Amazon S3 // as - Azure Storage var handlers struct {
Do we really need to store this at Dgraph? The user request can pass in the destination.
ee/backup/writer.go, line 65 at r2 (raw file):
// tmp file is our main working file. // we will prepare this file and then copy to dst when done. w.tmp, err = ioutil.TempFile("", dgraphBackupTempPrefix)
We don't need to first write to a tmp file, then upload to destination. We can stream directly to the destination.
ee/backup/writer.go, line 86 at r2 (raw file):
var err error for _, kv := range kvs.Kv { _, err = pbutil.WriteDelimited(w.tmp, kv)
Let's skip the lib. Just use fixed length header for the length of marshalled proto, then write the proto.
protos/pb.proto, line 444 at r2 (raw file):
message BackupRequest { uint64 start_ts = 1;
read_ts
protos/pb.proto, line 449 at r2 (raw file):
} message BackupResponse {
I've never been happy with the ExportPayload, that this one is based on. Technically, all we need to pass back is an error.
message Status {
string err = 1; // If empty, then OK. If not, then it has the relevant error.
int exit_code = 2; // If 0, then OK. If 1, then error.
}
api.Payload, which can be used for this purpose as well. But, I think a special Status one is helpful.
worker/backup.go, line 78 at r2 (raw file):
// TODO: add stop to all goroutines to cancel on failure. func backupDispatch(ctx context.Context, readTs uint64, target string, gids []uint32,
Move some args down, so the next line doesn't need to start with end-bracket?
worker/backup.go, line 79 at r2 (raw file):
// TODO: add stop to all goroutines to cancel on failure. func backupDispatch(ctx context.Context, readTs uint64, target string, gids []uint32, ) chan *pb.BackupResponse {
This should just return error.
worker/backup.go, line 80 at r2 (raw file):
func backupDispatch(ctx context.Context, readTs uint64, target string, gids []uint32, ) chan *pb.BackupResponse { out := make(chan *pb.BackupResponse)
Out should not be returned. Rename to statusCh.
statusCh := make(chan *pb.Status, )
The function would just return error or status. That keeps all the goroutines, channels, etc. local to the function. We'd want a select case checking for context.done and the read from statusCh.
worker/backup.go, line 132 at r3 (raw file):
// This will dispatch the request to all groups and wait for their response. // If we receive any failures, we cancel the process. for resp := range backupDispatch(ctx, readTs, target, gids) {
backupDispatch should return an error.
ee/backup/handler_file.go, line 37 at r3 (raw file):
// Copy is called when we are ready to transmit a file to the target. // Returns error on failure, nil on success. func (h *fileHandler) Copy(in, out string) error {
Send(kvs *pb.KVS) error
So, this can plug in to orchestrate. Also, have a Flush() method, which can be called at the end.
worker/stream/stream.go, line 33 at r3 (raw file):
) const pageSize = 1 << 20 * 4 // 4MB
4 << 20
ee/backup/writer_local.go, line 65 at r2 (raw file):
defer dst.Close() if _, err = io.Copy(dst, src); err != nil {
Why not just write directly to the destination?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 20 files reviewed, 19 unresolved discussions (waiting on @golangcibot and @manishrjain)
dgraph/cmd/alpha/admin.go, line 87 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
w.Write
is not checked
We'll come back to this later
ee/backup/backup.go, line 18 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
type Backup struct?
backup.Backup{}
looks weird. this is backup.Worker{}
. I have changed it to backup.Request{}
, maybe this looks better?
ee/backup/backup.go, line 21 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
UnixTs string // UTC, as we do in export.
Done.
ee/backup/backup.go, line 40 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Look at predicate.go. We need to convert to posting list, and then marshal to KV.
if schema, then do value copy. Again, same logic as predicate.go.
Done.
ee/backup/handler.go, line 22 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Send(kvs) error, so it would work with orchestrate.
Flush error
backup.writer implements kvStream, and imports the handler. i'll look to simply it after it's working.
ee/backup/handler.go, line 32 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Do we really need to store this at Dgraph? The user request can pass in the destination.
The user sends it in the HTTP request as destination
.
e.g.,
curl -X POST -F 'destination=s3://dgraph.s3.amazonaws.com/backups/201810/' http://localhost:8080/admin/backup
# Dgraph will grab the AWS auth info from the env.
ee/backup/writer.go, line 65 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
We don't need to first write to a tmp file, then upload to destination. We can stream directly to the destination.
Done.
ee/backup/writer.go, line 86 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Let's skip the lib. Just use fixed length header for the length of marshalled proto, then write the proto.
Done.
protos/pb.proto, line 444 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
read_ts
Done.
protos/pb.proto, line 449 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
I've never been happy with the ExportPayload, that this one is based on. Technically, all we need to pass back is an error.
message Status {
string err = 1; // If empty, then OK. If not, then it has the relevant error.
int exit_code = 2; // If 0, then OK. If 1, then error.
}api.Payload, which can be used for this purpose as well. But, I think a special Status one is helpful.
Done.
worker/backup.go, line 124 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
(*github.com/dgraph-io/dgraph/posting.oracle).WaitForTs
is not checked
Done.
worker/backup.go, line 78 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Move some args down, so the next line doesn't need to start with end-bracket?
Done.
worker/backup.go, line 79 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This should just return error.
Done.
worker/backup.go, line 80 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Out should not be returned. Rename to statusCh.
statusCh := make(chan *pb.Status, )
The function would just return error or status. That keeps all the goroutines, channels, etc. local to the function. We'd want a select case checking for context.done and the read from statusCh.
Done.
worker/backup.go, line 132 at r3 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
backupDispatch should return an error.
Done.
ee/backup/handler_file.go, line 37 at r3 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Send(kvs *pb.KVS) error
So, this can plug in to orchestrate. Also, have a Flush() method, which can be called at the end.
i need to call this from writer, but i'll refactor if i can. i have Close() that does the final Flush()
worker/stream/stream.go, line 27 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed
Done.
worker/stream/stream.go, line 33 at r3 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
4 << 20
Done.
ee/backup/writer_local.go, line 65 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Why not just write directly to the destination?
Done.
removed binary encoding package, using encode/binary with size delimiter. fixed race and added posting list values to backup. fixed issue with file handler that was breaking badger. refactored backup process to be simpler. added generic Status proto for any service response. added minio and used it for S3 backup uploads.
ee/backup/writer.go
Outdated
"github.com/golang/glog" | ||
) | ||
|
||
const dgraphBackupTempPrefix = "dgraph-backup-*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dgraphBackupTempPrefix
is unused
@@ -24,21 +24,24 @@ import ( | |||
"github.com/dgraph-io/badger" | |||
"github.com/dgraph-io/dgraph/protos/pb" | |||
"github.com/dgraph-io/dgraph/x" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File is not goimports
-ed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 23 files reviewed, 21 unresolved discussions (waiting on @golangcibot and @manishrjain)
ee/backup/writer.go, line 17 at r4 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
dgraphBackupTempPrefix
is unused
Done.
stream/stream.go, line 27 at r4 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed
Done.
"github.com/dgraph-io/dgraph/protos/pb" | ||
"github.com/dgraph-io/dgraph/stream" | ||
"github.com/dgraph-io/dgraph/x" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File is not goimports
-ed
ee/backup/handler_file.go
Outdated
"path/filepath" | ||
|
||
"github.com/dgraph-io/dgraph/x" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File is not goimports
-ed
ee/backup/handler_s3.go
Outdated
"strings" | ||
|
||
"github.com/dgraph-io/dgraph/x" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File is not goimports
-ed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 23 files reviewed, 24 unresolved discussions (waiting on @golangcibot and @manishrjain)
dgraph/cmd/alpha/admin.go, line 87 at r1 (raw file):
Previously, srfrog (Gus) wrote…
We'll come back to this later
Done.
ee/backup/backup.go, line 18 at r2 (raw file):
Previously, srfrog (Gus) wrote…
backup.Backup{}
looks weird. this isbackup.Worker{}
. I have changed it tobackup.Request{}
, maybe this looks better?
Done.
ee/backup/backup.go, line 16 at r5 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed
Done.
ee/backup/handler.go, line 22 at r2 (raw file):
Previously, srfrog (Gus) wrote…
backup.writer implements kvStream, and imports the handler. i'll look to simply it after it's working.
Done.
ee/backup/handler.go, line 32 at r2 (raw file):
Previously, srfrog (Gus) wrote…
The user sends it in the HTTP request as
destination
.
e.g.,curl -X POST -F 'destination=s3://dgraph.s3.amazonaws.com/backups/201810/' http://localhost:8080/admin/backup # Dgraph will grab the AWS auth info from the env.
Done.
ee/backup/handler_s3.go, line 16 at r5 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed
Done.
ee/backup/handler_file.go, line 37 at r3 (raw file):
Previously, srfrog (Gus) wrote…
i need to call this from writer, but i'll refactor if i can. i have Close() that does the final Flush()
Done.
ee/backup/handler_file.go, line 13 at r5 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is a lot closer to submission. I'd like to change a couple of things here and there, see what code we can remove to slim it down a bit more. Let's chat after lunch.
Reviewed 3 of 19 files at r1, 3 of 13 files at r3, 12 of 17 files at r4, 5 of 5 files at r5.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @golangcibot, @manishrjain, and @srfrog)
ee/backup/backup.go, line 63 at r5 (raw file):
} glog.V(3).Infof("Backup started ...")
This can be V(2).
ee/backup/backup.go, line 67 at r5 (raw file):
return err } glog.V(3).Infof("Backup finishing ...")
Can be V(2).
worker/backup.go, line 40 at r5 (raw file):
// sanity, make sure this is our group. if groups().groupId() != req.GroupId { err := x.Errorf("Backup request group mismatch. Mine: %d. Requested: %d\n",
return x.Errorf
worker/backup.go, line 50 at r5 (raw file):
// create backup request and process it. br := &backup.Request{ ReadTs: req.ReadTs,
Instead of copying, why not pass pb.BackupRequest to backup package.
worker/backup.go, line 77 at r5 (raw file):
func backupDispatch(ctx context.Context, in *pb.BackupRequest, gids []uint32) chan error { statusCh := make(chan error) go func() {
There's something not right about a function spawning a goroutine and just leaving them there. I find it easier to follow and a good practice when the func which spawns a goroutine also ensures that the goroutine reaches its conclusion.
This func backupDispatch, probably an apt name for what it's doing right now, should really do the dispatching, but also run through the results, returning a single error (not a chan).
worker/backup.go, line 132 at r5 (raw file):
// If we receive any failures, we cancel the process. req := &pb.BackupRequest{ReadTs: readTs, Target: target} for err := range backupDispatch(ctx, req, gids) {
There';s
x/file.go, line 44 at r5 (raw file):
// Writeq is a quiet write. Writes b to w and eats up the return. func Writeq(w io.Writer, b []byte) { _, _ = w.Write(b)
Can probably check? x.Check2(w.Write(b))
ee/backup/handler_file.go, line 35 at r5 (raw file):
return err } glog.V(3).Infof("using file path: %q", path)
Can be V(2).
ee/backup/handler_file.go, line 47 at r5 (raw file):
} }() if err := h.fp.Sync(); err != nil {
If Sync fails, then just do fp.Close().
But, do return the error of Close, so remove the defer, and bring it inline after Sync.
ee/backup/handler_s3.go
Outdated
glog.V(3).Infof("sent %d bytes, actual %d bytes, time elapsed %s", n, h.n, time.Since(start)) | ||
break | ||
} | ||
pr.CloseWithError(nil) // EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of pr.CloseWithError
is not checked
ee/backup/handler_s3.go
Outdated
|
||
const ( | ||
s3defaultEndpoint = "s3.amazonaws.com" | ||
s3accelerateHost = "s3-accelerate" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s3accelerateHost
is unused
ee/backup/writer.go
Outdated
"github.com/golang/glog" | ||
) | ||
|
||
const dgraphBackupSuffix = ".dgraph-backup" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dgraphBackupSuffix
is unused
renamed s3handler.send to s3handler.upload and removed all buffering. s3handler tests that bucket exists before working, we cant assume a region. s3handler.Close blocks until the upload is complete.
ee/backup/handler_s3.go
Outdated
|
||
const ( | ||
s3DefaultEndpoint = "s3.amazonaws.com" | ||
s3AccelerateHost = "s3-accelerate" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s3AccelerateHost
is unused
ee/backup/handler_s3.go
Outdated
const ( | ||
s3DefaultEndpoint = "s3.amazonaws.com" | ||
s3AccelerateHost = "s3-accelerate" | ||
s3MinioChunkSize = 64 << 20 // 64MiB, minimum upload size for single file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const s3MinioChunkSize
is unused
ee/backup/handler_s3.go
Outdated
|
||
// progress allows us to monitor the progress of an upload. | ||
// TODO: I used this during testing, maybe keep it turned on for -v 5 ? | ||
type progress struct{ n uint64 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type progress
is unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 20 files reviewed, 26 unresolved discussions (waiting on @manishrjain and @golangcibot)
ee/backup/backup.go, line 63 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This can be V(2).
Done.
ee/backup/backup.go, line 67 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Can be V(2).
Done.
ee/backup/handler_s3.go, line 26 at r6 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
s3accelerateHost
is unused
Done.
ee/backup/handler_s3.go, line 107 at r6 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
pr.CloseWithError
is not checked
Done.
ee/backup/handler_s3.go, line 27 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
s3AccelerateHost
is unused
Done.
ee/backup/handler_s3.go, line 28 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
const
s3MinioChunkSize
is unused
Done.
ee/backup/handler_s3.go, line 103 at r8 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
type
progress
is unused
Done.
ee/backup/writer.go, line 18 at r7 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
dgraphBackupSuffix
is unused
Done.
worker/backup.go, line 40 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
return x.Errorf
Done.
worker/backup.go, line 50 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Instead of copying, why not pass pb.BackupRequest to backup package.
Done.
worker/backup.go, line 77 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
There's something not right about a function spawning a goroutine and just leaving them there. I find it easier to follow and a good practice when the func which spawns a goroutine also ensures that the goroutine reaches its conclusion.
This func backupDispatch, probably an apt name for what it's doing right now, should really do the dispatching, but also run through the results, returning a single error (not a chan).
We catch all the errors at the receiver in the order they finish and stop if we get the first error. the part missing (which i'm adding now) is a sync stop to stop any acting go routines.
worker/backup.go, line 132 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
There';s
Done.
x/file.go, line 44 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Can probably check?
x.Check2(w.Write(b))
Done.
ee/backup/handler_file.go, line 35 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Can be V(2).
Done.
ee/backup/handler_file.go, line 47 at r5 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
If Sync fails, then just do fp.Close().
But, do return the error of Close, so remove the defer, and bring it inline after Sync.
Done.
ee/backup/s3_handler.go
Outdated
if err != nil { | ||
return x.Errorf("Error while looking for bucket: %s at host: %s. Error: %v", | ||
h.bucket, uri.Host, err) | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreachable code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is good to go! Tested and all.
Reviewable status: 11 of 20 files reviewed, 23 unresolved discussions (waiting on @manishrjain, @golangcibot, and @srfrog)
ee/backup/backup.go, line 66 at r9 (raw file):
return err } if err = w.cleanup(); err != nil {
Close is the typical expected fn call at the end, or Flush. Cleanup sounds strange here.
…h into feature/roadmap-backups
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 20 files reviewed, 23 unresolved discussions (waiting on @manishrjain and @golangcibot)
ee/backup/backup.go, line 66 at r9 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Close is the typical expected fn call at the end, or Flush. Cleanup sounds strange here.
Done.
ee/backup/s3_handler.go, line 87 at r10 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
unreachable code
Done.
* moved orchestrate to its own pkg within worker. adding backup to worker and alpha. * trying to get the handler and file writers working. * added destination parameter. handler support to destination URI scheme. additional logging. * file handler rename on same volume. added more comments and logging. * changed worker to use stream pkg. updated protos for backup. fixed mislabelled request field. * logging changes for debugging * added some error checks, tweaked comments. * moved stream pkg out of worker. removed binary encoding package, using encode/binary with size delimiter. fixed race and added posting list values to backup. fixed issue with file handler that was breaking badger. refactored backup process to be simpler. added generic Status proto for any service response. added minio and used it for S3 backup uploads. * removed unused const. format fixes. * Initial pass at simplifying things. * cleaned up redundant code. renamed s3handler.send to s3handler.upload and removed all buffering. s3handler tests that bucket exists before working, we cant assume a region. s3handler.Close blocks until the upload is complete. * unused const * missing space * added progress monitoring. fixed issues found by CI * Small fixes here and there. * Rename handler files. * Both S3 uploads and file writes are tested to work. * renamed writer.cleapup to writer.close * regenerated protos * removed unneeded fallthrough
This PR adds full backups to Dgraph.
Backups are started from any alpha using HTTP to
/admin/backup
via POST. A destination variable must be sent with the request to indicate where to save the backup.curl -XPOST -F 'destination=/some/path' localhost:8080/admin/backup
If the destination path must exist, except for local which will attempt to create it. Currently only local or NFS paths are added, but we will be adding AWS, GCP, Azure, and HTTP next.
This change is