-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Live & bulk loader changes #2961
Conversation
…generic test data filenames.
… name. Improve live loader messages.
z/exec.go
Outdated
} | ||
} | ||
|
||
cmd[i].Start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of (*os/exec.Cmd).Start
is not checked (from errcheck
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 21 of 21 files at r1.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @codexnull and @golangcibot)
dgraph/cmd/bulk/loader.go, line 40 at r1 (raw file):
"github.com/dgraph-io/dgraph/x" "github.com/dgraph-io/dgraph/xidmap" "google.golang.org/grpc"
Why is this line being moved?
dgraph/cmd/bulk/loader.go, line 69 at r1 (raw file):
shardOutputDirs []string keyFields []string }
If you are going to have two fields that look the same, there should probably a comment explaining the difference to avoid confusion in the future.
Personally, if I understand the difference, I would call keyFields something like parsedKeyFields.
dgraph/cmd/live/load_json_test.go, line 39 at r1 (raw file):
var testDataDir string var dg *dgo.Dgraph var tmpDir string
move these into a var block
dgraph/cmd/live/run.go, line 69 at r1 (raw file):
var Live x.SubCommand var keyFields []string
also move all these vars into a var block
dgraph/cmd/live/run.go, line 174 at r1 (raw file):
} // forward file to the RDF or JSON processor as appropriate
change it to "processFile forwards file to the ...."
edgraph/nquads_from_json.go, line 422 at r1 (raw file):
func replaceBlankIds(nqs []*api.NQuad, keyFields *[]string) error { field2idx := make(map[string]int)
I think this would be a bit more readable if it was called fieldToIdx
edgraph/nquads_from_json.go, line 450 at r1 (raw file):
if err == nil { if nqs[0].Subject == "_:blank-0" { if keyFields == nil {
maybe assert this at the very beginning and exit early if keyFields == nil?
loadfile/chunk.go, line 62 at r1 (raw file):
func (rdfChunker) Begin(r *bufio.Reader) error { return nil
Add a comment on why this chunker just returns nil.
loadfile/chunk.go, line 68 at r1 (raw file):
batch.Grow(1 << 20) for lineCount := 0; lineCount < 1e5; lineCount++ {
maybe assign these constants to a variable and explain why these values were chosen.
loadfile/chunk.go, line 122 at r1 (raw file):
func (rdfChunker) End(r *bufio.Reader) error { return nil
a comment explaining why this function is empty would also be helpful
loadfile/chunk.go, line 145 at r1 (raw file):
func (jsonChunker) Chunk(r *bufio.Reader) (*bytes.Buffer, error) { out := new(bytes.Buffer) out.Grow(1 << 20)
Also this constant but I assume since it has the same value as above it represents the same quantity.
z/exec.go, line 67 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
(*os/exec.Cmd).Start
is not checked (fromerrcheck
)
Fix this
z/json.go, line 91 at r1 (raw file):
func sortJSONArray(a []interface{}) uint64 { h := uint64(0)
var h uint64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @golangcibot and @martinmr)
dgraph/cmd/bulk/loader.go, line 40 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Why is this line being moved?
Because the automatic formatting in my editor decided to do it. Sometimes it rearranges imports and I don't always catch it.
dgraph/cmd/bulk/loader.go, line 69 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
If you are going to have two fields that look the same, there should probably a comment explaining the difference to avoid confusion in the future.
Personally, if I understand the difference, I would call keyFields something like parsedKeyFields.
Done.
dgraph/cmd/live/load_json_test.go, line 39 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
var testDataDir string var dg *dgo.Dgraph var tmpDir string
move these into a var block
OK. Is that just because it's the preferred Go style?
dgraph/cmd/live/run.go, line 69 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
also move all these vars into a var block
Done.
dgraph/cmd/live/run.go, line 174 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
change it to "processFile forwards file to the ...."
Done.
edgraph/nquads_from_json.go, line 422 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
I think this would be a bit more readable if it was called fieldToIdx
Done.
edgraph/nquads_from_json.go, line 450 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
maybe assert this at the very beginning and exit early if keyFields == nil?
It's OK for keyFields to be nil most of the time. Only when a JSON map does not contain a uid field (which is what the "_:blank-0" check is for) is it required.
loadfile/chunk.go, line 62 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Add a comment on why this chunker just returns nil.
Done.
loadfile/chunk.go, line 68 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
batch.Grow(1 << 20) for lineCount := 0; lineCount < 1e5; lineCount++ {
maybe assign these constants to a variable and explain why these values were chosen.
Those values were there already, but my guess is they are arbitrary.
loadfile/chunk.go, line 122 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
a comment explaining why this function is empty would also be helpful
Done.
loadfile/chunk.go, line 145 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Also this constant but I assume since it has the same value as above it represents the same quantity.
Yes, this I just copied for consistency.
z/json.go, line 91 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
var h uint64
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r2.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @codexnull, @golangcibot, and @martinmr)
dgraph/cmd/live/load_json_test.go, line 39 at r1 (raw file):
Previously, codexnull (Javier Alvarado) wrote…
OK. Is that just because it's the preferred Go style?
I believe so
loadfile/chunk.go, line 61 at r2 (raw file):
} // RDF files don't require any special processing at the beginning of the file
period at the end
loadfile/chunk.go, line 122 at r2 (raw file):
} // RDF files don't require any special processing at the end of the file
period at the end
z/json.go, line 91 at r2 (raw file):
func sortJSONArray(a []interface{}) uint64 { h := uint64(0)
var h uint64 here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 21 files at r1, 7 of 7 files at r2.
Reviewable status: all files reviewed, 30 unresolved discussions (waiting on @codexnull, @golangcibot, and @martinmr)
dgraph/cmd/bulk/loader.go, line 204 at r2 (raw file):
defer thr.Done() r, cleanup_fn := x.FileReader(file)
r, cleanup := x.FileReader(file)
defer cleanup()
dgraph/cmd/bulk/mapper.go, line 120 at r2 (raw file):
} func (m *mapper) run(inputFormat int, keyFields *[]string) {
Slices are pointers by default. No need to pass in a pointer to a slice.
If you're changing the slice, then the Go way is to return the updated slice as a result.
dgraph/cmd/bulk/mapper.go, line 135 at r2 (raw file):
} gqlNqs := make([]gql.NQuad, len(nqs))
What is this for? Are you copying them over for some reason?
dgraph/cmd/bulk/run.go, line 51 at r2 (raw file):
flag := Bulk.Cmd.Flags() flag.StringP("rdfs", "r", "", "Location of RDF data to load.")
I think the previous comment was clearer than this one.
dgraph/cmd/bulk/run.go, line 54 at r2 (raw file):
// would be nice to use -j to match -r, but already used by --num_go_routines flag.String("jsons", "", "Location of JSON data to load.")
same here. This isn't as clear -- unless you changed how we detect files.
dgraph/cmd/bulk/run.go, line 94 at r2 (raw file):
flag.String("custom_tokenizers", "", "Comma separated list of tokenizer plugins") flag.StringP("key", "k", "", "Comma-separated list of JSON fields to identify a uid")
Add (Optional)
prefix in the description.
... to identify a node uniquely.
dgraph/cmd/bulk/run.go, line 184 at r2 (raw file):
for _, f := range strings.Split(opt.KeyFields, ",") { opt.parsedKeyFields = append(opt.parsedKeyFields, strings.TrimSpace(f))
We should avoid having 2 fields in opt, which both represent the same information. You can just overwrite opt.KeyFields
here.
dgraph/cmd/live/run.go, line 85 at r2 (raw file):
flag := Live.Cmd.Flags() flag.StringP("rdfs", "r", "", "Location of RDF or JSON file(s) to load")
Rename to files
, and f
. What do you expect the pattern to be? like this:
Go uses camelcase. But, as mentioned above, no need for _fn
. Just call it cleanup. Similar to how context.WithCancel works.
dgraph/cmd/live/run.go, line 189 at r2 (raw file):
} else { isJson, err = x.IsJSONData(rd) if isJson {
Shouldn't you check for error first? But, also, why try to guess? Just complain and crash out. Users' responsibility to give us the right files.
dgraph/cmd/live/run.go, line 199 at r2 (raw file):
} func (l *loader) nextNquads(batch []*api.NQuad, nqs []*api.NQuad) []*api.NQuad {
Typically these things just get inlined into the loops, instead of being funcs on their own.
dgraph/cmd/live/run.go, line 211 at r2 (raw file):
mu := api.Mutation{Set: batch[:opt.batchSize]} l.reqs <- mu batch = batch[opt.batchSize:]
I smell some funkiness here. The way batch is being passed around doesn't look good. Yeah, just inline this in the caller.
Also, after passing batch to a channel, create a new batch, don't modify it anymore. This particular case might just work, but in general, a recipe for disaster.
dgraph/cmd/live/run.go, line 217 at r2 (raw file):
} func (l *loader) finalNquads(batch []*api.NQuad) {
Inline this func.
dgraph/cmd/live/run.go, line 253 at r2 (raw file):
} func (l *loader) processJsonFile(ctx context.Context, rd *bufio.Reader) error {
Not a fan of 2-line functions. Rather inline it and avoid the code jumps required to read this code.
dgraph/cmd/live/run.go, line 258 at r2 (raw file):
} func (l *loader) processRdfFile(ctx context.Context, rd *bufio.Reader) error {
And here.
dgraph/cmd/live/run.go, line 325 at r2 (raw file):
for _, f := range strings.Split(opt.keyFields, ",") { keyFields = append(keyFields, strings.TrimSpace(f))
Some other place, this was being set back into the opt? (looks like in bulk) But comments there apply here as well. Overwrite opt.keyFields, decrease a variable.
edgraph/nquads_from_json.go, line 437 at r2 (raw file):
for _, nq := range nqs { nq.Subject = str
This won't work if another node is pointing to this node. The ObjectId which was a variable before would have been correctly set by xidToUid
map, but now the subjects get changed, leaving those pointers dangling.
loadfile/chunk.go, line 62 at r1 (raw file):
Previously, codexnull (Javier Alvarado) wrote…
Done.
I don't see the changes required by Martin.
loadfile/chunk.go, line 17 at r2 (raw file):
*/ package loadfile
package chunker.
Also, move rdf parsing within this package.
loadfile/chunk.go, line 214 at r2 (raw file):
} nqs, err := edgraph.JsonToNquads(chunkBuf.Bytes(), keyFields)
Move it to this package.
x/file.go, line 105 at r2 (raw file):
// and decompressed automatically even without the gz extension. The caller is responsible for // calling the returned cleanup function when done with the reader. func FileReader(file string) (rd *bufio.Reader, cleanup func()) {
Move this to chunker pkg.
z/client.go, line 2 at r2 (raw file):
/* * Copyright 2017-2018 Dgraph Labs, Inc. and Contributors
2019
z/exec.go, line 2 at r2 (raw file):
/* * Copyright 2017-2018 Dgraph Labs, Inc. and Contributors
2019
z/json.go, line 2 at r2 (raw file):
/* * Copyright 2017-2018 Dgraph Labs, Inc. and Contributors
2019
dgraph/cmd/bulk/mapper.go
Outdated
@@ -139,7 +140,7 @@ func (m *mapper) run(inputFormat int) { | |||
} | |||
} | |||
|
|||
m.processNQuad(nq) | |||
m.processNQuad(gql.NQuad{nq}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gql.NQuad
composite literal uses unkeyed fields (from govet
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 21 files reviewed, 31 unresolved discussions (waiting on @codexnull, @golangcibot, @manishrjain, and @martinmr)
dgraph/cmd/bulk/loader.go, line 204 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
r, cleanup := x.FileReader(file)
defer cleanup()
Done.
dgraph/cmd/bulk/mapper.go, line 120 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Slices are pointers by default. No need to pass in a pointer to a slice.
If you're changing the slice, then the Go way is to return the updated slice as a result.
Done.
dgraph/cmd/bulk/mapper.go, line 135 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
What is this for? Are you copying them over for some reason?
chunker.Parse() returns a slice of *api.NQuad, but the call to mapper.processNquad() below needs a gql.NQuad, so this code was converting from a slice of *api.NQuad to a slice of gql.NQuad. Looking at the code again, though, it's not necessary to create an array of gql.NQuad; each one can be just converted at the mapper.processNquad() invocation.
dgraph/cmd/bulk/mapper.go, line 143 at r3 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
gql.NQuad
composite literal uses unkeyed fields (fromgovet
)
Done.
dgraph/cmd/bulk/run.go, line 51 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
I think the previous comment was clearer than this one.
The reason I changed the wording is because for the live loader, the argument isn't necessarily a directory. It may be the rdf file directly, a comma-separated list of file, or a directory containing the rdf files. I changed the bulk loader to behave the same way as the live loader.
dgraph/cmd/bulk/run.go, line 54 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
same here. This isn't as clear -- unless you changed how we detect files.
The argument here may also be a file, a list of files, or a directory.
dgraph/cmd/bulk/run.go, line 94 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Add
(Optional)
prefix in the description.... to identify a node uniquely.
Aren't all options optional? :)
Removing the --key option since the feature requires more thought.
dgraph/cmd/bulk/run.go, line 184 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
We should avoid having 2 fields in opt, which both represent the same information. You can just overwrite
opt.KeyFields
here.
I don't think we can do that given that opt.KeyFields is the string given as the option argument (e.g. "last,first,middle") and opt.parsedKeyFields is a slice of the individual fields (e.g. ["last","first","middle"]), but I'll see if I can move parsedKeyFields out of opt to some other (non-global variable) spot.
dgraph/cmd/live/run.go, line 85 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Rename to
/*.json.gz? Whatever it is, mention it in the description so it is clear.files
, andf
. What do you expect the pattern to be? like this:
I kept --rdfs because I didn't want to break backward compatibility. Should we keep it and deprecate it? I will change the option to --files in the bulk loader so that they are consistent.
dgraph/cmd/live/run.go, line 179 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Go uses camelcase. But, as mentioned above, no need for
_fn
. Just call it cleanup. Similar to how context.WithCancel works.
Done.
dgraph/cmd/live/run.go, line 189 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Shouldn't you check for error first? But, also, why try to guess? Just complain and crash out. Users' responsibility to give us the right files.
I check isJson first because it can only be true if err is nil and I wanted to let errors fall through to the return at the end of the function, but I'll change it to check err since that seems to be the preferred go idiom.
As for detecting JSON data, the reason is so that the user can do something like "curl | dgraph live ..." to stream data in (that is, there's no filename to check for an extension). Would it be better to make the user specify something like --format=json instead?
dgraph/cmd/live/run.go, line 199 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Typically these things just get inlined into the loops, instead of being funcs on their own.
OK
dgraph/cmd/live/run.go, line 211 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
I smell some funkiness here. The way batch is being passed around doesn't look good. Yeah, just inline this in the caller.
Also, after passing batch to a channel, create a new batch, don't modify it anymore. This particular case might just work, but in general, a recipe for disaster.
batch is not a mutation, it's just a regular slice that holds a queue of nquads. When there are batchSize or more nquads in the queue, then the mutation is created, the leading batchSize items are copied to the mutation and removed from the queue, and the mutation is sent off through the channel and forgotten about. I think the queue will be unnecessary when I inline the functions
dgraph/cmd/live/run.go, line 253 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Not a fan of 2-line functions. Rather inline it and avoid the code jumps required to read this code.
Done.
dgraph/cmd/live/run.go, line 258 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
And here.
Done.
dgraph/cmd/live/run.go, line 325 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Some other place, this was being set back into the opt? (looks like in bulk) But comments there apply here as well. Overwrite opt.keyFields, decrease a variable.
Yes, the bulk and live loaders have different opt structures. Removing this feature since it requires more thought.
edgraph/nquads_from_json.go, line 437 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This won't work if another node is pointing to this node. The ObjectId which was a variable before would have been correctly set by
xidToUid
map, but now the subjects get changed, leaving those pointers dangling.
Removing this feature since it requires more thought.
loadfile/chunk.go, line 61 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
period at the end
Done.
loadfile/chunk.go, line 122 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
period at the end
Done.
z/exec.go, line 67 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Fix this
Done.
z/json.go, line 91 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
var h uint64 here as well
Done.
@@ -31,6 +31,7 @@ import ( | |||
"sync/atomic" | |||
|
|||
"github.com/dgraph-io/dgo/protos/api" | |||
"github.com/dgraph-io/dgraph/chunker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File is not goimports
-ed (from goimports
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 30 files reviewed, 32 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)
dgraph/cmd/bulk/mapper.go, line 135 at r2 (raw file):
Previously, codexnull (Javier Alvarado) wrote…
chunker.Parse() returns a slice of *api.NQuad, but the call to mapper.processNquad() below needs a gql.NQuad, so this code was converting from a slice of *api.NQuad to a slice of gql.NQuad. Looking at the code again, though, it's not necessary to create an array of gql.NQuad; each one can be just converted at the mapper.processNquad() invocation.
Done.
dgraph/cmd/bulk/mapper.go, line 34 at r4 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed (fromgoimports
)
Done.
dgraph/cmd/bulk/run.go, line 184 at r2 (raw file):
Previously, codexnull (Javier Alvarado) wrote…
I don't think we can do that given that opt.KeyFields is the string given as the option argument (e.g. "last,first,middle") and opt.parsedKeyFields is a slice of the individual fields (e.g. ["last","first","middle"]), but I'll see if I can move parsedKeyFields out of opt to some other (non-global variable) spot.
Removing the --key option since the feature requires more thought.
dgraph/cmd/live/run.go, line 85 at r2 (raw file):
Previously, codexnull (Javier Alvarado) wrote…
I kept --rdfs because I didn't want to break backward compatibility. Should we keep it and deprecate it? I will change the option to --files in the bulk loader so that they are consistent.
Done.
dgraph/cmd/live/run.go, line 217 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Inline this func.
Done.
x/file.go, line 105 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Move this to chunker pkg.
Done.
z/client.go, line 2 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
2019
Done.
z/exec.go, line 2 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
2019
Done.
z/json.go, line 2 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
2019
Done.
loadfile/chunk.go, line 17 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
package chunker.
Also, move rdf parsing within this package.
Done.
loadfile/chunk.go, line 214 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Move it to this package.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made and pushed a small change to fix an issue. Good to go.
Reviewed 3 of 9 files at r3, 8 of 15 files at r4, 12 of 13 files at r5, 1 of 1 files at r6.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @codexnull, @golangcibot, @manishrjain, and @martinmr)
dgraph/cmd/live/run.go, line 189 at r2 (raw file):
Previously, codexnull (Javier Alvarado) wrote…
I check isJson first because it can only be true if err is nil and I wanted to let errors fall through to the return at the end of the function, but I'll change it to check err since that seems to be the preferred go idiom.
As for detecting JSON data, the reason is so that the user can do something like "curl | dgraph live ..." to stream data in (that is, there's no filename to check for an extension). Would it be better to make the user specify something like --format=json instead?
OK.
dgraph/cmd/live/run.go, line 221 at r5 (raw file):
batch = append(batch, nqs...) for len(batch) >= opt.batchSize { mu := api.Mutation{Set: batch[:opt.batchSize]}
send the whole batch out, and create a new batch. Don't modify a slice after its ownership has changed. I've pushed a change to this
edgraph/server.go, line 35 at r5 (raw file):
"github.com/dgraph-io/dgo/y" nqjson "github.com/dgraph-io/dgraph/chunker/json"
cjson? short of chunker/json?
Fixes dgraph-io#2889, dgraph-io#2927. Summary of loader changes: * Support live loading JSON files * Support loading RDF or JSON stream instead of requiring files in live loader * Auto-detect compressed load data instead of requiring extension in filename in both loaders * Auto-detect JSON load data instead of requiring extension in filename in both loaders * Lots of refactoring of live and bulk loaders to share code for the same functionality
Fixes #2889, #2848, #2927.
Summary of loader changes:
Support live loading JSON files
Support creating UID on the fly based on one or more other JSON fields in both loaders
Support loading RDF or JSON stream instead of requiring files in live loader
Auto-detect compressed load data instead of requiring extension in filename in both loaders
Auto-detect JSON load data instead of requiring extension in filename in both loaders
Lots of refactoring of live and bulk loaders to share code for the same functionality
This change is