BatchCommitLog
is a HDFSMetadataLog with metadata as regular text (i.e. String
).
Note
|
HDFSMetadataLog is a MetadataLog that uses Hadoop HDFS for a reliable storage.
|
BatchCommitLog
is created exclusively for batch commit log in StreamExecution
.
serialize(metadata: String, out: OutputStream): Unit
Note
|
serialize is a part of HDFSMetadataLog Contract to write a metadata in serialized format.
|
serialize
writes out the version prefixed with v
on a single line (e.g. v1
) followed by the empty JSON (i.e. {}
).
Note
|
The version in Spark 2.2 is 1 with the charset being UTF-8. |
Note
|
serialize always writes an empty JSON as the name of the files gives the meaning.
|
$ ls -tr [checkpoint-directory]/commits
0 1 2 3 4 5 6 7 8 9
$ cat [checkpoint-directory]/commits/8
v1
{}