Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compulsory MIMEType or description in addData!????? #579

Closed
dehann opened this issue Aug 8, 2020 · 10 comments
Closed

Compulsory MIMEType or description in addData!????? #579

dehann opened this issue Aug 8, 2020 · 10 comments
Labels
Milestone

Comments

@dehann
Copy link
Member

dehann commented Aug 8, 2020

Old:

addDataEntry!( dfg, :x1, datastore, :datalabel, "application/json", Vector{UInt8}(JSON2.write( someDict )) 

New (missing the mimetype or description field)

addData!(fg, :default_folder_store, :x1, :datalabel, Vector{UInt8}(JSON2.write( someDict ))  )

EDIT, see (Standard Interface <: AbstractBlobEntry) #594


Conclusion is

addData!(fg, :default_folder_store, :x1, :datalabel, Vector{UInt8}(JSON2.write( someDict )), mimeType="application/json"  )
@dehann dehann added API user experience data: entry=>blob Previously bigdata, suggested over `smalldata` regression labels Aug 8, 2020
@dehann dehann added this to the v0.10.0 milestone Aug 8, 2020
@dehann dehann changed the title Compuloray MIMEType or description in Data????? Compulsory MIMEType or description in Data????? Aug 8, 2020
@dehann dehann changed the title Compulsory MIMEType or description in Data????? Compulsory MIMEType or description in addData!????? Aug 8, 2020
@Affie
Copy link
Member

Affie commented Aug 8, 2020

I don't know exactly what the issue is. Do you want it as compulsory arguments instead of keyword arguments? It makes it hard to be general.

I went on how it previously was:

GeneralDataEntry(key::Symbol, storeKey::Symbol;
mimeType::String="application/octet-stream") =
GeneralDataEntry(key, storeKey, now(), now(), mimeType)
function GeneralDataEntry(dfg::G, var::V, key::Symbol;
mimeType::String="application/octet-stream") where {G <: AbstractDFG, V <: AbstractDFGVariable}
return GeneralDataEntry(key, _uniqueKey(dfg, var, key), mimeType=mimeType)
end

So keyword argument on MIME Type and added a new field discription also set with a keyword argument.

Also in v0.8

GeneralBigDataEntry(key::Symbol, storeKey::Symbol;
mimeType::String="application/octet-stream") =
GeneralBigDataEntry(key, storeKey, now(), now(), mimeType)
function GeneralBigDataEntry(dfg::G, var::V, key::Symbol;
mimeType::String="application/octet-stream") where {G <: AbstractDFG, V <: AbstractDFGVariable}
return GeneralBigDataEntry(key, _uniqueKey(dfg, var, key), mimeType=mimeType)
end

@Affie
Copy link
Member

Affie commented Aug 8, 2020

Here is the function signature:

function addData!(dfg::AbstractDFG, blobstore::AbstractBlobStore, label::Symbol, key::Symbol,
blob::Vector{UInt8}, timestamp=now(localzone()); description="", mimeType = "", id::UUID = uuid4(), hashfunction = sha256)

@dehann
Copy link
Member Author

dehann commented Aug 9, 2020

don't know exactly what the issue is

Okay, lets try figure that out first. We previously spoke about always storing the format in which the data is serialized (don't think we captured that in an issue). I think that discussion has just gotten lost and I'm trying to make sure we still capture it. The latest example does not have a mimetype or a description field: #580 (comment)

Previously, I did do the example with keyword: #391 (comment) but now I'm thinking the mimetype as keyword is a mistake, I think it needs to be compulsory.

xref:

hard to be general

Maybe I'm not understanding this part, but i think it should be a mandatory field that must be carried around everywhere (similar to e.g. timestamp). Users need to communicate how the data blob was serialized otherwise its near impossible to use. E.g., if you pull an image from the store (that someone else pushed) and it's just a binary data blob, how is a user supposed to know what they have (png, jpg, bitmap, ?). This is more tricky if the default mimetype was assumed as application/octet-stream. Hence, to me mimetype is super critical.

v0.8 API

Yep, still figuring it out and the API has been changing quite a bit. Currently _uniqueKey and default mimetype=application/octet-stream get locked together inside the Entry without the user knowing, and cannot be changed under the immutability requirement. So the user later learns there is a mimetype and then has to redo the upload because they alse need to learn that key=>blob pairings are immutable. Then they might get the label collision as in #578 too.

PS, I'd almost push for a second description/note field that the user can populate. Example, someone wants to store an acoustic wav recording so use the mimetype application/json, but then add the description="this is the one with bad electronics board v0.13". We also spoke about this before, but this is fine as a keyword interface that can be added after.

@dehann
Copy link
Member Author

dehann commented Aug 9, 2020

old discussion,

@Affie
Copy link
Member

Affie commented Aug 9, 2020

I'll just restate a short explanation and history here, there are two (almost competing) implementations of storing data in the file system:

  • BlobStoreEntry working with a FolderStore
    struct BlobStoreEntry <: AbstractDataEntry
    label::Symbol
    id::UUID
    blobstore::Symbol
    hash::String # Probably https://docs.julialang.org/en/v1/stdlib/SHA
    origin::String # E.g. user|robot|session|varlabel
    description::String
    mimeType::String
    createdTimestamp::ZonedDateTime # of when the entry was created
    end
  • FileDataEntry
    struct FileDataEntry <: AbstractDataEntry
    label::Symbol
    id::UUID
    folder::String
    hash::String #using bytes2hex or perhaps Vector{Uint8}?
    createdTimestamp::ZonedDateTime
    function FileDataEntry(label, id, folder, hash, timestamp)
    if !isdir(folder)
    @warn "Folder '$folder' doesn't exist - creating."
    # create new folder
    mkpath(folder)
    end
    return new(label, id, folder, hash, timestamp)
    end
    end

I created and used FileDataEntry as a first solution to quickly save data blobs. Then the stores were created with the surrounding functions, but I didn't find it flexible and it didn't fill my original need.
I then refactored it a bit to be able to use a store in the DFG structure itself.
That makes the BlobStoreEntry + FolderStore look like a better solution. I kept both alive (but didn't add the description and MIMEType to FileDataEntry.

As you can see BlobStoreEntry has a MIME type and a description field.

If I understand correctly, the issue then is

  • mimeType as a compulsory or keyword argument in the convenience functions (addData)

    • JT I have never had the need for it and think it should be a bit more hidden and automated. so would say optional keyword argument (perhaps default to application/octet-stream, as the implementation (blob::Vector{UInt8}) is binary). I can see the usefulness in cases such as image/jpg etc. In those cases, it would be nice to have it automatically filled in.
    • DF I frequently need this, and every time we have worked as multiple people together on a project this quickly becomes an issue, so I would push for compulsory mimetype, and write the docs to have that field empty "" from default user call. So we'd need to break the tie -- perhaps @GearsAD can decide the vote on mimetype API between optional keyword or compulsory for all entry => blob.
  • description as a compulsory or keyword argument in the convenience functions (addData)

    • JT I think its optional, so keyword argument
    • DF, optional on this is fine with me too, and +1 for definitely having the field available

@dehann
Copy link
Member Author

dehann commented Aug 9, 2020

thanks, added my votes above.

there are two (almost competing) implementations of storing data

Sounds like main effort will be a consolidation between two half solutions. It's easier to build out new duplicate functionality and harder to update the old code.

@dehann
Copy link
Member Author

dehann commented Aug 11, 2020

@Affie
Copy link
Member

Affie commented Aug 11, 2020

I like this part:

User can upload anything,
- Then go back and reinsert new bigData with proper MIME type declarations,

@dehann
Copy link
Member Author

dehann commented Aug 12, 2020

overall design issue #594

@dehann
Copy link
Member Author

dehann commented Aug 12, 2020

Related JuliaRobotics/Caesar.jl#274

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants