Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add metadata #48

Merged
merged 44 commits into from
Sep 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
ff31aa7
add `getmetadata`
bkamins May 8, 2022
96b1e39
Update Project.toml
bkamins May 8, 2022
f7cb305
Apply suggestions from code review
bkamins May 22, 2022
77e3604
Update runtests.jl
bkamins May 22, 2022
7d53c4b
Update src/DataAPI.jl
bkamins May 22, 2022
c28d7c5
Update test/runtests.jl
bkamins May 22, 2022
37ef469
Update test/runtests.jl
bkamins May 22, 2022
4617f60
Update test/runtests.jl
bkamins May 22, 2022
5a493b6
update API
bkamins May 23, 2022
aa15118
improve test coverage
bkamins May 23, 2022
992d1e8
change column to key
bkamins May 23, 2022
bbeda80
Apply suggestions from code review
bkamins May 23, 2022
45b433a
Update src/DataAPI.jl
bkamins May 23, 2022
6897052
Update src/DataAPI.jl
bkamins May 23, 2022
2ac9c88
add colmetadata
bkamins May 30, 2022
43cec64
fix typo
bkamins May 30, 2022
320f9db
fix another typo
bkamins May 30, 2022
3e7ed28
fix tests
bkamins May 30, 2022
f2e4eb1
improve contract description
bkamins May 30, 2022
76bfba5
update error message
bkamins May 31, 2022
9db264a
drop table-level colmetadata
bkamins Jun 1, 2022
b863431
Update test/runtests.jl
bkamins Jun 3, 2022
0df50d7
Apply suggestions from code review
bkamins Jun 23, 2022
0c685ad
add hascolmetadata for whole table
bkamins Jun 27, 2022
0d75fce
new metadata style
bkamins Jul 30, 2022
40edcd9
small fixes
bkamins Jul 30, 2022
c3d7f94
Apply suggestions from code review
bkamins Jul 31, 2022
a75b965
changes after code review
bkamins Jul 31, 2022
47629ed
update specification
bkamins Jul 31, 2022
51add3c
update docstrings
bkamins Aug 1, 2022
f0c9fc0
Apply suggestions from code review
bkamins Aug 2, 2022
be8102f
Apply suggestions from code review
bkamins Aug 2, 2022
435e062
add metadata deletion
bkamins Aug 3, 2022
56f98b7
fix typo
bkamins Aug 3, 2022
b909f01
fix another typo
bkamins Aug 3, 2022
3e32920
fix tests
bkamins Aug 3, 2022
b2d2636
one more fix to tests
bkamins Aug 3, 2022
a589cee
final test fix
bkamins Aug 3, 2022
c612a12
improve test coverage
bkamins Aug 3, 2022
15303a6
add emptymetadata! and emptycolmetadata!
bkamins Aug 4, 2022
3b61e32
Update src/DataAPI.jl
bkamins Aug 6, 2022
edfd350
Update src/DataAPI.jl
bkamins Aug 7, 2022
b33b62e
make :none style more precise
bkamins Aug 30, 2022
c5f699e
change :none to :default
bkamins Sep 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "DataAPI"
uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
authors = ["quinnj <[email protected]>"]
version = "1.10.0"
version = "1.11.0"

[compat]
julia = "1"
Expand Down
154 changes: 154 additions & 0 deletions src/DataAPI.jl
Original file line number Diff line number Diff line change
Expand Up @@ -287,4 +287,158 @@ using a `sink` function to materialize the table.
"""
function allcombinations end

const STYLE_INFO = """
One of the uses of the metadata `style` is decision
how the metadata should be propagated when `x` is transformed. This interface
defines the `:default` style that indicates that metadata should not be propagated
under any operations (it is only preserved when a copy of the source table is
performed). All types supporting metadata allow at least this style.
"""

const COL_INFO = """
`col` must have a type that is supported by table `x` for column indexing.
Following the Tables.jl contract `Symbol` and `Int` are always allowed.
Passing `col` that is not a column of `x` throws an error.
"""

"""
metadata(x, key::AbstractString; style::Bool=false)

Return metadata value associated with object `x` for key `key`.
If `x` does not support metadata throw `ArgumentError`.
If `x` supports metadata, but does not have a mapping for `key` throw
`KeyError`.

If `style=true` return a tuple of metadata value and metadata style. Metadata
style is an additional information about the kind of metadata that is stored
for the `key`.

$STYLE_INFO
"""
metadata(::T, ::AbstractString; style::Bool=false) where {T} =
throw(ArgumentError("Objects of type $T do not support getting metadata"))

"""
metadatakeys(x)

Return an iterator of metadata keys for which `metadata(x, key)` returns a
metadata value. If `x` does not support metadata return `()`.
"""
metadatakeys(::Any) = ()

"""
metadata!(x, key::AbstractString, value; style)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, maybe the syntax would be more natural as metadata!(x, key => value)? That would allow extending this in the future to pass multiple pairs if it appears to be convenient.

A counter-argument is that setindex! doesn't use that syntax, but it's almost never called that way since x[key] = value is nicer. Of course both syntaxes could be allowed as they are not ambiguous (we will probably never allow keys to be pairs).

Copy link
Member Author

@bkamins bkamins Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a first reaction it makes sense.

My only reservation was that in DataFrames.jl => is used for operation specification language, so we would have yet a third way to interpret => there.

The question is how would it look for colmetadata!?

colmetadata!(x, col, key => value; style=style)

(which does not look that nice)

Also note that metadata!(x, key => value) would not be allowed, you would need to write metadata!(x, key => value; style=style).

In summary - I would keep the things as they are here and consider what we design here a low-level API.

I assume that the extra new package planned (tentatively named TableMetadataTools.jl) will provide convenient high-level functions. In practice I even expect that if we define there:

caption!(table, str) = metadata!(table, "caption", str, style=:note)
caption(table) = metadata(table, "caption")
label!(table, col, str) = colmetadata!(table, col, "label", str, style=:note)
label(table, col) = colmetadata(table, col, "caption")

this will cover 95% of use cases of metadata in practice.

In summary - I propose to discuss a convenience high-level API in TableMetadataTools.jl, as I expect that in that package we will drop the requirement to specify style which we have in low-level API, as in high level API all styles will be :note.


Set metadata for object `x` for key `key` to have value `value`
and style `style` and return `x`.
If `x` does not support setting metadata throw `ArgumentError`.

$STYLE_INFO
"""
metadata!(::T, ::AbstractString, ::Any; style) where {T} =
throw(ArgumentError("Objects of type $T do not support setting metadata"))

"""
deletemetadata!(x, key::AbstractString)

Delete metadata for object `x` for key `key` and return `x`
(if metadata for `key` is not present do not perform any action).
If `x` does not support metadata deletion throw `ArgumentError`.
"""
deletemetadata!(::T, ::AbstractString) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))

"""
emptymetadata!(x)

Delete all metadata for object `x`.
If `x` does not support metadata deletion throw `ArgumentError`.
"""
emptymetadata!(::T) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))

bkamins marked this conversation as resolved.
Show resolved Hide resolved
"""
colmetadata(x, col, key::AbstractString; style::Bool=false)

Return metadata value associated with table `x` for column `col` and key `key`.
If `x` does not support metadata for column `col` throw `ArgumentError`. If `x`
supports metadata, but does not have a mapping for column `col` for `key` throw
`KeyError`.

$COL_INFO

If `style=true` return a tuple of metadata value and metadata style. Metadata
style is an additional information about the kind of metadata that is stored for
the `key`.

$STYLE_INFO
"""
colmetadata(::T, ::Int, ::AbstractString; style::Bool=false) where {T} =
throw(ArgumentError("Objects of type $T do not support getting column metadata"))
colmetadata(::T, ::Symbol, ::AbstractString; style::Bool=false) where {T} =
throw(ArgumentError("Objects of type $T do not support getting column metadata"))

"""
colmetadatakeys(x, [col])

If `col` is passed return an iterator of metadata keys for which `metadata(x,
col, key)` returns a metadata value. If `x` does not support metadata for column
`col` return `()`.

`col` must have a type that is supported by table `x` for column indexing.
Following the Tables.jl contract `Symbol` and `Int` are always allowed. Passing
`col` that is not a column of `x` either throws an error (this is a
preferred behavior if it is possible) or returns `()` (this duality is allowed
as some Tables.jl tables do not have a schema).

If `col` is not passed return an iterator of `col => colmetadatakeys(x, col)`
pairs for all columns that have metadata, where `col` are `Symbol`.
If `x` does not support column metadata return `()`.
"""
colmetadatakeys(::Any, ::Int) = ()
colmetadatakeys(::Any, ::Symbol) = ()
colmetadatakeys(::Any) = ()

"""
colmetadata!(x, col, key::AbstractString, value; style)

Set metadata for table `x` for column `col` for key `key` to have value `value`
and style `style` and return `x`.
If `x` does not support setting metadata for column `col` throw `ArgumentError`.

$COL_INFO

$STYLE_INFO
"""
colmetadata!(::T, ::Int, ::AbstractString, ::Any; style) where {T} =
throw(ArgumentError("Objects of type $T do not support setting metadata"))
colmetadata!(::T, ::Symbol, ::AbstractString, ::Any; style) where {T} =
throw(ArgumentError("Objects of type $T do not support setting metadata"))

"""
deletecolmetadata!(x, col, key::AbstractString)

Delete metadata for table `x` for column `col` for key `key` and return `x`
(if metadata for `key` is not present do not perform any action).
If `x` does not support metadata deletion for column `col` throw `ArgumentError`.
"""
deletecolmetadata!(::T, ::Symbol, ::AbstractString) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))
deletecolmetadata!(::T, ::Int, ::AbstractString) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))

"""
emptycolmetadata!(x, [col])

bkamins marked this conversation as resolved.
Show resolved Hide resolved
Delete all metadata for table `x` for column `col`.
If `col` is not passed delete all column level metadata for table `x`.
If `x` does not support metadata deletion for column `col` throw `ArgumentError`.
"""
emptycolmetadata!(::T, ::Symbol) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))
emptycolmetadata!(::T, ::Int) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))
emptycolmetadata!(::T) where {T} =
throw(ArgumentError("Objects of type $T do not support metadata deletion"))

end # module
131 changes: 131 additions & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,79 @@ Base.size(x::TestArray) = size(x.x)
Base.getindex(x::TestArray, i) = x.x[i]
DataAPI.levels(x::TestArray) = reverse(DataAPI.levels(x.x))

# An example implementation of metadata
# For simplicity Int col indexing is not implemented
# and no checking if col is a column of a table is performed

struct TestMeta
table::Dict{String, Any}
col::Dict{Symbol, Dict{String, Any}}

TestMeta() = new(Dict{String, Any}(), Dict{Symbol, Dict{String, Any}}())
end

function DataAPI.metadata(x::TestMeta, key::AbstractString; style::Bool=false)
return style ? x.table[key] : x.table[key][1]
end

DataAPI.metadatakeys(x::TestMeta) = keys(x.table)

function DataAPI.metadata!(x::TestMeta, key::AbstractString, value; style)
x.table[key] = (value, style)
return x
end

function DataAPI.metadata!(x::TestMeta, key::AbstractString, value; style)
x.table[key] = (value, style)
return x
end

DataAPI.deletemetadata!(x::TestMeta, key::AbstractString) = delete!(x.table, key)
DataAPI.emptymetadata!(x::TestMeta) = empty!(x.table)

function DataAPI.colmetadata(x::TestMeta, col::Symbol, key::AbstractString; style::Bool=false)
return style ? x.col[col][key] : x.col[col][key][1]
end

function DataAPI.colmetadatakeys(x::TestMeta, col::Symbol)
haskey(x.col, col) && return keys(x.col[col])
return ()
end

function DataAPI.colmetadatakeys(x::TestMeta)
isempty(x.col) && return ()
return (col => keys(x.col[col]) for col in keys(x.col))
end

function DataAPI.colmetadata!(x::TestMeta, col::Symbol, key::AbstractString, value; style)
if haskey(x.col, col)
x.col[col][key] = (value, style)
else
x.col[col] = Dict{Any, Any}(key => (value, style))
end
return x
end

function DataAPI.deletecolmetadata!(x::TestMeta, col::Symbol, key::AbstractString)
if haskey(x.col, col)
delete!(x.col[col], key)
else
throw(ArgumentError("column $col not found"))
end
return x
end

function DataAPI.emptycolmetadata!(x::TestMeta, col::Symbol)
if haskey(x.col, col)
delete!(x.col, col)
else
throw(ArgumentError("column $col not found"))
end
return x
end

DataAPI.emptycolmetadata!(x::TestMeta) = empty!(x.col)

@testset "DataAPI" begin

@testset "defaultarray" begin
Expand Down Expand Up @@ -173,4 +246,62 @@ end
@test DataAPI.unwrap(missing) === missing
end

@testset "metadata" begin
@test_throws ArgumentError DataAPI.metadata!(1, "a", 10, style=:default)
@test_throws ArgumentError DataAPI.deletemetadata!(1, "a")
@test_throws ArgumentError DataAPI.emptymetadata!(1)
@test_throws ArgumentError DataAPI.metadata(1, "a")
@test_throws ArgumentError DataAPI.metadata(1, "a", style=true)
@test DataAPI.metadatakeys(1) == ()

@test_throws ArgumentError DataAPI.colmetadata!(1, :col, "a", 10, style=:default)
@test_throws ArgumentError DataAPI.deletecolmetadata!(1, :col, "a")
@test_throws ArgumentError DataAPI.emptycolmetadata!(1, :col)
@test_throws ArgumentError DataAPI.deletecolmetadata!(1, 1, "a")
@test_throws ArgumentError DataAPI.emptycolmetadata!(1, 1)
@test_throws ArgumentError DataAPI.emptycolmetadata!(1)
@test_throws ArgumentError DataAPI.colmetadata(1, :col, "a")
@test_throws ArgumentError DataAPI.colmetadata(1, :col, "a", style=true)
@test_throws ArgumentError DataAPI.colmetadata!(1, 1, "a", 10, style=:default)
@test_throws ArgumentError DataAPI.colmetadata(1, 1, "a")
@test_throws ArgumentError DataAPI.colmetadata(1, 1, "a", style=true)
@test DataAPI.colmetadatakeys(1, :col) == ()
@test DataAPI.colmetadatakeys(1, 1) == ()
@test DataAPI.colmetadatakeys(1) == ()

tm = TestMeta()
@test isempty(DataAPI.metadatakeys(tm))
@test DataAPI.metadata!(tm, "a", "100", style=:note) == tm
@test collect(DataAPI.metadatakeys(tm)) == ["a"]
@test_throws KeyError DataAPI.metadata(tm, "b")
@test_throws KeyError DataAPI.metadata(tm, "b", style=true)
@test DataAPI.metadata(tm, "a") == "100"
@test DataAPI.metadata(tm, "a", style=true) == ("100", :note)
DataAPI.deletemetadata!(tm, "a")
@test isempty(DataAPI.metadatakeys(tm))
@test DataAPI.metadata!(tm, "a", "100", style=:note) == tm
DataAPI.emptymetadata!(tm)
@test isempty(DataAPI.metadatakeys(tm))

@test DataAPI.colmetadatakeys(tm) == ()
@test DataAPI.colmetadatakeys(tm, :col) == ()
@test DataAPI.colmetadata!(tm, :col, "a", "100", style=:note) == tm
@test [k => collect(v) for (k, v) in DataAPI.colmetadatakeys(tm)] == [:col => ["a"]]
@test collect(DataAPI.colmetadatakeys(tm, :col)) == ["a"]
@test_throws KeyError DataAPI.colmetadata(tm, :col, "b")
@test_throws KeyError DataAPI.colmetadata(tm, :col, "b", style=true)
@test_throws KeyError DataAPI.colmetadata(tm, :col2, "a")
@test_throws KeyError DataAPI.colmetadata(tm, :col2, "a", style=true)
@test DataAPI.colmetadata(tm, :col, "a") == "100"
@test DataAPI.colmetadata(tm, :col, "a", style=true) == ("100", :note)
DataAPI.deletecolmetadata!(tm, :col, "a")
@test isempty(DataAPI.colmetadatakeys(tm, :col))
@test DataAPI.colmetadata!(tm, :col, "a", "100", style=:note) == tm
DataAPI.emptycolmetadata!(tm, :col)
@test isempty(DataAPI.colmetadatakeys(tm, :col))
@test DataAPI.colmetadata!(tm, :col, "a", "100", style=:note) == tm
DataAPI.emptycolmetadata!(tm)
@test isempty(DataAPI.colmetadatakeys(tm))
end

end # @testset "DataAPI"