Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to use RegistryInstances #68

Merged
merged 4 commits into from
Oct 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "PackageAnalyzer"
uuid = "e713c705-17e4-4cec-abe0-95bf5bf3e10c"
authors = ["Mosè Giordano <[email protected]>"]
version = "0.1.1"
version = "1.0.0"

[deps]
Git = "d7ba0133-e1db-5d97-8f8c-041e4b3a1eb2"
Expand All @@ -10,6 +10,7 @@ JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
LicenseCheck = "726dbf0d-6eb6-41af-b36c-cd770e0f00cc"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
RegistryInstances = "2792f1a3-b283-48e8-9a74-f99dce5104f3"
TOML = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
Tokei_jll = "3ac119c9-1236-5556-b556-adc8150b0244"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
Expand All @@ -19,6 +20,7 @@ Git = "1.2.1"
GitHub = "5.4"
JSON3 = "1.5.1"
LicenseCheck = "0.2"
RegistryInstances = "0.1"
julia = "1.6"
ericphanson marked this conversation as resolved.
Show resolved Hide resolved

[extras]
Expand Down
11 changes: 6 additions & 5 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,9 @@ a package in a locally-installed registry (the General registry is checked by de
*NOTE*: the Git repository of the package will be cloned, in order to inspect
its content.

You can also pass a [`RegistryEntry`](@ref), a simple datastructure which points
to the directory of the package in the registry, where the file `Package.toml`
is stored. The function [`find_package`](@ref) gives you the
[`RegistryEntry`](@ref) of a package in your local copy of any registry, by
You can also pass a [`PkgEntry`](@ref) from RegistryInstances.jl.
The function [`find_package`](@ref) gives you the
[`PkgEntry`](@ref) of a package in your local copy of any registry, by
default the [General registry](https://github.com/JuliaRegistries/General).
`find_package` is invoked automatically when you pass the name of a package.

Expand Down Expand Up @@ -146,7 +145,7 @@ To run the analysis for multiple packages you can either use broadcasting
```julia
analyze.(registry_entries)
```
or use the method `analyze(registry_entries::AbstractVector{<:RegistryEntry})` which
or use the method `analyze(pkg_entries::AbstractVector{<:PkgEntry})` which
runs the analysis with multiple threads.

You can use the function [`find_packages`](@ref) to find all packages in a given
Expand All @@ -167,6 +166,8 @@ Do not abuse this function! Consider using the in-place function `analyze!(root,
!!! warning
Cloning all the repos in General will take more than 20 GB of disk space and can take up to a few hours to complete.

You can use RegistryInstance's `reachable_registries()` function to find other `RegistryInstance` objects to use for the `registry` keyword argument.

## License information

The `license_files` field of the `Package` object is a [`Tables.jl`](https://github.com/JuliaData/Tables.jl) row table
Expand Down
98 changes: 49 additions & 49 deletions src/PackageAnalyzer.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,20 @@ using JSON3 # for interfacing with `tokei` to count lines of code
using Tokei_jll # count lines of code
using GitHub # Use GitHub API to get extra information about the repo
using Git
using RegistryInstances

export general_registry, find_package, find_packages
export analyze, analyze!

# borrowed from <https://github.com/JuliaRegistries/RegistryTools.jl/blob/77cae9ef6a075e1d6ec1592bc3e166234d3f01c8/src/builtin_pkgs.jl>
# borrowed from <https://github.com/JuliaRegistries/RegistryTools.jl/blob/841a56d8274e2857e3fd5ea993ba698cdbf51849/src/builtin_pkgs.jl>
const stdlibs = isdefined(Pkg.Types, :stdlib) ? Pkg.Types.stdlib : Pkg.Types.stdlibs
const STDLIBS = stdlibs()
# Julia 1.8 changed from `name` to `(name, version)`.
get_stdlib_name(s::AbstractString) = s
get_stdlib_name(s::Tuple) = first(s)
const STDLIBS = Dict(k => get_stdlib_name(v) for (k, v) in stdlibs())

include("count_loc.jl")

const LicenseTableEltype=@NamedTuple{license_filename::String, licenses_found::Vector{String}, license_file_percent_covered::Float64}
const ContributionTableElType=@NamedTuple{login::Union{String,Missing}, id::Union{Int,Missing}, name::Union{String,Missing}, type::String, contributions::Int}

Expand Down Expand Up @@ -66,16 +71,6 @@ function Package(name, uuid, repo;
license_files, licenses_in_project, lines_of_code, contributors)
end

"""
RegistryEntry(path::String)

Light data structure pointing to the directory where an entry of a registry is
stored.
"""
struct RegistryEntry
path::String
end

# define `isequal`, `==`, and `hash` just in terms of the fields
for f in (:isequal, :(==))
@eval begin
Expand Down Expand Up @@ -167,13 +162,22 @@ function Base.show(io::IO, p::Package)
print(io, strip(body))
end

const GENERAL_REGISTRY_UUID = UUID("23338594-aafe-5451-b93e-139f81909106")

"""
general_registry() -> String
general_registry() -> RegistryInstance

Guess the path of the General registry.
Return the `RegistryInstance` associated to the General registry.
"""
general_registry() =
first(joinpath(d, "registries", "General") for d in Pkg.depots() if isfile(joinpath(d, "registries", "General", "Registry.toml")))
function general_registry()
registries = reachable_registries()
idx = findfirst(r -> r.uuid == GENERAL_REGISTRY_UUID, registries)
if idx === nothing
throw(ArgumentError("Could not find General registry! Is it installed?"))
else
return registries[idx]
end
end


"""
Expand All @@ -183,15 +187,15 @@ Returns the [RegistryEntry](@ref) for the package `pkg`.
The singular version of [`find_packages`](@ref).
"""
function find_package(pkg::AbstractString; registry=general_registry())
registry_entries = find_packages([pkg]; registry)
if isempty(registry_entries)
pkg_entries = find_packages([pkg]; registry)
if isempty(pkg_entries)
if pkg ∈ values(STDLIBS)
throw(ArgumentError("Standard library $pkg not present in registry"))
else
throw(ArgumentError("$pkg not found in registry"))
end
end
return only(registry_entries)
return only(pkg_entries)
end

"""
Expand All @@ -213,32 +217,31 @@ find_packages(names::AbstractString...; registry = general_registry()) = find_p

function find_packages(names; registry = general_registry())
if names !== nothing
entries = RegistryEntry[]
entries = PkgEntry[]
for name in names
path = joinpath(registry, string(uppercase(first(name))), name)
if isdir(path)
push!(entries, RegistryEntry(path))
uuids = uuids_from_name(registry, name)
if length(uuids) > 1
error("There are more than one packages with name $(name)! These have UUIDs $uuids")
elseif length(uuids) == 1
push!(entries, registry.pkgs[only(uuids)])
elseif name ∉ values(STDLIBS)
@error("Could not find package in registry!", name, path)
@error("Could not find package in registry!", name)
end
end
return entries
end
end

# The UUID of the "julia" pseudo-package in the General registry
const JULIA_UUID = "1222c4b2-2114-5bfd-aeef-88e4692bbb3e"
const JULIA_UUID = UUID("1222c4b2-2114-5bfd-aeef-88e4692bbb3e")

function find_packages(; registry = general_registry(),
filter = (uuid, p) -> !endswith(p["name"], "_jll") && uuid != JULIA_UUID)
# Get the list of packages in the registry by parsing the `Registry.toml`
# file in the given directory.
packages = TOML.parsefile(joinpath(registry, "Registry.toml"))["packages"]
# Get the directories of all packages. Filter out JLL packages: they are
filter = ((uuid, p),) -> !endswith(p.name, "_jll") && uuid != JULIA_UUID)
# Get the PkgEntry's of all packages in the registry. Filter out JLL packages: they are
# automatically generated and we know that they don't have testing nor
# documentation. We also filter out the "julia" package which is not a real
# package and just points at the Julia source code.
return [RegistryEntry(joinpath(registry, splitpath(p["path"])...)) for (uuid, p) in packages if filter(uuid, p)]
return collect(values(Base.filter(filter, registry.pkgs)))
end


Expand Down Expand Up @@ -274,16 +277,13 @@ the list of contributors to the repository is also collected, after waiting for
`sleep` seconds. Only the number of contributors will be shown in the summary.
See [`PackageAnalyzer.github_auth`](@ref) to obtain a GitHub authentication.
"""
function analyze!(root, pkg::RegistryEntry; auth::GitHub.Authorization=github_auth(), sleep=0)
# Parse the `Package.toml` file in the given directory.
toml = TOML.parsefile(joinpath(pkg.path, "Package.toml"))
name = toml["name"]::String
uuid_string = toml["uuid"]::String
uuid = UUID(uuid_string)
repo = toml["repo"]::String
subdir = get(toml, "subdir", "")::String

dest = joinpath(root, uuid_string)
function analyze!(root, pkg::PkgEntry; auth::GitHub.Authorization=github_auth(), sleep=0)
name = pkg.name
uuid = pkg.uuid
info = registry_info(pkg)
repo = info.repo
subdir = something(info.subdir, "")
dest = joinpath(root, string(uuid))

isdir(dest) && return analyze_path(dest; repo, subdir, auth, sleep)

Expand Down Expand Up @@ -324,9 +324,9 @@ function analyze_path!(dest::AbstractString, repo::AbstractString; name="", uuid
end

"""
analyze!(root, registry_entries::AbstractVector{<:RegistryEntry}; auth::GitHub.Authorization=github_auth(), sleep=0) -> Vector{Package}
analyze!(root, pkg_entries::AbstractVector{<:PkgEntry}; auth::GitHub.Authorization=github_auth(), sleep=0) -> Vector{Package}

Analyze all packages in the iterable `registry_entries`, using threads, cloning them to `root`
Analyze all packages in the iterable `pkg_entries`, using threads, cloning them to `root`
if a directory with their `uuid` does not already exist. Returns a
`Vector{Package}`.

Expand All @@ -336,13 +336,13 @@ for `sleep` seconds for each entry (useful to avoid getting rate-limited by
GitHub). See [`PackageAnalyzer.github_auth`](@ref) to obtain a GitHub
authentication.
"""
function analyze!(root, registry_entries::AbstractVector{RegistryEntry}; auth::GitHub.Authorization=github_auth(), sleep=0)
inputs = Channel{Tuple{Int, RegistryEntry}}(length(registry_entries))
for (i,r) in enumerate(registry_entries)
function analyze!(root, pkg_entries::AbstractVector{PkgEntry}; auth::GitHub.Authorization=github_auth(), sleep=0)
inputs = Channel{Tuple{Int, PkgEntry}}(length(pkg_entries))
for (i,r) in enumerate(pkg_entries)
put!(inputs, (i,r))
end
close(inputs)
outputs = Channel{Tuple{Int, Package}}(length(registry_entries))
outputs = Channel{Tuple{Int, Package}}(length(pkg_entries))
Threads.foreach(inputs) do (i, r)
put!(outputs, (i, analyze!(root, r; auth, sleep)))
end
Expand All @@ -351,8 +351,8 @@ function analyze!(root, registry_entries::AbstractVector{RegistryEntry}; auth::G
end

"""
analyze(package::RegistryEntry; auth::GitHub.Authorization=github_auth(), sleep=0) -> Package
analyze(packages::AbstractVector{<:RegistryEntry}; auth::GitHub.Authorization=github_auth(), sleep=0) -> Vector{Package}
analyze(package::PkgEntry; auth::GitHub.Authorization=github_auth(), sleep=0) -> Package
analyze(packages::AbstractVector{<:PkgEntry}; auth::GitHub.Authorization=github_auth(), sleep=0) -> Vector{Package}

Analyzes a package or list of packages using the information in their directory
in a registry by creating a temporary directory and calling `analyze!`,
Expand Down
29 changes: 14 additions & 15 deletions test/runtests.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
using Test, UUIDs
using PackageAnalyzer
using PackageAnalyzer: parse_project, RegistryEntry
using PackageAnalyzer: parse_project
using JLLWrappers
using GitHub
using RegistryInstances

get_libpath() = get(ENV, JLLWrappers.LIBPATH_env, nothing)
const orig_libpath = get_libpath()
Expand All @@ -11,36 +12,34 @@ const auth = GitHub.AnonymousAuth()

@testset "PackageAnalyzer" begin
general = general_registry()
@test isdir(general)
@test all(p -> isdir(p.path), find_packages())
@test find_package("julia") ∉ find_packages()
@test all(p -> isdir(p.path), find_packages("Flux"))
@test isdir(find_package("Flux").path)
@test general isa RegistryInstance
# Test some properties of the `Measurements` package. NOTE: they may change
# in the future!
measurements = analyze(RegistryEntry(joinpath(general, "M", "Measurements")); auth)
measurements = analyze(find_package("Measurements"); auth)
@test measurements.uuid == UUID("eff96d63-e80a-5855-80a2-b1b0885c5ab7")
@test measurements.reachable
@test measurements.docs
@test measurements.runtests
@test !measurements.buildkite
@test !isempty(measurements.lines_of_code)
packages = find_packages("Cuba", "PolynomialRoots")
# Test results of a couple of packages. Same caveat as above
packages = [RegistryEntry(joinpath(general, p...)) for p in (("C", "Cuba"), ("P", "PolynomialRoots"))]
@test Set(packages) == Set(find_packages("Cuba", "PolynomialRoots")) == Set(find_packages(["Cuba", "PolynomialRoots"]))
@test packages ⊆ find_packages()
uuids = only.(uuids_from_name.(Ref(general), ["Cuba", "PolynomialRoots"]))
# We compare by UUID, since other fields may be initialized or not
@test Set(uuids) == Set([x.uuid for x in packages]) == Set([x.uuid for x in find_packages(["Cuba", "PolynomialRoots"])])
@test uuids ⊆ [x.uuid for x in find_packages()]
results = analyze(packages; auth)
cuba, polyroots = results
@test length(filter(p -> p.reachable, results)) == 2
@test length(filter(p -> p.runtests, results)) == 2
@test cuba.drone
@test cuba.cirrus
ericphanson marked this conversation as resolved.
Show resolved Hide resolved
@test !polyroots.docs # Documentation is in the README!
# We can also use broadcasting!
@test Set(results) == Set(analyze.(packages; auth))

# Test `analyze!` directly
mktempdir() do root
measurements2 = analyze!(root, RegistryEntry(joinpath(general, "M", "Measurements")); auth)
measurements2 = analyze!(root, find_package("Measurements"); auth)
@test isequal(measurements, measurements2)
@test isdir(joinpath(root, "eff96d63-e80a-5855-80a2-b1b0885c5ab7")) # not cleaned up yet
end
Expand Down Expand Up @@ -70,7 +69,7 @@ end

# the tests folder isn't a package!
# But this helps catch issues in error paths for when things go wrong
bad_pkg = analyze("."; auth)
bad_pkg = analyze(@__DIR__; auth)
@test bad_pkg.repo == ""
@test bad_pkg.uuid == UUID(UInt128(0))
@test !bad_pkg.cirrus
Expand Down Expand Up @@ -172,11 +171,11 @@ end

# has `license = "MIT"`
project_1 = (; name = "PackageAnalyzer", uuid = UUID("e713c705-17e4-4cec-abe0-95bf5bf3e10c"), licenses_in_project=["MIT"])
@test parse_project("license_in_project") == project_1
@test parse_project(joinpath(@__DIR__, "license_in_project")) == project_1

# has `license = ["MIT", "GPL"]`
project_2 = (; name = "PackageAnalyzer", uuid = UUID("e713c705-17e4-4cec-abe0-95bf5bf3e10c"), licenses_in_project=["MIT", "GPL"])
@test parse_project("licenses_in_project") == project_2
@test parse_project(joinpath(@__DIR__, "licenses_in_project")) == project_2
end

@testset "`show`" begin
Expand Down