standardize and export getter names #190

cecileane · 2022-12-15T02:39:46Z

@crsl4 @pbastide this is ready for discussion, started on #80. I went with these principles:

use verbs, like get... or is... or has....
use edge as suffix if the function returns an edge, instead of a node. No suffix if the function returns a node, to get shorter names.
use minor as a suffix if the function returns the minor instead of major edge. No suffix for major, again for shorter names, also because there's no minor option in some cases (e.g. parent node of a tree node).
use a verb and a preposition if the function has 2 arguments, to follow the pattern I see in other julia functions like isa, indexin, typeof etc.

Questions:

Do you agree that we should export them? Personally, think it's time and they would get used. example: PhyNEST.jl requires PhyloNetworks, and they needed to define their own GetChild and childnode functions.
Any recommendations to change the names? Should we adopt different principles? Should the order of suffixes edge and minor be switched, when both are used?
Any other function we should define?
Any other function we should rename? For example, getOtherNode(edge, node)?

What I'll do later: think about descendants and functions to get node ages or distances between nodes.

Below is a recap of the name changes so far.

new functions:

For most of them, the goal is to avoid using the internal structure (e.g. node.leaf), which is not recommended. These function will be inlined by the compiler, so there is no cost to performance

getroot(network)
isrootof(node, network)
isleaf(node)
isexternal(edge)
isparentof(node, edge)
ischildof(node, edge)
hassinglechild(node) # used to get "the" child of a node, typically a hybrid node

existing functions, renamed and exported:

getchild(edge) # replaces getChild
getchild(node) # new method
getchildren(node)  # replaces getChildren
getchildedge(node) # replaced getChildEdge
getparents(node) # replaces getParents
getparent(edge) # replaces getParent
getparent(node) # replaces getMajorParent
getparentminor(node) # replaces getMinorParent
getparentedge(node)  # replaces getMajorParentEdge
getparentedgeminor(node) # replaces getMinorParentEdge
getpartneredge(edge) # replaces getPartner

…, ischildof

codecov · 2022-12-15T02:52:09Z

Codecov Report

Merging #190 (1112fbc) into master (4ee407a) will increase coverage by 0.02%.
The diff coverage is 97.66%.

@@            Coverage Diff             @@
##           master     #190      +/-   ##
==========================================
+ Coverage   86.27%   86.29%   +0.02%     
==========================================
  Files          29       29              
  Lines       12561    12560       -1     
==========================================
+ Hits        10837    10839       +2     
+ Misses       1724     1721       -3

Impacted Files	Coverage Δ
src/addHybrid_snaq.jl	`60.33% <0.00%> (ø)`
src/types.jl	`82.71% <ø> (ø)`
src/pairwiseDistanceLS.jl	`95.70% <83.33%> (ø)`
src/auxiliary.jl	`90.67% <98.68%> (+0.53%)`	⬆️
src/addHybrid.jl	`94.62% <100.00%> (ø)`
src/bootstrap.jl	`70.26% <100.00%> (ø)`
src/compareNetworks.jl	`99.32% <100.00%> (ø)`
src/deleteHybrid.jl	`84.34% <100.00%> (ø)`
src/graph_components.jl	`99.15% <100.00%> (ø)`
src/interop.jl	`100.00% <100.00%> (ø)`
... and 7 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

cecileane · 2022-12-15T02:54:23Z

I went with standard deprecations without forcing warnings even when depwarn=no because doing so is not recommended.

for individual analyses, users will be able to reproduce their analyses even if they used internal functions like PhyloNetworks.getChild, assuming that they keep their Manifest file and re-instantiate their project from the Manifest.
for packages that depend on PhyloNetworks (like PhyloPlots or PhyNEST), semantic versioning will prevent them from updating to the new version of PhyloNetworks without the developers intentionally increasing their "compat" entry for PhyloNetworks. When the developers will do so, they will first run their package's tests, and will see the deprecation warnings, so they will know what new function names to use.

We'll need to remember to increase the minor version before this change makes it to a new release.

cecileane · 2022-12-15T03:24:12Z

For users, the documentation has information

under Library > Public > basic utilities (to view all docstrings)
and under Manual > Network manipulation for an organized list. We can discuss this list too.

cecileane · 2023-01-04T14:09:33Z

Any other thoughts on this: about the new names? about exporting them?

pbastide · 2023-01-05T10:13:16Z

Thank you Cécile for working on this !

I like the new names, and I agree that we should export them. The standard deprecation mechanism you chose seem indeed appropriate.

I have only maybe one comment about the functions : there are a lot of them, which could make them difficult to discover for the users. You already link some of them in the doc with a "See also" paragraph. Maybe we could increase the links there ? Eg refer to getchild in getparent, and vice versa ?

An other approach we could take is to use keywords arguments. So eg instead of having 4 functions: getparent, getparentminor, getparentedge, getparentedgeminor, we could have only one function with two keywords:

getparent(node; parent="node", type="major")

So that default getparent is the same as before, but we would use getparent(node, type="minor") to get the minor parent node for instance, etc.
The question then is how to choose the right keywords, these could also be booleans, eg

getparent(node; edge=false, major=true)

With this semantic, we have only one function, and all the features are easily discoverable (all are documented in the same docstring).

However, some drawbacks are that (1) the function is more complex; (2) there might be some strange combinations (eg getchild(edge, edge=true) does not exist with the named functions) and (3) it makes it a bit more difficult to replace existing functions (eg getMinorParent would need to be replaced by getparent(node, major=false)), and the call is maybe less transparent.

Anyway, these are just some thoughts, and I'm happy with the current naming scheme.

cecileane · 2023-01-08T02:36:19Z

Thank you Paul!

I have only maybe one comment about the functions : there are a lot of them, which could make them difficult to discover for the users. You already link some of them in the doc with a "See also" paragraph. Maybe we could increase the links there ? Eg refer to getchild in getparent, and vice versa ?

Ah yes absolutely (duh!!). It would greatly help discoverability.

An other approach we could take is to use keywords arguments. So eg instead of having 4 functions: getparent, getparentminor, getparentedge, getparentedgeminor, we could have only one function with two keywords:
getparent(node; parent="node", type="major")

That's a very neat idea. My main worry is performance. Code that has forks / branches (depending on keywords) could become slower. Multiple dispatch is super-efficient in Julia, so having 2 methods for parent(edge) and parent(node) is extremely efficient. But with forks, I'm afraid we would lose efficiency. I haven't tried to implement the idea and run benchmarks, though.

With this semantic, we have only one function, and all the features are easily discoverable (all are documented in the same docstring).

Aha, that's another great idea: we could, in fact, document all these functions in a single docstring. That could actually be much better than many different docstrings that refer to each other. For short docstrings at least.

cecileane · 2023-01-15T19:16:56Z

There could be a small overhead to use keyword arguments instead of different function names, based on the silly example below, so I'd rather stick to different functions.

I consolidated the docstrings (7 fewer than earlier). You can check a preview of the documentation starting around here.

I also fixed a function that would have bugged on networks with a hybrid ladder (sorry for adding unrelated changes!), and I plan to improve the docstring for the function that calculates the hardwired-cluster "distance".

function foo(x; type=:add, what=:one)
  if what == :one
    if type == :add
      fun = addone
    else
      fun = subone
    end
  else
    if type == :add
      fun = addtwo
    else
      fun = subtwo
    end
  end
  return fun(x)
end

@noinline function addone(x) return x .+ 1; end
@noinline function addtwo(x) return x .+ 2; end
@noinline function subone(x) return x .- 1; end
@noinline function subtwo(x) return x .- 2; end

foo(5.5)
@benchmark addone(rand(1)) # mean:  89.525 ns
@benchmark foo(rand(1))    # mean: 100.739 ns

pbastide · 2023-01-24T10:44:17Z

Thank you for checking the keyword option, I hadn't think of the computational overhead.
The separate functions with the common docstrings seems like the best of both worlds.

Thanks !

cecileane added 4 commits December 9, 2022 18:08

add utilities; standardize getter names: start

268271f

parent / child accessors: lower-case names

f80c60c

isleaf, isexternal, getparentedgeminor, getroot, isrootof, isparentof…

082d04d

…, ischildof

deprecations, getpartner -> getpartneredge

432a9fa

doc: new section to list utilities

e375987

fix jldoctest

6e54c51

cecileane added 2 commits January 14, 2023 20:51

consolidated docstrings (-7 of them); new tests for better coverage

3d1a6a2

get and setGammas: fixed to work with non-tree-child networks

e1f8c7e

documentation for clusters and hardwired cluster distance

1112fbc

cecileane merged commit 948975d into master Jan 22, 2023

cecileane deleted the dev branch January 22, 2023 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

standardize and export getter names #190

standardize and export getter names #190

cecileane commented Dec 15, 2022

codecov bot commented Dec 15, 2022 •

edited

Loading

cecileane commented Dec 15, 2022

cecileane commented Dec 15, 2022

cecileane commented Jan 4, 2023

pbastide commented Jan 5, 2023

cecileane commented Jan 8, 2023

cecileane commented Jan 15, 2023

pbastide commented Jan 24, 2023

standardize and export getter names #190

standardize and export getter names #190

Conversation

cecileane commented Dec 15, 2022

new functions:

existing functions, renamed and exported:

codecov bot commented Dec 15, 2022 • edited Loading

Codecov Report

cecileane commented Dec 15, 2022

cecileane commented Dec 15, 2022

cecileane commented Jan 4, 2023

pbastide commented Jan 5, 2023

cecileane commented Jan 8, 2023

cecileane commented Jan 15, 2023

pbastide commented Jan 24, 2023

codecov bot commented Dec 15, 2022 •

edited

Loading