Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add entrypoints #274

Closed
wants to merge 3 commits into from
Closed

add entrypoints #274

wants to merge 3 commits into from

Conversation

renxida
Copy link
Contributor

@renxida renxida commented Oct 4, 2017

  • added entrypoints flowcal and FlowCal to launch excel_ui in setup.py
  • refactored FlowCal/excel_ui.py to accomodate those changes
    • added main function
    • imported sys
    • my editor automatically clears lines with only tabs and spaces. Sorry for the mess

Tested in fresh anaconda python3 and python2 environments on Ubuntu Linux 16.10.

@castillohair
Copy link
Collaborator

Looks ok codewise. I'll test personally later.

Can you confirm that the following is true?:

  • Both FlowCal and flowcal are valid commands, and they both work from anywhere in the file system.
  • Both respond to flags the same way as python -m FlowCal.excel_ui.
  • If no input file is specified with flag -i, a window opens to select an input file.
  • python -m FlowCal.excel_ui still works as before.
  • If you have access to a windows and/or OSX machine, it will be useful to confirm that this works there.

@JS3xton do you have any comments?

@JS3xton
Copy link
Contributor

JS3xton commented Oct 4, 2017

  • It's a little confusing to have both a run function and a main function in the excel_ui module.
    • Can (/should?) we combine them?
      • Is anything (e.g. an external functionality that we're employing) expecting a run function?
      • I believe the python -m <module-name> interface simply executes code in a if __name__ == '__main__': block, so it's currently executing all of the argparse code and then calling run.
      • We could pull the argparse code into run and then call run both in the if __name__ == '__main__': block and as the console_scripts entry point. @castillohair not sure if you want to pollute what is otherwise a command line-agnostic run function with all of the argparse code, though. (could prioritize values explicitly specified to run by the user and fall back on values extracted by argparse if the user did not specify a required parameter, but I honestly don't know if I like this behavior).
    • If not, can we clarify them?
      • It looks like console_scripts entry points can point to any module function (not one specifically named "main"). Can we rename main to something else? Like run_command_line?
      • main (or whatever it ends up getting called) should be documented (i.e. via a docstring). The documentation for run and main should make it clear how the two functions are different (either implicitly or explicitly) and maybe even provide intended use cases.
  • sys.argv (a list, i.e. a mutable data type) as the default argument to main could be bad practice? (see here and here) This is kinda a special use case, though, since sys is maintaining the list, so I'm not sure if these bugs could still occur. My preference is to use a sentinel value, though, unless you can demonstrate that these bugs will never happen.
  • Good compromise to allow both FlowCal and flowcal from the command line. 👍

@renxida
Copy link
Contributor Author

renxida commented Oct 5, 2017

@castillohair All confirmed, except windows / osx part.

@renxida
Copy link
Contributor Author

renxida commented Oct 5, 2017

@JS3xton "main" is a pretty standard name for a function that parses command-line arguments and runs the actual main function in the program. In both Python and C. I don't feel uncomfortable about having an argument parsing function and naming it main.

Compared to

def main (args = None)
    if args is None:
        args=sys.argv

I prefer main(args = sys.argv) because it signifies the role of the main function as an entrypoint that takes the arguments from the system and parses them. Also, args = sys.argv just sets args to be a reference to sys.argv, making no difference at all. If we'll be changing sys.argv and if we don't want the changes to carry from one call on main to another call, the only way to do it would be to import copy and then make a deep copy of sys.argv and then operate on it. But it is not necessary because problems only happen when both of these conditions hold true:

1. `args` is mutated by `main()`
2. `main` is called several times without arguments

but argparse doesn't really change the value of args as it parses the arguments, and main()never gets executed twice, unless we're using unit tests to test main(), in which case we will be passing arguments to main to replace args.

I also think we can always get away with just main(), because whether we use python -m or the entrypoint, the main function is always called without arguments and the arguments are always fetched from sys.argv. But if we do that, in the future we won't be able to do unit tests on main when we add new arguments to the command line scripts.

@castillohair
Copy link
Collaborator

castillohair commented Oct 5, 2017

@castillohair not sure if you want to pollute what is otherwise a command line-agnostic run function with all of the argparse code, though.

No, I don't. I'd rather keep the function and the argparse stuff separated. Separation of the "main" function that processes command-line arguments and the actual function that performs the Excel UI work is good in my opinion.

Is anything (e.g. an external functionality that we're employing) expecting a run function?

Not that I know, although someone might be calling this function from python in their own scripts. Backwards compatibility is another reason to not combine run() and "main".

It looks like console_scripts entry points can point to any module function (not one specifically named "main"). Can we rename main to something else? Like run_command_line?

The point of this is that the entry points map directly into what is now executed when you call python -m FlowCal.excel_ui, which in turn runs whatever is below if __name__=='__main__'.

Edit: Apparently @JS3xton was complaining about the function name choice. I have no strong feelings one way or another. Both seem like logical choices to me.

However, I initially accepted making a main() function that is separated from the if __name__=='__main__' statement, but on second thought I fail to see the value on this. @renxida, why is this preferable to having the argparse code below if __name__=='__main__', as we had it before?

Edit: apparently a function is necessary for the entrypoint.

main (or whatever it ends up getting called) should be documented (i.e. via a docstring).

Agree. Assuming that we keep the function, we will need a docstring before accepting the request. @renxida , please check https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt for the docstring format that we use, although for this particular case it might be enough to just look at the other docstrings in our code. I am ok with the current docstring of run(), though.

sys.argv (a list, i.e. a mutable data type) as the default argument to main could be bad practice? (see here and here) This is kinda a special use case, though, since sys is maintaining the list, so I'm not sure if these bugs could still occur. My preference is to use a sentinel value, though, unless you can demonstrate that these bugs will never happen.

I'm gonna wait until @renxida justifies better why main() had to be separated from if __name__=='__main__ before giving an opinion on this.

Although main currently doesn't modify args, is probably better to future-proof the code. I think deep-copying would be the way to go. The code should then look something like this:

def main (args = None)
    if args is None:
        args=copy.deepcopy(sys.argv)
    ....

@renxida
Copy link
Contributor Author

renxida commented Oct 5, 2017

@castillohair

Thanks for the helpful comments. I'm on it with the docstring.

Importing copy for sys.argv won't be very useful. It's going to be redundant, it will damage code readability, and it wouldn't be at all useful. sys.argv isn't ever changed except when people test code the wrong way, and argparse is a pretty safe and mature language that wouldn't do something as stupid as changing an input argument. Personally, I would worry about alien invasions and zombie apocalypses before I worry about argparse or something else messing up that variable.

Ultimately though, it's your call.

@renxida
Copy link
Contributor Author

renxida commented Oct 5, 2017

Docstring added. argv moved to function body.

@JS3xton
Copy link
Contributor

JS3xton commented Oct 5, 2017

I don't think the deepcopy of sys.argv is necessary, but I also don't think it hurts. I at least want a sentinel value used in the function definition, though, so that the function binds sys.argv when the function is called, not when the function is defined.

Please rename main to run_command_line.

  • I'm familiar with main()'s use in other programming languages; much of my initial training was in C. I haven't seen main() used in Python much, and, more relevantly, Python and the console_scripts Entry Point mechanism afford us the opportunity to have a more descriptive function name, which I think is important. main is painfully vague (outside of its historical use in other languages, as discussed), whereas run_command_line appropriately alludes to the existing run function and concisely communicates the relevant information that it's differentiated from run for reasons related to the command line.
  • @castillohair and I discussed this in person, and he established something that I agreed with upon reflection: the behavior of python -m FlowCal and calling flowcal from the terminal should be the same, and that behavior should be to run FlowCal from the command line. Given that their behavior should be the same (even though their implementation mechanisms are different), it makes sense to call the same function to implement both behaviors. In my opinion, that behavior is much more clearly described via run_command_line, though, than main.

@renxida
Copy link
Contributor Author

renxida commented Oct 5, 2017 via email

@renxida
Copy link
Contributor Author

renxida commented Oct 5, 2017

Done.

@castillohair
Copy link
Collaborator

Currently fails on Windows 10.
The following opens the "open file" window in the current develop branch.

> python -m FlowCal.excel_ui

In renxida/FlowCal (c43b8a9) the following error message appears:

usage: excel_ui.py [-h] [-i [INPUTPATH]] [-o [OUTPUTPATH]] [-v] [-p] [-H]
excel_ui.py: error: unrecognized arguments: d:\[redacted]\programs\flowcal\FlowCal\excel_ui.py

Same if I try to run flowcal.


Notes
----------
Use main(argv = ['arguments', 'you', 'want', 'to', 'test'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not called "main" anymore.

@@ -1521,11 +1545,14 @@ def run(input_path=None,
"--histogram-sheet",
action="store_true",
help="generate sheet in output Excel file specifying histogram bins")
args = parser.parse_args()
args = parser.parse_args(args=argv)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails on Windows 10, both when using flowcal and python -m FlowCal.excel_ui. Replacing this line with the previous version, args = parser.parse_args(), makes everything work.

I think the reason is that the list of arguments, as given by sys.argv, includes the path of the script as the first element on Windows. However, parser.parse_args doesn't expect this. Therefore, it fails.

An immediate solution is to replace this line with the old version, and eliminate the option to specify argument argv to the function run_command_line.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same behavior in Linux Mint 18. It looks like the first element of sys.argv is always the script name, which parser.parse_args does not expect. This is true in all OS, both in python2 and python3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it appears that argparse version 1.1 (the default argparse module included in Python 2.7 [source] and Python 3.6 [source]) strips out the first argument here [source]:

    def parse_known_args(self, args=None, namespace=None):
        if args is None:
            # args default to the system args
            args = _sys.argv[1:]
        else:
            # make sure that args are mutable
            args = list(args)

Thus, I think we're justified in doing the same thing. Might be nice to explicitly state all of this in a brief comment in the code, though.

@@ -34,6 +34,12 @@ def find_version(file_path):
setup(
name='FlowCal',

entry_points = {
'console_scripts': [
'flowcal=FlowCal.excel_ui:run_command_line'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the entrypoint FlowCal eliminated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@renxida raised this issue here: pypa/pip#4771

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In light of @renxida's post and possible uninstall problems, I think we should just go with lowercase flowcal.

Separately, I think it's a good idea to incorporate verification of a clean uninstall in the build process (I hate it when a package sloppily fails to uninstall correctly and pollutes my system).

Copy link
Collaborator

@castillohair castillohair Dec 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #281 for a discussion on installation.

Assuming that I install the package with pip install ., uninstallation proceeds cleanly, so I'm unable to reproduce this issue.

On FlowCal's folder:

$ sudo pip install .
Processing /home/[redacted]/FlowCal
Requirement already satisfied: packaging>=16.8 in /usr/local/lib/python2.7/dist-packages (from FlowCal==1.1.4)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python2.7/dist-packages/six-1.11.0-py2.7.egg (from FlowCal==1.1.4)
Requirement already satisfied: numpy>=1.8.2 in /usr/local/lib/python2.7/dist-packages (from FlowCal==1.1.4)
Requirement already satisfied: scipy>=0.14.0 in /usr/local/lib/python2.7/dist-packages (from FlowCal==1.1.4)
Requirement already satisfied: matplotlib>=1.3.1 in /usr/local/lib/python2.7/dist-packages (from FlowCal==1.1.4)
Requirement already satisfied: palettable>=2.1.1 in /usr/local/lib/python2.7/dist-packages/palettable-3.0.0-py2.7.egg (from FlowCal==1.1.4)
Requirement already satisfied: scikit-learn>=0.16.0 in /usr/local/lib/python2.7/dist-packages (from FlowCal==1.1.4)
Requirement already satisfied: pandas>=0.16.1 in /usr/local/lib/python2.7/dist-packages (from FlowCal==1.1.4)
Requirement already satisfied: xlrd>=0.9.2 in /usr/local/lib/python2.7/dist-packages/xlrd-1.0.0-py2.7.egg (from FlowCal==1.1.4)
Requirement already satisfied: XlsxWriter>=0.5.2 in /usr/local/lib/python2.7/dist-packages/XlsxWriter-0.9.6-py2.7.egg (from FlowCal==1.1.4)
Requirement already satisfied: pyparsing in /usr/local/lib/python2.7/dist-packages (from packaging>=16.8->FlowCal==1.1.4)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python2.7/dist-packages (from matplotlib>=1.3.1->FlowCal==1.1.4)
Requirement already satisfied: python-dateutil in /usr/local/lib/python2.7/dist-packages (from matplotlib>=1.3.1->FlowCal==1.1.4)
Requirement already satisfied: functools32 in /usr/local/lib/python2.7/dist-packages (from matplotlib>=1.3.1->FlowCal==1.1.4)
Requirement already satisfied: pytz in /usr/local/lib/python2.7/dist-packages (from matplotlib>=1.3.1->FlowCal==1.1.4)
Requirement already satisfied: subprocess32 in /usr/local/lib/python2.7/dist-packages (from matplotlib>=1.3.1->FlowCal==1.1.4)
Installing collected packages: FlowCal
  Running setup.py install for FlowCal ... done
Successfully installed FlowCal-1.1.4

Somewhere else (not in the FlowCal folder):

$ flowcal
$ FlowCal

(Both commands work and open the "open file" dialog.

Same location:

$ sudo pip uninstall FlowCal
Uninstalling FlowCal-1.1.4:
  /usr/local/bin/FlowCal
  /usr/local/bin/flowcal
  /usr/local/lib/python2.7/dist-packages/FlowCal-1.1.4-py2.7.egg-info
  /usr/local/lib/python2.7/dist-packages/FlowCal/__init__.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/__init__.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/excel_ui.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/excel_ui.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/gate.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/gate.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/io.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/io.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/mef.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/mef.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/plot.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/plot.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/stats.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/stats.pyc
  /usr/local/lib/python2.7/dist-packages/FlowCal/transform.py
  /usr/local/lib/python2.7/dist-packages/FlowCal/transform.pyc
Proceed (y/n)? y
  Successfully uninstalled FlowCal-1.1.4
$ flowcal
bash: /usr/local/bin/flowcal: No such file or directory
$ FlowCal
bash: /usr/local/bin/FlowCal: No such file or directory

After starting another bash session:

$ flowcal
flowcal: command not found
$ FlowCal
FlowCal: command not found

So it seems that uninstalling proceeds as expected.
(All of this was performed with pip 9.0.1 in Linux Mint 18).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I successfully reproduced the problem in python 3.6. Again, after installing with sudo pip3.6 install . and confirming that everything works, I get the following:

$ sudo pip3.6 uninstall FlowCal
Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/local/lib/python3.6/dist-packages/pip/commands/uninstall.py", line 76, in run
    requirement_set.uninstall(auto_confirm=options.yes)
  File "/usr/local/lib/python3.6/dist-packages/pip/req/req_set.py", line 346, in uninstall
    req.uninstall(auto_confirm=auto_confirm)
  File "/usr/local/lib/python3.6/dist-packages/pip/req/req_install.py", line 734, in uninstall
    FakeFile(dist.get_metadata_lines('entry_points.txt'))
  File "/usr/lib/python3.6/configparser.py", line 763, in readfp
    self.read_file(fp, source=filename)
  File "/usr/lib/python3.6/configparser.py", line 718, in read_file
    self._read(f, source)
  File "/usr/lib/python3.6/configparser.py", line 1092, in _read
    fpname, lineno)
configparser.DuplicateOptionError: While reading from '<???>' [line  3]: option 'flowcal' in section 'console_scripts' already exists

This doesn't happen if I only have the entrypoint flowcal.

It is interesting that this error only happens in python 3.6 (and possibly other versions of python 3). I will remove the entrypoint FlowCal as discussed.

http://amir.rachum.com/blog/2017/07/28/python-entry-points/

"""
if argv == None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@castillohair
Copy link
Collaborator

I made a local branch from the changes in this PR and submitted a new PR to accelerate code iteration. Please continue discussion in #280.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants