Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSOC 2020] Project ideas #8373

Closed
JukkaL opened this issue Feb 5, 2020 · 33 comments
Closed

[GSOC 2020] Project ideas #8373

JukkaL opened this issue Feb 5, 2020 · 33 comments
Labels
meta Issues tracking a broad area of work topic-developer Issues relevant to mypy developers

Comments

@JukkaL
Copy link
Collaborator

JukkaL commented Feb 5, 2020

Update: See #10099 for GSoC 2021 project ideas.

Here are some ideas for larger mypy-related projects for contributors who want to tackle something fairly big (but also with a big potential impact).

Deep editor integrations

Currently it's possible to run mypy daemon from an editor and display the list of errors within the editor, but we could go much further. Possible ideas include going to the definition of an arbitrary reference (such as a method, variable, type, etc.), and displaying the inferred type of an expression. IDEs such as PyCharm can do some of this already, but mypy could support these features more reliably in some cases, since it maintains a very detailed representation of the program internally. Also, this could be very helpful with editors that have no or limited built-in support for these features.

Better decorator support

Mypy can't properly support decorators that change the return type of the decorated function, such as contextmanager (contextmanager is special-cased using a plugin, but this approach doesn't generalize to arbitrary functions). Add support for PEP 612 draft to make this better.

Generalize mypyc IR to allow non-C backends

Currently the IR of mypyc, the compiler we use to compile mypy, is tightly bound to C. This makes it impractical to experiment with alternative backends, such as an LLVM back end or a completely custom back end that directly generates assembly.

Related issue: mypyc/mypyc#709

Faster callables and nested functions in mypyc

Currently calling nested functions and variables with a callable type is pretty slow in compiled code. These limitations reduce the usefulness of mypyc significantly, especially when compiling code that wasn't originally written with mypyc in mind.

Related issues: mypyc/mypyc#713, mypyc/mypyc#712 (both of these would be implemented in a GSoC project)

NumPy support

This is a big topic, but this can be approached a feature at a time. One of the main missing things is "shape types" -- there needs to be a way to express the number of dimensions in an array, at the very least.

@hauntsaninja
Copy link
Collaborator

I'd be interested in adding support for PEP 612. I'll take a look at the pyre implementation this weekend, re-read the PEP, and scope out a plan.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Feb 12, 2020

@hauntsaninja Great! If you have any questions, I'm happy to help.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Feb 20, 2020

Another idea:

Detecting potentially undefined or misspelled locals

Some uses of undefined variables are not caught by mypy:

def f() -> None:
    if foo():
        x = 0
    print(x)  # No error

It would be useful to catch these (#2400). A related issue is reporting locals that are never read, as this is often an error (#76).

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Feb 20, 2020

GSOC 2020 organizations are announced and mypy is one of them! see https://summerofcode.withgoogle.com/organizations/6527008550420480/

I am interested in Generalize mypyc IR to allow non-C backends and I will manage to have a plan by the end of February.

@vpeiter
Copy link

vpeiter commented Feb 22, 2020

Hi!
I'd be interested in working in detecting potentially undefined or misspelled locals.
I will work out a plan and get in contact soon...

@dubesar
Copy link

dubesar commented Feb 22, 2020

Hi!!
I am Sarvesh, I am interested in Faster callables and nested functions in mypyc and Numpy support.
I am working on Faster Callables and will get back soon with some solutuon.

@edwinjon
Copy link

Hello,
My name is Jonathan, I am actually focus on data science and I am really interested to work in Numpy Support. I once questioned regarding the unavailable of "shape types" . By working this project, will help my Data Science works too.

I am going to read the documentations and try to find out some insights which features can be improve for numpy. Thanks

Cheers,
Jonathan

@TH3CHARLie
Copy link
Collaborator

Hi, @edwinjon and @dubesar, glad to see someone interested in working on the mypy-Numpy topic!

As far as I know, there's a similar work that may help you develop your ideas and plans, check https://github.com/numpy/numpy-stubs.

@dubesar
Copy link

dubesar commented Feb 22, 2020

@JukkaL By numpy support I understand that the codes written in standard format needs to be changed to numpy format as for example changing arrays to numpy arrays.
Also by shapes you mean the shapes of the arrays?

Please elaborate that part!!

Also the Faster Callables part I checked the running time of the code and have addressed it in the issue. Please guide me further what has to be done in that, so that I can move in further with the work.

@egenedy97
Copy link

Hi @JukkaL my name is Eslam Genedy , I am at third year in University ASU Computer Engineering Department , it is my pleasure to apply for mypy . I have a good knowledge in oop in python and numpy . finally I hope to be contributed in this project thanks in advance

@JukkaL
Copy link
Collaborator Author

JukkaL commented Feb 26, 2020

To support NumPy, we'd need some form of "shape type" support. We want to specify the number of dimensions and the item type of an array, at least. Even better, it would often be useful to specify the exact size of an array, but this will be much harder to implement.

This example is from @ilevkivskyi's presentation from the Typing Summit at PyCon 2019:

from typing import Shape, IntVar, TypeVar

N = IntVar('N')
T = TypeVar('T')

def diff(a: ndarray[T, Shape[N]]) -> ndarray[T, Shape[N - 1]]:
    ...
def sum2d(a: ndarray[T, Shape[:, :]) -> ndarray[T, Shape[:]]:
    ...

This means that diff takes a one-dimensional array of size N and item type T, and it returns a one-dimensional array of size N - 1 and item type T. sum2d, on the other hand, takes an arbitrary-sized two-dimensional array with item type T, and it returns a one-dimensional, arbitrary-sized array with item type T.

A reasonable goal for GSoC would be to support specifying the item type and the number of dimensions (e.g. ndarray[float, Shape[:, :]] for a two-dimensional array). This would involve at least these steps (I'm leaving a lot of details out):

  1. Add support for the new shape type syntax.
  2. Implement basic type operations involving shape types, such as displaying shapes in error messages and subtyping.
  3. Implement simple stubs for NumPy that use the shape types.
  4. If needed, add a mypy plugin to handle some common NumPy operations that can't be supported via existing type system features.

This is quite a challenging project and requires a deep understanding of mypy type checking internals, so it's really only well-suited for somebody who has substantial experience with working on mypy (or another type checker), or perhaps has completed a course on type theory (beyond a compilers course).

@joybh98
Copy link
Contributor

joybh98 commented Feb 27, 2020

@JukkaL , I'd like to integrate PEP 612 and take the better decorator support

@JukkaL
Copy link
Collaborator Author

JukkaL commented Feb 28, 2020

@joybhallaa That would be a really useful project! I'd suggest starting by closing a few smaller issues first to get familiar with working on mypy. (The same advice applies to anybody who's interested in diving into a major project. It's best to gradually learn the codebase. Otherwise these's a big risk that there's too much to learn at once, and you'll get discouraged.)

Since PEP 612 involves type inference and type variables, tackling one (or some) of these issues could be a good option: https://github.com/python/mypy/issues?q=is%3Aopen+is%3Aissue+label%3Atopic-type-variables

If you find a promising issue but you are not sure where to start, you may want to ask for hints in the issue.

@joybh98
Copy link
Contributor

joybh98 commented Feb 29, 2020

Since PEP 612 involves type inference and type variables, tackling one (or some) of these issues could be a good option: https://github.com/python/mypy/issues?q=is%3Aopen+is%3Aissue+label%3Atopic-type-variables

@JukkaL I've taken up an issue and after I'm finished with that I'll take up issues that you have mentioned. 👍

@pr4k
Copy link

pr4k commented Feb 29, 2020

Hi @JukkaL I am interested in contributing for adding better decorator support, RIght now I am working on the issues , is there anything which should be done parallely with this and are there any potential mentors for this project idea?

@msullivan
Copy link
Collaborator

As a heads up, we will probably have the bandwidth to mentor one or two GSoC projects. We won't make any final decisions on proposals until officially reviewing the applications.

@davidzwa
Copy link
Contributor

davidzwa commented Mar 6, 2020

@JukkaL Im posting here to inform you that me and 3 other students from the TU Delft (Netherlands) are currently involved in analyzing mypy from a Software Architecture perspective (university course).

We are currently writing multiple essays publicly visible on https://desosa2020.netlify.com/projects/mypy/
And our first essay (merely an introduction):
https://desosa2020.netlify.com/projects/mypy/2020/03/04/the-vision-behind-mypy
3 more will follow soon!

Now my question is:

  • Is there any low-hanging fruit for us to get familiar with the code-base?
  • Is there something can we contribute to this great project from an software architecture point of view? (Documentation, analysis, bugs or advice)
  • And finally, would you be interested to see the results of our architectural analysis?

@emmatyping
Copy link
Collaborator

@davidzwa For a link of good first issues to fix, you may want to check out https://github.com/python/mypy/issues?q=is%3Aissue+is%3Aopen+label%3Agood-first-issue+-linked%3Apr+no%3Aassignee

Also mypy is usually lowercase, unless it starts a sentence, FWIW. I'll leave the other questions to Jukka though :)

@Jahnavi-Jonnalagadda
Copy link

@JukkaL and @msullivan , I'm having fair knowledge in Python and solved several problems using Data structures and Algorithms concepts. I'm new to GSoC, would need inputs to kick start contributing in this.

@Arshaan-256
Copy link

@JukkaL I would like to pick up NumPy support as my project. I have already made myself familiar with mypy and have decent experience working with Numpy. Besides creating a way to define array dimensions, we could also add checks for matrix and vector operations. I worked on issue: #8344 , the pull request is yet to be reviewed though.

@nittyswami
Copy link

Hey guys i want to help in set up mypy for contributing

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 19, 2020

Many people have expressed interest in the GSoC projects. Since we can't accept many candidates, we are setting some minimum requirements for candidates to streamline the process, and to set realistic expectations. These are the requirements (no exceptions, sorry):

  1. You have a) completed a university course "Introduction to Compilers" (or similar), or b) have made multiple contributions to a type checker or compiler project (this can be mypy or other open source project).
  2. You can demonstrate practical Python experience. For the mypyc-related projects, you have experience with C or C++ as well.
  3. You've been able to create a non-trivial mypy PR and respond to reviews. (It's not necessary that your PR is merged, though that's a plus.)
  4. Contact me ([email protected]) by email and give these details:
  • A summary of your compilers-related experience. If you finished a compilers course, give a summary of what you learned and a description of your programming project (if any). If you have contributed to a compiler or type checker project, provide links to your contributions and/or describe what you've done and what's the significance of your contributions. Give a link to your mypy PR or PRs.
  • A summary of your Python experience (and C/C++, if relevant). Provide a short summary of one or two projects you've worked on in the language(s). If you can give links to code, even better.
  • Tell us which of the project ideas would you like to work on, and why.
  • (Optional but recommended) Other supporting evidence, such as other courses you've finished that may be relevant, your grades, side projects you've worked on, web sites you've created, other languages you are skilled at, etc.

As I mentioned above, the NumPy project is more challenging than the others, and thus we'd need a candidate with substantial previous experience. A handful of mypy PRs won't be enough experience to work on NumPy with a good chance of success, if that's all the type checker related experience you have.

Notes:

  • Knowledge of algorithms and data structures is also necessary, but it's not sufficient.
  • We'll pick a few of the most promising candidates and schedule interviews and/or a small coding tasks. These can happen over Skype or chat, for example -- we are quite flexible.
  • Unfortunately, we don't have bandwidth to personally help everybody get their mypy development environment set up. Getting started using the available documentation is one of the first tasks you need to be able to complete on your own.
  • Reading a book is not a substitute for attending a compilers course, since the coding assignments in typical courses are very important.
  • For generic questions about getting started, read this thread, our information at the GSoC site, and our readme (https://github.com/python/mypy/blob/master/README.md). We expect that suitable candidates have enough experience to be able to create a PR with the available help.
  • If you have specific questions, feel free to email me or post here.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 24, 2020

Clarification: The Faster callables and nested functions in mypyc GSoC project includes implementing both of the linked issues.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 24, 2020

Note that Detecting potentially undefined or misspelled locals feels too small for a full GSoC project. If somebody wants to work on this topic in the context of GSoC, we can expand the scope by also detecting (some) uninitialized attributes, and preventing unsafe attribute deletions. I can provide more context if there is interest.

@shadaabghani1
Copy link

Hello ,

Can you give a bit of details for numpy support ?

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 30, 2020

@shadaabghani1 It would be much easier me to help you, if you can ask more specific questions, as the topic is pretty wide ranging. I already gave an overview in this comment: #8373 (comment)

@bhack
Copy link

bhack commented Apr 2, 2020

@JukkaL I don't know if for the numpy support point you could be interested in this thread: https://llvm.discourse.group/t/numpy-scipy-op-set/768

@Akuli
Copy link
Contributor

Akuli commented Aug 3, 2020

Currently I'm writing sphinx docs like this and I wish there was a sphinx plugin or something that would ask mypy what the type of each attribute is:

    .. attribute:: callback
        :type: Callable[[], None]

        Blah blah. Boring text goes here.

@friyaz
Copy link

friyaz commented Feb 3, 2021

Hey @JukkaL, will mypy participate in GSoC this summer? I am interested in working on the editor integration of mypy.

@TH3CHARLie
Copy link
Collaborator

Hey @JukkaL, will mypy participate in GSoC this summer? I am interested in working on the editor integration of mypy.

Hi, since it may take Jukka several hours before he gets back to you, I will try to share what I know. Mypy is still deciding whether it will participate in GSoC 2021 but we are looking forward to it. For project ideas, once core team members have decided to participate, they will come up with some possible ideas or even post a public issue for brainstorming. The final ideas will most likely be summarized in a new issue. Anyway, please stay tuned to this issue and also the Gitter channel.

@friyaz
Copy link

friyaz commented Feb 3, 2021

Hi, since it may take Jukka several hours before he gets back to you, I will try to share what I know.

Thanks for replying to this. I will add myself to the Gitter channel and try working on any easy to fix issue to get started.

@TH3CHARLie
Copy link
Collaborator

Thanks for replying to this. I will add myself to the Gitter channel and try working on any easy to fix issue to get started.

Cool, apart from mypy's issue tracker, you may also find some easy-to-do issues at: https://github.com/mypyc/mypyc/issues. And for mypyc issues, feel free to ping me for a review

@JukkaL JukkaL changed the title Project ideas [GSOC 2020] Project ideas Mar 17, 2021
@AlexWaygood AlexWaygood added topic-developer Issues relevant to mypy developers meta Issues tracking a broad area of work labels Mar 25, 2022
@emmatyping
Copy link
Collaborator

I am going to close this since I don't think there is anything useful to track here, the issues brought up can be tracked in their respective issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Issues tracking a broad area of work topic-developer Issues relevant to mypy developers
Projects
None yet
Development

No branches or pull requests