-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PYTHON] CRPYT404: Use generator expressions instead of list comprehensions in for-loops declaration (Team 1.5) #127
Comments
ExplanationPython generators resemble lazy lists from other programming languages: when iterated over, they compute their values on the fly. They lack some list behaviors (indexing, len method, ...) but are memory-efficient, as they do not store each of their values in memory, unlike lists. Thus, when declared in a for-loop declaration, list comprehensions can be safely replaced with generator comprehensions. Code examplesThe code example below creates a list through list comprehension, only for it to be iterated over in a for loop: for index in [index for index in range(1_000_000)]:
... The 1,000,000 long list of integers is entirely stored in memory, taking unnecessary space as the list is not referenced anywhere else in the code. Below is how it should be done preferrably: for index in (index for index in range(1_000_000)):
... Notice the use of parenthesis instead of brackets: we created a generator through generator comprehension. By design, generators take almost no space in memory, thus releasing memory constraints on the hardware. Next stepsInspect use of list comprehensions outside for-loops declarationExplanationIf given more time, one could implement a rule that inspect list comprehensions when used in local contexts (for example, in a function body) and assess if the list comprehension can be replaced with a generator comprehension. ChallengesTo avoid raising too much false positives, one could keep track, for each list defined by list comprehension, of its references in the code, and ensure that no list-derived methods (indexing, len query, ...) are used on them. We would need also to ensure that the list is iterated over only once (a generator can be iterated over only once). If those 2 elements can be checked through static analysis in a local context, then we could raise an issue for using a list comprehension when a generator comprehension could be used. Non-compliant code example: def func(...):
my_list = [i for i in range(1_000)]
return sum(my_list) Compliant code example: def func(...):
my_generator = (i for i in range(1_000))
return sum(my_generator) Code examples that should not raise this issue (but unarguably, they could be rewritten): def func(...):
my_list = [i for i in range(1_000)]
for index in my_list:
...
for index in my_list:
... def func(...):
my_list = [i for i in range(1_000)]
return my_list[0] Inspect use of list comprehensions in function calls for functions that take iterable argumentsExplanationA good example is worth hundreds of words. The following code should never be used: sum([i ** 2 for i in range(1_000)]) This should be always preferred: sum(i ** 2 for i in range(1_000)) Note: when a generator created through generator comprehension is directly passed to a function, its parenthesis can be omitted. ChallengesThe main challenge here is to detect when a function actually needs a list input and not just an iterable input. An other one would be to define the scope of the rule: which functions would be inspected (a predefined set of built-in functions like filter, sum, any, all, ... or all functions with a type hint indicating one parameter should be Iterable ?) |
Team 1.5°C
The text was updated successfully, but these errors were encountered: