Skip to content

Commit

Permalink
Modify rule S6466: Extend LaYC content for Python
Browse files Browse the repository at this point in the history
  • Loading branch information
anton-haubner-sonarsource committed Sep 26, 2023
1 parent c101d61 commit 111d418
Showing 1 changed file with 110 additions and 7 deletions.
117 changes: 110 additions & 7 deletions rules/S6466/python/rule.adoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,39 @@
This rule raises an issue when trying to access a list or a tuple index that is out of bounds.
This rule raises an issue when trying to access a list or a tuple at an index
that is out-of-bounds.

== Why is this an issue?

Trying to access a list or a tuple index that is beyond the size of the list/tuple is probably a mistake and will result in an `IndexError`.
In python, lists and tuples have a certain size and their elements are indexed
in the range between the starting index `0` (inclusive) and the length of the
sequence (exclusive).

=== Code examples
When trying to access a list or tuple with an index outside of this range,
an `IndexError` will be raised and the operation will fail.

==== Noncompliant code example
Negative indices are supported. When using a negative index, it will be
interpreted by computing the sum of the negative index and the list size.
The result is then used as the actual index for accessing the sequence.
Thus, the result must be non-negative and fit into the aforementioned range.

=== What is the potential impact?

Since accessing a sequence outside of its bounds raises an `IndexError`, it will
interrupt the normal execution of the program and can result in unexpected
crashes.
Therefore, this issue might impact the availability and reliability of your
application.

If the computation of the index is tied to user input data, this issue can
potentially be exploited by attackers to disrupt your application.

== How can I fix it?

The following are examples of code containing out-of-bounds accesses to
sequences, resulting in `IndexError`s.
These situations can be avoided by carefully considering the range of valid
index values, or even better, by comparing indices and the size of a sequence.

=== Noncompliant code example

[source,python,diff-id=1,diff-type=noncompliant]
----
Expand All @@ -16,7 +43,7 @@ def fun():
----

==== Compliant solution
=== Compliant solution

[source,python,diff-id=1,diff-type=compliant]
----
Expand All @@ -26,11 +53,87 @@ def fun():
----

== Resources
=== Noncompliant code example

[source,python,diff-id=1,diff-type=noncompliant]
----
def fun(ls: list[int]):
print(ls[len(ls)]) # Noncompliant: Indexing starts at 0, hence the list length will always be an invalid index.
----

=== Compliant solution

[source,python,diff-id=1,diff-type=compliant]
----
def fun(ls: list[int]):
# We can make sure ls is non-empty before trying to access its last element.
# Also, the index `len(ls) - 1` or just `-1` will correctly select the last
# element within bounds.
print("Empty list!" if not ls else ls[-1])
----

=== How does it work?

In the first example a list `ls` containing three elements is being created.
Since in Python, the first element of a list has index `0`, the last valid index
is `2`.
Therefore, an `IndexError` is raised when accessing `ls` at index `3`.

The second example is similar, but we don't know the length of the list `ls` so
it is computed using `len`.
Still, accessing a list with its size as an index is not correct.

In general, when you do not know the concrete size of a sequence that you are
accessing by index, always make sure to guard the access.
That is, you should add if-else-constructs or make use of other control flow
tools to ensure that the index value you are using fits within the bounds of
the sequence.

=== Pitfalls

The indices `0`, `len(...) - 1`, or `-1` for the first and last element of a
sequence are not always valid!
Make sure the list or tuple in question is non-empty before accessing these
indices.

=== Going the extra mile

In many cases, accessing a sequence by index can be avoided.
For instance, you can make use of built-in functions like `map()`, `filter()`
and `reduce()` that let you operate on sequences without using indices.

If you absolutely need to know the index of an element while iterating over a
sequence, you can use `enumerate()`. It provides you the indices and the
elements of a sequence during iteration, eliminating the need to manually
retrieve elements from the sequence using indices.

==== Noncompliant code example

[source,python,diff-id=1,diff-type=noncompliant]
----
for i in range(len(ls)):
elem = ls[i] # We can eliminate this access by index using enumerate.
foo(i, elem)
----

==== Compliant solution

[source,python,diff-id=1,diff-type=compliant]
----
for i, elem in enumerate(ls):
foo(i, elem)
----

== More Info

=== Documentation

* Python Documentation - https://docs.python.org/3/library/exceptions.html#IndexError[IndexError]
* https://docs.python.org/3/reference/expressions.html#subscriptions[Subscriptions]
* https://docs.python.org/3/library/exceptions.html#IndexError[`IndexError`]
* https://docs.python.org/3/library/functions.html#built-in-functions[Built-ins, including `map`, `filter`, `enumerate`, etc.]

ifdef::env-github,rspecator-view[]

Expand Down

0 comments on commit 111d418

Please sign in to comment.