Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve DeprecationWarning: Seeding based on hashing #656

Merged
merged 3 commits into from
Mar 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/changes.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
Changes
=======

Version 1.7.15
--------------

* Add unit tests for randomtable, dummytable, and their supporting functions and classes.
By :user:`bmos`, :issue:`657`.

* Fix: DeprecationWarning: Seeding based on hashing is deprecated since Python 3.9 and will be removed in a subsequent version.
By :user:`bmos`, :issue:`657`.

Version 1.7.14
--------------

Expand Down
93 changes: 93 additions & 0 deletions petl/test/util/test_random.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
import random as pyrandom

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Missing module docstring Warning test

Missing module docstring

Check warning

Code scanning / Pylint (reported by Codacy)

Missing module docstring Warning test

Missing module docstring
import time
from functools import partial

from petl.util.random import randomseed, randomtable, RandomTable, dummytable, DummyTable


def test_randomseed():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that randomseed provides a non-empty string that changes.
"""
seed_1 = randomseed()
time.sleep(1)
seed_2 = randomseed()

assert isinstance(seed_1, str)
Dismissed Show dismissed Hide dismissed

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.
assert seed_1 != ""
Dismissed Show dismissed Hide dismissed

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.
assert seed_1 != seed_2
Dismissed Show dismissed Hide dismissed

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.


def test_randomtable():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that randomtable provides a table with the right number of rows and columns.
"""
columns, rows = 3, 10
table = randomtable(columns, rows)

assert len(table[0]) == columns

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert len(table) == rows + 1

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.


def test_randomtable_class():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that RandomTable provides a table with the right number of rows and columns.
"""
columns, rows = 4, 60
table = RandomTable(numflds=columns, numrows=rows)

assert len(table[0]) == columns

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert len(table) == rows + 1

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.


def test_dummytable_custom_fields():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that dummytable provides a table with the right number of rows
and that it accepts and uses custom column names provided.
"""
columns = (
('count', partial(pyrandom.randint, 0, 100)),

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Module 'random' has no 'randint' member Warning test

Module 'random' has no 'randint' member

Check warning

Code scanning / Pylint (reported by Codacy)

Module 'random' has no 'randint' member Warning test

Module 'random' has no 'randint' member
('pet', partial(pyrandom.choice, ['dog', 'cat', 'cow', ])),

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Module 'random' has no 'choice' member Warning test

Module 'random' has no 'choice' member

Check warning

Code scanning / Pylint (reported by Codacy)

Module 'random' has no 'choice' member Warning test

Module 'random' has no 'choice' member
('color', partial(pyrandom.choice, ['yellow', 'orange', 'brown'])),

Check warning

Code scanning / Pylint (reported by Codacy)

Module 'random' has no 'choice' member Warning test

Module 'random' has no 'choice' member

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Module 'random' has no 'choice' member Warning test

Module 'random' has no 'choice' member
('value', pyrandom.random),
)
rows = 35

table = dummytable(numrows=rows, fields=columns)
assert table[0] == ('count', 'pet', 'color', 'value')

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.
assert len(table) == rows + 1

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.


def test_dummytable_no_seed():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that dummytable provides a table with the right number of rows
and columns when not provided with a seed.
"""
rows = 35

table = dummytable(numrows=rows)
assert len(table[0]) == 3

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert len(table) == rows + 1

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.


def test_dummytable_int_seed():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that dummytable provides a table with the right number of rows
and columns when provided with an integer as a seed.
"""
rows = 35
seed = 42
table = dummytable(numrows=rows, seed=seed)
assert len(table[0]) == 3

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert len(table) == rows + 1

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.


def test_dummytable_class():
Fixed Show fixed Hide fixed
Fixed Show fixed Hide fixed
"""
Ensure that DummyTable provides a table with the right number of rows
and columns.
"""
rows = 70
table = DummyTable(numrows=rows)

assert len(table) == rows + 1

Check warning

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Warning test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

Check notice

Code scanning / Semgrep (reported by Codacy)

The application was found using assert in non-test code. Note test

The application was found using assert in non-test code.
76 changes: 47 additions & 29 deletions petl/util/random.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,24 @@
from __future__ import absolute_import, print_function, division


import datetime
import random
import hashlib
import random as pyrandom
import time
from collections import OrderedDict
from functools import partial

from petl.compat import xrange, text_type
from petl.util.base import Table


from petl.util.base import Table
def randomseed():
"""
Obtain the hex digest of a sha256 hash of the
current epoch time in nanoseconds.
"""

time_ns = str(time.time()).encode()
hash_time = hashlib.sha256(time_ns).hexdigest()
return hash_time


def randomtable(numflds=5, numrows=100, wait=0, seed=None):
Expand All @@ -36,56 +45,60 @@
| 0.026535969683863625 | 0.1988376506866485 | 0.6498844377795232 |
+----------------------+----------------------+---------------------+
...
<BLANKLINE>

Note that the data are generated on the fly and are not stored in memory,
so this function can be used to simulate very large tables.
The only supported seed types are: None, int, float, str, bytes, and bytearray.

"""

return RandomTable(numflds, numrows, wait=wait, seed=seed)


class RandomTable(Table):

def __init__(self, numflds=5, numrows=100, wait=0, seed=None):
self.numflds = numflds
self.numrows = numrows
self.wait = wait
if seed is None:
self.seed = datetime.datetime.now()
self.seed = randomseed()
else:
self.seed = seed

def __iter__(self):

nf = self.numflds
nr = self.numrows
seed = self.seed

# N.B., we want this to be stable, i.e., same data each time
random.seed(seed)
pyrandom.seed(seed)

# construct fields
flds = ['f%s' % n for n in range(nf)]
flds = ["f%s" % n for n in range(nf)]

Check warning

Code scanning / Prospector (reported by Codacy)

Formatting a regular string which could be a f-string (consider-using-f-string) Warning

Formatting a regular string which could be a f-string (consider-using-f-string)
juarezr marked this conversation as resolved.
Show resolved Hide resolved
yield tuple(flds)

# construct data rows
for _ in xrange(nr):
# artificial delay
if self.wait:
time.sleep(self.wait)
yield tuple(random.random() for n in range(nf))
yield tuple(pyrandom.random() for n in range(nf))
Dismissed Show dismissed Hide dismissed

Check warning

Code scanning / Bandit (reported by Codacy)

Standard pseudo-random generators are not suitable for security/cryptographic purposes. Warning

Standard pseudo-random generators are not suitable for security/cryptographic purposes.

def reseed(self):
self.seed = datetime.datetime.now()


def dummytable(numrows=100,
fields=(('foo', partial(random.randint, 0, 100)),
('bar', partial(random.choice, ('apples', 'pears',
'bananas', 'oranges'))),
('baz', random.random)),
wait=0, seed=None):
self.seed = randomseed()


def dummytable(
numrows=100,
fields=(
('foo', partial(pyrandom.randint, 0, 100)),
Dismissed Show dismissed Hide dismissed
('bar', partial(pyrandom.choice, ('apples', 'pears', 'bananas', 'oranges'))),

Check warning

Code scanning / Pylint (reported by Codacy)

Wrong hanging indentation (remove 4 spaces). Warning

Wrong hanging indentation (remove 4 spaces).
('baz', pyrandom.random),

Check warning

Code scanning / Pylint (reported by Codacy)

Wrong hanging indentation (remove 4 spaces). Warning

Wrong hanging indentation (remove 4 spaces).
),
wait=0,
seed=None,
):
"""
Construct a table with dummy data. Use `numrows` to specify the number of
rows. Set `wait` to a float greater than zero to simulate a delay on each
Expand All @@ -108,14 +121,13 @@
| 4 | 'apples' | 0.09369523986159245 |
+-----+----------+----------------------+
...
<BLANKLINE>

>>> # customise fields
... import random
>>> import random as pyrandom
>>> from functools import partial
>>> fields = [('foo', random.random),
... ('bar', partial(random.randint, 0, 500)),
... ('baz', partial(random.choice,
... ['chocolate', 'strawberry', 'vanilla']))]
>>> fields = [('foo', pyrandom.random),
... ('bar', partial(pyrandom.randint, 0, 500)),
... ('baz', partial(pyrandom.choice, ['chocolate', 'strawberry', 'vanilla']))]
>>> table2 = etl.dummytable(100, fields=fields, seed=42)
>>> table2
+---------------------+-----+-------------+
Expand All @@ -132,20 +144,26 @@
| 0.4219218196852704 | 15 | 'chocolate' |
+---------------------+-----+-------------+
...
<BLANKLINE>

>>> table3_1 = etl.dummytable(50)
>>> table3_2 = etl.dummytable(100)
>>> table3_1[5] == table3_2[5]
False

Data generation functions can be specified via the `fields` keyword
argument.

Note that the data are generated on the fly and are not stored in memory,
so this function can be used to simulate very large tables.
The only supported seed types are: None, int, float, str, bytes, and bytearray.

"""

return DummyTable(numrows=numrows, fields=fields, wait=wait, seed=seed)


class DummyTable(Table):

def __init__(self, numrows=100, fields=None, wait=0, seed=None):
self.numrows = numrows
self.wait = wait
Expand All @@ -154,7 +172,7 @@
else:
self.fields = OrderedDict(fields)
if seed is None:
self.seed = datetime.datetime.now()
self.seed = randomseed()
else:
self.seed = seed

Expand All @@ -167,7 +185,7 @@
fields = self.fields.copy()

# N.B., we want this to be stable, i.e., same data each time
random.seed(seed)
pyrandom.seed(seed)

# construct header row
hdr = tuple(text_type(f) for f in fields.keys())
Expand All @@ -181,4 +199,4 @@
yield tuple(fields[f]() for f in fields)

def reseed(self):
self.seed = datetime.datetime.now()
self.seed = randomseed()
Loading