Skip to content
This repository has been archived by the owner on Jan 14, 2023. It is now read-only.

Adapt the app for Councilmatic 1.0 #33

Closed
wants to merge 16 commits into from

Conversation

jeancochrane
Copy link

Overview

Make sure the app is compatible with django-councilmatic v1.0. The major changes in this update include:

  • Support Django 2.0+
  • Refactor queries in send_notifications management command to use the ORM and the new OCD model structure
  • Write some basic tests for the app

Testing instructions

  • Make a virtualenv for the project and install it for testing with pip install -e .[tests]
  • Run pytest and confirm all tests pass

@jeancochrane jeancochrane requested a review from hancush July 9, 2019 15:13
@hancush
Copy link
Member

hancush commented Jul 9, 2019

@jeancochrane tests are failing for me.

first i ran into this.

(django-councilmatic-notifications) call-me-hank:django-councilmatic-notifications hannah$ pytest
Traceback (most recent call last):
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/bin/pytest", line 10, in <module>
    sys.exit(main())
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/_pytest/config/__init__.py", line 55, in main
    config = _prepareconfig(args, plugins)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/_pytest/config/__init__.py", line 200, in _prepareconfig
    pluginmanager=pluginmanager, args=args
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/hooks.py", line 289, in __call__
    return self._hookexec(self, self.get_hookimpls(), kwargs)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/manager.py", line 87, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/manager.py", line 81, in <lambda>
    firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/callers.py", line 203, in _multicall
    gen.send(outcome)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/_pytest/helpconfig.py", line 89, in pytest_cmdline_parse
    config = outcome.get_result()
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/callers.py", line 80, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
    res = hook_impl.function(*args)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/_pytest/config/__init__.py", line 661, in pytest_cmdline_parse
    self.parse(args)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/_pytest/config/__init__.py", line 869, in parse
    self._preparse(args, addopts=addopts)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/_pytest/config/__init__.py", line 825, in _preparse
    early_config=self, args=args, parser=self._parser
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/hooks.py", line 289, in __call__
    return self._hookexec(self, self.get_hookimpls(), kwargs)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/manager.py", line 87, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/manager.py", line 81, in <lambda>
    firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/callers.py", line 208, in _multicall
    return outcome.get_result()
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/callers.py", line 80, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
    res = hook_impl.function(*args)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pytest_django/plugin.py", line 335, in pytest_load_initial_conftests
    _setup_django()
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/pytest_django/plugin.py", line 223, in _setup_django
    django.setup()
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/django/__init__.py", line 18, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/django/apps/registry.py", line 85, in populate
    app_config = AppConfig.create(entry)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/django/apps/config.py", line 116, in create
    mod = import_module(mod_path)
  File "/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'opencivicdata'

i think we need to install django-councilmatic from the 1.0 branch. i have not figured out a good way to do that in setup.py, but maybe you know one?

i did pip install -e .[tests] as directed, then pip install git+https://github.com/datamade/[email protected]. at that point, pytest threw an ImportMismatchError, so i ran export PY_IGNORE_IMPORTMISMATCH=1 as suggested in pytest-dev/py#200 (comment).

but then, my tests entered another world of hurt.

(django-councilmatic-notifications) call-me-hank:django-councilmatic-notifications hannah$ pytest tests/
=============================================================================== test session starts ================================================================================
platform darwin -- Python 3.7.3, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
Django settings: tests.test_config (from ini file)
rootdir: /Users/hannah/projects/django-councilmatic-notifications, inifile: setup.cfg
plugins: django-3.5.1, mock-1.10.4
collected 2 items / 1 errors / 1 selected

====================================================================================== ERRORS ======================================================================================
_______________________________________________________________________ ERROR collecting tests/test_views.py _______________________________________________________________________
ImportError while importing test module '/Users/hannah/projects/django-councilmatic-notifications/tests/test_views.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
ModuleNotFoundError: No module named 'tests.test_views'
================================================================================= warnings summary =================================================================================
/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/sql/base.py:49
  /Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/sql/base.py:49: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    class _DialectArgView(collections.MutableMapping):

/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py:374
[truncated output]
  /Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py:405: DeprecationWarning: `formatargspec` is deprecated since Python 3.5. Use `signature` and the `Signature` object directly
    formatvalue=lambda x: '=' + x)

/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/engine/result.py:182
  /Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/engine/result.py:182: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    from collections import Sequence

/Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/util/_collections.py:798
  /Users/hannah/.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/sqlalchemy/util/_collections.py:798: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    if not isinstance(x, collections.Iterable) or \

-- Docs: https://docs.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================================================== 465 warnings, 1 error in 0.36 seconds =======================================================================

so that seems like it was the wrong thing to do...

@hancush
Copy link
Member

hancush commented Jul 9, 2019

oh, i see, it looks like test directories no longer need an __init__.py... or something? pytest-dev/pytest#3863

i deleted __init__.py and resolved the resultant complaint about relative imports by changing from .test_config_jurisdiction to from tests.test_config_jurisdiction in tests/test_config.py, but now django is protesting:

(django-councilmatic-notifications) call-me-hank:django-councilmatic-notifications hannah$ pytest -vvv -p no:warnings
=============================================================================== test session starts ================================================================================
platform darwin -- Python 3.7.3, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/hannah/.virtualenvs/django-councilmatic-notifications/bin/python3.7
cachedir: .pytest_cache
Django settings: tests.test_config (from ini file)
rootdir: /Users/hannah/projects/django-councilmatic-notifications, inifile: setup.cfg
plugins: django-3.5.1, mock-1.10.4
collected 2 items / 1 errors / 1 selected

====================================================================================== ERRORS ======================================================================================
________________________________________________________________ ERROR collecting tests/test_management_commands.py ________________________________________________________________
tests/test_management_commands.py:10: in <module>
    from notifications import models as notifications_models
notifications/models.py:35: in <module>
    class PersonSubscription(Subscription):
../../.virtualenvs/django-councilmatic-notifications/lib/python3.7/site-packages/django/db/models/base.py:95: in __new__
    "INSTALLED_APPS." % (module, name)
E   RuntimeError: Model class notifications.models.PersonSubscription doesn't declare an explicit app_label and isn't in an application in INSTALLED_APPS.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================= 1 error in 0.25 seconds ==============================================================================

@jeancochrane
Copy link
Author

@hancush Thanks for flagging this! I did some digging and discovered that all of the mysterious errors go away when you remove the tests directory from your virtualenv (i.e. the root error is that the tests directory is getting installed by pip -- we don't actually want to ignore the ImportMismatch error in this case, since it's pointing out a real problem with the setup).

Unfortunately I'm still confused by Python packaging and I can't quite figure out why pip is installing the tests directory... setup.py specifies that notifications is the only package that should be installed, and MANIFEST.in specifically excludes tests/* from the package. I'll consult with Forest tomorrow and see if we can get past the issue.

@jeancochrane
Copy link
Author

jeancochrane commented Jul 11, 2019

@hancush Got the testing environment fixed up thanks to @fgregg's help! You'll find updated installation instructions here: datamade/django-councilmatic#252

This still won't address the problem of specifying a GitHub source for django-councilmatic in setup.py, which @fgregg confirms there's no good way of doing. Instead, we agreed that the best path forward is probably to cut a 1.0 release of django-councilmatic and then pin this package to django-councilmatic>=1.0.

Copy link
Member

@hancush hancush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you so much for taking this on, @jeancochrane!

i especially like that you've kept the methods in tact, even though they follow a similar pattern of behavior. i think that we erred a little too far on the side of abstraction in our now-obsolete import_data script, so i'm happy to see this landed closer to the middle!

to summarize the comments:

  • we started implementing custom managers for the first-class objects (events and bills, in particular) in django-councilmatic that cast their date attributes to datetime objects, but it seems like we didn't always handle null dates. do you think we ought to make that change in django-councilmatic, so you can leverage it here?
  • i think excluding objects with null end dates instead of improvising the coalesce operation would be more straightforward.
  • does distinct on id remove anything from a queryset?

happy to talk more about this out loud, if you like!

),
Value(str(self.get_threshold(minutes+1)))
),
# If the 'date' is null, set it beyond the threshold to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you simplify this by excluding actions with null dates before you run the query, e.g., BillAction.objects.exclude(date__exact='').annotate(...)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a feeling there was a smarter way to do this 😅After refactoring the Coalesce logic into django-councilmatic, I'll swap this to .exclude(date__isnull=True).

cursor = connection.cursor()
cursor.execute(new_actions, [tuple(bill_ids)])
def find_bill_action_updates(self, bill_ids, minutes=15):
new_actions = ocd_legislative_models.BillAction.objects.annotate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, this annotation operation strikes me as something that should happen in django-councilmatic, like we do for memberships. then you would query the proxy model, instead of the base ocd model. what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's definitely the right place for this to happen! If we can do that it'll also simplify these queries enormously. I'll go ahead and make a PR onto django-councilmatic moving this logic over.

committee_updates.append(committee_group)

for action in new_actions.distinct(
'date', 'organization__id', 'bill__id', 'id'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of this distinct? wouldn't it return everything, if it's distinct on id (the pk?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure! I couldn't figure out why it was in the previous query so I kept it on superstitiously. I suspect there was something about the preceding joins that could return duplicate rows. But since neither of us can figure it out I'll go ahead and remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i think in general the distincts in the queries were because of the joins, e.g., joining event to event participant would return a row for every participant -> duplicate events. i feel ok about omitting them here!

def find_committee_event_updates(self, committee_ids, minutes=15):
committee_updates = []
for committee_id in committee_ids:
new_event_particip = ocd_legislative_models.EventParticipant.objects.annotate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's becoming clear to me that we might need to handle null end dates in all the custom managers in django-councilmatic.

otoh, if you think it's just notifications that needs to deal with this, maybe break out this annotation into a helper method on this class, to reduce repeat code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this particular instance, could you the query councilmatic Event, whose default manager includes the datetime annotation, then grab the event participant off those objects meeting the filter criteria?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's becoming clear to me that we might need to handle null end dates in all the custom managers in django-councilmatic.

It could be a good idea to standardize django-councilmatic model managers to provide _dt attributes that are cast to DateTimeFields with proper nulls, but I don't think I could make that change with confidence in the context of this PR. I went ahead and added/updated these attributes for models that are implicated in Notifications, however.

could you the query councilmatic Event, whose default manager includes the datetime annotation, then grab the event participant off those objects meeting the filter criteria?

I can't believe I missed EventManager! This is very smart -- done.

created_at__lte=self.get_threshold(minutes),
updated_at__gte=self.get_threshold(minutes),
start_datetime__gte=datetime.now(pytz.timezone(settings.TIME_ZONE))
).distinct('start_datetime', 'id').order_by('start_datetime')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto distinct on id.

).filter(
created_at__gte=self.get_threshold(minutes),
start_datetime__gte=datetime.now(pytz.timezone(settings.TIME_ZONE))
).distinct('start_datetime', 'id').order_by('start_datetime')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto distinct on id.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, is distinct necessary at all here?

SubscriptionsManageView, person_subscribe, person_unsubscribe, bill_subscribe, \
bill_unsubscribe, committee_events_subscribe, committee_events_unsubscribe, \
committee_actions_subscribe, committee_actions_unsubscribe, search_check_subscription, \
search_subscribe, search_unsubscribe, events_subscribe, events_unsubscribe, \
send_notifications

import django_rq
if django.VERSION < (1, 11):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this release depends on django-councilmatic>=1.0, and that requires django>=2.0, do we need to do this check?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! I thought I was being clever supporting multiple Django versions, but you're right, the dependency tree excludes <=1.0.

from django.core.exceptions import ObjectDoesNotExist
from django.core.mail import EmailMessage
from django.core.cache import cache
from django.core import management

if django.VERSION < (2, 0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto on version question.

Copy link
Author

@jeancochrane jeancochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @hancush! Your point about factoring date-casting logic out into django-councilmatic was a really nice design suggestion and wound up simplifying the queries here by a big margin. I'll open up a PR in django-councilmatic corresponding to those changes.

I think my two biggest remaining questions are:

  1. What are the distinct filters doing in these queries, and do we need to preserve them in the ORM? As far as I can tell they're superfluous but I don't understand what they were doing before so I'm worried I don't grok the data model well enough to make an informed decision.

  2. Am I being too cautious about timezones? I made sure to adjust queries so that any time we retrieve a datetime.now() object we make it timezone-aware according to settings.TIME_ZONE, but if e.g. all OCD models are stored in the database in UTC time, then it might be better to enforce the timezone as UTC.

committee_updates.append(committee_group)

for action in new_actions.distinct(
'date', 'organization__id', 'bill__id', 'id'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure! I couldn't figure out why it was in the previous query so I kept it on superstitiously. I suspect there was something about the preceding joins that could return duplicate rows. But since neither of us can figure it out I'll go ahead and remove it.

def find_committee_event_updates(self, committee_ids, minutes=15):
committee_updates = []
for committee_id in committee_ids:
new_event_particip = ocd_legislative_models.EventParticipant.objects.annotate(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's becoming clear to me that we might need to handle null end dates in all the custom managers in django-councilmatic.

It could be a good idea to standardize django-councilmatic model managers to provide _dt attributes that are cast to DateTimeFields with proper nulls, but I don't think I could make that change with confidence in the context of this PR. I went ahead and added/updated these attributes for models that are implicated in Notifications, however.

could you the query councilmatic Event, whose default manager includes the datetime annotation, then grab the event participant off those objects meeting the filter criteria?

I can't believe I missed EventManager! This is very smart -- done.

}
for event_particip in new_event_particip.distinct(
'event__start_date', 'organization__id', 'event__id'
).order_by('-event__start_date'):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hancush Do you think this distinct filter is necessary?

).filter(
created_at__gte=self.get_threshold(minutes),
start_datetime__gte=datetime.now(pytz.timezone(settings.TIME_ZONE))
).distinct('start_datetime', 'id').order_by('start_datetime')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, is distinct necessary at all here?

SubscriptionsManageView, person_subscribe, person_unsubscribe, bill_subscribe, \
bill_unsubscribe, committee_events_subscribe, committee_events_unsubscribe, \
committee_actions_subscribe, committee_actions_unsubscribe, search_check_subscription, \
search_subscribe, search_unsubscribe, events_subscribe, events_unsubscribe, \
send_notifications

import django_rq
if django.VERSION < (1, 11):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! I thought I was being clever supporting multiple Django versions, but you're right, the dependency tree excludes <=1.0.

@jeancochrane
Copy link
Author

I think I'm all set for another look @hancush! Still curious what you think about timezones, but other than that this should be good to go.

@jeancochrane jeancochrane changed the base branch from master to 1.0 July 15, 2019 21:21
Copy link
Member

@hancush hancush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: datetime, it looks like created and updated at timestamps, plus event start times, are localized, so the way you've handled comparisons should be ok!

i do have a question about how to identify new or updated bill-related objects, e.g., bill actions and bill sponsorships. we used to separate these in the data import, but we don't do that anymore. i think we can use the created at timestamp from the bill to find new related objects, because sponsorships and actions associated with a new bill will also be new, but i don't think we can use the updated at timestamp to find updates to related objects, because it will be updated for any change in the bill.

am i missing something obvious, or misinterpreting intent? called out specific instances inline.

def find_bill_action_updates(self, bill_ids, minutes=15):
new_actions = councilmatic_models.BillAction.objects.filter(
bill__id__in=bill_ids,
date_dt__gte=self.get_threshold(minutes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this column refers to the date of the action itself, not necessarily when it was added to the database. my first instinct was to query through the updated at on the bill, however that can be toggled for any bill update, not just new actions. we used to create a new table by left joining incoming objects with the existing table to separate them into new and updated objects. i don't know that we have a good way of knowing whether a bill action (or other event- or bill-related object without its own updated at timestamp) is "new" in the refactor... @fgregg, do you have any thoughts here?

person_updates = []
for person_id in person_ids:
new_sponsorships = councilmatic_models.BillSponsorship.objects.filter(
bill__created_at__gte=self.get_threshold(minutes),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are sponsorships ever added some time after a bill is created? if so, i don't think this will capture them, but i have the same uncertainty about how to capture changes in related objects as above.

for committee_id in committee_ids:
new_actions = councilmatic_models.BillAction.objects.filter(
organization__id=committee_id,
date_dt__gte=self.get_threshold(minutes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto on changes in related objects.

@fgregg
Copy link
Member

fgregg commented Jul 15, 2019

re: datetime, it looks like created and updated at timestamps, plus event start times, are localized, so the way you've handled comparisons should be ok!

i do have a question about how to identify new or updated bill-related objects, e.g., bill actions and bill sponsorships. we used to separate these in the data import, but we don't do that anymore. i think we can use the created at timestamp from the bill to find new related objects, because sponsorships and actions associated with a new bill will also be new, but i don't think we can use the updated at timestamp to find updates to related objects, because it will be updated for any change in the bill.

am i missing something obvious, or misinterpreting intent? called out specific instances inline.

You are right @hancush. I don't have a perfect idea. I think the best is to create subclasses in django-councilmatic that have updated_at fields.

@jeancochrane
Copy link
Author

Creating subclasses with updated_at attributes seems reasonable. Will we then have to update the django-councilmatic signals to set these fields when objects get created or updated?

@fgregg
Copy link
Member

fgregg commented Jul 16, 2019 via email

@jeancochrane
Copy link
Author

Looks like the signals approach won't work for BillActions and BillSponsorships. Pupa uses bulk_create() to create related entities, which doesn't fire pre_save or post_save signals. @hancush and I talked about a few possible alternatives:

  1. Write a migration for django-councilmatic to set DB-level Postgres triggers, a la BGA. This is a close approximation to the signals approach, but feels like the wrong level of abstraction to me, in that would introduce application-level functionality via a database migration.
  2. Adjust python-opencivicdata model managers to use a custom bulk_create method that fires post_save signals. This is the solution recommended by the SO threads we found on the topic (see example). This would let us preserve the signals approach as-is, but would require a higher-order change to an upstream library. The custom manager would also be a bit confusing since it would only exist to adjust behavior in an upstream library, not in python-opencivicdata itself.
  3. Adjust pupa to allow optionally creating related models with a naked save() instead of post_save(). This would preserve the signals approach, but require an upstream change and reduce the performance of the operation that creates related entities.
  4. Add created_at and updated_at fields to the RelatedBase class in python-opencivicdata. This is my personal favorite approach but I suspect it'll be controversial, since it requires a data model change to an upstream library. I don't fully understand why created_at and updated_at aren't already part of the spec for related entities, though, so maybe I'm missing something important.

Curious to hear what @fgregg thinks about all this.

@fgregg
Copy link
Member

fgregg commented Jul 16, 2019 via email

@jeancochrane
Copy link
Author

Superceded by #34.

@jeancochrane jeancochrane deleted the feature/jfc/councilmatic-1.0 branch July 31, 2019 16:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants