feat: scytl connector #737

agreenspan24 · 2022-08-30T23:57:36Z

Scytl, or Clarity Elections, is a company that creates a tool for publishing election results in real-time. It's used in the U.S. by several states and over 800 counties, including Georgia, Colorado, Arkansas, and Dallas County, Texas.

For each participating election administrator, they publish a site with results that can be published on election night. Unfortunately, while that site contains downloadable data, that data is formatted in a complex way, making it difficult for developers to fetch election results. In general, their results either come zipped in either an unformatted text file or a complex XML document. Summary results come as a zipped CSV, but just contain top-line results. The JSON that powers the site results is even more complicated.

This connector provides methods to download the latest election results from their site and formats them into readable lists of dictionaries, which can easily be converted into a Parsons Table or Pandas dataframe.

Because this connector is can be useful for live reporting, it also contains a polling feature. As long as the class is instantiated, it will only fetch results that are new since the previous fetch of that method. To skip this feature, set force_update to true on any of the fetch methods.

Note: please do not confuse with Clarity Campaign Labs

Example microsite: https://results.enr.clarityelections.com/CO/105975/web.276603/

shaunagm · 2022-09-12T19:18:06Z

Hi @agreenspan24! Thanks so much for this contribution. I haven't had a chance to go through the whole PR yet but I'm hoping to get to it and leave a review later this week.

I appreciate you having thorough tests for you connector! I am a little hesitant about the 25-ish test files though. Are they all necessary? Might they cause issues for long-term maintainability if there are formatting changes?

agreenspan24 · 2022-09-13T01:20:13Z

Hi @agreenspan24! Thanks so much for this contribution. I haven't had a chance to go through the whole PR yet but I'm hoping to get to it and leave a review later this week.

I appreciate you having thorough tests for you connector! I am a little hesitant about the 25-ish test files though. Are they all necessary? Might they cause issues for long-term maintainability if there are formatting changes?

Hi Shauna! Makes total sense - was trying to keep the 18 county files to match the real results, but I think you're right to reduce them. I cut out all but two and adjusted the tests - run a good bit faster now too!

For maintainability, those files are just direct copies of real election results as Scytl holds them. If they change their format, I'm hoping we can just drop newer versions of those files into the tests and adjust the parser accordingly. But let me know your thoughts!

shaunagm

Overall looks good. Mostly docs issues here.

docs/scytl.rst

shaunagm · 2022-09-13T15:21:13Z

parsons/scytl/scytl.py

+            ex: Clarke
+    """
+
+    def __init__(self, state: str, election_id: str, county=""):


It's unusual for anything other than auth info to get passed in to a Parsons connector, although it certainly does happen, and it makes sense to pass in state and election_id here so they don't have to be passed in with get_summary_results, etc.

That said, county in particular confuses me. How does the county parameter here fit with the counties passed in with get_detailed_results_for_participating_counties? Does this county parameter get used at any other point? It seems like the other two are always state level but maybe I'm misunderstanding. Does passing in the county on init mean that get_summary_results and get_county_results apply only to the county?

Yeah, the setup is kinda confusing, would love your help navigating it.

Essentially, they have a concept of an election (at the state or county level), which can have summary results and detailed results. The detailed results show results by county for state level elections, while they show precinct results for a county-level election.

State-level elections can also have sub-elections in each county, and link to each of those county elections, which each have summary and detailed results. So, a county-level election won't have participating counties and the method won't make sense.

The main reason I wanted to add the state and election_id at the class level instead of the method is so the class could track the previous version- I have a feeling people would use this for polling and I want to make that as easy as possible for them to see if results have updated.

One idea might be to split this up into a StateScytlElection and CountyScytlElection class, which would have some of the same implementation, but not all of it. What do you think?

Sorry, that was a bit of a word salad. Does it make sense to you to split up the state-wide election and county-wide election into two separate connectors, since they have slightly different functions and outputs?

Separately, if we're using this for polling, to prevent fetching unnecessary results, does it make sense to you to store the previous version in the class or return it with the results and have them pass it back in the function?

parsons/scytl/scytl.py

shaunagm · 2022-09-13T15:35:43Z

@agreenspan24 I think your test fixes look good!

parsons/scytl/scytl.py

alxmrs

A few notes.

parsons/scytl/scytl.py

test/test_scytl/test_scytl.py

alxmrs · 2022-09-17T21:56:03Z

test/test_scytl/test_scytl.py

+        scy = Scytl(TEST_STATE, TEST_ELECTION_ID)
+        result = scy.get_summary_results()
+
+        with open(f'{_dir}/114729_summary_expected.csv', 'r') as expected:


I find it hard to understand if the data is correct since it's hard to compare the source and destination data files in this diff.

Yeah that's a reasonable point. I think it's important to use the zip files since unzipping is a big part of what we're testing.

I guess it's somewhat helpful to just read the expected CSV and make sure all the data looks good and reasonable, but not ideal.

test/test_scytl/test_scytl.py

agreenspan24

@alxmrs Thanks for the review! Still pretty new to proper python conventions, so I appreciate the deeper look! A couple of ideas here sound interesting to me, like typing and dataclasses, but I don't want to introduce new patterns without approval from maintainers, so @shaunagm let us know your thoughts!

parsons/scytl/scytl.py

test/test_scytl/test_scytl.py

agreenspan24 · 2022-09-18T23:27:48Z

test/test_scytl/test_scytl.py

+        scy = Scytl(TEST_STATE, TEST_ELECTION_ID)
+        result = scy.get_summary_results()
+
+        with open(f'{_dir}/114729_summary_expected.csv', 'r') as expected:


Yeah that's a reasonable point. I think it's important to use the zip files since unzipping is a big part of what we're testing.

I guess it's somewhat helpful to just read the expected CSV and make sure all the data looks good and reasonable, but not ideal.

parsons/scytl/scytl.py

shaunagm · 2022-09-19T15:36:54Z

Still pretty new to proper python conventions, so I appreciate the deeper look! A couple of ideas here sound interesting to me, like typing and dataclasses, but I don't want to introduce new patterns without approval from maintainers, so @shaunagm let us know your thoughts!

Responded re: dataclasses inline. The question of typing is a bigger one. We currently don't require any sort of type annotations and certainly not any strict enforcement, but perhaps that's something we could at least ask of contributors going forward. We don't have capacity to retroactively add typing to the existing codebase but incremental improvements are better than nothing.

Thanks for raising these questions @agreenspan24 & @alxmrs

agreenspan24 · 2022-09-19T22:14:06Z

@shaunagm any idea what's causing docs-build to fail? No clear answer from CircleCI

shaunagm · 2022-09-19T22:18:07Z

I don't think it's an issue with your PR. @SorenSpicknall? This and a couple other open PRs are failing with this same error Error: [Errno 2] No such file or directory: '/home/circleci/project/venv/bin/python3'

SorenSpicknall · 2022-09-20T21:11:08Z

I don't think it's an issue with your PR. @SorenSpicknall? This and a couple other open PRs are failing with this same error Error: [Errno 2] No such file or directory: '/home/circleci/project/venv/bin/python3'

I'm working on this now; definitely looks like a CircleCI config breakage. If we can't figure it out in the next day or so, we may just merge some of these newer PRs without the additional testing step in order to not pause ongoing work, since the PRs generally test green aside from the CI issue.

shaunagm · 2022-09-22T16:35:10Z

@SorenSpicknall I know you submitted a PR to fix the CI issue (#752), what do we need to do to get this PR passing? Do we need to rebase it onto the updated main branch?

SorenSpicknall · 2022-09-22T17:40:49Z

@SorenSpicknall I know you submitted a PR to fix the CI issue (#752), what do we need to do to get this PR passing? Do we need to rebase it onto the updated main branch?

Yes, changes from main will need to be merged in or the PR will need to be recreated from scratch based on that newly updated main branch. Sorry about the annoyance of that, @agreenspan24 - you just got unlucky with timing of a weird cache issue with our CI.

shaunagm · 2022-09-22T23:08:37Z

Let me know if you need help rebasing, @agreenspan24. It can be tricky.

shaunagm · 2022-10-20T17:28:53Z

Hi @agreenspan24! Just checking in. I know a lot of folks are busy with the election, so I wanted to offer a few different options here:

(a) I can rebase your PR for you (I should be able to as a maintainer)
(b) we can pair and I can walk you through the process (don't know if you've rebased before, but it can be intimidating)
(c) we can just wait for you to have a chance to do it yourself

Totally up to you!

agreenspan24 · 2022-10-20T18:33:10Z

Hi @agreenspan24! Just checking in. I know a lot of folks are busy with the election, so I wanted to offer a few different options here:

(a) I can rebase your PR for you (I should be able to as a maintainer) (b) we can pair and I can walk you through the process (don't know if you've rebased before, but it can be intimidating) (c) we can just wait for you to have a chance to do it yourself

Totally up to you!

@shaunagm so sorry I missed this request, I can find some time in the next few days to rebase

agreenspan24 · 2022-10-20T22:14:22Z

@shaunagm all done!

shaunagm · 2022-10-20T22:35:46Z

Thank you so much for the effort you put into this PR! Excited to finally get to merge it. :)

agreenspan24 added 8 commits August 11, 2022 22:50

SCYTL connector code

896b41e

OOP, polling handling, and some unit tests

54c52fb

precinct tests

f2c5750

force update and code annotations

5a381cd

partial doc

5a0b2e7

summary backups

dd523df

cleanup

1b9bc2c

linting

b0e1b57

fewer test files

a193879

shaunagm requested changes Sep 13, 2022

View reviewed changes

agreenspan24 added 2 commits September 15, 2022 18:14

Private method documentation

c5b8ea2

remove unnecessary backslashes

7f11a7f

alxmrs reviewed Sep 17, 2022

View reviewed changes

parsons/scytl/scytl.py Outdated Show resolved Hide resolved

alxmrs reviewed Sep 17, 2022

View reviewed changes

agreenspan24 added 2 commits September 18, 2022 19:49

code review cleanup

2ebebd3

document authentication is not needed

14cd545

agreenspan24 commented Sep 18, 2022

View reviewed changes

agreenspan24 added 3 commits September 19, 2022 12:28

Context manager, flush, and lint

4ceb1f6

use dataclass

3ca4d5c

Typing definitions

649ea5e

Merge branch 'move-coop:main' into ag-scytl-connector

1d637e8

agreenspan24 requested a review from shaunagm October 20, 2022 22:14

shaunagm approved these changes Oct 20, 2022

View reviewed changes

shaunagm added the new connector Work type - creating a new Parsons connector for a tool label Oct 20, 2022

shaunagm merged commit ba9cadd into move-coop:main Oct 20, 2022

agreenspan24 deleted the ag-scytl-connector branch October 20, 2022 22:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: scytl connector #737

feat: scytl connector #737

agreenspan24 commented Aug 30, 2022

shaunagm commented Sep 12, 2022

agreenspan24 commented Sep 13, 2022

shaunagm left a comment

shaunagm Sep 13, 2022

agreenspan24 Sep 15, 2022

agreenspan24 Sep 21, 2022

shaunagm commented Sep 13, 2022

alxmrs left a comment

alxmrs Sep 17, 2022

agreenspan24 Sep 18, 2022

agreenspan24 left a comment

agreenspan24 Sep 18, 2022

shaunagm commented Sep 19, 2022 •

edited

Loading

agreenspan24 commented Sep 19, 2022

shaunagm commented Sep 19, 2022

SorenSpicknall commented Sep 20, 2022

shaunagm commented Sep 22, 2022

SorenSpicknall commented Sep 22, 2022

shaunagm commented Sep 22, 2022

shaunagm commented Oct 20, 2022

agreenspan24 commented Oct 20, 2022

agreenspan24 commented Oct 20, 2022

shaunagm commented Oct 20, 2022

feat: scytl connector #737

feat: scytl connector #737

Conversation

agreenspan24 commented Aug 30, 2022

shaunagm commented Sep 12, 2022

agreenspan24 commented Sep 13, 2022

shaunagm left a comment

Choose a reason for hiding this comment

shaunagm Sep 13, 2022

Choose a reason for hiding this comment

agreenspan24 Sep 15, 2022

Choose a reason for hiding this comment

agreenspan24 Sep 21, 2022

Choose a reason for hiding this comment

shaunagm commented Sep 13, 2022

alxmrs left a comment

Choose a reason for hiding this comment

alxmrs Sep 17, 2022

Choose a reason for hiding this comment

agreenspan24 Sep 18, 2022

Choose a reason for hiding this comment

agreenspan24 left a comment

Choose a reason for hiding this comment

agreenspan24 Sep 18, 2022

Choose a reason for hiding this comment

shaunagm commented Sep 19, 2022 • edited Loading

agreenspan24 commented Sep 19, 2022

shaunagm commented Sep 19, 2022

SorenSpicknall commented Sep 20, 2022

shaunagm commented Sep 22, 2022

SorenSpicknall commented Sep 22, 2022

shaunagm commented Sep 22, 2022

shaunagm commented Oct 20, 2022

agreenspan24 commented Oct 20, 2022

agreenspan24 commented Oct 20, 2022

shaunagm commented Oct 20, 2022

shaunagm commented Sep 19, 2022 •

edited

Loading