Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spc #267

Merged
merged 6 commits into from
Jan 13, 2020
Merged

Spc #267

merged 6 commits into from
Jan 13, 2020

Conversation

AodhanSweeney
Copy link

Adding SPC

@lesserwhirls
Copy link
Collaborator

I had no idea Siphon was failing in so many places. It's hard for me to tell for sure, but it seems like it's not been passing for some time now. The good news is that your two classes are not causing any of the errors at this point, which is good.

The next step will be to write some tests for SPC and NCH. To avoid hitting live servers (which can go down and may not be reliable for testing purposes), we utilize vcrpy under the hood. Here is an example:

https://github.com/Unidata/siphon/blob/master/siphon/tests/test_wyoming.py

(the vcrpy library gets pulled in by from siphon.testing import get_recorder).

Basically, the first time you run the test code, any web request gets "recorded" by vcrpy in a "cassette" that will be played back each time you run the test in the future. These "cassettes" are stored in the github repository just like the code, and are found here.

It's not super complicated, but it's a little tricky the first time you go through the testing processing like this.

@zbruick
Copy link
Contributor

zbruick commented Jun 27, 2019

I'll get my hands dirty with siphon and work on fixing the Travis errors.

@AodhanSweeney
Copy link
Author

AodhanSweeney commented Jun 27, 2019 via email

@AodhanSweeney AodhanSweeney force-pushed the spc branch 2 times, most recently from b44c49a to 727cc1e Compare July 15, 2019 18:44
@zbruick
Copy link
Contributor

zbruick commented Jul 15, 2019

Note that this PR will close #259, #257, and #228.

@zbruick
Copy link
Contributor

zbruick commented Jul 15, 2019

The remaining Travis issue isn't caused by this PR (see discussion on #269). So as far as issues to fix, I think you're good for right now. I think this just needs to go to everyone (@lesserwhirls @dopplershift) for final review now.

@AodhanSweeney
Copy link
Author

I saw that it also wanted a copyright so i gave it that as well, I think this is as far as I can take it though... Thanks for all the help though man! @zbruick

url_count = 0
for url in url_links:
# Checking if url is valid, if status_code is 200 then website is active
if requests.get(url).status_code == 200:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be cases where you get another value of 2xx code that is actually ok. Rather than checking for 200 exactly, you can use raise_for_status() on the response, and if something went wrong (4xx, 5xx), requests will raise a requests.exceptions.HTTPError error. See https://3.python-requests.org/user/quickstart/#response-status-codes.

You could replace the whole if block with something like:

try:
    response = requests.get(url)
    response .raise_for_status()
    # rest of code after old if statement
    ...
except requests.exceptions.HTTPError as http_error:
    # give helpful message on console
    print("url {} was not valid, select different storm.".format(url))
    # raise the caught error
    raise    

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm wondering...since there is only one direct use of the requests library in your PR, perhaps class NHCD should subclass HTTPEndPoint, much like IAStateUpperAir does. @dopplershift @zbruick? Or is that something we refactor later, given the summer timeline?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think consistent methods for accessing data whenever possible across the package would be nice. Since I haven't played around with any of this, I'm not sure how much refactoring it would take, but given how IAStateUpperAir uses it, I don't think it would take that much time. @AodhanSweeney, chime in if this would take too long - not sure what's left on your to-do list for the summer. If too much, we can refactor down the road.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would still need to check if the URL was valid though, right? NDBC still does this with _check_if_url_valid, so I don't believe that's handled by HTTPEndPoint?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of two things to do to address this.

  1. replace all r.status_code == 200 checks with r.raise_for_status() (when you have a response obtained from HTTPEndPoint)

  2. extend HTTPEndPoint to have a method for checking if a url is reachable (is_url_reachable(url)?) which simply does a head request on the url followed by a raise_for_status() (would replace all _check_if_url_valid type things). Maybe something like:

def is_url_reachable(self, path, params=None):
        """Test if a url is reachable.

        Make a HEAD request, optionally including a parameters, to a path.
        The path of the request is the full URL.
        Parameters
        ----------
        path : str
            The URL to request
        params : DataQuery, optional
            The query to pass when making the request
        Returns
        -------
        resp : requests.Response
            The server's response to the request
        Raises
        ------
        HTTPError
            If the server returns a 4xx or 5xx code

        """
        resp = self._session.head(path, params=params)
        resp.raise_for_status()

        return true

Since get requests follow redirect by default, we should probably make sure the head call includes the parameter allow_redirects=True (head does not follow redirects by default), since the url would ultimately be reachable by a get request.

How does that sound?

import pandas as pd


class SpcData:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpcData should also use HTTPEndPoint, in the way that NDBC does. In this case, we would use the methods from HTTPEndPoint to read the data from a URL, and pass that to pandas pd.read_csv by wrapping the HTTPEndPoint return in a StringIO. Again, @dopplershift @zbruick, is that something we refactor later, given the summer timeline?

@AodhanSweeney
Copy link
Author

This most recent pull request is good for approval, but I want to work something else out so no one needs to look at it quite yet

@codecov
Copy link

codecov bot commented Jul 23, 2019

Codecov Report

Merging #267 into master will increase coverage by 0.05%.
The diff coverage is 98.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #267      +/-   ##
==========================================
+ Coverage   97.79%   97.85%   +0.05%     
==========================================
  Files          40       44       +4     
  Lines        4318     4716     +398     
==========================================
+ Hits         4223     4615     +392     
- Misses         95      101       +6
Impacted Files Coverage Δ
siphon/tests/test_nhc.py 100% <100%> (ø)
siphon/tests/test_spc.py 100% <100%> (ø)
siphon/simplewebservice/iastate.py 98.43% <100%> (ø) ⬆️
siphon/simplewebservice/nhc.py 100% <100%> (ø)
siphon/http_util.py 98.18% <100%> (+0.06%) ⬆️
siphon/simplewebservice/spc.py 94.33% <94.33%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 881c77d...89ceea3. Read the comment docs.

@AodhanSweeney
Copy link
Author

This is now ready for review, once this is in I will be pushing respective GUIs into the python library.

@AodhanSweeney AodhanSweeney force-pushed the spc branch 2 times, most recently from b2c7ddf to 181dc9a Compare August 2, 2019 13:50
@AodhanSweeney
Copy link
Author

AodhanSweeney commented Aug 2, 2019

This should be ready for review, it addresses issues #228 #259 and #257

zbruick
zbruick previously approved these changes Aug 22, 2019
Copy link
Contributor

@zbruick zbruick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve but I'll defer to others on this before merging, as this is pretty extensive.

@dopplershift dopplershift added this to the Winter 2018 milestone Aug 26, 2019
@zbruick zbruick mentioned this pull request Sep 30, 2019
@lesserwhirls
Copy link
Collaborator

The build isn't passing at the moment, but outside of that, what remains on this one?

@dopplershift
Copy link
Member

A review of the API for the NHC stuff (signature of public functions) to make sure we're ok with supporting that interface long term.

@dopplershift dopplershift merged commit f39e322 into Unidata:master Jan 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants