Request to add Wycombe / Chiltern Council #154

adamcarter81 · 2023-01-06T19:49:41Z

Name of Council

Chiltern Council

Example Postcode

HP10 9TX

Additional Information

First page: https://chiltern.gov.uk/collection-dates requires to enter postcode, press Submit andthen choose house number, then submit
The next page gives the results

First page calls this URL after entering postcode:
https://chiltern.gov.uk/apiserver/postcode?callback=jQuery2240753028558035197_1673034357951&jsonrpc={"id":+25266114,"method":+"postcodeSearch","params":+{"provider":+"",++"postcode":+"hp10+9tx"}}&_=1673034357952

It seems to set a session ID after choosing a postcode/house number, this is then set as a cookie and passed to the next page whcih displays the results for the next bin collection days (there is no calendar that I could find)

PS This plug in is awesome! Wish the councils would standardise the data and have a public API!

Edit: #38 looks to be the same website

dp247 · 2023-01-07T00:58:05Z

Ah yes, Chiltern. Otherwise known as the bane of my Python skills 😆

adamcarter81 · 2023-01-07T09:33:30Z

I've been going nowhere for a while trying to get this to work in HA, I was so happy when I found this project! To only realise others have been here before :/

dp247 · 2023-01-07T12:29:51Z

It's been a while since I looked at it kind, so I'll give it another go 😁

preator67 · 2023-01-27T00:35:44Z

@dp247 I might have a working implementation of this, utilising Selenium. It needs a little bit of work to bring it into line with the rest of the package, but I'm happy to do this if you don't already have something in the pipeline?

dp247 · 2023-01-30T20:14:05Z

@preator67 I've delegated it to @robbrad to look at. I'm not entirely against the idea, but I'm not sure how selenium's dependencies would change/interfere with the project

robbrad · 2023-01-30T21:59:31Z

I'll take a look at this tomorrow night 👍

robbrad · 2023-01-31T20:52:27Z

Im not having the best fun with this one - the following should work as far as I can see but no result is returned - they must have some rate limiting setup

  def parse_data(self, page: str, **kwargs) -> dict:
      # Make a BS4 object
      #setup our session
      s = requests.Session()
      headers = {'Host': 'chiltern.gov.uk',
          'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/109.0',
          'Accept': 'text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, */*; q=0.01',
          'Accept-Language': 'en-GB,en;q=0.5',
          'Accept-Encoding': 'gzip, deflate, br',
          'X-Requested-With': 'XMLHttpRequest',
          'Connection': 'keep-alive',
          'Sec-Fetch-Dest': 'empty',
          'Sec-Fetch-Mode': 'cors',
          'Sec-Fetch-Site': 'same-origin'}

      s.headers.update(headers)

      first_load = s.get('https://chiltern.gov.uk/collection-dates')

      soup = BeautifulSoup(first_load.text, features="html.parser")
      soup.prettify()
      action = soup.find('form', id='COPYOFECHOCOLLECTIONDATES_FORM').get('action')
      parsed_url = urlparse(action)
      captured_values = parse_qs(parsed_url.query)

      form_data = {
          'COPYOFECHOCOLLECTIONDATES_PAGESESSIONID': captured_values['pageSessionId'][0],
          'COPYOFECHOCOLLECTIONDATES_SESSIONID': captured_values['fsid'][0],
          'COPYOFECHOCOLLECTIONDATES_NONCE': captured_values['fsn'][0],
          'COPYOFECHOCOLLECTIONDATES_VARIABLES': 'e30=',
          'COPYOFECHOCOLLECTIONDATES_PAGENAME': 'ADDRESSSELECTION',
          'COPYOFECHOCOLLECTIONDATES_PAGEINSTANCE': '0',
          'COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_ADDRESSSELECTIONPOSTCODE': 'HP10 9TX',
          'COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_ADDRESSSELECTIONADDRESS': 14,
          'COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_SELECTEDADDRESS':'31 STATION ROAD\nLOUDWATER\nHIGH WYCOMBE\nHP10 9TX',
          'COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_UPRN':'100081167425',
          'COPYOFECHOCOLLECTIONDATES_FORMACTION_NEXT': 'COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_NAV1'
          }
      inter_request = s.post(action, data=form_data)
      bin_data_page = s.get(f"https://chiltern.gov.uk/collection-dates?pageSessionId={captured_values['pageSessionId'][0]}&fsn={captured_values['fsn'][0]}")

      data = {"bins": []}

@preator67 - are you using https://selenium-python.readthedocs.io/ - as long as

Its in Python
It has an Intergration test
It has unit test coverage
And above all produces a JSON

Im more than happy to have selenium do the heavy lifting - do you want to submit a PR?

preator67 · 2023-02-04T16:59:46Z

@robbrad yes, that’s the one. What I have working ticks the boxes on 1 and 4. Give me a little time to fully address/check 2 and 3, and I’ll submit a PR

robbrad · 2023-02-04T17:54:31Z

I can help on 2/3 if you need it or get stuck 👍

Check contributing.md if you need guidance or check the Cheshire east council/previous PRs

I'm actually quite excited to see some selenium in action

preator67 · 2023-02-12T22:00:41Z

Sorry to be slow on this. I'm struggling with integration/unit tests, but this may be because I've had to go about things abnormally. @robbrad I would therefore some input on if this approach is workable with the existing framework.

Firstly, I've had to execute the file as:
python collect_data.py Chilterns https://chiltern.gov.uk/collection-dates -p "HP14 4LA" -n "HUGHENDEN MANOR, MANOR ROAD, HUGHENDEN VALLEY, HIGH WYCOMBE" -s SKIP_GET_URL

The number needs to be the address as it is appears on the council website, which isn't ideal. I've also had to add the SKIP_GET_URL argument so I can use a custom get_data function. This initially returns an error:
UnboundLocalError: local variable 'bin_data_dict' referenced before assignment
but this can be solved by assigning the else statement to bin_data_dict on line 60 of get_bin_data.py - although I'm not sure if this an acceptable change?

The following then produces a JSON in the correct format:

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options
from uk_bin_collection.uk_bin_collection.common import *
from uk_bin_collection.uk_bin_collection.get_bin_data import AbstractGetBinDataClass


class CouncilClass(AbstractGetBinDataClass):
    """
    Concrete classes have to implement all abstract operations of the
    base class. They can also override some operations with a default
    implementation.
    """

    def get_data(self, df) -> dict:

        # Create dictionary of data to be returned
        data = {"bins": []}

        # Output collection data into dictionary
        for i, row in df.iterrows():
            dict_data = {
                "type": row['Collection Name'],
                "collectionDate": row['Next Collection Due'],
                        }
    
            data["bins"].append(dict_data)

        return data


    def parse_data(self, page: str, **kwargs) -> dict:

        page = 'https://chiltern.gov.uk/collection-dates'

        # Assign user info
        user_postcode = kwargs.get("postcode")
        user_paon = kwargs.get("paon")
        
        # Set up Selenium to run 'headless'
        options = webdriver.ChromeOptions()
        options.add_argument('--headless')
        options.add_argument('--no-sandbox')
        options.add_argument('--disable-gpu')
        options.add_argument('--disable-dev-shm-usage')

        # Create Selenium webdriver
        driver = webdriver.Chrome(options=options)
        driver.get(page)

        # Enter postcode in text box and wait
        inputElement_pc = driver.find_element(
            By.ID, "COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_ADDRESSSELECTIONPOSTCODE")
        inputElement_pc.send_keys(user_postcode)
        inputElement_pc.send_keys(Keys.ENTER)

        time.sleep(4)

        # Select address from dropdown and wait
        inputElement_ad = Select(driver.find_element(
            By.ID,"COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_ADDRESSSELECTIONADDRESS"))

        inputElement_ad.select_by_visible_text(user_paon)
        
        time.sleep(4)

        # Submit address information and wait
        inputElement_bn = driver.find_element(
            By.ID, "COPYOFECHOCOLLECTIONDATES_ADDRESSSELECTION_NAV1_NEXT").click()
        
        time.sleep(4)
       
        # Read next collection information into Pandas
        table = driver.find_element(By.ID, "COPYOFECHOCOLLECTIONDATES_PAGE1_DATES2").get_attribute('outerHTML')
        df = pd.read_html(table, header=[1])
        df = df[0]

        # Parse data into dict
        data = self.get_data(df)

        return data

I believe it is then failing the integration test because it is trying to execute without the -s option, but I'm unclear on how I can add this to the input.json. For instance, I tried:

"Chilterns": { "url": "https://chiltern.gov.uk/collection-dates", "postcode": "HP14 4LA", "house_number": "HUGHENDEN MANOR, MANOR ROAD, HUGHENDEN VALLEY, HIGH WYCOMBE", "SKIP_GET_URL": "SKIP_GET_URL" },

but this did not seem to have any effect?

robbrad · 2023-02-13T23:14:28Z

Looks really good @preator67 - to add the ability to take from the input json you need to add a small change to https://github.com/robbrad/UKBinCollectionData/blob/master/uk_bin_collection/tests/features/steps/validate_council.py#L25

This will then use the switch as specified - If you want to share a repo where you have this I could try it out ?

preator67 · 2023-02-18T19:51:03Z

Thanks for the pointer @robbrad - makes sense when you know where to look. I've submitted a PR. If it's all OK, I can write some info for the Wiki - as this requires a bit more info from the user than other councils.

robbrad · 2023-02-18T19:53:31Z

Fantastic work @preator67

robbrad · 2023-02-18T20:41:58Z

@preator67 - the integration tests came back saying lxml was missing - does this need adding to the Poetry.toml?

https://robbrad.github.io/UKBinCollectionData/3.9/449/#categories/66e5ec8c5c97ebb7160e51a452a5e3ba/54003f6f2df25442/

preator67 · 2023-02-18T23:55:58Z

@robbrad Yes, it does - apologies, not sure how that got missed off.

OliverCullimore · 2023-07-16T19:30:19Z

This one looks to be all implemented, closing.

dp247 added the council request A new council request label Jan 7, 2023

preator67 mentioned this issue Feb 18, 2023

Add support for Chilterns Council #185

Merged

OliverCullimore closed this as completed Jul 16, 2023

This was referenced Aug 5, 2023

Request Aylesbury Vale Council (Bucks) #312

Closed

Request Aylesbury Vale Council (Buckinghamshire, UK) mampfes/hacs_waste_collection_schedule#1139

Closed

Request for Buckinghamshire Country Council Source mampfes/hacs_waste_collection_schedule#962

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request to add Wycombe / Chiltern Council #154

Request to add Wycombe / Chiltern Council #154

adamcarter81 commented Jan 6, 2023 •

edited

Loading

dp247 commented Jan 7, 2023

adamcarter81 commented Jan 7, 2023

dp247 commented Jan 7, 2023

preator67 commented Jan 27, 2023 •

edited

Loading

dp247 commented Jan 30, 2023

robbrad commented Jan 30, 2023

robbrad commented Jan 31, 2023

preator67 commented Feb 4, 2023

robbrad commented Feb 4, 2023

preator67 commented Feb 12, 2023

robbrad commented Feb 13, 2023

preator67 commented Feb 18, 2023 •

edited

Loading

robbrad commented Feb 18, 2023

robbrad commented Feb 18, 2023

preator67 commented Feb 18, 2023

OliverCullimore commented Jul 16, 2023

Request to add Wycombe / Chiltern Council #154

Request to add Wycombe / Chiltern Council #154

Comments

adamcarter81 commented Jan 6, 2023 • edited Loading

Name of Council

Example Postcode

Additional Information

dp247 commented Jan 7, 2023

adamcarter81 commented Jan 7, 2023

dp247 commented Jan 7, 2023

preator67 commented Jan 27, 2023 • edited Loading

dp247 commented Jan 30, 2023

robbrad commented Jan 30, 2023

robbrad commented Jan 31, 2023

preator67 commented Feb 4, 2023

robbrad commented Feb 4, 2023

preator67 commented Feb 12, 2023

robbrad commented Feb 13, 2023

preator67 commented Feb 18, 2023 • edited Loading

robbrad commented Feb 18, 2023

robbrad commented Feb 18, 2023

preator67 commented Feb 18, 2023

OliverCullimore commented Jul 16, 2023

adamcarter81 commented Jan 6, 2023 •

edited

Loading

preator67 commented Jan 27, 2023 •

edited

Loading

preator67 commented Feb 18, 2023 •

edited

Loading