Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow to reuse a browser by passing a browserContext #884

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

daniel-hauser
Copy link
Contributor

This pull request changes the page creation logic by alowing a browserContext to be passes instead of a browser (that is currently being closed when the scraping ends)

@daniel-hauser daniel-hauser changed the title feat: Allow to reuse a browser by passing a browserContext feat: allow to reuse a browser by passing a browserContext Oct 3, 2024
@daniel-hauser daniel-hauser marked this pull request as ready for review October 5, 2024 20:36

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 3 changed files in this pull request and generated no suggestions.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (3)

src/scrapers/base-scraper-with-browser.ts:85

  • [nitpick] The word 'bang' might be confusing. It would be clearer to use 'exclamation mark (!)' instead.
NOTICE - it is discouraged to use bang (!) in general.

src/scrapers/base-isracard-amex.ts:33

  • The ExtendedScraperOptions interface was removed, but there was no mention of its removal in the context provided. Ensure that this interface is not used elsewhere in the codebase.
type CompanyServiceOptions = {

src/scrapers/base-isracard-amex.ts:258

  • [nitpick] The parameter name 'options' is of type 'CompanyServiceOptions'. It would be clearer to rename it to 'companyServiceOptions'.
async function getExtraScrapTransaction(page: Page, options:CompanyServiceOptions, month: Moment, accountIndex: number, transaction: Transaction): Promise<Transaction> {

src/scrapers/base-scraper-with-browser.ts Show resolved Hide resolved
@@ -26,49 +26,40 @@ export interface FutureDebit {
bankAccountNumber?: string;
}

export interface ScraperOptions {
interface ExternalBrowserOptions {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain how one ScraperOptions interface became to be a union of three interfaces with another fourth interface.

Sorry, something is not clear here, as you can see.

Maybe it is naming only, but it is not clear to me what is the difference between ExternalBrowserOptions and ExternalBrowserContextOptions, do are they both new ways to reuse a browser?
And if they are, why is there a third interface DefaultBrowserOptions? I guess this is the regular way of launching a new browser for each scraper.

Now after your change it is clear there are two different things inside one interface: Browser Settings and Scraping Settings. Maybe we need to split the interfaces or at least insert one of the subjects inside a property (scrapeConfig: ScrapeConfig) in the other one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain how one ScraperOptions interface became to be a union of three interfaces with another fourth interface.
...
Maybe it is naming only, but it is not clear to me what is the difference between ExternalBrowserOptions and ExternalBrowserContextOptions, do are they both new ways to reuse a browser?

The code had two ways to create a page:

  1. The default way - the scraper creates the browser, the caller can supply some options for the creation.
    • Named DefaultBrowserOptions in this PR
  2. The non-default way - the caller supplies an external single use browser (because browser.close() is called)
    • Named ExternalBrowserOptions in this PR

In this PR i added the third way, where the caller only supplies a BrowserContext and is responsible for the browser creation and teardown (named ExternalBrowserContextOptions)

Now after your change it is clear there are two different things inside one interface: Browser Settings and Scraping Settings. Maybe we need to split the interfaces or at least insert one of the subjects inside a property (scrapeConfig: ScrapeConfig) in the other one.

In order to stay backwards compatible, I didn't want to add a new browserOptions property to ScraperOptions, therefore left it as export type ScraperOptions = ScraperBrowserOptions & { ... /* All non-browser options */ }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants