Skip to content

API poller

Amy Farrell edited this page Jan 18, 2025 · 3 revisions

Part of USAGov Architecture patterns

We can use the API poller pattern when the data will vary over time, but will be the same for every visitor within a given timeframe.

  • Audience: general public
  • Purpose: Retrieve data from an API server and summarize it for use in a web page or web application
    • Authenticate securely with the API server
    • Limit the number of API requests we make, compared with on-demand queries
    • Refresh data at a regular rate, independent of updates to the Public static site

In the "API poller" pattern, an app gathers data from an API server on a periodic basis, and provide the resulting data (typically on a web page) via client-side javacript. This approach is suitable for data that will change over time, but that does not change based on the parameters of a request. For example, the data is:

  • Not personalized
  • Not generated in response to a query
  • Public

Advantages

  • Highly scalable; traffic from the public hits an s3 bucket. Could add Cloudfront if further caching is needed.
  • Consistent rate of use of the polled API; no danger of exceeding a quota
  • Server-to-server API access protects API credentials
C4Context
  title API poller
  Boundary(internet, "Internet", "the web") {
    Person(publicUser, "Public web user")
    System("externalAPI", "External API Server")
    Boundary(cloud_gov_boundary, "Cloud.gov", "") {
      Boundary(usagov_boundary, "USAgov org boundary", "") {
	Boundary(asg_restricted, "Restricted egress space", "") {
	  Container(api_poller, "API poller", "periodic updates", "Accesses External API, then writes data to S3", "")
	}
        Boundary(asg_public_egress, "Public egress space", "") {
	  Container(egress_proxy, "Egress proxy", "", "")
	}
       ContainerDb(s3_api_storage, "S3 (API results)", "")
      }
    }
  }
  Rel(api_poller, s3_api_storage, "aws cli", "WRITE")
  Rel(api_poller, egress_proxy, "HTTPS", "HEAD/GET/*")
  Rel(egress_proxy, externalAPI, "HTTPS", "HEAD/GET/*")
  Rel(publicUser, s3_api_storage, "HTTPS", "HEAD/GET")
  UpdateRelStyle(publicUser, s3_api_storage, $offsetX="0", $offsetY="-90")
  UpdateRelStyle(api_poller, s3_api_storage, $offsetX="40", $offsetY="50")
  UpdateRelStyle(api_poller, egress_proxy, $offsetX="-20", $offsetY="10")
  UpdateRelStyle(egress_proxy, externalAPI, $offsetX="120", $offsetY="200")
  UpdateElementStyle(egress_proxy, $bgColor="yellow", $fontColor="black")
Loading

Implementation notes

The s3 bucket must be public and must have a CORS policy that permits client-side web access from the servers that will rely on the stored data.

Implementation of the API poller application varies. Periodic tasks can be achieved by any of:

  • Using the Cron app
  • Running a loop (the Analytics Reporter does this)
  • Using a task runner (e.g., in CircleCI)

The Cron app can host a small script directly, or can trigger a CloudFoundry task in another app.

Examples

  • AnalyticsReporter
    • The Analytics Reporter is the API poller. It runs a nodejs application which runs a loop to perform periodic access to the Google Analytics API.
    • The Public static web site serves javascript which, when executed by the client, retrieves data from the S3 bucket, plus files to format the results. This is the analytics dashboard
  • Call center wait time poller
    • A script on the Cron app is the API poller. In this case cron provides the periodicity.
    • The Public static web site serves javascript which, when executed by the client, retrieves data from the S3 bucket.

Key connections to other systems

  • The Egress proxy (which is shown in this diagram in yellow)
  • A log drain connects the API poller app to the Log shipper.