Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard not initially showing proper traffic at first + N/A data #23

Open
kratsg opened this issue Apr 13, 2021 · 2 comments
Open

Dashboard not initially showing proper traffic at first + N/A data #23

kratsg opened this issue Apr 13, 2021 · 2 comments

Comments

@kratsg
Copy link

kratsg commented Apr 13, 2021

Hi, thanks a lot for the very nice write-up and documentation here (and dashboard here: https://grafana.com/grafana/dashboards/2870 ).

There are a few things that still mystify me perhaps, and I'm not sure what. First, the entrypoints confused me a fair bit since I could barely find documentation on it. Here's what my docker-compose ended up looking like:

  traefik:
    restart: always
    image: "traefik:v2.3"
    container_name: "traefik"
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      # enable 80, 443, 27017, 8082
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.websecure-mongodb.address=:27017"
      - "--entrypoints.metrics.address=:8082"
      # redirect 80 to 443
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      # automatic certificate generation for SSL
      - "--certificatesresolvers.le.acme.tlschallenge=true"
      - "[email protected]"
      - "--certificatesresolvers.le.acme.storage=/letsencrypt/acme.json"
      - "--log.level=DEBUG"
      # get dashboard/api
      - "--api=true"
      - "--api.dashboard=true"
      # enable metrics with prometheus
      - "--metrics=true"
      - "--metrics.prometheus=true"
      - "--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000"
      - "--metrics.prometheus.entrypoint=metrics"
      - "--metrics.prometheus.addEntryPointsLabels=true"
      - "--metrics.prometheus.addServicesLabels=true"

which I don't think is so bad. You can see the metrics primarily towards the end, and I picked 8082 (just so I could understand what's different from default 8080 that traefik uses). My promtheus service configuration looked just about the same (you can judge for yourself):

  prometheus:
    restart: unless-stopped
    image: prom/prometheus
    container_name: "prometheus"
    volumes:
      - ./config/prometheus/:/etc/prometheus/
      - prometheus-storage:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.prometheus.rule=(Host(`itkpix-srv.ucsc.edu`) && PathPrefix(`/prometheus`))"
      - "traefik.http.routers.prometheus.entrypoints=websecure"
      - "traefik.http.routers.prometheus.tls.certresolver=le"
      - "traefik.http.services.prometheus.loadbalancer.server.port=9090"
      - "traefik.http.middlewares.strip-prometheus.stripprefix.prefixes=/prometheus"
      - "traefik.http.middlewares.strip-prometheus.stripprefix.forceSlash=false"
      - "traefik.http.routers.prometheus.middlewares=strip-prometheus@docker"
    networks:
      - internal

so far, so good. My "traefik" is "web" here. However, when I reach the prometheus yaml file, I found that the whole thing about "listening" for a docker swarm (which I wasn't using) wasn't working, so I changed this up to use the static config example just like the prometheus job you already defined:

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    static_configs:
        - targets: ['localhost:9090']

  - job_name: 'traefik'
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    static_configs:
        - targets: ['traefik:8082']

Once I made these changes, the dashboard you provided started having some amount of data in it! However, I get a bit confused about how the "total number of services" matches the service drop down:

Screen Shot 2021-04-12 at 8 05 15 PM

and whether or not there's some delay between one of them updating and the other updating? I'm also wondering how long it takes (or how much data one needs to collect) before the other parts of the dashboard are "N/A"

Screen Shot 2021-04-12 at 8 06 04 PM

Here's where the metrics is (itkpix-srv.ucsc.edu:8082/metrics). Let me know if I should hide this behind an ip filtering as well or not (it's not clear to me if this ever needed to be exposed to the outside world, or if it was enough to only expose it to prometheus via a dependency on the traefik service).

Again thanks for all your work on this!

@kratsg kratsg changed the title Dashboard not initially showing proper traffic at first Dashboard not initially showing proper traffic at first + N/A data Apr 13, 2021
@vegasbrianc
Copy link
Owner

Hi @kratsg and thanks for your comment. I would recommend having a look at my Traefik training repo for more documentation https://github.com/56kcloud/traefik-training As for the data, it should be real-time but be sure to check the refresh rate of the graph in the upper right corner that it is set to about 5 minutes. The number of services you see is what Traefik sees connecting to Traefik. However, I need to check again to make sure the dashboard is working correclty..

Also, I would recommend not exposing any metrics outside of your network.

@kratsg
Copy link
Author

kratsg commented Apr 14, 2021

Also, I would recommend not exposing any metrics outside of your network.

Thanks! (For reference to anyone looking at this issue, since it took a little bit of time to figure out the pieces), I told traefik to let me do manual routing

      # enable metrics with prometheus
      - "--metrics=true"
      - "--metrics.prometheus=true"
      - "--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000"
      - "--metrics.prometheus.addEntryPointsLabels=true"
      - "--metrics.prometheus.addServicesLabels=true"
      - "--metrics.prometheus.manualrouting=true"

and just did a PathPrefix

      - "traefik.http.routers.metrics.entrypoints=metrics"
      - "traefik.http.routers.metrics.rule=PathPrefix(`/metrics`)"
      - "traefik.http.routers.metrics.service=prometheus@internal"

so then my prometheus config just pointed at the same port/entrypoint that I had already defined previously

  - job_name: 'traefik'
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    metrics_path: /metrics/
    static_configs:
      - targets: ['traefik:8082']

since I didn't want to deal with the TLS headaches. I also explicitly did not expose port 8082 on traefik which means it is only accessible from the internal network as I understand it.

traefik config
  traefik:
    restart: always
    image: "traefik:v2.3"
    container_name: "traefik"
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      # enable 80, 443, 27017, 8082
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.websecure-mongodb.address=:27017"
      - "--entrypoints.metrics.address=:8082"
      # redirect 80 to 443
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      # automatic certificate generation for SSL
      - "--certificatesresolvers.le.acme.tlschallenge=true"
      - "[email protected]"
      - "--certificatesresolvers.le.acme.storage=/letsencrypt/acme.json"
      - "--log.level=DEBUG"
      # get dashboard/api
      - "--api=true"
      - "--api.dashboard=true"
      # enable metrics with prometheus
      - "--metrics=true"
      - "--metrics.prometheus=true"
      - "--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000"
      - "--metrics.prometheus.addEntryPointsLabels=true"
      - "--metrics.prometheus.addServicesLabels=true"
      - "--metrics.prometheus.manualrouting=true"
    ports:
      - "27017:27017" # mongo
      - "443:443" # https
      - "80:80" # http
      # Note: do not expose publicly. Rely on prometheus.depends_on for making the port accessible.
      #- "8082:8082" # metrics
    volumes:
      - "./letsencrypt:/letsencrypt"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
    networks:
      - web
      - internal
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.api.rule=PathPrefix(`/api`) || PathPrefix(`/dashboard`)"
      - "traefik.http.routers.api.entrypoints=websecure"
      - "traefik.http.routers.api.tls.certresolver=le"
      - "traefik.http.routers.api.service=api@internal"
      - "traefik.http.middlewares.api-auth.basicauth.users=${TRAEFIK_BASICAUTH}"
      - "traefik.http.routers.api.middlewares=api-auth@docker"
      - "traefik.http.middlewares.allowed-ips.ipwhitelist.sourcerange=128.114.130.0/24"
      - "traefik.http.routers.metrics.entrypoints=metrics"
      - "traefik.http.routers.metrics.rule=PathPrefix(`/metrics`)"
      - "traefik.http.routers.metrics.service=prometheus@internal"
      #- "traefik.http.routers.metrics.middlewares=allowed-ips@docker"
    logging:
      driver: "json-file"
      options:
        max-file: '5'
        max-size: '50m'
prometheus config
  prometheus:
    restart: unless-stopped
    image: prom/prometheus
    container_name: "prometheus"
    volumes:
      - ./config/prometheus/:/etc/prometheus/
      - prometheus-storage:/prometheus
    depends_on:
      - traefik
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
      - '--web.external-url=/prometheus/'
      - '--web.route-prefix=/prometheus/'
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.prometheus.rule=(Host(`itkpix-srv.ucsc.edu`) && PathPrefix(`/prometheus/`))"
      - "traefik.http.routers.prometheus.entrypoints=websecure"
      - "traefik.http.routers.prometheus.tls.certresolver=le"
      - "traefik.http.services.prometheus.loadbalancer.server.port=9090"
    networks:
      - internal

The training looks great and helped clarify some things. I do want to share some screenshots of why I'm a bit confused.

Screen Shot 2021-04-14 at 12 25 23 PM

Sometimes I see that the top left has "2" services, when there's definitely more than 2 (see dropdown expanded). But refreshing over time, it'll update with "4" services instead:

Screen Shot 2021-04-14 at 11 44 08 AM

Which looks better. The "N/A" I assumed was because only data for a specific service would be shown, so I pick a specific service, such as influxdb

Screen Shot 2021-04-14 at 11 44 15 AM

so this looks great! One thing I did want to do (but no idea why or how, since grafana is somewhat new to me) is to be able to edit the panels so I could add units on the numbers (I assume times are measured in milliseconds?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants