Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fill data viz placeholders #67

Closed
rviscomi opened this issue Jun 30, 2019 · 11 comments
Closed

Fill data viz placeholders #67

rviscomi opened this issue Jun 30, 2019 · 11 comments
Labels
development Building the Almanac tech stack

Comments

@rviscomi
Copy link
Member

For each metric, its results will be queried and saved as JSON. We need to build a tool that generates data visualizations from the JSON data into SVG. The type of visualization will depend on the metric and the context in which it's displayed in the chapter.

For now this is a placeholder metric to brainstorm ideas and discuss what the requirements of the tool may be. TODO:

  • types of visualizations needed
  • mobile/desktop requirements
  • interactivity requirements
@rviscomi rviscomi added the development Building the Almanac tech stack label Jun 30, 2019
@rviscomi rviscomi added this to the Content written milestone Jun 30, 2019
@mikegeyser mikegeyser mentioned this issue Jul 15, 2019
5 tasks
@mikegeyser
Copy link
Contributor

Would it be possible to get a sample JSON file and visualisation, so that we can try and work on plumbing it into the authoring?

@rviscomi
Copy link
Member Author

rviscomi commented Jul 24, 2019

Sure, I've added the query, JSON, and a link to an example visualization for the 01_01 metric: https://gist.github.com/rviscomi/e50c893e3d045853215045e9dc62e47d

For the plumbing are you referring to the generation of SVG from JSON or the integration of SVG into the markdown? or both?

@rviscomi
Copy link
Member Author

rviscomi commented Aug 1, 2019

We've got several open PRs to add queries for the metrics and one thing I wasn't expecting is that @HTTPArchive/data-analysts are returning multiple values in a single query. This makes sense because there are many ways to represent the results, for example a full histogram in addition to significant percentiles (25/50/75).

On the development side, I think we should plan for this when handling the JSON data. For example, in the markdown an author could annotate a placeholder for a histogram visualization:

getHistogram('04_18', 'distribution')

Where the metric ID is 04_18 (chapter 4, metric 18) and the field name in the JSON is distribution containing data formatted in a way that can be visualized as a histogram (array of x/y values).

If the metric JSON also has summary stats, those could also be extracted using similar syntax:

showBigNumber('04_18', 'p50')

This would find the field p50 (the median) and render it in large text.

@patrickhulce
Copy link
Contributor

Oh yes big plus one there! There are also several circumstances in third-party where there's a multi-tiered breakdown and I would need the ability to pull out multiple percentiles from the same metric. I'll put up the PR later today so you can get a sense for it, but it might add another layer of complexity to the situations you've already mentioned.

@rviscomi rviscomi mentioned this issue Aug 1, 2019
@rviscomi
Copy link
Member Author

rviscomi commented Aug 1, 2019

SG thanks for flagging @patrickhulce. However we implement the data rendering, it would need to be flexible for many data types and result schemas.

@rviscomi
Copy link
Member Author

Related part of the pipeline: #177

@rviscomi
Copy link
Member Author

rviscomi commented Oct 2, 2019

Now that the analysis phase is just about done (:raised_hands:) , it's a good time to flesh out the data viz process a bit more. I have a few thoughts I'd just like to regurgitate for the record and gather feedback :)

Consistency over simplicity

Primary goal: ensure consistency across chapters. The easiest thing to do would be to allow authors to create whatever charts they need to support their narrative. But what we don't want is for each chapter to have its own visualization style. Having a cohesive "brand and identity ™️" used through the website and data viz is ideal for UX.

How fancy should we get?

We can take this in a few directions. The optimal approach would be for authors to simply annotate their content with a placeholder for the data viz. These annotations would specify the metric to visualize, the chart type, and any grouping/filtering needed to optimize the appearance of the chart. As part of our chapter generation process (converting markdown to HTML) we will replace these annotations with a dynamically generated visualization based on the JSON results for that metric. @mikegeyser has made a lot of great progress on this in #114. There's still more work to do: support various other chart types, style the charts (designs pending), and handle grouping/filtering. If it's not feasible to get this work completed before launch we might also want to consider a lower-tech approach.

One alternative approach would be to use the visualizations in Sheets or Data Studio to create the look and feel of the charts and publish/embed those interactive charts in the chapters. Or at least screenshots. If we do go in this direction I think it's worth continuing the development of the process described above and upgrading to it when ready.

Next steps

  1. Decide what types of data visualizations will be supported. We might need to see what authors are writing about and what their visualization needs are before finalizing this list. Alternatively, the results are all available in spreadsheets and we can anticipate what the best ways to visualize that data might be. This requires proactively scanning through all results and pigeon-holing each into a specific type of chart.
  2. Work with the UX designer to give us style templates we can use for each chart type.
  3. Decide on the chart creation process and build the needed tooling.
  4. Create the charts and inject them into the chapters.

cc @OBTo @HTTPArchive/developers

@rviscomi
Copy link
Member Author

rviscomi commented Oct 10, 2019

In lieu of any discussion/objection I decided to go with the Google Sheets data viz approach. The advantages of this are that we already have almost all of the data viz in Sheets, so we can easily generate data viz from within the web app, Sheets provides publishing and embedding capabilities, and we can copy/paste charts but update the underlying data source to retain a consistent look at feel.

  1. Decide what types of data visualizations will be supported. We might need to see what authors are writing about and what their visualization needs are before finalizing this list. Alternatively, the results are all available in spreadsheets and we can anticipate what the best ways to visualize that data might be. This requires proactively scanning through all results and pigeon-holing each into a specific type of chart.

I've created this demo sheet as an example of the types of visualizations we might need. Ideally every chart needed by all of the chapters is represented in this sheet.

  1. Work with the UX designer to give us style templates we can use for each chart type.

I've assigned our designer to work on applying our Almanac UX to the charts (colors, typography, etc) so we should have the final look and feel by the end of the week.

One important question is how will it look on mobile? Worst case the charts will require zooming/scrolling. I can build a prototype to see exactly what the UX would be. Hopefully the charts are responsive and adjusting the iframe's dimensions with media queries would ensure they fit on mobile screens.

  1. Decide on the chart creation process and build the needed tooling.

No special tools needed. This would be a manual process.

  1. Create the charts and inject them into the chapters.

Analysts/developers/authors can create the charts in Sheets and generate the markup to embed each chart (part of the publish flow in Sheets). As part of the final editing process, we'll replace all data viz placeholders in the chapters with their respective iframes.

One nice side effect of this approach is that the results of the analyses can be made available in publish Sheets for each chapter, and each query would include the raw results, any pivot tables, and the corresponding visualization all in one sheet.


I would still like to rerun all of the queries and save their results to the repo. Instead of JSON, though, I'd like to try to output the query results as CSV, which is a more compatible format for importing directly into Sheets if needed.

@foxdavidj
Copy link
Contributor

@rviscomi I'm a fan of using Google sheets for the visualizations. Helps to keep it simple, consistent and makes it possible for authors to help create their chapter's charts/graphs/etc if they would like.

Two things:

  1. If chapters have additional graphics besides these visualizations (e.g, showing what a skip link looks like), can we submit the graphics along with the markdown document?
  2. If at all possible, I'd like to avoid the use of pie charts. They look nice, but they make it very difficult for readers to compare different "slices".

@rviscomi rviscomi changed the title Generate SVG from JSON data Fill data viz placeholders Oct 10, 2019
@rviscomi
Copy link
Member Author

Good points!

If chapters have additional graphics besides these visualizations (e.g, showing what a skip link looks like), can we submit the graphics along with the markdown document?

Yes, it's a good idea to have a place for non-data graphics. I'll set aside chapter-specific subdirs under https://github.com/HTTPArchive/almanac.httparchive.org/tree/master/src/static/images/2019

If at all possible, I'd like to avoid the use of pie charts. They look nice, but they make it very difficult for readers to compare different "slices".

Agreed! Only as a last resort for the cases when they do help, of which there should be few.

@rviscomi
Copy link
Member Author

Closing this as a duplicate of #237

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Building the Almanac tech stack
Projects
None yet
Development

No branches or pull requests

4 participants