Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds coverage for CSV and XLSX file download #10494

Merged
merged 4 commits into from
Mar 6, 2023

Conversation

filipefurtad0
Copy link
Contributor

@filipefurtad0 filipefurtad0 commented Feb 28, 2023

What? Why?

Adds coverage for sales tax by orders report downloads, for file types:

  • csv - adds a fixture csv file
  • xlsx - adds a fixture csv file
  • pdf - we only check if a file was downloaded here, and not its content

What should we test?

  • green build

Release notes

Changelog Category: Technical changes

The title of the pull request will be included in the release notes.

Dependencies

Documentation updates

@filipefurtad0 filipefurtad0 changed the title Adds coverage for CSV file download [draft] Adds coverage for CSV file download Feb 28, 2023
@filipefurtad0 filipefurtad0 force-pushed the test_xlsx_reports branch 4 times, most recently from 215be32 to 619cadd Compare March 1, 2023 14:28
@filipefurtad0 filipefurtad0 changed the title [draft] Adds coverage for CSV file download Adds coverage for CSV file download Mar 1, 2023
@filipefurtad0 filipefurtad0 changed the title Adds coverage for CSV file download Adds coverage for CSV and XLSX file download Mar 1, 2023
@filipefurtad0
Copy link
Contributor Author

This one is ready for review I think.

Some comments:

  • I've implemented these tests for a single report. I'm thinking this would be very heavy on the test suite to add this to all report types. Although it would be the safest approach. I wonder if there is an alternative to this, like testing it on the request level. We'd still need to create fixtures for all reports, I guess.

  • We could extend this approach (creating a file fixture, opening, comparing with the downloaded file) to pdf downloads, too. If it works well then maybe even use this approach on our dreaded invoice_print_spec.rb (currently a feature spec using :rack_test) I think both could be addressed in a separate PR.

Happy for your thoughts and reviews.

@abdellani
Copy link
Member

@filipefurtad0

This looks good to me 👍

I did some research about the PDF format.

  • This is a good document that describes the internal structure of PDFs.
  • two PDF files that have the same content will not have the same hash (I tested with md5) due to the metadata stored on the file.
  • One of the metadata that changes from one report to another is CreationDate (maybe this is not the only one)
  • I found this solution which uses grep to remove CreationDate.
    I'm not sure if this solution is reliable because it uses grep -a which process the binary file as a text
       -a, --text
              Process a binary file as if it were text; this is equivalent to the --binary-files=text option.

  • I found many gems that be used for our purpose, the most popular are: pdf-reader, pdf-inspector and identikal.
  • I didn't test any of them, but from the first glimpse, pdf-reader gives a similar API to what we already use Roo::Excelx

I hope that you'll find this useful :)

@abdellani
Copy link
Member

I've implemented these tests for a single report. I'm thinking this would be very heavy on the test suite to add this to all report types. Although it would be the safest approach. I wonder if there is an alternative to this, like testing it on the request level. We'd still need to create fixtures for all reports, I guess.

I think the same thing, the system tests are slow. I was thinking about request-level vs object-level tests.
It'll be faster to ask the report directly to generate CSV/XLSX/PDF without sending a request through the controller.

what do you think?

@filipefurtad0
Copy link
Contributor Author

I did some research about the PDF format.

@abdellani
I'm really very thankful for those insights - I think this is great push in the right direction 🙏
I'll have a go at this issue.

It'll be faster to ask the report directly to generate CSV/XLSX/PDF without sending a request through the controller.

For sure, I agree. I'm not very confident to be able to write object-level tests, I'll ping you for a some more insight in this when I pick this up, if that's ok.

PS: maybe we could gain some insight from Matomo on which reports users download which file types the most, and add a request level coverage on those. Maybe a good compromise.

@dacook dacook self-requested a review March 2, 2023 23:11
Copy link
Member

@dacook dacook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

I agree that it's not necessary to have a system test for every test, every format. But it's necessary to test one of the reports using the report framework to ensure it generally works.
I notice that we have a more general spec/system/admin/reports_spec.rb spec file. Maybe this test could have gone there. But I don't think it really matters.

Here, we test that it works, and can output the three formats. It also tests the content of those two formats that are most crucial (because they are structured data that could be used to import/integrate with other systems).

I had a suggestion, so will put back to in dev. But even if you don't use it, it's ready to go.

CSV.read("spec/fixtures/reports/sales_tax_by_order/sales_tax_by_order.csv")
end
it 'as csv' do
expect(downloaded_filenames.length).to eq(0) # downloads folder should be empty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 downloaded_filenames looks like a very handy helper! Looks like it was unused until now.

Comment on lines 457 to 460
downloaded_xlsx = Roo::Excelx.new(downloaded_filename)
downloaded_content = [downloaded_xlsx.row(1), downloaded_xlsx.row(2),
downloaded_xlsx.row(3), downloaded_xlsx.row(4),
downloaded_xlsx.row(5)]
fixture_xlsx = Roo::Excelx.new(report_file_xlsx)
fixture_content = [fixture_xlsx.row(1), fixture_xlsx.row(2), fixture_xlsx.row(3),
fixture_xlsx.row(4),
fixture_xlsx.row(5)]
expect(downloaded_content).to eq(fixture_content)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why, but I couldn't help but try to reduce the code down :D

Suggested change
downloaded_xlsx = Roo::Excelx.new(downloaded_filename)
downloaded_content = [downloaded_xlsx.row(1), downloaded_xlsx.row(2),
downloaded_xlsx.row(3), downloaded_xlsx.row(4),
downloaded_xlsx.row(5)]
fixture_xlsx = Roo::Excelx.new(report_file_xlsx)
fixture_content = [fixture_xlsx.row(1), fixture_xlsx.row(2), fixture_xlsx.row(3),
fixture_xlsx.row(4),
fixture_xlsx.row(5)]
expect(downloaded_content).to eq(fixture_content)
end
downloaded_content = extract_xlsx_rows(downloaded_filename, 1..5)
fixture_content = extract_xlsx_rows(report_file_xlsx, 1..5)
expect(downloaded_content).to eq(fixture_content)
end
def extract_xlsx_rows(file, range)
xlsx = Roo::Excelx.new(file)
range.map { |i| xlsx.row(i) }
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for that suggestion David. I've committed and squashed with the last commit (in which the contexts are added). I wanted to squash with the commit the which I introduced this XLSX test, but this would introduce several conflicts.

Hope this is ok now - moved back to code review.

@dacook
Copy link
Member

dacook commented Mar 2, 2023

PDF:

  • I didn't test any of them, but from the first glimpse, pdf-reader gives a similar API to what we already use Roo::Excelx

Good idea @abdellani , I think this looks good. It provides a text method which presumably is similar to Capybara's text method that extracts a plaintext version of the page, allowing us to assert the presence of certain content.

Object-level tests (aka unit tests):
I think that the reports themselves are already well tested for the generic tabular format, but it looks like we don't have tests on how the tabular format is rendered into each of these types. This could be added in the future here: spec/lib/reports/report_renderer_spec.rb, and we could use roo / pdf-reader to test the output.

@mkllnk mkllnk merged commit b9a7ff9 into openfoodfoundation:master Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Test automation] XLSX files
4 participants