Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #8542 - Escape HTML characters. #8544

Merged
merged 4 commits into from
Mar 1, 2021
Merged

Fix #8542 - Escape HTML characters. #8544

merged 4 commits into from
Mar 1, 2021

Conversation

jmarrec
Copy link
Contributor

@jmarrec jmarrec commented Feb 17, 2021

Pull request overview

Pull Request Author

Add to this list or remove from it as applicable. This is a simple templated set of guidelines.

  • Title of PR should be user-synopsis style (clearly understandable in a standalone changelog context)
  • Label the PR with at least one of: Defect, Refactoring, NewFeature, Performance, and/or DoNoPublish
  • Pull requests that impact EnergyPlus code must also include unit tests to cover enhancement or defect repair
  • Author should provide a "walkthrough" of relevant code changes using a GitHub code review comment process
  • If any diffs are expected, author must demonstrate they are justified using plots and descriptions
  • If changes fix a defect, the fix should be demonstrated in plots and descriptions
  • If any defect files are updated to a more recent version, upload new versions here or on DevSupport
  • If IDD requires transition, transition source, rules, ExpandObjects, and IDFs must be updated, and add IDDChange label
  • If structural output changes, add to output rules file and add OutputChange label
  • If adding/removing any LaTeX docs or figures, update that document's CMakeLists file dependencies

Reviewer

This will not be exhaustively relevant to every PR.

  • Perform a Code Review on GitHub
  • If branch is behind develop, merge develop and build locally to check for side effects of the merge
  • If defect, verify by running develop branch and reproducing defect, then running PR and reproducing fix
  • If feature, test running new feature, try creative ways to break it
  • CI status: all green or justified
  • Check that performance is not impacted (CI Linux results include performance check)
  • Run Unit Test(s) locally
  • Check any new function arguments for performance impacts
  • Verify IDF naming conventions and styles, memos and notes and defaults
  • If new idf included, locally check the err file and other outputs

Added an optional boolean argument to ConvertToEscape to not convert " and ' like it would in the original routine which was meant for XML. This is to support HTML4 mostly, but might as well be correct.
@jmarrec jmarrec added the Defect Includes code to repair a defect in EnergyPlus label Feb 17, 2021
@jmarrec jmarrec self-assigned this Feb 17, 2021
@jmarrec
Copy link
Contributor Author

jmarrec commented Feb 17, 2021

Unit Test before at 0dcf25b

/home/julien/Software/Others/EnergyPlus/tst/EnergyPlus/unit/OutputReportTabular.unit.cc:8483: Failure
Expected equality of these values:
  expectedHTMLString
    Which is: "<td align=\"right\">My Coil &lt;coil is DX&gt;</td>"
  found_cell
    Which is: "<td align=\"right\">My Coil <coil is DX></td>"
<td align="right">My Coil <coil is DX></td>
/home/julien/Software/Others/EnergyPlus/tst/EnergyPlus/unit/OutputReportTabular.unit.cc:8483: Failure
Expected equality of these values:
  expectedHTMLString
    Which is: "<td align=\"right\">My Design Day where it's &gt;= 8&deg;</td>"
  found_cell
    Which is: "<td align=\"right\">My Design Day where it's >= 8\xC2\xB0</td>"
    As Text: "<td align="right">My Design Day where it's >= 8°</td>"
<td align="right">My Design Day where it's >= 8°</td>

After it passes

@jmarrec
Copy link
Contributor Author

jmarrec commented Feb 17, 2021

I added a MCVE on https://github.com/NREL/EnergyPlusDevSupport/blob/c8f21351be86526a2970cf009d2f2309f6578846/DefectFiles/8000s/8542/mcve.idf#L7

Before:

image

After:

image

HTML diff:

image

Note: In both cases the CO2 table looks the same visually:

image

@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 1, 2021

Merged develop and resolved conflicts

@jmarrec jmarrec added the NotIDDChange Code does not impact IDD (can be merged after IO freeze) label Mar 1, 2021
Comment on lines +14555 to +14558
// " ' < > & degree-sign
// If isXML=false, it does an HTML conversion, so only ` < > & ` and degree sign
// Technically HTML4 doesn't support &quot, though most browsers would anyways.
// Also, escaping single and double quotes is only needed inside attributes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mod ConvertToEscaped to add a bool isXML so it can be used for HTML escaping too, which is quite similar

Comment on lines +733 to +734
std::string ConvertToEscaped(std::string const &inString, // Input String
bool isXML=true); // isXML if false assumes HTML and will not convert quotes and apostrophes, for HTML4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mod.

Comment on lines +9616 to +9650
TEST_F(SQLiteFixture, OutputReportTabularTest_EscapeHTML)
{
// Test for #8542 - Ensures strings are escaped before going to HTML
EnergyPlus::sqlite->sqliteBegin();
EnergyPlus::sqlite->createSQLiteSimulationsRecord(1, "EnergyPlus Version", "Current Time");

auto &ort(state->dataOutRptTab);
ort->numStyles = 1;
ort->TableStyle(1) = OutputReportTabular::iTableStyle::HTML;
ort->del(1) = DataStringGlobals::CharSpace; // space - this is not used much for HTML output

ort->WriteTabularFiles = true;

SetupUnitConversions(*state);
ort->unitsStyle = OutputReportTabular::iUnitsStyle::JtoKWH;

SetPredefinedTables(*state);
std::string CompName = "My Coil <coil is DX>";

PreDefTableEntry(*state, state->dataOutRptPredefined->pdchDXCoolCoilType, CompName, "Coil:Cooling:DX:SingleSpeed");
// This would normally be called with numerics such as CompName, 0.006, 8, but I don't really care
PreDefTableEntry(*state, state->dataOutRptPredefined->pdch2CoilLvgHumRatIdealPeak, CompName, "My Design Day where it's >= 8\u00B0");
PreDefTableEntry(*state, state->dataOutRptPredefined->pdst2CoilSummaryCoilSelection, CompName, "My Design Day where it's >= 8\u00B0"); // this is >= 8 degree sign

// We enable the reports we care about, making sure we have the right ones
EXPECT_EQ("HVACSizingSummary", state->dataOutRptPredefined->reportName(6).name);
state->dataOutRptPredefined->reportName(6).show = true;

OutputReportTabular::OpenOutputTabularFile(*state);

WritePredefinedTables(*state);

OutputReportTabular::CloseOutputTabularFile(*state);

std::vector<std::string> lines = read_lines_in_file(DataStringGlobals::outputTblHtmFileName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setup the unit test. We write some stuff to the HTML with special characters. Ensures the HTML is closed. Then we read back the html lines.

Comment on lines +9652 to +9671
// Lambda helper to locate a line in the html file, and compare that line with the expected html after trimming
auto compare_html_output = [this, &lines](const std::string& lookup, const std::string& expectedHTMLString) {
std::string found_cell;
for (const auto& line: lines) {
if (line.find(lookup) != std::string::npos) {
found_cell = line;
break;
}
}
EXPECT_FALSE(found_cell.empty())
<< "Did not find the lookup string '" << lookup
<< "' string in the html output at '" << DataStringGlobals::outputTblHtmFileName
<< "'..." << '\n' << delimited_string(lines);

// Trim leading and trailing spaces
found_cell.erase(0, found_cell.find_first_not_of(' ')); // ltrim
found_cell.erase(found_cell.find_last_not_of(' ') + 1); // rtrim

EXPECT_EQ(expectedHTMLString, found_cell) << found_cell;
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lambda to check we can actually find something in the HTML lines.

EXPECT_EQ(expectedHTMLString, found_cell) << found_cell;
};

compare_html_output("My Coil", "<td align=\"right\">My Coil &lt;coil is DX&gt;</td>");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First test. I can locate the line header for My Coil <coil is DX> and it is properly escaped.

Comment on lines +9705 to +9706
// Clean up
FileSystem::removeFile(DataStringGlobals::outputTblHtmFileName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we clean up to be a good citizen.

Comment on lines +3078 to +3082
tbl_stream << "<p>Building: <b>" << ConvertToEscaped(BuildingName, false) << "</b></p>\n";
if (state.dataEnvrn->EnvironmentName == state.dataEnvrn->WeatherFileLocationTitle) {
tbl_stream << "<p>Environment: <b>" << state.dataEnvrn->EnvironmentName << "</b></p>\n";
tbl_stream << "<p>Environment: <b>" << ConvertToEscaped(state.dataEnvrn->EnvironmentName, false) << "</b></p>\n";
} else {
tbl_stream << "<p>Environment: <b>" << state.dataEnvrn->EnvironmentName << " ** " << state.dataEnvrn->WeatherFileLocationTitle << "</b></p>\n";
tbl_stream << "<p>Environment: <b>" << ConvertToEscaped(state.dataEnvrn->EnvironmentName, false) << " ** " << ConvertToEscaped(state.dataEnvrn->WeatherFileLocationTitle, false) << "</b></p>\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Escape at the very top of the HTML file.

@@ -15331,7 +15331,7 @@ namespace EnergyPlus::OutputReportTabular {
for (iCol = 1; iCol <= colsColumnLabels; ++iCol) {
outputLine = " <td align=\"right\">";
for (jRow = 1; jRow <= maxNumColLabelRows; ++jRow) {
outputLine += colLabelMulti(iCol, jRow);
outputLine += ConvertToEscaped(colLabelMulti(iCol, jRow), false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Escape column headers, isXML=False

@@ -15343,13 +15343,13 @@ namespace EnergyPlus::OutputReportTabular {
for (jRow = 1; jRow <= rowsBody; ++jRow) {
tbl_stream << " <tr>\n";
if (rowLabels(jRow) != "") {
tbl_stream << " <td align=\"right\">" << InsertCurrencySymbol(state, rowLabels(jRow), true) << "</td>\n";
tbl_stream << " <td align=\"right\">" << ConvertToEscaped(InsertCurrencySymbol(state, rowLabels(jRow), true), false) << "</td>\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Escape row headers

} else {
tbl_stream << " <td align=\"right\">&nbsp;</td>\n";
}
for (iCol = 1; iCol <= colsBody; ++iCol) {
if (body(iCol, jRow) != "") {
tbl_stream << " <td align=\"right\">" << InsertCurrencySymbol(state, body(iCol, jRow), true) << "</td>\n";
tbl_stream << " <td align=\"right\">" << ConvertToEscaped(InsertCurrencySymbol(state, body(iCol, jRow), true), false) << "</td>\n";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Escape every regular table cell.

@jmarrec jmarrec requested a review from JasonGlazer March 1, 2021 10:41
@jmarrec jmarrec added this to the EnergyPlus 9.5.0 milestone Mar 1, 2021
Comment on lines +9680 to +9703
// We ensure that SQL doesn't have the same escape
for (const std::string reportName: {"HVACSizingSummary"}) {


auto result = queryResult("SELECT RowName, Value From TabularDataWithStrings "
"WHERE ReportName = \"" + reportName + "\""
" AND ColumnName = \"Coil Leaving Air Humidity Ratio at Ideal Loads Peak\"",
"TabularDataWithStrings");

EnergyPlus::sqlite->sqliteCommit();

EXPECT_EQ(1u, result.size());
// Because the table has 8 cols
EXPECT_EQ(8u, result[0].size());

// 0.006 is a ratio, so unitconv = 1
std::string s = result[0][0];
// Trim the string, it has leading spaces
//s.erase(std::remove_if(s.begin(), s.end(), ::isspace), s.end());

EXPECT_EQ("My Coil <coil is DX>", s);

EXPECT_EQ("My Design Day where it's >= 8\u00B0", result[0][1]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@Myoldmopar
Copy link
Member

OK, pulled develop and verified the html is not sanitized, and indeed, brackets are treated as child html elements and hidden away. With this branch it looks great. Thanks for the fix @jmarrec , merging.

@Myoldmopar Myoldmopar merged commit e5d85ad into develop Mar 1, 2021
@Myoldmopar Myoldmopar deleted the 8542_EscapeHTML branch March 1, 2021 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Defect Includes code to repair a defect in EnergyPlus NotIDDChange Code does not impact IDD (can be merged after IO freeze)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some IDF object names can break the HTML output file
7 participants