Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(html): Handle <br> elements to insert line breaks in text #1950

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Dhruv-Maradiya
Copy link

Fixes #1090 by updating the DOM parser to handle <br> elements and insert line breaks (\n) when converting HTML content to plain text.

Initially, I thought adding a simple condition might not be a reliable solution. So, I decided to check how HTML-to-text conversion is handled in Chromium and found a similar approach. Here's the link.


  • I’ve reviewed the contributor guide and applied the relevant portions to this PR.
Contribution guidelines:

Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.

@Dhruv-Maradiya Dhruv-Maradiya changed the title Fix(html): Handle <br> elements to insert line breaks in text Fix(html): Handle <br> elements to insert line breaks in text Dec 27, 2024
@kevmoo kevmoo requested a review from devoncarew December 28, 2024 02:55
Copy link

PR Health

Breaking changes ✔️
Package Change Current Version New Version Needed Version Looking good?
html None 0.15.5 0.15.5 0.15.5 ✔️
Changelog Entry ✔️
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

Coverage ✔️
File Coverage
pkgs/html/lib/dom_parsing.dart 💚 58 % ⬆️ 3 %

This check for test coverage is informational (issues shown here will not fail the PR).

API leaks ⚠️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

Package Leaked API symbols
html HtmlTokenizer
Token
HtmlInputStream
TagToken
DoctypeToken
StringToken
TreeBuilder
ActiveFormattingElements
StartTagToken
TagAttribute
CommentToken
CharactersToken
SpaceCharactersToken
EndTagToken

This check can be disabled by tagging the PR with skip-leaking-check.

License Headers ⚠️
// Copyright (c) 2024, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files
pkgs/html/lib/dom_parsing.dart
pkgs/html/test/parser_feature_test.dart

All source files should start with a license header.

Unrelated files missing license headers
Files
pkgs/bazel_worker/benchmark/benchmark.dart
pkgs/bazel_worker/example/client.dart
pkgs/bazel_worker/example/worker.dart
pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart
pkgs/boolean_selector/example/example.dart
pkgs/clock/lib/clock.dart
pkgs/clock/lib/src/clock.dart
pkgs/clock/lib/src/default.dart
pkgs/clock/lib/src/stopwatch.dart
pkgs/clock/lib/src/utils.dart
pkgs/clock/test/clock_test.dart
pkgs/clock/test/default_test.dart
pkgs/clock/test/stopwatch_test.dart
pkgs/clock/test/utils.dart
pkgs/html/example/main.dart
pkgs/html/lib/dom.dart
pkgs/html/lib/html_escape.dart
pkgs/html/lib/parser.dart
pkgs/html/lib/src/constants.dart
pkgs/html/lib/src/encoding_parser.dart
pkgs/html/lib/src/html_input_stream.dart
pkgs/html/lib/src/list_proxy.dart
pkgs/html/lib/src/query_selector.dart
pkgs/html/lib/src/token.dart
pkgs/html/lib/src/tokenizer.dart
pkgs/html/lib/src/treebuilder.dart
pkgs/html/lib/src/utils.dart
pkgs/html/test/dom_test.dart
pkgs/html/test/parser_test.dart
pkgs/html/test/query_selector_test.dart
pkgs/html/test/selectors/level1_baseline_test.dart
pkgs/html/test/selectors/level1_lib.dart
pkgs/html/test/selectors/selectors.dart
pkgs/html/test/support.dart
pkgs/html/test/tokenizer_test.dart
pkgs/pubspec_parse/test/git_uri_test.dart
pkgs/stack_trace/example/example.dart
pkgs/watcher/test/custom_watcher_factory_test.dart
pkgs/yaml_edit/example/example.dart

This check can be disabled by tagging the PR with skip-license-check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

<br/> Tag does not product /n
1 participant