Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain text color when pasting #27399

Closed
wants to merge 6 commits into from

Conversation

grzim
Copy link
Contributor

@grzim grzim commented Dec 1, 2020

Description

Not ready for merging yet!

According to issue #27212 there is a problem that when text is copied and pasted it is losing text colouring.
The problem is - as far as I investigated it looks like it was no mechanics to handle it from the very beginning, therefore it is not a regression bug but this functionality was missing from the beggining.
Every letter is wrapped in a <span> and later altered to html tag which refers to span style (strong, i, etc.) or unwrapped.
The problem lays in the architectural design of this mechanism. Not all styles can be represented as a corresponding HTML element and this is why colouring is not working. My solution was to add a special tag 'coloured' that is only used for wrapping in order to preserve text color, and then in the final html code, every is altered to with a color style applied.
All in all, I don't like the mechanism that is here as it is not future proof. Adding any new feature, like text highlight, different text sizes etc or changing the existing ones will need to be done separately in this part of code, which will lead to problems in the future for sure.
Also in the current state, sub and sup are not working when copy-pasted. The same when we edit something as html and will add some styling - they will be lost when copy-pasting. The source of all of those problems is the same.

Another issue connected to the same problem:
#27345

How has this been tested?

Tested visualy - e2e tests are not passed. I can fix them but it is a good idea to discuss the solution first.
This mechanics is used in many other parts of code (not only copy-pasting) so it is vital to work out a wise solution.

Screenshots

Types of changes

New feature

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • My code has proper inline documentation.
  • I've included developer documentation if appropriate.
  • I've updated all React Native files affected by any refactorings/renamings in this PR.

grzegorz_marzencki added 2 commits November 30, 2020 22:10
@grzim grzim requested a review from ellatrix as a code owner December 1, 2020 14:56
@grzim grzim requested review from adamziel and draganescu December 1, 2020 14:59
@grzim grzim changed the title Copy paste is losing information about text color Retain text color when pasting Dec 1, 2020
const openingTagRegExp = new RegExp( `<${ tag }`, 'g' );
const closingTagRegExp = new RegExp( `</${ tag }`, 'g' );

const withOriginalOpeningTag = HTML.replace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would walk the tree instead of relying on regexps - they may have unintended consequences.

Copy link
Contributor Author

@grzim grzim Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All manually placed special chars like < have escape characters so this should not be dangerous, nevertheless, indeed tree traversing is just safe, so it is better to approach, thanks.
I am just not convinced with the entire idea - as I've mentioned in the desc - this is not scalable and there would be needed a huge amount of logic to cover all cases (colors are just a tip of an iceberg). I think it will be better to change the mechanics of unfolding letters from their corresponding spans and the way spans are translated to native html tags.

} = node.style;

// adds color to the wrapper
if ( color && ! defaultColor.includes( color ) ) {
const wrapper = doc.createElement( 'coloured' );
Copy link
Contributor

@adamziel adamziel Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My memory is hazy, but I remember using custom tags had some surprising effects. I'm not completely opposed to custom tags as long as we can be sure they works as intended in all the browsers. Maybe using one of the allowed tags with a custom attribute like data-has-color="true" would be easier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just checking for style.color later on would suffice?

Copy link
Contributor Author

@grzim grzim Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This custom tag is not placed in html. I don't like the idea to put any custom tag directly to html unless it is a component. I use them only in the internal mechanics not to lose it, as every letter is stripped out of a span tag. Therefore, every custom tag is transformed back to span at the very end of all transformations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not insert custom tags or data attributes. We should just preserve the color style property, and add the has-inline-color class.

@adamziel
Copy link
Contributor

adamziel commented Dec 2, 2020

All in all, I don't like the mechanism that is here as it is not future proof. Adding any new feature, like text highlight, different text sizes etc or changing the existing ones will need to be done separately in this part of code, which will lead to problems in the future for sure.

How else would you approach this? The overall approach here is allowlisting a specific subset of html. My understanding is that regardless of the implementation details we will always have to explicitly add/allow new attributes and ways of formatting.

Also CCing @ellatrix @noisysocks as authors of the code.

@grzim
Copy link
Contributor Author

grzim commented Dec 2, 2020

Explicitly adding new attributes is not the right choice, as a user can use any attrs they want - they can edit the text as HTML. If the attr is not set in contentSchema then it will be lost - not only when pasted but also when edited (#27345) . I am not sure why there was a need to introduce contentSchemas, I will be happy if authors could put some light on it.
The proposed solution is to get rid of schemas that define which attrs can be used to which elements. Instead, preserve all attributes of an element wrapping a letter (it can be span or any other tag). As the list of all atrrs (especially style) will be long, I would remove all attributes that are there by default.
If this diff (element/defaultElement) is empty then a letter is unwrapped from a span. If the diff is not empty then transform it to the corresponding HTML tag if there is such (like <strong>), if there is no corresponding tag, leave it as it is. Thanks to this we don't have to define all possible attributes in advance.

Copy link
Member

@ellatrix ellatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

You're right that this is not a bug, it's just not implemented yet.

The raw handler strips all HTML except for what we'd explicitly like to preserve. In the case of text colouring, it seems there are some classes and style properties that need to be preserved: has-inline-color, has-...-color (in case of a named colour, and style="color:#...".

A fix should just preserve these attributes, nothing more I think.

@grzim
Copy link
Contributor Author

grzim commented Dec 3, 2020

Preserving this content will work for colouring but this will not help in a general issue of loosing additional data. User can edit text as html and add other formatting. This will be lost.

I will make another PR describing the entire mechanics as focusing only on colours will not help in solving the general problem

@ellatrix
Copy link
Member

ellatrix commented Dec 6, 2020

What additional data are you referring to?

@grzim
Copy link
Contributor Author

grzim commented Dec 7, 2020

By additional data, I mean all the styles and attributes that a user can add but are not listed in schemas. As users can edit the text as html, they have a huge amount of possibilities. Listing them all is not possible. We need te to be secure, so I would suggest using third party sanitization instead of hardcoded schemas

grzegorz_marzencki added 2 commits December 8, 2020 02:47
…d - 1. get default styling (black text) 2. diff the default styling with styles of pasted element to get all nondefault styles applied to element 3. Sanitize output 4. paste the output
@grzim
Copy link
Contributor Author

grzim commented Dec 14, 2020

@ellatrix how about adding to the schema styles we want to preserve?

Also - schemas define the form of HTML sanitization, shouldn't it be done on the server-side?

@ellatrix
Copy link
Member

You mean that they can add a style attribute in HTML mode and put anything in there? This is already possible right? No need to sanitise it. At most we can generally sanitise the attribute value, but no need to sanitise colors or style attributes.

@grzim
Copy link
Contributor Author

grzim commented Dec 15, 2020

As far as I understood schemas are defined in order to describe what kind of attributes elements can have. Otherwise, they are erased. What is the purpose of it if not to sanitize?
Adding 'style' attr to schema would prevent from removing color style, which is added when a user adds text coloring.

@draganescu draganescu added the Needs Technical Feedback Needs testing from a developer perspective. label Feb 17, 2021
@ellatrix
Copy link
Member

Does #27967 resolve this issue? For internally pasted content we should not run all the paste filters. For externally pasted content we should and we should not allow color formatting because there can be a lot of unwanted styling present.

Base automatically changed from master to trunk March 1, 2021 15:44
@grzim
Copy link
Contributor Author

grzim commented Mar 2, 2021

Looks like #27967 has nothing to do with retaining text color when pasting. Tested on master and the issue still appears.

@grzim
Copy link
Contributor Author

grzim commented Mar 2, 2021

Closing this PR as this issue should be fixed in a more general way - to handle different custom styles which users can add to their text and are lost on pasting.

@grzim grzim closed this Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Paste Needs Technical Feedback Needs testing from a developer perspective.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants