-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: Refactor Quotes and Dialogue #1773
Comments
Yes, an implicit feature request here is to allow creating dialogue/quoting rules per Project. I can even picture the need of having customised rules per Novel! (Perhaps the project is multilingual? Perhaps the author needs flexibility across works in a same project?) |
So in Brazilian Portuguese we usually do dialogue like this: — Tudo bem? — ele pergunta. — Você falou com ele? Quick copy-paste from my own manuscript, each dialogue always starts in a new line. Italics are narrator breaks. But of course we also use the dashes in the middle of a paragraph the same way in English, and that has nothing to do with dialogue. I can get novelwriter to detect dialogue paragraphs but not the narrator breaks. |
See #1771 Basically, I asked @tmarplatt about the syntax (for Spanish) and it has been implemented with the assumption there is no space between the dash and the narrator text. It is clear that the current narrator break implementation doesn't work in the general case. It also does not work for Polish, as someone else pointed out. I am a little uncertain how to solve it. Currently, novelWriter uses full line RegEx to parse syntax. It works for all other highlighting, but I haven't manage to solve this one yet. What it does now is to assume the whole line is dialogue, and then revert the highlighting back to text for text between the narrator break symbols without padding spaces. It would be possible to add a special format that handles lines starting with a dash (long or short) differently and do a split on all dashes, and alternate the highlighting of dialogue on and off for each dash, starting with dialogue enabled. Would that be enough to solve it? It can be handled with a simple RegEx split. I can look into it, and test it. Could you please create a new feature request for it? I'm currently refactoring the whole text formatting module of novelWriter, so it's a very good time to fix it. |
Yes, but in those cases, the line doesn't start with a dash, correct? So the highlighter can tell the difference between a dialogue line and a line with just a dash in it. Or would you also use dashes in a dialogue line for other purposes than narrator break? If so, I think the only fool-proof way to handle narrator breaks is to add a shortcode for it. But those are annoying to use if they're needed often. I tend to use narrator breaks in my writing a fair bit. |
In those cases it all works as intended, sorry I wasn't clear, I was just saying that we do use dashes the same way as in English as well. I checked that other issue too, and the only difference would be we do use space between the dash and the text when there is a break.
For the uses in Brazilian Portuguese, I think so, yeah. How are you gonna implement that in the options, though? I mean, it seems like every language has its own way of doing dialogue (though I do see that a lot of Brazilian writers sometimes use the English way). |
I was thinking of leaving the options as-is. They already accept input for dialogue line character and narrator break character separately. So I was thinking of allowing the dialogue character to turn on line dialogue mode, and then split the remainder of the line on the narrator break character, and turn on/off dialogue highlighting on alternate text pieces of that split, starting with "on" for the first bit. As I understand it, it will solve the Portuguese usage without disrupting the Spanish usage. But it does mean it there can be no dash (if that is the narrator break character set) in the line for other use-cases like a pause. I think this will also solve the Polish case as discussed in #1976. Maybe @Nauthizz can confirm? |
You're too fast, I was just writing the feature request 😅
It does seem like in Polish they use mostly the same dialogue punctuation as we do (and I say PT-BR because as far as I know PT-PT does it just like in Spanish with those weird «» comillas ;)) so it would probably work for most of the times, except maybe when the dialogue just starts with a dash in the middle of the text like in that last example. But then if in Polish there are no em dashes for other types of pause mid-text wouldn't that work by using that "alternate dialogue symbol" option (and allowing open-ended dialogue lines)? |
I can add a switch for letting the dialogue start mid-line. It complicates the parsing a little, and will probably slow down highlighting when all of these are turned on, but unless you write the entire novel in a single document, it will not be noticeable. In any event, the whole document is only highlighted on open. After that, it only refreshes the paragraph you're writing in. But it does so on every keypress. But after I switched to a true plain text editor, it performs much better in general. I've just tried to avoid character by character parsing for text paragraphs. That's where it starts to get really slow. But when the rules get too complicated, it's hard to get a single regxes to handle the rules. That's why the markdown support is also limited. |
... what kind of monster would do that... cof cof >.> As for the options, though, at least in pt-br and Polish, it seems to me this usage should be covered if I add an em dash to the "narrator break symbol" option, no? At least that's what I would expect it did. |
Oops! 😄 But 60k words is not so bad. I use 800k words for stress testing. It still works fine, but takes a noticeable amount of time to load.
Yes, setting both the dialogue line symbol and narrator break symbol settings to a long dash should then do what you are requesting. I've queued up the feature request for the upcoming release. It fits with the other features I'm adding as I'm restructuring the whole document formatting engine and adding more formats and more capabilities. I've added DocX and PDF formats to the manuscript build options as well. |
It turns out there is a Unicode character that is the (more) correct symbol for dialogue than the em dash, the horizontal bar. I'm wondering if it's worth adding an auto-replace feature for it so that it can be used to distinguish between narrator breaks and pauses in text? Like for instance, if you type |
Ok, I've added #2070 for this. It will of course be hidden behind a switch. If any of you have a proposal for a keyboard sequence to replace with a dialogue/narrator dash, please add it to the new issue. There is also a keyboard shortcut already for inserting it: |
Ok, I realised I could simplify the dropdown box to just a switch: The modes are the same though. If the switch is turned off, the "Dialogue/narrator symbol" will alternate highlighting anywhere on a line when it sees the symbol provided. If the switch is turned on, the line must start with the "Dialogue line symbol" and will alternate on the "Dialogue/narrator symbol" if one is provided. |
Ok, more tinkering now. I wrote yet another parser, and I don't think I can get any closer than this without having to implement completely separate parser for each language. The settings are the same, but I re-labelled them a little: Spanish uses alt dialog symbols for «», which I understand is mainly used for inner dialogue, so that makes sense? Portuguese is pretty simple. Just set Alternate dialogue/narrator symbol to em-dash. Polish is the same, just set Alternate dialogue/narrator symbol to en-dash instead. |
I'm not sure I understand the options in this last change you did 🤔 In PT-BR we still use em dashes in the middle of the text to indicate pause or something like that — like this —, and it has nothing to do with dialogue. Will that work without that option to only highlight dialogue when it starts with the dialogue symbol? |
I was back and forth on the condition for alternating freely, like in Polish, and I think the merged version does not work like that. But maybe I should change it so that setting the dialogue line symbols blocks the "Polish mode". It would mean that you'd add em-dash in both boxes, and get "Spanish mode" which also allows punctuation after the dash. It's almost impossible to distinguish between "Spanish mode" and "Portuguese mode" here. I may have to add an extra setting to achieve it. I've also added support for horizontal bar by typing 4 dashes. It looks exactly the same as em-dash, but is unicode 2015 instead of 2014, so the rules can be set for one but not the other. Horizontal bar is the technically correct symbol for dialogue in many standards. |
I already regret opening this dialogue highlighting can of worms 😅 I bet I'm going to end up with a dropdown list of language modes instead in the end. Each rule set is much easier to implement isolated from the others. Either that, or we just accept that they won't be perfect. |
Ok, so another iteration on this. I wanted to avoid adding more settings in Preferences, but the only way to separate the Polish style from Spanish/Portuguese was to add separate settings for it. So I did: "Dialogue line symbols" and "Dialogue narrator break symbol" works more or less as before, except the first can now take multiple characters, so you can also add whatever continuation symbol is used. It can take up to four. The Polish feature uses only the "Alternating dialogue/narration symbol" setting. The three settings work this way:
Spanish and Portuguese would thus set 1. and 2. to a long dash. The highlighting will detect the difference in padding around the end of narration dash on its own. Polish would set 3. to a short dash. Norwegian, and other languages that have dash symbols for dialogue lines, but not narrator breaks would set 1. to a short dash and leave 2. empty. So this works for all cases I currently know of. I hope, Does this make sense, @tmarplatt, @nyex and @Nauthizz? I will clearly need some documentation on it. Although with only three settings, trial and error would work too. Edit: Oh, and 1. and 2. work completely separately from 3. If all three are set, a paragraph that triggers 1. or 2. will not apply 3. They even work at the same time! |
This isn't even necessary, you could just leave it as it works in Spanish, with the alternating dashes inside dialogue the same colour as the narrator break, but it seems good as it is as well (just saying this in case the code is better without this extra check or whatever it is). Also because (SORRY) sometimes this happens in Portuguese: — I like this — he said —, but not that. (I don't use it like this, because the dash is already punctuation and it seems a bit overkill, but it's correct.) That said, I'm not very particular about what colour the dash itself is in the end ;) And the options look clear enough to me, it's what I had expected to happen before :) |
Yeah, that will group the dash colour-wise to the right: It basically detects the space before the dash and assigns it left or right, that's all. It's a simple if-statement, so it isn't particularly complicated code. I just thought it looked better. I'm trying to get the highlighting accurate enough that it would be possible to provide a dialogue word count for the manuscript. But dashes and punctuation don't matter for that. Edit: There, fixed that too by slightly tweaking the rule: The reason I made the second dash highlighted was to be consistent with alternating dialogue, but perhaps this is not necessary at all since practically non one will use both features at the same time. |
Thank you for wrestling that can of worms @vkbo because this is looking fine. I've refrained from suggesting changes to the controls before for that reason: it's hard to see how an improvement for my use case impacts formatting requirements in other languages. Two observations:
|
Sure, I can change the label. As for the highlighting, since I split the narrator break feature from the alternating (Polish) style, I made it as simple as highlighting the entire break, including up to one trailing punctuation, as narration. When using the new alternating field, it instead renders each section as on/off where the dash is associated with the following text. It made more sense when considering both the cases where dialogue starts at the beginning and in the middle of the line. My initial struggle was to make both these work when only considering the narrator break setting. Using separate settings is much simpler and much less code for guessing what style you're using. (And hence also faster highlighting.) |
|
Hmm, no, it is an alternative to the setting it is already below, so I think it makes sense the way it is. It has nothing at all to do with the dialogue line feature. |
That's correct, I misunderstood. |
Anyway, I'm pretty much ready to make the 2.6 Beta release now, so you should all be able to test it properly. I think this is a good starting point. If I get around to finish up the last few things, I should be able to make the release this weekend. |
Prompted by @tmarplatt, this Epic ticket will track some efforts into improving handling of auto-replacement of quote symbols, and highlighting of dialogue. I've considered making some changes here for a while, and the feature requests contain a lot of ideas to build on.
Firstly, I would like to separate the quote replacement from the dialogue highlighting. The quote symbol settings are currently there to replace key presses on the keyboard of
'
and"
straight quotes with the user's preference of symbols. I don't want to change this feature. It is also not really what the feature requests are about.Instead, I would like to define a set of rules for highlighting dialogue that better works for different languages. They should also work for alternative styles other than the standard rules. Sometimes an author also needs some creative flexibility here.
Feature Issues
Implementation
A form to set these rules should have an easy way to pick the symbols to use without having to type them on the keyboard. Similar to the current quotes selector dialog, but with additional symbol, support like at least en and em dashes.
Add a new section named "Dialogue Highlighting" in Preferences for these settings, and consider making it possible to override them on a per-project basis by adding a "Override Dialogue Highlighting" page in Project Settings.
Related to #1770:
::Person saying something::
indicates the communication is non-verbal through technology.Related to #1771:
Related to #1772:
The text was updated successfully, but these errors were encountered: