Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add definition for plain text #1800

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

staab
Copy link
Member

@staab staab commented Feb 21, 2025

No description provided.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 21, 2025

Looks good, but we really need to stop calling this "plain text". It doesn't make any sense.

Adds:

  • words starting with #([^\s!@#$%^&*()=+./,\[{\]};:'"?><]+) represent pointers to the t-tagged hashtag of it's name
  • words starting with lnbc represent lightning invoices
  • words starting with lnurl1 represent lightning wallet uris
  • words starting with lno1 represent lightning offers
  • words starting with cashuA and cashuB represent cashu tokens.
  • words starting with $<a-zA-Z0-9> represent a stock ticker or an asset price.
  • words starting with !<geohash> represent a pointer to a g-tagged geohash feed.

Additional rules:

  • Urls with scheme, like https://, are expected to be previewed in cards, images, audio players, and video playbacks with the link hidden from the user.
  • domain names without schemes, like nytimes.com, are expected to render as is, without previews.
  • nip19 uris are expected to either be previewed by the app as a quote OR be a link to an outside app that can view them using NIP-89/typed-scheme (Typed URI schemes. #1539)

@staab
Copy link
Member Author

staab commented Feb 21, 2025

Looks good, but we really need to stop calling this "plain text".

I thought about introducing the term "nostr flavored markdown" but I think that communicates the wrong thing. "Human writable text" is the closest name I can come up with to the actual justification for this, but it also makes no sense.

Those are good additions, I've added most of them (although making the language somewhat less prescriptive). I left out geohashes since those aren't really human readable or writable. Topics were already in the list.

@vitorpamplona
Copy link
Collaborator

"Human writable text" should not include any NIP-19 URIs IMO. Either way, it should be either all or nothing. Keeping very long unreadable, unwritable URIs like nprofile1, nevent, naddr, lnbcs, cashuB and not accepting others like nembed doesn't make much sense.

@staab
Copy link
Member Author

staab commented Feb 21, 2025

Humans copy/paste urls, they can copy/paste nevent1s too (and they do). They're all the same thing — a URL, i.e. a reference. base64 encoded images and nembeds are content, and while yes, you could copy/paste those too, it's not idiomatic to (for example) copy a pdf's contents into an email instead of attaching it.

10.md Outdated
Comment on lines 69 to 70
- Unordered lists MUST be one level deep, and SHOULD use the `-` character
- Ordered lists MUST be one level deep, and MUST be sequentially numbered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen this consistently implemented anywhere, and I don't think it's wise to specify and conjure it into existence

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less about it being implemented (since there's not really anything a parser really has to do), than being commonly used in rich text editors, and something users do anyhow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do they? I think users do lists in many different ways, like 1), 1., 1:, or *, -, . they also do multi-level lists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. clients should also parse those things probably. I've changed the language slightly so that the given list is what clients should add on behalf of users, but clients will have to parse a much wider range of nonsense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add that lists only hold one paragraph?

Meaning that the parser doesn't need to connect lists that are more than a new line away from each other.

This would not work, for instance.

  • Item 1: ... ... ...
    .. asdjfasdf
    asdfasdf
    asdfasdf
    asdf

  • Item 2

  • Item 3

@fiatjaf
Copy link
Member

fiatjaf commented Feb 21, 2025

Although I mostly agree with @staab's rationale about human-writable and copyable text, and after lefting a bunch of comments, my conclusion is that actually we shouldn't specify any of these, and the only thing we really should do is follow NIP-27.

Maybe mention somewhere how URLs and image URLs are usually handled, as the behavior regarding these is fairly consistent and expected.

Nostrudel adds a special link every time a NIP is mentioned, should that also be standardized? I don't think so. The same principle applies to $, *, >.

Emojis are also specified somewhere else and very optional. I live very well (better) without them.

…implicity, not because it won't come up elsewhere
@vitorpamplona
Copy link
Collaborator

People are copy-pasting nembeds between a DM chat with their doctor to a DM with a pharmacist or an optician.

But I digressed.

@fiatjaf I somewhat disagree. The lack of definition of what is expected of clients to render each kind is a massive UX problem. If some clients show a custom emoji and others don't, that is a problem. The scope kind, say kind 1, must decide to either use or not use custom emojis. But once the decision is made, all kind 1 clients, must follow. There is no option to not implement it. Same for rendering ln invoices, tokens, nip19 uris, etc.

The more we spec what is expected to happen in each "scoping kind" (which defines the use case), the better it gets for users.

@staab
Copy link
Member Author

staab commented Feb 21, 2025

If some clients show a custom emoji and others don't, that is a problem.

I think it's fine. Coracle doesn't render emojis, and while it probably should, it really doesn't bother me as a user. I understand what they mean by the shortcode. Same for lnurls, cashu, urls, nip19, whatever. In fact, I use coracle in a no-media mode. Clients should have total leeway to render notes however they (their users) want.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 21, 2025

If that is fine, then everything is fine. We can do a full markdown post on Kind 1 and your users will still be able to read it. Which has been what Amethyst has done for years now. If there are no complaints, we can just put markdown (or any other readable format) on every kind.

@vitorpamplona
Copy link
Collaborator

For the name, how about "Nostr Structured Text" or "Structured Note Format"?

@staab
Copy link
Member Author

staab commented Feb 21, 2025

Let's worry about the semantics for now and change the name later. "plain text" is used all over the place already, I'm just trying to define it.

@fiatjaf
Copy link
Member

fiatjaf commented Feb 21, 2025

If that is fine, then everything is fine. We can do a full markdown post on Kind 1 and your users will still be able to read it. Which has been what Amethyst has done for years now. If there are no complaints, we can just put markdown (or any other readable format) on every kind.

This is not true. While things like Markdown code blocks don't hurt, Markdown links are horrible for reading anywhere that doesn't support Markdown natively, same for Markdown tables, and even those Markdown titles with leading #. I don't believe you don't understand this.

What about HTML? Or LaTex, or the PDF syntax, I don't know. Any client can decide to support anything, and we certainly shouldn't expect all clients to render all these formats, but also users probably cannot read any of them comfortably.

But users can read an :alpaca: emoji blob and imagine an alpaca just fine.

@fiatjaf
Copy link
Member

fiatjaf commented Feb 21, 2025

By the way, the initial Markdown proposal doesn't mention anywhere the horrible []() links or the # titles.

https://daringfireball.net/projects/markdown/index.text

See how the links are much more human-readable with the actual URLs going after the text in their own lines, and the titles are much better too. That's why Markdown was said to be "plain text with an optional rendering step", but of course soon afterwards it got corrupted, we shouldn't be complicit in that corruption.

@fiatjaf
Copy link
Member

fiatjaf commented Feb 21, 2025

In any case I'm ok with merging this as an experiment.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 21, 2025

I agree about the[]() links. But you realize how arbitrary that point is right? That's my problem with leaving things open and why being highly prescriptive is awesome.

If we don't provide any guidance, every dev will choose their own arbitrary line. If we do provide guidance (like what this PR is doing), then devs have at least something to be based on...

@cypherhoodlum
Copy link
Contributor

What about markdown tables such as:

Column 1 Column 2
Cell 1, Row 1 Cell 2, Row 1
Cell 1, Row 2 Cell 1, Row 2

Are these human readable enough without being displayed as an actual table? Bigger the table the harder it becomes. Worth a mention in the spec?

@staab
Copy link
Member Author

staab commented Feb 24, 2025

I don't think tables are easily readable. They're also not "light formatting", they're really structured data.

@vitorpamplona
Copy link
Collaborator

We have tables at home.

The tables at home:

Column 1       | Column 2
---------------|----------------
Cell 1, Row 1  | Cell 2, Row 1
Cell 1, Row 2  | Cell 1, Row 2

@fiatjaf
Copy link
Member

fiatjaf commented Feb 25, 2025

first column second column of what is considered a valid markdown table third column final row with sums
as you can see second value of the first row third value of the first row 12312
this is not very readable more values here another value 673
another row messy 2316
this is not really plaintext by any stretch of imagination, it's not readable unless it's formatted, so this defeats the purpose entirely
|first column|second column of what is considered a valid markdown table|third column|final row with sums||
|-|-|-|-|-|
|as you can see|second value of the first row|third value of the first row|12312||
|this is not very readable|more values here|another value|673||
|another|row|super|messy|2316||
|this is not really plaintext by any stretch of imagination, it's not readable unless it's formatted, so this defeats the purpose entirely|||||

@mikedilger
Copy link
Contributor

NACK

I think this idea is too proscriptive.  PLANTEXT can be > Anything, > Nothing, > Made Up Formatting with
__NO__ particular predefined meaning.

I know you want to parse and interpret plaintext to render it "better" but I think that is more dangerous than not (rerendering plaintext may totally screw up ascii art, for example).

@staab
Copy link
Member Author

staab commented Feb 26, 2025

I know you want to parse and interpret plaintext to render it "better" but I think that is more dangerous than not (rerendering plaintext may totally screw up ascii art, for example).

That's what backticks are for. Anyway, this is unavoidable, clients already parse note content and render it to a greater or lesser extent. This PR is about encouraging people to opt for "lesser"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants