Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show Contents of Readme on Crate Pages #81

Closed
Tracked by #9 ...
killercup opened this issue Nov 24, 2014 · 15 comments
Closed
Tracked by #9 ...

Show Contents of Readme on Crate Pages #81

killercup opened this issue Nov 24, 2014 · 15 comments
Labels
C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works

Comments

@killercup
Copy link
Member

I consider the inline Readme file to be one of the most useful things on npm.

The Readme data is already available (and searchable, cf. #72), but I couldn't find any mention of a plan to show it on crate pages. As this can add a lot of content to a crate page (with possibly complicated markup), a new layout for the page might also be necessary.

@sfackler
Copy link
Member

This'd be great.

@Luthaf
Copy link
Contributor

Luthaf commented Feb 18, 2016

Any new on this? How hard would it be to do?

@steveklabnik
Copy link
Member

None yet. It shouldn't be that hard, though it would require paying close attention to escapes.

NPM does this with markymark, I would suggest blatantly copying whatever they do.

@Luthaf
Copy link
Contributor

Luthaf commented Feb 19, 2016

NPM does this with markymark, I would suggest blatantly copying whatever they do.

Wouldn't that need a server-side node process? The current server is written in Rust, if I understand everything correctly.

@steveklabnik
Copy link
Member

I didn't mean literally running their code; I meant investigating how they do so. What HTML do they escape? Do they show the whole README or a summary? Etc.

On Feb 19, 2016, 04:17 -0500, [email protected], wrote:

NPM does this with markymark, I would suggest blatantly copying whatever they do.

Wouldn't that need a server-side node process? The current server is written in Rust, if I understand everything correctly.


Reply to this email directly orview it on GitHub(#81 (comment)).

@Luthaf
Copy link
Contributor

Luthaf commented Mar 2, 2016

After some investigation, I managed to get a local version running, but when I try to upload a simple example crate it fails because it can not connect to S3. How can I do to load example data in a local website instance?

@steveklabnik
Copy link
Member

It might be worth using the --proxy option and use the data from the actual
backend instead.

On Wed, Mar 2, 2016 at 4:45 PM, Luthaf [email protected] wrote:

After some investigation, I managed to get a local version running, but
when I try to upload a simple example crate it fails because it can not
connect to S3. How can I do to load example data in a local website
instance?


Reply to this email directly or view it on GitHub
#81 (comment).

@Luthaf
Copy link
Contributor

Luthaf commented Mar 2, 2016

Yes, but I do not see how I can change the behaviour of the backend, to add an API point to get the README content as HTML in that case.

@steveklabnik
Copy link
Member

Ah, sorry, I forgot about that part. I was only thinking about the frontend rendering aspect :(

@krzkaczor
Copy link

krzkaczor commented Jul 9, 2016

I have some time and willingness to work on this issue. @Luthaf can u help me with backend changes? Frontend changes seems to be trivial when we have nice endpoint returning readme as html :D

I couldn't find the project which is responsible actually for adding new packages into the repository. crates.io seems to be frontend and backend but it's only querying data - not adding them. I wanted to research textsearchable_index_col field in database because it seems to hold README contents and be used for full text search.

@Luthaf
Copy link
Contributor

Luthaf commented Jul 9, 2016

@Luthaf can u help me with backend changes?

Sorry, I have no knowledge at all of crates.io backend. I tried to do this, but gave up as I did not managed to get a local version of the backend working with some data.

@shssoichiro
Copy link

I'd like to step up and take a shot at implementing this.

@steveklabnik
Copy link
Member

steveklabnik commented Oct 25, 2016

I mentioned this bug to @ashleygwilliams yesterday, and she mentioned that for @npm, they actually made this a separate service. One issue is that right now, crates.io displays stuff in the metadata, displaying READMEs would mean moving README stuff into the metadata, but that means that it blows up in size....

(By separate service I mean "implemented as a microservice", not a totally different website)

@ashleygwilliams
Copy link
Member

👋 hey all! i'm currently writing a blog post about the interesting history around dealing with READMEs for npm. right now, READMEs are part of the package metadata, and due to the potential for a DDoS, as well as the performance hit large READMEs cause, we've had to place an arbitrary restriction of 64kb on the READMEs. yes, yuck, arbitrary restrictions.

our goal with the READMEs in npm registry v3 will be to remove them from the package metadata completely and run them as a separate service that pre-renders the markdown and serves the generated html statically. this will remove any threat large READMEs would cause to performance or security and it will also have the added benefit of allowing us to render READMEs per version. it will also speed up our website since they will be pre-rendered and served statically.

i would greatly encourage ya'll to learn from our struggles and opt to keep the README out of the package metadata. when ya'll become as popular as npm you'll thank me :)

if this is still open when the blogpost comes out later this week i'll be sure to post it here.

@jdm
Copy link

jdm commented Nov 15, 2016

Blog post: http://blog.npmjs.org/post/153186129005/zero-one-infinity-readmes

@carols10cents carols10cents added C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works P-high labels Dec 15, 2016
bors-voyager bot added a commit that referenced this issue Aug 17, 2017
869: Show Contents of Readme on Crate Pages r=carols10cents

Hey everyone, I started working on README rendering earlier and I thought that I should get some feedback now that it is working correctly on my local instance.

I edited the `/crates/new` endpoint so that when a crate is uploaded to the registry, its README (markdown only) is rendered to HTML and a `readmes/{crate}/{crate}-{version}.html` file is created on S3 when the crate file is uploaded there.
Then, when visiting a crate page, the Ember application makes a request to `{s3}/readmes/{crate}/{crate}-{version}.html`: if the file is found, its content is rendered in place, if not nothing happens.
This way, the renderd README is cached and versioned, following @ashleygwilliams [recommendations](#81 (comment)).

On the `crate/version` route, the README contents is displayed in place of the package description.

The generated HTML is sanitized using [Ammonia](https://crates.io/crates/ammonia). This crate removes all tags and attributes that are not whitelisted, making the resulting markup as safe as possible for the end user. Bonus point: is uses the html5ever crate (created by the Servo project) to parse HTML.

~~~Codeblocks syntax highlighting is done server-side using [syntect](https://crates.io/crates/syntect). This crate uses Sublime syntax files to highlight code samples, which means that we automatically have access to a lot of pre-existing grammars and we can easily add/create missing ones.~~~ Syntax highlighting is now done client-side using PrismJS due to syntect's dependency onigumura conflicting with git2.

Finally, I added a new script in `src/bin` named `render-readmes`. Its goal is to render the README of every uploaded crate version and to upload it on S3. This means that if modify the renderer behavior and we want to retroactively apply it, we're just one command (and a lil' bit of time) away ! (Though I might have left too much `unwrap()`/`expect()` in it)

TODO:

* [x] Security checks in `render::markdown_to_html` (script tag, javascript callbacks, more ?)
* [x] Tests
* [x] Documentation
* [x] ~Edit delete-crate & delete-version scripts to delete README files~ It would seem that the scripts only delete database entries, not stored files
* [x] Script to render published crates README (?)

Fixes #81

Preview:

![screenshot from 2017-07-10 00-22-45](https://user-images.githubusercontent.com/1275463/27998080-f83bad7a-6505-11e7-884e-95ee4f7c2a47.png)
Turbo87 pushed a commit to Turbo87/crates.io that referenced this issue Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works
Projects
None yet
Development

No branches or pull requests

9 participants