Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

this directory its open #240

Closed
taylitor opened this issue Oct 13, 2015 · 11 comments · Fixed by #301
Closed

this directory its open #240

taylitor opened this issue Oct 13, 2015 · 11 comments · Fixed by #301

Comments

@taylitor
Copy link

https://nodejs.org/download/

@stevemao
Copy link
Contributor

Can you provide some description of what the issue is rather than throwing a random link?

@Trott
Copy link
Member

Trott commented Oct 14, 2015

I believe they are reporting that the URL pulls up an index of the directory, which is often reported by security scanners and the like as a security issue. In this case, though, I don't think it is. I think it's a feature, not a bug.

@rvagg
Copy link
Member

rvagg commented Oct 14, 2015

I'm thinking we might need to put a a simple index.html in there because (a) we keep on getting these reports from folks who think it's a problem and (b) it's still a primary result when you google "node download" or "download node.js" or similar, regardless of the robots.txt (if anyone has a suggestion for fixing this other than an index.html for that page then please let us know!).

@silverwind
Copy link
Contributor

I too like the browsability of the downloads folder, it's a simple way to access old versions. There's things like the empty test folder that should probably be cleaned up.

Of course, it'd be nicer if google would point to https://nodejs.org/en/download/ when seaching for nodejs download, but we probably need to do some SEO for that.

@rnsloan
Copy link
Contributor

rnsloan commented Oct 26, 2015

I have pinged my SEO friend about the problem. Will report back his thoughts.

@bnb
Copy link
Contributor

bnb commented Oct 26, 2015

Here's my angle on this: when I was a younger developer, there were programming language/framework/whatever sites that used a directory instead of having a page. This was pretty discouraging as a newcomer, as I didn't really know what to do with this structure that I'd never seen on the web before.

It was frustrating when there were just two software versions + 32/64 bit versions + Windows/Mac/Linux versions. This directory is much harder than that - to get to the latest Node release, you have to pick the right folder (release) and scroll all the way down to the arbitrary location of the latest version in the middle of the page. Very poor usability.

@rnsloan
Copy link
Contributor

rnsloan commented Oct 27, 2015

Here is the advice I have received on stopping this from appearing in search results.

  1. Remove the robots.txt block on /download/
  2. Pass a noindex X-Robots-Tag via the HTTP header ( this is assuming you don't have an index page on https://nodejs.org/download/ , if you do you can use a meta robots tag <meta name="robots" content="noindex" /> in the <head> )
  3. Use the no follow link attribute for all links that points to https://nodejs.org/download/ from this website. ( eg in your navigation or wiki, etc )

@rvagg do you want to just stop this from appearing in search results or additionally stop this being flagged up by people who think it is a problem by plugging it with an index.html (personally i share @silverwind 's view, but I have no idea how often this comes up)

@rvagg
Copy link
Member

rvagg commented Oct 27, 2015

We've had 3 or 4 reports of this now (perhaps more via other mediums) by people who think it's a problem and the reason is because they are finding it via google:

screen shot 2015-10-28 at 9 58 17 am

That search really should go to https://nodejs.org/en/download/ instead because it's the place where we can guide them through the selection process.

@rnsloan
Copy link
Contributor

rnsloan commented Oct 28, 2015

Made a start on this https://github.com/rnsloan/new.nodejs.org/commit/aaf435494420d1bd01749ca4a4d94f71d22d0ca5

@rvagg you still want these two directories inside /download to be crawled?

robots.txt:

Allow: /download/release/latest/
Allow: /download/release/latest/docs/api/

I am unsure if they will end up supplanting the current top search result. Might be best if we try removing all of /download from being crawled.

@rvagg
Copy link
Member

rvagg commented Oct 28, 2015

server config needs to go in https://github.com/nodejs/build/blob/master/setup/www/resources/config/nodejs.org

and yeah, I'm fine with disallowing all of /download, the api docs get served from /api anyway.

@rnsloan
Copy link
Contributor

rnsloan commented Oct 30, 2015

Once this nginx config change is deployed nodejs/build#231 I will create a PR to update robots.txt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants