-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audit the website for SEO issues #286
Comments
Already started ;) |
I made a first check pass and wrote down my findings in draft mode. I'll create an issue for each one + details on how to correct / enhance :) |
This one is not quite correct. 307 is because of HSTS which the browser has cached. It’s an internal browser redirect added by your browser. If you use an online tool like https://wheregoes.com/ on http://almanac.httparchive.org then you’ll see it does a 302 to https://almanac.httparchive.org/ (this should probably be a 301) and then a further 302 https://almanac.httparchive.org/en/2019/ (which probably needs to stay as a 302 because next year it will redirect to the 2020 directory so it is a temporary redirect -albeit a year long temporary redirect). |
Yeah, the point is http=>https should be 301 and not 302, 307 or any other redirection type. |
The second 302 is not because 1 year redirect. It is mainly for language detection. When the translations will be added, if your accept-language is for example es, it should redirect to /es/2019 with 302. |
I ran a crawl of the site in its current state, and everyone can access the findings here. In addition to the points in @AymenLoukil's Google doc:
I'll schedule another crawl when the chapters have been added in to look for any other issues, and am happy to run more as we go along and share those as and when needed. |
@AymenLoukil In addition to your doc, the |
Agreed but my point is the 307 isn’t a real redirect served by the server. It’s a fake one generated by Chrome to represent the automatic upgrade to HTTPS that HSTS does (see here: https://www.seroundtable.com/googlebot-hsts-redirects-301-307-21405.html). So you can’t “fix” this (and in fact there is nothing to fix). But you are 100% right about the first 302 (that those clients who have not loaded the HSTS instruction will use - including GoogleBot) - which is basically the point you were making! 😀 |
Yeah yeah i confirm @bazzadp ! You right on the 307. I went too fast on this :) |
@rachellcostello two chapters are currently live and ready for another audit: Markup and Performance. |
The updated crawl can be accessed here. It confirms that the canonical tags are being picked up and the author link errors have been resolved. Page reports for the Markup and Performance chapters aren't showing any issues, which is good! |
OK so at the moment the issues I see are:
So looks to me that once we add the Sitemap then hopefully there is no further work required here other than rerunning a scan when full site is launched. So let's hold it open until then but wanted to make sure there was no expectation that any other work was ongoing here. Let me know if there is! |
Add native lazy loading for all the imgs / iframes |
Raised #351 for this. |
It is a security + performance issue :) |
Hmm. Just got this email from Google Search Console:
I also submitted the sitemap.xml file and it returned a few errors similar to this:
Visiting the file itself shows an error page: https://almanac.httparchive.org/sitemap.xml |
OK will fix that with my sitemap fix PR. |
Any ideas if these are resolved now? Guessing the Sitemap is, but does it list hreflang issues in GSC so can see if fixed there (may not be until it’s next crawled)? |
Not sure. I wasn't able to view the error anywhere other than the vague email. I'll keep an eye out for any more vague emails. |
One of the great things about staffing CDS is that there are real live humans who work on Search Console here! They helped identify the hreflang issue we were seeing and pointed me to this support doc: https://support.google.com/webmasters/answer/189077?hl=en
It's still kind of unclear what SC is complaining about, but maybe
should be
|
Could it be you shouldn’t implement it unless you have more than one language? |
No @rviscomi. We implemented it right. Just as i said earlier, modify en-US to en. and paths must be absolute. That's it. Every page should reference itself + reference the other languages (if they exist). And if A reference B, B should also reference A. The x-default one is for telling what's the default page (if we are an international company, it could be the language selector page). It is optional but we could add it. Could you please resubmit the website in SC and ask for indexing. + submitting the Sitemap.xml ? |
Ok so this |
https://www.seroundtable.com/one-language-hreglang-google-23970.html I really think we should add an if statement around this so it doesn't show until the second language is launched. It makes no sense to have it as of now. |
It's a two line change to base.html:
|
Both versions are fine: https://support.google.com/webmasters/answer/189077?hl=en. However agree we should just do the language and not the country since we are unlikely to have |
Yes :) |
OK for that. When we implemented the hreflang tags, i thought that we will have at least another language at launch (es). So why not hide them until we publish new language. |
Agree, as per here: https://support.google.com/webmasters/answer/189077?hl=en:
We have no Included a fix for that in 5e9beba In that, I say that we support region codes, but I'm not 100% sure we do since directory structure is based on language only, so probably more clean up we could do but that will do for now and can look again when we have more languages and/or regions as difficult to test with just the one we have now.
Included in 0810d2d Could do with thoroughly testing this when first additional language goes live! Should we raise an issue for that separate to this one? |
The hreflang issue appears to have been resolved. @AymenLoukil do you feel comfortable closing this issue? |
@rviscomi good! |
Of course! Thank you! |
Also, contributors page is shuffling on reload which is not the best choice from SEO / crawl POV. |
Yeah I don't think it's a big deal, and it's nicer to our contributors so I say we accept this. @AymenLoukil are we good to close this issues? Alt-tags is being tracked in #379 and rest have either been dealt with or can be accepted IMHO. |
What about the figures files names ?
You create a separate issue ? |
I think it's a minor thing. And having done the last big bulk update not too inclined to do another! I care much more about the content being SEO-friendly than the images anyway. Another item could be that we should watermark all the images with "© HTTP Web Almanac 2019" in case anyone uses them, but again I think that's one for next year unless anyone takes that on?
Nope. I'm saying we accept it. This was a conscious decision and I don't see the big SEO lose with it to be honest. |
Visual search is so important and one of the 3 pillars of Google search announced on its 20th birthday. Images are part of the content and text is not more or less important. Having a descriptive/SEO file name is one of the basics of images SEO.
I understand. We could imagine make use of the figure title to slugify it for the image filename.
There is no big SEO loose here. The fact is we are making Google receive a different answer on each time it crawls the page. but it is ok. |
Personally I think the effort involved - particularly with the translations well underway now - far outweighs any benefit here. To prove the point I just Googled with what I think might be a typical search request that one of the title might help with: As you can see we are already as the 3rd and 5th search results. I really don't think there will be a noticeable improvement from this by renaming the image files. They are already loaded with context from text around them, |
I think this issue is ok to close and we can open new issues for anything SEO related that comes up. |
Since we have SEO experts on-hand it'd be great if we can make sure that we're following our own best practices on the website.
The version pushed to https://almanac.httparchive.org has most of the site structure except for the chapter content (the most important part) so if needed we can revisit the website when that's pushed in ~a week or we can audit the local website. The website is also missing translations, so we won't be able to check for that class of SEO issues yet.
CC @ymschaap @rachellcostello @AVGP @clarkeclark @andylimn @voltek62 @AymenLoukil @catalinred
The text was updated successfully, but these errors were encountered: