-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Website & API Doc site generator using DocFx script #206
Conversation
This looks great, thanks for the initiative! @NightOwl888 I assume many files' code comments are still broken so we will gradually get them fixed so it looks better then @wwb is there a way we can use our CI to generate the docs for each build (and then as a next step maybe automatically pull them for a static hosting of some kind e.g. github pages)? |
Thanks for this!
We definitely want them. Lucene has HTML documents that they add to each package, and often this is where the best code samples and detailed overview of the API can be found. It would be best if we could add the HTML documents unmodified from Lucene to our repo and have the script convert them to be used in the documentation. Then we just need to copy over the files from the next version and that part of the documentation will be automatic. Here is an example of one of those HTML files: If there is a way to automate converting the code samples (preferably both to C# and VB.NET), it would be ideal, but at least that would likely be the only part of the document that needs to change if converting the code samples is not possible. It occurred to me that we also need to re-map the namespaces, but we should be able to easily automate that part. For the home page, we should also aim to provide the same information as the rest of the Java API docs: https://lucene.apache.org/core/4_8_0/. Yes, many of the files in Lucene.Net and Lucene.Net.Codecs have not been cleaned up yet. Plus there are some other places where the comments need to be fixed up a bit. I have been doing this bit by bit during the hour-long test runs when I can't really do much else. We could really use some help with this, as it would take one person the better part of a week to get it all done. If 50 people contributed an hour each, we would be done in an hour ;). |
OK cool, I'm happy to update this PR with what I can and let you know what I get done. Probably isn't going to happy super fast but I can put in some work each week! |
Thanks again. I took a look and documentation generated perfectly. The documentation and code samples look great. I have done all of the grunt work to update the documentation comments to get rid of nearly all of the compile warnings (at least in Visual Studio). However, there are a few issues/limitations that I found with the generated documentation, as well as some features that would be nice to build in. Package BreakdownThe Lucene documentation (https://lucene.apache.org/core/4_8_0/) breaks the API down by package first, and then allows you to drill into types. I am torn between that approach and putting everything into one "bucket" like we currently have, which is similar to MSDN. The filter makes it easy to find something specific, but it is difficult to tell where the core types are vs the specialized add-ons. The amount of data that you have to wade through is a bit overwhelming. For example, the navigation initializes with mostly obscure analysis packages in view before more useful namespaces. If we could somehow arrange it so the main namespaces show up at the top level, and allow a drill down to the levels below (or at least have an additional navigation feature that does this), that would seem more appropriate. .NET Standard vs .NET FrameworkThe APIs for each framework are similar, but there are places where they diverge. Namely, there are several types that are not supported in .NET Standard and therefore don't exist. One such example is ConcurrentMergeScheduler. If you look at that class in the documentation, there is no indication at all that it doesn't exist in .NET Core. Ideally, the fix for that would be to generate framework/version specific documents with a "drop down" (or similar) navigation feature that allows switching between available frameworks (just like MSDN). Is this (or a workaround) possible? Missing LinksSome of the documentation I updated have links that are not being generated in the output even though they show up fine in Intellisense. Here are some problematic files:
In the first case, several of the links (such as CodecHeader) are not showing up. In the second case, all of them are showing up except for the one after Attributes. I haven't figured out why this is the case. But actually this is a symptom of another problem. In Lucene, they are able to change the link text to a code reference, but I haven't worked out what the syntax for that is (if it is even possible). You can see here that the link after Attributes has the text I tried the obvious way to create that type of link ( HTML pagesI mentioned this before, but after looking at this there are more than 250 HTML pages. So this is a huge amount of missing documentation and most of the code samples are in it. I recall reading that some documentation generators allow you to specify "namespace documentation", and if that is the case with DocFx, perhaps we should use that to solve this. If you could provide a specification as to what format the "package documentation" needs to be in (i.e. Markdown) and what convention it needs to follow (where the documentation needs to be in order to show up under a specific namespace), I would be happy to put together a tool to convert the existing HTML pages to that format and location. Viewport WidthMinor complaint. On a large monitor, only about 2/3 of the available width is being utilized. I checked and regular MSDN pages are using roughly 10% more width, and some of the newer pages (example) are using about 25% more of the available width. Is there a way to specify the maximum width be wider? Token ReplacementIn Lucene there are a couple of tokens, such as Worst case, we could just find and replace in Visual Studio, but it seems better maintenance-wise to use similar functionality if it is available in the doc generator. |
Awesome feedback and questions. I'm currently overseas atm but will see what answers i can provide next week. I know answers to some but others will require a bit of investigation. I'll get back to you in about a week |
@Shazwazza - Added another minor issue to the above list. Any chance you will be able to answer some of these questions soon? In particular, I would like to know if there is a spec that the HTML docs can be converted to (and whether there a convention we can use for changing the code links within them into the correct hyperlinks). Even if it is imperfect or still incomplete, it would be nice to have some documentation hosted so people using the beta have somewhere more relevant to turn than the Lucene 4.8.0 docs. @synhershko - Any particular reason you are suggesting Github pages instead of hosting at http://lucenenet.apache.org/docs/3.0.3/Index.html? I think it would be less confusing if users only have to modify the version number in the URL to get to the latest. Although, since most of the new classes are not in the same location as the old, now would be the ideal time to jump to a different host if that is indeed the plan. Question: For pre-releases should we be releasing new docs on each release in a new versioned location, or updating the existing 4.8.0 version location until it is fully released? Seems the former would be a better option in terms of legacy usage and automation of deployment, but may end up taking up lots of space if we end up with a lot of pre-releases. |
Hi all, Here's some feedback on many of the above questions/comments: I've pushed some updates to this PR which:
I cannot figure out why docfx is complaining about System cref's such as If you wish to test this setup without waiting for the entire metadata for all classes to be created, you can update the /docfx.json file metadata/src/files section from Currently DocFx does not support the namespace style docuementation that Sandcastle used to support, there's an open issue for that here: dotnet/docfx#952 So for namespace style documentation such as https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.8.0/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package.html we would currently have to host these as documentation articles. Currently I've put documentation articles in the /docs folder but it's possible to have any number of different articles folders if required. As for changing how the namespaces are shown on the left hand side and ordering by more important ones, this could be achieved by modifying the generated /api/toc.yml file after it is built. This file is autogenerated by docfx when it's building the API docs. As far as I can tell one way to do this would be with a custom Post Processor: https://dotnet.github.io/docfx/tutorial/howto_add_a_customized_post_processor.html but OOTB I don't think this is possible with standard configuration. I'm not really sure what we can do about the .NET Standard vs .NET Framework, there is some mention of this in this issue: dotnet/docfx#1518 which apparently is fixed in this PR dotnet/docfx#1549 . I will just need to figure out exactly what all this means and what the options are. For token replacement, i think this could also be achieved with a Post Processor in one way or another https://dotnet.github.io/docfx/tutorial/howto_add_a_customized_post_processor.html, though i did see this feature in later release notes: dotnet/docfx#1737 There's quite a lot of docs on docfx here http://dotnet.github.io/docfx/tutorial/docfx_getting_started.html Hope this answers a few of your questions. I'll keep researching into the new docfx versions, what support it has and why we can't use it currently. |
I have added some comments inline. I also asked a question about creating code links in DocFx with custom link text that you might find the comments helpful for.
Normally when faced with post-build issues such as these I either overwrite the contents of the file by generating it in the Powershell script or use the Powershell script to update the contents of the file, depending on how much of the file I need control over. But I wasn't referring to the order of them so much as the depth. For example, it would be best if we had a link to Anyway, I will wade through the rest of this and get back to you if I have any other questions/comments. |
I like having them at lucenenet.apache.org/docs I think that's the right solution.
I'd say cross the space issue when it becomes an issue. |
…es filter config, updates to latest docfx version, updates to correct LUCENENET TODO
Some updated info:
There's still lots for me to look into based on the previous questions. I'll keep trying out things as I find time. |
…in new metadata instead
Much appreciated - keep up the good work 👍 I am almost to the point where I will start documenting the new CLI tool. Originally, I was thinking about making 1 page per command like Microsoft did on their dotnet tool, but there isn't quite enough here for all of that. It would be easier to have 1 document for each of the 4 subcommands and a small section below for each command + 1 overview document describing the tool in general (so 5 pages of docs). This tool contains all of the index maintenance tasks (checking, fixing, upgrading, splitting, merging, moving segments around, etc.) plus a set of demos that can be run, and source code viewed, or exported. The plan is to put this tool on Chocolatey so it can be easily installed and updated, as well as make it part of the CI release process. Would building the docs in Markdown and placing them in a subdirectory of tools be appropriate, or would something else work better?
This is one point that is a bit unclear to me. In the past I have tried to make links between pages on GitHub and they didn't always work - I ended up using absolute URLs to avoid the problems (but I don't recall exactly what they were). Do you have a suggestion about the correct way to make relative links between Markdown pages? |
@NightOwl888 If you put markdown files in the /apidocs/tools folder that should be fine and then i can update any "toc" files to point to them. Currently that folder already exists for download 'tools' that help the docfx build process but i'll move that to a better temporary folder (i.e. 'obj/tools') . The correct way to create links between MD pages can be seen here: Shazwazza@d440348#diff-a121785cab27b808ad3b4d2fbd049bc7R6 and docfx will ensure it's all wired up correctly when it builds. Of course if you want to go up a level it's the standard "../" syntax. |
Actually, I was referring to the new "tools" folder under "src" (to keep the docs near the source code the same as they would be by converting the HTML pages from Java). BTW - There is some discussion about the WIKI happening on the dev mailing list. If you are not already, you should subscribe to stay looped on on this. |
Ah i see, we can include any md files from anywhere in the solution so wherever you want to put them will work just fine :) I'm on the list so all good, just haven't had a chance to reply quite yet, will do soon |
…ugin for parsing lucene tokens.
…ags inside the triple-slash comments ... but can't get it to work so will revert back to dfm, just keeping this here for history
…pecial tags inside the triple-slash comments ... but can't get it to work so will revert back to dfm, just keeping this here for history" This reverts commit efb0b00.
…l tags are replaced.
Fixing links to the download-package
@Shazwazza I will go ahead and merge this beast. Again, thank you for all your efforts and time getting this together. Well done! |
Looks great. However, on first pass I was unable to find the docs for the lucene-cli tool. Did they not get included, or are they just hard to find? There probably should be a link to this from the home page, as it contains the demos in both executable and exportable form. There are also some updates to those docs because I have now setup the deployment so it can be installed using |
Hi @NightOwl888 welcome back :) Yes these are hard to find, the docs site isn't really finished or "live" yet, it's sort of pseudo live. I haven't had time to get this into the correct state but maybe soon i can. Currently, from the new website if you go to the Documentation tab: https://lucenenet.apache.org/docs.html This links to the different API doc sites, the first one there is the new 4.8 docs, but this is hosted on my own azure account still since it's still the demo site (but better than nothing for now). The menu on this doc site is broken, so it's a hamburger menu for the desktop site, if you click that, you'll see them https://lucenenetdocs.azurewebsites.net/cli/index.html I should try to find some time to get the docs site running correctly and then we should get it hosted properly too. |
…)" This reverts commit 0d56d20.
I am working on rolling another beta and would like to try getting the docs updated to reflect the install instructions for the I'd like to try to get the doc building functionality hooked into the release build. Maybe we won't have it automated to the point of doc site deployment, but it would be nice to at least get it to the point where running a build produces the docs and main website as build artifacts so they can be manually downloaded and deployed. Ideally, we would parameterize the version number and base URLs that the docs use and pass them into a command to generate the docs. Those parameters would be put into If you don't have time right now, we don't necessarily have to do this before the release (the docs could be manually generated and synced up afterward). But if you have time to work on this, it would be great if we could get it partially done. Thanks again for all your hard work! |
@NightOwl888 Sorry ran outta time today, i have this pinned in my inbox and will get back to you on monday with hopefully some updates too. Cheers! |
No problem. Just out of curiosity, how long does it take to generate docs? We could probably do it on a dedicated server in parallel with one or more of the other jobs so it doesn't add any time to the overall run, but we have a 1 hour cap on time anything can run on a single server. Also, it seems the servers take about 2x the time to do anything I run locally. |
Hi @NightOwl888 have pushed a PR here with notes, can discuss the above stuff there #229 |
This is a PR to build both a new website and the API documentation using DocFx.
There are several changes:
To test the website you can run:
websites/site/site.ps1
which will build the site and start a webserver at http://localhost:8080, any changes made to the site just stop the script (ctrl + c) and re-run it and it will do an incremental build. To just build the website for deployment, runwebsites/site/site.ps1 -ServeDocs 1 -Clean 1
which will clean all temp files and compile the website to a static website atwebsites/site/_site
.To test the docs you can run:
websites/apidocs/docs.ps1
which will build the site and start a webserver at http://localhost:8080, any changes made to the docs just stop the script (ctrl + c) and re-run it and it will do an incremental build. To just build the docs for deployment, runwebsites/apidocs/docs.ps1 -ServeDocs 1 -Clean 1
which will clean all temp files and compile the website to a static website atwebsites/apidocs/_site
.(In both cases, the build operation takes a few minutes!)
Website tasks to complete:
gitpubsub
is the easiest and most flexible to use.Docs tasks to complete:
Additional tasks (nice to have):