Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Separate core and ancillary components [WIP, DO NOT MERGE] #38

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bensetterholm
Copy link
Contributor

Hello Brian,

I am planning to make use of this library in my next project, since Zemax is on the one hand overkill for my application and on the other hand is not scalable to run on a Linux cluster after an initial exploration.

Unfortunately, some of the dependencies in this library are causing problems on my overly-managed workstation (PyCall is not building properly on one of them and I cannot get Pluto to run on either of the two work computers I use, plus other seemingly non-deterministic build issues).

Since the dependencies that are causing issues for me are not necessary to the main functionality of the library (for example, all the Python-related code is limited to the cloud computing infrastructure), I would like to propose splitting the library into core and ancillary components - each with their own Project.toml file - whilst keeping OpticSim as both a monorepo and an "all inclusive" metalibrary.

As such, I have reorganized as follows:

  • OpticSimCore (contains the GlassCat, Geometry, Optical, and RepeatingStructures [sans Examples] subfolders)
  • OpticSimVis
  • OpticSimOptimization
  • OpticSimCloud
  • OpticSimExamples (contains Examples and Data subfolders, plus the RepeatingStructures multilens example)

This then allows me to load in only the components I care about without having to deal with precompilation errors (for which I was previously commenting out includes and removing Project.toml entries locally).

The extra advantage of this exercise is that I found a handful of dependencies that are either no longer used in the project, or are only used in the test or examples code, and do not need to be part of the core codebase.

I am opening a pull request now to solicit your opinion of whether or not this is a change you would support for your library, before putting in any more time to validate correctness, fix potentially broken examples with this change, and further reduce unnecessary dependencies (improving start-up times) - for example Plots within core could be eventually migrated to Makie and moved to visualization submodule and DataFrames could be moved to a package extension, etc, etc.

Currently, the changes I have made should preserve the [exported] behavior of OpticSim as-is (with minimal changes to unit tests only that call unexported methods), and all of the unit tests pass on my machine, but I have not yet validated the Examples or updated any documentation.

I am very happy to expand/reduce the scope of what is "core" and put in the necessary elbow greace into moving components as appropriate.

Cheers,
Benjamin

@rambunctiousapple
Copy link

Thank you for taking on this reorganization. This is a good idea which I wholly support. I've wanted to break up OpticSim into smaller modules for a long time but haven't had the time.

Two suggestions:

  1. It might be better to have some kind of AddOns package and put RepeatingStructures there rather than have it be part of OpticSimCore. We wrote the repeating structures code because we were designing ultra thin multi-lens optics for head mounted displays which was unreasonably difficult to do in Zemax as well as being excruciatingly slow to simulate. 99% of users won't need it so it doesn't make sense to always have to load it.

  2. GlassCat could be in its own package since it's quite independent from the rest of the code. I made a start here https://github.com/rambunctiousapple/AGFFileReader.jl. Not sure you'd want to use this as a model, I was trying to put the glass catalogs into artifacts rather than downloading them directly from the manufacturer's websites.

We wanted to have VSCode intellisense completion of glass names but the only way we could make this work was to generate the glass code at install time and include this code in the main OpticSim package. Pretty sure that packages are no longer allowed to modify their source at installation so you may not be able to get OptiSim into the general registry without changing this.

Another problem was keeping track of the urls the glass companies use for their glass catalogs. These are hard coded into the GlassCat code in the src/GlassCat/data/sources.txt file. The companies change these urls occasionally so after a while some of the catalogs wouldn't be found and users would start out with an incomplete set of glasses on installation of OpticSim. Adding glasses after the original installation is possible but as I recall it is not straightforward.

Rethinking how this part of the code works could eliminate many of the maintenance headaches we had.

Another thing that might benefit from a rethink is the documentation build. This renders the illustrations on GitHub headless servers. Getting xvfb and its dependencies to load and execute properly on the GitHub servers was a continuing problem. Every few months something would break, for non-obvious reasons. Fixing these problems consumed many maintenance hours.

Perhaps it would be better to build the documentation locally on a machine that has a GPU and then include the docs as part of the commit. We never had problems getting rendering to work locally.

# interval imports
using OpticSim: α, halfspaceintersection, positivehalfspace, lower, upper, EmptyInterval, rayorigininterval, intervalcomplement, intervalintersection, intervalunion, RayOrigin, Infinity, Intersection
using OpticSim: EmptyInterval, RayOrigin, Infinity, Intersection

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The items used from OpticSim (EmptyInterval, etc.) seem to be things that would be defined in OpticSimCore from which you import a bunch of other stuff related to rays and ray tracing. Was there a reason you broke them apart like this?

# lens imports
using OpticSim: reflectedray, snell, refractedray, trace, intersection, pathlength, power
using OpticSim: trace, intersection, pathlength, power
using OpticSim.Core: reflectedray, snell, refractedray

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here. why are the imports split between OpticSim and OpticSim.Core. These all look like functions that would be part of OpticSimCore

@@ -2,7 +2,8 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# See LICENSE in the project root for full license information.

using OpticSim:area,Rectangle,ParaxialLensRect,virtualpoint
using OpticSim: Rectangle, ParaxialLensRect, virtualpoint
using OpticSim.Core: area

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question again about the split between OpticSim and OpticSimCore imports

@bensetterholm
Copy link
Contributor Author

Thank you for your prompt reply. As I mentioned above, I am happy to take on the morcellation task. Please note that I created this pull request as a means to start a conversation (and I thus put "DO NOT MERGE" explicitly in the title).

  1. Sounds good; I will separate out the RepeatingStructures code from Core. To be honest, I did not dig too deeply into these submodules.
  2. I have an idea for the GlassCat code using the Scratch library, but since it is a bit tangential to this pull request, I was planning to prototype it separately later next week. I did something similar in a previous project I worked on, though intellisense was not a consideration there so I am not certain if it will pan out.

More generally, I would like to map out explicitly what should and what should not be included in "core" before proceeding further. And how many ancillary modules do we want to split off? (I am of the opinion that we should keep everything in the single repo though rather than separating an important component like the glass catalog into a separate repository. I am also linking this discussion in JuliaLang so that I can find it again later.)

I see your comments regarding documentation and do not have any additional thoughts at this time...

With regards to your comments on the tests, my primary goal when submitting this PR was to demonstrate that splitting the library is viable without changing the API as much as possible while still making the tests pass. To that effect, anything that used to work with a simple using OpticSim should remain working, with the exception that some private methods are no longer accessible from the OpticSim namespace due to the fact that the Reexport library only reexports exported methods.

I will write more on Monday. It is late here now in Germany. Have a nice weekend!

@rambunctiousapple
Copy link

rambunctiousapple commented Aug 24, 2024

As for what should be in core I am 99% certain that nothing you currently have in core uses any functionality from repeating structures and this is super specialized functionality that most people won't use. So the Repeating Structures folder could move up two directory levels into its own subdirectory at the same level as OpticSimVis or OpticSimOptimization.

I am agnostic about putting GlassCat in its own project. If it would be simpler to leave it part of a monorepo as is then that is fine with me.

@bensetterholm
Copy link
Contributor Author

I have been working on trying to separate the GlassCat code into its own module like the others in the PR above and I am constantly running into an issue with tests with the glasses defined in the Examples module. The root problem is this running ID number that gets incremented each time a new glass is added. (The specific problem when just loading the Examples module is I think that it adds its glasses somehow to a different namespace than what is accessible to the user, but I cannot figure out why yet).

As an entirely different, unrelated exploration, I was able to get the module to dynamically load AGF files without the intermediate "generate jl files during build" step AND keep intellisense working (via Expr evals creating the submodules on-the-fly in the GlassCat module's __init__ function), but again the thing that bit me here is the ID numbering, because in this case the Example glasses get loaded first and then the library glasses start loading but they think their IDs should start from the beginning... easy enough to fix once I knew what was going on, but took a long time to track that bug down.

The fundamental thing I don't understand is why there is this ID number in the first place? To me it seems that machinery is in place for the sole purpose of keeping the AGF_GLASS_NAMES array populated, which in turn exists so that the show method for glasses returns something sensible. But why does the Glass type not just hold have the name as one of its fields instead? The Glass type is not isbitstype anyway given the Union type for the transmission field. I must be missing something else, which is why I want to ask before I subtly break things further :)

@brianguenter
Copy link
Owner

It's been a while since I looked at that code. Checking it out now, might take a bit to figure out what is up.

@brianguenter
Copy link
Owner

Looked at the code and am not seeing any obvious reason why there is a numerical identifier for each glass. It seems that instead you could just use the Glass object itself. It's been too long for (4+ years) for me to remember the precise reason for this choice.

We started on this project before Julia 1.0; the old compilers were not great at preventing allocation. Any allocation killed multithreaded performance, so we contorted the design to ensure that allocations were 0 for performance critical code.

Preventing allocation may have been the motivation for the numerical id. Today it seems the better design would be to get rid of GlassID entirely and just use the Glass type directly.

@brianguenter
Copy link
Owner

The GlassId is pervasive; many functions expect this type and access either the name or the id to get at the glass.

Initially It might be better to put aside splitting off the glass catalog code and instead get the entire system to build and run properly with the glasscat stuff as is. Then address the issue of splitting off glasscat. Otherwise, you might introduce subtle bugs into the code that could take a long time to track down.

@brianguenter
Copy link
Owner

I got the glass cat code working as a standalone project here https://github.com/brianguenter/AGFFileReader.jl. In case you reconsider having several packages instead of a monorepo. There's enough functionality in AGFFileReader so that it is useful standalone - you can load glasses, plot index against wavelength, draw glass maps, and a few other things.

@bensetterholm
Copy link
Contributor Author

Hello Brian,

I've been working on a massive refactor of the glass catalog code into a new Materials module. A draft of this code is here if you want to play with it; note that it is still undocumented and there is still some functionality missing compared to the current version.

A brief synopsis:

  • A new [internal] Material data type that contains only the material information needed by OpticSimCore. I have already implemented conversions from AGFGlass, ModelGlass, and MILGlass to this type. In the future, we can add other "raw" types to ingest data from other sources (such as refractiveindex.info)
  • New data types AGFGlass and AGFCatalog, which contain all of the information within a AGF file. I have already implemented read(file_path, AGFCatalog), which entirely eliminates the need to build bespoke jl files. It would also be trivial to implement a write method in the future. Note that the AGFGlass type parses new tokens added to the file standard since the OpticSim module was written. The AGFCatalog type includes a list of symbols of all the loaded glasses and a Dict that lazily generates Material objects only when the user wants them. The list of symbols is emitted with a call to propertynames so all available glasses can be seen with REPL tab-completion once a catalog file is loaded in.
  • Dispersion formulations are now their own types, meaning new ones can be easily added in the future (or by the end user)
  • Notably, this module does not include code that downloads external AGF files or manages the catalog as a whole (i.e. ID numbering and search functionality). I do not believe this behavior belongs in OpticSimCore, and will thus be incorporated elsewhere, see further down this post.
  • I have not implemented the Unitful or plotting functions yet. These I think would be best moved to package extensions anyway to keep the dependencies in OpticSimCore to a minimum.

I plan to finish this draft and submit it as a separate PR. Then, I will rebase this PR and propose the following layout to the main OpticSim repository:

src/
    <code for glass catalog management, optimization, and repeating structures, eye model/liquid refractive index materials>
OpticSimCore/
    src/
        <Materials, Geometry, Optical>
    test/...
OpticSimVisualization/
    src/...
    test/...
Examples/
    <notebooks, docs example functions, cloud computing>
⋮
<deps, tests, docs, paper, license, readme, etc>

Notably, the "less used" items you mentioned above (like repeating structures) will remain in the umbrella OpticSim module, with only Core and Visualization components split off. And the cloud computing to me seems more of an "example" rather than library functionality. The only component I am on the fence about is the Optimization code. It does have the potential to be its own separate OpticSimOptimization top-level module, but currently I do not think it is sufficiently developed to justify taking it out of the main umbrella OpticSim module.

With regards to the glass catalog management within OpticSim, my proposal here is:

  1. Download AGF files from various sources to a scratch-space (via Scratch.jl) rather than into the cloned repo itself
  2. Ship a copy of the sources.txt file with the module as a backup, but give preference to an entry in a user's julia session ENV dictionary if present or else have a step where the library downloads the most recent sources.txt from this github repo directly to the scratch space (so that we can update the sources file when companies move their URLs without needing to release a new version of the module)
  3. Implement glass ID, glass search within the main OpticSim module only, structured so that intellisense in VSCode continues to work (I have prototype code that seems to work for this).

In summary, OpticSim will do all the catalog management and OpticSimCore will contain everything needed to work with glasses themselves. Any user who only needs the latter can read in their own AGF files directly.

@bensetterholm
Copy link
Contributor Author

The GlassId is pervasive; many functions expect this type and access either the name or the id to get at the glass.

Initially It might be better to put aside splitting off the glass catalog code and instead get the entire system to build and run properly with the glasscat stuff as is. Then address the issue of splitting off glasscat. Otherwise, you might introduce subtle bugs into the code that could take a long time to track down.

My goal, as far as the main OpticSim module is concerned, is to keep all of the current behavior while at the same time reworking the internals so that they will be more extendable and easier to refactor in the future. But for now, I'm trying my utmost not to touch any unit test code.

@brianguenter
Copy link
Owner

Looked over your code. Much more organized than before. Your changes should be a big improvement.

One minor suggestion which you may feel free to ignore: it would make the code easier for others to read if types like AGF_MD were instead AGF_MechanicalData.

@bensetterholm
Copy link
Contributor Author

Great thanks, will do. Naming things is hard, haha!

I will do my best when moving/renaming files to keep the git history as organized as I can. Do you prefer a philosophy where each commit should be strictly non-breaking or one where git blame is maximally preserved? (I tend to prefer the latter in my own projects, but that often results in extra git gymnastics when splitting methods from one file into multiple files or vice versa).

Secondly, do you want me to keep the following header in each file?

# MIT license
# Copyright (c) Microsoft Corporation. All rights reserved.
# See LICENSE in the project root for full license information.

(Happy to do so; I don't really care either way).

@brianguenter
Copy link
Owner

It can be too difficult to make every commit non-breaking, especially when making such big changes. Go for the second option.

It is simplest to include the header for the license in every file. The Microsoft license is essentially the MIT license so it's very permissive.

@brianguenter
Copy link
Owner

I messed more with https://github.com/brianguenter/AGFFileReader.jl and made it work without GlassID stuff. Not sure yet what the performance implications will be. Feel free to swipe any of it that is useful.

@bensetterholm
Copy link
Contributor Author

The problem is that there are a significant number of uses of GlassID in Optical/OpticalInterface.jl. I have not looked too carefully here whether it can be easily removed or not. I am currently working on refactoring the rest of the GlassCat code while keeping the current API so that tests don't break. I am almost done with this work, but the scope was a little larger than I initially expected. I should have something to share early next week, sorry for the radio silence.

@brianguenter
Copy link
Owner

For a separate project which I haven't made public yet. I've been working with a colleague to implement Gaussian Beams and eventually add them to OpticSim. This would dramatically expand the types of systems that could be accurately simulated. It's still far in the future though, assuming I can get it to work.

I pulled out all the ray tracing code and modified it to work with the new AGF code that doesn't use GlassID. I think most of the code that references GlassID is in this part of the code.

I can cycle that back to you after you are done with your refactoring and you can see if that reduces the work of switching to an acceptable level. First I want to test the code to make sure using the glass object directly doesn't result in a huge performance hit.

@bensetterholm
Copy link
Contributor Author

I posted a draft of what I have been working on in 2748cea (please do not merge this commit). I still need to fix a couple things in tests, port over the plotting functions, add/fix documentation, and overall clean up the codebase. But this is a working prototype and gives a general outline of the refactor with the following highlights:

  • the Materials module will move into OpticSimCore
  • the Catalog module is a drop-in replacement for the previous GlassCat module. It should export [nearly] the same symbols as GlassCat (after I carefully fix everything)
  • AGF files are no longer downloaded directly into the project directory

Once I tidy up everything, I will open a new PR and rebase this PR. Since you are working on other changes to the library, and this rebase is pretty big, I just wanted to show you a sketch of my refactor earlier rather than later in case you have any comments.

@brianguenter
Copy link
Owner

The new organization looks good. Should be much easier for new users to navigate through. Look forward to the new PR.

@brianguenter
Copy link
Owner

@bensetterholm You mentioned that you have intellisense working for Glasses without requiring the project to modify itself during build:

  1. Implement glass ID, glass search within the main OpticSim module only, structured so that intellisense in VSCode continues to work (I have prototype code that seems to work for this).

Could you point me at the part of your code that does this? I've continued working on the Gaussian Beam implementation and am slowly moving over code from OpticSim and simplifying and optimizing it. It would help me to know how you got intellisense working so I can incorporate in my code.

@bensetterholm
Copy link
Contributor Author

Hi @brianguenter,

Yes. Instead of the build process creating .jl files that hard code the .agf file information and including them during module precompilation, I instead generate the julia code on-the-fly during the module __init__. Since Intellisense dynamically loads the module in the background, the symbols are available for tab completion, etc.

See src/Catalog/collection.jl around line 20 in 2748cea.

Sorry it's been a while since I pushed the needle here... life has gotten busy!

@brianguenter
Copy link
Owner

brianguenter commented Nov 27, 2024

@bensetterholm , totally understand about life being busy. I've 3x the open source code to write as time to write it. But, for now, I'm concentrating on getting OpticSim back in usable form. Key new features are Gaussian beam wave optics, polarization, and complex indices of refraction so reflection from metals can be less of a hack. These will make OpticSim much more powerful than it currently is.

Thanks for the pointer to __init__ I'll be sure to add this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants