-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modeling Files and File Dependencies #70
Comments
@ormsbee I really like this idea! Though I do feel that if we could avoid supporting subfolders within the Folder Components, I think it would be simpler and better.
I consider this partly a matter of missing UI that was never fully built out. The library component's static files tab should have a "Use an existing file..." button that shows you a combined, searchable view of all the static assets attached to other components in the library, and allows you to copy the asset into the current component. Things are de-duped at the storage layer so it's fine to copy an asset into many components. There could also be a "Files & Uploads" view that shows you all the assets in a course, and groups identical assets so you can easily bulk update any asset that's used in multiple components. Again, mostly a UX overlay without changing any functionality, but a huge improvement to workflow. I believe that would be totally sufficient for libraries, though for courses it's clear that there's a need for "general" uploads for the course like PDFs that may not be tied to any component, and I think your proposal of being able to create Folder components at the top level for that is great. I guess my main concern is that if you don't "strongly encourage" authors to link components to where they're used, and people default to a big course-wide "Everything" folder, we won't see much improvement compared to the current situation. So I'd like to see some serious UX thinking on how to nudge users to be inherently organized. |
Thanks for putting this up.
@ormsbee Not having seen the mockups - are we putting uploaded files and videos into the same view together? If that's not what we're talking about, feel free to ignore this bit. Someone had asked us about that one a while back and it sounded like a bad idea. Videos and other files need totally different information shown at a glance.
Is "user" here a course author or a learner?
That sounds just fine to me. How will all this look when the course is exported? Will components just include a "files referenced" attribute that points to stuff in |
To be clear, there would be no nesting of Folder Components (Filesystem Components? Ugh). But we do need to be able to have subdirectories within a Folder Component since that's likely going to be a common use case when we have pre-packaged interactives with JS, images, and such.
I think there is a value from the UI point of view of having one authoritative, shared place where the thing in question "lives", if it's explicitly intended to be a shared resource.
Right. I think I'm leaning towards upload-to-component to be the default behavior exposed in the UI, with an option to make a reference to a Folder Component as a secondary/advanced option.
It's not currently in scope, and I think there's a lot more iteration that would be required, but the proposal was to be able to upload a video file and see it appear next to your other files and uploads, so that it's possible to organize them in folders and such. Except the screens around Videos implied a lot more metadata, like where it's used in the course. I had major concerns with such a view because:
That being said, I am supportive of searching and organizing components in various ways, and while I'm skittish about having a Component masquerade as a File, I'm fine with a group of files being a Component. Then they could be organized via filter/search/tagging in a common place.
Ah, good call out. I meant author here. Import/Export Format
To try to keep backwards compatibility as much as possible, I was thinking something like this: The unorganized stuff in static files imports and exports exactly as it does today. Assets that are bound to a specific Component export into a directory under where that component's OLX goes. So for instance, if there is a problem that exports its OLX to Assets in these new Folder Components (really needs a different name) follow the conventions for other Components. So that means that the top level metadata for that component would go in something like References to files in these new Folder Components would be done via some sort of link prefix convention. So instead of |
Migration PathGoals for any sort of migration path:
(See rough plan at the end of the previous comment.) We have some big pieces that I'd like to eventually pull together into a common set of Learning Core data models, but I think we can tackle them individually for now: Phase 1: Creating File Groupings
Some technical notes:
Phase 2: Unifying Components and File Groupings? This would require a lot of UX consideration, but it's possible to do once Modulestore data has been ported over to Learning Core data models and are Components as well. Import/export would stay the same as Phase 1, but we'd make the data model associations between Components when one uses assets from another (e.g. several ProblemBlocks using the same image). At this point we could use filter/tagging as well. It's possible that we completely subsume the current files and uploads set in this step–no visible changes to authors or the import/export, but we would effectively make a "course run default Folder Component" and stick all the unorganized stuff in there, so we could get rid of old code. I'm not going to speculate too much at what future phases might bring, but I think it would be consistent with where we're going to have a more unified Library/Course content filtering/browsing experience. |
BTW folks, I fly out to Korea tomorrow afternoon and don't come back until August 22nd–so I likely won't be responsive to comments on this ticket over the next week. I just really wanted to get these thoughts out as soon as I could so that folks could think it over. |
Side Note on Storage GrowthComponentVersions are currently modeled in a way that stores a full set mapping ComponentVersions to the RawContent that they use, meaning that a series of small changes to a ComponentVersion with many files is very inefficient. Mitigation suggestions:
We can also punt this question for now and leave the existing files and uploads backend as-is, while creating new groupings of files in this new system. |
Enjoy!
In defense of the 16k+ file course, I have no actual defense that's my mistake. I was un-tarring items with the assumption that it would overwrite the previous file structure. It did not, and sometimes my folks still had the old folder structure in place without realizing they were being merged. We now have 2GB courses that contain nearly all the files from every course we have. We're fixing it. In related news, I am eagerly anticipating the bulk delete functionality in the new Files page. 2k files is still a legit size for us, though.
All of that sounds reasonable to me. I may need to tell |
More Storage ThoughtsOkay, so I've been mulling over the storage thing again. I'm writing this up on a train, so it's a little rushed/incomplete. There are broadly three paths I can think of: 1. Model ComponentVersion to RawContent mappings with range awareness.This would mean having a model that might look something like this: class ComponentVersionRangeRawContent(models.Model):
first_version = models.ForeignKey(ComponentVersion, on_delete=models.RESTRICT)
last_version = models.ForeignKey(ComponentVersion, on_delete=models.RESTRICT, null=True)
uuid = immutable_uuid_field()
key = key_field()
# range_num is sort of like version_num for this one piece of content, but it exists here
# primarily to guard against race conditions.
range_num = models.PositiveBigIntegerField(null=False, validators=[MinValueValidator(1)])
learner_downloadable = models.BooleanField(default=False) This is much more efficient for storing a large set of content related to ComponentVersion, since we only make one new row when a piece of content changes (as opposed to the current implementation that makes a new row for every associated piece of RawContent for a ComponentVersion whenever there is a change in any one of them). Drawbacks:
2. Model each file as a ComponentIf we did it this way, then each file becomes a FileComponent, and we have some higher-level entity that keeps references to all the children, like how we planned to make the Unit->Component relationship. In order to guarantee that there are no conflicts in file names, the metadata for that naming would have to exist at this Unit-like layer. Drawbacks:
3. Make a FileSystemComponent-specific mapping of RawContentAnother alternative is to make this new collection of files a Component, but give that component type its own way of defining the relationship between ComponentVersions and RawContent. So it would still make ComponentVersions and still have ways of declaring dependencies on them. But instead of using Component's simple mapping mechanism, it would use its own models. The advantage of this approach is that we can opt to use this more complex and fragile system for the one Component where the efficiency problem will really be noticed, while keeping other Components simple. We can also define a common model for Component dependencies. Disadvantages:
I'm currently in favor of approach (3). It addresses the efficiency problem in a way that still fairly closely matches the semantics of how Components are supposed to work, but doesn't risk introducing the burden of an overly complex model on Components as a whole. We always intended to let Component types extend the data model with their own additions (though I hadn't really thought of extending it in this way). Also, it lets the data model for groups of files develop independently, in a way that can accommodate its very different set of use cases from most the Component types we care about. |
For option 3, are you imagining a single FileSysetmComponent would behave like a folder or that folders would be a concept of that component and you would associate a single one of these globally with a learning context? How does option 3 handle inter-file dependencies? Is this also something that we would implement inside the new component type? You mentioned in Option 2 drawbacks, the HTML file that references a JS file and I'm trying understand how you imagine that working in option 3. |
Folders would be a concept within it. So instead of having a single In this scenario, the course-wide "Files and Uploads" is one instance of this FileSystemComponent that's there for backwards compatibility.
Option 3 wouldn't really model inter-file dependencies at all. You would be able to say, "This ProblemBlock uses this FileSystemComponent that has a molecular editor and assorted assets", but there would be no mapping of "this HTML file uses these JS files". My criticism of Option 2 (where individual files are Components) was that it would make the dependency mapping misleading. It would show that ProblemBlock uses this particular File, but not all the transitive dependencies of that file–because in either option, I don't think we want to try to parse HTML/JS/Python/WhateverRandomThing to figure those out. It's simpler to just treat the whole thing as a single component for dependency's sake, and leave it to people to structure things in a sane way. |
The point of the dependency tracking would largely be for update purposes, so it makes sense to treat the whole set of files as one component–i.e. if my ProblemBlock uses v. 12 of this library, and there's now a version 13 published, that's the level of granularity I care about as an author of the problem. |
Gotcha, you don't care that they changed the JS file or the HTML file, if they get versioned together and the component gets bumped if any relevant file gets updated. That makes sense to me. So files and uploads would map to one |
Random thoughts I had as I'm mucking with the static file code:
|
How are things stored today?
Courses in Studio Storage
Libraries v2 Storage
Current shortcomings
Other considerations
Proposal: Folders as a type of Component with explicit dependencies
So in this case, a Folder is its own namespace–you could use it for something like "all the PDFs in this course". It can have subdirectories in it, but these aren't Components.
Implications
Migration Path
We'd put everything in a current course into one top level Folder Component. I'm not sure how we'd differentiate this in the UX though–we definitely need a better name than Folder Component to differentiate between creating different ones of these vs. nested sub-directories.
We might be able to do this in a way that works in our favor by having them explicitly move the stuff that they want out of the legacy space, and they can leave/ignore the stuff they don't care about.
The text was updated successfully, but these errors were encountered: