Reclaim page tables #32

mark-i-m · 2018-06-14T02:10:57Z

There are some concerns to be addressed. copy/pasting from previous thread:

I'm not sure about the design of the reclaim_page_tables function. It seems a bit dangerous (it's possible to reclaim page tables that are still in use)

Yes, it is a bit dangerous. In particular, it is very OS-specific to know if you can reclaim a page table. The problem is that it is impossible to know just from the page tables if a page is shared without some other source of information. If you have ideas for how to get around this, please let me know.

and surprising (e.g. if a 1GiB range is passed the bounds are rounded so that not all page tables in the range might get freed).

Hmm... so would you propose not having reclaim_page_tables take an S: PageSize parameter? So it would always take two Page<Size4KiB> and free everything in between?

Also, there is one thing I want to do first: make the range be specified as RangeBound so that one can do reclaim_page_tables(page..other_page) and reclaim_page_tables(page..=other_page).

mark-i-m · 2018-06-14T02:11:32Z

Oh, sorry, and one other thing is that this should not merge before #30.

mark-i-m · 2018-06-16T02:04:00Z

@phil-opp Any thoughts on the above?

phil-opp · 2018-06-17T13:45:20Z

Hmm... so would you propose not having reclaim_page_tables take an S: PageSize parameter? So it would always take two Page and free everything in between?

Yes, I think that would be less confusing. Every range of 2MiB or 1GiB pages can be converted to a 4KiB range. So a 1GiB range should not lead to different behavior than when it's converted to a 4KiB range.

Also, there is one thing I want to do first: make the range be specified as RangeBound so that one can do reclaim_page_tables(page..other_page) and reclaim_page_tables(page..=other_page).

Range syntax for page ranges is something that I wanted for a long time. The last time that I tried it, there were some traits required that I didn't want to implement. Maybe this is no longer required with the latest updates.

Yes, it is a bit dangerous. In particular, it is very OS-specific to know if you can reclaim a page table. The problem is that it is impossible to know just from the page tables if a page is shared without some other source of information. If you have ideas for how to get around this, please let me know.

Shared normal pages are not the problem since we only reclaim page table frames. Sharing page table frames is possible too, but less common. Either way, reclaiming a non-empty page table is typically not what you want, because it means that you also do an unmap without reclaiming the mapped frames. Instead you normally want to unmap all pages before reclaiming page tables. If a page is shared with another page, you do two unmap calls but only return the frame once to an allocator.

So the reclaim_page_tables function should not change any mappings. It should only reclaim empty page table frames that can be replaced by an empty entry in the parent table.

mark-i-m · 2018-06-17T22:21:31Z

Yes, I think that would be less confusing. Every range of 2MiB or 1GiB pages can be converted to a 4KiB range. So a 1GiB range should not lead to different behavior than when it's converted to a 4KiB range.

Done.

Range syntax for page ranges is something that I wanted for a long time. The last time that I tried it, there were some traits required that I didn't want to implement. Maybe this is no longer required with the latest updates.

I did a bit of looking into this. There are two parts, IIUC:

Accepting range arguments. This is easy. Just have you method/function accept a generic argument of type R: RangeBounds<T>, where T is the type we are ranging over (e.g. Page<S>).
Using .. and ..= syntax. These operators currently always return one of the built-in Rust range types (see the "Implementors" section of https://doc.rust-lang.org/nightly/std/ops/trait.RangeBounds.html for a list of these types).

This is all pretty straightforward. It is pretty easy for us to allow the use of reclaim_page_tables(start_page..end_page) or reclaim_page_tables(my_page_range), etc.

However, I found that there is one confusing point: core::ops::Range<Page<S>> and PageRange<S> are not the same type. There are a few options:

Get rid of the PageRange and PageRangeInclusive types and use the built-in core::ops::Range* types. This would mean losing PageRange::as_4kib_page_range, though maybe it could be implemented as a standalone function?
Do nothing and risk some confusion about range types.
Do nothing and discourage the use of .. and ..=.
impl<S> From<core::ops::Range*<Page<S>>> for PageRange* for the appropriate range types. This would require writing a bunch of .into()s everywhere...

My preference is for the first one.

Sharing page table frames is possible too, but less common.

This is true. However, sharing page tables does happen, especially in hypervisors. So I would like to be flexible enough to support this use case.

So the reclaim_page_tables function should not change any mappings. It should only reclaim empty page table frames that can be replaced by an empty entry in the parent table.

Agreed. I will add some sanity checks that at each level, all PTEs should be not present.

mark-i-m · 2018-06-17T22:51:38Z

Design question:

So the reclaim_page_tables function should not change any mappings. It should only reclaim empty page table frames that can be replaced by an empty entry in the parent table.

I think we could do this two ways:

We could make reclaim_page_tables reclaim all empty page tables (i.e. all zeros) and leave others in place.
We could make reclaim_page_tables try to reclaim every page and panic if one of the page tables is non-zero. We only attempt to reclaim page tables that lie entirely within the given range.

My preference is for (2).

phil-opp · 2018-06-18T09:01:48Z

Get rid of the PageRange and PageRangeInclusive types and use the built-in core::ops::Range* types. This would mean losing PageRange::as_4kib_page_range, though maybe it could be implemented as a standalone function?

Sounds good to me!

I think we could do this two ways:

We could make reclaim_page_tables reclaim all empty page tables (i.e. all zeros) and leave others in place.
We could make reclaim_page_tables try to reclaim every page and panic if one of the page tables is non-zero. We only attempt to reclaim page tables that lie entirely within the given range.

Both things would be useful in my opinion. (1) could be used in some kind of garbage collect thread that periodically cleans up no longer used page tables. (2) can be used when it's 100% certain that only empty page tables should exist in a given range.

mark-i-m · 2018-06-19T01:41:02Z

@phil-opp

Get rid of the PageRange and PageRangeInclusive types and use the built-in core::ops::Range* types. This would mean losing PageRange::as_4kib_page_range, though maybe it could be implemented as a standalone function?

Sounds good to me!

Done.

Both things would be useful in my opinion. (1) could be used in some kind of garbage collect thread that periodically cleans up no longer used page tables. (2) can be used when it's 100% certain that only empty page tables should exist in a given range.

I added a mode argument to reclaim_page_tables that let's you choose behavior. Let me know what you think.

phil-opp

Thanks for the updates, @mark-i-m, and sorry for the delay!

I like the mode parameter that allows to use both variants. Adding support for the range syntax looks also good to me, even though I don't like the step trait in its current form.

I'm unsure about the implementation of reclaim_page_tables and left some line comments. It is somewhat confusing at the moment and I think that I found some bugs too.

Maybe you could use ranges of 2MiB and 1GiB pages for iterating over P2/P3 tables. This would be more clear and also avoid setting the indices manually to 0.

phil-opp · 2018-06-27T11:20:11Z

src/structures/paging/recursive.rs

+    ///
+    /// # Panics
+    ///
+    /// - If one of the range enpoints is `Unbounded`.


What's the problem with unbounded? I would expect that it just walks over the whole page table.

phil-opp · 2018-06-27T11:21:41Z

src/structures/paging/recursive.rs

+            }
+        };
+        let end = match range.end_bound() {
+            Bound::Included(endpoint) => *endpoint + 1,


Doesn't this lead to an overflow if the endpoint is the address space end address?

phil-opp · 2018-06-27T11:32:02Z

src/structures/paging/recursive.rs

+                start.p4_index(),
+                start.p3_index(),
+                start.p2_index(),
+                start.p1_index() + u9::new(1),


I'm not sure what you're trying to do here. How does adding 1 to the P1 index change anything?

Should be form_page_table_idx(p4, p3, p2 + 1, 0).

phil-opp · 2018-06-27T11:32:28Z

src/structures/paging/recursive.rs

+            start.p4_index(),
+            start.p3_index(),
+            start.p2_index(),
+            u9::new(0),


Why is the P1 index always zero? Isn't the range below always empty this way? Or did you mean to use end instead of start for the other indices?

Yes, it is meant to be end.

phil-opp · 2018-06-27T11:36:20Z

src/structures/paging/recursive.rs

+                Err(FrameError::FrameNotPresent) => {
+                    continue;
+                }
+                Err(FrameError::HugeFrame) => unreachable!(),


Why is this unreachable? 1GiB pages are totally possible.

Not sure what I was thinking here...

phil-opp · 2018-06-27T11:36:30Z

src/structures/paging/recursive.rs

+                Err(FrameError::FrameNotPresent) => {
+                    continue;
+                }
+                Err(FrameError::HugeFrame) => unreachable!(),


Why is this unreachable? 2MiB pages are totally possible.

...or here...

phil-opp · 2018-06-27T11:37:06Z

src/structures/paging/recursive.rs

+            start.p3_index(),
+            u9::new(0),
+            u9::new(0),
+        );


Same questions as above with p1_start and p1_end.

phil-opp · 2018-06-27T11:37:56Z

src/structures/paging/recursive.rs

+        );
+
+        // Free all the page tables!
+        for page in p1_start..p1_end {


Aren't you iterating over the same page table multiple times (same for the other page table levels)?

Yes, but after you free the table the first time, it should short-circuit in subsequent iterations.

Hmm, I think it would be less confusing to use 2MiB pages instead, because then each P1 table would occur only once.

phil-opp · 2018-06-27T11:42:36Z

src/structures/paging/recursive.rs

+    ///
+    /// - If one of the range enpoints is `Unbounded`.
+    /// - If the `mode` specifies that we should.
+    unsafe fn reclaim_page_tables<D, R>(


I'm not sure whether this should be a method of the Mapper trait. It does not really have something to do with mapping/unmapping and it is completely independent of the generic PageSize parameter. Simply adding it as a method to RecursivePageTable might be a better fit and would also avoid the identical implementations for all page sizes.

phil-opp · 2018-06-27T11:49:38Z

src/structures/paging/mod.rs

+/// just changes the page size.
+pub fn as_4kib_page_range(Range { start, end }: Range<Page<Size2MiB>>) -> Range<Page<Size4KiB>> {
+    Range {
+        start: Page::from_start_address(start.start_address()).unwrap(),


Why not use containing_address here?

mark-i-m · 2018-06-30T16:22:30Z

Thanks @phil-opp :)

I will take a look. This might take me a while. It's kind of hard to get right without tests... If I have time, I might try your bootimage test :)

phil-opp · 2018-07-01T14:44:59Z

@mark-i-m Thanks for the continuous updates!

This is very tricky code, so tests are a good idea. I already added basic support for bootimage test in the testing subdirectory. It basically works by creating a small testing executable similar to test-basic that checks the desired properties and does serial_println!("ok") if successful. Tests for (un)mapping pages and cleaning up page tables might be complex to write, but would be very very useful.

mark-i-m · 2018-07-03T03:11:13Z

Thanks :)

This will take me a while to work though. I think I have an idea for a clean way to iterate over page tables, but I'm still figuring out the details. I'll push when I at least have something that compiles :P

mark-i-m · 2018-07-05T02:04:38Z

@phil-opp Could you take a look at my latest commit? It is rather large and messy and very WIP, but I think it is a cleaner approach than what I previously had. Any thoughts?

phil-opp · 2018-07-05T10:53:24Z

I'm a bit busy this week, but I try to take a look after the weekend.

phil-opp

I really like the visitor abstraction! It makes the code much clearer.

I currently see two problems:

The visits occur top-down, i.e. first the P2 table, then its P1 tables. For the delete visitor we would want a bottom-up order, because freeing P1 tables could make the parent P2 table empty.
If the delete visitor wants to overwrite the visit_p* methods, it needs to reimplement the visitor logic, which leads to code duplication.

I think both of them could be solved by introducing additional before_visit_* and after_visit_* methods (the names are only placeholders). Then the visit_p3 method would do the following:

Call before_visit_p3, by default a no-op.
Call visit_p2 for all P3 entries.
Call after_visit_p3, by default a no-op.

The delete visitor could then just overwrite the after_visit_* methods. Thus we would get the correct order and we wouldn't need to duplicate the visitor code.

phil-opp · 2018-07-10T11:45:02Z

src/structures/paging/recursive.rs

+                            .as_mut_ptr()
+                    };
+                    self.visit_1gib_page(entry, frame);
+                    return;


Shouldn't this be a continue instead of a return? I.e. shouldn't we visit the following P3 entries too?

phil-opp · 2018-07-10T11:46:07Z

src/structures/paging/recursive.rs

+                            .as_mut_ptr()
+                    };
+                    self.visit_2mib_page(entry, frame);
+                    return;


Same as above (continue instead of return?)

phil-opp · 2018-07-10T11:47:40Z

src/structures/paging/recursive.rs

+                // No entry... skip.
+                Err(FrameError::FrameNotPresent) => continue,
+                // Cannot have 512GiB huge pages (yet)!
+                Err(FrameError::HugeFrame) => unreachable!(),


Can we return a Result from this function instead of possibly panicking?

phil-opp · 2018-07-10T11:48:16Z

src/structures/paging/recursive.rs

+                // No entry... skip.
+                Err(FrameError::FrameNotPresent) => continue,
+                // We are already at 4KiB. No huge pages here.
+                Err(FrameError::HugeFrame) => unreachable!(),


Can we turn this into an error type instead of a possible panic?

phil-opp · 2018-07-10T11:49:51Z

src/structures/paging/recursive.rs

+}
+
+/// Page table visitor that visits the page tables mapping a certain range of memory and
+/// deallocates them.


Maybe we should mention that this only frees page tables, not pages.

mark-i-m · 2018-07-22T00:58:06Z

Update: I am working on this, but very slooowly... sorry for the delay. If you want to close the PR to keep the issue tracker clean, feel free. I will reopen when this is in a state reasonable for another review.

phil-opp · 2018-07-22T16:01:30Z

@mark-i-m No worries! There's no hurry

mark-i-m · 2018-09-05T14:59:04Z

Update: I am still working on this. I wanted to get page tables working in my own project so I could play with this and test it. That's done now, so I can continue with this... However, sigh time...

mark-i-m · 2020-01-24T20:24:41Z

I'm going to go ahead and close this because it has been a long time, my own project has evolved in a different way, and I suspect that some sort of Visitor pattern might be better anyway...

mark-i-m mentioned this pull request Jun 14, 2018

Create a separate PhysMemAllocator trait #30

Merged

mark-i-m force-pushed the reclaim_page_tables branch from c8c80ff to 13de794 Compare June 16, 2018 02:18

mark-i-m added 2 commits June 17, 2018 16:25

Add reclaim_page_tables API

b97b9e2

Have a single reclaim_page_tables

7179da0

mark-i-m force-pushed the reclaim_page_tables branch from 3c208c0 to 7179da0 Compare June 17, 2018 21:26

range bounds for reclaim_page_tables

89bc62b

sanity check reclaim_page_tables ptes

23ebd8a

mark-i-m added 2 commits June 18, 2018 20:08

page ranges

c1e8352

allow Panic and Skip modes for reclaim_page_tables

baedd39

fix test

3824003

mark-i-m changed the title ~~[WIP] Reclaim page tables~~ Reclaim page tables Jun 19, 2018

phil-opp reviewed Jun 27, 2018

View reviewed changes

start on some improvements

4bdae70

[VERY WIP] Create a Visitor for PageTables

c3134ef

phil-opp reviewed Jul 10, 2018

View reviewed changes

working on improvements to visitor

b5b1671

mark-i-m closed this Jan 24, 2020

mark-i-m mentioned this pull request Jan 24, 2020

Page Table Visitors #121

Open

Reclaim page tables #32

Reclaim page tables #32

Conversation

mark-i-m commented Jun 14, 2018

mark-i-m commented Jun 14, 2018

mark-i-m commented Jun 16, 2018

phil-opp commented Jun 17, 2018

mark-i-m commented Jun 17, 2018

mark-i-m commented Jun 17, 2018

phil-opp commented Jun 18, 2018

mark-i-m commented Jun 19, 2018

phil-opp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mark-i-m commented Jun 30, 2018

phil-opp commented Jul 1, 2018

mark-i-m commented Jul 3, 2018

mark-i-m commented Jul 5, 2018

phil-opp commented Jul 5, 2018

phil-opp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mark-i-m commented Jul 22, 2018

phil-opp commented Jul 22, 2018

mark-i-m commented Sep 5, 2018

mark-i-m commented Jan 24, 2020