-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reclaim page tables #32
Conversation
Oh, sorry, and one other thing is that this should not merge before #30. |
@phil-opp Any thoughts on the above? |
c8c80ff
to
13de794
Compare
Yes, I think that would be less confusing. Every range of 2MiB or 1GiB pages can be converted to a 4KiB range. So a 1GiB range should not lead to different behavior than when it's converted to a 4KiB range.
Range syntax for page ranges is something that I wanted for a long time. The last time that I tried it, there were some traits required that I didn't want to implement. Maybe this is no longer required with the latest updates.
Shared normal pages are not the problem since we only reclaim page table frames. Sharing page table frames is possible too, but less common. Either way, reclaiming a non-empty page table is typically not what you want, because it means that you also do an unmap without reclaiming the mapped frames. Instead you normally want to unmap all pages before reclaiming page tables. If a page is shared with another page, you do two unmap calls but only return the frame once to an allocator. So the |
3c208c0
to
7179da0
Compare
Done.
I did a bit of looking into this. There are two parts, IIUC:
This is all pretty straightforward. It is pretty easy for us to allow the use of However, I found that there is one confusing point:
My preference is for the first one.
This is true. However, sharing page tables does happen, especially in hypervisors. So I would like to be flexible enough to support this use case.
Agreed. I will add some sanity checks that at each level, all PTEs should be not present. |
Design question:
I think we could do this two ways:
My preference is for (2). |
Sounds good to me!
Both things would be useful in my opinion. (1) could be used in some kind of garbage collect thread that periodically cleans up no longer used page tables. (2) can be used when it's 100% certain that only empty page tables should exist in a given range. |
Done.
I added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, @mark-i-m, and sorry for the delay!
I like the mode
parameter that allows to use both variants. Adding support for the range syntax looks also good to me, even though I don't like the step trait in its current form.
I'm unsure about the implementation of reclaim_page_tables
and left some line comments. It is somewhat confusing at the moment and I think that I found some bugs too.
Maybe you could use ranges of 2MiB and 1GiB pages for iterating over P2/P3 tables. This would be more clear and also avoid setting the indices manually to 0
.
/// | ||
/// # Panics | ||
/// | ||
/// - If one of the range enpoints is `Unbounded`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the problem with unbounded? I would expect that it just walks over the whole page table.
src/structures/paging/recursive.rs
Outdated
} | ||
}; | ||
let end = match range.end_bound() { | ||
Bound::Included(endpoint) => *endpoint + 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this lead to an overflow if the endpoint is the address space end address?
src/structures/paging/recursive.rs
Outdated
start.p4_index(), | ||
start.p3_index(), | ||
start.p2_index(), | ||
start.p1_index() + u9::new(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're trying to do here. How does adding 1 to the P1 index change anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be form_page_table_idx(p4, p3, p2 + 1, 0)
.
src/structures/paging/recursive.rs
Outdated
start.p4_index(), | ||
start.p3_index(), | ||
start.p2_index(), | ||
u9::new(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the P1 index always zero? Isn't the range below always empty this way? Or did you mean to use end
instead of start
for the other indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is meant to be end
.
src/structures/paging/recursive.rs
Outdated
Err(FrameError::FrameNotPresent) => { | ||
continue; | ||
} | ||
Err(FrameError::HugeFrame) => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this unreachable? 1GiB pages are totally possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what I was thinking here...
src/structures/paging/recursive.rs
Outdated
Err(FrameError::FrameNotPresent) => { | ||
continue; | ||
} | ||
Err(FrameError::HugeFrame) => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this unreachable? 2MiB pages are totally possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...or here...
src/structures/paging/recursive.rs
Outdated
start.p3_index(), | ||
u9::new(0), | ||
u9::new(0), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions as above with p1_start
and p1_end
.
src/structures/paging/recursive.rs
Outdated
); | ||
|
||
// Free all the page tables! | ||
for page in p1_start..p1_end { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't you iterating over the same page table multiple times (same for the other page table levels)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but after you free the table the first time, it should short-circuit in subsequent iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think it would be less confusing to use 2MiB pages instead, because then each P1 table would occur only once.
src/structures/paging/recursive.rs
Outdated
/// | ||
/// - If one of the range enpoints is `Unbounded`. | ||
/// - If the `mode` specifies that we should. | ||
unsafe fn reclaim_page_tables<D, R>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether this should be a method of the Mapper
trait. It does not really have something to do with mapping/unmapping and it is completely independent of the generic PageSize
parameter. Simply adding it as a method to RecursivePageTable
might be a better fit and would also avoid the identical implementations for all page sizes.
/// just changes the page size. | ||
pub fn as_4kib_page_range(Range { start, end }: Range<Page<Size2MiB>>) -> Range<Page<Size4KiB>> { | ||
Range { | ||
start: Page::from_start_address(start.start_address()).unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use containing_address
here?
Thanks @phil-opp :) I will take a look. This might take me a while. It's kind of hard to get right without tests... If I have time, I might try your bootimage test :) |
@mark-i-m Thanks for the continuous updates! This is very tricky code, so tests are a good idea. I already added basic support for |
Thanks :) This will take me a while to work though. I think I have an idea for a clean way to iterate over page tables, but I'm still figuring out the details. I'll push when I at least have something that compiles :P |
@phil-opp Could you take a look at my latest commit? It is rather large and messy and very WIP, but I think it is a cleaner approach than what I previously had. Any thoughts? |
I'm a bit busy this week, but I try to take a look after the weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the visitor abstraction! It makes the code much clearer.
I currently see two problems:
- The visits occur top-down, i.e. first the P2 table, then its P1 tables. For the delete visitor we would want a bottom-up order, because freeing P1 tables could make the parent P2 table empty.
- If the delete visitor wants to overwrite the
visit_p*
methods, it needs to reimplement the visitor logic, which leads to code duplication.
I think both of them could be solved by introducing additional before_visit_*
and after_visit_*
methods (the names are only placeholders). Then the visit_p3
method would do the following:
- Call
before_visit_p3
, by default a no-op. - Call
visit_p2
for all P3 entries. - Call
after_visit_p3
, by default a no-op.
The delete visitor could then just overwrite the after_visit_*
methods. Thus we would get the correct order and we wouldn't need to duplicate the visitor code.
src/structures/paging/recursive.rs
Outdated
.as_mut_ptr() | ||
}; | ||
self.visit_1gib_page(entry, frame); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a continue
instead of a return
? I.e. shouldn't we visit the following P3 entries too?
src/structures/paging/recursive.rs
Outdated
.as_mut_ptr() | ||
}; | ||
self.visit_2mib_page(entry, frame); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above (continue
instead of return
?)
// No entry... skip. | ||
Err(FrameError::FrameNotPresent) => continue, | ||
// Cannot have 512GiB huge pages (yet)! | ||
Err(FrameError::HugeFrame) => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return a Result
from this function instead of possibly panicking?
// No entry... skip. | ||
Err(FrameError::FrameNotPresent) => continue, | ||
// We are already at 4KiB. No huge pages here. | ||
Err(FrameError::HugeFrame) => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we turn this into an error type instead of a possible panic?
src/structures/paging/recursive.rs
Outdated
} | ||
|
||
/// Page table visitor that visits the page tables mapping a certain range of memory and | ||
/// deallocates them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should mention that this only frees page tables, not pages.
Update: I am working on this, but very slooowly... sorry for the delay. If you want to close the PR to keep the issue tracker clean, feel free. I will reopen when this is in a state reasonable for another review. |
@mark-i-m No worries! There's no hurry |
Update: I am still working on this. I wanted to get page tables working in my own project so I could play with this and test it. That's done now, so I can continue with this... However, sigh time... |
I'm going to go ahead and close this because it has been a long time, my own project has evolved in a different way, and I suspect that some sort of Visitor pattern might be better anyway... |
There are some concerns to be addressed. copy/pasting from previous thread:
Yes, it is a bit dangerous. In particular, it is very OS-specific to know if you can reclaim a page table. The problem is that it is impossible to know just from the page tables if a page is shared without some other source of information. If you have ideas for how to get around this, please let me know.
Hmm... so would you propose not having
reclaim_page_tables
take anS: PageSize
parameter? So it would always take twoPage<Size4KiB>
and free everything in between?Also, there is one thing I want to do first: make the range be specified as
RangeBound
so that one can doreclaim_page_tables(page..other_page)
andreclaim_page_tables(page..=other_page)
.