Fix PT page allocation#665
Conversation
|
I think that the PT page insertion logic is correct (from our work with R-MAX, we rebuilt the page table structure from tracing and found it was correctly done). The PTW is providing the level index for the get_pte_pa calls. |
|
please use my test branch here and run random workloads like:
The output logs would have unrealistically high numbers of 'new active pte page', and the count of 'new next pte page' would also be more than 'new data page'. For example, 436.cactusADM-1804B workload has about only 4MB memory footprint but allocates thousands of pte pages (should only need tens of them). |
|
@ngober This seems like a major issue. We are overallocating a significant number of pages for pte entries. Like mentioned above, translation is going to be much slower as well. |
|
This probably needs a test added to the test suite. Also, from some initial runs, this could account for as much as a 5% performance differencd for some workloads. Others may be closer to 1%. Drops off as the TLB is filled and ptw walks become less frequent. |
|
Oh, yea, that does look like a bug. That will over-allocate pages especially when warming up. Let's add a test on this and get it merged. |
|
Resolved with #688 |
Currently each
VirtualMemory::get_pte_pafunction call would allocate a new PT page, aschampsim::page_offset{next_pte_page} == champsim::page_offset{0}is always true.Also, PT page insertion logic seems to use the wrong level for index comparison. Consider level = 1, the function will try to insert the full PFN into
page_table. While inVirtualMemory::va_to_pait will do another page address translation. Therefore, the fault penalty and the overall memory usage will be doubled.