DRRIP Set Dueling + SHiP Set Sampling Fix#603
DRRIP Set Dueling + SHiP Set Sampling Fix#603Quangmire wants to merge 5 commits intoChampSim:masterfrom
Conversation
ngober
left a comment
There was a problem hiding this comment.
This looks more or less like what I hoped it would look like. Thank you very much! Should we add tests for this to validate it?
| auto set_lower = set & 0x1F; // Bits 0 - 4 inclusive | ||
| auto set_upper = (set >> 5) & 0x1F; // Bits 5 - 9 inclusive |
There was a problem hiding this comment.
Should these be dynamic based on the number of sets? It seems that this algorithm breaks when the number of sets is less than 2^10. I know you had stated that there isn't literature about RRIP in anything other than the LLC, but given ChampSim's place as a research tool, it shouldn't surprise us if someone tries to put it in the L1.
There was a problem hiding this comment.
In general, increasing the percentage of leader sets (beyond 1 in 32 per policy) can cause performance degradation since the leader sets are statically assigned a policy and the leader sets for the unselected policy will always do worse. This is might be further exacerbated in small caches which don't have that many sets to begin with. The above trade-off would need to be better explored to determine what a reasonable amount of sampling would be for smaller caches.
That said, from the small set of SPEC workloads that I've tested, a 64-set L1D seems largely insensitive to the replacement policy (IPC is within +- 0.01).
There was a problem hiding this comment.
I don't mean to debate whether or not it's a good idea to put this policy in the L1. What I'm saying is that someone is definitely going to try it, and these hard-coded magic numbers are going to unfairly penalize them for doing that. This user won't know that they need to go in and change 0x1F to 0x3 or some other good value. This isn't a microarchitectural point, it's a defense-against-intrepid-users point.
There was a problem hiding this comment.
Gotcha. What I was trying to say that it's not clear what a reasonable default would be for a small cache like the L1. If we don't want to leave it up to the user, I suppose we could arbitrarily assign some defaults:
- 1024+ sets => 1 in 32 per policy (64 total leader sets at 1024)
- 256 to 1024 sets => 1 in 16 per policy (32 total leader sets at 256)
- 64 to 256 sets => 1 in 8 per policy (16 total leader sets at 64)
- 8 to 64 sets => 1 in 4 per policy (4 total leader sets at 8)
- < 8 sets => Assertion Failure?
To make it somewhat more transparent (and less magic-number-y), I could extract out the leader set detection:
bool is_leader_set(bool brrip, uint64_t set) {
uint64_t mask = SET_SAMPLE_RATE - 1;
uint64_t shift = champsim::lg2(SET_SAMPLE_RATE);
if(brrip)
return (set & mask) == ~((set >> shift) & mask);
return (set & mask) == ((set >> shift) & mask);
}
where SET_SAMPLE_RATE is a power of two such as 32.
There was a problem hiding this comment.
That seems like a reasonable set of arbitrary defaults. And, refactoring out is_leader_set() feels like a good idea. I might do something more like
enum class leader_class {
follower, brrip_leader, srrip_leader
};
[[nodiscard]] constexpr leader_class is_leader_set(long set) {
auto mask = SET_SAMPLE_RATE - 1;
auto shift = champsim::lg2(SET_SAMPLE_RATE);
auto low_slice = set & mask;
auto high_slice = (set >> shift) & mask;
if (high_slice == ~low_slice) {
return leader_class::brrip_leader;
} else if (high_slice == low_slice) {
return leader_class::srrip_leader;
}
return leader_class::follower;
}
This is a bit more explicit about how the function tells you if the set is a leader and, if so, which kind.
There was a problem hiding this comment.
I think this is a good idea. We should put it in champsim::modules::replacement, rather than CACHE. The latter is bloated enough already. A naming consideration: the word "index" seems overloaded here. Should we consider a name like get_set_sample_category()?
There was a problem hiding this comment.
Hi,
I have a doubt in chosing the total sampler sets
With this set_sampling_rate we set the default sampling rate based on the cache level that uses this replacement policy. But will the code also give provision to set the total number of sampled sets??
Just like if i want to track only k sets then which parameter to change? The code only has provision to fix the rate but not the number of such sets. lets say for 1 in 32 policy if we get 32 leader sets per policy, if i need only first 16 leader sets out of that then what should be done?? In the code i could see if sampling_rate is fixed then no of sampled sets gets fixed. If I wanted to choose a few out of that then what needs to be done?? Could you pls help me with this query.
There was a problem hiding this comment.
Are you asking for a parameter that controls the number of sampled sets? Currently, the only way to do this is to modify the code (which, given that this is in a user module, seems reasonable to me). If you want a parameter in the configuration file, I have a call for contributions open in #422. Would you consider contributing to that?
There was a problem hiding this comment.
1)I don't mean to having parameter in config file, but may be i thought the parameter could be declared and used in replacement policy file itself.
2) What exactly is the user module means?
3)I am not really sure how to develop logic to set the number of sampled sets and use it. So couldn't be able to contribute at this point of time.
There was a problem hiding this comment.
I suggest you read the documentation. I think that if you want the things you're asking for, you'll have to get your hands dirty and modify the source code.
ngober
left a comment
There was a problem hiding this comment.
All of these changes look great. I particularly like that all of the set-dueling behavior gets factored out. Factoring it out does give us a good opportunity to test the factored code, and the wider codebase benefits. Of the functions you've added, only get_set_sample_category(long, long) seems to be a "pure" function that is easily testable. All of the others connect out to other components (such as the underlying cache model). Are there any other opportunities you see for us to pull things out and test them?
| auto mask = set_sample_rate - 1; | ||
| auto shift = champsim::lg2(set_sample_rate); |
There was a problem hiding this comment.
| auto mask = set_sample_rate - 1; | |
| auto shift = champsim::lg2(set_sample_rate); | |
| champsim::data::bits shift{champsim::lg2(set_sample_rate)}; | |
| auto mask = champsim::bitmask(shift); |
|
Resolved by #689 |
Fixes for issues brought up in #599.
Issues with Set Dueling / Sampling
Rather than utilizing random sampling for DRRIP / SHiP, I instead used the leader set selection mechanism proposed in the original set dueling paper. Namely, to determine the leader set for one policy we check
set[0:5] == set[5:10]and to determine the other we checkset[0:5] == ~set[5:10]. For SHiP, I used the same approach for determining which sets should be sampled for tracking.By adopting the above approach, there will be one BRRIP and one SRRIP leader set per 32 sets and one SHiP sampled set per 32 sets.
Other Issues
SHiP had a separate bug where the
if (match->used)condition should have been reversed.In order to make both policies work in light of #570 and #598, I modified both policies to have
replacement_cache_filland disregarded misses as needed inupdate_replacement_state.Overall Results
After making these changes, the performance now typically follows the SHiP >= DRRIP > LRU trend. Before, SHiP, DRRIP, and LRU would often have similar performance.