I’ve been studying the memory model and saw this (quote from https://research.swtch.com/hwmm):
Litmus Test: Write Queue (also called Store Buffer)
Can this program see r1 = 0, r2 = 0?
// Thread 1 // Thread 2
x = 1 y = 1
r1 = y r2 = x
On sequentially consistent hardware: no.
On x86 (or other TSO): yes!
-
Fact 1: This is the store buffer litmus test mentioned in many articles. They all say that both r1 and r2 being zero could happen on TSO because of the existence of the store buffer. They seem to assume that all the stores and loads are executed in order, and yet the result is both r1 and r2 being zero. This later concludes that “store/load reordering could happen”, as a “consequence of the store buffer’s existence”.
-
Fact 2: However we know that OoO execution could also reorder the store and the load in both threads. In this sense, regardless of the store buffer, this reordering could result in both r1 and r2 being zero, as long as all four instructions retire without seeing each other’s invalidation to x or y. And this seems to me that “store/load reordering could happen”, just because “they are executed out of order”. (I might be very wrong about this since this is the best I know of speculation and OoO execution.)
I wonder how these two facts converge (assuming I happen to be right about both): Is store buffer or OoO execution the reason for “store/load reordering”, or both are?
Alternatively speaking: Say I somehow observed this litmus test on an x86 machine, was it because of the store buffer, or OoO execution? Or is it even possible to know which?
EDIT: Actually my major confusion is the unclear causality among the following points from various literatures:
- OoO execution can cause the memory reordering;
- Store/load reordering is caused by the store buffer and demonstrated by a litmus test (and thus named as “store buffer”);
- Some program having the exact same instructions as the store buffer litmus test is used as an observable OoO execution example, just as this article https://preshing.com/20120515/memory-reordering-caught-in-the-act does.
1 + 2 seems to imply that the store buffer is the cause, and OoO execution is the consequence. 3 + 1 seems to imply that OoO execution is the cause, and memory reordering is the consequence. I can no more tell which causes which. And it is that litmus test sitting in the middle of this mystery.