Intel 253668-032US User Manual

Page of 806
11-12   Vol. 3
MEMORY CACHE CONTROL
The WC memory type is weakly ordered by definition. Once the eviction of a WC 
buffer has started, the data is subject to the weak ordering semantics of its defini-
tion. Ordering is not maintained between the successive allocation/deallocation of 
WC buffers (for example, writes to WC buffer 1 followed by writes to WC buffer 2 may 
appear as buffer 2 followed by buffer 1 on the system bus). When a WC buffer is 
evicted to memory as partial writes there is no guaranteed ordering between succes-
sive partial writes (for example, a partial write for chunk 2 may appear on the bus 
before the partial write for chunk 1 or vice versa). 
The only elements of WC propagation to the system bus that are guaranteed are 
those provided by transaction atomicity. For example, with a P6 family processor, a 
completely full WC buffer will always be propagated as a single 32-bit burst transac-
tion using any chunk order. In a WC buffer eviction where data will be evicted as 
partials, all data contained in the same chunk (0 mod 8 aligned) will be propagated 
simultaneously. Likewise, for more recent processors starting with those based on 
Intel NetBurst microarchitectures, a full WC buffer will always be propagated as a 
single burst transactions, using any chunk order within a transaction. For partial 
buffer propagations, all data contained in the same chunk will be propagated simul-
taneously.
11.3.2 
Choosing a Memory Type
The simplest system memory model does not use memory-mapped I/O with read or 
write side effects, does not include a frame buffer, and uses the write-back memory 
type for all memory. An I/O agent can perform direct memory access (DMA) to write-
back memory and the cache protocol maintains cache coherency.
A system can use strong uncacheable memory for other memory-mapped I/O, and 
should always use strong uncacheable memory for memory-mapped I/O with read 
side effects.
Dual-ported memory can be considered a write side effect, making relatively prompt 
writes desirable, because those writes cannot be observed at the other port until they 
reach the memory agent. A system can use strong uncacheable, uncacheable, write-
through, or write-combining memory for frame buffers or dual-ported memory that 
contains pixel values displayed on a screen. Frame buffer memory is typically large (a 
few megabytes) and is usually written more than it is read by the processor. Using 
strong uncacheable memory for a frame buffer generates very large amounts of bus 
traffic, because operations on the entire buffer are implemented using partial writes 
rather than line writes. Using write-through memory for a frame buffer can displace 
almost all other useful cached lines in the processor's L2 and L3 caches and L1 data 
cache. Therefore, systems should use write-combining memory for frame buffers 
whenever possible.
Software can use page-level cache control, to assign appropriate effective memory 
types when software will not access data structures in ways that benefit from write-
back caching. For example, software may read a large data structure once and not 
access the structure again until the structure is rewritten by another agent. Such a