您的位置:首页 > 其它

CPU拓扑结构的调查

2013-03-07 07:56 155 查看

CPU拓扑结构的调查

在做多核程序的时候(比如Erlang程序),我们需要了解cpu的拓扑结构, 了解logic CPU和物理的CPU的映射关系,以及了解CPU的内部的硬件参数,比如说

L1,L2 cache的大小等信息。

Linux下的/proc/cpuinfo提供了相应的信息,但是比较不全面。 /sys/devices/system/cpu/也提供了topology结构但是比较难解读。

很多时候我们需要更专业的工具了。intel提供了这样的救助。参见: http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
下载下来编译执行就好。

[admin@my174 cpu-topology]$ ./cpu_topology64.out

Advisory to Users on system topology enumeration

This utility is for demonstration purpose only. It assumes the hardware topology

configuration within a coherent domain does not change during the life of an OS

session. If an OS support advanced features that can change hardware topology

configurations, more sophisticated adaptation may be necessary to account for

the hardware configuration change that might have added and reduced the number

of logical processors being managed by the OS.

User should also`be aware that the system topology enumeration algorithm is

based on the assumption that CPUID instruction will return raw data reflecting

the native hardware configuration. When an application runs inside a virtual

machine hosted by a Virtual Machine Monitor (VMM), any CPUID instructions

issued by an app (or a guest OS) are trapped by the VMM and it is the VMM’s

responsibility and decision to emulate/supply CPUID return data to the virtual

machines. When deploying topology enumeration code based on querying CPUID

inside a VM environment, the user must consult with the VMM vendor on how an VMM

will emulate CPUID instruction relating to topology enumeration.

Software visible enumeration in the system:

Number of logical processors visible to the OS: 16

Number of logical processors visible to this process: 16

Number of processor cores visible to this process: 8

Number of physical packages visible to this process: 2

Hierarchical counts by levels of processor topology:

# of cores in package 0 visible to this process: 4 .

# of logical processors in Core 0 visible to this process: 2 .

# of logical processors in Core 1 visible to this process: 2 .

# of logical processors in Core 2 visible to this process: 2 .

# of logical processors in Core 3 visible to this process: 2 .

# of cores in package 1 visible to this process: 4 .

# of logical processors in Core 0 visible to this process: 2 .

# of logical processors in Core 1 visible to this process: 2 .

# of logical processors in Core 2 visible to this process: 2 .

# of logical processors in Core 3 visible to this process: 2 .

Affinity masks per SMT thread, per core, per package:

Individual:

P:0, C:0, T:0 –> 1

P:0, C:0, T:1 –> 100

Core-aggregated:

P:0, C:0 –> 101

Individual:

P:0, C:1, T:0 –> 4

P:0, C:1, T:1 –> 400

Core-aggregated:

P:0, C:1 –> 404

Individual:

P:0, C:2, T:0 –> 10

P:0, C:2, T:1 –> 1z3

Core-aggregated:

P:0, C:2 –> 1010

Individual:

P:0, C:3, T:0 –> 40

P:0, C:3, T:1 –> 4z3

Core-aggregated:

P:0, C:3 –> 4040

Pkg-aggregated:

P:0 –> 5555

Individual:

P:1, C:0, T:0 –> 2

P:1, C:0, T:1 –> 200

Core-aggregated:

P:1, C:0 –> 202

Individual:

P:1, C:1, T:0 –> 8

P:1, C:1, T:1 –> 800

Core-aggregated:

P:1, C:1 –> 808

Individual:

P:1, C:2, T:0 –> 20

P:1, C:2, T:1 –> 2z3

Core-aggregated:

P:1, C:2 –> 2020

Individual:

P:1, C:3, T:0 –> 80

P:1, C:3, T:1 –> 8z3

Core-aggregated:

P:1, C:3 –> 8080

Pkg-aggregated:

P:1 –> aaaa

APIC ID listings from affinity masks

OS cpu 0, Affinity mask 000001 – apic id 10

OS cpu 1, Affinity mask 000002 – apic id 0

OS cpu 2, Affinity mask 000004 – apic id 12

OS cpu 3, Affinity mask 000008 – apic id 2

OS cpu 4, Affinity mask 000010 – apic id 14

OS cpu 5, Affinity mask 000020 – apic id 4

OS cpu 6, Affinity mask 000040 – apic id 16

OS cpu 7, Affinity mask 000080 – apic id 6

OS cpu 8, Affinity mask 000100 – apic id 11

OS cpu 9, Affinity mask 000200 – apic id 1

OS cpu 10, Affinity mask 000400 – apic id 13

OS cpu 11, Affinity mask 000800 – apic id 3

OS cpu 12, Affinity mask 001000 – apic id 15

OS cpu 13, Affinity mask 002000 – apic id 5

OS cpu 14, Affinity mask 004000 – apic id 17

OS cpu 15, Affinity mask 008000 – apic id 7

Package 0 Cache and Thread details

Box Description:

Cache is cache level designator

Size is cache size

OScpu# is cpu # as seen by OS

Core is core#[_thread# if > 1 thread/core] inside socket

AffMsk is AffinityMask(extended hex) for core and thread

CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache

CmbMsk will differ from AffMsk if > 1 hw_thread/cache

Extended Hex replaces trailing zeroes with ‘z#’

where # is number of zeroes (so ’8z5′ is ’0×800000′)

L1D is Level 1 Data cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 4

L1I is Level 1 Instruction cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 4

L2 is Level 2 Unified cache, size(KBytes)= 256, Cores/cache= 2, Caches/package= 4

L3 is Level 3 Unified cache, size(KBytes)= 8192, Cores/cache= 8, Caches/package= 1

+———–+———–+———–+———–+

Cache | L1D | L1D | L1D | L1D |

Size | 32K | 32K | 32K | 32K |

OScpu#| 0 8| 2 10| 4 12| 6 14|

Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|

AffMsk| 1 100| 4 400| 10 1z3| 40 4z3|

CmbMsk| 101 | 404 | 1010 | 4040 |

+———–+———–+———–+———–+

Cache | L1I | L1I | L1I | L1I |

Size | 32K | 32K | 32K | 32K |

+———–+———–+———–+———–+

Cache | L2 | L2 | L2 | L2 |

Size | 256K | 256K | 256K | 256K |

+———–+———–+———–+———–+

Cache | L3 |

Size | 8M |

CmbMsk| 5555 |

+———————————————–+

Combined socket AffinityMask= 0×5555

Package 1 Cache and Thread details

Box Description:

Cache is cache level designator

Size is cache size

OScpu# is cpu # as seen by OS

Core is core#[_thread# if > 1 thread/core] inside socket

AffMsk is AffinityMask(extended hex) for core and thread

CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache

CmbMsk will differ from AffMsk if > 1 hw_thread/cache

Extended Hex replaces trailing zeroes with ‘z#’

where # is number of zeroes (so ’8z5′ is ’0×800000′)

+———–+———–+———–+———–+

Cache | L1D | L1D | L1D | L1D |

Size | 32K | 32K | 32K | 32K |

OScpu#| 1 9| 3 11| 5 13| 7 15|

Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|

AffMsk| 2 200| 8 800| 20 2z3| 80 8z3|

CmbMsk| 202 | 808 | 2020 | 8080 |

+———–+———–+———–+———–+

Cache | L1I | L1I | L1I | L1I |

Size | 32K | 32K | 32K | 32K |

+———–+———–+———–+———–+

Cache | L2 | L2 | L2 | L2 |

Size | 256K | 256K | 256K | 256K |

+———–+———–+———–+———–+

Cache | L3 |

Size | 8M |

CmbMsk| aaaa |

+———————————————–+

我们可以很清楚的看到我们CPU的信息,L1,L2,L3, cacheline的大小等,这些信息我们在做程序的时候经常需要的。

玩的开心!

参考文献:

1 . https://kevinclosson.wordpress.com/2009/04/22/linux-thinks-its-a-cpu-but-what-is-it-really-mapping-xeon-5500-nehalem-processor-threads-to-linux-os-cpus/
2. http://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu
3. http://chemnitzer.linux-tage.de/2010/vortraege/shortpaper/470-slides.pdf
4. http://software.intel.com/sites/oss/pdfs/mclinux.pdf
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: