lscpu命令详解

背景

  • Kernel源码: v5.15-rc1 (linux-stable)
  • 测试Kernel: v5.4.18 (kylin)
  • 测试系统: KylinOS V10 SP1

注: 分析只针对arm64平台, 以飞腾平台为例


简介

**lscpu**: 主要是用来显示CPU结构相关信息
对应的help信息:

用法:
 lscpu [选项]

显示 CPU 架构信息。

选项:
 -a, --all               同时打印在线和离线 CPU (-e 选项默认值)
 -b, --online            只打印在线 CPU (-p 选项默认值)
 -B, --bytes             print sizes in bytes rather than in human readable format
 -C, --caches[=<list>]   info about caches in extended readable format
 -c, --offline           只打印离线 CPU
 -J, --json              use JSON for default or extended format
 -e, --extended[=<列表>] 打印扩展的可读格式
 -p, --parse[=<列表>]    打印可解析格式
 -s, --sysroot <目录>    以指定目录作为系统根目录
 -x, --hex               打印十六进制掩码而非 CPU 列表
 -y, --physical          打印物理 ID 而非逻辑 ID
     --output-all        print all available columns for -e, -p or -C

 -h, --help              display this help
 -V, --version           display version

Available output columns for -e or -p:
           CPU  逻辑 CPU 数量
          CORE  逻辑核心数量
        SOCKET  逻辑(CPU)座数量
          NODE  逻辑 NUMA 节点数量
          BOOK  逻辑 book 数
        DRAWER  逻辑抽屉号
         CACHE  显示 CPU 间是如何共享缓存的
  POLARIZATION  虚拟硬件上的 CPU 调度模式
       ADDRESS  CPU 的物理地址
    CONFIGURED  显示超级监督(hypervisor)是否分配了 CPU
        ONLINE  显示 Linux 当前是否在使用该 CPU
        MAXMHZ  显示 CPU 的最大 MHz
        MINMHZ  显示 CPU 的最小 MHz

Available output columns for -C:
      ALL-SIZE  size of all system caches
         LEVEL  cache level
          NAME  cache name
      ONE-SIZE  size of one cache
          TYPE  cache type
          WAYS  ways of associativity

更具体的信息可查看相关的帮助手册。

使用lscpu命令查看飞腾平台信息:

user@user-D2000:~$ lscpu
架构:                           aarch64
CPU 运行模式:                   32-bit, 64-bit
字节序:                         Little Endian
CPU:                             8
在线 CPU 列表:                  0-7
每个核的线程数:                 1
每个座的核数:                   8
座:                             1
NUMA 节点:                      1
厂商 ID:                        0x70
型号:                           3
型号名称:                       Phytium,D2000/8
步进:                           0x1
CPU 最大 MHz:                   2300.0000
CPU 最小 MHz:                   575.0000
BogoMIPS:                       96.00
L1d 缓存:                       256 KiB
L1i 缓存:                       256 KiB
L2 缓存:                        8 MiB
L3 缓存:                        4 MiB
NUMA 节点0 CPU:                 0-7
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Vulnerable
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Vulnerable
Vulnerability Tsx async abort:   Not affected
标记:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

lscpu源码分析

下载源码:

由帮助文档可知:

The lscpu command is part of the util-linux package and is available from https://www.kernel.org/pub/linux/utils/util-linux/.

所以可以需要下载util-linux包的源码,可以通过以下2种方式:

其中,与lscpu有关的主要是lscpu.clscpu-arm.c

lscpu命令输出

lscpu命令全部输出的实现是print_summary()函数

先看下print_summary()函数中,打印输出项与数据及代码对应关系:

  • 架构(Architecture):
    desc->arch

  • CPU运行模式(CPU op-mode(s)):
    desc->mode

  • 字节序(Byte Order):

    #if !defined(WORDS_BIGENDIAN)
        add_summary_s(tb, _("Byte Order:"), "Little Endian");
    #else
        add_summary_s(tb, _("Byte Order:"), "Big Endian");
    #endif
  • CPU(CPU(s)):
    desc->ncpus

  • 在线CPU列表(On-line CPU(s) list):

    if (desc->online)
        print_cpuset(tb, mod->hex ? _("On-line CPU(s) mask:") :
                        _("On-line CPU(s) list:"),
                desc->online, mod->hex);
  • 每个核的线程数(Thread(s) per core):

  • 每个座的核数(Core(s) per socket):

  • 座(Socket(s)):

    if (desc->mtid)
        threads_per_core = atoi(desc->mtid) + 1;
    add_summary_n(tb, _("Thread(s) per core:"),
        threads_per_core ?: desc->nthreads / desc->ncores);
    add_summary_n(tb, _("Core(s) per socket:"),
        cores_per_socket ?: desc->ncores / desc->nsockets);
    if (desc->nbooks) {
        add_summary_n(tb, _("Socket(s) per book:"),
            sockets_per_book ?: desc->nsockets / desc->nbooks);
        if (desc->ndrawers) {
            add_summary_n(tb, _("Book(s) per drawer:"),
                books_per_drawer ?: desc->nbooks / desc->ndrawers);
            add_summary_n(tb, _("Drawer(s):"), drawers ?: desc->ndrawers);
        } else {
            add_summary_n(tb, _("Book(s):"), books_per_drawer ?: desc->nbooks);
        }
    } else {
        add_summary_n(tb, _("Socket(s):"), sockets_per_book ?: desc->nsockets);
    }
  • NUMA节点(NUMA node(s)):
    desc->nnodes

  • 厂家ID(Vendor ID):
    desc->vendor

  • 型号(Model):
    desc->revision ? desc->revision : desc->model

  • 型号名称(Model name):
    desc->cpu ? desc->cpu : desc->modelname

  • 步进(Stepping):
    desc->stepping

  • CPU最大MHz(CPU max MHz):
    cpu_max_mhz(desc, buf, sizeof(buf))
    遍历寻找最高频率:desc->maxmhz[i]

  • CPU最小MHz(CPU min MHz):
    cpu_min_mhz(desc, buf, sizeof(buf))
    遍历寻找最低频率:desc->minmhz[i]

  • BogoMIPS:
    desc->bogomips

  • cache:

    if (desc->ncaches) {
        for (i = desc->ncaches - 1; i >= 0; i--) {
            uint64_t sz = 0;
            char *tmp;
            struct cpu_cache *ca = &desc->caches[i];
    
            if (ca->size == 0)
                continue;
            if (get_cache_full_size(desc, ca, &sz) != 0 || sz == 0)
                continue;
            if (mod->bytes)
                xasprintf(&tmp, "%" PRIu64, sz);
            else
                tmp = size_to_human_string(
                    SIZE_SUFFIX_3LETTER | SIZE_SUFFIX_SPACE,
                    sz);
            snprintf(buf, sizeof(buf), _("%s cache:"), ca->name);
            add_summary_s(tb, buf, tmp);
            free(tmp);
        }
    }
    if (desc->necaches) {
        for (i = desc->necaches - 1; i >= 0; i--) {
            char *tmp;
            struct cpu_cache *ca = &desc->ecaches[i];
    
            if (ca->size == 0)
                continue;
            if (mod->bytes)
                xasprintf(&tmp, "%" PRIu64, ca->size);
            else
                tmp = size_to_human_string(
                    SIZE_SUFFIX_3LETTER | SIZE_SUFFIX_SPACE,
                    ca->size);
            snprintf(buf, sizeof(buf), _("%s cache:"), ca->name);
            add_summary_s(tb, buf, tmp);
            free(tmp);
        }
    }
  • NUMA 节点 0 CPU:

    for (i = 0; i < desc->nnodes; i++) {
        snprintf(buf, sizeof(buf), _("NUMA node%d CPU(s):"), desc->idx2nodenum[i]);
        print_cpuset(tb, buf, desc->nodemaps[i], mod->hex);
    }
  • vulnerability

    if (desc->vuls) {
            for (i = 0; i < desc->nvuls; i++) {
                snprintf(buf, sizeof(buf), ("Vulnerability %s:"), desc->vuls[i].name);
                add_summary_s(tb, buf, desc->vuls[i].text);
            }
        }
  • 标记:
    desc->flags

整个梳理出来对应关系如下:

所以与lscpu输出相关的主要对应的是类型为struct lscpu_desc的数据desc, 而数据从哪来呢?
从上面脑图上可以看出,主要从以下几个接口:

  • uname系统调用
  • /proc/cpuinfo
  • /sys/devices/system/cpu
  • /sys/devices/system/node
  • /sys/devices/system/cpu/vulnerabilities

下面就依次看下这几个接口,及数据是怎么解析出来的。

/proc/cpuinfo

在飞腾下面的信息,下面只列举了1个CPU(0号CPU),其余7个类似:

user@user-D2000:~$ cat /proc/cpuinfo 
processor	: 0
model name	: Phytium,D2000/8
BogoMIPS	: 96.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x70
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x663
CPU revision	: 3

Architecture
uname系统调用,获取内核、架构等相关信息,保存到struct utsname utsbuf
desc->arch = xstrdup(utsbuf.machine);获取对应架构保存到desc->arch

遍历 /proc/cpuinfo, 查找对应字段并保存:

"CPU implementer"  desc->vendor    /* ARM and aarch64 */
"cpu family"       desc->family
"CPU part"         desc->model    /* ARM and aarch64 */
"model name"       desc->modelname
"CPU variant"      desc->stepping   /* aarch64 */ 
"cpu MHz"          desc->mhz
"Features"         desc->flags     /* aarch64 */   
"BogoMIPS"         desc->bogomips  /* aarch64 */  
"CPU revision"     desc->revision  /* aarch64 */    

Cache:
lookup_cache()函数会去查找cache相关的字段,并保存到desc->necaches desc->ecaches

CPU运行模式(32-bit\64-bit):
desc->mode = init_mode(mod);
init_mode函数中对于ARM64的处理:

#if defined(__aarch64__)
	{
		/* personality() is the most reliable way (since 4.7)
		 * to determine aarch32 support */
		int pers = personality(PER_LINUX32);
		if (pers != -1) {
			personality(pers);
			m |= MODE_32BIT;
		}
		m |= MODE_64BIT;
	}
#endif

arch/arm64/kernel/sys.c:

SYSCALL_DEFINE1(arm64_personality, unsigned int, personality)
{
	if (personality(personality) == PER_LINUX32 &&
		!system_supports_32bit_el0())
		return -EINVAL;
	return ksys_personality(personality);
}

/sys/devices/system/cpu

在飞腾下面的信息:

user@user-D2000:~$ ls /sys/devices/system/cpu/
cpu0  cpu3  cpu6     cpuidle   kernel_max  online    present  vulnerabilities
cpu1  cpu4  cpu7     hotplug   modalias    possible  smt
cpu2  cpu5  cpufreq  isolated  offline     power     uevent

几个比较主要的字段:
通过kernel_max获取maxcpus
通过possible获取cpuset,从而获取cpu个数: desc->->ncpuspos
通过present获取当前CPUs: desc->ncpus
通过online获取online的CPUs: desc->online desc->nthreads

遍历每个cpu,获取信息:

read_topology(desc, i);      => desc->ncores  desc->nsockets
read_cache(desc, i);
read_polarization(desc, i);
read_address(desc, i);
read_configured(desc, i);
read_max_mhz(desc, i);       => desc->maxmhz[i]
read_min_mhz(desc, i);       => desc->minmhz[i]

/sys/devices/system/node

在飞腾下面的信息:

user@user-D2000:~$ ls /sys/devices/system/node/
has_cpu  has_memory  has_normal_memory  node0  online  possible  power  uevent

read_nodes函数用于获取node节点等相关信息,保存到对应desc->nnodesdesc->nodemaps等。

/sys/devices/system/cpu/vulnerabilities

在飞腾下面的信息:

user@user-D2000:~$ ls /sys/devices/system/cpu/vulnerabilities/
itlb_multihit  mds       spec_store_bypass  spectre_v2
l1tf           meltdown  spectre_v1         tsx_async_abort

通过read_vulnerabilities()函数获取vulnerabilities保存到 desc->vuls

arm_cpu_decode()

源码:lscpu-arm.c
处理arm相关的具体信息,主要是通过ID等来匹配, 从而转换成可以直接方便阅读的直观字符串
不同的厂商,不同的型号,对应不同的ID,类似于UUID
简单列举几个:

static const struct id_part arm_part[] = {
    ...
    { 0xd04, "Cortex-A35" },
    { 0xd05, "Cortex-A55" },
    { 0xd07, "Cortex-A57" },
    { 0xd08, "Cortex-A72" },
    { 0xd09, "Cortex-A73" },
    ...
};

对应的内核接口分析

uname系统调用

源码:kernel/sys.c
对应的主要数据结构:struct new_utsname

代码中主要通过utsname()函数返回保存的系统信息, 代码如下:

static inline struct new_utsname *utsname(void)
{
    /* current指向当前进程的task结构体 */
	return &current->nsproxy->uts_ns->name;
}

对于新的Linux内核,会涉及到UTS namespace

处于不同UTS namespace中的进程,task结构体里面的nsproxy->uts_ns所指向的结构体是不一样的, 具体就不在此详述了

而赋值是在: init/version.c

struct uts_namespace init_uts_ns = {
	.ns.count = REFCOUNT_INIT(2),
	.name = {
		.sysname	= UTS_SYSNAME,
		.nodename	= UTS_NODENAME,
		.release	= UTS_RELEASE,
		.version	= UTS_VERSION,
		.machine	= UTS_MACHINE,
		.domainname	= UTS_DOMAINNAME,
	},
	.user_ns = &init_user_ns,
	.ns.inum = PROC_UTS_INIT_INO,
#ifdef CONFIG_UTS_NS
	.ns.ops = &utsns_operations,
#endif
};
EXPORT_SYMBOL_GPL(init_uts_ns);

/proc/cpuinfo

节点对应源码: fs/proc/cpuinfo.c

extern const struct seq_operations cpuinfo_op;
static int cpuinfo_open(struct inode *inode, struct file *file)
{
	arch_freq_prepare_all();
	return seq_open(file, &cpuinfo_op);
}

static const struct proc_ops cpuinfo_proc_ops = {
	.proc_flags	= PROC_ENTRY_PERMANENT,
	.proc_open	= cpuinfo_open,
	.proc_read_iter	= seq_read_iter,
	.proc_lseek	= seq_lseek,
	.proc_release	= seq_release,
};

static int __init proc_cpuinfo_init(void)
{
	proc_create("cpuinfo", 0, NULL, &cpuinfo_proc_ops);
	return 0;
}

cpuinfo_op操作集对应源码: arch/arm64/kernel/cpuinfo.c

const struct seq_operations cpuinfo_op = {
	.start	= c_start,
	.next	= c_next,
	.stop	= c_stop,
	.show	= c_show
};

打印输出主要是c_show()函数:

static int c_show(struct seq_file *m, void *v)
{
	int i, j;
	bool compat = personality(current->personality) == PER_LINUX32;

	for_each_online_cpu(i) {
		struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i);
		u32 midr = cpuinfo->reg_midr;

		/*
		 * glibc reads /proc/cpuinfo to determine the number of
		 * online processors, looking for lines beginning with
		 * "processor".  Give glibc what it expects.
		 */
		seq_printf(m, "processor\t: %d\n", i);
		if (compat)
			seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n",
				   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);

		seq_printf(m, "BogoMIPS\t: %lu.%02lu\n",
			   loops_per_jiffy / (500000UL/HZ),
			   loops_per_jiffy / (5000UL/HZ) % 100);

		/*
		 * Dump out the common processor features in a single line.
		 * Userspace should read the hwcaps with getauxval(AT_HWCAP)
		 * rather than attempting to parse this, but there's a body of
		 * software which does already (at least for 32-bit).
		 */
		seq_puts(m, "Features\t:");
		if (compat) {
#ifdef CONFIG_COMPAT
			for (j = 0; j < ARRAY_SIZE(compat_hwcap_str); j++) {
				if (compat_elf_hwcap & (1 << j)) {
					/*
					 * Warn once if any feature should not
					 * have been present on arm64 platform.
					 */
					if (WARN_ON_ONCE(!compat_hwcap_str[j]))
						continue;

					seq_printf(m, " %s", compat_hwcap_str[j]);
				}
			}

			for (j = 0; j < ARRAY_SIZE(compat_hwcap2_str); j++)
				if (compat_elf_hwcap2 & (1 << j))
					seq_printf(m, " %s", compat_hwcap2_str[j]);
#endif /* CONFIG_COMPAT */
		} else {
			for (j = 0; j < ARRAY_SIZE(hwcap_str); j++)
				if (cpu_have_feature(j))
					seq_printf(m, " %s", hwcap_str[j]);
		}
		seq_puts(m, "\n");

		seq_printf(m, "CPU implementer\t: 0x%02x\n",
			   MIDR_IMPLEMENTOR(midr));
		seq_printf(m, "CPU architecture: 8\n");
		seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr));
		seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr));
		seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr));
	}

	return 0;
}

c_show()函数中, 遍历online的CPU, 然后将其对应的信息打印出来, 信息数据哪来呢?

cpu_data来源:
定义了一个percpu变量:cpu_data, 类型为struct cpuinfo_arm64
__cpuinfo_store_cpu()函数中, 通过read_cpuid函数读取相关寄存器,获取到的信息保存到struct cpuinfo_arm64类型的per_cpu变量中
__cpuinfo_store_cpu()的调用过程:

smp_prepare_boot_cpu() [arch/arm64/kernel/smp.c]
    -> cpuinfo_store_boot_cpu() [arch/arm64/kernel/cpuinfo.c]
        -> __cpuinfo_store_cpu()
secondary_start_kernel() [arch/arm64/kernel/smp.c]
    -> cpuinfo_store_cpu() [arch/arm64/kernel/cpuinfo.c]
        -> __cpuinfo_store_cpu()

loops_per_jiffy的来源:
在下面的流程中loops_per_jiffy会被赋值

`start_kernel() [init/main.c]`
    -> `calibrate_delay() [init/calibrate.c]`

adjust_jiffies() [drivers/cpufreq/cpufreq.c]函数中也会修改,即CPU在改变频率的时候,具体可参见cpufreq相关知识及源码

compat_elf_hwcapcompat_elf_hwcap2来源:
主要源码: arch/arm64/kernel/cpufeature.c

setup_cpu_features()
    -> setup_elf_hwcaps()
        -> `cap_set_elf_hwcap() `

就是通过读取相应的寄存器,来匹配赋值给相应的compat_elf_hwcapcompat_elf_hwcap2,具体可参见相关源码

主要对应关系:

/sys/devices/system/cpu

几个重要节点

  • CPUx节点
    调用流程:

    `topology_init() [arch/arm64/kernel/setup.c]` 
        -> `register_cpu() [drivers/base/cpu.c]` 
            -> `device_register() [drivers/base/core.c]` 
                -> `device_add() [drivers/base/core.c]`

    device_add()函数会在sysfs下生成相应的节点

  • kernel_max节点

  • possible节点

  • present节点

  • online节点
    主要源码: drivers/base/cpu.c
    对应的主要结构体: cpu_root_attrs
    通过subsys_system_register()函数注册节点,通过对应的bitmapcpumask来打印,具体可以参见源码

/sys/devices/system/node

  • nodex节点
  • nodex/cpumap节点
  • nodex/cpulist节点
    和CPU的节点创建比较类似,
    对应的结构体为node_dev_bin_attrs
    通过device_register()函数注册节点
    通过对应的bitmapcpumask来打印,具体可以参见源码
    `topology_init() [arch/arm64/kernel/setup.c]` 
        -> `__register_one_node() [drivers/base/node.c]`
            -> `register_node() [drivers/base/node.c]`
                -> `device_register() [drivers/base/core.c]` 
                    -> `device_add() [drivers/base/core.c]`

参考

  • 内核源码
  • util-linux包源码