insmod analys by shakesear

Modultils 工具源码分析之 insmod 篇

作者：吴晖 2005 年 12 月 29 日 email：wuhui1973@21cn.com

前言 ................................................................................................................................... 2 Insmod——main函数 ....................................................................................................... 6 Insmod——INSMOD_MAIN函数................................................................................... 7 Insmod——config_read函数 .......................................................................................... 10 Insmod——do_read函数 ................................................................................................ 10 Insmod——build_list函数 .............................................................................................. 15 Insmod——SHELL_META宏 ....................................................................................... 15 Insmod——OPT_LIST结构 ........................................................................................... 15 Insmod——gen_file数组 ................................................................................................ 17 Insmod——gen_files结构 .............................................................................................. 17 Insmod——gen_file_env函数 ........................................................................................ 18 Insmod——ETC_MODULES_CONF宏........................................................................ 19 Insmod——fgets_strip函数 ............................................................................................ 20 Insmod——strip_end函数 .............................................................................................. 22 Insmod——GLOB_LIST结构........................................................................................ 31 Insmod——meta_expand函数........................................................................................ 31 Insmod——ME_ALL宏 ................................................................................................. 35 Insmod——split_line函数 .............................................................................................. 35 Insmod——gen_file_conf函数....................................................................................... 36 Insmod——decode_list函数........................................................................................... 37 Insmod——search_module_path函数 ............................................................................ 39 Insmod——config_lstmod函数 ...................................................................................... 39 Insmod——config_add函数 ........................................................................................... 41 Insmod——xftw函数...................................................................................................... 43 Insmod——prune数组 .................................................................................................... 45 Insmod——xftw_dirent结构 .......................................................................................... 47 Insmod——xftw_readdir函数 ........................................................................................ 48 Insmod——xftw_dir_name函数..................................................................................... 48 Insmod——xftw_add_dirent函数................................................................................... 49 Insmod——xftw_sortdir函数 ......................................................................................... 49 Insmod——xftw_type2 函数 .......................................................................................... 50 Insmod——xftw_do_name函数 ..................................................................................... 52 Insmod——get_kernel_info函数.................................................................................... 53 Insmod——new_get_kernel_info函数 ........................................................................... 53 Insmod——set_ncv_prefix函数 ..................................................................................... 56 Insmod——obj_load函数 ............................................................................................... 58

-1-

Insmod——arch_new_file函数 ...................................................................................... 63 Insmod——Elf32_hdr结构............................................................................................. 64 Insmod——Elf32_Shdr结构........................................................................................... 65 Insmod——obj_insert_section_load_order函数 ............................................................ 65 Insmod——obj_load_order_prio函数 ............................................................................ 66 Insmod——Elf32_Sym结构........................................................................................... 66 Insmod——obj_add_symbol函数 .................................................................................. 67 Insmod——Elf32_Rel结构............................................................................................. 69 Insmod——get_kernel_version函数 .............................................................................. 70 Insmod——get_module_version函数 ............................................................................ 71 Insmod——obj_set_symbol_compare函数 .................................................................... 72

前言 Linux 的前身 UNIX 是一个巨内核操作系统，这样的系统运行效率高，但是内核占据的资源比较多，而且更要命的是系统在启动时必须把所有的设备驱动都加载，不管有没有用。另外，每添加或修改驱动都要重新编译内核。在各种设备层出不穷，日新月异的今天，这有点不合时宜了。作为另一种解决方案，卡奈基·梅隆大学开发了 UNIX 的微内核变种 mach。这个操作系统表现出了许多先进的特性，拥有传统 UNIX 所不具备的性能。但是微内核把许多传统上属于内核的功能，象文件系统丢到了用户空间，使其只能通过 IPC 与内核交互，一定程度上影响了效率。 “不站左、不站右、站中间”，Linux 暗合中国的中庸文化，整出了可加载模块这一方案。在这方案里原来属于内核的东西还是归内核，但是象文件系统、设备驱动，这样的东西都可以做成可动态添加、卸载。需要的时候添加，不需要的时候卸载，随时保持内核最小。粗粗一看，这有点像我们做的动态库，但是又不一样。动态库可以导出一系列变量、函数供外部使用，但它自己不能引用库以外的变量、函数。显然，文件系统、设备驱动不可能不使用内核中的函数。内核倒更像动态库，它不会使用模块导出的变量、函数（因为无法预测），相反它会导出一堆变量、函数给模块用。可关键是，内核必须是可执行程序，像动态库这样等着别人来调用，万万不行！伤脑筋呵？其实还好，既然内核有动态库这样的特性就可以了，那么事情就不是那么复杂了。而模块呢，一方面它会导出变量和函数给别的後加载的模块使用，另一方面，它要使用内核和已加载模块导出的变量和函数。毫无疑问，模块使用的文件必须是可重定位的。因此，模块使用的是 elf 文件格式。Elf 文件的全称是 Executable and Linkable Format。 Elf 文件格式的可能布局如下图。

-2-

在上图中，程序头（program Header）对于.o 目标文件是没有意义的（模块文件正是这样的文件），而在运行执行文件和库文件时，系统一般只理会程序头，因为程序头已经给出了足够的信息，这里的程序头就是我们平常看到的 segment（进程就是以 segment 组织起来的，可以参考 linux 的 binfmt_elf.c 文件里 elf 文件加载的代码）。另外，值得注意的是，segment 的内容可以与段的内容重叠（一般也会重叠），segment 和段只不过是描述程序数据的不同方式。对于.o 目标文件来说，程序头一般是不存在的（因为还没完成外部符号解析及与其他目标文件的链接，无法确定进程的内存布局）。因此对于.o 目标文件来说，信息都保存在段（section）中。段根据其保存内容的不同，分为字符段、符号段、数据段、代码段、静态初始化数据段 (.bss)、重定位段、注释段、Global offset table（我不知道怎么翻译贴切）段、等等。这些段中最重要的段是符号段和重定位段。这 2 个段的关系，见下图。

-3-

符号段保存了模块所有的符号，包括模块定义的，以及模块引用外部的。符号段保存的是一个个固定大小的单元。每个单元，保存着关于这个符号的信息。这些信息里面最重要的就是符号的值，这个值对于引用外部变量，是边界对齐要求（因为对于引用外部变量，变量的地址只能在链接时确定，编译时无法知道哪怕一丁点的信息），对于其他文件内定义的变量，这个值保存的是从保存它的段（注意不是保存符号信息的段）起始到符号位置的偏移，基本上就是变量的相对地址。重定位段则保存相应符号的重定位信息、离所在段起始的偏移，重定位段中的每一单元都是对应一个符号。从重定位段出发就可以找到所有需要重定位的符号所在的段，进而找到符号的信息。然后根据重定位段给出的重定位操作信息，就可以计算、设置符号的绝对地址。而内核为了导出符号给模块，将要导出的符号存放在__kallsyms 这样一个段（模块也一样），并专门提供了个 sys_query_module 这样的系统调用（v2.1.x 以后的版本），模块通过这个调用可以查询内核和其他已加载模块导出的符号。不过即便已经有了这些便利，加载模块还是一个不轻松的工作（它的工作内容和链接多个.o 文件相仿）。为此，出现了专门的工具，这就是 modutils 系列。Modutils 包含好几个工具： insmod，rmmod，depmod，modprobe，lsmod，kerneld，ksyms，kallsyms。其中的 insmod 就是用来加载模块的，而 rmmod 则是卸载模块。这里由于时间和篇幅，我们暂时只讨论 insmod。其他，以后有时间，会尽量补上。选择 insmod 的原因，是因为它的工作和 GNU 的 ld 类似，读懂了它就能大致明白 ld 的原理，还有就是能了解 elf 文件格式。

-4-

虽然 insmod 的代码相当长，但是它的逻辑并不复杂。基本流程如下图。处理命令行参数，设定程序选项 insmod 里主要是设置模块查找时的过滤项

设置模块查找目录

解析配置文档

确定模块的路径，打开模块文件

获取内核导出符号

加载模块文件

确认模块版本与内核相符

绑定引用的内核导出符号

引用外部符号、静态符号分配资源

检查模块运行时参数设置

解析模块运行时参数模块运行时参数使用文件

确认文件名、程序设置有效

解析文件中的模块运行时参数添加 kallsyms 段

计算模块大小，在内核生成模块

模块运行时参数使用文件

确认内核支持该功能

完成符号重定位

完成 kallsyms 段

-5-

初始化内核模块

Insmod——main 函数 insmod 的入口在 insmod.c，在./modultils-2.4.0/insmod/下。 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991

/* For common 3264 code, only compile main in the 64 bit version. */ #if defined(COMMON_3264) && defined(ONLY_32) /* Use the main in the 64 bit version */ #else /* This mainline looks at the name it was invoked under, checks that the name * contains exactly one of the possible combined targets and invokes the * corresponding handler for that function. */ int main(int argc, char **argv) { /* List of possible program names and the corresponding mainline routines */ static struct { char *name; int (*handler)(int, char **); } mains[] = { { "insmod", &insmod_main }, #ifdef COMBINE_modprobe { "modprobe", &modprobe_main }, #endif #ifdef COMBINE_rmmod { "rmmod", &rmmod_main }, #endif #ifdef COMBINE_ksyms { "ksyms", &ksyms_main }, #endif #ifdef COMBINE_lsmod { "lsmod", &lsmod_main }, #endif #ifdef COMBINE_kallsyms { "kallsyms", &kallsyms_main }, #endif }; #define MAINS_NO (sizeof(mains)/sizeof(mains[0])) static int mains_match; static int mains_which; char *p = strrchr(argv[0], '/'); char error_id1[2048] = "The "; char error_id2[2048] = ""; int i;

/* Way oversized */ /* Way oversized */

p = p ? p + 1 : argv[0]; for (i = 0; i < MAINS_NO; ++i) { if (i) { xstrcat(error_id1, "/", sizeof(error_id1)); if (i == MAINS_NO-1) xstrcat(error_id2, " or ", sizeof(error_id2)); else xstrcat(error_id2, ", ", sizeof(error_id2));

-6-

1992 } 1993 xstrcat(error_id1, mains[i].name, sizeof(error_id1)); 1994 xstrcat(error_id2, mains[i].name, sizeof(error_id2)); 1995 if (strstr(p, mains[i].name)) { 1996 ++mains_match; 1997 mains_which = i; 1998 } 1999 } 2000 2001 /* Finish the error identifiers */ 2002 if (MAINS_NO != 1) 2003 xstrcat(error_id1, " combined", sizeof(error_id1)); 2004 xstrcat(error_id1, " binary", sizeof(error_id1)); 2005 2006 if (mains_match == 0 && MAINS_NO == 1) 2007 ++mains_match; /* Not combined, any name will do */ 2008 if (mains_match == 0) { 2009 error("%s does not have a recognisable name, " 2010 "the name must contain one of %s.", 2011 error_id1, error_id2); 2012 return(1); 2013 } 2014 else if (mains_match > 1) { 2015 error("%s has an ambiguous name, it must contain %s%s.", 2016 error_id1, MAINS_NO == 1 ? "" : "exactly one of ", error_id2); 2017 return(1); 2018 } 2019 else 2020 return((mains[mains_which].handler)(argc, argv)); 2021 } 2022 #endif /* defined(COMMON_3264) && defined(ONLY_32) */ 当输入 insmod –x…命令时，将调用 insmod_main，这是个宏定义。 1416 1417 1418 1419 1420 1421 1422

#if defined(COMMON_3264) && defined(ONLY_32) #define INSMOD_MAIN insmod_main_32 /* 32 bit version */ #elif defined(COMMON_3264) && defined(ONLY_64) #define INSMOD_MAIN insmod_main_64 /* 64 bit version */ #else #define INSMOD_MAIN insmod_main /* Not common code */ #endif

不管是怎么样的体系，最后都会进入 INSMOD_MAIN 函数。这个函数在 insmod.c 中。 Insmod——INSMOD_MAIN 函数 1424 int INSMOD_MAIN(int argc, char **argv) 1425 { 1426 int k_version; 1427 int k_crcs; 1428 char k_strversion[STRVERSIONLEN]; 1429 struct option long_opts[] = { 1430 {"force", 0, 0, 'f'}, 1431 {"help", 0, 0, 'h'}, {"autoclean", 0, 0, 'k'}, 1432 1433 {"lock", 0, 0, 'L'}, 1434 {"map", 0, 0, 'm'},

-7-

1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491

{"noload", 0, 0, 'n'}, {"probe", 0, 0, 'p'}, {"poll", 0, 0, 'p'}, /* poll is deprecated, remove in 2.5 */ {"quiet", 0, 0, 'q'}, {"root", 0, 0, 'r'}, {"syslog", 0, 0, 's'}, {"kallsyms", 0, 0, 'S'}, {"verbose", 0, 0, 'v'}, {"version", 0, 0, 'V'}, {"noexport", 0, 0, 'x'}, {"export", 0, 0, 'X'}, {"noksymoops", 0, 0, 'y'}, {"ksymoops", 0, 0, 'Y'}, {"persist", 1, 0, 'e'}, {"name", 1, 0, 'o'}, {"blob", 1, 0, 'O'}, {"prefix", 1, 0, 'P'}, {0, 0, 0, 0} }; char *m_name = NULL; char *blob_name = NULL; /* Save object as binary blob */ int m_version; ElfW(Addr) m_addr; unsigned long m_size; int m_crcs; char m_strversion[STRVERSIONLEN]; char *filename; char *persist_name = NULL; /* filename to hold any persistent data */ int fp; struct obj_file *f; struct obj_section *kallsyms = NULL, *archdata = NULL; int o; int noload = 0; int dolock = 1; /*Note: was: 0; */ int quiet = 0; int exit_status = 1; int force_kallsyms = 0; int persist_parms = 0; /* does module have persistent parms? */ int i; error_file = "insmod"; /* To handle repeated calls from combined modprobe */ errors = optind = 0; /* Process the command line. */ while ((o = getopt_long(argc, argv, "fhkLmnpqrsSvVxXyYe:o:O:P:R:", &long_opts[0], NULL)) != EOF) switch (o) { case 'f': /* force loading */ flag_force_load = 1; break; case 'h': /* Print the usage message. */ insmod_usage(); break; case 'k': /* module loaded by kerneld, auto-cleanable */

-8-

1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548

flag_autoclean = 1; break; case 'L': /* protect against recursion. */ dolock = 1; break; case 'm': /* generate load map */ flag_load_map = 1; break; case 'n': /* don't load, just check */ noload = 1; break; case 'p': /* silent probe mode */ flag_silent_probe = 1; break; case 'q': /* Don't print unresolved symbols */ quiet = 1; break; case 'r': /* allow root to load non-root modules */ root_check_off = !root_check_off; break; /* start syslog */ case 's': setsyslog("insmod"); break; case 'S': /* Force kallsyms */ force_kallsyms = 1; break; case 'v': /* verbose output */ flag_verbose = 1; break; case 'V': fputs("insmod version " MODUTILS_VERSION "\n", stderr); break; case 'x': /* do not export externs */ flag_export = 0; break; case 'X': /* do export externs */ flag_export = 1; break; case 'y': /* do not define ksymoops symbols */ flag_ksymoops = 0; break; case 'Y': /* do define ksymoops symbols */ flag_ksymoops = 1; break; case 'e': /* persistent data filename */ free(persist_name); persist_name = xstrdup(optarg); break; case 'o': /* name the output module */ m_name = optarg; break; case 'O': /* save the output module object */ blob_name = optarg; break; case 'P': /* use prefix on crc */ set_ncv_prefix(optarg);

-9-

1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560

break; default: insmod_usage(); break; } if (optind >= argc) { insmod_usage(); } filename = argv[optind++];

以上的代码，声明变量，处理命令行参数。各参数的含义，注释中已有说明，其用途在下面代码中可明了。需要说明的是 1548 行的 set_ncv_prefix(optarg)，该函数也在 insmod.c 中。其用途是让用户设置版本识别前缀。为了使内核与所使用模块兼容，不至发生错误，linux 里使用前、后缀（？）标识不同版本的内核与模块，在加载时内核就可以通过前后缀识别是否是正确版本的模块。 1561 1562 1563

if (config_read(0, NULL, "", NULL) < 0) { error("Failed handle configuration"); }

modutils 工具具有令人恐怖的配置性能，简直与一个小型语言相仿。不过在 insmod 里，配置文件很多配置项都不起作用，唯一用到就是 prune，它在查找模块路径名时被用作过滤项。函数 config_read 正是解读 modutils 配置文档的关键。该函数在./modultis2.4.0/util/config_read.c 中。 Insmod——config_read 函数 1359 int config_read(int all, char *force_ver, char *base_dir, char *conf_file) 1360 { 1361 int r; 1362 if (modpath != NULL) 1363 return 0; /* already initialized */ 1364 1365 if (uname(&uts_info) < 0) { 1367 error("Failed to find kernel name information"); 1368 return -1; 1369 } 1370 1371 r = do_read(all, force_ver, base_dir, conf_file, 0); 1372 1373 if (quick && !r && !need_update (force_ver, base_dir)) 1374 exit (0); 1375 1376 return r; 1377 } 函数 do_read 也在同一文件中。 Insmod——do_read 函数 475 /* 476 * Read the configuration file.

- 10 -

477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533

* If parameter "all" == 0 then ignore everything except path info * Return -1 if any error. * Error messages generated. */ static int do_read(int all, char *force_ver, char *base_dir, char *conf_file, int depth) { #define MAX_LEVEL 20 FILE *fin; GLOB_LIST g; int i; int assgn; int drop_default_paths = 1; int lineno = 0; int ret = 0; int state[MAX_LEVEL + 1]; /* nested "if" */ int level = 0; char buf[3000]; char tmpline[100]; char **pathp; char *envpath; char *version; char *type; char **glb; char old_name[] = "/etc/conf.modules"; int conf_file_specified = 0; /* * The configuration file is optional. * No error is printed if it is missing. * If it is missing the following content is assumed. * * path[boot]=/lib/modules/boot * * path[toplevel]=/lib/modules/`uname -r` * * path[toplevel]=/lib/modules/`kernelversion` * (where kernelversion gives the major kernel version: "2.0", "2.2"...) * * path[toplevel]=/lib/modules/default * * path[kernel]=/lib/modules/kernel * path[fs]=/lib/modules/fs * path[net]=/lib/modules/net * path[scsi]=/lib/modules/scsi * path[block]=/lib/modules/block * path[cdrom]=/lib/modules/cdrom * path[ipv4]=/lib/modules/ipv4 * path[ipv6]=/lib/modules/ipv6 * path[sound]=/lib/modules/sound * path[fc4]=/lib/modules/fc4 * path[video]=/lib/modules/video * path[misc]=/lib/modules/misc * path[pcmcia]=/lib/modules/pcmcia * path[atm]=/lib/modules/atm * path[usb]=/lib/modules/usb * path[ide]=/lib/modules/ide * path[ieee1394]=/lib/modules/ieee1394

- 11 -

534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591

* path[mtd]=/lib/modules/mtd * * The idea is that modprobe will look first if the * modules are compiled for the current release of the kernel. * If not found, it will look for modules that fit for the * general kernelversion (2.0, 2.2 and so on). * If still not found, it will look into the default release. * And if still not found, it will look in the other directories. * * The strategy should be like this: * When you install a new linux kernel, the modules should go * into a directory related to the release (version) of the kernel. * Then you can do a symlink "default" to this directory. * * Each time you compile a new kernel, the make modules_install * will create a new directory, but it won't change thee default. * * When you get a module unrelated to the kernel distribution * you can place it in one of the last three directory types. * * This is the default strategy. Of course you can overide * this in /etc/modules.conf. * * 2.3.15 added a new file tree walk algorithm which made it possible to * point at a top level directory and get the same behaviour as earlier * versions of modutils. 2.3.16 takes this one stage further, it * removes all the individual directory names from most of the scans, * only pointing at the top level directory. The only exception is the * last ditch scan, scanning all of /lib/modules would be a bad idea(TM) * so the last ditch scan still runs individual directory names under * /lib/modules. * * Additional syntax: * * [add] above module module1 ... * Specify additional modules to pull in on top of a module * * [add] below module module1 ... * Specify additional modules needed to be able to load a module * * [add] prune filename ... * * [add] probe name module1 ... * When "name" is requested, modprobe tries to install each * module in the list until it succeeds. * * [add] probeall name module1 ... * When "name" is requested, modprobe tries to install all * modules in the list. * If any module is installed, the command has succeeded. * * [add] options module option_list * * For all of the above, the optional "add" prefix is used to * add to a list instead of replacing the contents. * * include FILE_TO_INCLUDE

- 12 -

592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645

* This does what you expect. No limitation on include levels. * * persistdir=persist_directory * Name the directory to save persistent data from modules. * * In the following WORD is a sequence if non-white characters. * If ' " or ` is found in the string, all characters up to the * matching ' " or ` will also be included, even whitespace. * Every WORD will then be expanded w.r.t. meta-characters. * If the expanded result gives more than one word, then only * the first word of the result will be used. * * * define CODE WORD * Do a putenv("CODE=WORD") * * EXPRESSION below can be: * WORD compare_op WORD * where compare_op is one of == != < <= >= > * The string values of the WORDs are compared * or * -n WORD compare_op WORD * where compare_op is one of == != < <= >= > * The numeric values of the WORDs are compared * or * WORD * if the expansion of WORD fails, or if the * expansion is "0" (zero), "false" or "" (empty) * then the expansion has the value FALSE. * Otherwise the expansion has the value TRUE * or * -f FILENAME * Test if the file FILENAME exists * or * -k * Test if "autoclean" (i.e. called from the kernel) * or * ! EXPRESSION * A negated expression is also an expression * * if EXPRESSION * any config line * ... * elseif EXPRESSION * any config line * ... * else * any config line * ... * endif * * The else and elseif keywords are optional. * "if"-statements nest up to 20 levels. */

- 13 -

请仔细阅读上面的注释，modutils 的配置文档功能很强大，是吧？！如果不是很明白，不要紧，往下看就明白了。 646 647 648 649 650 651 652 653

state[0] = 1; if (force_ver) version = force_ver; else version = uts_info.release; config_version = xstrdup(version); 这里 force_ver 为 null，uts_info 已经在 config_read 中设置好了，见 config_read，1365 行。以

上，获取合适的 version 信息，并保存在 config_version 中。 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693

/* Only read the default entries on the first file */ if (depth == 0) { maxpath = 100; modpath = (struct PATH_TYPE*)xmalloc(maxpath*sizeof(struct PATH_TYPE)); nmodpath = 0; maxexecs = 10; execs = (struct EXEC_TYPE*)xmalloc(maxexecs*sizeof(struct EXEC_TYPE)); nexecs = 0; /* * Build predef options */ if (all && optlist[0]) n_opt_list = build_list(optlist, &opt_list, version, 1); /* * Build predef above */ if (all && above[0]) n_abovelist = build_list(above, &abovelist, version, 0); /* * Build predef below */ if (all && below[0]) n_belowlist = build_list(below, &belowlist, version, 0); /* * Build predef prune list */ if (prune[0]) n_prunelist = build_list(prune, &prunelist, version, 0); /* * Build predef aliases */ if (all && aliaslist[0]) n_aliases = build_list(aliaslist, &aliases, version, 0);

- 14 -

参数 oplist，above，below，prune，aliaslist 都在./modutils/util/alias.h 中，为一些预定义规则。在 insmod 里除了 aliaslist，其他都是“空的”。函数 build_list 也在同一文件。用于建立上述预定义规则的链表。 Insmod——build_list 函数 380 static int build_list(char **in, OPT_LIST **out, char *version, int opts) 381 { 382 GLOB_LIST *pg; 383 int i; 384 385 for (i = 0; in[i]; ++i) { 386 char *p = xstrdup(in[i]); 387 char *pt = next_word(p); 388 char *pn = p; 389 390 *out = (OPT_LIST *)xrealloc(*out, (i + 2) * sizeof(OPT_LIST)); 391 (*out)[i].autoclean = 1; 392 if (opts && !strcmp (p, "-k")) { 393 pn = pt; 394 pt = next_word(pn); 395 (*out)[i].autoclean = 0; 396 } 397 pg = (GLOB_LIST *)xmalloc(sizeof(GLOB_LIST)); 398 meta_expand(pt, pg, NULL, version, ME_ALL); 399 (*out)[i].name = xstrdup(pn); 400 (*out)[i].opts = pg; 401 free(p); 402 } 403 memset(&(*out)[i], 0, sizeof(OPT_LIST)); 404 405 return i; 406 } 这段代码相当简单，输入的参数按 name value|-k name value 这样的样式给出，经过解析保存在 out 表里。如果指定-k，就可以将一个模块的 autoclean 置为 0。这个标志将被设置入 module 里，使 module 可以自动卸载。另外，在预定义参数中可以使用 shell 中的元字符(meta character)，如：$， *，并按 Shell 里的含义来解释。SHELL_META 定义了这些元字符，在./modutils2.4.0/include/util.h。 Insmod——SHELL_META 宏 31 #define SHELL_META "&();|<>$`\"'\\!{}[]~=+:?*" /* Sum of bj0rn and Debian */ OPT_LIST 的定义在./modutils-2.4.0/include/config.h 中： Insmod——OPT_LIST 结构 51 typedef struct { 52 char *name; 53 GLOB_LIST *opts; 54 int autoclean; 55 } OPT_LIST; 接下来，设置模块的查找目录。

- 15 -

696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751

/* Order and priority is now: (MODPATH + modules.conf) || (predefs + modules.conf) */ if ((envpath = getenv("MODPATH")) != NULL && !safemode) { size_t len; char *p; char *path; /* Make a copy so's we can mung it with strtok. */ len = strlen(envpath) + 1; p = alloca(len); memcpy(p, envpath, len); path = alloca(PATH_MAX); for (p = strtok(p, ":"); p != NULL; p = strtok(NULL, ":")) { len = snprintf(path, PATH_MAX, p, version); modpath[nmodpath].path = xstrdup(path); if ((type = strrchr(path, '/')) != NULL) type += 1; else type = "misc"; modpath[nmodpath].type = xstrdup(type); if (++nmodpath >= maxpath) { maxpath += 100; modpath = (struct PATH_TYPE *)xrealloc(modpath, maxpath * sizeof(struct PATH_TYPE)); } } } else { /* * Build the default "path[type]" configuration */ int n; char *k; /* The first entry in the path list */ modpath[nmodpath].type = xstrdup("boot"); snprintf(tmpline, sizeof(tmpline), "%s/lib/modules/boot", base_dir); modpath[nmodpath].path = xstrdup(tmpline); ++nmodpath; /* The second entry in the path list, `uname -r` */ modpath[nmodpath].type = xstrdup("toplevel"); snprintf(tmpline, sizeof(tmpline), "%s/lib/modules/%s", base_dir, version); modpath[nmodpath].path = xstrdup(tmpline); ++nmodpath; /* The third entry in the path list, `kernelversion` */ modpath[nmodpath].type = xstrdup("toplevel"); for (n = 0, k = version; *k; ++k) { if (*k == '.' && ++n == 2) break; } snprintf(tmpline, sizeof(tmpline), "%s/lib/modules/%.*s", base_dir, (/* typecast for Alpha */ int)(k - version), version); modpath[nmodpath].path = xstrdup(tmpline); ++nmodpath;

- 16 -

752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774

/* The rest of the entries in the path list */ for (pathp = tbpath; *pathp; ++pathp) { char **type; for (type = tbtype; *type; ++type) { char path[PATH_MAX]; snprintf(path, sizeof(path), "%s%s/%s", base_dir, *pathp, *type); if (meta_expand(path, &g, NULL, version, ME_ALL)) return -1; for (glb = g.pathv; glb && *glb; ++glb) { modpath[nmodpath].type = xstrdup(*type); modpath[nmodpath].path = *glb; if (++nmodpath >= maxpath) { maxpath += 100; modpath = (struct PATH_TYPE *)xrealloc(modpath, maxpath * sizeof(struct PATH_TYPE)); } } } } } 注意 safemode 在 config.c 中定位为全局变量，insmod 不对它赋任何值，因此在这里其值为 0。

所以，如果 modpath 这个环境变量被设置，将使用这个值构造 module 的各个路径，否则就使用默认路径，具体可以看函数开头的注释。从代码中可以看到在环境变量中的路径通过“：”分割，对于不以“/”开头的路径，程序一概归入“misc”类路径。如果使用默认路径，insmod 将传入 null 给 base_dir。因此，构成如函数头部所注明的默认路径。 776 777 778 779 780

/* Environment overrides for testing only, undocumented */ for (i = 0; i < gen_file_count; ++i) gen_file_env(gen_file+i); }

/* End of depth == 0 */

这里使用了 gen_file 数组，这个数组的定义在同一个文件里。 Insmod——gen_file 数组 112 /* The initialization order must match the gen_file_enum order in config.h */ 113 struct gen_files gen_file[] = { 114 {"generic_string", NULL, 0}, 115 {"pcimap", NULL, 0}, 116 {"isapnpmap", NULL, 0}, 117 {"usbmap", NULL, 0}, 118 {"parportmap", NULL, 0}, 119 {"dep", NULL, 0}, 120 }; gen_files 结构定义在./modutils-2.4.0/include/config.h 中， Insmod——gen_files 结构 79 /* Information about generated files */ 80 struct gen_files { 81 char *base; /* xxx in /lib/modules/`uname -r`/modules.xxx */

- 17 -

82 83 84

char *name; time_t mtime;

/* name actually used */

};

函数 gen_file_env 也在./modutils-2.4.0/util/config.c。 Insmod——gen_file_env 函数 408 /* Environment variables can override defaults, testing only */ 409 static void gen_file_env(struct gen_files *gf) 410 { 411 if (!safemode) { 412 char *e = xmalloc(strlen(gf->base)+5), *p1 = gf->base, *p2 = e; 413 while ((*p2++ = toupper(*p1++))) ; /* safe, xmalloc */ 414 strcpy(p2-1, "PATH"); 415 if ((p2 = getenv(e)) != NULL) { 416 free(gf->name); 417 gf->name = xstrdup(p2); 418 } 419 free(e); 420 } 421 } 这个函数的作用显而易见，将 gen_files 的 base 字符串转换为大写，并在结尾加上“PATH”，如果存在同名的环境变量，则将其值赋予 gen_files 的 name。从字面上，这些 gen_files 应该是些公用文档和位图。 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805

if (conf_file || ((conf_file = getenv("MODULECONFIG")) != NULL && *conf_file && !safemode)) { if (!(fin = fopen(conf_file, "r"))) { error("Can't open %s", conf_file); return -1; } conf_file_specified = 1; } else { if (!(fin = fopen((conf_file = ETC_MODULES_CONF), "r"))) { /* Fall back to non-standard name */ if ((fin = fopen((conf_file = old_name), "r"))) { fprintf(stderr, "Warning: modutils is reading from %s because\n" " %s does not exist. The use of %s is\n" " deprecated, please rename %s to %s\n" " as soon as possible. Command\n" " mv %s %s\n", old_name, ETC_MODULES_CONF, old_name, old_name, ETC_MODULES_CONF, old_name, ETC_MODULES_CONF); } /* So what... use the default configuration */ } } 如果在调用 config_read 时，传入了一个 conf_file 指针，或者设置了 MODULECONFIG 环境变

量，那么在这里将打开、读入 conf_file 指定的文件（就是所谓的配置文件）。否则打开、读入

- 18 -

ETC_MODULES_CONF 所指定文件。ETC_MODULES_CONF 在./modutils-2.4.0/include/config.h 中定义为： Insmod——ETC_MODULES_CONF 宏 31 #define ETC_MODULES_CONF "/etc/modules.conf" 如果"/etc/modules.conf"打不开，则尝试 old_name 指定的文件。old_name 在前面 500 行的地方被定义为"/etc/conf.modules"。估计这是以前 linux 使用的配置文档路径。 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852

if (fin) { struct stat statbuf1, statbuf2; if (fstat(fileno(fin), &statbuf1) == 0) config_mtime = statbuf1.st_mtime; config_file = xstrdup(conf_file); /* Save name actually used */ if (!conf_file_specified && stat(ETC_MODULES_CONF, &statbuf1) == 0 && stat(old_name, &statbuf2) == 0) { /* Both /etc files exist */ if (statbuf1.st_dev == statbuf2.st_dev && statbuf1.st_ino == statbuf2.st_ino) { if (lstat(ETC_MODULES_CONF, &statbuf1) == 0 && S_ISLNK(statbuf1.st_mode)) fprintf(stderr, "Warning: You do not need a link from %s to\n" " %s. The use of %s is deprecated,\n" " please remove %s and rename %s\n" " to %s as soon as possible. Commands.\n" " rm %s\n" " mv %s %s\n", ETC_MODULES_CONF, old_name, old_name, ETC_MODULES_CONF, old_name, ETC_MODULES_CONF, ETC_MODULES_CONF, old_name, ETC_MODULES_CONF); else { #ifndef NO_WARN_ON_OLD_LINK fprintf(stderr, "Warning: You do not need a link from %s to\n" " %s. The use of %s is deprecated,\n" " please remove %s as soon as possible. Command\n" " rm %s\n", old_name, ETC_MODULES_CONF, old_name, old_name, old_name); #endif } } else fprintf(stderr, "Warning: modutils is reading from %s and\n" " ignoring %s. The use of %s is deprecated,\n" " please remove %s as soon as possible. Command\n" " rm %s\n", ETC_MODULES_CONF, old_name, old_name, old_name, old_name);

- 19 -

853 854

} } 接下来测试"/etc/modules.conf"和"/etc/conf.modules"是否同时存在，或者其中是另一个的符号链

接，是的话给出警告。 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893

/* * Finally, decode the file */ while (fin && fgets_strip(buf, sizeof(buf) - 1, fin, &lineno) != NULL) { char *arg2; char *parm = buf; char *arg; int one_err = 0; int adding; while (isspace(*parm)) parm++; if (strncmp(parm, "add", 3) == 0) { adding = 1; parm += 3; while (isspace(*parm)) parm++; } else adding = 0; arg = parm; if (*parm == '\0') continue; one_err = 1; while (*arg > ' ' && *arg != '=') arg++; if (*arg == '=') assgn = 1; else assgn = 0; *arg++ = '\0'; while (isspace(*arg)) arg++; 函数 fgets_strip 在同一文件里，作用是提取文件每一行。

Insmod——fgets_strip 函数 215 /* 216 * Read a line of a configuration file and process continuation lines. 217 * Return buf, or NULL if EOF. 218 * Blank at the end of line are always stripped. 219 * Everything on a line following comchar is a comment. 220 * 221 * Continuation character is \

- 20 -

222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278

* Comment character is # */ char *fgets_strip(char *buf, int sizebuf, FILE * fin, int *lineno) { int nocomment = 1; /* No comments found ? */ int contline = 0; char *start = buf; char *ret = NULL; char comchar = '#'; char contchar = '\\'; *buf = '\0'; while (fgets(buf, sizebuf, fin) != NULL) { char *end = strip_end(buf); char *pt = strchr(buf, comchar); if (pt != NULL) { nocomment = 0; *pt = '\0'; end = strip_end(buf); } if (lineno != NULL) (*lineno)++; ret = start; if (contline) { char *pt = buf; while (isspace(*pt)) pt++; (pt > buf + 1) { if strcpy(buf + 1, pt); /* safe, backward copy */ buf[0] = ' '; end -= (int) (pt - buf) - 1; } else if (pt == buf + 1) { buf[0] = ' '; } } if (end > buf && *(end - 1) == contchar) { if (end == buf + 1 || *(end - 2) != contchar) { /* Continuation */ contline = 1; end--; *end = '\0'; buf = end; } else { *(end - 1) = '\0'; break; } } else { break; } } return ret; }

- 21 -

236 行的函数 strip_end 也在 config.c 中，作用是除去行末尾的空白字符。 Insmod——strip_end 函数 202 /* 203 * Strip white char at the end of a string. 204 * Return the address of the last non white char + 1 (point on the '\0'). 205 */ 206 static char *strip_end(char *str) 207 { 208 int len = strlen(str); 209 210 for (str += len - 1; len > 0 && (isspace(*str)); --len, --str) 211 *str = '\0'; 212 return str + 1; 213 } 237~243 行用于去掉以#开头的注释。261 行检测行最后一个有效字符是否为连句符‘\’，如果行不为空，见 end>buf 这个判断，而且最后字符不是连句符，则通过 273 行的 break 退出循环。否则进入 262 行的 if 语句，如果该行只有一个‘\’字符，或者‘\’前面没有紧接的‘\’字符，则认为连句。而‘\\’将按转义处理。在连句情况下回到 235 行的 while 继续读取文件中的下一句。然后进入 248 行的 if 块。248~260 行的处理很简单，将下一句的有效字符拷贝到上一行的末尾，拼成一行。回到 do_read 函数。869~873 行判断是否为添加属性（参见函数开头的注释）。行 877~893 分割出 parm，并将 arg 定位到下一个参数起始。在./modutils-2.4.0/depmod/目录下有一个配置文件的样式，Example.module.conf。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

# This is an example of additional definitions you can put in /etc/modules.conf # Note that modprobe has some default aliases built in ("modprobe -c"). # The built-in aliases will be overridden by any definitions in this file. keep # keep the default set of paths and _add_ the following path(s) path[net]=/lib/modules/`uname -r`/some_special_directory alias scsi_hostadapter aha1542 alias eth0 3c509 alias eth1 de620 options de620 irq=7 bnc=1 # override: alias char-major-14 sound # Conditional decoding via: if, else, elseif, endif # # Avoid having "path" definitions in conditional parts, # unless you are _sure_ that the modules.dep file generated # by depmod is always correct whenever modprobe executes. # # version dependence: if `kernelversion` > 2.0 alias char-major-14 sb endif

- 22 -

25 26 27 28 29 30 31 32 33 34 35

# Include another config file: include FILE # in this case some additional aliases if -f /etc/devfs.aliases include /etc/devfs.aliases endif # Additional dependencies, "pull in" above sb adlib_card # Set parameters (also in the environment): define PARAM VALUE 下面以这个文档为例子，分析配置文件的解析过程。

895 896 897 898 899 900 901 902 903 904 905 906

/* * endif */ if (!assgn && strcmp(parm, "endif") == 0) { if (level > 0) --level; else { error("unmatched endif in line %d", lineno); return -1; } continue; }

907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935

/* * else */ if (!assgn && strcmp(parm, "else") == 0) { if (level <= 0) { error("else without if in line %d", lineno); return -1; } state[level] = !state[level]; continue; } /* * elseif */ if (!assgn && strcmp(parm, "elseif") == 0) { if (level <= 0) { error("elseif without if in line %d", lineno); return -1; } if (state[level] != 0) { /* * We have already found a TRUE * if statement in this "chain". * That's what "2" means. */ state[level] = 2; continue; }

- 23 -

936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992

/* else: No TRUE if has been found, cheat */ /* * The "if" handling increments level, * but this is the _same_ level as before. * So, compensate for it. */ --level; parm = "if"; /* Fallthru to "if" */ } /* * if */ if (strcmp(parm, "if") == 0) { char *cmp; int not = 0; int numeric = 0; if (level >= MAX_LEVEL) { error("Too many nested if's in line %d\n", lineno); return -1; } state[++level] = 0; /* default false */ if (*arg == '!') { not = 1; arg = next_word(arg); } if (strncmp(arg, "-k", 2) == 0) { state[level] = flag_autoclean; continue; } if (strncmp(arg, "-f", 2) == 0) { char *file = next_word(arg); meta_expand(file, &g, NULL, version, ME_ALL); if (access(g.pathc ? g.pathv[0] : file, R_OK) == 0) state[level] = !not; else state[level] = not; continue; } if (strncmp(arg, "-n", 2) == 0) { numeric = 1; arg = next_word(arg); }

cmp = next_word(arg); if (*cmp) { GLOB_LIST g2; long n1 = 0; long n2 = 0; char *w1 = "";

- 24 -

993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049

char *w2 = ""; arg2 = next_word(cmp); meta_expand(arg, &g, NULL, version, ME_ALL); if (g.pathc && g.pathv[0]) w1 = g.pathv[0]; meta_expand(arg2, &g2, NULL, version, ME_ALL); if (g2.pathc && g2.pathv[0]) w2 = g2.pathv[0]; if (numeric) { n1 = strtol(w1, NULL, 0); n2 = strtol(w2, NULL, 0); } if (strcmp(cmp, "==") == 0 || strcmp(cmp, "=") == 0) { if (numeric) state[level] = (n1 == n2); else state[level] = strcmp(w1, w2) == 0; } else if (strcmp(cmp, "!=") == 0) { if (numeric) state[level] = (n1 != n2); else state[level] = strcmp(w1, w2) != 0; } else if (strcmp(cmp, ">=") == 0) { if (numeric) state[level] = (n1 >= n2); else state[level] = strcmp(w1, w2) >= 0; } else if (strcmp(cmp, "<=") == 0) { if (numeric) state[level] = (n1 <= n2); else state[level] = strcmp(w1, w2) <= 0; } else if (strcmp(cmp, ">") == 0) { if (numeric) state[level] = (n1 > n2); else state[level] = strcmp(w1, w2) > 0; } else if (strcmp(cmp, "<") == 0) { if (numeric) state[level] = (n1 < n2); else state[level] = strcmp(w1, w2) < 0; } } else { /* Check defined value, if any */ /* undef or defined as * "" or "0" or "false" => false * defined => true */ if (!meta_expand(arg, &g, NULL, version, ME_ALL) && g.pathc > 0 && strcmp(g.pathv[0], "0") != 0 &&

- 25 -

1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106

strcmp(g.pathv[0], "false") != 0 && strlen(g.pathv[0]) != 0) state[level] = 1; /* true */ } if (not) state[level] = !state[level]; continue; } /* * Should we bother? */ if (state[level] != 1) continue; /* * define */ if (!assgn && strcmp(parm, "define") == 0) { char env[PATH_MAX]; arg2 = next_word(arg); meta_expand(arg2, &g, NULL, version, ME_ALL); snprintf(env, sizeof(env), "%s=%s", arg, (g.pathc ? g.pathv[0] : "")); putenv(env); one_err = 0; } /* * include */ if (!assgn && strcmp(parm, "include") == 0) { meta_expand(arg, &g, NULL, version, ME_ALL); if (!do_read(all, version, base_dir, g.pathc ? g.pathv[0] : arg, depth+1)) one_err = 0; else error("include %s failed\n", arg); } /* * above */ else if (all && !assgn && strcmp(parm, "above") == 0) { decode_list(&n_abovelist, &abovelist, arg, adding, version, 0); one_err = 0; } /* * below */ else if (all && !assgn && strcmp(parm, "below") == 0) { decode_list(&n_belowlist, &belowlist, arg, adding, version, 0); one_err = 0; }

- 26 -

1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163

/* * prune */ else if (all && !assgn && strcmp(parm, "prune") == 0) { decode_list(&n_prunelist, &prunelist, arg, adding, version, 0); one_err = 0; } /* * probe */ else if (all && !assgn && strcmp(parm, "probe") == 0) { decode_list(&n_probe_list, &probe_list, arg, adding, version, 0); one_err = 0; } /* * probeall */ else if (all && !assgn && strcmp(parm, "probeall") == 0) { decode_list(&n_probeall_list, &probeall_list, arg, adding, version, 0); one_err = 0; } /* * options */ else if (all && !assgn && strcmp(parm, "options") == 0) { decode_list(&n_opt_list, &opt_list, arg, adding, version, 1); one_err = 0; } /* * alias */ else if (all && !assgn && strcmp(parm, "alias") == 0) { /* * Replace any previous (default) definitions * for the same module */ decode_list(&n_aliases, &aliases, arg, 0, version, 0); one_err = 0; } /* * Specification: /etc/modules.conf * The format of the commands in /etc/modules.conf are: * * pre-install module command * install module command * post-install module command * pre-remove module command * remove module command * post-remove module command * * The different words are separated by tabs or spaces. */

- 27 -

1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220

/* * pre-install */ else if (all && !assgn && (strcmp(parm, "pre-install") == 0)) { decode_exec(arg, EXEC_PRE_INSTALL); one_err = 0; } /* * install */ else if (all && !assgn && (strcmp(parm, "install") == 0)) { decode_exec(arg, EXEC_INSTALL); one_err = 0; } /* * post-install */ else if (all && !assgn && (strcmp(parm, "post-install") == 0)) { decode_exec(arg, EXEC_POST_INSTALL); one_err = 0; } /* * pre-remove */ else if (all && !assgn && (strcmp(parm, "pre-remove") == 0)) { decode_exec(arg, EXEC_PRE_REMOVE); one_err = 0; } /* * remove */ else if (all && !assgn && (strcmp(parm, "remove") == 0)) { decode_exec(arg, EXEC_REMOVE); one_err = 0; } /* * post-remove */ else if (all && !assgn && (strcmp(parm, "post-remove") == 0)) { decode_exec(arg, EXEC_POST_REMOVE); one_err = 0; } /* * insmod_opt= */ else if (assgn && (strcmp(parm, "insmod_opt") == 0)) { insmod_opt = xstrdup(arg); one_err = 0; } /*

- 28 -

1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277

* keep */ else if (!assgn && (strcmp(parm, "keep") == 0)) { drop_default_paths = 0; one_err = 0; } /* * path...= */ else if (assgn && strncmp(parm, "path", 4) == 0) { /* * Specification: config file / path parameter * The path parameter specifies a directory to * search for modules. * This parameter may be repeated multiple times. * * Note that the actual path may be defined using * wildcards and other shell meta-chars, such as "*?`". * For example: * path[misc]=/lib/modules/1.1.5?/misc * * Optionally the path keyword carries a tag. * This tells us a little more about the purpose of * this directory and allows some automated operations. * A path is marked with a tag by adding the tag, * enclosed in square brackets, to the path keyword: *# * path[boot]=/lib/modules/boot *# * This case identifies the path a of directory * holding modules loadable a boot time. */ if (drop_default_paths) { int n; /* * Specification: config file / path / default * * Whenever there is a path[] specification * in the config file, all the default * path are reset. * * If one instead wants to _add_ to the default * set of paths, one has to have the option * keep * before the first path[]-specification line * in the configuration file. */ drop_default_paths = 0; for (n = 0; n < nmodpath; n++) { free(modpath[n].path); free(modpath[n].type); } nmodpath = 0; }

- 29 -

1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334

/* * Get (the optional) tag * If the tag is missing, the word "misc" * is assumed. */ type = "misc"; if (parm[4] == '['] { char *pt_type = parm + 5; while (*pt_type != '\0' && *pt_type != ')') pt_type++; if (*pt_type == ')' && pt_type[1] == '\0') { *pt_type = '\0'; type = parm + 5; } /* else CHECKME */ } /* * Handle the actual path description */ if (meta_expand(arg, &g, base_dir, version, ME_ALL)) return -1; for (glb = g.pathv; glb && *glb; ++glb) { modpath[nmodpath].type = xstrdup(type); modpath[nmodpath].path = *glb; if (++nmodpath >= maxpath) { maxpath += 100; modpath = (struct PATH_TYPE *)xrealloc(modpath, maxpath * sizeof(struct PATH_TYPE)); } } one_err = 0; } /* * persistdir */ else if (assgn && strcmp(parm, "persistdir") == 0) { meta_expand(arg, &g, NULL, version, ME_ALL); persistdir = xstrdup(g.pathc ? g.pathv[0] : arg); one_err = 0; } /* Names for generated files in config file */ for (i = 0; one_err && i < gen_file_count; ++i) one_err = gen_file_conf(gen_file+i, assgn, parm, arg); /* * any errors so far? */ if (all == 0) one_err = 0; else if (one_err) { error("Invalid line %d in %s\n\t%s", lineno, conf_file, buf);

- 30 -

1335 1336 } 1337}

ret = -1;

首先看 Example.module.conf 的第 5 行。经过 866~893 行的处理，parm=”keep”，arg=0。然后进入 1223 行的 else if 块，将 drop_default_paths，one_error 置为 0。至此，对第 5 行的处理结束。读取第 6 行，经过 866~893 行后，parm=”path[net]”，arg=”/lib/modules/`uname -r`/ some_special_directory”，assign=1。注意，在 path[net]和=之间不能有空格，否则通不过 884 行的 if 语句。然后进入 1231 行的 else if 块。首先注意 1232~1253 行的注释，注释中提到 path 可以使用[…] 这样的标识，以方便自动处理程序，在这里 tag 是 net。另外，在路径中可以使用 Shell 里的元符号，真是功能强大啊。在 1259~1269 行的注释中说明，如果在所有的 path 命令前声明 keep，默认的 path 配置都将保留，否则将覆盖，这里声明了 keep。因此跳过了 1255~1277 行。 1286~1296 行提取 path 的标识。在 1301 行，meta_expand 函数对 arg 指向的路径进行元符号转换处理，函数的第 2 个参数 g，是一个 GLOB_LIST 结构，该结构定义于./modutils-2.4.0/include/util.h。 Insmod——GLOB_LIST 结构 65 /* 66 * Generic globlist <bj0rn@blox.se> 67 */ 68 typedef struct { 69 int pathc; /* Count of paths matched so far */ 70 char **pathv; /* List of matched pathnames. */ 71 } GLOB_LIST; 函数 meta_expand 的定义在./modutils-2.4.0/util/meta_expand.c 中。 Insmod——meta_expand 函数 152 /* 153 * Expand the string (including meta-character) to a list of matches 154 * 155 * Return 0 if OK else -1 156 */ 157 int meta_expand(char *pt, GLOB_LIST *g, char *base_dir, char *version, int type) 158 { 159 FILE *fin; int len = 0; 160 161 char *line = NULL; 162 char *p, *p1; 163 char tmpline[PATH_MAX + 1]; 164 char wrk[sizeof(tmpline)]; 165 char tmpcmd[2*sizeof(tmpline)+20]; /* room for /bin/echo "text" */ 166 167 g->pathc = 0; 168 g->pathv = NULL; 169 170 /* 171 * Take care of version dependent expansions 172 * Needed for forced version handling 173 */

- 31 -

174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230

if ((p = strchr(pt, '`')) != NULL && (type & ME_BUILTIN_COMMAND)) { do { char *s; for (s = p + 1; isspace(*s); ++s) ; if (strncmp(s, "uname -r", 8) == 0) { while (*s && (*s != '`')) ++s; if (*s == '`') { *p = '\0'; snprintf(wrk, sizeof(wrk), "%s%s%s", pt, version, s + 1); *p = '`'; } strcpy(tmpline, wrk); /* safe, same size */ pt = tmpline; } else if (strncmp(s, "kernelversion", 13) == 0) { while (*s && (*s != '`')) ++s; if (*s == '`') { int n; char *k; *p = '\0'; for (n = 0, k = version; *k; ++k) { if (*k == '.' && ++n == 2) break; } snprintf(wrk, sizeof(wrk), "%s%.*s%s", pt, /* typecast for Alpha */ (int)(k - version), version, s + 1); *p = '`'; strcpy(tmpline, wrk); /* safe, same size */ pt = tmpline; } } else break; } while ((p = strchr(pt, '`')) != NULL); } /* * Any remaining meta-chars? */ if (strpbrk(pt, SHELL_META) == NULL) { /* * No meta-chars. * Split into words, delimited by whitespace. */ snprintf(wrk, sizeof(wrk), "%s%s", (base_dir ? base_dir : ""), pt); strcpy(tmpline, wrk); /* safe, same size */

- 32 -

231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287

if ((p = strtok(tmpline, " \t\n")) != NULL) { while (p) { g->pathv = (char **)xrealloc(g->pathv, (g->pathc + 2) * sizeof(char *)); g->pathv[g->pathc++] = xstrdup(p); p = strtok(NULL, " \t\n"); } } if (g->pathc) g->pathv[g->pathc] = NULL; return 0; } /* else */ /* * Handle remaining meta-chars */ /* * Just plain quotes? */ if (strpbrk(pt, "&();|<>$`!{}[]~=+:?*") == NULL && (p = strpbrk(pt, "\"'\\"))) { split_line(g, pt, 1); return 0; } if (strpbrk(pt, "&();|<>$`\"'\\!{}~+:[]~?*") == NULL) { /* Only "=" remaining, should be module options */ split_line(g, pt, 0); return 0; } /* * If there are meta-characters and * if they are only shell glob meta-characters: do globbing */ #if HAVE_WORDEXP if (strpbrk(pt, "&();|<>`\"'\\!{}~=+:") == NULL && strpbrk(pt, "$[]~?*")) #else if (strpbrk(pt, "&();|<>$`\"'\\!{}~=+:") == NULL && strpbrk(pt, "[]~?*")) #endif if ((type & ME_GLOB) && glob_it(pt, g) == 0) return 0; if (strpbrk(pt, "&();|<>$`\"'\\!{}~+:[]~?*") == NULL) { /* Only "=" remaining, should be module options */ split_line(g, pt, 0); return 0; } /* * Last resort: Use "echo". * DANGER: Applying shell expansion to user supplied input is a * major security risk. Modutils code should only do meta * expansion via shell commands for trusted data. Basically

- 33 -

288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339

* * * * * *

this means only for data in the config file. Even that assumes that the user cannot run modprobe as root with their own config file. Programs (including the kernel) that invoke modprobe as root with user supplied input must pass exactly one user supplied parameter and must set safe mode. */ if (!(type & ME_SHELL_COMMAND)) return 0; snprintf(wrk, sizeof(wrk), "%s%s", (base_dir ? base_dir : ""), pt); strcpy(tmpline, wrk); /* safe, same size */ snprintf(tmpcmd, sizeof(tmpcmd), "/bin/echo \""); for (p = tmpline, p1 = tmpcmd + strlen(tmpcmd); *p; ++p, ++p1) { if (*p == '"' || *p == '\\') *p1++ = '\\'; *p1 = *p; } *p1++ = '"'; *p1++ = '\0'; if (p1 - tmpcmd > sizeof(tmpcmd)) { error("tmpcmd overflow, should never happen"); exit(1); } if ((fin = popen(tmpcmd, "r")) == NULL) { error("Can't execute: %s", tmpcmd); return -1; } /* else */ /* * Collect the result */ while (fgets(tmpcmd, PATH_MAX, fin) != NULL) { int l = strlen(tmpcmd); line = (char *)xrealloc(line, len + l + 1); line[len] = '\0'; strcat(line + len, tmpcmd); /* safe, realloc */ len += l; } pclose(fin); if (line) { /* shell used to strip one set of quotes. Paranoia code in * 2.3.20 stops that strip so we do it ourselves. */ split_line(g, line, 1); free(line); } return 0; } 在 1301 行中调用 meta_expand 的 type 为 ME_ALL，该宏定义在./modutils-2.4.0/include/util.h

中。

- 34 -

Insmod——ME_ALL 宏 76 #define ME_ALL (ME_GLOB|ME_SHELL_COMMAND|ME_BUILTIN_COMMAND) 至此，进入 174 行的 if 块中。178 行的 for 循环提取”`”包含的内容，这里是 uname –r。在 182~193 行中，uname –r 被替换为 version 信息。195~215 行则对 kernelversion 进行替换，替换部分为 version 的 x.xx，也就是主次版本号。224 行以下，对 Shell 元符号进行处理。这里我们将进入 224 行的 if 块。在这里加上 base_dir，也就是 path 的起始目录路径，在 insmod 中， base_dir 为 null。231~237 处理分行的情况——将他们拷贝到一处。我们继续看 meta_expand 余下的代码。251~255 行处理“"\'”这样的元符号，其实就是字符串引用符号。257~260 行则是处理普通字符及仅有“=”的情况。函数 split_line 也在同一文件中。 Insmod——split_line 函数 47 /* 48 * Split into words delimited by whitespace, 49 * handle remaining quotes though... 50 * If strip_quotes != 0 then strip one level of quotes from the line. 51 */ 52 static void split_line(GLOB_LIST *g, char *line, int strip_quotes) 53 { 54 int len; 55 char *d; 56 char *e; 57 char *p; 58 char tmpline[PATH_MAX]; 59 60 for (p = line; *p; p = e) { 61 /* Skip leading whitespace */ 62 while (*p && isspace(*p)) 63 ++p; 64 65 /* find end of word */ 66 d = tmpline; 67 for (e = p; *e && !(isspace(*e)); ++e) { 68 char match; 69 70 /* Quote handling */ 71 switch (*e) { 72 case '\\': 73 if (!strip_quotes) 74 *d++ = *e; 75 break; 76 77 case '"': 78 case '\'': 79 match = *e; 80 if (!strip_quotes) 81 *d++ = *e; 82 for (++e; *e && *e != match; ++e) { 83 *d++ = *e; 84 if (*e == '\\' && *(e + 1) == match) 85 *d++ = *++e; 86 } 87 if (!strip_quotes)

- 35 -

88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109

*d++ = *e; break; default: *d++ = *e; break; } } if ((len = (int)(d - tmpline)) > 0) { char *str = xmalloc(len + 1); strncpy(str, tmpline, len); str[len] = '\0'; g->pathv = (char **)xrealloc(g->pathv, (g->pathc + 2) * sizeof(char *)); g->pathv[g->pathc++] = str; } } if (g->pathc) g->pathv[g->pathc] = NULL; } 在“=”的情况下，没有特别的处理，只是将字符拷入缓存，注意84、85行处理了转义字符。

267~275行是对Shell的元符号的处理，他们是：[]~?*，如果允许使用Shell变量，还包括$。如果处理不了，继续往下走，277~281行又是对“=”进行处理。。如果不止是“=”，而且glob也无法处理。那么就只好请echo帮忙了。297~310行构建echo命令行。311~313行则通过管道执行echo命令。 320~328行则是从管道获取输出。在330~335行对这些输出用split_line分割参数。回到do_read函数，1303~1311行，将解析好的路径信息存到modpath中。到此 Example.module.conf的第6 行处理完毕。然后，进入1325的for循环，不过因为one_err=0。所以，实际上没有执行。函数gen_file_conf也在config.c中。 Insmod——gen_file_conf 函数 423 /* Read a config option for a generated filename */ 424 static int gen_file_conf(struct gen_files *gf, int assgn, const char *parm, const char *arg) 425 { 426 427 int l = strlen(gf->base); 428 if (assgn && 429 strncmp(parm, gf->base, l) == 0 && 430 strcmp(parm+l, "file") == 0 && 431 !gf->name) { 432 gf->name = xstrdup(arg); 433 return(0); 434 } 435 return(1); 436 } 显然，这是察看形如x=xxx这样的式子是否定义了gen_file的路径。one_err在assign为1，parm不等于insmod_opt，path，persistdir时，为1。将进入gen_file_conf函数。接着看第8行。insmod调用do_read时，all设为1。因此，进入1142行的else if块。decode_list在同

- 36 -

一文件中。arg此时为” scsi_hostadapter”。 Insmod——decode_list 函数 323 static void decode_list(int *n, OPT_LIST **list, char *arg, int adding, 324 char *version, int opts) 325 { 326 GLOB_LIST *pg; 327 GLOB_LIST *prevlist = NULL; 328 int i, autoclean = 1; 329 int where = *n; 330 char *arg2 = next_word(arg); 331 332 if (opts && !strcmp (arg, "-k")) { 333 if (!*arg2) 334 error("Missing module argument after -k\n"); 335 arg = arg2; 336 arg2 = next_word(arg); 337 autoclean = 0; 338 } 339 340 for (i = 0; i < *n; ++i) { 341 if (strcmp((*list)[i].name, arg) == 0) { 342 if (adding) 343 prevlist = (*list)[i].opts; 344 else 345 free((*list)[i].opts); 346 (*list)[i].opts = NULL; 347 where = i; 348 break; 349 } 350 } 351 if (where == *n) { 352 (*list) = (OPT_LIST *)xrealloc((*list), 353 (*n + 2) * sizeof(OPT_LIST)); 354 (*list)[*n].name = xstrdup(arg); 355 (*list)[*n].autoclean = autoclean; 356 *n += 1; 357 memset(&(*list)[*n], 0, sizeof(OPT_LIST)); 358 } else if (!autoclean) 359 (*list)[where].autoclean = 0; 360 pg = (GLOB_LIST *)xmalloc(sizeof(GLOB_LIST)); 361 meta_expand(arg2, pg, NULL, version, ME_ALL); 362 (*list)[where].opts = addlist(prevlist, pg); 363 } 有了上面代码的基础，这段代码相当简单。首先处理-k选项，然后根据是否设定了adding，添加或覆盖对应的项。至此，Example.module.conf的第8行也已处理完。第9、10、11、13行的处理也是类似的，此处就略过了。第22行中`kernelversion`的处理我们也已经看过了。因此，直接跳到第 28行。经过类似的处理后，parm = if，arg = -f。进入do_read第950行的if块中，这时level=0。Arg=-f 是要测试文件的存在，因此，在973行获取文件路径后，在974行通过meta_expand对可能存在的元符号进行处理，这个我们已经看过了。然后通过access函数测试文件的存在与否。这里假定文件存在。注意此时level=1，这个if块下面的代码处理比较操作符及取反操作符，比较简单，不赘述了。

- 37 -

接下来处理第29行，if块里的语句。注意在1063行，因为此时state[level]=1（我们假设文件存在），因此跳过了这个if块。同时注意在646行，函数第一次运行时将state[0]设为0。直到1082行，同样首先使用meta_expand对include文件的路径进行元字符处理。注意，如果路径中不包含元字符，也没有分行，meta_expand，将不做任何处理，g.pathc将为0。然后这个路径将作为conf_file传给do_read做递归调用，然后经过同样漫长的过程，最终返回来。处理endif的语句在897~906行，非常简单。好了，我们顺便看看910~945行，else和elseif的处理。else的处理很简单，将state[level]取反即可，state[level]自会照顾else块里代码的解析。elseif就要花点心思，首先如果elseif之前的if条件为真，那么为了实现3个以上的分支，将state[level]设为 2，这样可以通过927行和1063行的if检测。相反，如果elseif之前的条件测试均没通过，那么就将 parm置为if，用处理if的语句来处理。这个实现，可说相当的巧妙。 Example.module.conf的最后一行，看似不同，但实际上还是通过decode_list来处理。所以，就不看了。至此，do_read的工作就完成了，接着config_read也结束。继续看INSMOD_MAIN。 1565 if (persist_name && !*persist_name && 1566 (!persistdir || !*persistdir)) { 1567 free(persist_name); 1568 persist_name = NULL; 1569 if (flag_verbose) 1570 lprintf("insmod: -e \"\" ignored, no persistdir"); 1571 } 1572 1573 if (m_name == NULL) { 1574 size_t len; 1575 char *p; 1576 1577 if ((p = strrchr(filename, '/')) != NULL) 1578 p++; 1579 else 1580 p = filename; 1581 len = strlen(p); 1582 if (len > 2 && p[len - 2] == '.' && p[len - 1] == 'o') 1583 len -= 2; 1584 else if (len > 4 && p[len - 4] == '.' && p[len - 3] == 'm' 1585 && p[len - 2] == 'o' && p[len - 1] == 'd') 1586 len -= 4; 1587 #ifdef CONFIG_USE_ZLIB 1589 else if (len > 5 && !strcmp(p + len - 5, ".o.gz")) 1590 len -= 5; 1591 #endif 1592 1593 m_name = xmalloc(len + 1); 1594 memcpy(m_name, p, len); 1595 m_name[len] = '\0'; 1596 } 1565~1571行，如果没有指定persistent data file，清空该缓存（做系统就得这么小器）。如果没

- 38 -

有指定输出的模块名，通过1573~1596行取一个。规则很简单，将输入的文件名去掉后缀！ 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607

/* Locate the file to be loaded. */ if (!strchr(filename, '/') && !strchr(filename, '.')) { char *tmp = search_module_path(filename); if (tmp == NULL) { error("%s: no module by that name found", filename); return 1; } filename = tmp; lprintf("Using %s", filename); } else if (flag_verbose) lprintf("Using %s", filename);

如果仅仅给出了输入模块名，就要通过search_module_path来查询模块路径了。这个函数在./modutils-2.4.0/util/config.c中。 Insmod——search_module_path 函数 1566 /* Given a bare module name, poke through the module path to find the file. */ 1567 char *search_module_path(const char *base) 1568 { 1569 GLOB_LIST *g; 1570 1571 if (config_read(0, NULL, "", NULL) < 0) 1572 return NULL; 1573 /* else */ 1574 g = config_lstmod(base, NULL, 1); 1575 if (g == NULL || g->pathc == 0) { 1576 char base_o[PATH_MAX]; 1577 1578 snprintf(base_o, sizeof(base_o), "%s.o", base); 1579 g = config_lstmod(base_o, NULL, 1); 1580 #ifdef CONFIG_USE_ZLIB 1581 if (g == NULL || g->pathc == 0) { 1582 snprintf(base_o, sizeof(base_o), "%s.o.gz", base); 1583 g = config_lstmod(base_o, NULL, 1); 1584 } 1585 #endif 1586 } 1587 if (g == NULL || g->pathc == 0) 1588 return NULL; 1589 /* else */ 1590 return g->pathv[0]; 1591 } 这里的处理，首先调用config_read，估计是构造modpath。但是，这个函数以同样参数在 INSMOD_MAIN的1560行已经调用过了。这里还要调一次，估计是因为，这个函数的也会被 modinfo工具调用。可是总觉得浪费啊。接着是config_lstmod，这个函数在同样的文件中。 Insmod——config_lstmod 函数 1485 /* 1486 * Find modules matching the name "match" in directory of type "type"

- 39 -

1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543

* (type == NULL matches all) * * Return a pointer to the list of modules found (or NULL if error). * Update the counter (sent as parameter). */ GLOB_LIST *config_lstmod(const char *match, const char *type, int first_only) { /* * Note: * There are _no_ wildcards remaining in the path descriptions! */ struct stat sb; int i; int ret = 0; char *path = NULL; char this[PATH_MAX]; if (!match) match = "*"; one_only = first_only; found = 0; filter_by_file = match; filter_by_dir = NULL; if (type) { char tmpdir[PATH_MAX]; snprintf(tmpdir, sizeof(tmpdir), "/%s/", type); filter_by_dir = xstrdup(tmpdir); } /* In safe mode, the module name is always handled as is, without meta * expansion. It might have come from an end user via kmod and must * not be trusted. Even in unsafe mode, only apply globbing to the * module name, not command expansion. We trust config file input so * applying command expansion is safe, we do not trust command line input. * This assumes that the only time the user can specify -C config file * is when they run under their own authority. In particular all * mechanisms that call modprobe as root on behalf of the user must * run in safe mode, without letting the user supply a config filename. */ meta_expand_type = 0; if (strpbrk(match, SHELL_META) && strcmp(match, "*") && !safemode) meta_expand_type = ME_GLOB|ME_BUILTIN_COMMAND; list = (char **)xmalloc((favail = 100) * sizeof(char *)); for (i = 0; i < nmodpath; i++) { path = modpath[i].path; /* Special case: insmod: handle single, non-wildcard match */ if (first_only && strpbrk(match, SHELL_META) == NULL) { /* Fix for "2.1.121 syntax */ snprintf(this, sizeof(this), "%s/%s/%s", path, modpath[i].type, match); if (stat(this, &sb) == 0 && config_add(this, &sb)) break; /* End fix for "2.1.121 syntax */ snprintf(this, sizeof(this), "%s/%s", path, match);

- 40 -

1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 }

if (stat(this, &sb) == 0 && config_add(this, &sb)) break; } /* Start looking */ if ((ret = xftw(path, config_add))) { break; } } if (ret >= 0) { GLOB_LIST *g = (GLOB_LIST *)xmalloc(sizeof(GLOB_LIST)); g->pathc = found; g->pathv = list; free(filter_by_dir); return g; } free(list); free(filter_by_dir); return NULL;

调用该函数时，match=filename，type=null，first_only=1。首先看1515行的注释，谈到在 safemode模式下，filename不进行元字符展开，因为它不可信。即使在非safemode模式下，对命令行的元字符也不做展开。modutils工具只完全展开配置文件里的元符号。1525~1527行就根据这个原则设置元符号展开规则。1531行的modpath是个全局变量，在config_read里已经设置好了。它实际上指明了查找各个功能模块的查找目录，见do_read开头的注释。当只查找一次匹配，而且待匹配模块名中不含元字符，通过1536行构建待查找的路径path/type/match，这个路径的构造方式与 do_read函数中的759行构造各类module查找路径的方法是一致的。如果在1538行成功获取文件状态信息，则调用config_add函数，它在同一文件中。 Insmod——config_add 函数 1390 /* 1391 * Add a file name if it exist 1392 */ 1393 static int config_add(const char *file, const struct stat *sb) 1394 { 1395 int i; int npaths = 0; 1396 1397 char **paths = NULL; 1398 1399 if (meta_expand_type) { 1400 GLOB_LIST g; 1401 char **p; 1402 char full[PATH_MAX]; 1403 1404 snprintf(full, sizeof(full), "%s/%s", file, filter_by_file); 1405 1406 if (filter_by_dir && !strstr(full, filter_by_dir)) 1407 return 0; 1408 1409 if (meta_expand(full, &g, NULL, config_version, meta_expand_type))

- 41 -

1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466

return 1; for (p = g.pathv; p && *p; ++p) { paths = (char **)xrealloc(paths, (npaths + 1) * sizeof(char *)); paths[npaths++] = *p; } } else { /* normal path match or match with "*" */ if (!S_ISREG(sb->st_mode)) return 0; if (strcmp(filter_by_file, "*")) { char *p; if ((p = strrchr(file, '/')) == NULL) p = (char *)file; else p += 1; if (strcmp(p, filter_by_file)) return 0; } if (filter_by_dir && !strstr(file, filter_by_dir)) return 0; paths = (char **)xmalloc(sizeof(char **)); *paths = xstrdup(file); npaths = 1; } for (i = 0; i < npaths; ++i) { struct stat sbuf; if (S_ISDIR(sb->st_mode)) { if (stat(paths[i], &sbuf) == 0) sb = &sbuf; } if (S_ISREG(sb->st_mode) && sb->st_mode & S_IRUSR) { int j; char **this; if (!root_check_off) { if (sb->st_uid != 0) { error("%s is not owned by root", paths[i]); continue; } } /* avoid duplicates */ for (j = 0, this = list; j < found; ++j, ++this) { if (strcmp(*this, paths[i]) == 0) { free(paths[i]); goto next; } } list[found] = paths[i]; if (++found >= favail) list = (char **)xrealloc(list,

- 42 -

1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 }

(favail += 100) * sizeof(char *)); if (one_only) { for (j = i + 1; j < npaths; ++j) free(paths[j]); free(paths); return 1; /* finish xftw */ } } next: } if (npaths > 0) free(paths); return 0;

在这里meta_expand_type为空。filter_by_file实际上是match，而且不包括任何元符号，包括通配符（*），file是在config_lstmod函数里构建的this字符串，这里假设为path/type/match。进入 config_add的1416行，如果该路径不是普通文件，那么查找就失败了。否则查找路径的最后部分是否相符，由此，得知是否找文件。如果path/type/match找不到，就继续用path/match找一次。如果还是找不到，那就只好劳驾xftw 了。又是一个艰辛的过程！如果幸运找到的话，就将路径一直返回到INSMOD_MAIN中。 xftw在./modutils-2.4.0/util/xftw.c中。在ftw.c文件的开头有一大段注释，解释了这一族函数的由来。特摘录如下。 Insmod——xftw 函数 24 /* 25 modutils requires special processing during the file tree walk 26 of /lib/modules/<version> and any paths that the user specifies. 27 The standard ftw() does a blind walk of all paths and can end 28 up following the build symlink down the kernel source tree. 29 Although nftw() has the option to get more control such as not 30 automatically following symbolic links, even that is not enough 31 for modutils. The requirements are: 32 33 Paths must be directories or symlinks to directories. 34 35 Each directory is read and sorted into alphabetical order 36 before processing. 37 38 A directory is type 1 iff it was specified on a path statement 39 (either explicit or default) and the directory contains a 40 subdirectory with one of the known names and the directory name 41 does not end with "/kernel". Otherwise it is type 2. 42 43 In a type 1 directory, walk the kernel subdirectory if it exists, 44 then the old known names in their historical order then any 45 remaining directory entries in alphabetical order and finally any 46 non-directory entries in alphabetical order. 47

- 43 -

48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84

Entries in a type 1 directory are filtered against the "prune" list. A type 1 directory can contain additional files which are not modules nor symlinks to modules. The prune list skips known additional files, if a distribution wants to store additional text files in the top level directory they should be added to the prune list. A type 2 directory must contain only modules or symlinks to modules. They are processed in alphabetical order, without pruning. Symlinks to directories are an error in type 2 directories. The user function is not called for type 1 directories, nor for pruned entries. It is called for type 2 directories and their contents. It is also called for any files left in a type 1 directory after pruning and processing type 2 subdirectories. The user function never sees symlinks, they are resolved before calling the function. Why have different directory types? The original file tree walk was not well defined. Some users specified each directory individually, others just pointed at the top level directory. Either version worked until the "build" symlink was added. Now users who specify the top level directory end up running the entire kernel source tree looking for modules, not nice. We cannot just ignore symlinks because pcmcia uses symlinks to modules for backwards compatibility. Type 1 is when a user specifies the top level directory which needs special processing, type 2 is individual subdirectories. But the only way to tell the difference is by looking at the contents. The "/kernel" directory introduced in 2.3.12 either contains nothing (old make modules_install) or contains all the kernel modules using the same tree structure as the source. Because "/kernel" can contain old names but is really a type 2 directory, it is detected as a special case. */ 这里面提到，一些应用程序会指定具体的路径，一些则只指定最上层的目录，在没有引入build

符号链接前（build符号链接是引用内核头文件最可靠的方法，它会自动定位到正确的版本）， linux原有的ftw和nftw函数都能工作，但是引入build符号链接后，指定最上层目录的应用程序将会遍历整个内核查找模块，在效率上不够好。因此，在这里区分出了2种目录。一般来说，类型1对应于指定上层目录的情况。因为它包含非模块文件或不指向模块的符号链接，因此需要特殊对待——使用prune list过滤。遍历的次序是：内核子目录、按历史排序的已知的目录、按字母排序的目录、按字母排序的非目录项。类型1的目录遍历时，只有在完成prune list过滤后，才对剩余的文件使用用户提供的函数（类似于ftw中的fn参数）。类型2的目录仅包含模块或指向模块的符号链接，一般来说，这就包括以/kernel结尾的目录路径。它们按字母顺序被依次处理，而且不使用prune list过滤，并且自始至终使用用户提供的函数进行处理。

- 44 -

对于这些目录的处理，有需要共同遵守的规则： 1）路径必须指向目录或者是指向目录的符号链接。 2）在处理前，目录需按字母排序。现在回过头来看xftw，linux下ftw的对应物。首先看看prune_list。它定义在./modutils2.4.0/util/alias.h中。 Insmod——prune 数组 221 /* 222 * This is the list of pre-defined "prune"s, 223 * used to exclude paths from scan of /lib/modules. 224 * /etc/modules.conf can add entries but not remove them. 225 */ 226 char *prune[] = 227 { 228 "modules.dep", 229 "modules.generic_string", 230 "modules.pcimap", 231 "modules.isapnpmap", 232 "modules.usbmap", 233 "modules.parportmap", 234 "System.map", 235 ".config", 236 "build", /* symlink to source tree */ 237 "vmlinux", 238 "vmlinuz", 239 "bzImage", 240 "zImage", 241 ".rhkmvtag", /* wish RedHat had told me before they did this */ 242 NULL /* marks the end of the list! */ 422 }; 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332

/* Only external visible function. Decide on the type of directory and312 * process accordingly. */ int xftw(const char *directory, xftw_func_t funcptr) { struct stat statbuf; int ret, i, j, type; xftw_tree_t *t; struct xftw_dirent *c; verbose("xftw starting at %s ", directory); if (lstat(directory, &statbuf)) { verbose("lstat on %s failed\n", directory); return(0); } if (S_ISLNK(statbuf.st_mode)) { char real[PATH_MAX]; verbose("resolving symlink to "); if (!(directory = realpath(directory, real))) { if (errno == ENOENT) { verbose("%s: does not exist, dangling symlink ignored\n", real); return(0); }

- 45 -

333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389

perror("... failed"); return(-1); } verbose("%s ", directory); if (lstat(directory, &statbuf)) { error("lstat on %s failed ", directory); perror(""); return(-1); } } if (!S_ISDIR(statbuf.st_mode)) { error("%s is not a directory\n", directory); return(-1); } verbose("\n"); /* All returns after this point must be via cleanup */ if ((ret = xftw_readdir(directory, 0))) goto cleanup; t = tree; /* depth 0 */ type = 2; for (i = 0 ; type == 2 && i < t->used; ++i) { c = t->contents+i; for (j = 0; tbtype[j]; ++j) { if (strcmp(c->name, tbtype[j]) == 0 && S_ISDIR(c->statbuf.st_mode)) { const char *p = directory + strlen(directory) - 1; if (*p == '/') --p; if (p - directory >= 6 && strncmp(p-6, "/kernel", 7) == 0) continue; /* "/kernel" path is a special case, type 2 */ type = 1; /* known subdirectory */ break; } } } if (type == 1) { OPT_LIST *p; /* prune entries in type 1 directories only */ for (i = 0 ; i < t->used; ++i) { for (p = prunelist; p->name; ++p) { c = t->contents+i; if (strcmp(p->name, c->name) == 0) { verbose("pruned %s\n", c->name); *(c->name) = '\0'; /* ignore */ } } } /* run known subdirectories first in historical order, "kernel" is now top of list */ for (i = 0 ; i < t->used; ++i) { c = t->contents+i; for (j = 0; tbtype[j]; ++j) { if (*(c->name) && strcmp(c->name, tbtype[j]) == 0 &&

- 46 -

390 S_ISDIR(c->statbuf.st_mode)) { 391 if ((ret = xftw_type2(directory, c->name, 1, funcptr))) 392 goto cleanup; 393 *(c->name) = '\0'; /* processed */ 394 } 395 } 396 } 397 /* any other directories left, in alphabetical order */ 398 for (i = 0 ; i < t->used; ++i) { 399 c = t->contents+i; 400 if (*(c->name) && 401 S_ISDIR(c->statbuf.st_mode)) { 402 if ((ret = xftw_type2(directory, c->name, 1, funcptr))) 403 goto cleanup; 404 *(c->name) = '\0'; /* processed */ 405 } 406 } 407 /* anything else is passed to the user function */ 408 for (i = 0 ; i < t->used; ++i) { 409 c = t->contents+i; 410 if (*(c->name)) { 411 verbose("%s found in type 1 directory %s\n", c->name, directory); 412 if ((ret = xftw_do_name(directory, c->name, &(c->statbuf), funcptr))) 413 goto cleanup; 414 *(c->name) = '\0'; /* processed */ 415 } 416 } 417 } 418 else { 419 /* type 2 */ 420 xftw_free_tree(0); 421 if ((ret = xftw_type2(directory, NULL, 0, funcptr))) 422 goto cleanup; 423 } 424 425 /* amazing, it all worked */ 426 ret = 0; 427 cleanup: 428 for (i = 0; i < XFTW_MAXDEPTH; ++i) 429 xftw_free_tree(i); 430 return(ret); 431} 读过上面的注释，这个函数就不难理解，首先根据遍历的结果，构造一棵排序的树。这个树的定义也在同一文件。 Insmod——xftw_dirent 结构 104 struct xftw_dirent { 105 struct stat statbuf; 106 char *name; 107 char *fullname; 108 }; 109 110 #define XFTW_MAXDEPTH 64 111 112 typedef struct { 113 struct xftw_dirent *contents;

/* Maximum directory depth handled */

- 47 -

114 115 116 117 118

int size; int used; } xftw_tree_t; static xftw_tree_t tree[XFTW_MAXDEPTH]; tree数组的下标对应相应的路径深度。 312~333行的目的很明显，是将符号链接转为它实际指向的文件或目录，在这里只有目录可以

接受。获取了目录的实际路径后，调用xftw_readdir，这个函数在同一文件。 Insmod——xftw_readdir 函数 217 /* Read a directory and sort it, ignoring "." and ".." */ 218 static int xftw_readdir(const char *directory, int depth) 219 { 220 DIR *d; 221 struct dirent *ent; 222 verbose("xftw_readdir %s\n", directory); 223 if (!(d = opendir(directory))) { 224 perror(directory); 225 return(1); 226 } 227 while ((ent = readdir(d))) { 228 char *name; 229 struct xftw_dirent *f; 230 if (strcmp(ent->d_name, ".") == 0 || 231 strcmp(ent->d_name, "..") == 0) 232 continue; 233 name = xftw_dir_name(directory, ent->d_name); 234 xftw_add_dirent(depth); 235 f = tree[depth].contents+tree[depth].used-1; 236 f->name = xstrdup(ent->d_name); 237 f->fullname = name; /* do not free name, it is in use */ 238 if (lstat(name, &(f->statbuf))) { 239 perror(name); 240 return(1); 241 } 242 } 243 closedir(d); 244 qsort(tree[depth].contents, tree[depth].used, sizeof(*(tree[0].contents)), &xftw_sortdir); 245 return(0); 246 } 直到233行之前的代码都很简单，打开目录，忽略名为“.”和“..”子目录。然后使用 xftw_dir_name合成这个子目录的路径。函数xftw_dir_name也在xftw.c中。 Insmod——xftw_dir_name 函数 152 /* Concatenate directory name and entry name into one string. 153 * Note: caller must free result or leak. 154 */ 155 static char *xftw_dir_name(const char *directory, const char *entry) 156 { 157 int i = strlen(directory); char *name; 158 159 if (entry) 160 i += strlen(entry);

- 48 -

161 162 163 164 165 166 167 168 169

i += 2; name = xmalloc(i); strcpy(name, directory); /* safe, xmalloc */ if (*directory && entry) strcat(name, "/"); /* safe, xmalloc */ if (entry) strcat(name, entry); /* safe, xmalloc */ return(name); } 函数xftw_add_dirent在必要时扩展xftw_tree_t中content的容量，也在同一文件。

Insmod——xftw_add_dirent 函数 135 /* Increment dirents used at this depth, resizing if necessary */ 136 static void xftw_add_dirent(int depth) 137 { 138 xftw_tree_t *t = tree+depth; 139 int i, size = t->size; 140 if (++t->used < size) 141 return; 142 size += 10; /* arbitrary increment */ 143 t->contents = xrealloc(t->contents, size*sizeof(*(t->contents))); 144 for (i = t->size; i < size; ++i) { memset(&(t->contents[i].statbuf), 0, sizeof(t->contents[i].statbuf)); 145 t->contents[i].name = NULL; 146 t->contents[i].fullname = NULL; 147 } 148 t->size = size; 149 } 回到函数xftw_readdir中，235~241行确认找到的路径有效，并保存于tree中。在遍历结束后，使用xftw_sortdir排序。该函数很简单，在同一文件里。 Insmod——xftw_sortdir 函数 211 /* Sort directory entries into alphabetical order */ 212 static int xftw_sortdir(const void *a, const void *b) 213 { 214 return(strcmp(((struct xftw_dirent *)a)->name, ((struct xftw_dirent *)b)->name)); 214 } 从xftw_readdir返回时，第一层目录下的子目录全部排序保存在tree[0]中了。然后，在xftw函数的347行for循环里，开始检查是否类型1的目录。在这里，我们终于知道那些是所谓已知的目录名和所谓的历史顺序，也就是tbtype数组对应目录名和下标，这个数组在./modutils-2.4.0/util/alias.h。不过里面要除去“kernel”，包含它的目录属于第2类。 5 6 7 8 9 10 11 12

/* * tbpath and tbtype are used to build the complete set of paths for finding * modules, but only when we search for individual directories, they are not * used for [boot] and [toplevel] searches. */ static char *tbpath[] = { "/lib/modules",

- 49 -

13 14 15 16 17 18 20 21 22 23 24 25 26 27 28 30 31 32 33 34 35 36 37 38 39 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243

NULL

/* marks the end of the list! */

}; char *tbtype[] = { "kernel", "fs", "net", "scsi", "block", "cdrom", "ipv4", "ipv6", "sound", "fc4", "video", "misc", "pcmcia", "atm", "usb", "ide", "ieee1394", "mtd", NULL };

/* as of 2.3.14 this must be first */19

/* marks the end of the list! */

/* * This is the list of pre-defined "prune"s, * used to exclude paths from scan of /lib/modules. * /etc/modules.conf can add entries but not remove them. */ char *prune[] = { "modules.dep", "modules.generic_string", "modules.pcimap", "modules.isapnpmap", "modules.usbmap", "modules.parportmap", "System.map", ".config", "build", /* symlink to source tree */ "vmlinux", "vmlinuz", "bzImage", "zImage", ".rhkmvtag", /* wish RedHat had told me before they did this */ NULL /* marks the end of the list! */ }; 只要找到一个已知目录，就确定是第1类目录，立即跳出循环。然后在366行，按照注释说明，

对照prunelist进行过滤。过滤完成后，376行的for循环首先处理已知目录。函数xftw_type2在同一文件里。 Insmod——xftw_type2 函数 248 /* Process a type 2 directory */

- 50 -

249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299

int xftw_type2(const char *directory, const char *entry, int depth, xftw_func_t funcptr) { int ret, i; xftw_tree_t *t = tree+depth; struct stat statbuf; char *dirname = xftw_dir_name(directory, entry); verbose("type 2 %s\n", dirname); if (depth > XFTW_MAXDEPTH) { error("xftw_type2 exceeded maxdepth\n"); ret = 1; goto cleanup; } if ((ret = xftw_readdir(dirname, depth))) goto cleanup; t = tree+depth; /* user function sees type 2 directories */ if ((ret = lstat(dirname, &statbuf)) || (ret = xftw_do_name("", dirname, &statbuf, funcptr))) goto cleanup; /* user sees all contents of type 2 directory, no pruning */ for (i = 0; i < t->used; ++i) { struct xftw_dirent *c = t->contents+i; if (S_ISLNK(c->statbuf.st_mode)) { if (!stat(c->name, &(c->statbuf))) { if (S_ISDIR(c->statbuf.st_mode)) { error("symlink to directory is not allowed, %s ignored\n", c->name); *(c->name) = '\0'; /* ignore it */ } } } if (!*(c->name)) continue; if (S_ISDIR(c->statbuf.st_mode)) { /* recursion is the curse of the programming classes */ ret = xftw_type2(dirname, c->name, depth+1, funcptr); if (ret) goto cleanup; } else if ((ret = xftw_do_name(dirname, c->name, &(c->statbuf), funcptr))) goto cleanup; *(c->name) = '\0'; /* processed */ } ret = 0; cleanup: free(dirname); return(ret); } 254行构造一个完整路径。在255行我们可以知道，已知目录在这里是按第2类目录来处理的，

根据注释这些目录里都是模块名或者指向模块的符号链接。查找目录层深是有限制的，这里是 XFTW_MAXDEPTH，定义为64。只要不超过限制，就通过xftw_do_name函数在dirname路径内来

- 51 -

查找目的模块。函数也在同一文件里。 Insmod——xftw_do_name 函数 171 /* Call the user function for a directory entry */ 172 static int xftw_do_name(const char *directory, const char *entry, struct stat *sb, 173 xftw_func_t funcptr) 173 { 174 int ret = 0; 175 char *name = xftw_dir_name(directory, entry); 176 177 if (S_ISLNK(sb->st_mode)) { 178 char real[PATH_MAX], *newname; 179 verbose("resolving %s symlink to ", name); 180 if (!(newname = realpath(name, real))) { 181 if (errno == ENOENT) { 182 verbose("%s: does not exist, dangling symlink ignored\n", real); 183 goto cleanup; 184 } 185 perror("... failed"); 186 goto cleanup; 187 } 188 verbose("%s ", newname); 189 if (lstat(newname, sb)) { 190 error("lstat on %s failed ", newname); 191 perror(""); 192 goto cleanup; 193 } 194 free(name); 195 name = xstrdup(newname); 196 } 197 198 if (!S_ISREG(sb->st_mode) && 199 !S_ISDIR(sb->st_mode)) { 200 error("%s is not plain file nor directory\n", name); 201 goto cleanup; 202 } 203 204 verbose("user function %s\n", name); 205 ret = (*funcptr)(name, sb); 206 cleanup: 207 free(name); 208 return(ret); 209 } 函数首先处理符号链接（见177行）。获得真实路径名后，通过传入的函数指针funcptr解析保存所求的路径。这个函数实际上就是config_add。前面已经看过。它通过filter_by_file，filter_by_dir 来过滤、查找路径。如果config_add查找成功并且只查找第一个匹配，返回1。xftw_type2就结束了。不过在这里如果dirname为目录名，config_add会返回0。如果为普通文件，则对应的tree项为空。在dirname为目录名情况下，继续进入272行的循环，分别处理dirname的子目录。首先确保符号链接不是指向目录。对于子目录递归调用xftw_type2，而对于文件则使用xftw_do_name处理。回到xftw，注意每处理完一个tree项，该项的name都会被置空，防止重复处理。在389行就开始

- 52 -

按字母顺序处理路径了。首先处理子目录，然后在399行处理文件。如果在一开始的指定路径名里找不到包含已知目录名（除kernel外）的项，就认为是第2类型，直接使用xftw_type2处理。至此xftw完毕。经过长途跋涉，现在终于返回到config_lstmod中。最终该函数完成在modpath中漫长的查找，在 list中带着所期待的路径返回了。回到search_module_path，如果用base查找不到路径，就是用base.o 再找一遍，如果还是不行而且允许使用zlib，则用base.o.gz再找。只要找到这些路径， search_module_path只返回第一个路径名（因此，查找的顺序很重要）。回到主战场INSMOD_MAIN！ 1610 1611 1612 1613 1614 1615 1616 1617

/* And open it. */ if ((fp = gzf_open(filename, O_RDONLY)) == -1) { error("%s: %m", filename); return 1; } /* Try to prevent multiple simultaneous loads. */ if (dolock) flock(fp, LOCK_EX);

gzf_open首先按压缩方式打开文件，然后在尝试非压缩方式。如果要求加锁，那就满足要求。 1618 if (!get_kernel_info(K_SYMBOLS)) 1619 goto out; 函数get_kernel_info在./modutils-2.4.0/util/modstat.c中。 Insmod——get_kernel_info 函数 409 int get_kernel_info(int type) 410 { 411 k_new_syscalls = !query_module(NULL, 0, NULL, 0, NULL); 412 413 #ifdef COMPAT_2_0 414 if (!k_new_syscalls) 415 return old_get_kernel_info(type); 416 #endif /* COMPAT_2_0 */ 417 418 return new_get_kernel_info(type); 419 } 411行的作用是测试内核的版本，sys_query_module这个系统调用是在v2.1.x以后才加入的，如果版本比这个早，就会返回-NOSYS。new_get_kernel_info是这个函数的核心，它也在这个文件里。 Insmod——new_get_kernel_info 函数 84 static int new_get_kernel_info(int type) 85 { 86 struct module_stat *modules; 87 struct module_stat *m; 88 struct module_symbol *syms; 89 struct module_symbol *s; 90 size_t ret;

- 53 -

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147

size_t bufsize; size_t nmod; size_t nsyms; size_t i; size_t j; char *module_names; char *mn; drop(); /* * Collect the loaded modules */ module_names = xmalloc(bufsize = 256); while (query_module(NULL, QM_MODULES, module_names, bufsize, &ret)) { if (errno != ENOSPC) { error("QM_MODULES: %m\n"); return 0; } module_names = xrealloc(module_names, bufsize = ret); } module_name_list = module_names; l_module_name_list = bufsize; n_module_stat = nmod = ret; module_stat = modules = xmalloc(nmod * sizeof(struct module_stat)); memset(modules, 0, nmod * sizeof(struct module_stat)); /* Collect the info from the modules */ for (i = 0, mn = module_names, m = modules; i < nmod; ++i, ++m, mn += strlen(mn) + 1) { struct module_info info; m->name = mn; if (query_module(mn, QM_INFO, &info, sizeof(info), &ret)) { if (errno == ENOENT) { /* The module was removed out from underneath us. */ m->flags = NEW_MOD_DELETED; continue; } /* else oops */ error("module %s: QM_INFO: %m", mn); return 0; } m->addr = info.addr; if (type & K_INFO) { m->size = info.size; m->flags = info.flags; m->usecount = info.usecount; m->modstruct = info.addr; } if (type & K_REFS) { int mm; char *mrefs;

- 54 -

148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204

char *mr; mrefs = xmalloc(bufsize = 64); while (query_module(mn, QM_REFS, mrefs, bufsize, &ret)) { if (errno != ENOSPC) { error("QM_REFS: %m"); return 1; } mrefs = xrealloc(mrefs, bufsize = ret); } for (j = 0, mr = mrefs; j < ret; ++j, mr += strlen(mr) + 1) { for (mm = 0; mm < i; ++mm) { if (strcmp(mr, module_stat[mm].name) == 0) { m->nrefs += 1; m->refs=xrealloc(m->refs, m->nrefs * sizeof(struct module_stat **)); m->refs[m->nrefs - 1] = module_stat + mm; break; } } } free(mrefs); } if (type & K_SYMBOLS) { /* Want info about symbols */ syms = xmalloc(bufsize = 1024); while (query_module(mn, QM_SYMBOLS, syms, bufsize, &ret)) { if (errno == ENOSPC) { syms = xrealloc(syms, bufsize = ret); continue; } if (errno == ENOENT) { /* * The module was removed out * from underneath us. */ m->flags = NEW_MOD_DELETED; free(syms); goto next; } else { error("module %s: QM_SYMBOLS: %m", mn); return 0; } } nsyms = ret; m->nsyms = nsyms; m->syms = syms; /* Convert string offsets to string pointers */ for (j = 0, s = syms; j < nsyms; ++j, ++s) s->name += (unsigned long) syms; } next: }

- 55 -

205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224

if (type & K_SYMBOLS) { /* Want info about symbols */ /* Collect the kernel's symbols. */ syms = xmalloc(bufsize = 16 * 1024); while (query_module(NULL, QM_SYMBOLS, syms, bufsize, &ret)) { if (errno != ENOSPC) { error("kernel: QM_SYMBOLS: %m"); return 0; } syms = xrealloc(syms, bufsize = ret); } nksyms = nsyms = ret; ksyms = syms; /* Convert string offsets to string pointers */ for (j = 0, s = syms; j < nsyms; ++j, ++s) s->name += (unsigned long) syms; } return 1; } 99行的drop，首先清除module_stat的内容。module_stat是个全局变量，用于保存模块信息。

105行获取所有内核模块的名字。然后在119行，获取所有模块的信息，这些信息以module_info结构保存。这个结构在linux/include/linux/module.h中定义。 95 96 97 98 99 100 101

struct module_info { unsigned long addr; unsigned long size; unsigned long flags; long usecount; }; 接下来，由于type为K_SYMBOLS，所以执行173行的if块，获取各个模块的符号信息。在205行

还要查询一次模块符号信息，不过这次模块名没有指定，查的是内核模块的符号（查询模块信息时，内核模块是不被查阅的）。在INSMOD_MAIN中接下来设置当待加载模块与内核版本不符时，要使用的前后缀。 Insmod——set_ncv_prefix 函数 136 /* Only set prefix once. If set by the user, use it. If not set by the 137 * user, look for a well known kernel symbol and derive the prefix from 138 * there. Otherwise set the prefix depending on whether uts_info 139 * includes SMP or not for backwards compatibility. 140 */ 141 static void set_ncv_prefix(char *prefix) 142 { 143 static char derived_prefix[256]; 144 static const char *well_known_symbol[] = { "get_module_symbol_R", 145 "inter_module_get_R", 146 }; 147 struct module_symbol *s; 148 int i, j, l, m, pl; 149 const char *name;

- 56 -

150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191

char *p; if (ncv_prefix) return; if (prefix) ncv_prefix = prefix; else { /* Extract the prefix (if any) from well known symbols */ for (i = 0, s = ksyms; i < nksyms; ++i, ++s) { name = (char *) s->name; l = strlen(name); for (j = 0; j < sizeof(well_known_symbol)/sizeof(well_known_symbol[0]); ++j) { m = strlen(well_known_symbol[j]); if (m + 8 > l || strncmp(name, well_known_symbol[j], m)) continue; pl = l - m - 8; if (pl > sizeof(derived_prefix)-1) continue; /* Prefix is wrong length */ /* Must end with 8 hex digits */ (void) strtoul(name+l-8, &p, 16); if (*p == 0) { strncpy(derived_prefix, name+m, pl); *(derived_prefix+pl) = '\0'; ncv_prefix = derived_prefix; break; } } } } if (!ncv_prefix) { p = strchr(uts_info.version, ' '); if (p && *(++p) && !strncmp(p, "SMP ", 4)) ncv_prefix = "smp_"; else ncv_prefix = ""; } ncv_plen = strlen(ncv_prefix); if (flag_verbose) lprintf("Symbol version prefix '%s'", ncv_prefix); } ncv_prefix 可由用户指定，只能设置一次。如果用户没有设定，则系统通过 157~180 行进行设

置。规则很简单，在内核模块符号里查找以“get_module_symbol_R”和“inter_module_get_R” 开头的符号（我猜‘R’应该是 release 的意思，所以后面跟的一定是版本信息），如果存在这样的符号而且符号的最后八个子符是 16 进制数字，就取这八个数字作为 ncv_prefix 的内容。如果找不到这样的符号（不大可能），那就看通过 sys_newuname 获取的版本信息，如果是 SMP 体系（对称多 CPU），将 ncv_prefix 设为“smp_”，否则就认为无关紧要。接下来 INSMOD_MAIN 检查待安装的模块是否已经加载。如果已经加载了，当然就不能往下走了。

- 57 -

1627 1628 1629 1630 1631 1632 1633 1634 1635 1636

for (i = 0; i < n_module_stat; ++i) { if (strcmp(module_stat[i].name, m_name) == 0) { error("a module named %s already exists", m_name); goto out; } } error_file = filename; if ((f = obj_load(fp, ET_REL, filename)) == NULL) goto out;

到了 obj_load 了。终于要开始干正事了！ obj_load 在./modutils-2.4.0/obj/obj_load.c 中。 Insmod——obj_load 函数 36 struct obj_file * 37 obj_load (int fp, Elf32_Half e_type, const char *filename) 38 { 39 struct obj_file *f; 40 ElfW(Shdr) *section_headers; 41 int shnum, i; 42 char *shstrtab; 43 44 /* Read the file header. */ 45 46 f = arch_new_file(); 47 memset(f, 0, sizeof(*f)); 48 f->symbol_cmp = strcmp; 49 f->symbol_hash = obj_elf_hash; 50 f->load_order_search_start = &f->load_order; 51 52 gzf_lseek(fp, 0, SEEK_SET); 53 if (gzf_read(fp, &f->header, sizeof(f->header)) != sizeof(f->header)) 54 { 55 error("error reading ELF header %s: %m", filename); 56 return NULL; 57 } 58 59 if (f->header.e_ident[EI_MAG0] != ELFMAG0 60 || f->header.e_ident[EI_MAG1] != ELFMAG1 61 || f->header.e_ident[EI_MAG2] != ELFMAG2 62 || f->header.e_ident[EI_MAG3] != ELFMAG3) 63 { 64 error("%s is not an ELF file", filename); 65 return NULL; 66 } 67 68 if (f->header.e_ident[EI_CLASS] != ELFCLASSM 69 || f->header.e_ident[EI_DATA] != ELFDATAM 70 || f->header.e_ident[EI_VERSION] != EV_CURRENT 71 || !MATCH_MACHINE(f->header.e_machine)) 72 { 73 error("ELF file %s not for this architecture", filename); 74 return NULL; 75 } 76 77 if (f->header.e_type != e_type && e_type != ET_NONE)

- 58 -

78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134

{ switch (e_type) { case ET_REL: error("ELF file %s not a relocatable object", filename); break; case ET_EXEC: error("ELF file %s not an executable object", filename); break; default: error("ELF file %s has wrong type, expecting %d got %d", filename, e_type, f->header.e_type); break; } return NULL; } /* Read the section headers. */ if (f->header.e_shentsize != sizeof(ElfW(Shdr))) { error("section header size mismatch %s: %lu != %lu", filename, (unsigned long)f->header.e_shentsize, (unsigned long)sizeof(ElfW(Shdr))); return NULL; } shnum = f->header.e_shnum; f->sections = xmalloc(sizeof(struct obj_section *) * shnum); memset(f->sections, 0, sizeof(struct obj_section *) * shnum); section_headers = alloca(sizeof(ElfW(Shdr)) * shnum); gzf_lseek(fp, f->header.e_shoff, SEEK_SET); if (gzf_read(fp, section_headers, sizeof(ElfW(Shdr))*shnum) != sizeof(ElfW(Shdr))*shnum) { error("error reading ELF section headers %s: %m", filename); return NULL; } /* Read the section data. */ for (i = 0; i < shnum; ++i) { struct obj_section *sec; f->sections[i] = sec = arch_new_section(); memset(sec, 0, sizeof(*sec)); sec->header = section_headers[i]; sec->idx = i; switch (sec->header.sh_type) { case SHT_NULL: case SHT_NOTE: case SHT_NOBITS: /* ignore */

- 59 -

135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191

break; case SHT_PROGBITS: case SHT_SYMTAB: case SHT_STRTAB: case SHT_RELM: if (sec->header.sh_size > 0) { sec->contents = xmalloc(sec->header.sh_size); gzf_lseek(fp, sec->header.sh_offset, SEEK_SET); if (gzf_read(fp, sec->contents, sec->header.sh_size) != sec->header.sh_size) { error("error reading ELF section data %s: %m", filename); return NULL; } } else sec->contents = NULL; break; #if SHT_RELM == SHT_REL case SHT_RELA: if (sec->header.sh_size) { error("RELA relocations not supported on this architecture %s", filename); return NULL; } break; #else case SHT_REL: if (sec->header.sh_size) { error("REL relocations not supported on this architecture %s", filename); return NULL; } break; #endif default: if (sec->header.sh_type >= SHT_LOPROC) { if (arch_load_proc_section(sec, fp) < 0) return NULL; break; } error("can't handle sections of type %ld %s", (long)sec->header.sh_type, filename); return NULL; } } /* Do what sort of interpretation as needed by each section. */ shstrtab = f->sections[f->header.e_shstrndx]->contents; for (i = 0; i < shnum; ++i) { struct obj_section *sec = f->sections[i];

- 60 -

192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248

sec->name = shstrtab + sec->header.sh_name; } for (i = 0; i < shnum; ++i) { struct obj_section *sec = f->sections[i]; /* .modinfo and .modstring should be contents only but gcc has no * attribute for that. The kernel may have marked these sections as * ALLOC, ignore the allocate bit. */ if (strcmp(sec->name, ".modinfo") == 0 || strcmp(sec->name, ".modstring") == 0) sec->header.sh_flags &= ~SHF_ALLOC; if (sec->header.sh_flags & SHF_ALLOC) obj_insert_section_load_order(f, sec); switch (sec->header.sh_type) { case SHT_SYMTAB: { unsigned long nsym, j; char *strtab; ElfW(Sym) *sym; if (sec->header.sh_entsize != sizeof(ElfW(Sym))) { error("symbol size mismatch %s: %lu != %lu", filename, (unsigned long)sec->header.sh_entsize, (unsigned long)sizeof(ElfW(Sym))); return NULL; } nsym = sec->header.sh_size / sizeof(ElfW(Sym)); strtab = f->sections[sec->header.sh_link]->contents; sym = (ElfW(Sym) *) sec->contents; /* Allocate space for a table of local symbols. */ j = f->local_symtab_size = sec->header.sh_info; f->local_symtab = xmalloc(j *= sizeof(struct obj_symbol *)); memset(f->local_symtab, 0, j); /* Insert all symbols into the hash table. */ for (j = 1, ++sym; j < nsym; ++j, ++sym) { const char *name; if (sym->st_name) name = strtab+sym->st_name; else name = f->sections[sym->st_shndx]->name; obj_add_symbol(f, name, j, sym->st_info, sym->st_shndx, sym->st_value, sym->st_size); }

- 61 -

249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304

} break; } } /* second pass to add relocation data to symbols */ for (i = 0; i < shnum; ++i) { struct obj_section *sec = f->sections[i]; switch (sec->header.sh_type) { case SHT_RELM: { unsigned long nrel, j; ElfW(RelM) *rel; struct obj_section *symtab; char *strtab; if (sec->header.sh_entsize != sizeof(ElfW(RelM))) { error("relocation entry size mismatch %s: %lu != %lu", filename, (unsigned long)sec->header.sh_entsize, (unsigned long)sizeof(ElfW(RelM))); return NULL; } nrel = sec->header.sh_size / sizeof(ElfW(RelM)); rel = (ElfW(RelM) *) sec->contents; symtab = f->sections[sec->header.sh_link]; strtab = f->sections[symtab->header.sh_link]->contents; /* Save the relocate type in each symbol entry. */ for (j = 0; j < nrel; ++j, ++rel) { ElfW(Sym) *extsym; struct obj_symbol *intsym; unsigned long symndx; symndx = ELFW(R_SYM)(rel->r_info); if (symndx) { extsym = ((ElfW(Sym) *) symtab->contents) + symndx; if (ELFW(ST_BIND)(extsym->st_info) == STB_LOCAL) { /* Local symbols we look up in the local table to be sure we get the one that is really intended. */ intsym = f->local_symtab[symndx]; } else { /* Others we look up in the hash table. */ const char *name; if (extsym->st_name) name = strtab + extsym->st_name; else name = f->sections[extsym->st_shndx]->name; intsym = obj_find_symbol(f, name); }

- 62 -

305 306 307 308 309 310 311 312 313 314 315 316

intsym->r_type = ELFW(R_TYPE)(rel->r_info); } } } break; } } f->filename = xstrdup(filename); return f; } 这个函数相当大，而且干的事情也不简单。首先，模块格式是 elf 格式。这个格式是可重定位

的。46 行是一个跟体系结构有关的函数，生成的是一个临时文件对象。在 x86 体系下，这个函数在./modutils-2.4.0/obj/obj_i386.c 中。 Insmod——arch_new_file 函数 56 struct obj_file * 57 arch_new_file (void) 58 { 59 struct i386_file *f; 60 f = xmalloc(sizeof(*f)); 61 f->got = NULL; 62 return &f->root; 63 } bj_file 的定义也在同一文件里。 41 42 43 44 45

struct i386_file { struct obj_file root; struct obj_section *got; }; obj_file 对应着 elf 文件结构。这个结构定义在./modutils-2.4.0/include/obj.h 中。

94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

struct obj_file { ElfW(Ehdr) header; ElfW(Addr) baseaddr; struct obj_section **sections; struct obj_section *load_order; struct obj_section **load_order_search_start; struct obj_string_patch_struct *string_patches; struct obj_symbol_patch_struct *symbol_patches; int (*symbol_cmp)(const char *, const char *); unsigned long (*symbol_hash)(const char *); unsigned long local_symtab_size; struct obj_symbol **local_symtab; struct obj_symbol *symtab[HASH_BUCKETS]; const char *filename;

- 63 -

109 110

char *persist; }; 要理解这个结构体，先要说说 elf 文件格式。概括的说 elf 文件格式主要一个 file header，一个

可选的 program header，数据，若干 section 组成。Section 就是所谓的段，使用过 C++的话，就知道 C++里的 static 类型和 global 类型的变量放在.bss 段里，这个.bss 如果在 elf 格式里，也就对应着一段具体的 section。主要的 section 类型有 string（包含 section 的名字和 symbol 的名字），symbol （列出所有的符号），relocation（包含重定位信息）等。 ElfW 是个宏，在./modutils-2.4.0/include/obj.h。 35 36 37 38 39 40 41 42 43

#ifndef ElfW # if ELFCLASSM == ELFCLASS32 # define ElfW(x) Elf32_ ## x # define ELFW(x) ELF32_ ## x # else # define ElfW(x) Elf64_ ## x # define ELFW(x) ELF64_ ## x # endif #endif 因此，header 的类型是 Elf32_Ehdr，其他类推。Elf32_Ehdr 的结构对这里的说明比较重要，也

列在如下，整个 elf 格式有关的定义都在 linux-2.4.0/linux/include/linux/elf.h 里。 Insmod——Elf32_hdr 结构 407 #define EI_NIDENT 16 408 409 typedef struct elf32_hdr{ 410 unsigned char e_ident[EI_NIDENT]; 411 Elf32_Half e_type; 412 Elf32_Half e_machine; 413 Elf32_Word e_version; 414 Elf32_Addr e_entry; /* Entry point */ 415 Elf32_Off e_phoff; 416 Elf32_Off e_shoff; 417 Elf32_Word e_flags; 418 Elf32_Half e_ehsize; 419 Elf32_Half e_phentsize; 420 Elf32_Half e_phnum; 421 Elf32_Half e_shentsize; 422 Elf32_Half e_shnum; 423 Elf32_Half e_shstrndx; 424 } Elf32_Ehdr;

文件识别字段文件类型机器信息版本信息进程的起始虚址程序头组的偏移量（相对文件起始）段头组的偏移量（相对文件起始）与处理器相关的标识 elf头的大小程序头的大小程序头的数量段头的大小段头的数量包含段名的字符段的段头组的序号

显然，在 obj_file 结构中，header 对应 elf 文件头，sections 对应着 elf 的 section，symtab 对应着从 symbol section 中解析出来的 symbol，这些 symbol 还通过 hash 表链在一起。

- 64 -

Elf 文件头里有一个识别格式的魔数。59~66 行就是进行这样的 health check。另外，文件头里还包括 version，数据编码（高端在前或低端在前），目标机器等，也要在 68~75 行对其进行检查。 ET_REL 类型的 elf 文件是可重定位文件，ET_EXEC 则是执行文件。可加载模块必须是 ET_REL 类型的文件，77~92 行确保了这一点。 96~103 行确保，在这里声明的段头结构与这个文件里所使用段头结构一致（通过比较大小）。 105~109 行获取段个数，分配缓存。110~115 行定位到文件里保存段头的地方，读入所有的段头数据。119 行的 for 逐个读入段头的内容。为了让我们看得更清楚些，将 Elf32_Shdr 定义列如下。 Insmod——Elf32_Shdr 结构 511 typedef struct { 512 Elf32_Word sh_name; 段名，字符串实际保存在，elf头中e_shstrndx指向的符号段里，这里给出的是在这个段里对应的下标 513 Elf32_Word sh_type; 段类型 514 Elf32_Word sh_flags; 标识位（段的属性） 515 Elf32_Addr sh_addr; 如果段在进程中可见，指定在进程中的地址 516 Elf32_Off sh_offset; 段内容相对elf文件起始的偏移 517 Elf32_Word sh_size; 段的大小 518 Elf32_Word sh_link; 特殊含义，下说明 519 Elf32_Word sh_info; 同上 520 Elf32_Word sh_addralign; 地址对齐大小 521 Elf32_Word sh_entsize; 如果段内容由固定大小的项组成，给出项的大小 522 } Elf32_Shdr; 对于段类型大于 SHT_LOPROC 的段，要么为处理器架构保留，要么为用户保留，要些特殊处理，不过在 x386 体系下，arch_load_proc_section 是空函数。 187 行获得包含段名的符号段。 189~194 行获取各个段的名字。对于段.modifno 和.modstring，199 行注释说，只有 gcc 才使用，需要将 SHF_ALLOC 标识置为 0。一个段如果设置了 SHF_ALLOC 标记，就说明该段在进程执行时可见，如果没有这个标记，则该段不占据进程空间。那么从减小执行体的大小考虑，剥夺.modinfo 和.modstring 的 SHF_ALLOC 标记，还是合理的。如果段的 SHF_ALLOC 设置了，就要将这个段通过 obj_insert_section_load_order，添加到 obj_file 的 load_order_search_start 所指向的链表，这个函数在./modutils-2.4.0/obj/obj_common.c 中。 Insmod——obj_insert_section_load_order 函数 253 void 254 obj_insert_section_load_order (struct obj_file *f, struct obj_section *sec) 255 { 256 struct obj_section **p; 257 int prio = obj_load_order_prio(sec); 258 for (p = f->load_order_search_start; *p ; p = &(*p)->load_next) 259 if (obj_load_order_prio(*p) < prio)

- 65 -

260 261 262 263

break; sec->load_next = *p; *p = sec; } 在这个链表里，段都是排序的，优先级的计算通过 obj_load_order_prio 计算，这个函数也在

obj_common.c 中。 Insmod——obj_load_order_prio 函数 232 static int 233 obj_load_order_prio(struct obj_section *a) 234 { 235 unsigned long af, ac; 236 237 af = a->header.sh_flags; 238 239 ac = 0; 240 if (a->name[0] != '.' || strlen(a->name) != 10 || 241 strcmp(a->name + 5, ".init")) ac |= 32; 242 if (af & SHF_ALLOC) ac |= 16; 243 if (!(af & SHF_WRITE)) ac |= 8; 244 if (af & SHF_EXECINSTR) ac |= 4; 245 if (a->header.sh_type != SHT_NOBITS) ac |= 2; 246 #if defined(ARCH_ia64) 247 if (af & SHF_IA_64_SHORT) ac -= 1; 248 #endif 249 250 return ac; 251 } 可以看出以“.”开头并且段名长度为 10，而且后 5 个字符是“.init”时，这个段具有很低的优先级，实际上是最低的优先级。段在这里次序代表了段在程序中出现的次序，在这里可以看出来，可写段排在只读段前面，不占内存资源的段排在占用内存资源的段前面。安排好这个段的位置后，根据其类型进行处理了。对于每种不同的段，其内容是相应的类型，比如符号段，其内容就是 Elf32_Sym 结构；又如重定位段，其内容就是 Elf32_Rel 或者 Elf32_Rela 结构。在这里首先考虑的是符号段。在 elf 格式里，符号根据其连结属性和可见性，分为 global，weak，local。其中前 2 种都是全局可见，只是 weak 比 global 优先级低一些，另外在链接时不允许有多个同名的 global 类型符号，但 weak 类型的符号则不影响。Elf32_Symbol 的定义如下： Insmod——Elf32_Sym 结构 388 typedef struct elf32_sym{ 389 Elf32_Word st_name; 390 Elf32_Addr st_value; 391 Elf32_Word st_size; 392 unsigned char st_info; 393 unsigned char st_other; 394 Elf32_Half st_shndx; 395 } Elf32_Sym;

符号的名字符号的值，可以是数字或地址符号的大小符号的连接属性及类型保留与符号相关的段的序号

- 66 -

注意 232 行，对于 symbol 段，sh_info 保存的是 local 类型的 symbol 个数，而且这类符号在段中放在最前面。237 行第一个++sym 是跳过 symbol 中的第一项，这一项是不使用的。还有 240~243 行，对符号名的确定。处理的主体，是 obj_add_symbol 函数，在./modutils-2.4.0/obj/obj_common.c 中。 Insmod——obj_add_symbol 函数 90 struct obj_symbol * 91 obj_add_symbol (struct obj_file *f, const char *name, unsigned long symidx, 92 int info, int secidx, ElfW(Addr) value, unsigned long size) 93 { 94 struct obj_symbol *sym; 95 unsigned long hash = f->symbol_hash(name) % HASH_BUCKETS; 96 int n_type = ELFW(ST_TYPE)(info); 97 int n_binding = ELFW(ST_BIND)(info); 98 99 for (sym = f->symtab[hash]; sym; sym = sym->next) 100 if (f->symbol_cmp(sym->name, name) == 0) 101 { 102 int o_secidx = sym->secidx; 103 int o_info = sym->info; 104 int o_type = ELFW(ST_TYPE)(o_info); 105 int o_binding = ELFW(ST_BIND)(o_info); 106 107 /* A redefinition! Is it legal? */ 108 109 if (secidx == SHN_UNDEF) 110 return sym; 111 else if (o_secidx == SHN_UNDEF) 112 goto found; 113 else if (n_binding == STB_GLOBAL && o_binding == STB_LOCAL) 114 { 115 /* Cope with local and global symbols of the same name 116 in the same object file, as might have been created 117 by ld -r. The only reason locals are now seen at this 118 level at all is so that we can do semi-sensible things 119 with parameters. */ 120 121 struct obj_symbol *nsym, **p; 122 123 nsym = arch_new_symbol(); 124 nsym->next = sym->next; 125 nsym->ksymidx = -1; 126 127 /* Excise the old (local) symbol from the hash chain. */ 128 for (p = &f->symtab[hash]; *p != sym; p = &(*p)->next) 129 continue; 130 *p = sym = nsym; 131 goto found; 132 } 133 else if (n_binding == STB_LOCAL) 134 { 135 /* Another symbol of the same name has already been defined. 136 Just add this to the local table. */ 137 sym = arch_new_symbol();

- 67 -

138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188

sym->next = NULL; sym->ksymidx = -1; f->local_symtab[symidx] = sym; goto found; } else if (n_binding == STB_WEAK) return sym; else if (o_binding == STB_WEAK) goto found; /* Don't unify COMMON symbols with object types the programmer doesn't expect. */ else if (secidx == SHN_COMMON && (o_type == STT_NOTYPE || o_type == STT_OBJECT)) return sym; else if (o_secidx == SHN_COMMON && (n_type == STT_NOTYPE || n_type == STT_OBJECT)) goto found; else { /* Don't report an error if the symbol is coming from the kernel or some external module. */ if (secidx <= SHN_HIRESERVE) error("%s multiply defined", name); return sym; } } /* Completely new symbol. */ sym = arch_new_symbol(); sym->next = f->symtab[hash]; f->symtab[hash] = sym; sym->ksymidx = -1; if (ELFW(ST_BIND)(info) == STB_LOCAL && symidx != -1) { if (symidx >= f->local_symtab_size) error("local symbol %s with index %ld exceeds local_symtab_size %ld", name, (long) symidx, (long) f->local_symtab_size); else f->local_symtab[symidx] = sym; } found: sym->name = name; sym->value = value; sym->size = size; sym->secidx = secidx; sym->info = info; sym->r_type = 0; /* should be R_arch_NONE for all arch */ return sym; } 函数首先通过 obj_file 里的 hash 计算函数和字符串比较函数（在这里暂时还是 strcmp）找出同

名的符号。如果不存在同名的符号，就将符号加入 obj_file 的 symtab 数组中，如果是 local 属性的，还要加入 local_symtab 中。

- 68 -

如果存在同名符号，则要根据属性进行替代了：如果要加入的符号的属性是 SHN_UNDEF，即未定义，不用处理。相反，如果已加入的符号属性是 SHN_UNDEF，则用新的符号替换。如果要加入的符号属性是 STB_GLOBAL，而已加入的符号属性是 STB_LOCAL，那么就将 STB_GLOBAL 符号替换掉 STB_LOCAL 符号。但是被替换的符号通过 local_symtab 仍然能找到（实际上，这才是 local 符号该待的地方）。如果加入的符号属性是 STB_LOCAL，则将符号加入 local_symtab 中，但不加入 symtab。因此，不会进入 for 比较中。如果加入符号属性是 STB_WEAK，则什么都不干，反之如果已加入符号属性是 STB_WEAK，此时加入符号的属性一定是 STB_GLOBAL，替换一定发生。在 elf 格式中，有些段的序号是有特殊含义，供特殊段使用，SHN_COMMON 就是其中之一，对于 C 语言，它对应未分配的外部符号。而符号也根据其表达对象和用途，有多种类型，其中 STT_OBJECT 类型，对应于数据对象，而 STT_NOTYPE 类型，则对应未定义类型的符号（原文： The synbol’s type is not specified）。那么在 149 行，可以看见，如果一方是对应数据对象，而另一方代表未分配符号，则已数据对象为准（暂时认为对上了）。从 obj_add_symbol 返回时，obj_file 里的 local_symtab 填满了 local 符号，不过在 symtab 里可能还会有部分的 local 符号（如果没有同名的 global 符号）。依次处理完所有的符号段后，在 254 行开始，处理重定位段。对于重定位段，其段头的 sh_link 保存相关的符号段段序号，而符号段的 sh_link 则保存有关的字符段的序号。这就是 277、278 行的意义。重定位段里的内容是 Elf32_Rel 或 Elf32_Rela 结构。 Insmod——Elf32_Rel 结构 366 typedef struct elf32_rel { 367 Elf32_Addr r_offset; 368 Elf32_Word r_info; 369 } Elf32_Rel Insmod——Elf32_Rela 结构 376 typedef struct elf32_rela{ 377 Elf32_Addr r_offset; 378 Elf32_Word r_info; 379 Elf32_Sword r_addend; 380 } Elf32_Rela; 381 这 2 个结构的区别，在于 elf32_rela 可以指定重定位时使用的常数（原文：addend），而 elf32_rel 这个常数是固定的。其中的 r_info 字段的高 8 位给出了符号在符号段里的序号（基本上除了字符段，我们这里要处理的段都包含固定大小的项，所以 elf 非常适合重定位），低 8 位给出了所要执行的重定位操作（重定位操作，根据所使用的参数，有好几种）。那么 289 行就是取对这个

- 69 -

重定位项相关的符号项。如果符号是 local 属性的，从 local_symtab 获取，否则查找 symtab(hash 表) 获取，并添上重定位类型的信息。执行完这些，obj_load 函数就要结束了。看看他干了什么——加载了模块对应的 elf 文件，分析符号段和重定位段，根据符号的属性分别保存在 local_symtab 和 symtab 中，并通过重定位段，确定了符号的重定位方法。回到 INSMOD_MAIN 函数。 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666

/* Version correspondence? */ k_version = get_kernel_version(k_strversion); m_version = get_module_version(f, m_strversion); if (m_version == -1) { error("couldn't find the kernel version the module was compiled for"); goto out; } k_crcs = is_kernel_checksummed(); m_crcs = is_module_checksummed(f); if ((m_crcs == 0 || k_crcs == 0) && strncmp(k_strversion, m_strversion, STRVERSIONLEN) != 0) { if (flag_force_load) { lprintf("Warning: kernel-module version mismatch\n" "\t%s was compiled for kernel version %s\n" "\twhile this kernel is version %s", filename, m_strversion, k_strversion); } else { if (!quiet) error("kernel-module version mismatch\n" "\t%s was compiled for kernel version %s\n" "\twhile this kernel is version %s.", filename, m_strversion, k_strversion); goto out; } } if (m_crcs != k_crcs) obj_set_symbol_compare(f, ncv_strcmp, ncv_symbol_hash);

初步处理了模块文件后，在进一步处理内核已知符号前，首先要检查内核与模块的版本。查找内核与模块版本的函数也在 insmod.c 中，比较简单，将它列如下，不做说明了。 Insmod——get_kernel_version 函数 107 /* Get the kernel version in the canonical integer form. */ 108 109 static int get_kernel_version(char str[STRVERSIONLEN]) 110 { 111 char *p, *q; 112 int a, b, c; 113 114 strncpy(str, uts_info.release, STRVERSIONLEN); 115 p = uts_info.release; 116 117 a = strtoul(p, &p, 10);

- 70 -

118 119 120 121 122 123 124 125 126 127 128

if (*p != '.') return -1; b = strtoul(p + 1, &p, 10); if (*p != '.') return -1; c = strtoul(p + 1, &q, 10); if (p + 1 == q) return -1; return a << 16 | b << 8 | c; }

Insmod——get_module_version 函数 557 /* Get the module's kernel version in the canonical integer form. */ 558 static int get_module_version(struct obj_file *f, char str[STRVERSIONLEN]) 559 { 560 int a, b, c; 561 char *p, *q; 562 563 if ((p = get_modinfo_value(f, "kernel_version")) == NULL) { 564 struct obj_symbol *sym; 565 566 m_has_modinfo = 0; 567 if ((sym = obj_find_symbol(f, "kernel_version")) == NULL) 568 sym = obj_find_symbol(f, "__module_kernel_version"); 569 if (sym == NULL) 570 return -1; 571 p = f->sections[sym->secidx]->contents + sym->value; 572 } else 573 m_has_modinfo = 1; 574 575 strncpy(str, p, STRVERSIONLEN); 576 578 a = strtoul(p, &p, 10); 579 if (*p != '.') 580 return -1; 581 b = strtoul(p + 1, &p, 10); 582 if (*p != '.') 583 return -1; 584 c = strtoul(p + 1, &q, 10); 585 if (p + 1 == q) 586 return -1; 587 588 return a << 16 | b << 8 | c; 589 } 接下来还要测试内核和模块是否使用了版本的附加信息。这 2 个函数也在同一文件里。 Insmod——is_kernel_checksummed 函数 590 /* Return the kernel symbol checksum version, or zero if not used. */ 591 static int is_kernel_checksummed(void) 592 { 593 struct module_symbol *s; 594 size_t i; 595 596 /* 597 * Using_Versions might not be the first symbol,

- 71 -

598 * but it should be in there. 599 */ 600 for (i = 0, s = ksyms; i < nksyms; ++i, ++s) 601 if (strcmp((char *) s->name, "Using_Versions") == 0) 602 return s->value; 603 604 return 0; 605 } Insmod——is_module_checksummed 函数 607 static int is_module_checksummed(struct obj_file *f) 608 { 609 if (m_has_modinfo) { 610 const char *p = get_modinfo_value(f, "using_checksums"); 611 if (p) 612 return atoi(p); 613 else 614 return 0; 615 } else 616 return obj_find_symbol(f, "Using_Versions") != NULL; 617 } 如果不使用附加信息，这 2 个函数都返回 0。1648~1662 行，如果版本不对，而且没有强制加载，到此就要出错退出。否则，打印警告信息，继续执行。如果内核或模块之一使用附加信息，而且附加信息不一致，就要通过 obj_set_symbol_compare 设置新的符号比较和 hash 函数，并根据新的 hash 函数重构 symtab 表。这个函数在 obj_common.c 中。 Insmod——obj_set_symbol_compare 函数 62 void 63 obj_set_symbol_compare (struct obj_file *f, 64 int (*cmp)(const char *, const char *), unsigned long (*hash)(const char *)) 65 66 { 67 if (cmp) 68 f->symbol_cmp = cmp; 69 if (hash) 70 { 71 struct obj_symbol *tmptab[HASH_BUCKETS], *sym, *next; 72 int i; 73 74 f->symbol_hash = hash; 75 76 memcpy(tmptab, f->symtab, sizeof(tmptab)); 77 memset(f->symtab, 0, sizeof(f->symtab)); 78 79 for (i = 0; i < HASH_BUCKETS; ++i) 80 for (sym = tmptab[i]; sym ; sym = next) 81 { 82 unsigned long h = hash(sym->name) % HASH_BUCKETS; 83 next = sym->next; 84 sym->next = f->symtab[h]; 85 f->symtab[h] = sym; 86 } 87 }

- 72 -

} 完成这些处理后，又到了一个关键操作。

1667 1668

/* Let the module know about the kernel symbols. */ add_kernel_symbols(f);

add_kernel_symbols 函数也在 insmod.c 里。 Insmod——add_kernel_symbols 函数 266 static void add_kernel_symbols(struct obj_file *f) 267 { 268 struct module_stat *m; 269 size_t i, nused = 0; 270 271 /* Add module symbols first. */ 272 for (i = 0, m = module_stat; i < n_module_stat; ++i, ++m) 273 if (m->nsyms && 274 add_symbols_from(f, SHN_HIRESERVE + 2 + i, m->syms, m->nsyms)) 275 m->status = 1 /* used */, ++nused; 276 n_ext_modules_used = nused; 277 278 /* And finally the symbols from the kernel proper. */ 279 if (nksyms) 280 add_symbols_from(f, SHN_HIRESERVE + 1, ksyms, nksyms); 281 } 函数首先处理已加载模块。每个模块的符号通过 add_symbols_from 函数处理，该函数也在 insmod.c 中。 Insmod——add_symbol_from 函数 233 static int add_symbols_from(struct obj_file *f, int idx, 234 struct module_symbol *syms, size_t nsyms) 235 { 236 struct module_symbol *s; 237 size_t i; 238 int used = 0; 239 240 for (i = 0, s = syms; i < nsyms; ++i, ++s) { 241 /* 242 * Only add symbols that are already marked external. 243 * If we override locals we may cause problems for 244 * argument initialization. 245 * We will also create a false dependency on the module. 246 */ 247 struct obj_symbol *sym; 248 249 sym = obj_find_symbol(f, (char *) s->name); 250 if (sym && !ELFW(ST_BIND) (sym->info) == STB_LOCAL) { 251 sym = obj_add_symbol(f, (char *) s->name, -1, 252 ELFW(ST_INFO) (STB_GLOBAL, STT_NOTYPE), 253 idx, s->value, 0); 254 /* 255 * Did our symbol just get installed? 256 * If so, mark the module as "used".

- 73 -

257 258 259 260 261 262 263 264

*/ if (sym->secidx == idx) used = 1; } } return used; } 函数首先在 obj_file 的 symtab 中查找与模块特定符号同名的符号。只要这个符号不是 local 属性

的，就要通过 obj_add_symbol 测试、添加符号。注意传给函数 info 参数为 ELFW(ST_INFO) (STB_GLOBAL, STT_NOTYPE)，也就是加入符号的属性是 global，类型是 STT_NOTYPE。而 symidx 参数则是-1。这样在 obj_add_symbol 函数里，只有原来的符号的属性是 weak，或者原来符号的段序号为 SHN_COMMON 时，才会进行符号替换。如果发生了符号替换，将 used 置为 1，表示模块已经被使用。并且该符号被新的段序号标识（注意原来段序号为 SHN_COMMON 的符号，现在它持有的段序号被改掉了，不再是 SHN_COMMON，记住这点很重要！）。 SHN_HIRESERVE 对应系统保留段的上限，因此，add_kernel_symbols 使用 SHN_HIRESERVE 以上的段来保存已加载模块的符号。注意 274 和 280 行。在这个函数结束的时候， n_ext_modules_used 记载了被使用模块的数目。回到 INSMOM_MAIN。 1670 /* Allocate common symbols, symbol tables, and string tables. 1671 * 1672 * The calls marked DEPMOD indicate the bits of code that depmod 1673 * uses to do a pseudo relocation, ignoring undefined symbols. 1674 * Any changes made to the relocation sequence here should be 1675 * checked against depmod. 1676 */ 1677 #ifdef COMPAT_2_0 1678 if (k_new_syscalls 1679 ? !create_this_module(f, m_name) 1680 : !old_create_mod_use_count(f)) 1681 goto out; 1682 #else 1683 if (!create_this_module(f, m_name)) 1684 goto out; 1685 #endif COMPAT_2_0 表示支持 2.0 的内核（如果是 2.0 以前的内核，这时 k_new_syscalls 为 0），这样就要调用 old_create_mod_use_count 了，这里我们不看它了。2.0 以后内核则使用函数 create_this_module，这个函数也在同一文件里。 Insmod——create_this_module 函数 429 static int create_this_module(struct obj_file *f, const char *m_name) 430 {

- 74 -

431 432 433 434 435 436 437 438 439 440 441 442 443

struct obj_section *sec; sec = obj_create_alloced_section_first(f, ".this", tgt_sizeof_long, sizeof(struct module)); memset(sec->contents, 0, sizeof(struct module)); obj_add_symbol(f, "__this_module", -1, ELFW(ST_INFO) (STB_LOCAL, STT_OBJECT), sec->idx, 0, sizeof(struct module)); obj_string_patch(f, sec->idx, offsetof(struct module, name), m_name); return 1; } 首先通过函数 obj_create_alloced_section_first 分配一个段头。函数在 obj_common.c 中。

Insmod——obj_create_alloced_section_first 函数 290 struct obj_section * 291 obj_create_alloced_section_first (struct obj_file *f, const char *name, 292 unsigned long align, unsigned long size) 293 { 294 int newidx = f->header.e_shnum++; 295 struct obj_section *sec; 296 297 f->sections = xrealloc(f->sections, (newidx+1) * sizeof(sec)); 298 f->sections[newidx] = sec = arch_new_section(); 299 300 memset(sec, 0, sizeof(*sec)); 301 sec->header.sh_type = SHT_PROGBITS; 302 sec->header.sh_flags = SHF_WRITE|SHF_ALLOC; 303 sec->header.sh_size = size; 304 sec->header.sh_addralign = align; 305 sec->name = name; 306 sec->idx = newidx; 307 if (size) 308 sec->contents = xmalloc(size); 309 310 sec->load_next = f->load_order; 311 f->load_order = sec; 312 if (f->load_order_search_start == &f->load_order) 313 f->load_order_search_start = &sec->load_next; 314 315 return sec; 316 } 这个函数是给 obj_file 添加一个 SHT_PROGBITS 段，SHT_PROGBITS 表示段里保存的信息由程序定义的，它的格式和含义由程序唯一确定。这个段的名字为“.this”，与 tgt_sizeof_long（实际上就是 sizeof(long)）对齐，长度为 sizeof(struct module)。显然准备在这个段里存放 module 结构。然后回到 create_this_module，添加了.this 段后，再通过 obj_add_symbol 向 obj_file 添加符号，符号的名字是“__this_module”，属性是 STB_LOCAL，类型是 STT_OBJECT，注意调用 obj_add_symbol 时，symidx 为-1，所以这个符号不加入 local_symtab 中（因为这个符号不是文件原

- 75 -

有的）。最后通过 obj_string_patch 将模块名加入字符段。这个函数在./modutils-2.4.0/obj/obj_reloc.c 中。 Insmod——obj_string_patch 函数 33 int 34 obj_string_patch(struct obj_file *f, int secidx, ElfW(Addr) offset, 35 const char *string) 36 { 37 struct obj_string_patch_struct *p; 38 struct obj_section *strsec; 39 size_t len = strlen(string)+1; 40 char *loc; 41 42 p = xmalloc(sizeof(*p)); 43 p->next = f->string_patches; 44 p->reloc_secidx = secidx; 45 p->reloc_offset = offset; 46 f->string_patches = p; 47 48 strsec = obj_find_section(f, ".kstrtab"); 49 if (strsec == NULL) 50 { 51 strsec = obj_create_alloced_section(f, ".kstrtab", 1, len); 52 p->string_offset = 0; 53 loc = strsec->contents; 54 } 55 else 56 { 57 p->string_offset = strsec->header.sh_size; 58 loc = obj_extend_section(strsec, len); 59 } 60 memcpy(loc, string, len); 61 62 return 1; 63 } 为了能在 obj_file 里引用模块名（回忆一下，每个字符串要么与段名对应，要么对应于一个符号。而在这里，模块名没有对应的符号或段），因此，obj_file 引入了 obj_string_patch_struct 结构。收留这些孤独的字符串。这个结构的定义在./modutils-2.40/include/obj.h 中。 Insmod——obj_string_patch_struct 函数 121 struct obj_string_patch_struct 122 { 123 struct obj_string_patch_struct *next; 124 int reloc_secidx; 125 ElfW(Addr) reloc_offset; 126 ElfW(Addr) string_offset; 127 }; obj_extent_section 则在 obj_common.c 中。 Insmod——obj_extend_section 函数 318 void * 319 obj_extend_section (struct obj_section *sec, unsigned long more) 320 {

- 76 -

321 322 323 324

unsigned long oldsize = sec->header.sh_size; sec->contents = xrealloc(sec->contents, sec->header.sh_size += more); return sec->contents + oldsize; } 再一次辗转回到 INSMOD_MAIN 中。

1687 1688 1689

if (!obj_check_undefineds(f, quiet)) /* DEPMOD, obj_clear_undefineds */ goto out; obj_allocate_commons(f); /* DEPMOD */

这次使用 obj_check_undefineds 检查是否有未解析的符号，该函数在 obj_reloc.c 中。 Insmod——obj_check_undefineds 函数 81 int 82 obj_check_undefineds(struct obj_file *f, int quiet) 83 { 84 unsigned long i; 85 int ret = 1; 86 87 for (i = 0; i < HASH_BUCKETS; ++i) 88 { 89 struct obj_symbol *sym; 90 for (sym = f->symtab[i]; sym ; sym = sym->next) 91 if (sym->secidx == SHN_UNDEF) 92 { 93 if (ELFW(ST_BIND)(sym->info) == STB_WEAK) 94 { 95 sym->secidx = SHN_ABS; 96 sym->value = 0; 97 } 98 else if (sym->r_type) /* assumes R_arch_NONE is 0 on all arch */ 99 { 100 if (!quiet) 101 error("unresolved symbol %s", sym->name); 102 ret = 0; 103 } 104 } 105 } 106 107 return ret; 108 } 在 elf 文档里指明，对于属性为 weak 的符号，如果未能解析（resolve），链接器是不会多非周折的，只是将它置为 0 了事。段序号为 SHN_ABS 的段，表示其内容不受重定位影响。如果符号的属性不是 weak，而且需要重定位操作（R_arch_NONE 为 0，表示不需要重定位操作，在 symtab 中的 local 符号，就包含这样的属性）。那就出错了，也就是我们常在 c 或 c++里看到的 unresolved symbol 这样的错误。接下来需要处理的是 SHN_COMMON 类型的符号，这些符号是未分配资源的。结构中的 st_value 给出的是边界对齐要求，函数 obj_allocate_common 在 obj_reloc.c 中。

- 77 -

Insmod——obj_allocate_commons 函数 126 void 127 obj_allocate_commons(struct obj_file *f) 128 { 129 struct common_entry 130 { 131 struct common_entry *next; 132 struct obj_symbol *sym; 133 } *common_head = NULL; 134 135 unsigned long i; 136 137 for (i = 0; i < HASH_BUCKETS; ++i) 138 { 139 struct obj_symbol *sym; 140 for (sym = f->symtab[i]; sym ; sym = sym->next) 141 if (sym->secidx == SHN_COMMON) 142 { 143 /* Collect all COMMON symbols and sort them by size so as to 144 minimize space wasted by alignment requirements. */ 145 { 146 struct common_entry **p, *n; 147 for (p = &common_head; *p ; p = &(*p)->next) 148 if (sym->size <= (*p)->sym->size) 149 break; 150 151 n = alloca(sizeof(*n)); 152 n->next = *p; 153 n->sym = sym; 154 *p = n; 155 } 156 } 157 } 158 159 for (i = 1; i < f->local_symtab_size; ++i) 160 { 161 struct obj_symbol *sym = f->local_symtab[i]; 162 if (sym && sym->secidx == SHN_COMMON) 163 { 164 struct common_entry **p, *n; 165 for (p = &common_head; *p ; p = &(*p)->next) 166 if (sym == (*p)->sym) 167 break; 168 else if (sym->size < (*p)->sym->size) 169 { 170 n = alloca(sizeof(*n)); 171 n->next = *p; 172 n->sym = sym; 173 *p = n; 174 break; 175 } 176 } 177 } 178 179 if (common_head) 180 { 181 /* Find the bss section. */

- 78 -

182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238

for (i = 0; i < f->header.e_shnum; ++i) if (f->sections[i]->header.sh_type == SHT_NOBITS) break; /* If for some reason there hadn't been one, create one. */ if (i == f->header.e_shnum) { struct obj_section *sec; f->sections = xrealloc(f->sections, (i+1) * sizeof(sec)); f->sections[i] = sec = arch_new_section(); f->header.e_shnum = i+1; memset(sec, 0, sizeof(*sec)); sec->header.sh_type = SHT_PROGBITS; sec->header.sh_flags = SHF_WRITE|SHF_ALLOC; sec->name = ".bss"; sec->idx = i; } /* Allocate the COMMONS. */ { ElfW(Addr) bss_size = f->sections[i]->header.sh_size; ElfW(Addr) max_align = f->sections[i]->header.sh_addralign; struct common_entry *c; for (c = common_head; c ; c = c->next) { ElfW(Addr) align = c->sym->value; if (align > max_align) max_align = align; if (bss_size & (align - 1)) bss_size = (bss_size | (align - 1)) + 1; c->sym->secidx = i; c->sym->value = bss_size; bss_size += c->sym->size; } f->sections[i]->header.sh_size = bss_size; f->sections[i]->header.sh_addralign = max_align; } } /* For the sake of patch relocation and parameter initialization, allocate zeroed data for NOBITS sections now. Note that after this we cannot assume NOBITS are really empty. */ for (i = 0; i < f->header.e_shnum; ++i) { struct obj_section *s = f->sections[i]; if (s->header.sh_type == SHT_NOBITS) { if (s->header.sh_size) s->contents = memset(xmalloc(s->header.sh_size), 0, s->header.sh_size);

- 79 -

239 240 241 242 243 244

else s->contents = NULL; s->header.sh_type = SHT_PROGBITS; } } } 137 行的 for 循环，找出 symtab（包含了 global，weak，local 符号）中所有类型为

SHN_COMMON 的符号，并按大小排序，注释里说这样可以减少因对齐要求造成的空间浪费。 159 行的 for 循环，则找出 local_symtab（只包含 local 符号，但可能与 symtab 中重复）中所有所持段序号为 SHN_COMMON 的符号，也按大小排序。166~167 行就是为了防止符号重复。如果所持段序号为 SHN_COMMON 的符号，那么在 182 行的循环里查找.bss 段。SHT_NOBITS 表明这个段不占据文件空间，但其含义与 SHT_PROGBITS 相同。这正是.bss 段的类型。如果找不到.bss 段，就通过 187~200 行添加一个，注意添加段的类型现在变为 SHT_PROGBITS，属性是 SHF_WRITE 和 SHF_ALLOC。 203~225 行，计算所有所持段序号为 SHN_COMMON 的符号占据的大小和段最终要求的对齐边界。然后在 231 行的循环里，为 SHT_NOBITS 段分配资源，并将它设为 SHN_PROGBITS。注意现在这些符号的段序号也给改变了，记住这一点。从这里可以看到，文件定义的静态、全局的变量，还有引用的外部变量，最终都放在了.bss 段中。回到 INSMOD_MAIN。 1691 1692 1693 1694 1695 1696 1697 1698

check_module_parameters(f, &persist_parms); if (optind < argc) { if (!process_module_arguments(f, argc - optind, argv + optind, 1)) goto out; } arch_create_got(f); /* DEPMOD */ hide_special_symbols(f);

当在命令行里运行程序时，我们可以在程序名后跟上参数。在 insmod 里也可以这样。不过要实现这个功能，模块需要做点工作。为了使自己能接受命令行参数，模块必须使用 MODULE_PARM 来处理这些变量。例如 linux ipv6 部分代码里的： 124 /* Default to forward because I got too much mail already. */ 125 static int forward = NF_ACCEPT; 126 MODULE_PARM(forward, "i"); Insmod——MODULE_PARM 宏 MODULE_PARM 在 linux-2.4.0/include/linux/module.h 中。 209 210 211 212 213

/* Used to verify parameters given to the module. The TYPE arg should be a string in the following format: [min[-max]]{b,h,i,l,s} The MIN and MAX specifiers delimit the length of the array. If MAX is omitted, it defaults to MIN; if both are omitted, the default is 1.

- 80 -

214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230

The final character is a type specifier: b byte h short i int l long s string */ #define MODULE_PARM(var,type) const char __module_parm_##var[] __attribute__((section(".modinfo"))) = "parm_" __MODULE_STRING(var) "=" type

\ \ \

#define MODULE_PARM_DESC(var,desc) const char __module_parm_desc_##var[] \ __attribute__((section(".modinfo"))) = \ "parm_desc_" __MODULE_STRING(var) "=" desc

151 152 153 154

/* Indirect stringification. */ #define __MODULE_STRING_1(x) #x #define __MODULE_STRING(x) __MODULE_STRING_1(x) 被 MODULE_PARM 和 MODULE_PARM_DESC 处理的变量，会将一条类似

“parm_var=type”，“parm_desc_var=desc”的信息加入.modinfo 段里。以上面的例子，假定模块名为 ipv6，在 insmod 命令里，可以使用 insmid ipv6 forward=1。这样的命令，在这个命令里，模块中的变量 forward 将被设为 1。好了，现在来看看 check_module_parameters 这个函数，它和 INSMOD_MAIN 在同一个文件里。 Insmod——check_module_parameters 函数 1335 /* Check that all module parameters have reasonable definitions */ 1336 static void check_module_parameters(struct obj_file *f, int *persist_flag) 1337 { 1338 struct obj_section *sec; 1339 char *ptr, *value, *n, *endptr; 1340 int namelen, err = 0; 1341 1342 sec = obj_find_section(f, ".modinfo"); 1343 if (sec == NULL) { 1344 /* module does not support typed parameters */ 1345 return; 1346 } 1347 1348 ptr = sec->contents; 1349 endptr = ptr + sec->header.sh_size; 1350 while (ptr < endptr && !err) { 1351 value = strchr(ptr, '='); 1352 n = strchr(ptr, '\0'); 1353 if (value) { 1354 namelen = value - ptr; 1355 if (namelen >= 5 && strncmp(ptr, "parm_", 5) == 0 1356 && !(namelen > 10 && strncmp(ptr, "parm_desc_", 10) == 0)) {

- 81 -

1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 }

char *pname = xmalloc(namelen + 1); strncpy(pname, ptr + 5, namelen - 5); pname[namelen - 5] = '\0'; err = check_module_parameter(f, pname, value+1, persist_flag); free(pname); } } else { if (n - ptr >= 5 && strncmp(ptr, "parm_", 5) == 0) { error("parameter %s found with no value", ptr); err = 1; } } ptr = n + 1; } if (err) *persist_flag = 0; return;

函数的主体是 check_module_parameter，这个函数也在同一文件里。 Insmod——check_module_parameter 函数 1266 /* Check that a module parameter has a reasonable definition */ 1267 static int check_module_parameter(struct obj_file *f, char *key, char *value, int *persist_flag) 1269 { 1270 struct obj_symbol *sym; 1271 int min, max; 1272 char *p = value; 1273 1274 sym = obj_find_symbol(f, key); 1275 if (sym == NULL) { 1276 /* FIXME: For 2.2 kernel compatibility, only issue warnings for 1277 * most error conditions. Make these all errors in 2.5. 1278 */ 1279 lprintf("Warning: %s symbol for parameter %s not found", error_file, key); 1280 return(1); 1281 } 1282 1283 if (isdigit(*p)) { 1284 min = strtoul(p, &p, 10); 1285 if (*p == '-')1286 1286 max = strtoul(p + 1, &p, 10); 1287 else 1288 max = min; 1289 } else 1290 min = max = 1; 1291 1292 if (max < min) { 1293 lprintf("Warning: %s parameter %s has max < min!", error_file, key); 1294 return(1); 1295 } 1296 1297 switch (*p) { 1298 case 'c': 1299 if (!isdigit(p[1])) { 1300 lprintf("%s parameter %s has no size after 'c'!", error_file, key);

- 82 -

1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 }

return(1); } while (isdigit(p[1])) ++p; /* swallow c array size */ break; case 'b': /* drop through */ case 'h': /* drop through */ case 'i': /* drop through */ case 'l': /* drop through */ case 's': break; case '\0': lprintf("%s parameter %s has no format character!", error_file, key); return(1); default: lprintf("%s parameter %s has unknown format character '%c'", error_file, key, *p); return(1); } switch (*++p) { case 'p': if (*(p-1) == 's') { error("parameter %s is invalid persistent string", key); return(1); } *persist_flag = 1; break; case '\0': break; default: lprintf("%s parameter %s has unknown format modifier '%c'", error_file, key, *p); return(1); } return(0);

函数首先确定符号存在。根据前面注释的解释，在声明类型之前，可以 min-max 这样的形式声明该参数可以接受的值的个数。所以，1282~1289 行解析可能存在的参数值个数声明，如果没有参数值个数取值范围，就默认为 1。1296 行处理变量的类型。这里与 2.4.0 内核有点不符，modutils2.4.0 还可以处理 c 和 p{c, i, l, h, s}这样的类型，如果变量前有 p，就表明这些参数来自文件，而且模块退出时，参数值需要回写。做完这个检查后，如果命令行里带入了参数（optind<argc），处理这些参数。函数 process_module_arguments 也在这个文件中。 Insmod——process_module_arguments 函数 733 static int process_module_arguments(struct obj_file *f, int argc, char **argv, int required) 734 { 735 for (; argc > 0; ++argv, --argc) { 736 struct obj_symbol *sym; 737 int c; 738 int min, max; 739 int n; 740 char *contents; 741 char *input;

- 83 -

742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 758 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798

char *fmt; char *key; char *loc; if ((input = strchr(*argv, '=')) == NULL) continue; n = input - *argv; input += 1; /* skip '=' */ key = alloca(n + 6); if (m_has_modinfo) { memcpy(key, "parm_", 5); memcpy(key + 5, *argv, n); key[n + 5] = '\0'; if ((fmt = get_modinfo_value(f, key)) == NULL) { if (required) { error("invalid parameter %s", key); return 0; } else { if (flag_verbose) lprintf("ignoring %s", *argv); continue; /* silently ignore optional parameters */ } } key += 5; if (isdigit(*fmt)) { min = strtoul(fmt, &fmt, 10); if (*fmt == '-') max = strtoul(fmt + 1, &fmt, 10); else max = min; } else min = max = 1; } else { /* not m_has_modinfo */ memcpy(key, *argv, n); key[n] = '\0'; if (isdigit(*input)) fmt = "i"; else fmt = "s"; min = max = 0; } sym = obj_find_symbol(f, key); /* * Also check that the parameter was not * resolved from the kernel. */ if (sym == NULL || sym->secidx > SHN_HIRESERVE) { error("symbol for parameter %s not found", key); return 0;

- 84 -

799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 849 850 851 852 853 854 855 856

} contents = f->sections[sym->secidx]->contents; loc = contents + sym->value; n = 1; while (*input) { char *str; switch (*fmt) { case 's': case 'c': /* * Do C quoting if we begin with a ", * else slurp the lot. */ if (*input == '"') { char *r; str = alloca(strlen(input)); for (r = str, input++; *input != '"'; ++input, ++r) { if (*input == '\0') { error("improperly terminated string argument for %s", key); return 0; } /* else */ if (*input != '\\') { *r = *input; continue; } /* else handle \ */ switch (*++input) { case 'a': *r = '\a'; break; case 'b': *r = '\b'; break; case 'e': *r = '\033'; break; case 'f': *r = '\f'; break; case 'n': *r = '\n'; break; case 'r': *r = '\r'; break; case 't': *r = '\t'; break; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': c = *input - '0'; if ('0' <= input[1] && input[1] <= '7') { c = (c * 8) + *++input - '0'; if ('0' <= input[1] && input[1] <= '7') c = (c * 8) + *++input - '0'; } *r = c; break;

- 85 -

857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913

default: *r = *input; break; } } *r = '\0'; ++input; } else { /* * The string is not quoted. * We will break it using the comma * (like for ints). * If the user wants to include commas * in a string, he just has to quote it */ char *r; /* Search the next comma */ if ((r = strchr(input, ',')) != NULL) { /* * Found a comma * Recopy the current field */ str = alloca(r - input + 1); memcpy(str, input, r - input); str[r - input] = '\0'; /* Keep next fields */ input = r; } else { /* last string */ str = input; input = ""; } } if (*fmt == 's') { /* Normal string */ obj_string_patch(f, sym->secidx, loc - contents, str); loc += tgt_sizeof_char_p; } else { /* Array of chars (in fact, matrix !) */ long charssize; /* size of each member */ /* Get the size of each member */ /* Probably we should do that outside the loop ? */ if (!isdigit(*(fmt + 1))) { error("parameter type 'c' for %s must be followed by" " the maximum size", key); return 0; } charssize = strtoul(fmt + 1, (char **) NULL, 10); /* Check length */ if (strlen(str) >= charssize-1) { error("string too long for %s (max %ld)", key, charssize - 1); return 0; } /* Copy to location */

- 86 -

914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970

strcpy((char *) loc, str); loc += charssize;

/* safe, see check above */

} /* * End of 's' and 'c' */ break; case 'b': *loc++ = strtoul(input, &input, 0); break; case 'h': *(short *) loc = strtoul(input, &input, 0); loc += tgt_sizeof_short; break; case 'i': *(int *) loc = strtoul(input, &input, 0); loc += tgt_sizeof_int; break; case 'l': *(long *) loc = strtoul(input, &input, 0); loc += tgt_sizeof_long; break; default: error("unknown parameter type '%c' for %s", *fmt, key); return 0; } /* * end of switch (*fmt) */ while (*input && isspace(*input)) ++input; if (*input == '\0') break; /* while (*input) */ /* else */ if (*input == ',') { if (max && (++n > max)) { error("too many values for %s (max %d)", key, max); return 0; } ++input; /* continue with while (*input) */ } else { error("invalid argument syntax for %s: '%c'", key, *input); return 0; } } /* end of while (*input) */ if (min && (n < min)) {

- 87 -

971 972 973 974 975 976 977

error("too few values for %s (min %d)", key, min); return 0; } } /* end of for (;argc > 0;) */ return 1; } 因为使用命令行参数时，一定是 param=value 这样的形式。所以，746~752 行就是分开=2 边的

字符串。如果模块文件包含.modinfo 这个段，那么就通过 get_modinfo_value 函数在.modinfo 段里查找参数的格式。这个函数在同一文件里，而且很简单。 Insmod——get_modinfo_value 函数 401 static char * get_modinfo_value(struct obj_file *f, const char *key) 402 { 403 struct obj_section *sec; 404 char *p, *v, *n, *ep; 405 size_t klen = strlen(key); 406 407 sec = obj_find_section(f, ".modinfo"); 408 if (sec == NULL) 409 return NULL; 410 411 p = sec->contents; 412 ep = p + sec->header.sh_size; 413 while (p < ep) { 414 v = strchr(p, '='); 415 n = strchr(p, '\0'); 416 if (v) { 417 if (v - p == klen && strncmp(p, key, klen) == 0) 418 return v + 1; 419 } else { 420 if (n - p == klen && strcmp(p, key) == 0) 421 return n; 422 } 423 p = n + 1; 424 } 425 426 return NULL; 427 } 在这里调用 process_module_parameters 时，参数 required 为 1。因此，如果在.modinfo 段里找不到这个参数的格式，就要出错返回。如果没有.modinfo 段，那么只能简单地根据参数设置值是数字还是字符来确定格式。790 行在文件的符号集中查找与=左边字符串相同的符号。796 行确定该符号存在，而且不是内核提供的（回忆一下，来自内核的符号保存在 SHN_HIRESERVE+1 的段，其他模块的符号保存在 SHN_HIRESERVE+2 以上的模块。由此，可以知道，命令行参数设定的只能是模块自己的符号。 805 行开始根据数据声明的格式处理数据。对于字符型数据，如果字符串没有用””包含，则 “,”就是分割符。而且在””里的字符串可以使用转义字符。每个字符串通过 obj_string_patch 保存

- 88 -

到 obj_file 里。如果是数值型数据，则直接写到符号对应的值里。另外，还要测试传给参数的值的个数是否符合声明。处理完命令行参数，INSMOD_MAIN 接下来调用 arch_create_got。这个函数在./modutils2.4.0/obj/obj_i386.c 中。 Insmod——arch_create_got 函数 158 int 159 arch_create_got (struct obj_file *f) 160 { 161 struct i386_file *ifile = (struct i386_file *)f; 162 int i, n, offset = 0, gotneeded = 0; 163 164 n = ifile->root.header.e_shnum; 165 for (i = 0; i < n; ++i) 166 { 167 struct obj_section *relsec, *symsec, *strsec; 168 Elf32_Rel *rel, *relend; 169 Elf32_Sym *symtab; 170 const char *strtab; 171 172 relsec = ifile->root.sections[i]; 173 if (relsec->header.sh_type != SHT_REL) 174 continue; 175 176 symsec = ifile->root.sections[relsec->header.sh_link]; 177 strsec = ifile->root.sections[symsec->header.sh_link]; 178 179 rel = (Elf32_Rel *)relsec->contents; 180 relend = rel + (relsec->header.sh_size / sizeof(Elf32_Rel)); 181 symtab = (Elf32_Sym *)symsec->contents; 182 strtab = (const char *)strsec->contents; 183 184 for (; rel < relend; ++rel) 185 { 186 Elf32_Sym *extsym; 187 struct i386_symbol *intsym; 188 const char *name; 189 190 switch (ELF32_R_TYPE(rel->r_info)) 191 { 192 case R_386_GOTPC: 193 case R_386_GOTOFF: 194 gotneeded = 1; 195 default: 196 continue; 197 198 case R_386_GOT32: 199 break; 200 } 201 202 extsym = &symtab[ELF32_R_SYM(rel->r_info)]; 203 if (extsym->st_name) 204 name = strtab + extsym->st_name; 205 else 206 name = f->sections[extsym->st_shndx]->name;

- 89 -

207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

intsym = (struct i386_symbol *)obj_find_symbol(&ifile->root, name); if (!intsym->gotent.offset_done) { intsym->gotent.offset_done = 1; intsym->gotent.offset = offset; offset += 4; } } } if (offset > 0 || gotneeded) ifile->got = obj_create_alloced_section(&ifile->root, ".got", 4, offset); return 1; } 这个函数的作用是创建文件的.GOT 段。这是怎么回事呢？.GOT 的全称是 global offset table。

在这个段里保存的是绝对地址，这些地址不受重定位的影响。如果程序需要直接引用符号的绝对地址，这些符号就必须在.GOT 段中出现（原文：that symbol will have a global offset table entry）。什么情况下程序需要引用符号的绝对地址呢？这跟重定位操作的类型有关。重定位操作为类型 R_386_GOTOFF、R_386_GOTPC 及 R_386_GOT32 的符号，会使用.GOT 段。函数里使用了 i386_file 和 i386_symbol 结构，这 2 个结构定义在同一文件里。 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

struct i386_got_entry { int offset; unsigned offset_done : 1; unsigned reloc_done : 1; }; struct i386_file { struct obj_file root; struct obj_section *got; }; struct i386_symbol { struct obj_symbol root; struct i386_got_entry gotent; }; 这些结构都很简单，只是添加了些辅助控制位。接下来，INSMOD_MAIN 通过 hide_special_symbols 将符号“cleanup_module”，

“init_module”，“kernel_version”的属性改为 local，从而使其外部不可见。这个函数也在 insmod.c 里。 Insmod——hide_special_symbols 函数 283 static void hide_special_symbols(struct obj_file *f)

- 90 -

284 285 286 287 288 289 290 291 292 293 294 295 296 297 298

{ struct obj_symbol *sym; const char *const *p; static const char *const specials[] = { "cleanup_module", "init_module", "kernel_version", NULL }; for (p = specials; *p; ++p) if ((sym = obj_find_symbol(f, *p)) != NULL) sym->info = ELFW(ST_INFO) (STB_LOCAL, ELFW(ST_TYPE) (sym->info)); } 接下来，如果命令行参数来自文件，将文件名保存好，下面要从那里读出参数值。

1700 if (persist_parms && persist_name && *persist_name) { 1701 f->persist = persist_name; 1702 persist_name = NULL; 1703 } 1704 1705 if (persist_parms && 1706 persist_name && !*persist_name) { 1707 /* -e "". This is ugly. Take the filename, compare it against 1708 * each of the module paths until we find a match on the start 1709 * of the filename, assume the rest is the relative path. Have 1710 * to do it this way because modprobe uses absolute filenames 1711 * for module names in modules.dep and the format of modules.dep 1712 * does not allow for any backwards compatible changes, so there 1713 * is nowhere to store the relative filename. The only way this 1714 * should fail to calculate a relative path is "insmod ./xxx", for 1715 * that case the user has to specify -e filename. 1716 */ 1717 int j, l = strlen(filename); 1718 char *relative = NULL; 1719 char *p; 1720 for (i = 0; i < nmodpath; ++i) { 1721 p = modpath[i].path; 1722 j = strlen(p); 1723 while (j && p[j] == '/') 1724 --j; 1725 if (j < l && strncmp(filename, p, j) == 0 && filename[j] == '/') { 1726 while (filename[j] == '/') 1727 ++j; 1728 relative = xstrdup(filename+j); 1729 break; 1730 } 1731 } 1732 if (relative) { 1733 i = strlen(relative); 1734 if (i > 3 && strcmp(relative+i-3, ".gz") == 0) 1735 relative[i -= 3] = '\0'; 1736 if (i > 2 && strcmp(relative+i-2, ".o") == 0) 1737 relative[i -= 2] = '\0';

- 91 -

1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 }

else if (i > 4 && strcmp(relative+i-4, ".mod") == 0) relative[i -= 4] = '\0'; f->persist = xmalloc(strlen(persistdir) + 1 + i + 1); strcpy(f->persist, persistdir); /* safe, xmalloc */ strcat(f->persist, "/");/* safe, xmalloc */ strcat(f->persist, relative); /* safe, xmalloc */ free(relative); } else error("Cannot calculate persistent filename");

这段代码没什么可说的。1705~1748 行是防止-e “”这样的恶作剧。接下来是一些健康检查，并且处理从文件里读出的参数值。 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789

if (f->persist && *(f->persist) != '/') { error("Persistent filenames must be absolute, ignoring '%s'", f->persist); free(f->persist); f->persist = NULL; } if (f->persist && !flag_ksymoops) { error("has persistent data but ksymoops symbols are not available"); free(f->persist); f->persist = NULL; } if (f->persist && !k_new_syscalls) { error("has persistent data but the kernel is too old to support it"); free(f->persist); f->persist = NULL; } if (persist_parms && flag_verbose) { if (f->persist) lprintf("Persist filename '%s'", f->persist); else lprintf("No persistent filename available"); } if (f->persist) { FILE *fp = fopen(f->persist, "r"); if (!fp) { if (flag_verbose) lprintf("Cannot open persist file '%s' %m", f->persist); } else { int pargc = 0; char *pargv[1000]; /* hard coded but big enough */ char line[3000]; /* hard coded but big enough */ char *p; while (fgets(line, sizeof(line), fp)) { p = strchr(line, '\n'); if (!p) {

- 92 -

1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 }

error("Persistent data line is too long\n%s", line); break; } *p = '\0'; p = line; while (isspace(*p)) ++p; if (!*p || *p == '#') continue; if (pargc == sizeof(pargv)/sizeof(pargv[0])) { error("More than %d persistent parameters", pargc); break; } pargv[pargc++] = xstrdup(p); } fclose(fp); if (!process_module_arguments(f, pargc, pargv, 0)) goto out; while (pargc--) free(pargv[pargc]); }

这里处理的主体是 process_module_arguments，这个函数我们已经看过了。从这里可以看到文件里的内容类似，A=a B=b。 1813 1814

if (flag_ksymoops) add_ksymoops_symbols(f, filename, m_name);

如果定义了 ksymoops 符号，调用 add_ksymoops_symbols 函数。ksymoops 是一个调试辅助工具，它将试图将代码转换为指令并将堆栈值映射到内核符号。在很多情况下，这些信息就足够您确定错误的可能原因是什么了。这个函数也在 insmod.c 中。 Insmod——add_ksymoops_symbols 函数 619 /* add module source, timestamp, kernel version and a symbol for the 620 * start of some sections. this info is used by ksymoops to do better 621 * debugging. 622 */ 623 static void add_ksymoops_symbols(struct obj_file *f, const char *filename, 624 const char *m_name) 625 { 626 struct obj_section *sec; 627 struct obj_symbol *sym; 628 char *name, *absolute_filename; 629 char str[STRVERSIONLEN], real[PATH_MAX]; 630 int i, l, lm_name, lfilename, use_ksymtab, version; 631 struct stat statbuf; 632 633 static const char *section_names[] = { 634 ".text", 635 ".rodata", 636 ".data", 637 ".bss" 638 };

- 93 -

639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695

if (realpath(filename, real)) { absolute_filename = xstrdup(real); } else { int save_errno = errno; error("cannot get realpath for %s", filename); errno = save_errno; perror(""); absolute_filename = xstrdup(filename); } lm_name = strlen(m_name); lfilename = strlen(absolute_filename); /* add to ksymtab if it already exists or there is no ksymtab and other symbols * are not to be exported. otherwise leave ksymtab alone for now, the * "export all symbols" compatibility code will export these symbols later. */ use_ksymtab = obj_find_section(f, "__ksymtab") || !flag_export; if ((sec = obj_find_section(f, ".this"))) { /* tag the module header with the object name, last modified * timestamp and module version. worst case for module version * is 0xffffff, decimal 16777215. putting all three fields in * one symbol is less readable but saves kernel space. */ l = sizeof(symprefix)+ /* "__insmod_" */ lm_name+ /* module name */ 2+ /* "_O" */ lfilename+ /* object filename */ 2+ /* "_M" */ 2*sizeof(statbuf.st_mtime)+ /* mtime in hex */ 2+ /* "_V" */ 8+ /* version in dec */ 1; /* nul */ name = xmalloc(l); if (stat(absolute_filename, &statbuf) != 0) statbuf.st_mtime = 0; version = get_module_version(f, str); /* -1 if not found */ snprintf(name, l, "%s%s_O%s_M%0*lX_V%d", symprefix, m_name, absolute_filename, 2*sizeof(statbuf.st_mtime), statbuf.st_mtime, version); sym = obj_add_symbol(f, name, -1, ELFW(ST_INFO) (STB_GLOBAL, STT_NOTYPE), sec->idx, sec->header.sh_addr, 0); if (use_ksymtab) add_ksymtab(f, sym); } free(absolute_filename); /* record where the persistent data is going, same address as previous symbol */ if (f->persist) { l = sizeof(symprefix)+

/* "__insmod_" */

- 94 -

696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731

lm_name+ /* module name */ 2+ /* "_P" */ strlen(f->persist)+ /* data store */ 1; /* nul */ name = xmalloc(l); snprintf(name, l, "%s%s_P%s", symprefix, m_name, f->persist); sym = obj_add_symbol(f, name, -1, ELFW(ST_INFO) (STB_GLOBAL, STT_NOTYPE), sec->idx, sec->header.sh_addr, 0); if (use_ksymtab) add_ksymtab(f, sym); } /* tag the desired sections if size is non-zero */ for (i = 0; i < sizeof(section_names)/sizeof(section_names[0]); ++i) { if ((sec = obj_find_section(f, section_names[i])) && sec->header.sh_size) { l = sizeof(symprefix)+ /* "__insmod_" */ lm_name+ /* module name */ 2+ /* "_S" */ strlen(sec->name)+ /* section name */ 2+ /* "_L" */ 8+ /* length in dec */ 1; /* nul */ name = xmalloc(l); snprintf(name, l, "%s%s_S%s_L%ld", symprefix, m_name, sec->name, (long)sec->header.sh_size); sym = obj_add_symbol(f, name, -1, ELFW(ST_INFO) (STB_GLOBAL, STT_NOTYPE), sec->idx, sec->header.sh_addr, 0); if (use_ksymtab) add_ksymtab(f, sym); } } } 这个函数做的事情不复杂，就是记录经过修饰的模块名和经过修饰的长度不为 0 的段名。另

外，如果模块运行时参数是通过文件传入的，也要修饰、记录这个文件名。在这里修饰的目的，应该是为了唯一确定所使用的模块。在这里 flag_export 是全局变量，初始化为 1，当运行 insmod 时，如果使用了-x，就置为 0，如果使用了-X，就置为 1。默认为 1。在存在__ksymtab 段或者不存在这个段，而且其他符号都不导出的情况下，还要将这些符号添加入__symtab 段。这个由函数 add_ksymtab 完成。这个函数在同一文件里。 Insmod——add_ksymtab 函数 476 /* add an entry to the __ksymtab section, creating it if necessary */ 477 static void add_ksymtab(struct obj_file *f, struct obj_symbol *sym) 478 { 479 struct obj_section *sec; 480 ElfW(Addr) ofs; 481 482 /* ensure __ksymtab is allocated, EXPORT_NOSYMBOLS creates a non-alloc section.

- 95 -

483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502

* If __ksymtab is defined but not marked alloc, x out the first character * (no obj_delete routine) and create a new __ksymtab with the correct * characteristics. */ sec = obj_find_section(f, "__ksymtab"); if (sec && !(sec->header.sh_flags & SHF_ALLOC)) { *((char *)(sec->name)) = 'x'; /* override const */ sec = NULL; } if (!sec) sec = obj_create_alloced_section(f, "__ksymtab", tgt_sizeof_void_p, 0); if (!sec) return; sec->header.sh_flags |= SHF_ALLOC; ofs = sec->header.sh_size; obj_symbol_patch(f, sec->idx, ofs, sym); obj_string_patch(f, sec->idx, ofs + tgt_sizeof_void_p, sym->name); obj_extend_section(sec, 2 * tgt_sizeof_char_p); } 在这个函数里，如果存在__symtab 段，但是没有 SHF_ALLOC 标志（表明段在程序执行时不占

据内存资源），那么直接在段的开头写入“x”，标识它已删除。为了使段容易扩展，在段里面保存的都是“指针”。所以，函数使用 obj_symbol_patch 来保存这些“额外”的 symbol。这个函数在 obj_reloc.c 中。 Insmod——obj_symbol_patch 函数 65 int 66 obj_symbol_patch(struct obj_file *f, int secidx, ElfW(Addr) offset, 67 struct obj_symbol *sym) 68 { 69 struct obj_symbol_patch_struct *p; 70 71 p = xmalloc(sizeof(*p)); 72 p->next = f->symbol_patches; 73 p->reloc_secidx = secidx; 74 p->reloc_offset = offset; 75 p->sym = sym; 76 f->symbol_patches = p; 77 78 return 1; 79 } 如果内核版本在 v2.1.x 以上，调用 create_module_symtab。这个函数也在 insmod.c 里。 1816 if (k_new_syscalls) 1817 create_module_ksymtab(f); Insmod——create_module_ksymtab 函数 504 static int create_module_ksymtab(struct obj_file *f) 505 { 506 struct obj_section *sec; 507 int i; 508 509 /* We must always add the module references. */

- 96 -

510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555

if (n_ext_modules_used) { struct module_ref *dep; struct obj_symbol *tm; sec = obj_create_alloced_section(f, ".kmodtab", tgt_sizeof_void_p, (sizeof(struct module_ref) * n_ext_modules_used)); if (!sec) return 0; tm = obj_find_symbol(f, "__this_module"); dep = (struct module_ref *) sec->contents; for (i = 0; i < n_module_stat; ++i) if (module_stat[i].status /* used */) { dep->dep = module_stat[i].addr; obj_symbol_patch(f, sec->idx, (char *) &dep->ref - sec->contents, tm); dep->next_ref = 0; ++dep; } } if (flag_export && !obj_find_section(f, "__ksymtab")) { int *loaded; /* We don't want to export symbols residing in sections that aren't loaded. There are a number of these created so that we make sure certain module options don't appear twice. */ loaded = alloca(sizeof(int) * (i = f->header.e_shnum)); while (--i >= 0) loaded[i] = (f->sections[i]->header.sh_flags & SHF_ALLOC) != 0; for (i = 0; i < HASH_BUCKETS; ++i) { struct obj_symbol *sym; for (sym = f->symtab[i]; sym; sym = sym->next) { if (ELFW(ST_BIND) (sym->info) != STB_LOCAL && sym->secidx <= SHN_HIRESERVE && (sym->secidx >= SHN_LORESERVE || loaded[sym->secidx])) { add_ksymtab(f, sym); } } } } return 1; } n_ext_modules_used 是该模块引用模块的数目。内核所有加载的模块的信息都保存在

module_stat 数组里，每个被引用的模块的 module_stat 里的 status 为 1（见 add_kernel_symbols）。函数在 511~530 行创建一个.kmodtab 段保存该模块所引用的模块信息（包括这些模块对该模块的引用信息）。如果需要导出符号，而__ksymtab 段又不存在（回忆一下，这种情况下， add_ksymoops_symbols 里是不会创建__ksymtab 段的），那么通过 add_ksymtab 创建__ksymtab 段，

- 97 -

并加入模块要导出的符号。在这里可以发现，如果模块需要导出符号，那么经过修饰的模块名、模块文件名，甚至参数文件名会在这里加入__ksymtab 段（因为在前面的 add_ksymoops_symbols 函数里，这些符号被设为 STB_GLOBOL 了，段序号指向.this 段）。注意，在这里模块导出符号的确定（545 行 if）！模块符号的段序号一定在 SHN_LORESERVE~SHN_HIRESERVE（注意这个范围的序号是不出现在段头数组里的。回忆一下，序号大于 SHN_HIRESERVE 的段，保存的是模块引用的外部符号）。回到 INSMOD_MAIN。 1819 1820 1821 1822 1823 1824 1825

/* archdata based on relocatable addresses */ if (add_archdata(f, &archdata)) goto out; /* kallsyms based on relocatable addresses */ if (add_kallsyms(f, &kallsyms, force_kallsyms)) goto out;

add_archdata 函数也在 insmod.c 里。 Insmod——add_archdata 函数 1034 /* Add an arch data section if the arch wants it. */ 1035 static int add_archdata(struct obj_file *f, 1036 struct obj_section **sec) 1037 { 1038 size_t i; 1039 1040 *sec = NULL; 1041 /* Add an empty archdata section to the module if necessary */ 1042 for (i = 0; i < f->header.e_shnum; ++i) { 1043 if (strcmp(f->sections[i]->name, ARCHDATA_SEC_NAME) == 0) { 1044 *sec = f->sections[i]; 1045 break; 1046 } 1047 } 1048 if (!*sec) 1049 *sec = obj_create_alloced_section(f, ARCHDATA_SEC_NAME, 16, 0); 1050 1051 /* Size and populate archdata */ 1052 if (arch_archdata(f, *sec)) 1053 return(1); 1054 return 0; 1055 } 函数创建名为“__archdata” （宏 ARCH_SEC_NAME 的定义）的段。在 x86 体系下 arch_archdata 函数是空函数。接下来是对 add_kallsyms 的调用，这个函数也在同一文件下。 Insmod——add_kallsyms 函数 979 /* Add a kallsyms section if the kernel supports all symbols. */ 980 static int add_kallsyms(struct obj_file *f, 981 struct obj_section **module_kallsyms, int force_kallsyms) 982 {

- 98 -

983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 }

struct module_symbol *s; struct obj_file *f_kallsyms; struct obj_section *sec_kallsyms; size_t i; int l; const char *p, *pt_R; unsigned long start = 0, stop = 0; for (i = 0, s = ksyms; i < nksyms; ++i, ++s) { p = (char *)s->name; pt_R = strstr(p, "_R"); if (pt_R) l = pt_R - p; else l = strlen(p); if (strncmp(p, "__start_" KALLSYMS_SEC_NAME, l) == 0) start = s->value; else if (strncmp(p, "__stop_" KALLSYMS_SEC_NAME, l) == 0) stop = s->value; } if (start >= stop && !force_kallsyms) return(0); /* The kernel contains all symbols, do the same for this module. */ /* Add an empty kallsyms section to the module if necessary */ for (i = 0; i < f->header.e_shnum; ++i) { if (strcmp(f->sections[i]->name, KALLSYMS_SEC_NAME) == 0) { *module_kallsyms = f->sections[i]; break; } } if (!*module_kallsyms) *module_kallsyms = obj_create_alloced_section(f, KALLSYMS_SEC_NAME, 0, 0); /* Size and populate kallsyms */ if (obj_kallsyms(f, &f_kallsyms)) return(1); sec_kallsyms = f_kallsyms->sections[KALLSYMS_IDX]; (*module_kallsyms)->header.sh_addralign = sec_kallsyms->header.sh_addralign; (*module_kallsyms)->header.sh_size = sec_kallsyms->header.sh_size; free((*module_kallsyms)->contents); (*module_kallsyms)->contents = sec_kallsyms->contents; sec_kallsyms->contents = NULL; obj_free(f_kallsyms); return 0;

这个函数主要是处理内核导出符号。从代码中可以知道，内核的导出符号位于 “__start_kallsyms”和“__stop_kallsyms”符号所指向的地址之间。这个函数的主体是 obj_kallsyms 函数，这个函数在./modutils-2.4.0/obj/obj_kallsyms.c 里。 Insmod——obj_kallsyms 函数 84 /* Extract all symbols from the input obj_file, ignore ones that are

- 99 -

85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141

* no use for debugging, build an output obj_file containing only the * kallsyms section. * * The kallsyms section is a bit unusual. It deliberately has no * relocatable data, all "pointers" are represented as byte offsets * into the the section. This means it can be stored anywhere without * relocation problems. In particular it can be stored within a kernel * image, it can be stored separately from the kernel image, it can be * appended to a module just before loading, it can be stored in a * separate area etc. * * Format of the kallsyms section. * * Header: * Size of header. * Total size of kallsyms data, including strings. * Number of loaded sections. * Offset to first section entry from start of header. * Size of each section entry, excluding the name string. * Number of symbols. * Offset to first symbol entry from start of header. * Size of each symbol entry, excluding the name string. * * Section entry - one per loaded section. * Start of section[1]. * Size of section. * Offset to name of section, from start of strings. * Section flags. * * Symbol entry - one per symbol in the input file[2]. * Offset of section that owns this symbol, from start of section data. * Address of symbol within the real section[1]. * Offset to name of symbol, from start of strings. * * Notes: [1] This is an exception to the "represent pointers as * offsets" rule, it is a value, not an offset. The start * address of a section or a symbol is extracted from the * obj_file data which may contain absolute or relocatable * addresses. If the addresses are relocatable then the * caller must adjust the section and/or symbol entries in * kallsyms after relocation. * [2] Only symbols that fall within loaded sections are stored. */ int obj_kallsyms (struct obj_file *fin, struct obj_file **fout_result) { struct obj_file *fout; int i, loaded = 0, *fin_to_allsym_map; struct obj_section *isec, *osec; struct kallsyms_header *a_hdr; struct kallsyms_section *a_sec; ElfW(Off) sec_off; struct kallsyms_symbol *symbols = NULL, a_sym; ElfW(Word) symbols_size = 0, symbols_left = 0; char *strings = NULL, *p; ElfW(Word) strings_size = 0, strings_left = 0;

- 100 -

142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198

ElfW(Off) file_offset; static char strtab[] = "\000" KALLSYMS_SEC_NAME; /* Create the kallsyms section. */ fout = arch_new_file(); memset(fout, 0, sizeof(*fout)); fout->symbol_cmp = strcmp; fout->symbol_hash = obj_elf_hash; fout->load_order_search_start = &fout->load_order; /* Copy file characteristics from input file and modify to suit */ memcpy(&fout->header, &fin->header, sizeof(fout->header)); fout->header.e_type = ET_REL; /* Output is relocatable */ fout->header.e_entry = 0; /* No entry point */ fout->header.e_phoff = 0; /* No program header */ file_offset = sizeof(fout->header); /* Step over Elf header */ fout->header.e_shoff = file_offset; /* Section headers next */ fout->header.e_phentsize = 0; /* No program header */ fout->header.e_phnum = 0; /* No program header */ fout->header.e_shnum = KALLSYMS_IDX+1; /* Initial, strtab, kallsyms */ fout->header.e_shstrndx = KALLSYMS_IDX-1; /* strtab */ file_offset += fout->header.e_shentsize * fout->header.e_shnum; /* Populate the section data for kallsyms itself */ fout->sections = xmalloc(sizeof(*(fout->sections))*fout->header.e_shnum); memset(fout->sections, 0, sizeof(*(fout->sections))*fout->header.e_shnum); fout->sections[0] = osec = arch_new_section(); memset(osec, 0, sizeof(*osec)); osec->header.sh_type = SHT_NULL; osec->header.sh_link = SHN_UNDEF; fout->sections[KALLSYMS_IDX-1] = osec = arch_new_section(); memset(osec, 0, sizeof(*osec)); osec->name = ".strtab"; osec->header.sh_type = SHT_STRTAB; osec->header.sh_link = SHN_UNDEF; osec->header.sh_offset = file_offset; osec->header.sh_size = sizeof(strtab); osec->contents = xmalloc(sizeof(strtab)); memcpy(osec->contents, strtab, sizeof(strtab)); file_offset += osec->header.sh_size; fout->sections[KALLSYMS_IDX] = osec = arch_new_section(); memset(osec, 0, sizeof(*osec)); osec->name = KALLSYMS_SEC_NAME; osec->header.sh_name = 1; /* Offset in strtab */ osec->header.sh_type = SHT_PROGBITS; /* Load it */ osec->header.sh_flags = SHF_ALLOC; /* Read only data */ osec->header.sh_link = SHN_UNDEF; osec->header.sh_addralign = sizeof(ElfW(Word)); file_offset = (file_offset + osec->header.sh_addralign - 1) & -(osec->header.sh_addralign); osec->header.sh_offset = file_offset; /* How many loaded sections are there? */ for (i = 0; i < fin->header.e_shnum; ++i) {

- 101 -

199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255

if (fin->sections[i]->header.sh_flags & SHF_ALLOC) ++loaded; } /* Initial contents, header + one entry per input section. No strings. */ osec->header.sh_size = sizeof(*a_hdr) + loaded*sizeof(*a_sec); a_hdr = (struct kallsyms_header *) osec->contents = xmalloc(osec->header.sh_size); memset(osec->contents, 0, osec->header.sh_size); a_hdr->size = sizeof(*a_hdr); a_hdr->sections = loaded; a_hdr->section_off = a_hdr->size; a_hdr->section_size = sizeof(*a_sec); a_hdr->symbol_off = osec->header.sh_size; a_hdr->symbol_size = sizeof(a_sym); a_hdr->start = (ElfW(Addr))(~0); /* Map input section numbers to kallsyms section offsets. */ sec_off = 0; /* Offset to first kallsyms section entry */ fin_to_allsym_map = xmalloc(sizeof(*fin_to_allsym_map)*fin->header.e_shnum); for (i = 0; i < fin->header.e_shnum; ++i) { isec = fin->sections[i]; if (isec->header.sh_flags & SHF_ALLOC) { fin_to_allsym_map[isec->idx] = sec_off; sec_off += a_hdr->section_size; } else fin_to_allsym_map[isec->idx] = -1; /* Ignore this section */ } /* Copy the loaded section data. */ a_sec = (struct kallsyms_section *) ((char *) a_hdr + a_hdr->section_off); for (i = 0; i < fin->header.e_shnum; ++i) { isec = fin->sections[i]; if (!(isec->header.sh_flags & SHF_ALLOC)) continue; a_sec->start = isec->header.sh_addr; a_sec->size = isec->header.sh_size; a_sec->flags = isec->header.sh_flags; a_sec->name_off = strings_size - strings_left; append_string(isec->name, &strings, &strings_size, &strings_left); if (a_sec->start < a_hdr->start) a_hdr->start = a_sec->start; if (a_sec->start+a_sec->size > a_hdr->end) a_hdr->end = a_sec->start+a_sec->size; ++a_sec; } /* Build the kallsyms symbol table from the symbol hashes. */ for (i = 0; i < HASH_BUCKETS; ++i) { struct obj_symbol *sym = fin->symtab[i]; for (sym = fin->symtab[i]; sym ; sym = sym->next) { if (!sym || sym->secidx >= fin->header.e_shnum) continue; if ((a_sym.section_off = fin_to_allsym_map[sym->secidx]) == -1) continue; if (strcmp(sym->name, "gcc2_compiled.") == 0 ||

- 102 -

256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289

strncmp(sym->name, "__insmod_", 9) == 0) continue; a_sym.symbol_addr = sym->value; if (fin->header.e_type == ET_REL) a_sym.symbol_addr += fin->sections[sym->secidx]->header.sh_addr; a_sym.name_off = strings_size - strings_left; append_symbol(&a_sym, &symbols, &symbols_size, &symbols_left); append_string(sym->name, &strings, &strings_size, &strings_left); ++a_hdr->symbols; } } free(fin_to_allsym_map); /* Sort the symbols into ascending order by address and name */ sym_strings = strings; /* For symbol_compare */ qsort((char *) symbols, (unsigned) a_hdr->symbols, sizeof(* symbols), symbol_compare); sym_strings = NULL; /* Put the lot together */ osec->header.sh_size = a_hdr->total_size = a_hdr->symbol_off + a_hdr->symbols*a_hdr->symbol_size + strings_size - strings_left; a_hdr = (struct kallsyms_header *) osec->contents = xrealloc(a_hdr, a_hdr->total_size); p = (char *)a_hdr + a_hdr->symbol_off; memcpy(p, symbols, a_hdr->symbols*a_hdr->symbol_size); free(symbols); p += a_hdr->symbols*a_hdr->symbol_size; a_hdr->string_off = p - (char *)a_hdr; memcpy(p, strings, strings_size - strings_left); free(strings); *fout_result = fout; return 0; } 先看注释，这个函数的作用是将输入 obj_file 里的符号提取出来，忽略不用于调试的符号。构

建出的 obj_file 只包含 kallsyms 段。这个段是特别设计，是可任意重定位（否则，debugger 的设计将是噩梦！）。它的格式在注释中已经讲得非常清楚，在./modutils-2.40/include/obj_kallsyms.h 中也有明确的定义。 Insmod——kallsyms_header 结构 60 /* Format of data in the kallsyms section. 61 * Most of the fields are small numbers but the total size and all 62 * offsets can be large so use the 32/64 bit types for these fields. 63 * 64 * Do not use sizeof() on these structures, modutils may be using extra 65 * fields. Instead use the size fields in the header to access the 66 * other bits of data. 67 */ 68 69 struct kallsyms_header { 70 int size; /* Size of this header */ 71 ElfW(Word) total_size; /* Total size of kallsyms data */ 72 int sections; /* Number of section entries */ 73 ElfW(Off) section_off; /* Offset to first section entry */

- 103 -

74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 88 89 90 91 92 93

int int ElfW(Off) int ElfW(Off) ElfW(Addr) ElfW(Addr)

section_size; symbols; symbol_off; symbol_size; string_off; start; end;

/* Size of one section entry */ /* Number of symbol entries */ /* Offset to first symbol entry */ /* Size of one symbol entry */ /* Offset to first string */ /* Start address of first section */ /* End address of last section */

}; struct kallsyms_section { ElfW(Addr) start; ElfW(Word) size; ElfW(Off) name_off; ElfW(Word) flags; };

/* Start address of section */ /* Size of this section */ /* Offset to section name */ /* Flags from section */

struct kallsyms_symbol { ElfW(Off) section_off; /* Offset to section that owns this symbol */ ElfW(Addr) symbol_addr; /* Address of symbol */ ElfW(Off) name_off; /* Offset to symbol name */ }; 回到obj_kallsyms函数。到167行是对elf文件头的设置，具体请参照elf格式文档。169~172行是

对section（段头）第一项的设置，在elf文件里这是保留不用的。174~183行设置第2项section，这是个string section，其内容初始化为”\000__kallsyms”，实际上是不可见的。185行开始设置最后一项 section，这是这个obj_file的主要部分。 198行计算已加载段的数目。203~214行设置kallsyms段头。216~227行根据段的序号计算各个已加载段头相对第一个段头的偏移。230~245行依据已加载的段，设置已加载段参数，并把各个段的名字保存到string指向的缓存。函数append_string也在同一文件里。 Insmod——append_string 函数 /* Append a string to the big list of strings */ 34 35 36 static void 37 append_string (const char *s, char **strings, 38 ElfW(Word) *strings_size, ElfW(Word) *strings_left) 39 { 40 int l = strlen(s) + 1; 41 while (l > *strings_left) { 42 *strings = xrealloc(*strings, *strings_size += EXPAND_BY); 43 *strings_left += EXPAND_BY; 44 } 45 memcpy((char *)*strings+*strings_size-*strings_left, s, l); 46 *strings_left -= l; 46 } 248~266行将合适的符号保存入kallsyms段里。从代码里可以看到，这里忽略符号名为 gcc2_complied.或包含__insmod_的符号。这里生成的obj_file的属性是ET_REL，所以执行260行，计算出符号在内存中的地址（注意现在得到的还是理论上的地址，也就是文件在内存的起始位置为 0，段头的sh_addr表示段在内存的起始位置，就是以这个为假设的）。然后通过append_ symbol将

- 104 -

符号的内容存入缓存，接着保存符号名。append_symbol在同一文件里。 Insmod——append_symbol 函数 50 /* Append a symbol to the big list of symbols */ 51 52 static void 53 append_symbol (const struct kallsyms_symbol *s, 54 struct kallsyms_symbol **symbols, 55 ElfW(Word) *symbols_size, ElfW(Word) *symbols_left) 56 { 57 int l = sizeof(*s); 58 while (l > *symbols_left) { 59 *symbols = xrealloc(*symbols, *symbols_size += EXPAND_BY); 60 *symbols_left += EXPAND_BY; 61 } 62 memcpy((char *)*symbols+*symbols_size-*symbols_left, s, l); 63 *symbols_left -= l; 64 } 接着在270~273行，对符号进行排序。Symbol_compare函数也还是在同一文件里。 Insmod——symbol_compare 函数 66 /* qsort compare routine to sort symbols */ 67 68 static const char *sym_strings; 69 70 static int 71 symbol_compare (const void *a, const void *b) 72 { 73 struct kallsyms_symbol *c = (struct kallsyms_symbol *) a; 74 struct kallsyms_symbol *d = (struct kallsyms_symbol *) b; 75 76 if (c->symbol_addr > d->symbol_addr) 77 return(1); 78 if (c->symbol_addr < d->symbol_addr) 79 return(-1); 80 return(strcmp(sym_strings+c->name_off, sym_strings+d->name_off)); 80 } 可以看到符号按它们在内存地址的高低排序，地址相同，再使用名字排序。最后，276~285 行，将这些经过处理内容拷贝到输出文件里。到这里为止，输出文件的内容如下图。

- 105 -

实际上，这个文件就是将输入文件里的信息整理整理，更方便使用（记住__kallsyms段是用作辅助内核调试），整个输出文件只是临时的仓库。add_kallsyms函数的1022~1030行，正是将输出文件里的第2个段的内容（途中灰色部分）保存为__kallsyms段的内容。回到INSMOD_MAIN。 1826 /**** No symbols or sections to be changed after kallsyms above ***/ 之前所有的操作都没动内核。下面就要来真的了。 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840

if (errors) goto out; /* If we were just checking, we made it. */ if (flag_silent_probe) { exit_status = 0; goto out; } /* Module has now finished growing; find its size and install it. */ m_size = obj_load_size(f); /* DEPMOD */ if (noload) { /* Don't bother actually touching the kernel. */

- 106 -

1841 m_addr = 0x12340000; 1842 } else { 1843 errno = 0; 1844 m_addr = create_module(m_name, m_size); 1845 switch (errno) { 1846 case 0: 1847 break; 1848 case EEXIST: 1849 if (dolock) { 1850 /* 1851 * Assume that we were just invoked 1852 * simultaneous with another insmod 1853 * and return success. 1854 */ 1855 exit_status = 0; 1856 goto out; 1857 } 1858 error("a module named %s already exists", m_name); 1859 goto out; 1860 case ENOMEM: 1861 error("can't allocate kernel memory for module; needed %lu bytes", 1862 m_size); 1863 goto out; 1864 default: 1865 error("create_module: %m"); 1866 goto out; 1867 } 1868 } 首先如果flag_slient_probe已经设置，说明我们不想真正安装模块，只是想测试一下，那么到这里测试已经完成了，模块一切正常。否则，就要开始动手了，先计算载入模块所需的大小。函数 obj_load_size在./modutils-2.4.0/obj/obj_reloc.c里。 Insmod——obj_load_size 函数 246 unsigned long 247 obj_load_size (struct obj_file *f) 248 { 249 unsigned long dot = 0; 250 struct obj_section *sec; 251 252 /* Finalize the positions of the sections relative to one another. */ 253 254 for (sec = f->load_order; sec ; sec = sec->load_next) 255 { 256 ElfW(Addr) align; 257 258 align = sec->header.sh_addralign; 259 if (align && (dot & (align - 1))) 260 dot = (dot | (align - 1)) + 1; 261 262 sec->header.sh_addr = dot; 263 dot += sec->header.sh_size; 264 } 265 266 return dot; 267 }

- 107 -

前面提到段按对其边界大小排序，可以减少空间占用，就是体现在这里。如果noload设置了，那么我们选择不真正加载模块。随便给加载地址就完了。但是如果要真正加载，就不能那么儿戏了。首先要通过 create_module 创建内核里的模块对象。这个函数在./modutils-2.4.0/util/sys_cm.c中。 Insmod——create_module 函数 39 #define __NR__create_module __NR_create_module 40 static inline _syscall2(long, _create_module, const char *, name, size_t, size) 41 42 unsigned long create_module(const char *name, size_t size) 43 { 44 /* Why all this fuss? 45 46 In linux 2.1, the address returned by create module point in 47 kernel space which is now mapped at the top of user space (at 48 0xc0000000 on i386). This looks like a negative number for a 49 long. The normal syscall macro of linux 2.0 (and all libc compile 50 with linux 2.0 or below) consider that the return value is a 51 negative number and consider it is an error number (A kernel 52 convention, return value are positive or negative, indicating the 53 error number). 54 55 By checking the value of errno, we know if we have been fooled by 56 the syscall2 macro and we fix it. */ 57 58 long ret = _create_module(name, size); 59 if (ret == -1 && errno > 125) 60 { 61 ret = -errno; 62 errno = 0; 63 } 64 return ret; 65 } 这个函数里有一个比较有意思的注释，谈到内核创建了module对象后，因为其地址在内核区，在0xc0000000以上，在系统调用返回时，2.0以下的版本会认为这是个出错符号，因此，在此要做些判断。 syscall2是一个宏，用来作系统调用__NR_create_module。这个宏的定义在linux/include/asmi386/unistd.h中。 263 264 265 266 267 268 269 270 271

#define _syscall2(type,name,type1,arg1,type2,arg2) \ type name(type1 arg1,type2 arg2) \ {\ long __res; \ __asm__ volatile ("int $0x80" \ : "=a" (__res) \ : "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2))); \ __syscall_return(type,__res); \ }

- 108 -

在这个文件里，刚才注释里提到的错误已经修正了，见__syscall_return，这个宏也在这个文件里。 231 232 233 234 235 236 237 238 239 240

/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */ #define __syscall_return(type, res) \ do { \ if ((unsigned long)(res) >= (unsigned long)(-125)) { \ errno = -(res); \ res = -1; \ }\ return (type) (res); \ } while (0) 函数最终会调用系统调用sys_create_module。这个调用在内核空间生成一个模块对象，并将它

链入内核的模块链表（详见 linux 源代码情景分析）。创建内核模块对象时，有可能出错， INSMOD_MAIN的1845行的swtich检查模块对象是否成功创建。 1870 /* module is already built, complete with ksymoops symbols for the 1871 * persistent filename. If the kernel does not support persistent data 1872 * then give an error but continue. It is too difficult to clean up at 1873 * this stage and this error will only occur on backported modules. 1874 * rmmod will also get an error so warn the user now. 1875 */ 1876 if (f->persist && !noload) { 1877 struct { 1878 struct module m; 1879 int data; 1880 } test_read; 1881 memset(&test_read, 0, sizeof(test_read)); 1882 test_read.m.size_of_struct = -sizeof(test_read.m); /* -ve size => read, not write */ 1883 test_read.m.read_start = m_addr + sizeof(struct module); 1884 test_read.m.read_end = test_read.m.read_start + sizeof(test_read.data); 1885 if (sys_init_module(m_name, (struct module *) &test_read)) { 1886 int old_errors = errors; 1887 error("has persistent data but the kernel is too old to support it." 1888 " Expect errors during rmmod as well"); 1889 errors = old_errors; 1890 } 1891 } 如果模块运行时参数使用了文件，而且需要真正加载，那么要执行1876行的语句块，检查内核是否支持persist data（modutils文档里这样解释persist data：当模块初始化时从文件读入，模块退出时将相关内容写入文件）。这里的 module 结构不是内核使用的那个，它定义在 ./modutils2.4.0/include/module.h中。 136 137 138

struct module { unsigned tgt_long size_of_struct; /* == sizeof(module) */

- 109 -

139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169

unsigned tgt_long next; unsigned tgt_long name; unsigned tgt_long size; tgt_long usecount; unsigned tgt_long flags;

/* AUTOCLEAN et al */

unsigned nsyms; unsigned ndeps; unsigned tgt_long syms; unsigned tgt_long deps; unsigned tgt_long refs; unsigned tgt_long init; unsigned tgt_long cleanup; unsigned tgt_long ex_table_start; unsigned tgt_long ex_table_end; #ifdef __alpha__ unsigned tgt_long gp; #endif /* Everything after here is extension. */ unsigned tgt_long read_start; /* Read data from existing module */ unsigned tgt_long read_end; unsigned tgt_long can_unload; unsigned tgt_long runsize; unsigned tgt_long kallsyms_start; unsigned tgt_long kallsyms_end; unsigned tgt_long archdata_start; unsigned tgt_long archdata_end; unsigned tgt_long kernel_data; }; 与此对比的是内核里对模块的定义。

53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75

struct module { unsigned long size_of_struct; struct module *next; const char *name; unsigned long size;

/* == sizeof(module) */

union { atomic_t usecount; long pad; } uc;

/* Needs to keep its size - so says rth */

unsigned long flags;

/* AUTOCLEAN et al */

unsigned nsyms; unsigned ndeps; struct module_symbol *syms; struct module_ref *deps; struct module_ref *refs; int (*init)(void); void (*cleanup)(void);

- 110 -

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93

const struct exception_table_entry *ex_table_start; const struct exception_table_entry *ex_table_end; #ifdef __alpha__ unsigned long gp; #endif /* Members past this point are extensions to the basic module support and are optional. Use mod_member_present() to examine them. */ const struct module_persist *persist_start; const struct module_persist *persist_end; int (*can_unload)(void); int runsize; /* In modutils, not currently used */ const char *kallsyms_start; /* All symbols for kernel debugging */ const char *kallsyms_end; const char *archdata_start; /* arch specific data for module */ const char *archdata_end; const char *kernel_data; /* Reserved for kernel internal use */ }; 这2个结构基本上是对应的，只是这里定义的结构，为了避免麻烦，使用unsigned long代替各种

指针，但是如果insmod和内核的版本相差太大，这2个结构还是会有不对应的地方。这里的主体是sys_init_module，这个函数在./modutils-2.4.0/util/sys_nim.c中。 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

#ifndef CONFIG_USE_SYSCALL extern int init_module(const char *name, const struct module *info); int sys_init_module(const char *name, const struct module *info) { return init_module(name, info); } #else #define __NR_sys_init_module __NR_init_module _syscall2(int, sys_init_module, const char *, name, const struct module *, info) #endif 这里调用那个函数取决于宏CONFIG_USE_SYSCALL是否定义。在./modutils-2.4.0/ INSTALL文

件中提到这些选项在默认情况下是false。使用这个宏的原因，是有些库不能做系统调用，所以，必须通过_syscall2这样的宏做系统调用。回到INSMOD_MAIN，在1882行中，size_of_struct是unsigned int类型的，-sizeof(module)结果将会是一个很大的值，为什么要这样做呢？文档patch-2.4.0-test13-pre2给出了答案。在需要使用这个功能的时候，需要给内核打补丁，扩展sys_init_module系统调用的语义。使其在size为负数时，是从现有module结构读出而不是设置新的模块。以下就是这个文档给出的蛛丝马迹。

- 111 -

72 73 74 75

if ((error = get_user(mod_user_size, &mod_user->size_of_struct)) != 0) goto err1; -if (mod_user_size < (unsigned long)&((struct module *)0L)->persist_start || mod_user_size > sizeof(struct module) + 16*sizeof(void*)) { 上面的代码是将在内核进行的判断，如果size是负的，上面的条件一定满足，现在modutils扩展

了glibc库，当size为负时，执行下面代码。 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97

/* A negative mod_user_size indicates reading data from an * existing module. */ for (i = 0; i < 2; ++i) { if (mod_user_size >= (unsigned long)&((struct module *)0L)->read_start && mod_user_size <= sizeof(struct module) + 16*sizeof(void*)) break; mod_user_size = -mod_user_size; /* Try with negated size */ } if (i == 1) { /* Negative size, read from existing module */ error = read_module_data(mod_user_size, mod_user, mod); goto err1; } if (i == 2) { printk(KERN_ERR "init_module: Invalid module header size.\n" KERN_ERR "A new version of the modutils is likely " "needed.\n"); KERN_ERR "A new version of modutils may be needed.\n"); error = -EINVAL; goto err1; }

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

/* A negative mod_user_size to sys_init_module indicates that the caller wants * to read data out of an existing module instead of initializing a new module. * This usage overloads the meaning of sys_init_module, but the alternative was * yet another system call and changes to glibc. sys_init_module already does * much of the work needed to read from an existing module so it was easier to * extend that syscall. Keith Owens <kaos@ocs.com.au> November 2000 */ static int read_module_data(unsigned long mod_user_size, struct module *mod_user, struct module *mod_exist) { struct module mod; int error; if (!try_inc_mod_count(mod_exist)) return(-ENOENT); error = copy_from_user(&mod, mod_user, mod_user_size); if (error) { error = -EFAULT; goto err1; } mod.size_of_struct = mod_user_size; error = -EINVAL; /* read_start and read_end must be present and must point inside the * existing module. The module data from read_start to read_end-1 is

- 112 -

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

* copied back to the user, immediately after the user's struct module. */ if (!mod_member_present(&mod, read_end) || !mod_bound(mod.read_start, 0, mod_exist) || !mod_bound(mod.read_end, -1, mod_exist) || mod.read_start >= mod.read_end) { printk(KERN_ERR "init_module: mod->read_xxx data out of bounds.\n"); goto err1; } error = copy_to_user(((char *)mod_user)+mod_user_size, mod.read_start, mod.read_end - mod.read_start); if (error) { error = -EFAULT; goto err1; } error = 0; err1: __MOD_DEC_USE_COUNT(mod_exist); return(error); } 以上给出的只是原理性代码。这些代码在内核里，相应的read_start，read_end都要换为内核里

的 persist_start 和 persist_end 。在 read_module_data 函数里 mod_exist 是内核生成的模块，注意在 INSMOD_MAIN的1844行， m_addr保存了由内核生成的模块的地址，而test_read.m->read_start = m_addr+sizeof(module) ，这样在 49 行的 copy_to_user 语句中，拷贝的正是这个位置，大小是 sizeof(int)。在实际运行时，read_start的位置是模块名字符串的开始。因此，read_module_data在这里没做实际的事情，只要是打过补丁的内核，sys_init_module的操作是不会出错的了。所以，这里主要是测试内核是否已经打了补丁。回到INSMOD_MAIN里。 1893 if (!obj_relocate(f, m_addr)) { /* DEPMOD */ 1894 if (!noload) 1895 delete_module(m_name); 1896 goto out; 1897 } 函数obj_relocate在./modutils-2.4.0/obj/obj_reloc.c中。 Insmod——obj_relocate 函数 269 int 270 obj_relocate (struct obj_file *f, ElfW(Addr) base) 271 { 272 int i, n = f->header.e_shnum; 273 int ret = 1; 274 275 /* Finalize the addresses of the sections. */ 276 278 arch_finalize_section_address(f, base); 279 280 /* And iterate over all of the relocations. */

- 113 -

281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337

for (i = 0; i < n; ++i) { struct obj_section *relsec, *symsec, *targsec, *strsec; ElfW(RelM) *rel, *relend; ElfW(Sym) *symtab; const char *strtab; relsec = f->sections[i]; if (relsec->header.sh_type != SHT_RELM) continue; symsec = f->sections[relsec->header.sh_link]; targsec = f->sections[relsec->header.sh_info]; strsec = f->sections[symsec->header.sh_link]; rel = (ElfW(RelM) *)relsec->contents; relend = rel + (relsec->header.sh_size / sizeof(ElfW(RelM))); symtab = (ElfW(Sym) *)symsec->contents; strtab = (const char *)strsec->contents; for (; rel < relend; ++rel) { ElfW(Addr) value = 0; struct obj_symbol *intsym = NULL; unsigned long symndx; ElfW(Sym) *extsym = 0; const char *errmsg; /* Attempt to find a value to use for this relocation. */ symndx = ELFW(R_SYM)(rel->r_info); if (symndx) { /* Note we've already checked for undefined symbols. */ extsym = &symtab[symndx]; if (ELFW(ST_BIND)(extsym->st_info) == STB_LOCAL) { /* Local symbols we look up in the local table to be sure we get the one that is really intended. */ intsym = f->local_symtab[symndx]; } else { /* Others we look up in the hash table. */ const char *name; if (extsym->st_name) name = strtab + extsym->st_name; else name = f->sections[extsym->st_shndx]->name; intsym = obj_find_symbol(f, name); } value = obj_symbol_final_value(f, intsym); }

- 114 -

338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394

#if SHT_RELM == SHT_RELA #if defined(__alpha__) && defined(AXP_BROKEN_GAS) /* Work around a nasty GAS bug, that is fixed as of 2.7.0.9. */ if (!extsym || !extsym->st_name || ELFW(ST_BIND)(extsym->st_info) != STB_LOCAL) #endif value += rel->r_addend; #endif /* Do it! */ switch (arch_apply_relocation(f,targsec,symsec,intsym,rel,value)) { case obj_reloc_ok: break; case obj_reloc_overflow: errmsg = "Relocation overflow"; goto bad_reloc; case obj_reloc_dangerous: errmsg = "Dangerous relocation"; goto bad_reloc; case obj_reloc_unhandled: errmsg = "Unhandled relocation"; goto bad_reloc; case obj_reloc_constant_gp: errmsg = "Modules compiled with -mconstant-gp cannot be loaded"; goto bad_reloc; bad_reloc: if (extsym) { error("%s of type %ld for %s", errmsg, (long)ELFW(R_TYPE)(rel->r_info), strtab + extsym->st_name); } else { error("%s of type %ld", errmsg, (long)ELFW(R_TYPE)(rel->r_info)); } ret = 0; break; } } } /* Finally, take care of the patches. */ if (f->string_patches) { struct obj_string_patch_struct *p; struct obj_section *strsec; ElfW(Addr) strsec_base; strsec = obj_find_section(f, ".kstrtab"); strsec_base = strsec->header.sh_addr; for (p = f->string_patches; p ; p = p->next) {

- 115 -

395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414

struct obj_section *targsec = f->sections[p->reloc_secidx]; *(ElfW(Addr) *)(targsec->contents + p->reloc_offset) = strsec_base + p->string_offset; } } if (f->symbol_patches) { struct obj_symbol_patch_struct *p; for (p = f->symbol_patches; p; p = p->next) { struct obj_section *targsec = f->sections[p->reloc_secidx]; *(ElfW(Addr) *)(targsec->contents + p->reloc_offset) = obj_symbol_final_value(f, p->sym); } } return ret; } 这个函数的逻辑也不复杂。首先，参数base在这里是m_addr——模块在内核的地址。而在模块

elf文件里，段在内存的位置是假设文件从0地址加载而得出的，现在就要根据base值调整，函数 arch_finalize_section_address在./modultils-2.4-/obj/obj_i386.c中。 Insmod——arch_finalize_section_address 函数 230 int 231 arch_finalize_section_address(struct obj_file *f, Elf32_Addr base) 232 { 233 int i, n = f->header.e_shnum; 234 235 f->baseaddr = base; 236 for (i = 0; i < n; ++i) 237 f->sections[i]->header.sh_addr += base; 238 return 1; 239 } 接下来，开始处理重定位符号了。需要重定位的符号都可以从重定位段中找到（也只能从那里去找），需要重定位的符号一般都涉及指针、地址、外部符号这类在链接时才能确定的东西。因此，表示它们的Elf32_Sym结构中的st_value应该就是地址（实际上st_value只代表2种情况，在符号的段序号为SHN_COMMON时，这个值代表边界对齐值。而其他情况下代表被重定位符号相对保存它的段起始的偏移。还记得吗？段序号为SHN_COMMON的符号，如果对应文件内的外部符号——比如B文件引用了A文件里定义的变量，在前面已经被obj_allocate_commons函数处理了，符号的段序号都指向了.bss段的序号。而如果符号对应内核或已加载模块导出的符号，则被前面的 add_kernel_symbols处理了，序号为SHN_HIRESERVE以上。）。在elf文件里，段序号大于SHN_LORESERVE的符号，是没有对应的段的。因此，符号里的值就认为是绝对地址（见209行）。内核和已加载模块导出符号就处在这个区段。前面已经看到，local属性的符号都存放在local_symtab中，外部符号和全局符号则存放在symtab

- 116 -

的 hash 表里。找到存放符号的结构体后，计算符号的绝对地址，函数 obj_symbol_final_value 在./modutils-2.4.0/obj/obj_common.c中。 Insmod——obj_symbol_final_value 函数 203 ElfW(Addr) 204 obj_symbol_final_value (struct obj_file *f, struct obj_symbol *sym) 205 { 206 if (sym) 207 { 208 if (sym->secidx >= SHN_LORESERVE) 209 return sym->value; 210 211 return sym->value + f->sections[sym->secidx]->header.sh_addr; 212 } 213 else 214 { 215 /* As a special case, a NULL sym has value zero. */ 216 return 0; 217 } 218 } 对于类型是rela的重定位符号，因为有显式指定的偏移值r_addend，所以还要加上它。获得了绝对地址后，可已开始重定位操作了。函数arch_apply_relocation在obj_i386.c中。所涉及的重定位操作，参见elf文档，这里不多说了。 Insmod——arch_apply_relocation 函数 90 enum obj_reloc 91 arch_apply_relocation (struct obj_file *f, 92 struct obj_section *targsec, 93 struct obj_section *symsec, 94 struct obj_symbol *sym, 95 Elf32_Rel *rel, 96 Elf32_Addr v) 97 { 98 struct i386_file *ifile = (struct i386_file *)f; 99 struct i386_symbol *isym = (struct i386_symbol *)sym; 100 101 Elf32_Addr *loc = (Elf32_Addr *)(targsec->contents + rel->r_offset); 102 Elf32_Addr dot = targsec->header.sh_addr + rel->r_offset; 103 Elf32_Addr got = ifile->got ? ifile->got->header.sh_addr : 0; 104 105 enum obj_reloc ret = obj_reloc_ok; 106 107 switch (ELF32_R_TYPE(rel->r_info)) 108 { 109 case R_386_NONE: 110 break; 111 112 case R_386_32: 113 *loc += v; 114 break; 115 116 case R_386_PLT32: 117 case R_386_PC32: 118 *loc += v - dot;

- 117 -

119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156

break; case R_386_GLOB_DAT: case R_386_JMP_SLOT: *loc = v; break; case R_386_RELATIVE: *loc += f->baseaddr; break; case R_386_GOTPC: assert(got != 0); *loc += got - dot; break; case R_386_GOT32: assert(isym != NULL); if (!isym->gotent.reloc_done) { isym->gotent.reloc_done = 1; *(Elf32_Addr *)(ifile->got->contents + isym->gotent.offset) = v; } *loc += isym->gotent.offset; break; case R_386_GOTOFF: assert(got != 0); *loc += v - got; break; default: ret = obj_reloc_unhandled; break; } return ret; } 这个函数返回后，就剩下在加载文件的过程中额外生成的符号了。这些符号都通过链表保存在

obj_file中。他们的重定位过程相当简单。回到INSMOD_MAIN函数里。 1899 1900 1901 1902 1903 1904 1905

/* Do archdata again, this time we have the final addresses */ if (add_archdata(f, &archdata)) goto out; /* Do kallsyms again, this time we have the final addresses */ if (add_kallsyms(f, &kallsyms, force_kallsyms)) goto out;

这 2 函数在前面都看过了。 add_archdata 因为 __archdata 段已经存在，所以什么都没干。 add_kallsyms用绝对地址重新生成kallsyms段的内容（现在的__kallsyms段才真正可用）。

- 118 -

1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930

#ifdef COMPAT_2_0 if (k_new_syscalls) init_module(m_name, f, m_size, blob_name, noload, flag_load_map); else if (!noload) old_init_module(m_name, f, m_size); #else init_module(m_name, f, m_size, blob_name, noload, flag_load_map); #endif if (errors) { if (!noload) delete_module(m_name); goto out; } exit_status = 0; out: if (dolock) flock(fp, LOCK_UN); close(fp); if (!noload) snap_shot(NULL, 0); return exit_status; }

假定COMPAT_2.0没有定义。这已经是整个INSMOD_MAIN的最后工作了，不过这个工作也不轻松。其主体是init_module函数，该函数也在insmod.c中。 Insmod——init_module 函数 1058 static int init_module(const char *m_name, struct obj_file *f, 1059 unsigned long m_size, const char *blob_name, 1060 unsigned int noload, unsigned int flag_load_map) 1061 { 1062 struct module *module; 1063 struct obj_section *sec; 1064 void *image; 1065 int ret = 0; 1066 tgt_long m_addr; 1067 1068 sec = obj_find_section(f, ".this"); 1069 module = (struct module *) sec->contents; 1070 m_addr = sec->header.sh_addr; 1071 1072 module->size_of_struct = sizeof(*module); 1073 module->size = m_size; 1074 module->flags = flag_autoclean ? NEW_MOD_AUTOCLEAN : 0; 1075 1076 sec = obj_find_section(f, "__ksymtab"); 1077 if (sec && sec->header.sh_size) { 1078 module->syms = sec->header.sh_addr; 1079 module->nsyms = sec->header.sh_size / (2 * tgt_sizeof_char_p); 1080 } 1081 if (n_ext_modules_used) { 1082 sec = obj_find_section(f, ".kmodtab"); 1083 module->deps = sec->header.sh_addr; 1084 module->ndeps = n_ext_modules_used;

- 119 -

1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141

} module->init = obj_symbol_final_value(f, obj_find_symbol(f, "init_module")); module->cleanup = obj_symbol_final_value(f, obj_find_symbol(f, "cleanup_module")); sec = obj_find_section(f, "__ex_table"); if (sec) { module->ex_table_start = sec->header.sh_addr; module->ex_table_end = sec->header.sh_addr + sec->header.sh_size; } sec = obj_find_section(f, ".text.init"); if (sec) { module->runsize = sec->header.sh_addr - m_addr; } sec = obj_find_section(f, ".data.init"); if (sec) { if (!module->runsize || module->runsize > sec->header.sh_addr - m_addr) module->runsize = sec->header.sh_addr - m_addr; } sec = obj_find_section(f, ARCHDATA_SEC_NAME); if (sec && sec->header.sh_size) { module->archdata_start = sec->header.sh_addr; module->archdata_end = module->archdata_start + sec->header.sh_size; } sec = obj_find_section(f, KALLSYMS_SEC_NAME); if (sec && sec->header.sh_size) { module->kallsyms_start = sec->header.sh_addr; module->kallsyms_end = module->kallsyms_start + sec->header.sh_size; } if (!arch_init_module(f, module)) return 0; /* * Whew! All of the initialization is complete. * Collect the final module image and give it to the kernel. */ image = xmalloc(m_size); obj_create_image(f, image); if (flag_load_map) print_load_map(f); if (blob_name) { int fd, l; fd=open(blob_name,O_WRONLY|O_CREAT|O_TRUNC,S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH); if (fd < 0) { error("open %s failed %m", blob_name); ret = -1; } else { if ((l = write(fd, image, m_size)) != m_size) { error("write %s failed %m", blob_name); ret = -1; } close(fd); }

- 120 -

1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 }

} if (ret == 0 && !noload) { fflush(stdout); /* Flush any debugging output */ ret = sys_init_module(m_name, (struct module *) image); if (ret) { error("init_module: %m"); lprintf("Hint: insmod errors can be caused by incorrect module parameters, " "including invalid IO or IRQ parameters"); } } free(image); return ret == 0;

__ksymtab段里保存的是内核和模块导出的符号。这些符号对应于module里的syms。在前面已经看到过module的deps对象对应于.kmodtab段。除了runsize外，其他都很到理解。那么runsize是怎么回事呢？在内核里，所有只在系统初始化时使用一次的数据和代码都放入特殊的段里，也就是data.init和text.init段。这些段占用的空间会在系统启动后回收。如果模块不是动态安装的，毫无疑问，它的初始化数据和代码都将放入这些段。那么如果模块是动态安装的，在模块编译的时候，它并不知道内核的这些段，必然是在自己文件里生成这2个段。这些段空间的回收就需要模块自己做。很自然，这些段一定是放在模块空间的末尾，这样才能使空间的回收最方便（其实前面的obj_load_order_prio函数已经说得很清楚了）。因为，这2个段在模块初始化后，就不再使用也不应该使用（不管你释不释放），而且这2个段的先后顺序也不一定。所以，1095~1104行找出这2个段的起始地址到模块起始地址之差的最小值，这就是模块的runsize。不过，这个变量现在还没使用。如果模块没有初始化的数据和代码，从代码中可以看出，runsize无法求得。 1115行的arch_init_module在x86体系下什么都不干，直接返回1。1122~1123行分配资源，构建模块影像。函数obj_create_image在obj_reloc.c中。 Insmod——obj_create_image 函数 415 int 416 obj_create_image (struct obj_file *f, char *image) 417 { 418 struct obj_section *sec; 419 ElfW(Addr) base = f->baseaddr; 420 421 for (sec = f->load_order; sec ; sec = sec->load_next) 422 { 423 char *secimg; 424 425 if (sec->contents == 0) 426 continue; 427

- 121 -

428 429 430 431 432 433 434 435

secimg = image + (sec->header.sh_addr - base); /* Note that we allocated data for NOBITS sections earlier. */ memcpy(secimg, sec->contents, sec->header.sh_size); } return 1; } 如果需要打印加载位图，则调用print_load_map函数，它还是在insmod.c中。

Insmod——print_load_map 函数 300 static void print_load_map(struct obj_file *f) 301 { 302 struct obj_symbol *sym; 303 struct obj_symbol **all, **p; 304 struct obj_section *sec; 305 int load_map_cmp(const void *a, const void *b) { 306 struct obj_symbol **as = (struct obj_symbol **) a; 307 struct obj_symbol **bs = (struct obj_symbol **) b; 308 unsigned long aa = obj_symbol_final_value(f, *as); 309 unsigned long ba = obj_symbol_final_value(f, *bs); 310 return aa < ba ? -1 : aa > ba ? 1 : 0; 311 } 312 int i, nsyms, *loaded; 313 314 /* Report on the section layout. */ 315 316 lprintf("Sections: Size %-*s Align", 317 (int) (2 * sizeof(void *)), "Address"); 318 319 for (sec = f->load_order; sec; sec = sec->load_next) { 320 int a; 321 unsigned long tmp; 322 323 for (a = -1, tmp = sec->header.sh_addralign; tmp; ++a) 324 tmp >>= 1; 325 if (a == -1) 326 a = 0; 327 328 lprintf("%-16s%08lx %0*lx 2**%d", 329 sec->name, 330 (long)sec->header.sh_size, 331 (int) (2 * sizeof(void *)), 332 (long)sec->header.sh_addr, 333 a); 334 } 335 336 /* Quick reference which section indicies are loaded. */ 337 338 loaded = alloca(sizeof(int) * (i = f->header.e_shnum)); 339 while (--i >= 0) 340 loaded[i] = (f->sections[i]->header.sh_flags & SHF_ALLOC) != 0; 341 342 /* Collect the symbols we'll be listing. */ 343 344 for (nsyms = i = 0; i < HASH_BUCKETS; ++i)

- 122 -

345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396

for (sym = f->symtab[i]; sym; sym = sym->next) if (sym->secidx <= SHN_HIRESERVE && (sym->secidx >= SHN_LORESERVE || loaded[sym->secidx])) ++nsyms; all = alloca(nsyms * sizeof(struct obj_symbol *)); for (i = 0, p = all; i < HASH_BUCKETS; ++i) for (sym = f->symtab[i]; sym; sym = sym->next) if (sym->secidx <= SHN_HIRESERVE && (sym->secidx >= SHN_LORESERVE || loaded[sym->secidx])) *p++ = sym; /* Sort them by final value. */ qsort(all, nsyms, sizeof(struct obj_file *), load_map_cmp); /* And list them. */ lprintf("\nSymbols:"); for (p = all; p < all + nsyms; ++p) { char type = '?'; unsigned long value; sym = *p; if (sym->secidx == SHN_ABS) { type = 'A'; value = sym->value; } else if (sym->secidx == SHN_UNDEF) { type = 'U'; value = 0; } else { struct obj_section *sec = f->sections[sym->secidx]; if (sec->header.sh_type == SHT_NOBITS) type = 'B'; else if (sec->header.sh_flags & SHF_ALLOC) { if (sec->header.sh_flags & SHF_EXECINSTR) type = 'T'; else if (sec->header.sh_flags & SHF_WRITE) type = 'D'; else type = 'R'; } value = sym->value + sec->header.sh_addr; } if (ELFW(ST_BIND) (sym->info) == STB_LOCAL) type = tolower(type); lprintf("%0*lx %c %s", (int) (2 * sizeof(void *)), value, type, sym->name); } } 这个函数不复杂，就不说那么多了。另外，如果指定了要输出生成的模块到blob_name指定的

文件，那么也保存它。

- 123 -

最后一步，调用sys_init_module把image指向的module内容拷贝到内核的module对象中。有没有注意到，前面的重定位工作都是以模块对象在内核中的地址为起始地址，来进行的（就是前面的 m_addr），就是为了这一步的拷贝。经过这一步，我们的模块终于成为内核的一员，发挥它的作用。至此，insmod_main结束了。

- 124 -