Lab1

基本部分

在實驗給出的檔案中，已經詳說明了早期PC的記憶體布局，并且運行了bootloader，詳細地解釋了，上電后BIOS所做的作業，因此這部分不再贅述，需要注意的是bootloader的職能：

進入保護模式
從磁盤加載kernel到記憶體

boot/boot.S

實驗檔案中要求好好理解boot/boot.S和boot/main.c ，先看匯編檔案

# Start the CPU: switch to 32-bit protected mode, jump into C.
# The BIOS loads this code from the first sector of the hard disk into
# memory at physical address 0x7c00 and starts executing in real mode
# with %cs=0 %ip=7c00.

.set PROT_MODE_CSEG, 0x8         # kernel code segment selector
.set PROT_MODE_DSEG, 0x10        # kernel data segment selector
.set CR0_PE_ON,      0x1         # protected mode enable flag

.globl start
start:
  .code16                     # Assemble for 16-bit mode
  cli                         # Disable interrupts
  cld                         # String operations increment

可以看到第一條被加載到0x7c00處的代碼應該是cli關中斷指令，之后進行了部分暫存器的初始化

# Set up the important data segment registers (DS, ES, SS).
  xorw    %ax,%ax             # Segment number zero
  movw    %ax,%ds             # -> Data Segment
  movw    %ax,%es             # -> Extra Segment
  movw    %ax,%ss             # -> Stack Segment

然后是開啟A20地址線，早期8086處理器，只有20根地址線，因此編碼時地址最高到0xfffff之后便會歸零，然而在80286時期，地址線已經增長到24根，為了早期程式的兼容性，A20(第21根地址線)是默認關閉的，現在為了進入保護模式我們需要手動開啟這根線，開啟的方法是利用8042芯片的IO埠，

 # Enable A20:
  #   For backwards compatibility with the earliest PCs, physical
  #   address line 20 is tied low, so that addresses higher than
  #   1MB wrap around to zero by default.  This code undoes this.
seta20.1:
  inb     $0x64,%al               # Wait for not busy
  testb   $0x2,%al
  jnz     seta20.1

  movb    $0xd1,%al               # 0xd1 -> port 0x64
  outb    %al,$0x64

seta20.2:
  inb     $0x64,%al               # Wait for not busy
  testb   $0x2,%al
  jnz     seta20.2

  movb    $0xdf,%al               # 0xdf -> port 0x60
  outb    %al,$0x60

首先了解一下8042的狀態暫存器

Bit7: PARITY-EVEN(P_E): 從鍵盤獲得的資料奇偶校驗錯誤
Bit6: RCV-TMOUT(R_T): 接收超時，置1
Bit5: TRANS_TMOUT(T_T): 發送超時，置1
Bit4: KYBD_INH(K_I): 為1，鍵盤沒有被禁止，為0，鍵盤被禁止，
Bit3: CMD_DATA(C_D): 為1，輸入緩沖器中的內容為命令，為0，輸入緩沖器中的內容為資料，
Bit2: SYS_FLAG(S_F): 系統標志，加電啟動置0，自檢通過后置1
Bit1: INPUT_BUF_FULL(I_B_F): 輸入緩沖器滿置1，i8042 取走后置0
BitO: OUT_BUF_FULL(O_B_F): 輸出緩沖器滿置1，CPU讀取后置0

然后便是埠地址，8042只使用兩個埠

0x64 :命令埠

0x60 :資料埠

然后便是一些相關的命令，這里只選取和代碼相關的：

驅動對鍵盤控制器發送命令是通過寫埠64h實作:

D1h
準備寫Output埠，隨后通過60h埠寫入的位元組，會被放置在Output Port中，

關于向8042發送命令前的準備作業：

向i8042發命令的方法，首先，讀取狀態暫存器，判斷bit1，狀態暫存器bit1為0，說明輸入緩沖器為空，可以寫入，保證狀態暫存器bit1為0，然后對64h埠進行寫操作，寫入命令，

那么這段代碼就很明顯了，至于0xdf解釋如下

0xdd :disenable A20

0xdf :enable A20

至此A20便開啟了，

下面是關于加載GDT的代碼

# Switch from real to protected mode, using a bootstrap GDT
  # and segment translation that makes virtual addresses 
  # identical to their physical addresses, so that the 
  # effective memory map does not change during the switch.
  lgdt    gdtdesc
  movl    %cr0, %eax
  orl     $CR0_PE_ON, %eax
  movl    %eax, %cr0
  #......
  # Bootstrap GDT
.p2align 2                                # force 4 byte alignment
gdt:
  SEG_NULL				# null seg
  SEG(STA_X|STA_R, 0x0, 0xffffffff)	    # code seg
  SEG(STA_W, 0x0, 0xffffffff)	        # data seg
gdtdesc:
  .word   0x17                            # sizeof(gdt) - 1
  .long   gdt                             # address gdt

指令lgdt裝載了gdt descriptor，其中的0x17是全域描述符表的大小，一共裝載了3個段，每個段的大小是8位元組，因此24位元組，關于gdt可以看下圖:

而對于這三個段的定義，需要結合#include <inc/mmu.h>

/*
 * Macros to build GDT entries in assembly.
 */
#define SEG_NULL						\
	.word 0, 0;						\
	.byte 0, 0, 0, 0
#define SEG(type,base,lim)					\
	.word (((lim) >> 12) & 0xffff), ((base) & 0xffff);	\
	.byte (((base) >> 16) & 0xff), (0x90 | (type)),		\
		(0xC0 | (((lim) >> 28) & 0xf)), (((base) >> 24) & 0xff)
//......
#define STA_X		0x8	    // Executable segment
#define STA_E		0x4	    // Expand down (non-executable segments)
#define STA_C		0x4	    // Conforming code segment (executable only)
#define STA_W		0x2	    // Writeable (non-executable segments)
#define STA_R		0x2	    // Readable (executable segments)
#define STA_A		0x1	    // Accessed

為了更好地理解，先看一下Intel手冊中，關于保護模式下記憶體管理地描述(chapter3)，幾個值得注意的點，一個是關于GDTR

這就是ldgt裝載地暫存器，前文的代碼含義便很清楚了，

下圖是關于段的type

因此：

#define STA_X		0x8	    // Executable segment
#define STA_W		0x2	    // Writeable (non-executable segments)
#define STA_R		0x2	    // Readable (executable segments)

所以（STA_X|STA_R）便是type(1010)，即可執行可讀，

下圖是關于segment descriptor

有幾個域是我們感興趣的:

Segment limit field
Specifies the size of the segment. The processor puts together the two segment limit fields to form a 20-bit value.

Base address fields
Defines the location of byte 0 of the segment within the 4-GByte linear address space. The processor puts together the three base address fields to form a single 32-bit value. Segment base addresses should be aligned to 16-byte boundaries. Although 16-byte alignment is not required,this alignment allows programs to maximize performance by aligning code and data on 16-byte boundaries.

Type field

Indicates the segment or gate type and specifies the kinds of access that can be made to the segment and the direction of growth.

S (descriptor type) flag
Specifies whether the segment descriptor is for a system segment (S flag is clear) or a code or data segment (S flag is set).

DPL (descriptor privilege level) field
Specifies the privilege level of the segment. The privilege level can range from 0 to 3, with 0 being the most privileged level.

P (segment-present) flag
Indicates whether the segment is present in memory (set) or not present (clear).

D/B (default operation size/default stack pointer size and/or upper bound) flag
Performs different functions depending on whether the segment descriptor is an executable code segment, an expand-down data segment, or a stack segment. (This flag should always be set to 1 for 32-bit code and data segments and to 0 for 16-bit code and data segments.)

L (64-bit code segment) flag
In IA-32e mode, bit 21 of the second doubleword of the segment descriptor indicates whether a code segment contains native 64-bit code. A value of 1 indicates instructions in this code segment are executed in 64-bit mode. A value of 0 indicates the instructions in this code segment are executed in compatibility mode.

G (granularity) flag
Determines the scaling of the segment limit field. When the granularity flag is clear, the segment limit is interpreted in byte units; when flag is set, the segment limit is interpreted in 4-KByte units. (This flag does not affect the granularity of the base address; it is always byte granular.) When the granularity flag is set, the twelve least significant bits of an offset are not tested when checking the offset against the segment limit. For example, when the granularity flag is set, a limit of 0 results in valid offsets from 0 to 4095.

這之后我們再去看代碼，便很清晰了，以

SEG(STA_X|STA_R, 0x0, 0xffffffff)	    # code seg

為例子，先解讀C語言的這段宏

#define SEG(type,base,lim)					\
	.word (((lim) >> 12) & 0xffff), ((base) & 0xffff);	\
	.byte (((base) >> 16) & 0xff), (0x90 | (type)),		\
		(0xC0 | (((lim) >> 28) & 0xf)), (((base) >> 24) & 0xff)

.word (((lim) >> 12) & 0xffff), ((base) & 0xffff)

獲得了界限的bit12-27，共計16位，在(((lim) >> 28) & 0xf)中獲得了最高4位，共計20位，因為要映射到完整的4G空間，也就是說，段限最大0xffffffff，因為后面置位了段描述符的G位，因此以4KB為單元就擴展到了最大段限制4G，

(0x90 | (type)),(0xC0 | (((lim) >> 28) & 0xf))

這里只需解釋：

0x90 P=1(在記憶體中) DPL=00(特權級0) S=1(代碼段或者資料段)

0xC0 G=1(4KB單位解釋段限) D/B=1(32位代碼應總是1) L=0 AVL=0

至此關于GDT的這段故事完結，

之后

movl    %cr0, %eax
orl     $CR0_PE_ON, %eax
movl    %eax, %cr0

先看關于CR0

CR0.PE
Protection Enable (bit 0 of CR0) — Enables protected mode when set; enables real-address mode when clear. This flag does not enable paging directly. It only enables segment-level protection. To enable paging, both the PE and PG flags must be set.

所以PE允許了保護模式的開啟，并且開啟了分段保護機制(沒開啟分頁PG)，這也就是為什么要在此之前建立GDT，至此，已經進入了32位的保護模式，因此尋址方式也已經發生變化，

先了解段選擇子

保護模式是通過“段選擇符+段內偏移”尋址最終的線性地址或物理地址的，

TI位，選擇GDT或者是LDT

因此

# Jump to next instruction, but in 32-bit code segment.
# Switches processor into 32-bit mode.
ljmp    $PROT_MODE_CSEG, $protcseg

段選擇子就是$PROT_MODE_CSEG也就是0x8，即0000000000001 0 00

所以是GDT的第二項，基址為0x00000000，因此與之前的CS一致，

這里針對保護模式下的段選擇子和段暫存器再多寫一點[來自Intel手冊Volume 3]，

為了減少地址轉換用時和編碼的復雜度，處理器提供6個段暫存器來保存段選擇子，

每個段暫存器都支持特定的記憶體尋址，對于任何程式的執行，至少要將CS,DS,SS賦予有效的段選擇符，當一個行程需要訪問某個段時，這個段的段選擇子必須被賦予到某一個段暫存器中，因此，盡管可以定義很多段，但是只有6個能被直接使用，每個段暫存器都由兩個部分組成，可見部分和不可見部分，當一個段選擇子被加載到段暫存器的可見部分時，處理器也通過段選擇符指向的段描述符獲得了這個段的不可見資訊，

有兩種裝入段暫存器的指令：

直接載入

mov pop lds les lss lgs和lfs
隱含地載入

far pointer 版本的call jmp ret指令，還有iret intn int0 int3 指令，伴隨著這些指令的進行，他們改變了CS暫存器，

# Set up the stack pointer and call into C.
  movl    $start, %esp
  call bootmain

把堆疊頂設在了start處，也就是0x7c00，之后呼叫了C語言函式，

參考材料

[關于8042詳細的解讀]https://blog.csdn.net/wyyy2088511/article/details/108847079

[關于A20地址線]https://docs.huihoo.com/gnu_linux/own_os/booting-a20_4.htm

[關于0xdf]https://stackoverflow.com/questions/15768683/the-a20-line-with-jos

[關于A20與編程]https://www.win.tue.nl/~aeb/linux/kbd/A20.html

[GDT wiki]https://en.wikipedia.org/wiki/Global_Descriptor_Table

[lgdt]https://www.jianshu.com/p/2cb94c4c0cd0

[ucore]https://zhuanlan.zhihu.com/p/67259776

[CHAPTER 3 protected-mode]https://pdos.csail.mit.edu/6.828/2018/readings/ia32/IA32-3A.pdf

[ucore boot]https://www.cnblogs.com/maruixin/p/3175894.html

boot/main.c

先看一個比較簡單的函式

void waitdisk(void)
{
	// wait for disk reaady
	while ((inb(0x1F7) & 0xC0) != 0x40)
		/* do nothing */;
}

埠1F7在被讀的時候是作為狀態暫存器使用，其中bit_7=0表示控制器空閑，bit_6=1表示驅動器就緒，因此，waitdisk在控制器空閑和驅動器就緒同時成立時才會結束等待，即在0100 0000時退出等待，

下面是一個讀取扇區的函式

void readsect(void *dst, uint32_t offset)
{
	// wait for disk to be ready
	waitdisk();

	outb(0x1F2, 1);		// count = 1
	outb(0x1F3, offset);
	outb(0x1F4, offset >> 8);
	outb(0x1F5, offset >> 16);
	outb(0x1F6, (offset >> 24) | 0xE0);
	outb(0x1F7, 0x20);	// cmd 0x20 - read sectors

	// wait for disk to be ready
	waitdisk();

	// read a sector
	insl(0x1F0, dst, SECTSIZE/4);
}

首先

outb(port,data): 向port寫入1位元組資料data

insl(port,addr,cnt) : 從port讀cnt個dword到addr中去

關于埠的含義,可以參考UCORE的實驗教材

因此，這個函式從磁盤上讀了SECTSIZE/4dword的資料，也就是512位元組，剛好一個扇區，

下面一個函式，進行kernel的加載,一個輔助的函式是從磁盤讀到記憶體中特定的地址上，

// Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
// Might copy more than asked
void
readseg(uint32_t pa, uint32_t count, uint32_t offset)
{
	uint32_t end_pa;
	end_pa = pa + count;
	// round down to sector boundary
	pa &= ~(SECTSIZE - 1);

	// translate from bytes to sectors, and kernel starts at sector 1
	offset = (offset / SECTSIZE) + 1;

	// If this is too slow, we could read lots of sectors at a time.
	// We'd write more to memory than asked, but it doesn't matter --
	// we load in increasing order.
	while (pa < end_pa) {
		// Since we haven't enabled paging yet and we're using
		// an identity segment mapping (see boot.S), we can
		// use physical addresses directly.  This won't be the
		// case once JOS enables the MMU.
		readsect((uint8_t*) pa, offset);
		pa += SECTSIZE;
		offset++;
	}
}

在解釋bootmain之前需要了解一下ELF檔案格式

ELF檔案（Executable Linkable Format）是一種檔案存盤格式，Linux下的目標檔案和可執行檔案都按照該格式進行存盤，代碼編譯后的指令放在代碼段，全域變數和區域靜態變數放到資料段，檔案以一個“檔案頭”開始，記錄了整個檔案的屬性資訊，

以下是inc/elf.h的部分

#define ELF_MAGIC 0x464C457FU	/* "\x7FELF" in little endian */

struct Elf {
	uint32_t e_magic;	// must equal ELF_MAGIC
	uint8_t e_elf[12];
	uint16_t e_type;
	uint16_t e_machine;
	uint32_t e_version;
	uint32_t e_entry;  //程式入口的虛地址
	uint32_t e_phoff;  //program header表的位置偏移
	uint32_t e_shoff;
	uint32_t e_flags;
	uint16_t e_ehsize;
	uint16_t e_phentsize;
	uint16_t e_phnum;  //program header表中的入口數目
	uint16_t e_shentsize;
	uint16_t e_shnum;
	uint16_t e_shstrndx;
};
//一個ELF檔案中分為好幾個段，程式段、資料段等
struct Proghdr {
	uint32_t p_type;	// 段型別
	uint32_t p_offset;  // 段相對檔案頭的偏移值
	uint32_t p_va;		// 段的第一個位元組將被放到記憶體中的虛擬地址
	uint32_t p_pa;
	uint32_t p_filesz;
	uint32_t p_memsz;	// 段在記憶體映像中占用的位元組數
	uint32_t p_flags;
	uint32_t p_align;
};

下面是bootmain的實作,關鍵部分給與注釋

void
bootmain(void)
{
    //兩個program header指標
	struct Proghdr *ph, *eph;

    //磁盤第一個頁讀到記憶體位置0x10000的位置
    //頁大小512*8=4kB 偏移量為0
	// read 1st page off disk
	readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);
	
	// is this a valid ELF?
	if (ELFHDR->e_magic != ELF_MAGIC)
		goto bad;

    // 程式頭表的頭指標，為ELF檔案的起始地址加上程式頭表的偏移量
	// load each program segment (ignores ph flags)
	ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
    
    // 程式頭表的尾指標，為ELF檔案的頭指標加上程式頭表的段數
	eph = ph + ELFHDR->e_phnum;
    
	for (; ph < eph; ph++)
        //回圈讀取ELF程式頭表中的每個段(代碼段，資料段)到記憶體中
		// p_pa is the load address of this segment (as well
		// as the physical address)
        // pa物理地址 memsz占用的位元組 offset偏移
		readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
	//至此內核就加載完成了，馬上移交控制權
	// call the entry point from the ELF header
	// note: does not return!
    // 跳轉到內核程式的入口，CPU 控制權交給kernel，不再回傳
    // 內核代碼會被加載到指定位置，都會以一個ELF格式開頭，以此可以得到入口地址
	((void (*)(void)) (ELFHDR->e_entry))();
bad:
	outw(0x8A00, 0x8A00);
	outw(0x8A00, 0x8E00);
	while (1)
		/* do nothing */;
}

至此，bootloader的程式代碼已經沒有秘密了，是一個很底層的程序，但對于了解計算機啟動的初始階段很有幫助，

下一篇文章，包括kernel部分與lab1的習題，

參考文章

[ports]https://bochs.sourceforge.io/techspec/PORTS.LST

[ports]http://web.archive.org/web/20040304063834/http://members.iweb.net.au/~pstorr/pcbook/book2/ioassign.htm

[ucore ports]https://chyyuu.gitbooks.io/ucore_os_docs/content/lab1/lab1_3_2_3_dist_accessing.html

[insl]https://stackoverflow.com/questions/38410829/why-cant-find-the-insl-instruction-in-x86-document

[ucore elf]https://chyyuu.gitbooks.io/ucore_os_docs/content/lab1/lab1_3_2_4_elf.html

[elf]https://blog.csdn.net/qqNCer/article/details/105984272

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/304791.html

標籤：Linux

上一篇：openEuler 20.03/21.03 - 華為歐拉開源版（CentOS 8 華為版開源版）下載