diff --git a/Documentation/teaching/labs/introduction.rst b/Documentation/teaching/labs/introduction.rst index 9709d3dda98d99..fb80ceb5df7cc4 100644 --- a/Documentation/teaching/labs/introduction.rst +++ b/Documentation/teaching/labs/introduction.rst @@ -64,11 +64,17 @@ cscope `Cscope `__ is a tool for efficient navigation of C sources. To use it, a cscope database must be geberated from the existing sources. In a Linux tree, the command -:command:`make ARCH = x86 cscope` is sufficient. Specification of the +:command:`make ARCH=x86 cscope` is sufficient. Specification of the architecture through the ARCH variable is optional but recommended; otherwise, some architecture dependent functions will appear multiple times in the database. +You can build the cscope database with the command :command:`make +ARCH=x86 COMPILED_SOURCE=1 cscope`. This way, the cscope database will +only contain symbols that have already been used in the compile +process before, thus resulting in better performance when searching +for symbols. + Cscope can also be used as stand-alone, but it is more useful when combined with an editor. To use cscope with :command:`vim`, it is necessary to install both packages and add the following lines to the file diff --git a/Documentation/teaching/lectures/intro.rst b/Documentation/teaching/lectures/intro.rst index 2640e534b66910..2d42f8d097d088 100644 --- a/Documentation/teaching/lectures/intro.rst +++ b/Documentation/teaching/lectures/intro.rst @@ -42,7 +42,7 @@ User vs Kernel * User-space -Kernel and user are two terms that are often use in operating +Kernel and user are two terms that are often used in operating systems. Their definition is pretty straight forward: The kernel is the part of the operating system that runs with higher privileges while user (space) usually means by applications running with low @@ -265,7 +265,7 @@ goal: inline functions, function pointers -There are a class of operating systems that (used to) claim to be +There is a class of operating systems that (used to) claim to be hybrid kernels, in between monolithic and micro-kernels (e.g. Windows, Mac OS X). However, since all of the typical monolithic services run in kernel-mode in these operating systems, there is little merit to diff --git a/Documentation/teaching/lectures/syscalls.rst b/Documentation/teaching/lectures/syscalls.rst index 9c31e5caa362f5..46262a09a9ccbd 100644 --- a/Documentation/teaching/lectures/syscalls.rst +++ b/Documentation/teaching/lectures/syscalls.rst @@ -95,7 +95,7 @@ similar with how interrupts and exception are handled (in fact on some architectures this transition happens as a result of an exception). The system call entry point will save registers (which contains values -from userspace, including system call number and system call +from user space, including system call number and system call parameters) on stack and then it will continue with executing the system call dispatcher. @@ -172,7 +172,7 @@ number and run the kernel function associated with the system call. -To demonstrate the system call flow are are going to use the virtual +To demonstrate the system call flow we are going to use the virtual machine setup, attach gdb to a running kernel, add a breakpoint to the dup2 system call and inspect the state. @@ -203,10 +203,10 @@ In summary, this is what happens during a system call: * The system call dispatcher identifies the system call function and runs it - * The userspace registers are restored and execution is switched + * The user space registers are restored and execution is switched back to user (e.g. calling IRET) - * Userspace application resumes + * The user space application resumes System call table @@ -245,11 +245,11 @@ system call numbers to kernel functions: System call parameters handling ------------------------------- -Handling system calls parameters is tricky. Since these values are -setup by userspace, the kernel can not assume correctness and must +Handling system call parameters is tricky. Since these values are +setup by user space, the kernel can not assume correctness and must always verify them throughly. -Pointers have a few important special case that must be checked: +Pointers have a few important special cases that must be checked: .. slide:: System Calls Pointer Parameters :inline-contents: True @@ -267,7 +267,7 @@ applications might get read or write access to kernel space. For example, lets consider the case where such a check is not made for the read or write system calls. If the user passes a kernel-space pointer to a write system call then it can get access to kernel data -by later reading the file. If it passes an kernel-space pointer to a +by later reading the file. If it passes a kernel-space pointer to a read system call then it can corrupt kernel memory. @@ -306,11 +306,11 @@ space to determine the cause: :level: 2 * Copy on write, demand paging, swapping: both the fault and - faulting addresses are in userspace; the fault address is + faulting addresses are in user space; the fault address is valid (checked against the user address space) * Invalid pointer used in system call: the faulting address is - in kernel space; the fault address is in userspace and it is + in kernel space; the fault address is in user space and it is invalid * Kernel bug (kernel accesses invalid pointer): same as above @@ -319,7 +319,7 @@ But in the last two cases we don't have enough information to determine the cause of the fault. In order to solve this issue Linux uses special APIs (e.g -:c:func:`copy_to_user`) to accesses userspace that are specially +:c:func:`copy_to_user`) to accesses user space that are specially crafted: .. slide:: Marking kernel code that accesses user space @@ -335,7 +335,7 @@ crafted: Although the fault handling case may be more costly overall depending on the address space vs exception table size, and it is more complex, -it does optimizes for the common case and that is why it is preferred +it is optimized for the common case and that is why it is preferred and used in Linux. @@ -372,7 +372,8 @@ With VDSO the system call interface is decided by the kernel: :level: 2 * a stream of instructions to issue the system call is generated by - the kernel in a special memory area (as en ELF .so) + the kernel in a special memory area (formatted as an ELF shared + object) * that memory area is mapped towards the end of the user address space @@ -392,7 +393,7 @@ With VDSO the system call interface is decided by the kernel: An interesting development of the VDSO are the virtual system calls -(vsyscalls) which run directly from userspace. These vsyscalls are +(vsyscalls) which run directly from user space. These vsyscalls are also part of VDSO and they are accessing data from the VDSO page that is either static or modified by the kernel in a separate read-write map of the VDSO page. Examples of system calls that can be implemented @@ -403,7 +404,7 @@ as vsyscalls are: getpid or gettimeofday. :inline-contents: True :level: 2 - * "System calls" that run directly from userspace, part of the VDSO + * "System calls" that run directly from user space, part of the VDSO * Static data (e.g. getpid()) @@ -411,16 +412,16 @@ as vsyscalls are: getpid or gettimeofday. (e.g. gettimeofday(), time(), ) -Accessing userspace from system calls +Accessing user space from system calls ===================================== -As we mentioned earlier, userspace must be accessed with special APIs +As we mentioned earlier, user space must be accessed with special APIs (:c:func:`get_user`, :c:func:`put_user`, :c:func:`copy_from_user`, -:c:func:`copy_to_user`) that checks the pointer to be in userspace and -also handles the fault if the pointer is invalid. In case of invalid -pointers they returns a non zero value. +:c:func:`copy_to_user`) that check wether the pointer is in user space +and also handle the fault if the pointer is invalid. In case of invalid +pointers they return a non zero value. -.. slide:: Accessing userspace from system calls +.. slide:: Accessing user space from system calls :inline-contents: True :level: 2 @@ -434,7 +435,7 @@ pointers they returns a non zero value. memcpy(&kernel_buffer, user_ptr, size); -Lets examine the simplest of API, get_user, as implemented for x86: +Let's examine the simplest API, get_user, as implemented for x86: .. slide:: get_user implementation :inline-contents: True @@ -458,7 +459,7 @@ Lets examine the simplest of API, get_user, as implemented for x86: The implementation uses inline assembly, that allows inserting ASM -sequences in C code and also handles access to / from variable in the +sequences in C code and also handles access to / from variables in the ASM code. Based on the type size of the x variable, one of __get_user_1, @@ -518,11 +519,11 @@ with the addr_limit field of the current task (process) descriptor to make sure that we don't have a pointer to kernel space. Then, SMAP is disabled, to allow access to user from kernel, and the -access to userspace is done with the instruction at the 1: label. EAX +access to user space is done with the instruction at the 1: label. EAX is then zeroed to mark success, SMAP is enabled, and the call returns. -The movzbl instruction is the one that does the access to userspace -and it's address is captured with the 1: label and stored in a special +The movzbl instruction is the one that does the access to user space +and its address is captured with the 1: label and stored in a special section: .. slide:: Exception table entry @@ -544,7 +545,7 @@ section: _ASM_EXTABLE_HANDLE(from, to, ex_handler_default) -For each address that accesses userspace we have an entry in the +For each address that accesses user space we have an entry in the exception table, that is made up of: the faulting address(from), where to jump to in case of a fault, and a handler function (that implements the jump logic). All of these addresses are stored on 32bit in