• Uncategorized

About linux : What-are-the-return-values-of-system-calls-in-Assembly

Question Detail

When I try to research about return values of system calls of the kernel, I find tables that describe them and what do I need to put in the different registers to let them work. However, I don’t find any documentation where it states what is that return value I get from the system call. I’m just finding in different places that what I receive will be in the EAX register.


TutorialsPoint:

The result is usually returned in the EAX register.

Assembly Language Step-By-Step: Programming with Linux book by Jeff Duntemann states many times in his programs:

  • Look at sys_read’s return value in EAX

  • Copy sys_read return value for safe keeping


Any of the websites I have don’t explain about this return value. Is there any Internet source? Or can someone explain me about this values?

Question Answer

See also this excellent LWN article about system calls which assumes C knowledge.

Also: The Definitive Guide to Linux System Calls (on x86), and related: What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?


C is the language of Unix systems programming, so all the documentation is in terms of C. And then there’s documentation for the minor differences between the C interface and the asm on any given platform, usually in the Notes section of man pages.

sys_read means the raw system call (as opposed to the libc wrapper function). The kernel implementation of the read system call is a kernel function called sys_read(). You can’t call it with a call instruction, because it’s in the kernel, not a library. But people still talk about “calling sys_read” to distinguish it from the libc function call. However, it’s ok to say read even when you mean the raw system call (especially when the libc wrapper doesn’t do anything special), like I do in this answer.

Also note that syscall.h defines constants like SYS_read with the actual system call number, or asm/unistd.h for the Linux __NR_read names for the same constants. (The value you put in EAX before an int 0x80 or syscall instruction).


Linux system call return values (in EAX/RAX on x86) are either “normal” success, or a -errno code for error. e.g. -EFAULT if you pass an invalid pointer. This behaviour is documented in the syscalls(2) man page.

-1 to -4095 means error, anything else means success. See AOSP non-obvious syscall() implementation for more details on this -4095UL .. -1UL range, which is portable across architectures on Linux, and applies to every system call. (In the future, a different architecture could use a different value for MAX_ERRNO, but the value for existing arches like x86-64 is guaranteed to stay the same as part of Linus’s don’t-break-userspace policy of keeping kernel ABIs stable.)

For example, glibc’s generic syscall(2) wrapper function uses this sequence: cmp rax, -4095 / jae SYSCALL_ERROR_LABEL, which is guaranteed to be future-proof for all Linux system calls.

You can use that wrapper function to make any system call, like syscall( __NR_mmap, ... ). (Or use an inline-asm wrapper header like https://github.com/linux-on-ibm-z/linux-syscall-support/blob/master/linux_syscall_support.h that has safe inline-asm for multiple ISAs, avoiding problems like missing "memory" clobbers that some other inline-asm wrappers have.)


Interesting cases include getpriority where the kernel ABI maps the -20..19 return-value range to 1..40, and libc decodes it. More details in a related answer about decoding syscall error return values.

For mmap, if you wanted you could also detect error just by checking that the return value isn’t page-aligned (e.g. any non-zero bits in the low 11, for a 4k page size), if that would be more efficient than checking p > -4096ULL.


To find the actual numeric values of constants for a specific platform, you need to find the C header file where they’re #defined. See my answer on a question about that for details. e.g. in asm-generic/errno-base.h / asm-generic/errno.h.


The meanings of return values for each sys call are documented in the section 2 man pages, like read(2). (sys_read is the raw system call that the glibc read() function is a very thin wrapper for.) Most man pages have a whole section for the return value. e.g.

RETURN VALUE

On success, the number of bytes read is returned (zero indicates
end of file), and the file position is advanced by this number. It
is not an error if this number is smaller than the number of bytes
requested; this may happen for example because fewer bytes are
actually available right now (maybe because we were close to end-of-
file, or because we are reading from a pipe, or from a terminal), or
because read() was interrupted by a signal. See also NOTES.

On error, -1 is returned, and errno is set appropriately. In this
case, it is left unspecified whether the file position (if any)
changes.

Note that the last paragraph describes how the glibc wrapper decodes the value and sets errno to -EAX if the raw system call’s return value is negative, so errno=EFAULT and return -1 if the raw system call returned -EFAULT.

And there’s a whole section listing all the possible error codes that read() is allowed to return, and what they mean specifically for read(). (POSIX standardizes most of this behaviour.)

You may also like...

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.