ChatGPT解决这个技术问题 Extra ChatGPT

Use of floating point in the Linux kernel

I am reading Robert Love's "Linux Kernel Development", and I came across the following passage:

No (Easy) Use of Floating Point When a user-space process uses floating-point instructions, the kernel manages the transition from integer to floating point mode. What the kernel has to do when using floating-point instructions varies by architecture, but the kernel normally catches a trap and then initiates the transition from integer to floating point mode. Unlike user-space, the kernel does not have the luxury of seamless support for floating point because it cannot easily trap itself. Using a floating point inside the kernel requires manually saving and restoring the floating point registers, among other possible chores. The short answer is: Don’t do it! Except in the rare cases, no floating-point operations are in the kernel.

I've never heard of these "integer" and "floating-point" modes. What exactly are they, and why are they needed? Does this distinction exist on mainstream hardware architectures (such as x86), or is it specific to some more exotic environments? What exactly does a transition from integer to floating point mode entail, both from the point of view of the process and the kernel?

The book confuses the issue a bit by talking about a "mode". The integer instructions are always available, but the FPU can be disabled entirely or in part. No useful function ever consisted entirely of FP ops, for example, all the control instructions are considered "integer". See below for more.
@DigitalRoss: I agree about the terminology. Thanks for the answer BTW, it made things crystal clear.
It would be interesting to know what the desire to do floating point ops in the kernel stems from. It's tempting to say "poor design" in the sense of trying to do something in the kernel that should be done outside of it, but perhaps there are things a kernel truly should be doing where leveraging the FPU would be an innovative solution?
Since nobody mentioned it, if you use FP (or SIMD) inside the kernel, you need to call kernel_fpu_begin() / kernel_fpu_end() before/after your code to make sure user-space FPU state isn't corrupted. This is what Linux's md code does for RAID5 / RAID6.

D
DigitalRoss

Because...

many programs don't use floating point or don't use it on any given time slice; and

saving the FPU registers and other FPU state takes time; therefore

...an OS kernel may simply turn the FPU off. Presto, no state to save and restore, and therefore faster context-switching. (This is what mode meant, it just meant that the FPU was enabled.)

If a program attempts an FPU op, the program will trap into the kernel, the kernel will turn the FPU on, restore any saved state that may already exist, and then return to re-execute the FPU op.

At context switch time, it knows to actually go through the state save logic. (And then it may turn the FPU off again.)

By the way, I believe the book's explanation for the reason kernels (and not just Linux) avoid FPU ops is ... not perfectly accurate.1

The kernel can trap into itself and does so for many things. (Timers, page faults, device interrupts, others.) The real reason is that the kernel doesn't particularly need FPU ops and also needs to run on architectures without an FPU at all. Therefore, it simply avoids the complexity and runtime required to manage its own FPU context by not doing ops for which there are always other software solutions.

It's interesting to note how often the FPU state would have to be saved if the kernel wanted to use FP . . . every system call, every interrupt, every switch between kernel threads. Even if there was a need for occasional kernel FP,2 it would probably be faster to do it in software.

1. That is, dead wrong. 2. There are a few cases I know about where kernel software contains a floating point arithmetic implementation. Some architectures implement traditional FPU ops in hardware but leave some complex IEEE FP operations to software. (Think: denormal arithmetic.) When some odd IEEE corner case happens they trap to software which contains a pedantically correct emulation of the ops that can trap.


The kernel would generally not “turn the FPU off.” I am not even sure that is possible on current processor models. Rather, the kernel merely omits saving and restoring FPU data for processes that are not using it. If process F uses the FPU for a while and then is interrupted or stopped temporarily, other processes that are not using the FPU can come and go. You do not want to turn the FPU off for them, because it is still holding the data from process F. The kernel is merely letting that data sit without saving it…
… When process F runs again, it can continuing using the FPU with its data already there. If another process that wants to use the FPU runs, then the kernel needs to save F’s FPU data and load the new process’ data.
H
Hot Licks

With some kernel designs the floating-point registers are not saved when a "kernel" or "system" task is task-switched out. (This is because the FP registers are large and take both time and space to save.) So if you attempt to use FP the values will go "poof" randomly.

In addition, some hardware floating-point schemes rely on the kernel to handle "oddball" situations (eg, zero division) via a trap, and the required trap mechanism may be at a higher "level" than the kernel task is currently running.

For these reasons (and a couple more) some hardware FP schemes will trap when you use an FP instruction for the first time in a task. If you're permitted to use FP then a floating-point flag is turned on in the task, if not, you're shot by the firing squad.


In Linux, use kernel_fpu_begin() / kernel_fpu_end() before/after your code to trigger save/restore of user-space FPU state (and I guess of your kernel FPU state against pre-emption).
t
tjarco

I'm hitting this result regarding floating point usage in kernelspace. What I'm wondering is, isn't this an "old" implementation (compatibility) because of the older architectures in witch a dedicated FPU is implemented and FPU instructions are outsourced to a physical 'co-processor' having their own pipelines?

I can imagine that such an architecture should handle 'outsourced' instructions differently because of pipeline delays and such, but as I recall the current architectures (Arm.v8) has their IEEE754 instructions as part of the instruction set, not as an external FPU-module. So it can't be turned on or off, nor have the issue of pipeline delays. Yes, probably some CORE register that should be saved/ restored but that seems negligible (compared to overhead of stack management).

In my opinion there's no argument of not using Floats in the kernel. As already said above, it's already used in kernel-space for RAID.


Hello and welcome to StackOverflow. The question you have answered is already 9 years old, with 2 other well-upvoted answers. So before posting an answer to an old question like this, please make sure it contains new information.
If you have a new question, please ask it by clicking the Ask Question button. Include a link to this question if it helps provide context. - From Review