0

Is there a way to configure Linux so that the OOM killer will report/post only which process is killed, but not dump the entire machine state with it?

The reason I ask that is, that on our production systems, several of our customers run their workloads. And OOM kills on some processes are bloating the log files - hence causing a lot of IO on the OS disk, and the system becomes unstable and hard to interact with.

We have spent time googling this but we mostly find how to adjust the OOM score (the priority for the processes) for the processes but not the logging level for the OOM killer.

Thanks for your time!

New contributor
Krishna Chaitanya is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

1 Answer 1

1

And OOM kills on some processes are bloating the log files - hence causing a lot of IO on the OS disk, and the system becomes unstable and hard to interact with.

NO!

Your system ran out of memory, so badly that the OOM killer was triggered. That is what makes your system unstable.

You're mistaking cause and effect there.


IMHO if the Out-of-Memory killer needs to run frequently, that should be the actual issue you need to address.

Unless your system is really large, or ridiculously undersized, generating and storing the task dump that is generated by the OoM-killer should not be a real issue.

But on large systems, you can set the kernel tuneable vm.oom_dump_tasks to 0 to disable the task dump.

See https://www.kernel.org/doc/Documentation/sysctl/vm.txt

oom_dump_tasks

Enables a system-wide task dump (excluding kernel threads) to be produced when the kernel performs an OOM-killing and includes such information as pid, uid, tgid, vm size, rss, pgtables_bytes, swapents, oom_score_adj score, and name. This is helpful to determine why the OOM killer was invoked, to identify the rogue task that caused it, and to determine why the OOM killer chose the task it did to kill.

If this is set to zero, this information is suppressed. On very large systems with thousands of tasks it may not be feasible to dump the memory state information for each one. Such systems should not be forced to incur a performance penalty in OOM conditions when the information may not be desired.

If this is set to non-zero, this information is shown whenever the OOM killer actually kills a memory-hogging task.

The default value is 1 (enabled).

1
  • Ah, yes. That's what I'm looking for. Sorry that I didn't mention that the OOM kills are happening inside a container, and, from my understanding, the host sees an oom kill and dumps to the log. This I think is causing the IO as the OOM kills in the container are happening continuously. Once again, sorry for not mentioning that. Thanks for your answer! 20 hours ago

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .