Restart an OOM killed docker automatically

RustyNova@lemmy.world · 6 months ago

Restart an OOM killed docker automatically

lemmyng@lemmy.ca · 6 months ago

Use -m and limit the build job’s memory so it doesn’t kill the docker daemon.

RustyNova@lemmy.world · 6 months ago

Fair enough. But I don’t want a bandaid fix solution. Even more that I do all my docker through portainer and the option isn’t there.

It could also be useful if a container got a memory leak and is unbounded

just_another_person@lemmy.world · 6 months ago

This isn’t a band-aid, it’s the literal fix.

Structuring the available CPU and Memory reservations for containers is LITERALLY the entire reason containers exist. Just because you’re only familiar with the “dumb” way of using them doesn’t mean you should be dismissive when someone offers you advice when you come here asking for it.

You’re also seemingly just a dick for being lazy, because I looked, and wuddyaknow. So now you’re just rude, dickish, and lazy.

Take the advice from the original responder, and then go and learn how to use the things you’re asking for help with, along with some manners.

drkt@lemmy.dbzer0.com · edit-2 16 days ago

deleted by creator

Bo7a@lemmy.ca · 6 months ago

You can’t expect people who are knowledgeable about this stuff to just forever accept that someone asks for advice, gets told the solution, and then ignores/belittles the person with knowledge.

This is our daily life experience. We get hired to be experts, and get told by non-experts that our solutions are not tenable every single day. Only for that solution to eventually be accepted when the user in question figures out their idea was not useful and the expert was correct.

We have to put up with it at work, we are not obliged to accept it here.

drkt@lemmy.dbzer0.com · edit-2 16 days ago

deleted by creator

Bo7a@lemmy.ca · 6 months ago

In which way am I complaining? I am explaining why calling a valid solution a bandaid might be construed as belittling their very real knowledge of this process. And how that is a regular pattern in a lot technical fields.

And don’t give me this shit about ‘I’m not the person you were talking to’ This is an open forum not a direct/private message.

gravitas_deficiency@sh.itjust.works · 6 months ago

You sound like you work in product

drkt@lemmy.dbzer0.com · edit-2 16 days ago

deleted by creator

just_another_person@lemmy.world · 6 months ago

I was obliged to respond to let him know that he was actually provided the correct answer, and he didn’t need to respond to the person who provided the correct answer like that. I don’t feel it’s right to sit idly by and let people who are only trying to help for free be getting snark like that. Obliged, much.

RustyNova@lemmy.world · 5 months ago

There’s a difference between helping people with misunderstanding a tool and belittling them for being wrong. It’s just a matter of wording that separate an helpful answer from a toxic one

I could tell you “You should actually use Y instead of X. They are numerous benefits like A, B and C. The doc actually have a great example you may have missed or not understood it was for this purpose. It will help you a lot more than what you are thinking of doing.” And this would be fine.

But “Just use Y. X is bad because Y is made for that. You not willing to use Y shouldn’t make you do X. There’s even a the first Google link on how to do it” isn’t fine.

And I have not belittled them at all. I have said that it wasn’t what I was looking for. A lot of times people post questions they think should solve their issue, but only to realise that they didn’t fully understand the full picture and theirs problem is on a larger scale.

RustyNova@lemmy.world · edit-2 5 months ago

Alright, sorry for calling it a “bandaid fix”. It wasn’t just the right term for what I wanted to say. I was more referring on how it would only fix issues in cases of builds, and not on actual runtime, which can also be an issue if I am not careful. So yeah, it’s the fix for the issue in the post, but this solution made me realise that this isn’t the only thing I want.

But the second part is… Just chill. It’s a home server. Not a high availability cluster. I can afford stupid things. Heck, I’m only asking this question because I got stupid and haven’t limited the job count of a cargo build, downing my server. I don’t care that my build crash. I just want to not have to manually restart it, because when I’m not here I can’t do it.

As for the link that you sent, it’s container limitations, not image building limitations. And I already have setup some on my most hungry container, stats shown that it blew past it, so idk what’s going on there.

Edit: NVM. This is a bandaid fix. What if you forgot to put the flag? Like it’s been 5 month since last time and forgot to do the same fix? Or you accidentally removed it while editing the command? I’m actually looking for a solution that fixed my problem fully, not a partial solution

just_another_person@lemmy.world · 5 months ago

Then you didn’t explain the issue very well, because what you’re asking for was given to you exactly. Builds also have flags, and you should know that if you’re complaining about advice given to you. I’m not saying that to admonish you, just giving you the info.

The next step down is that you’re using Portainer, and having user-error issues somehow. So another solution is renaming these actions something with a very obvious prefix like “BUILD ACTION”, but also setting memory limits.

The very last step is making sure your swap is in order. Allocate 2x your system memory to swap, and this will help alleviate OOM issues to a point, but especially during builds.

If you come back and say this is a band-aid solution, get a better machine and stop asking questions to solve the impossible in here. This is your fault this is an issue to begin with, you don’t know how to run your machines (regardless of it just being a home server or whatever ), and you’re just being rude.

Badabinski@kbin.earth · 6 months ago

The other person may have responded with a fair amount of hostility, but they’re absolutely correct. I run Kubernetes clusters hosting millions of containers across hundreds of thousands of VMs at my job, and OOMKills are just a fact of life. Apps will leak memory, and you’re powerless to fix it unless you’re willing to debug the app and fix the leak. It’s better for the container to run out of memory and trigger a cgroup-scoped OOM kill. A system-wide OOM kill will murder the things you love, shit in your hat, and lick your face like David Tennant licked Krysten Ritter.

RustyNova@lemmy.world · 5 months ago

Oh that’s not a problem to let a container get killed. It’s perfectly fine. What I want is just not crippling my whole server because one container did a funny.

If it keeps docker and the portainer VM I’ll be 100% ok, because I can just restart it. I don’t want to have remote access to my server outside of my home for security reasons, so this is just the bare minimum

Findmysec@infosec.pub · 5 months ago

Those remote access fears can be solved with a wireguard VPN

null@slrpnk.net · 5 months ago

I don’t want to have remote access to my server outside of my home for security reasons, so this is just the bare minimum

What are your security concerns?

Treczoks@lemmy.world · 5 months ago

This is not a bandaid, this is the solution. What you try is, at least for this scenario, the band aid.

KairuByte@lemmy.dbzer0.com · 5 months ago

??? Your original proposed solution is literally a bandaid fix.