NVIDIA DevBox with Ubuntu 16.04 and 4.4.0-137-generic kernel randomly reboots and automatically shuts down...
I've recently stated using an NVIDIA DevBox that has an ASUS bios, with ther kernel version and ubuntu version mentioned above. For some reasons the machine can't really be left on overnight, as it is usual with other laptop and/or computer machines: where you can just leave it on it will lock itself after a couple of minutes and/or go into sleep mode -- and the next day once you move your mouse or type something in your keyboard the computer 'unsuspends' or wakes up and you have all your programs on and running just how you left them the previous day.
For some strange reason, this hasn't been happening with this machine. There was a previous user before me who hasn't touched the machine in about a year, so it is possible that he/she might have done some sort of configuration with regards to power savings, but everything looks good when I check the power option in my machine (I have it for suspend -- 1 hour, and lock 1 hour). I guess the funny thing I've noticed is that if I come back after lunch and the machine is locked/suspended, it get's back in the session without any problems, but if I leave it overnight, then I arrive the next day and the machine has automatically turned itself off. The building is locked so it's not possible for someone else to physically hit the power off button overnight, and I've also checked the history command from the other user (we both have admin privileges, and he doesn't use the computer) to check for remote access shutdowns and that doesn't pop up either.
I've read in a couple of places that it could potentially be a heating issue due to poor or broken power supply, but how can I check that this is the case? I have the psensor app, but that only seems to register temperatures in real-time without saving them to a file where I can check what the temperature was of any of the graphics cards (there are 4) or motherboard.
What is another way to diagnose the automatic shutdown of the machine?
How can I know if it's a heating issue or a faulty power supply? Or potentially a kernel issue? The machine has no real intense programs installed for now (its almost new) except for the NVIDIA drivers that I'm quite experienced with installing, so maybe I can consider a fresh Ubuntu install? -- though this is pretty much pointless if there is a hardware issue
Other details:
The NVIDIA drivers are correctly installed.
The driver got bugged and the machine responded pretty badly when I forced the following command and the machine was on for 2 consecutive days (which should be a breeze for these machines), until it had a hard time being on for more than 5 minutes after 2 consecutive random reboots in the middle of the night:
$ unset autologoff
I had to reinstall the drivers later correctly (and set the autolog option back on), and the system went back to its current state where it "needs" to shut itself off if its not doing anything for more than 24 hours (not doing anything as in it is not receiving human input, but backend processes may potentially still be running).
- Motherboard: ASUS EATX DDR4 LGA 2011-3 Motherboards X99-E WS/USB 3.1
- CPU: Intel Xeon E5-2690 v4 2.6 GHz 14-Core LGA 2011 Processor 135 W
- Cooler: Corsair Hydro Series H80i v2 Extreme Performance Liquid CPU
Cooler , Black. - Power Supply: EVGA SuperNOVA 1600 P2 80+ PLATINUM,
1600W ECO Mode Fully Modular NVIDIA SLI and Crossfire Ready 10 Year
Warranty Power Supply 220-P2-1600-X1 - Graphics Card: 4 Titan X Pascal.
I added the pci=noaer
in booting after finding out that the machine was giving me this error: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected
Output of :
$ cat /proc/cmdline
is
BOOT_IMAGE=/boot/vmlinuz-4.4.0-137-generic.efi.signed root=UUID=569dd2ad-c5a6-4ae4-a167-f849b8f6ae9e ro quiet splash pci=noaer vt.handoff=7
power-management reboot pci
New contributor
add a comment |
I've recently stated using an NVIDIA DevBox that has an ASUS bios, with ther kernel version and ubuntu version mentioned above. For some reasons the machine can't really be left on overnight, as it is usual with other laptop and/or computer machines: where you can just leave it on it will lock itself after a couple of minutes and/or go into sleep mode -- and the next day once you move your mouse or type something in your keyboard the computer 'unsuspends' or wakes up and you have all your programs on and running just how you left them the previous day.
For some strange reason, this hasn't been happening with this machine. There was a previous user before me who hasn't touched the machine in about a year, so it is possible that he/she might have done some sort of configuration with regards to power savings, but everything looks good when I check the power option in my machine (I have it for suspend -- 1 hour, and lock 1 hour). I guess the funny thing I've noticed is that if I come back after lunch and the machine is locked/suspended, it get's back in the session without any problems, but if I leave it overnight, then I arrive the next day and the machine has automatically turned itself off. The building is locked so it's not possible for someone else to physically hit the power off button overnight, and I've also checked the history command from the other user (we both have admin privileges, and he doesn't use the computer) to check for remote access shutdowns and that doesn't pop up either.
I've read in a couple of places that it could potentially be a heating issue due to poor or broken power supply, but how can I check that this is the case? I have the psensor app, but that only seems to register temperatures in real-time without saving them to a file where I can check what the temperature was of any of the graphics cards (there are 4) or motherboard.
What is another way to diagnose the automatic shutdown of the machine?
How can I know if it's a heating issue or a faulty power supply? Or potentially a kernel issue? The machine has no real intense programs installed for now (its almost new) except for the NVIDIA drivers that I'm quite experienced with installing, so maybe I can consider a fresh Ubuntu install? -- though this is pretty much pointless if there is a hardware issue
Other details:
The NVIDIA drivers are correctly installed.
The driver got bugged and the machine responded pretty badly when I forced the following command and the machine was on for 2 consecutive days (which should be a breeze for these machines), until it had a hard time being on for more than 5 minutes after 2 consecutive random reboots in the middle of the night:
$ unset autologoff
I had to reinstall the drivers later correctly (and set the autolog option back on), and the system went back to its current state where it "needs" to shut itself off if its not doing anything for more than 24 hours (not doing anything as in it is not receiving human input, but backend processes may potentially still be running).
- Motherboard: ASUS EATX DDR4 LGA 2011-3 Motherboards X99-E WS/USB 3.1
- CPU: Intel Xeon E5-2690 v4 2.6 GHz 14-Core LGA 2011 Processor 135 W
- Cooler: Corsair Hydro Series H80i v2 Extreme Performance Liquid CPU
Cooler , Black. - Power Supply: EVGA SuperNOVA 1600 P2 80+ PLATINUM,
1600W ECO Mode Fully Modular NVIDIA SLI and Crossfire Ready 10 Year
Warranty Power Supply 220-P2-1600-X1 - Graphics Card: 4 Titan X Pascal.
I added the pci=noaer
in booting after finding out that the machine was giving me this error: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected
Output of :
$ cat /proc/cmdline
is
BOOT_IMAGE=/boot/vmlinuz-4.4.0-137-generic.efi.signed root=UUID=569dd2ad-c5a6-4ae4-a167-f849b8f6ae9e ro quiet splash pci=noaer vt.handoff=7
power-management reboot pci
New contributor
add a comment |
I've recently stated using an NVIDIA DevBox that has an ASUS bios, with ther kernel version and ubuntu version mentioned above. For some reasons the machine can't really be left on overnight, as it is usual with other laptop and/or computer machines: where you can just leave it on it will lock itself after a couple of minutes and/or go into sleep mode -- and the next day once you move your mouse or type something in your keyboard the computer 'unsuspends' or wakes up and you have all your programs on and running just how you left them the previous day.
For some strange reason, this hasn't been happening with this machine. There was a previous user before me who hasn't touched the machine in about a year, so it is possible that he/she might have done some sort of configuration with regards to power savings, but everything looks good when I check the power option in my machine (I have it for suspend -- 1 hour, and lock 1 hour). I guess the funny thing I've noticed is that if I come back after lunch and the machine is locked/suspended, it get's back in the session without any problems, but if I leave it overnight, then I arrive the next day and the machine has automatically turned itself off. The building is locked so it's not possible for someone else to physically hit the power off button overnight, and I've also checked the history command from the other user (we both have admin privileges, and he doesn't use the computer) to check for remote access shutdowns and that doesn't pop up either.
I've read in a couple of places that it could potentially be a heating issue due to poor or broken power supply, but how can I check that this is the case? I have the psensor app, but that only seems to register temperatures in real-time without saving them to a file where I can check what the temperature was of any of the graphics cards (there are 4) or motherboard.
What is another way to diagnose the automatic shutdown of the machine?
How can I know if it's a heating issue or a faulty power supply? Or potentially a kernel issue? The machine has no real intense programs installed for now (its almost new) except for the NVIDIA drivers that I'm quite experienced with installing, so maybe I can consider a fresh Ubuntu install? -- though this is pretty much pointless if there is a hardware issue
Other details:
The NVIDIA drivers are correctly installed.
The driver got bugged and the machine responded pretty badly when I forced the following command and the machine was on for 2 consecutive days (which should be a breeze for these machines), until it had a hard time being on for more than 5 minutes after 2 consecutive random reboots in the middle of the night:
$ unset autologoff
I had to reinstall the drivers later correctly (and set the autolog option back on), and the system went back to its current state where it "needs" to shut itself off if its not doing anything for more than 24 hours (not doing anything as in it is not receiving human input, but backend processes may potentially still be running).
- Motherboard: ASUS EATX DDR4 LGA 2011-3 Motherboards X99-E WS/USB 3.1
- CPU: Intel Xeon E5-2690 v4 2.6 GHz 14-Core LGA 2011 Processor 135 W
- Cooler: Corsair Hydro Series H80i v2 Extreme Performance Liquid CPU
Cooler , Black. - Power Supply: EVGA SuperNOVA 1600 P2 80+ PLATINUM,
1600W ECO Mode Fully Modular NVIDIA SLI and Crossfire Ready 10 Year
Warranty Power Supply 220-P2-1600-X1 - Graphics Card: 4 Titan X Pascal.
I added the pci=noaer
in booting after finding out that the machine was giving me this error: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected
Output of :
$ cat /proc/cmdline
is
BOOT_IMAGE=/boot/vmlinuz-4.4.0-137-generic.efi.signed root=UUID=569dd2ad-c5a6-4ae4-a167-f849b8f6ae9e ro quiet splash pci=noaer vt.handoff=7
power-management reboot pci
New contributor
I've recently stated using an NVIDIA DevBox that has an ASUS bios, with ther kernel version and ubuntu version mentioned above. For some reasons the machine can't really be left on overnight, as it is usual with other laptop and/or computer machines: where you can just leave it on it will lock itself after a couple of minutes and/or go into sleep mode -- and the next day once you move your mouse or type something in your keyboard the computer 'unsuspends' or wakes up and you have all your programs on and running just how you left them the previous day.
For some strange reason, this hasn't been happening with this machine. There was a previous user before me who hasn't touched the machine in about a year, so it is possible that he/she might have done some sort of configuration with regards to power savings, but everything looks good when I check the power option in my machine (I have it for suspend -- 1 hour, and lock 1 hour). I guess the funny thing I've noticed is that if I come back after lunch and the machine is locked/suspended, it get's back in the session without any problems, but if I leave it overnight, then I arrive the next day and the machine has automatically turned itself off. The building is locked so it's not possible for someone else to physically hit the power off button overnight, and I've also checked the history command from the other user (we both have admin privileges, and he doesn't use the computer) to check for remote access shutdowns and that doesn't pop up either.
I've read in a couple of places that it could potentially be a heating issue due to poor or broken power supply, but how can I check that this is the case? I have the psensor app, but that only seems to register temperatures in real-time without saving them to a file where I can check what the temperature was of any of the graphics cards (there are 4) or motherboard.
What is another way to diagnose the automatic shutdown of the machine?
How can I know if it's a heating issue or a faulty power supply? Or potentially a kernel issue? The machine has no real intense programs installed for now (its almost new) except for the NVIDIA drivers that I'm quite experienced with installing, so maybe I can consider a fresh Ubuntu install? -- though this is pretty much pointless if there is a hardware issue
Other details:
The NVIDIA drivers are correctly installed.
The driver got bugged and the machine responded pretty badly when I forced the following command and the machine was on for 2 consecutive days (which should be a breeze for these machines), until it had a hard time being on for more than 5 minutes after 2 consecutive random reboots in the middle of the night:
$ unset autologoff
I had to reinstall the drivers later correctly (and set the autolog option back on), and the system went back to its current state where it "needs" to shut itself off if its not doing anything for more than 24 hours (not doing anything as in it is not receiving human input, but backend processes may potentially still be running).
- Motherboard: ASUS EATX DDR4 LGA 2011-3 Motherboards X99-E WS/USB 3.1
- CPU: Intel Xeon E5-2690 v4 2.6 GHz 14-Core LGA 2011 Processor 135 W
- Cooler: Corsair Hydro Series H80i v2 Extreme Performance Liquid CPU
Cooler , Black. - Power Supply: EVGA SuperNOVA 1600 P2 80+ PLATINUM,
1600W ECO Mode Fully Modular NVIDIA SLI and Crossfire Ready 10 Year
Warranty Power Supply 220-P2-1600-X1 - Graphics Card: 4 Titan X Pascal.
I added the pci=noaer
in booting after finding out that the machine was giving me this error: https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected
Output of :
$ cat /proc/cmdline
is
BOOT_IMAGE=/boot/vmlinuz-4.4.0-137-generic.efi.signed root=UUID=569dd2ad-c5a6-4ae4-a167-f849b8f6ae9e ro quiet splash pci=noaer vt.handoff=7
power-management reboot pci
power-management reboot pci
New contributor
New contributor
New contributor
asked 2 hours ago
ArturoArturo
1011
1011
New contributor
New contributor
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Arturo is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501743%2fnvidia-devbox-with-ubuntu-16-04-and-4-4-0-137-generic-kernel-randomly-reboots-an%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Arturo is a new contributor. Be nice, and check out our Code of Conduct.
Arturo is a new contributor. Be nice, and check out our Code of Conduct.
Arturo is a new contributor. Be nice, and check out our Code of Conduct.
Arturo is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501743%2fnvidia-devbox-with-ubuntu-16-04-and-4-4-0-137-generic-kernel-randomly-reboots-an%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown