What happens to a TCP connection when the network is abruptly terminated
Say a user space application has a TCP connection with some non local endpoint. At some point, network gets abruptly broken (i.e. connection deleted in network manager, unplug the wifi dongle, cut the Ethernet cable)
What is conceptually happening inside the kernel to cope with the situation and how does it manifest itself to the userspace application?
Guideline sub-questions:
- what are the timeouts involved?
- will the kernel try to hide from userspace the connection is lost while attempting to reconnect?
- can waiting for a response cause the userspace app to not want to quit gracefully?
linux networking linux-kernel tcp
add a comment |
Say a user space application has a TCP connection with some non local endpoint. At some point, network gets abruptly broken (i.e. connection deleted in network manager, unplug the wifi dongle, cut the Ethernet cable)
What is conceptually happening inside the kernel to cope with the situation and how does it manifest itself to the userspace application?
Guideline sub-questions:
- what are the timeouts involved?
- will the kernel try to hide from userspace the connection is lost while attempting to reconnect?
- can waiting for a response cause the userspace app to not want to quit gracefully?
linux networking linux-kernel tcp
1
There has been a close vote for the question being off topic without a comment what particularity. As I see it the question addressed two topic points.Using or administering a *nix desktop or server
andUNIX C API and System Interfaces
. Would therefore ask for a comment on the closed vote please.
– TheMeaningfulEngineer
Dec 5 '18 at 13:38
I personally voted to leave this open; one vote currently says "too broad" while the other thinks it's a request for learning materials. Since I didn't cast those votes, I can't speak to them.
– Jeff Schaller
Dec 5 '18 at 15:50
The network interface or other infrastructure going down does not mean the “connection is lost”. TCP will keep trying to retransmit for a long time before killing the connection (9 minutes?). It’s not up to the kernel it is the TCP protocol and the “userspace app” could very well wait until the connection is dropped before exiting.
– Murray Jensen
Dec 8 '18 at 5:23
@MurrayJensen Your comment is pretty close to an answer. If you can phrase it to address my questions, will accept it as an answer.
– TheMeaningfulEngineer
yesterday
add a comment |
Say a user space application has a TCP connection with some non local endpoint. At some point, network gets abruptly broken (i.e. connection deleted in network manager, unplug the wifi dongle, cut the Ethernet cable)
What is conceptually happening inside the kernel to cope with the situation and how does it manifest itself to the userspace application?
Guideline sub-questions:
- what are the timeouts involved?
- will the kernel try to hide from userspace the connection is lost while attempting to reconnect?
- can waiting for a response cause the userspace app to not want to quit gracefully?
linux networking linux-kernel tcp
Say a user space application has a TCP connection with some non local endpoint. At some point, network gets abruptly broken (i.e. connection deleted in network manager, unplug the wifi dongle, cut the Ethernet cable)
What is conceptually happening inside the kernel to cope with the situation and how does it manifest itself to the userspace application?
Guideline sub-questions:
- what are the timeouts involved?
- will the kernel try to hide from userspace the connection is lost while attempting to reconnect?
- can waiting for a response cause the userspace app to not want to quit gracefully?
linux networking linux-kernel tcp
linux networking linux-kernel tcp
asked Dec 5 '18 at 10:10
TheMeaningfulEngineerTheMeaningfulEngineer
1,74773775
1,74773775
1
There has been a close vote for the question being off topic without a comment what particularity. As I see it the question addressed two topic points.Using or administering a *nix desktop or server
andUNIX C API and System Interfaces
. Would therefore ask for a comment on the closed vote please.
– TheMeaningfulEngineer
Dec 5 '18 at 13:38
I personally voted to leave this open; one vote currently says "too broad" while the other thinks it's a request for learning materials. Since I didn't cast those votes, I can't speak to them.
– Jeff Schaller
Dec 5 '18 at 15:50
The network interface or other infrastructure going down does not mean the “connection is lost”. TCP will keep trying to retransmit for a long time before killing the connection (9 minutes?). It’s not up to the kernel it is the TCP protocol and the “userspace app” could very well wait until the connection is dropped before exiting.
– Murray Jensen
Dec 8 '18 at 5:23
@MurrayJensen Your comment is pretty close to an answer. If you can phrase it to address my questions, will accept it as an answer.
– TheMeaningfulEngineer
yesterday
add a comment |
1
There has been a close vote for the question being off topic without a comment what particularity. As I see it the question addressed two topic points.Using or administering a *nix desktop or server
andUNIX C API and System Interfaces
. Would therefore ask for a comment on the closed vote please.
– TheMeaningfulEngineer
Dec 5 '18 at 13:38
I personally voted to leave this open; one vote currently says "too broad" while the other thinks it's a request for learning materials. Since I didn't cast those votes, I can't speak to them.
– Jeff Schaller
Dec 5 '18 at 15:50
The network interface or other infrastructure going down does not mean the “connection is lost”. TCP will keep trying to retransmit for a long time before killing the connection (9 minutes?). It’s not up to the kernel it is the TCP protocol and the “userspace app” could very well wait until the connection is dropped before exiting.
– Murray Jensen
Dec 8 '18 at 5:23
@MurrayJensen Your comment is pretty close to an answer. If you can phrase it to address my questions, will accept it as an answer.
– TheMeaningfulEngineer
yesterday
1
1
There has been a close vote for the question being off topic without a comment what particularity. As I see it the question addressed two topic points.
Using or administering a *nix desktop or server
and UNIX C API and System Interfaces
. Would therefore ask for a comment on the closed vote please.– TheMeaningfulEngineer
Dec 5 '18 at 13:38
There has been a close vote for the question being off topic without a comment what particularity. As I see it the question addressed two topic points.
Using or administering a *nix desktop or server
and UNIX C API and System Interfaces
. Would therefore ask for a comment on the closed vote please.– TheMeaningfulEngineer
Dec 5 '18 at 13:38
I personally voted to leave this open; one vote currently says "too broad" while the other thinks it's a request for learning materials. Since I didn't cast those votes, I can't speak to them.
– Jeff Schaller
Dec 5 '18 at 15:50
I personally voted to leave this open; one vote currently says "too broad" while the other thinks it's a request for learning materials. Since I didn't cast those votes, I can't speak to them.
– Jeff Schaller
Dec 5 '18 at 15:50
The network interface or other infrastructure going down does not mean the “connection is lost”. TCP will keep trying to retransmit for a long time before killing the connection (9 minutes?). It’s not up to the kernel it is the TCP protocol and the “userspace app” could very well wait until the connection is dropped before exiting.
– Murray Jensen
Dec 8 '18 at 5:23
The network interface or other infrastructure going down does not mean the “connection is lost”. TCP will keep trying to retransmit for a long time before killing the connection (9 minutes?). It’s not up to the kernel it is the TCP protocol and the “userspace app” could very well wait until the connection is dropped before exiting.
– Murray Jensen
Dec 8 '18 at 5:23
@MurrayJensen Your comment is pretty close to an answer. If you can phrase it to address my questions, will accept it as an answer.
– TheMeaningfulEngineer
yesterday
@MurrayJensen Your comment is pretty close to an answer. If you can phrase it to address my questions, will accept it as an answer.
– TheMeaningfulEngineer
yesterday
add a comment |
1 Answer
1
active
oldest
votes
The network interface or other infrastructure going down does not necessarily mean the “connection is lost” - TCP may keep trying to retransmit for a long time before killing the connection (depends what happened - an error on the local interface will probably cause an immediate error, but a router going down somewhere along the path may not).
It’s not up to the kernel, it is determined by the TCP protocol and the “userspace app” could very well wait a long time before receiving an error on the socket.
To answer each sub-question specifically:
- I've seen suggestions of up to 9 minutes before timeout (I think some of these timeouts might be configurable, where the protocol allows it and things like TCP keepalives can be configured to cause timeouts earlier);
- the kernel doesn't hide things, or try to "reconnect", it simply follows the TCP protocol, continually re-trying sending of un-acknowledged segments ... the "userspace app" is simply suspended inside the system call (e.g. write(), sendto(), etc) i.e. the "userspace app" is running in kernel mode and it's context is switched out and won't be switched back until some event makes the process "runnable" again;
- while suspended, the "userspace app" may be "uninterruptible" which means you can't kill it, even if you use SIGKILL (i.e. kill -9) as root - "graceful exit" may not be an option (although, I don't think this can happen with send on a socket, it has to be something that is considered to be short term and high priority - e.g write to a file on NFS with hard mount and intr flag not set can do it) ... but even if it is an option, the "app" must be written to catch errors and exit gracefully itself - if the kernel makes the "app" exit, it won't be graceful :-) (e.g. it won't run exit handlers or free up resources allocated outside the "app", etc)
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f486117%2fwhat-happens-to-a-tcp-connection-when-the-network-is-abruptly-terminated%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The network interface or other infrastructure going down does not necessarily mean the “connection is lost” - TCP may keep trying to retransmit for a long time before killing the connection (depends what happened - an error on the local interface will probably cause an immediate error, but a router going down somewhere along the path may not).
It’s not up to the kernel, it is determined by the TCP protocol and the “userspace app” could very well wait a long time before receiving an error on the socket.
To answer each sub-question specifically:
- I've seen suggestions of up to 9 minutes before timeout (I think some of these timeouts might be configurable, where the protocol allows it and things like TCP keepalives can be configured to cause timeouts earlier);
- the kernel doesn't hide things, or try to "reconnect", it simply follows the TCP protocol, continually re-trying sending of un-acknowledged segments ... the "userspace app" is simply suspended inside the system call (e.g. write(), sendto(), etc) i.e. the "userspace app" is running in kernel mode and it's context is switched out and won't be switched back until some event makes the process "runnable" again;
- while suspended, the "userspace app" may be "uninterruptible" which means you can't kill it, even if you use SIGKILL (i.e. kill -9) as root - "graceful exit" may not be an option (although, I don't think this can happen with send on a socket, it has to be something that is considered to be short term and high priority - e.g write to a file on NFS with hard mount and intr flag not set can do it) ... but even if it is an option, the "app" must be written to catch errors and exit gracefully itself - if the kernel makes the "app" exit, it won't be graceful :-) (e.g. it won't run exit handlers or free up resources allocated outside the "app", etc)
add a comment |
The network interface or other infrastructure going down does not necessarily mean the “connection is lost” - TCP may keep trying to retransmit for a long time before killing the connection (depends what happened - an error on the local interface will probably cause an immediate error, but a router going down somewhere along the path may not).
It’s not up to the kernel, it is determined by the TCP protocol and the “userspace app” could very well wait a long time before receiving an error on the socket.
To answer each sub-question specifically:
- I've seen suggestions of up to 9 minutes before timeout (I think some of these timeouts might be configurable, where the protocol allows it and things like TCP keepalives can be configured to cause timeouts earlier);
- the kernel doesn't hide things, or try to "reconnect", it simply follows the TCP protocol, continually re-trying sending of un-acknowledged segments ... the "userspace app" is simply suspended inside the system call (e.g. write(), sendto(), etc) i.e. the "userspace app" is running in kernel mode and it's context is switched out and won't be switched back until some event makes the process "runnable" again;
- while suspended, the "userspace app" may be "uninterruptible" which means you can't kill it, even if you use SIGKILL (i.e. kill -9) as root - "graceful exit" may not be an option (although, I don't think this can happen with send on a socket, it has to be something that is considered to be short term and high priority - e.g write to a file on NFS with hard mount and intr flag not set can do it) ... but even if it is an option, the "app" must be written to catch errors and exit gracefully itself - if the kernel makes the "app" exit, it won't be graceful :-) (e.g. it won't run exit handlers or free up resources allocated outside the "app", etc)
add a comment |
The network interface or other infrastructure going down does not necessarily mean the “connection is lost” - TCP may keep trying to retransmit for a long time before killing the connection (depends what happened - an error on the local interface will probably cause an immediate error, but a router going down somewhere along the path may not).
It’s not up to the kernel, it is determined by the TCP protocol and the “userspace app” could very well wait a long time before receiving an error on the socket.
To answer each sub-question specifically:
- I've seen suggestions of up to 9 minutes before timeout (I think some of these timeouts might be configurable, where the protocol allows it and things like TCP keepalives can be configured to cause timeouts earlier);
- the kernel doesn't hide things, or try to "reconnect", it simply follows the TCP protocol, continually re-trying sending of un-acknowledged segments ... the "userspace app" is simply suspended inside the system call (e.g. write(), sendto(), etc) i.e. the "userspace app" is running in kernel mode and it's context is switched out and won't be switched back until some event makes the process "runnable" again;
- while suspended, the "userspace app" may be "uninterruptible" which means you can't kill it, even if you use SIGKILL (i.e. kill -9) as root - "graceful exit" may not be an option (although, I don't think this can happen with send on a socket, it has to be something that is considered to be short term and high priority - e.g write to a file on NFS with hard mount and intr flag not set can do it) ... but even if it is an option, the "app" must be written to catch errors and exit gracefully itself - if the kernel makes the "app" exit, it won't be graceful :-) (e.g. it won't run exit handlers or free up resources allocated outside the "app", etc)
The network interface or other infrastructure going down does not necessarily mean the “connection is lost” - TCP may keep trying to retransmit for a long time before killing the connection (depends what happened - an error on the local interface will probably cause an immediate error, but a router going down somewhere along the path may not).
It’s not up to the kernel, it is determined by the TCP protocol and the “userspace app” could very well wait a long time before receiving an error on the socket.
To answer each sub-question specifically:
- I've seen suggestions of up to 9 minutes before timeout (I think some of these timeouts might be configurable, where the protocol allows it and things like TCP keepalives can be configured to cause timeouts earlier);
- the kernel doesn't hide things, or try to "reconnect", it simply follows the TCP protocol, continually re-trying sending of un-acknowledged segments ... the "userspace app" is simply suspended inside the system call (e.g. write(), sendto(), etc) i.e. the "userspace app" is running in kernel mode and it's context is switched out and won't be switched back until some event makes the process "runnable" again;
- while suspended, the "userspace app" may be "uninterruptible" which means you can't kill it, even if you use SIGKILL (i.e. kill -9) as root - "graceful exit" may not be an option (although, I don't think this can happen with send on a socket, it has to be something that is considered to be short term and high priority - e.g write to a file on NFS with hard mount and intr flag not set can do it) ... but even if it is an option, the "app" must be written to catch errors and exit gracefully itself - if the kernel makes the "app" exit, it won't be graceful :-) (e.g. it won't run exit handlers or free up resources allocated outside the "app", etc)
edited 5 mins ago
answered 10 mins ago
Murray JensenMurray Jensen
1,314165
1,314165
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f486117%2fwhat-happens-to-a-tcp-connection-when-the-network-is-abruptly-terminated%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
There has been a close vote for the question being off topic without a comment what particularity. As I see it the question addressed two topic points.
Using or administering a *nix desktop or server
andUNIX C API and System Interfaces
. Would therefore ask for a comment on the closed vote please.– TheMeaningfulEngineer
Dec 5 '18 at 13:38
I personally voted to leave this open; one vote currently says "too broad" while the other thinks it's a request for learning materials. Since I didn't cast those votes, I can't speak to them.
– Jeff Schaller
Dec 5 '18 at 15:50
The network interface or other infrastructure going down does not mean the “connection is lost”. TCP will keep trying to retransmit for a long time before killing the connection (9 minutes?). It’s not up to the kernel it is the TCP protocol and the “userspace app” could very well wait until the connection is dropped before exiting.
– Murray Jensen
Dec 8 '18 at 5:23
@MurrayJensen Your comment is pretty close to an answer. If you can phrase it to address my questions, will accept it as an answer.
– TheMeaningfulEngineer
yesterday