Performance bottlenecks of Linux traffic shaping (tc)
I have an interesting problem.
We have a Linux server with Debian Stretch installed. The server has a single CPU E5-2680 v2 and Intel 82599ES 10G Dual Port NIC's. There are no other services (like HTTP and so on) running on it.
All seems good, until I enable traffic shaping. The shaping rules consist of htb qdisc
on the bond0
and bond1
interface.
There is a three step hashing filter:
Four rule filter to select a large subnet. The rules look like this:
protocol ip u32 ht 800:: match ip src 10.15.0.0/18 hashkey mask 0x0000ff00 at 12 link 1:
60 or so rule tables to select a /24 out of that large subnet
protocol ip u32 ht 1:10 match ip src 10.15.16.0/24 hashkey mask 0x000000ff at 12 link 100:
256 rule tables to match the IP
parent 1:0 prio 100 protocol ip u32 ht 100:20: match ip src 10.15.16.32 flowid 1:20
class add dev bond0 parent 1:0 classid 1:1 htb rate 24Gbit ceil 24Gbit burst 4096k cburst 256k
class add dev bond0 parent 1:1 classid 1:20 htb rate 102400Kbit burst 4096k cburst 256k
qdisc add dev bond0 parent 1:20 pfifo_fast
Now, when I enable these rules, I get at most around 10Gbps total traffic and then ping starts to go up to 100ms or more. However, during that time, the CPU load is almost none - uptime
shows a load of 0.07 and each CPU core is only about 5% utilized. ping
instantly becomes OK once I clear the shaping rules.
Previously I used sfq
instead of pfifo_fast
, and that did result in a large CPU load. I then switched to red, which produced a bit better results with no CPU load, then I switched to pfifo_fast
which produces almost the same results as red.
- What other bottleneck could be there?
- Has anyone tried using Linux traffic shaper for a network that's over 10gbps?
linux networking performance tc
add a comment |
I have an interesting problem.
We have a Linux server with Debian Stretch installed. The server has a single CPU E5-2680 v2 and Intel 82599ES 10G Dual Port NIC's. There are no other services (like HTTP and so on) running on it.
All seems good, until I enable traffic shaping. The shaping rules consist of htb qdisc
on the bond0
and bond1
interface.
There is a three step hashing filter:
Four rule filter to select a large subnet. The rules look like this:
protocol ip u32 ht 800:: match ip src 10.15.0.0/18 hashkey mask 0x0000ff00 at 12 link 1:
60 or so rule tables to select a /24 out of that large subnet
protocol ip u32 ht 1:10 match ip src 10.15.16.0/24 hashkey mask 0x000000ff at 12 link 100:
256 rule tables to match the IP
parent 1:0 prio 100 protocol ip u32 ht 100:20: match ip src 10.15.16.32 flowid 1:20
class add dev bond0 parent 1:0 classid 1:1 htb rate 24Gbit ceil 24Gbit burst 4096k cburst 256k
class add dev bond0 parent 1:1 classid 1:20 htb rate 102400Kbit burst 4096k cburst 256k
qdisc add dev bond0 parent 1:20 pfifo_fast
Now, when I enable these rules, I get at most around 10Gbps total traffic and then ping starts to go up to 100ms or more. However, during that time, the CPU load is almost none - uptime
shows a load of 0.07 and each CPU core is only about 5% utilized. ping
instantly becomes OK once I clear the shaping rules.
Previously I used sfq
instead of pfifo_fast
, and that did result in a large CPU load. I then switched to red, which produced a bit better results with no CPU load, then I switched to pfifo_fast
which produces almost the same results as red.
- What other bottleneck could be there?
- Has anyone tried using Linux traffic shaper for a network that's over 10gbps?
linux networking performance tc
Please add the brand and model of your NICs.
– Rui F Ribeiro
Dec 21 '17 at 23:37
@RuiFRibeiro added
– Pentium100
Dec 22 '17 at 4:32
FWIW, I had similar performance problems usingtc
to emulate a WAN. I don't remember the exact details as it was a few years ago, but I was only using 100baseT ethernet as that was about the bandwidth on the WAN connection I was emulating, so I was hoping I could just usetc
to add latency. And I still had performance problems similar to what you're describing when I started increasing the simulated latency, including severe bandwidth limitations well beneath what the underlying hardware supported withouttc
shaping.
– Andrew Henle
Feb 8 '18 at 12:47
add a comment |
I have an interesting problem.
We have a Linux server with Debian Stretch installed. The server has a single CPU E5-2680 v2 and Intel 82599ES 10G Dual Port NIC's. There are no other services (like HTTP and so on) running on it.
All seems good, until I enable traffic shaping. The shaping rules consist of htb qdisc
on the bond0
and bond1
interface.
There is a three step hashing filter:
Four rule filter to select a large subnet. The rules look like this:
protocol ip u32 ht 800:: match ip src 10.15.0.0/18 hashkey mask 0x0000ff00 at 12 link 1:
60 or so rule tables to select a /24 out of that large subnet
protocol ip u32 ht 1:10 match ip src 10.15.16.0/24 hashkey mask 0x000000ff at 12 link 100:
256 rule tables to match the IP
parent 1:0 prio 100 protocol ip u32 ht 100:20: match ip src 10.15.16.32 flowid 1:20
class add dev bond0 parent 1:0 classid 1:1 htb rate 24Gbit ceil 24Gbit burst 4096k cburst 256k
class add dev bond0 parent 1:1 classid 1:20 htb rate 102400Kbit burst 4096k cburst 256k
qdisc add dev bond0 parent 1:20 pfifo_fast
Now, when I enable these rules, I get at most around 10Gbps total traffic and then ping starts to go up to 100ms or more. However, during that time, the CPU load is almost none - uptime
shows a load of 0.07 and each CPU core is only about 5% utilized. ping
instantly becomes OK once I clear the shaping rules.
Previously I used sfq
instead of pfifo_fast
, and that did result in a large CPU load. I then switched to red, which produced a bit better results with no CPU load, then I switched to pfifo_fast
which produces almost the same results as red.
- What other bottleneck could be there?
- Has anyone tried using Linux traffic shaper for a network that's over 10gbps?
linux networking performance tc
I have an interesting problem.
We have a Linux server with Debian Stretch installed. The server has a single CPU E5-2680 v2 and Intel 82599ES 10G Dual Port NIC's. There are no other services (like HTTP and so on) running on it.
All seems good, until I enable traffic shaping. The shaping rules consist of htb qdisc
on the bond0
and bond1
interface.
There is a three step hashing filter:
Four rule filter to select a large subnet. The rules look like this:
protocol ip u32 ht 800:: match ip src 10.15.0.0/18 hashkey mask 0x0000ff00 at 12 link 1:
60 or so rule tables to select a /24 out of that large subnet
protocol ip u32 ht 1:10 match ip src 10.15.16.0/24 hashkey mask 0x000000ff at 12 link 100:
256 rule tables to match the IP
parent 1:0 prio 100 protocol ip u32 ht 100:20: match ip src 10.15.16.32 flowid 1:20
class add dev bond0 parent 1:0 classid 1:1 htb rate 24Gbit ceil 24Gbit burst 4096k cburst 256k
class add dev bond0 parent 1:1 classid 1:20 htb rate 102400Kbit burst 4096k cburst 256k
qdisc add dev bond0 parent 1:20 pfifo_fast
Now, when I enable these rules, I get at most around 10Gbps total traffic and then ping starts to go up to 100ms or more. However, during that time, the CPU load is almost none - uptime
shows a load of 0.07 and each CPU core is only about 5% utilized. ping
instantly becomes OK once I clear the shaping rules.
Previously I used sfq
instead of pfifo_fast
, and that did result in a large CPU load. I then switched to red, which produced a bit better results with no CPU load, then I switched to pfifo_fast
which produces almost the same results as red.
- What other bottleneck could be there?
- Has anyone tried using Linux traffic shaper for a network that's over 10gbps?
linux networking performance tc
linux networking performance tc
edited Feb 7 '18 at 11:31
Yaron
3,27421027
3,27421027
asked Dec 21 '17 at 21:11
Pentium100Pentium100
163128
163128
Please add the brand and model of your NICs.
– Rui F Ribeiro
Dec 21 '17 at 23:37
@RuiFRibeiro added
– Pentium100
Dec 22 '17 at 4:32
FWIW, I had similar performance problems usingtc
to emulate a WAN. I don't remember the exact details as it was a few years ago, but I was only using 100baseT ethernet as that was about the bandwidth on the WAN connection I was emulating, so I was hoping I could just usetc
to add latency. And I still had performance problems similar to what you're describing when I started increasing the simulated latency, including severe bandwidth limitations well beneath what the underlying hardware supported withouttc
shaping.
– Andrew Henle
Feb 8 '18 at 12:47
add a comment |
Please add the brand and model of your NICs.
– Rui F Ribeiro
Dec 21 '17 at 23:37
@RuiFRibeiro added
– Pentium100
Dec 22 '17 at 4:32
FWIW, I had similar performance problems usingtc
to emulate a WAN. I don't remember the exact details as it was a few years ago, but I was only using 100baseT ethernet as that was about the bandwidth on the WAN connection I was emulating, so I was hoping I could just usetc
to add latency. And I still had performance problems similar to what you're describing when I started increasing the simulated latency, including severe bandwidth limitations well beneath what the underlying hardware supported withouttc
shaping.
– Andrew Henle
Feb 8 '18 at 12:47
Please add the brand and model of your NICs.
– Rui F Ribeiro
Dec 21 '17 at 23:37
Please add the brand and model of your NICs.
– Rui F Ribeiro
Dec 21 '17 at 23:37
@RuiFRibeiro added
– Pentium100
Dec 22 '17 at 4:32
@RuiFRibeiro added
– Pentium100
Dec 22 '17 at 4:32
FWIW, I had similar performance problems using
tc
to emulate a WAN. I don't remember the exact details as it was a few years ago, but I was only using 100baseT ethernet as that was about the bandwidth on the WAN connection I was emulating, so I was hoping I could just use tc
to add latency. And I still had performance problems similar to what you're describing when I started increasing the simulated latency, including severe bandwidth limitations well beneath what the underlying hardware supported without tc
shaping.– Andrew Henle
Feb 8 '18 at 12:47
FWIW, I had similar performance problems using
tc
to emulate a WAN. I don't remember the exact details as it was a few years ago, but I was only using 100baseT ethernet as that was about the bandwidth on the WAN connection I was emulating, so I was hoping I could just use tc
to add latency. And I still had performance problems similar to what you're describing when I started increasing the simulated latency, including severe bandwidth limitations well beneath what the underlying hardware supported without tc
shaping.– Andrew Henle
Feb 8 '18 at 12:47
add a comment |
1 Answer
1
active
oldest
votes
Did you find a way to get around the problem?
New contributor
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f412365%2fperformance-bottlenecks-of-linux-traffic-shaping-tc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Did you find a way to get around the problem?
New contributor
add a comment |
Did you find a way to get around the problem?
New contributor
add a comment |
Did you find a way to get around the problem?
New contributor
Did you find a way to get around the problem?
New contributor
New contributor
answered 30 mins ago
mctailermctailer
1
1
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f412365%2fperformance-bottlenecks-of-linux-traffic-shaping-tc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please add the brand and model of your NICs.
– Rui F Ribeiro
Dec 21 '17 at 23:37
@RuiFRibeiro added
– Pentium100
Dec 22 '17 at 4:32
FWIW, I had similar performance problems using
tc
to emulate a WAN. I don't remember the exact details as it was a few years ago, but I was only using 100baseT ethernet as that was about the bandwidth on the WAN connection I was emulating, so I was hoping I could just usetc
to add latency. And I still had performance problems similar to what you're describing when I started increasing the simulated latency, including severe bandwidth limitations well beneath what the underlying hardware supported withouttc
shaping.– Andrew Henle
Feb 8 '18 at 12:47