Why is `sed` no op much faster than `awk` in this case












2















I am trying to understand some performance issues related to sed and awk, and I did the following experiment



$ seq 100000 > test
$ yes 'NR==100001{print}' | head -n 5000 > test.awk
$ yes '100001{p;b}' | head -n 5000 > test.sed
$ time sed -nf test.sed test
real 0m3.436s
user 0m3.428s
sys 0m0.004s
$ time awk -F@ -f test.awk test
real 0m11.615s
user 0m11.582s
sys 0m0.007s
$ sed --version
sed (GNU sed) 4.5
$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)


Here, since the test file only contains 100000 lines, all the commands in test.sed and test.awk are no-ops. Both programs only need to match the line number with the address (in sed) or NR(in awk) to decide that the command does not need to be executed, but there is still a huge difference in the time cost. Why is it the case? Are there anyone with different versions of sed and awk installed that gives a different result on this test?



Edit:
The results for mawk (as suggested by @mosvy) and perl are given below,



$ time mawk -F@ -f test.awk test
real 0m5.934s
user 0m5.919s
sys 0m0.004s
$ yes 'print if $.==100001;' | head -n 5000 > test.pl
$ time perl -n test.pl test
real 0m33.245s
user 0m33.110s
sys 0m0.019s
$ mawk -W version
mawk 1.3.4 20171017
$ perl --version
This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-thread-multi









share|improve this question




















  • 2





    I also suggest you try it with mawk ;-)

    – mosvy
    1 hour ago
















2















I am trying to understand some performance issues related to sed and awk, and I did the following experiment



$ seq 100000 > test
$ yes 'NR==100001{print}' | head -n 5000 > test.awk
$ yes '100001{p;b}' | head -n 5000 > test.sed
$ time sed -nf test.sed test
real 0m3.436s
user 0m3.428s
sys 0m0.004s
$ time awk -F@ -f test.awk test
real 0m11.615s
user 0m11.582s
sys 0m0.007s
$ sed --version
sed (GNU sed) 4.5
$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)


Here, since the test file only contains 100000 lines, all the commands in test.sed and test.awk are no-ops. Both programs only need to match the line number with the address (in sed) or NR(in awk) to decide that the command does not need to be executed, but there is still a huge difference in the time cost. Why is it the case? Are there anyone with different versions of sed and awk installed that gives a different result on this test?



Edit:
The results for mawk (as suggested by @mosvy) and perl are given below,



$ time mawk -F@ -f test.awk test
real 0m5.934s
user 0m5.919s
sys 0m0.004s
$ yes 'print if $.==100001;' | head -n 5000 > test.pl
$ time perl -n test.pl test
real 0m33.245s
user 0m33.110s
sys 0m0.019s
$ mawk -W version
mawk 1.3.4 20171017
$ perl --version
This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-thread-multi









share|improve this question




















  • 2





    I also suggest you try it with mawk ;-)

    – mosvy
    1 hour ago














2












2








2


1






I am trying to understand some performance issues related to sed and awk, and I did the following experiment



$ seq 100000 > test
$ yes 'NR==100001{print}' | head -n 5000 > test.awk
$ yes '100001{p;b}' | head -n 5000 > test.sed
$ time sed -nf test.sed test
real 0m3.436s
user 0m3.428s
sys 0m0.004s
$ time awk -F@ -f test.awk test
real 0m11.615s
user 0m11.582s
sys 0m0.007s
$ sed --version
sed (GNU sed) 4.5
$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)


Here, since the test file only contains 100000 lines, all the commands in test.sed and test.awk are no-ops. Both programs only need to match the line number with the address (in sed) or NR(in awk) to decide that the command does not need to be executed, but there is still a huge difference in the time cost. Why is it the case? Are there anyone with different versions of sed and awk installed that gives a different result on this test?



Edit:
The results for mawk (as suggested by @mosvy) and perl are given below,



$ time mawk -F@ -f test.awk test
real 0m5.934s
user 0m5.919s
sys 0m0.004s
$ yes 'print if $.==100001;' | head -n 5000 > test.pl
$ time perl -n test.pl test
real 0m33.245s
user 0m33.110s
sys 0m0.019s
$ mawk -W version
mawk 1.3.4 20171017
$ perl --version
This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-thread-multi









share|improve this question
















I am trying to understand some performance issues related to sed and awk, and I did the following experiment



$ seq 100000 > test
$ yes 'NR==100001{print}' | head -n 5000 > test.awk
$ yes '100001{p;b}' | head -n 5000 > test.sed
$ time sed -nf test.sed test
real 0m3.436s
user 0m3.428s
sys 0m0.004s
$ time awk -F@ -f test.awk test
real 0m11.615s
user 0m11.582s
sys 0m0.007s
$ sed --version
sed (GNU sed) 4.5
$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)


Here, since the test file only contains 100000 lines, all the commands in test.sed and test.awk are no-ops. Both programs only need to match the line number with the address (in sed) or NR(in awk) to decide that the command does not need to be executed, but there is still a huge difference in the time cost. Why is it the case? Are there anyone with different versions of sed and awk installed that gives a different result on this test?



Edit:
The results for mawk (as suggested by @mosvy) and perl are given below,



$ time mawk -F@ -f test.awk test
real 0m5.934s
user 0m5.919s
sys 0m0.004s
$ yes 'print if $.==100001;' | head -n 5000 > test.pl
$ time perl -n test.pl test
real 0m33.245s
user 0m33.110s
sys 0m0.019s
$ mawk -W version
mawk 1.3.4 20171017
$ perl --version
This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-thread-multi






awk sed perl performance






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 59 mins ago







Weijun Zhou

















asked 1 hour ago









Weijun ZhouWeijun Zhou

1,553325




1,553325








  • 2





    I also suggest you try it with mawk ;-)

    – mosvy
    1 hour ago














  • 2





    I also suggest you try it with mawk ;-)

    – mosvy
    1 hour ago








2




2





I also suggest you try it with mawk ;-)

– mosvy
1 hour ago





I also suggest you try it with mawk ;-)

– mosvy
1 hour ago










0






active

oldest

votes











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506892%2fwhy-is-sed-no-op-much-faster-than-awk-in-this-case%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506892%2fwhy-is-sed-no-op-much-faster-than-awk-in-this-case%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

濃尾地震

How to rewrite equation of hyperbola in standard form

No ethernet ip address in my vocore2