Pattern Matching Exclude Duplicate Characters












7















Is there a regular expression for the following that matches characters in a character set but only once? In other words, once a character is found, remove it from the set.



If grep cannot do this, is there a built-in utility which can?



Example:



Characters to match only once:   spine


Input:



spine
spines
spin
pine
seep
spins


Output:



spine
spin
pine


EDIT:

There are many ways to achieve this output (one example below), but I'm looking for a way to do this without having to customize the command for each pattern I want to match.



grep '[spine]' input_file | grep -v 's.*s' | ... | grep -v 'e.*e'










share|improve this question

























  • Question: What is the application for this?

    – mdpc
    Jul 22 '11 at 5:01
















7















Is there a regular expression for the following that matches characters in a character set but only once? In other words, once a character is found, remove it from the set.



If grep cannot do this, is there a built-in utility which can?



Example:



Characters to match only once:   spine


Input:



spine
spines
spin
pine
seep
spins


Output:



spine
spin
pine


EDIT:

There are many ways to achieve this output (one example below), but I'm looking for a way to do this without having to customize the command for each pattern I want to match.



grep '[spine]' input_file | grep -v 's.*s' | ... | grep -v 'e.*e'










share|improve this question

























  • Question: What is the application for this?

    – mdpc
    Jul 22 '11 at 5:01














7












7








7


2






Is there a regular expression for the following that matches characters in a character set but only once? In other words, once a character is found, remove it from the set.



If grep cannot do this, is there a built-in utility which can?



Example:



Characters to match only once:   spine


Input:



spine
spines
spin
pine
seep
spins


Output:



spine
spin
pine


EDIT:

There are many ways to achieve this output (one example below), but I'm looking for a way to do this without having to customize the command for each pattern I want to match.



grep '[spine]' input_file | grep -v 's.*s' | ... | grep -v 'e.*e'










share|improve this question
















Is there a regular expression for the following that matches characters in a character set but only once? In other words, once a character is found, remove it from the set.



If grep cannot do this, is there a built-in utility which can?



Example:



Characters to match only once:   spine


Input:



spine
spines
spin
pine
seep
spins


Output:



spine
spin
pine


EDIT:

There are many ways to achieve this output (one example below), but I'm looking for a way to do this without having to customize the command for each pattern I want to match.



grep '[spine]' input_file | grep -v 's.*s' | ... | grep -v 'e.*e'







grep regular-expression patterns






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 21 '11 at 23:02







Steven

















asked Jul 21 '11 at 21:41









StevenSteven

3911513




3911513













  • Question: What is the application for this?

    – mdpc
    Jul 22 '11 at 5:01



















  • Question: What is the application for this?

    – mdpc
    Jul 22 '11 at 5:01

















Question: What is the application for this?

– mdpc
Jul 22 '11 at 5:01





Question: What is the application for this?

– mdpc
Jul 22 '11 at 5:01










3 Answers
3






active

oldest

votes


















4














With regular expressions in the mathematical sense, it's possible, but the size of the regular expressions grows exponentially relative to the size of the alphabet, so it isn't practical.



There's a simple way with negation and backreferences.



grep '[spine]' | grep -Ev '([spine]).*1'


The first grep selects lines that contain at least one of einps; the second grep rejects lines that contain more than one of any (e.g. allowing spinal tap and spend but not foobar or see).






share|improve this answer































    1














    Inspired by your expression, I can come up with a shorter one, using egrep:



    egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE


    which is equivalent to



    sed /s.*s/d;/p.*p/d;/i.*i/d;/n.*n/d;/e.*e/d; FILE


    And this is how to produce the sed-command from the input automatically:



    #!/bin/bash
    word=$1
    file=$2
    expr=$(for c in $(echo $word | sed 's/./& /g'); do echo -n "/"$c".*"$c"/d;"; done);
    sed $expr $file


    I tried a similar approach with grep, but couldn't convince the shell to take the grep-pattern from a variable, but if I echoed it out, and inserted the result with cut and paste, the command worked:



    expr="'("$(for c in $(echo $wort | sed 's/./& /g'); do echo -n $c".*"$c"|"; done)

    egrep -v ${expr/%|/)'} FILE
    # doesn't work, filters nothing, whole file is printed
    # check:
    echo egrep -v $(echo $exp) FILE
    egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
    # manually:
    egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
    spine
    spin
    pine


    Maybe I made an error, maybe I make a mistake with variable expansion.






    share|improve this answer


























    • See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

      – Steven
      Jul 21 '11 at 23:06











    • Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

      – user unknown
      Jul 22 '11 at 1:55











    • Finally found out how to solve it with sed - is that acceptable?

      – user unknown
      Jul 22 '11 at 2:53



















    0














    Y
    Hgcvvbbjnnjnmll jjbkmn gggghh





    share








    New contributor




    user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.




















      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "106"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f17175%2fpattern-matching-exclude-duplicate-characters%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      4














      With regular expressions in the mathematical sense, it's possible, but the size of the regular expressions grows exponentially relative to the size of the alphabet, so it isn't practical.



      There's a simple way with negation and backreferences.



      grep '[spine]' | grep -Ev '([spine]).*1'


      The first grep selects lines that contain at least one of einps; the second grep rejects lines that contain more than one of any (e.g. allowing spinal tap and spend but not foobar or see).






      share|improve this answer




























        4














        With regular expressions in the mathematical sense, it's possible, but the size of the regular expressions grows exponentially relative to the size of the alphabet, so it isn't practical.



        There's a simple way with negation and backreferences.



        grep '[spine]' | grep -Ev '([spine]).*1'


        The first grep selects lines that contain at least one of einps; the second grep rejects lines that contain more than one of any (e.g. allowing spinal tap and spend but not foobar or see).






        share|improve this answer


























          4












          4








          4







          With regular expressions in the mathematical sense, it's possible, but the size of the regular expressions grows exponentially relative to the size of the alphabet, so it isn't practical.



          There's a simple way with negation and backreferences.



          grep '[spine]' | grep -Ev '([spine]).*1'


          The first grep selects lines that contain at least one of einps; the second grep rejects lines that contain more than one of any (e.g. allowing spinal tap and spend but not foobar or see).






          share|improve this answer













          With regular expressions in the mathematical sense, it's possible, but the size of the regular expressions grows exponentially relative to the size of the alphabet, so it isn't practical.



          There's a simple way with negation and backreferences.



          grep '[spine]' | grep -Ev '([spine]).*1'


          The first grep selects lines that contain at least one of einps; the second grep rejects lines that contain more than one of any (e.g. allowing spinal tap and spend but not foobar or see).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jul 22 '11 at 10:38









          GillesGilles

          543k12811001618




          543k12811001618

























              1














              Inspired by your expression, I can come up with a shorter one, using egrep:



              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE


              which is equivalent to



              sed /s.*s/d;/p.*p/d;/i.*i/d;/n.*n/d;/e.*e/d; FILE


              And this is how to produce the sed-command from the input automatically:



              #!/bin/bash
              word=$1
              file=$2
              expr=$(for c in $(echo $word | sed 's/./& /g'); do echo -n "/"$c".*"$c"/d;"; done);
              sed $expr $file


              I tried a similar approach with grep, but couldn't convince the shell to take the grep-pattern from a variable, but if I echoed it out, and inserted the result with cut and paste, the command worked:



              expr="'("$(for c in $(echo $wort | sed 's/./& /g'); do echo -n $c".*"$c"|"; done)

              egrep -v ${expr/%|/)'} FILE
              # doesn't work, filters nothing, whole file is printed
              # check:
              echo egrep -v $(echo $exp) FILE
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              # manually:
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              spine
              spin
              pine


              Maybe I made an error, maybe I make a mistake with variable expansion.






              share|improve this answer


























              • See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

                – Steven
                Jul 21 '11 at 23:06











              • Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

                – user unknown
                Jul 22 '11 at 1:55











              • Finally found out how to solve it with sed - is that acceptable?

                – user unknown
                Jul 22 '11 at 2:53
















              1














              Inspired by your expression, I can come up with a shorter one, using egrep:



              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE


              which is equivalent to



              sed /s.*s/d;/p.*p/d;/i.*i/d;/n.*n/d;/e.*e/d; FILE


              And this is how to produce the sed-command from the input automatically:



              #!/bin/bash
              word=$1
              file=$2
              expr=$(for c in $(echo $word | sed 's/./& /g'); do echo -n "/"$c".*"$c"/d;"; done);
              sed $expr $file


              I tried a similar approach with grep, but couldn't convince the shell to take the grep-pattern from a variable, but if I echoed it out, and inserted the result with cut and paste, the command worked:



              expr="'("$(for c in $(echo $wort | sed 's/./& /g'); do echo -n $c".*"$c"|"; done)

              egrep -v ${expr/%|/)'} FILE
              # doesn't work, filters nothing, whole file is printed
              # check:
              echo egrep -v $(echo $exp) FILE
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              # manually:
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              spine
              spin
              pine


              Maybe I made an error, maybe I make a mistake with variable expansion.






              share|improve this answer


























              • See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

                – Steven
                Jul 21 '11 at 23:06











              • Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

                – user unknown
                Jul 22 '11 at 1:55











              • Finally found out how to solve it with sed - is that acceptable?

                – user unknown
                Jul 22 '11 at 2:53














              1












              1








              1







              Inspired by your expression, I can come up with a shorter one, using egrep:



              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE


              which is equivalent to



              sed /s.*s/d;/p.*p/d;/i.*i/d;/n.*n/d;/e.*e/d; FILE


              And this is how to produce the sed-command from the input automatically:



              #!/bin/bash
              word=$1
              file=$2
              expr=$(for c in $(echo $word | sed 's/./& /g'); do echo -n "/"$c".*"$c"/d;"; done);
              sed $expr $file


              I tried a similar approach with grep, but couldn't convince the shell to take the grep-pattern from a variable, but if I echoed it out, and inserted the result with cut and paste, the command worked:



              expr="'("$(for c in $(echo $wort | sed 's/./& /g'); do echo -n $c".*"$c"|"; done)

              egrep -v ${expr/%|/)'} FILE
              # doesn't work, filters nothing, whole file is printed
              # check:
              echo egrep -v $(echo $exp) FILE
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              # manually:
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              spine
              spin
              pine


              Maybe I made an error, maybe I make a mistake with variable expansion.






              share|improve this answer















              Inspired by your expression, I can come up with a shorter one, using egrep:



              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE


              which is equivalent to



              sed /s.*s/d;/p.*p/d;/i.*i/d;/n.*n/d;/e.*e/d; FILE


              And this is how to produce the sed-command from the input automatically:



              #!/bin/bash
              word=$1
              file=$2
              expr=$(for c in $(echo $word | sed 's/./& /g'); do echo -n "/"$c".*"$c"/d;"; done);
              sed $expr $file


              I tried a similar approach with grep, but couldn't convince the shell to take the grep-pattern from a variable, but if I echoed it out, and inserted the result with cut and paste, the command worked:



              expr="'("$(for c in $(echo $wort | sed 's/./& /g'); do echo -n $c".*"$c"|"; done)

              egrep -v ${expr/%|/)'} FILE
              # doesn't work, filters nothing, whole file is printed
              # check:
              echo egrep -v $(echo $exp) FILE
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              # manually:
              egrep -v '(s.*s|p.*p|i.*i|n.*n|e.*e)' FILE
              spine
              spin
              pine


              Maybe I made an error, maybe I make a mistake with variable expansion.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jul 22 '11 at 2:41

























              answered Jul 21 '11 at 22:31









              user unknownuser unknown

              7,41112450




              7,41112450













              • See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

                – Steven
                Jul 21 '11 at 23:06











              • Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

                – user unknown
                Jul 22 '11 at 1:55











              • Finally found out how to solve it with sed - is that acceptable?

                – user unknown
                Jul 22 '11 at 2:53



















              • See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

                – Steven
                Jul 21 '11 at 23:06











              • Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

                – user unknown
                Jul 22 '11 at 1:55











              • Finally found out how to solve it with sed - is that acceptable?

                – user unknown
                Jul 22 '11 at 2:53

















              See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

              – Steven
              Jul 21 '11 at 23:06





              See my edited post for desired output. Also, I'm looking for a solution which doesn't require a complex, tedious, pattern-specific command.

              – Steven
              Jul 21 '11 at 23:06













              Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

              – user unknown
              Jul 22 '11 at 1:55





              Yes, I see. Maybe I find a way to produce the sed-command from the word 'spine'.

              – user unknown
              Jul 22 '11 at 1:55













              Finally found out how to solve it with sed - is that acceptable?

              – user unknown
              Jul 22 '11 at 2:53





              Finally found out how to solve it with sed - is that acceptable?

              – user unknown
              Jul 22 '11 at 2:53











              0














              Y
              Hgcvvbbjnnjnmll jjbkmn gggghh





              share








              New contributor




              user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.

























                0














                Y
                Hgcvvbbjnnjnmll jjbkmn gggghh





                share








                New contributor




                user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.























                  0












                  0








                  0







                  Y
                  Hgcvvbbjnnjnmll jjbkmn gggghh





                  share








                  New contributor




                  user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.










                  Y
                  Hgcvvbbjnnjnmll jjbkmn gggghh






                  share








                  New contributor




                  user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.








                  share


                  share






                  New contributor




                  user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered 9 mins ago









                  user343044user343044

                  1




                  1




                  New contributor




                  user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  user343044 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Unix & Linux Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f17175%2fpattern-matching-exclude-duplicate-characters%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      CARDNET

                      Boot-repair Failure: Unable to locate package grub-common:i386

                      Aws NAT - Aws IGW- Aws router