Remove all duplicate word from string using shell script












9















I have a string like



"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"


I want to remove duplicate word from string then output will be like



"aaa,bbb,ccc"


I tried This code Source



$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.



How can I remove duplicate value.



UPDATE



My question is adding all corresponding value into a single string if user is same .I have data like this ->



   user name    | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green


In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -



while read the records 

if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi


When I print this $c variable i get the output (For User AAA)



"red,black,blue,red,green,red,black,blue,red,green,"


I want to remove duplicate color .Then desired output should be like



"red,black,blue,green"


For this desired output i used above code



 echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


but it is displaying the output with duplicate values .Like



"red,black,blue,red,green,red,black,blue,red,green,"
Thanks










share|improve this question




















  • 3





    Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?

    – terdon
    Mar 23 '17 at 12:57











  • echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs

    – Sundeep
    Mar 23 '17 at 13:01











  • string value comes dynamically. It is printing same value (contain duplicate value).

    – Urvashi
    Mar 23 '17 at 13:02






  • 1





    yeah, show the code that failed, otherwise how would we know what could've gone wrong?

    – Sundeep
    Mar 23 '17 at 13:02











  • Does the order matter?

    – Jacob Vlijm
    Mar 23 '17 at 14:06
















9















I have a string like



"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"


I want to remove duplicate word from string then output will be like



"aaa,bbb,ccc"


I tried This code Source



$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.



How can I remove duplicate value.



UPDATE



My question is adding all corresponding value into a single string if user is same .I have data like this ->



   user name    | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green


In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -



while read the records 

if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi


When I print this $c variable i get the output (For User AAA)



"red,black,blue,red,green,red,black,blue,red,green,"


I want to remove duplicate color .Then desired output should be like



"red,black,blue,green"


For this desired output i used above code



 echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


but it is displaying the output with duplicate values .Like



"red,black,blue,red,green,red,black,blue,red,green,"
Thanks










share|improve this question




















  • 3





    Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?

    – terdon
    Mar 23 '17 at 12:57











  • echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs

    – Sundeep
    Mar 23 '17 at 13:01











  • string value comes dynamically. It is printing same value (contain duplicate value).

    – Urvashi
    Mar 23 '17 at 13:02






  • 1





    yeah, show the code that failed, otherwise how would we know what could've gone wrong?

    – Sundeep
    Mar 23 '17 at 13:02











  • Does the order matter?

    – Jacob Vlijm
    Mar 23 '17 at 14:06














9












9








9


2






I have a string like



"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"


I want to remove duplicate word from string then output will be like



"aaa,bbb,ccc"


I tried This code Source



$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.



How can I remove duplicate value.



UPDATE



My question is adding all corresponding value into a single string if user is same .I have data like this ->



   user name    | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green


In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -



while read the records 

if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi


When I print this $c variable i get the output (For User AAA)



"red,black,blue,red,green,red,black,blue,red,green,"


I want to remove duplicate color .Then desired output should be like



"red,black,blue,green"


For this desired output i used above code



 echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


but it is displaying the output with duplicate values .Like



"red,black,blue,red,green,red,black,blue,red,green,"
Thanks










share|improve this question
















I have a string like



"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"


I want to remove duplicate word from string then output will be like



"aaa,bbb,ccc"


I tried This code Source



$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.



How can I remove duplicate value.



UPDATE



My question is adding all corresponding value into a single string if user is same .I have data like this ->



   user name    | colour
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green
AAA | red
AAA | black
BBB | red
BBB | blue
AAA | blue
AAA | red
CCC | red
CCC | red
AAA | green


In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -



while read the records 

if [ "$c" == "" ]; then #$c I defined global
c="$colour1"
else
c="$c,$colour1"
fi


When I print this $c variable i get the output (For User AAA)



"red,black,blue,red,green,red,black,blue,red,green,"


I want to remove duplicate color .Then desired output should be like



"red,black,blue,green"


For this desired output i used above code



 echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs


but it is displaying the output with duplicate values .Like



"red,black,blue,red,green,red,black,blue,red,green,"
Thanks







shell-script shell text-processing xargs duplicate






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 23 '17 at 12:39









Community

1




1










asked Mar 23 '17 at 12:41









UrvashiUrvashi

8316




8316








  • 3





    Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?

    – terdon
    Mar 23 '17 at 12:57











  • echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs

    – Sundeep
    Mar 23 '17 at 13:01











  • string value comes dynamically. It is printing same value (contain duplicate value).

    – Urvashi
    Mar 23 '17 at 13:02






  • 1





    yeah, show the code that failed, otherwise how would we know what could've gone wrong?

    – Sundeep
    Mar 23 '17 at 13:02











  • Does the order matter?

    – Jacob Vlijm
    Mar 23 '17 at 14:06














  • 3





    Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?

    – terdon
    Mar 23 '17 at 12:57











  • echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs

    – Sundeep
    Mar 23 '17 at 13:01











  • string value comes dynamically. It is printing same value (contain duplicate value).

    – Urvashi
    Mar 23 '17 at 13:02






  • 1





    yeah, show the code that failed, otherwise how would we know what could've gone wrong?

    – Sundeep
    Mar 23 '17 at 13:02











  • Does the order matter?

    – Jacob Vlijm
    Mar 23 '17 at 14:06








3




3





Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?

– terdon
Mar 23 '17 at 12:57





Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail?

– terdon
Mar 23 '17 at 12:57













echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs

– Sundeep
Mar 23 '17 at 13:01





echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs

– Sundeep
Mar 23 '17 at 13:01













string value comes dynamically. It is printing same value (contain duplicate value).

– Urvashi
Mar 23 '17 at 13:02





string value comes dynamically. It is printing same value (contain duplicate value).

– Urvashi
Mar 23 '17 at 13:02




1




1





yeah, show the code that failed, otherwise how would we know what could've gone wrong?

– Sundeep
Mar 23 '17 at 13:02





yeah, show the code that failed, otherwise how would we know what could've gone wrong?

– Sundeep
Mar 23 '17 at 13:02













Does the order matter?

– Jacob Vlijm
Mar 23 '17 at 14:06





Does the order matter?

– Jacob Vlijm
Mar 23 '17 at 14:06










10 Answers
10






active

oldest

votes


















10














One more awk, just for fun:



$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
aaa bbb ccc ddd


By the way, even your solution works fine with variables:



$ b="zebra ant spider spider ant zebra ant" 
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra





share|improve this answer


























  • This works for me .Thanks @George Vasiliou

    – Urvashi
    Mar 24 '17 at 5:59











  • Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

    – JeremyCanfield
    20 hours ago



















8














$ echo "zebra ant spider spider ant zebra ant"  | awk -v RS="[ n]+" '!n[$0]++' 
zebra
ant
spider





share|improve this answer



















  • 1





    Very clever!!!!

    – George Vasiliou
    Mar 24 '17 at 0:54











  • @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

    – JJoao
    Mar 24 '17 at 8:44



















7














With tr, sort and uniq



echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq


or



echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs 


to get one line






share|improve this answer


























  • You need to add | xargs to join the output to one line again

    – Philippos
    Mar 23 '17 at 12:59






  • 3





    Or use sort -u. Or even a awk '!u[$0]++.

    – Benoît
    Mar 23 '17 at 18:42






  • 1





    @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

    – gardenhead
    Mar 24 '17 at 1:25



















2














With gnu sed:



sed ':s;s/(<S*>)(.*)<1>/12/g;ts'


You may add ;s/ */ /g to remove dublicate spaces.



Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.






share|improve this answer
























  • What are < and >?

    – someonewithpc
    Mar 23 '17 at 20:19











  • @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

    – Philippos
    Mar 23 '17 at 21:29











  • Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

    – someonewithpc
    Mar 23 '17 at 21:34






  • 1





    @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

    – Philippos
    Mar 23 '17 at 21:44



















2














perl -lane '$,=$";print grep { ! $h{$_}++ } @F'





share|improve this answer































    2














    Obligatory awk solution:



    $ echo "ant zebra ant spider spider ant zebra ant" | 
    awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
    zebra ant spider


    (The final echo is there for the newline)






    share|improve this answer


























    • Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

      – George Vasiliou
      Mar 23 '17 at 14:14











    • Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

      – ilkkachu
      Mar 23 '17 at 14:17











    • Yes, good point! Even sort prints in different order than input.

      – George Vasiliou
      Mar 23 '17 at 14:18






    • 1





      @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

      – user218374
      Mar 23 '17 at 14:31





















    1














    Python



    Option 1





    #!/usr/bin/env python
    # get_unique_words.py

    import sys

    l =
    for w in sys.argv[1].split(','):
    if w not in l:
    l += [ w ]
    print ','.join(l)


    Make executable, then call from Bash:



    $ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
    aaa,bbb,ccc


    Or you could implement it as a Bash function, but the syntax is messy.



    get_unique_words(){
    python -c "
    l =
    for w in '$1'.split(','):
    if w not in l:
    l += [ w ]
    print ','.join(l)"
    }


    Option 2



    This option can become a one-liner if needed:





    #!/usr/bin/env python
    # get_unique_words.py

    import sys

    s_in = sys.argv[1]
    l_in = s_in.split(',') # Turn string into a list.
    set_out = set(l_in) # Turning a list into a set removes duplicates items.
    s_out = ','.join(set_out)
    print s_out


    In Bash:



    get_unique_words(){
    python -c "print ','.join(set('$1'.split(',')))"
    }





    share|improve this answer

































      0














      cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile





      share|improve this answer


























      • I do not get it

        – Pierre.Vriens
        Dec 2 '18 at 7:00











      • Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

        – Kusalananda
        4 hours ago



















      0














      Using the original tabular data in the file called file:



      sed '1d' file | sort -u |
      awk '{ color[$1] = ( color[$1] == "" ? $3 : color[$1] "," $3 ) }
      END { for (user in color) print user, color[user] }'


      This generates



      CCC red
      BBB blue,red
      AAA black,blue,green,red


      The three steps of the pipeline:




      1. The sed command removes the first line which is a header that we don't want to read.


      2. The sort command gives us unique lines. The sample data after the sort looks like



        AAA         | black
        AAA | blue
        AAA | green
        AAA | red
        BBB | blue
        BBB | red
        CCC | red


      3. The awk command takes this data and produces a comma-delimited string for each user in the array color (where the username is the key into the array). At the end (in the END block), all collected data is outputted.






      share|improve this answer































        -2














        a="aaa aaa aaa bbb bbb ccc bbb ccc"
        for item in $a
        do
        echo $item
        done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)





        share|improve this answer


























        • Please add an explanation on how your code works and why you did this and that.

          – xhienne
          Mar 24 '17 at 1:37











        Your Answer








        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "106"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f353321%2fremove-all-duplicate-word-from-string-using-shell-script%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        10 Answers
        10






        active

        oldest

        votes








        10 Answers
        10






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        10














        One more awk, just for fun:



        $ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
        $ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
        aaa bbb ccc ddd


        By the way, even your solution works fine with variables:



        $ b="zebra ant spider spider ant zebra ant" 
        $ echo "$b" | xargs -n1 | sort -u | xargs
        ant spider zebra





        share|improve this answer


























        • This works for me .Thanks @George Vasiliou

          – Urvashi
          Mar 24 '17 at 5:59











        • Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

          – JeremyCanfield
          20 hours ago
















        10














        One more awk, just for fun:



        $ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
        $ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
        aaa bbb ccc ddd


        By the way, even your solution works fine with variables:



        $ b="zebra ant spider spider ant zebra ant" 
        $ echo "$b" | xargs -n1 | sort -u | xargs
        ant spider zebra





        share|improve this answer


























        • This works for me .Thanks @George Vasiliou

          – Urvashi
          Mar 24 '17 at 5:59











        • Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

          – JeremyCanfield
          20 hours ago














        10












        10








        10







        One more awk, just for fun:



        $ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
        $ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
        aaa bbb ccc ddd


        By the way, even your solution works fine with variables:



        $ b="zebra ant spider spider ant zebra ant" 
        $ echo "$b" | xargs -n1 | sort -u | xargs
        ant spider zebra





        share|improve this answer















        One more awk, just for fun:



        $ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
        $ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("n")}'
        aaa bbb ccc ddd


        By the way, even your solution works fine with variables:



        $ b="zebra ant spider spider ant zebra ant" 
        $ echo "$b" | xargs -n1 | sort -u | xargs
        ant spider zebra






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 23 '17 at 14:20

























        answered Mar 23 '17 at 14:12









        George VasiliouGeorge Vasiliou

        5,71531030




        5,71531030













        • This works for me .Thanks @George Vasiliou

          – Urvashi
          Mar 24 '17 at 5:59











        • Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

          – JeremyCanfield
          20 hours ago



















        • This works for me .Thanks @George Vasiliou

          – Urvashi
          Mar 24 '17 at 5:59











        • Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

          – JeremyCanfield
          20 hours ago

















        This works for me .Thanks @George Vasiliou

        – Urvashi
        Mar 24 '17 at 5:59





        This works for me .Thanks @George Vasiliou

        – Urvashi
        Mar 24 '17 at 5:59













        Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

        – JeremyCanfield
        20 hours ago





        Neat approach. The only adjustment I had to make was to use %s instead of %s%s. The reason being is that I was doing a for loop through the results and two white spaces caused some challenges with regex matches.

        – JeremyCanfield
        20 hours ago













        8














        $ echo "zebra ant spider spider ant zebra ant"  | awk -v RS="[ n]+" '!n[$0]++' 
        zebra
        ant
        spider





        share|improve this answer



















        • 1





          Very clever!!!!

          – George Vasiliou
          Mar 24 '17 at 0:54











        • @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

          – JJoao
          Mar 24 '17 at 8:44
















        8














        $ echo "zebra ant spider spider ant zebra ant"  | awk -v RS="[ n]+" '!n[$0]++' 
        zebra
        ant
        spider





        share|improve this answer



















        • 1





          Very clever!!!!

          – George Vasiliou
          Mar 24 '17 at 0:54











        • @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

          – JJoao
          Mar 24 '17 at 8:44














        8












        8








        8







        $ echo "zebra ant spider spider ant zebra ant"  | awk -v RS="[ n]+" '!n[$0]++' 
        zebra
        ant
        spider





        share|improve this answer













        $ echo "zebra ant spider spider ant zebra ant"  | awk -v RS="[ n]+" '!n[$0]++' 
        zebra
        ant
        spider






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 23 '17 at 15:25









        JJoaoJJoao

        7,3991929




        7,3991929








        • 1





          Very clever!!!!

          – George Vasiliou
          Mar 24 '17 at 0:54











        • @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

          – JJoao
          Mar 24 '17 at 8:44














        • 1





          Very clever!!!!

          – George Vasiliou
          Mar 24 '17 at 0:54











        • @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

          – JJoao
          Mar 24 '17 at 8:44








        1




        1





        Very clever!!!!

        – George Vasiliou
        Mar 24 '17 at 0:54





        Very clever!!!!

        – George Vasiliou
        Mar 24 '17 at 0:54













        @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

        – JJoao
        Mar 24 '17 at 8:44





        @GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ]

        – JJoao
        Mar 24 '17 at 8:44











        7














        With tr, sort and uniq



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq


        or



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs 


        to get one line






        share|improve this answer


























        • You need to add | xargs to join the output to one line again

          – Philippos
          Mar 23 '17 at 12:59






        • 3





          Or use sort -u. Or even a awk '!u[$0]++.

          – Benoît
          Mar 23 '17 at 18:42






        • 1





          @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

          – gardenhead
          Mar 24 '17 at 1:25
















        7














        With tr, sort and uniq



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq


        or



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs 


        to get one line






        share|improve this answer


























        • You need to add | xargs to join the output to one line again

          – Philippos
          Mar 23 '17 at 12:59






        • 3





          Or use sort -u. Or even a awk '!u[$0]++.

          – Benoît
          Mar 23 '17 at 18:42






        • 1





          @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

          – gardenhead
          Mar 24 '17 at 1:25














        7












        7








        7







        With tr, sort and uniq



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq


        or



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs 


        to get one line






        share|improve this answer















        With tr, sort and uniq



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq


        or



        echo "zebra ant spider spider ant zebra ant" | tr ' ' 'n' | sort | uniq | xargs 


        to get one line







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 23 '17 at 13:01

























        answered Mar 23 '17 at 12:55









        Michael D.Michael D.

        1,707917




        1,707917













        • You need to add | xargs to join the output to one line again

          – Philippos
          Mar 23 '17 at 12:59






        • 3





          Or use sort -u. Or even a awk '!u[$0]++.

          – Benoît
          Mar 23 '17 at 18:42






        • 1





          @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

          – gardenhead
          Mar 24 '17 at 1:25



















        • You need to add | xargs to join the output to one line again

          – Philippos
          Mar 23 '17 at 12:59






        • 3





          Or use sort -u. Or even a awk '!u[$0]++.

          – Benoît
          Mar 23 '17 at 18:42






        • 1





          @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

          – gardenhead
          Mar 24 '17 at 1:25

















        You need to add | xargs to join the output to one line again

        – Philippos
        Mar 23 '17 at 12:59





        You need to add | xargs to join the output to one line again

        – Philippos
        Mar 23 '17 at 12:59




        3




        3





        Or use sort -u. Or even a awk '!u[$0]++.

        – Benoît
        Mar 23 '17 at 18:42





        Or use sort -u. Or even a awk '!u[$0]++.

        – Benoît
        Mar 23 '17 at 18:42




        1




        1





        @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

        – gardenhead
        Mar 24 '17 at 1:25





        @Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes...

        – gardenhead
        Mar 24 '17 at 1:25











        2














        With gnu sed:



        sed ':s;s/(<S*>)(.*)<1>/12/g;ts'


        You may add ;s/ */ /g to remove dublicate spaces.



        Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.






        share|improve this answer
























        • What are < and >?

          – someonewithpc
          Mar 23 '17 at 20:19











        • @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

          – Philippos
          Mar 23 '17 at 21:29











        • Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

          – someonewithpc
          Mar 23 '17 at 21:34






        • 1





          @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

          – Philippos
          Mar 23 '17 at 21:44
















        2














        With gnu sed:



        sed ':s;s/(<S*>)(.*)<1>/12/g;ts'


        You may add ;s/ */ /g to remove dublicate spaces.



        Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.






        share|improve this answer
























        • What are < and >?

          – someonewithpc
          Mar 23 '17 at 20:19











        • @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

          – Philippos
          Mar 23 '17 at 21:29











        • Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

          – someonewithpc
          Mar 23 '17 at 21:34






        • 1





          @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

          – Philippos
          Mar 23 '17 at 21:44














        2












        2








        2







        With gnu sed:



        sed ':s;s/(<S*>)(.*)<1>/12/g;ts'


        You may add ;s/ */ /g to remove dublicate spaces.



        Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.






        share|improve this answer













        With gnu sed:



        sed ':s;s/(<S*>)(.*)<1>/12/g;ts'


        You may add ;s/ */ /g to remove dublicate spaces.



        Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 23 '17 at 12:52









        PhilipposPhilippos

        6,08711649




        6,08711649













        • What are < and >?

          – someonewithpc
          Mar 23 '17 at 20:19











        • @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

          – Philippos
          Mar 23 '17 at 21:29











        • Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

          – someonewithpc
          Mar 23 '17 at 21:34






        • 1





          @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

          – Philippos
          Mar 23 '17 at 21:44



















        • What are < and >?

          – someonewithpc
          Mar 23 '17 at 20:19











        • @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

          – Philippos
          Mar 23 '17 at 21:29











        • Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

          – someonewithpc
          Mar 23 '17 at 21:34






        • 1





          @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

          – Philippos
          Mar 23 '17 at 21:44

















        What are < and >?

        – someonewithpc
        Mar 23 '17 at 20:19





        What are < and >?

        – someonewithpc
        Mar 23 '17 at 20:19













        @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

        – Philippos
        Mar 23 '17 at 21:29





        @someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched.

        – Philippos
        Mar 23 '17 at 21:29













        Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

        – someonewithpc
        Mar 23 '17 at 21:34





        Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word.

        – someonewithpc
        Mar 23 '17 at 21:34




        1




        1





        @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

        – Philippos
        Mar 23 '17 at 21:44





        @someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately

        – Philippos
        Mar 23 '17 at 21:44











        2














        perl -lane '$,=$";print grep { ! $h{$_}++ } @F'





        share|improve this answer




























          2














          perl -lane '$,=$";print grep { ! $h{$_}++ } @F'





          share|improve this answer


























            2












            2








            2







            perl -lane '$,=$";print grep { ! $h{$_}++ } @F'





            share|improve this answer













            perl -lane '$,=$";print grep { ! $h{$_}++ } @F'






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Mar 23 '17 at 13:07







            user218374






























                2














                Obligatory awk solution:



                $ echo "ant zebra ant spider spider ant zebra ant" | 
                awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
                zebra ant spider


                (The final echo is there for the newline)






                share|improve this answer


























                • Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

                  – George Vasiliou
                  Mar 23 '17 at 14:14











                • Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

                  – ilkkachu
                  Mar 23 '17 at 14:17











                • Yes, good point! Even sort prints in different order than input.

                  – George Vasiliou
                  Mar 23 '17 at 14:18






                • 1





                  @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

                  – user218374
                  Mar 23 '17 at 14:31


















                2














                Obligatory awk solution:



                $ echo "ant zebra ant spider spider ant zebra ant" | 
                awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
                zebra ant spider


                (The final echo is there for the newline)






                share|improve this answer


























                • Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

                  – George Vasiliou
                  Mar 23 '17 at 14:14











                • Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

                  – ilkkachu
                  Mar 23 '17 at 14:17











                • Yes, good point! Even sort prints in different order than input.

                  – George Vasiliou
                  Mar 23 '17 at 14:18






                • 1





                  @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

                  – user218374
                  Mar 23 '17 at 14:31
















                2












                2








                2







                Obligatory awk solution:



                $ echo "ant zebra ant spider spider ant zebra ant" | 
                awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
                zebra ant spider


                (The final echo is there for the newline)






                share|improve this answer















                Obligatory awk solution:



                $ echo "ant zebra ant spider spider ant zebra ant" | 
                awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x; } ' ; echo
                zebra ant spider


                (The final echo is there for the newline)







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Mar 23 '17 at 13:58

























                answered Mar 23 '17 at 13:52









                ilkkachuilkkachu

                61.9k10102178




                61.9k10102178













                • Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

                  – George Vasiliou
                  Mar 23 '17 at 14:14











                • Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

                  – ilkkachu
                  Mar 23 '17 at 14:17











                • Yes, good point! Even sort prints in different order than input.

                  – George Vasiliou
                  Mar 23 '17 at 14:18






                • 1





                  @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

                  – user218374
                  Mar 23 '17 at 14:31





















                • Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

                  – George Vasiliou
                  Mar 23 '17 at 14:14











                • Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

                  – ilkkachu
                  Mar 23 '17 at 14:17











                • Yes, good point! Even sort prints in different order than input.

                  – George Vasiliou
                  Mar 23 '17 at 14:18






                • 1





                  @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

                  – user218374
                  Mar 23 '17 at 14:31



















                Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

                – George Vasiliou
                Mar 23 '17 at 14:14





                Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys.

                – George Vasiliou
                Mar 23 '17 at 14:14













                Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

                – ilkkachu
                Mar 23 '17 at 14:17





                Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though.

                – ilkkachu
                Mar 23 '17 at 14:17













                Yes, good point! Even sort prints in different order than input.

                – George Vasiliou
                Mar 23 '17 at 14:18





                Yes, good point! Even sort prints in different order than input.

                – George Vasiliou
                Mar 23 '17 at 14:18




                1




                1





                @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

                – user218374
                Mar 23 '17 at 14:31







                @ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order.

                – user218374
                Mar 23 '17 at 14:31













                1














                Python



                Option 1





                #!/usr/bin/env python
                # get_unique_words.py

                import sys

                l =
                for w in sys.argv[1].split(','):
                if w not in l:
                l += [ w ]
                print ','.join(l)


                Make executable, then call from Bash:



                $ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
                aaa,bbb,ccc


                Or you could implement it as a Bash function, but the syntax is messy.



                get_unique_words(){
                python -c "
                l =
                for w in '$1'.split(','):
                if w not in l:
                l += [ w ]
                print ','.join(l)"
                }


                Option 2



                This option can become a one-liner if needed:





                #!/usr/bin/env python
                # get_unique_words.py

                import sys

                s_in = sys.argv[1]
                l_in = s_in.split(',') # Turn string into a list.
                set_out = set(l_in) # Turning a list into a set removes duplicates items.
                s_out = ','.join(set_out)
                print s_out


                In Bash:



                get_unique_words(){
                python -c "print ','.join(set('$1'.split(',')))"
                }





                share|improve this answer






























                  1














                  Python



                  Option 1





                  #!/usr/bin/env python
                  # get_unique_words.py

                  import sys

                  l =
                  for w in sys.argv[1].split(','):
                  if w not in l:
                  l += [ w ]
                  print ','.join(l)


                  Make executable, then call from Bash:



                  $ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
                  aaa,bbb,ccc


                  Or you could implement it as a Bash function, but the syntax is messy.



                  get_unique_words(){
                  python -c "
                  l =
                  for w in '$1'.split(','):
                  if w not in l:
                  l += [ w ]
                  print ','.join(l)"
                  }


                  Option 2



                  This option can become a one-liner if needed:





                  #!/usr/bin/env python
                  # get_unique_words.py

                  import sys

                  s_in = sys.argv[1]
                  l_in = s_in.split(',') # Turn string into a list.
                  set_out = set(l_in) # Turning a list into a set removes duplicates items.
                  s_out = ','.join(set_out)
                  print s_out


                  In Bash:



                  get_unique_words(){
                  python -c "print ','.join(set('$1'.split(',')))"
                  }





                  share|improve this answer




























                    1












                    1








                    1







                    Python



                    Option 1





                    #!/usr/bin/env python
                    # get_unique_words.py

                    import sys

                    l =
                    for w in sys.argv[1].split(','):
                    if w not in l:
                    l += [ w ]
                    print ','.join(l)


                    Make executable, then call from Bash:



                    $ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
                    aaa,bbb,ccc


                    Or you could implement it as a Bash function, but the syntax is messy.



                    get_unique_words(){
                    python -c "
                    l =
                    for w in '$1'.split(','):
                    if w not in l:
                    l += [ w ]
                    print ','.join(l)"
                    }


                    Option 2



                    This option can become a one-liner if needed:





                    #!/usr/bin/env python
                    # get_unique_words.py

                    import sys

                    s_in = sys.argv[1]
                    l_in = s_in.split(',') # Turn string into a list.
                    set_out = set(l_in) # Turning a list into a set removes duplicates items.
                    s_out = ','.join(set_out)
                    print s_out


                    In Bash:



                    get_unique_words(){
                    python -c "print ','.join(set('$1'.split(',')))"
                    }





                    share|improve this answer















                    Python



                    Option 1





                    #!/usr/bin/env python
                    # get_unique_words.py

                    import sys

                    l =
                    for w in sys.argv[1].split(','):
                    if w not in l:
                    l += [ w ]
                    print ','.join(l)


                    Make executable, then call from Bash:



                    $ ./get_unique_words.py "aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"
                    aaa,bbb,ccc


                    Or you could implement it as a Bash function, but the syntax is messy.



                    get_unique_words(){
                    python -c "
                    l =
                    for w in '$1'.split(','):
                    if w not in l:
                    l += [ w ]
                    print ','.join(l)"
                    }


                    Option 2



                    This option can become a one-liner if needed:





                    #!/usr/bin/env python
                    # get_unique_words.py

                    import sys

                    s_in = sys.argv[1]
                    l_in = s_in.split(',') # Turn string into a list.
                    set_out = set(l_in) # Turning a list into a set removes duplicates items.
                    s_out = ','.join(set_out)
                    print s_out


                    In Bash:



                    get_unique_words(){
                    python -c "print ','.join(set('$1'.split(',')))"
                    }






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited May 14 '17 at 3:19

























                    answered Mar 23 '17 at 20:34









                    wjandreawjandrea

                    502413




                    502413























                        0














                        cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile





                        share|improve this answer


























                        • I do not get it

                          – Pierre.Vriens
                          Dec 2 '18 at 7:00











                        • Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

                          – Kusalananda
                          4 hours ago
















                        0














                        cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile





                        share|improve this answer


























                        • I do not get it

                          – Pierre.Vriens
                          Dec 2 '18 at 7:00











                        • Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

                          – Kusalananda
                          4 hours ago














                        0












                        0








                        0







                        cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile





                        share|improve this answer















                        cat filename | awk '{ delete a; for (i=1; i<=NF; i++) a[$i]++; n=asorti(a, b); for (i=1; i<=n; i++) printf b[i]" "; print "" }' > newfile






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited 4 hours ago









                        George Vasiliou

                        5,71531030




                        5,71531030










                        answered Dec 2 '18 at 4:18









                        天津神 こと天津神 こと

                        1




                        1













                        • I do not get it

                          – Pierre.Vriens
                          Dec 2 '18 at 7:00











                        • Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

                          – Kusalananda
                          4 hours ago



















                        • I do not get it

                          – Pierre.Vriens
                          Dec 2 '18 at 7:00











                        • Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

                          – Kusalananda
                          4 hours ago

















                        I do not get it

                        – Pierre.Vriens
                        Dec 2 '18 at 7:00





                        I do not get it

                        – Pierre.Vriens
                        Dec 2 '18 at 7:00













                        Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

                        – Kusalananda
                        4 hours ago





                        Your code lack explanation. With no explanation, it's difficult to follow what's happening. You also seem to make assumptions about the data that seems wrong (whitespace-delimited fields) and about the particular awk implementation being used (asorti() is not a standard awk function).

                        – Kusalananda
                        4 hours ago











                        0














                        Using the original tabular data in the file called file:



                        sed '1d' file | sort -u |
                        awk '{ color[$1] = ( color[$1] == "" ? $3 : color[$1] "," $3 ) }
                        END { for (user in color) print user, color[user] }'


                        This generates



                        CCC red
                        BBB blue,red
                        AAA black,blue,green,red


                        The three steps of the pipeline:




                        1. The sed command removes the first line which is a header that we don't want to read.


                        2. The sort command gives us unique lines. The sample data after the sort looks like



                          AAA         | black
                          AAA | blue
                          AAA | green
                          AAA | red
                          BBB | blue
                          BBB | red
                          CCC | red


                        3. The awk command takes this data and produces a comma-delimited string for each user in the array color (where the username is the key into the array). At the end (in the END block), all collected data is outputted.






                        share|improve this answer




























                          0














                          Using the original tabular data in the file called file:



                          sed '1d' file | sort -u |
                          awk '{ color[$1] = ( color[$1] == "" ? $3 : color[$1] "," $3 ) }
                          END { for (user in color) print user, color[user] }'


                          This generates



                          CCC red
                          BBB blue,red
                          AAA black,blue,green,red


                          The three steps of the pipeline:




                          1. The sed command removes the first line which is a header that we don't want to read.


                          2. The sort command gives us unique lines. The sample data after the sort looks like



                            AAA         | black
                            AAA | blue
                            AAA | green
                            AAA | red
                            BBB | blue
                            BBB | red
                            CCC | red


                          3. The awk command takes this data and produces a comma-delimited string for each user in the array color (where the username is the key into the array). At the end (in the END block), all collected data is outputted.






                          share|improve this answer


























                            0












                            0








                            0







                            Using the original tabular data in the file called file:



                            sed '1d' file | sort -u |
                            awk '{ color[$1] = ( color[$1] == "" ? $3 : color[$1] "," $3 ) }
                            END { for (user in color) print user, color[user] }'


                            This generates



                            CCC red
                            BBB blue,red
                            AAA black,blue,green,red


                            The three steps of the pipeline:




                            1. The sed command removes the first line which is a header that we don't want to read.


                            2. The sort command gives us unique lines. The sample data after the sort looks like



                              AAA         | black
                              AAA | blue
                              AAA | green
                              AAA | red
                              BBB | blue
                              BBB | red
                              CCC | red


                            3. The awk command takes this data and produces a comma-delimited string for each user in the array color (where the username is the key into the array). At the end (in the END block), all collected data is outputted.






                            share|improve this answer













                            Using the original tabular data in the file called file:



                            sed '1d' file | sort -u |
                            awk '{ color[$1] = ( color[$1] == "" ? $3 : color[$1] "," $3 ) }
                            END { for (user in color) print user, color[user] }'


                            This generates



                            CCC red
                            BBB blue,red
                            AAA black,blue,green,red


                            The three steps of the pipeline:




                            1. The sed command removes the first line which is a header that we don't want to read.


                            2. The sort command gives us unique lines. The sample data after the sort looks like



                              AAA         | black
                              AAA | blue
                              AAA | green
                              AAA | red
                              BBB | blue
                              BBB | red
                              CCC | red


                            3. The awk command takes this data and produces a comma-delimited string for each user in the array color (where the username is the key into the array). At the end (in the END block), all collected data is outputted.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 3 hours ago









                            KusalanandaKusalananda

                            136k17257426




                            136k17257426























                                -2














                                a="aaa aaa aaa bbb bbb ccc bbb ccc"
                                for item in $a
                                do
                                echo $item
                                done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)





                                share|improve this answer


























                                • Please add an explanation on how your code works and why you did this and that.

                                  – xhienne
                                  Mar 24 '17 at 1:37
















                                -2














                                a="aaa aaa aaa bbb bbb ccc bbb ccc"
                                for item in $a
                                do
                                echo $item
                                done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)





                                share|improve this answer


























                                • Please add an explanation on how your code works and why you did this and that.

                                  – xhienne
                                  Mar 24 '17 at 1:37














                                -2












                                -2








                                -2







                                a="aaa aaa aaa bbb bbb ccc bbb ccc"
                                for item in $a
                                do
                                echo $item
                                done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)





                                share|improve this answer















                                a="aaa aaa aaa bbb bbb ccc bbb ccc"
                                for item in $a
                                do
                                echo $item
                                done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)






                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Mar 24 '17 at 0:27

























                                answered Mar 24 '17 at 0:18









                                Tododo FlyTododo Fly

                                11




                                11













                                • Please add an explanation on how your code works and why you did this and that.

                                  – xhienne
                                  Mar 24 '17 at 1:37



















                                • Please add an explanation on how your code works and why you did this and that.

                                  – xhienne
                                  Mar 24 '17 at 1:37

















                                Please add an explanation on how your code works and why you did this and that.

                                – xhienne
                                Mar 24 '17 at 1:37





                                Please add an explanation on how your code works and why you did this and that.

                                – xhienne
                                Mar 24 '17 at 1:37


















                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f353321%2fremove-all-duplicate-word-from-string-using-shell-script%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                濃尾地震

                                How to rewrite equation of hyperbola in standard form

                                No ethernet ip address in my vocore2