Remove duplicate lines from a file but leave 1 occurrence

I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.

Example of the file:

this is a string

test line

test line 2

this is a string

From the above example, I would want to remove 1 occurrence of "this is a string".

Best way to do this?

edited May 19 '18 at 18:33

asked May 19 '18 at 13:09

Tom Bailey

161

1

With such questions you should always provide example input and output.

– Hauke Laging
May 19 '18 at 13:12

1

Possibly related: Remove duplicate lines while keeping the order of the lines

– steeldriver
May 19 '18 at 13:12

Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?

– Kusalananda
May 19 '18 at 13:14

1

Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?

– roaima
May 19 '18 at 13:17

1

it is not a problem for you that the lines will be sorted, then a sort file|uniq will do what you want.

– peterh
May 19 '18 at 19:03

|
show 4 more comments

I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.

Example of the file:

this is a string

test line

test line 2

this is a string

From the above example, I would want to remove 1 occurrence of "this is a string".

Best way to do this?

edited May 19 '18 at 18:33

asked May 19 '18 at 13:09

Tom Bailey

161

1

With such questions you should always provide example input and output.

– Hauke Laging
May 19 '18 at 13:12

1

Possibly related: Remove duplicate lines while keeping the order of the lines

– steeldriver
May 19 '18 at 13:12

Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?

– Kusalananda
May 19 '18 at 13:14

1

Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?

– roaima
May 19 '18 at 13:17

1

it is not a problem for you that the lines will be sorted, then a sort file|uniq will do what you want.

– peterh
May 19 '18 at 19:03

|
show 4 more comments

I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.

Example of the file:

this is a string

test line

test line 2

this is a string

From the above example, I would want to remove 1 occurrence of "this is a string".

Best way to do this?

edited May 19 '18 at 18:33

asked May 19 '18 at 13:09

Tom Bailey

161

I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.

Example of the file:

this is a string

test line

test line 2

this is a string

From the above example, I would want to remove 1 occurrence of "this is a string".

Best way to do this?

linux

edited May 19 '18 at 18:33

asked May 19 '18 at 13:09

Tom Bailey

161

edited May 19 '18 at 18:33

asked May 19 '18 at 13:09

Tom Bailey

161

edited May 19 '18 at 18:33

asked May 19 '18 at 13:09

Tom Bailey

161

asked May 19 '18 at 13:09

Tom Bailey

161

asked May 19 '18 at 13:09

Tom Bailey

161

1

With such questions you should always provide example input and output.

– Hauke Laging
May 19 '18 at 13:12

1

Possibly related: Remove duplicate lines while keeping the order of the lines

– steeldriver
May 19 '18 at 13:12

Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?

– Kusalananda
May 19 '18 at 13:14

1

Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?

– roaima
May 19 '18 at 13:17

1

it is not a problem for you that the lines will be sorted, then a sort file|uniq will do what you want.

– peterh
May 19 '18 at 19:03

|
show 4 more comments

1

With such questions you should always provide example input and output.

– Hauke Laging
May 19 '18 at 13:12

1

Possibly related: Remove duplicate lines while keeping the order of the lines

– steeldriver
May 19 '18 at 13:12

Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?

– Kusalananda
May 19 '18 at 13:14

1

Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?

– roaima
May 19 '18 at 13:17

1

it is not a problem for you that the lines will be sorted, then a sort file|uniq will do what you want.

– peterh
May 19 '18 at 19:03

With such questions you should always provide example input and output.

– Hauke Laging
May 19 '18 at 13:12

Possibly related: Remove duplicate lines while keeping the order of the lines

– steeldriver
May 19 '18 at 13:12

Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?

– Kusalananda
May 19 '18 at 13:14

Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?

– roaima
May 19 '18 at 13:17

it is not a problem for you that the lines will be sorted, then a sort file|uniq will do what you want.

– peterh
May 19 '18 at 19:03

|
show 4 more comments

2 Answers
2

active

oldest

votes

This leaves the first occurrence:

awk '! a[$0]++' inputfile



start cmd:> echo 'this is a string

cont. cmd:> test line

cont. cmd:> test line 2

cont. cmd:> this is a string' | awk '! a[$0]++'

    this is a string

    test line

    test line 2

edited May 19 '18 at 20:16

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

It seems to just print out and not actually make in changes in the file.

– Tom Bailey
May 19 '18 at 15:49

@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.

– Hauke Laging
May 19 '18 at 16:49

I have edited it now.

– Tom Bailey
May 19 '18 at 19:29

@TomBailey works fine for me.

– Hauke Laging
May 19 '18 at 20:16

add a comment |

Demo file `stuff.txt` contains:

one

two

three

one

two

four

five

Remove duplicate lines from a file assuming you don't mind that lines are sorted

$ sort -u stuff.txt 

five

four

one

three

two

Explanation: the u flag sent to sort says sort the lines of the file and force unique.

Remove duplicate lines from a file, preserve original ordering, keep the first:

$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-

one

two

three

four

five

Explanation: The n flag passed to cat appends line numbers to left of every line, plus space, then the first sort says sort by unique and but only after the first word, the second sort command says use the line numbers we stored in step 1 to resort by the original ordering, finally cut off the first word.

Remove duplicate lines from a file, preserve order, keep last.

tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt

three

one

two

four

five

Explanation: Same as before, but tac reverse the file, achieving the desired result.

answered 35 mins ago

Eric Leschinski

1,30711416

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f444795%2fremove-duplicate-lines-from-a-file-but-leave-1-occurrence%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

This leaves the first occurrence:

awk '! a[$0]++' inputfile



start cmd:> echo 'this is a string

cont. cmd:> test line

cont. cmd:> test line 2

cont. cmd:> this is a string' | awk '! a[$0]++'

    this is a string

    test line

    test line 2

edited May 19 '18 at 20:16

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

It seems to just print out and not actually make in changes in the file.

– Tom Bailey
May 19 '18 at 15:49

@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.

– Hauke Laging
May 19 '18 at 16:49

I have edited it now.

– Tom Bailey
May 19 '18 at 19:29

@TomBailey works fine for me.

– Hauke Laging
May 19 '18 at 20:16

add a comment |

This leaves the first occurrence:

awk '! a[$0]++' inputfile



start cmd:> echo 'this is a string

cont. cmd:> test line

cont. cmd:> test line 2

cont. cmd:> this is a string' | awk '! a[$0]++'

    this is a string

    test line

    test line 2

edited May 19 '18 at 20:16

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

It seems to just print out and not actually make in changes in the file.

– Tom Bailey
May 19 '18 at 15:49

@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.

– Hauke Laging
May 19 '18 at 16:49

I have edited it now.

– Tom Bailey
May 19 '18 at 19:29

@TomBailey works fine for me.

– Hauke Laging
May 19 '18 at 20:16

add a comment |

This leaves the first occurrence:

awk '! a[$0]++' inputfile



start cmd:> echo 'this is a string

cont. cmd:> test line

cont. cmd:> test line 2

cont. cmd:> this is a string' | awk '! a[$0]++'

    this is a string

    test line

    test line 2

edited May 19 '18 at 20:16

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

This leaves the first occurrence:

awk '! a[$0]++' inputfile



start cmd:> echo 'this is a string

cont. cmd:> test line

cont. cmd:> test line 2

cont. cmd:> this is a string' | awk '! a[$0]++'

    this is a string

    test line

    test line 2

edited May 19 '18 at 20:16

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

edited May 19 '18 at 20:16

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

answered May 19 '18 at 13:16

Hauke Laging

57k1287135

It seems to just print out and not actually make in changes in the file.

– Tom Bailey
May 19 '18 at 15:49

@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.

– Hauke Laging
May 19 '18 at 16:49

I have edited it now.

– Tom Bailey
May 19 '18 at 19:29

@TomBailey works fine for me.

– Hauke Laging
May 19 '18 at 20:16

add a comment |

It seems to just print out and not actually make in changes in the file.

– Tom Bailey
May 19 '18 at 15:49

@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.

– Hauke Laging
May 19 '18 at 16:49

I have edited it now.

– Tom Bailey
May 19 '18 at 19:29

@TomBailey works fine for me.

– Hauke Laging
May 19 '18 at 20:16

It seems to just print out and not actually make in changes in the file.

– Tom Bailey
May 19 '18 at 15:49

@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.

– Hauke Laging
May 19 '18 at 16:49

I have edited it now.

– Tom Bailey
May 19 '18 at 19:29

@TomBailey works fine for me.

– Hauke Laging
May 19 '18 at 20:16

add a comment |

Demo file `stuff.txt` contains:

one

two

three

one

two

four

five

Remove duplicate lines from a file assuming you don't mind that lines are sorted

$ sort -u stuff.txt 

five

four

one

three

two

Explanation: the u flag sent to sort says sort the lines of the file and force unique.

Remove duplicate lines from a file, preserve original ordering, keep the first:

$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-

one

two

three

four

five

Remove duplicate lines from a file, preserve order, keep last.

tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt

three

one

two

four

five

Explanation: Same as before, but tac reverse the file, achieving the desired result.

answered 35 mins ago

Eric Leschinski

1,30711416

add a comment |

Demo file `stuff.txt` contains:

one

two

three

one

two

four

five

Remove duplicate lines from a file assuming you don't mind that lines are sorted

$ sort -u stuff.txt 

five

four

one

three

two

Explanation: the u flag sent to sort says sort the lines of the file and force unique.

Remove duplicate lines from a file, preserve original ordering, keep the first:

$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-

one

two

three

four

five

Remove duplicate lines from a file, preserve order, keep last.

tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt

three

one

two

four

five

Explanation: Same as before, but tac reverse the file, achieving the desired result.

answered 35 mins ago

Eric Leschinski

1,30711416

add a comment |

Demo file `stuff.txt` contains:

one

two

three

one

two

four

five

Remove duplicate lines from a file assuming you don't mind that lines are sorted

$ sort -u stuff.txt 

five

four

one

three

two

Explanation: the u flag sent to sort says sort the lines of the file and force unique.

Remove duplicate lines from a file, preserve original ordering, keep the first:

$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-

one

two

three

four

five

Remove duplicate lines from a file, preserve order, keep last.

tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt

three

one

two

four

five

Explanation: Same as before, but tac reverse the file, achieving the desired result.

answered 35 mins ago

Eric Leschinski

1,30711416

Demo file `stuff.txt` contains:

one

two

three

one

two

four

five

Remove duplicate lines from a file assuming you don't mind that lines are sorted

$ sort -u stuff.txt 

five

four

one

three

two

Explanation: the u flag sent to sort says sort the lines of the file and force unique.

Remove duplicate lines from a file, preserve original ordering, keep the first:

$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-

one

two

three

four

five

Remove duplicate lines from a file, preserve order, keep last.

tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt

three

one

two

four

five

Explanation: Same as before, but tac reverse the file, achieving the desired result.

answered 35 mins ago

Eric Leschinski

1,30711416

answered 35 mins ago

Eric Leschinski

1,30711416

answered 35 mins ago

Eric Leschinski

1,30711416

answered 35 mins ago

Eric Leschinski

1,30711416

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Yrurtj