Get checksum of directory on bash
I was wondering:
- Is it possible to create a checksum of a directory (using something like md5sum)
- Is it possible to recursively create checksum for each file inside the dir (and then print it out)?
- Or both?
I'm using bash
bash files directory hashsum
add a comment |
I was wondering:
- Is it possible to create a checksum of a directory (using something like md5sum)
- Is it possible to recursively create checksum for each file inside the dir (and then print it out)?
- Or both?
I'm using bash
bash files directory hashsum
add a comment |
I was wondering:
- Is it possible to create a checksum of a directory (using something like md5sum)
- Is it possible to recursively create checksum for each file inside the dir (and then print it out)?
- Or both?
I'm using bash
bash files directory hashsum
I was wondering:
- Is it possible to create a checksum of a directory (using something like md5sum)
- Is it possible to recursively create checksum for each file inside the dir (and then print it out)?
- Or both?
I'm using bash
bash files directory hashsum
bash files directory hashsum
edited Sep 10 '15 at 22:12
Gilles
542k12810991616
542k12810991616
asked Sep 10 '15 at 7:53
driglerdrigler
1313
1313
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
md5sum won't take directory as input, however
tar cf - FOO | md5sum
will checksum it, if a file is change any place within FOO, checksum will change, but you won't have any hint of which file. The checksum will also change if any file metadata changes (permissions, timestamps, …).
You might consider using :
find FOO -type f -exec md5sum {} ; > FOO.md5
which will md5 every file individually, and save the result in FOO.md5. This makes it easier to check which file has changed. This variant only depends on file content, not on metadata.
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (-exec cmd +).
– schily
Sep 10 '15 at 9:41
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use thestar -Doption.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.
– schily
Sep 10 '15 at 9:57
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
add a comment |
ZFS creates such a checksum but this is for internal use only.
In order to calculate a checksum from an object, you need to read this object and for directories, this is not granted to be possible.
Many filesystems return the EISDIR error when trying to read(2) a directory and ZFS in addition sets the "size" of the directory to the number of entries in that directory instead of the number of bytes (what you may expect when you like to read(2) a directory).
The fact that some filesystems still allow you to read(2) directories is a bug - a left over artefact from the 1970s when there was no readdir() call.
BTW: on my WOFS, all directories have size 0 as they don't have any file type content because the files instead say in their meta data "hey, I am in this directory". If you wonder what WOFS is: all other basic concepts from WOFS have been incorporated into ZFS.
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
add a comment |
1.Why do you need md5sum on directory? If you want to be sure of file consistency and/or integrity you can tar it and make hash of tar/tar.gz file.
2.If you have sub directories it will be a little harder. If you have files only, try this as example:
#!/bin/bash
for i in /home/{username}/Books/rhel/; do
md5sum "$i"
done
Output example:
dd7a684cc8668d208ca5dcf00bc58e8d Red_Hat_Enterprise_Linux-6-Deployment...
775602071a1ec5a1ac1a99cee9d065fa Red_Hat_Enterprise_Linux-7-High_Av..
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
add a comment |
On https://blake2.net/ they claim:
" BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. BLAKE2 has been adopted by many projects due to its high speed, security, and simplicity. "
Therefore, some tools are using it, e.g. rmlint https://github.com/sahib/rmlint .
Anyway, following https://unix.stackexchange.com/a/228758/9689 , you can also do :
find "${dirname}"/ -type f -exec b2sum -b -l 256 {} ; > "${dirname}".blake2sum_l256
And later check with:
b2sum -c "${dirname}".blake2sum_l256
Or this to see only failing ones:
b2sum --quiet -c "${dirname}".blake2sum_l256
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f228741%2fget-checksum-of-directory-on-bash%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
md5sum won't take directory as input, however
tar cf - FOO | md5sum
will checksum it, if a file is change any place within FOO, checksum will change, but you won't have any hint of which file. The checksum will also change if any file metadata changes (permissions, timestamps, …).
You might consider using :
find FOO -type f -exec md5sum {} ; > FOO.md5
which will md5 every file individually, and save the result in FOO.md5. This makes it easier to check which file has changed. This variant only depends on file content, not on metadata.
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (-exec cmd +).
– schily
Sep 10 '15 at 9:41
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use thestar -Doption.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.
– schily
Sep 10 '15 at 9:57
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
add a comment |
md5sum won't take directory as input, however
tar cf - FOO | md5sum
will checksum it, if a file is change any place within FOO, checksum will change, but you won't have any hint of which file. The checksum will also change if any file metadata changes (permissions, timestamps, …).
You might consider using :
find FOO -type f -exec md5sum {} ; > FOO.md5
which will md5 every file individually, and save the result in FOO.md5. This makes it easier to check which file has changed. This variant only depends on file content, not on metadata.
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (-exec cmd +).
– schily
Sep 10 '15 at 9:41
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use thestar -Doption.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.
– schily
Sep 10 '15 at 9:57
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
add a comment |
md5sum won't take directory as input, however
tar cf - FOO | md5sum
will checksum it, if a file is change any place within FOO, checksum will change, but you won't have any hint of which file. The checksum will also change if any file metadata changes (permissions, timestamps, …).
You might consider using :
find FOO -type f -exec md5sum {} ; > FOO.md5
which will md5 every file individually, and save the result in FOO.md5. This makes it easier to check which file has changed. This variant only depends on file content, not on metadata.
md5sum won't take directory as input, however
tar cf - FOO | md5sum
will checksum it, if a file is change any place within FOO, checksum will change, but you won't have any hint of which file. The checksum will also change if any file metadata changes (permissions, timestamps, …).
You might consider using :
find FOO -type f -exec md5sum {} ; > FOO.md5
which will md5 every file individually, and save the result in FOO.md5. This makes it easier to check which file has changed. This variant only depends on file content, not on metadata.
edited Jun 28 '17 at 3:33
Community♦
1
1
answered Sep 10 '15 at 9:12
ArchemarArchemar
20.2k93873
20.2k93873
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (-exec cmd +).
– schily
Sep 10 '15 at 9:41
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use thestar -Doption.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.
– schily
Sep 10 '15 at 9:57
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
add a comment |
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (-exec cmd +).
– schily
Sep 10 '15 at 9:41
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use thestar -Doption.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.
– schily
Sep 10 '15 at 9:57
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (
-exec cmd +).– schily
Sep 10 '15 at 9:41
If you have bad luck, both proposed commands may take a week. Regarding the second one: find(1) has been enhanced in 1989, so you may like to read again the man page and search for "execplus" (
-exec cmd +).– schily
Sep 10 '15 at 9:41
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
I suppose OP checksum file to monitor change on a "small" directory, but yes, if you have zillion of file and few Tera bytes, this could take long.
– Archemar
Sep 10 '15 at 9:46
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use the
star -D option.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.– schily
Sep 10 '15 at 9:57
Well, the main issue here is that tar traverses the whole tree and this may affect a lot of files unless you e.g. use the
star -D option.For small trees, the command may be sufficient, but unless you carefully specify the archive format this may even not be helpful as modern TAR archive formats include the last file access time.– schily
Sep 10 '15 at 9:57
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
Thanks a lot, this is what I was looking for. My directories have around 1gb with reasonable file count. The last thing lasts about a minute.
– drigler
Sep 10 '15 at 11:27
add a comment |
ZFS creates such a checksum but this is for internal use only.
In order to calculate a checksum from an object, you need to read this object and for directories, this is not granted to be possible.
Many filesystems return the EISDIR error when trying to read(2) a directory and ZFS in addition sets the "size" of the directory to the number of entries in that directory instead of the number of bytes (what you may expect when you like to read(2) a directory).
The fact that some filesystems still allow you to read(2) directories is a bug - a left over artefact from the 1970s when there was no readdir() call.
BTW: on my WOFS, all directories have size 0 as they don't have any file type content because the files instead say in their meta data "hey, I am in this directory". If you wonder what WOFS is: all other basic concepts from WOFS have been incorporated into ZFS.
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
add a comment |
ZFS creates such a checksum but this is for internal use only.
In order to calculate a checksum from an object, you need to read this object and for directories, this is not granted to be possible.
Many filesystems return the EISDIR error when trying to read(2) a directory and ZFS in addition sets the "size" of the directory to the number of entries in that directory instead of the number of bytes (what you may expect when you like to read(2) a directory).
The fact that some filesystems still allow you to read(2) directories is a bug - a left over artefact from the 1970s when there was no readdir() call.
BTW: on my WOFS, all directories have size 0 as they don't have any file type content because the files instead say in their meta data "hey, I am in this directory". If you wonder what WOFS is: all other basic concepts from WOFS have been incorporated into ZFS.
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
add a comment |
ZFS creates such a checksum but this is for internal use only.
In order to calculate a checksum from an object, you need to read this object and for directories, this is not granted to be possible.
Many filesystems return the EISDIR error when trying to read(2) a directory and ZFS in addition sets the "size" of the directory to the number of entries in that directory instead of the number of bytes (what you may expect when you like to read(2) a directory).
The fact that some filesystems still allow you to read(2) directories is a bug - a left over artefact from the 1970s when there was no readdir() call.
BTW: on my WOFS, all directories have size 0 as they don't have any file type content because the files instead say in their meta data "hey, I am in this directory". If you wonder what WOFS is: all other basic concepts from WOFS have been incorporated into ZFS.
ZFS creates such a checksum but this is for internal use only.
In order to calculate a checksum from an object, you need to read this object and for directories, this is not granted to be possible.
Many filesystems return the EISDIR error when trying to read(2) a directory and ZFS in addition sets the "size" of the directory to the number of entries in that directory instead of the number of bytes (what you may expect when you like to read(2) a directory).
The fact that some filesystems still allow you to read(2) directories is a bug - a left over artefact from the 1970s when there was no readdir() call.
BTW: on my WOFS, all directories have size 0 as they don't have any file type content because the files instead say in their meta data "hey, I am in this directory". If you wonder what WOFS is: all other basic concepts from WOFS have been incorporated into ZFS.
edited Sep 10 '15 at 8:25
answered Sep 10 '15 at 8:09
schilyschily
10.9k31642
10.9k31642
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
add a comment |
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
Thanks for your answer, the zero-size didn't cross my mind. :)
– drigler
Sep 10 '15 at 11:21
add a comment |
1.Why do you need md5sum on directory? If you want to be sure of file consistency and/or integrity you can tar it and make hash of tar/tar.gz file.
2.If you have sub directories it will be a little harder. If you have files only, try this as example:
#!/bin/bash
for i in /home/{username}/Books/rhel/; do
md5sum "$i"
done
Output example:
dd7a684cc8668d208ca5dcf00bc58e8d Red_Hat_Enterprise_Linux-6-Deployment...
775602071a1ec5a1ac1a99cee9d065fa Red_Hat_Enterprise_Linux-7-High_Av..
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
add a comment |
1.Why do you need md5sum on directory? If you want to be sure of file consistency and/or integrity you can tar it and make hash of tar/tar.gz file.
2.If you have sub directories it will be a little harder. If you have files only, try this as example:
#!/bin/bash
for i in /home/{username}/Books/rhel/; do
md5sum "$i"
done
Output example:
dd7a684cc8668d208ca5dcf00bc58e8d Red_Hat_Enterprise_Linux-6-Deployment...
775602071a1ec5a1ac1a99cee9d065fa Red_Hat_Enterprise_Linux-7-High_Av..
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
add a comment |
1.Why do you need md5sum on directory? If you want to be sure of file consistency and/or integrity you can tar it and make hash of tar/tar.gz file.
2.If you have sub directories it will be a little harder. If you have files only, try this as example:
#!/bin/bash
for i in /home/{username}/Books/rhel/; do
md5sum "$i"
done
Output example:
dd7a684cc8668d208ca5dcf00bc58e8d Red_Hat_Enterprise_Linux-6-Deployment...
775602071a1ec5a1ac1a99cee9d065fa Red_Hat_Enterprise_Linux-7-High_Av..
1.Why do you need md5sum on directory? If you want to be sure of file consistency and/or integrity you can tar it and make hash of tar/tar.gz file.
2.If you have sub directories it will be a little harder. If you have files only, try this as example:
#!/bin/bash
for i in /home/{username}/Books/rhel/; do
md5sum "$i"
done
Output example:
dd7a684cc8668d208ca5dcf00bc58e8d Red_Hat_Enterprise_Linux-6-Deployment...
775602071a1ec5a1ac1a99cee9d065fa Red_Hat_Enterprise_Linux-7-High_Av..
edited Jan 25 '17 at 23:21
answered Sep 10 '15 at 8:51
obohovykobohovyk
492613
492613
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
add a comment |
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
You're right, but I don't want to tar my files :) Thanks for your answer
– drigler
Sep 10 '15 at 11:21
add a comment |
On https://blake2.net/ they claim:
" BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. BLAKE2 has been adopted by many projects due to its high speed, security, and simplicity. "
Therefore, some tools are using it, e.g. rmlint https://github.com/sahib/rmlint .
Anyway, following https://unix.stackexchange.com/a/228758/9689 , you can also do :
find "${dirname}"/ -type f -exec b2sum -b -l 256 {} ; > "${dirname}".blake2sum_l256
And later check with:
b2sum -c "${dirname}".blake2sum_l256
Or this to see only failing ones:
b2sum --quiet -c "${dirname}".blake2sum_l256
add a comment |
On https://blake2.net/ they claim:
" BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. BLAKE2 has been adopted by many projects due to its high speed, security, and simplicity. "
Therefore, some tools are using it, e.g. rmlint https://github.com/sahib/rmlint .
Anyway, following https://unix.stackexchange.com/a/228758/9689 , you can also do :
find "${dirname}"/ -type f -exec b2sum -b -l 256 {} ; > "${dirname}".blake2sum_l256
And later check with:
b2sum -c "${dirname}".blake2sum_l256
Or this to see only failing ones:
b2sum --quiet -c "${dirname}".blake2sum_l256
add a comment |
On https://blake2.net/ they claim:
" BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. BLAKE2 has been adopted by many projects due to its high speed, security, and simplicity. "
Therefore, some tools are using it, e.g. rmlint https://github.com/sahib/rmlint .
Anyway, following https://unix.stackexchange.com/a/228758/9689 , you can also do :
find "${dirname}"/ -type f -exec b2sum -b -l 256 {} ; > "${dirname}".blake2sum_l256
And later check with:
b2sum -c "${dirname}".blake2sum_l256
Or this to see only failing ones:
b2sum --quiet -c "${dirname}".blake2sum_l256
On https://blake2.net/ they claim:
" BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. BLAKE2 has been adopted by many projects due to its high speed, security, and simplicity. "
Therefore, some tools are using it, e.g. rmlint https://github.com/sahib/rmlint .
Anyway, following https://unix.stackexchange.com/a/228758/9689 , you can also do :
find "${dirname}"/ -type f -exec b2sum -b -l 256 {} ; > "${dirname}".blake2sum_l256
And later check with:
b2sum -c "${dirname}".blake2sum_l256
Or this to see only failing ones:
b2sum --quiet -c "${dirname}".blake2sum_l256
answered 1 hour ago
Grzegorz WierzowieckiGrzegorz Wierzowiecki
5,3471464106
5,3471464106
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f228741%2fget-checksum-of-directory-on-bash%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown