finding duplicate files in backups

Abhishek Dixit abhidixit87 at gmail.com
Mon Jun 6 21:37:15 UTC 2011


On Tue, Jun 7, 2011 at 12:02 AM, Abhishek Dixit <abhidixit87 at gmail.com> wrote:
> Hi,
> I have a 1 TB USB used as backup drive. Due to various reasons same
> file existed in different file systems and has been backed up multiple
> times in this 1 TB hard disk.
> I want to keep only one single copy of those files.The problem is
> these files are spread in different file systems and (multiple
> partitions) and are present here and there at various locations which
> I do not remember.
> I want to achieve following
>
> 1) Reduce n number of occurrence of same file at different location to
> 1 occurrence.
>
> 2) Since I do not know the name of files which have multiple
> occurrences how can I easily find this.
>
> 3) Is there a way I can create an index of files and directories
> present on my laptop for example when you open a book then each book
> has an index page which tells on which page number what is present.The
> same way.
>
> What can be an easy way to achieve above?
> --
> Regards
> Abhi
>

On Tue, Jun 7, 2011 at 12:53 AM, compdoc <compdoc at hotrodpc.com> wrote:
> It's unclear exactly what you're trying to achieve with step 3, but you
> might google 'deduplication' for the rest.
>

I want to create an index of files (this index could be a simple plain
text file or an xml file or an html) this would be similar to an index
which a search engine crawler creates.My purpose is that  I do not
want to browse each subdirectory to see which directory contains
what.Though if I browse then I am able to recall what is inside that
subdirectory.But having a lot of directories and subdirectories make
it difficult if I want to have a glance just of directories or
subdirectories in case of a situation which is pressurizing.For
example if I have a pictorial representation of my home directory and
sub folders in it then I will not go to sub directories just to find a
small piece of information as where did I left a particular file.Given
the fact that my content is highly organized.May be I am not able to
clearly put it in words just imagine if you create a picture map of
directory names and sub directories + files in that directory.Which at
one glance I am able to see entire things without browsing each
folder.If I have not made myself clear let me know I will try to
describe more.




More information about the ubuntu-users mailing list