Need someone to lead product or development at your software company? I lead product and engineering teams and I'm looking for my next opportunity. Check out my resume and get in touch.

Removing duplicate files in your iTunes library

Freshness Warning
This blog post is over 13 years old. It's possible that the information you read below isn't current and the links no longer work.

One of the problems with a networked drive acting as the iTunes library for multiple computers in the house is that I often end up with duplicate songs in the library. I think this comes from importing the library directory when you also have "Copy files to the iTunes library" set in the iTunes advanced prefs. ITunes imports a song and then tries to copy it over itself. Seeing there’s already a file there by that name it creates a file called "songname 1.mp3"

At least that’s my theory.

To clean up a library full of duplicate files, here’s what I did. I installed Duff, a unix utility that quickly finds duplicate files. Duff works by comparing the actual files of any two files that have identical sizes. I sent Duff’s output to a text file with the command

duff -r /Volumes/music/ > duplicatemusic.txt

Once Duff was done running, I ran a short command to grab all the lines from the output that end in 1.mp3 and delete them.

cat duplicatemusic.txt | grep 1.mp3 | tr '\012' '\000' | xargs -0 rm

If you’ve done whatever it is that causes duplicates a number of times, you might have *2.mp3, *3.mp3, etc. Just run that command again, replacing 1.mp3 with 2.mp3 and so on. The one liner above could probably be improved to grep for any single digit followed by .mp3, but it’s quick enough to run it a few times that I didn’t bother.

Andrew
October 31, 2007 1:38 AM

Hi Adam, I'm having a slightly different problem. iTunes indicates that I have 28.58GB of music in the library. When I go to my Music folder and highlight all the music folders the total size is 32GB. 3.42 is a pretty big discrepancy. Do you know of any program that would match the iTunes library with the actual files on the hard drive and delete the ones that are not in the library? Thanks

bill johnston
October 31, 2007 3:18 PM

create an empty working file `touch .tmpDupeFile .tmpSortedFile` build a big list of md5 signatures `find . -type f -print0 | xargs -0 md5 -r > .tmpDupeFile` sort the signatures `cat .tmpDupeFile | sort > .tmpSortedFile` create a list of duplicated files `cat .tmpSortedFile | awk '{ if ($1 == oldmd5) { printf "rm %s \n", $2 } oldmd5 = $1 }' > duplicates.sh` clean up `rm .tmpDupeFile` `rm .tmpSortedFile`

Andrew
October 31, 2007 5:14 PM

Bill, I appreciate your help but I'm afraid that I just don't know what to do with all that information since I'm not a programmer. I was hoping for some kind of software solution to my problem. Thanks.

Adam Kalsey
January 3, 2008 12:06 PM

Bill, Your awk script doesn't take into account that most iTunes filenames contain spaces. A file called "My Music" ends up creating 'rm My' instead of 'rm "My Music"' I changed it to... cat .tmpSortedFile | awk '{ if ($1 == oldmd5) { printf "rm \"%s\" \n", substr($0, index($0, " ")+1) } oldmd5 = $1 }' > duplicates.sh This grabs the whole filename and wraps it in quotes so that duplicates.sh works properly.

Todd
January 13, 2008 9:10 PM

Adam, you are THE MAN! Thank you so much for this. I had imported my MP3's into iTunes from an external drive, and M3U files in the folders caused duplicates to be created in my iTunes library. I was just getting ready to delete my entire iTunes library and re-import because of the duplicates. Your work here saved me hours of reimporting. For some reason, I didn't find your site earlier (when Googling for iTunes duplicate solutions), but did find it when searching for info on making sure deleting items from iTunes would also delete the files (hah). I had booted to Windows to use Windows and Robocopy to move my M3U files into a backup folder. I was literally just launching Mac OS to delete my library and start over when I found your site. I'm running Mac OSX Leopard, and I'm not sure that you are?... Either way, I found some descrepancies in your method when running under OSX. I've created a blog entry on my site with the changes for OSX, including a quick tutorial on running the compilation process for Duff, for those who might not be familiar with it. (I wasn't, so I documented it as I went.) Anyway, you can check out my post at this link. http://www.togeo.com/togeo/wordpress/?p=47 Thanks again for the excellent post on your site, you've got a cool blog otherwise, too!!!

Franky
May 12, 2008 1:47 PM

How do you send Duff's output to a text file? I'm dumb. I'm running Leopard on an iMac and I have triples and quadruples of the same songs and it's killing storage capacity.

Timothy Appnel
July 22, 2008 7:39 AM

Thanks for the informative as always post Adam. One thing that you seem to overlook is that the duplicate are not removed from iTunes library. This means that, while the files are gone off the drive, you will see multiple copies of tracks some marked with (!) markers. I used Super Remove Dead Tracks from Doug's AppleScripts to clean up the catalog. http://dougscripts.com/itunes/scripts/ss.php?sp=removedeadsuper It took awhile to run, but that finished the job and got my iTunes library back in shape.

nicolas
August 2, 2008 8:06 PM

Hi, For some reason, the 2nd step did not work for me. I use macosx 10.5.4 and had the following error: > cat duplicate.txt | grep 1.mp3 | tr '\012' ' \000' | xargs -0 rm rm: /bibliotheque/iTunesFusion/iTunes Music/Daft Punk/Unknown Album/32 Daft Punk - Revolution 909(1) 1.mp3 /bibli.. ...arepusher/Lost in Translation/04 Tommib 1.mp3 : File name too long For some reason, xargs is not splitting the entry...! I think I get the \n vs null terminated replacement and I dont see why it does not... It WORKED however by replacing the command by cat duplicate.txt | grep " 1.mp3" | while read file; do echo \"$file\"| xargs echo ; done to see what would actually be deleted and cat duplicate.txt | grep " 1.mp3" | while read file; do echo \"$file\"| xargs rm -v ; done to delete the files.

nicolas
August 2, 2008 8:06 PM

Hi, For some reason, the 2nd step did not work for me. I use macosx 10.5.4 and had the following error: > cat duplicate.txt | grep 1.mp3 | tr '\012' ' \000' | xargs -0 rm rm: /bibliotheque/iTunesFusion/iTunes Music/Daft Punk/Unknown Album/32 Daft Punk - Revolution 909(1) 1.mp3 /bibli.. ...arepusher/Lost in Translation/04 Tommib 1.mp3 : File name too long For some reason, xargs is not splitting the entry...! I think I get the \n vs null terminated replacement and I dont see why it does not... It WORKED however by replacing the command by cat duplicate.txt | grep " 1.mp3" | while read file; do echo \"$file\"| xargs echo ; done to see what would actually be deleted and cat duplicate.txt | grep " 1.mp3" | while read file; do echo \"$file\"| xargs rm -v ; done to delete the files.

kikizee
January 13, 2009 10:11 PM

Someone was looking for a software solution....try Beyond Compare by Scooter software. It's really easy to compare files and folders, then you can adjust everything into one folder or the other so you have everything in one spot. Awesome for everything, not just music. Just google it, I don't know the URL. HTH :)

Joshua
February 26, 2009 7:57 PM

To improve your one liner just use a regex like the following. cat dupes.txt | grep [1-9].mp3 | tr '\012' '\000' | xargs -0 rm

sam finney
November 20, 2009 5:47 AM

In iTunes 9.0.2 with Home Sharing setup, there is a "Show" box in the bottom left of the screen. You can select "Show items not in my library" and get a list of allegedly non-duplicate items that you can then transfer. This works preatty well. However, it is not perfect--the filtered list it showed me excluded some items that were not in my library...

Roland Beuker
February 8, 2010 12:45 PM

Hello Adam; Your changed awk script doesn’t run; cat .tmpSortedFile | awk ‘{ if ($1 == oldmd5) { printf “rm "%s" \n”, substr($0, index($0, ” “)+1) } oldmd5 = $1 }’ > duplicates.sh -bash: syntax error near unexpected token `(' What's wrong here?

Adam Kalsey
February 8, 2010 12:59 PM

On the web page, the quotes are being changed to curly quotes. They're typographically very nice, but don't actually work as part of the unix command line. Take those double and single quotes and re-type them to make them just normal quotation marks.

markbun
June 20, 2013 11:13 PM

I' m using "DuplicateFilesDeleter" Its a guaranteed fix for duplicates.

These are the last 15 comments. Read all 19 comments here.

This discussion has been closed.

Recently Written

How to advance your Product Market Fit KPI (Oct 21)
Finding the gaps in your product that will unlock the next round of growth.
Developer Relations as Developer Success (Oct 19)
Outreach, marketing, and developer evangelism are a part of Developer Relations. But the companies that are most successful with developers spend most of their time on something else.
Developer Experience Principle 6: Easy to Maintain (Oct 17)
Keeping your product Easy to Maintain will improve the lives of your team and your customers. It will help keep your docs up to date. Your SDKs and APIs will be released in sync. Your tooling and overall experience will shine.
Developer Experience Principle 5: Easy to Trust (Oct 9)
A developer building part of their business on your product needs to believe that you're going to do the right thing for them and their customers.
Developer Experience Principle 4: Easy to Get Help (Oct 8)
The faster you can unblock a stuck developer, the better their experience will be.
Developer Experience Principle 3: Easy to Build (Oct 5)
A product makes it Easy to Build by focusing on productivity for developers building real-world applications.
How to understand your product and your market (Sep 30)
A customer development question you can ask to find out who your product is best for and why they'll love it.
Developer Experience Principle 2: Easy to Use (Sep 28)
Making it Easy to Use means letting the developer do everything without involving you.

Older...

What I'm Reading

Contact

Adam Kalsey

+1 916 600 2497

Resume

Public Key

© 1999-2020 Adam Kalsey.