I have many gigabytes of PDF files, and it’s something of a challenge to collect and organize them in a way that is easy and useful. I’ve tried a number of things, although I’m still trying to work out what the best one is. Here are my results to date, however.
What I actually use as of now:
EagleFiler is a general purpose collection management system (it is in the same general market as DevonThink, Yojimbo, Together, and maybe a bit more distantly similar to Journler and Mori). The reasons I have adopted EagleFiler are:
- It is under active development and is getting better and better, fast.
- It does not store my files in any specialized database, they are just in a folder. Although you are not allowed to move files in and out of this folder except with the help of EagleFiler, it means that if EagleFiler were ever to disappear, my files would still be just as accessible. It also means that other programs can link to files there.
- Getting files into EagleFiler can be very easy, using one of two system-wide hotkeys that let you send what you are looking at straight into the active library (or you can choose among open libraries where to send them)
- It does integrity checking on the files, so I can be relatively confident no files have gotten corrupted.
- It does duplicate checking upon import, so I can just drop any newly-acquired PDF files there, and it will generally refuse to import files I’ve added already.
- It indexes the contents of files, and searching is fast.
So, my primary papers collection is in an EagleFiler library, within which I have created folders by source (like “Journals”, with subcategories like “LI” and “NLLT”).
EagleFiler has the capability to add tags (including hierarchically organized tags) to files, but I use them relatively sparingly. I find the tag interface to be a bit clumsy, and not very well suited for adding lots of tags. A small number is manageable, but not hundreds.
Every file in the EagleFiler library has a “Title” and “From” field (and others), and I have been putting paper titles in the Title field and listing authors in the From field. I have also tried to regularize my naming convention to AuthorYear-source.pdf. This makes my library browsable and the filenames predictable, but the big downside is that this information is not used by anything else, which means in particular that to make a bibliographic entry requires entering that information again.
I actually also use EagleFiler for many other collections, including my own work documents (papers, talk handouts, student work, etc.).
BibDesk is primarily for managing LaTeX bibliography files, but it also doubles as a pretty decent paper archive and browser. You can attach PDF files to the entries, which are then indexed and can be searched.
Below are some notes about other programs that I have tried.
Yep. Nice browser, but it only views PDF files (no PostScript, no Word documents, no text files), and the metadata information doesn’t really include everything you’d want (and moreover, since it is not integrated with any kind of bibliography management, it would probably require double-entry of a fair amount of information).
Bookends. This is really a bibliography manager, but it isn’t that useful now that I’m working with TeX (though it’s a good replacement for EndNote if you use Word or Mellel). It also allows you to attach and browse PDF files, but this too is likely to require double-entry (or risk going out of sync).
Zotero. This is a more general purpose bibliography/collection tool, to capture whatever you find on the web, tag it, add notes, etc. It’s a FireFox plugin, and it works pretty well. You can keep bibliography information in it, add tags, attach files (and web pages), and index them for searching. I want to like this, and the notes function is nice, but there are certain things that make it a little bit clunky for just keeping your papers in.
Journler. This is maybe not the best for a papers archive, but it is great for taking notes and organizing them. Still working on how to integrate everything.
Papers. This looks promising, and kind of aimed at exactly our market, but it’s still a bit more medical-papers-friendly than Linguistics-papers-friendly. It’s also got rudimentary bibliography features, meaning more double-entries.
Skim. This is not a paper organizer, but it is a PDF viewer that allows you to add notes to PDFs in a nondestructive way. By the same people as BibDesk. I’ll have things to say about Skim and how to store Skim notes.
(Refs: Organizing all those PDFs)