Extract Bibtex references from table of contents

I have been trying  different tools to extract bibliography references from table of contents of a pdf book.

Assume you have the pdf format of an edited book which contains 20 articles in it. If you want to have the reference data for all of the entries, you have to go to google scholar and extract the reference data for each of them. The process is hectic. Furthermore, google scholar usually offers incomplete data. You need to go and edit each of these references. it is a lot of work.

Won’t it be easier if you can just pick the reference data directly from the table of contents of the given pdf book?

Yes, in principle.

But, in practice, you need to understand a lot of programming and under-the-hood understanding of PDF files. I have none of it. Therefore, I came up with a simpler, but, equally plausible solution= using Keyboard Maestro and Jabref.

The process is a bit complex. But, the output is much better and faster than Google shcolar or any of the reference extraction methods.

  1. Fill up the reference data of the main book in Jabref (from Worldcat)
  2. Copy the bibtex of the edited book to a specific clipboard inside Keyboard maestro. (if you are importing my macro, simply hit CTRL+C; that will copy the bibtex and make some calculations to get the publication year)
  3. Copy the Title, Author and page number of each of the articles of the pdf book. Each of the references must be copied in that order.
  4. hit a shortcut (CMD+ALT+9) that calls a window of Keyboard maestro asking me for the number of copied references. I count the number of references I copied and answer the question. I typically copy 8 references at a time.
  5. click OK. KM magically turns the clipboard to references; calculates the page numbers for each entry, and crossrefs them with the mother book.

I magically get a perfectly formatted reference from the copied clipboards. Once you get how it works, it is very powerful script.

Keyboard maestro script

You can ask if you are interested in the script.

Advertisements

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑

%d bloggers like this: