Powerful search tools in Windows & Mac

If you are an information worker (academic), having great support from powerful search tool is  crucial. Unless you have that sharp searching too, you will have trouble to pick that grain of information from the gigantic jungle of information coded in the form of data, sentences, or books.

There are great tools everywhere; but, some stand out in their capabilities than others.

The three giants in the Windows environment you might need to check are:

  1. Dtsearch (Windows)
  2. X1 search (Windows)
  3. Copernic desktop (Windows)
  4. FoxTrot Professional search (Mac)
  5. ? Devonthink (Mac)

 

Personally, I am not that much fond of Copernic mainly because it has no internal previewing tools; and, it seems to consume too much resource of my machine.

My number 1 pick is DtSearch. It is the best in its class in digging the tiniest of information. The proximity search is an invaluable tool to find associated ideas.

X1 comes closer. It is more of a document manager just like Devonthink in the mac than a specific searching tool. X1 also a wonderful application. It is cheaper than DtSearch.

 

As to FoxTrot, it is quite comparable to the DtSearch. But I like the preview system in FoxTrot even more.

The proximity search in DtSearch requires you to write the distances between the words(phrases) explicitly like Mary w/5 John (‘search Mary and John within the distance of 5 words’); while Foxtrot has a little scrolling window to search within a paragraph, within a sentence or less closer phrases.

One might put DT as a competitor to Foxtrot in the mac. But, I think FT is much superior on the search side while DT rocks for its AI and other organizational tools.

(Note, I don’t like giving links to the products because I don’t want to sound that I want to get a penny by associating them to my small, free, blog….I am dropping these notes because I believe these notes might help somebody out there; not because I have some other agenda. I used to keep these notes in my internal system; i put them out now in case somebody get sth useful out of these notes).

Advertisements

Where Bookends sucks

I am now checking out other alternatives as the fate of Sente is looking dismal. The best alternative for the users of Sente looks like Bookends. I think there is some kind of communication between the developers of the two applications. One of the reasons that we suspected the abandonment of Sente came from the developments on the Bookends side. we have seen, the developers of Bookends have been preparing to grab the former users of Sente.In their latest updates, since June 2015, they have been modifying their application to import Sente references. But, still, there are a lot of glitches to move references from Sente.

But, personally, I am more worried about the capabilities of Bookends as a reference manager than the migration. The migration is a work of few days. But, if the application has some fundamental weakness, that will be a pain for a long time; that I am afraid the pain of migrating my references might not worth the effort.

First the strengths:

  1. Bookends seems faster than Sente; at least at the startup.
  2. It works well with a number of other applications such as Devonthink, Tinderbox and Scrivener.
  3. And, most importantly, it has some cool tools called **Global Change**  which seem very useful. These tools helps to make a change to a number of references in one sweep. Sente also has this system; implemented differently.I think the Bookends has an upper hand here. One of the worst footprints of Mendeley from the Windows that remained in my reference for ages was: the Titles of the references were placed in the place of the Journal. I don’t why Mendeley does that. But, the Titles were exported as Journal. I was not able to change that for a number of years within  Sente. Bookends does it in a sweep, just a couple of seconds to fix about 1500 references. Yes, that is great programming.

 

and the issues:

I think bookend is quite good reference manger. But, it has some really deep issues:

  1. The PDF reader is ugly; and not even comparable to the reader in SEnte. Sente gives the best PDF reading experience ever; not even the dedicated PDF readers like PDF Expert, iAnnotate, Acrobat Reader can reach it. Bookends has a mediocre PDF reader.
  2. Not well organized: the tools and features are jumbled here and there. It is really not clear which of the menus do what. At the first look, the app generally looks unattractive. But, honestly, I am less worried about that. I just think some people might not appreciate it. Personally, I just want my job done. I am not going to wear this app for  my birthday party.
  3. But, the real issue, and the true depth of shit of Bookends is on the reference detection and downloading part. The whole focus of the Bookends seems on the MedPub. Generally, most reference managers can detect references from the major databases (search engines) like PubMed and Google Scholar. But, Sente has been efficient in doing it from a broad array of sources: that I cannot list all here. The most important of them, for me have been: Worldcat and  Stanford University Library. Sente made the process perfect by its feature called Targeted Browsing. The two sources offer the cleanest references while Google Scholar gives out the most incomplete metadata. So, when I was trying Bookends, I was hoping that Bookends would do the same.

Assume that I received a PDF book from a friend. I want to download the metadata. In Sente, I would drop the PDF to the library, Sente displays its Citation Lookup  dialogue box in which I will select the Title of the book and choose WorldCat.

joudw

The top 5 sources are the most important. The title of the book, “Italian Syntax…” is going to be automatically pasted in the WordCat. Now, look at the WordCat website.  gku8e

That Red Circle makes the insertion of reference so elegant. Clicking the red button populates the reference information. Worldcat gives a complete reference data; i rarely find a mistake. Note that these 5 sources can be expanded, if required. I used to have a large number of other sources including Stanford University.

Bookends has a similar feature. But, the implementation is inefficient because it is restricted to a few If I want to do the same in Bookends, the process is clunky and inefficient. The sources are not expandable that, if they don’t work for you, you will be stuck. The fact that I cannot grab references from the WorldCat database is really disconcerting to me because that is the number one source for me. All the references for books come from it. It gives the most complete metadata.

 

First, Bookends asks you to attach the PDF. You will get the citation window after  the PDF is attached. That is way step away. After you attach the PDF you will have the following window:

txpz41

This is where I am frustrated. The part I marked with the big red rectangle is supposed to display the Title of  the book. But, it doesn’t. Therefore, if you have to put the title, you are supposed to do it manually. You have remember the title, or copy it before hand. That is strangely sluggish. Second, the sources at the right lower corner  are really useless to me.

Except Google Scholar, the rest are useless, really. I cannot modify or add a new engine either. So, I am stuck. The offer is Take it or leave it. You will be happy if you are a medicine student; fucked otherwise. Google Scholar is quite ok for articles. But, it is one of the most incomplete sources.

Bookends has another method of downloading references using an internal browser. But, I think that one is even worse. I put the title of an article; out of the total of 20 articles in google scholar, which Sente detected all of them, Bookends was able to detect just two.

All in all, I think Bookends is not really polished at downloading metadata.

The difference between Sente and Bookends might seem minimal here, from outside. But, for some one who will use the process thousands of times, even the tinies further step is one more pain. While I like many of the features of the application, I find it hard to adopt the app as my main reference manager  because of  this problem. So, I am contemplating either to stay with Sente to the last breath of the app, or check out other alternatives, Papers 3 probably.

 

2017-02-05: update

  • Now, Bookends has included further sources of data extraction. Now, I am now using Bookends as my main reference manager. Even if it has some weaknesses on the reading side, it turn out to be one of the most complete reference manager out there. I have noted my observations here.

is Sente abandoned?

Sente has been my favorite reference manger for the last couple of years. It has the most elegant reading interface; the annotation and quotation features are incomparable to any other PDF reader, let alone reference manger. I enjoyed every bit of the time I spend with Sente. Importing reference data, and downloading PDF files alongside, has never been as great. Unlike any other reference manager both in windows and mac environment (I have tried many of them), Sente allows downloading references from a very wide variety of sources. Its targeted browsing has been of utmost service for me. I really love how the application is designed; how it all is implemented. Sente is extremely well-crafted application; much better than Papers and Bookends in many aspects.

But, unfortunately, there is no update of any kind from Thirdstreetsoftware for the last few days. They shut down the blog, and stop replying emails. There are also some internal rummers that Sente might not bee see developments. I am truly worried if Sente is vanishing into nonexistence; all the time I spend on organizing my library; all the annotations and notes I made….I don’t know how to live without it. Very sad part of proprietary software;  the end is always ugly.

It also makes me wonder what kind of person would develop such a polished application for years; and ultimately abandon it. They have been developing it for IOS quite recently. There should be something seriously wrong!

 

 

Replace Tinderbox with Scapple

Tinderbox has been very proud of its mapping features. That feature is indeed the main selling point of the application. But, I just realize, one  doesn’t  need to go through the pain of learning an application as complex as Tinderbox to gain these benefits. A little sister of Scrivener, Scapple can do it the mapping of ideas with the smallest fraction of cost; with the easiest pace of learning. It is also better than mind mapping applications mainly because it is portable (that is, you can just drag your notes to Scrivener), and the rigid outlining forced in the mind mapping applications doesn’t exist in Scapple. It is just free, a clean piece of paper. Write, connect in a way you want your ideas to flow.

 

Go and try it. You don’t have to worry about learning a complex, arcane piece of software to graph your ideas.

 

Why you need to split your big PDF books

I have one secret tool that I bust all my class mates when it comes to digging down the nitty-gritties of small pieces of information. 

When we discuss some issue with professors or classmates, sometimes we come up with some wild ideas. We ponder about it; ask if anybody else has thought that before us. What they usually do is google.  I also sometimes google the ideas if anybody else thought them  before us (me). But, the fact of the matter is, google has a lot of noise out there with the same keywords but has little to offer the very specific information I am looking for. 

That is where a internal database comes to rescue. I collect as many books and article into my disk so that I can dig them whenever i want to find out specific ideas. The concept is known by “text mining” in a different camp of linguistics.

 Right now, I have over 2000 books and articles in my disk all of which deal with Theoretical Linguistics. 

If you have a collection of books and articles like me, and tried search a specific phrase into it; using Alfred, Spotlight or Devonthink, you will immediately learn that the biggest book always comes on top regardless of the quality of the material in it. The reason behind it is the word count. The larger the book, the more likely that it contains the queried word multiple times. If you collection specially contains gigantic Encyclopedia books, there is not chance that the short article comes out on top of your search result however relevant the article could be. 

Therefore, to make each small article as competent as any other material; and that your search tools could pick the small articles whenever they are relevant, you need to split the books into article sizes. 

I have experimented with different tools of splitting my books; beginning from Apple’s own Automator to a number of python and Shell scripts. Most of them work by bursting  the book by pages.

Bursting a book into single pages could be feasible when you have less than 1000 books. As you books grow, the bursting creates too many files to manage. In addition, the single pages won’t contain enough material to read within  the search result (FoxTrot for me). That is where splitting in 10-15 (article size) rage becomes crucial. 

Right now, I have shell script that breaks down my books in 10 pages ranges, a script that I adopt from a South African guy (I forget his name; I met him in Acadamic.edu). 

Any book or article I add to Sente library directly gets copied to another folder (using Hazel ) and gets splitted into article size page. All the pages finally move to another folder for ultimate archival; where my searching tools such as Devonthink and Foxtrot index. 

I will come back  to the  the full workflow and the scripts I use to achieve the task in another post. 

Workflow with Sente, Devonthink, Scrivener using Hazel and Dropbox as glue: part 2

On Mirroring

In this second post, I am going to talk about a method, rather than a tool (software). I call the method “mirroring”. The method is a complementary approach for syncing. I generally like syncing files across my macs and iOS devices. The problem is: syncing is possible only when the app developers offer it. For Sente, for example, you can sync your Sente library to your Sente in IOS. But, you can not do so to other applications such as Devonthink; or Scrivener. The tags in Sente are not visible in Finder; and the notes and annotations, all are specific to the application. It is a locked application in that sense. Most reference managers are lock-down applications, unfortunately. I would be wise to avoid them; but they facilitate workflow.

 

Therefore, since I am relying on Sente and other locked applications, for my work flow, mirroring is a way around the locking weakness. What do I mirror? I mirror my projects.

My works are project based. I move from one project to another; writing small articles and developing small pieces of works for my dissertation is what I am doing, and will be doing for the next two years. I already talked about how I organize my PDF files based on projects. How do I mirror it? I mirror my project inside Sente to Finder by creating a folder. For example: if I am working on a project called “Object Shift”; i will have a tag in sente with the same name. All the PDF files that I will need to read will be tagged “Object Shift”. Look at the following picture: ppic82 When I double click the Tag, Sente hooks me to what I call the  project mood. The project mood is my favorite mood for reading in Sente. It also helps me to see the relationships and differences between the papers. ppic83

 

Now, I have all the papers I believe are important for the project. I then read and annotate them as fast as I can; and export the annotations to a Folder in Finder. The folder I create inside Dropbox is a mirrored folder; with the same name. The folder itself is inside a big folder called “Projects” which itself is inside Dropbox.  That mirrored folder (“Object Shift”) is where I keep all the notes I export from Sente  as well as the Tex file I will finally compile it to a finished paper. The “Project” folder is indexed inside Devonthink. Therefore, anything I add inside “Object Shift” is automatically available inside DT.  Now, you see I am in a good shape. My project files are in a separate folder inside Dropbox; but still in communication with the rest of my files inside Devonthink. The next step is  to develop a dozen of search algorithms (smart groups) inside DT that will hunt down all the relevant  files  to my topic. File selection and grouping in Sente is manual. Grouping inside DT is automatic. There are both pros and cons for for manual and automatic approaches of grouping files for project. I combine the two to get the best results.

 

As I have mentioned, I have “Object Shift” inside Sente, Dropbox (a folder) and Devonthink (indexed).  I also open a project under the same name inside Scrivener (I use it for some projects) and also a paper folder tagged with same name where I put all the papers relevant for the project.  That is mirroring.

It is a way of organizing myself wherever syncing is not available globally.

I mirror not only the projects and folder; but also the Statuses. The Statuses that I assign in Sente, demonstrated in the first post, are used across the board: inside Devonthink, Finder (Path Finder), Scrivener and even printed papers and books. Their application in the printed materials is actually quite interesting. I was a reading a book titled “How to Read a  Book”. In that book, the authors have a notion called x-raying the book.  X-raying a book is going through the major sections of the book, and evaluating the organization of the topics to evaluate the topics for your purpose. It is very effective method. I have developed the habit of examine the Table of Contents, the Sections and Sub-sections of the books before I read them. As soon as I finished examining the book, which takes just 2 minutes, I assign my statuses to the sections; with small notes; by attaching small stickers on them. That way, I will make sure that I l read the “Must Read” sections; and skip the “Repelling” sections (too much details or digressions) etc. As one can see from its multiple applications (on folders, projects, books and articles),  I can say that Mirroring is rather a habit; a useful habit to get things done.

You can make it your habit too.

 

Workflow with Sente, Devonthink, Scrivener using Hazel and Dropbox as glue: part 1

Since I want to follow up informations on the internet on a few applications that I am interested in,  I have setup  google alerts for tracking  blog posts and articles into my email inbox. Under my Devonthink tag, today I get this small visual workflow in a website called Pinterest.

workflow

I have never visited Pinterest before; but the visual illustration looks beautiful.  There is no explanation on how to build the workflow in the visual maps; but the illustration is elegant, something I have been planning to design. Since this is the  workflow  what I am already using it for the last year, it seems good idea to put how I put these different apps to work together to have the “virtually perfect” kind of workflow for my research (phd dissertation, I am starting soon). I have already written short articles in my previous posts on  some of the connections I made between these four majestic apps. In this extended post, I will demonstrate how I developed my workflow using Sente, Devonthink, Scrivener, nvALT, xMind, Sublime Text and ultimately Latex. Hazel, the Automator, Dropbox and Keyboard Maestro are the core glues of the workflow. I will spell out how each app works together with the other applications to great a coherent and elegant workflow.

Now, have a loot at a pictorial overview of the workflow we will have by the end of this series.

research workflow
Click for larger preview

In the next few series, I will try to explain each step of the workflow; how I develop the connections and how the tools chosen talk to each other to build a solid workflow. (I didn’t represent Hazel in this drawing mainly because the tasks of Hazel. The role of Hazel  will be clear by the end of the post)

Let me start from the app I use to gather resources for research: Sente.

# Sente

For me, the most important organization burden is lifted by Sente. Sente has four crucial features which are the life and soul of my workflow: Targeted browsing, File renaming,  QuickTags and Status. I will explain how I use each of these features to download, store, rename and organize my PDF files; then, how I will incorporate the organized PDF into the Devonthink hemisphere using Hazel as a glue.

I am not going to explain how to do all the importing and referencing process in Sente. That is left for the user manual; my task here is to show how to use what I call above the crucial features and getting things done.

Let me start from setting up the Sente library:

## File naming

The library is setup to store PDF attachments inside the Sente bundle. Setting up the library inside the bundle is very important for syncing the library to iPad via the Sente server.

The attachments will also be named as:

[First author Last name] [Year of publication] [Title of publication]

The file naming  is  important because I find some of the PDF files hard to read inside Sente; hence, I have to open them in acrobat. If you have the files properly named based on the author and title, getting the file is just a matter of second, specially if you are using Alfred.

## Targeted browsing

After you setup your library, the next step is to search files and download or import them into your sente library. There are two ways of getting your PDF files into Sente library; both of which support of-the cuff-setting up the reference of the file. The first method is directly downloading from the internet. Sente has this wonderful feature called targeted browsing. The main advantage of the targeted browsing is you have a choice of downloading bibliography information from a plethora of websites; you are not limited to Google Scholar or MedPub (the main weakness of Mendeley, by the way is absence of such a choice; it can download only from Scholar and medPub). Even if Google scholar is one of the greatest data sources on the internet, the data you retrieve from it are usually incomplete. Many people like MedPub. But, for my field, MedPub is irrelevant. Therefore, for articles, I havn’t found any better source than Google Scholar.  To get both the reference and the PDF file from Google scholar to your library, what you do is first import the reference information into your library by clicking the red button in   scholar website; and then, download the related PDF file. I am sure you know how to do this; I don’t need to explain it in depth.

The second approach is to have the PDF file in your disk; and then drag it or import it into your library. Sente will present you a window to add reference information to your PDF file. At this point, what I usually do is: highlight the title of the PDF file and take the title to my favorite search engines. For research articles, there is no better choice than Scholar.

As for books, thanks for their ISBN numbers, there are a lot of choices; WorldCat being one of the most popular. I used to retrieve the data for the books from WorldCat for a while.  But, after some time, I leaned that it sometimes confuses Affiliations metadata with Author. Therefore, I have been looking for alternative sources for retrieving bibliography information for books. Google Books is quite good; but Sente Targeted browsing seems to have some kind of difficulty to retrieve data from google books; takes longer time. Finally, , to my surprise,  I discovered the library of Stanford has the cleanest data; and Sente is very happy about it.

The template you develop to modify your targeted browsing in Sente is called Autolink Templates. Here is how I setup my Autolink Template:

Autolink Templates in Sente

 

## Quicktags

Quicktags: are the tools for organizing your research resources into groups. I have three major classes of Quicktags:

a) the Class: this is the class of quicktags that I assign to the PDF’s inherent classes. I assign these tags to classify the paper into the basic inherent classes of my field: Linguistics. Linguistics has many sub-branches if study; and sub-topics of research. Therefore, whenever I download a PDF paper, I assign these classes into the paper so that I can easily search and look at whenever I am looking at certain sub-topics. I for example use tags like Syntax, Semantics, Pragmatics, Phonology, plus sub-topics such as  VPs, PP, RC, DP under Syntax which what they are. I also have hierarchies of tags for the families of languages that I am interested in.

Afro-Asiatic [Semitic[Classical[Arabic, Hebrew]][Modern]] etc

b) the Project: this is a group of tags that I assign to papers on project basis. The projects are usually transient tasks that I plan, finish and move on to the next project. They could be part of the specific Class; they can also run across many classes. I assign these transient tags to the papers I am downloading, or on those which are already in my library. I search down my library, google scholar and may other source to combine all the resources to finish under one project using these Project tags.

c) the third group of tags is what I call meta tags. They are organizational tags. Whenever print a PDF file, I assign a tag xPrint; whenever I am reading the paper in Acrobat reader, I assign a tag called xAcrobat; or, finished reading the PDF and exporting the annotations of the PDF,  mark it xExport.

The Meta tags are supporters while Project tags are brothers of the Statuses, which I will explain in a moment.

## Statuses:

I basically use Statuses are meta-tags and to track the progress of projects. As you can see from the following snapshot, I have about 12 Statuses that I assign to my PDF files:

Statuses in Sente

– To be read nextis for example a status that I assign to a paper that I want to read just immediately after I finish reading the current paper.

  • Must read: is another status for a paper need to be read by hook or crook before I finish my PhD; a paper that I believe can offer a significant and profound insight to my research.
  • Repelling is on the negative side: a paper hard to read; or written in a bad language; the point is: I am dropping that paper; and might delete it from my library completely.

you get the idea