Finding a free and effective academic database/search engine
Key points:
- I lost access to academic databases and search engines through my university, meaning I can no longer use my standard literature search method of Web of Science + Google Scholar.
- I searched for replacements.
- My solution is OpenAlex + Semantic Scholar + Google Scholar, perhaps complemented by some field-specific databases when appropriate.
Recently, my university library stopped me from accessing Web of Science as an alumna. I’ve emailed them to see why this happened, and the answer seems to be that alumni were not supposed to have access to Web of Science in the first place!
To do my work effectively, I need a searchable academic database that a) covers many, many records and b) supports advanced search in some form. Without such a database, I feel like I’m fighting with my hands tied behind my back.
Fortunately, we are living in interesting times. There are many new tools emerging that did not exist even 5 years ago, when I was finishing my training as a researcher.
For comparison, Web of Science indexes about (~200M records). I need a database or multiple databases that achieve this level of coverage of the literature. I often complement this database by conducting, for example, additional searches of grey literature, non-English literature, or the wonderful catch-all search engine that is Google Scholar. I also need a tool to be cheap or free; my small, grant-funded organisation doesn’t have a small fortunate readily available to pay for database access.
I am only interested in the search engine functionality. I do not necessarily need access to the papers’ full texts, as I can build a separate system to obtain those. I also have no interest in the million additional features offered as part of many “innovative” AI-based software tools. I don’t need a full clunky ecosystem; I need an effective search engine.
Here are a few AI-based search engines that seem popular (I’m ignoring the other functions of these tools, including automatically generating summaries of the literature, as such functions are often useless for me):
- Semantic Scholar: ~200M records across many key publishers (e.g. Wiley, Sage, Science, Springer) and preprint servers
- Consensus: ~200M records: as it turns out, this search engine actually uses the Semantic Scholar corpus
- Elicit: as it turns out, this search engine actually uses the Semantic Scholar corpus
So, I’ll use Semantic Scholar. The search functionality isn’t amazing—it doesn’t support wildcard or boolean searches. For this reason, I might end up complementing it with one of the AI-based search engines (Consensus or Elicit) in the future.
I also found OpenAlex, which appears to be a purpose-built not-for-profit replacement to the big databases like Web of Science. OpenAlex reports having ~240M records, which it obtains itself from a variety of sources. This means that OpenAlex is probably the best like-for-like Web of Science replacement; moreover, the fact that it doesn’t use the Semantic Scholar corpus means that these two search engines might complement each other well.
I’ll probably complement this with Google Scholar, as I have been doing already. Google Scholar has a less versatile search functionality, but it has an enormous coverage (~100M records). I might also complement this with field-specific databases that are relevant for my work, including AgEcon Search, EconBiz, and Open Access Theses and Dissertations.