  1. Learn to preprocess data extracted from bibliography management softwares

2. Learn to build R functions to extract metadata from the Crossref API

3. Learn to visualize metadata to understand the trend among different variables

Manually searching specific metadata for an academic paper is laborious. Is there any magic that we can get all metadata for the bibliography search done at once? Crossref is the tool for you. It can extract the metadata for tens of thousands of papers online in one run. By the end of this project, learners will be able to create their own tailored R function to find paper metrics from the Crossref API. The function, which will be guided to build step by step, can easily be re-used when there are newly added articles or if the learners want to get the most up-to-date metrics. In this guided project, the instructor will walk learners through understanding the Crossref API, tailoring an R function, and wrangling the bibliography dataset. A good handle of this method will make it convenient for learners to analyze different metrics for bibliography from different fields, such as impact and number of collaborators.

  1. Task 1: Understand Crossref API and Import Libraries

  2. Task 2: Import Raw Data and Conduct Basic Data Wrangling

  3. Task 3: Build a Function to Extract General Crossref Information

  4. Task 4: Tailor Variables From the General Crossref Function Output

  5. Task 5: Build a Function to Extract Citation Amount

  6. Task 6: Combine Previous Functions Into an Upper-Level Function

  7. Task 7: Connect Crossref Output to the Original Dataset to Realize Data Visualization

