TF-IDF and LDA applied to extracting trends in innovation from patent data
The full manuscript can be found in "Innovation hotspots in anaerobic food waste treatment.pdf"
Full Jupyter notebooks with the R code can be found in:
FW_NLP_Final (Food waste patents) Biogas_NLP_Final (Biogas patents) AD_NLP_Final (AD patents)
In addition, all the text data can be found in the .csv files: fw1.csv (food waste patent text data) biogas1.csv (biogas patent text data) ad1.csv (AD patent text data)