Skip to content

PatristicTextArchive/analyse_data

Repository files navigation

Analyse PTA with Jupyter Notebooks

This repository contains several Jupyter Notebooks to explore the Patristic Text Archive beyond the tools provided in the web frontend.

Open In Colab

This repository contains the following notebooks

  • Open In Colab pta_nlp_cld.ipynb using CLD
  • Open In Colab pta_nlp_spacy.ipynb using grc_proiel_sm (cf. greCy. Ancient Greek models for spaCy)
  • Open In Colab keywords-in-context.ipynb
  • Open In Colab collocations.ipynb
  • Open In Colab analyse_corpus.ipynb (using TF-IDF)
  • Open In Colab biblical_quotations.ipynb (using severian_quotes.json and pta_metadata repository)
  • convert_pta_totext.ipynb Helper notebook to convert PTA-XML to a csv file
  • lemmatize_all.ipynb Helper notebook to lemmatize all text generated by convert_pta_totext.ipynb

Data for use in notebooks

in folder data

  • DejaVuSans.ttf (to be able to use Greek Extended in wordclouds)
  • severian_plaintext.csv as generated by convert_pta_totext.ipynb
  • severian_plaintext_lemmatized.csv as generated by lemmatize_all.ipynb
  • severian_quotes.json as generated by convert_pta_totext.ipynb

in folder assets

About

Analyse PTA data with Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published