Over the past five days I have received several positive comments to version 1.0 of the analysis …so I’ve been thinking…what could be the next version?
The most practical option would have been to work down the list of terms in the word count, like “from chart position #11 to #20″…but things get fuzzier once you walk out of the top terms, because ambiguities and semantics start playing a heavier role, and a stronger semantic engine would be required.
Thinking about parts of the narrative which are less ambiguous, I decided to consider the temporal components of the Encyclical Letter, namely…years. These are included both in the actual text, as well as in the references at the end.
As for the previous version, the caveat in this case is that the text mining algorithm does an easy job of identifying single years (and will detect nouns related to time). However the algorithm will not “understand” relative time references, such as “five years after the publication of document X”, and so forth.
The chart shows the number of occurrences of a given year in the text.
The trend clearly shows that references to years, which are typically the dates of documents, protocols, events increase substantially over time. At the same time, we can observe phase, typically around the central part of a decade, where the number of references tends to go down.
Is this because there has been a more limited concern on environment in these moments in time, or because the Vatican considers these phases less relevant to its vision on the Care of the Common Home?
Let’s see what the next version of the algorithm may bring…
As for my previous article, please write to email@example.com for comments, or if you are interested in collaborating on this type of work.