-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathREADME.Rmd
72 lines (54 loc) · 4.41 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
output:
rmarkdown::github_document
---
```{r, echo=FALSE}
knitr::opts_chunk$set(
collapse = FALSE,
comment = "##",
fig.path = "images/",
dpi = 150,
fig.height = 5,
fig.width = 10
)
```
# LSS: Semi-supervised algorithm for document scaling
<!-- badges: start -->
[![CRAN
Version](https://www.r-pkg.org/badges/version/LSX)](https://CRAN.R-project.org/package=LSX)
[![Downloads](https://cranlogs.r-pkg.org/badges/LSX)](https://CRAN.R-project.org/package=LSX)
[![Total
Downloads](https://cranlogs.r-pkg.org/badges/grand-total/LSX?color=orange)](https://CRAN.R-project.org/package=LSX)
[![R build
status](https://github.com/koheiw/LSX/workflows/R-CMD-check/badge.svg)](https://github.com/koheiw/LSX/actions)
[![codecov](https://codecov.io/gh/koheiw/LSX/branch/master/graph/badge.svg)](https://app.codecov.io/gh/koheiw/LSX)
<!-- badges: end -->
In quantitative text analysis, the cost of training supervised machine learning models tend to be very high when the corpus is large. Latent Semantic Scaling (LSS) is a semi-supervised document scaling technique that I developed to perform large scale analysis at low cost. Taking user-provided *seed words* as weak supervision, it estimates polarity of words in the corpus by latent semantic analysis and locates documents on a unidimensional scale (e.g. sentiment).
## Installation
From CRAN:
```{r, eval=FALSE}
install.packages("LSX")
```
From Github:
```{r, eval=FALSE}
devtools::install_github("koheiw/LSX")
```
## Examples
Please visit the package website to understand the usage of the functions:
- [Introduction to LSX](https://koheiw.github.io/LSX/articles/pkgdown/basic.html)
- [Application in research](https://koheiw.github.io/LSX/articles/pkgdown/research.html)
- [Selection of seed words](https://koheiw.github.io/LSX/articles/pkgdown/seedwords.html)
Please read the following papers for the algorithm and methodology, and its application to non-English texts (Japanese and Hebrew):
- Watanabe, Kohei. 2020. ["Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages"](https://www.tandfonline.com/doi/full/10.1080/19312458.2020.1832976), *Communication Methods and Measures*.
- Watanabe, Kohei, Segev, Elad, & Tago, Atsushi. (2022). ["Discursive diversion: Manipulation of nuclear threats by the conservative leaders in Japan and Israel"](https://journals.sagepub.com/doi/full/10.1177/17480485221097967), *International Communication Gazette*.
## Other publications
LSS has been used for research in various fields of social science.
- Nakamura, Kentaro. 2022 [Balancing Opportunities and Incentives: How Rising China’s Mediated Public Diplomacy Changes Under Crisis](https://ijoc.org/index.php/ijoc/article/view/18676/3968), *International Journal of Communication*.
- Zollinger, Delia. 2022 [Cleavage Identities in Voters’ Own Words: Harnessing Open-Ended Survey Responses](https://onlinelibrary.wiley.com/doi/10.1111/ajps.12743), *American Journal of Political Science*.
- Brändle, Verena K., and Olga Eisele. 2022. ["A Thin Line: Governmental Border Communication in Times of European Crises"](https://onlinelibrary.wiley.com/doi/full/10.1111/jcms.13398) *Journal of Common Market Studies*.
- Umansky, Natalia. 2022. ["Who gets a say in this? Speaking security on social media"](https://journals.sagepub.com/doi/10.1177/14614448221111009). *New Media & Society*.
- Rauh, Christian, 2022. ["Supranational emergency politics? What executives’ public crisis communication may tell us"](https://www.tandfonline.com/doi/full/10.1080/13501763.2021.1916058), *Journal of European Public Policy*.
- Trubowitz, Peter and Watanabe, Kohei. 2021. ["The Geopolitical Threat Index: A Text-Based Computational Approach to Identifying Foreign Threats"](https://academic.oup.com/isq/advance-article/doi/10.1093/isq/sqab029/6278490), *International Studies Quarterly*.
- Vydra, Simon and Kantorowicz, Jaroslaw. 2020. ["Tracing Policy-relevant Information in Social Media: The Case of Twitter before and during the COVID-19 Crisis"](https://www.degruyter.com/document/doi/10.1515/spp-2020-0013/html). *Statistics, Politics and Policy*.
- Watanabe, Kohei. 2017. ["Measuring News Bias: Russia's Official News Agency ITAR-TASS’s Coverage of the Ukraine Crisis"](http://journals.sagepub.com/eprint/TBc9miIc89njZvY3gyAt/full), *European Journal Communication*.
More publications are available on [Google Scholar](https://scholar.google.com/scholar?oi=bibs&hl=en&cites=5312969973901591795).