Skip to content

Converting Reddit posts and their comments to 1-1 dialogues

License

Notifications You must be signed in to change notification settings

emorynlp/reddit-to-dialogue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reddit-to-Dialogue

Purpose

Reddit-to-Dialogue is a tool that transforms a Reddit post & its comments into a dialogue.

Developement

This project began at Emory University in the Emory NLP Lab under the direction of Dr. Jinho Choi.

The development constituted two separate undergraduate honors theses, undertaken by Daniil Huryn and Mack Hutsell, and resulted in a long paper accepted to COLING 2022.

Installation

!! Pip Package Coming Soon !! `pip3 install reddit-to-dialogue`

Usage

Input

Data should be in a JSON format, organized as defined by PRAW. Example in reddit folder.

Output

Dialogues will be returned in the format (example in exampleoutput):

[{
    "sid": "",
    "link": "",
    "title:": "",
    "text": "",
    "author": "",
    "created": unix timestamp,
    "updated": unix timestamp,
    "over_18": boolean,
    "upvotes": integer,
    "upvote_ratio": decimal value 0 - 1,
    "response": [
        "",
    ],
    "dialogue": [
        "",
    ],
    "score":
}, ]

Where response is a list of Speaker 2 statements and dialogue alternates Speaker 1 and Speaker 2 statements.

About

Converting Reddit posts and their comments to 1-1 dialogues

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages