- System: MongoDB 4.2.5
- Installed through Homebrew (any recent version should work with what we have)
- Data Summary:
- Loading Data & Preprocessing:
- Run
python3 preprocessing.py
in terminal to load and preprocess the data- Make sure numpy, pandas, and pymongo are installed
- mongo daemon is running in a separate terminal
- The csv file extracted is in the same directory as the script
- Note: if you are trying to run the preprocessing script on the shorter 1000 data point version, you need to change pd.read_csv("stage3.csv") to pd.read_csv("shortStage3.csv") in the script
- Run
- Setup Requirements (Programming Languages Versions)
- Python 3 (for preprocessing)
- Third-party libaries: Numpy, Pandas, PyMongo (for preprocessing)
- Java 8 (for setting up the Cloud9Agent of Knowi)
- For Knowi:
-
Setup:
- Make sure the mongo daemon is running by
mongod
command in a terminal - Sign up for a free trial
- Install the Cloud 9 Agent (Java 8 required, anything higher does not work)
- To start the agent, change directory to the Cloud 9 Agent folder in a new terminal window, then run command
./run.sh
(Mac) orrun.bat
(Windows) - Next, sign in to your account on Knowi
- To create a data source, go to Data sources tab on the left hand side of the screen and input localhost into Hosts and 27017 into Port. For Database Name, you should use gunviolence (based on our preprocessing script, that is the name of our collection). Then click Use Agent and select the API key for the agent you downloaded
- Make sure the mongo daemon is running by
-
Querying & Visualizing:
- To create and run queries, go to Queries tab and create a new query. You can just copy and paste our queries from Query.docx here into Mongo Query Box and click Save and Run now
- Navigating to the dashboard, you can now add visualizations to it by dragging a widget (generated from executing the query in previous step) into the dashboard and selecting the type of visualization you want by clicking the three dots at the top right and choose Analyze
- Using this link, you can see the visualizations we created, and the specific parameters we used to set up the visualizations. Each visualization has a title and a number, corresponding to the ones listed in the query document as well as the last slide of our presentation. In case the link expires after our free trial ends, we have also provided a pdf of all the visualizaions here named Visualization.pdf.
-
- Everything was written by us. We did not use Github code or anything else.
File Name | Description |
---|---|
shortStage3.csv | Small portion of the data used (1000 data point) |
preprocessing.py | Script for preprocessing |
Query.docx | Doc that contains all the queries and their brief descriptions corresponding to the visualizations on the dashboard |
Visualization.pdf | A pdf of all the visualizations we created |