-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathconclusions.html
executable file
·100 lines (92 loc) · 6.66 KB
/
conclusions.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
<!DOCTYPE html>
<html>
<head>
<title> CS 499 Group 7: Supreme Court </title>
<meta charset="UTF-8">
<!-- Put JS headers here -->
<!-- CSS Stylesheets -->
<link rel="stylesheet" href="main.css" type="text/css">
</head>
<body>
<div id="main-page" class="container">
<div id="main-page-header" class="header">
<br>
<h1> Supreme Court Rulings Analysis </h1>
<h2>University of Kentucky</h2>
<h2>CS 499 Spring 2018</h2>
</div>
<br><br><br>
<div class="content">
<div class="links">
<a href="index.html">Home</a>
<br><br>
<a href="introduction.html">Introduction</a>
<br><br>
<a href="requirements.html">Requirements</a>
<br><br>
<a href="updates.html">Updates</a>
<br><br>
<a href="schedule.html">Schedule</a>
<br><br>
<a href="design.html">Design</a>
<br><br>
<a href="testing.html">Testing</a>
<br><br>
<a href="useCases.html"> Use Cases</a>
<br><br>
<a href="designAndImplementation.html">Design Considerations/Implementation Issues</a>
<br><br>
<a href="enhancementsMaintenance.html">Future Enhancements/Maintenance</a>
<br><br>
<a href="conclusions.html">Conclusions</a>
<br><br>
<a href="installation.html">Installation</a>
<br><br>
<a href="references.html"> References</a>
<br><br>
</div>
<div class="text">
<h2>Conclusion</h2>
<p>
The goal of this project was to create a web application which would allow users to sort and analyze articles pertaining to the Supreme Court of the United States. There are a variety of sources the user can filter through as well. The application also presents analytics gained from the downloaded articles, including sentiment score, magnitude, key words, images, and entities of images. The functionality of the application is tremendously convenient; it's faster than a simple google search of articles. It's comparable to a mass amount of searches done simultaneously; however, it's still better, because the articles' data is downloaded and analyzed. It has the potential to save hundreds of hours of the user's time.
</p>
<p>
Most ideas from previous' groups work on this project were adopted. These include the scraping process. The functionality was improved by adding more sources to draw from when collecting articles. Another improvement was the UI design, wherein more convenient features were added. Here is a simplified list of the process gone through to produce the application:
<ul>
<li>Database created, which holds article data: links, text, images, sentient, magnitude, keywords, and entities.</li>
<li>Site scrapers set up. Change settings for scrapers to run on a preset time interval (i.e. daily).</li>
<li>Host source code and article collection files in an Amazon Web Service Apache HTTP server. Anyone with the link can now access the application and use it.</li>
</ul>
</p>
<p>
The tools and platforms that we used in the production are as follows:
<table style="width: 100%">
<tr>
<td>Python 3.6</td><td>AWS EC2 Instance t2.micro</td>
<td>MySQL</td>
</tr>
<tr>
<td>PHP 7.2.2</td><td>Newspaper3k 0.2.6</td>
<td>Google Cloud Vision API</td>
</tr>
<tr>
<td>Google Cloud Language API</td><td>Javascript 1.8.2</td>
<td>NEWSAPI 0.1.1</td>
</tr>
</table>
</p>
<p>
There were several problems we ran into throughout the semester. A major event that happened was the loss of a group member. He just stopped showing up one day, early in the semester, with no communication at all. This is was still after he pitched a great idea for a machine learning function for the articles, which we told the professors about. They were excited about it, but this group member was the most knowledgable in this area, so after he left we had to scale back the goals of the project. Another problem that we had to solve was making the project sustainable; migrating all of the files that make up the project onto the UK's database so that the professors can keep using it in the future after this semester. Lastly, another problem was paywalls. When scraping articles from the web, some of the websites had paywalls or the articles were presented in an unknown format. As a result, the "articles" that were collected were too short or just were not articles at all. To determine if we encountered a paywall or irrelevant results, we analyzed the length of the collected text. If it was too short, it was likely a paywall and the article wasn't added to the database.
</p>
<p>
Some of the things we were able to take away from working with this project was efficient communication with a customer. This involved phrasing things in a non-technical fashion. Explaining details in an easy-to-understand way to them so that both the customers and us were on the same level. Another thing we learned is the process of web scraping and how it functions. Another tool we became familiar with was the Amazon Web Service and Google Cloud Service. Both of these helped to make the project runnable and presentable to the customers and the public; anyone with the link can access the web application and use it freely.
</p>
<p>
For now, the project is complete and its functions can save someone a great deal of time. Although there are still many functions we wanted to add to it. Some of these include adding an administrative page where the customers can view source statistics, such as how many articles have been rejected from a certain website. Through this, bad sources could be identified and then black-listed. The classifying agent, implementing machine learning code, would be an impressive addition to the classification of articles. On mobile, the application functions, but the display is completely incompatible with the devices' browsers or web settings. In the future, it would be nice to add mobile support on the UI code. Lastly, pertaining to paywalls, if the user has an account on a website that has a paywall, they could add their account details and the scraper would then be able to access all articles from that website unhindered.
</p>
<p>
Although not much maintenance is required for the web application, there are some things to make note of in case some unexpected error occurs. The Google Alerts account may expire, or the Google Alerts feeds may become unreachable. The NEWSAPI key or the Google Vision API key may expire. Some more unlikely problems that may occur are visual, in the UI, such as display problems. These cases are are explained in more detail in the maintenance page.
</p>
</div>
</body>
</html>