Skip to content

Commit

Permalink
feat: Search engine UI added (#3)
Browse files Browse the repository at this point in the history
* feat: initial commit

* chore: cleanup

* chore: reorg

* feat: final and cleanup

* chore: final touchups

* feat: added build script

* style: style fix

* feat: updated readme and moved to src

* feat: search engine added

* feat: finsihed search Engine

* chore: search engine Done
  • Loading branch information
AnsahMohammad authored Apr 30, 2024
1 parent b08d233 commit af4294f
Show file tree
Hide file tree
Showing 18 changed files with 402 additions and 384 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,5 @@ cython_debug/
logs.txt
index.json
indexed.json
titles.json
.archive
49 changes: 0 additions & 49 deletions Phantom_local/query_engine.py

This file was deleted.

22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,20 @@
# Phantom
Distributed Crawler Indexing Engine
# Phantom Search
Light weight python based search engine

## Set-up
1) open `crawl.sh` and update the parameters

```shell
python phantom.py --num_threads 8 --urls "site1.com" "site2.com"
```
2) now run crawl.sh by typing
```shell
./crawl.sh
```
This crawls the web and saves indices into `index.json` file

3) run `build.sh` to Process the indices and run the `Query Engine`

4) now everytime you can start the query engine by running the file `query_engine.py`


9 changes: 9 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
source .env/bin/activate

pip install -r requirements.txt
clear
echo "Installation done"
python3 -m src.phantom_indexing
echo "Phantom Processing done"
clear
python3 -m src.query_engine
3 changes: 1 addition & 2 deletions crawl.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
python3 -m venv .env
source .env/bin/activate

cd phantom_crawler
pip install -r requirements.txt
python3 phantom_engine.py
python3 -m src.phantom --num_threads 10 --urls "https://www.geeksforgeeks.org/" "https://stackoverflow.com/questions" --show_logs True --print_logs True --sleep 60
26 changes: 26 additions & 0 deletions phantom.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from flask import Flask, render_template, request
from src.query_engine import Phantom_Query
from src.phantom_engine import Parser

app = Flask(__name__)
engine = Phantom_Query("src/indexed.json", titles="src/titles.json")
parser = Parser()

@app.route('/', methods=['GET', 'POST'])
def home():
input_text = ""
if request.method == 'POST':
input_text = request.form.get('input_text')
result = process_input(input_text)
return render_template('result.html', result=result, input_text=input_text)
return render_template('home.html', input_text=input_text)

def process_input(input_text):
result = engine.query(input_text, count=20)
#(doc, score, title)
print("results ; \n\n")
print(result)
return result

if __name__ == '__main__':
app.run()
Loading

0 comments on commit af4294f

Please sign in to comment.