Skip to content

Latest commit

 

History

History
58 lines (48 loc) · 4.53 KB

readme.md

File metadata and controls

58 lines (48 loc) · 4.53 KB

AI於釣魚郵件辨別之應用

Report Project

(2) Backround

  • Motivation

    During the course, the combination of generative AI with fraud was mentioned, which proved to be quite an intriguing perspective. With this initial idea and data collection, we believed that we could focus on the topic of AI analyzing URLs for phishing website links. Phishing website links are commonly associated with cybersecurity breaches, appearing frequently in attacks through mediums such as emails and text messages. Despite their long history, they continue to be a persistent threat.

    Furthermore, there is another advantage to analyzing website links. Phishing emails often target businesses, and the emails themselves involve privacy concerns. By simply analyzing the submitted URL to determine whether it leads to a phishing website, we can better protect privacy.

    Existing phishing website databases are typically populated through manual submissions and subsequent verification. Using AI for detection provides a quicker and more real-time approach to the task, aligning with the need for swifter detection. We have chosen to employ GPT-3.5 for this purpose.

(3) Solutions

Reference

  1. 機器學習分析垃圾&釣魚郵件標頭檔
  2. Learning OpenAI API
  3. OpenAI 專屬助理--網頁部分
  4. ChatGPT Writer
  5. Detecting Phishing Sites Using ChatGPT-2023/06/09
  6. AnomalyDetectioninEmailsusingMachine LearningandHeaderInformation-2022/03/19
  7. Phishing by Form: The Abuse of Form Sites-2011/10/18 IEEE

Error

  • Alt text 一直沒辦法使用 chatGPT 的機器人,後來跟可以使用的組員比對後發現缺少了一個 persist 資料夾

一鍵搜索資料夾

  • Confusion Matrix:計算與挑選模糊矩陣的樣本
  • crawl:bug1只抓文字;bug2抓取整個html;craw以firefox引擎為模板;web抓取 html 並截圖已做後續OCR處理
  • de-identification:一鍵將儲存資料夾裡的html抓取文本。coool有 meta 值、html、url,有資料夾;lighter沒有 meta 值
  • weeeb:如果 chatGPT 無法分析的備案,再傳入URL時將初步訓練的所有步驟全部在後台跑一次
  • main:後端訓練
  • gpt-master:串接與網站 Demo