Skip to content

Latest commit

 

History

History
14 lines (13 loc) · 654 Bytes

readme.md

File metadata and controls

14 lines (13 loc) · 654 Bytes

知乎爬虫代码仓库

爬虫使用方法:

  • 创建一个conda的虚拟环境,pip install requirements.txt
  • 之后在./config/base_config.py里面进行配置(关键词)等信息
  • 使用命令 python main.py --platform zhihu --lt qrcode --type search进行爬虫

本仓库参考如下:

  1. https://github.com/NanmiCoder/MediaCrawler
  2. https://github.com/NanmiCoder/CrawlerTutorial
  3. https://blog.csdn.net/ChenBinBini/article/details/109739116
  4. https://playwright.dev/
  5. https://github.com/lining0806/PythonSpiderNotes

生成词云图方法:

在wordcloud.py中配置数据来源之后,直接python wordcloud.py即可