Skip to content

Latest commit

 

History

History
221 lines (154 loc) · 13.9 KB

README_en.md

File metadata and controls

221 lines (154 loc) · 13.9 KB

天机 Tianji


 中文  |  English  |  日本語

PyPI - Python Version LICENSE

Tianji is a free, non-commercial artificial intelligence system. You can utilize it for tasks involving worldly wisdom, such as "art of conversation," to enhance your emotional intelligence and core competitiveness. We firmly believe that worldly wisdom are the future core competency of AI, and let us join hands to witness the advent of general artificial intelligence.

What You Will Learn in Tianji 🍉

After completing all the content, you will gain entry-level full-stack application development capabilities for large language models. 🕶️

SocialAI is a non-profit organization established in China. We have fully open-sourced the Tianji series of work. The current open-source system technology route involves Prompt, AI games, Agent, knowledge base, model training. For specific content, please refer to the documentation and corresponding repositories.

Based on the organized social etiquette data, the social etiquette large model system - Tianji includes seven major areas of common interpersonal communication (specific details can be found in the scenario refinement details in Scene Classification), which can be broadly divided into:

- Toasting etiquette culture Etiquette
- Hospitality etiquette culture Hospitality
- Gift-giving etiquette culture Gifting
- Sending blessings Wishes
- How to speak Communication
- Resolving "awkward" situations Awkwardness
- Dealing with conflicts & confrontations Conflict

Combining these areas, Tianji involves four technical routes:

  • Pure prompt (including AI games): Built-in system prompt based on the large model's own ability to dialogue.
  • Agent (MetaGPT, etc.): Using Agent architecture to get richer, more customized detailed answers.
  • Knowledge base: Directly retrieve social etiquette rules (such as how to drink at the dinner table).
  • Model training: Based on different excellent model bases, perform Lora fine-tuning or full fine-tuning when accumulating a large amount of data.

You can find the corresponding source code for the four routes in the tianji directory. If you want to refer to the project architecture, data management, and technical route of Tianji to recreate your own vertical domain AI application, welcome to fork or directly refer to it. We will open source all processes including project start-up, data direction exploration, data construction and management, AI application creation from scratch, in-depth combination of domain (such as social etiquette) and technical routes; we hope to see further acceleration of AI native applications in life.

Table of Contents

📰News

Roadmap

  • Release the simplest initial version (involving prompt, aigame, agent, knowledge base, model fine-tuning)
  • Complete the reproducible documentation of the model fine-tuning data collection to fine-tuning process for Social Etiquette Large Model - Sending Blessings
  • Open-source social etiquette corpus - Sending Blessings to huggingface
  • Iterate better data production tools and cleaning solutions, open-source data cleaning scripts
  • Complete knowledge base part iteration
  • Complete Agent part documentation
  • Open knowledge base corpus acquisition details to huggingface
  • Add documentation (how to build your own application prompt, agent, knowledge base, fine-tuning application by referencing this project)
  • Organize multi-dimensional data, open-source relatively complete social etiquette corpus
  • Collect & open-source social etiquette corpus - 20,000 and train to give results and the entire process

Running Environment

Environment Installation

In this project, execute the following commands to install the project's complete requirements:

pip install -r requirements.txt
pip install .

Key Configuration

To ensure the project runs normally, please create a new .env file in the project and set your API keys in it. You can write the corresponding keys according to the following example to successfully run the call. Currently, zhipuai is used by default, so you can just write ZHIPUAI_API_KEY to use it.

If you encounter extremely slow or unable to download issues when downloading models from Hugging Face, please set the value of HF_ENDPOINT to https://hf-mirror.com in the .env file. Please note that some Hugging Face repositories may require access permission (e.g., Jina Ai). For this, please register a Hugging Face account and add HF_TOKEN in the .env file. You can find and obtain your token here.

OPENAI_API_KEY=
OPENAI_API_BASE=
ZHIPUAI_API_KEY=
OPENAI_API_MODEL=
HF_HOME='./cache/'
HF_ENDPOINT = 'https://hf-mirror.com'
HF_TOKEN=

File Directory Description

assets/: Static image files
run/: Includes various demonstration frontends
test/: This contains test files for various functions, including core modules and llm runtime unit tests
tianji/: Source code directory, containing main logic and algorithm implementations
tools/: Tools to help collect data, organize data cleaning corpus

Documentation Description

TODO, to cover all documentation

🍺How to Run

The following provides the application methods for prompt and agent. Before running, please make sure you have created a .env file:

# Run prompt webui frontend
python3 run/tianji_prompt_webui.py

# Run agent frontend
streamlit run run/metagpt_webui.py

Examples

Fine-tuning - Blessing Module

The current blessing module supports three modes of triggering:

Normal style, directly write the blessing recipient, conforming to common conversation patterns. Xiaohongshu style, can be triggered by adding "Xiaohongshu style" after specifying the blessing recipient; comes with emojis and cute conversation style. Serious style, can be triggered by adding "serious style" after specifying the blessing recipient; formal writing style.

Prompt - How to Make a Toast

Here's a dialogue example of Tesla founder Elon Musk toasting to OpenAI founder Sam Altman (generated based on built-in prompts):

elon musk: Sam, your Openai should be called CloseAI instead
sam altman: You better manage your grok first, I'm punishing you with three drinks!
elon musk: Look at what you're saying, do you want me to finish this bottle of baijiu?
sam altman: Haha, Elon, you're a man aiming for the stars and seas, what's a bottle of baijiu? I heard you even dare to drink rocket fuel at SpaceX parties! Come on, I'll finish this glass, you do as you please, but don't lose the dignity of the world's richest man!
elon musk: You're challenging me, brother Sam! Well, I won't be polite then, I'll drink it, you take half a glass. But speaking of which, OpenAI's achievements aren't just about boasting, you better keep an eye on it, don't let it really become "CloseAI".
sam altman: Hey, I'm always ready to take off this CEO hat and let you wear it, old Musk. But tonight, let's forget about all that, come on, let's toast to our friendship! I'll finish it, you do as you please, but remember, we still have to continue changing the world tomorrow!
elon musk: Alright, just for your "changing the world" phrase, I'll finish this glass! But don't think I'll let you off easily, wait until I bring my Tesla charging pile next time, see how you deal with that!
sam altman: Haha, I'm looking forward to that day! But for now, let's enjoy tonight. Come on, last glass, I'll finish it, you do as you please, drink happily, don't forget, OpenAI's AIs are still waiting for us to go back and continue training them!

Knowledge Base - xxx

TODO

How to Participate in This Project

Issue Board

Here records some issues existing in this project (welcome to solve them), as well as some roadmap that needs to be completed in the future. Currently, there's no time to write, if you have time and want to play together, please issue or contact me directly via email: [email protected]

Submit Your First PR

Thanks to good ci facilities, you only need to refer to the example PR to quickly propose your first prompt PR! After submitting the PR, the new prompt will be automatically merged into the json file under tianji/prompt, convenient for one-click calling. If you don't know what to write, you can refer to the various scenario refinement details in Scene Classification to write prompts for different social etiquette domains.

How to Replicate This Project

The initial intention of this project is, first, to make AI learn core technology, and second, to allow more people (domains/industries) to build their own AI systems, accelerating AI's penetration into every field. You can learn about this project through the following ways:

You can refer to this project to create new vertical domain applications:

  • Rental assistant (agent)
  • Childcare assistant (data collection and knowledge base)
  • Life guide (data collection and knowledge base) ......

Contributors

Some contributors don't have GitHub, we thank each and every contributor!, and welcome you to join!

Acknowledgements

Thanks to all the following people for their help with this project:

Star History

Star History Chart