Padie is an evolving, open-source Python package designed to enable conversational AI systems with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo. It aims to provide AI-powered tools for language detection, intent recognition, and response generation, while fostering community collaboration to enhance its capabilities.
🔧 Note: Padie is a work in progress. Models are being trained and refined, and we’re actively gathering datasets to improve accuracy. Your contributions can make a difference!
-
🌍 AI-Powered Language Detection
Automatically identify the language of input text across supported Nigerian languages. -
🎯 AI-Powered Intent Recognition
Accurately understand user intentions across multiple domains and contexts. -
🤖 Dynamic Response Generation
Generate intelligent, context-aware responses tailored to user input. -
⚙️ Framework-Agnostic Design
Seamlessly integrate Padie into any framework or application. -
🧑🤝🧑 Community Contributions
An open platform for developers, linguists, and AI enthusiasts to contribute datasets, models, and features.
Datasets should follow a structured format. Each entry must include at least the following fields:
{
"text": "Example input text.",
"label": "language_label",
"format": "format_type",
"source": {
"provider": null,
"url": null
},
"citation": null
}
- If the data includes a citation, create a unique citation code (e.g.,
"c1"
,"c2"
) and add the citation details to a separate file namedcitations.json
. - Replace the citation field in the dataset with the corresponding code.
{
"c2": {
"title": "Naija Language Translation",
"author": "Wazobia",
"year": 2021,
"url": "https://example.com/naija-paper"
}
}
{
"text": "Example text in a Nigerian language.",
"label": "pidgin",
"format": "article",
"source": {
"provider": "BBC",
"url": "https://www.bbc.com/article"
},
"citation": "c2"
}
- Ensures clarity by centralising citation details.
- Prevents redundancy and reduces dataset file size.
- Facilitates easier management and updates.
We welcome contributions from everyone—whether you're a developer, linguist, or data scientist! Here's how you can get involved:
-
Fork the Repository:
Click the "Fork" button at the top of the repository page to create your copy. -
Clone Your Fork:
git clone https://github.com/sir-temi/Padie.git
-
Create a Branch:
git checkout -b feature-name
-
Make Your Changes:
- Contribute datasets for supported or new languages.
- Add citations and ensure proper referencing in
citations.json
. - Improve AI models or build new ones.
- Fix bugs or add features.
-
Commit and Push:
git commit -m "Describe your changes" git push origin feature-name
-
Submit a Pull Request:
Open a pull request against themain
branch with a clear description of your changes.
🚧 Coming Soon!
We’re finalizing the core framework and will provide installation instructions once it’s ready.
Padie is licensed under the MIT License, ensuring it remains free and open for everyone to use, contribute to, and enhance.