Skip to content
/ Padie Public

An open-source Python package for conversational AI with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo.

License

Notifications You must be signed in to change notification settings

sir-temi/Padie

Repository files navigation

🌟 Padie 🌟

License Status Python Contributions Welcome

Padie is an evolving, open-source Python package designed to enable conversational AI systems with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo. It aims to provide AI-powered tools for language detection, intent recognition, and response generation, while fostering community collaboration to enhance its capabilities.

🔧 Note: Padie is a work in progress. Models are being trained and refined, and we’re actively gathering datasets to improve accuracy. Your contributions can make a difference!


Features

  • 🌍 AI-Powered Language Detection
    Automatically identify the language of input text across supported Nigerian languages.

  • 🎯 AI-Powered Intent Recognition
    Accurately understand user intentions across multiple domains and contexts.

  • 🤖 Dynamic Response Generation
    Generate intelligent, context-aware responses tailored to user input.

  • ⚙️ Framework-Agnostic Design
    Seamlessly integrate Padie into any framework or application.

  • 🧑‍🤝‍🧑 Community Contributions
    An open platform for developers, linguists, and AI enthusiasts to contribute datasets, models, and features.


🛠️ Handling Datasets and Citations

Dataset Format

Datasets should follow a structured format. Each entry must include at least the following fields:

{
    "text": "Example input text.",
    "label": "language_label",
    "format": "format_type",
    "source": {
        "provider": null,
        "url": null
    },
    "citation": null
}

Citations

  • If the data includes a citation, create a unique citation code (e.g., "c1", "c2") and add the citation details to a separate file named citations.json.
  • Replace the citation field in the dataset with the corresponding code.

Example Citation File (citations.json):

{
    "c2": {
        "title": "Naija Language Translation",
        "author": "Wazobia",
        "year": 2021,
        "url": "https://example.com/naija-paper"
    }
}

Example Dataset Entry with Citation Code:

{
    "text": "Example text in a Nigerian language.",
    "label": "pidgin",
    "format": "article",
    "source": {
        "provider": "BBC",
        "url": "https://www.bbc.com/article"
    },
    "citation": "c2"
}

Why Use a Citation File?

  • Ensures clarity by centralising citation details.
  • Prevents redundancy and reduces dataset file size.
  • Facilitates easier management and updates.

📋 How to Contribute

We welcome contributions from everyone—whether you're a developer, linguist, or data scientist! Here's how you can get involved:

  1. Fork the Repository:
    Click the "Fork" button at the top of the repository page to create your copy.

  2. Clone Your Fork:

    git clone https://github.com/sir-temi/Padie.git
  3. Create a Branch:

    git checkout -b feature-name
  4. Make Your Changes:

    • Contribute datasets for supported or new languages.
    • Add citations and ensure proper referencing in citations.json.
    • Improve AI models or build new ones.
    • Fix bugs or add features.
  5. Commit and Push:

    git commit -m "Describe your changes"
    git push origin feature-name
  6. Submit a Pull Request:
    Open a pull request against the main branch with a clear description of your changes.


📦 Installation

🚧 Coming Soon!
We’re finalizing the core framework and will provide installation instructions once it’s ready.


🌍 Open Source Contribution

Padie is licensed under the MIT License, ensuring it remains free and open for everyone to use, contribute to, and enhance.

About

An open-source Python package for conversational AI with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages