AI Engine

Core Concepts

The AI Engine is a flexible system for structured data collection and workflow processing using LLMs. It processes JSON-based configurations to orchestrate complex data extraction and decision-making workflows.

Key Components

Data Definitions - Define collectable fields with specific types:
- string - Text-based responses
- numeric - Numerical values
- object - Complex nested structures
- list - Array processing
Workflow Definitions - Organize execution flow:
- Prompt-based workflows (independent)
- Explanation-based workflows (dependent)

Implementation Details

The implementation provides several sophisticated features:

AIEngine Class
- Central orchestrator
- Parallel execution support
- Result caching
- Workflow management
Executors
- DataExecutor - Handles data processing operations
- WorkflowExecutor - Manages workflow orchestration
Key Features
- Parallel processing
- Thread-safe caching
- Multiple LLM provider support
- Robust error handling
- Extensible architecture

Usage Example

Here's how to use the AI Engine:

from ai_engine import AIEngine
from langchain_openai import ChatOpenAI

# Initialize model
model = ChatOpenAI(
    openai_api_key="your-key",
    model_name="gpt-4o"
)

# Configure engine
config = {
    "data": {
        "summary": {
            "type": "string",
            "prompt": "Summarize the content"
        }
    },
    "workflow": {
        "analyze": {
            "prompt": "Analyze the content",
            "data": ["summary"]
        }
    }
}

# Initialize engine
engine = AIEngine(config, model)

# Execute
async def run():
    result = await engine.execute("Your content here")
    print(result)

# Run with asyncio
import asyncio
asyncio.run(run())

Project Setup

Prerequisites

Install direnv:

# macOS
brew install direnv

# Linux
curl -sfL https://direnv.net/install.sh | bash

Install devbox:

curl -fsSL https://get.jetpack.io/devbox | bash

Project Setup

Clone the repository:

git clone https://github.com/your-org/ai-engine.git
cd ai-engine

Create .envrc file:

export OPENAI_API_KEY="your-key-here"

Allow direnv:

direnv allow

Initialize devbox:

devbox init

Install dependencies:

devbox install

Start the development shell:

devbox shell

Install Poetry:

curl -sSL https://install.python-poetry.org | python3 -

Or through devbox (devbox should already install it)

devbox add poetry

Install project dependencies through Poetry:

poetry install

Publishing to PyPI (Work in progress)

The project uses GitHub Actions for automated publishing to PyPI. The workflow is triggered when you push a version tag.

Configure GitHub repository:
- Go to repository Settings → Secrets and variables → Actions
- Add a new secret named PYPI_TOKEN with your PyPI API token
Update the version in pyproject.toml:

poetry version patch  # For patch version bump
# or
poetry version minor  # For minor version bump
# or
poetry version major  # For major version bump

Commit your changes:

git add pyproject.toml
git commit -m "Bump version to x.y.z"

Create and push a version tag:

git tag vx.y.z  # Replace with your version (e.g., v1.0.0)
git push origin vx.y.z

The GitHub Action will automatically:

Build the package
Publish to PyPI
Create a release on GitHub

Note: To publish to Test PyPI first, you can manually run:

poetry config repositories.testpypi https://test.pypi.org/legacy/
poetry publish -r testpypi

The project uses modern development tools (devbox and direnv) to ensure consistent development environments and secure credential management. The implementation supports parallel processing, caching, and multiple LLM providers while maintaining a clean, extensible architecture.

Future Work

RAG (Retrieval Augmented Generation) Integration

The AI Engine roadmap includes implementing robust RAG capabilities:

Document Processing
- PDF, markdown, and plain text ingestion
- Document chunking and preprocessing
- Metadata extraction and indexing
Vector Store Integration
- Support for multiple vector databases (Pinecone, Weaviate, etc.)
- Efficient similarity search
- Hybrid search capabilities
Context Enhancement
- Dynamic context window management
- Relevance scoring and filtering
- Context compression techniques
Advanced Features
- Multi-document reasoning
- Cross-reference validation
- Source attribution and citation
- Incremental learning capabilities

These enhancements will enable the AI Engine to:

Process and understand large document collections
Provide more accurate and contextual responses
Support domain-specific knowledge bases
Maintain traceability to source materials

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
ai_engine		ai_engine
docs		docs
tests		tests
.envrc		.envrc
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
devbox.json		devbox.json
devbox.lock		devbox.lock
engine.json		engine.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Engine

Core Concepts

Key Components

Implementation Details

Usage Example

Project Setup

Prerequisites

Project Setup

Publishing to PyPI (Work in progress)

Future Work

RAG (Retrieval Augmented Generation) Integration

About

Releases

Packages

Languages

License

jazibjohar/ai-engine

Folders and files

Latest commit

History

Repository files navigation

AI Engine

Core Concepts

Key Components

Implementation Details

Usage Example

Project Setup

Prerequisites

Project Setup

Publishing to PyPI (Work in progress)

Future Work

RAG (Retrieval Augmented Generation) Integration

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages