Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/cloudwatch #699

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions api/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
rerun:
bash ./rerun.sh
95 changes: 95 additions & 0 deletions api/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
`sudo bash ./install.sh`

to redo all the steps remove the lock files

`rm ${ROOT}/opt/swarms/install/* `

or in my system:
```
export ROOT=/mnt/data1/swarms
sudo rm ${ROOT}/opt/swarms/install/*
```

rerun
```
export ROOT=/mnt/data1/swarms;
sudo rm ${ROOT}/opt/swarms/install/*;
sudo bash ./install.sh
```
* setup
To install on linux:
https://docs.aws.amazon.com/systems-manager/

```
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/ubuntu_64bit/session-manager-plugin.deb" -o "session-manager-plugin.deb"
sudo dpkg -i ./session-manager-plugin.deb
```

* run

To redo the installation steps for the Swarms tool on your system, follow these commands sequentially:

1. Set the ROOT variable:
```bash
export ROOT=/mnt/data1/swarms
```

2. Remove the lock files:
```bash
sudo rm ${ROOT}/opt/swarms/install/*
```

3. Run the installation script again:
```bash
sudo bash ./install.sh
```

For setting up the Session Manager plugin on Linux, you can follow these commands:

1. Download the Session Manager plugin:
```bash
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/ubuntu_64bit/session-manager-plugin.deb" -o "session-manager-plugin.deb"
```

2. Install the plugin:
```bash
sudo dpkg -i ./session-manager-plugin.deb
```

After that, you can run your desired commands or workflows.** get the instance id
`aws ec2 describe-instances`

** start a session
`aws ssm start-session --target i-XXXX`

** on the machine:
```
sudo su -
tail /var/log/cloud-init-output.log
```

Convert this to an automation of your choice to run all the steps
and run this on all the instances

To get the instance ID and start a session using AWS CLI, follow these steps:

1. **Get the Instance ID:**
Run the following command to list your instances and their details:
```bash
aws ec2 describe-instances
```

2. **Start a Session:**
Replace `i-XXXX` with your actual instance ID from the previous step:
```bash
aws ssm start-session --target i-XXXX
```

3. **On the Machine:**
After starting the session, you can execute the following commands:
```bash
sudo su -
tail /var/log/cloud-init-output.log
```


14 changes: 13 additions & 1 deletion api/agent_api_test.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,22 @@
import os
import time

import requests
from loguru import logger

import time
from typing import Dict, Optional, Tuple
from uuid import UUID

BASE_URL = "http://0.0.0.0:8000/v1"
# # Configure loguru
# logger.add(
# "api_tests_{time}.log",
# rotation="100 MB",
# level="DEBUG",
# format="{time} {level} {message}",
# )

BASE_URL = os.getenv("SWARMS_URL","http://localhost:8000/v1")

def check_api_server() -> bool:
"""Check if the API server is running and accessible."""
Expand Down Expand Up @@ -33,6 +44,7 @@ def __init__(self):
@property
def headers(self) -> Dict[str, str]:
"""Get headers with authentication."""
print("HEADERS",self.api_key)
return {"api-key": self.api_key} if self.api_key else {}


Expand Down
32 changes: 32 additions & 0 deletions api/boot.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash

# to be run as swarms user
set -e
set -x
export ROOT=""
export HOME="${ROOT}/home/swarms"
unset CONDA_EXE
unset CONDA_PYTHON_EXE
export PATH="${ROOT}/var/swarms/agent_workspace/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

if [ ! -f "${ROOT}/var/swarms/agent_workspace/.venv/" ];
then
virtualenv "${ROOT}/var/swarms/agent_workspace/.venv/"
fi
ls "${ROOT}/var/swarms/agent_workspace/"
. "${ROOT}/var/swarms/agent_workspace/.venv/bin/activate"

pip install fastapi uvicorn termcolor
# these are tried to be installed by the app on boot
pip install sniffio pydantic-core httpcore exceptiongroup annotated-types pydantic anyio httpx ollama
pip install -e "${ROOT}/opt/swarms/"
cd "${ROOT}/var/swarms/"
pip install -e "${ROOT}/opt/swarms-memory"
pip install "fastapi[standard]"
pip install "loguru"
pip install "hunter" # for tracing
pip install pydantic==2.8.2
pip install pathos || echo oops
pip freeze
# launch as systemd
# python /opt/swarms/api/main.py
17 changes: 17 additions & 0 deletions api/boot_fast.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

# to be run as swarms user
set -e
set -x
export ROOT=""
export HOME="${ROOT}/home/swarms"
unset CONDA_EXE
unset CONDA_PYTHON_EXE
export PATH="${ROOT}/var/swarms/agent_workspace/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

ls "${ROOT}/var/swarms/agent_workspace/"
. "${ROOT}/var/swarms/agent_workspace/.venv/bin/activate"

pip install -e "${ROOT}/opt/swarms/"
cd "${ROOT}/var/swarms/"
pip install -e "${ROOT}/opt/swarms-memory"
52 changes: 52 additions & 0 deletions api/check_ssm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import os
import json
import boto3

# Create .cache directory if it doesn't exist
os.makedirs('.cache', exist_ok=True)

def cache(name, value):
cache_file = f'.cache/{name}'
if not os.path.isfile(cache_file):
with open(cache_file, 'w') as f:
f.write(value)

# Initialize Boto3 SSM client
ssm = boto3.client('ssm')

# List commands from AWS SSM
response = ssm.list_commands()

cache("aws_ssm_list_commands", response)

# Retrieve commands
print(response)
commands = response["Commands"]
run_ids = [cmd['CommandId'] for cmd in commands]
print(f"RUNIDS: {run_ids}")

# Check the status of each command
for command in commands:
#print(command)
command_id = command['CommandId']
status = command['Status']
#eG: command= {'CommandId': '820dcf47-e8d7-4c23-8e8a-bc64de2883ff', 'DocumentName': 'AWS-RunShellScript', 'DocumentVersion': '$DEFAULT', 'Comment': '', 'ExpiresAfter': datetime.datetime(2024, 12, 13, 12, 41, 24, 683000, tzinfo=tzlocal()), 'Parameters': {'commands': ['sudo su - -c "tail /var/log/cloud-init-output.log"']}, 'InstanceIds': [], 'Targets': [{'Key': 'instanceids', 'Values': ['i-073378237c5a9dda1']}], 'RequestedDateTime': datetime.datetime(2024, 12, 13, 10, 41, 24, 683000, tzinfo=tzlocal()), 'Status': 'Success', 'StatusDetails': 'Success', 'OutputS3Region': 'us-east-1', 'OutputS3BucketName': '', 'OutputS3KeyPrefix': '', 'MaxConcurrency': '50', 'MaxErrors': '0', 'TargetCount': 1, 'CompletedCount': 1, 'ErrorCount': 0, 'DeliveryTimedOutCount': 0, 'ServiceRole': '', 'NotificationConfig': {'NotificationArn': '', 'NotificationEvents': [], 'NotificationType': ''}, 'CloudWatchOutputConfig': {'CloudWatchLogGroupName': '', 'CloudWatchOutputEnabled': False}, 'TimeoutSeconds': 3600, 'AlarmConfiguration': {'IgnorePollAlarmFailure': False, 'Alarms': []}, 'TriggeredAlarms': []}], 'ResponseMetadata': {'RequestId': '535839c4-9b87-4526-9c01-ed57f07d21ef', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Fri, 13 Dec 2024 16:58:53 GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length': '2068', 'connection': 'keep-alive', 'x-amzn-requestid': '535839c4-9b87-4526-9c01-ed57f07d21ef'}, 'RetryAttempts': 0}}

if status == "Success":
print(f"Check logs of {command_id}")
# use ssm to fetch logs using CommandId

# Assuming you have the command_id from the previous command output
command_id = command['CommandId']
instance_id = command['Targets'][0]['Values'][0] # Get the instance ID

# Fetching logs using CommandId
log_response = ssm.get_command_invocation(
CommandId=command_id,
InstanceId=instance_id
)
print(log_response['StandardOutputContent']) # Output logs
print(log_response['StandardErrorContent']) # Error logs (if any)
print(f"aws ssm start-session --target {instance_id}")


54 changes: 54 additions & 0 deletions api/get_logs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import time

import boto3
#from dateutil import tz


def parse_command_id(send_command_output):
return send_command_output['Command']['CommandId']

def main():
ec2_client = boto3.client('ec2')
ssm_client = boto3.client('ssm')

# Get the list of instance IDs and their states
instances_response = ec2_client.describe_instances()
instances = [
(instance['InstanceId'], instance['State']['Name'])
for reservation in instances_response['Reservations']
for instance in reservation['Instances']
]

for instance_id, state in instances:
if state == 'running':
print(f"Starting command for instance: {instance_id}")

# Send command to the instance
send_command_output = ssm_client.send_command(
DocumentName="AWS-RunShellScript",
Targets=[{"Key": "instanceids", "Values": [instance_id]}],
Parameters={'commands': ['sudo su - -c "tail /var/log/cloud-init-output.log"']}
)

# Get the command ID
command_id = parse_command_id(send_command_output)

# Check the command status every second for 4 seconds
for _ in range(4):
time.sleep(20)
command_status = ssm_client.list_command_invocations(CommandId=command_id, Details=True)

print(command_status)
for invocation in command_status['CommandInvocations']:
if invocation['Status'] == 'Success':
for plugin in invocation['CommandPlugins']:
if plugin['Status'] == 'Success':
print(f"Output from instance {instance_id}:\n{plugin['Output']}")
else:
print(f"Error in plugin execution for instance {instance_id}: {plugin['StatusDetails']}")
else:
print(f"Command for instance {instance_id} is still in progress... Status: {invocation['Status']}")


if __name__ == "__main__":
main()
Loading
Loading