Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #214

Merged
merged 9 commits into from
Feb 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish-pages.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ jobs:
cache-name: cache-pages-yarn
with:
path: ./pages/node_modules/
key: cache-pages-yarn-${{ hashFiles('pages/package-lock.json') }}
key: cache-pages-yarn-${{ hashFiles('pages/yarn.lock') }}

- name: Install Dependencies
if: steps.cache-pages-yarn.outputs.cache-hit != 'true'
Expand Down
62 changes: 24 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,20 @@ Data Dumps: https://data.lemmyverse.net/

This project provides a simple way to explore Lemmy Instances and Communities.

![List of Communities](./docs/images/communities.png)
![List of Communities](./docs/images/0.10.0-communities.png)

The project consists of four modules:
## Project Structure

1. Crawler (NodeJS, Redis) `/crawler`
2. Frontend (ReactJS, MUI Joy, TanStack) `/frontend`
3. Deploy (Amazon CDK v2) `/cdk`
4. Data Site (GitHub Pages) `/pages`
The project consists of the following modules:

| Module Description | Path | Readme |
| --------------------------------------------- | ----------- | ------------------------------ |
| Crawler _(NodeJS, Redis)_ | `/crawler` | [README](./crawler/README.md) |
| Frontend _(ReactJS, MUI Joy, TanStack)_ | `/frontend` | [README](./frontend/README.md) |
| Deployment _(Amazon CDK v2)_ | `/cdk` | [README](./cdk/README.md) |
| Data Dump Site _(ReactJS, MUI, GitHub Pages)_ | `/pages` | [README](./pages/README.md) |

Each module has its own README with more details.

## FAQ

Expand All @@ -36,11 +42,12 @@ Additionally, instance tags and trust data is fetched from [Fediseer](https://gu

The NSFW filter is a client-side filter that filters out NSFW communities and instances from results by default.
The "NSFW Toggle" checkbox has thress states that you can toggle through:
| State | Filter | Value |
| --- | --- | --- |
| Default | Hide NSFW | false |
| One Click | Include NSFW | null |
| Two Clicks | NSFW Only | true |

| State | Filter | Value |
| ---------- | ------------ | ----- |
| Default | Hide NSFW | false |
| One Click | Include NSFW | null |
| Two Clicks | NSFW Only | true |

When you try to switch to a non-sfw state, a popup will appear to confirm your choice. You can save your response in your browsers cache and it will be remembered.

Expand Down Expand Up @@ -75,47 +82,26 @@ You can also download [Latest ZIP](https://nightly.link/tgxn/lemmy-explorer/work
- `instances.full.json` - list of all instances
- `overview.json` - metadata and counts

## Crawler

[Crawler README](./crawler/README.md)

## Frontend

[Frontend README](./frontend/README.md)

## Data Site

[Data Site README](./pages/README.md)

## Deploy

The deploy is an Amazon CDK v2 project that deploys the crawler and frontend to AWS.

`config.example.json` has the configuration for the deploy.

then run `cdk deploy --all` to deploy the frontend to AWS.
## Awesome Lemmy Links

## Similar Sites
### General

- https://browse.feddit.de/
- https://join-lemmy.org/instances
- https://github.com/maltfield/awesome-lemmy-instances
- https://lemmymap.feddit.de/
- https://browse.toast.ooo/
- https://lemmyfind.quex.cc/

## Lemmy Stats Pages
### Lemmy Stats Pages

- https://lemmy.fediverse.observer/dailystats
- https://the-federation.info/platform/73
- https://fedidb.org/software/lemmy
- https://fedidb.org/current-events/threadiverse

## Thanks / Related Lemmy Tools
### Thanks / Related Lemmy Tools

- https://github.com/db0/fediseer
- https://github.com/LemmyNet/lemmy-stats-crawler

# Credits

Logo made by Andy Cuccaro (@andycuccaro) under the CC-BY-SA 4.0 license.
- Logo made by Andy Cuccaro (@andycuccaro) under the CC-BY-SA 4.0 license.
- Lemmy Developers and Community for creating [Lemmy](https://github.com/LemmyNet).
6 changes: 5 additions & 1 deletion cdk/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# Lemmy Explorer Deployment (Amazon CDK v2)

This is a CDK v2 project for deploying the Lemmy Explorer to AWS.
The deploy is an Amazon CDK v2 project that deploys the Lemmy Explorer frontend to AWS.

`config.example.json` has the configuration for the deploy, rename to `config.json` and fill in the values.

then run `cdk deploy --all` (or `yarn deploy`) to deploy the frontend to AWS.

## Deployment

Expand Down
4 changes: 2 additions & 2 deletions crawler/src/lib/crawlStorage.ts
Original file line number Diff line number Diff line change
Expand Up @@ -131,12 +131,12 @@ export class CrawlStorage {
}

async getAttributesWithScores(baseUrl: string, attributeName: string): Promise<any> {
const start = Date.now() - this.attributeMaxAge;
// const start = Date.now() - this.attributeMaxAge;
const end = Date.now();

const keys = await this.client.zRangeByScoreWithScores(
`attributes:instance:${baseUrl}:${attributeName}`,
start,
0, //start,
end,
);
return keys;
Expand Down
20 changes: 19 additions & 1 deletion crawler/src/output/file_writer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -142,10 +142,28 @@ export default class OutputFileWriter {
await this.writeJsonFile(`${this.publicDataFolder}/tags.meta.json`, JSON.stringify(fediTags));
}

async storeMetricsSeries(data: { versions: any }) {
await this.writeJsonFile(`${this.publicDataFolder}/metrics.series.json`, JSON.stringify(data));
}
/**
* this method is used to store the instance metrics data
*/
public async storeInstanceMetricsData(instanceBaseUrl: String, data: any) {

public async storeInstanceMetricsData(
instanceBaseUrl: String,
data: {
instance: any[];
communityCount: number;
users: any[];
communities: any[];
posts: any[];
comments: any[];
versions: any[];
usersActiveDay: any[];
usersActiveMonth: any[];
usersActiveWeek: any[];
},
) {
await mkdir(this.metricsPath, {
recursive: true,
});
Expand Down
151 changes: 151 additions & 0 deletions crawler/src/output/output.ts
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,12 @@ export default class CrawlOutput {
const returnInstanceArray = await this.getInstanceArray();
await this.fileWriter.storeInstanceData(returnInstanceArray);

// VERSIONS DATA
await this.outputAttributeHistory(
returnInstanceArray.map((i) => i.baseurl),
"version",
);

const returnCommunityArray = await this.getCommunityArray(returnInstanceArray);
await this.fileWriter.storeCommunityData(returnCommunityArray);

Expand Down Expand Up @@ -518,8 +524,21 @@ export default class CrawlOutput {
private async generateInstanceMetrics(instance, storeCommunityData) {
// get timeseries
const usersSeries = await storage.instance.getAttributeWithScores(instance.baseurl, "users");
const usersActiveDaySeries = await storage.instance.getAttributeWithScores(
instance.baseurl,
"users_active_day",
);
const usersActiveMonthSeries = await storage.instance.getAttributeWithScores(
instance.baseurl,
"users_active_month",
);
const usersActiveWeekSeries = await storage.instance.getAttributeWithScores(
instance.baseurl,
"users_active_week",
);
const postsSeries = await storage.instance.getAttributeWithScores(instance.baseurl, "posts");
const commentsSeries = await storage.instance.getAttributeWithScores(instance.baseurl, "comments");
const communitiesSeries = await storage.instance.getAttributeWithScores(instance.baseurl, "communities");
const versionSeries = await storage.instance.getAttributeWithScores(instance.baseurl, "version");

// generate array with time -> value
Expand All @@ -529,6 +548,28 @@ export default class CrawlOutput {
value: item.value,
};
});

const usersActiveDay = usersActiveDaySeries.map((item) => {
return {
time: item.score,
value: item.value,
};
});

const usersActiveMonth = usersActiveMonthSeries.map((item) => {
return {
time: item.score,
value: item.value,
};
});

const usersActiveWeek = usersActiveWeekSeries.map((item) => {
return {
time: item.score,
value: item.value,
};
});

const posts = postsSeries.map((item) => {
return {
time: item.score,
Expand All @@ -541,6 +582,14 @@ export default class CrawlOutput {
value: item.value,
};
});

const communities = communitiesSeries.map((item) => {
return {
time: item.score,
value: item.value,
};
});

const versions = versionSeries.map((item) => {
return {
time: item.score,
Expand All @@ -555,6 +604,10 @@ export default class CrawlOutput {
posts,
comments,
versions,
usersActiveDay,
usersActiveMonth,
usersActiveWeek,
communities,
});
}

Expand Down Expand Up @@ -895,6 +948,104 @@ export default class CrawlOutput {
return instanceErrors;
}

/// VERSION HISTORY

private async outputAttributeHistory(
countInstanceBaseURLs: string[],
metricToAggregate: string,
): Promise<any> {
// this function needs to output and aghgregated array of versions, and be able to show change over time
// this will be used to show version history on the website

// basically, it creates a snapshot each 12 hours, and calculates the total at that point in time
// maybe it shoudl use a floating window, so that it can show the change over time

// load all versions for all instances
let aggregateDataObject: {
time: number;
value: string;
}[] = [];

console.log("countInstanceBaseURLs", countInstanceBaseURLs.length);

for (const baseURL of countInstanceBaseURLs) {
const attributeData = await storage.instance.getAttributeWithScores(baseURL, metricToAggregate);
// console.log("MM attributeData", attributeData);

if (attributeData) {
for (const merticEntry of attributeData) {
const time = merticEntry.score;
const value = merticEntry.value;

aggregateDataObject.push({ time, value });
}
}
}

console.log("aggregateDataObject", aggregateDataObject.length);

// console.log("aggregateDataObject", aggregateDataObject);

const snapshotWindow = 12 * 60 * 60 * 1000; // 12 hours
const totalWindows = 600; // 60 snapshots

// generate sliding window of x hours, look backwards
const currentTime = Date.now();

const buildWindowData = {};

let currentWindow = 0;
// let countingData = true;
while (currentWindow <= totalWindows) {
// console.log("currentWindow", currentWindow);
const windowOffset = currentWindow * snapshotWindow;

// get this
const windowStart = currentTime - windowOffset;
const windowEnd = windowStart - snapshotWindow;

// filter data
const windowData = aggregateDataObject.filter((entry) => {
// console.log("entry.time", entry.time, windowStart, windowEnd);
return entry.time < windowStart;
});
console.log("currentWindow", currentWindow, windowStart, windowEnd, windowData.length);

// // stop if no data
// if (windowData.length === 0) {
// countingData = false;
// break;
// }

// console.log("windowData", windowData);

// count data
const countData = {};
windowData.forEach((entry) => {
if (!countData[entry.value]) {
countData[entry.value] = 1;
} else {
countData[entry.value]++;
}
});

// console.log("countData", countData);

// store data
buildWindowData[windowStart] = countData;

currentWindow++;
}

console.log("buildWindowData", buildWindowData);

await this.fileWriter.storeMetricsSeries({
versions: buildWindowData,
});

// throw new Error("Not Implemented");
}

// FEDIVERSE

private async outputFediverseData(outputInstanceData): Promise<IFediverseDataOutput[]> {
Expand Down
Binary file added docs/images/0.10.0-communities.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
1 change: 1 addition & 0 deletions frontend/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
"@tanstack/react-query-devtools": "^4.29.23",
"@uidotdev/usehooks": "^2.0.1",
"axios": "^1.4.0",
"d3-scale": "^4.0.2",
"masonic": "^3.7.0",
"moment": "^2.29.4",
"notistack": "^3.0.1",
Expand Down
Loading