Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Anmol2059 committed Sep 2, 2024
0 parents commit 14e0e6d
Show file tree
Hide file tree
Showing 14 changed files with 152,625 additions and 0 deletions.
60 changes: 60 additions & 0 deletions chipsal-datasets/datasetInfo.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
datasets are from : https://github.com/therealthapa/chipsal24
---------------------------------------
Sub Task 1 --> Devnagari Identification
---------------------------------------
'Nepali' : 0
'Marathi' : 1
'Sanskrit' : 2
'Bhojpuri' : 3
'Hindi' : 4

[Train Data]
-------------
Nepali : 12544
Marathi : 11034
Sanskrit : 10996
Bhojpuri : 10184
Hindi : 7664

[Evaluation Data]
-----------------
Nepali : 2688
Marathi : 2364
Sanskrit : 2356
Bhojpuri : 2182
Hindi : 1643

---------------------------------------
Sub Task 2 --> Hate Speech Detection in Devanagari Script Language
---------------------------------------
'non-hate' : 0
'hate' : 1

[Train Data]
-------------
non-hate : 16805
hate : 2214

[Evaluation Data]
-----------------
non-hate : 3602
hate : 474

---------------------------------------
Sub Task 3 --> Target Identification for Hate Speech in Devanagari Script Language
---------------------------------------
'Individual' : 0
'Organization' : 1
'Community' : 2

[Train Data]
-------------
Individual : 1074
Organization : 856
Community : 284

[Evaluation Data]
-----------------
Individual : 230
Organization : 183
Community : 61
60 changes: 60 additions & 0 deletions chipsal-datasets/eval/info.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
datasets are from : https://github.com/therealthapa/chipsal24
---------------------------------------
Sub Task 1 --> Devnagari Identification
---------------------------------------
'Nepali' : 0
'Marathi' : 1
'Sanskrit' : 2
'Bhojpuri' : 3
'Hindi' : 4

[Train Data]
-------------
Nepali : 12544
Marathi : 11034
Sanskrit : 10996
Bhojpuri : 10184
Hindi : 7664

[Evaluation Data]
-----------------
Nepali : 2688
Marathi : 2364
Sanskrit : 2356
Bhojpuri : 2182
Hindi : 1643

---------------------------------------
Sub Task 2 --> Hate Speech Detection in Devanagari Script Language
---------------------------------------
'non-hate' : 0
'hate' : 1

[Train Data]
-------------
non-hate : 16805
hate : 2214

[Evaluation Data]
-----------------
non-hate : 3602
hate : 474

---------------------------------------
Sub Task 3 --> Target Identification for Hate Speech in Devanagari Script Language
---------------------------------------
'Individual' : 0
'Organization' : 1
'Community' : 2

[Train Data]
-------------
Individual : 1074
Organization : 856
Community : 284

[Evaluation Data]
-----------------
Individual : 230
Organization : 183
Community : 61
19 changes: 19 additions & 0 deletions chipsal-datasets/eval/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

# Download task1_index_tweet.csv
gdown --id 1wmivix0utKHmq6d6ICvpRBTo1pebV3yI -O task1_index_tweet.csv

# Download task1_index_label.csv
gdown --id 1IicURjnKv8IRvcB99VmSBUIO7SzDfwSu -O task1_index_label.csv

# Download task2_index_tweet.csv
gdown --id 1SD7bn-5bU0g13GQrZ-RXGDGyAFNSy6VC -O task2_index_tweet.csv

# Download task2_index_label.csv
gdown --id 1apPJPZnZTke9PJi7z1NvkJKaxn70bCYT -O task2_index_label.csv

# Download task3_index_tweet.csv
gdown --id 1-2TjS6xPfjWj9YaJGSf-JXXXfNz-2pNT -O task3_index_tweet.csv

# Download task3_index_label.csv
gdown --id 1-1k1yHOGP7Wij1mUG2iKaSTN8i1WUgPz -O task3_index_label.csv
Loading

0 comments on commit 14e0e6d

Please sign in to comment.