Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added line numbers and hyperlink by function and file #107

Merged
merged 27 commits into from
Jun 6, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
04be813
feat: added file_function_lineno_map preloading upon initialization o…
May 27, 2024
9574976
addressed Tiff comment: maintaining the same format as demonstrated l…
May 27, 2024
b90ac6b
Merge branch '82-export-report-refactor' into 104-add-line-number
May 27, 2024
2177285
feat: added Line Numbers in report
May 27, 2024
b9c61bf
feat: added hyperlink in the report
May 28, 2024
e266afd
saved demo
May 28, 2024
60a6dec
cleaned up demo
May 28, 2024
715aee0
resolved merge conflict
May 30, 2024
5e8a6e6
resolved bugs due to merge conflict
May 30, 2024
a364786
saved demo
May 30, 2024
08b7d13
fix misleading variable name
SoloSynth1 May 30, 2024
bf615cf
Merge branch 'main' into 104-add-line-number
SoloSynth1 Jun 4, 2024
5a26a62
Merge branch 'main' into 104-add-line-number
SoloSynth1 Jun 4, 2024
7f02ed5
break down `Repository` and separate git-related methods to `GitContext`
SoloSynth1 Jun 4, 2024
613f73e
move abstract `CodeAnalyzer`
SoloSynth1 Jun 4, 2024
93160a7
include basic tests for gitcontext; improve tests by parametrizing
SoloSynth1 Jun 4, 2024
01ba5b0
change prompt to improve response quality from LLM
SoloSynth1 Jun 4, 2024
52dfea2
add `GitPython` as dependency
SoloSynth1 Jun 4, 2024
c6b4a13
fix validation logic to return invalid responses
SoloSynth1 Jun 4, 2024
7157a8b
wip: change calling procedure for line number formatting
SoloSynth1 Jun 4, 2024
d8f2890
change schema of `evaluationreseponse` to include repository and chec…
SoloSynth1 Jun 5, 2024
d401443
reduce redundant computation on finding out which remote service to use
SoloSynth1 Jun 5, 2024
e74525b
change parser to attach link to github repo on function names
SoloSynth1 Jun 5, 2024
835334a
fix error raising lines being too long
SoloSynth1 Jun 5, 2024
9edb286
implement naive parser for function line number as fallback; remove u…
SoloSynth1 Jun 5, 2024
2bf68a7
add test for checking links when no remote is provided
SoloSynth1 Jun 5, 2024
836e056
fix incorrectly set default argument
SoloSynth1 Jun 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 29 additions & 67 deletions src/test_creation/demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
"name": "stderr",
"output_type": "stream",
"text": [
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:32<00:00, 10.71s/it]\n"
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:33<00:00, 11.16s/it]\n"
]
}
],
Expand All @@ -55,64 +55,34 @@
"output_type": "stream",
"text": [
"Report:\n",
" Requirement \\\n",
"ID Title \n",
"2.1 Ensure Data File Loads as Expected Ensure that data-loading functions correctly l... \n",
"3.2 Data in the Expected Format Verify that the data to be ingested matches th... \n",
"3.5 Check for Duplicate Records in Data Check for duplicate records in the dataset and... \n",
"4.2 Verify Data Split Proportion Check that the data is split into training and... \n",
"5.3 Ensure Model Output Shape Aligns with Expectation Ensure the shape of the model's output aligns ... \n",
"6.1 Verify Evaluation Metrics Implementation Verify that the evaluation metrics are correct... \n",
"6.2 Evaluate Model's Performance Against Thresholds Compute evaluation metrics for both the traini... \n",
"\n",
" is_Satisfied \\\n",
"ID Title \n",
"2.1 Ensure Data File Loads as Expected 0.0 \n",
"3.2 Data in the Expected Format 0.0 \n",
"3.5 Check for Duplicate Records in Data 0.0 \n",
"4.2 Verify Data Split Proportion 0.5 \n",
"5.3 Ensure Model Output Shape Aligns with Expectation 0.0 \n",
"6.1 Verify Evaluation Metrics Implementation 1.0 \n",
"6.1 Verify Evaluation Metrics Implementation 0.5 \n",
"6.2 Evaluate Model's Performance Against Thresholds 0.5 \n",
"\n",
" n_files_tested \\\n",
"ID Title \n",
"2.1 Ensure Data File Loads as Expected 3 \n",
"3.2 Data in the Expected Format 3 \n",
"3.5 Check for Duplicate Records in Data 3 \n",
"4.2 Verify Data Split Proportion 3 \n",
"5.3 Ensure Model Output Shape Aligns with Expectation 3 \n",
"6.1 Verify Evaluation Metrics Implementation 3 \n",
"6.2 Evaluate Model's Performance Against Thresholds 3 \n",
"\n",
" Observations \\\n",
"ID Title \n",
"2.1 Ensure Data File Loads as Expected [(test_cross_validation.py) The code does not ... \n",
"3.2 Data in the Expected Format [(test_cross_validation.py) The code does not ... \n",
"3.5 Check for Duplicate Records in Data [(test_cross_validation.py) The code does not ... \n",
"4.2 Verify Data Split Proportion [(test_cross_validation.py) The code tests the... \n",
"5.3 Ensure Model Output Shape Aligns with Expectation [(test_cross_validation.py) The code does not ... \n",
"6.1 Verify Evaluation Metrics Implementation [(test_cross_validation.py) The code does not ... \n",
"6.2 Evaluate Model's Performance Against Thresholds [(test_cross_validation.py) The code does not ... \n",
" n_files_tested \n",
"ID Title \n",
"2.1 Ensure Data File Loads as Expected 3 \n",
"3.2 Data in the Expected Format 3 \n",
"3.5 Check for Duplicate Records in Data 3 \n",
"4.2 Verify Data Split Proportion 3 \n",
"5.3 Ensure Model Output Shape Aligns with Expectation 3 \n",
"6.1 Verify Evaluation Metrics Implementation 3 \n",
"6.2 Evaluate Model's Performance Against Thresholds 3 \n",
"\n",
" Function References \n",
"ID Title \n",
"2.1 Ensure Data File Loads as Expected [{'File Path': '../../data/raw/openja/lightfm_... \n",
"3.2 Data in the Expected Format [{'File Path': '../../data/raw/openja/lightfm_... \n",
"3.5 Check for Duplicate Records in Data [{'File Path': '../../data/raw/openja/lightfm_... \n",
"4.2 Verify Data Split Proportion [{'File Path': '../../data/raw/openja/lightfm_... \n",
"5.3 Ensure Model Output Shape Aligns with Expectation [{'File Path': '../../data/raw/openja/lightfm_... \n",
"6.1 Verify Evaluation Metrics Implementation [{'File Path': '../../data/raw/openja/lightfm_... \n",
"6.2 Evaluate Model's Performance Against Thresholds [{'File Path': '../../data/raw/openja/lightfm_... \n",
"\n",
"Score: 2.0/7\n",
"Score: 1.5/7\n",
"\n"
]
},
{
"data": {
"text/plain": [
"'2.0/7'"
"'1.5/7'"
]
},
"execution_count": 4,
Expand All @@ -121,7 +91,7 @@
}
],
"source": [
"parser = ResponseParser(response)\n",
"parser = ResponseParser(response, repo)\n",
"parser.get_completeness_score(verbose=True)"
]
},
Expand Down Expand Up @@ -199,7 +169,7 @@
"name": "stderr",
"output_type": "stream",
"text": [
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:35<00:00, 11.94s/it]\n"
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:31<00:00, 10.40s/it]\n"
]
},
{
Expand All @@ -213,7 +183,7 @@
"name": "stderr",
"output_type": "stream",
"text": [
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:32<00:00, 10.70s/it]\n"
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:31<00:00, 10.36s/it]\n"
]
},
{
Expand All @@ -228,7 +198,7 @@
"name": "stderr",
"output_type": "stream",
"text": [
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:38<00:00, 12.83s/it]\n"
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:39<00:00, 13.16s/it]\n"
]
},
{
Expand All @@ -242,7 +212,7 @@
"name": "stderr",
"output_type": "stream",
"text": [
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:40<00:00, 13.34s/it]\n"
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:42<00:00, 14.24s/it]\n"
]
}
],
Expand Down Expand Up @@ -306,14 +276,14 @@
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.142857</td>\n",
" <td>0.214286</td>\n",
" <td>ID ...</td>\n",
" <td>gpt-3.5-turbo</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.714286</td>\n",
" <td>0.642857</td>\n",
" <td>ID ...</td>\n",
" <td>gpt-4o</td>\n",
" <td>1</td>\n",
Expand All @@ -332,8 +302,8 @@
"text/plain": [
" score report model_name \\\n",
"0 0.214286 ID ... gpt-3.5-turbo \n",
"1 0.142857 ID ... gpt-3.5-turbo \n",
"2 0.714286 ID ... gpt-4o \n",
"1 0.214286 ID ... gpt-3.5-turbo \n",
"2 0.642857 ID ... gpt-4o \n",
"3 0.714286 ID ... gpt-4o \n",
"\n",
" test_no \n",
Expand Down Expand Up @@ -428,7 +398,7 @@
" <td>Check that the data is split into training and...</td>\n",
" <td>0.5</td>\n",
" <td>3</td>\n",
" <td>[(test_cross_validation.py) The code does spli...</td>\n",
" <td>[(test_cross_validation.py) The code includes ...</td>\n",
" <td>[{'File Path': '../../data/raw/openja/lightfm_...</td>\n",
" </tr>\n",
" <tr>\n",
Expand Down Expand Up @@ -488,7 +458,7 @@
"0 3 [(test_cross_validation.py) The code does not ... \n",
"1 3 [(test_cross_validation.py) The code does not ... \n",
"2 3 [(test_cross_validation.py) The code does not ... \n",
"3 3 [(test_cross_validation.py) The code does spli... \n",
"3 3 [(test_cross_validation.py) The code includes ... \n",
"4 3 [(test_cross_validation.py) The code does not ... \n",
"5 3 [(test_cross_validation.py) The code does not ... \n",
"6 3 [(test_cross_validation.py) The code does not ... \n",
Expand Down Expand Up @@ -559,12 +529,12 @@
" <tbody>\n",
" <tr>\n",
" <th>gpt-3.5-turbo</th>\n",
" <td>0.002551</td>\n",
" <td>0.000000</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>gpt-4o</th>\n",
" <td>0.000000</td>\n",
" <td>0.002551</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
Expand All @@ -575,8 +545,8 @@
" score \n",
" var count\n",
"model_name \n",
"gpt-3.5-turbo 0.002551 2\n",
"gpt-4o 0.000000 2"
"gpt-3.5-turbo 0.000000 2\n",
"gpt-4o 0.002551 2"
]
},
"execution_count": 11,
Expand All @@ -597,19 +567,11 @@
"id": "5b1f94c8-1883-4435-84c7-b0687a6e6387",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/vd/r3dvzdx10pxf47gvdqf81r9h0000gn/T/ipykernel_42405/1426530661.py:5: RuntimeWarning: divide by zero encountered in scalar divide\n",
" f_score = score_var[('score', 'var')]['gpt-3.5-turbo'] / score_var[('score', 'var')]['gpt-4o'] # var(prev) / var(curr)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"p-value: 0.0\n",
"p-value: 1.0\n",
"\n",
"2-tail test:\n",
" Successfully reject the null hypothesis: Var(Completeness_Score(Current Version)) == Var(Completeness_Score(Last Week Version))\n"
Expand Down
11 changes: 10 additions & 1 deletion src/test_creation/modules/code_analyzer/analyzers/python.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import ast
from functools import wraps
from collections import defaultdict

from . import CodeAnalyzer

Expand All @@ -24,6 +25,14 @@ def read(self, file_path: str):
self.content = f.read()
self._tree = ast.parse(self.content)

@assert_have_read_content
def _get_function_lineno_map(self): # FIXME: when to use _xxx? when to use xxx?
function_lineno_map = defaultdict(int)
for node in ast.walk(self._tree):
if isinstance(node, ast.FunctionDef):
function_lineno_map[node.name] = node.lineno
return function_lineno_map

@assert_have_read_content
def list_imported_packages(self):
packages = set()
Expand All @@ -36,7 +45,7 @@ def list_imported_packages(self):

@assert_have_read_content
def list_all_functions(self):
raise NotImplementedError()
return self._get_function_lineno_map().keys()

@assert_have_read_content
def contains_test(self):
Expand Down
Loading