CMS Forms Flexible Imports #419

DevChima · 2025-02-17T10:58:58Z

Purpose

There is need for CMS forms to have a more robust import mechanism.
Also, we need the error handling to be more intuitive so there's better guidance for the user when there's an import failure

Solution

This pr includes:
A header validator function which validates the header before writing to rows
A list of mandatory headers that each import should have
A snakecase function which converts the headers to snake case so the import is more flexible
What this means is that if a user enters Generic Error instead of generic_error as a header name, it will convert and pass.

Checklist

Added or updated unit tests
Added to release notes
Updated readme/documentation (if necessary)

erikh360

Nice! few comments

erikh360 · 2025-02-17T11:21:27Z

home/tests/test_assessment_import_export.py

@@ -425,6 +425,41 @@ def test_single_assessment(self, impexp: ImportExport) -> None:
        imported = impexp.get_assessment_json()
        assert imported == orig

+    def test_snake_case_assessments(self, csv_impexp: ImportExport) -> None:
+        """
+        Importing a csv with spaces in header names and uppercase text should be coverted to snake case


Spelling: "coverted"

erikh360 · 2025-02-17T11:22:57Z

home/tests/test_assessment_import_export.py

+        model_fields = [field.name for field in assessment_model._meta.get_fields()]
+
+        expected_fields = {
+            "title": True,


Is the True value here not redundant since they're all True? Removing this would make your for loop below less complicated

Actually that's true.

erikh360 · 2025-02-17T11:27:49Z

home/import_assessments.py

+        row_iterator = parse_file(self.file_content, self.file_type)
+        rows = [row for _, row in row_iterator]
+
+        if rows:


Doesn't look like we have a test covering rows being empty?

I've added it in but I also had to modify the helper function here:

contentrepo/home/import_helpers.py

Line 172 in ad6ca7b

first_row = next(rows)

It would throw a runtime stop-iteration error becuase it will try use next(rows) on an empty iterator since the file is empty.
So I put it in a try-except block to catch the exception if the iterator is empty.
Then in the parse_file function, I added an if not rows to catch empty rows.

@erikh360

erikh360 · 2025-02-17T13:30:33Z

home/import_assessments.py

@@ -185,6 +214,21 @@ def create_shadow_assessment_from_row(
        )
        assessment.questions.append(question)

+    def validate_headers(


Do we have to pass in MANDATORY_HEADERS here? validate_headers is in the same class where it is called and can also access the variable directly.

erikh360 · 2025-02-17T13:33:54Z

home/import_assessments.py

+        original_headers = rows[0].keys()
+        headers_mapping = {
+            header: self.to_snake_case(header) for header in original_headers
+        }
+        snake_case_headers = list(headers_mapping.values())
+        self.validate_headers(snake_case_headers, MANDATORY_HEADERS, row_num=1)
+        transformed_rows = [
+            {headers_mapping[key]: value for key, value in row.items()} for row in rows
+        ]


Should some of this not maybe live inside the parse_file function? I see we have a function in there called fix_rows?

I'm not very familiar with all the import code so I'm just asking

I believe we can move some there so it's also more accessible to other apps that will share the helper functions. There we can check if there are rows. We could perhaps handle snake_case conversion there too. I'll take a look.

… apps for file imports. - Check empty rows() now actioned for assessment, content pages imports and any other cms imports that uses the parse_file function. - We are not able to move snake_case into fix_rows() in import_helpers.py because contentset uses PascalCase for headers. If we snake_cased every import, there will be key errors in contentsets.

erikh360 · 2025-02-18T06:41:17Z

home/import_assessments.py

+                row_num=row_num,
+            )
+
+    def to_snake_case(self, s: str) -> str:


Is this being used?

jerith · 2025-02-18T07:20:51Z

home/import_helpers.py

+def to_snake_case(s: str) -> str:
+    """
+    Converts string to snake_case.
+    """
+    return re.sub(r"[\W_]+", "_", s).lower().strip("_")


I'd rather have a specific set of characters that we convert, perhaps just spaces and dashes. Anything too far off should be an error to avoid confusion.

If we do it that way then making it reusable across all imports could be more cumbersome since they all have their various headers?

home/tests/import-export-data/broken_assessment.csv

home/tests/test_assessment_import_export.py

…to handle it

jerith · 2025-02-19T06:49:49Z

home/import_helpers.py

+    if any(char in s for char in INVALID_CHARACTERS):
+        raise ImportException(
+            f"Invalid header: '{s}' contains invalid characters.", row_num=row_num
+        )
+    return re.sub(r"[\W_]+", "_", s).strip("_")


Rather than maintaining a long list of invalid non-word characters (some of which we may potentially want to use in the future) and replacing everything else with _, we can replace just the _-equivalent characters:

Suggested change

if any(char in s for char in INVALID_CHARACTERS):

raise ImportException(

f"Invalid header: '{s}' contains invalid characters.", row_num=row_num

)

return re.sub(r"[\W_]+", "_", s).strip("_")

return s.replace(" ", "_").replace("-", "_").strip("_")

(I typed this directly into the PR comment, so check that it works before accepting the suggestion.)

…ter exception

home/import_assessments.py

jerith · 2025-02-19T09:34:23Z

home/import_assessments.py

+    @classmethod
+    def check_missing_fields(cls, row: dict[str, str], row_num: int) -> None:
+        """
+        Checks for missing required fields in the row and raises an exception if any is missing.
+        """
+        missing_fields = [field for field in MANDATORY_HEADERS if field not in row]
+        if missing_fields:
+            raise ImportAssessmentException(
+                f"The import file is missing required fields: {', '.join(missing_fields)}",
+                row_num,
+            )
+


Isn't this already covered by AssessmentImporter.validate_headers?

No, validate_headers checks for missing headers before writing to rows. Check_missing_fields then checks that the mandatory headers do not have missing fields in its rows.

All rows will have exactly the same set of keys, so if the headers aren't missing there they won't be missing here. If the intent is to check for missing values, we should change the code to do that instead of just checking the keys.

(Unless we're removing empty fields somewhere along the way and I missed it?)

jerith · 2025-02-19T11:42:28Z

home/import_assessments.py

            raise ImportAssessmentException(
-                "The import file is missing some required fields."
+                f"The import file is missing required fields: {', '.join(missing_fields)}",


Perhaps "Row missing values for required fields: {...}" instead?

DevChima added 3 commits February 17, 2025 05:48

CMS Forms Flexible Imports

7ea1b7c

Merge branch 'main' into assessment-flexible-import-misisng-error-fields

92cc3d9

Update Changelog

c3270af

erikh360 reviewed Feb 17, 2025

View reviewed changes

Handle case where import files are empty

10bd8b2

erikh360 reviewed Feb 17, 2025

View reviewed changes

erikh360 reviewed Feb 18, 2025

View reviewed changes

home/import_assessments.py Outdated

row_num=row_num,

)

def to_snake_case(self, s: str) -> str:

Copy link

Contributor

erikh360 Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being used?

Remove redundant function that was moved to import_helpers file

3206672

jerith reviewed Feb 18, 2025

View reviewed changes

DevChima added 2 commits February 18, 2025 05:54

specific list of characters we convert and remove unneeded tests

3b890da

Add check for invalid characters in headers, an exception and a test …

cc507cd

…to handle it

jerith reviewed Feb 19, 2025

View reviewed changes

We do not want to catch all invalid characters. Remove invalid charac…

27d1c37

…ter exception

jerith reviewed Feb 19, 2025

View reviewed changes

Update dataclass s mandatory fields have no default value

3329359

jerith reviewed Feb 19, 2025

View reviewed changes

DevChima added 2 commits February 19, 2025 06:49

Update exception error message

ce3cbc9

Add more tests for headers and missing values

5e95476

jerith approved these changes Feb 19, 2025

View reviewed changes

DevChima merged commit 9cca1d2 into main Feb 19, 2025
2 checks passed

DevChima deleted the assessment-flexible-import-misisng-error-fields branch February 19, 2025 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CMS Forms Flexible Imports #419

CMS Forms Flexible Imports #419

DevChima commented Feb 17, 2025 •

edited

Loading

erikh360 left a comment

erikh360 Feb 17, 2025

erikh360 Feb 17, 2025

DevChima Feb 17, 2025

erikh360 Feb 17, 2025

DevChima Feb 17, 2025 •

edited

Loading

erikh360 Feb 17, 2025

erikh360 Feb 17, 2025

DevChima Feb 17, 2025

erikh360 Feb 18, 2025

jerith Feb 18, 2025

DevChima Feb 18, 2025

jerith Feb 19, 2025

jerith Feb 19, 2025

DevChima Feb 19, 2025

jerith Feb 19, 2025

jerith Feb 19, 2025

CMS Forms Flexible Imports #419

CMS Forms Flexible Imports #419

Conversation

DevChima commented Feb 17, 2025 • edited Loading

Purpose

Solution

Checklist

erikh360 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DevChima Feb 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DevChima commented Feb 17, 2025 •

edited

Loading

DevChima Feb 17, 2025 •

edited

Loading