Refactor inputdata (#14)

* Strip preceding whitespace from column names in csv parsing * Fix issue where only the part of the input stream before the first whitespace character was put into the "temp" variable * Add tests for parsing csv log data with whitespace in headers and symbol columns * Add inputdata tests to cmakelists * Publish test results even if they fail * Clear static lists and maps of inputdata class so no data from previous runs affects the running of tests * mem_store: Add functionality for alternative input data implementations Big refactor of input data processing, still work in progress * add refactored csv reader tests * add configurable whitespace stripping and delimiter to csv parsing * abbadingoreader wip * add apta friend class * Add abbadingo reading functionality to refactored inputdata * Abbadingoreader works * Rename inputdata classes to match the style of the rest of flexfringe * Refactor WIP * CSV refactor WIP * Refactor csvreader to not need abbadingo format conversion stuff anymore * clean up tests dir * Remove abbadingo methods from csv reader * clean up csv reading test, and add test for special characters * clean up cmakelists tests build * Ignore CMakeFiles * Raise error for streaming mode which is currently not working * Add csv parsing library * add csv input parser WIP * add csv parser to cmakelists * Add parser interface * Add csv parser WIP * Bump c++ standard to 23 * Add csv parsing tests * Add new parser files * Parsing tests * Very broken WIP * Big broken WIP again * CSV parsing sort of works * Add abbadingo parser & test * Include iterator in strutil since it is used there * Add convenience method to add single element to symbolinfo instead of vector * Allow column names to only be a label, if the label has been added before (or is a default label) * Add constructor with perfect forwarding to make construction less annoying * Make parser inheritance public * Add test to check header parsing with no column names, just labels * Fix includes * Inputdata smoke tests * More tests * Update cmakelists * Delete unused files * Add convenience methods for string getting and contains * Fix includes * Update inputdata to read from the new parser classes * Bump cmake versions * Fix includes again * Remove unused test file * Add sstream include * Add sliding window mode back in * Smoke test for sliding window mode * Set default param for sliding_window_type to false * Add test for sliding window inputdata with abbadingo format * Bump required cmake version to 3.24 * Add lexy as dependency for making recursive descent parsers * Add lexy parser for enhanced abbadingo header + tests * Include lexy in the utility folder instead of downloading it with cmake. * Switch to LEXY_LIT macro instead of dsl::lit to ensure compatibility with compilers that do not support c++20 non-type template parameters yet * Use new lexy abbadingo header parser and start work on symbol parser Use new lexy abbadingo header parser and start work on symbol parser * Abbadingo symbol parsing with optional attributes and data works! * Abbadingo trace parsing WIP * Add sstream include because CI complains about it not being there * Remove old :0 symbol attr designator * We can now parse abbadingo traces with lexy too! * Add fmt string formatting library * Use new abbadingo trace parser * Handle parsing empty abbadingo trace "0 0" * Use string views in abbadingo trace parser to avoid a bunch of unneccesary string copies. Is it actually faster? Maybe?! * Use stringview in abbadingoparser * Woops forgot fmt include in cmakelists * Woops forgot quote * Fix line number shown on parsing error being off * Improve error messages shown by abbadingo trace parser * Enable redirecting stdin and stderr in catch2 so the error messages printed by lexy show up in the test results * Add test cases for incomplete specification of attribute and data * Add some index bookkeeping to inputdata * Add virtual destructor to keep the compiler from complaining about things being deleted by smart pointers * Remove virtual specifier from method that doesn't need it * Make state merger not delete memory it does not own * Fix issue where inputdata would not correctly finalize traces by adding another tail with symbol -1 * Add additional checks to smoke test 1 * Fix order of expected traces * Remove stdout redirect that was breaking xml test result generation * Abbadingo parser now correctly adds symbol and trace attribute information to symbol_info objects * Remove unused col_ids * Allow constructing attribute from attribute info * Make inputdata use new way of processing trace and symbol attributes * CSV symbol and trace attr parsing WIP * csv header attribute parsing WIP * csvparser now uses the lexy csv header parser * csvparser attr and tattr WIP * Allow overwriting of trace attributes if they are specified more than once per trace * Actually, don't allow overwriting trace attributes. We throw an exception now instead. * Don't use lexy::trace in tests as it messes up the xml output formatting * Remove old unused way of processing attributes * Add finalization of trace into class method * Sliding windows appear to work again!! (at least in the simple test cases) * WIP * dummyparser * Allow reading inputdata trace by trace, with varying strategies * Make standard inputdata read use strategy to prevent code duplication * Make sliding window use strategy * Make predict mode use the inputdata directly instead of having it try to read it from an input stream again * Handle the special case of empty traces in abbadingo input files * Remove unneeded statements from test case * Fix handling of empty abbadingo traces, they now consist of a single finalized symbol which does not count towards the length of the trace * Only finalize traces that aren't already finalized (like empty traces are by default) * Fixed same bug as in Publications/CSS_and_streaming branch * Changed problem with PAutomaC files, when newline character was wrong * Small bug in reading the apta. Fixed by @SiccoVerwer * Comment out idat, according to robert alphabet mismatch fix * Fix inputdata parsing bug where it would return the last trace read infinitely * Convert tail data pointer in tail objects to shared_ptr to prevent memory leaks, and make mem store free the tail data only if it is no longer in use * Convert tail data attr pointer to unique_ptr to prevent another memory leak * Fix dot output test to use new inputdata * Require abbadingo header to either end with a newline or EOF * Allow abbadingo parser to parse single lines. We currently do this by faking a header, which is probably not ideal and only works for traces without trace and symbol attributes * Re-enable reading access traces from json * Run tests on all 3 platforms for pull requests * Refer to reusable workflow as local file * Don't move windows executables again? * Remove unused coroutine include * Add workaround for apple clang not implementing c++20 string views correctly yet * Remove "using namespace" statements from all header files, as this pollutes the namespace of all other files including them * Forgot one rename * Patch loguru for mingw builds on windows * Change windows build to msys2-mingw64 * CI fiddling #1 * Actually screw that, let's try fixing the native windows build * Revert "Actually screw that, let's try fixing the native windows build" This reverts commit f63ec65. * Add mingw static linker flags * Skip cmake setup on windows, since we're using the msys2 provided one * Disable LTO for apple builds for now due to known issues with xcode * Typo * Remove todos that were already done * Remove incorrectly copy pasted comments * Add deprecation warnings to abbadingo inputdata * Factor out input file reading and take into account sliding window params * Remove separate predict_csv function, as all input file type related functionality is now handled upon parsing the input file * Add factory method that creates a new inputdata that matches the alphabet mapping of another inputdata * Fix row numbers in predict not being incremented * Use separate inputdata instance for predict mode, so the access traces added by read_json aren't predicted on as well * Switch to streaming mode for predict, to mirror the old behavior where traces were read and predicted one by one to save on memory --------- Co-authored-by: Robert Baumgartner <[email protected]>
tudelft-cda-lab · Mar 19, 2024 · 94f415b · 94f415b
1 parent e1e0547
commit 94f415b
Show file tree

Hide file tree

Showing 342 changed files with 110,461 additions and 67,100 deletions.
diff --git a/.github/workflows/build-test-all.yml b/.github/workflows/build-test-all.yml
@@ -25,22 +25,28 @@ jobs:
       - uses: actions/checkout@v1
 
       - name: Setup cmake
+        if: runner.os != 'Windows'
         uses: jwlawson/[email protected]
         with:
           cmake-version: '3.25.x'
 
       - name: prepare-windows
         if: runner.os == 'Windows'
-        uses: ilammy/msvc-dev-cmd@v1
+        uses: msys2/setup-msys2@v2
+        with:
+          update: true
+          msystem: mingw64
+          install: >-
+            mingw-w64-x86_64-gcc
+            mingw-w64-x86_64-cmake
 
       - name: build-windows
+        shell: msys2 {0}
         if: runner.os == 'Windows'
         run: |
           mkdir build && cd build
           cmake ..
           cmake --build . --config Release
-          mv Release/flexfringe.exe .
-          mv Release/runtests.exe .
 
       - name: build-unix
         if: runner.os == 'Linux' || runner.os == 'macOS'

diff --git a/.github/workflows/run-test-only.yml b/.github/workflows/run-test-only.yml
@@ -3,28 +3,5 @@ on: [pull_request]
 
 jobs:
   build-and-test:
-    name: build-test-linux
-    runs-on: ubuntu-latest
-
-    steps:
-      - uses: actions/checkout@v1
-
-      - name: Setup cmake
-        uses: jwlawson/[email protected]
-        with:
-          cmake-version: '3.16.x'
-
-      - name: build
-        run: |
-          mkdir build && cd build
-          cmake .. -DCMAKE_BUILD_TYPE=Release
-          make -j$(nproc)
-
-      - name: run-tests
-        run: build/runtests -r junit > testresults.xml
-
-      - name: publish-test-results
-        uses: EnricoMi/publish-unit-test-result-action@v1
-        with:
-          check_name: "Unit Test Results"
-          files: "testresults.xml"
+    uses: ./.github/workflows/build-test-all.yml
+    secrets: inherit
diff --git a/.gitignore b/.gitignore
@@ -23,3 +23,4 @@ tests/tests
 # Generated files
 source/gitversion.cpp
 source/evaluators.h
+CMakeFiles/
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -1,16 +1,23 @@
-cmake_minimum_required(VERSION 3.16)
+cmake_minimum_required(VERSION 3.24)
 set(CMAKE_OSX_SYSROOT "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk")
 project(flexfringe)
 
-set(CMAKE_CXX_STANDARD 17)
+set(CMAKE_CXX_STANDARD 20)
 set(CMAKE_CXX_STANDARD_REQUIRED ON)
 
+if(MINGW)
+    set(CMAKE_EXE_LINKER_FLAGS "-static -static-libgcc")
+endif()
+
 if(NOT CMAKE_BUILD_TYPE)
   set(CMAKE_BUILD_TYPE Release)
 endif()
 
 option(COMPILE_DOCS "This is settable from the command line" OFF)
 
+add_subdirectory(source/utility/lexy)
+add_subdirectory(source/utility/fmt-9.1.0)
+
 if (MSVC)
     add_compile_options("/W4" "$<$<CONFIG:RELEASE>:/O2>")
 else()
@@ -20,11 +27,14 @@ else()
 endif()
 
 add_compile_definitions(LOGURU_WITH_STREAMS=1)
+#add_compile_definitions(CATCH_CONFIG_EXPERIMENTAL_REDIRECT=1)
 
 include_directories("${PROJECT_SOURCE_DIR}"
                     "${PROJECT_SOURCE_DIR}/source"
                     "${PROJECT_SOURCE_DIR}/source/evaluation"
-                    "${PROJECT_SOURCE_DIR}/source/utility")
+                    "${PROJECT_SOURCE_DIR}/source/utility"
+                    "${PROJECT_SOURCE_DIR}/source/utility/lexy/include"
+                    "${PROJECT_SOURCE_DIR}/source/utility/fmt-9.1.0/include")
 
 add_subdirectory(source)
 
@@ -51,7 +61,9 @@ endif()
 
 target_link_libraries(flexfringe
         Source
-        Util)
+        Util
+        foonathan::lexy
+        fmt::fmt)
 
 
 find_package(Threads)
@@ -73,9 +85,11 @@ if(NOT SKIP_TESTS)
 
     add_executable(runtests
             tests/tests.cpp
-            tests/tail.cpp
             tests/smoketest.cpp
-            source/main.cpp tests/input_data.h)
+            tests/testcsvheaderparser.cpp
+            tests/testcsvparser.cpp
+            tests/testabbadingoparser.cpp
+            source/main.cpp tests/testinputdata.cpp)
 
     if(MSVC)
         target_link_libraries(runtests
@@ -96,6 +110,8 @@ if(NOT SKIP_TESTS)
             Catch2::Catch2
             Source
             Util
+            foonathan::lexy
+            fmt::fmt
             ${CMAKE_THREAD_LIBS_INIT})
 
     if(NOT WIN32)
@@ -111,3 +127,10 @@ if(MSVC)
     set_property(TARGET flexfringe Evaluation Source Util runtests PROPERTY
             MSVC_RUNTIME_LIBRARY "MultiThreaded$<$<CONFIG:Debug>:Debug>")
 endif()
+
+# XCode 15 currently has known issues with LTO:
+# https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Known-Issues
+if (NOT APPLE)
+    set_property(TARGET flexfringe PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)
+    set_property(TARGET runtests PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)
+endif()