Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace every occurrence of searched words #6

Open
Seb35 opened this issue Dec 22, 2018 · 1 comment
Open

Replace every occurrence of searched words #6

Seb35 opened this issue Dec 22, 2018 · 1 comment

Comments

@Seb35
Copy link
Member

Seb35 commented Dec 22, 2018

This is a quite simple issue and could be a good first bug for a newcomer in SedLex codebase.

When diffs are generated by SedLex (in AddDiffVisitor), when DuraLex tree says to replace a word (well an expression possibly with multiple words) by another, only the first one is replaced. The perimeter is currently delimited by (self.begin, self.end), this should be changed into a list of perimeters and then do the edit operation in each perimeter. Some care should be taken because the text should not be reset between each sub-edit operation, particularly for exact diffs.

This can be tested on Durafront - it can be checked that the DuraLex tree is correct.
Amendment =

Le mot "truc" est remplacé par le mot "machin".

Text to be amended:

Le truc est ici. Le truc n'est pas ici.

Currently only the first "truc" is changed to "machin".

When this will be implemented, the diff of the above example will look like:

--- "unnamed article"
+++ "unnamed article"
@@ -1 +1 @@
-Le truc est ici. Le truc n'est pas ici.
+Le machin est ici. Le machin n'est pas ici.

and the exact diff will look like:

--- "unnamed article"
+++ "unnamed article"
@@ -4,4 +4,6 @@
-truc
+machin
@@ -21,4 +21,6 @@
-truc
+machin
@Seb35 Seb35 changed the title Replace every occurrence of searched words [simple/good first bug] Replace every occurrence of searched words Dec 29, 2018
@Seb35
Copy link
Member Author

Seb35 commented Dec 29, 2018

Removed the 'easy' tag because possibly it could have links with the way issue #9 of DuraLex is solved. This other issue is about how multiple articles are changed together, and this one is about how multiple occurences are managed in a single article, but I think the data model in AddDiffVisitor should be rewritten by taking into account both characteristics.

A complete example is an amendment:

Au premier alinéa de l’article 3 et au cinquième alinéa de l’article 5,
les mots "truc" sont remplacés par les mots "machin".

And the word "truc" appears multiple times in both locations specified in the articles. (According to the rules in the French assemblies, an amendment in the classical sense cannot change multiple articles, but here the “amendment” word is used as a synonym of “modifying text” and this type of amendment can be found in law projects/proposals or in in-force laws modifying other laws.)

The DuraLex tree of such an amendment would be something like:

{
  "children": [
    {
      "children": [
        {
          "children": [
            {
              "children": [
                {
                  "type": "quote",
                  "words": "truc"
                }
              ],
              "type": "word-reference"
            }
          ],
          "order": 1,
          "type": "alinea-reference"
        }
      ],
      "id": "3",
      "type": "article-reference"
    },
    {
      "children": [
        {
          "children": [
            {
              "children": [
                {
                  "type": "quote",
                  "words": "truc"
                }
              ],
              "type": "word-reference"
            }
          ],
          "order": 5,
          "type": "alinea-reference"
        }
      ],
      "id": "5",
      "type": "article-reference"
    },
    {
      "children": [
        {
          "type": "quote",
          "words": "machin"
        }
      ],
      "type": "word-definition"
    }
  ],
  "editType": "replace",
  "type": "edit"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant