Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose getLocationLineAndColumn() as a public API #1412

Conversation

martinduffy1
Copy link
Contributor

Allows better use of the offset data returned by getStructuredErrors()

Based on the work from @KyleFromKitware and comments from @BillyDonahue on #1282

@billhoffman
Copy link

billhoffman commented Jul 15, 2022

@BillyDonahue do you think you can look at this? It is an important change required by the CMake project to allow for better error reporting in the CMake presets functionality. I guess this needs to be merged as well #1409

@billhoffman
Copy link

@martinduffy1 looks like the branch might need a rebase.

@martinduffy1
Copy link
Contributor Author

#1409 is the more important change for this.

@martinduffy1 martinduffy1 force-pushed the getlocationlineandcolumn branch from f81fee5 to 6d1b347 Compare July 21, 2022 17:05
@@ -1991,6 +1978,32 @@ bool parseFromStream(CharReader::Factory const& fact, IStream& sin, Value* root,
return reader->parse(begin, end, root, errs);
}

Json::LineAndColumn getLocationLineAndColumn(char const* beginDoc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove redundant Json:: qualifiers.

Json::LineAndColumn getLocationLineAndColumn(char const* beginDoc,
char const* endDoc,
size_t location) {
Json::LineAndColumn lineAndColumn;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable not needed.

@@ -395,6 +395,15 @@ bool JSON_API parseFromStream(CharReader::Factory const&, IStream&, Value* root,
*/
JSON_API IStream& operator>>(IStream&, Value&);

struct LineAndColumn {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct LineAndColumn {
/** Line and column within a document (1-based). */
struct DocumentLocation {

int column;
};

/** Get the line and column of a character within a string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** Get the line and column of a character within a string.
/** Return the location of a character within a string. */

Comment on lines 405 to 406
LineAndColumn JSON_API getLocationLineAndColumn(char const* beginDoc, char const* endDoc,
size_t location);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LineAndColumn JSON_API getLocationLineAndColumn(char const* beginDoc, char const* endDoc,
size_t location);
DocumentLocation JSON_API locateInDocument(char const* beginDoc, char const* endDoc,
size_t offset);

Comment on lines 770 to 774
Json::LineAndColumn lineAndColumn = Json::getLocationLineAndColumn(
document_.data(), document_.data() + document_.length(),
location - document_.data());
line = lineAndColumn.line;
column = lineAndColumn.column;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Json::LineAndColumn lineAndColumn = Json::getLocationLineAndColumn(
document_.data(), document_.data() + document_.length(),
location - document_.data());
line = lineAndColumn.line;
column = lineAndColumn.column;
auto loc = locationInDocument(
document_.data(),
document_.data() + document_.size(),
location - document_.data());
line = lineAndColumn.line;
column = lineAndColumn.column;

Comment on lines 1985 to 2002
char const* current = beginDoc;
char const* lastLineStart = current;
int line = 0;
while (current < beginDoc + location && current < endDoc) {
char c = *current++;
if (c == '\r') {
if (*current == '\n')
++current;
lastLineStart = current;
++line;
} else if (c == '\n') {
lastLineStart = current;
++line;
}
}
// column & line start at 1
lineAndColumn.column = int(beginDoc + location - lastLineStart) + 1;
lineAndColumn.line = ++line;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function can be written more simply than this.
You don't need to introduce variables for lineAndColumn, current, or lastLineStart.

You don't need endDoc. It's not handled correctly anyway.
If the location indicates a position beyond endDoc, the result is silently considered to be a position in an imaginary last line of infinite length. This is not a good failure mode and so don't even try, I think.

    int line = 1;
    int col = 1;
    const char* last = beginDoc + location;
    for (; beginDoc != last; ++beginDoc) {
      switch (*beginDoc) {
        case '\r':
          if (beginDoc + 1 != last && beginDoc[1] == '\n')
            continue;  // consume CRLF as a single token.
          [[fallthrough]];  // CR without a following LF is same as LF
        case '\n':
          col = 1;
          ++line;
          break;
        default:
          ++col;
          break;
      }
    }
    return {line, col};

Comment on lines 3918 to 3922
size_t found_l = -1;
size_t found_i = -1;
size_t found_n = -1;
size_t found_e = -1;
size_t found_space = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is too clever I think. Trying to reuse these variables and use a fixture function.
It can just hard-code these positions and use a more creative test string.
Just hard-code some inputs and outputs in a table, and run a loop over that table.
It would be less code and easier to maintain.

Improve the name "test".

Needs more coverage.
I think it's missing scenarios where:
- input has trailing CR,LF,CRLF at end of string or beginning.
- input with blank lines.
- input with embedded \0.
- weird stuff like LFCR or CRCRLF.

@@ -3903,6 +3903,37 @@ JSONTEST_FIXTURE_LOCAL(FuzzTest, fuzzDoesntCrash) {
example.size()));
}

struct GetLocationLineAndColumnTest : JsonTest::TestCase {
void testLineAndColumn(const std::string& doc, ptrdiff_t location, int row,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subtly changing size_t to ptrdiff_t here. No reason to do that I think.

Comment on lines 3931 to 3933
testLineAndColumn(example, found_n, row, 3);
testLineAndColumn(example, found_e, row, 4);
testLineAndColumn(example, found_space, row, 5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the 3rd, 4th, and 5th finds tell you anything the first 2 didn't already tell you.

Allows better use of the offset data returned by getStructuredErrors()
@martinduffy1 martinduffy1 force-pushed the getlocationlineandcolumn branch from 2ffa426 to 459332d Compare July 25, 2022 15:43
@martinduffy1
Copy link
Contributor Author

Thanks for the feedback @BillyDonahue, I addressed the comments so far.
I'll also mention that #1409 has the more important changes for this and probably makes more sense to progress first, as it allows us to get the offset from the error message that we would use with this API.

@martinduffy1
Copy link
Contributor Author

@BillyDonahue since #1409 is approved now, can you take another look at this?

@baylesj
Copy link
Contributor

baylesj commented Sep 12, 2024

Closing this due to inactivity. Feel free to reopen if you are able to address feedback.

@baylesj baylesj closed this Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants