Added codeAction (extract subSchema to defs) #133

arpitkuriyal · 2025-02-20T17:46:33Z

#132
Point to Note in this:-

Child Nodes of $defs Are Added from the Top
- Instead of adding child elements from the bottom, they are inserted from the top.
- This prevents inconsistencies caused by offset + textLength changes due to spaces and brackets

Screen.Recording.2025-02-20.at.10.47.01.PM.mov

jdesrosiers

This is great progress! But, it looks like there's still some work to do.

Adding the definitions to the top isn't good enough. Also, the sloppy formatting of the generated code isn't acceptable either. The biggest motivating factor for this project is to encourage best practices and good style. Two of those rules are that $schema and $id always come first and that $defs always goes last. It's important that the definitions are in the right place and reasonably formatted when moved. You might have to somehow run the moved code through a formatter once it's in its new location. jsonc-parser is used internally to parse the schema. It has a formatting feature that you might be able to use somehow.

I noticed that the schema you're using in your demo video has a problem. The "shipping_address" property has a property "address" that then has the actual schema in it. That subschema will do nothing at all. "address" is not JSON Schema keyword, so it gets ignored. You can tell something's wrong because the schema in "address" doesn't have the expected syntax highlighting. JSON Schema keywords should be highlighted differently than plain JSON properties.

I see you're using the property name as the definition name. That's a nice default, but that's not always going to work. Subschemas can be in quite a few places other than properties values. For example, this would work for the items keyword. I think what needs to happen is that the user needs to provide the definition name. Have a look at the if/then completion provider. There's a special syntax that allows you to set placeholders that the user will be prompted to fill in. That approach may also work here.

You'll need to add tests for this feature that cover whatever edge cases you can think of.

I appreciate the clean and well written code so far. I suggest installing the ESLint plugin in your editor to get linting feedback while developing. It will help avoid getting build errors in your PRs if you forget to run the linter before pushing.

arpitkuriyal · 2025-02-21T07:02:05Z

Thank you for the detailed feedback. However, I encountered an issue while looking into what you mentioned:

I think what needs to happen is that the user needs to provide the definition name.
There's a special syntax that allows you to set placeholders that the user will be prompted to fill in. That approach may also work here.

I found that InsertTextFormat: InsertTextFormat.Snippet is coming in LSP 3.18 (see here). I’ll try using the command feature as an alternative and see how it goes.

jdesrosiers · 2025-02-21T19:37:15Z

Thanks for figuring that out. Yes, it looks like SnippetTextEdit is what we need. Although 3.18 isn't released yet, it looks like vscode-languageserver-node already supports it. That means that vscode's language server client probably supports it too. Other clients may not support it yet, but that's ok.

arpitkuriyal · 2025-02-21T20:15:07Z

I think it's not supported by vscode-languageserver-node yet.

As you can see in the screenshots, the latest available version is still 3.17.5, and this feature is introduced in 3.18. I also checked the node_modules directory and couldn't find any trace of this feature there.
Could you please confirm this on your end as well?

version ScreenShot

nodeModule Screenshot

PR Screenshot that u mentioned here:-

Although 3.18 isn't released yet, it looks like vscode-languageserver-node already supports it.

jdesrosiers · 2025-02-21T21:26:41Z

Yes, you're right. Although the code was merged over a year ago, it hasn't made it to an official release yet. It's planned for the next major release (v10) and that only has pre-releases published so far. We could change the dependency to `"vscode-languageserver": "^10.0.0-next" to use the pre-release. But, that's only the server. It probably wouldn't work in vscode yet.

I guess that means we have to hold off on that detail. I was hoping this would give us a way to avoid having to come up with a more robust way to generate definition names, but it looks like we're going to have to find a temporary solution until this feature is released.

arpitkuriyal · 2025-02-22T00:50:59Z

I think, for now, we should keep it simple and implement a basic defCounter that generates definition names like def1, def2, and so on.

Do you have any other solutions in mind? Please let me know your thoughts.

jdesrosiers · 2025-02-22T19:12:17Z

Number based naming can work, but there are some some edge cases.

Extract a schema and you'd get def1. Then restart the language server and extract another schema and you get def1 again. Ideally people will rename after extraction and this won't come up often, but it could be a problem.

Also, if you do refactorings in one schema and then go to another schema, it would be weird for it to generate def5 or something when there isn't def1 - def4 yet.

One thing that I think would work is the generate number-based names starting at def1 and check if the name already exists. If it does, increment and check again until you have a unique name.

Or, you could inspect all the definition names looking for the def{number} pattern and increment starting from the number you find or 1 if the pattern isn't found.

Or, you could generate a UUID and use that as the name and not have to check anything. But, a bulky UUID might not make for as good a user experience.

Any of those options would be ok.

arpitkuriyal · 2025-02-23T04:22:59Z

Alright, I’ll work on it and let you know once it’s done.

arpitkuriyal · 2025-02-23T19:21:58Z

Screen.Recording.2025-02-24.at.12.48.49.AM.mov

Please review it. If everything looks good, I will start writing the test cases.

arpitkuriyal · 2025-02-23T19:25:48Z

When switching the dialect URI of the schema, we need to change $defs to definitions for older drafts. Therefore, I added a condition: if the dialect is 2020-12 or 2019-09, use $defs otherwise, use definitions. Is this correct, or is there anything else I should do?

jdesrosiers · 2025-02-24T17:16:44Z

Is this correct, or is there anything else I should do?

Use the getKeywordName function from @hyperjump/json-schema/experimental. It takes a keyword URI and a dialect URI and returns the right label for the dialect.

arpitkuriyal · 2025-02-24T17:27:38Z

Got it! I'll use getKeywordName from @hyperjump/json-schema/experimental for the right label.

arpitkuriyal · 2025-02-24T18:01:53Z

I just made the change, you can check it now. I'll write the test case as soon as possible.

jdesrosiers

Have another try at the JSON formatting part. The other things I mentioned should be small and easy to address. I included a few code style suggestions that aren't caught by the linter. In general, I thought the code made better use of whitespace the last time I reviewed. Things are more compact now making the code harder to read.

language-server/src/features/codeAction/extractSubschema.js

jdesrosiers · 2025-02-24T17:34:10Z

language-server/src/features/codeAction/extractSubschema.js

+    // Helper function to format new def using jsonc-parser
+    const formatNewDef = (/** @type {string} */ newDefText) => {
+      try {
+        /** @type {unknown} */
+        const parsedDef = jsoncParser.parse(newDefText);
+        return JSON.stringify(parsedDef, null, 2).replace(/\n/g, "\n    ");
+      } catch {
+        return newDefText;
+      }
+    };


Pull this out of the constructor. Either make it a private function or a utility function outside of the class. I'm sure other refactorings will need to use a function like this as well, so maybe it belongs in util.js?

language-server/src/features/codeAction/extractSubschema.js

jdesrosiers · 2025-02-24T17:56:18Z

language-server/src/features/codeAction/extractSubschema.js

+        const parsedDef = jsoncParser.parse(newDefText);
+        return JSON.stringify(parsedDef, null, 2).replace(/\n/g, "\n    ");


This isn't what I meant when I suggested jsonc-parser. You're not using it for anything you couldn't have used JSON.parse for. Look for the format function from jsonc-parser. This solution using JSON.stringify to format isn't going to work for embedded schemas.

{ "$defs": { "this-is-an-embedded-schema-because-it-has-$id": { "$id": "my-embedded-schema", "$defs": { "def1": { "$comment": "This definition will need more indentation because it's nested" } } } } }

jsoncParser.format should help get around that problem because you can give it the whole document with the replaced text and tell it to format just the replaced text.

jdesrosiers · 2025-02-24T18:13:02Z

language-server/src/features/codeAction/extractSubschema.js

+      try {
+        /** @type {unknown} */
+        const parsedDef = jsoncParser.parse(newDefText);
+        return JSON.stringify(parsedDef, null, 2).replace(/\n/g, "\n    ");


Hardcoding the indentation strategy to two spaces isn't going to work. You're going to need to determine what indentation strategy the client is using and match that. You should be able to get the information from the configuration service, but it's currently only configured to retrieve this server's configs. I think there's another "section" you'd have to request, but you'll have to figure out what that is.

I have made all the changes, but I am stuck on the formatting part. It always takes the default tab size of four instead of the current tab size. Even though I am fetching the editor settings, it doesn't seem to reflect the actual tab size used in the document. Do you have any suggestions on how to correctly retrieve the active document's indentation settings?

Try sending the document's URI (schemaDocument.textDocument.uri) in the scopeUri parameter in the configuration request. That should return the settings for that file, instead of the settings for the workspace. That's my best guess.

I think it should be ok to add an optional parameter to the get function so you can pass in the document URI.

I also tried this approach, but when I console it in the terminal, it still always shows 4. Not sure why it's not picking up the actual tab size. Any other suggestions?

I don't have any more guesses. I'll try to find some time tonight to try some things and see if I can figure anything out.

Here's what I figured out.

I think getting 4 is technically correct. Your editor is configured for a 4 space indentation by default. However, vscode also has a setting called "Detect Indentation". If that is set, it ignores the tabSize setting and figures out the indentation based on the content of the file. I haven't found any way to get vscode to tell us the detected indentation. There might not be a way. I think that all we can do is check for the detectIndentation config and if it's true, do our own indentation detection on the server.

Okay, then I'll set detectIndentation: true and handle the indentation detection. Thanks for your help!

jdesrosiers · 2025-02-24T18:24:18Z

language-server/src/features/codeAction/extractSubschema.js

+        const defsContent = schemaDocument.textDocument.getText().slice(
+          definitionsNode.offset,
+          definitionsNode.offset + definitionsNode.textLength
+        );
+        const defMatches = [...defsContent.matchAll(/"def(\d+)":/g)];
+        defMatches.forEach((match) =>
+          highestDefNumber = Math.max(highestDefNumber, parseInt(match[1], 10))
+        );


This isn't a good approach. Use keys from schema-node.js to loop over all the property names of the definitionsNode. Then you can use the regex on those values to determine the highestDefNumber.

language-server/src/features/codeAction/extractSubschema.js

Co-authored-by: Jason Desrosiers <[email protected]>

arpitkuriyal added 2 commits February 20, 2025 22:58

added codeAction (extract subSchema to defs)

f298838

changed the params to Destructured Parameter

2e9930e

jdesrosiers requested changes Feb 21, 2025

View reviewed changes

correction in the extractSubschema.js

77808f4

added getKeywordName to get definition keyword

1845509

jdesrosiers requested changes Feb 24, 2025

View reviewed changes

arpitkuriyal and others added 5 commits February 26, 2025 02:49

improve code style

6ac18c5

Co-authored-by: Jason Desrosiers <[email protected]>

improve CO

a135b7b

Co-authored-by: Jason Desrosiers <[email protected]>

typing the function as a whole rather than inline annotations.

62e5463

Co-authored-by: Jason Desrosiers <[email protected]>

improve code style

f87c1b2

Co-authored-by: Jason Desrosiers <[email protected]>

change hardCoded "$Defs" to ${defName}

ac7909a

Co-authored-by: Jason Desrosiers <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added codeAction (extract subSchema to defs) #133

Added codeAction (extract subSchema to defs) #133

arpitkuriyal commented Feb 20, 2025 •

edited

Loading

jdesrosiers left a comment

arpitkuriyal commented Feb 21, 2025

jdesrosiers commented Feb 21, 2025

arpitkuriyal commented Feb 21, 2025 •

edited

Loading

jdesrosiers commented Feb 21, 2025

arpitkuriyal commented Feb 22, 2025

jdesrosiers commented Feb 22, 2025

arpitkuriyal commented Feb 23, 2025

arpitkuriyal commented Feb 23, 2025

arpitkuriyal commented Feb 23, 2025 •

edited

Loading

jdesrosiers commented Feb 24, 2025

arpitkuriyal commented Feb 24, 2025

arpitkuriyal commented Feb 24, 2025

jdesrosiers left a comment

jdesrosiers Feb 24, 2025

jdesrosiers Feb 24, 2025

jdesrosiers Feb 24, 2025

arpitkuriyal Feb 25, 2025

jdesrosiers Feb 25, 2025

arpitkuriyal Feb 25, 2025

jdesrosiers Feb 25, 2025 •

edited

Loading

jdesrosiers Feb 26, 2025

arpitkuriyal Feb 26, 2025

jdesrosiers Feb 24, 2025

		const parsedDef = jsoncParser.parse(newDefText);
		return JSON.stringify(parsedDef, null, 2).replace(/\n/g, "\n ");

Added codeAction (extract subSchema to defs) #133

Are you sure you want to change the base?

Added codeAction (extract subSchema to defs) #133

Conversation

arpitkuriyal commented Feb 20, 2025 • edited Loading

jdesrosiers left a comment

Choose a reason for hiding this comment

arpitkuriyal commented Feb 21, 2025

jdesrosiers commented Feb 21, 2025

arpitkuriyal commented Feb 21, 2025 • edited Loading

jdesrosiers commented Feb 21, 2025

arpitkuriyal commented Feb 22, 2025

jdesrosiers commented Feb 22, 2025

arpitkuriyal commented Feb 23, 2025

arpitkuriyal commented Feb 23, 2025

arpitkuriyal commented Feb 23, 2025 • edited Loading

jdesrosiers commented Feb 24, 2025

arpitkuriyal commented Feb 24, 2025

arpitkuriyal commented Feb 24, 2025

jdesrosiers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdesrosiers Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arpitkuriyal commented Feb 20, 2025 •

edited

Loading

arpitkuriyal commented Feb 21, 2025 •

edited

Loading

arpitkuriyal commented Feb 23, 2025 •

edited

Loading

jdesrosiers Feb 25, 2025 •

edited

Loading