-
-
Notifications
You must be signed in to change notification settings - Fork 965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: saml federation #2653
base: master
Are you sure you want to change the base?
feat: saml federation #2653
Conversation
persistence/sql/migrations/templates/20201201161451_credential_types_values.down.fizz
Outdated
Show resolved
Hide resolved
8f81269
to
3e5ad65
Compare
Hello again - just wanted to check in on this: When would you like us to take a look at this PR? |
Hi @kmherrmann, I think there is still some minor tidying up to do on the branch, but the people in our team working on it are currently on vacation, and should come back next week. However, as the PR is pretty large, I'd say you can start reviewing now. |
daefc17
to
1177f82
Compare
ffe5337
to
5e715a5
Compare
0bcd948
to
6cf0778
Compare
Codecov Report
@@ Coverage Diff @@
## master #2653 +/- ##
==========================================
- Coverage 77.50% 73.46% -4.05%
==========================================
Files 314 305 -9
Lines 19897 17517 -2380
==========================================
- Hits 15421 12868 -2553
- Misses 3285 3638 +353
+ Partials 1191 1011 -180
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you for working on this feature! There is a couple of topics I'd like to address, please see my comments. Thanks! :)
selfservice/flow/saml/handler.go
Outdated
@@ -0,0 +1,336 @@ | |||
package saml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To follow the rest of the design pattern, this handler should move into the strategy. I don't think that it belongs into the flow as it doesn't have its own flow object nor any other properties of a flow (hooks, ui, ...) as far as I can tell.
selfservice/flow/saml/handler.go
Outdated
} | ||
|
||
// Key pair to encrypt and sign SAML requests | ||
keyPair, err := tls.LoadX509KeyPair(strings.Replace(c.SAMLProviders[len(c.SAMLProviders)-1].PublicCertPath, "file://", "", 1), strings.Replace(c.SAMLProviders[len(c.SAMLProviders)-1].PrivateKeyPath, "file://", "", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "use the last entry" functionality is copy & pasted across several lines. Do we need it everywhere? ANd why do we care only about the last entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This is a leftover from the very beginning of our work on SAML implementation. We implemented the Provider() method incorrectly (provider_config.go), I'll fix that quickly :)
selfservice/flow/saml/handler.go
Outdated
} | ||
keyPair.Leaf, err = x509.ParseCertificate(keyPair.Certificate[0]) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please wrap errors in descriptive error messages using the herodot
package (e.g. herodot.ErrBadRequest
) and use errors.WithStack()
to include stack traces
selfservice/flow/saml/handler.go
Outdated
|
||
metadataURL := c.SAMLProviders[len(c.SAMLProviders)-1].IDPInformation["idp_metadata_url"] | ||
// The metadata file is provided | ||
if strings.HasPrefix(metadataURL, "file://") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a library for that in ory/x called fetcher, please use that one!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, we've done the file reorganization, it should fit your design pattern better now. We are now about to take into account the rest of your review :)
How could we design end-to-end tests? Or are there any simpler solutions? Some SAML IdP reference implementation? |
I think there is an online SAML test server, I saw it somewhere on slack. I'd like to avoid setting up Gloo, I've tried before but it was really difficult to get it working (it was ~1 year ago). |
We used a lot samltest.id and samltool, I recommend these :) |
Good news, this PR is ready for review! We will soon take into account the first feedbacks and then take care of the missing tests. Thank you! |
selfservice/flow/saml/handler.go
Outdated
// We have to get the SessionID from the cookie to inject it into the context to ensure continuity | ||
cookie, err := r.Cookie(continuity.CookieName) | ||
if err != nil { | ||
h.d.SelfServiceErrorManager().Forward(r.Context(), w, r, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cookie is not set, nil pointer dereference error is thrown down the code.
Have added registration test: ovh#4 |
Hey, I think we've taken in account all points of your review, don't hesitate to take a look! |
I tried pushing some changes required for merging the PR to your fork & branch, but it appears that I am not allowed to do so 😕
But the good news is, giving access is easy! If the repository belongs to an organization, please add me for the project as a collaborator! |
I wanted to fix the formatter error as well as the SDK generation error :) |
@aeneasr thanks a lot for your help! You should have access to push on ovh/kratos now :) |
Great, thanks! I have allocated some time on monday to look into the PR :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This is already in a great state. I've reviewed everything outside the SAML module for now, the saml module itself is still required to review on my end.
continuity/manager_relaystate.go
Outdated
@@ -0,0 +1,151 @@ | |||
package continuity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is almost a 1:1 copy of manger_cookie. Can you please remove it and use manager_cookie instead? If you need to parametrize things, feel ree to adjust the manager_cookie file :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! We now use the manager_cookie and make the use of the RelayState parametrizable :)
@@ -1,3 +1,6 @@ | |||
// 20221125150145 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove please - invalid json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
driver/registry_default.go
Outdated
// If m.continuityManager is nil or not a continuity.ManagerCookie | ||
switch m.continuityManager.(type) { | ||
case *continuity.ManagerCookie: | ||
default: | ||
m.continuityManager = continuity.NewManagerCookie(m) | ||
} | ||
return m.continuityManager | ||
} | ||
|
||
func (m *RegistryDefault) RelayStateContinuityManager() continuity.Manager { | ||
// If m.continuityManager is nil or not a continuity.ManagerRelayState | ||
switch m.continuityManager.(type) { | ||
case *continuity.ManagerRelayState: | ||
default: | ||
m.continuityManager = continuity.NewManagerRelayState(m, m) | ||
} | ||
return m.continuityManager | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks incorrect and racy. If you need two separate managers, save them in separate variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need separate managers anymore, so problem solved :)
identity/credentials_saml.go
Outdated
// swagger:model identityCredentialsSamlProvider | ||
type CredentialsSAMLProvider struct { | ||
Subject string `json:"subject"` | ||
Provider string `json:"samlProvider"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use underscore :)
saml_provider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
} | ||
|
||
// Create an uniq identifier for user in database. Its look like "id + the id of the saml provider" | ||
func NewCredentialsSAML(subject string, provider string) (*Credentials, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
@@ -0,0 +1 @@ | |||
INSERT INTO identity_credential_types (id, name) SELECT 'ff5a1823-8b47-4255-860f-4b70ed122740', 'saml' WHERE NOT EXISTS ( SELECT * FROM identity_credential_types WHERE name = 'saml'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you generate this UUID? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generated it by hand because I still didn't understand how it all worked, how am I supposed to do that?
x/provider.go
Outdated
@@ -29,6 +29,11 @@ type CookieProvider interface { | |||
ContinuityCookieManager(ctx context.Context) sessions.StoreExact | |||
} | |||
|
|||
type RelayStateProvider interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We potentially don't need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, we removed it
) | ||
|
||
// SessionGetRelayState returns a string of the content of the relaystate for the current session. | ||
func SessionGetStringRelayState(r *http.Request, s sessions.StoreExact, id string, key interface{}) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite complex - which means that it needs a test :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's covered by the test in continuity/manager_relaystate_test.go :)
// SessionGetRelayState returns a string of the content of the relaystate for the current session. | ||
func SessionGetStringRelayState(r *http.Request, s sessions.StoreExact, id string, key interface{}) (string, error) { | ||
|
||
cipherRelayState := r.PostForm.Get("RelayState") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be underscored and lowercase:
cipherRelayState := r.PostForm.Get("relay_state")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to use this "RelayState" notation because it is a SAML standard
Hey @aeneasr, thank you very much for your review. We wanted to finish our current task before taking it on, so here is some information about it: Don't hesitate if you have any questions or feedback! |
I don't have much technical to contribute here, but just wanted to say this is really exciting to see being developed and it'll be great to see Kratos get SAML SP support when this is merged 🚀 |
Signed-off-by: ThibaultHerard <[email protected]> Co-authored-by: sebferrer <[email protected]> Co-authored-by: psauvage <[email protected]> Co-authored-by: alexGNX <[email protected]> Co-authored-by: Stoakes <[email protected]>
Signed-off-by: ThibaultHerard <[email protected]> Co-authored-by: sebferrer <[email protected]>
Signed-off-by: ThibaultHerard <[email protected]> Co-authored-by: sebferrer <[email protected]>
Signed-off-by: sebferrer <[email protected]> Co-authored-by: ThibaultHerard <[email protected]>
+ update saml tests Signed-off-by: ThibaultHerard <[email protected]> Co-authored-by: sebferrer <[email protected]>
Signed-off-by: sebferrer <[email protected]> Co-authored-by: ThibaultHerard <[email protected]>
Signed-off-by: ThibaultHerard <[email protected]> Co-authored-by: sebferrer <[email protected]>
+ test new credentials saml + resolving conflicts with master Signed-off-by: sebferrer <[email protected]> Co-authored-by: ThibaultHerard <[email protected]>
Hey @aeneasr! I think all your last reviews have been taken into account, we did a refactoring of the continuity manager in the case of using RelayState. |
Conflicts @sebferrer And thank you for all the work on this! Definitely excited to see this get in! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally had some time to look more into the code base - thank you very much. It's a lot of comments and I think there are several areas that have to be addressed before this rolls out in any production system.
As it stands currently, there are several areas of improvements. This ranges from code path issues (forgot to add return) to incorrect function use that would end up with responses that clients can't understand (e.g. sending a HTTP redirect instead of XML).
The traits pull-model on login is not working at the moment, as the identity is not updated as part of login currently.
Further, the traits mapping is naive in a sense that it just copies anything the SAML client sends into the traits, and it does not run the vital validation pipeline. Identities will not be able to perform e.g. account recovery or other basic flows (adding 2FA for example) as far as I can read this.
There seems to be the concept of JsonNet in the test data, but it is - as far as I can tell - not being used by the code base.
There are a couple more topics such as missing thread safety which will eventually end up with panics and invalid configuration. The problem is that we're using a global map-based cache for the configs. This will not work with hot-reloading.
Lastly, there are no functional / integration / e2e tests that perform the complete flow from a-z and prove that it works / test for regression.
In summary, it is great how far this has come! But there are stil miles to do before this can run in a production system.
config := h.d.Config() | ||
pid := ps.ByName("provider") | ||
|
||
if samlMiddlewares[pid] == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not thread safe - need to address
config := h.d.Config() | ||
pid := ps.ByName("provider") | ||
|
||
if samlMiddlewares[pid] == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
|
||
// Checks if the user already have an active session | ||
if e := new(session.ErrNoActiveSessionFound); errors.As(e, &e) { | ||
// No session exists yet, we start the auth flow and create the session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enough? What if we are refreshing the flow?
} | ||
|
||
func DestroyMiddlewareIfExists(pid string) { | ||
if samlMiddlewares[pid] != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not thread-safe
} | ||
|
||
// Key pair to encrypt and sign SAML requests | ||
keyPair, err := tls.LoadX509KeyPair(strings.Replace(providerConfig.PublicCertPath, "file://", "", 1), strings.Replace(providerConfig.PrivateKeyPath, "file://", "", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing: base64 / https source. Should use fetcherx instead
if err != nil { | ||
return err | ||
} | ||
i.Traits = identity.Traits(traits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this is doing - it is updating the traits of the user with the data from the SAML provider. Is that intentional?
s.d.Logger(). | ||
WithRequest(r). | ||
WithField("oidc_provider", provider.Config().ID). | ||
WithSensitiveField("identity_traits", i.Traits). | ||
WithField("mapper_jsonnet_url", provider.Config().Mapper). | ||
Debug("Merged form values and OpenID Connect Jsonnet output.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an incorrect log
delete(traitsMap, "iss") | ||
delete(traitsMap, "email_verified") | ||
delete(traitsMap, "sub") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we delete these fields?
} | ||
|
||
i := identity.NewIdentity(s.d.Config().DefaultIdentityTraitsSchemaID(r.Context())) | ||
if err := s.setTraits(w, r, claims, provider, jsonClaims, i); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive approach to traits mapping, will not work in practice. Need to add JsonNet parsing and more
dsig "github.com/russellhaering/goxmldsig" | ||
"gotest.tools/assert" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question to self: are we testing the underlying library here?
Hey @aeneasr, thanks a lot for your review! The points of improvement are very clear to us and will keep us moving in the right direction. We had to move on to other priority tasks so we won't be making the changes right away. We can't make any commitment on a specific date to consider your review, but this PR is definitely still on the table for us. If it takes too long for you, feel free to take over the code yourself, from our side we will keep you informed when we resume our work on this project. Thank you again! 🙂 |
Hi @aeneasr, we are looking for this SAML support for quite some time. Would like to know when this feature will be merged and available? This PR merge is blocked for some code issue? Thanks. |
The original PR #2148 has been accidentally closed, so here's a new one.
Completion Progress
Main endpoints
Concerning the first part, the goal is to develop the two main endpoints :
/metadata
(GET) : Generate the metadata of the SP (Kratos)/acs
(POST) : Handle SAML requestThen we have to deal with the way endpoints work. The library already implements what we want to make these endpoints work. The library allows you to create a metadata file very easily so we will need to incorporate it into Kratos to allow the endpoint
/metadata
to create them easily. Concerning the endpoint/acs
, the Crewjam library allows to receive the SAML requests, to understand them and to treat them accordingly.SDK adaptation
The goal here is to allow the SDK to call our SAML methods. Currently, the SDK allows to protect a route via a redirection to the login page. We should copy this system a little and allow to protect a route via SAML by a redirection to the IDP. After authentication, the IDP will redirect the user to the desired page. There is also the very important problem of converting the session created by the Crewjam/saml library into a Kratos session to remain homogeneous.
Responses consumption
Now that the endpoints are created, the SAML responses must be processed by Kratos. This means that the endpoint
/acs
must receive the SAML responses, consume them and translate them into a language that Kratos can understand. More clearly, this endpoint must allow Kratos to support SAML requests and to perform the actions associated with these requests.It is also in this part that you must check if the session has not expired (according to the duration indicated in option). If it is the case, you have to send a SAML Request to the IDP.
RelayState continuity
Kratos has a continuity management system that allows it to validate the continuity of a login flow (in particular to prevent the possibility of forging login requests). This system is based on the transit of a continuity cookie along the login flow. In the case of SAML, the cookie is lost at the end of the chain because the response is sent from the IdP to the SP in POST, and the SameSite cookies is set to "Lax". The idea is to keep the already present pattern which consists in starting from a continuity cookie to end with the same cookie, rebuilding the lost cookie that would have been kept via the RelayState. It is therefore necessary to add a continuity manager which uses the same principles as the current one, but which would also be based on the transit of the continuity value via the RelayState. More information here: #2486
UI adaptation
Now we need to make the buttons corresponding to our SAML configuration appear in the UI. This requires an adaptation of the Nodes in Kratos, but also in Elements.
Indeed, Ory moved all of the UI rendering and handling of the flow object to Ory Elements so that it is reusable across many examples as well as the kratos-selfservice-ui-node repository.
Right now they opted to expect certain flow nodes in the structure they want. Here is the related PR in ory/elements: ory/elements#54
YAML Configuration
Finally, the last part will concern the configuration. Not everyone wants to use SAML so we will have to use the YAML and Kratos configuration system to adapt it to SAML by adding new options to indicate if we want to use SAML and fill in the endpoints. The objective here is to make the final link between Kratos and SAML and thus be able to create instances of Kratos implementing SAML.
Concerning the options, here are the variables we can modify :
Related issue(s)
Design Document
Checklist
introduces a new feature.
contributing code guidelines.
vulnerability. If this pull request addresses a security. vulnerability, I
confirm that I got green light (please contact
[email protected]) from the maintainers to push
the changes.
works.
Disclaimer
At the moment, this is only a first version which is not intended to be merge. All the documentation and tests are still to be done.