Photo by Everyday basics on Unsplash

Auth acronyms explained — part 2

SCIM, SAML and OIDC

Adrian Bednarz
7 min readNov 6, 2022

--

In this article we are going to talk about user information synchronization mechanism in federated applications (SCIM) and the protocols for exchanging authentication information between the systems (SAML, OIDC). These describe message content specification for SSO.

This is the second article in the series of articles about auth acronyms. You can check out part 1 here.

SCIM

SSO (and federated identity model in general) allows you to have a single place where user identities are stored. Sometimes though you want to synchronize user identities between two places. A real-world example might be sychnronization between on-premises Active Directory and cloud-based ADFS or a process of migrating from one cloud provider to the other, keeping accounts in both IAM (Identity and Management) systems in sync.

SCIM (System for Cross-domain Identity Management) is a specification that introduces APIs and entities definition for moving users and their identities around the cloud. It is based on there RFCs: RFC7642, RFC7643 and RFC7644. And it defines two main entities: user and groups. Groups, as one may expect, are a means to define organizational structures of users. SCIM is based on REST APIs and JSON entities. It defines the following APIs

  • resource creation: POST /[version]/[resource]
  • resource management: GET, PUT, DELETE, PATH /[version]/[resource]/[id]
  • resource browsing: GET /version/resource?sortBy=[...]&sortOrder=...&filter=...
  • along with feature discovery endpoints (API version, supported resource types and their schemas).

In the above definitions, resource can refer to a user or a group. SCIM allows users to define their own types of resources too. It’s been evolving over the years with the latest version of 2.0 (v2 in the resource URLs).

Now, let’s change subject and focus on authentication protocols. From last article we know that authentication focuses on confirming the user claims who they are. With SSO the user have confirmed their identity in some external system. We need some protocol in place to securely transfer data from IdP to SP. Trust is built between federated parties in FIM systems using protocols such as SAML or OIDC.

Note that in such flows, there is no direct communication between SP and IdP. Client’s browser works as a proxy that is redirected back and forth.

SAML

From the two, SAML (Security Assertion Markup Language) is the older, more established one. It has been a gold standard for user authentication for years. The protocol is based on XML messages and they are called assertions. Some of you might be wondering why XML was selected instead of JSON. This technology wasn’t created for the cloud world — it dates back to 2005 and protocols back then were XML based 🤷‍♂.

The initial idea behind SAML was to exchange information in on-premises world with catalogs such as Active Directory at heart of the system. So technically it was a predecessor of SSO as we know it today. SAML was generic enough to adapt to federated identity systems and is still being used.

Even back then, SAML was built with certain principles in place that passed the test of time. One of the interesting ones are SP and IdP initiated flows

  • SP initialed flow happens if the user accesses the application unauthenticated. The SP (application) has to determine what IdP to redirect the user to. This might not be a trivial task if the application supports multiple IdPs. Usually web applications work in multi-tenant mode and the system can determine the right identity provider based on a subdomain. Once the IdP is selected, SAML request is sent to them. After successful authentication the SAML response will come back containing user information,
  • IdP initiated flow happens if the user uses some sort of dashboard on IdP side to access applications. SAML request will not happen and the browser will redirect you to the right application with a SAML response. Note that due to asynchronous nature of SSO workflows, the process should be stateless. SAML response is rich enough to contain all the necessary information about the user and the application route they wish to access.
Example of Okta dashboard. Clicking the app icon will initiate IdP flow. Source: https://www.okta.com/sites/default/files/styles/1640w_scaled/public/media/image/2021-04/New%20side%20navigation%20design.png?itok=QoOmxJcH

I just mentioned that the SSO flow should be stateless and that’s my opinion rather than a pure requirement. Of course, you can correlate SAML request and response persisting some state locally. I will argue that there is no use in doing so. Protocols support fields such as RelayState in SAML which purpose is to pass extra context information between request and response. Asynchronous processing means that you might never get the response back from the IdP server or the response might come back arbitrarily delayed, there is no point in maintaining complex logic around managing such state.

These problems and dilemmas are not limited to SAML, any federated SSO based service has to face them. That’s why so many IDaaS (Identity as a service) solutions are emerging — managing identities in cloud based SaaS world gets more and more complex.

To implement SAML protocol you need

  • a shared certificate between SP and IdP. They need some way of validating the assertions they exchange. Cryptography is used,
  • IdP Sign-In gateway — in SP-initialized flows this is the endpoint where user’s browser will be redirected to with SAML request containing information about which application the user tries to access,
  • Assertion Consumer Service — endpoint used by IdP to redirect user with SAML response. SP will validate the response and determine who the user is.

There are many libraries that simplify the implementation. In Java world java-saml is the most popular, though not temporarily not maintained (if you are interested in the progress on the issue check out this ticket).

OIDC

The modern alternative to SAML is OIDC (OpenID Connect). It was born out of OAuth 2.0 and is tightly related to OAuth. Given that OAuth is in fact a standard way of giving permissions to certain user information for third party apps without granting full access or exposing user credentials in web-based world, it seems that in the long run this will become a standard for the web (some may consider it a standard already).

At heart, OIDC focuses on problems specific to web and mobile applications. The regular SSO flow that was presented in the first part of this series didn’t touch the case of user re-authentication. On one hand you would like not to have long-lived tokens, on the other hand you don’t want to force users to re-authenticate too frequently. One is a security threat, the other results in inconvenient user experience.

OIDC, just as OAuth, uses a concept of refresh tokens. User authentication doesn’t directly return a token with user identity. Instead, it returns a token that can later be used to fetch another token called ID token. This indirection allows for introducing different token management strategies for both. OIDC can’t promise that tokens won’t be compromised so it tries to provide measures for reducing the possible impact.

  • as ID tokens can be refetched whenever needed using refresh tokens, they can be short-lived,
  • poorly managed refresh token can be a serious security threat — after all, it can be used to issue new ID tokens. A smart way of securing them involve token rotation. Every time a refresh token is used, a new one is issued. Refresh tokens and connected ID tokens form a token family. Whenever someone tries to use expired refresh token, they will be asked to re-authenticate and all existing tokens in the family will be invalidated. This implies that only one valid refresh token can be in circulation and possible refresh token breach will have limited impact. To protect against replay attacks nonces can be validated (message signatures, server should reject a message that it already processed).

OIDC is an interesting protocol as it is possible to use it with single page applications. With token security strategies in place, they can be stored in browser relatively securely. It supports re-authentication in background (as long as user is authenticated with the IdP) with silent authentication feature (prompt=none parameter). The usage of refresh token is optional, it is definitely useful for backend applications but silent authentication can be more than enough for SPAs. In that case, Authorization Endpoint will return ID token directly.

Most components in OIDC nomenclature can be mapped the ones in SAML:

  • Relying Party is the Service Provider,
  • OpenID provider is the Identity Provider,
  • ID token serves the same purpose as SAML response,
  • Claim is the same as SAML assertion (information about the end user).

From the architecture perspective there are three main components

  • Authorization Endpoint — where OpenID provider authorizes users and Relying Party sends requests for refresh tokens,
  • Token Endpoint — a place to get ID token from refresh token,
  • UserInfo Endpoint — a place to fetch claims using ID token.

Not all components have to be used in each of the possible workflows supported by OIDC. For instance, the flow without refresh tokens (called implicit flow) doesn’t use token endpoint. Keep in mind that these components are shared for both OAuth and OIDC. There is even an option to fetch both authorization (ID token) and authentication (OAuth access token) in one go.

Conclusion

In this article we went over two topics: SCIM (that controls how user identity information can be synchronized between systems) and authorization protocols (SAML, OIDC). We understood that SCIM is a protocol that allows for user account synchronization, allowing companies to quickly create accounts for new employees within all IT systems and automagically remove all accounts in all systems for people that were let go. You should also have better grasp on differences between SAML and its younger brother OIDC built on top of OAuth 2.0.

This is part 2 of my article series on auth acronyms. Check out part 1 here.

--

--

Adrian Bednarz
Adrian Bednarz

Written by Adrian Bednarz

Staff Data Engineer — AI / Big Data / ML

No responses yet