Problem Statement
Have you heard of identity and access management (IAM), attribute-based access control (ABAC), or role-based access control (RBAC)? You’re in the right place. If not, feel free to share this with one of your IT or Security engineers or managers.
We ourselves are IT and Security administrators that provision access requests day-to-day and manage identity systems for our users, just like you or one of the teams at your company does.
We’ve been asking ourselves why we do we have so many access requests? Why do we have access requests in the first place? Isn’t there a better way?
GitLab Origin Story
GitLab, like many rapidly growing technology companies, our IT and Security departments faced significant challenges in managing access control across its expanding ecosystem of applications and services. As the company grew, the manual processes for handling user access, entitlements, and provisioning became increasingly time-consuming, error-prone, and difficult to audit.
In 2023, it took 4 people full time per month to manually provision baseline (birthright and job-role) entitlements and handle the ad-hoc day-to-day access requests. Team members and contractors were waiting several days to get provisioned access to applications. Auditing is manual. Offboarding from applications is manual and time consuming for multiple teams.
As we’ve grown tired of approving and provisioning access requests over several years, we have been looking for a solution that could automate the provisioning process, however we realized that we needed good granular data about our users to determine what they should have access to. The closest that we had was some high-level attributes (ex. department, generic job title, etc) from Workday that was feeding our Okta user profile attributes that we could create some group rules with so we had an Okta group for every department. We looked at using complex expression string matching with department and manager or department and job title, however it quickly became unsustainable since strings changed constantly and you simply can’t keep track of it all.
For example, how do you solve for job titles that are renamed but have the same access, or job titles that change and should be attached to new policies?
In 2024, GitLab renamed 40% of departments with functional realignments and new naming standards. Can you imagine how much rework we had to do to adjust our group rules that use string matching? It’s not the first time, and it won’t be the last.
We spent many years trying to use configuration-as-code technologies including Ansible, Terraform, and various JSON/YAML data files for our homegrown scripts, and understand the importance of state management using Git repositories. However, at some point you start to stretch the limits at higher scale and trying to offer self service. It’s not realistic to get all of your non-engineering users to open merge requests for their access requests. You’ve just shifted the configuration state to a new location, however it still requires users to do a lot of busy work.
Pain Points
If you struggle with any of these pain points, rest assured that we feel your pain and you’re in the right place.
- Access Changes for New Job Roles: Making permission changes to users that change roles (movers) is tedious
- Access Requests: Process is manual and time consuming for team members and application provisioners in all departments.
- Attribute Name Changes: String matching rules are starting to cause problems and HRIS changes cause a lot of busy (re-)work with some changes being missed due to configuration sprawl
- Audit: Providing audit evidence is time consuming and not all systems are covered.
- Automation: Do not have a platform that will integrate with OKTA or directly with the 250+ apps not currently managed by OKTA. We cannot automate user onboarding and offboarding as a result.
- Centralization: Without a centralized database, you can only perform point-to-point automations with scripts or no-code solutions. You cannot achieve end-to-end access management with approval policies and have an audit trail with all approvals, access review audits, and provisioning logs. You also don’t have relationships that can be visualized in the UI or fetched with API.
- Checklists Out of Date: Managing onboarding (joiner) provisioning and offboarding (leaver) deprovisioning checklists is hard and they are missing a lot or inconsistent
- Complex Group Rules: Group rules in Okta are not powerful enough for multi-dimensional calculations
- Cross-System Sync: Keeping group users in sync across multiple systems is difficult (ex. GitLab Groups, Google Groups, Okta Groups, Slack Groups, etc.)
- Integrations: No 3rd party vendors support the provisioning of a large enough number of our tech stack apps (API integrations, etc)
- Job Role Automation: You have reasonable automation for baseline/birthright access for a large group of users but have weak processes for granular/smaller teams with specific access needs.
- Long Waiting Times: Users complain that getting access to systems takes too long due to complex manual provisioning and deprovisioning or high access request volume
- Manager Fatigue: Managers are tired of approving access requests and just rubber stamp them for non-sensitive systems
- No Code Workflow Automation: Do you have to build scripts and workflows to detect changes and trigger alerts or send notifications to different systems?
- Process Design: Business Policies and Processes for Identity Management and Role-Based Access Control could be improved or need to be redesigned
- Provisioning is Time Consuming: Provisioning and Deprovisioning
- Tech Stack: Maintaining the list of application approvers in the tech stack is manual and challenging leading to inaccuracy, delays, and compliance risk.
- 🤯 Time for a Vacation? Do you feel buried in tickets and email threads and they never seem to end? What if we could reduce the volume of access requests by 20%? 50%? 70%?
Thinking Different
These pain points are a day-in-the-life of IT and Security operations team members. The reality is that it takes a lot of time and investment by your teams to thoughtfully solve these problems.
Access Control is not trying to directly compete with the other Identity Governance and Administration (IGA) vendors. See the Industry and Market Competitors to learn more about the vendors that coexist and solve similar problems to Access Control.
We are trying to think different about why the problems that IGA vendors solve exist in the first place, and solve for root cause of the ones that are preventable with an emphasis on access requests and role-based access control using comprehensive user directory attribute metadata.
You can use Access Control alongside an IGA vendor. At GitLab, we use Lumos as our IGA vendor as a turnkey solution to solve for our access requests (that should decrease over time) and use Access Control to manage our policies (to improve our policies and automated provisioning over time).
Decision to Build
Jeff Martin incubated Access Control as an open source side project at GitLab while working in Corporate Security (formerly IT Operations) to scratch our own itch because we outgrew using group rules in Okta and manually managing members in GitLab, Google, and Slack groups, and struggled with policy management for baseline/birthright role-based access control (RBAC).
We had the opportunity to try to solve the root cause of these pain points and Access Control is a product of those engineering solutions.
We believe that by focusing on doing a few of the hard things well that solve that last 10-20% that the vendors aren’t solving, we can solve many of the pain points above that many vendors are only scratching the surface of. In other words, we’re not trying to reinvent the wheel, we’re trying to solve the last mile automation that we haven’t been able to find a better solution on the market for.
As we started building, we realized how generic and universal our problem statement was to the industry, so decided that since we’re already doing it, let’s open source it. It’s separate from the DevSecOps product that GitLab is known for, however we have a strong enginering and security culture that we believe in Security First and sometimes that means you have to build your way out of the problem instead of buy your way out of the problem.
There are several “bat cave”, “security automation and research”, or “skunkworks” engineers at larger tech companies that have to solve internal problems like this. We just happen to be a company that believes in transparency and open source so we have the joy of being able to open source our work and build in public.
By open-sourcing Access Control, GitLab continues its commitment to public by default transparency and it’s impossible to know everything collaboration values, hopefully benefiting other organizations (like yours!) facing similar challenges in the industry.
Market Vendor Gaps
We talked to several of the vendors, did demos and proof of concepts, and were left with one glaring question: where is the policy management for group rules? The general sentiment is that most vendors seem to be focused on access requests and access reviews with human intervention and automation after humans have performed manual approvals for each request.
As we started looking at the problem, we realized how much wider of an impact that group rules and policy management can have with determining role-based access control. We discovered the need for Identity Governance and Administration (IGA), but were perplexed with several overarching gaps:
- Access Requests: Why are we creating yet another ticket in yet another system with yet another process? Isn’t the goal to automate it?
- Appropriate Access vs Circumstantial Time-Based Approval: Is access appropriate at any time or only for specific circumstances? Can we pre-approve the business reason for their access, so any just-in-time access is pre-approved without delay?
- Checklist Automation: We have a checklist of applications/groups/resources to provision for each job role. How does that vendor automate that?
- Configuration Management: How do we standardize the configuration that makes it easy for administrators to manage holistically and granularly? Many UIs have limitations and are prone to mistakes.
- Group Management: How do we keep all of our team groups in sync across all systems? Why are systems managed independently?
- Manager Approval: Why does the manager need to approve? They will rubber stamp it anyway. Are we trying to determine if their access is appropriate at the specific time, or because we didn’t have their approval earlier that the role is appropriate?
- Necessity: Why was the access request needed? Is this missing from the pre-approved checklist?
- Perpetual Access: When it perpetual access acceptable? There are many systems that are not sensitive that it’s easier to just have access perpetually provisioned.
- Pre-Determined Granular Role: Don’t we know what their job role is? Can we use their profile attribute data to make a decision?
- Policy Management: If our birthright/baseline/role policies are configured properly, why do we need access requests? Why not just push updates to the policy that is then used to provision access on downstream systems that use that policy?
- SCIM Limitations: How do avoid the problems of losing user profile data on downstream applications if they are provisioned and deprovisioned using traditional technologies like SCIM?
- Security Risks: Many decisions based on security risk intentions, not security operations realities. Even fewer decisions are made with user experience in mind. How do we solve for the security risks with pre-approved access, whether or not it’s pre-provisioned?
- SSO Limitations: How do we solve for the limitations with group rules that we had
- Why: Why do we impose this process on users to get approval and managers to provide that approval?
Technical Justification
After building and subsequently outgrowing previous generations of homegrown scripts and none of the vendor tools meeting our needs, we decided to buy the tool that fit our needs the best for access requests and approvals (Lumos) and build the remaining pieces for role-based entitlements ourselves.
We tried to build earlier incarnations of Access Control with scripts and CI/CD pipelines. We internally referred to it as the Access Request Configuration Hyperautomation Identity Engine (ARCHIE). At first the JSON and YAML policy files and manifests appeared to solve our problem for configuration management. However, new problems appeared for how users could make self service updates, how much skills training did we need to provide for business owners to make configuration changes, and how to handle diffs between policies in an aesthetically pleasing way (like seeing a user’s job history or when different attributes were assigned to the policy over time) without having to crawl through Git commit history. It was sufficient for engineers, but it just built something else that we had to have high skilled team members to maintain. We aren’t the first to have these problems, and we won’t be the last. Our friends over at GitHub had a similar journey with their open source Entitlements project, and also uses Lumos for their IGA.
The reality is that you get a lot of benefits with a relational database and generated IDs that you simply can’t achieve with strings in JSON flat files running in CI/CD pipelines (at scale).
We started to see the need and benefits for a unified relational database, API endpoints, recurring sync background jobs, and a self-service UI for our employees and contractors, along with a strong policy engine that solves for pre-approval logic so no just-in-time approval is needed.
We have had success in the past with building homegrown self service portals in the past for infrastructure and demo/training related access requests, so we decided to apply our experience to solve the wider problem with access requests across more systems in the company.
Business Justification
The decision to build Access Control was driven by several key factors:
- Scalability: Manual processes were no longer sustainable as the company expanded.
- Security: Ensuring proper access control and least privilege became more critical as the organization grew.
- Compliance: Meeting audit requirements and maintaining accurate records of access rights became more complex.
- Efficiency: Reducing the time and resources spent on access management tasks was essential.
- User Experience: Streamlining the process for employees to request and receive access to necessary tools and applications.
- Integration: The need for a solution that could work seamlessly with GitLab’s existing tech stack and SSO provider.
By developing Access Control in-house as a side incubation project, GitLab was able to create a tailored solution that addressed these specific challenges while aligning with the company’s unique culture and workflows. This custom approach allowed for greater flexibility and control over the access management process, enabling GitLab to implement best practices in identity and access management (IAM) and role-based access control (RBAC) that were specifically suited to their needs.
Overengineered Solution
Overengineering can be useful if we put in the effort to simplify the way that it’s used while allowing us to leverage the power under the hood when we need it. Sports cars (refined overengineering) have a lot of power under the hood, but they are still more practical to drive to the grocery store than a semi truck (unrefined overengineering).