Skip to content

Latest commit

 

History

History
164 lines (118 loc) · 4.92 KB

README.md

File metadata and controls

164 lines (118 loc) · 4.92 KB

Authenticating Users for Access Control with RAG for LangChain in Python

An example CLI tool in Python demonstrating how to integrate Pangea's AuthN and AuthZ services into a LangChain app to filter out RAG documents based on user permissions.

Prerequisites

Setup

Pangea AuthN

After activating AuthN, under AuthN > General > Redirect (Callback) Settings, add http://localhost:3000 as a redirect and save.

Under AuthN > Users > New > Create User, create at least one user.

Pangea AuthZ

The setup in AuthZ should look something like this:

Resource types

Name Permissions
engineering read
finance read

Roles & access

Tip

At this point you need to create 2 new Roles under the Roles & Access tab in the Pangea console named engineering and finance.

Role: engineering

Resource type Permissions (read)
engineering ✔️
finance

Role: finance

Resource type Permissions (read)
engineering
finance ✔️

Assigned roles & relations

Subject type Subject ID Role/Relation
user your AuthN username engineering
user [email protected] finance

Note: Change or add assigned roles for your user to change permissions and access over time.

Configure the Code

git clone https://github.com/pangeacyber/langchain-python-user-authn.git
cd langchain-python-user-authn

Install libmagic

This is included in Windows via the python-magic-bin package

On macOS, you can install via this shell command:

brew install libmagic

Set up Project

If using pip:

python -m venv .venv
source .venv/bin/activate
pip install .

Or, if using uv:

uv sync
source .venv/bin/activate

The sample can then be executed with:

python -m langchain_user_authn "What is the software architecture of the company?"

Usage

Usage: python -m langchain_user_authn [OPTIONS] PROMPT

Options:
  --authn-client-token TEXT  Pangea AuthN Client API token. May also be set
                             via the `PANGEA_AUTHN_CLIENT_TOKEN` environment
                             variable.  [required]
  --authn-hosted-login TEXT  Pangea AuthN Hosted Login URL. May also be set
                             via the `PANGEA_AUTHN_HOSTED_LOGIN` environment
                             variable.  [required]
  --authz-token SECRET       Pangea AuthZ API token. May also be set via the
                             `PANGEA_AUTHZ_TOKEN` environment variable.
                             [required]
  --pangea-domain TEXT       Pangea API domain. May also be set via the
                             `PANGEA_DOMAIN` environment variable.  [default:
                             aws.us.pangea.cloud; required]
  --model TEXT               OpenAI model.  [default: gpt-4o-mini; required]
  --openai-api-key SECRET    OpenAI API key. May also be set via the
                             `OPENAI_API_KEY` environment variable.
                             [required]
  --help                     Show this message and exit.

Let's assume the current user is "[email protected]" and that they should have permission to see engineering documents. They can query the LLM on information regarding those documents:

$ python -m langchain_user_authn "What is the software architecture of the company?"

This will open a new tab in the user's default web browser where they can login through AuthN. Afterwards, their permissions are checked against AuthZ and they will indeed receive a response that is derived from the engineering documents:

The company's software architecture consists of a frontend built with ReactJS,
Redux, and Axios, along with Material-UI for design components. The backend
utilizes Node.js and Express.js, with MongoDB as the database. Authentication
and authorization are managed through JSON Web Tokens (JWT) and OAuth 2.0, and
version control is handled using Git and GitHub.

But they cannot query finance information:

$ python -m langchain_user_authn "What is the top salary in the Engineering department?"

[login flow]

I don't know the answer to that question, and you may not be authorized to know the answer.

And vice versa for "[email protected]", who is in finance but not engineering.