GitHub Agent Integration

Auto-Generated Documentation

This page is automatically synchronized with integration components.

Last Updated: 2025-12-04 Component Version Tracking:

GitHub MCP: v1 (updated 2025-12-04)
GitHub Document Loader: v3 (updated 2025-12-04)

Overview

GitHub is the world's leading software development platform, providing Git repository hosting, collaborative code review, project management, and DevOps automation. The AnswerAgentAI integration with GitHub enables you to build intelligent workflows that interact with your repositories, issues, pull requests, and development processes.

With this integration, you can:

Load repository data into AI workflows using the Document Loader
Manage repositories programmatically through the MCP (Model Context Protocol) server
Automate code reviews with AI-powered analysis and suggestions
Generate documentation from codebases automatically
Triage and respond to issues using natural language understanding
Create and update pull requests as part of automated workflows
Search across repositories using semantic understanding

GitHub's comprehensive API and webhook system make it ideal for AI integrations, enabling everything from automated code analysis and documentation generation to intelligent issue management and developer productivity tools.

Quick Start

Obtaining Credentials

GitHub uses Personal Access Tokens (PATs) for API authentication. You'll need different token scopes depending on your use case.

GitHub Personal Access Token

Required Credential:

Personal Access Token (PAT): Fine-grained or classic token with appropriate permissions

How to obtain:

Access Personal Access Tokens Settings:
- Log in to your GitHub account at https://github.com/
- Click your profile photo (top right) → Settings
- Scroll to Developer settings → Personal access tokens
Choose Token Type:

GitHub offers two types of tokens:

Fine-grained tokens (Recommended):
- Click Fine-grained tokens → Generate new token
- More secure with repository-specific access
- Expiration required (maximum 1 year)
- Granular permission control
Classic tokens:
- Click Tokens (classic) → Generate new token (classic)
- Broader permissions
- Can be set to never expire (not recommended)
- Simpler setup but less secure
Configure Token:

Token Name: Give it a descriptive name (e.g., "AnswerAgentAI Integration")

Expiration: Choose an expiration period (recommended: 90 days)

Repository Access (Fine-grained only):
- Select specific repositories, or
- Choose "All repositories" for organization-wide access
Set Permissions/Scopes:

For Fine-grained tokens:

Repository permissions:
- Contents: Read and write (for loading files and creating commits)
- Issues: Read and write (for issue management)
- Pull requests: Read and write (for PR management)
- Metadata: Read-only (automatically included)
- Commit statuses: Read and write (for CI/CD integration)
- Discussions: Read and write (if using discussions)
For Classic tokens, select these scopes:
- repo - Full control of private repositories
  - repo:status - Access commit status
  - repo_deployment - Access deployment status
  - public_repo - Access public repositories (if you only need public access)
- read:org - Read organization membership (if working with org repos)
- read:user - Read user profile data
- read:discussion - Read discussions (if needed)
Generate and Copy Token:
- Click Generate token
- IMPORTANT: Copy the token immediately - you won't be able to see it again!
- Store it securely (password manager recommended)

Security Best Practices

Never commit tokens to version control
Use fine-grained tokens with minimal required permissions
Set expiration dates and rotate tokens regularly
Use repository-specific access when possible
Revoke tokens immediately if compromised
Use organization secrets for team environments

Documentation Reference: GitHub Personal Access Tokens

GitHub App Authentication (Advanced)

For organization-wide integrations and enhanced security, consider using GitHub Apps:

Benefits:

More granular permissions
Higher rate limits
No user account dependency
Audit trail via app identity

Setup:

Create a GitHub App in organization settings
Install app to specific repositories
Generate private key for authentication
Use app installation tokens in API calls

Documentation: Creating a GitHub App

Available Components

Auto-Generated

This section is automatically generated from component metadata in scripts/integration-mapping.json. Last updated: December 4, 2025

This integration provides components across multiple categories:

Document Loaders

Load code, documentation, and repository data into your AI workflows.

GitHub Document Loader (v3)

Description: Load data from GitHub repositories including code files, documentation, issues, and pull requests.

Key Features:

Clone and load entire repositories or specific paths
Support for public and private repositories
Load specific file types or patterns
Include commit history and metadata
Filter by branch, tag, or specific commits
Support for markdown documentation
Load issues and pull request descriptions

Configuration Options:

Repository URL: Full GitHub repository URL (e.g., https://github.com/owner/repo)
Branch: Target branch (default: repository default branch)
File Pattern: Glob pattern for files to include (e.g., **/*.md, src/**/*.ts)
Recursive: Load files recursively from subdirectories
Max Files: Limit number of files to load
Include Metadata: Include commit info, file paths, and timestamps

Use Cases:

Building documentation chatbots from README and wiki files
Code search and navigation with semantic understanding
Generating code embeddings for similarity search
Analyzing repository structure and dependencies
Creating knowledge bases from open-source projects

Learn More: GitHub Document Loader Documentation

MCP Servers

Interact with GitHub's API through natural language using the Model Context Protocol.

GitHub MCP Server (v1)

Description: MCP Server for the GitHub API enabling comprehensive repository and project management through AI agents.

Available Tools:

Repository Management:

create_repository - Create new repositories
get_repository - Get repository details and metadata
list_repositories - List repositories for user or organization
update_repository - Update repository settings
delete_repository - Delete repositories
fork_repository - Create repository forks

File Operations:

get_file_contents - Read file contents from repository
create_or_update_file - Create or update files
delete_file - Remove files from repository
list_directory - List directory contents

Branch & Commit Management:

list_branches - List repository branches
create_branch - Create new branches
get_commit - Get commit details
list_commits - List commit history
compare_commits - Compare two commits or branches

Issue Management:

create_issue - Create new issues
update_issue - Update existing issues
list_issues - Search and filter issues
add_issue_comment - Comment on issues
close_issue - Close issues
assign_issue - Assign issues to users

Pull Request Management:

create_pull_request - Create new pull requests
update_pull_request - Update PR details
list_pull_requests - Filter and search PRs
merge_pull_request - Merge approved PRs
review_pull_request - Submit PR reviews
list_pull_request_files - Get changed files in PR

Search & Discovery:

search_repositories - Search GitHub for repositories
search_code - Search code across repositories
search_issues - Advanced issue search
search_users - Find GitHub users

Collaboration:

add_collaborator - Add repository collaborators
list_collaborators - List collaborators
create_webhook - Set up webhooks
get_user - Get user profile information

Use Cases:

Automated issue triage and labeling
Code review assistance and suggestions
Documentation generation from codebases
Repository maintenance automation
Development workflow orchestration

Learn More: GitHub MCP Documentation

Use Cases

Common Scenarios

1. Intelligent Code Documentation

Automatically generate and maintain documentation from your codebase.

Workflow:

Use GitHub Document Loader to import code files (e.g., **/*.ts, **/*.py)
Parse code to extract functions, classes, and their purposes
Generate comprehensive documentation using AI
Create or update README.md and wiki pages via GitHub MCP
Commit documentation back to repository

Benefits:

Always up-to-date documentation
Consistent documentation style
Reduced manual documentation overhead
Better onboarding for new developers

2. Automated Code Review Assistant

Provide AI-powered code review suggestions for pull requests.

Workflow:

Use webhook to trigger on new PR
Load changed files using list_pull_request_files
Analyze code for:
- Security vulnerabilities
- Performance issues
- Style inconsistencies
- Best practice violations
Post review comments via review_pull_request
Suggest improvements and alternatives

Benefits:

Faster code reviews
Consistent coding standards
Catch issues before human review
Educational feedback for developers

3. Issue Triage and Auto-Labeling

Automatically categorize and prioritize new issues.

Workflow:

Webhook triggers on new issue creation
Retrieve issue via get_issue
Analyze issue content with AI:
- Identify issue type (bug, feature, question)
- Determine severity and priority
- Detect affected components
Apply labels using update_issue
Assign to appropriate team member
Add standardized response or ask for clarification

Benefits:

Faster issue triage
Consistent categorization
Better resource allocation
Improved response times

4. Repository Knowledge Base

Create a searchable knowledge base from repository contents.

Workflow:

Load repository documentation via Document Loader
Include README, wiki, and markdown files
Split and generate embeddings
Store in vector database
Provide semantic search interface
Answer developer questions about the codebase

Benefits:

Faster developer onboarding
Self-service documentation access
Reduced repetitive questions
Better knowledge retention

5. Automated Release Notes

Generate comprehensive release notes from commits and PRs.

Workflow:

Fetch commits between releases using compare_commits
Load associated PR descriptions with list_pull_requests
Categorize changes (features, fixes, breaking changes)
Generate formatted release notes
Create GitHub release with notes
Update CHANGELOG.md

Benefits:

Professional release documentation
Time saved on manual writing
Consistent formatting
Complete change tracking

6. Dependency Update Management

Monitor and manage dependency updates automatically.

Workflow:

Scan package.json, requirements.txt, etc. using get_file_contents
Check for available updates
Analyze breaking changes and compatibility
Create branch with create_branch
Update dependency files with create_or_update_file
Run tests and create PR with results
Assign to maintainer for review

Benefits:

Stay current with dependencies
Security vulnerability mitigation
Reduced manual update work
Better change management

Example Workflows

Example 1: Documentation Chatbot

Goal: Create a chatbot that answers questions about your codebase and documentation.

Chatflow Configuration:

1. GitHub Document Loader
   - Credential: GitHub API (Personal Access Token)
   - Repository URL: https://github.com/yourorg/yourrepo
   - Branch: main
   - File Pattern: **/*.md,**/*.ts,**/*.js
   - Include Metadata: true
   - Recursive: true

2. Recursive Character Text Splitter
   - Chunk Size: 2000
   - Chunk Overlap: 200

3. OpenAI Embeddings
   - Model: text-embedding-3-small

4. Pinecone Vector Store
   - Index Name: codebase-docs
   - Namespace: main

5. Conversational Retrieval QA Chain
   - Retriever: Pinecone
   - LLM: GPT-4
   - Return Source Documents: true

Result: Developers can ask questions like "How do I authenticate users?" and get answers with links to specific files and line numbers.

Example 2: Automated Issue Responder

Goal: Automatically respond to new issues with helpful information or questions.

Chatflow Configuration:

Conversational Agent (OpenAI Functions)
   - LLM: GPT-4
   - Tools: GitHub MCP

System Prompt:
   "You are an issue triage assistant for a software project.
   When a new issue is created:
Read the issue description carefully
Determine if it's a bug report, feature request, or question
Check if similar issues exist using search_issues
If it's a bug, ask for reproduction steps if not provided
If it's a feature request, ask for use case details
Apply appropriate labels using update_issue
Add a welcoming comment thanking the user"

Trigger: GitHub webhook on issue creation
Memory: Buffer Memory

Example Interaction:

Webhook: New issue created #123 "App crashes on startup"

Agent:
1. Gets issue details with get_issue
2. Searches for similar issues: search_issues({"q": "crash startup"})
3. Analyzes issue description
4. Adds labels: ["bug", "needs-reproduction"]
5. Adds comment:
   "Thank you for reporting this! To help us fix this, could you provide:
   - Your operating system and version
   - Steps to reproduce the crash
   - Any error messages you see

   I found similar issues #87 and #102 - are you experiencing the same thing?"

Example 3: Pull Request Code Review

Goal: Automatically review pull requests for common issues and best practices.

Chatflow Configuration:

1. Conversational Agent
   - LLM: GPT-4-Turbo
   - Tools: [GitHub MCP, Code Analysis Tool]

2. Workflow:
   - Triggered by PR webhook
   - Get PR files using list_pull_request_files
   - For each changed file:
     * Load file contents with get_file_contents
     * Analyze for:
       - Security vulnerabilities (SQL injection, XSS, etc.)
       - Performance issues (N+1 queries, inefficient loops)
       - Code style violations
       - Missing tests for new functions
       - Outdated dependencies
   - Generate review comments
   - Submit review using review_pull_request

3. Review Comment Example:
   {
     "path": "src/api/users.ts",
     "position": 42,
     "body": "⚠️ Security: This SQL query is vulnerable to injection.
             Consider using parameterized queries instead:
             `db.query('SELECT * FROM users WHERE id = ?', [userId])`"
   }

Benefits:

Immediate feedback on PRs
Consistent review standards
Catches security issues early
Educates developers

Advanced Configuration

Document Loader Configuration

Repository and Branch Selection

{
    "repositoryUrl": "https://github.com/owner/repository",
    "branch": "main",
    "accessToken": "ghp_xxxxxxxxxxxx"
}

Branch Options:

main or master - Default branches
develop - Development branch
feature/xyz - Specific feature branch
Commit SHA - Specific commit (e.g., abc1234)
Tag - Release tag (e.g., v1.0.0)

File Pattern Filtering

Use glob patterns to load specific files:

{
    "filePattern": "**/*.md", // All markdown files
    "filePattern": "src/**/*.ts", // TypeScript in src/
    "filePattern": "{README,CONTRIBUTING}.md", // Specific files
    "filePattern": "**/*.{ts,js,json}", // Multiple extensions
    "recursive": true
}

Common Patterns:

**/*.md - All documentation
src/**/* - All source code
**/test/** - All test files
!**/node_modules/** - Exclude node_modules

Metadata Configuration

Include git metadata for better context:

{
    "includeMetadata": true,
    "metadata": {
        "includeCommitInfo": true,
        "includeFileStats": true,
        "includeBlameInfo": false
    }
}

Metadata Fields:

File Path: Relative path in repository
Last Modified: Last commit timestamp
Author: Last commit author
Commit Message: Last commit message
File Size: Size in bytes
Language: Detected programming language

MCP Server Configuration

Environment Variables

For production deployments, use environment variables:

GITHUB_PERSONAL_ACCESS_TOKEN=ghp_xxxxxxxxxxxx
GITHUB_DEFAULT_OWNER=your-org
GITHUB_DEFAULT_REPO=your-repo

Rate Limiting

GitHub API has rate limits:

Authenticated requests:

5,000 requests per hour (per user)
Primary rate limit

Search API:

30 requests per minute
Separate limit for search endpoints

Best Practices:

Cache frequently accessed data
Use conditional requests (ETags)
Implement exponential backoff on 429 responses
Monitor rate limit headers
Consider GitHub Apps for higher limits (15,000/hour)

Check rate limit:

// Use MCP tool to check current rate limit
{
    "tool": "get_rate_limit",
    "parameters": {}
}

Webhook Configuration

For real-time event handling:

Create webhook:

{
    "tool": "create_webhook",
    "parameters": {
        "owner": "your-org",
        "repo": "your-repo",
        "config": {
            "url": "https://your-server.com/webhook",
            "content_type": "json",
            "secret": "your-webhook-secret"
        },
        "events": ["push", "pull_request", "issues"]
    }
}

Common webhook events:

push - Code pushed to repository
pull_request - PR opened, updated, merged
issues - Issue created, updated, closed
issue_comment - Comments on issues/PRs
release - Release published
workflow_run - GitHub Actions workflow events

Webhook security:

Validate webhook signatures
Use HTTPS endpoints only
Rotate secrets regularly
Limit events to what you need

Organization-Level Access

For organization-wide operations:

{
    "owner": "your-organization",
    "type": "org",
    "permissions": {
        "repositories": "read",
        "members": "read",
        "projects": "write"
    }
}

Organization features:

Manage multiple repositories
Team-based access control
Organization-wide webhooks
Security alerts and policies

Working with Large Repositories

For repositories with many files:

Strategy 1: Incremental Loading

{
    "maxFiles": 100,
    "startPath": "src/components",
    "pagination": true
}

Strategy 2: Filtered Loading

{
    "filePattern": "**/*.{md,mdx}",
    "excludePatterns": ["**/node_modules/**", "**/dist/**", "**/.git/**"]
}

Strategy 3: Branch-Specific Loading

{
    "branch": "main",
    "paths": ["docs/", "README.md", "CONTRIBUTING.md"]
}

Frequently Asked Questions

Setup & Configuration

Q: What's the difference between fine-grained and classic Personal Access Tokens?

Fine-grained tokens (Recommended):
- Repository-specific access
- More granular permissions
- Required expiration (max 1 year)
- Better security audit trail
- Available since August 2023
Classic tokens:
- Broader, organization-wide access
- Coarser permission scopes
- Can never expire (not recommended)
- Simpler to set up
- Legacy approach

Q: What token permissions do I need for read-only access?

A: Fine-grained: Only Contents: Read permission Classic: Only public_repo scope (for public repos) or repo (for private repos)

Q: What permissions do I need to create issues and PRs?

A: Fine-grained:

Contents: Read and write (for file changes)
Issues: Read and write
Pull requests: Read and write

Classic:

repo (full repository access)
write:discussion (if using discussions)

Q: Can I use the same token for multiple repositories?

A: Yes! Classic tokens work across all repositories you have access to. Fine-grained tokens can be configured for specific repositories or all repositories.

Q: How do I handle token expiration?

Set calendar reminders before expiration
Generate new token with same permissions
Update credentials in AnswerAgentAI
Revoke old token after confirming new one works
Consider using GitHub Apps for longer-lived authentication

Usage & Best Practices

Q: How do I load only documentation files from a repository?

A: Use file pattern filtering:

{
    "filePattern": "**/*.md",
    "paths": ["docs/", "README.md", "CONTRIBUTING.md"]
}

Q: Can I load from private repositories?

A: Yes! Ensure your Personal Access Token has:

Fine-grained: Repository access configured for specific private repos
Classic: repo scope selected

Q: How do I avoid hitting rate limits?

Cache aggressively - Store loaded data locally
Use conditional requests - Check ETags before fetching
Batch operations - Group multiple file operations
Monitor usage - Check rate limit headers
Upgrade to GitHub Apps - Get 3x higher limits
Use GraphQL API - More efficient than REST for complex queries

Q: Can I monitor multiple repositories simultaneously?

A: Yes! Create separate Document Loader nodes for each repository, or use the MCP server to list and iterate through repositories programmatically.

Q: How do I handle merge conflicts when updating files?

A: The MCP server's create_or_update_file requires a commit SHA. If the file changed since you read it, you'll get a conflict error. Always:

Fetch latest file version
Get current commit SHA
Include SHA in update request
Handle conflict by re-fetching and merging

Q: Can I trigger workflows on file changes?

A: Yes! When you commit files via the MCP server:

Use create_or_update_file to commit changes
GitHub Actions workflows trigger automatically if configured
Monitor workflow status using get_workflow_run
Wait for completion before proceeding

Q: How do I search for issues with specific criteria?

A: Use GitHub's search syntax:

{
    "tool": "search_issues",
    "parameters": {
        "q": "is:open is:issue label:bug repo:owner/repo created:>2024-01-01"
    }
}

Search qualifiers:

is:issue or is:pr
is:open, is:closed, is:merged
label:bug, label:"good first issue"
author:username
assignee:username
created:>2024-01-01
updated:<2024-12-31

Troubleshooting

Q: Getting "401 Unauthorized" errors

Solutions:

Verify token hasn't expired
Check token has required permissions/scopes
Confirm token is correctly stored in credential
For private repos, verify token has repository access
Check if organization requires SSO - authorize token for SSO

Q: Document Loader returns empty results

Possible causes:

Wrong branch - Verify branch name is correct
File pattern doesn't match - Test pattern against actual files
Repository is empty - Check repository has files
Private repo without access - Verify token permissions
Path doesn't exist - Check specified paths exist in repo

Q: Rate limit exceeded errors

Solutions:

Check current limit: GET /rate_limit
Wait for reset (check X-RateLimit-Reset header)
Implement caching to reduce requests
Use conditional requests with ETags
Consider GitHub App for higher limits
For search, spread queries over time (30/min limit)

Q: Cannot create pull request

Common issues:

Branch doesn't exist - Create branch first with create_branch
No changes - Ensure commits differ from base branch
Branch protection - Check repository branch protection rules
Permissions - Verify token has pull_request:write permission
Base branch incorrect - Verify base branch exists and is correct

Q: Webhook not receiving events

Debug steps:

Check webhook URL is publicly accessible
Verify webhook secret matches in code
Check webhook delivery history in GitHub settings
Ensure events are configured correctly
Validate SSL certificate (GitHub requires valid HTTPS)
Check webhook response status (must return 2xx)

Q: File content appears truncated or garbled

Causes:

Binary files - GitHub API doesn't support binary content well
Large files - Files >1MB need special handling
Encoding issues - Specify encoding in request
Rate limit truncation - Check if response was rate limited

Solutions:

Use Git LFS for large files
Download raw file via download_url for large files
Specify accept: application/vnd.github.raw header
Filter out binary files in document loader

Q: Search returns unexpected results

Tips:

Use quotes for exact phrases: "error handling"
Specify repository: repo:owner/name
Limit by date: created:>2024-01-01
Combine qualifiers: is:open is:issue label:bug
Use NOT operator: -label:wontfix
Remember search is case-insensitive

Q: Cannot update organization repositories

Requirements:

Token must have organization access
User must have write permissions to repository
Organization may require SSO authentication
Some operations require admin access
Organization may have IP allowlist

Q: Commit/push fails with authentication error

Checklist:

Token has Contents: Write permission
Branch is not protected (or you have bypass rights)
Repository is not archived
File size under 100MB (use Git LFS for larger)
Commit message follows repository requirements
No required status checks failing

Resources

Official Documentation

GitHub REST API Documentation - Complete REST API reference
GitHub GraphQL API - GraphQL API for complex queries
Personal Access Tokens Guide - Token creation and management
GitHub Apps Documentation - Building GitHub Apps for enhanced integrations
Webhooks Guide - Setting up and securing webhooks

Guides & Tutorials

GitHub API Quickstart - Getting started with GitHub API
Search Syntax - Advanced search queries
Rate Limiting - Understanding and handling rate limits
OAuth Apps vs GitHub Apps - Choosing authentication method
Security Best Practices - Securing integrations

AnswerAgentAI Documentation

GitHub Document Loader - Detailed loader configuration
GitHub MCP Server - Complete MCP tool reference
Document Loaders Overview - General document loader concepts
MCP Servers Overview - Introduction to MCP integrations

Community & Support

GitHub Community - Forums and discussions
GitHub Public Roadmap - Upcoming features
Stack Overflow - Q&A for GitHub developers
GitHub Changelog - API updates and new features
GitHub Status - Service status and incidents

Tools & SDKs

Octokit - Official GitHub API client libraries
GitHub CLI - Command-line interface for GitHub
REST API Browser - Interactive API explorer
GitHub Apps Marketplace - Pre-built integrations and tools

Overview​

Quick Start​

Obtaining Credentials​

GitHub Personal Access Token​

GitHub App Authentication (Advanced)​

Available Components​

Document Loaders​

GitHub Document Loader (v3)​

MCP Servers​

GitHub MCP Server (v1)​

Use Cases​

Common Scenarios​

1. Intelligent Code Documentation​

2. Automated Code Review Assistant​

3. Issue Triage and Auto-Labeling​

4. Repository Knowledge Base​

5. Automated Release Notes​

6. Dependency Update Management​

Example Workflows​

Example 1: Documentation Chatbot​

Example 2: Automated Issue Responder​

Example 3: Pull Request Code Review​

Advanced Configuration​

Document Loader Configuration​

Repository and Branch Selection​

File Pattern Filtering​

Metadata Configuration​

MCP Server Configuration​

Environment Variables​

Rate Limiting​

Webhook Configuration​

Organization-Level Access​

Working with Large Repositories​

Frequently Asked Questions​

Setup & Configuration​

Usage & Best Practices​

Troubleshooting​

Resources​

Official Documentation​

Guides & Tutorials​

AnswerAgentAI Documentation​

Community & Support​

Tools & SDKs​

Ask Alpha

Overview

Quick Start

Obtaining Credentials

GitHub Personal Access Token

GitHub App Authentication (Advanced)

Available Components

Document Loaders

GitHub Document Loader (v3)

MCP Servers

GitHub MCP Server (v1)

Use Cases

Common Scenarios

1. Intelligent Code Documentation

2. Automated Code Review Assistant

3. Issue Triage and Auto-Labeling

4. Repository Knowledge Base

5. Automated Release Notes

6. Dependency Update Management

Example Workflows

Example 1: Documentation Chatbot

Example 2: Automated Issue Responder

Example 3: Pull Request Code Review

Advanced Configuration

Document Loader Configuration

Repository and Branch Selection

File Pattern Filtering

Metadata Configuration

MCP Server Configuration

Environment Variables

Rate Limiting

Webhook Configuration

Organization-Level Access

Working with Large Repositories

Frequently Asked Questions

Setup & Configuration

Usage & Best Practices

Troubleshooting

Resources

Official Documentation

Guides & Tutorials

AnswerAgentAI Documentation

Community & Support

Tools & SDKs