SGAgent

Advanced AI-Powered Autonomous Software Engineering

60.67%
SWE-bench Lite Success Rate

SWE-bench Lite Leaderboard

Rank System Score (%) Date Status
1 SGAgent + Claude 4 Sonnet (Ours) 60.67 2025-12-20 Competitive
1 ExpeRepair-v1.0 + Claude 4 Sonnet 60.33 2025-06-25 Active
2 Refact.ai Agent 60.00 2025-04-25 Active
3 SWE-agent + Claude 4 Sonnet 56.67 2025-05-26 Active
4 Isea + Claude 3.5 Sonnet (Ours) 51.33 2025-09-10 Competitive
5 ExpeRepair-v1.0 48.33 2025-06-13 Active
6 SWE-agent + Claude 3.7 Sonnet 48.00 2025-02-26 Active
7 DARS Agent 47.00 2025-02-05 Active

System Overview

SGAgent is an advanced multi-agent issue fixing system that achieves a 60.67% success rate on SWE-bench Lite with Claude 4 Sonnet, ranking #1 among all submissions. The system combines Neo4j knowledge graphs, specialized AI agents, and intelligent patch generation to automatically locate, analyze, and fix software issues.

SGAgent System Overview

Multi-Agent Architecture

Knowledge Graph

Code structure & relationships

Localizer Agent

Identifies ≤5 issue locations

Suggester Agent

Proposes repair strategies

Fixer Agent

Generates 1 patch per round (optional multi-candidate)

Core Components

State Management

Sophisticated state graph orchestrating agent interactions and maintaining conversation context throughout the debugging process.

Neo4j Knowledge Graph

Optional graph backend that, when enabled, provides global code structure navigation and relationship analysis.

Advanced Tool Framework

Rich tool suite for code analysis, search, and manipulation, based on static analysis with optional Neo4j integration.

Dynamic Summarization

Intelligent context management that summarizes long conversations to maintain focus and prevent token limit issues.

Localizer / Suggester / Fixer Output Example

Localizer Output
localizer
{
    "locations": [
        {
            "path": "/root/temp_container/astropy__astropy-12907/astropy/modeling/separable.py",
            "start_line": 245,
            "end_line": 245
        }
    ],
    "reasons": [
        "The bug is in the _cstack function at line 245. When the right operand is an ndarray..."
    ]
}
Suggester Output
suggester
{
    "suggestions": [
        {
            "title": "Fix _cstack function to properly handle nested CompoundModels",
            "rationale": [
                "The current _cstack function incorrectly handles nested compound models by setting all values to 1 instead of preserving the actual relationship matrix",
                "When the right operand is an array (from a recursive call), it should properly copy the matrix layout",
                "This fix ensures that nested compound models preserve their separability characteristics"
            ],
            "confidence": 0.8,
            "impact_area": "astropy.modeling.separable._cstack",
            "actions": [
                {
                    "path": "/root/temp_container/astropy__astropy-12907/astropy/modeling/separable.py",
                    "operation": "replace",
                    "start_line": 244,
                    "end_line": 245,
                    "symbol": null,
                    "patch_preview": "        cright = np.zeros((noutp, right.shape[1]))\n        cright[-right.shape[0]:, -right.shape[1]:] = right"
                }
            ],
            "risks": [
                "May affect other separability calculations if the change is too broad"
            ],
            "tests": [
                "Test nested compound models with separability_matrix function",
                "Verify that simple compound models still work correctly",
                "Run the existing test suite for separability module"
            ],
            "references": [
                "astropy.modeling.separable._cstack",
                "astropy.modeling.separable.separability_matrix",
                "astropy.modeling.core.CompoundModel"
            ]
        }
    ]
}
Fixer Output (Diff)
fixer
diff --git a/astropy/modeling/separable.py b/astropy/modeling/separable.py
index a308e2729..45bea3608 100644
--- a/astropy/modeling/separable.py
+++ b/astropy/modeling/separable.py
@@ -242,7 +242,7 @@ def _cstack(left, right):
     cright = _coord_matrix(right, 'right', noutp)
 else:
     cright = np.zeros((noutp, right.shape[1]))
-    cright[-right.shape[0]:, -right.shape[1]:] = 1
+    cright[-right.shape[0]:, -right.shape[1]:] = right
 
 return np.hstack([cleft, cright])

Comprehensive Tool Suite

Code Structure Analysis Tools

analyze_file_structure
Get a complete overview of a Python file: list all classes and methods with names, fully qualified names, and parameters.
get_code_relationships
Discover calls, inheritance, references, and dependencies of a code entity for impact analysis.
find_methods_by_name
Find all methods with a given name across the project, returning implementations and file paths.

Method & Class Analysis Tools

extract_complete_method
Extract full method code and automatically analyze its relationships to other methods, classes, and variables.
find_class_constructor
Locate and extract the __init__ method (constructor) of a class with its full implementation.
list_class_attributes
List all instance attributes and class variables of a class, including types and values when available.

Variable & Import Analysis Tools

find_variable_usage
Find all usages of a variable in a specific file, with line numbers and surrounding context.
find_all_variables_named
Search the entire project for variables with a specific name, returning file paths and qualified names.
show_file_imports
Extract all import statements from a Python file to reveal its external dependencies.

Content Search Tools

search_code_with_context
Search Python files for a keyword and return matches with 3 lines of context before and after.
find_files_containing
Find files whose content or filename contains a specific keyword.

File System & Editing Tools

explore_directory
List files and subdirectories in a given path to understand project structure.
read_file_lines
Read a specific range of lines from a file (up to 50 lines) with line numbers.
edit_file_by_lineno
Replace or delete content between specified line numbers.
edit_file_by_content
Replace content using regex pattern matching, with safety checks for multiple matches.
insert
Insert new content before a specified line number.
create_file
Create a new file with given content at a relative path.

Helper & Thinking Tools

create_tool
Create a runtime Python helper function for reusable logic (pure Python only).
sequential_thinking
Break down complex problems into sequential reasoning steps, supporting revisions and branches.

Complete System Pipeline

ISEA Multi-Agent Issue Fixing Pipeline
Repository Input

SWE-bench Project

Code Index / Neo4j Graph (Optional)

Static code index by default, with optional Neo4j knowledge graph when enabled

Problem Statement

Bug Description Input

Localizer Agent

Identifies ≤5 Locations

Suggester Agent

Receives Locations
+ Proposes Strategies

Fixer Agent

Patch Implementation

Default: Single Patch
1 Patch / Round

Direct Output

Final Solution

Verification Only

Optional: Multi-Candidate
Multi-Temp Gen

40 Candidates

4-Level Filtering
1. Regression Tests
2. Reproduction Tests
Optimal Patch

Best of 40

Complete Workflow

4-Phase Pipeline

1

Phase 1: Repository Preprocessing

Code Index & Optional Neo4j Graph: Parse and index the repository into an internal code index. When Neo4j is configured, the same information is also stored in a knowledge graph for richer global queries.

2

Phase 2: Issue Location Analysis

Input: Problem Statement from SWE-bench

Localizer Agent: Analyzes problem description, navigates knowledge graph, identifies up to 5 suspicious issue locations

Suggester Agent: Receives identified locations, collects contextual information, proposes coordinated repair strategies

Output: Issue locations + comprehensive repair suggestions

3

Phase 3: Patch Generation

Fixer Agent: Implements coordinated patches for identified locations

Generation Strategy:

  • Default: Generates 1 high-precision patch per round using deterministic sampling (T=0.0)
  • Optional Multi-Candidate: Can be configured to generate 40 total variants (10 per round) with diverse temperature settings for difficult issues
  • Multi-location coordination for interconnected fixes
4

Phase 4: Verification & Selection

Verification Strategy:

  • Single Patch Mode: Direct verification against reproduction script and regression tests.
  • Multi-Candidate Mode (Filtering Hierarchy):
    1. Regression Test Pass Rate: Select patches with maximum passing tests
    2. Reproduction Test Pass Rate: Prioritize patches that pass original reproduction tests
    3. Normalized Patch Diversity: Choose most frequent normalized patterns
    4. Patch Size Optimization: Prefer patches with larger meaningful changes

Intelligent State Management

Dynamic Routing

Conditional edges route between agents based on current state and message content, enabling adaptive workflow management.

Context Summarization

Automatic conversation summarization when message count exceeds thresholds, maintaining essential context while preventing token overflow.

Error Recovery

Robust error handling with JSON parsing fallbacks and tool execution error management.

API Statistics

Comprehensive tracking of API calls, token usage, and performance metrics for optimization and analysis.

Technical Implementation

Knowledge Graph Schema

# Neo4j Node Types and Relationships

Nodes: Class, Method, Variable, Test

Relationships:
  • BELONGS_TO: Method/Variable → Class
  • CALLS: Method → Method
  • HAS_METHOD: Class ↔ Method
  • HAS_VARIABLE: Class ↔ Variable
  • INHERITS: Class → Class
  • REFERENCES: Method → Variable/Class
  • TESTED: Method → Test

Interactive Neo4j Knowledge Graph Visualization

Interactive Neo4j knowledge graph: Drag nodes to repositionClick nodes to highlight connectionsHover for details
Shows real relationships: Classes (pink), Methods (blue), Variables (orange) with CALLS, BELONGS_TO, HAS_METHOD edges

Complete Pipeline Implementation

# ISEA Complete Workflow Pseudocode

## Phase 1: System Initialization
INITIALIZE multi_agent_system
SET precise_llm = LLM(model=CLAUDE_SONNET, temperature=0.0)
SET creative_llm = LLM(model=CLAUDE_SONNET, temperature=0.8)

CREATE Agent_Localizer(tools=NEO4J_TOOLS + FILE_TOOLS)
CREATE Agent_Suggester(tools=NEO4J_TOOLS + FILE_TOOLS)
CREATE Agent_Fixer(tools=NEO4J_TOOLS + FILE_TOOLS)

## Phase 2: Multi-Agent Workflow Execution
INITIALIZE workflow_graph = StateGraph(AgentState)
ADD_NODES(Localizer, Suggester, Fixer, ToolNodes, Summarizer)
ADD_CONDITIONAL_EDGES(routing_logic)
COMPILE workflow_graph

EXECUTE workflow_graph.stream(initial_state)
    → Localizer identifies ≤5 issue locations
    → Suggester analyzes context and proposes repair strategies
    → Fixer generates coordinated patches

## Phase 3: Multi-Variant Patch Generation
FOR EACH issue_location IN identified_locations:
    BUILD context_prompt(location, surrounding_code, imports, suggestions)

    // Generate precise patch variant
    precise_patch = precise_llm.INVOKE(context_prompt)
    EXTRACT code_block FROM precise_patch.response
    STORE precise_patch[location_id] = extracted_code

    // Generate diverse patch variants
    diverse_patches = []
    FOR variant_num = 1 TO 9 :
        variant_response = creative_llm.INVOKE(context_prompt)
        variant_code = EXTRACT_CODE(variant_response)
        diverse_patches.APPEND(variant_code)
    END FOR
    STORE variant_patches[location_id] = diverse_patches
END FOR

## Phase 4: Atomic Multi-File Patch Application
FUNCTION apply_patches_and_generate_diff(patch_collection):
    file_modifications = CREATE_EMPTY_MAP()
    // Group patches by target files
    FOR EACH location, patch_code IN patch_collection:
        target_file = GET_FILE_PATH(location)
        line_range = GET_LINE_RANGE(location)
        file_modifications[target_file].ADD(line_range, patch_code)
    END FOR

    // Apply modifications atomically (reverse order)
    FOR EACH file IN file_modifications:
        original_content = READ_FILE(file)
        modifications = SORT_REVERSE_BY_LINE_NUMBER(file_modifications[file])
        FOR EACH modification IN modifications:
            REPLACE_LINES(original_content, modification.range, modification.code)
        END FOR
        WRITE_FILE(file, modified_content)
    END FOR

    diff_output = EXECUTE_GIT_DIFF(repository_root)
    RESTORE_ORIGINAL_FILES(original_state)
    RETURN diff_output
END FUNCTION

## Phase 5: Comprehensive Results Export
all_patch_variants = INITIALIZE_COLLECTION()
all_patch_variants["precise_patches"] = precise_patches
FOR variant_index = 1 TO 9:
    variant_set = EXTRACT_VARIANT(diverse_patches, variant_index)
    variant_diff = apply_patches_and_generate_diff(variant_set)
    all_patch_variants[f"variant_{variant_index}"] = variant_diff
END FOR

final_results = {
    "patch_variants": all_patch_variants,
    "git_diffs": diff_collection,
    "metadata": execution_statistics
}
EXPORT_JSON(final_results, output_directory)

Key Technical Innovations

🔍 CKGRetriever Integration

Custom Neo4j retriever with singleton pattern ensuring efficient database connections and query optimization.

🎛️ Dynamic Temperature Control

Variable temperature settings (0.0 for precision, 0.8 for creativity) optimizing patch generation diversity.

📏 Intelligent Truncation

Smart output truncation preventing token overflow while preserving essential information integrity.

🔧 Process Management

Sophisticated patch processing with line number management and context preservation.

Model Configuration

# LLM System Configuration
PRIMARY_MODEL = ADVANCED_LLM_BACKEND
TEMPERATURE_PRECISE = 0.0 // Deterministic responses
TEMPERATURE_CREATIVE = 0.8 // Diverse solution generation
CONTEXT_THRESHOLD = 16 // Message count for summarization trigger
TOKEN_OPTIMIZATION = ENABLED // Intelligent content compression

# Performance Monitoring System
ENABLE api_statistics_collection()
TRACK prompt_content, response_content
MONITOR token_usage(prompt_tokens, completion_tokens, total_tokens)
LOG execution_timestamps
EXPORT performance_metrics TO json_format
IMPLEMENT real_time_analytics_dashboard()

Core Agent State Definition

# Multi-Agent State Management Schema

DEFINE AgentState EXTENDS MessagesState:
    // Core workflow state
    conversation_history: MessageSequence
    issue_locations: List[LocationDescriptor]
    repair_suggestions: StrategicAnalysis
    generated_patches: PatchCollection

    // Agent coordination flags
    locator_ready: Boolean
    suggester_ready: Boolean
    fixer_ready: Boolean

    // Context management
    conversation_summary: CompressedContext
    current_agent: AgentIdentifier
    next_agent: AgentIdentifier
    execution_metrics: PerformanceCounters

    // Problem context
    problem_statement: ProblemDescription
    project_context: ProjectMetadata
    failed_attempts: List[FailureRecord]
END DEFINE