SGAgent

SWE-bench Lite Leaderboard

Rank	System	Score (%)	Date	Status
1	SGAgent + Claude 4 Sonnet (Ours)	60.67	2025-12-20	Competitive
1	ExpeRepair-v1.0 + Claude 4 Sonnet	60.33	2025-06-25	Active
2	Refact.ai Agent	60.00	2025-04-25	Active
3	SWE-agent + Claude 4 Sonnet	56.67	2025-05-26	Active
4	Isea + Claude 3.5 Sonnet (Ours)	51.33	2025-09-10	Competitive
5	ExpeRepair-v1.0	48.33	2025-06-13	Active
6	SWE-agent + Claude 3.7 Sonnet	48.00	2025-02-26	Active
7	DARS Agent	47.00	2025-02-05	Active

System Overview

SGAgent is an advanced multi-agent issue fixing system that achieves a 60.67% success rate on SWE-bench Lite with Claude 4 Sonnet, ranking #1 among all submissions. The system combines Neo4j knowledge graphs, specialized AI agents, and intelligent patch generation to automatically locate, analyze, and fix software issues.

Multi-Agent Architecture

Knowledge Graph

Code structure & relationships

→

Localizer Agent

Identifies ≤5 issue locations

→

Suggester Agent

Proposes repair strategies

→

Fixer Agent

Generates 1 patch per round (optional multi-candidate)

Core Components

State Management

Sophisticated state graph orchestrating agent interactions and maintaining conversation context throughout the debugging process.

Neo4j Knowledge Graph

Optional graph backend that, when enabled, provides global code structure navigation and relationship analysis.

Advanced Tool Framework

Rich tool suite for code analysis, search, and manipulation, based on static analysis with optional Neo4j integration.

Dynamic Summarization

Intelligent context management that summarizes long conversations to maintain focus and prevent token limit issues.

Localizer / Suggester / Fixer Output Example

Localizer Output

localizer

{
    "locations": [
        {
            "path": "/root/temp_container/astropy__astropy-12907/astropy/modeling/separable.py",
            "start_line": 245,
            "end_line": 245
        }
    ],
    "reasons": [
        "The bug is in the _cstack function at line 245. When the right operand is an ndarray..."
    ]
}

Suggester Output

suggester

{
    "suggestions": [
        {
            "title": "Fix _cstack function to properly handle nested CompoundModels",
            "rationale": [
                "The current _cstack function incorrectly handles nested compound models by setting all values to 1 instead of preserving the actual relationship matrix",
                "When the right operand is an array (from a recursive call), it should properly copy the matrix layout",
                "This fix ensures that nested compound models preserve their separability characteristics"
            ],
            "confidence": 0.8,
            "impact_area": "astropy.modeling.separable._cstack",
            "actions": [
                {
                    "path": "/root/temp_container/astropy__astropy-12907/astropy/modeling/separable.py",
                    "operation": "replace",
                    "start_line": 244,
                    "end_line": 245,
                    "symbol": null,
                    "patch_preview": "        cright = np.zeros((noutp, right.shape[1]))\n        cright[-right.shape[0]:, -right.shape[1]:] = right"
                }
            ],
            "risks": [
                "May affect other separability calculations if the change is too broad"
            ],
            "tests": [
                "Test nested compound models with separability_matrix function",
                "Verify that simple compound models still work correctly",
                "Run the existing test suite for separability module"
            ],
            "references": [
                "astropy.modeling.separable._cstack",
                "astropy.modeling.separable.separability_matrix",
                "astropy.modeling.core.CompoundModel"
            ]
        }
    ]
}

Fixer Output (Diff)

fixer

diff --git a/astropy/modeling/separable.py b/astropy/modeling/separable.py
index a308e2729..45bea3608 100644
--- a/astropy/modeling/separable.py
+++ b/astropy/modeling/separable.py
@@ -242,7 +242,7 @@ def _cstack(left, right):
     cright = _coord_matrix(right, 'right', noutp)
 else:
     cright = np.zeros((noutp, right.shape[1]))
-    cright[-right.shape[0]:, -right.shape[1]:] = 1
+    cright[-right.shape[0]:, -right.shape[1]:] = right
 
 return np.hstack([cleft, cright])

Comprehensive Tool Suite

Code Structure Analysis Tools

analyze_file_structure

Get a complete overview of a Python file: list all classes and methods with names, fully qualified names, and parameters.

get_code_relationships

Discover calls, inheritance, references, and dependencies of a code entity for impact analysis.

find_methods_by_name

Find all methods with a given name across the project, returning implementations and file paths.

Method & Class Analysis Tools

extract_complete_method

Extract full method code and automatically analyze its relationships to other methods, classes, and variables.

find_class_constructor

Locate and extract the __init__ method (constructor) of a class with its full implementation.

list_class_attributes

List all instance attributes and class variables of a class, including types and values when available.

Variable & Import Analysis Tools

find_variable_usage

Find all usages of a variable in a specific file, with line numbers and surrounding context.

find_all_variables_named

Search the entire project for variables with a specific name, returning file paths and qualified names.

show_file_imports

Extract all import statements from a Python file to reveal its external dependencies.

Content Search Tools

search_code_with_context

Search Python files for a keyword and return matches with 3 lines of context before and after.

find_files_containing

Find files whose content or filename contains a specific keyword.

File System & Editing Tools

explore_directory

List files and subdirectories in a given path to understand project structure.

read_file_lines

Read a specific range of lines from a file (up to 50 lines) with line numbers.

edit_file_by_lineno

Replace or delete content between specified line numbers.

edit_file_by_content

Replace content using regex pattern matching, with safety checks for multiple matches.

insert

Insert new content before a specified line number.

create_file

Create a new file with given content at a relative path.

Helper & Thinking Tools

create_tool

Create a runtime Python helper function for reusable logic (pure Python only).

sequential_thinking

Break down complex problems into sequential reasoning steps, supporting revisions and branches.

Complete System Pipeline

ISEA Multi-Agent Issue Fixing Pipeline

Repository Input

SWE-bench Project

→

Code Index / Neo4j Graph (Optional)

Static code index by default, with optional Neo4j knowledge graph when enabled

↓

Problem Statement

Bug Description Input

→

Localizer Agent

Identifies ≤5 Locations

→

Suggester Agent

Receives Locations
+ Proposes Strategies

↓

Fixer Agent

Patch Implementation

↓

Default: Single Patch

1 Patch / Round

Direct Output

↓

Final Solution

Verification Only

Optional: Multi-Candidate

Multi-Temp Gen

40 Candidates

↓

4-Level Filtering

1. Regression Tests

2. Reproduction Tests

↓

Optimal Patch

Best of 40

Complete Workflow

4-Phase Pipeline

Phase 1: Repository Preprocessing

Code Index & Optional Neo4j Graph: Parse and index the repository into an internal code index. When Neo4j is configured, the same information is also stored in a knowledge graph for richer global queries.

Phase 2: Issue Location Analysis

Input: Problem Statement from SWE-bench

Localizer Agent: Analyzes problem description, navigates knowledge graph, identifies up to 5 suspicious issue locations

Suggester Agent: Receives identified locations, collects contextual information, proposes coordinated repair strategies

Output: Issue locations + comprehensive repair suggestions

Phase 3: Patch Generation

Fixer Agent: Implements coordinated patches for identified locations

Generation Strategy:

Default: Generates 1 high-precision patch per round using deterministic sampling (T=0.0)
Optional Multi-Candidate: Can be configured to generate 40 total variants (10 per round) with diverse temperature settings for difficult issues
Multi-location coordination for interconnected fixes

Phase 4: Verification & Selection

Verification Strategy:

Single Patch Mode: Direct verification against reproduction script and regression tests.
Multi-Candidate Mode (Filtering Hierarchy):
1. Regression Test Pass Rate: Select patches with maximum passing tests
2. Reproduction Test Pass Rate: Prioritize patches that pass original reproduction tests
3. Normalized Patch Diversity: Choose most frequent normalized patterns
4. Patch Size Optimization: Prefer patches with larger meaningful changes

Intelligent State Management

Dynamic Routing

Conditional edges route between agents based on current state and message content, enabling adaptive workflow management.

Context Summarization

Automatic conversation summarization when message count exceeds thresholds, maintaining essential context while preventing token overflow.

Error Recovery

Robust error handling with JSON parsing fallbacks and tool execution error management.

API Statistics

Comprehensive tracking of API calls, token usage, and performance metrics for optimization and analysis.

Technical Implementation

Knowledge Graph Schema

# Neo4j Node Types and Relationships

Nodes: Class, Method, Variable, Test

Relationships:

  • BELONGS_TO: Method/Variable → Class

  • CALLS: Method → Method

  • HAS_METHOD: Class ↔ Method

  • HAS_VARIABLE: Class ↔ Variable

  • INHERITS: Class → Class

  • REFERENCES: Method → Variable/Class

  • TESTED: Method → Test

Interactive Neo4j Knowledge Graph Visualization

Interactive Neo4j knowledge graph: Drag nodes to reposition • Click nodes to highlight connections • Hover for details
Shows real relationships: Classes (pink), Methods (blue), Variables (orange) with CALLS, BELONGS_TO, HAS_METHOD edges

Complete Pipeline Implementation

# ISEA Complete Workflow Pseudocode

## Phase 1: System Initialization

INITIALIZE multi_agent_system

SET precise_llm = LLM(model=CLAUDE_SONNET, temperature=0.0)

SET creative_llm = LLM(model=CLAUDE_SONNET, temperature=0.8)

CREATE Agent_Localizer(tools=NEO4J_TOOLS + FILE_TOOLS)

CREATE Agent_Suggester(tools=NEO4J_TOOLS + FILE_TOOLS)

CREATE Agent_Fixer(tools=NEO4J_TOOLS + FILE_TOOLS)

## Phase 2: Multi-Agent Workflow Execution

INITIALIZE workflow_graph = StateGraph(AgentState)

ADD_NODES(Localizer, Suggester, Fixer, ToolNodes, Summarizer)

ADD_CONDITIONAL_EDGES(routing_logic)

COMPILE workflow_graph

EXECUTE workflow_graph.stream(initial_state)

    → Localizer identifies ≤5 issue locations

    → Suggester analyzes context and proposes repair strategies

    → Fixer generates coordinated patches

## Phase 3: Multi-Variant Patch Generation

FOR EACH issue_location IN identified_locations:

    BUILD context_prompt(location, surrounding_code, imports, suggestions)

    // Generate precise patch variant

    precise_patch = precise_llm.INVOKE(context_prompt)

    EXTRACT code_block FROM precise_patch.response

    STORE precise_patch[location_id] = extracted_code

    // Generate diverse patch variants

    diverse_patches = []

    FOR variant_num = 1 TO 9 :

        variant_response = creative_llm.INVOKE(context_prompt)

        variant_code = EXTRACT_CODE(variant_response)

        diverse_patches.APPEND(variant_code)

    END FOR

    STORE variant_patches[location_id] = diverse_patches

END FOR

## Phase 4: Atomic Multi-File Patch Application

FUNCTION apply_patches_and_generate_diff(patch_collection):

    file_modifications = CREATE_EMPTY_MAP()

    // Group patches by target files

    FOR EACH location, patch_code IN patch_collection:

        target_file = GET_FILE_PATH(location)

        line_range = GET_LINE_RANGE(location)

        file_modifications[target_file].ADD(line_range, patch_code)

    END FOR

    // Apply modifications atomically (reverse order)

    FOR EACH file IN file_modifications:

        original_content = READ_FILE(file)

        modifications = SORT_REVERSE_BY_LINE_NUMBER(file_modifications[file])

        FOR EACH modification IN modifications:

            REPLACE_LINES(original_content, modification.range, modification.code)

        END FOR

        WRITE_FILE(file, modified_content)

    END FOR

    diff_output = EXECUTE_GIT_DIFF(repository_root)

    RESTORE_ORIGINAL_FILES(original_state)

    RETURN diff_output

END FUNCTION

## Phase 5: Comprehensive Results Export

all_patch_variants = INITIALIZE_COLLECTION()

all_patch_variants["precise_patches"] = precise_patches

FOR variant_index = 1 TO 9:

    variant_set = EXTRACT_VARIANT(diverse_patches, variant_index)

    variant_diff = apply_patches_and_generate_diff(variant_set)

    all_patch_variants[f"variant_{variant_index}"] = variant_diff

END FOR

final_results = {

    "patch_variants": all_patch_variants,

    "git_diffs": diff_collection,

    "metadata": execution_statistics

}

EXPORT_JSON(final_results, output_directory)

Key Technical Innovations

🔍 CKGRetriever Integration

Custom Neo4j retriever with singleton pattern ensuring efficient database connections and query optimization.

🎛️ Dynamic Temperature Control

Variable temperature settings (0.0 for precision, 0.8 for creativity) optimizing patch generation diversity.

📏 Intelligent Truncation

Smart output truncation preventing token overflow while preserving essential information integrity.

🔧 Process Management

Sophisticated patch processing with line number management and context preservation.

Model Configuration

# LLM System Configuration

PRIMARY_MODEL = ADVANCED_LLM_BACKEND

TEMPERATURE_PRECISE = 0.0  // Deterministic responses

TEMPERATURE_CREATIVE = 0.8  // Diverse solution generation

CONTEXT_THRESHOLD = 16  // Message count for summarization trigger

TOKEN_OPTIMIZATION = ENABLED  // Intelligent content compression

# Performance Monitoring System

ENABLE api_statistics_collection()

TRACK prompt_content, response_content

MONITOR token_usage(prompt_tokens, completion_tokens, total_tokens)

LOG execution_timestamps

EXPORT performance_metrics TO json_format

IMPLEMENT real_time_analytics_dashboard()

Core Agent State Definition

# Multi-Agent State Management Schema

DEFINE AgentState EXTENDS MessagesState:

    // Core workflow state

    conversation_history: MessageSequence

    issue_locations: List[LocationDescriptor]

    repair_suggestions: StrategicAnalysis

    generated_patches: PatchCollection

    // Agent coordination flags

    locator_ready: Boolean

    suggester_ready: Boolean

    fixer_ready: Boolean

    // Context management

    conversation_summary: CompressedContext

    current_agent: AgentIdentifier

    next_agent: AgentIdentifier

    execution_metrics: PerformanceCounters

    // Problem context

    problem_statement: ProblemDescription

    project_context: ProjectMetadata

    failed_attempts: List[FailureRecord]

END DEFINE