SWE-bench Lite Leaderboard š
Rank | System | Score (%) | Date | Status |
---|---|---|---|---|
š„ | ExpeRepair-v1.0 + Claude 4 Sonnet | 60.33 | 2025-06-25 | ā Active |
š„ | Refact.ai Agent | 60.00 | 2025-04-25 | ā Active |
š„ | SWE-agent + Claude 4 Sonnet | 56.67 | 2025-05-26 | ā Active |
4 | ISEA + Claude 3-5 Sonnet (Ours) | 51.33 | 2025-09-10 | š Competitive |
5 | ExpeRepair-v1.0 | 48.33 | 2025-06-13 | ā Active |
6 | SWE-agent + Claude 3.7 Sonnet | 48.00 | 2025-02-26 | ā Active |
7 | DARS Agent | 47.00 | 2025-02-05 | ā Active |
š System Overview
ISEA is an advanced multi-agent issue fixing system that achieves a 51.33% success rate on SWE-bench Lite, ranking #4 among all submissions. The system combines Neo4j knowledge graphs, specialized AI agents, and intelligent patch generation to automatically locate, analyze, and fix software issues.
šļø Multi-Agent Architecture
Neo4j Knowledge Graph
Code structure & relationships
Locator Agent
Identifies ā¤5 issue locations
Suggester Agent
Proposes repair strategies
Fixer Agent
Generates 40 patch variants
Core Components
š§ LangGraph State Management
Sophisticated state graph orchestrating agent interactions and maintaining conversation context throughout the debugging process.
š Neo4j Knowledge Graph
Comprehensive code structure database enabling intelligent navigation and relationship analysis across the codebase.
š ļø Advanced Tool Framework
14 specialized tools for code analysis, search, and manipulation, powered by custom Neo4j queries and file operations.
š Dynamic Summarization
Intelligent context management that summarizes long conversations to maintain focus and prevent token limit issues.
š ļø Comprehensive Tool Suite
š Core Neo4j Knowledge Graph Tools (9 tools)
š File System & Analysis Tools (3 tools)
Key Performance Metrics
š Complete System Pipeline
Repository Input
SWE-bench Project
Neo4j Knowledge Graph
Code Structure Indexing
Problem Statement
Bug Description Input
Locator Agent
Identifies ā¤5 Locations
Suggester Agent
Receives Locations
+ Proposes Strategies
Fixer Agent
Patch Implementation
Multi-Temperature Generation
T=0.0: 1 precise patch
T=0.8: 9 diverse patches
40 Patch Candidates
Multi-variant Pool
4-Level Filtering
Optimal Patch
Final Solution
āļø Complete Workflow
š 4-Phase Pipeline
Phase 1: Repository Preprocessing
Neo4j Knowledge Graph Construction: Parse and index the entire repository into a comprehensive knowledge graph, capturing classes, methods, variables, and their relationships (inheritance, calls, references, etc.)
Phase 2: Issue Location Analysis
Input: Problem Statement from SWE-bench
Locator Agent: Analyzes problem description, navigates knowledge graph, identifies up to 5 suspicious issue locations
Suggester Agent: Receives identified locations, collects contextual information, proposes coordinated repair strategies
Output: Issue locations + comprehensive repair suggestions
Phase 3: Multi-Round Patch Generation
Fixer Agent: Implements coordinated patches for identified locations
Generation Strategy:
- 4 rounds Ć 10 patches = 40 total variants
- Each round: 1 precise patch (T=0.0) + 9 diverse patches (T=0.8)
- Multi-location coordination for interconnected fixes
Phase 4: Intelligent Patch Selection
4-Level Filtering Hierarchy:
- Regression Test Pass Rate: Select patches with maximum passing tests
- Reproduction Test Pass Rate: Prioritize patches that pass original reproduction tests
- Normalized Patch Diversity: Choose most frequent normalized patterns
- Patch Size Optimization: Prefer patches with larger meaningful changes
Intelligent State Management
š Dynamic Routing
Conditional edges route between agents based on current state and message content, enabling adaptive workflow management.
š Context Summarization
Automatic conversation summarization when message count exceeds thresholds, maintaining essential context while preventing token overflow.
š”ļø Error Recovery
Robust error handling with JSON parsing fallbacks and tool execution error management.
š API Statistics
Comprehensive tracking of API calls, token usage, and performance metrics for optimization and analysis.
š» Technical Implementation
Knowledge Graph Schema
Nodes: Class, Method, Variable, Test
Relationships:
⢠BELONGS_TO: Method/Variable ā Class
⢠CALLS: Method ā Method
⢠HAS_METHOD: Class ā Method
⢠HAS_VARIABLE: Class ā Variable
⢠INHERITS: Class ā Class
⢠REFERENCES: Method ā Variable/Class
⢠TESTED: Method ā Test
Interactive Neo4j Knowledge Graph Visualization
Interactive Neo4j knowledge graph: Drag nodes to reposition ⢠Click nodes to highlight connections ⢠Hover for details
Shows real relationships: Classes (pink), Methods (blue), Variables (orange) with CALLS, BELONGS_TO, HAS_METHOD edges
Complete Pipeline Implementation
## Phase 1: System Initialization
INITIALIZE multi_agent_system
SET precise_llm = LLM(model=CLAUDE_SONNET, temperature=0.0)
SET creative_llm = LLM(model=CLAUDE_SONNET, temperature=0.8)
CREATE Agent_Locator(tools=NEO4J_TOOLS + FILE_TOOLS)
CREATE Agent_Suggester(tools=NEO4J_TOOLS + FILE_TOOLS)
CREATE Agent_Fixer(tools=NEO4J_TOOLS + FILE_TOOLS)
## Phase 2: Multi-Agent Workflow Execution
INITIALIZE workflow_graph = StateGraph(AgentState)
ADD_NODES(Locator, Suggester, Fixer, ToolNodes, Summarizer)
ADD_CONDITIONAL_EDGES(routing_logic)
COMPILE workflow_graph
EXECUTE workflow_graph.stream(initial_state)
ā Locator identifies ā¤5 issue locations
ā Suggester analyzes context and proposes repair strategies
ā Fixer generates coordinated patches
## Phase 3: Multi-Variant Patch Generation
FOR EACH issue_location IN identified_locations:
BUILD context_prompt(location, surrounding_code, imports, suggestions)
// Generate precise patch variant
precise_patch = precise_llm.INVOKE(context_prompt)
EXTRACT code_block FROM precise_patch.response
STORE precise_patch[location_id] = extracted_code
// Generate diverse patch variants
diverse_patches = []
FOR variant_num = 1 TO 8:
variant_response = creative_llm.INVOKE(context_prompt)
variant_code = EXTRACT_CODE(variant_response)
diverse_patches.APPEND(variant_code)
END FOR
STORE variant_patches[location_id] = diverse_patches
END FOR
## Phase 4: Atomic Multi-File Patch Application
FUNCTION apply_patches_and_generate_diff(patch_collection):
file_modifications = CREATE_EMPTY_MAP()
// Group patches by target files
FOR EACH location, patch_code IN patch_collection:
target_file = GET_FILE_PATH(location)
line_range = GET_LINE_RANGE(location)
file_modifications[target_file].ADD(line_range, patch_code)
END FOR
// Apply modifications atomically (reverse order)
FOR EACH file IN file_modifications:
original_content = READ_FILE(file)
modifications = SORT_REVERSE_BY_LINE_NUMBER(file_modifications[file])
FOR EACH modification IN modifications:
REPLACE_LINES(original_content, modification.range, modification.code)
END FOR
WRITE_FILE(file, modified_content)
END FOR
diff_output = EXECUTE_GIT_DIFF(repository_root)
RESTORE_ORIGINAL_FILES(original_state)
RETURN diff_output
END FUNCTION
## Phase 5: Comprehensive Results Export
all_patch_variants = INITIALIZE_COLLECTION()
all_patch_variants["precise_patches"] = precise_patches
FOR variant_index = 1 TO 8:
variant_set = EXTRACT_VARIANT(diverse_patches, variant_index)
variant_diff = apply_patches_and_generate_diff(variant_set)
all_patch_variants[f"variant_{variant_index}"] = variant_diff
END FOR
final_results = {
"patch_variants": all_patch_variants,
"git_diffs": diff_collection,
"metadata": execution_statistics
}
EXPORT_JSON(final_results, output_directory)
Key Technical Innovations
š CKGRetriever Integration
Custom Neo4j retriever with singleton pattern ensuring efficient database connections and query optimization.
šļø Dynamic Temperature Control
Variable temperature settings (0.0 for precision, 0.8 for creativity) optimizing patch generation diversity.
š Intelligent Truncation
Smart output truncation preventing token overflow while preserving essential information integrity.
š§ Process Management
Sophisticated patch processing with line number management and context preservation.
Model Configuration
PRIMARY_MODEL = ADVANCED_LLM_BACKEND
TEMPERATURE_PRECISE = 0.0 // Deterministic responses
TEMPERATURE_CREATIVE = 0.8 // Diverse solution generation
CONTEXT_THRESHOLD = 16 // Message count for summarization trigger
TOKEN_OPTIMIZATION = ENABLED // Intelligent content compression
# Performance Monitoring System
ENABLE api_statistics_collection()
TRACK prompt_content, response_content
MONITOR token_usage(prompt_tokens, completion_tokens, total_tokens)
LOG execution_timestamps
EXPORT performance_metrics TO json_format
IMPLEMENT real_time_analytics_dashboard()
Core Agent State Definition
DEFINE AgentState EXTENDS MessagesState:
// Core workflow state
conversation_history: MessageSequence
issue_locations: List[LocationDescriptor]
repair_suggestions: StrategicAnalysis
generated_patches: PatchCollection
// Agent coordination flags
locator_ready: Boolean
suggester_ready: Boolean
fixer_ready: Boolean
// Context management
conversation_summary: CompressedContext
current_agent: AgentIdentifier
next_agent: AgentIdentifier
execution_metrics: PerformanceCounters
// Problem context
problem_statement: ProblemDescription
project_context: ProjectMetadata
failed_attempts: List[FailureRecord]
END DEFINE
š Performance Results & Analysis
Advanced Multi-Round Patch Generation Strategy
šÆ 40-Variant Strategy
4 Rounds Ć 10 Patches
Each round: 1 precise patch (T=0.0) + 9 diverse patches (T=0.8)
Total: 40 candidate solutions per issue
š§ Temperature-Based Diversity
Precision vs Creativity Balance
T=0.0: Deterministic, focused solutions
T=0.8: Creative, diverse approaches
š 4-Level Selection Hierarchy
Intelligent Filtering Process
1. Regression test pass rate
2. Reproduction test validation
3. Normalized pattern frequency
4. Patch size optimization
ā” Multi-Location Coordination
Interconnected Fixes
Up to 5 issue locations
Coordinated patch application
Atomic rollback on failure
šÆ Future Enhancements & Optimization Roadmap
Performance Gap Analysis: 9% gap to current leader (60.33%) presents clear optimization opportunities
Next Steps:
⢠Claude 4 Integration: Upgrading to latest language model for enhanced reasoning
⢠Multimodal Integration: Adding vision capabilities for diagram and UI debugging
⢠Advanced Test Generation: Automatic test case creation for patch validation
⢠Performance Optimization: Enhanced caching and parallel processing