SWE-bench Lite Leaderboard 🆕
| Rank | System | Score (%) | Date | Status |
|---|---|---|---|---|
| 🥇 | ExpeRepair-v1.0 + Claude 4 Sonnet | 60.33 | 2025-06-25 | ✅ Active |
| 🥈 | Refact.ai Agent | 60.00 | 2025-04-25 | ✅ Active |
| 🥉 | SWE-agent + Claude 4 Sonnet | 56.67 | 2025-05-26 | ✅ Active |
| 4 | ISEA + Claude 3-5 Sonnet (Ours) | 51.33 | 2025-09-10 | 🚀 Competitive |
| 5 | ExpeRepair-v1.0 | 48.33 | 2025-06-13 | ✅ Active |
| 6 | SWE-agent + Claude 3.7 Sonnet | 48.00 | 2025-02-26 | ✅ Active |
| 7 | DARS Agent | 47.00 | 2025-02-05 | ✅ Active |
🚀 System Overview
ISEA is an advanced multi-agent issue fixing system that achieves a 51.33% success rate on SWE-bench Lite, ranking #4 among all submissions. The system combines Neo4j knowledge graphs, specialized AI agents, and intelligent patch generation to automatically locate, analyze, and fix software issues.
🏗️ Multi-Agent Architecture
Neo4j Knowledge Graph
Code structure & relationships
Locator Agent
Identifies ≤5 issue locations
Suggester Agent
Proposes repair strategies
Fixer Agent
Generates 40 patch variants
Core Components
🧠 LangGraph State Management
Sophisticated state graph orchestrating agent interactions and maintaining conversation context throughout the debugging process.
🔍 Neo4j Knowledge Graph
Comprehensive code structure database enabling intelligent navigation and relationship analysis across the codebase.
🛠️ Advanced Tool Framework
14 specialized tools for code analysis, search, and manipulation, powered by custom Neo4j queries and file operations.
🔄 Dynamic Summarization
Intelligent context management that summarizes long conversations to maintain focus and prevent token limit issues.
🛠️ Comprehensive Tool Suite
🔗 Core Neo4j Knowledge Graph Tools (9 tools)
📁 File System & Analysis Tools (3 tools)
Key Performance Metrics
🔄 Complete System Pipeline
Repository Input
SWE-bench Project
Neo4j Knowledge Graph
Code Structure Indexing
Problem Statement
Bug Description Input
Locator Agent
Identifies ≤5 Locations
Suggester Agent
Receives Locations
+ Proposes Strategies
Fixer Agent
Patch Implementation
Multi-Temperature Generation
T=0.0: 1 precise patch
T=0.8: 9 diverse patches
40 Patch Candidates
Multi-variant Pool
4-Level Filtering
Optimal Patch
Final Solution
⚙️ Complete Workflow
📋 4-Phase Pipeline
Phase 1: Repository Preprocessing
Neo4j Knowledge Graph Construction: Parse and index the entire repository into a comprehensive knowledge graph, capturing classes, methods, variables, and their relationships (inheritance, calls, references, etc.)
Phase 2: Issue Location Analysis
Input: Problem Statement from SWE-bench
Locator Agent: Analyzes problem description, navigates knowledge graph, identifies up to 5 suspicious issue locations
Suggester Agent: Receives identified locations, collects contextual information, proposes coordinated repair strategies
Output: Issue locations + comprehensive repair suggestions
Phase 3: Multi-Round Patch Generation
Fixer Agent: Implements coordinated patches for identified locations
Generation Strategy:
- 4 rounds × 10 patches = 40 total variants
- Each round: 1 precise patch (T=0.0) + 9 diverse patches (T=0.8)
- Multi-location coordination for interconnected fixes
Phase 4: Intelligent Patch Selection
4-Level Filtering Hierarchy:
- Regression Test Pass Rate: Select patches with maximum passing tests
- Reproduction Test Pass Rate: Prioritize patches that pass original reproduction tests
- Normalized Patch Diversity: Choose most frequent normalized patterns
- Patch Size Optimization: Prefer patches with larger meaningful changes
Intelligent State Management
🔄 Dynamic Routing
Conditional edges route between agents based on current state and message content, enabling adaptive workflow management.
📝 Context Summarization
Automatic conversation summarization when message count exceeds thresholds, maintaining essential context while preventing token overflow.
🛡️ Error Recovery
Robust error handling with JSON parsing fallbacks and tool execution error management.
📊 API Statistics
Comprehensive tracking of API calls, token usage, and performance metrics for optimization and analysis.
💻 Technical Implementation
Knowledge Graph Schema
Nodes: Class, Method, Variable, Test
Relationships:
• BELONGS_TO: Method/Variable → Class
• CALLS: Method → Method
• HAS_METHOD: Class ↔ Method
• HAS_VARIABLE: Class ↔ Variable
• INHERITS: Class → Class
• REFERENCES: Method → Variable/Class
• TESTED: Method → Test
Interactive Neo4j Knowledge Graph Visualization
Interactive Neo4j knowledge graph: Drag nodes to reposition • Click nodes to highlight connections • Hover for details
Shows real relationships: Classes (pink), Methods (blue), Variables (orange) with CALLS, BELONGS_TO, HAS_METHOD edges
Complete Pipeline Implementation
## Phase 1: System Initialization
INITIALIZE multi_agent_system
SET precise_llm = LLM(model=CLAUDE_SONNET, temperature=0.0)
SET creative_llm = LLM(model=CLAUDE_SONNET, temperature=0.8)
CREATE Agent_Locator(tools=NEO4J_TOOLS + FILE_TOOLS)
CREATE Agent_Suggester(tools=NEO4J_TOOLS + FILE_TOOLS)
CREATE Agent_Fixer(tools=NEO4J_TOOLS + FILE_TOOLS)
## Phase 2: Multi-Agent Workflow Execution
INITIALIZE workflow_graph = StateGraph(AgentState)
ADD_NODES(Locator, Suggester, Fixer, ToolNodes, Summarizer)
ADD_CONDITIONAL_EDGES(routing_logic)
COMPILE workflow_graph
EXECUTE workflow_graph.stream(initial_state)
→ Locator identifies ≤5 issue locations
→ Suggester analyzes context and proposes repair strategies
→ Fixer generates coordinated patches
## Phase 3: Multi-Variant Patch Generation
FOR EACH issue_location IN identified_locations:
BUILD context_prompt(location, surrounding_code, imports, suggestions)
// Generate precise patch variant
precise_patch = precise_llm.INVOKE(context_prompt)
EXTRACT code_block FROM precise_patch.response
STORE precise_patch[location_id] = extracted_code
// Generate diverse patch variants
diverse_patches = []
FOR variant_num = 1 TO 9 :
variant_response = creative_llm.INVOKE(context_prompt)
variant_code = EXTRACT_CODE(variant_response)
diverse_patches.APPEND(variant_code)
END FOR
STORE variant_patches[location_id] = diverse_patches
END FOR
## Phase 4: Atomic Multi-File Patch Application
FUNCTION apply_patches_and_generate_diff(patch_collection):
file_modifications = CREATE_EMPTY_MAP()
// Group patches by target files
FOR EACH location, patch_code IN patch_collection:
target_file = GET_FILE_PATH(location)
line_range = GET_LINE_RANGE(location)
file_modifications[target_file].ADD(line_range, patch_code)
END FOR
// Apply modifications atomically (reverse order)
FOR EACH file IN file_modifications:
original_content = READ_FILE(file)
modifications = SORT_REVERSE_BY_LINE_NUMBER(file_modifications[file])
FOR EACH modification IN modifications:
REPLACE_LINES(original_content, modification.range, modification.code)
END FOR
WRITE_FILE(file, modified_content)
END FOR
diff_output = EXECUTE_GIT_DIFF(repository_root)
RESTORE_ORIGINAL_FILES(original_state)
RETURN diff_output
END FUNCTION
## Phase 5: Comprehensive Results Export
all_patch_variants = INITIALIZE_COLLECTION()
all_patch_variants["precise_patches"] = precise_patches
FOR variant_index = 1 TO 9:
variant_set = EXTRACT_VARIANT(diverse_patches, variant_index)
variant_diff = apply_patches_and_generate_diff(variant_set)
all_patch_variants[f"variant_{variant_index}"] = variant_diff
END FOR
final_results = {
"patch_variants": all_patch_variants,
"git_diffs": diff_collection,
"metadata": execution_statistics
}
EXPORT_JSON(final_results, output_directory)
Key Technical Innovations
🔍 CKGRetriever Integration
Custom Neo4j retriever with singleton pattern ensuring efficient database connections and query optimization.
🎛️ Dynamic Temperature Control
Variable temperature settings (0.0 for precision, 0.8 for creativity) optimizing patch generation diversity.
📏 Intelligent Truncation
Smart output truncation preventing token overflow while preserving essential information integrity.
🔧 Process Management
Sophisticated patch processing with line number management and context preservation.
Model Configuration
PRIMARY_MODEL = ADVANCED_LLM_BACKEND
TEMPERATURE_PRECISE = 0.0 // Deterministic responses
TEMPERATURE_CREATIVE = 0.8 // Diverse solution generation
CONTEXT_THRESHOLD = 16 // Message count for summarization trigger
TOKEN_OPTIMIZATION = ENABLED // Intelligent content compression
# Performance Monitoring System
ENABLE api_statistics_collection()
TRACK prompt_content, response_content
MONITOR token_usage(prompt_tokens, completion_tokens, total_tokens)
LOG execution_timestamps
EXPORT performance_metrics TO json_format
IMPLEMENT real_time_analytics_dashboard()
Core Agent State Definition
DEFINE AgentState EXTENDS MessagesState:
// Core workflow state
conversation_history: MessageSequence
issue_locations: List[LocationDescriptor]
repair_suggestions: StrategicAnalysis
generated_patches: PatchCollection
// Agent coordination flags
locator_ready: Boolean
suggester_ready: Boolean
fixer_ready: Boolean
// Context management
conversation_summary: CompressedContext
current_agent: AgentIdentifier
next_agent: AgentIdentifier
execution_metrics: PerformanceCounters
// Problem context
problem_statement: ProblemDescription
project_context: ProjectMetadata
failed_attempts: List[FailureRecord]
END DEFINE
📊 Performance Results & Analysis
Advanced Multi-Round Patch Generation Strategy
🎯 40-Variant Strategy
4 Rounds × 10 Patches
Each round: 1 precise patch (T=0.0) + 9 diverse patches (T=0.8)
Total: 40 candidate solutions per issue
🔧 Temperature-Based Diversity
Precision vs Creativity Balance
T=0.0: Deterministic, focused solutions
T=0.8: Creative, diverse approaches
📊 4-Level Selection Hierarchy
Intelligent Filtering Process
1. Regression test pass rate
2. Reproduction test validation
3. Normalized pattern frequency
4. Patch size optimization
⚡ Multi-Location Coordination
Interconnected Fixes
Up to 5 issue locations
Coordinated patch application
Atomic rollback on failure
🎯 Future Enhancements & Optimization Roadmap
Performance Gap Analysis: 9% gap to current leader (60.33%) presents clear optimization opportunities
Next Steps:
• Claude 4 Integration: Upgrading to latest language model for enhanced reasoning
• Multimodal Integration: Adding vision capabilities for diagram and UI debugging
• Advanced Test Generation: Automatic test case creation for patch validation
• Performance Optimization: Enhanced caching and parallel processing