# joycode-agent **Repository Path**: JD-opensource/joycode-agent ## Basic Information - **Project Name**: joycode-agent - **Description**: Repository-level Repair Agent Based on SWE-Benchβ€”JoyCode Agent - **Primary Language**: Python - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 39 - **Forks**: 2 - **Created**: 2025-10-15 - **Last Updated**: 2025-10-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # JoyCode SWE-bench Agent Pipeline [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Docker](https://img.shields.io/badge/docker-required-blue.svg)](https://www.docker.com/) **JoyCode** is an end-to-end LLM-powered pipeline for fixing real-world open-source software issues. It generates patches, creates and verifies tests, and employs intelligent retry mechanisms to achieve high success rates on the SWE-bench dataset. **Project Status:** JoyCode has achieved **74.6% resolution rate** on SWE-bench Verified split, demonstrating state-of-the-art performance in automated software engineering. **Key Innovation:** Our pipeline combines patch generation with intelligent test creation and failure attribution, enabling robust automated code repair with comprehensive validation and smart retry mechanisms. Code Agent ## ✨ Features ### πŸ† **High Performance & Cost Efficiency** - **74.6% Success Rate** on SWE-bench Verified, ranking 2nd globally - **30-50% Lower Resource Consumption** than top competitors - Exceptional cost-performance ratio with near state-of-the-art results ### πŸ”„ **Patch-Test Co-generation** - **Smart Test Generation**: Automatic Fail2Pass and Pass2Pass test creation with pre-validation - **Collaborative Verification**: Patches and tests generated together for comprehensive validation - **Closed-loop Iteration**: "Generate β†’ Validate β†’ Refine" cycle replacing one-shot approaches ### 🧠 **Intelligent Failure Attribution** - **Root Cause Analysis**: Precise failure attribution to patch vs. test issues - **Targeted Retry Strategy**: Experience-driven retries based on failure analysis - **CSR-Powered Learning**: Historical success pattern retrieval for optimization ### πŸ—οΈ **Multi-Agent Architecture** - **Specialized Agents**: Testing, Patch, CSR, and Decision agents with distinct roles - **React-based Workflow**: "Observe-Think-Act" loop mimicking human developers - **Smart Decision Making**: LLM-powered voting for optimal patch selection ### πŸ’‘ **Smart Resource Management** - **Token-Efficient Design**: Targeted LLM calls avoiding wasteful parallel sampling - **Early Failure Detection**: Pre-validation to filter invalid paths - **Quality-First Generation**: Fewer, higher-quality patches over massive sampling ### 🐳 **Production-Ready Engineering** - **Containerized Execution**: Isolated Docker environments with SWE-bench images - **Repository-Level Understanding**: Multi-file coordination and cross-module reasoning - **Comprehensive Logging**: Full trajectory recording with optional compression - **Multi-LLM Support**: Flexible model configuration for different pipeline stages ## πŸš€ Installation ### Requirements - Python 3.11+ - Docker with access to `docker.1ms.run` - LLM API keys (OpenAI, Anthropic, etc.) ### Setup ```bash # Clone repository git clone https://gitee.com/JD-opensource/joycode-agent.git cd joycode # Create conda environment conda create -n joycode python=3.11 conda activate joycode # Install dependencies pip install -r requirements.txt ``` ## βš™οΈ Configuration ### LLM Configuration Configure your models in `llm_server/model_config.json`: ```json { "patch_generation": { "api_key": "your_api_key_here", "base_url": "https://api.openai.com/v1", "model_name": "gpt-5", "max_tokens": 4000, "temperature": 1 } } ``` ### Docker Setup Ensure Docker is running and you can access the registry: ```bash # Test Docker connectivity docker pull docker.1ms.run/swebench/sweb.eval.x86_64.django__django-11099:latest ``` ### Instance Configuration Specify instances to process in `instance_id.txt`: ``` django__django-11099 matplotlib__matplotlib-23562 sympy__sympy-18189 ... ``` ## πŸ“– Usage ### Quick Start ```bash # Run with default settings python run_patch_pipeline.py --num-processes 1 --enable-post-processing ``` ### Common Usage Patterns ```bash # Simple mode (patch only, no tests) python run_patch_pipeline.py --simple-mode # Single instance processing python run_patch_pipeline.py --problem-id django__django-11099 --num-processes 1 # Batch processing with limits python run_patch_pipeline.py --num-examples 10 --num-processes 4 # Disable test generation python run_patch_pipeline.py --no-generate-tests --no-validate-with-tests # Custom configuration python run_patch_pipeline.py --enable-post-processing --num-processes 2 ``` ### Command Line Options | Option | Description | Default | |--------|-------------|---------| | `--num-processes` | Number of parallel processes | 1 | | `--enable-post-processing` | Enable trajectory compression and retries | False | | `--simple-mode` | Patch generation only | False | | `--problem-id` | Process single instance | None | | `--num-examples` | Limit number of instances | All | | `--no-generate-tests` | Skip test generation | False | | `--no-validate-with-tests` | Skip test validation | False | ## πŸ› οΈ Advanced Features ### Patch Voting System Compare and select between multiple patch candidates: ```bash # Prepare voting inputs python scripts/prepare_voting_data.py # Run voting python vote.py ``` **Input Requirements:** - `patch_1.json`: Primary patch candidates - `patch_2.json`: Alternative patch candidates - `test-00000-of-00001.parquet`: Instance metadata with problem statements ### Pipeline Stages 1. **Container Setup**: Pull and start SWE-bench Docker images 2. **Test Generation** (optional): Create and validate tests on original code 3. **Agent Execution**: Generate patches using LLM agents via `cli.py` 4. **Validation**: Run tests and evaluate patch quality 5. **Post-processing** (optional): Trajectory compression, similarity matching, intelligent retries ### Output Structure ``` output_files/ β”œβ”€β”€ / β”‚ β”œβ”€β”€ predictions.json # Generated patch and metadata β”‚ β”œβ”€β”€ agent_logs.txt # Main agent execution logs β”‚ β”œβ”€β”€ test_generation_result.json # Test generation results β”‚ β”œβ”€β”€ test_generation_logs.txt # Test generation logs β”‚ β”œβ”€β”€ agent_logs_retry.txt # Retry attempt logs β”‚ └── compressed_trajectory.txt # Compressed execution trajectory β”œβ”€β”€ successful_cases.txt # Summary of successful instances β”œβ”€β”€ failed_cases.txt # Summary of failed instances β”œβ”€β”€ empty_diff_cases.txt # Cases with no generated patches └── similar_case_matches_summary.json # Similar case analysis ``` ## πŸ“Š Performance Results ``` Submission summary for 20250909_JoyCode on SWE-bench verified split ================================================== Resolved 373 instances (74.6%) ================================================== Resolved by Repository: - astropy/astropy: 13/22 (59.09%) - django/django: 178/231 (77.06%) - matplotlib/matplotlib: 25/34 (73.53%) - mwaskom/seaborn: 1/2 (50.0%) - pallets/flask: 1/1 (100.0%) - psf/requests: 3/8 (37.5%) - pydata/xarray: 19/22 (86.36%) - pylint-dev/pylint: 2/10 (20.0%) - pytest-dev/pytest: 17/19 (89.47%) - scikit-learn/scikit-learn: 28/32 (87.5%) - sphinx-doc/sphinx: 29/44 (65.91%) - sympy/sympy: 57/75 (76.0%) ``` ## πŸ”§ Development ### Repository Structure ``` joycode/ β”œβ”€β”€ run_patch_pipeline.py # Main entry point β”œβ”€β”€ cli.py # Core agent CLI β”œβ”€β”€ test_case_generator/ # Test generation logic β”œβ”€β”€ test/ # Test execution and validation β”œβ”€β”€ utils/docker_utils.py # Container management β”œβ”€β”€ llm_server/ # LLM integration layer β”œβ”€β”€ princeton-nlp___swe-bench_verified/ # Local SWE-bench dataset └── vote.py # Patch voting system ``` ### Troubleshooting **Docker Issues:** ```bash # Check Docker connectivity docker info docker pull docker.1ms.run/hello-world ``` **LLM Configuration:** ```bash # Verify model config python -c "import json; print(json.load(open('llm_server/model_config.json')))" ``` **Memory Issues:** ```bash # Reduce parallel processes python run_patch_pipeline.py --num-processes 1 ``` ## πŸ“„ License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## πŸ™ Acknowledgments - [SWE-bench](https://www.swe-bench.com/) for providing the benchmark dataset