mirror of https://github.com/Prowlarr/Indexers.git synced 2026-04-17 07:13:26 +01:00

Files

bakerboy448 dd2bcd1b79 Fixed: Improve indexer-sync robustness and workflow (#662 )

- Add external blocklist configuration (scripts/blocklist.txt)
- Fix conflict detection using proper git commands (diff --filter=U)
- Expand list of unwanted file types (images, code files, etc)
- Remove Jackett src/ files and unwanted assets automatically
- Better handling of both-added (AA) conflicts
- Simplify commit logic (remove retry mechanism)
- Better error handling for empty commits
- Enhanced GitHub Actions workflow with exit code handling
- Auto force-push for automated-indexer-sync branch only
- Manual force-push with -f flag for other branches

Co-authored-by: ServarrAdmin <development@lidarr.audio>

2025-10-18 22:03:30 -05:00

12 KiB

Raw Blame History

Contributing to Prowlarr Indexers

This guide covers how to contribute to the Prowlarr Indexers repository, including validation processes and schema requirements.

Prerequisites

Important

Python 3.11 or higher is required - must be installed and accessible via python3 command

Git
Basic understanding of YAML and JSON Schema

Setup

Fork and clone the repository:

git clone https://github.com/YOUR_USERNAME/Indexers.git
cd Indexers

Set up Python environment:

# Create virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Script Commands

Validation Scripts

Quick Validation

# Validate all indexer definitions (recommended)
./scripts/validate-python.sh

Python Validation Script

# Validate all definitions (auto-detects structure)
python3 scripts/validate.py

# Validate specific directory
python3 scripts/validate.py /path/to/definitions

# Validate single file against schema
python3 scripts/validate.py --single file.yml schema.json

# Show only first error (default shows all errors)
python3 scripts/validate.py --first-error-only

# Find best schema version for a file
python3 scripts/validate.py --find-best-version file.yml

Indexer Sync Script

# Basic sync from Jackett
./scripts/indexer-sync-v2.sh

# With verbose logging
./scripts/indexer-sync-v2.sh -v

# With debug logging
./scripts/indexer-sync-v2.sh -d

# Automation mode (no prompts)
./scripts/indexer-sync-v2.sh -a

# Skip backporting (recommended for automation)
./scripts/indexer-sync-v2.sh -z

# Push changes to remote
./scripts/indexer-sync-v2.sh -p

# Set remote name
./scripts/indexer-sync-v2.sh -r upstream

# Development mode with pauses
./scripts/indexer-sync-v2.sh -m dev

# Combined: automation mode with push
./scripts/indexer-sync-v2.sh -z -a -p

# See all available options
./scripts/indexer-sync-v2.sh --help

Validation Process

The validation script automatically detects directory structure and validates accordingly:

Prowlarr structure: Uses versioned directories (currently v11/) with individual schemas
Jackett structure: Uses flat directory with root schema.json

Schema Versions

Each Cardigann version has its own schema in definitions/v{VERSION}/schema.json. Current active version is:

v11 - Active version with all indexer definitions (522+ indexers) including:
- Predefined setting type: info_category_8000
- Optional selectorinputs and getselectorinputs for login section
- Extended language support and enhanced SelectorBlock validation

Warning

v10 - DEPRECATED as of 2025-08-24

All indexers migrated to v11

Schema remains for historical reference only

Note

For historical version information and deprecated schemas (v1-v9), see the main README.md.

Validation Requirements

YAML Format Requirements

All keys must be strings (no unquoted boolean or numeric keys)
Boolean values in options and case fields are automatically converted to strings
Numeric keys are automatically converted to strings during validation

Common Issues and Solutions

Warning

Boolean Keys/Values

# ❌ Problematic
options:
  true: yes    # boolean key 'true'
  false: no    # boolean key 'false'

# ✅ Preferred (quotes ensure string keys)
options:
  "true": yes
  "false": no

Warning

Numeric Keys

# ❌ Problematic
options:
  1: "Option 1"    # numeric key
  2: "Option 2"

# ✅ Preferred
options:
  "1": "Option 1"
  "2": "Option 2"

Validation Quirks and Limitations

Automatic Type Conversion

The Python validation system automatically handles common YAML parsing issues:

Boolean Keys: True/False keys are converted to "true"/"false"
Numeric Keys: Integer keys are converted to string representations
Boolean Values in Options: Boolean values in options and case fields are converted to lowercase strings

Schema Strictness

The JSON Schema validation is strict about types:

All dictionary keys must be strings
Values must match the expected type exactly
Pattern properties are enforced with regex matching

YAML vs JSON Differences

YAML parsing can introduce type conversions that don't occur in JSON:

Unquoted true/false become booleans
Unquoted numbers become integers/floats
Date-like strings may be parsed as datetime objects

Testing Your Changes

Local Testing

# Test all definitions (supports both Prowlarr and Jackett structures)
./scripts/validate-python.sh

# Test specific directory (current versions)
python3 scripts/validate.py definitions/v11
python3 scripts/validate.py definitions/v10

# Test external projects (like Jackett)
python3 scripts/validate.py ../jackett/src/Jackett.Common/Definitions

# Test single file
python3 scripts/validate.py --single yourindexer.yml schema.json

# Show all errors for comprehensive debugging
python3 scripts/validate.py  # default behavior

# Show only first error for quick fixes
python3 scripts/validate.py --first-error-only

CI/CD Workflows

The repository uses several automated workflows to ensure code quality:

1. YAML Validation (`ci.yml`)

Triggers: Push/PR to master on YAML files in definitions/
Purpose: Validates YAML syntax and formatting
Tools: yamllint with GitHub annotations
Runs: On every change to definition files

2. Python Validation (`python-validation.yml`)

Triggers: Push/PR to master on Python files in scripts/ or requirements.txt
Purpose: Validates Python script syntax and functionality
Tools: py_compile syntax checking
Runs: On script changes only

3. Schema Validation (part of `ci.yml`)

Purpose: Validates all indexer definitions against JSON schemas
Tool: Python-based validation using scripts/validate.py
Coverage: All definition files in versioned directories

4. Indexer Sync Automation (`indexer-sync.yml`)

Schedule: 3 times daily (2 AM, 10 AM, 6 PM UTC)
Purpose: Automatically syncs indexers from Jackett repository
Features:
- Automated PR creation for updates
- Manual trigger with debug options
- Caching for performance (Python deps + Jackett data)
Output: Creates/updates automated-indexer-sync branch

5. Label Actions (`label-actions.yml`)

Triggers: Label changes on issues/PRs
Purpose: Automated issue/PR management based on labels

All workflows include:

Concurrency controls to prevent duplicate runs
Proper caching for dependencies
GitHub status checks integration

Indexer Sync Process

Indexers are primarily synced from Jackett. There is an automated branch - automated-indexer-sync updated via GitHub Actions that run 3 times daily (12 AM, 8 AM, 4 PM UTC). However, this must be manually merged to master.

Community Sync Options

We sync indexer definitions from Jackett. The community can help with updates using either approach:

Option 1: PR the Automated Sync Branch

Monitor automated updates: View Latest Sync
Fork the Repo and Create a PR from your fork's automated-indexer-sync branch to upstream's master

Option 2: Run Sync Script Manually

Fork this repository and clone it

Use the sync script scripts/indexer-sync-v2.sh with available options:

Flag	Env Variable	Description
`-a`		Automation mode - runs without interactive prompts
`-b BRANCH`		Target branch to sync to (default: master)
`-c TEMPLATE`		Custom commit message template
`-d`	`DEBUG=true`	Enable DEBUG logging - shows detailed execution traces
`-f`		Force push with lease - overwrites remote branch safely
`-j URL`		Jackett repository URL to sync from
`-J BRANCH`		Jackett branch to sync from (default: master)
`-m MODE`		Execution mode: `normal` (default), `development`, `jackett`
`-n NAME`		Jackett remote name in git
`-o REMOTE`		Remote to push changes to (default: origin)
`-p`		Enable push mode - automatically pushes changes
`-r REMOTE`		Prowlarr remote name (default: origin)
`-R BRANCH`		Prowlarr release branch (default: master)
`-u URL`		Prowlarr repository URL
`-v`	`VERBOSE=true`	Enable VERBOSE logging - shows detailed progress info
`-z`		Skip backport - faster sync, skips older version updates

Create PR to master with your sync results

Quick Start

Fork this repository on GitHub

Clone your fork

git clone https://github.com/YOUR_USERNAME/Indexers.git
cd Indexers

Install dependencies

# Create virtual environment (recommended)
python -m venv .venv

# Activate virtual environment
# On Linux/Mac:
source .venv/bin/activate
# On Windows:
source .venv/Scripts/activate

# Install Python dependencies
pip install -r requirements.txt

Execute sync script

chmod +x scripts/indexer-sync-v2.sh
./scripts/indexer-sync-v2.sh -r upstream -p

Sync Script Features

The indexer sync script includes several performance and usability improvements:

Sparse Checkout: Automatically configures git sparse checkout to only download Jackett indexer definitions (src/Jackett.Common/Definitions/*), significantly reducing bandwidth and disk usage
Controlled Logging: Three logging levels for better troubleshooting:
- Default: Clean output (INFO, WARN, ERROR only)
- -v or VERBOSE=true: Shows detailed parameter and operation information
- -d or DEBUG=true: Shows all logging including debug traces
Efficient Syncing: Only fetches the necessary files from the large Jackett repository
Blocklist Support: Excludes problematic indexers via scripts/blocklist.txt (no hardcoded defaults)

Automated Sync Details

Schedule: 3 times daily (2 AM, 10 AM, 6 PM UTC) via GitHub Actions
Manual Trigger: Available through GitHub Actions workflow dispatch
Mode: Uses -z flag (skips backporting) for automated runs
Pull Requests: Automatically creates/updates PRs with sync results
Caching: Separate caches for Python dependencies and Jackett data for faster runs

Submitting Changes

Create a feature branch:

git checkout -b feature/new-indexer

Make your changes and validate:

./scripts/validate-python.sh

Commit with conventional format:

git commit -m "feat: add new indexer xyz"
git commit -m "fix: update indexer abc schema"
git commit -m "docs: improve contributing guide"

Push and create pull request:

git push origin feature/new-indexer

Schema Development

Adding New Fields

When adding fields to schemas:

Update the appropriate definitions/v{VERSION}/schema.json (currently v11)
Test against existing indexer definitions
Consider backward compatibility

Version Management

New breaking changes require a new schema version (v12+)
Current active version (v11) should remain stable
Deprecated versions (v1-v9) are frozen and no longer updated
Test schema changes against the full definition set

Troubleshooting

Common Validation Errors

"expected string or bytes-like object, got 'int'"

Usually caused by numeric keys in dictionaries
Fixed automatically by the validation system

"True is not of type 'string'"

Boolean value where string expected
Check options and case field values

"pattern does not match"

Field doesn't match required regex pattern
Check schema requirements for the specific field

Getting Help

Tip

Need assistance? Here are your best resources:

Check existing issues: https://github.com/Prowlarr/Indexers/issues

Review schema documentation: https://wiki.servarr.com/prowlarr/cardigann-yml-definition

Ask on Discord: https://requests.prowlarr.com/

Best Practices

Tip

Follow these guidelines for successful contributions:

Always validate before submitting

Use descriptive commit messages

Test changes thoroughly

Follow existing patterns in similar indexers

Quote YAML keys that might be interpreted as booleans/numbers

Keep indexer definitions focused and minimal

12 KiB Raw Blame History