Skip to content

GitHub Integration

GitHub API client with caching and rate limit handling.

Data Classes

RepoStats dataclass

Statistics for a GitHub repository.

RepoResult dataclass

Result of fetching repo stats for a package.

Repository Stats

fetch_repo_stats

fetch_repo_stats(
    owner: str,
    repo: str,
    conn: Connection | None = None,
    use_cache: bool = True,
) -> RepoStats

Fetch repository statistics from GitHub API.

Uses cached responses when available (24h TTL). Supports GITHUB_TOKEN/GH_TOKEN for higher rate limits. Uses exponential backoff on rate limiting (403).

Raises HTTPError on API errors.

fetch_package_github_stats

fetch_package_github_stats(
    package_name: str,
    conn: Connection | None = None,
    use_cache: bool = True,
) -> RepoResult

Fetch GitHub stats for a PyPI package.

Looks up the GitHub repo URL from PyPI metadata, then fetches repository statistics from the GitHub API.

Releases

fetch_github_releases

fetch_github_releases(
    owner: str, repo: str
) -> list[dict[str, Any]] | None

Fetch release list from the GitHub Releases API.

Parameters:

Name Type Description Default
owner str

Repository owner.

required
repo str

Repository name.

required

Returns:

Type Description
list[dict[str, Any]] | None

List of release dicts with tag_name, published_at, and name,

list[dict[str, Any]] | None

sorted by published_at ascending. Returns None on API error,

list[dict[str, Any]] | None

or an empty list if the repo has no releases.

URL Parsing

parse_github_url

parse_github_url(url: str) -> tuple[str, str] | None

Extract owner and repo name from a GitHub URL.

Returns (owner, repo) tuple or None if not a GitHub URL.

extract_github_url

extract_github_url(package_name: str) -> str | None

Extract GitHub repository URL from PyPI package metadata.

Queries the PyPI JSON API and looks for GitHub URLs in project_urls and home_page fields.

Returns the GitHub URL or None if not found.

Cache Management

get_github_cache_stats

get_github_cache_stats(conn: Connection) -> dict[str, int]

Get statistics about the GitHub cache.

clear_github_cache

clear_github_cache(
    conn: Connection, expired_only: bool = True
) -> int

Clear GitHub API cache entries.

Returns number of entries cleared.

Authentication

get_github_token

get_github_token() -> str | None

Get GitHub token from environment (GITHUB_TOKEN or GH_TOKEN).