Skip to content

Database Operations

Low-level SQLite operations. Most users should use PackageStatsService instead of calling these directly.

Connection Management

get_db

get_db(
    db_path: str,
) -> Generator[sqlite3.Connection, None, None]

Context manager for database connections with automatic init and cleanup.

get_db_connection

get_db_connection(db_path: str) -> sqlite3.Connection

Create and return a database connection.

get_config_dir

get_config_dir() -> Path

Get the pkgdb config directory (~/.pkgdb), creating it if needed.

Schema

init_db

init_db(conn: Connection) -> None

Initialize the database schema.

Package Management

add_package

add_package(conn: Connection, name: str) -> bool

Add a package to the tracking database.

Returns True if package was added, False if it already exists.

remove_package

remove_package(conn: Connection, name: str) -> bool

Remove a package from the tracking database.

Returns True if package was removed, False if it didn't exist.

get_packages

get_packages(conn: Connection) -> list[str]

Get list of tracked package names from the database.

Stats Storage

store_stats

store_stats(
    conn: Connection,
    package_name: str,
    stats: PackageStats,
    commit: bool = True,
) -> None

Store package statistics in the database.

Parameters:

Name Type Description Default
conn Connection

Database connection.

required
package_name str

Name of the package.

required
stats PackageStats

Package statistics to store.

required
commit bool

If True, commit the transaction. Set to False for batch operations.

True

store_stats_batch

store_stats_batch(
    conn: Connection,
    stats_list: list[tuple[str, PackageStats]],
) -> int

Store multiple package statistics in a single transaction.

More efficient than calling store_stats() multiple times as it uses a single commit for all inserts.

Parameters:

Name Type Description Default
conn Connection

Database connection.

required
stats_list list[tuple[str, PackageStats]]

List of (package_name, stats) tuples to store.

required

Returns:

Type Description
int

Number of packages stored.

get_latest_stats

get_latest_stats(conn: Connection) -> list[dict[str, Any]]

Get the most recent stats for all packages, ordered by total downloads.

get_package_history

get_package_history(
    conn: Connection, package_name: str, limit: int = 30
) -> list[dict[str, Any]]

Get historical stats for a specific package, ordered by date descending.

get_all_history

get_all_history(
    conn: Connection, limit_per_package: int = 30
) -> dict[str, list[dict[str, Any]]]

Get historical stats for all packages, grouped by package name.

get_stats_with_growth

get_stats_with_growth(
    conn: Connection,
) -> list[dict[str, Any]]

Get latest stats with week-over-week and month-over-month growth metrics.

Uses a single query to fetch all history, avoiding N+1 query pattern.

Environment Stats

store_env_stats

store_env_stats(
    conn: Connection,
    package_name: str,
    python_versions: list[CategoryDownloads] | None = None,
    os_data: list[CategoryDownloads] | None = None,
    commit: bool = True,
) -> None

Store environment stats (Python versions, OS distribution) in the database.

Parameters:

Name Type Description Default
conn Connection

Database connection.

required
package_name str

Name of the package.

required
python_versions list[CategoryDownloads] | None

Python version download breakdown, or None.

None
os_data list[CategoryDownloads] | None

OS distribution download breakdown, or None.

None
commit bool

If True, commit the transaction.

True

get_cached_python_versions

get_cached_python_versions(
    conn: Connection, package_name: str
) -> list[CategoryDownloads] | None

Get cached Python version stats for a package.

Returns the most recent fetch date's data, sorted by downloads descending. Returns None if no cached data exists.

get_cached_os_stats

get_cached_os_stats(
    conn: Connection, package_name: str
) -> list[CategoryDownloads] | None

Get cached OS distribution stats for a package.

Returns the most recent fetch date's data, sorted by downloads descending. Returns None if no cached data exists.

get_cached_env_summary

get_cached_env_summary(
    conn: Connection,
) -> EnvSummary | None

Aggregate cached environment stats across all packages.

Returns dict with 'python_versions' and 'os_distribution' keys, each mapping to a list of (category, total_downloads) tuples sorted descending. Returns None if no cached data exists.

Release Data

store_pypi_releases

store_pypi_releases(
    conn: Connection,
    package_name: str,
    releases: list[PyPIRelease],
) -> None

Store PyPI release data and update cache timestamp.

get_pypi_releases

get_pypi_releases(
    conn: Connection, package_name: str
) -> list[PyPIRelease] | None

Get cached PyPI releases if cache is valid.

Returns None if cache has expired or no data exists.

get_all_pypi_releases

get_all_pypi_releases(
    conn: Connection, package_name: str
) -> list[PyPIRelease]

Get all stored PyPI releases regardless of cache validity.

store_github_releases

store_github_releases(
    conn: Connection,
    repo_key: str,
    releases: list[GitHubRelease],
) -> None

Store GitHub release data and update cache timestamp.

get_github_releases

get_github_releases(
    conn: Connection, repo_key: str
) -> list[GitHubRelease] | None

Get cached GitHub releases if cache is valid.

Returns None if cache has expired or no data exists.

get_all_github_releases

get_all_github_releases(
    conn: Connection, repo_key: str
) -> list[GitHubRelease]

Get all stored GitHub releases regardless of cache validity.

Fetch Tracking

record_fetch_attempt

record_fetch_attempt(
    conn: Connection,
    package_name: str,
    success: bool,
    commit: bool = True,
) -> None

Record a fetch attempt for a package.

Parameters:

Name Type Description Default
conn Connection

Database connection.

required
package_name str

Name of the package.

required
success bool

Whether the fetch was successful.

required
commit bool

If True, commit the transaction.

True

get_packages_needing_update

get_packages_needing_update(
    conn: Connection, hours: int = 24
) -> list[str]

Get packages that haven't been fetched in the last N hours.

Parameters:

Name Type Description Default
conn Connection

Database connection.

required
hours int

Number of hours since last attempt to consider stale.

24

Returns:

Type Description
list[str]

List of package names that need updating.

get_next_update_seconds

get_next_update_seconds(
    conn: Connection, hours: int = 24
) -> float | None

Get seconds until the next package becomes eligible for update.

Finds the oldest successful attempt within the cooldown window and computes how many seconds remain until it expires.

Returns:

Type Description
float | None

Seconds until the next package is eligible, or None if no packages are throttled.

Maintenance

cleanup_orphaned_stats

cleanup_orphaned_stats(conn: Connection) -> int

Remove stats for packages that are no longer being tracked.

Returns the number of orphaned records deleted.

prune_old_stats

prune_old_stats(conn: Connection, days: int = 365) -> int

Remove stats older than the specified number of days.

Parameters:

Name Type Description Default
conn Connection

Database connection.

required
days int

Delete stats older than this many days (default: 365).

365

Returns:

Type Description
int

Number of records deleted.

get_database_stats

get_database_stats(conn: Connection) -> dict[str, Any]

Get database statistics.

Returns:

Type Description
dict[str, Any]

Dict with package_count, record_count, first_fetch, and last_fetch.