Database Operations¶
Low-level SQLite operations. Most users should use PackageStatsService instead of calling these directly.
Connection Management¶
get_db ¶
Context manager for database connections with automatic init and cleanup.
get_db_connection ¶
Create and return a database connection.
get_config_dir ¶
Get the pkgdb config directory (~/.pkgdb), creating it if needed.
Schema¶
Package Management¶
add_package ¶
Add a package to the tracking database.
Returns True if package was added, False if it already exists.
remove_package ¶
Remove a package from the tracking database.
Returns True if package was removed, False if it didn't exist.
get_packages ¶
Get list of tracked package names from the database.
Stats Storage¶
store_stats ¶
store_stats(
conn: Connection,
package_name: str,
stats: PackageStats,
commit: bool = True,
) -> None
Store package statistics in the database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection. |
required |
package_name
|
str
|
Name of the package. |
required |
stats
|
PackageStats
|
Package statistics to store. |
required |
commit
|
bool
|
If True, commit the transaction. Set to False for batch operations. |
True
|
store_stats_batch ¶
Store multiple package statistics in a single transaction.
More efficient than calling store_stats() multiple times as it uses a single commit for all inserts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection. |
required |
stats_list
|
list[tuple[str, PackageStats]]
|
List of (package_name, stats) tuples to store. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of packages stored. |
get_latest_stats ¶
Get the most recent stats for all packages, ordered by total downloads.
get_package_history ¶
Get historical stats for a specific package, ordered by date descending.
get_all_history ¶
Get historical stats for all packages, grouped by package name.
get_stats_with_growth ¶
Get latest stats with week-over-week and month-over-month growth metrics.
Uses a single query to fetch all history, avoiding N+1 query pattern.
Environment Stats¶
store_env_stats ¶
store_env_stats(
conn: Connection,
package_name: str,
python_versions: list[CategoryDownloads] | None = None,
os_data: list[CategoryDownloads] | None = None,
commit: bool = True,
) -> None
Store environment stats (Python versions, OS distribution) in the database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection. |
required |
package_name
|
str
|
Name of the package. |
required |
python_versions
|
list[CategoryDownloads] | None
|
Python version download breakdown, or None. |
None
|
os_data
|
list[CategoryDownloads] | None
|
OS distribution download breakdown, or None. |
None
|
commit
|
bool
|
If True, commit the transaction. |
True
|
get_cached_python_versions ¶
Get cached Python version stats for a package.
Returns the most recent fetch date's data, sorted by downloads descending. Returns None if no cached data exists.
get_cached_os_stats ¶
Get cached OS distribution stats for a package.
Returns the most recent fetch date's data, sorted by downloads descending. Returns None if no cached data exists.
get_cached_env_summary ¶
Aggregate cached environment stats across all packages.
Returns dict with 'python_versions' and 'os_distribution' keys, each mapping to a list of (category, total_downloads) tuples sorted descending. Returns None if no cached data exists.
Release Data¶
store_pypi_releases ¶
Store PyPI release data and update cache timestamp.
get_pypi_releases ¶
Get cached PyPI releases if cache is valid.
Returns None if cache has expired or no data exists.
get_all_pypi_releases ¶
Get all stored PyPI releases regardless of cache validity.
store_github_releases ¶
Store GitHub release data and update cache timestamp.
get_github_releases ¶
Get cached GitHub releases if cache is valid.
Returns None if cache has expired or no data exists.
get_all_github_releases ¶
Get all stored GitHub releases regardless of cache validity.
Fetch Tracking¶
record_fetch_attempt ¶
record_fetch_attempt(
conn: Connection,
package_name: str,
success: bool,
commit: bool = True,
) -> None
Record a fetch attempt for a package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection. |
required |
package_name
|
str
|
Name of the package. |
required |
success
|
bool
|
Whether the fetch was successful. |
required |
commit
|
bool
|
If True, commit the transaction. |
True
|
get_packages_needing_update ¶
Get packages that haven't been fetched in the last N hours.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection. |
required |
hours
|
int
|
Number of hours since last attempt to consider stale. |
24
|
Returns:
| Type | Description |
|---|---|
list[str]
|
List of package names that need updating. |
get_next_update_seconds ¶
Get seconds until the next package becomes eligible for update.
Finds the oldest successful attempt within the cooldown window and computes how many seconds remain until it expires.
Returns:
| Type | Description |
|---|---|
float | None
|
Seconds until the next package is eligible, or None if no packages are throttled. |
Maintenance¶
cleanup_orphaned_stats ¶
Remove stats for packages that are no longer being tracked.
Returns the number of orphaned records deleted.
prune_old_stats ¶
Remove stats older than the specified number of days.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection. |
required |
days
|
int
|
Delete stats older than this many days (default: 365). |
365
|
Returns:
| Type | Description |
|---|---|
int
|
Number of records deleted. |
get_database_stats ¶
Get database statistics.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with package_count, record_count, first_fetch, and last_fetch. |