Service Layer¶
The PackageStatsService class is the primary entry point for programmatic use of pkgdb. It provides a clean abstraction over database and API operations.
PackageStatsService¶
PackageStatsService ¶
High-level service for managing package statistics.
Provides a clean abstraction over database and API operations, making it easier to test, mock, and extend.
__init__ ¶
Initialize the service with a database path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the SQLite database file. |
required |
add_package ¶
Add a package to tracking.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Package name to add. |
required |
verify
|
bool
|
If True, verify package exists on PyPI before adding. Network errors are logged as warnings but don't block addition. |
True
|
Returns:
| Type | Description |
|---|---|
bool
|
True if package was added, False if it already exists. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If package name is invalid or package not found on PyPI (when verify=True). |
remove_package ¶
Remove a package from tracking.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Package name to remove. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if package was removed, False if it didn't exist. |
list_packages ¶
Get list of tracked packages with their added dates.
Returns:
| Type | Description |
|---|---|
list[PackageInfo]
|
List of PackageInfo objects. |
import_packages ¶
Import packages from a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to file (JSON or plain text). |
required |
verify
|
bool
|
If True, verify each package exists on PyPI before adding. |
True
|
Returns:
| Type | Description |
|---|---|
tuple[int, int, list[str], list[str]]
|
Tuple of (added_count, skipped_count, invalid_names, not_found_names). |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If file doesn't exist. |
sync_packages_from_user ¶
Sync tracked packages with a PyPI user's current packages.
Fetches the user's packages from PyPI and adds any that aren't already being tracked. Optionally removes packages no longer associated with the user.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
username
|
str
|
PyPI username to fetch packages from. |
required |
prune
|
bool
|
If True, remove locally tracked packages not in user's PyPI account. |
False
|
Returns:
| Type | Description |
|---|---|
SyncResult | None
|
SyncResult with lists of added, already tracked, packages |
SyncResult | None
|
not on remote, and pruned packages. |
SyncResult | None
|
Returns None if unable to fetch from PyPI. |
fetch_all_stats ¶
fetch_all_stats(
progress_callback: Callable[
[int, int, str, PackageStats | None], None
]
| None = None,
) -> FetchResult
Fetch and store stats for all tracked packages.
Skips packages that have been attempted within the last 24 hours. Uses batch commits for better performance when storing multiple packages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
progress_callback
|
Callable[[int, int, str, PackageStats | None], None] | None
|
Optional callback called for each package with (current_index, total_count, package_name, stats_or_none). |
None
|
Returns:
| Type | Description |
|---|---|
FetchResult
|
FetchResult with success/failure/skipped counts and results. |
fetch_package_details ¶
Fetch detailed statistics for a single package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package
|
str
|
Package name. |
required |
Returns:
| Type | Description |
|---|---|
PackageDetails
|
PackageDetails with stats, Python versions, and OS breakdown. |
get_stats ¶
Get latest stats for all packages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
with_growth
|
bool
|
If True, include growth metrics. |
False
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of stats dictionaries ordered by total downloads. |
get_history ¶
Get historical stats for a package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package
|
str
|
Package name. |
required |
limit
|
int
|
Maximum number of days to return. |
30
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of historical stats ordered by date descending. |
get_all_history ¶
Get historical stats for all packages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
limit_per_package
|
int
|
Maximum days per package. |
30
|
Returns:
| Type | Description |
|---|---|
dict[str, list[dict[str, Any]]]
|
Dict mapping package names to their history. |
generate_report ¶
generate_report(
output_file: str,
include_env: bool = False,
include_github: bool = False,
) -> bool
Generate HTML report for all packages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_file
|
str
|
Path to write HTML file. |
required |
include_env
|
bool
|
If True, include Python/OS distribution summary. |
False
|
include_github
|
bool
|
If True, include GitHub stats (stars, forks, etc.) from cache. Packages without cached data are skipped. |
False
|
Returns:
| Type | Description |
|---|---|
bool
|
True if report was generated, False if no data available. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If output path is invalid or not writable. |
generate_package_report ¶
Generate detailed HTML report for a single package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package
|
str
|
Package name. |
required |
output_file
|
str
|
Path to write HTML file. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if report was generated. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If output path is invalid or not writable. |
fetch_package_releases ¶
Fetch PyPI and GitHub releases for a package.
Uses cached data when available (24h TTL).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package
|
str
|
Package name. |
required |
Returns:
| Type | Description |
|---|---|
tuple[list[PyPIRelease], list[GitHubRelease]]
|
Tuple of (pypi_releases, github_releases). |
generate_project_report ¶
Generate a project view HTML report for a single package.
Shows download history with release markers, release timeline, and environment distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package
|
str
|
Package name. |
required |
output_file
|
str
|
Path to write HTML file. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if report was generated. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If output path is invalid. |
export ¶
Export stats in the specified format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
format
|
str
|
One of 'csv', 'json', 'markdown', 'md'. |
required |
output_file
|
str | None
|
Optional path to write output. If None, returns string. |
None
|
Returns:
| Type | Description |
|---|---|
str | None
|
Exported string, or None if no data available. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If format is unknown or output path is invalid. |
generate_badge ¶
Generate an SVG badge for a package's download count.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package
|
str
|
Package name. |
required |
period
|
str
|
One of "total", "month", "week", "day". |
'total'
|
color
|
str | None
|
Badge color (default: auto-select based on count). |
None
|
Returns:
| Type | Description |
|---|---|
str | None
|
SVG string for the badge, or None if no stats available. |
fetch_github_stats ¶
fetch_github_stats(
packages: list[str] | None = None,
use_cache: bool = True,
) -> list[RepoResult]
Fetch GitHub repository stats for tracked packages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
packages
|
list[str] | None
|
Specific packages to fetch. If None, fetches all tracked. |
None
|
use_cache
|
bool
|
Whether to use cached GitHub API responses (24h TTL). |
True
|
Returns:
| Type | Description |
|---|---|
list[RepoResult]
|
List of RepoResult with stats or error for each package. |
clear_github_cache ¶
Clear GitHub API cache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expired_only
|
bool
|
If True, only clear expired entries. |
True
|
Returns:
| Type | Description |
|---|---|
int
|
Number of entries cleared. |
get_github_cache_stats ¶
Get GitHub cache statistics.
Returns:
| Type | Description |
|---|---|
dict[str, int]
|
Dict with 'total', 'valid', and 'expired' counts. |
cleanup ¶
Clean up orphaned stats and return counts.
Removes stats for packages that are no longer being tracked.
Returns:
| Type | Description |
|---|---|
tuple[int, int]
|
Tuple of (orphaned_deleted, packages_remaining). |
prune ¶
Remove stats older than the specified number of days.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
days
|
int
|
Delete stats older than this many days. |
365
|
Returns:
| Type | Description |
|---|---|
int
|
Number of records deleted. |
get_database_info ¶
Get database statistics and metadata.
Returns:
| Type | Description |
|---|---|
DatabaseInfo
|
DatabaseInfo with package count, record count, date range, and file size. |
Data Classes¶
PackageInfo
dataclass
¶
Information about a tracked package.
FetchResult
dataclass
¶
Result of a fetch operation.
PackageDetails
dataclass
¶
Detailed statistics for a package.
SyncResult
dataclass
¶
Result of syncing packages from a PyPI user.