Skip to content

CLOE Util Git Client

A Python library for retrieving and interacting with files from Git repository history based on tags.

Overview

cloe-util-git-client provides a convenient interface to access historical versions of JSON and YAML files from your Git repository. It allows you to retrieve file contents from specific tagged commits, making it ideal for configuration management, versioning workflows, and historical data access.

Installation

Install the package using pip or uv:

pip install cloe-util-git-client

Or with uv:

uv add cloe-util-git-client

Quick Start

from pathlib import Path
from cloe_util_git_client.git_client import GitClient

# Initialize the client with your repository path and tag pattern
client = GitClient(
    model_root_path=Path("/path/to/your/repo"),
    git_tag_regex=r"v\d+\.\d+\.\d+"  # Match semantic version tags
)

# Retrieve all JSON files from a specific directory at the tagged commit
json_files = client.get_json_from_tag(Path("config/"))

# Access the retrieved files
for file_path, content in json_files.items():
    print(f"{file_path}: {content}")

Core Concepts

Tag-Based Retrieval

The GitClient works by finding the most recent Git tag that matches a specified regex pattern. It then retrieves files from that tagged commit, allowing you to access specific historical versions of your files.

Supported File Types

  • JSON files (.json) - Parsed and returned as Python dictionaries or lists
  • YAML files (.yaml, .yml) - Parsed and returned as Python dictionaries or lists

API Reference

GitClient

The main class for interacting with Git repositories.

Constructor

GitClient(model_root_path: pathlib.Path, git_tag_regex: str)

Parameters:

  • model_root_path (Path): Absolute path to the root of your Git repository
  • git_tag_regex (str): Regular expression pattern to match Git tags

Example:

from pathlib import Path
from cloe_util_git_client.git_client import GitClient

# Match tags like "release-2024.01.15"
client = GitClient(
    model_root_path=Path("/projects/myapp"),
    git_tag_regex=r"release-\d{4}\.\d{2}\.\d{2}"
)

Methods

get_json_from_tag(target_path: pathlib.Path) -> dict[str, dict | list]

Retrieves all JSON files from the specified path at the tagged commit.

Parameters:

  • target_path (Path): Path to a file or directory within the repository (can be absolute or relative)

Returns:

Dictionary mapping file paths to their parsed JSON content.

Example:

# Get all JSON files from a directory
configs = client.get_json_from_tag(Path("config/production/"))

# Get a specific JSON file
settings = client.get_json_from_tag(Path("config/settings.json"))

# Get all JSON files from repository root
all_json = client.get_json_from_tag(Path())
get_yaml_from_tag(target_path: pathlib.Path) -> dict[str, dict | list]

Retrieves all YAML files from the specified path at the tagged commit.

Parameters:

  • target_path (Path): Path to a file or directory within the repository (can be absolute or relative)

Returns:

Dictionary mapping file paths to their parsed YAML content.

Example:

# Get all YAML files from a directory
manifests = client.get_yaml_from_tag(Path("kubernetes/"))

# Get a specific YAML file
config = client.get_yaml_from_tag(Path("docker-compose.yml"))
get_git_tree_list(commit_start: Commit, target_path: pathlib.Path) -> list

Retrieves a list of all file paths at the specified commit and path.

Parameters:

  • commit_start (Commit): Git commit object to retrieve files from
  • target_path (Path): Path to filter files by

Returns:

List of file paths (as strings) relative to the repository root.

Note: This is a lower-level method. Most users should use get_json_from_tag() or get_yaml_from_tag() instead.

Usage Examples

Example 1: Retrieve Configuration Files

from pathlib import Path
from cloe_util_git_client.git_client import GitClient

# Initialize client for production configs
client = GitClient(
    model_root_path=Path("/app/repository"),
    git_tag_regex=r"prod-v\d+\.\d+"
)

# Load all configuration files from the last production release
configs = client.get_json_from_tag(Path("config/"))

# Process each configuration
for file_path, config_data in configs.items():
    print(f"Loading config: {file_path}")
    # Use config_data as needed
    database_url = config_data.get("database", {}).get("url")

Example 2: Compare Multiple Versions

from pathlib import Path
from cloe_util_git_client.git_client import GitClient

# Load from version 1.0
client_v1 = GitClient(
    model_root_path=Path("/app/repo"),
    git_tag_regex=r"v1\.0\.\d+"
)
config_v1 = client_v1.get_json_from_tag(Path("config/app.json"))

# Load from version 2.0
client_v2 = GitClient(
    model_root_path=Path("/app/repo"),
    git_tag_regex=r"v2\.0\.\d+"
)
config_v2 = client_v2.get_json_from_tag(Path("config/app.json"))

# Compare configurations
for key in config_v1.keys():
    if config_v1[key] != config_v2.get(key):
        print(f"Changed: {key}")