Examples¶
Basic usage¶
import pandas as pd
from pandas_diff import get_diffs
before = pd.DataFrame([
{"hero": "hulk", "power": "strength"},
{"hero": "black_widow", "power": "spy"},
])
after = pd.DataFrame([
{"hero": "hulk", "power": "smart"},
{"hero": "captain marvel", "power": "strength"},
])
df = get_diffs(before, after, keys="hero")
Multi-key comparison¶
df = get_diffs(before, after, keys=["region", "server_id"])
Ignoring columns¶
df = get_diffs(before, after, keys="id", ignore_columns=["updated_at"])
API Reference¶
Main module.
- pandas_diff.pandas_diff.get_diffs(before: DataFrame, after: DataFrame, keys: list[str] | str, ignore_columns: list[str] | None = None) DataFrame[source]¶
Generate DataFrame with differences between two DataFrames.
- Args:
before: DataFrame representing the previous state. after: DataFrame representing the current state. keys: Column name(s) used to identify rows. ignore_columns: Columns to exclude from comparison.
- Returns:
DataFrame with columns: operation, object_keys, object_values, object_json, attribute_changed, old_value, new_value.
- Raises:
TypeError: If before or after are not DataFrames. ValueError: If keys is empty or not found in columns.