ext.partition¶
partition
¶
Partition helpers -- Hive-style partition path building and parsing.
Utilities for constructing and deconstructing partition paths like
year=2026/month=03/day=01/data.parquet, commonly used in Parquet
data lake workflows.
Example
ParsedPartition
dataclass
¶
Result of parsing a Hive-style partition path.
Attributes:
-
partitions(dict[str, str]) –Ordered mapping of partition column names to values.
-
filename(str) –The trailing non-partition portion of the path.
partition_path
¶
Build a Hive-style partition path.
Parameters:
-
filename(str) –Leaf file name (e.g.,
"data.parquet"). Must be non-empty and must not contain/. -
partitions(str | int, default:{}) –Partition key-value pairs. Values are coerced to
str. Keys and coerced values must be non-empty and must not contain=.
Returns:
-
str–Forward-slash-joined path like
"year=2026/month=03/data.parquet".
Raises:
-
ValueError–If filename is empty or contains
/, if any partition key or coerced value is empty, or if any key or value contains=.
parse_partition
¶
parse_partition(path: str) -> ParsedPartition
Parse a Hive-style partition path into its components.
A segment is treated as a partition if it contains exactly one =
and the key portion is non-empty. Once a non-partition segment is
encountered, all remaining segments (including any later key=value
segments) become part of the filename.
Parameters:
-
path(str) –The partition path to parse (e.g.,
"year=2026/month=03/data.parquet").
Returns:
-
ParsedPartition–A
ParsedPartitionwith extracted partitions and filename.
Raises:
-
ValueError–If path is empty.
See also¶
- Data Lake Patterns — guide to Hive-style partitioning and data lake layouts