write_delta_merge
WriteDeltaMergeAction
¶
Bases: PipelineAction
This class implements a Merge action for an ETL pipeline.
The MergeIntoDeltaAction merges a Dataframe to Delta Table.
Example
# Basic merge with same column names
Write Delta Merge:
action: WRITE_DELTA_MERGE
options:
table_identifier: my_catalog.my_schema.my_table
key_columns:
- id
- customer_id
cols_to_exclude_from_update:
- created_at
when_matched_update: true
when_not_matched_insert: true
use_partition_pruning: true
# Merge with different source and target column names
Write Delta Merge with Mapping:
action: WRITE_DELTA_MERGE
options:
table_identifier: my_catalog.my_schema.my_table
key_columns:
- customer_id
column_mapping:
customer_id: cust_id
full_name: name
email_address: email
when_matched_update: true
when_not_matched_insert: true
Source code in src/cloe_nessy/pipeline/actions/write_delta_merge.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
run(context, *, table_identifier=None, key_columns=None, cols_to_exclude_from_update=None, column_mapping=None, when_matched_update=True, when_matched_delete=False, when_not_matched_insert=True, use_partition_pruning=True, ignore_empty_df=False, create_if_not_exists=True, refresh_table=True, **_)
¶
Merge the dataframe into the delta table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
PipelineContext
|
Context in which this Action is executed. |
required |
table_identifier
|
str | None
|
The identifier of the table. If passed, the UC Adapter will be used to create a table object. Otherwise the Table object will be created from the table metadata in the context. |
None
|
key_columns
|
list[str] | None
|
List of target column names that form the key for the merge operation. |
None
|
cols_to_exclude_from_update
|
list[str] | None
|
List of target column names to be excluded from the update operation in the target Delta table. |
None
|
column_mapping
|
dict[str, str] | None
|
Mapping from target column names to source column names. Use this when source and target tables have different column names. If a column is not in the mapping, it's assumed to have the same name in both source and target. |
None
|
when_matched_update
|
bool
|
Flag to specify whether to perform an update operation when matching records are found in the target Delta table. |
True
|
when_matched_delete
|
bool
|
Flag to specify whether to perform a delete operation when matching records are found in the target Delta table. |
False
|
when_not_matched_insert
|
bool
|
Flag to specify whether to perform an insert operation when matching records are not found in the target Delta table. |
True
|
use_partition_pruning
|
bool
|
Flag to specify whether to use partition pruning to optimize the performance of the merge operation. |
True
|
ignore_empty_df
|
bool
|
A flag indicating whether to ignore an empty source dataframe. |
False
|
create_if_not_exists
|
bool
|
Create the table if it not exists. |
True
|
refresh_table
|
bool
|
Refresh the table after the transaction. |
True
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the table does not exist. |
ValueError
|
If the data is not set in the pipeline context. |
ValueError
|
If the table metadata is empty. |
Returns:
| Type | Description |
|---|---|
PipelineContext
|
Pipeline Context |
Source code in src/cloe_nessy/pipeline/actions/write_delta_merge.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |