Method: Google::Cloud::Spanner::Client#execute_partition_update
- Defined in:
- lib/google/cloud/spanner/client.rb
#execute_partition_update(sql, params: nil, types: nil, exclude_txn_from_change_streams: false, query_options: nil, request_options: nil, call_options: nil) ⇒ Integer Also known as: execute_pdml
Executes a Partitioned DML SQL statement.
Partitioned DML is an alternate implementation with looser semantics to enable large-scale changes without running into transaction size limits or (accidentally) locking the entire table in one large transaction. At a high level, it partitions the keyspace and executes the statement on each partition in separate internal transactions.
Partitioned DML does not guarantee database-wide atomicity of the statement - it guarantees row-based atomicity, which includes updates to any indices. Additionally, it does not guarantee that it will execute exactly one time against each row - it guarantees "at least once" semantics.
Where DML statements must be executed using Transaction (see Transaction#execute_update), Partitioned DML statements are executed outside of a read/write transaction.
Not all DML statements can be executed in the Partitioned DML mode and the backend will return an error for the statements which are not supported.
DML statements must be fully-partitionable. Specifically, the statement must be expressible as the union of many statements which each access only a single row of the table. InvalidArgumentError is raised if the statement does not qualify.
The method will block until the update is complete. Running a DML statement with this method does not offer exactly once semantics, and therefore the DML statement should be idempotent. The DML statement must be fully-partitionable. Specifically, the statement must be expressible as the union of many statements which each access only a single row of the table. This is a Partitioned DML transaction in which a single Partitioned DML statement is executed. Partitioned DML partitions the and runs the DML statement over each partition in parallel using separate, internal transactions that commit independently. Partitioned DML transactions do not need to be committed.
Partitioned DML updates are used to execute a single DML statement with a different execution strategy that provides different, and often better, scalability properties for large, table-wide operations than DML in a Transaction#execute_update transaction. Smaller scoped statements, such as an OLTP workload, should prefer using Transaction#execute_update.
That said, Partitioned DML is not a drop-in replacement for standard DML used in Transaction#execute_update.
- The DML statement must be fully-partitionable. Specifically, the statement must be expressible as the union of many statements which each access only a single row of the table.
- The statement is not applied atomically to all rows of the table. Rather, the statement is applied atomically to partitions of the table, in independent internal transactions. Secondary index rows are updated atomically with the base table rows.
- Partitioned DML does not guarantee exactly-once execution semantics
against a partition. The statement will be applied at least once to
each partition. It is strongly recommended that the DML statement
should be idempotent to avoid unexpected results. For instance, it
is potentially dangerous to run a statement such as
UPDATE table SET column = column + 1as it could be run multiple times against some rows. - The partitions are committed automatically - there is no support for Commit or Rollback. If the call returns an error, or if the client issuing the DML statement dies, it is possible that some rows had the statement executed on them successfully. It is also possible that statement was never executed against other rows.
- If any error is encountered during the execution of the partitioned DML operation (for instance, a UNIQUE INDEX violation, division by zero, or a value that cannot be stored due to schema constraints), then the operation is stopped at that point and an error is returned. It is possible that at this point, some partitions have been committed (or even committed multiple times), and other partitions have not been run at all.
Given the above, Partitioned DML is good fit for large, database-wide, operations that are idempotent, such as deleting old rows from a very large table.
773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 |
# File 'lib/google/cloud/spanner/client.rb', line 773 def execute_partition_update sql, params: nil, types: nil, exclude_txn_from_change_streams: false, query_options: nil, request_options: nil, call_options: nil ensure_service! params, types = Convert.to_input_params_and_types params, types = Convert. , tag_type: :request_tag route_to_leader = LARHeaders.partition_query results = nil @pool.with_session do |session| transaction = pdml_transaction session, exclude_txn_from_change_streams: exclude_txn_from_change_streams results = session.execute_query \ sql, params: params, types: types, transaction: transaction, query_options: , request_options: , call_options: , route_to_leader: route_to_leader end # Stream all PartialResultSet to get ResultSetStats results.rows.to_a # Raise an error if there is not a row count returned if results.row_count.nil? raise Google::Cloud::InvalidArgumentError, "Partitioned DML statement is invalid." end results.row_count end |