@@ -7,15 +7,15 @@ Copy Management
77.. warning:: `postgresql.copyman` is a new feature in v1.0.
88
99The `postgresql.copyman` module provides a way to quickly move COPY data coming
10- from one connection to many connections. Alternatively, it can also be sourced
10+ from one connection to many connections. Alternatively, it can be sourced
1111by arbitrary iterators and target arbitrary callables.
1212
1313Statement execution methods offer a way for running COPY operations
1414with iterators, but the cost of allocating objects for each row is too
1515significant for transferring gigabytes of COPY data from one connection to
1616another. The interfaces available on statement objects are primarily intended to
1717be used when transferring COPY data to and from arbitrary Python
18- interfaces .
18+ objects .
1919
2020Direct connection-to-connection COPY operations can be performed using the
2121high-level `postgresql.copyman.transfer` function::
@@ -37,13 +37,13 @@ The `postgresql.copyman.CopyManager` class manages the Producer and the
3737Receivers involved in a COPY operation. Normally,
3838`postgresql.copyman.StatementProducer` and
3939`postgresql.copyman.StatementReceiver` instances. Naturally, a Producer is the
40- object that produces the COPY data to be given to the manager 's Receivers.
40+ object that produces the COPY data to be given to the Manager 's Receivers.
4141
42- Using a CopyManager directly means that there is a need for more control over
42+ Using a Manager directly means that there is a need for more control over
4343the operation. The Manager is both a context manager and an iterator. The
44- context manager interfaces handle initialization and finalization, and the
45- iterator provides an event loop emitting information about the amount of
46- COPY data copied this cycle. Normal usage takes the form::
44+ context manager interfaces handle initialization and finalization of the COPY
45+ state, and the iterator provides an event loop emitting information about the
46+ amount of COPY data transferred this cycle. Normal usage takes the form::
4747
4848 >>> from postgresql import copyman
4949 >>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
@@ -57,15 +57,14 @@ COPY data copied this cycle. Normal usage takes the form::
5757 ... for num_messages, num_bytes in copy:
5858 ... update_rate(num_bytes)
5959
60- The use of the context manager is necessary for ensuring that connection state
61- is properly restored at the end of the COPY.
62-
6360As an alternative to a for-loop inside a with-statement block, the `run` method
6461can be called to perform the operation::
6562
6663 >>> with source.xact(), destination.xact():
6764 ... copyman.CopyManager(producer, receiver).run()
6865
66+ However, there is little benefit beyond using the high-level
67+ `postgresql.copyman.transfer` function.
6968
7069Manager Interface Points
7170------------------------
@@ -81,35 +80,61 @@ an iterator for controlling the COPY operation.
8180 be used until ``__exit__`` is ran.
8281
8382 ``CopyManager.__exit__(typ, val, tb)``
84- Finish, abort, or fail the COPY operation. Aborts in the case of an incomplete
85- COPY or an unidentified exception, and fails in the case of an untrapped
86- fault .
83+ Finish the COPY operation. Fails in the case of an incomplete
84+ COPY, or an untrapped exception. Either returns `None` or raises the generalized
85+ exception, `postgresql.copyman.CopyFail` .
8786
8887 ``CopyManager.__iter__()``
8988 Returns the CopyManager instance.
9089
9190 ``CopyManager.__next__()``
9291 Transfer the next chunk of COPY data to the receivers. Yields a tuple
93- consisting of the number of messages and bytes transferred. Raises
94- `StopIteration` when complete.
92+ consisting of the number of messages and bytes transferred,
93+ ``(num_messages, num_bytes)``. Raises `StopIteration` when complete.
94+
95+ Raises `postgresql.copyman.ReceiverFault` when a Receiver raises an
96+ exception.
97+ Raises `postgresql.copyman.ProducerFault` when the Producer raises an
98+ exception. The original exception is available via the exception's
99+ ``__context__`` attribute.
95100
96101 ``CopyManager.reconcile(faulted_receiver)``
97102 Reconcile a faulted receiver. When a receiver faults, it will no longer
98- be in the receiver set. This method is used to signal to the manager that the
99- problem has been cleared up, and the receiver is again ready to receive.
103+ be in the set of Receivers. This method is used to signal to the manager that the
104+ problem has been corrected, and the receiver is again ready to receive.
105+
106+ ``CopyManager.receivers``
107+ The `builtins.set` of Receivers involved in the COPY operation.
108+
109+ ``CopyManager.producer``
110+ The Producer emitting the data to be given to the Receivers.
100111
101112
102113Faults
103114======
104115
105- The CopyManager generalizes some exceptions that occur during transfer. While
116+ The CopyManager generalizes any exceptions that occur during transfer. While
106117inside the context manager, `postgresql.copyman.Fault` may be raised if a
107- Receiver raises an exception. The Manager assumes the Fault is fatal to a
108- Receiver, and immediately removes it from the set of target receivers.
109- Additionally, if the Fault goes untrapped, the copy will be aborted.
118+ Receiver or a Producer raises an exception. A `postgresql.copyman.ProducerFault`
119+ in the case of the Producer, and `postgresql.copyman.ReceiverFault` in the case
120+ of the Receivers.
121+
122+ .. note::
123+ Faults are only raised by `postgresql.copyman.CopyManager.__next__`. The
124+ ``run()`` method will always raise `postgresql.copyman.CopyFail`.
125+
126+ Receiver Faults
127+ ---------------
128+
129+ The Manager assumes the Fault is fatal to a Receiver, and immediately removes
130+ it from the set of target receivers. Additionally, if the Fault exception goes
131+ untrapped, the copy will ultimately fail.
110132
111133The Fault exception references the Manager that raised the exception, and the
112- actual exceptions that occurred, associated with the Receiver that caused them::
134+ actual exceptions that occurred associated with the Receiver that caused them.
135+
136+ In order to identify the exception that caused a Fault, the ``faults`` attribute
137+ on the `postgresql.copyman.ReceiverFault` must be referenced::
113138
114139 >>> from postgresql import copyman
115140 >>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
@@ -124,36 +149,36 @@ actual exceptions that occurred, associated with the Receiver that caused them::
124149 ... try:
125150 ... for num_messages, num_bytes in copy:
126151 ... update_rate(num_bytes)
127- ... except copyman.Fault as cf:
152+ ... except copyman.ReceiverFault as cf:
153+ ... # Access the original exception using the receiver as the key.
128154 ... original_exception = cf.faults[receiver]
129155 ... if unknown_failure(original_exception):
130156 ... ...
131157 ... raise
132158
133159
134- Fault Properties
135- ----------------
160+ ReceiverFault Properties
161+ ~~~~~~~~~~~~~~~~~~~~~~~~
136162
137- The following attributes exist on `postgresql.copyman.Fault ` instances:
163+ The following attributes exist on `postgresql.copyman.ReceiverFault ` instances:
138164
139- ``Fault.manager``
140- The `postgresql.copyman.CopyManager` instance that raised the exception; the
141- same manager that caught the fault.
165+ ``ReceiverFault.manager``
166+ The subject `postgresql.copyman.CopyManager` instance.
142167
143- ``Fault.faults``
144- A dictionary mapping the Receiver to the exception that occurred. The Manager
145- will give processing to every Receiver, so only one Fault will occur per
146- transfer cycle.
168+ ``ReceiverFault.faults``
169+ A dictionary mapping the Receiver to the exception raised by that Receiver.
147170
148- Reconciliation
149- --------------
150171
151- When a Fault occurs, it is possible that it was not fatal. In such cases the
152- `postgresql.copyman.CopyManager.reconcile` method can be used to reintroduce the
153- Receiver to the Manager's set. That is, when a Fault occurs, the Manager
154- immediately removes the Receiver so that the COPY operation can continue.
172+ Reconciliation
173+ ~~~~~~~~~~~~~~
155174
156- Faults should be trapped from within the Manager's context::
175+ When a `postgresql.copyman.ReceiverFault` is raised, the Manager immediately
176+ removes the Receiver so that the COPY operation can continue. Continuation of
177+ the COPY can occur by trapping the exception and continuing the iteration of the
178+ Manager. However, if the fault is recoverable, the
179+ `postgresql.copyman.CopyManager.reconcile` method must be used to reintroduce the
180+ Receiver into the Manager's set. Faults must be trapped from within the
181+ Manager's context::
157182
158183 >>> import socket
159184 >>> from postgresql import copyman
@@ -169,7 +194,7 @@ Faults should be trapped from within the Manager's context::
169194 ... try:
170195 ... for num_messages, num_bytes in copy:
171196 ... update_rate(num_bytes)
172- ... except copyman.Fault as cf:
197+ ... except copyman.ReceiverFault as cf:
173198 ... if isinstance(cf.faults[receiver], socket.timeout):
174199 ... copy.reconcile(receiver)
175200 ... else:
@@ -180,6 +205,82 @@ so, often, it's best to avoid conditions in which reconciliable Faults may
180205occur.
181206
182207
208+ Producer Faults
209+ ---------------
210+
211+ Producer faults are normally fatal to the COPY operation and should rarely be
212+ trapped. However, the Manager makes no state changes when a Producer faults,
213+ so, unlike Receiver Faults, no reconciliation process is necessary; rather,
214+ if it's safe to continue, the Manager's iterator should continue to be
215+ processed.
216+
217+ ProducerFault Properties
218+ ~~~~~~~~~~~~~~~~~~~~~~~~
219+
220+ The following attributes exist on `postgresql.copyman.ProducerFault` instances:
221+
222+ ``ReceiverFault.manager``
223+ The subject `postgresql.copyman.CopyManager`.
224+
225+ ``ReceiverFault.__context__``
226+ The original exception raised by the Producer.
227+
228+
229+ Failures
230+ ========
231+
232+ When a COPY operation is aborted, either by an exception or by the iterator
233+ being broken, a `postgresql.copyman.CopyFail` exception will be raised,
234+ generalizing the failure. When a failure occurs, the Manager will *attempt* to
235+ recover and realign the Producer and the Receivers. Regardless of the success of
236+ the recovery process, a `postgresql.copyman.CopyFail` exception will be raised.
237+
238+ The `postgresql.copyman.CopyFail` offers to record any exceptions that occur
239+ during the exit of the context manager.
240+
241+
242+ CopyFail Properties
243+ -------------------
244+
245+ The following properties exist on `postgresql.copyman.CopyFail` exceptions:
246+
247+ ``CopyFail.manager``
248+ The Manager whose COPY operation failed.
249+
250+ ``CopyFail.receiver_faults``
251+ A dictionary mapping a `postgresql.copyman.Receiver` to the exception raised
252+ by that Receiver's ``__exit__``. `None` if no exceptions were raised by the
253+ Receivers.
254+
255+ ``CopyFail.producer_fault``
256+ The exception Raised by the `postgresql.copyman.Producer`. `None` if none.
257+
258+
259+ Producers
260+ =========
261+
262+ The following Producers are available:
263+
264+ ``postgresql.copyman.StatementProducer(postgresql.api.Statement)``
265+ Given a Statement producing COPY data, construct a Producer.
266+
267+ ``postgresql.copyman.IteratorProducer(collections.Iterator)``
268+ Given an Iterator producing *chunks* of COPY lines, construct a Producer to
269+ manage the data coming from the iterator.
270+
271+
272+ Receivers
273+ =========
274+
275+ ``postgresql.copyman.StatementReceiver(postgresql.api.Statement)``
276+ Given a Statement producing COPY data, construct a Producer.
277+
278+ ``postgresql.copyman.CallReceiver(callable)``
279+ Given a callable, construct a Receiver that will transmit COPY data in chunks
280+ of lines. That is, the callable will be given a list of COPY lines for each
281+ transfer cycle.
282+
283+
183284Terminology
184285===========
185286
@@ -208,16 +309,13 @@ processes of the `postgresql.copyman` module:
208309 necessary steps for a Receiver's reintroduction into the COPY operation after
209310 a Fault.
210311
312+ Failed Copy
313+ A failed copy is an aborted COPY operation. This occurs in situations of
314+ untrapped exceptions or an incomplete COPY. Specifically, the COPY will be
315+ noted as failed in cases where the Manager's iterator is *not* ran until
316+ exhaustion.
317+
211318 Realignment
212319 The process of providing compensating data to the receivers so that the
213- connection will be on a message boundary. Occurs when the COPY operation is
214- aborted.
215-
216- Aborted Copy
217- An aborted copy is a COPY operation that terminated prematurely. This happens
218- if a CopyManager's for-loop is terminated early by breaking or by an
219- unidentified exception being raised.
220-
221- Failed Copy
222- A failed copy is an aborted COPY operation that was
223- *terminated due to a fault*, or a producer failure.
320+ connection will be on a message boundary. Occurs when the COPY operation
321+ fails.
0 commit comments