Skip to content

Commit bce7101

Browse files
author
James William Pye
committed
Minor nits in copyman, update documentation.
1 parent aa8cd38 commit bce7101

24 files changed

Lines changed: 630 additions & 336 deletions

postgresql/documentation/copyman.txt

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -127,11 +127,14 @@ Receiver Faults
127127
---------------
128128

129129
The Manager assumes the Fault is fatal to a Receiver, and immediately removes
130-
it from the set of target receivers. Additionally, if the Fault goes untrapped,
131-
the copy will ultimately fail.
130+
it from the set of target receivers. Additionally, if the Fault exception goes
131+
untrapped, the copy will ultimately fail.
132132

133133
The Fault exception references the Manager that raised the exception, and the
134-
actual exceptions that occurred, associated with the Receiver that caused them::
134+
actual exceptions that occurred associated with the Receiver that caused them.
135+
136+
In order to identify the exception that caused a Fault, the ``faults`` attribute
137+
on the `postgresql.copyman.ReceiverFault` must be referenced::
135138

136139
>>> from postgresql import copyman
137140
>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
@@ -164,8 +167,6 @@ The following attributes exist on `postgresql.copyman.ReceiverFault` instances:
164167

165168
``ReceiverFault.faults``
166169
A dictionary mapping the Receiver to the exception raised by that Receiver.
167-
The Manager will give processing time to every Receiver, so only *one* Fault will
168-
occur per transfer cycle, each iteration.
169170

170171

171172
Reconciliation
@@ -176,7 +177,7 @@ removes the Receiver so that the COPY operation can continue. Continuation of
176177
the COPY can occur by trapping the exception and continuing the iteration of the
177178
Manager. However, if the fault is recoverable, the
178179
`postgresql.copyman.CopyManager.reconcile` method must be used to reintroduce the
179-
Receiver into the Manager's set. Faults should be trapped from within the
180+
Receiver into the Manager's set. Faults must be trapped from within the
180181
Manager's context::
181182

182183
>>> import socket

postgresql/documentation/html/_sources/changes.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ Changes
77
* **DEPRECATION**: Removed 2PC support documentation.
88
* **DEPRECATION**: Removed pg_python and pg_dotconf 'scripts'.
99
They are still accessible by python3 -m postgresql.bin.pg_*
10+
* Add support for binary hstore.
11+
* Add support for user service files.
12+
* Implement a Copy manager for direct connection-to-connection COPY operations.
1013
* Added db.do() method for DO-statement support(convenience method).
1114
* Set the default client_min_messages level to WARNING.
1215
NOTICEs are often not desired by programmers, and py-postgresql's
@@ -40,4 +43,3 @@ Changes
4043
* Fix count return from .first() method. Failed to provide an empty
4144
tuple for the rformats of the bind statement.
4245
[Reported by dou dou]
43-
* Add support for binary hstore.

postgresql/documentation/html/_sources/copyman.txt

Lines changed: 150 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@ Copy Management
77
.. warning:: `postgresql.copyman` is a new feature in v1.0.
88

99
The `postgresql.copyman` module provides a way to quickly move COPY data coming
10-
from one connection to many connections. Alternatively, it can also be sourced
10+
from one connection to many connections. Alternatively, it can be sourced
1111
by arbitrary iterators and target arbitrary callables.
1212

1313
Statement execution methods offer a way for running COPY operations
1414
with iterators, but the cost of allocating objects for each row is too
1515
significant for transferring gigabytes of COPY data from one connection to
1616
another. The interfaces available on statement objects are primarily intended to
1717
be used when transferring COPY data to and from arbitrary Python
18-
interfaces.
18+
objects.
1919

2020
Direct connection-to-connection COPY operations can be performed using the
2121
high-level `postgresql.copyman.transfer` function::
@@ -37,13 +37,13 @@ The `postgresql.copyman.CopyManager` class manages the Producer and the
3737
Receivers involved in a COPY operation. Normally,
3838
`postgresql.copyman.StatementProducer` and
3939
`postgresql.copyman.StatementReceiver` instances. Naturally, a Producer is the
40-
object that produces the COPY data to be given to the manager's Receivers.
40+
object that produces the COPY data to be given to the Manager's Receivers.
4141

42-
Using a CopyManager directly means that there is a need for more control over
42+
Using a Manager directly means that there is a need for more control over
4343
the operation. The Manager is both a context manager and an iterator. The
44-
context manager interfaces handle initialization and finalization, and the
45-
iterator provides an event loop emitting information about the amount of
46-
COPY data copied this cycle. Normal usage takes the form::
44+
context manager interfaces handle initialization and finalization of the COPY
45+
state, and the iterator provides an event loop emitting information about the
46+
amount of COPY data transferred this cycle. Normal usage takes the form::
4747

4848
>>> from postgresql import copyman
4949
>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
@@ -57,15 +57,14 @@ COPY data copied this cycle. Normal usage takes the form::
5757
... for num_messages, num_bytes in copy:
5858
... update_rate(num_bytes)
5959

60-
The use of the context manager is necessary for ensuring that connection state
61-
is properly restored at the end of the COPY.
62-
6360
As an alternative to a for-loop inside a with-statement block, the `run` method
6461
can be called to perform the operation::
6562

6663
>>> with source.xact(), destination.xact():
6764
... copyman.CopyManager(producer, receiver).run()
6865

66+
However, there is little benefit beyond using the high-level
67+
`postgresql.copyman.transfer` function.
6968

7069
Manager Interface Points
7170
------------------------
@@ -81,35 +80,61 @@ an iterator for controlling the COPY operation.
8180
be used until ``__exit__`` is ran.
8281

8382
``CopyManager.__exit__(typ, val, tb)``
84-
Finish, abort, or fail the COPY operation. Aborts in the case of an incomplete
85-
COPY or an unidentified exception, and fails in the case of an untrapped
86-
fault.
83+
Finish the COPY operation. Fails in the case of an incomplete
84+
COPY, or an untrapped exception. Either returns `None` or raises the generalized
85+
exception, `postgresql.copyman.CopyFail`.
8786

8887
``CopyManager.__iter__()``
8988
Returns the CopyManager instance.
9089

9190
``CopyManager.__next__()``
9291
Transfer the next chunk of COPY data to the receivers. Yields a tuple
93-
consisting of the number of messages and bytes transferred. Raises
94-
`StopIteration` when complete.
92+
consisting of the number of messages and bytes transferred,
93+
``(num_messages, num_bytes)``. Raises `StopIteration` when complete.
94+
95+
Raises `postgresql.copyman.ReceiverFault` when a Receiver raises an
96+
exception.
97+
Raises `postgresql.copyman.ProducerFault` when the Producer raises an
98+
exception. The original exception is available via the exception's
99+
``__context__`` attribute.
95100

96101
``CopyManager.reconcile(faulted_receiver)``
97102
Reconcile a faulted receiver. When a receiver faults, it will no longer
98-
be in the receiver set. This method is used to signal to the manager that the
99-
problem has been cleared up, and the receiver is again ready to receive.
103+
be in the set of Receivers. This method is used to signal to the manager that the
104+
problem has been corrected, and the receiver is again ready to receive.
105+
106+
``CopyManager.receivers``
107+
The `builtins.set` of Receivers involved in the COPY operation.
108+
109+
``CopyManager.producer``
110+
The Producer emitting the data to be given to the Receivers.
100111

101112

102113
Faults
103114
======
104115

105-
The CopyManager generalizes some exceptions that occur during transfer. While
116+
The CopyManager generalizes any exceptions that occur during transfer. While
106117
inside the context manager, `postgresql.copyman.Fault` may be raised if a
107-
Receiver raises an exception. The Manager assumes the Fault is fatal to a
108-
Receiver, and immediately removes it from the set of target receivers.
109-
Additionally, if the Fault goes untrapped, the copy will be aborted.
118+
Receiver or a Producer raises an exception. A `postgresql.copyman.ProducerFault`
119+
in the case of the Producer, and `postgresql.copyman.ReceiverFault` in the case
120+
of the Receivers.
121+
122+
.. note::
123+
Faults are only raised by `postgresql.copyman.CopyManager.__next__`. The
124+
``run()`` method will always raise `postgresql.copyman.CopyFail`.
125+
126+
Receiver Faults
127+
---------------
128+
129+
The Manager assumes the Fault is fatal to a Receiver, and immediately removes
130+
it from the set of target receivers. Additionally, if the Fault exception goes
131+
untrapped, the copy will ultimately fail.
110132

111133
The Fault exception references the Manager that raised the exception, and the
112-
actual exceptions that occurred, associated with the Receiver that caused them::
134+
actual exceptions that occurred associated with the Receiver that caused them.
135+
136+
In order to identify the exception that caused a Fault, the ``faults`` attribute
137+
on the `postgresql.copyman.ReceiverFault` must be referenced::
113138

114139
>>> from postgresql import copyman
115140
>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
@@ -124,36 +149,36 @@ actual exceptions that occurred, associated with the Receiver that caused them::
124149
... try:
125150
... for num_messages, num_bytes in copy:
126151
... update_rate(num_bytes)
127-
... except copyman.Fault as cf:
152+
... except copyman.ReceiverFault as cf:
153+
... # Access the original exception using the receiver as the key.
128154
... original_exception = cf.faults[receiver]
129155
... if unknown_failure(original_exception):
130156
... ...
131157
... raise
132158

133159

134-
Fault Properties
135-
----------------
160+
ReceiverFault Properties
161+
~~~~~~~~~~~~~~~~~~~~~~~~
136162

137-
The following attributes exist on `postgresql.copyman.Fault` instances:
163+
The following attributes exist on `postgresql.copyman.ReceiverFault` instances:
138164

139-
``Fault.manager``
140-
The `postgresql.copyman.CopyManager` instance that raised the exception; the
141-
same manager that caught the fault.
165+
``ReceiverFault.manager``
166+
The subject `postgresql.copyman.CopyManager` instance.
142167

143-
``Fault.faults``
144-
A dictionary mapping the Receiver to the exception that occurred. The Manager
145-
will give processing to every Receiver, so only one Fault will occur per
146-
transfer cycle.
168+
``ReceiverFault.faults``
169+
A dictionary mapping the Receiver to the exception raised by that Receiver.
147170

148-
Reconciliation
149-
--------------
150171

151-
When a Fault occurs, it is possible that it was not fatal. In such cases the
152-
`postgresql.copyman.CopyManager.reconcile` method can be used to reintroduce the
153-
Receiver to the Manager's set. That is, when a Fault occurs, the Manager
154-
immediately removes the Receiver so that the COPY operation can continue.
172+
Reconciliation
173+
~~~~~~~~~~~~~~
155174

156-
Faults should be trapped from within the Manager's context::
175+
When a `postgresql.copyman.ReceiverFault` is raised, the Manager immediately
176+
removes the Receiver so that the COPY operation can continue. Continuation of
177+
the COPY can occur by trapping the exception and continuing the iteration of the
178+
Manager. However, if the fault is recoverable, the
179+
`postgresql.copyman.CopyManager.reconcile` method must be used to reintroduce the
180+
Receiver into the Manager's set. Faults must be trapped from within the
181+
Manager's context::
157182

158183
>>> import socket
159184
>>> from postgresql import copyman
@@ -169,7 +194,7 @@ Faults should be trapped from within the Manager's context::
169194
... try:
170195
... for num_messages, num_bytes in copy:
171196
... update_rate(num_bytes)
172-
... except copyman.Fault as cf:
197+
... except copyman.ReceiverFault as cf:
173198
... if isinstance(cf.faults[receiver], socket.timeout):
174199
... copy.reconcile(receiver)
175200
... else:
@@ -180,6 +205,82 @@ so, often, it's best to avoid conditions in which reconciliable Faults may
180205
occur.
181206

182207

208+
Producer Faults
209+
---------------
210+
211+
Producer faults are normally fatal to the COPY operation and should rarely be
212+
trapped. However, the Manager makes no state changes when a Producer faults,
213+
so, unlike Receiver Faults, no reconciliation process is necessary; rather,
214+
if it's safe to continue, the Manager's iterator should continue to be
215+
processed.
216+
217+
ProducerFault Properties
218+
~~~~~~~~~~~~~~~~~~~~~~~~
219+
220+
The following attributes exist on `postgresql.copyman.ProducerFault` instances:
221+
222+
``ReceiverFault.manager``
223+
The subject `postgresql.copyman.CopyManager`.
224+
225+
``ReceiverFault.__context__``
226+
The original exception raised by the Producer.
227+
228+
229+
Failures
230+
========
231+
232+
When a COPY operation is aborted, either by an exception or by the iterator
233+
being broken, a `postgresql.copyman.CopyFail` exception will be raised,
234+
generalizing the failure. When a failure occurs, the Manager will *attempt* to
235+
recover and realign the Producer and the Receivers. Regardless of the success of
236+
the recovery process, a `postgresql.copyman.CopyFail` exception will be raised.
237+
238+
The `postgresql.copyman.CopyFail` offers to record any exceptions that occur
239+
during the exit of the context manager.
240+
241+
242+
CopyFail Properties
243+
-------------------
244+
245+
The following properties exist on `postgresql.copyman.CopyFail` exceptions:
246+
247+
``CopyFail.manager``
248+
The Manager whose COPY operation failed.
249+
250+
``CopyFail.receiver_faults``
251+
A dictionary mapping a `postgresql.copyman.Receiver` to the exception raised
252+
by that Receiver's ``__exit__``. `None` if no exceptions were raised by the
253+
Receivers.
254+
255+
``CopyFail.producer_fault``
256+
The exception Raised by the `postgresql.copyman.Producer`. `None` if none.
257+
258+
259+
Producers
260+
=========
261+
262+
The following Producers are available:
263+
264+
``postgresql.copyman.StatementProducer(postgresql.api.Statement)``
265+
Given a Statement producing COPY data, construct a Producer.
266+
267+
``postgresql.copyman.IteratorProducer(collections.Iterator)``
268+
Given an Iterator producing *chunks* of COPY lines, construct a Producer to
269+
manage the data coming from the iterator.
270+
271+
272+
Receivers
273+
=========
274+
275+
``postgresql.copyman.StatementReceiver(postgresql.api.Statement)``
276+
Given a Statement producing COPY data, construct a Producer.
277+
278+
``postgresql.copyman.CallReceiver(callable)``
279+
Given a callable, construct a Receiver that will transmit COPY data in chunks
280+
of lines. That is, the callable will be given a list of COPY lines for each
281+
transfer cycle.
282+
283+
183284
Terminology
184285
===========
185286

@@ -208,16 +309,13 @@ processes of the `postgresql.copyman` module:
208309
necessary steps for a Receiver's reintroduction into the COPY operation after
209310
a Fault.
210311

312+
Failed Copy
313+
A failed copy is an aborted COPY operation. This occurs in situations of
314+
untrapped exceptions or an incomplete COPY. Specifically, the COPY will be
315+
noted as failed in cases where the Manager's iterator is *not* ran until
316+
exhaustion.
317+
211318
Realignment
212319
The process of providing compensating data to the receivers so that the
213-
connection will be on a message boundary. Occurs when the COPY operation is
214-
aborted.
215-
216-
Aborted Copy
217-
An aborted copy is a COPY operation that terminated prematurely. This happens
218-
if a CopyManager's for-loop is terminated early by breaking or by an
219-
unidentified exception being raised.
220-
221-
Failed Copy
222-
A failed copy is an aborted COPY operation that was
223-
*terminated due to a fault*, or a producer failure.
320+
connection will be on a message boundary. Occurs when the COPY operation
321+
fails.

0 commit comments

Comments
 (0)