Added Python (Thanks to Beholder) - it fails to build properly using my build system,

so there's a precompiled binary included, with a hack in Android.mk to make it work on NDK r4b
This commit is contained in:
pelya
2011-04-01 14:32:12 +03:00
parent a7cf867372
commit 9586a42a30
3953 changed files with 1480069 additions and 1 deletions

View File

@@ -0,0 +1,217 @@
Contributors to the Python Documentation
----------------------------------------
This section lists people who have contributed in some way to the Python
documentation. It is probably not complete -- if you feel that you or
anyone else should be on this list, please let us know (send email to
docs@python.org), and we'll be glad to correct the problem.
.. acks::
* Aahz
* Michael Abbott
* Steve Alexander
* Jim Ahlstrom
* Fred Allen
* A. Amoroso
* Pehr Anderson
* Oliver Andrich
* Heidi Annexstad
* Jesús Cea Avión
* Daniel Barclay
* Chris Barker
* Don Bashford
* Anthony Baxter
* Alexander Belopolsky
* Bennett Benson
* Jonathan Black
* Robin Boerdijk
* Michal Bozon
* Aaron Brancotti
* Georg Brandl
* Keith Briggs
* Ian Bruntlett
* Lee Busby
* Lorenzo M. Catucci
* Carl Cerecke
* Mauro Cicognini
* Gilles Civario
* Mike Clarkson
* Steve Clift
* Dave Cole
* Matthew Cowles
* Jeremy Craven
* Andrew Dalke
* Ben Darnell
* L. Peter Deutsch
* Robert Donohue
* Fred L. Drake, Jr.
* Josip Dzolonga
* Jeff Epler
* Michael Ernst
* Blame Andy Eskilsson
* Carey Evans
* Martijn Faassen
* Carl Feynman
* Dan Finnie
* Hernán Martínez Foffani
* Stefan Franke
* Jim Fulton
* Peter Funk
* Lele Gaifax
* Matthew Gallagher
* Gabriel Genellina
* Ben Gertzfield
* Nadim Ghaznavi
* Jonathan Giddy
* Shelley Gooch
* Nathaniel Gray
* Grant Griffin
* Thomas Guettler
* Anders Hammarquist
* Mark Hammond
* Harald Hanche-Olsen
* Manus Hand
* Gerhard Häring
* Travis B. Hartwell
* Tim Hatch
* Janko Hauser
* Thomas Heller
* Bernhard Herzog
* Magnus L. Hetland
* Konrad Hinsen
* Stefan Hoffmeister
* Albert Hofkamp
* Gregor Hoffleit
* Steve Holden
* Thomas Holenstein
* Gerrit Holl
* Rob Hooft
* Brian Hooper
* Randall Hopper
* Michael Hudson
* Eric Huss
* Jeremy Hylton
* Roger Irwin
* Jack Jansen
* Philip H. Jensen
* Pedro Diaz Jimenez
* Kent Johnson
* Lucas de Jonge
* Andreas Jung
* Robert Kern
* Jim Kerr
* Jan Kim
* Greg Kochanski
* Guido Kollerie
* Peter A. Koren
* Daniel Kozan
* Andrew M. Kuchling
* Dave Kuhlman
* Erno Kuusela
* Thomas Lamb
* Detlef Lannert
* Piers Lauder
* Glyph Lefkowitz
* Robert Lehmann
* Marc-André Lemburg
* Ross Light
* Ulf A. Lindgren
* Everett Lipman
* Mirko Liss
* Martin von Löwis
* Fredrik Lundh
* Jeff MacDonald
* John Machin
* Andrew MacIntyre
* Vladimir Marangozov
* Vincent Marchetti
* Laura Matson
* Daniel May
* Rebecca McCreary
* Doug Mennella
* Paolo Milani
* Skip Montanaro
* Paul Moore
* Ross Moore
* Sjoerd Mullender
* Dale Nagata
* Ng Pheng Siong
* Koray Oner
* Tomas Oppelstrup
* Denis S. Otkidach
* Zooko O'Whielacronx
* Shriphani Palakodety
* William Park
* Joonas Paalasmaa
* Harri Pasanen
* Bo Peng
* Tim Peters
* Benjamin Peterson
* Christopher Petrilli
* Justin D. Pettit
* Chris Phoenix
* François Pinard
* Paul Prescod
* Eric S. Raymond
* Edward K. Ream
* Sean Reifschneider
* Bernhard Reiter
* Armin Rigo
* Wes Rishel
* Armin Ronacher
* Jim Roskind
* Guido van Rossum
* Donald Wallace Rouse II
* Mark Russell
* Nick Russo
* Chris Ryland
* Constantina S.
* Hugh Sasse
* Bob Savage
* Scott Schram
* Neil Schemenauer
* Barry Scott
* Joakim Sernbrant
* Justin Sheehy
* Charlie Shepherd
* Michael Simcich
* Ionel Simionescu
* Michael Sloan
* Gregory P. Smith
* Roy Smith
* Clay Spence
* Nicholas Spies
* Tage Stabell-Kulo
* Frank Stajano
* Anthony Starks
* Greg Stein
* Peter Stoehr
* Mark Summerfield
* Reuben Sumner
* Kalle Svensson
* Jim Tittsler
* David Turner
* Ville Vainio
* Martijn Vries
* Charles G. Waldman
* Greg Ward
* Barry Warsaw
* Corran Webster
* Glyn Webster
* Bob Weiner
* Eddy Welbourne
* Jeff Wheeler
* Mats Wichmann
* Gerry Wiener
* Timothy Wild
* Collin Winter
* Blake Winton
* Dan Wolfe
* Steven Work
* Thomas Wouters
* Ka-Ping Yee
* Rory Yorke
* Moshe Zadka
* Milan Zamazal
* Cheng Zhang

View File

@@ -0,0 +1,151 @@
#
# Makefile for Python documentation
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# You can set these variables from the command line.
PYTHON = python
SVNROOT = http://svn.python.org/projects
SPHINXOPTS =
PAPER =
SOURCES =
DISTVERSION = $(shell $(PYTHON) tools/sphinxext/patchlevel.py)
ALLSPHINXOPTS = -b $(BUILDER) -d build/doctrees -D latex_paper_size=$(PAPER) \
$(SPHINXOPTS) . build/$(BUILDER) $(SOURCES)
.PHONY: help checkout update build html htmlhelp clean coverage dist check
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " text to make plain text files"
@echo " changes to make an overview over all changed/added/deprecated items"
@echo " linkcheck to check all external links for integrity"
@echo " suspicious to check for suspicious markup in output text"
@echo " coverage to check documentation coverage for library and C API"
@echo " dist to create a \"dist\" directory with archived docs for download"
checkout:
@if [ ! -d tools/sphinx ]; then \
echo "Checking out Sphinx..."; \
svn checkout $(SVNROOT)/external/Sphinx-0.6.1/sphinx tools/sphinx; \
fi
@if [ ! -d tools/docutils ]; then \
echo "Checking out Docutils..."; \
svn checkout $(SVNROOT)/external/docutils-0.5/docutils tools/docutils; \
fi
@if [ ! -d tools/jinja2 ]; then \
echo "Checking out Jinja..."; \
svn checkout $(SVNROOT)/external/Jinja-2.1.1/jinja2 tools/jinja2; \
fi
@if [ ! -d tools/pygments ]; then \
echo "Checking out Pygments..."; \
svn checkout $(SVNROOT)/external/Pygments-0.11.1/pygments tools/pygments; \
fi
update: checkout
svn update tools/sphinx
svn update tools/docutils
svn update tools/jinja2
svn update tools/pygments
build: checkout
mkdir -p build/$(BUILDER) build/doctrees
$(PYTHON) tools/sphinx-build.py $(ALLSPHINXOPTS)
@echo
html: BUILDER = html
html: build
@echo "Build finished. The HTML pages are in build/html."
htmlhelp: BUILDER = htmlhelp
htmlhelp: build
@echo "Build finished; now you can run HTML Help Workshop with the" \
"build/htmlhelp/pydoc.hhp project file."
latex: BUILDER = latex
latex: build
@echo "Build finished; the LaTeX files are in build/latex."
@echo "Run \`make all-pdf' or \`make all-ps' in that directory to" \
"run these through (pdf)latex."
text: BUILDER = text
text: build
@echo "Build finished; the text files are in build/text."
changes: BUILDER = changes
changes: build
@echo "The overview file is in build/changes."
linkcheck: BUILDER = linkcheck
linkcheck: build
@echo "Link check complete; look for any errors in the above output " \
"or in build/$(BUILDER)/output.txt"
suspicious: BUILDER = suspicious
suspicious: build
@echo "Suspicious check complete; look for any errors in the above output " \
"or in build/$(BUILDER)/suspicious.txt"
coverage: BUILDER = coverage
coverage: build
@echo "Coverage finished; see c.txt and python.txt in build/coverage"
doctest: BUILDER = doctest
doctest: build
@echo "Testing of doctests in the sources finished, look at the " \
"results in build/doctest/output.txt"
pydoc-topics: BUILDER = pydoc-topics
pydoc-topics: build
@echo "Building finished; now copy build/pydoc-topics/pydoc_topics.py " \
"into the Lib/ directory"
htmlview: html
$(PYTHON) -c "import webbrowser; webbrowser.open('build/html/index.html')"
clean:
-rm -rf build/*
-rm -rf tools/sphinx
dist:
-rm -rf dist
mkdir -p dist
# archive the HTML
make html
cp -pPR build/html dist/python-$(DISTVERSION)-docs-html
tar -C dist -cf dist/python-$(DISTVERSION)-docs-html.tar python-$(DISTVERSION)-docs-html
bzip2 -9 -k dist/python-$(DISTVERSION)-docs-html.tar
(cd dist; zip -q -r -9 python-$(DISTVERSION)-docs-html.zip python-$(DISTVERSION)-docs-html)
rm -r dist/python-$(DISTVERSION)-docs-html
rm dist/python-$(DISTVERSION)-docs-html.tar
# archive the text build
make text
cp -pPR build/text dist/python-$(DISTVERSION)-docs-text
tar -C dist -cf dist/python-$(DISTVERSION)-docs-text.tar python-$(DISTVERSION)-docs-text
bzip2 -9 -k dist/python-$(DISTVERSION)-docs-text.tar
(cd dist; zip -q -r -9 python-$(DISTVERSION)-docs-text.zip python-$(DISTVERSION)-docs-text)
rm -r dist/python-$(DISTVERSION)-docs-text
rm dist/python-$(DISTVERSION)-docs-text.tar
# archive the A4 latex
-rm -r build/latex
make latex PAPER=a4
(cd build/latex; make clean && make all-pdf && make FMT=pdf zip bz2)
cp build/latex/docs-pdf.zip dist/python-$(DISTVERSION)-docs-pdf-a4.zip
cp build/latex/docs-pdf.tar.bz2 dist/python-$(DISTVERSION)-docs-pdf-a4.tar.bz2
# archive the letter latex
rm -r build/latex
make latex PAPER=letter
(cd build/latex; make clean && make all-pdf && make FMT=pdf zip bz2)
cp build/latex/docs-pdf.zip dist/python-$(DISTVERSION)-docs-pdf-letter.zip
cp build/latex/docs-pdf.tar.bz2 dist/python-$(DISTVERSION)-docs-pdf-letter.tar.bz2
check:
$(PYTHON) tools/rstlint.py -i tools

View File

@@ -0,0 +1,137 @@
Python Documentation README
~~~~~~~~~~~~~~~~~~~~~~~~~~~
This directory contains the reStructuredText (reST) sources to the Python
documentation. You don't need to build them yourself, prebuilt versions are
available at http://docs.python.org/download/.
Documentation on the authoring Python documentation, including information about
both style and markup, is available in the "Documenting Python" chapter of the
documentation. There's also a chapter intended to point out differences to
those familiar with the previous docs written in LaTeX.
Building the docs
=================
You need to install Python 2.4 or higher; the toolset used to build the docs are
written in Python. The toolset used to build the documentation is called
*Sphinx*, it is not included in this tree, but maintained separately in the
Python Subversion repository. Also needed are Jinja, a templating engine
(included in Sphinx as a Subversion external), and optionally Pygments, a code
highlighter.
Using make
----------
Luckily, a Makefile has been prepared so that on Unix, provided you have
installed Python and Subversion, you can just run ::
make html
to check out the necessary toolset in the `tools/` subdirectory and build the
HTML output files. To view the generated HTML, point your favorite browser at
the top-level index `build/html/index.html` after running "make".
Available make targets are:
* "html", which builds standalone HTML files for offline viewing.
* "htmlhelp", which builds HTML files and a HTML Help project file usable to
convert them into a single Compiled HTML (.chm) file -- these are popular
under Microsoft Windows, but very handy on every platform.
To create the CHM file, you need to run the Microsoft HTML Help Workshop
over the generated project (.hhp) file.
* "latex", which builds LaTeX source files that can be run with "pdflatex"
to produce PDF documents.
* "text", which builds a plain text file for each source file.
* "linkcheck", which checks all external references to see whether they are
broken, redirected or malformed, and outputs this information to stdout
as well as a plain-text (.txt) file.
* "changes", which builds an overview over all versionadded/versionchanged/
deprecated items in the current version. This is meant as a help for the
writer of the "What's New" document.
* "coverage", which builds a coverage overview for standard library modules
and C API.
* "pydoc-topics", which builds a Python module containing a dictionary
with plain text documentation for the labels defined in
`tools/sphinxext/pyspecific.py` -- pydoc needs these to show topic
and keyword help.
A "make update" updates the Subversion checkouts in `tools/`.
Without make
------------
You'll need to checkout the Sphinx package to the `tools/` directory::
svn co http://svn.python.org/projects/doctools/trunk/sphinx tools/sphinx
Then, you need to install Docutils, either by checking it out via ::
svn co http://svn.python.org/projects/external/docutils-0.4/docutils tools/docutils
or by installing it from http://docutils.sf.net/.
You can optionally also install Pygments, either as a checkout via ::
svn co http://svn.python.org/projects/external/Pygments-0.9/pygments tools/pygments
or from PyPI at http://pypi.python.org/pypi/Pygments.
Then, make an output directory, e.g. under `build/`, and run ::
python tools/sphinx-build.py -b<builder> . build/<outputdirectory>
where `<builder>` is one of html, text, latex, or htmlhelp (for explanations see
the make targets above).
Contributing
============
Bugs in the content should be reported to the Python bug tracker at
http://bugs.python.org.
Bugs in the toolset should be reported in the Sphinx bug tracker at
http://www.bitbucket.org/birkenfeld/sphinx/issues/.
You can also send a mail to the Python Documentation Team at docs@python.org,
and we will process your request as soon as possible.
If you want to help the Documentation Team, you are always welcome. Just send
a mail to docs@python.org.
Copyright notice
================
The Python source is copyrighted, but you can freely use and copy it
as long as you don't change or remove the copyright notice:
----------------------------------------------------------------------
Copyright (c) 2000-2008 Python Software Foundation.
All rights reserved.
Copyright (c) 2000 BeOpen.com.
All rights reserved.
Copyright (c) 1995-2000 Corporation for National Research Initiatives.
All rights reserved.
Copyright (c) 1991-1995 Stichting Mathematisch Centrum.
All rights reserved.
See the file "license.rst" for information on usage and redistribution
of this file, and for a DISCLAIMER OF ALL WARRANTIES.
----------------------------------------------------------------------

View File

@@ -0,0 +1,6 @@
To do
=====
* split very large files and add toctrees
* finish "Documenting Python"
* care about XXX comments

View File

@@ -0,0 +1,33 @@
=====================
About these documents
=====================
These documents are generated from `reStructuredText
<http://docutils.sf.net/rst.html>`_ sources by *Sphinx*, a document processor
specifically written for the Python documentation.
In the online version of these documents, you can submit comments and suggest
changes directly on the documentation pages.
Development of the documentation and its toolchain takes place on the
docs@python.org mailing list. We're always looking for volunteers wanting
to help with the docs, so feel free to send a mail there!
Many thanks go to:
* Fred L. Drake, Jr., the creator of the original Python documentation toolset
and writer of much of the content;
* the `Docutils <http://docutils.sf.net/>`_ project for creating
reStructuredText and the Docutils suite;
* Fredrik Lundh for his `Alternative Python Reference
<http://effbot.org/zone/pyref.htm>`_ project from which Sphinx got many good
ideas.
See :ref:`reporting-bugs` for information how to report bugs in Python itself.
.. including the ACKS file here so that it can be maintained separately
.. include:: ACKS.txt
It is only with the input and contributions of the Python community
that Python has such wonderful documentation -- Thank You!

View File

@@ -0,0 +1,55 @@
.. _reporting-bugs:
************************
Reporting Bugs in Python
************************
Python is a mature programming language which has established a reputation for
stability. In order to maintain this reputation, the developers would like to
know of any deficiencies you find in Python.
Bug reports should be submitted via the Python Bug Tracker
(http://bugs.python.org/). The bug tracker offers a Web form which allows
pertinent information to be entered and submitted to the developers.
The first step in filing a report is to determine whether the problem has
already been reported. The advantage in doing so, aside from saving the
developers time, is that you learn what has been done to fix it; it may be that
the problem has already been fixed for the next release, or additional
information is needed (in which case you are welcome to provide it if you can!).
To do this, search the bug database using the search box on the top of the page.
If the problem you're reporting is not already in the bug tracker, go back to
the Python Bug Tracker. If you don't already have a tracker account, select the
"Register" link in the sidebar and undergo the registration procedure.
Otherwise, if you're not logged in, enter your credentials and select "Login".
It is not possible to submit a bug report anonymously.
Being now logged in, you can submit a bug. Select the "Create New" link in the
sidebar to open the bug reporting form.
The submission form has a number of fields. For the "Title" field, enter a
*very* short description of the problem; less than ten words is good. In the
"Type" field, select the type of your problem; also select the "Component" and
"Versions" to which the bug relates.
In the "Comment" field, describe the problem in detail, including what you
expected to happen and what did happen. Be sure to include whether any
extension modules were involved, and what hardware and software platform you
were using (including version information as appropriate).
Each bug report will be assigned to a developer who will determine what needs to
be done to correct the problem. You will receive an update each time action is
taken on the bug.
.. seealso::
`How to Report Bugs Effectively <http://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_
Article which goes into some detail about how to create a useful bug report.
This describes what kind of information is useful and why it is useful.
`Bug Writing Guidelines <http://developer.mozilla.org/en/docs/Bug_writing_guidelines>`_
Information about writing a good bug report. Some of this is specific to the
Mozilla project, but describes general good practices.

View File

@@ -0,0 +1,26 @@
.. highlightlang:: c
.. _abstract:
**********************
Abstract Objects Layer
**********************
The functions in this chapter interact with Python objects regardless of their
type, or with wide classes of object types (e.g. all numerical types, or all
sequence types). When used on object types for which they do not apply, they
will raise a Python exception.
It is not possible to use these functions on objects that are not properly
initialized, such as a list object that has been created by :cfunc:`PyList_New`,
but whose items have not been set to some non-\ ``NULL`` value yet.
.. toctree::
object.rst
number.rst
sequence.rst
mapping.rst
iter.rst
objbuffer.rst

View File

@@ -0,0 +1,104 @@
.. highlightlang:: c
.. _allocating-objects:
Allocating Objects on the Heap
==============================
.. cfunction:: PyObject* _PyObject_New(PyTypeObject *type)
.. cfunction:: PyVarObject* _PyObject_NewVar(PyTypeObject *type, Py_ssize_t size)
.. cfunction:: void _PyObject_Del(PyObject *op)
.. cfunction:: PyObject* PyObject_Init(PyObject *op, PyTypeObject *type)
Initialize a newly-allocated object *op* with its type and initial reference.
Returns the initialized object. If *type* indicates that the object
participates in the cyclic garbage detector, it is added to the detector's set
of observed objects. Other fields of the object are not affected.
.. cfunction:: PyVarObject* PyObject_InitVar(PyVarObject *op, PyTypeObject *type, Py_ssize_t size)
This does everything :cfunc:`PyObject_Init` does, and also initializes the
length information for a variable-size object.
.. cfunction:: TYPE* PyObject_New(TYPE, PyTypeObject *type)
Allocate a new Python object using the C structure type *TYPE* and the Python
type object *type*. Fields not defined by the Python object header are not
initialized; the object's reference count will be one. The size of the memory
allocation is determined from the :attr:`tp_basicsize` field of the type object.
.. cfunction:: TYPE* PyObject_NewVar(TYPE, PyTypeObject *type, Py_ssize_t size)
Allocate a new Python object using the C structure type *TYPE* and the Python
type object *type*. Fields not defined by the Python object header are not
initialized. The allocated memory allows for the *TYPE* structure plus *size*
fields of the size given by the :attr:`tp_itemsize` field of *type*. This is
useful for implementing objects like tuples, which are able to determine their
size at construction time. Embedding the array of fields into the same
allocation decreases the number of allocations, improving the memory management
efficiency.
.. cfunction:: void PyObject_Del(PyObject *op)
Releases memory allocated to an object using :cfunc:`PyObject_New` or
:cfunc:`PyObject_NewVar`. This is normally called from the :attr:`tp_dealloc`
handler specified in the object's type. The fields of the object should not be
accessed after this call as the memory is no longer a valid Python object.
.. cfunction:: PyObject* Py_InitModule(char *name, PyMethodDef *methods)
Create a new module object based on a name and table of functions, returning the
new module object.
.. versionchanged:: 2.3
Older versions of Python did not support *NULL* as the value for the *methods*
argument.
.. cfunction:: PyObject* Py_InitModule3(char *name, PyMethodDef *methods, char *doc)
Create a new module object based on a name and table of functions, returning the
new module object. If *doc* is non-*NULL*, it will be used to define the
docstring for the module.
.. versionchanged:: 2.3
Older versions of Python did not support *NULL* as the value for the *methods*
argument.
.. cfunction:: PyObject* Py_InitModule4(char *name, PyMethodDef *methods, char *doc, PyObject *self, int apiver)
Create a new module object based on a name and table of functions, returning the
new module object. If *doc* is non-*NULL*, it will be used to define the
docstring for the module. If *self* is non-*NULL*, it will passed to the
functions of the module as their (otherwise *NULL*) first parameter. (This was
added as an experimental feature, and there are no known uses in the current
version of Python.) For *apiver*, the only value which should be passed is
defined by the constant :const:`PYTHON_API_VERSION`.
.. note::
Most uses of this function should probably be using the :cfunc:`Py_InitModule3`
instead; only use this if you are sure you need it.
.. versionchanged:: 2.3
Older versions of Python did not support *NULL* as the value for the *methods*
argument.
.. cvar:: PyObject _Py_NoneStruct
Object which is visible in Python as ``None``. This should only be accessed
using the ``Py_None`` macro, which evaluates to a pointer to this object.

View File

@@ -0,0 +1,542 @@
.. highlightlang:: c
.. _arg-parsing:
Parsing arguments and building values
=====================================
These functions are useful when creating your own extensions functions and
methods. Additional information and examples are available in
:ref:`extending-index`.
The first three of these functions described, :cfunc:`PyArg_ParseTuple`,
:cfunc:`PyArg_ParseTupleAndKeywords`, and :cfunc:`PyArg_Parse`, all use *format
strings* which are used to tell the function about the expected arguments. The
format strings use the same syntax for each of these functions.
A format string consists of zero or more "format units." A format unit
describes one Python object; it is usually a single character or a parenthesized
sequence of format units. With a few exceptions, a format unit that is not a
parenthesized sequence normally corresponds to a single address argument to
these functions. In the following description, the quoted form is the format
unit; the entry in (round) parentheses is the Python object type that matches
the format unit; and the entry in [square] brackets is the type of the C
variable(s) whose address should be passed.
``s`` (string or Unicode object) [const char \*]
Convert a Python string or Unicode object to a C pointer to a character string.
You must not provide storage for the string itself; a pointer to an existing
string is stored into the character pointer variable whose address you pass.
The C string is NUL-terminated. The Python string must not contain embedded NUL
bytes; if it does, a :exc:`TypeError` exception is raised. Unicode objects are
converted to C strings using the default encoding. If this conversion fails, a
:exc:`UnicodeError` is raised.
``s#`` (string, Unicode or any read buffer compatible object) [const char \*, int (or :ctype:`Py_ssize_t`, see below)]
This variant on ``s`` stores into two C variables, the first one a pointer to a
character string, the second one its length. In this case the Python string may
contain embedded null bytes. Unicode objects pass back a pointer to the default
encoded string version of the object if such a conversion is possible. All
other read-buffer compatible objects pass back a reference to the raw internal
data representation.
Starting with Python 2.5 the type of the length argument can be
controlled by defining the macro :cmacro:`PY_SSIZE_T_CLEAN` before
including :file:`Python.h`. If the macro is defined, length is a
:ctype:`Py_ssize_t` rather than an int.
``s*`` (string, Unicode, or any buffer compatible object) [Py_buffer \*]
Similar to ``s#``, this code fills a Py_buffer structure provided by the caller.
The buffer gets locked, so that the caller can subsequently use the buffer even
inside a ``Py_BEGIN_ALLOW_THREADS`` block; the caller is responsible for calling
``PyBuffer_Release`` with the structure after it has processed the data.
.. versionadded:: 2.6
``z`` (string or ``None``) [const char \*]
Like ``s``, but the Python object may also be ``None``, in which case the C
pointer is set to *NULL*.
``z#`` (string or ``None`` or any read buffer compatible object) [const char \*, int]
This is to ``s#`` as ``z`` is to ``s``.
``z*`` (string or ``None`` or any buffer compatible object) [Py_buffer*]
This is to ``s*`` as ``z`` is to ``s``.
.. versionadded:: 2.6
``u`` (Unicode object) [Py_UNICODE \*]
Convert a Python Unicode object to a C pointer to a NUL-terminated buffer of
16-bit Unicode (UTF-16) data. As with ``s``, there is no need to provide
storage for the Unicode data buffer; a pointer to the existing Unicode data is
stored into the :ctype:`Py_UNICODE` pointer variable whose address you pass.
``u#`` (Unicode object) [Py_UNICODE \*, int]
This variant on ``u`` stores into two C variables, the first one a pointer to a
Unicode data buffer, the second one its length. Non-Unicode objects are handled
by interpreting their read-buffer pointer as pointer to a :ctype:`Py_UNICODE`
array.
``es`` (string, Unicode object or character buffer compatible object) [const char \*encoding, char \*\*buffer]
This variant on ``s`` is used for encoding Unicode and objects convertible to
Unicode into a character buffer. It only works for encoded data without embedded
NUL bytes.
This format requires two arguments. The first is only used as input, and
must be a :ctype:`const char\*` which points to the name of an encoding as a
NUL-terminated string, or *NULL*, in which case the default encoding is used.
An exception is raised if the named encoding is not known to Python. The
second argument must be a :ctype:`char\*\*`; the value of the pointer it
references will be set to a buffer with the contents of the argument text.
The text will be encoded in the encoding specified by the first argument.
:cfunc:`PyArg_ParseTuple` will allocate a buffer of the needed size, copy the
encoded data into this buffer and adjust *\*buffer* to reference the newly
allocated storage. The caller is responsible for calling :cfunc:`PyMem_Free` to
free the allocated buffer after use.
``et`` (string, Unicode object or character buffer compatible object) [const char \*encoding, char \*\*buffer]
Same as ``es`` except that 8-bit string objects are passed through without
recoding them. Instead, the implementation assumes that the string object uses
the encoding passed in as parameter.
``es#`` (string, Unicode object or character buffer compatible object) [const char \*encoding, char \*\*buffer, int \*buffer_length]
This variant on ``s#`` is used for encoding Unicode and objects convertible to
Unicode into a character buffer. Unlike the ``es`` format, this variant allows
input data which contains NUL characters.
It requires three arguments. The first is only used as input, and must be a
:ctype:`const char\*` which points to the name of an encoding as a
NUL-terminated string, or *NULL*, in which case the default encoding is used.
An exception is raised if the named encoding is not known to Python. The
second argument must be a :ctype:`char\*\*`; the value of the pointer it
references will be set to a buffer with the contents of the argument text.
The text will be encoded in the encoding specified by the first argument.
The third argument must be a pointer to an integer; the referenced integer
will be set to the number of bytes in the output buffer.
There are two modes of operation:
If *\*buffer* points a *NULL* pointer, the function will allocate a buffer of
the needed size, copy the encoded data into this buffer and set *\*buffer* to
reference the newly allocated storage. The caller is responsible for calling
:cfunc:`PyMem_Free` to free the allocated buffer after usage.
If *\*buffer* points to a non-*NULL* pointer (an already allocated buffer),
:cfunc:`PyArg_ParseTuple` will use this location as the buffer and interpret the
initial value of *\*buffer_length* as the buffer size. It will then copy the
encoded data into the buffer and NUL-terminate it. If the buffer is not large
enough, a :exc:`ValueError` will be set.
In both cases, *\*buffer_length* is set to the length of the encoded data
without the trailing NUL byte.
``et#`` (string, Unicode object or character buffer compatible object) [const char \*encoding, char \*\*buffer]
Same as ``es#`` except that string objects are passed through without recoding
them. Instead, the implementation assumes that the string object uses the
encoding passed in as parameter.
``b`` (integer) [unsigned char]
Convert a nonnegative Python integer to an unsigned tiny int, stored in a C
:ctype:`unsigned char`.
``B`` (integer) [unsigned char]
Convert a Python integer to a tiny int without overflow checking, stored in a C
:ctype:`unsigned char`.
.. versionadded:: 2.3
``h`` (integer) [short int]
Convert a Python integer to a C :ctype:`short int`.
``H`` (integer) [unsigned short int]
Convert a Python integer to a C :ctype:`unsigned short int`, without overflow
checking.
.. versionadded:: 2.3
``i`` (integer) [int]
Convert a Python integer to a plain C :ctype:`int`.
``I`` (integer) [unsigned int]
Convert a Python integer to a C :ctype:`unsigned int`, without overflow
checking.
.. versionadded:: 2.3
``l`` (integer) [long int]
Convert a Python integer to a C :ctype:`long int`.
``k`` (integer) [unsigned long]
Convert a Python integer or long integer to a C :ctype:`unsigned long` without
overflow checking.
.. versionadded:: 2.3
``L`` (integer) [PY_LONG_LONG]
Convert a Python integer to a C :ctype:`long long`. This format is only
available on platforms that support :ctype:`long long` (or :ctype:`_int64` on
Windows).
``K`` (integer) [unsigned PY_LONG_LONG]
Convert a Python integer or long integer to a C :ctype:`unsigned long long`
without overflow checking. This format is only available on platforms that
support :ctype:`unsigned long long` (or :ctype:`unsigned _int64` on Windows).
.. versionadded:: 2.3
``n`` (integer) [Py_ssize_t]
Convert a Python integer or long integer to a C :ctype:`Py_ssize_t`.
.. versionadded:: 2.5
``c`` (string of length 1) [char]
Convert a Python character, represented as a string of length 1, to a C
:ctype:`char`.
``f`` (float) [float]
Convert a Python floating point number to a C :ctype:`float`.
``d`` (float) [double]
Convert a Python floating point number to a C :ctype:`double`.
``D`` (complex) [Py_complex]
Convert a Python complex number to a C :ctype:`Py_complex` structure.
``O`` (object) [PyObject \*]
Store a Python object (without any conversion) in a C object pointer. The C
program thus receives the actual object that was passed. The object's reference
count is not increased. The pointer stored is not *NULL*.
``O!`` (object) [*typeobject*, PyObject \*]
Store a Python object in a C object pointer. This is similar to ``O``, but
takes two C arguments: the first is the address of a Python type object, the
second is the address of the C variable (of type :ctype:`PyObject\*`) into which
the object pointer is stored. If the Python object does not have the required
type, :exc:`TypeError` is raised.
``O&`` (object) [*converter*, *anything*]
Convert a Python object to a C variable through a *converter* function. This
takes two arguments: the first is a function, the second is the address of a C
variable (of arbitrary type), converted to :ctype:`void \*`. The *converter*
function in turn is called as follows::
status = converter(object, address);
where *object* is the Python object to be converted and *address* is the
:ctype:`void\*` argument that was passed to the :cfunc:`PyArg_Parse\*` function.
The returned *status* should be ``1`` for a successful conversion and ``0`` if
the conversion has failed. When the conversion fails, the *converter* function
should raise an exception and leave the content of *address* unmodified.
``S`` (string) [PyStringObject \*]
Like ``O`` but requires that the Python object is a string object. Raises
:exc:`TypeError` if the object is not a string object. The C variable may also
be declared as :ctype:`PyObject\*`.
``U`` (Unicode string) [PyUnicodeObject \*]
Like ``O`` but requires that the Python object is a Unicode object. Raises
:exc:`TypeError` if the object is not a Unicode object. The C variable may also
be declared as :ctype:`PyObject\*`.
``t#`` (read-only character buffer) [char \*, int]
Like ``s#``, but accepts any object which implements the read-only buffer
interface. The :ctype:`char\*` variable is set to point to the first byte of
the buffer, and the :ctype:`int` is set to the length of the buffer. Only
single-segment buffer objects are accepted; :exc:`TypeError` is raised for all
others.
``w`` (read-write character buffer) [char \*]
Similar to ``s``, but accepts any object which implements the read-write buffer
interface. The caller must determine the length of the buffer by other means,
or use ``w#`` instead. Only single-segment buffer objects are accepted;
:exc:`TypeError` is raised for all others.
``w#`` (read-write character buffer) [char \*, Py_ssize_t]
Like ``s#``, but accepts any object which implements the read-write buffer
interface. The :ctype:`char \*` variable is set to point to the first byte of
the buffer, and the :ctype:`int` is set to the length of the buffer. Only
single-segment buffer objects are accepted; :exc:`TypeError` is raised for all
others.
``w*`` (read-write byte-oriented buffer) [Py_buffer \*]
This is to ``w`` what ``s*`` is to ``s``.
.. versionadded:: 2.6
``(items)`` (tuple) [*matching-items*]
The object must be a Python sequence whose length is the number of format units
in *items*. The C arguments must correspond to the individual format units in
*items*. Format units for sequences may be nested.
.. note::
Prior to Python version 1.5.2, this format specifier only accepted a tuple
containing the individual parameters, not an arbitrary sequence. Code which
previously caused :exc:`TypeError` to be raised here may now proceed without an
exception. This is not expected to be a problem for existing code.
It is possible to pass Python long integers where integers are requested;
however no proper range checking is done --- the most significant bits are
silently truncated when the receiving field is too small to receive the value
(actually, the semantics are inherited from downcasts in C --- your mileage may
vary).
A few other characters have a meaning in a format string. These may not occur
inside nested parentheses. They are:
``|``
Indicates that the remaining arguments in the Python argument list are optional.
The C variables corresponding to optional arguments should be initialized to
their default value --- when an optional argument is not specified,
:cfunc:`PyArg_ParseTuple` does not touch the contents of the corresponding C
variable(s).
``:``
The list of format units ends here; the string after the colon is used as the
function name in error messages (the "associated value" of the exception that
:cfunc:`PyArg_ParseTuple` raises).
``;``
The list of format units ends here; the string after the semicolon is used as
the error message *instead* of the default error message. ``:`` and ``;``
mutually exclude each other.
Note that any Python object references which are provided to the caller are
*borrowed* references; do not decrement their reference count!
Additional arguments passed to these functions must be addresses of variables
whose type is determined by the format string; these are used to store values
from the input tuple. There are a few cases, as described in the list of format
units above, where these parameters are used as input values; they should match
what is specified for the corresponding format unit in that case.
For the conversion to succeed, the *arg* object must match the format
and the format must be exhausted. On success, the
:cfunc:`PyArg_Parse\*` functions return true, otherwise they return
false and raise an appropriate exception. When the
:cfunc:`PyArg_Parse\*` functions fail due to conversion failure in one
of the format units, the variables at the addresses corresponding to that
and the following format units are left untouched.
.. cfunction:: int PyArg_ParseTuple(PyObject *args, const char *format, ...)
Parse the parameters of a function that takes only positional parameters into
local variables. Returns true on success; on failure, it returns false and
raises the appropriate exception.
.. cfunction:: int PyArg_VaParse(PyObject *args, const char *format, va_list vargs)
Identical to :cfunc:`PyArg_ParseTuple`, except that it accepts a va_list rather
than a variable number of arguments.
.. cfunction:: int PyArg_ParseTupleAndKeywords(PyObject *args, PyObject *kw, const char *format, char *keywords[], ...)
Parse the parameters of a function that takes both positional and keyword
parameters into local variables. Returns true on success; on failure, it
returns false and raises the appropriate exception.
.. cfunction:: int PyArg_VaParseTupleAndKeywords(PyObject *args, PyObject *kw, const char *format, char *keywords[], va_list vargs)
Identical to :cfunc:`PyArg_ParseTupleAndKeywords`, except that it accepts a
va_list rather than a variable number of arguments.
.. cfunction:: int PyArg_Parse(PyObject *args, const char *format, ...)
Function used to deconstruct the argument lists of "old-style" functions ---
these are functions which use the :const:`METH_OLDARGS` parameter parsing
method. This is not recommended for use in parameter parsing in new code, and
most code in the standard interpreter has been modified to no longer use this
for that purpose. It does remain a convenient way to decompose other tuples,
however, and may continue to be used for that purpose.
.. cfunction:: int PyArg_UnpackTuple(PyObject *args, const char *name, Py_ssize_t min, Py_ssize_t max, ...)
A simpler form of parameter retrieval which does not use a format string to
specify the types of the arguments. Functions which use this method to retrieve
their parameters should be declared as :const:`METH_VARARGS` in function or
method tables. The tuple containing the actual parameters should be passed as
*args*; it must actually be a tuple. The length of the tuple must be at least
*min* and no more than *max*; *min* and *max* may be equal. Additional
arguments must be passed to the function, each of which should be a pointer to a
:ctype:`PyObject\*` variable; these will be filled in with the values from
*args*; they will contain borrowed references. The variables which correspond
to optional parameters not given by *args* will not be filled in; these should
be initialized by the caller. This function returns true on success and false if
*args* is not a tuple or contains the wrong number of elements; an exception
will be set if there was a failure.
This is an example of the use of this function, taken from the sources for the
:mod:`_weakref` helper module for weak references::
static PyObject *
weakref_ref(PyObject *self, PyObject *args)
{
PyObject *object;
PyObject *callback = NULL;
PyObject *result = NULL;
if (PyArg_UnpackTuple(args, "ref", 1, 2, &object, &callback)) {
result = PyWeakref_NewRef(object, callback);
}
return result;
}
The call to :cfunc:`PyArg_UnpackTuple` in this example is entirely equivalent to
this call to :cfunc:`PyArg_ParseTuple`::
PyArg_ParseTuple(args, "O|O:ref", &object, &callback)
.. versionadded:: 2.2
.. cfunction:: PyObject* Py_BuildValue(const char *format, ...)
Create a new value based on a format string similar to those accepted by the
:cfunc:`PyArg_Parse\*` family of functions and a sequence of values. Returns
the value or *NULL* in the case of an error; an exception will be raised if
*NULL* is returned.
:cfunc:`Py_BuildValue` does not always build a tuple. It builds a tuple only if
its format string contains two or more format units. If the format string is
empty, it returns ``None``; if it contains exactly one format unit, it returns
whatever object is described by that format unit. To force it to return a tuple
of size 0 or one, parenthesize the format string.
When memory buffers are passed as parameters to supply data to build objects, as
for the ``s`` and ``s#`` formats, the required data is copied. Buffers provided
by the caller are never referenced by the objects created by
:cfunc:`Py_BuildValue`. In other words, if your code invokes :cfunc:`malloc`
and passes the allocated memory to :cfunc:`Py_BuildValue`, your code is
responsible for calling :cfunc:`free` for that memory once
:cfunc:`Py_BuildValue` returns.
In the following description, the quoted form is the format unit; the entry in
(round) parentheses is the Python object type that the format unit will return;
and the entry in [square] brackets is the type of the C value(s) to be passed.
The characters space, tab, colon and comma are ignored in format strings (but
not within format units such as ``s#``). This can be used to make long format
strings a tad more readable.
``s`` (string) [char \*]
Convert a null-terminated C string to a Python object. If the C string pointer
is *NULL*, ``None`` is used.
``s#`` (string) [char \*, int]
Convert a C string and its length to a Python object. If the C string pointer
is *NULL*, the length is ignored and ``None`` is returned.
``z`` (string or ``None``) [char \*]
Same as ``s``.
``z#`` (string or ``None``) [char \*, int]
Same as ``s#``.
``u`` (Unicode string) [Py_UNICODE \*]
Convert a null-terminated buffer of Unicode (UCS-2 or UCS-4) data to a Python
Unicode object. If the Unicode buffer pointer is *NULL*, ``None`` is returned.
``u#`` (Unicode string) [Py_UNICODE \*, int]
Convert a Unicode (UCS-2 or UCS-4) data buffer and its length to a Python
Unicode object. If the Unicode buffer pointer is *NULL*, the length is ignored
and ``None`` is returned.
``i`` (integer) [int]
Convert a plain C :ctype:`int` to a Python integer object.
``b`` (integer) [char]
Convert a plain C :ctype:`char` to a Python integer object.
``h`` (integer) [short int]
Convert a plain C :ctype:`short int` to a Python integer object.
``l`` (integer) [long int]
Convert a C :ctype:`long int` to a Python integer object.
``B`` (integer) [unsigned char]
Convert a C :ctype:`unsigned char` to a Python integer object.
``H`` (integer) [unsigned short int]
Convert a C :ctype:`unsigned short int` to a Python integer object.
``I`` (integer/long) [unsigned int]
Convert a C :ctype:`unsigned int` to a Python integer object or a Python long
integer object, if it is larger than ``sys.maxint``.
``k`` (integer/long) [unsigned long]
Convert a C :ctype:`unsigned long` to a Python integer object or a Python long
integer object, if it is larger than ``sys.maxint``.
``L`` (long) [PY_LONG_LONG]
Convert a C :ctype:`long long` to a Python long integer object. Only available
on platforms that support :ctype:`long long`.
``K`` (long) [unsigned PY_LONG_LONG]
Convert a C :ctype:`unsigned long long` to a Python long integer object. Only
available on platforms that support :ctype:`unsigned long long`.
``n`` (int) [Py_ssize_t]
Convert a C :ctype:`Py_ssize_t` to a Python integer or long integer.
.. versionadded:: 2.5
``c`` (string of length 1) [char]
Convert a C :ctype:`int` representing a character to a Python string of length
1.
``d`` (float) [double]
Convert a C :ctype:`double` to a Python floating point number.
``f`` (float) [float]
Same as ``d``.
``D`` (complex) [Py_complex \*]
Convert a C :ctype:`Py_complex` structure to a Python complex number.
``O`` (object) [PyObject \*]
Pass a Python object untouched (except for its reference count, which is
incremented by one). If the object passed in is a *NULL* pointer, it is assumed
that this was caused because the call producing the argument found an error and
set an exception. Therefore, :cfunc:`Py_BuildValue` will return *NULL* but won't
raise an exception. If no exception has been raised yet, :exc:`SystemError` is
set.
``S`` (object) [PyObject \*]
Same as ``O``.
``N`` (object) [PyObject \*]
Same as ``O``, except it doesn't increment the reference count on the object.
Useful when the object is created by a call to an object constructor in the
argument list.
``O&`` (object) [*converter*, *anything*]
Convert *anything* to a Python object through a *converter* function. The
function is called with *anything* (which should be compatible with :ctype:`void
\*`) as its argument and should return a "new" Python object, or *NULL* if an
error occurred.
``(items)`` (tuple) [*matching-items*]
Convert a sequence of C values to a Python tuple with the same number of items.
``[items]`` (list) [*matching-items*]
Convert a sequence of C values to a Python list with the same number of items.
``{items}`` (dictionary) [*matching-items*]
Convert a sequence of C values to a Python dictionary. Each pair of consecutive
C values adds one item to the dictionary, serving as key and value,
respectively.
If there is an error in the format string, the :exc:`SystemError` exception is
set and *NULL* returned.
.. cfunction:: PyObject* Py_VaBuildValue(const char *format, va_list vargs)
Identical to :cfunc:`Py_BuildValue`, except that it accepts a va_list
rather than a variable number of arguments.

View File

@@ -0,0 +1,54 @@
.. highlightlang:: c
.. _boolobjects:
Boolean Objects
---------------
Booleans in Python are implemented as a subclass of integers. There are only
two booleans, :const:`Py_False` and :const:`Py_True`. As such, the normal
creation and deletion functions don't apply to booleans. The following macros
are available, however.
.. cfunction:: int PyBool_Check(PyObject *o)
Return true if *o* is of type :cdata:`PyBool_Type`.
.. versionadded:: 2.3
.. cvar:: PyObject* Py_False
The Python ``False`` object. This object has no methods. It needs to be
treated just like any other object with respect to reference counts.
.. cvar:: PyObject* Py_True
The Python ``True`` object. This object has no methods. It needs to be treated
just like any other object with respect to reference counts.
.. cmacro:: Py_RETURN_FALSE
Return :const:`Py_False` from a function, properly incrementing its reference
count.
.. versionadded:: 2.4
.. cmacro:: Py_RETURN_TRUE
Return :const:`Py_True` from a function, properly incrementing its reference
count.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyBool_FromLong(long v)
Return a new reference to :const:`Py_True` or :const:`Py_False` depending on the
truth value of *v*.
.. versionadded:: 2.3

View File

@@ -0,0 +1,119 @@
.. highlightlang:: c
.. _bufferobjects:
Buffer Objects
--------------
.. sectionauthor:: Greg Stein <gstein@lyra.org>
.. index::
object: buffer
single: buffer interface
Python objects implemented in C can export a group of functions called the
"buffer interface." These functions can be used by an object to expose its data
in a raw, byte-oriented format. Clients of the object can use the buffer
interface to access the object data directly, without needing to copy it first.
Two examples of objects that support the buffer interface are strings and
arrays. The string object exposes the character contents in the buffer
interface's byte-oriented form. An array can also expose its contents, but it
should be noted that array elements may be multi-byte values.
An example user of the buffer interface is the file object's :meth:`write`
method. Any object that can export a series of bytes through the buffer
interface can be written to a file. There are a number of format codes to
:cfunc:`PyArg_ParseTuple` that operate against an object's buffer interface,
returning data from the target object.
.. index:: single: PyBufferProcs
More information on the buffer interface is provided in the section
:ref:`buffer-structs`, under the description for :ctype:`PyBufferProcs`.
A "buffer object" is defined in the :file:`bufferobject.h` header (included by
:file:`Python.h`). These objects look very similar to string objects at the
Python programming level: they support slicing, indexing, concatenation, and
some other standard string operations. However, their data can come from one of
two sources: from a block of memory, or from another object which exports the
buffer interface.
Buffer objects are useful as a way to expose the data from another object's
buffer interface to the Python programmer. They can also be used as a zero-copy
slicing mechanism. Using their ability to reference a block of memory, it is
possible to expose any data to the Python programmer quite easily. The memory
could be a large, constant array in a C extension, it could be a raw block of
memory for manipulation before passing to an operating system library, or it
could be used to pass around structured data in its native, in-memory format.
.. ctype:: PyBufferObject
This subtype of :ctype:`PyObject` represents a buffer object.
.. cvar:: PyTypeObject PyBuffer_Type
.. index:: single: BufferType (in module types)
The instance of :ctype:`PyTypeObject` which represents the Python buffer type;
it is the same object as ``buffer`` and ``types.BufferType`` in the Python
layer. .
.. cvar:: int Py_END_OF_BUFFER
This constant may be passed as the *size* parameter to
:cfunc:`PyBuffer_FromObject` or :cfunc:`PyBuffer_FromReadWriteObject`. It
indicates that the new :ctype:`PyBufferObject` should refer to *base* object
from the specified *offset* to the end of its exported buffer. Using this
enables the caller to avoid querying the *base* object for its length.
.. cfunction:: int PyBuffer_Check(PyObject *p)
Return true if the argument has type :cdata:`PyBuffer_Type`.
.. cfunction:: PyObject* PyBuffer_FromObject(PyObject *base, Py_ssize_t offset, Py_ssize_t size)
Return a new read-only buffer object. This raises :exc:`TypeError` if *base*
doesn't support the read-only buffer protocol or doesn't provide exactly one
buffer segment, or it raises :exc:`ValueError` if *offset* is less than zero.
The buffer will hold a reference to the *base* object, and the buffer's contents
will refer to the *base* object's buffer interface, starting as position
*offset* and extending for *size* bytes. If *size* is :const:`Py_END_OF_BUFFER`,
then the new buffer's contents extend to the length of the *base* object's
exported buffer data.
.. cfunction:: PyObject* PyBuffer_FromReadWriteObject(PyObject *base, Py_ssize_t offset, Py_ssize_t size)
Return a new writable buffer object. Parameters and exceptions are similar to
those for :cfunc:`PyBuffer_FromObject`. If the *base* object does not export
the writeable buffer protocol, then :exc:`TypeError` is raised.
.. cfunction:: PyObject* PyBuffer_FromMemory(void *ptr, Py_ssize_t size)
Return a new read-only buffer object that reads from a specified location in
memory, with a specified size. The caller is responsible for ensuring that the
memory buffer, passed in as *ptr*, is not deallocated while the returned buffer
object exists. Raises :exc:`ValueError` if *size* is less than zero. Note that
:const:`Py_END_OF_BUFFER` may *not* be passed for the *size* parameter;
:exc:`ValueError` will be raised in that case.
.. cfunction:: PyObject* PyBuffer_FromReadWriteMemory(void *ptr, Py_ssize_t size)
Similar to :cfunc:`PyBuffer_FromMemory`, but the returned buffer is writable.
.. cfunction:: PyObject* PyBuffer_New(Py_ssize_t size)
Return a new writable buffer object that maintains its own memory buffer of
*size* bytes. :exc:`ValueError` is returned if *size* is not zero or positive.
Note that the memory buffer (as returned by :cfunc:`PyObject_AsWriteBuffer`) is
not specifically aligned.

View File

@@ -0,0 +1,78 @@
.. highlightlang:: c
.. _bytearrayobjects:
Byte Array Objects
------------------
.. index:: object: bytearray
.. versionadded:: 2.6
.. ctype:: PyByteArrayObject
This subtype of :ctype:`PyObject` represents a Python bytearray object.
.. cvar:: PyTypeObject PyByteArray_Type
This instance of :ctype:`PyTypeObject` represents the Python bytearray type;
it is the same object as ``bytearray`` in the Python layer.
.. cfunction:: int PyByteArray_Check(PyObject *o)
Return true if the object *o* is a bytearray object or an instance of a
subtype of the bytearray type.
.. cfunction:: int PyByteArray_CheckExact(PyObject *o)
Return true if the object *o* is a bytearray object, but not an instance of a
subtype of the bytearray type.
.. cfunction:: PyObject* PyByteArray_FromObject(PyObject *o)
Return a new bytearray object from any object, *o*, that implements the
buffer protocol.
.. XXX expand about the buffer protocol, at least somewhere
.. cfunction:: PyObject* PyByteArray_FromStringAndSize(const char *string, Py_ssize_t len)
Create a new bytearray object from *string* and its length, *len*. On
failure, *NULL* is returned.
.. cfunction:: Py_ssize_t PyByteArray_Size(PyObject *bytearray)
Return the size of *bytearray* after checking for a *NULL* pointer.
.. cfunction:: Py_ssize_t PyByteArray_GET_SIZE(PyObject *bytearray)
Macro version of :cfunc:`PyByteArray_Size` that doesn't do pointer checking.
.. cfunction:: char* PyByteArray_AsString(PyObject *bytearray)
Return the contents of *bytearray* as a char array after checking for a
*NULL* pointer.
.. cfunction:: char* PyByteArray_AS_STRING(PyObject *bytearray)
Macro version of :cfunc:`PyByteArray_AsString` that doesn't check pointers.
.. cfunction:: PyObject* PyByteArray_Concat(PyObject *a, PyObject *b)
Concat bytearrays *a* and *b* and return a new bytearray with the result.
.. cfunction:: PyObject* PyByteArray_Resize(PyObject *bytearray, Py_ssize_t len)
Resize the internal buffer of *bytearray* to *len*.

View File

@@ -0,0 +1,62 @@
.. highlightlang:: c
.. _cell-objects:
Cell Objects
------------
"Cell" objects are used to implement variables referenced by multiple scopes.
For each such variable, a cell object is created to store the value; the local
variables of each stack frame that references the value contains a reference to
the cells from outer scopes which also use that variable. When the value is
accessed, the value contained in the cell is used instead of the cell object
itself. This de-referencing of the cell object requires support from the
generated byte-code; these are not automatically de-referenced when accessed.
Cell objects are not likely to be useful elsewhere.
.. ctype:: PyCellObject
The C structure used for cell objects.
.. cvar:: PyTypeObject PyCell_Type
The type object corresponding to cell objects.
.. cfunction:: int PyCell_Check(ob)
Return true if *ob* is a cell object; *ob* must not be *NULL*.
.. cfunction:: PyObject* PyCell_New(PyObject *ob)
Create and return a new cell object containing the value *ob*. The parameter may
be *NULL*.
.. cfunction:: PyObject* PyCell_Get(PyObject *cell)
Return the contents of the cell *cell*.
.. cfunction:: PyObject* PyCell_GET(PyObject *cell)
Return the contents of the cell *cell*, but without checking that *cell* is
non-*NULL* and a cell object.
.. cfunction:: int PyCell_Set(PyObject *cell, PyObject *value)
Set the contents of the cell object *cell* to *value*. This releases the
reference to any current content of the cell. *value* may be *NULL*. *cell*
must be non-*NULL*; if it is not a cell object, ``-1`` will be returned. On
success, ``0`` will be returned.
.. cfunction:: void PyCell_SET(PyObject *cell, PyObject *value)
Sets the value of the cell object *cell* to *value*. No reference counts are
adjusted, and no checks are made for safety; *cell* must be non-*NULL* and must
be a cell object.

View File

@@ -0,0 +1,65 @@
.. highlightlang:: c
.. _classobjects:
Class and Instance Objects
--------------------------
.. index:: object: class
Note that the class objects described here represent old-style classes, which
will go away in Python 3. When creating new types for extension modules, you
will want to work with type objects (section :ref:`typeobjects`).
.. ctype:: PyClassObject
The C structure of the objects used to describe built-in classes.
.. cvar:: PyObject* PyClass_Type
.. index:: single: ClassType (in module types)
This is the type object for class objects; it is the same object as
``types.ClassType`` in the Python layer.
.. cfunction:: int PyClass_Check(PyObject *o)
Return true if the object *o* is a class object, including instances of types
derived from the standard class object. Return false in all other cases.
.. cfunction:: int PyClass_IsSubclass(PyObject *klass, PyObject *base)
Return true if *klass* is a subclass of *base*. Return false in all other cases.
.. index:: object: instance
There are very few functions specific to instance objects.
.. cvar:: PyTypeObject PyInstance_Type
Type object for class instances.
.. cfunction:: int PyInstance_Check(PyObject *obj)
Return true if *obj* is an instance.
.. cfunction:: PyObject* PyInstance_New(PyObject *class, PyObject *arg, PyObject *kw)
Create a new instance of a specific class. The parameters *arg* and *kw* are
used as the positional and keyword parameters to the object's constructor.
.. cfunction:: PyObject* PyInstance_NewRaw(PyObject *class, PyObject *dict)
Create a new instance of a specific class without calling its constructor.
*class* is the class of new object. The *dict* parameter will be used as the
object's :attr:`__dict__`; if *NULL*, a new dictionary will be created for the
instance.

View File

@@ -0,0 +1,56 @@
.. highlightlang:: c
.. _cobjects:
CObjects
--------
.. index:: object: CObject
Refer to :ref:`using-cobjects` for more information on using these objects.
.. ctype:: PyCObject
This subtype of :ctype:`PyObject` represents an opaque value, useful for C
extension modules who need to pass an opaque value (as a :ctype:`void\*`
pointer) through Python code to other C code. It is often used to make a C
function pointer defined in one module available to other modules, so the
regular import mechanism can be used to access C APIs defined in dynamically
loaded modules.
.. cfunction:: int PyCObject_Check(PyObject *p)
Return true if its argument is a :ctype:`PyCObject`.
.. cfunction:: PyObject* PyCObject_FromVoidPtr(void* cobj, void (*destr)(void *))
Create a :ctype:`PyCObject` from the ``void *`` *cobj*. The *destr* function
will be called when the object is reclaimed, unless it is *NULL*.
.. cfunction:: PyObject* PyCObject_FromVoidPtrAndDesc(void* cobj, void* desc, void (*destr)(void *, void *))
Create a :ctype:`PyCObject` from the :ctype:`void \*` *cobj*. The *destr*
function will be called when the object is reclaimed. The *desc* argument can
be used to pass extra callback data for the destructor function.
.. cfunction:: void* PyCObject_AsVoidPtr(PyObject* self)
Return the object :ctype:`void \*` that the :ctype:`PyCObject` *self* was
created with.
.. cfunction:: void* PyCObject_GetDesc(PyObject* self)
Return the description :ctype:`void \*` that the :ctype:`PyCObject` *self* was
created with.
.. cfunction:: int PyCObject_SetVoidPtr(PyObject* self, void* cobj)
Set the void pointer inside *self* to *cobj*. The :ctype:`PyCObject` must not
have an associated destructor. Return true on success, false on failure.

View File

@@ -0,0 +1,132 @@
.. highlightlang:: c
.. _complexobjects:
Complex Number Objects
----------------------
.. index:: object: complex number
Python's complex number objects are implemented as two distinct types when
viewed from the C API: one is the Python object exposed to Python programs, and
the other is a C structure which represents the actual complex number value.
The API provides functions for working with both.
Complex Numbers as C Structures
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that the functions which accept these structures as parameters and return
them as results do so *by value* rather than dereferencing them through
pointers. This is consistent throughout the API.
.. ctype:: Py_complex
The C structure which corresponds to the value portion of a Python complex
number object. Most of the functions for dealing with complex number objects
use structures of this type as input or output values, as appropriate. It is
defined as::
typedef struct {
double real;
double imag;
} Py_complex;
.. cfunction:: Py_complex _Py_c_sum(Py_complex left, Py_complex right)
Return the sum of two complex numbers, using the C :ctype:`Py_complex`
representation.
.. cfunction:: Py_complex _Py_c_diff(Py_complex left, Py_complex right)
Return the difference between two complex numbers, using the C
:ctype:`Py_complex` representation.
.. cfunction:: Py_complex _Py_c_neg(Py_complex complex)
Return the negation of the complex number *complex*, using the C
:ctype:`Py_complex` representation.
.. cfunction:: Py_complex _Py_c_prod(Py_complex left, Py_complex right)
Return the product of two complex numbers, using the C :ctype:`Py_complex`
representation.
.. cfunction:: Py_complex _Py_c_quot(Py_complex dividend, Py_complex divisor)
Return the quotient of two complex numbers, using the C :ctype:`Py_complex`
representation.
.. cfunction:: Py_complex _Py_c_pow(Py_complex num, Py_complex exp)
Return the exponentiation of *num* by *exp*, using the C :ctype:`Py_complex`
representation.
Complex Numbers as Python Objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. ctype:: PyComplexObject
This subtype of :ctype:`PyObject` represents a Python complex number object.
.. cvar:: PyTypeObject PyComplex_Type
This instance of :ctype:`PyTypeObject` represents the Python complex number
type. It is the same object as ``complex`` and ``types.ComplexType``.
.. cfunction:: int PyComplex_Check(PyObject *p)
Return true if its argument is a :ctype:`PyComplexObject` or a subtype of
:ctype:`PyComplexObject`.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyComplex_CheckExact(PyObject *p)
Return true if its argument is a :ctype:`PyComplexObject`, but not a subtype of
:ctype:`PyComplexObject`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyComplex_FromCComplex(Py_complex v)
Create a new Python complex number object from a C :ctype:`Py_complex` value.
.. cfunction:: PyObject* PyComplex_FromDoubles(double real, double imag)
Return a new :ctype:`PyComplexObject` object from *real* and *imag*.
.. cfunction:: double PyComplex_RealAsDouble(PyObject *op)
Return the real part of *op* as a C :ctype:`double`.
.. cfunction:: double PyComplex_ImagAsDouble(PyObject *op)
Return the imaginary part of *op* as a C :ctype:`double`.
.. cfunction:: Py_complex PyComplex_AsCComplex(PyObject *op)
Return the :ctype:`Py_complex` value of the complex number *op*.
.. versionchanged:: 2.6
If *op* is not a Python complex number object but has a :meth:`__complex__`
method, this method will first be called to convert *op* to a Python complex
number object.

View File

@@ -0,0 +1,107 @@
.. highlightlang:: c
.. _concrete:
**********************
Concrete Objects Layer
**********************
The functions in this chapter are specific to certain Python object types.
Passing them an object of the wrong type is not a good idea; if you receive an
object from a Python program and you are not sure that it has the right type,
you must perform a type check first; for example, to check that an object is a
dictionary, use :cfunc:`PyDict_Check`. The chapter is structured like the
"family tree" of Python object types.
.. warning::
While the functions described in this chapter carefully check the type of the
objects which are passed in, many of them do not check for *NULL* being passed
instead of a valid object. Allowing *NULL* to be passed in can cause memory
access violations and immediate termination of the interpreter.
.. _fundamental:
Fundamental Objects
===================
This section describes Python type objects and the singleton object ``None``.
.. toctree::
type.rst
none.rst
.. _numericobjects:
Numeric Objects
===============
.. index:: object: numeric
.. toctree::
int.rst
bool.rst
long.rst
float.rst
complex.rst
.. _sequenceobjects:
Sequence Objects
================
.. index:: object: sequence
Generic operations on sequence objects were discussed in the previous chapter;
this section deals with the specific kinds of sequence objects that are
intrinsic to the Python language.
.. toctree::
bytearray.rst
string.rst
unicode.rst
buffer.rst
tuple.rst
list.rst
.. _mapobjects:
Mapping Objects
===============
.. index:: object: mapping
.. toctree::
dict.rst
.. _otherobjects:
Other Objects
=============
.. toctree::
class.rst
function.rst
method.rst
file.rst
module.rst
iterator.rst
descriptor.rst
slice.rst
weakref.rst
cobject.rst
cell.rst
gen.rst
datetime.rst
set.rst

View File

@@ -0,0 +1,103 @@
.. highlightlang:: c
.. _string-conversion:
String conversion and formatting
================================
Functions for number conversion and formatted string output.
.. cfunction:: int PyOS_snprintf(char *str, size_t size, const char *format, ...)
Output not more than *size* bytes to *str* according to the format string
*format* and the extra arguments. See the Unix man page :manpage:`snprintf(2)`.
.. cfunction:: int PyOS_vsnprintf(char *str, size_t size, const char *format, va_list va)
Output not more than *size* bytes to *str* according to the format string
*format* and the variable argument list *va*. Unix man page
:manpage:`vsnprintf(2)`.
:cfunc:`PyOS_snprintf` and :cfunc:`PyOS_vsnprintf` wrap the Standard C library
functions :cfunc:`snprintf` and :cfunc:`vsnprintf`. Their purpose is to
guarantee consistent behavior in corner cases, which the Standard C functions do
not.
The wrappers ensure that *str*[*size*-1] is always ``'\0'`` upon return. They
never write more than *size* bytes (including the trailing ``'\0'`` into str.
Both functions require that ``str != NULL``, ``size > 0`` and ``format !=
NULL``.
If the platform doesn't have :cfunc:`vsnprintf` and the buffer size needed to
avoid truncation exceeds *size* by more than 512 bytes, Python aborts with a
*Py_FatalError*.
The return value (*rv*) for these functions should be interpreted as follows:
* When ``0 <= rv < size``, the output conversion was successful and *rv*
characters were written to *str* (excluding the trailing ``'\0'`` byte at
*str*[*rv*]).
* When ``rv >= size``, the output conversion was truncated and a buffer with
``rv + 1`` bytes would have been needed to succeed. *str*[*size*-1] is ``'\0'``
in this case.
* When ``rv < 0``, "something bad happened." *str*[*size*-1] is ``'\0'`` in
this case too, but the rest of *str* is undefined. The exact cause of the error
depends on the underlying platform.
The following functions provide locale-independent string to number conversions.
.. cfunction:: double PyOS_ascii_strtod(const char *nptr, char **endptr)
Convert a string to a :ctype:`double`. This function behaves like the Standard C
function :cfunc:`strtod` does in the C locale. It does this without changing the
current locale, since that would not be thread-safe.
:cfunc:`PyOS_ascii_strtod` should typically be used for reading configuration
files or other non-user input that should be locale independent.
.. versionadded:: 2.4
See the Unix man page :manpage:`strtod(2)` for details.
.. cfunction:: char * PyOS_ascii_formatd(char *buffer, size_t buf_len, const char *format, double d)
Convert a :ctype:`double` to a string using the ``'.'`` as the decimal
separator. *format* is a :cfunc:`printf`\ -style format string specifying the
number format. Allowed conversion characters are ``'e'``, ``'E'``, ``'f'``,
``'F'``, ``'g'`` and ``'G'``.
The return value is a pointer to *buffer* with the converted string or NULL if
the conversion failed.
.. versionadded:: 2.4
.. cfunction:: double PyOS_ascii_atof(const char *nptr)
Convert a string to a :ctype:`double` in a locale-independent way.
.. versionadded:: 2.4
See the Unix man page :manpage:`atof(2)` for details.
.. cfunction:: char * PyOS_stricmp(char *s1, char *s2)
Case insensitive comparison of strings. The function works almost
identically to :cfunc:`strcmp` except that it ignores the case.
.. versionadded:: 2.6
.. cfunction:: char * PyOS_strnicmp(char *s1, char *s2, Py_ssize_t size)
Case insensitive comparison of strings. The function works almost
identically to :cfunc:`strncmp` except that it ignores the case.
.. versionadded:: 2.6

View File

@@ -0,0 +1,238 @@
.. highlightlang:: c
.. _datetimeobjects:
DateTime Objects
----------------
Various date and time objects are supplied by the :mod:`datetime` module.
Before using any of these functions, the header file :file:`datetime.h` must be
included in your source (note that this is not included by :file:`Python.h`),
and the macro :cfunc:`PyDateTime_IMPORT` must be invoked. The macro puts a
pointer to a C structure into a static variable, ``PyDateTimeAPI``, that is
used by the following macros.
Type-check macros:
.. cfunction:: int PyDate_Check(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_DateType` or a subtype of
:cdata:`PyDateTime_DateType`. *ob* must not be *NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyDate_CheckExact(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_DateType`. *ob* must not be
*NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_Check(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_DateTimeType` or a subtype of
:cdata:`PyDateTime_DateTimeType`. *ob* must not be *NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_CheckExact(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_DateTimeType`. *ob* must not
be *NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyTime_Check(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_TimeType` or a subtype of
:cdata:`PyDateTime_TimeType`. *ob* must not be *NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyTime_CheckExact(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_TimeType`. *ob* must not be
*NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyDelta_Check(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_DeltaType` or a subtype of
:cdata:`PyDateTime_DeltaType`. *ob* must not be *NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyDelta_CheckExact(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_DeltaType`. *ob* must not be
*NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyTZInfo_Check(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_TZInfoType` or a subtype of
:cdata:`PyDateTime_TZInfoType`. *ob* must not be *NULL*.
.. versionadded:: 2.4
.. cfunction:: int PyTZInfo_CheckExact(PyObject *ob)
Return true if *ob* is of type :cdata:`PyDateTime_TZInfoType`. *ob* must not be
*NULL*.
.. versionadded:: 2.4
Macros to create objects:
.. cfunction:: PyObject* PyDate_FromDate(int year, int month, int day)
Return a ``datetime.date`` object with the specified year, month and day.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyDateTime_FromDateAndTime(int year, int month, int day, int hour, int minute, int second, int usecond)
Return a ``datetime.datetime`` object with the specified year, month, day, hour,
minute, second and microsecond.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyTime_FromTime(int hour, int minute, int second, int usecond)
Return a ``datetime.time`` object with the specified hour, minute, second and
microsecond.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyDelta_FromDSU(int days, int seconds, int useconds)
Return a ``datetime.timedelta`` object representing the given number of days,
seconds and microseconds. Normalization is performed so that the resulting
number of microseconds and seconds lie in the ranges documented for
``datetime.timedelta`` objects.
.. versionadded:: 2.4
Macros to extract fields from date objects. The argument must be an instance of
:cdata:`PyDateTime_Date`, including subclasses (such as
:cdata:`PyDateTime_DateTime`). The argument must not be *NULL*, and the type is
not checked:
.. cfunction:: int PyDateTime_GET_YEAR(PyDateTime_Date *o)
Return the year, as a positive int.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_GET_MONTH(PyDateTime_Date *o)
Return the month, as an int from 1 through 12.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_GET_DAY(PyDateTime_Date *o)
Return the day, as an int from 1 through 31.
.. versionadded:: 2.4
Macros to extract fields from datetime objects. The argument must be an
instance of :cdata:`PyDateTime_DateTime`, including subclasses. The argument
must not be *NULL*, and the type is not checked:
.. cfunction:: int PyDateTime_DATE_GET_HOUR(PyDateTime_DateTime *o)
Return the hour, as an int from 0 through 23.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_DATE_GET_MINUTE(PyDateTime_DateTime *o)
Return the minute, as an int from 0 through 59.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_DATE_GET_SECOND(PyDateTime_DateTime *o)
Return the second, as an int from 0 through 59.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_DATE_GET_MICROSECOND(PyDateTime_DateTime *o)
Return the microsecond, as an int from 0 through 999999.
.. versionadded:: 2.4
Macros to extract fields from time objects. The argument must be an instance of
:cdata:`PyDateTime_Time`, including subclasses. The argument must not be *NULL*,
and the type is not checked:
.. cfunction:: int PyDateTime_TIME_GET_HOUR(PyDateTime_Time *o)
Return the hour, as an int from 0 through 23.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_TIME_GET_MINUTE(PyDateTime_Time *o)
Return the minute, as an int from 0 through 59.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_TIME_GET_SECOND(PyDateTime_Time *o)
Return the second, as an int from 0 through 59.
.. versionadded:: 2.4
.. cfunction:: int PyDateTime_TIME_GET_MICROSECOND(PyDateTime_Time *o)
Return the microsecond, as an int from 0 through 999999.
.. versionadded:: 2.4
Macros for the convenience of modules implementing the DB API:
.. cfunction:: PyObject* PyDateTime_FromTimestamp(PyObject *args)
Create and return a new ``datetime.datetime`` object given an argument tuple
suitable for passing to ``datetime.datetime.fromtimestamp()``.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyDate_FromTimestamp(PyObject *args)
Create and return a new ``datetime.date`` object given an argument tuple
suitable for passing to ``datetime.date.fromtimestamp()``.
.. versionadded:: 2.4

View File

@@ -0,0 +1,55 @@
.. highlightlang:: c
.. _descriptor-objects:
Descriptor Objects
------------------
"Descriptors" are objects that describe some attribute of an object. They are
found in the dictionary of type objects.
.. cvar:: PyTypeObject PyProperty_Type
The type object for the built-in descriptor types.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyDescr_NewGetSet(PyTypeObject *type, struct PyGetSetDef *getset)
.. versionadded:: 2.2
.. cfunction:: PyObject* PyDescr_NewMember(PyTypeObject *type, struct PyMemberDef *meth)
.. versionadded:: 2.2
.. cfunction:: PyObject* PyDescr_NewMethod(PyTypeObject *type, struct PyMethodDef *meth)
.. versionadded:: 2.2
.. cfunction:: PyObject* PyDescr_NewWrapper(PyTypeObject *type, struct wrapperbase *wrapper, void *wrapped)
.. versionadded:: 2.2
.. cfunction:: PyObject* PyDescr_NewClassMethod(PyTypeObject *type, PyMethodDef *method)
.. versionadded:: 2.3
.. cfunction:: int PyDescr_IsData(PyObject *descr)
Return true if the descriptor objects *descr* describes a data attribute, or
false if it describes a method. *descr* must be a descriptor object; there is
no error checking.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyWrapper_New(PyObject *, PyObject *)
.. versionadded:: 2.2

View File

@@ -0,0 +1,220 @@
.. highlightlang:: c
.. _dictobjects:
Dictionary Objects
------------------
.. index:: object: dictionary
.. ctype:: PyDictObject
This subtype of :ctype:`PyObject` represents a Python dictionary object.
.. cvar:: PyTypeObject PyDict_Type
.. index::
single: DictType (in module types)
single: DictionaryType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python dictionary type.
This is exposed to Python programs as ``dict`` and ``types.DictType``.
.. cfunction:: int PyDict_Check(PyObject *p)
Return true if *p* is a dict object or an instance of a subtype of the dict
type.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyDict_CheckExact(PyObject *p)
Return true if *p* is a dict object, but not an instance of a subtype of the
dict type.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyDict_New()
Return a new empty dictionary, or *NULL* on failure.
.. cfunction:: PyObject* PyDictProxy_New(PyObject *dict)
Return a proxy object for a mapping which enforces read-only behavior. This is
normally used to create a proxy to prevent modification of the dictionary for
non-dynamic class types.
.. versionadded:: 2.2
.. cfunction:: void PyDict_Clear(PyObject *p)
Empty an existing dictionary of all key-value pairs.
.. cfunction:: int PyDict_Contains(PyObject *p, PyObject *key)
Determine if dictionary *p* contains *key*. If an item in *p* is matches *key*,
return ``1``, otherwise return ``0``. On error, return ``-1``. This is
equivalent to the Python expression ``key in p``.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyDict_Copy(PyObject *p)
Return a new dictionary that contains the same key-value pairs as *p*.
.. versionadded:: 1.6
.. cfunction:: int PyDict_SetItem(PyObject *p, PyObject *key, PyObject *val)
Insert *value* into the dictionary *p* with a key of *key*. *key* must be
:term:`hashable`; if it isn't, :exc:`TypeError` will be raised. Return ``0``
on success or ``-1`` on failure.
.. cfunction:: int PyDict_SetItemString(PyObject *p, const char *key, PyObject *val)
.. index:: single: PyString_FromString()
Insert *value* into the dictionary *p* using *key* as a key. *key* should be a
:ctype:`char\*`. The key object is created using ``PyString_FromString(key)``.
Return ``0`` on success or ``-1`` on failure.
.. cfunction:: int PyDict_DelItem(PyObject *p, PyObject *key)
Remove the entry in dictionary *p* with key *key*. *key* must be hashable; if it
isn't, :exc:`TypeError` is raised. Return ``0`` on success or ``-1`` on
failure.
.. cfunction:: int PyDict_DelItemString(PyObject *p, char *key)
Remove the entry in dictionary *p* which has a key specified by the string
*key*. Return ``0`` on success or ``-1`` on failure.
.. cfunction:: PyObject* PyDict_GetItem(PyObject *p, PyObject *key)
Return the object from dictionary *p* which has a key *key*. Return *NULL* if
the key *key* is not present, but *without* setting an exception.
.. cfunction:: PyObject* PyDict_GetItemString(PyObject *p, const char *key)
This is the same as :cfunc:`PyDict_GetItem`, but *key* is specified as a
:ctype:`char\*`, rather than a :ctype:`PyObject\*`.
.. cfunction:: PyObject* PyDict_Items(PyObject *p)
Return a :ctype:`PyListObject` containing all the items from the dictionary, as
in the dictionary method :meth:`dict.items`.
.. cfunction:: PyObject* PyDict_Keys(PyObject *p)
Return a :ctype:`PyListObject` containing all the keys from the dictionary, as
in the dictionary method :meth:`dict.keys`.
.. cfunction:: PyObject* PyDict_Values(PyObject *p)
Return a :ctype:`PyListObject` containing all the values from the dictionary
*p*, as in the dictionary method :meth:`dict.values`.
.. cfunction:: Py_ssize_t PyDict_Size(PyObject *p)
.. index:: builtin: len
Return the number of items in the dictionary. This is equivalent to ``len(p)``
on a dictionary.
.. cfunction:: int PyDict_Next(PyObject *p, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue)
Iterate over all key-value pairs in the dictionary *p*. The :ctype:`int`
referred to by *ppos* must be initialized to ``0`` prior to the first call to
this function to start the iteration; the function returns true for each pair in
the dictionary, and false once all pairs have been reported. The parameters
*pkey* and *pvalue* should either point to :ctype:`PyObject\*` variables that
will be filled in with each key and value, respectively, or may be *NULL*. Any
references returned through them are borrowed. *ppos* should not be altered
during iteration. Its value represents offsets within the internal dictionary
structure, and since the structure is sparse, the offsets are not consecutive.
For example::
PyObject *key, *value;
Py_ssize_t pos = 0;
while (PyDict_Next(self->dict, &pos, &key, &value)) {
/* do something interesting with the values... */
...
}
The dictionary *p* should not be mutated during iteration. It is safe (since
Python 2.1) to modify the values of the keys as you iterate over the dictionary,
but only so long as the set of keys does not change. For example::
PyObject *key, *value;
Py_ssize_t pos = 0;
while (PyDict_Next(self->dict, &pos, &key, &value)) {
int i = PyInt_AS_LONG(value) + 1;
PyObject *o = PyInt_FromLong(i);
if (o == NULL)
return -1;
if (PyDict_SetItem(self->dict, key, o) < 0) {
Py_DECREF(o);
return -1;
}
Py_DECREF(o);
}
.. cfunction:: int PyDict_Merge(PyObject *a, PyObject *b, int override)
Iterate over mapping object *b* adding key-value pairs to dictionary *a*. *b*
may be a dictionary, or any object supporting :func:`PyMapping_Keys` and
:func:`PyObject_GetItem`. If *override* is true, existing pairs in *a* will be
replaced if a matching key is found in *b*, otherwise pairs will only be added
if there is not a matching key in *a*. Return ``0`` on success or ``-1`` if an
exception was raised.
.. versionadded:: 2.2
.. cfunction:: int PyDict_Update(PyObject *a, PyObject *b)
This is the same as ``PyDict_Merge(a, b, 1)`` in C, or ``a.update(b)`` in
Python. Return ``0`` on success or ``-1`` if an exception was raised.
.. versionadded:: 2.2
.. cfunction:: int PyDict_MergeFromSeq2(PyObject *a, PyObject *seq2, int override)
Update or merge into dictionary *a*, from the key-value pairs in *seq2*. *seq2*
must be an iterable object producing iterable objects of length 2, viewed as
key-value pairs. In case of duplicate keys, the last wins if *override* is
true, else the first wins. Return ``0`` on success or ``-1`` if an exception was
raised. Equivalent Python (except for the return value)::
def PyDict_MergeFromSeq2(a, seq2, override):
for key, value in seq2:
if override or key not in a:
a[key] = value
.. versionadded:: 2.2

View File

@@ -0,0 +1,560 @@
.. highlightlang:: c
.. _exceptionhandling:
******************
Exception Handling
******************
The functions described in this chapter will let you handle and raise Python
exceptions. It is important to understand some of the basics of Python
exception handling. It works somewhat like the Unix :cdata:`errno` variable:
there is a global indicator (per thread) of the last error that occurred. Most
functions don't clear this on success, but will set it to indicate the cause of
the error on failure. Most functions also return an error indicator, usually
*NULL* if they are supposed to return a pointer, or ``-1`` if they return an
integer (exception: the :cfunc:`PyArg_\*` functions return ``1`` for success and
``0`` for failure).
When a function must fail because some function it called failed, it generally
doesn't set the error indicator; the function it called already set it. It is
responsible for either handling the error and clearing the exception or
returning after cleaning up any resources it holds (such as object references or
memory allocations); it should *not* continue normally if it is not prepared to
handle the error. If returning due to an error, it is important to indicate to
the caller that an error has been set. If the error is not handled or carefully
propagated, additional calls into the Python/C API may not behave as intended
and may fail in mysterious ways.
.. index::
single: exc_type (in module sys)
single: exc_value (in module sys)
single: exc_traceback (in module sys)
The error indicator consists of three Python objects corresponding to the
Python variables ``sys.exc_type``, ``sys.exc_value`` and ``sys.exc_traceback``.
API functions exist to interact with the error indicator in various ways. There
is a separate error indicator for each thread.
.. XXX Order of these should be more thoughtful.
Either alphabetical or some kind of structure.
.. cfunction:: void PyErr_PrintEx(int set_sys_last_vars)
Print a standard traceback to ``sys.stderr`` and clear the error indicator.
Call this function only when the error indicator is set. (Otherwise it will
cause a fatal error!)
If *set_sys_last_vars* is nonzero, the variables :data:`sys.last_type`,
:data:`sys.last_value` and :data:`sys.last_traceback` will be set to the
type, value and traceback of the printed exception, respectively.
.. cfunction:: void PyErr_Print()
Alias for ``PyErr_PrintEx(1)``.
.. cfunction:: PyObject* PyErr_Occurred()
Test whether the error indicator is set. If set, return the exception *type*
(the first argument to the last call to one of the :cfunc:`PyErr_Set\*`
functions or to :cfunc:`PyErr_Restore`). If not set, return *NULL*. You do not
own a reference to the return value, so you do not need to :cfunc:`Py_DECREF`
it.
.. note::
Do not compare the return value to a specific exception; use
:cfunc:`PyErr_ExceptionMatches` instead, shown below. (The comparison could
easily fail since the exception may be an instance instead of a class, in the
case of a class exception, or it may the a subclass of the expected exception.)
.. cfunction:: int PyErr_ExceptionMatches(PyObject *exc)
Equivalent to ``PyErr_GivenExceptionMatches(PyErr_Occurred(), exc)``. This
should only be called when an exception is actually set; a memory access
violation will occur if no exception has been raised.
.. cfunction:: int PyErr_GivenExceptionMatches(PyObject *given, PyObject *exc)
Return true if the *given* exception matches the exception in *exc*. If
*exc* is a class object, this also returns true when *given* is an instance
of a subclass. If *exc* is a tuple, all exceptions in the tuple (and
recursively in subtuples) are searched for a match.
.. cfunction:: void PyErr_NormalizeException(PyObject**exc, PyObject**val, PyObject**tb)
Under certain circumstances, the values returned by :cfunc:`PyErr_Fetch` below
can be "unnormalized", meaning that ``*exc`` is a class object but ``*val`` is
not an instance of the same class. This function can be used to instantiate
the class in that case. If the values are already normalized, nothing happens.
The delayed normalization is implemented to improve performance.
.. cfunction:: void PyErr_Clear()
Clear the error indicator. If the error indicator is not set, there is no
effect.
.. cfunction:: void PyErr_Fetch(PyObject **ptype, PyObject **pvalue, PyObject **ptraceback)
Retrieve the error indicator into three variables whose addresses are passed.
If the error indicator is not set, set all three variables to *NULL*. If it is
set, it will be cleared and you own a reference to each object retrieved. The
value and traceback object may be *NULL* even when the type object is not.
.. note::
This function is normally only used by code that needs to handle exceptions or
by code that needs to save and restore the error indicator temporarily.
.. cfunction:: void PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback)
Set the error indicator from the three objects. If the error indicator is
already set, it is cleared first. If the objects are *NULL*, the error
indicator is cleared. Do not pass a *NULL* type and non-*NULL* value or
traceback. The exception type should be a class. Do not pass an invalid
exception type or value. (Violating these rules will cause subtle problems
later.) This call takes away a reference to each object: you must own a
reference to each object before the call and after the call you no longer own
these references. (If you don't understand this, don't use this function. I
warned you.)
.. note::
This function is normally only used by code that needs to save and restore the
error indicator temporarily; use :cfunc:`PyErr_Fetch` to save the current
exception state.
.. cfunction:: void PyErr_SetString(PyObject *type, const char *message)
This is the most common way to set the error indicator. The first argument
specifies the exception type; it is normally one of the standard exceptions,
e.g. :cdata:`PyExc_RuntimeError`. You need not increment its reference count.
The second argument is an error message; it is converted to a string object.
.. cfunction:: void PyErr_SetObject(PyObject *type, PyObject *value)
This function is similar to :cfunc:`PyErr_SetString` but lets you specify an
arbitrary Python object for the "value" of the exception.
.. cfunction:: PyObject* PyErr_Format(PyObject *exception, const char *format, ...)
This function sets the error indicator and returns *NULL*. *exception* should be
a Python exception (class, not an instance). *format* should be a string,
containing format codes, similar to :cfunc:`printf`. The ``width.precision``
before a format code is parsed, but the width part is ignored.
.. % This should be exactly the same as the table in PyString_FromFormat.
.. % One should just refer to the other.
.. % The descriptions for %zd and %zu are wrong, but the truth is complicated
.. % because not all compilers support the %z width modifier -- we fake it
.. % when necessary via interpolating PY_FORMAT_SIZE_T.
.. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
+-------------------+---------------+--------------------------------+
| Format Characters | Type | Comment |
+===================+===============+================================+
| :attr:`%%` | *n/a* | The literal % character. |
+-------------------+---------------+--------------------------------+
| :attr:`%c` | int | A single character, |
| | | represented as an C int. |
+-------------------+---------------+--------------------------------+
| :attr:`%d` | int | Exactly equivalent to |
| | | ``printf("%d")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%u` | unsigned int | Exactly equivalent to |
| | | ``printf("%u")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%ld` | long | Exactly equivalent to |
| | | ``printf("%ld")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%lu` | unsigned long | Exactly equivalent to |
| | | ``printf("%lu")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
| | | ``printf("%zd")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%zu` | size_t | Exactly equivalent to |
| | | ``printf("%zu")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%i` | int | Exactly equivalent to |
| | | ``printf("%i")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%x` | int | Exactly equivalent to |
| | | ``printf("%x")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%s` | char\* | A null-terminated C character |
| | | array. |
+-------------------+---------------+--------------------------------+
| :attr:`%p` | void\* | The hex representation of a C |
| | | pointer. Mostly equivalent to |
| | | ``printf("%p")`` except that |
| | | it is guaranteed to start with |
| | | the literal ``0x`` regardless |
| | | of what the platform's |
| | | ``printf`` yields. |
+-------------------+---------------+--------------------------------+
An unrecognized format character causes all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
.. cfunction:: void PyErr_SetNone(PyObject *type)
This is a shorthand for ``PyErr_SetObject(type, Py_None)``.
.. cfunction:: int PyErr_BadArgument()
This is a shorthand for ``PyErr_SetString(PyExc_TypeError, message)``, where
*message* indicates that a built-in operation was invoked with an illegal
argument. It is mostly for internal use.
.. cfunction:: PyObject* PyErr_NoMemory()
This is a shorthand for ``PyErr_SetNone(PyExc_MemoryError)``; it returns *NULL*
so an object allocation function can write ``return PyErr_NoMemory();`` when it
runs out of memory.
.. cfunction:: PyObject* PyErr_SetFromErrno(PyObject *type)
.. index:: single: strerror()
This is a convenience function to raise an exception when a C library function
has returned an error and set the C variable :cdata:`errno`. It constructs a
tuple object whose first item is the integer :cdata:`errno` value and whose
second item is the corresponding error message (gotten from :cfunc:`strerror`),
and then calls ``PyErr_SetObject(type, object)``. On Unix, when the
:cdata:`errno` value is :const:`EINTR`, indicating an interrupted system call,
this calls :cfunc:`PyErr_CheckSignals`, and if that set the error indicator,
leaves it set to that. The function always returns *NULL*, so a wrapper
function around a system call can write ``return PyErr_SetFromErrno(type);``
when the system call returns an error.
.. cfunction:: PyObject* PyErr_SetFromErrnoWithFilename(PyObject *type, const char *filename)
Similar to :cfunc:`PyErr_SetFromErrno`, with the additional behavior that if
*filename* is not *NULL*, it is passed to the constructor of *type* as a third
parameter. In the case of exceptions such as :exc:`IOError` and :exc:`OSError`,
this is used to define the :attr:`filename` attribute of the exception instance.
.. cfunction:: PyObject* PyErr_SetFromWindowsErr(int ierr)
This is a convenience function to raise :exc:`WindowsError`. If called with
*ierr* of :cdata:`0`, the error code returned by a call to :cfunc:`GetLastError`
is used instead. It calls the Win32 function :cfunc:`FormatMessage` to retrieve
the Windows description of error code given by *ierr* or :cfunc:`GetLastError`,
then it constructs a tuple object whose first item is the *ierr* value and whose
second item is the corresponding error message (gotten from
:cfunc:`FormatMessage`), and then calls ``PyErr_SetObject(PyExc_WindowsError,
object)``. This function always returns *NULL*. Availability: Windows.
.. cfunction:: PyObject* PyErr_SetExcFromWindowsErr(PyObject *type, int ierr)
Similar to :cfunc:`PyErr_SetFromWindowsErr`, with an additional parameter
specifying the exception type to be raised. Availability: Windows.
.. versionadded:: 2.3
.. cfunction:: PyObject* PyErr_SetFromWindowsErrWithFilename(int ierr, const char *filename)
Similar to :cfunc:`PyErr_SetFromWindowsErr`, with the additional behavior that
if *filename* is not *NULL*, it is passed to the constructor of
:exc:`WindowsError` as a third parameter. Availability: Windows.
.. cfunction:: PyObject* PyErr_SetExcFromWindowsErrWithFilename(PyObject *type, int ierr, char *filename)
Similar to :cfunc:`PyErr_SetFromWindowsErrWithFilename`, with an additional
parameter specifying the exception type to be raised. Availability: Windows.
.. versionadded:: 2.3
.. cfunction:: void PyErr_BadInternalCall()
This is a shorthand for ``PyErr_SetString(PyExc_TypeError, message)``, where
*message* indicates that an internal operation (e.g. a Python/C API function)
was invoked with an illegal argument. It is mostly for internal use.
.. cfunction:: int PyErr_WarnEx(PyObject *category, char *message, int stacklevel)
Issue a warning message. The *category* argument is a warning category (see
below) or *NULL*; the *message* argument is a message string. *stacklevel* is a
positive number giving a number of stack frames; the warning will be issued from
the currently executing line of code in that stack frame. A *stacklevel* of 1
is the function calling :cfunc:`PyErr_WarnEx`, 2 is the function above that,
and so forth.
This function normally prints a warning message to *sys.stderr*; however, it is
also possible that the user has specified that warnings are to be turned into
errors, and in that case this will raise an exception. It is also possible that
the function raises an exception because of a problem with the warning machinery
(the implementation imports the :mod:`warnings` module to do the heavy lifting).
The return value is ``0`` if no exception is raised, or ``-1`` if an exception
is raised. (It is not possible to determine whether a warning message is
actually printed, nor what the reason is for the exception; this is
intentional.) If an exception is raised, the caller should do its normal
exception handling (for example, :cfunc:`Py_DECREF` owned references and return
an error value).
Warning categories must be subclasses of :cdata:`Warning`; the default warning
category is :cdata:`RuntimeWarning`. The standard Python warning categories are
available as global variables whose names are ``PyExc_`` followed by the Python
exception name. These have the type :ctype:`PyObject\*`; they are all class
objects. Their names are :cdata:`PyExc_Warning`, :cdata:`PyExc_UserWarning`,
:cdata:`PyExc_UnicodeWarning`, :cdata:`PyExc_DeprecationWarning`,
:cdata:`PyExc_SyntaxWarning`, :cdata:`PyExc_RuntimeWarning`, and
:cdata:`PyExc_FutureWarning`. :cdata:`PyExc_Warning` is a subclass of
:cdata:`PyExc_Exception`; the other warning categories are subclasses of
:cdata:`PyExc_Warning`.
For information about warning control, see the documentation for the
:mod:`warnings` module and the :option:`-W` option in the command line
documentation. There is no C API for warning control.
.. cfunction:: int PyErr_Warn(PyObject *category, char *message)
Issue a warning message. The *category* argument is a warning category (see
below) or *NULL*; the *message* argument is a message string. The warning will
appear to be issued from the function calling :cfunc:`PyErr_Warn`, equivalent to
calling :cfunc:`PyErr_WarnEx` with a *stacklevel* of 1.
Deprecated; use :cfunc:`PyErr_WarnEx` instead.
.. cfunction:: int PyErr_WarnExplicit(PyObject *category, const char *message, const char *filename, int lineno, const char *module, PyObject *registry)
Issue a warning message with explicit control over all warning attributes. This
is a straightforward wrapper around the Python function
:func:`warnings.warn_explicit`, see there for more information. The *module*
and *registry* arguments may be set to *NULL* to get the default effect
described there.
.. cfunction:: int PyErr_WarnPy3k(char *message, int stacklevel)
Issue a :exc:`DeprecationWarning` with the given *message* and *stacklevel*
if the :cdata:`Py_Py3kWarningFlag` flag is enabled.
.. versionadded:: 2.6
.. cfunction:: int PyErr_CheckSignals()
.. index::
module: signal
single: SIGINT
single: KeyboardInterrupt (built-in exception)
This function interacts with Python's signal handling. It checks whether a
signal has been sent to the processes and if so, invokes the corresponding
signal handler. If the :mod:`signal` module is supported, this can invoke a
signal handler written in Python. In all cases, the default effect for
:const:`SIGINT` is to raise the :exc:`KeyboardInterrupt` exception. If an
exception is raised the error indicator is set and the function returns ``-1``;
otherwise the function returns ``0``. The error indicator may or may not be
cleared if it was previously set.
.. cfunction:: void PyErr_SetInterrupt()
.. index::
single: SIGINT
single: KeyboardInterrupt (built-in exception)
This function simulates the effect of a :const:`SIGINT` signal arriving --- the
next time :cfunc:`PyErr_CheckSignals` is called, :exc:`KeyboardInterrupt` will
be raised. It may be called without holding the interpreter lock.
.. % XXX This was described as obsolete, but is used in
.. % thread.interrupt_main() (used from IDLE), so it's still needed.
.. cfunction:: int PySignal_SetWakeupFd(int fd)
This utility function specifies a file descriptor to which a ``'\0'`` byte will
be written whenever a signal is received. It returns the previous such file
descriptor. The value ``-1`` disables the feature; this is the initial state.
This is equivalent to :func:`signal.set_wakeup_fd` in Python, but without any
error checking. *fd* should be a valid file descriptor. The function should
only be called from the main thread.
.. cfunction:: PyObject* PyErr_NewException(char *name, PyObject *base, PyObject *dict)
This utility function creates and returns a new exception object. The *name*
argument must be the name of the new exception, a C string of the form
``module.class``. The *base* and *dict* arguments are normally *NULL*. This
creates a class object derived from :exc:`Exception` (accessible in C as
:cdata:`PyExc_Exception`).
The :attr:`__module__` attribute of the new class is set to the first part (up
to the last dot) of the *name* argument, and the class name is set to the last
part (after the last dot). The *base* argument can be used to specify alternate
base classes; it can either be only one class or a tuple of classes. The *dict*
argument can be used to specify a dictionary of class variables and methods.
.. cfunction:: void PyErr_WriteUnraisable(PyObject *obj)
This utility function prints a warning message to ``sys.stderr`` when an
exception has been set but it is impossible for the interpreter to actually
raise the exception. It is used, for example, when an exception occurs in an
:meth:`__del__` method.
The function is called with a single argument *obj* that identifies the context
in which the unraisable exception occurred. The repr of *obj* will be printed in
the warning message.
.. _standardexceptions:
Standard Exceptions
===================
All standard Python exceptions are available as global variables whose names are
``PyExc_`` followed by the Python exception name. These have the type
:ctype:`PyObject\*`; they are all class objects. For completeness, here are all
the variables:
+------------------------------------+----------------------------+----------+
| C Name | Python Name | Notes |
+====================================+============================+==========+
| :cdata:`PyExc_BaseException` | :exc:`BaseException` | (1), (4) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_Exception` | :exc:`Exception` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_StandardError` | :exc:`StandardError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_LookupError` | :exc:`LookupError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_AssertionError` | :exc:`AssertionError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_AttributeError` | :exc:`AttributeError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_EOFError` | :exc:`EOFError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_EnvironmentError` | :exc:`EnvironmentError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_FloatingPointError` | :exc:`FloatingPointError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_IOError` | :exc:`IOError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ImportError` | :exc:`ImportError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_IndexError` | :exc:`IndexError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_KeyError` | :exc:`KeyError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_KeyboardInterrupt` | :exc:`KeyboardInterrupt` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_MemoryError` | :exc:`MemoryError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_NameError` | :exc:`NameError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_NotImplementedError` | :exc:`NotImplementedError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_OSError` | :exc:`OSError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_OverflowError` | :exc:`OverflowError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ReferenceError` | :exc:`ReferenceError` | \(2) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_RuntimeError` | :exc:`RuntimeError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_SyntaxError` | :exc:`SyntaxError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_SystemError` | :exc:`SystemError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_SystemExit` | :exc:`SystemExit` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_TypeError` | :exc:`TypeError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ValueError` | :exc:`ValueError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_WindowsError` | :exc:`WindowsError` | \(3) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ZeroDivisionError` | :exc:`ZeroDivisionError` | |
+------------------------------------+----------------------------+----------+
.. index::
single: PyExc_BaseException
single: PyExc_Exception
single: PyExc_StandardError
single: PyExc_ArithmeticError
single: PyExc_LookupError
single: PyExc_AssertionError
single: PyExc_AttributeError
single: PyExc_EOFError
single: PyExc_EnvironmentError
single: PyExc_FloatingPointError
single: PyExc_IOError
single: PyExc_ImportError
single: PyExc_IndexError
single: PyExc_KeyError
single: PyExc_KeyboardInterrupt
single: PyExc_MemoryError
single: PyExc_NameError
single: PyExc_NotImplementedError
single: PyExc_OSError
single: PyExc_OverflowError
single: PyExc_ReferenceError
single: PyExc_RuntimeError
single: PyExc_SyntaxError
single: PyExc_SystemError
single: PyExc_SystemExit
single: PyExc_TypeError
single: PyExc_ValueError
single: PyExc_WindowsError
single: PyExc_ZeroDivisionError
Notes:
(1)
This is a base class for other standard exceptions.
(2)
This is the same as :exc:`weakref.ReferenceError`.
(3)
Only defined on Windows; protect code that uses this by testing that the
preprocessor macro ``MS_WINDOWS`` is defined.
(4)
.. versionadded:: 2.5
Deprecation of String Exceptions
================================
.. index:: single: BaseException (built-in exception)
All exceptions built into Python or provided in the standard library are derived
from :exc:`BaseException`.
String exceptions are still supported in the interpreter to allow existing code
to run unmodified, but this will also change in a future release.

View File

@@ -0,0 +1,168 @@
.. highlightlang:: c
.. _fileobjects:
File Objects
------------
.. index:: object: file
Python's built-in file objects are implemented entirely on the :ctype:`FILE\*`
support from the C standard library. This is an implementation detail and may
change in future releases of Python.
.. ctype:: PyFileObject
This subtype of :ctype:`PyObject` represents a Python file object.
.. cvar:: PyTypeObject PyFile_Type
.. index:: single: FileType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python file type. This is
exposed to Python programs as ``file`` and ``types.FileType``.
.. cfunction:: int PyFile_Check(PyObject *p)
Return true if its argument is a :ctype:`PyFileObject` or a subtype of
:ctype:`PyFileObject`.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyFile_CheckExact(PyObject *p)
Return true if its argument is a :ctype:`PyFileObject`, but not a subtype of
:ctype:`PyFileObject`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyFile_FromString(char *filename, char *mode)
.. index:: single: fopen()
On success, return a new file object that is opened on the file given by
*filename*, with a file mode given by *mode*, where *mode* has the same
semantics as the standard C routine :cfunc:`fopen`. On failure, return *NULL*.
.. cfunction:: PyObject* PyFile_FromFile(FILE *fp, char *name, char *mode, int (*close)(FILE*))
Create a new :ctype:`PyFileObject` from the already-open standard C file
pointer, *fp*. The function *close* will be called when the file should be
closed. Return *NULL* on failure.
.. cfunction:: FILE* PyFile_AsFile(PyObject \*p)
Return the file object associated with *p* as a :ctype:`FILE\*`.
If the caller will ever use the returned :ctype:`FILE\*` object while
the GIL is released it must also call the :cfunc:`PyFile_IncUseCount` and
:cfunc:`PyFile_DecUseCount` functions described below as appropriate.
.. cfunction:: void PyFile_IncUseCount(PyFileObject \*p)
Increments the PyFileObject's internal use count to indicate
that the underlying :ctype:`FILE\*` is being used.
This prevents Python from calling f_close() on it from another thread.
Callers of this must call :cfunc:`PyFile_DecUseCount` when they are
finished with the :ctype:`FILE\*`. Otherwise the file object will
never be closed by Python.
The GIL must be held while calling this function.
The suggested use is to call this after :cfunc:`PyFile_AsFile` just before
you release the GIL.
.. versionadded:: 2.6
.. cfunction:: void PyFile_DecUseCount(PyFileObject \*p)
Decrements the PyFileObject's internal unlocked_count member to
indicate that the caller is done with its own use of the :ctype:`FILE\*`.
This may only be called to undo a prior call to :cfunc:`PyFile_IncUseCount`.
The GIL must be held while calling this function.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyFile_GetLine(PyObject *p, int n)
.. index:: single: EOFError (built-in exception)
Equivalent to ``p.readline([n])``, this function reads one line from the
object *p*. *p* may be a file object or any object with a :meth:`readline`
method. If *n* is ``0``, exactly one line is read, regardless of the length of
the line. If *n* is greater than ``0``, no more than *n* bytes will be read
from the file; a partial line can be returned. In both cases, an empty string
is returned if the end of the file is reached immediately. If *n* is less than
``0``, however, one line is read regardless of length, but :exc:`EOFError` is
raised if the end of the file is reached immediately.
.. cfunction:: PyObject* PyFile_Name(PyObject *p)
Return the name of the file specified by *p* as a string object.
.. cfunction:: void PyFile_SetBufSize(PyFileObject *p, int n)
.. index:: single: setvbuf()
Available on systems with :cfunc:`setvbuf` only. This should only be called
immediately after file object creation.
.. cfunction:: int PyFile_SetEncoding(PyFileObject *p, const char *enc)
Set the file's encoding for Unicode output to *enc*. Return 1 on success and 0
on failure.
.. versionadded:: 2.3
.. cfunction:: int PyFile_SetEncodingAndErrors(PyFileObject *p, const char *enc, *errors)
Set the file's encoding for Unicode output to *enc*, and its error
mode to *err*. Return 1 on success and 0 on failure.
.. versionadded:: 2.6
.. cfunction:: int PyFile_SoftSpace(PyObject *p, int newflag)
.. index:: single: softspace (file attribute)
This function exists for internal use by the interpreter. Set the
:attr:`softspace` attribute of *p* to *newflag* and return the previous value.
*p* does not have to be a file object for this function to work properly; any
object is supported (thought its only interesting if the :attr:`softspace`
attribute can be set). This function clears any errors, and will return ``0``
as the previous value if the attribute either does not exist or if there were
errors in retrieving it. There is no way to detect errors from this function,
but doing so should not be needed.
.. cfunction:: int PyFile_WriteObject(PyObject *obj, PyObject *p, int flags)
.. index:: single: Py_PRINT_RAW
Write object *obj* to file object *p*. The only supported flag for *flags* is
:const:`Py_PRINT_RAW`; if given, the :func:`str` of the object is written
instead of the :func:`repr`. Return ``0`` on success or ``-1`` on failure; the
appropriate exception will be set.
.. cfunction:: int PyFile_WriteString(const char *s, PyObject *p)
Write string *s* to file object *p*. Return ``0`` on success or ``-1`` on
failure; the appropriate exception will be set.

View File

@@ -0,0 +1,94 @@
.. highlightlang:: c
.. _floatobjects:
Floating Point Objects
----------------------
.. index:: object: floating point
.. ctype:: PyFloatObject
This subtype of :ctype:`PyObject` represents a Python floating point object.
.. cvar:: PyTypeObject PyFloat_Type
.. index:: single: FloatType (in modules types)
This instance of :ctype:`PyTypeObject` represents the Python floating point
type. This is the same object as ``float`` and ``types.FloatType``.
.. cfunction:: int PyFloat_Check(PyObject *p)
Return true if its argument is a :ctype:`PyFloatObject` or a subtype of
:ctype:`PyFloatObject`.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyFloat_CheckExact(PyObject *p)
Return true if its argument is a :ctype:`PyFloatObject`, but not a subtype of
:ctype:`PyFloatObject`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyFloat_FromString(PyObject *str, char **pend)
Create a :ctype:`PyFloatObject` object based on the string value in *str*, or
*NULL* on failure. The *pend* argument is ignored. It remains only for
backward compatibility.
.. cfunction:: PyObject* PyFloat_FromDouble(double v)
Create a :ctype:`PyFloatObject` object from *v*, or *NULL* on failure.
.. cfunction:: double PyFloat_AsDouble(PyObject *pyfloat)
Return a C :ctype:`double` representation of the contents of *pyfloat*. If
*pyfloat* is not a Python floating point object but has a :meth:`__float__`
method, this method will first be called to convert *pyfloat* into a float.
.. cfunction:: double PyFloat_AS_DOUBLE(PyObject *pyfloat)
Return a C :ctype:`double` representation of the contents of *pyfloat*, but
without error checking.
.. cfunction:: PyObject* PyFloat_GetInfo(void)
Return a structseq instance which contains information about the
precision, minimum and maximum values of a float. It's a thin wrapper
around the header file :file:`float.h`.
.. versionadded:: 2.6
.. cfunction:: double PyFloat_GetMax(void)
Return the maximum representable finite float *DBL_MAX* as C :ctype:`double`.
.. versionadded:: 2.6
.. cfunction:: double PyFloat_GetMin(void)
Return the minimum normalized positive float *DBL_MIN* as C :ctype:`double`.
.. versionadded:: 2.6
.. cfunction:: int PyFloat_ClearFreeList(void)
Clear the float free list. Return the number of items that could not
be freed.
.. versionadded:: 2.6

View File

@@ -0,0 +1,83 @@
.. highlightlang:: c
.. _function-objects:
Function Objects
----------------
.. index:: object: function
There are a few functions specific to Python functions.
.. ctype:: PyFunctionObject
The C structure used for functions.
.. cvar:: PyTypeObject PyFunction_Type
.. index:: single: MethodType (in module types)
This is an instance of :ctype:`PyTypeObject` and represents the Python function
type. It is exposed to Python programmers as ``types.FunctionType``.
.. cfunction:: int PyFunction_Check(PyObject *o)
Return true if *o* is a function object (has type :cdata:`PyFunction_Type`).
The parameter must not be *NULL*.
.. cfunction:: PyObject* PyFunction_New(PyObject *code, PyObject *globals)
Return a new function object associated with the code object *code*. *globals*
must be a dictionary with the global variables accessible to the function.
The function's docstring, name and *__module__* are retrieved from the code
object, the argument defaults and closure are set to *NULL*.
.. cfunction:: PyObject* PyFunction_GetCode(PyObject *op)
Return the code object associated with the function object *op*.
.. cfunction:: PyObject* PyFunction_GetGlobals(PyObject *op)
Return the globals dictionary associated with the function object *op*.
.. cfunction:: PyObject* PyFunction_GetModule(PyObject *op)
Return the *__module__* attribute of the function object *op*. This is normally
a string containing the module name, but can be set to any other object by
Python code.
.. cfunction:: PyObject* PyFunction_GetDefaults(PyObject *op)
Return the argument default values of the function object *op*. This can be a
tuple of arguments or *NULL*.
.. cfunction:: int PyFunction_SetDefaults(PyObject *op, PyObject *defaults)
Set the argument default values for the function object *op*. *defaults* must be
*Py_None* or a tuple.
Raises :exc:`SystemError` and returns ``-1`` on failure.
.. cfunction:: PyObject* PyFunction_GetClosure(PyObject *op)
Return the closure associated with the function object *op*. This can be *NULL*
or a tuple of cell objects.
.. cfunction:: int PyFunction_SetClosure(PyObject *op, PyObject *closure)
Set the closure associated with the function object *op*. *closure* must be
*Py_None* or a tuple of cell objects.
Raises :exc:`SystemError` and returns ``-1`` on failure.

View File

@@ -0,0 +1,153 @@
.. highlightlang:: c
.. _supporting-cycle-detection:
Supporting Cyclic Garbage Collection
====================================
Python's support for detecting and collecting garbage which involves circular
references requires support from object types which are "containers" for other
objects which may also be containers. Types which do not store references to
other objects, or which only store references to atomic types (such as numbers
or strings), do not need to provide any explicit support for garbage collection.
.. An example showing the use of these interfaces can be found in "Supporting the
.. Cycle Collector (XXX not found: ../ext/example-cycle-support.html)".
To create a container type, the :attr:`tp_flags` field of the type object must
include the :const:`Py_TPFLAGS_HAVE_GC` and provide an implementation of the
:attr:`tp_traverse` handler. If instances of the type are mutable, a
:attr:`tp_clear` implementation must also be provided.
.. data:: Py_TPFLAGS_HAVE_GC
:noindex:
Objects with a type with this flag set must conform with the rules documented
here. For convenience these objects will be referred to as container objects.
Constructors for container types must conform to two rules:
#. The memory for the object must be allocated using :cfunc:`PyObject_GC_New` or
:cfunc:`PyObject_GC_VarNew`.
#. Once all the fields which may contain references to other containers are
initialized, it must call :cfunc:`PyObject_GC_Track`.
.. cfunction:: TYPE* PyObject_GC_New(TYPE, PyTypeObject *type)
Analogous to :cfunc:`PyObject_New` but for container objects with the
:const:`Py_TPFLAGS_HAVE_GC` flag set.
.. cfunction:: TYPE* PyObject_GC_NewVar(TYPE, PyTypeObject *type, Py_ssize_t size)
Analogous to :cfunc:`PyObject_NewVar` but for container objects with the
:const:`Py_TPFLAGS_HAVE_GC` flag set.
.. cfunction:: PyVarObject * PyObject_GC_Resize(PyVarObject *op, Py_ssize_t)
Resize an object allocated by :cfunc:`PyObject_NewVar`. Returns the resized
object or *NULL* on failure.
.. cfunction:: void PyObject_GC_Track(PyObject *op)
Adds the object *op* to the set of container objects tracked by the collector.
The collector can run at unexpected times so objects must be valid while being
tracked. This should be called once all the fields followed by the
:attr:`tp_traverse` handler become valid, usually near the end of the
constructor.
.. cfunction:: void _PyObject_GC_TRACK(PyObject *op)
A macro version of :cfunc:`PyObject_GC_Track`. It should not be used for
extension modules.
Similarly, the deallocator for the object must conform to a similar pair of
rules:
#. Before fields which refer to other containers are invalidated,
:cfunc:`PyObject_GC_UnTrack` must be called.
#. The object's memory must be deallocated using :cfunc:`PyObject_GC_Del`.
.. cfunction:: void PyObject_GC_Del(void *op)
Releases memory allocated to an object using :cfunc:`PyObject_GC_New` or
:cfunc:`PyObject_GC_NewVar`.
.. cfunction:: void PyObject_GC_UnTrack(void *op)
Remove the object *op* from the set of container objects tracked by the
collector. Note that :cfunc:`PyObject_GC_Track` can be called again on this
object to add it back to the set of tracked objects. The deallocator
(:attr:`tp_dealloc` handler) should call this for the object before any of the
fields used by the :attr:`tp_traverse` handler become invalid.
.. cfunction:: void _PyObject_GC_UNTRACK(PyObject *op)
A macro version of :cfunc:`PyObject_GC_UnTrack`. It should not be used for
extension modules.
The :attr:`tp_traverse` handler accepts a function parameter of this type:
.. ctype:: int (*visitproc)(PyObject *object, void *arg)
Type of the visitor function passed to the :attr:`tp_traverse` handler. The
function should be called with an object to traverse as *object* and the third
parameter to the :attr:`tp_traverse` handler as *arg*. The Python core uses
several visitor functions to implement cyclic garbage detection; it's not
expected that users will need to write their own visitor functions.
The :attr:`tp_traverse` handler must have the following type:
.. ctype:: int (*traverseproc)(PyObject *self, visitproc visit, void *arg)
Traversal function for a container object. Implementations must call the
*visit* function for each object directly contained by *self*, with the
parameters to *visit* being the contained object and the *arg* value passed to
the handler. The *visit* function must not be called with a *NULL* object
argument. If *visit* returns a non-zero value that value should be returned
immediately.
To simplify writing :attr:`tp_traverse` handlers, a :cfunc:`Py_VISIT` macro is
provided. In order to use this macro, the :attr:`tp_traverse` implementation
must name its arguments exactly *visit* and *arg*:
.. cfunction:: void Py_VISIT(PyObject *o)
Call the *visit* callback, with arguments *o* and *arg*. If *visit* returns a
non-zero value, then return it. Using this macro, :attr:`tp_traverse` handlers
look like::
static int
my_traverse(Noddy *self, visitproc visit, void *arg)
{
Py_VISIT(self->foo);
Py_VISIT(self->bar);
return 0;
}
.. versionadded:: 2.4
The :attr:`tp_clear` handler must be of the :ctype:`inquiry` type, or *NULL* if
the object is immutable.
.. ctype:: int (*inquiry)(PyObject *self)
Drop references that may have created reference cycles. Immutable objects do
not have to define this method since they can never directly create reference
cycles. Note that the object must still be valid after calling this method
(don't just call :cfunc:`Py_DECREF` on a reference). The collector will call
this method if it detects that this object is involved in a reference cycle.

View File

@@ -0,0 +1,38 @@
.. highlightlang:: c
.. _gen-objects:
Generator Objects
-----------------
Generator objects are what Python uses to implement generator iterators. They
are normally created by iterating over a function that yields values, rather
than explicitly calling :cfunc:`PyGen_New`.
.. ctype:: PyGenObject
The C structure used for generator objects.
.. cvar:: PyTypeObject PyGen_Type
The type object corresponding to generator objects
.. cfunction:: int PyGen_Check(ob)
Return true if *ob* is a generator object; *ob* must not be *NULL*.
.. cfunction:: int PyGen_CheckExact(ob)
Return true if *ob*'s type is *PyGen_Type* is a generator object; *ob* must not
be *NULL*.
.. cfunction:: PyObject* PyGen_New(PyFrameObject *frame)
Create and return a new generator object based on the *frame* object. A
reference to *frame* is stolen by this function. The parameter must not be
*NULL*.

View File

@@ -0,0 +1,267 @@
.. highlightlang:: c
.. _importing:
Importing Modules
=================
.. cfunction:: PyObject* PyImport_ImportModule(const char *name)
.. index::
single: package variable; __all__
single: __all__ (package variable)
single: modules (in module sys)
This is a simplified interface to :cfunc:`PyImport_ImportModuleEx` below,
leaving the *globals* and *locals* arguments set to *NULL* and *level* set
to 0. When the *name*
argument contains a dot (when it specifies a submodule of a package), the
*fromlist* argument is set to the list ``['*']`` so that the return value is the
named module rather than the top-level package containing it as would otherwise
be the case. (Unfortunately, this has an additional side effect when *name* in
fact specifies a subpackage instead of a submodule: the submodules specified in
the package's ``__all__`` variable are loaded.) Return a new reference to the
imported module, or *NULL* with an exception set on failure. Before Python 2.4,
the module may still be created in the failure case --- examine ``sys.modules``
to find out. Starting with Python 2.4, a failing import of a module no longer
leaves the module in ``sys.modules``.
.. versionchanged:: 2.4
failing imports remove incomplete module objects.
.. versionchanged:: 2.6
always use absolute imports
.. cfunction:: PyObject* PyImport_ImportModuleNoBlock(const char *name)
This version of :cfunc:`PyImport_ImportModule` does not block. It's intended
to be used in C functions that import other modules to execute a function.
The import may block if another thread holds the import lock. The function
:cfunc:`PyImport_ImportModuleNoBlock` never blocks. It first tries to fetch
the module from sys.modules and falls back to :cfunc:`PyImport_ImportModule`
unless the lock is held, in which case the function will raise an
:exc:`ImportError`.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyImport_ImportModuleEx(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist)
.. index:: builtin: __import__
Import a module. This is best described by referring to the built-in Python
function :func:`__import__`, as the standard :func:`__import__` function calls
this function directly.
The return value is a new reference to the imported module or top-level package,
or *NULL* with an exception set on failure (before Python 2.4, the module may
still be created in this case). Like for :func:`__import__`, the return value
when a submodule of a package was requested is normally the top-level package,
unless a non-empty *fromlist* was given.
.. versionchanged:: 2.4
failing imports remove incomplete module objects.
.. versionchanged:: 2.6
The function is an alias for :cfunc:`PyImport_ImportModuleLevel` with
-1 as level, meaning relative import.
.. cfunction:: PyObject* PyImport_ImportModuleLevel(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level)
Import a module. This is best described by referring to the built-in Python
function :func:`__import__`, as the standard :func:`__import__` function calls
this function directly.
The return value is a new reference to the imported module or top-level package,
or *NULL* with an exception set on failure. Like for :func:`__import__`,
the return value when a submodule of a package was requested is normally the
top-level package, unless a non-empty *fromlist* was given.
.. versionadded:: 2.5
.. cfunction:: PyObject* PyImport_Import(PyObject *name)
.. index::
module: rexec
module: ihooks
This is a higher-level interface that calls the current "import hook function".
It invokes the :func:`__import__` function from the ``__builtins__`` of the
current globals. This means that the import is done using whatever import hooks
are installed in the current environment, e.g. by :mod:`rexec` or :mod:`ihooks`.
.. versionchanged:: 2.6
always use absolute imports
.. cfunction:: PyObject* PyImport_ReloadModule(PyObject *m)
.. index:: builtin: reload
Reload a module. This is best described by referring to the built-in Python
function :func:`reload`, as the standard :func:`reload` function calls this
function directly. Return a new reference to the reloaded module, or *NULL*
with an exception set on failure (the module still exists in this case).
.. cfunction:: PyObject* PyImport_AddModule(const char *name)
Return the module object corresponding to a module name. The *name* argument
may be of the form ``package.module``. First check the modules dictionary if
there's one there, and if not, create a new one and insert it in the modules
dictionary. Return *NULL* with an exception set on failure.
.. note::
This function does not load or import the module; if the module wasn't already
loaded, you will get an empty module object. Use :cfunc:`PyImport_ImportModule`
or one of its variants to import a module. Package structures implied by a
dotted name for *name* are not created if not already present.
.. cfunction:: PyObject* PyImport_ExecCodeModule(char *name, PyObject *co)
.. index:: builtin: compile
Given a module name (possibly of the form ``package.module``) and a code object
read from a Python bytecode file or obtained from the built-in function
:func:`compile`, load the module. Return a new reference to the module object,
or *NULL* with an exception set if an error occurred. Before Python 2.4, the
module could still be created in error cases. Starting with Python 2.4, *name*
is removed from :attr:`sys.modules` in error cases, and even if *name* was already
in :attr:`sys.modules` on entry to :cfunc:`PyImport_ExecCodeModule`. Leaving
incompletely initialized modules in :attr:`sys.modules` is dangerous, as imports of
such modules have no way to know that the module object is an unknown (and
probably damaged with respect to the module author's intents) state.
This function will reload the module if it was already imported. See
:cfunc:`PyImport_ReloadModule` for the intended way to reload a module.
If *name* points to a dotted name of the form ``package.module``, any package
structures not already created will still not be created.
.. versionchanged:: 2.4
*name* is removed from :attr:`sys.modules` in error cases.
.. cfunction:: long PyImport_GetMagicNumber()
Return the magic number for Python bytecode files (a.k.a. :file:`.pyc` and
:file:`.pyo` files). The magic number should be present in the first four bytes
of the bytecode file, in little-endian byte order.
.. cfunction:: PyObject* PyImport_GetModuleDict()
Return the dictionary used for the module administration (a.k.a.
``sys.modules``). Note that this is a per-interpreter variable.
.. cfunction:: PyObject* PyImport_GetImporter(PyObject *path)
Return an importer object for a :data:`sys.path`/:attr:`pkg.__path__` item
*path*, possibly by fetching it from the :data:`sys.path_importer_cache`
dict. If it wasn't yet cached, traverse :data:`sys.path_hooks` until a hook
is found that can handle the path item. Return ``None`` if no hook could;
this tells our caller it should fall back to the builtin import mechanism.
Cache the result in :data:`sys.path_importer_cache`. Return a new reference
to the importer object.
.. versionadded:: 2.6
.. cfunction:: void _PyImport_Init()
Initialize the import mechanism. For internal use only.
.. cfunction:: void PyImport_Cleanup()
Empty the module table. For internal use only.
.. cfunction:: void _PyImport_Fini()
Finalize the import mechanism. For internal use only.
.. cfunction:: PyObject* _PyImport_FindExtension(char *, char *)
For internal use only.
.. cfunction:: PyObject* _PyImport_FixupExtension(char *, char *)
For internal use only.
.. cfunction:: int PyImport_ImportFrozenModule(char *name)
Load a frozen module named *name*. Return ``1`` for success, ``0`` if the
module is not found, and ``-1`` with an exception set if the initialization
failed. To access the imported module on a successful load, use
:cfunc:`PyImport_ImportModule`. (Note the misnomer --- this function would
reload the module if it was already imported.)
.. ctype:: struct _frozen
.. index:: single: freeze utility
This is the structure type definition for frozen module descriptors, as
generated by the :program:`freeze` utility (see :file:`Tools/freeze/` in the
Python source distribution). Its definition, found in :file:`Include/import.h`,
is::
struct _frozen {
char *name;
unsigned char *code;
int size;
};
.. cvar:: struct _frozen* PyImport_FrozenModules
This pointer is initialized to point to an array of :ctype:`struct _frozen`
records, terminated by one whose members are all *NULL* or zero. When a frozen
module is imported, it is searched in this table. Third-party code could play
tricks with this to provide a dynamically created collection of frozen modules.
.. cfunction:: int PyImport_AppendInittab(char *name, void (*initfunc)(void))
Add a single module to the existing table of built-in modules. This is a
convenience wrapper around :cfunc:`PyImport_ExtendInittab`, returning ``-1`` if
the table could not be extended. The new module can be imported by the name
*name*, and uses the function *initfunc* as the initialization function called
on the first attempted import. This should be called before
:cfunc:`Py_Initialize`.
.. ctype:: struct _inittab
Structure describing a single entry in the list of built-in modules. Each of
these structures gives the name and initialization function for a module built
into the interpreter. Programs which embed Python may use an array of these
structures in conjunction with :cfunc:`PyImport_ExtendInittab` to provide
additional built-in modules. The structure is defined in
:file:`Include/import.h` as::
struct _inittab {
char *name;
void (*initfunc)(void);
};
.. cfunction:: int PyImport_ExtendInittab(struct _inittab *newtab)
Add a collection of modules to the table of built-in modules. The *newtab*
array must end with a sentinel entry which contains *NULL* for the :attr:`name`
field; failure to provide the sentinel value can result in a memory fault.
Returns ``0`` on success or ``-1`` if insufficient memory could be allocated to
extend the internal table. In the event of failure, no modules are added to the
internal table. This should be called before :cfunc:`Py_Initialize`.

View File

@@ -0,0 +1,27 @@
.. _c-api-index:
##################################
Python/C API Reference Manual
##################################
:Release: |version|
:Date: |today|
This manual documents the API used by C and C++ programmers who want to write
extension modules or embed Python. It is a companion to :ref:`extending-index`,
which describes the general principles of extension writing but does not
document the API functions in detail.
.. toctree::
:maxdepth: 2
intro.rst
veryhigh.rst
refcounting.rst
exceptions.rst
utilities.rst
abstract.rst
concrete.rst
init.rst
memory.rst
objimpl.rst

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,130 @@
.. highlightlang:: c
.. _intobjects:
Plain Integer Objects
---------------------
.. index:: object: integer
.. ctype:: PyIntObject
This subtype of :ctype:`PyObject` represents a Python integer object.
.. cvar:: PyTypeObject PyInt_Type
.. index:: single: IntType (in modules types)
This instance of :ctype:`PyTypeObject` represents the Python plain integer type.
This is the same object as ``int`` and ``types.IntType``.
.. cfunction:: int PyInt_Check(PyObject *o)
Return true if *o* is of type :cdata:`PyInt_Type` or a subtype of
:cdata:`PyInt_Type`.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyInt_CheckExact(PyObject *o)
Return true if *o* is of type :cdata:`PyInt_Type`, but not a subtype of
:cdata:`PyInt_Type`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyInt_FromString(char *str, char **pend, int base)
Return a new :ctype:`PyIntObject` or :ctype:`PyLongObject` based on the string
value in *str*, which is interpreted according to the radix in *base*. If
*pend* is non-*NULL*, ``*pend`` will point to the first character in *str* which
follows the representation of the number. If *base* is ``0``, the radix will be
determined based on the leading characters of *str*: if *str* starts with
``'0x'`` or ``'0X'``, radix 16 will be used; if *str* starts with ``'0'``, radix
8 will be used; otherwise radix 10 will be used. If *base* is not ``0``, it
must be between ``2`` and ``36``, inclusive. Leading spaces are ignored. If
there are no digits, :exc:`ValueError` will be raised. If the string represents
a number too large to be contained within the machine's :ctype:`long int` type
and overflow warnings are being suppressed, a :ctype:`PyLongObject` will be
returned. If overflow warnings are not being suppressed, *NULL* will be
returned in this case.
.. cfunction:: PyObject* PyInt_FromLong(long ival)
Create a new integer object with a value of *ival*.
The current implementation keeps an array of integer objects for all integers
between ``-5`` and ``256``, when you create an int in that range you actually
just get back a reference to the existing object. So it should be possible to
change the value of ``1``. I suspect the behaviour of Python in this case is
undefined. :-)
.. cfunction:: PyObject* PyInt_FromSsize_t(Py_ssize_t ival)
Create a new integer object with a value of *ival*. If the value exceeds
``LONG_MAX``, a long integer object is returned.
.. versionadded:: 2.5
.. cfunction:: long PyInt_AsLong(PyObject *io)
Will first attempt to cast the object to a :ctype:`PyIntObject`, if it is not
already one, and then return its value. If there is an error, ``-1`` is
returned, and the caller should check ``PyErr_Occurred()`` to find out whether
there was an error, or whether the value just happened to be -1.
.. cfunction:: long PyInt_AS_LONG(PyObject *io)
Return the value of the object *io*. No error checking is performed.
.. cfunction:: unsigned long PyInt_AsUnsignedLongMask(PyObject *io)
Will first attempt to cast the object to a :ctype:`PyIntObject` or
:ctype:`PyLongObject`, if it is not already one, and then return its value as
unsigned long. This function does not check for overflow.
.. versionadded:: 2.3
.. cfunction:: unsigned PY_LONG_LONG PyInt_AsUnsignedLongLongMask(PyObject *io)
Will first attempt to cast the object to a :ctype:`PyIntObject` or
:ctype:`PyLongObject`, if it is not already one, and then return its value as
unsigned long long, without checking for overflow.
.. versionadded:: 2.3
.. cfunction:: Py_ssize_t PyInt_AsSsize_t(PyObject *io)
Will first attempt to cast the object to a :ctype:`PyIntObject` or
:ctype:`PyLongObject`, if it is not already one, and then return its value as
:ctype:`Py_ssize_t`.
.. versionadded:: 2.5
.. cfunction:: long PyInt_GetMax()
.. index:: single: LONG_MAX
Return the system's idea of the largest integer it can handle
(:const:`LONG_MAX`, as defined in the system header files).
.. cfunction:: int PyInt_ClearFreeList(void)
Clear the integer free list. Return the number of items that could not
be freed.
.. versionadded:: 2.6

View File

@@ -0,0 +1,635 @@
.. highlightlang:: c
.. _api-intro:
************
Introduction
************
The Application Programmer's Interface to Python gives C and C++ programmers
access to the Python interpreter at a variety of levels. The API is equally
usable from C++, but for brevity it is generally referred to as the Python/C
API. There are two fundamentally different reasons for using the Python/C API.
The first reason is to write *extension modules* for specific purposes; these
are C modules that extend the Python interpreter. This is probably the most
common use. The second reason is to use Python as a component in a larger
application; this technique is generally referred to as :dfn:`embedding` Python
in an application.
Writing an extension module is a relatively well-understood process, where a
"cookbook" approach works well. There are several tools that automate the
process to some extent. While people have embedded Python in other
applications since its early existence, the process of embedding Python is less
straightforward than writing an extension.
Many API functions are useful independent of whether you're embedding or
extending Python; moreover, most applications that embed Python will need to
provide a custom extension as well, so it's probably a good idea to become
familiar with writing an extension before attempting to embed Python in a real
application.
.. _api-includes:
Include Files
=============
All function, type and macro definitions needed to use the Python/C API are
included in your code by the following line::
#include "Python.h"
This implies inclusion of the following standard headers: ``<stdio.h>``,
``<string.h>``, ``<errno.h>``, ``<limits.h>``, and ``<stdlib.h>`` (if
available).
.. warning::
Since Python may define some pre-processor definitions which affect the standard
headers on some systems, you *must* include :file:`Python.h` before any standard
headers are included.
All user visible names defined by Python.h (except those defined by the included
standard headers) have one of the prefixes ``Py`` or ``_Py``. Names beginning
with ``_Py`` are for internal use by the Python implementation and should not be
used by extension writers. Structure member names do not have a reserved prefix.
**Important:** user code should never define names that begin with ``Py`` or
``_Py``. This confuses the reader, and jeopardizes the portability of the user
code to future Python versions, which may define additional names beginning with
one of these prefixes.
The header files are typically installed with Python. On Unix, these are
located in the directories :file:`{prefix}/include/pythonversion/` and
:file:`{exec_prefix}/include/pythonversion/`, where :envvar:`prefix` and
:envvar:`exec_prefix` are defined by the corresponding parameters to Python's
:program:`configure` script and *version* is ``sys.version[:3]``. On Windows,
the headers are installed in :file:`{prefix}/include`, where :envvar:`prefix` is
the installation directory specified to the installer.
To include the headers, place both directories (if different) on your compiler's
search path for includes. Do *not* place the parent directories on the search
path and then use ``#include <pythonX.Y/Python.h>``; this will break on
multi-platform builds since the platform independent headers under
:envvar:`prefix` include the platform specific headers from
:envvar:`exec_prefix`.
C++ users should note that though the API is defined entirely using C, the
header files do properly declare the entry points to be ``extern "C"``, so there
is no need to do anything special to use the API from C++.
.. _api-objects:
Objects, Types and Reference Counts
===================================
.. index:: object: type
Most Python/C API functions have one or more arguments as well as a return value
of type :ctype:`PyObject\*`. This type is a pointer to an opaque data type
representing an arbitrary Python object. Since all Python object types are
treated the same way by the Python language in most situations (e.g.,
assignments, scope rules, and argument passing), it is only fitting that they
should be represented by a single C type. Almost all Python objects live on the
heap: you never declare an automatic or static variable of type
:ctype:`PyObject`, only pointer variables of type :ctype:`PyObject\*` can be
declared. The sole exception are the type objects; since these must never be
deallocated, they are typically static :ctype:`PyTypeObject` objects.
All Python objects (even Python integers) have a :dfn:`type` and a
:dfn:`reference count`. An object's type determines what kind of object it is
(e.g., an integer, a list, or a user-defined function; there are many more as
explained in :ref:`types`). For each of the well-known types there is a macro
to check whether an object is of that type; for instance, ``PyList_Check(a)`` is
true if (and only if) the object pointed to by *a* is a Python list.
.. _api-refcounts:
Reference Counts
----------------
The reference count is important because today's computers have a finite (and
often severely limited) memory size; it counts how many different places there
are that have a reference to an object. Such a place could be another object,
or a global (or static) C variable, or a local variable in some C function.
When an object's reference count becomes zero, the object is deallocated. If
it contains references to other objects, their reference count is decremented.
Those other objects may be deallocated in turn, if this decrement makes their
reference count become zero, and so on. (There's an obvious problem with
objects that reference each other here; for now, the solution is "don't do
that.")
.. index::
single: Py_INCREF()
single: Py_DECREF()
Reference counts are always manipulated explicitly. The normal way is to use
the macro :cfunc:`Py_INCREF` to increment an object's reference count by one,
and :cfunc:`Py_DECREF` to decrement it by one. The :cfunc:`Py_DECREF` macro
is considerably more complex than the incref one, since it must check whether
the reference count becomes zero and then cause the object's deallocator to be
called. The deallocator is a function pointer contained in the object's type
structure. The type-specific deallocator takes care of decrementing the
reference counts for other objects contained in the object if this is a compound
object type, such as a list, as well as performing any additional finalization
that's needed. There's no chance that the reference count can overflow; at
least as many bits are used to hold the reference count as there are distinct
memory locations in virtual memory (assuming ``sizeof(Py_ssize_t) >= sizeof(void*)``).
Thus, the reference count increment is a simple operation.
It is not necessary to increment an object's reference count for every local
variable that contains a pointer to an object. In theory, the object's
reference count goes up by one when the variable is made to point to it and it
goes down by one when the variable goes out of scope. However, these two
cancel each other out, so at the end the reference count hasn't changed. The
only real reason to use the reference count is to prevent the object from being
deallocated as long as our variable is pointing to it. If we know that there
is at least one other reference to the object that lives at least as long as
our variable, there is no need to increment the reference count temporarily.
An important situation where this arises is in objects that are passed as
arguments to C functions in an extension module that are called from Python;
the call mechanism guarantees to hold a reference to every argument for the
duration of the call.
However, a common pitfall is to extract an object from a list and hold on to it
for a while without incrementing its reference count. Some other operation might
conceivably remove the object from the list, decrementing its reference count
and possible deallocating it. The real danger is that innocent-looking
operations may invoke arbitrary Python code which could do this; there is a code
path which allows control to flow back to the user from a :cfunc:`Py_DECREF`, so
almost any operation is potentially dangerous.
A safe approach is to always use the generic operations (functions whose name
begins with ``PyObject_``, ``PyNumber_``, ``PySequence_`` or ``PyMapping_``).
These operations always increment the reference count of the object they return.
This leaves the caller with the responsibility to call :cfunc:`Py_DECREF` when
they are done with the result; this soon becomes second nature.
.. _api-refcountdetails:
Reference Count Details
^^^^^^^^^^^^^^^^^^^^^^^
The reference count behavior of functions in the Python/C API is best explained
in terms of *ownership of references*. Ownership pertains to references, never
to objects (objects are not owned: they are always shared). "Owning a
reference" means being responsible for calling Py_DECREF on it when the
reference is no longer needed. Ownership can also be transferred, meaning that
the code that receives ownership of the reference then becomes responsible for
eventually decref'ing it by calling :cfunc:`Py_DECREF` or :cfunc:`Py_XDECREF`
when it's no longer needed---or passing on this responsibility (usually to its
caller). When a function passes ownership of a reference on to its caller, the
caller is said to receive a *new* reference. When no ownership is transferred,
the caller is said to *borrow* the reference. Nothing needs to be done for a
borrowed reference.
Conversely, when a calling function passes in a reference to an object, there
are two possibilities: the function *steals* a reference to the object, or it
does not. *Stealing a reference* means that when you pass a reference to a
function, that function assumes that it now owns that reference, and you are not
responsible for it any longer.
.. index::
single: PyList_SetItem()
single: PyTuple_SetItem()
Few functions steal references; the two notable exceptions are
:cfunc:`PyList_SetItem` and :cfunc:`PyTuple_SetItem`, which steal a reference
to the item (but not to the tuple or list into which the item is put!). These
functions were designed to steal a reference because of a common idiom for
populating a tuple or list with newly created objects; for example, the code to
create the tuple ``(1, 2, "three")`` could look like this (forgetting about
error handling for the moment; a better way to code this is shown below)::
PyObject *t;
t = PyTuple_New(3);
PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
PyTuple_SetItem(t, 2, PyString_FromString("three"));
Here, :cfunc:`PyInt_FromLong` returns a new reference which is immediately
stolen by :cfunc:`PyTuple_SetItem`. When you want to keep using an object
although the reference to it will be stolen, use :cfunc:`Py_INCREF` to grab
another reference before calling the reference-stealing function.
Incidentally, :cfunc:`PyTuple_SetItem` is the *only* way to set tuple items;
:cfunc:`PySequence_SetItem` and :cfunc:`PyObject_SetItem` refuse to do this
since tuples are an immutable data type. You should only use
:cfunc:`PyTuple_SetItem` for tuples that you are creating yourself.
Equivalent code for populating a list can be written using :cfunc:`PyList_New`
and :cfunc:`PyList_SetItem`.
However, in practice, you will rarely use these ways of creating and populating
a tuple or list. There's a generic function, :cfunc:`Py_BuildValue`, that can
create most common objects from C values, directed by a :dfn:`format string`.
For example, the above two blocks of code could be replaced by the following
(which also takes care of the error checking)::
PyObject *tuple, *list;
tuple = Py_BuildValue("(iis)", 1, 2, "three");
list = Py_BuildValue("[iis]", 1, 2, "three");
It is much more common to use :cfunc:`PyObject_SetItem` and friends with items
whose references you are only borrowing, like arguments that were passed in to
the function you are writing. In that case, their behaviour regarding reference
counts is much saner, since you don't have to increment a reference count so you
can give a reference away ("have it be stolen"). For example, this function
sets all items of a list (actually, any mutable sequence) to a given item::
int
set_all(PyObject *target, PyObject *item)
{
int i, n;
n = PyObject_Length(target);
if (n < 0)
return -1;
for (i = 0; i < n; i++) {
PyObject *index = PyInt_FromLong(i);
if (!index)
return -1;
if (PyObject_SetItem(target, index, item) < 0)
return -1;
Py_DECREF(index);
}
return 0;
}
.. index:: single: set_all()
The situation is slightly different for function return values. While passing
a reference to most functions does not change your ownership responsibilities
for that reference, many functions that return a reference to an object give
you ownership of the reference. The reason is simple: in many cases, the
returned object is created on the fly, and the reference you get is the only
reference to the object. Therefore, the generic functions that return object
references, like :cfunc:`PyObject_GetItem` and :cfunc:`PySequence_GetItem`,
always return a new reference (the caller becomes the owner of the reference).
It is important to realize that whether you own a reference returned by a
function depends on which function you call only --- *the plumage* (the type of
the object passed as an argument to the function) *doesn't enter into it!*
Thus, if you extract an item from a list using :cfunc:`PyList_GetItem`, you
don't own the reference --- but if you obtain the same item from the same list
using :cfunc:`PySequence_GetItem` (which happens to take exactly the same
arguments), you do own a reference to the returned object.
.. index::
single: PyList_GetItem()
single: PySequence_GetItem()
Here is an example of how you could write a function that computes the sum of
the items in a list of integers; once using :cfunc:`PyList_GetItem`, and once
using :cfunc:`PySequence_GetItem`. ::
long
sum_list(PyObject *list)
{
int i, n;
long total = 0;
PyObject *item;
n = PyList_Size(list);
if (n < 0)
return -1; /* Not a list */
for (i = 0; i < n; i++) {
item = PyList_GetItem(list, i); /* Can't fail */
if (!PyInt_Check(item)) continue; /* Skip non-integers */
total += PyInt_AsLong(item);
}
return total;
}
.. index:: single: sum_list()
::
long
sum_sequence(PyObject *sequence)
{
int i, n;
long total = 0;
PyObject *item;
n = PySequence_Length(sequence);
if (n < 0)
return -1; /* Has no length */
for (i = 0; i < n; i++) {
item = PySequence_GetItem(sequence, i);
if (item == NULL)
return -1; /* Not a sequence, or other failure */
if (PyInt_Check(item))
total += PyInt_AsLong(item);
Py_DECREF(item); /* Discard reference ownership */
}
return total;
}
.. index:: single: sum_sequence()
.. _api-types:
Types
-----
There are few other data types that play a significant role in the Python/C
API; most are simple C types such as :ctype:`int`, :ctype:`long`,
:ctype:`double` and :ctype:`char\*`. A few structure types are used to
describe static tables used to list the functions exported by a module or the
data attributes of a new object type, and another is used to describe the value
of a complex number. These will be discussed together with the functions that
use them.
.. _api-exceptions:
Exceptions
==========
The Python programmer only needs to deal with exceptions if specific error
handling is required; unhandled exceptions are automatically propagated to the
caller, then to the caller's caller, and so on, until they reach the top-level
interpreter, where they are reported to the user accompanied by a stack
traceback.
.. index:: single: PyErr_Occurred()
For C programmers, however, error checking always has to be explicit. All
functions in the Python/C API can raise exceptions, unless an explicit claim is
made otherwise in a function's documentation. In general, when a function
encounters an error, it sets an exception, discards any object references that
it owns, and returns an error indicator --- usually *NULL* or ``-1``. A few
functions return a Boolean true/false result, with false indicating an error.
Very few functions return no explicit error indicator or have an ambiguous
return value, and require explicit testing for errors with
:cfunc:`PyErr_Occurred`.
.. index::
single: PyErr_SetString()
single: PyErr_Clear()
Exception state is maintained in per-thread storage (this is equivalent to
using global storage in an unthreaded application). A thread can be in one of
two states: an exception has occurred, or not. The function
:cfunc:`PyErr_Occurred` can be used to check for this: it returns a borrowed
reference to the exception type object when an exception has occurred, and
*NULL* otherwise. There are a number of functions to set the exception state:
:cfunc:`PyErr_SetString` is the most common (though not the most general)
function to set the exception state, and :cfunc:`PyErr_Clear` clears the
exception state.
.. index::
single: exc_type (in module sys)
single: exc_value (in module sys)
single: exc_traceback (in module sys)
The full exception state consists of three objects (all of which can be
*NULL*): the exception type, the corresponding exception value, and the
traceback. These have the same meanings as the Python objects
``sys.exc_type``, ``sys.exc_value``, and ``sys.exc_traceback``; however, they
are not the same: the Python objects represent the last exception being handled
by a Python :keyword:`try` ... :keyword:`except` statement, while the C level
exception state only exists while an exception is being passed on between C
functions until it reaches the Python bytecode interpreter's main loop, which
takes care of transferring it to ``sys.exc_type`` and friends.
.. index:: single: exc_info() (in module sys)
Note that starting with Python 1.5, the preferred, thread-safe way to access the
exception state from Python code is to call the function :func:`sys.exc_info`,
which returns the per-thread exception state for Python code. Also, the
semantics of both ways to access the exception state have changed so that a
function which catches an exception will save and restore its thread's exception
state so as to preserve the exception state of its caller. This prevents common
bugs in exception handling code caused by an innocent-looking function
overwriting the exception being handled; it also reduces the often unwanted
lifetime extension for objects that are referenced by the stack frames in the
traceback.
As a general principle, a function that calls another function to perform some
task should check whether the called function raised an exception, and if so,
pass the exception state on to its caller. It should discard any object
references that it owns, and return an error indicator, but it should *not* set
another exception --- that would overwrite the exception that was just raised,
and lose important information about the exact cause of the error.
.. index:: single: sum_sequence()
A simple example of detecting exceptions and passing them on is shown in the
:cfunc:`sum_sequence` example above. It so happens that that example doesn't
need to clean up any owned references when it detects an error. The following
example function shows some error cleanup. First, to remind you why you like
Python, we show the equivalent Python code::
def incr_item(dict, key):
try:
item = dict[key]
except KeyError:
item = 0
dict[key] = item + 1
.. index:: single: incr_item()
Here is the corresponding C code, in all its glory::
int
incr_item(PyObject *dict, PyObject *key)
{
/* Objects all initialized to NULL for Py_XDECREF */
PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
int rv = -1; /* Return value initialized to -1 (failure) */
item = PyObject_GetItem(dict, key);
if (item == NULL) {
/* Handle KeyError only: */
if (!PyErr_ExceptionMatches(PyExc_KeyError))
goto error;
/* Clear the error and use zero: */
PyErr_Clear();
item = PyInt_FromLong(0L);
if (item == NULL)
goto error;
}
const_one = PyInt_FromLong(1L);
if (const_one == NULL)
goto error;
incremented_item = PyNumber_Add(item, const_one);
if (incremented_item == NULL)
goto error;
if (PyObject_SetItem(dict, key, incremented_item) < 0)
goto error;
rv = 0; /* Success */
/* Continue with cleanup code */
error:
/* Cleanup code, shared by success and failure path */
/* Use Py_XDECREF() to ignore NULL references */
Py_XDECREF(item);
Py_XDECREF(const_one);
Py_XDECREF(incremented_item);
return rv; /* -1 for error, 0 for success */
}
.. index:: single: incr_item()
.. index::
single: PyErr_ExceptionMatches()
single: PyErr_Clear()
single: Py_XDECREF()
This example represents an endorsed use of the ``goto`` statement in C!
It illustrates the use of :cfunc:`PyErr_ExceptionMatches` and
:cfunc:`PyErr_Clear` to handle specific exceptions, and the use of
:cfunc:`Py_XDECREF` to dispose of owned references that may be *NULL* (note the
``'X'`` in the name; :cfunc:`Py_DECREF` would crash when confronted with a
*NULL* reference). It is important that the variables used to hold owned
references are initialized to *NULL* for this to work; likewise, the proposed
return value is initialized to ``-1`` (failure) and only set to success after
the final call made is successful.
.. _api-embedding:
Embedding Python
================
The one important task that only embedders (as opposed to extension writers) of
the Python interpreter have to worry about is the initialization, and possibly
the finalization, of the Python interpreter. Most functionality of the
interpreter can only be used after the interpreter has been initialized.
.. index::
single: Py_Initialize()
module: __builtin__
module: __main__
module: sys
module: exceptions
triple: module; search; path
single: path (in module sys)
The basic initialization function is :cfunc:`Py_Initialize`. This initializes
the table of loaded modules, and creates the fundamental modules
:mod:`__builtin__`, :mod:`__main__`, :mod:`sys`, and :mod:`exceptions`. It also
initializes the module search path (``sys.path``).
.. index:: single: PySys_SetArgv()
:cfunc:`Py_Initialize` does not set the "script argument list" (``sys.argv``).
If this variable is needed by Python code that will be executed later, it must
be set explicitly with a call to ``PySys_SetArgv(argc, argv)`` subsequent to
the call to :cfunc:`Py_Initialize`.
On most systems (in particular, on Unix and Windows, although the details are
slightly different), :cfunc:`Py_Initialize` calculates the module search path
based upon its best guess for the location of the standard Python interpreter
executable, assuming that the Python library is found in a fixed location
relative to the Python interpreter executable. In particular, it looks for a
directory named :file:`lib/python{X.Y}` relative to the parent directory
where the executable named :file:`python` is found on the shell command search
path (the environment variable :envvar:`PATH`).
For instance, if the Python executable is found in
:file:`/usr/local/bin/python`, it will assume that the libraries are in
:file:`/usr/local/lib/python{X.Y}`. (In fact, this particular path is also
the "fallback" location, used when no executable file named :file:`python` is
found along :envvar:`PATH`.) The user can override this behavior by setting the
environment variable :envvar:`PYTHONHOME`, or insert additional directories in
front of the standard path by setting :envvar:`PYTHONPATH`.
.. index::
single: Py_SetProgramName()
single: Py_GetPath()
single: Py_GetPrefix()
single: Py_GetExecPrefix()
single: Py_GetProgramFullPath()
The embedding application can steer the search by calling
``Py_SetProgramName(file)`` *before* calling :cfunc:`Py_Initialize`. Note that
:envvar:`PYTHONHOME` still overrides this and :envvar:`PYTHONPATH` is still
inserted in front of the standard path. An application that requires total
control has to provide its own implementation of :cfunc:`Py_GetPath`,
:cfunc:`Py_GetPrefix`, :cfunc:`Py_GetExecPrefix`, and
:cfunc:`Py_GetProgramFullPath` (all defined in :file:`Modules/getpath.c`).
.. index:: single: Py_IsInitialized()
Sometimes, it is desirable to "uninitialize" Python. For instance, the
application may want to start over (make another call to
:cfunc:`Py_Initialize`) or the application is simply done with its use of
Python and wants to free memory allocated by Python. This can be accomplished
by calling :cfunc:`Py_Finalize`. The function :cfunc:`Py_IsInitialized` returns
true if Python is currently in the initialized state. More information about
these functions is given in a later chapter. Notice that :cfunc:`Py_Finalize`
does *not* free all memory allocated by the Python interpreter, e.g. memory
allocated by extension modules currently cannot be released.
.. _api-debugging:
Debugging Builds
================
Python can be built with several macros to enable extra checks of the
interpreter and extension modules. These checks tend to add a large amount of
overhead to the runtime so they are not enabled by default.
A full list of the various types of debugging builds is in the file
:file:`Misc/SpecialBuilds.txt` in the Python source distribution. Builds are
available that support tracing of reference counts, debugging the memory
allocator, or low-level profiling of the main interpreter loop. Only the most
frequently-used builds will be described in the remainder of this section.
Compiling the interpreter with the :cmacro:`Py_DEBUG` macro defined produces
what is generally meant by "a debug build" of Python. :cmacro:`Py_DEBUG` is
enabled in the Unix build by adding :option:`--with-pydebug` to the
:file:`configure` command. It is also implied by the presence of the
not-Python-specific :cmacro:`_DEBUG` macro. When :cmacro:`Py_DEBUG` is enabled
in the Unix build, compiler optimization is disabled.
In addition to the reference count debugging described below, the following
extra checks are performed:
* Extra checks are added to the object allocator.
* Extra checks are added to the parser and compiler.
* Downcasts from wide types to narrow types are checked for loss of information.
* A number of assertions are added to the dictionary and set implementations.
In addition, the set object acquires a :meth:`test_c_api` method.
* Sanity checks of the input arguments are added to frame creation.
* The storage for long ints is initialized with a known invalid pattern to catch
reference to uninitialized digits.
* Low-level tracing and extra exception checking are added to the runtime
virtual machine.
* Extra checks are added to the memory arena implementation.
* Extra debugging is added to the thread module.
There may be additional checks not mentioned here.
Defining :cmacro:`Py_TRACE_REFS` enables reference tracing. When defined, a
circular doubly linked list of active objects is maintained by adding two extra
fields to every :ctype:`PyObject`. Total allocations are tracked as well. Upon
exit, all existing references are printed. (In interactive mode this happens
after every statement run by the interpreter.) Implied by :cmacro:`Py_DEBUG`.
Please refer to :file:`Misc/SpecialBuilds.txt` in the Python source distribution
for more detailed information.

View File

@@ -0,0 +1,50 @@
.. highlightlang:: c
.. _iterator:
Iterator Protocol
=================
.. versionadded:: 2.2
There are only a couple of functions specifically for working with iterators.
.. cfunction:: int PyIter_Check(PyObject *o)
Return true if the object *o* supports the iterator protocol.
.. cfunction:: PyObject* PyIter_Next(PyObject *o)
Return the next value from the iteration *o*. If the object is an iterator,
this retrieves the next value from the iteration, and returns *NULL* with no
exception set if there are no remaining items. If the object is not an
iterator, :exc:`TypeError` is raised, or if there is an error in retrieving the
item, returns *NULL* and passes along the exception.
To write a loop which iterates over an iterator, the C code should look
something like this::
PyObject *iterator = PyObject_GetIter(obj);
PyObject *item;
if (iterator == NULL) {
/* propagate error */
}
while (item = PyIter_Next(iterator)) {
/* do something with item */
...
/* release reference when done */
Py_DECREF(item);
}
Py_DECREF(iterator);
if (PyErr_Occurred()) {
/* propagate error */
}
else {
/* continue doing useful work */
}

View File

@@ -0,0 +1,62 @@
.. highlightlang:: c
.. _iterator-objects:
Iterator Objects
----------------
Python provides two general-purpose iterator objects. The first, a sequence
iterator, works with an arbitrary sequence supporting the :meth:`__getitem__`
method. The second works with a callable object and a sentinel value, calling
the callable for each item in the sequence, and ending the iteration when the
sentinel value is returned.
.. cvar:: PyTypeObject PySeqIter_Type
Type object for iterator objects returned by :cfunc:`PySeqIter_New` and the
one-argument form of the :func:`iter` built-in function for built-in sequence
types.
.. versionadded:: 2.2
.. cfunction:: int PySeqIter_Check(op)
Return true if the type of *op* is :cdata:`PySeqIter_Type`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PySeqIter_New(PyObject *seq)
Return an iterator that works with a general sequence object, *seq*. The
iteration ends when the sequence raises :exc:`IndexError` for the subscripting
operation.
.. versionadded:: 2.2
.. cvar:: PyTypeObject PyCallIter_Type
Type object for iterator objects returned by :cfunc:`PyCallIter_New` and the
two-argument form of the :func:`iter` built-in function.
.. versionadded:: 2.2
.. cfunction:: int PyCallIter_Check(op)
Return true if the type of *op* is :cdata:`PyCallIter_Type`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyCallIter_New(PyObject *callable, PyObject *sentinel)
Return a new iterator. The first parameter, *callable*, can be any Python
callable object that can be called with no parameters; each call to it should
return the next item in the iteration. When *callable* returns a value equal to
*sentinel*, the iteration will be terminated.
.. versionadded:: 2.2

View File

@@ -0,0 +1,147 @@
.. highlightlang:: c
.. _listobjects:
List Objects
------------
.. index:: object: list
.. ctype:: PyListObject
This subtype of :ctype:`PyObject` represents a Python list object.
.. cvar:: PyTypeObject PyList_Type
.. index:: single: ListType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python list type. This is
the same object as ``list`` and ``types.ListType`` in the Python layer.
.. cfunction:: int PyList_Check(PyObject *p)
Return true if *p* is a list object or an instance of a subtype of the list
type.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyList_CheckExact(PyObject *p)
Return true if *p* is a list object, but not an instance of a subtype of the
list type.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyList_New(Py_ssize_t len)
Return a new list of length *len* on success, or *NULL* on failure.
.. note::
If *length* is greater than zero, the returned list object's items are set to
``NULL``. Thus you cannot use abstract API functions such as
:cfunc:`PySequence_SetItem` or expose the object to Python code before setting
all items to a real object with :cfunc:`PyList_SetItem`.
.. cfunction:: Py_ssize_t PyList_Size(PyObject *list)
.. index:: builtin: len
Return the length of the list object in *list*; this is equivalent to
``len(list)`` on a list object.
.. cfunction:: Py_ssize_t PyList_GET_SIZE(PyObject *list)
Macro form of :cfunc:`PyList_Size` without error checking.
.. cfunction:: PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index)
Return the object at position *pos* in the list pointed to by *p*. The position
must be positive, indexing from the end of the list is not supported. If *pos*
is out of bounds, return *NULL* and set an :exc:`IndexError` exception.
.. cfunction:: PyObject* PyList_GET_ITEM(PyObject *list, Py_ssize_t i)
Macro form of :cfunc:`PyList_GetItem` without error checking.
.. cfunction:: int PyList_SetItem(PyObject *list, Py_ssize_t index, PyObject *item)
Set the item at index *index* in list to *item*. Return ``0`` on success or
``-1`` on failure.
.. note::
This function "steals" a reference to *item* and discards a reference to an item
already in the list at the affected position.
.. cfunction:: void PyList_SET_ITEM(PyObject *list, Py_ssize_t i, PyObject *o)
Macro form of :cfunc:`PyList_SetItem` without error checking. This is normally
only used to fill in new lists where there is no previous content.
.. note::
This function "steals" a reference to *item*, and, unlike
:cfunc:`PyList_SetItem`, does *not* discard a reference to any item that it
being replaced; any reference in *list* at position *i* will be leaked.
.. cfunction:: int PyList_Insert(PyObject *list, Py_ssize_t index, PyObject *item)
Insert the item *item* into list *list* in front of index *index*. Return ``0``
if successful; return ``-1`` and set an exception if unsuccessful. Analogous to
``list.insert(index, item)``.
.. cfunction:: int PyList_Append(PyObject *list, PyObject *item)
Append the object *item* at the end of list *list*. Return ``0`` if successful;
return ``-1`` and set an exception if unsuccessful. Analogous to
``list.append(item)``.
.. cfunction:: PyObject* PyList_GetSlice(PyObject *list, Py_ssize_t low, Py_ssize_t high)
Return a list of the objects in *list* containing the objects *between* *low*
and *high*. Return *NULL* and set an exception if unsuccessful. Analogous to
``list[low:high]``.
.. cfunction:: int PyList_SetSlice(PyObject *list, Py_ssize_t low, Py_ssize_t high, PyObject *itemlist)
Set the slice of *list* between *low* and *high* to the contents of *itemlist*.
Analogous to ``list[low:high] = itemlist``. The *itemlist* may be *NULL*,
indicating the assignment of an empty list (slice deletion). Return ``0`` on
success, ``-1`` on failure.
.. cfunction:: int PyList_Sort(PyObject *list)
Sort the items of *list* in place. Return ``0`` on success, ``-1`` on failure.
This is equivalent to ``list.sort()``.
.. cfunction:: int PyList_Reverse(PyObject *list)
Reverse the items of *list* in place. Return ``0`` on success, ``-1`` on
failure. This is the equivalent of ``list.reverse()``.
.. cfunction:: PyObject* PyList_AsTuple(PyObject *list)
.. index:: builtin: tuple
Return a new tuple object containing the contents of *list*; equivalent to
``tuple(list)``.

View File

@@ -0,0 +1,209 @@
.. highlightlang:: c
.. _longobjects:
Long Integer Objects
--------------------
.. index:: object: long integer
.. ctype:: PyLongObject
This subtype of :ctype:`PyObject` represents a Python long integer object.
.. cvar:: PyTypeObject PyLong_Type
.. index:: single: LongType (in modules types)
This instance of :ctype:`PyTypeObject` represents the Python long integer type.
This is the same object as ``long`` and ``types.LongType``.
.. cfunction:: int PyLong_Check(PyObject *p)
Return true if its argument is a :ctype:`PyLongObject` or a subtype of
:ctype:`PyLongObject`.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyLong_CheckExact(PyObject *p)
Return true if its argument is a :ctype:`PyLongObject`, but not a subtype of
:ctype:`PyLongObject`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyLong_FromLong(long v)
Return a new :ctype:`PyLongObject` object from *v*, or *NULL* on failure.
.. cfunction:: PyObject* PyLong_FromUnsignedLong(unsigned long v)
Return a new :ctype:`PyLongObject` object from a C :ctype:`unsigned long`, or
*NULL* on failure.
.. cfunction:: PyObject* PyLong_FromSsize_t(Py_ssize_t v)
Return a new :ctype:`PyLongObject` object from a C :ctype:`Py_ssize_t`, or
*NULL* on failure.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyLong_FromSize_t(size_t v)
Return a new :ctype:`PyLongObject` object from a C :ctype:`size_t`, or
*NULL* on failure.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyLong_FromLongLong(PY_LONG_LONG v)
Return a new :ctype:`PyLongObject` object from a C :ctype:`long long`, or *NULL*
on failure.
.. cfunction:: PyObject* PyLong_FromUnsignedLongLong(unsigned PY_LONG_LONG v)
Return a new :ctype:`PyLongObject` object from a C :ctype:`unsigned long long`,
or *NULL* on failure.
.. cfunction:: PyObject* PyLong_FromDouble(double v)
Return a new :ctype:`PyLongObject` object from the integer part of *v*, or
*NULL* on failure.
.. cfunction:: PyObject* PyLong_FromString(char *str, char **pend, int base)
Return a new :ctype:`PyLongObject` based on the string value in *str*, which is
interpreted according to the radix in *base*. If *pend* is non-*NULL*,
``*pend`` will point to the first character in *str* which follows the
representation of the number. If *base* is ``0``, the radix will be determined
based on the leading characters of *str*: if *str* starts with ``'0x'`` or
``'0X'``, radix 16 will be used; if *str* starts with ``'0'``, radix 8 will be
used; otherwise radix 10 will be used. If *base* is not ``0``, it must be
between ``2`` and ``36``, inclusive. Leading spaces are ignored. If there are
no digits, :exc:`ValueError` will be raised.
.. cfunction:: PyObject* PyLong_FromUnicode(Py_UNICODE *u, Py_ssize_t length, int base)
Convert a sequence of Unicode digits to a Python long integer value. The first
parameter, *u*, points to the first character of the Unicode string, *length*
gives the number of characters, and *base* is the radix for the conversion. The
radix must be in the range [2, 36]; if it is out of range, :exc:`ValueError`
will be raised.
.. versionadded:: 1.6
.. cfunction:: PyObject* PyLong_FromVoidPtr(void *p)
Create a Python integer or long integer from the pointer *p*. The pointer value
can be retrieved from the resulting value using :cfunc:`PyLong_AsVoidPtr`.
.. versionadded:: 1.5.2
.. versionchanged:: 2.5
If the integer is larger than LONG_MAX, a positive long integer is returned.
.. cfunction:: long PyLong_AsLong(PyObject *pylong)
.. index::
single: LONG_MAX
single: OverflowError (built-in exception)
Return a C :ctype:`long` representation of the contents of *pylong*. If
*pylong* is greater than :const:`LONG_MAX`, an :exc:`OverflowError` is raised
and ``-1`` will be returned.
.. cfunction:: Py_ssize_t PyLong_AsSsize_t(PyObject *pylong)
.. index::
single: PY_SSIZE_T_MAX
single: OverflowError (built-in exception)
Return a C :ctype:`Py_ssize_t` representation of the contents of *pylong*. If
*pylong* is greater than :const:`PY_SSIZE_T_MAX`, an :exc:`OverflowError` is raised
and ``-1`` will be returned.
.. versionadded:: 2.6
.. cfunction:: unsigned long PyLong_AsUnsignedLong(PyObject *pylong)
.. index::
single: ULONG_MAX
single: OverflowError (built-in exception)
Return a C :ctype:`unsigned long` representation of the contents of *pylong*.
If *pylong* is greater than :const:`ULONG_MAX`, an :exc:`OverflowError` is
raised.
.. cfunction:: PY_LONG_LONG PyLong_AsLongLong(PyObject *pylong)
Return a C :ctype:`long long` from a Python long integer. If *pylong* cannot be
represented as a :ctype:`long long`, an :exc:`OverflowError` will be raised.
.. versionadded:: 2.2
.. cfunction:: unsigned PY_LONG_LONG PyLong_AsUnsignedLongLong(PyObject *pylong)
Return a C :ctype:`unsigned long long` from a Python long integer. If *pylong*
cannot be represented as an :ctype:`unsigned long long`, an :exc:`OverflowError`
will be raised if the value is positive, or a :exc:`TypeError` will be raised if
the value is negative.
.. versionadded:: 2.2
.. cfunction:: unsigned long PyLong_AsUnsignedLongMask(PyObject *io)
Return a C :ctype:`unsigned long` from a Python long integer, without checking
for overflow.
.. versionadded:: 2.3
.. cfunction:: unsigned PY_LONG_LONG PyLong_AsUnsignedLongLongMask(PyObject *io)
Return a C :ctype:`unsigned long long` from a Python long integer, without
checking for overflow.
.. versionadded:: 2.3
.. cfunction:: double PyLong_AsDouble(PyObject *pylong)
Return a C :ctype:`double` representation of the contents of *pylong*. If
*pylong* cannot be approximately represented as a :ctype:`double`, an
:exc:`OverflowError` exception is raised and ``-1.0`` will be returned.
.. cfunction:: void* PyLong_AsVoidPtr(PyObject *pylong)
Convert a Python integer or long integer *pylong* to a C :ctype:`void` pointer.
If *pylong* cannot be converted, an :exc:`OverflowError` will be raised. This
is only assured to produce a usable :ctype:`void` pointer for values created
with :cfunc:`PyLong_FromVoidPtr`.
.. versionadded:: 1.5.2
.. versionchanged:: 2.5
For values outside 0..LONG_MAX, both signed and unsigned integers are accepted.

View File

@@ -0,0 +1,78 @@
.. highlightlang:: c
.. _mapping:
Mapping Protocol
================
.. cfunction:: int PyMapping_Check(PyObject *o)
Return ``1`` if the object provides mapping protocol, and ``0`` otherwise. This
function always succeeds.
.. cfunction:: Py_ssize_t PyMapping_Length(PyObject *o)
.. index:: builtin: len
Returns the number of keys in object *o* on success, and ``-1`` on failure. For
objects that do not provide mapping protocol, this is equivalent to the Python
expression ``len(o)``.
.. cfunction:: int PyMapping_DelItemString(PyObject *o, char *key)
Remove the mapping for object *key* from the object *o*. Return ``-1`` on
failure. This is equivalent to the Python statement ``del o[key]``.
.. cfunction:: int PyMapping_DelItem(PyObject *o, PyObject *key)
Remove the mapping for object *key* from the object *o*. Return ``-1`` on
failure. This is equivalent to the Python statement ``del o[key]``.
.. cfunction:: int PyMapping_HasKeyString(PyObject *o, char *key)
On success, return ``1`` if the mapping object has the key *key* and ``0``
otherwise. This is equivalent to ``o[key]``, returning ``True`` on success
and ``False`` on an exception. This function always succeeds.
.. cfunction:: int PyMapping_HasKey(PyObject *o, PyObject *key)
Return ``1`` if the mapping object has the key *key* and ``0`` otherwise.
This is equivalent to ``o[key]``, returning ``True`` on success and ``False``
on an exception. This function always succeeds.
.. cfunction:: PyObject* PyMapping_Keys(PyObject *o)
On success, return a list of the keys in object *o*. On failure, return *NULL*.
This is equivalent to the Python expression ``o.keys()``.
.. cfunction:: PyObject* PyMapping_Values(PyObject *o)
On success, return a list of the values in object *o*. On failure, return
*NULL*. This is equivalent to the Python expression ``o.values()``.
.. cfunction:: PyObject* PyMapping_Items(PyObject *o)
On success, return a list of the items in object *o*, where each item is a tuple
containing a key-value pair. On failure, return *NULL*. This is equivalent to
the Python expression ``o.items()``.
.. cfunction:: PyObject* PyMapping_GetItemString(PyObject *o, char *key)
Return element of *o* corresponding to the object *key* or *NULL* on failure.
This is the equivalent of the Python expression ``o[key]``.
.. cfunction:: int PyMapping_SetItemString(PyObject *o, char *key, PyObject *v)
Map the object *key* to the value *v* in object *o*. Returns ``-1`` on failure.
This is the equivalent of the Python statement ``o[key] = v``.

View File

@@ -0,0 +1,94 @@
.. highlightlang:: c
.. _marshalling-utils:
Data marshalling support
========================
These routines allow C code to work with serialized objects using the same data
format as the :mod:`marshal` module. There are functions to write data into the
serialization format, and additional functions that can be used to read the data
back. Files used to store marshalled data must be opened in binary mode.
Numeric values are stored with the least significant byte first.
The module supports two versions of the data format: version 0 is the historical
version, version 1 (new in Python 2.4) shares interned strings in the file, and
upon unmarshalling. Version 2 (new in Python 2.5) uses a binary format for
floating point numbers.
*Py_MARSHAL_VERSION* indicates the current file format (currently 2).
.. cfunction:: void PyMarshal_WriteLongToFile(long value, FILE *file, int version)
Marshal a :ctype:`long` integer, *value*, to *file*. This will only write the
least-significant 32 bits of *value*; regardless of the size of the native
:ctype:`long` type.
.. versionchanged:: 2.4
*version* indicates the file format.
.. cfunction:: void PyMarshal_WriteObjectToFile(PyObject *value, FILE *file, int version)
Marshal a Python object, *value*, to *file*.
.. versionchanged:: 2.4
*version* indicates the file format.
.. cfunction:: PyObject* PyMarshal_WriteObjectToString(PyObject *value, int version)
Return a string object containing the marshalled representation of *value*.
.. versionchanged:: 2.4
*version* indicates the file format.
The following functions allow marshalled values to be read back in.
XXX What about error detection? It appears that reading past the end of the
file will always result in a negative numeric value (where that's relevant), but
it's not clear that negative values won't be handled properly when there's no
error. What's the right way to tell? Should only non-negative values be written
using these routines?
.. cfunction:: long PyMarshal_ReadLongFromFile(FILE *file)
Return a C :ctype:`long` from the data stream in a :ctype:`FILE\*` opened for
reading. Only a 32-bit value can be read in using this function, regardless of
the native size of :ctype:`long`.
.. cfunction:: int PyMarshal_ReadShortFromFile(FILE *file)
Return a C :ctype:`short` from the data stream in a :ctype:`FILE\*` opened for
reading. Only a 16-bit value can be read in using this function, regardless of
the native size of :ctype:`short`.
.. cfunction:: PyObject* PyMarshal_ReadObjectFromFile(FILE *file)
Return a Python object from the data stream in a :ctype:`FILE\*` opened for
reading. On error, sets the appropriate exception (:exc:`EOFError` or
:exc:`TypeError`) and returns *NULL*.
.. cfunction:: PyObject* PyMarshal_ReadLastObjectFromFile(FILE *file)
Return a Python object from the data stream in a :ctype:`FILE\*` opened for
reading. Unlike :cfunc:`PyMarshal_ReadObjectFromFile`, this function assumes
that no further objects will be read from the file, allowing it to aggressively
load file data into memory so that the de-serialization can operate from data in
memory rather than reading a byte at a time from the file. Only use these
variant if you are certain that you won't be reading anything else from the
file. On error, sets the appropriate exception (:exc:`EOFError` or
:exc:`TypeError`) and returns *NULL*.
.. cfunction:: PyObject* PyMarshal_ReadObjectFromString(char *string, Py_ssize_t len)
Return a Python object from the data stream in a character buffer containing
*len* bytes pointed to by *string*. On error, sets the appropriate exception
(:exc:`EOFError` or :exc:`TypeError`) and returns *NULL*.

View File

@@ -0,0 +1,209 @@
.. highlightlang:: c
.. _memory:
*****************
Memory Management
*****************
.. sectionauthor:: Vladimir Marangozov <Vladimir.Marangozov@inrialpes.fr>
.. _memoryoverview:
Overview
========
Memory management in Python involves a private heap containing all Python
objects and data structures. The management of this private heap is ensured
internally by the *Python memory manager*. The Python memory manager has
different components which deal with various dynamic storage management aspects,
like sharing, segmentation, preallocation or caching.
At the lowest level, a raw memory allocator ensures that there is enough room in
the private heap for storing all Python-related data by interacting with the
memory manager of the operating system. On top of the raw memory allocator,
several object-specific allocators operate on the same heap and implement
distinct memory management policies adapted to the peculiarities of every object
type. For example, integer objects are managed differently within the heap than
strings, tuples or dictionaries because integers imply different storage
requirements and speed/space tradeoffs. The Python memory manager thus delegates
some of the work to the object-specific allocators, but ensures that the latter
operate within the bounds of the private heap.
It is important to understand that the management of the Python heap is
performed by the interpreter itself and that the user has no control over it,
even if she regularly manipulates object pointers to memory blocks inside that
heap. The allocation of heap space for Python objects and other internal
buffers is performed on demand by the Python memory manager through the Python/C
API functions listed in this document.
.. index::
single: malloc()
single: calloc()
single: realloc()
single: free()
To avoid memory corruption, extension writers should never try to operate on
Python objects with the functions exported by the C library: :cfunc:`malloc`,
:cfunc:`calloc`, :cfunc:`realloc` and :cfunc:`free`. This will result in mixed
calls between the C allocator and the Python memory manager with fatal
consequences, because they implement different algorithms and operate on
different heaps. However, one may safely allocate and release memory blocks
with the C library allocator for individual purposes, as shown in the following
example::
PyObject *res;
char *buf = (char *) malloc(BUFSIZ); /* for I/O */
if (buf == NULL)
return PyErr_NoMemory();
...Do some I/O operation involving buf...
res = PyString_FromString(buf);
free(buf); /* malloc'ed */
return res;
In this example, the memory request for the I/O buffer is handled by the C
library allocator. The Python memory manager is involved only in the allocation
of the string object returned as a result.
In most situations, however, it is recommended to allocate memory from the
Python heap specifically because the latter is under control of the Python
memory manager. For example, this is required when the interpreter is extended
with new object types written in C. Another reason for using the Python heap is
the desire to *inform* the Python memory manager about the memory needs of the
extension module. Even when the requested memory is used exclusively for
internal, highly-specific purposes, delegating all memory requests to the Python
memory manager causes the interpreter to have a more accurate image of its
memory footprint as a whole. Consequently, under certain circumstances, the
Python memory manager may or may not trigger appropriate actions, like garbage
collection, memory compaction or other preventive procedures. Note that by using
the C library allocator as shown in the previous example, the allocated memory
for the I/O buffer escapes completely the Python memory manager.
.. _memoryinterface:
Memory Interface
================
The following function sets, modeled after the ANSI C standard, but specifying
behavior when requesting zero bytes, are available for allocating and releasing
memory from the Python heap:
.. cfunction:: void* PyMem_Malloc(size_t n)
Allocates *n* bytes and returns a pointer of type :ctype:`void\*` to the
allocated memory, or *NULL* if the request fails. Requesting zero bytes returns
a distinct non-*NULL* pointer if possible, as if :cfunc:`PyMem_Malloc(1)` had
been called instead. The memory will not have been initialized in any way.
.. cfunction:: void* PyMem_Realloc(void *p, size_t n)
Resizes the memory block pointed to by *p* to *n* bytes. The contents will be
unchanged to the minimum of the old and the new sizes. If *p* is *NULL*, the
call is equivalent to :cfunc:`PyMem_Malloc(n)`; else if *n* is equal to zero,
the memory block is resized but is not freed, and the returned pointer is
non-*NULL*. Unless *p* is *NULL*, it must have been returned by a previous call
to :cfunc:`PyMem_Malloc` or :cfunc:`PyMem_Realloc`. If the request fails,
:cfunc:`PyMem_Realloc` returns *NULL* and *p* remains a valid pointer to the
previous memory area.
.. cfunction:: void PyMem_Free(void *p)
Frees the memory block pointed to by *p*, which must have been returned by a
previous call to :cfunc:`PyMem_Malloc` or :cfunc:`PyMem_Realloc`. Otherwise, or
if :cfunc:`PyMem_Free(p)` has been called before, undefined behavior occurs. If
*p* is *NULL*, no operation is performed.
The following type-oriented macros are provided for convenience. Note that
*TYPE* refers to any C type.
.. cfunction:: TYPE* PyMem_New(TYPE, size_t n)
Same as :cfunc:`PyMem_Malloc`, but allocates ``(n * sizeof(TYPE))`` bytes of
memory. Returns a pointer cast to :ctype:`TYPE\*`. The memory will not have
been initialized in any way.
.. cfunction:: TYPE* PyMem_Resize(void *p, TYPE, size_t n)
Same as :cfunc:`PyMem_Realloc`, but the memory block is resized to ``(n *
sizeof(TYPE))`` bytes. Returns a pointer cast to :ctype:`TYPE\*`. On return,
*p* will be a pointer to the new memory area, or *NULL* in the event of
failure. This is a C preprocessor macro; p is always reassigned. Save
the original value of p to avoid losing memory when handling errors.
.. cfunction:: void PyMem_Del(void *p)
Same as :cfunc:`PyMem_Free`.
In addition, the following macro sets are provided for calling the Python memory
allocator directly, without involving the C API functions listed above. However,
note that their use does not preserve binary compatibility across Python
versions and is therefore deprecated in extension modules.
:cfunc:`PyMem_MALLOC`, :cfunc:`PyMem_REALLOC`, :cfunc:`PyMem_FREE`.
:cfunc:`PyMem_NEW`, :cfunc:`PyMem_RESIZE`, :cfunc:`PyMem_DEL`.
.. _memoryexamples:
Examples
========
Here is the example from section :ref:`memoryoverview`, rewritten so that the
I/O buffer is allocated from the Python heap by using the first function set::
PyObject *res;
char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */
if (buf == NULL)
return PyErr_NoMemory();
/* ...Do some I/O operation involving buf... */
res = PyString_FromString(buf);
PyMem_Free(buf); /* allocated with PyMem_Malloc */
return res;
The same code using the type-oriented function set::
PyObject *res;
char *buf = PyMem_New(char, BUFSIZ); /* for I/O */
if (buf == NULL)
return PyErr_NoMemory();
/* ...Do some I/O operation involving buf... */
res = PyString_FromString(buf);
PyMem_Del(buf); /* allocated with PyMem_New */
return res;
Note that in the two examples above, the buffer is always manipulated via
functions belonging to the same set. Indeed, it is required to use the same
memory API family for a given memory block, so that the risk of mixing different
allocators is reduced to a minimum. The following code sequence contains two
errors, one of which is labeled as *fatal* because it mixes two different
allocators operating on different heaps. ::
char *buf1 = PyMem_New(char, BUFSIZ);
char *buf2 = (char *) malloc(BUFSIZ);
char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
...
PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */
free(buf2); /* Right -- allocated via malloc() */
free(buf1); /* Fatal -- should be PyMem_Del() */
In addition to the functions aimed at handling raw memory blocks from the Python
heap, objects in Python are allocated and released with :cfunc:`PyObject_New`,
:cfunc:`PyObject_NewVar` and :cfunc:`PyObject_Del`.
These will be explained in the next chapter on defining and implementing new
object types in C.

View File

@@ -0,0 +1,72 @@
.. highlightlang:: c
.. _method-objects:
Method Objects
--------------
.. index:: object: method
There are some useful functions that are useful for working with method objects.
.. cvar:: PyTypeObject PyMethod_Type
.. index:: single: MethodType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python method type. This
is exposed to Python programs as ``types.MethodType``.
.. cfunction:: int PyMethod_Check(PyObject *o)
Return true if *o* is a method object (has type :cdata:`PyMethod_Type`). The
parameter must not be *NULL*.
.. cfunction:: PyObject* PyMethod_New(PyObject *func, PyObject *self, PyObject *class)
Return a new method object, with *func* being any callable object; this is the
function that will be called when the method is called. If this method should
be bound to an instance, *self* should be the instance and *class* should be the
class of *self*, otherwise *self* should be *NULL* and *class* should be the
class which provides the unbound method..
.. cfunction:: PyObject* PyMethod_Class(PyObject *meth)
Return the class object from which the method *meth* was created; if this was
created from an instance, it will be the class of the instance.
.. cfunction:: PyObject* PyMethod_GET_CLASS(PyObject *meth)
Macro version of :cfunc:`PyMethod_Class` which avoids error checking.
.. cfunction:: PyObject* PyMethod_Function(PyObject *meth)
Return the function object associated with the method *meth*.
.. cfunction:: PyObject* PyMethod_GET_FUNCTION(PyObject *meth)
Macro version of :cfunc:`PyMethod_Function` which avoids error checking.
.. cfunction:: PyObject* PyMethod_Self(PyObject *meth)
Return the instance associated with the method *meth* if it is bound, otherwise
return *NULL*.
.. cfunction:: PyObject* PyMethod_GET_SELF(PyObject *meth)
Macro version of :cfunc:`PyMethod_Self` which avoids error checking.
.. cfunction:: int PyMethod_ClearFreeList(void)
Clear the free list. Return the total number of freed items.
.. versionadded:: 2.6

View File

@@ -0,0 +1,121 @@
.. highlightlang:: c
.. _moduleobjects:
Module Objects
--------------
.. index:: object: module
There are only a few functions special to module objects.
.. cvar:: PyTypeObject PyModule_Type
.. index:: single: ModuleType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python module type. This
is exposed to Python programs as ``types.ModuleType``.
.. cfunction:: int PyModule_Check(PyObject *p)
Return true if *p* is a module object, or a subtype of a module object.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyModule_CheckExact(PyObject *p)
Return true if *p* is a module object, but not a subtype of
:cdata:`PyModule_Type`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyModule_New(const char *name)
.. index::
single: __name__ (module attribute)
single: __doc__ (module attribute)
single: __file__ (module attribute)
Return a new module object with the :attr:`__name__` attribute set to *name*.
Only the module's :attr:`__doc__` and :attr:`__name__` attributes are filled in;
the caller is responsible for providing a :attr:`__file__` attribute.
.. cfunction:: PyObject* PyModule_GetDict(PyObject *module)
.. index:: single: __dict__ (module attribute)
Return the dictionary object that implements *module*'s namespace; this object
is the same as the :attr:`__dict__` attribute of the module object. This
function never fails. It is recommended extensions use other
:cfunc:`PyModule_\*` and :cfunc:`PyObject_\*` functions rather than directly
manipulate a module's :attr:`__dict__`.
.. cfunction:: char* PyModule_GetName(PyObject *module)
.. index::
single: __name__ (module attribute)
single: SystemError (built-in exception)
Return *module*'s :attr:`__name__` value. If the module does not provide one,
or if it is not a string, :exc:`SystemError` is raised and *NULL* is returned.
.. cfunction:: char* PyModule_GetFilename(PyObject *module)
.. index::
single: __file__ (module attribute)
single: SystemError (built-in exception)
Return the name of the file from which *module* was loaded using *module*'s
:attr:`__file__` attribute. If this is not defined, or if it is not a string,
raise :exc:`SystemError` and return *NULL*.
.. cfunction:: int PyModule_AddObject(PyObject *module, const char *name, PyObject *value)
Add an object to *module* as *name*. This is a convenience function which can
be used from the module's initialization function. This steals a reference to
*value*. Return ``-1`` on error, ``0`` on success.
.. versionadded:: 2.0
.. cfunction:: int PyModule_AddIntConstant(PyObject *module, const char *name, long value)
Add an integer constant to *module* as *name*. This convenience function can be
used from the module's initialization function. Return ``-1`` on error, ``0`` on
success.
.. versionadded:: 2.0
.. cfunction:: int PyModule_AddStringConstant(PyObject *module, const char *name, const char *value)
Add a string constant to *module* as *name*. This convenience function can be
used from the module's initialization function. The string *value* must be
null-terminated. Return ``-1`` on error, ``0`` on success.
.. versionadded:: 2.0
.. cfunction:: int PyModule_AddIntMacro(PyObject *module, macro)
Add an int constant to *module*. The name and the value are taken from
*macro*. For example ``PyModule_AddConstant(module, AF_INET)`` adds the int
constant *AF_INET* with the value of *AF_INET* to *module*.
Return ``-1`` on error, ``0`` on success.
.. versionadded:: 2.6
.. cfunction:: int PyModule_AddStringMacro(PyObject *module, macro)
Add a string constant to *module*.
.. versionadded:: 2.6

View File

@@ -0,0 +1,28 @@
.. highlightlang:: c
.. _noneobject:
The None Object
---------------
.. index:: object: None
Note that the :ctype:`PyTypeObject` for ``None`` is not directly exposed in the
Python/C API. Since ``None`` is a singleton, testing for object identity (using
``==`` in C) is sufficient. There is no :cfunc:`PyNone_Check` function for the
same reason.
.. cvar:: PyObject* Py_None
The Python ``None`` object, denoting lack of value. This object has no methods.
It needs to be treated just like any other object with respect to reference
counts.
.. cmacro:: Py_RETURN_NONE
Properly handle returning :cdata:`Py_None` from within a C function.
.. versionadded:: 2.4

View File

@@ -0,0 +1,322 @@
.. highlightlang:: c
.. _number:
Number Protocol
===============
.. cfunction:: int PyNumber_Check(PyObject *o)
Returns ``1`` if the object *o* provides numeric protocols, and false otherwise.
This function always succeeds.
.. cfunction:: PyObject* PyNumber_Add(PyObject *o1, PyObject *o2)
Returns the result of adding *o1* and *o2*, or *NULL* on failure. This is the
equivalent of the Python expression ``o1 + o2``.
.. cfunction:: PyObject* PyNumber_Subtract(PyObject *o1, PyObject *o2)
Returns the result of subtracting *o2* from *o1*, or *NULL* on failure. This is
the equivalent of the Python expression ``o1 - o2``.
.. cfunction:: PyObject* PyNumber_Multiply(PyObject *o1, PyObject *o2)
Returns the result of multiplying *o1* and *o2*, or *NULL* on failure. This is
the equivalent of the Python expression ``o1 * o2``.
.. cfunction:: PyObject* PyNumber_Divide(PyObject *o1, PyObject *o2)
Returns the result of dividing *o1* by *o2*, or *NULL* on failure. This is the
equivalent of the Python expression ``o1 / o2``.
.. cfunction:: PyObject* PyNumber_FloorDivide(PyObject *o1, PyObject *o2)
Return the floor of *o1* divided by *o2*, or *NULL* on failure. This is
equivalent to the "classic" division of integers.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyNumber_TrueDivide(PyObject *o1, PyObject *o2)
Return a reasonable approximation for the mathematical value of *o1* divided by
*o2*, or *NULL* on failure. The return value is "approximate" because binary
floating point numbers are approximate; it is not possible to represent all real
numbers in base two. This function can return a floating point value when
passed two integers.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyNumber_Remainder(PyObject *o1, PyObject *o2)
Returns the remainder of dividing *o1* by *o2*, or *NULL* on failure. This is
the equivalent of the Python expression ``o1 % o2``.
.. cfunction:: PyObject* PyNumber_Divmod(PyObject *o1, PyObject *o2)
.. index:: builtin: divmod
See the built-in function :func:`divmod`. Returns *NULL* on failure. This is
the equivalent of the Python expression ``divmod(o1, o2)``.
.. cfunction:: PyObject* PyNumber_Power(PyObject *o1, PyObject *o2, PyObject *o3)
.. index:: builtin: pow
See the built-in function :func:`pow`. Returns *NULL* on failure. This is the
equivalent of the Python expression ``pow(o1, o2, o3)``, where *o3* is optional.
If *o3* is to be ignored, pass :cdata:`Py_None` in its place (passing *NULL* for
*o3* would cause an illegal memory access).
.. cfunction:: PyObject* PyNumber_Negative(PyObject *o)
Returns the negation of *o* on success, or *NULL* on failure. This is the
equivalent of the Python expression ``-o``.
.. cfunction:: PyObject* PyNumber_Positive(PyObject *o)
Returns *o* on success, or *NULL* on failure. This is the equivalent of the
Python expression ``+o``.
.. cfunction:: PyObject* PyNumber_Absolute(PyObject *o)
.. index:: builtin: abs
Returns the absolute value of *o*, or *NULL* on failure. This is the equivalent
of the Python expression ``abs(o)``.
.. cfunction:: PyObject* PyNumber_Invert(PyObject *o)
Returns the bitwise negation of *o* on success, or *NULL* on failure. This is
the equivalent of the Python expression ``~o``.
.. cfunction:: PyObject* PyNumber_Lshift(PyObject *o1, PyObject *o2)
Returns the result of left shifting *o1* by *o2* on success, or *NULL* on
failure. This is the equivalent of the Python expression ``o1 << o2``.
.. cfunction:: PyObject* PyNumber_Rshift(PyObject *o1, PyObject *o2)
Returns the result of right shifting *o1* by *o2* on success, or *NULL* on
failure. This is the equivalent of the Python expression ``o1 >> o2``.
.. cfunction:: PyObject* PyNumber_And(PyObject *o1, PyObject *o2)
Returns the "bitwise and" of *o1* and *o2* on success and *NULL* on failure.
This is the equivalent of the Python expression ``o1 & o2``.
.. cfunction:: PyObject* PyNumber_Xor(PyObject *o1, PyObject *o2)
Returns the "bitwise exclusive or" of *o1* by *o2* on success, or *NULL* on
failure. This is the equivalent of the Python expression ``o1 ^ o2``.
.. cfunction:: PyObject* PyNumber_Or(PyObject *o1, PyObject *o2)
Returns the "bitwise or" of *o1* and *o2* on success, or *NULL* on failure.
This is the equivalent of the Python expression ``o1 | o2``.
.. cfunction:: PyObject* PyNumber_InPlaceAdd(PyObject *o1, PyObject *o2)
Returns the result of adding *o1* and *o2*, or *NULL* on failure. The operation
is done *in-place* when *o1* supports it. This is the equivalent of the Python
statement ``o1 += o2``.
.. cfunction:: PyObject* PyNumber_InPlaceSubtract(PyObject *o1, PyObject *o2)
Returns the result of subtracting *o2* from *o1*, or *NULL* on failure. The
operation is done *in-place* when *o1* supports it. This is the equivalent of
the Python statement ``o1 -= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceMultiply(PyObject *o1, PyObject *o2)
Returns the result of multiplying *o1* and *o2*, or *NULL* on failure. The
operation is done *in-place* when *o1* supports it. This is the equivalent of
the Python statement ``o1 *= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceDivide(PyObject *o1, PyObject *o2)
Returns the result of dividing *o1* by *o2*, or *NULL* on failure. The
operation is done *in-place* when *o1* supports it. This is the equivalent of
the Python statement ``o1 /= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceFloorDivide(PyObject *o1, PyObject *o2)
Returns the mathematical floor of dividing *o1* by *o2*, or *NULL* on failure.
The operation is done *in-place* when *o1* supports it. This is the equivalent
of the Python statement ``o1 //= o2``.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyNumber_InPlaceTrueDivide(PyObject *o1, PyObject *o2)
Return a reasonable approximation for the mathematical value of *o1* divided by
*o2*, or *NULL* on failure. The return value is "approximate" because binary
floating point numbers are approximate; it is not possible to represent all real
numbers in base two. This function can return a floating point value when
passed two integers. The operation is done *in-place* when *o1* supports it.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyNumber_InPlaceRemainder(PyObject *o1, PyObject *o2)
Returns the remainder of dividing *o1* by *o2*, or *NULL* on failure. The
operation is done *in-place* when *o1* supports it. This is the equivalent of
the Python statement ``o1 %= o2``.
.. cfunction:: PyObject* PyNumber_InPlacePower(PyObject *o1, PyObject *o2, PyObject *o3)
.. index:: builtin: pow
See the built-in function :func:`pow`. Returns *NULL* on failure. The operation
is done *in-place* when *o1* supports it. This is the equivalent of the Python
statement ``o1 **= o2`` when o3 is :cdata:`Py_None`, or an in-place variant of
``pow(o1, o2, o3)`` otherwise. If *o3* is to be ignored, pass :cdata:`Py_None`
in its place (passing *NULL* for *o3* would cause an illegal memory access).
.. cfunction:: PyObject* PyNumber_InPlaceLshift(PyObject *o1, PyObject *o2)
Returns the result of left shifting *o1* by *o2* on success, or *NULL* on
failure. The operation is done *in-place* when *o1* supports it. This is the
equivalent of the Python statement ``o1 <<= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceRshift(PyObject *o1, PyObject *o2)
Returns the result of right shifting *o1* by *o2* on success, or *NULL* on
failure. The operation is done *in-place* when *o1* supports it. This is the
equivalent of the Python statement ``o1 >>= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceAnd(PyObject *o1, PyObject *o2)
Returns the "bitwise and" of *o1* and *o2* on success and *NULL* on failure. The
operation is done *in-place* when *o1* supports it. This is the equivalent of
the Python statement ``o1 &= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceXor(PyObject *o1, PyObject *o2)
Returns the "bitwise exclusive or" of *o1* by *o2* on success, or *NULL* on
failure. The operation is done *in-place* when *o1* supports it. This is the
equivalent of the Python statement ``o1 ^= o2``.
.. cfunction:: PyObject* PyNumber_InPlaceOr(PyObject *o1, PyObject *o2)
Returns the "bitwise or" of *o1* and *o2* on success, or *NULL* on failure. The
operation is done *in-place* when *o1* supports it. This is the equivalent of
the Python statement ``o1 |= o2``.
.. cfunction:: int PyNumber_Coerce(PyObject **p1, PyObject **p2)
.. index:: builtin: coerce
This function takes the addresses of two variables of type :ctype:`PyObject\*`.
If the objects pointed to by ``*p1`` and ``*p2`` have the same type, increment
their reference count and return ``0`` (success). If the objects can be
converted to a common numeric type, replace ``*p1`` and ``*p2`` by their
converted value (with 'new' reference counts), and return ``0``. If no
conversion is possible, or if some other error occurs, return ``-1`` (failure)
and don't increment the reference counts. The call ``PyNumber_Coerce(&o1,
&o2)`` is equivalent to the Python statement ``o1, o2 = coerce(o1, o2)``.
.. cfunction:: int PyNumber_CoerceEx(PyObject **p1, PyObject **p2)
This function is similar to :cfunc:`PyNumber_Coerce`, except that it returns
``1`` when the conversion is not possible and when no error is raised.
Reference counts are still not increased in this case.
.. cfunction:: PyObject* PyNumber_Int(PyObject *o)
.. index:: builtin: int
Returns the *o* converted to an integer object on success, or *NULL* on failure.
If the argument is outside the integer range a long object will be returned
instead. This is the equivalent of the Python expression ``int(o)``.
.. cfunction:: PyObject* PyNumber_Long(PyObject *o)
.. index:: builtin: long
Returns the *o* converted to a long integer object on success, or *NULL* on
failure. This is the equivalent of the Python expression ``long(o)``.
.. cfunction:: PyObject* PyNumber_Float(PyObject *o)
.. index:: builtin: float
Returns the *o* converted to a float object on success, or *NULL* on failure.
This is the equivalent of the Python expression ``float(o)``.
.. cfunction:: PyObject* PyNumber_Index(PyObject *o)
Returns the *o* converted to a Python int or long on success or *NULL* with a
:exc:`TypeError` exception raised on failure.
.. versionadded:: 2.5
.. cfunction:: PyObject* PyNumber_ToBase(PyObject *n, int base)
Returns the integer *n* converted to *base* as a string with a base
marker of ``'0b'``, ``'0o'``, or ``'0x'`` if applicable. When
*base* is not 2, 8, 10, or 16, the format is ``'x#num'`` where x is the
base. If *n* is not an int object, it is converted with
:cfunc:`PyNumber_Index` first.
.. versionadded:: 2.6
.. cfunction:: Py_ssize_t PyNumber_AsSsize_t(PyObject *o, PyObject *exc)
Returns *o* converted to a Py_ssize_t value if *o* can be interpreted as an
integer. If *o* can be converted to a Python int or long but the attempt to
convert to a Py_ssize_t value would raise an :exc:`OverflowError`, then the
*exc* argument is the type of exception that will be raised (usually
:exc:`IndexError` or :exc:`OverflowError`). If *exc* is *NULL*, then the
exception is cleared and the value is clipped to *PY_SSIZE_T_MIN* for a negative
integer or *PY_SSIZE_T_MAX* for a positive integer.
.. versionadded:: 2.5
.. cfunction:: int PyIndex_Check(PyObject *o)
Returns True if *o* is an index integer (has the nb_index slot of the
tp_as_number structure filled in).
.. versionadded:: 2.5

View File

@@ -0,0 +1,46 @@
.. highlightlang:: c
.. _abstract-buffer:
Buffer Protocol
===============
.. cfunction:: int PyObject_AsCharBuffer(PyObject *obj, const char **buffer, Py_ssize_t *buffer_len)
Returns a pointer to a read-only memory location usable as character-based
input. The *obj* argument must support the single-segment character buffer
interface. On success, returns ``0``, sets *buffer* to the memory location and
*buffer_len* to the buffer length. Returns ``-1`` and sets a :exc:`TypeError`
on error.
.. versionadded:: 1.6
.. cfunction:: int PyObject_AsReadBuffer(PyObject *obj, const void **buffer, Py_ssize_t *buffer_len)
Returns a pointer to a read-only memory location containing arbitrary data. The
*obj* argument must support the single-segment readable buffer interface. On
success, returns ``0``, sets *buffer* to the memory location and *buffer_len* to
the buffer length. Returns ``-1`` and sets a :exc:`TypeError` on error.
.. versionadded:: 1.6
.. cfunction:: int PyObject_CheckReadBuffer(PyObject *o)
Returns ``1`` if *o* supports the single-segment readable buffer interface.
Otherwise returns ``0``.
.. versionadded:: 2.2
.. cfunction:: int PyObject_AsWriteBuffer(PyObject *obj, void **buffer, Py_ssize_t *buffer_len)
Returns a pointer to a writeable memory location. The *obj* argument must
support the single-segment, character buffer interface. On success, returns
``0``, sets *buffer* to the memory location and *buffer_len* to the buffer
length. Returns ``-1`` and sets a :exc:`TypeError` on error.
.. versionadded:: 1.6

View File

@@ -0,0 +1,395 @@
.. highlightlang:: c
.. _object:
Object Protocol
===============
.. cfunction:: int PyObject_Print(PyObject *o, FILE *fp, int flags)
Print an object *o*, on file *fp*. Returns ``-1`` on error. The flags argument
is used to enable certain printing options. The only option currently supported
is :const:`Py_PRINT_RAW`; if given, the :func:`str` of the object is written
instead of the :func:`repr`.
.. cfunction:: int PyObject_HasAttr(PyObject *o, PyObject *attr_name)
Returns ``1`` if *o* has the attribute *attr_name*, and ``0`` otherwise. This
is equivalent to the Python expression ``hasattr(o, attr_name)``. This function
always succeeds.
.. cfunction:: int PyObject_HasAttrString(PyObject *o, const char *attr_name)
Returns ``1`` if *o* has the attribute *attr_name*, and ``0`` otherwise. This
is equivalent to the Python expression ``hasattr(o, attr_name)``. This function
always succeeds.
.. cfunction:: PyObject* PyObject_GetAttr(PyObject *o, PyObject *attr_name)
Retrieve an attribute named *attr_name* from object *o*. Returns the attribute
value on success, or *NULL* on failure. This is the equivalent of the Python
expression ``o.attr_name``.
.. cfunction:: PyObject* PyObject_GetAttrString(PyObject *o, const char *attr_name)
Retrieve an attribute named *attr_name* from object *o*. Returns the attribute
value on success, or *NULL* on failure. This is the equivalent of the Python
expression ``o.attr_name``.
.. cfunction:: PyObject* PyObject_GenericGetAttr(PyObject *o, PyObject *name)
Generic attribute getter function that is meant to be put into a type
object's ``tp_getattro`` slot. It looks for a descriptor in the dictionary
of classes in the object's MRO as well as an attribute in the object's
:attr:`__dict__` (if present). As outlined in :ref:`descriptors`, data
descriptors take preference over instance attributes, while non-data
descriptors don't. Otherwise, an :exc:`AttributeError` is raised.
.. cfunction:: int PyObject_SetAttr(PyObject *o, PyObject *attr_name, PyObject *v)
Set the value of the attribute named *attr_name*, for object *o*, to the value
*v*. Returns ``-1`` on failure. This is the equivalent of the Python statement
``o.attr_name = v``.
.. cfunction:: int PyObject_SetAttrString(PyObject *o, const char *attr_name, PyObject *v)
Set the value of the attribute named *attr_name*, for object *o*, to the value
*v*. Returns ``-1`` on failure. This is the equivalent of the Python statement
``o.attr_name = v``.
.. cfunction:: int PyObject_GenericSetAttr(PyObject *o, PyObject *name, PyObject *value)
Generic attribute setter function that is meant to be put into a type
object's ``tp_setattro`` slot. It looks for a data descriptor in the
dictionary of classes in the object's MRO, and if found it takes preference
over setting the attribute in the instance dictionary. Otherwise, the
attribute is set in the object's :attr:`__dict__` (if present). Otherwise,
an :exc:`AttributeError` is raised and ``-1`` is returned.
.. cfunction:: int PyObject_DelAttr(PyObject *o, PyObject *attr_name)
Delete attribute named *attr_name*, for object *o*. Returns ``-1`` on failure.
This is the equivalent of the Python statement ``del o.attr_name``.
.. cfunction:: int PyObject_DelAttrString(PyObject *o, const char *attr_name)
Delete attribute named *attr_name*, for object *o*. Returns ``-1`` on failure.
This is the equivalent of the Python statement ``del o.attr_name``.
.. cfunction:: PyObject* PyObject_RichCompare(PyObject *o1, PyObject *o2, int opid)
Compare the values of *o1* and *o2* using the operation specified by *opid*,
which must be one of :const:`Py_LT`, :const:`Py_LE`, :const:`Py_EQ`,
:const:`Py_NE`, :const:`Py_GT`, or :const:`Py_GE`, corresponding to ``<``,
``<=``, ``==``, ``!=``, ``>``, or ``>=`` respectively. This is the equivalent of
the Python expression ``o1 op o2``, where ``op`` is the operator corresponding
to *opid*. Returns the value of the comparison on success, or *NULL* on failure.
.. cfunction:: int PyObject_RichCompareBool(PyObject *o1, PyObject *o2, int opid)
Compare the values of *o1* and *o2* using the operation specified by *opid*,
which must be one of :const:`Py_LT`, :const:`Py_LE`, :const:`Py_EQ`,
:const:`Py_NE`, :const:`Py_GT`, or :const:`Py_GE`, corresponding to ``<``,
``<=``, ``==``, ``!=``, ``>``, or ``>=`` respectively. Returns ``-1`` on error,
``0`` if the result is false, ``1`` otherwise. This is the equivalent of the
Python expression ``o1 op o2``, where ``op`` is the operator corresponding to
*opid*.
.. cfunction:: int PyObject_Cmp(PyObject *o1, PyObject *o2, int *result)
.. index:: builtin: cmp
Compare the values of *o1* and *o2* using a routine provided by *o1*, if one
exists, otherwise with a routine provided by *o2*. The result of the comparison
is returned in *result*. Returns ``-1`` on failure. This is the equivalent of
the Python statement ``result = cmp(o1, o2)``.
.. cfunction:: int PyObject_Compare(PyObject *o1, PyObject *o2)
.. index:: builtin: cmp
Compare the values of *o1* and *o2* using a routine provided by *o1*, if one
exists, otherwise with a routine provided by *o2*. Returns the result of the
comparison on success. On error, the value returned is undefined; use
:cfunc:`PyErr_Occurred` to detect an error. This is equivalent to the Python
expression ``cmp(o1, o2)``.
.. cfunction:: PyObject* PyObject_Repr(PyObject *o)
.. index:: builtin: repr
Compute a string representation of object *o*. Returns the string
representation on success, *NULL* on failure. This is the equivalent of the
Python expression ``repr(o)``. Called by the :func:`repr` built-in function and
by reverse quotes.
.. cfunction:: PyObject* PyObject_Str(PyObject *o)
.. index:: builtin: str
Compute a string representation of object *o*. Returns the string
representation on success, *NULL* on failure. This is the equivalent of the
Python expression ``str(o)``. Called by the :func:`str` built-in function and
by the :keyword:`print` statement.
.. cfunction:: PyObject* PyObject_Bytes(PyObject *o)
.. index:: builtin: bytes
Compute a bytes representation of object *o*. In 2.x, this is just a alias
for :cfunc:`PyObject_Str`.
.. cfunction:: PyObject* PyObject_Unicode(PyObject *o)
.. index:: builtin: unicode
Compute a Unicode string representation of object *o*. Returns the Unicode
string representation on success, *NULL* on failure. This is the equivalent of
the Python expression ``unicode(o)``. Called by the :func:`unicode` built-in
function.
.. cfunction:: int PyObject_IsInstance(PyObject *inst, PyObject *cls)
Returns ``1`` if *inst* is an instance of the class *cls* or a subclass of
*cls*, or ``0`` if not. On error, returns ``-1`` and sets an exception. If
*cls* is a type object rather than a class object, :cfunc:`PyObject_IsInstance`
returns ``1`` if *inst* is of type *cls*. If *cls* is a tuple, the check will
be done against every entry in *cls*. The result will be ``1`` when at least one
of the checks returns ``1``, otherwise it will be ``0``. If *inst* is not a
class instance and *cls* is neither a type object, nor a class object, nor a
tuple, *inst* must have a :attr:`__class__` attribute --- the class relationship
of the value of that attribute with *cls* will be used to determine the result
of this function.
.. versionadded:: 2.1
.. versionchanged:: 2.2
Support for a tuple as the second argument added.
Subclass determination is done in a fairly straightforward way, but includes a
wrinkle that implementors of extensions to the class system may want to be aware
of. If :class:`A` and :class:`B` are class objects, :class:`B` is a subclass of
:class:`A` if it inherits from :class:`A` either directly or indirectly. If
either is not a class object, a more general mechanism is used to determine the
class relationship of the two objects. When testing if *B* is a subclass of
*A*, if *A* is *B*, :cfunc:`PyObject_IsSubclass` returns true. If *A* and *B*
are different objects, *B*'s :attr:`__bases__` attribute is searched in a
depth-first fashion for *A* --- the presence of the :attr:`__bases__` attribute
is considered sufficient for this determination.
.. cfunction:: int PyObject_IsSubclass(PyObject *derived, PyObject *cls)
Returns ``1`` if the class *derived* is identical to or derived from the class
*cls*, otherwise returns ``0``. In case of an error, returns ``-1``. If *cls*
is a tuple, the check will be done against every entry in *cls*. The result will
be ``1`` when at least one of the checks returns ``1``, otherwise it will be
``0``. If either *derived* or *cls* is not an actual class object (or tuple),
this function uses the generic algorithm described above.
.. versionadded:: 2.1
.. versionchanged:: 2.3
Older versions of Python did not support a tuple as the second argument.
.. cfunction:: int PyCallable_Check(PyObject *o)
Determine if the object *o* is callable. Return ``1`` if the object is callable
and ``0`` otherwise. This function always succeeds.
.. cfunction:: PyObject* PyObject_Call(PyObject *callable_object, PyObject *args, PyObject *kw)
.. index:: builtin: apply
Call a callable Python object *callable_object*, with arguments given by the
tuple *args*, and named arguments given by the dictionary *kw*. If no named
arguments are needed, *kw* may be *NULL*. *args* must not be *NULL*, use an
empty tuple if no arguments are needed. Returns the result of the call on
success, or *NULL* on failure. This is the equivalent of the Python expression
``apply(callable_object, args, kw)`` or ``callable_object(*args, **kw)``.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyObject_CallObject(PyObject *callable_object, PyObject *args)
.. index:: builtin: apply
Call a callable Python object *callable_object*, with arguments given by the
tuple *args*. If no arguments are needed, then *args* may be *NULL*. Returns
the result of the call on success, or *NULL* on failure. This is the equivalent
of the Python expression ``apply(callable_object, args)`` or
``callable_object(*args)``.
.. cfunction:: PyObject* PyObject_CallFunction(PyObject *callable, char *format, ...)
.. index:: builtin: apply
Call a callable Python object *callable*, with a variable number of C arguments.
The C arguments are described using a :cfunc:`Py_BuildValue` style format
string. The format may be *NULL*, indicating that no arguments are provided.
Returns the result of the call on success, or *NULL* on failure. This is the
equivalent of the Python expression ``apply(callable, args)`` or
``callable(*args)``. Note that if you only pass :ctype:`PyObject \*` args,
:cfunc:`PyObject_CallFunctionObjArgs` is a faster alternative.
.. cfunction:: PyObject* PyObject_CallMethod(PyObject *o, char *method, char *format, ...)
Call the method named *method* of object *o* with a variable number of C
arguments. The C arguments are described by a :cfunc:`Py_BuildValue` format
string that should produce a tuple. The format may be *NULL*, indicating that
no arguments are provided. Returns the result of the call on success, or *NULL*
on failure. This is the equivalent of the Python expression ``o.method(args)``.
Note that if you only pass :ctype:`PyObject \*` args,
:cfunc:`PyObject_CallMethodObjArgs` is a faster alternative.
.. cfunction:: PyObject* PyObject_CallFunctionObjArgs(PyObject *callable, ..., NULL)
Call a callable Python object *callable*, with a variable number of
:ctype:`PyObject\*` arguments. The arguments are provided as a variable number
of parameters followed by *NULL*. Returns the result of the call on success, or
*NULL* on failure.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyObject_CallMethodObjArgs(PyObject *o, PyObject *name, ..., NULL)
Calls a method of the object *o*, where the name of the method is given as a
Python string object in *name*. It is called with a variable number of
:ctype:`PyObject\*` arguments. The arguments are provided as a variable number
of parameters followed by *NULL*. Returns the result of the call on success, or
*NULL* on failure.
.. versionadded:: 2.2
.. cfunction:: long PyObject_Hash(PyObject *o)
.. index:: builtin: hash
Compute and return the hash value of an object *o*. On failure, return ``-1``.
This is the equivalent of the Python expression ``hash(o)``.
.. cfunction:: long PyObject_HashNotImplemented(PyObject *o)
Set a :exc:`TypeError` indicating that ``type(o)`` is not hashable and return ``-1``.
This function receives special treatment when stored in a ``tp_hash`` slot,
allowing a type to explicitly indicate to the interpreter that it is not
hashable.
.. versionadded:: 2.6
.. cfunction:: int PyObject_IsTrue(PyObject *o)
Returns ``1`` if the object *o* is considered to be true, and ``0`` otherwise.
This is equivalent to the Python expression ``not not o``. On failure, return
``-1``.
.. cfunction:: int PyObject_Not(PyObject *o)
Returns ``0`` if the object *o* is considered to be true, and ``1`` otherwise.
This is equivalent to the Python expression ``not o``. On failure, return
``-1``.
.. cfunction:: PyObject* PyObject_Type(PyObject *o)
.. index:: builtin: type
When *o* is non-*NULL*, returns a type object corresponding to the object type
of object *o*. On failure, raises :exc:`SystemError` and returns *NULL*. This
is equivalent to the Python expression ``type(o)``. This function increments the
reference count of the return value. There's really no reason to use this
function instead of the common expression ``o->ob_type``, which returns a
pointer of type :ctype:`PyTypeObject\*`, except when the incremented reference
count is needed.
.. cfunction:: int PyObject_TypeCheck(PyObject *o, PyTypeObject *type)
Return true if the object *o* is of type *type* or a subtype of *type*. Both
parameters must be non-*NULL*.
.. versionadded:: 2.2
.. cfunction:: Py_ssize_t PyObject_Length(PyObject *o)
Py_ssize_t PyObject_Size(PyObject *o)
.. index:: builtin: len
Return the length of object *o*. If the object *o* provides either the sequence
and mapping protocols, the sequence length is returned. On error, ``-1`` is
returned. This is the equivalent to the Python expression ``len(o)``.
.. cfunction:: PyObject* PyObject_GetItem(PyObject *o, PyObject *key)
Return element of *o* corresponding to the object *key* or *NULL* on failure.
This is the equivalent of the Python expression ``o[key]``.
.. cfunction:: int PyObject_SetItem(PyObject *o, PyObject *key, PyObject *v)
Map the object *key* to the value *v*. Returns ``-1`` on failure. This is the
equivalent of the Python statement ``o[key] = v``.
.. cfunction:: int PyObject_DelItem(PyObject *o, PyObject *key)
Delete the mapping for *key* from *o*. Returns ``-1`` on failure. This is the
equivalent of the Python statement ``del o[key]``.
.. cfunction:: int PyObject_AsFileDescriptor(PyObject *o)
Derives a file descriptor from a Python object. If the object is an integer or
long integer, its value is returned. If not, the object's :meth:`fileno` method
is called if it exists; the method must return an integer or long integer, which
is returned as the file descriptor value. Returns ``-1`` on failure.
.. cfunction:: PyObject* PyObject_Dir(PyObject *o)
This is equivalent to the Python expression ``dir(o)``, returning a (possibly
empty) list of strings appropriate for the object argument, or *NULL* if there
was an error. If the argument is *NULL*, this is like the Python ``dir()``,
returning the names of the current locals; in this case, if no execution frame
is active then *NULL* is returned but :cfunc:`PyErr_Occurred` will return false.
.. cfunction:: PyObject* PyObject_GetIter(PyObject *o)
This is equivalent to the Python expression ``iter(o)``. It returns a new
iterator for the object argument, or the object itself if the object is already
an iterator. Raises :exc:`TypeError` and returns *NULL* if the object cannot be
iterated.

View File

@@ -0,0 +1,18 @@
.. highlightlang:: c
.. _newtypes:
*****************************
Object Implementation Support
*****************************
This chapter describes the functions, types, and macros used when defining new
object types.
.. toctree::
allocation.rst
structures.rst
typeobj.rst
gcsupport.rst

View File

@@ -0,0 +1,74 @@
.. highlightlang:: c
.. _countingrefs:
******************
Reference Counting
******************
The macros in this section are used for managing reference counts of Python
objects.
.. cfunction:: void Py_INCREF(PyObject *o)
Increment the reference count for object *o*. The object must not be *NULL*; if
you aren't sure that it isn't *NULL*, use :cfunc:`Py_XINCREF`.
.. cfunction:: void Py_XINCREF(PyObject *o)
Increment the reference count for object *o*. The object may be *NULL*, in
which case the macro has no effect.
.. cfunction:: void Py_DECREF(PyObject *o)
Decrement the reference count for object *o*. The object must not be *NULL*; if
you aren't sure that it isn't *NULL*, use :cfunc:`Py_XDECREF`. If the reference
count reaches zero, the object's type's deallocation function (which must not be
*NULL*) is invoked.
.. warning::
The deallocation function can cause arbitrary Python code to be invoked (e.g.
when a class instance with a :meth:`__del__` method is deallocated). While
exceptions in such code are not propagated, the executed code has free access to
all Python global variables. This means that any object that is reachable from
a global variable should be in a consistent state before :cfunc:`Py_DECREF` is
invoked. For example, code to delete an object from a list should copy a
reference to the deleted object in a temporary variable, update the list data
structure, and then call :cfunc:`Py_DECREF` for the temporary variable.
.. cfunction:: void Py_XDECREF(PyObject *o)
Decrement the reference count for object *o*. The object may be *NULL*, in
which case the macro has no effect; otherwise the effect is the same as for
:cfunc:`Py_DECREF`, and the same warning applies.
.. cfunction:: void Py_CLEAR(PyObject *o)
Decrement the reference count for object *o*. The object may be *NULL*, in
which case the macro has no effect; otherwise the effect is the same as for
:cfunc:`Py_DECREF`, except that the argument is also set to *NULL*. The warning
for :cfunc:`Py_DECREF` does not apply with respect to the object passed because
the macro carefully uses a temporary variable and sets the argument to *NULL*
before decrementing its reference count.
It is a good idea to use this macro whenever decrementing the value of a
variable that might be traversed during garbage collection.
.. versionadded:: 2.4
The following functions are for runtime dynamic embedding of Python:
``Py_IncRef(PyObject *o)``, ``Py_DecRef(PyObject *o)``. They are
simply exported function versions of :cfunc:`Py_XINCREF` and
:cfunc:`Py_XDECREF`, respectively.
The following functions or macros are only for use within the interpreter core:
:cfunc:`_Py_Dealloc`, :cfunc:`_Py_ForgetReference`, :cfunc:`_Py_NewReference`,
as well as the global variable :cdata:`_Py_RefTotal`.

View File

@@ -0,0 +1,50 @@
.. highlightlang:: c
.. _reflection:
Reflection
==========
.. cfunction:: PyObject* PyEval_GetBuiltins()
Return a dictionary of the builtins in the current execution frame,
or the interpreter of the thread state if no frame is currently executing.
.. cfunction:: PyObject* PyEval_GetLocals()
Return a dictionary of the local variables in the current execution frame,
or *NULL* if no frame is currently executing.
.. cfunction:: PyObject* PyEval_GetGlobals()
Return a dictionary of the global variables in the current execution frame,
or *NULL* if no frame is currently executing.
.. cfunction:: PyFrameObject* PyEval_GetFrame()
Return the current thread state's frame, which is *NULL* if no frame is
currently executing.
.. cfunction:: int PyEval_GetRestricted()
If there is a current frame and it is executing in restricted mode, return true,
otherwise false.
.. cfunction:: const char* PyEval_GetFuncName(PyObject *func)
Return the name of *func* if it is a function, class or instance object, else the
name of *func*\s type.
.. cfunction:: const char* PyEval_GetFuncDesc(PyObject *func)
Return a description string, depending on the type of *func*.
Return values include "()" for functions and methods, " constructor",
" instance", and " object". Concatenated with the result of
:cfunc:`PyEval_GetFuncName`, the result will be a description of
*func*.

View File

@@ -0,0 +1,170 @@
.. highlightlang:: c
.. _sequence:
Sequence Protocol
=================
.. cfunction:: int PySequence_Check(PyObject *o)
Return ``1`` if the object provides sequence protocol, and ``0`` otherwise.
This function always succeeds.
.. cfunction:: Py_ssize_t PySequence_Size(PyObject *o)
.. index:: builtin: len
Returns the number of objects in sequence *o* on success, and ``-1`` on failure.
For objects that do not provide sequence protocol, this is equivalent to the
Python expression ``len(o)``.
.. cfunction:: Py_ssize_t PySequence_Length(PyObject *o)
Alternate name for :cfunc:`PySequence_Size`.
.. cfunction:: PyObject* PySequence_Concat(PyObject *o1, PyObject *o2)
Return the concatenation of *o1* and *o2* on success, and *NULL* on failure.
This is the equivalent of the Python expression ``o1 + o2``.
.. cfunction:: PyObject* PySequence_Repeat(PyObject *o, Py_ssize_t count)
Return the result of repeating sequence object *o* *count* times, or *NULL* on
failure. This is the equivalent of the Python expression ``o * count``.
.. cfunction:: PyObject* PySequence_InPlaceConcat(PyObject *o1, PyObject *o2)
Return the concatenation of *o1* and *o2* on success, and *NULL* on failure.
The operation is done *in-place* when *o1* supports it. This is the equivalent
of the Python expression ``o1 += o2``.
.. cfunction:: PyObject* PySequence_InPlaceRepeat(PyObject *o, Py_ssize_t count)
Return the result of repeating sequence object *o* *count* times, or *NULL* on
failure. The operation is done *in-place* when *o* supports it. This is the
equivalent of the Python expression ``o *= count``.
.. cfunction:: PyObject* PySequence_GetItem(PyObject *o, Py_ssize_t i)
Return the *i*th element of *o*, or *NULL* on failure. This is the equivalent of
the Python expression ``o[i]``.
.. cfunction:: PyObject* PySequence_GetSlice(PyObject *o, Py_ssize_t i1, Py_ssize_t i2)
Return the slice of sequence object *o* between *i1* and *i2*, or *NULL* on
failure. This is the equivalent of the Python expression ``o[i1:i2]``.
.. cfunction:: int PySequence_SetItem(PyObject *o, Py_ssize_t i, PyObject *v)
Assign object *v* to the *i*th element of *o*. Returns ``-1`` on failure. This
is the equivalent of the Python statement ``o[i] = v``. This function *does
not* steal a reference to *v*.
.. cfunction:: int PySequence_DelItem(PyObject *o, Py_ssize_t i)
Delete the *i*th element of object *o*. Returns ``-1`` on failure. This is the
equivalent of the Python statement ``del o[i]``.
.. cfunction:: int PySequence_SetSlice(PyObject *o, Py_ssize_t i1, Py_ssize_t i2, PyObject *v)
Assign the sequence object *v* to the slice in sequence object *o* from *i1* to
*i2*. This is the equivalent of the Python statement ``o[i1:i2] = v``.
.. cfunction:: int PySequence_DelSlice(PyObject *o, Py_ssize_t i1, Py_ssize_t i2)
Delete the slice in sequence object *o* from *i1* to *i2*. Returns ``-1`` on
failure. This is the equivalent of the Python statement ``del o[i1:i2]``.
.. cfunction:: Py_ssize_t PySequence_Count(PyObject *o, PyObject *value)
Return the number of occurrences of *value* in *o*, that is, return the number
of keys for which ``o[key] == value``. On failure, return ``-1``. This is
equivalent to the Python expression ``o.count(value)``.
.. cfunction:: int PySequence_Contains(PyObject *o, PyObject *value)
Determine if *o* contains *value*. If an item in *o* is equal to *value*,
return ``1``, otherwise return ``0``. On error, return ``-1``. This is
equivalent to the Python expression ``value in o``.
.. cfunction:: Py_ssize_t PySequence_Index(PyObject *o, PyObject *value)
Return the first index *i* for which ``o[i] == value``. On error, return
``-1``. This is equivalent to the Python expression ``o.index(value)``.
.. cfunction:: PyObject* PySequence_List(PyObject *o)
Return a list object with the same contents as the arbitrary sequence *o*. The
returned list is guaranteed to be new.
.. cfunction:: PyObject* PySequence_Tuple(PyObject *o)
.. index:: builtin: tuple
Return a tuple object with the same contents as the arbitrary sequence *o* or
*NULL* on failure. If *o* is a tuple, a new reference will be returned,
otherwise a tuple will be constructed with the appropriate contents. This is
equivalent to the Python expression ``tuple(o)``.
.. cfunction:: PyObject* PySequence_Fast(PyObject *o, const char *m)
Returns the sequence *o* as a tuple, unless it is already a tuple or list, in
which case *o* is returned. Use :cfunc:`PySequence_Fast_GET_ITEM` to access the
members of the result. Returns *NULL* on failure. If the object is not a
sequence, raises :exc:`TypeError` with *m* as the message text.
.. cfunction:: PyObject* PySequence_Fast_GET_ITEM(PyObject *o, Py_ssize_t i)
Return the *i*th element of *o*, assuming that *o* was returned by
:cfunc:`PySequence_Fast`, *o* is not *NULL*, and that *i* is within bounds.
.. cfunction:: PyObject** PySequence_Fast_ITEMS(PyObject *o)
Return the underlying array of PyObject pointers. Assumes that *o* was returned
by :cfunc:`PySequence_Fast` and *o* is not *NULL*.
Note, if a list gets resized, the reallocation may relocate the items array.
So, only use the underlying array pointer in contexts where the sequence
cannot change.
.. versionadded:: 2.4
.. cfunction:: PyObject* PySequence_ITEM(PyObject *o, Py_ssize_t i)
Return the *i*th element of *o* or *NULL* on failure. Macro form of
:cfunc:`PySequence_GetItem` but without checking that
:cfunc:`PySequence_Check(o)` is true and without adjustment for negative
indices.
.. versionadded:: 2.3
.. cfunction:: Py_ssize_t PySequence_Fast_GET_SIZE(PyObject *o)
Returns the length of *o*, assuming that *o* was returned by
:cfunc:`PySequence_Fast` and that *o* is not *NULL*. The size can also be
gotten by calling :cfunc:`PySequence_Size` on *o*, but
:cfunc:`PySequence_Fast_GET_SIZE` is faster because it can assume *o* is a list
or tuple.

View File

@@ -0,0 +1,171 @@
.. highlightlang:: c
.. _setobjects:
Set Objects
-----------
.. sectionauthor:: Raymond D. Hettinger <python@rcn.com>
.. index::
object: set
object: frozenset
.. versionadded:: 2.5
This section details the public API for :class:`set` and :class:`frozenset`
objects. Any functionality not listed below is best accessed using the either
the abstract object protocol (including :cfunc:`PyObject_CallMethod`,
:cfunc:`PyObject_RichCompareBool`, :cfunc:`PyObject_Hash`,
:cfunc:`PyObject_Repr`, :cfunc:`PyObject_IsTrue`, :cfunc:`PyObject_Print`, and
:cfunc:`PyObject_GetIter`) or the abstract number protocol (including
:cfunc:`PyNumber_And`, :cfunc:`PyNumber_Subtract`, :cfunc:`PyNumber_Or`,
:cfunc:`PyNumber_Xor`, :cfunc:`PyNumber_InPlaceAnd`,
:cfunc:`PyNumber_InPlaceSubtract`, :cfunc:`PyNumber_InPlaceOr`, and
:cfunc:`PyNumber_InPlaceXor`).
.. ctype:: PySetObject
This subtype of :ctype:`PyObject` is used to hold the internal data for both
:class:`set` and :class:`frozenset` objects. It is like a :ctype:`PyDictObject`
in that it is a fixed size for small sets (much like tuple storage) and will
point to a separate, variable sized block of memory for medium and large sized
sets (much like list storage). None of the fields of this structure should be
considered public and are subject to change. All access should be done through
the documented API rather than by manipulating the values in the structure.
.. cvar:: PyTypeObject PySet_Type
This is an instance of :ctype:`PyTypeObject` representing the Python
:class:`set` type.
.. cvar:: PyTypeObject PyFrozenSet_Type
This is an instance of :ctype:`PyTypeObject` representing the Python
:class:`frozenset` type.
The following type check macros work on pointers to any Python object. Likewise,
the constructor functions work with any iterable Python object.
.. cfunction:: int PySet_Check(PyObject *p)
Return true if *p* is a :class:`set` object or an instance of a subtype.
.. versionadded:: 2.6
.. cfunction:: int PyFrozenSet_Check(PyObject *p)
Return true if *p* is a :class:`frozenset` object or an instance of a
subtype.
.. versionadded:: 2.6
.. cfunction:: int PyAnySet_Check(PyObject *p)
Return true if *p* is a :class:`set` object, a :class:`frozenset` object, or an
instance of a subtype.
.. cfunction:: int PyAnySet_CheckExact(PyObject *p)
Return true if *p* is a :class:`set` object or a :class:`frozenset` object but
not an instance of a subtype.
.. cfunction:: int PyFrozenSet_CheckExact(PyObject *p)
Return true if *p* is a :class:`frozenset` object but not an instance of a
subtype.
.. cfunction:: PyObject* PySet_New(PyObject *iterable)
Return a new :class:`set` containing objects returned by the *iterable*. The
*iterable* may be *NULL* to create a new empty set. Return the new set on
success or *NULL* on failure. Raise :exc:`TypeError` if *iterable* is not
actually iterable. The constructor is also useful for copying a set
(``c=set(s)``).
.. cfunction:: PyObject* PyFrozenSet_New(PyObject *iterable)
Return a new :class:`frozenset` containing objects returned by the *iterable*.
The *iterable* may be *NULL* to create a new empty frozenset. Return the new
set on success or *NULL* on failure. Raise :exc:`TypeError` if *iterable* is
not actually iterable.
.. versionchanged:: 2.6
Now guaranteed to return a brand-new :class:`frozenset`. Formerly,
frozensets of zero-length were a singleton. This got in the way of
building-up new frozensets with :meth:`PySet_Add`.
The following functions and macros are available for instances of :class:`set`
or :class:`frozenset` or instances of their subtypes.
.. cfunction:: Py_ssize_t PySet_Size(PyObject *anyset)
.. index:: builtin: len
Return the length of a :class:`set` or :class:`frozenset` object. Equivalent to
``len(anyset)``. Raises a :exc:`PyExc_SystemError` if *anyset* is not a
:class:`set`, :class:`frozenset`, or an instance of a subtype.
.. cfunction:: Py_ssize_t PySet_GET_SIZE(PyObject *anyset)
Macro form of :cfunc:`PySet_Size` without error checking.
.. cfunction:: int PySet_Contains(PyObject *anyset, PyObject *key)
Return 1 if found, 0 if not found, and -1 if an error is encountered. Unlike
the Python :meth:`__contains__` method, this function does not automatically
convert unhashable sets into temporary frozensets. Raise a :exc:`TypeError` if
the *key* is unhashable. Raise :exc:`PyExc_SystemError` if *anyset* is not a
:class:`set`, :class:`frozenset`, or an instance of a subtype.
.. cfunction:: int PySet_Add(PyObject *set, PyObject *key)
Add *key* to a :class:`set` instance. Does not apply to :class:`frozenset`
instances. Return 0 on success or -1 on failure. Raise a :exc:`TypeError` if
the *key* is unhashable. Raise a :exc:`MemoryError` if there is no room to grow.
Raise a :exc:`SystemError` if *set* is an not an instance of :class:`set` or its
subtype.
.. versionchanged:: 2.6
Now works with instances of :class:`frozenset` or its subtypes.
Like :cfunc:`PyTuple_SetItem` in that it can be used to fill-in the
values of brand new frozensets before they are exposed to other code.
The following functions are available for instances of :class:`set` or its
subtypes but not for instances of :class:`frozenset` or its subtypes.
.. cfunction:: int PySet_Discard(PyObject *set, PyObject *key)
Return 1 if found and removed, 0 if not found (no action taken), and -1 if an
error is encountered. Does not raise :exc:`KeyError` for missing keys. Raise a
:exc:`TypeError` if the *key* is unhashable. Unlike the Python :meth:`discard`
method, this function does not automatically convert unhashable sets into
temporary frozensets. Raise :exc:`PyExc_SystemError` if *set* is an not an
instance of :class:`set` or its subtype.
.. cfunction:: PyObject* PySet_Pop(PyObject *set)
Return a new reference to an arbitrary object in the *set*, and removes the
object from the *set*. Return *NULL* on failure. Raise :exc:`KeyError` if the
set is empty. Raise a :exc:`SystemError` if *set* is an not an instance of
:class:`set` or its subtype.
.. cfunction:: int PySet_Clear(PyObject *set)
Empty an existing set of all elements.

View File

@@ -0,0 +1,56 @@
.. highlightlang:: c
.. _slice-objects:
Slice Objects
-------------
.. cvar:: PyTypeObject PySlice_Type
.. index:: single: SliceType (in module types)
The type object for slice objects. This is the same as ``slice`` and
``types.SliceType``.
.. cfunction:: int PySlice_Check(PyObject *ob)
Return true if *ob* is a slice object; *ob* must not be *NULL*.
.. cfunction:: PyObject* PySlice_New(PyObject *start, PyObject *stop, PyObject *step)
Return a new slice object with the given values. The *start*, *stop*, and
*step* parameters are used as the values of the slice object attributes of the
same names. Any of the values may be *NULL*, in which case the ``None`` will be
used for the corresponding attribute. Return *NULL* if the new object could not
be allocated.
.. cfunction:: int PySlice_GetIndices(PySliceObject *slice, Py_ssize_t length, Py_ssize_t *start, Py_ssize_t *stop, Py_ssize_t *step)
Retrieve the start, stop and step indices from the slice object *slice*,
assuming a sequence of length *length*. Treats indices greater than *length* as
errors.
Returns 0 on success and -1 on error with no exception set (unless one of the
indices was not :const:`None` and failed to be converted to an integer, in which
case -1 is returned with an exception set).
You probably do not want to use this function. If you want to use slice objects
in versions of Python prior to 2.3, you would probably do well to incorporate
the source of :cfunc:`PySlice_GetIndicesEx`, suitably renamed, in the source of
your extension.
.. cfunction:: int PySlice_GetIndicesEx(PySliceObject *slice, Py_ssize_t length, Py_ssize_t *start, Py_ssize_t *stop, Py_ssize_t *step, Py_ssize_t *slicelength)
Usable replacement for :cfunc:`PySlice_GetIndices`. Retrieve the start, stop,
and step indices from the slice object *slice* assuming a sequence of length
*length*, and store the length of the slice in *slicelength*. Out of bounds
indices are clipped in a manner consistent with the handling of normal slices.
Returns 0 on success and -1 on error with exception set.
.. versionadded:: 2.3

View File

@@ -0,0 +1,264 @@
.. highlightlang:: c
.. _stringobjects:
String/Bytes Objects
--------------------
These functions raise :exc:`TypeError` when expecting a string parameter and are
called with a non-string parameter.
.. note::
These functions have been renamed to PyBytes_* in Python 3.x. The PyBytes
names are also available in 2.6.
.. index:: object: string
.. ctype:: PyStringObject
This subtype of :ctype:`PyObject` represents a Python string object.
.. cvar:: PyTypeObject PyString_Type
.. index:: single: StringType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python string type; it is
the same object as ``str`` and ``types.StringType`` in the Python layer. .
.. cfunction:: int PyString_Check(PyObject *o)
Return true if the object *o* is a string object or an instance of a subtype of
the string type.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyString_CheckExact(PyObject *o)
Return true if the object *o* is a string object, but not an instance of a
subtype of the string type.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyString_FromString(const char *v)
Return a new string object with a copy of the string *v* as value on success,
and *NULL* on failure. The parameter *v* must not be *NULL*; it will not be
checked.
.. cfunction:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
Return a new string object with a copy of the string *v* as value and length
*len* on success, and *NULL* on failure. If *v* is *NULL*, the contents of the
string are uninitialized.
.. cfunction:: PyObject* PyString_FromFormat(const char *format, ...)
Take a C :cfunc:`printf`\ -style *format* string and a variable number of
arguments, calculate the size of the resulting Python string and return a string
with the values formatted into it. The variable arguments must be C types and
must correspond exactly to the format characters in the *format* string. The
following format characters are allowed:
.. % This should be exactly the same as the table in PyErr_Format.
.. % One should just refer to the other.
.. % The descriptions for %zd and %zu are wrong, but the truth is complicated
.. % because not all compilers support the %z width modifier -- we fake it
.. % when necessary via interpolating PY_FORMAT_SIZE_T.
.. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
+-------------------+---------------+--------------------------------+
| Format Characters | Type | Comment |
+===================+===============+================================+
| :attr:`%%` | *n/a* | The literal % character. |
+-------------------+---------------+--------------------------------+
| :attr:`%c` | int | A single character, |
| | | represented as an C int. |
+-------------------+---------------+--------------------------------+
| :attr:`%d` | int | Exactly equivalent to |
| | | ``printf("%d")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%u` | unsigned int | Exactly equivalent to |
| | | ``printf("%u")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%ld` | long | Exactly equivalent to |
| | | ``printf("%ld")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%lu` | unsigned long | Exactly equivalent to |
| | | ``printf("%lu")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
| | | ``printf("%zd")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%zu` | size_t | Exactly equivalent to |
| | | ``printf("%zu")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%i` | int | Exactly equivalent to |
| | | ``printf("%i")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%x` | int | Exactly equivalent to |
| | | ``printf("%x")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%s` | char\* | A null-terminated C character |
| | | array. |
+-------------------+---------------+--------------------------------+
| :attr:`%p` | void\* | The hex representation of a C |
| | | pointer. Mostly equivalent to |
| | | ``printf("%p")`` except that |
| | | it is guaranteed to start with |
| | | the literal ``0x`` regardless |
| | | of what the platform's |
| | | ``printf`` yields. |
+-------------------+---------------+--------------------------------+
An unrecognized format character causes all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
.. cfunction:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
Identical to :cfunc:`PyString_FromFormat` except that it takes exactly two
arguments.
.. cfunction:: Py_ssize_t PyString_Size(PyObject *string)
Return the length of the string in string object *string*.
.. cfunction:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
Macro form of :cfunc:`PyString_Size` but without error checking.
.. cfunction:: char* PyString_AsString(PyObject *string)
Return a NUL-terminated representation of the contents of *string*. The pointer
refers to the internal buffer of *string*, not a copy. The data must not be
modified in any way, unless the string was just created using
``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
*string* is a Unicode object, this function computes the default encoding of
*string* and operates on that. If *string* is not a string object at all,
:cfunc:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
.. cfunction:: char* PyString_AS_STRING(PyObject *string)
Macro form of :cfunc:`PyString_AsString` but without error checking. Only
string objects are supported; no Unicode objects should be passed.
.. cfunction:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
Return a NUL-terminated representation of the contents of the object *obj*
through the output variables *buffer* and *length*.
The function accepts both string and Unicode objects as input. For Unicode
objects it returns the default encoded version of the object. If *length* is
*NULL*, the resulting buffer may not contain NUL characters; if it does, the
function returns ``-1`` and a :exc:`TypeError` is raised.
The buffer refers to an internal string buffer of *obj*, not a copy. The data
must not be modified in any way, unless the string was just created using
``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
*string* is a Unicode object, this function computes the default encoding of
*string* and operates on that. If *string* is not a string object at all,
:cfunc:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
.. cfunction:: void PyString_Concat(PyObject **string, PyObject *newpart)
Create a new string object in *\*string* containing the contents of *newpart*
appended to *string*; the caller will own the new reference. The reference to
the old value of *string* will be stolen. If the new string cannot be created,
the old reference to *string* will still be discarded and the value of
*\*string* will be set to *NULL*; the appropriate exception will be set.
.. cfunction:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
Create a new string object in *\*string* containing the contents of *newpart*
appended to *string*. This version decrements the reference count of *newpart*.
.. cfunction:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
A way to resize a string object even though it is "immutable". Only use this to
build up a brand new string object; don't use this if the string may already be
known in other parts of the code. It is an error to call this function if the
refcount on the input string object is not one. Pass the address of an existing
string object as an lvalue (it may be written into), and the new size desired.
On success, *\*string* holds the resized string object and ``0`` is returned;
the address in *\*string* may differ from its input value. If the reallocation
fails, the original string object at *\*string* is deallocated, *\*string* is
set to *NULL*, a memory exception is set, and ``-1`` is returned.
.. cfunction:: PyObject* PyString_Format(PyObject *format, PyObject *args)
Return a new string object from *format* and *args*. Analogous to ``format %
args``. The *args* argument must be a tuple.
.. cfunction:: void PyString_InternInPlace(PyObject **string)
Intern the argument *\*string* in place. The argument must be the address of a
pointer variable pointing to a Python string object. If there is an existing
interned string that is the same as *\*string*, it sets *\*string* to it
(decrementing the reference count of the old string object and incrementing the
reference count of the interned string object), otherwise it leaves *\*string*
alone and interns it (incrementing its reference count). (Clarification: even
though there is a lot of talk about reference counts, think of this function as
reference-count-neutral; you own the object after the call if and only if you
owned it before the call.)
.. cfunction:: PyObject* PyString_InternFromString(const char *v)
A combination of :cfunc:`PyString_FromString` and
:cfunc:`PyString_InternInPlace`, returning either a new string object that has
been interned, or a new ("owned") reference to an earlier interned string object
with the same value.
.. cfunction:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
Create an object by decoding *size* bytes of the encoded buffer *s* using the
codec registered for *encoding*. *encoding* and *errors* have the same meaning
as the parameters of the same name in the :func:`unicode` built-in function.
The codec to be used is looked up using the Python codec registry. Return
*NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
Decode a string object by passing it to the codec registered for *encoding* and
return the result as Python object. *encoding* and *errors* have the same
meaning as the parameters of the same name in the string :meth:`encode` method.
The codec to be used is looked up using the Python codec registry. Return *NULL*
if an exception was raised by the codec.
.. cfunction:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
Encode the :ctype:`char` buffer of the given size by passing it to the codec
registered for *encoding* and return a Python object. *encoding* and *errors*
have the same meaning as the parameters of the same name in the string
:meth:`encode` method. The codec to be used is looked up using the Python codec
registry. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
Encode a string object using the codec registered for *encoding* and return the
result as Python object. *encoding* and *errors* have the same meaning as the
parameters of the same name in the string :meth:`encode` method. The codec to be
used is looked up using the Python codec registry. Return *NULL* if an exception
was raised by the codec.

View File

@@ -0,0 +1,274 @@
.. highlightlang:: c
.. _common-structs:
Common Object Structures
========================
There are a large number of structures which are used in the definition of
object types for Python. This section describes these structures and how they
are used.
All Python objects ultimately share a small number of fields at the beginning of
the object's representation in memory. These are represented by the
:ctype:`PyObject` and :ctype:`PyVarObject` types, which are defined, in turn, by
the expansions of some macros also used, whether directly or indirectly, in the
definition of all other Python objects.
.. ctype:: PyObject
All object types are extensions of this type. This is a type which contains the
information Python needs to treat a pointer to an object as an object. In a
normal "release" build, it contains only the object's reference count and a
pointer to the corresponding type object. It corresponds to the fields defined
by the expansion of the ``PyObject_HEAD`` macro.
.. ctype:: PyVarObject
This is an extension of :ctype:`PyObject` that adds the :attr:`ob_size` field.
This is only used for objects that have some notion of *length*. This type does
not often appear in the Python/C API. It corresponds to the fields defined by
the expansion of the ``PyObject_VAR_HEAD`` macro.
These macros are used in the definition of :ctype:`PyObject` and
:ctype:`PyVarObject`:
.. cmacro:: PyObject_HEAD
This is a macro which expands to the declarations of the fields of the
:ctype:`PyObject` type; it is used when declaring new types which represent
objects without a varying length. The specific fields it expands to depend on
the definition of :cmacro:`Py_TRACE_REFS`. By default, that macro is not
defined, and :cmacro:`PyObject_HEAD` expands to::
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
When :cmacro:`Py_TRACE_REFS` is defined, it expands to::
PyObject *_ob_next, *_ob_prev;
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
.. cmacro:: PyObject_VAR_HEAD
This is a macro which expands to the declarations of the fields of the
:ctype:`PyVarObject` type; it is used when declaring new types which represent
objects with a length that varies from instance to instance. This macro always
expands to::
PyObject_HEAD
Py_ssize_t ob_size;
Note that :cmacro:`PyObject_HEAD` is part of the expansion, and that its own
expansion varies depending on the definition of :cmacro:`Py_TRACE_REFS`.
PyObject_HEAD_INIT
.. ctype:: PyCFunction
Type of the functions used to implement most Python callables in C. Functions of
this type take two :ctype:`PyObject\*` parameters and return one such value. If
the return value is *NULL*, an exception shall have been set. If not *NULL*,
the return value is interpreted as the return value of the function as exposed
in Python. The function must return a new reference.
.. ctype:: PyMethodDef
Structure used to describe a method of an extension type. This structure has
four fields:
+------------------+-------------+-------------------------------+
| Field | C Type | Meaning |
+==================+=============+===============================+
| :attr:`ml_name` | char \* | name of the method |
+------------------+-------------+-------------------------------+
| :attr:`ml_meth` | PyCFunction | pointer to the C |
| | | implementation |
+------------------+-------------+-------------------------------+
| :attr:`ml_flags` | int | flag bits indicating how the |
| | | call should be constructed |
+------------------+-------------+-------------------------------+
| :attr:`ml_doc` | char \* | points to the contents of the |
| | | docstring |
+------------------+-------------+-------------------------------+
The :attr:`ml_meth` is a C function pointer. The functions may be of different
types, but they always return :ctype:`PyObject\*`. If the function is not of
the :ctype:`PyCFunction`, the compiler will require a cast in the method table.
Even though :ctype:`PyCFunction` defines the first parameter as
:ctype:`PyObject\*`, it is common that the method implementation uses a the
specific C type of the *self* object.
The :attr:`ml_flags` field is a bitfield which can include the following flags.
The individual flags indicate either a calling convention or a binding
convention. Of the calling convention flags, only :const:`METH_VARARGS` and
:const:`METH_KEYWORDS` can be combined (but note that :const:`METH_KEYWORDS`
alone is equivalent to ``METH_VARARGS | METH_KEYWORDS``). Any of the calling
convention flags can be combined with a binding flag.
.. data:: METH_VARARGS
This is the typical calling convention, where the methods have the type
:ctype:`PyCFunction`. The function expects two :ctype:`PyObject\*` values. The
first one is the *self* object for methods; for module functions, it has the
value given to :cfunc:`Py_InitModule4` (or *NULL* if :cfunc:`Py_InitModule` was
used). The second parameter (often called *args*) is a tuple object
representing all arguments. This parameter is typically processed using
:cfunc:`PyArg_ParseTuple` or :cfunc:`PyArg_UnpackTuple`.
.. data:: METH_KEYWORDS
Methods with these flags must be of type :ctype:`PyCFunctionWithKeywords`. The
function expects three parameters: *self*, *args*, and a dictionary of all the
keyword arguments. The flag is typically combined with :const:`METH_VARARGS`,
and the parameters are typically processed using
:cfunc:`PyArg_ParseTupleAndKeywords`.
.. data:: METH_NOARGS
Methods without parameters don't need to check whether arguments are given if
they are listed with the :const:`METH_NOARGS` flag. They need to be of type
:ctype:`PyCFunction`. When used with object methods, the first parameter is
typically named ``self`` and will hold a reference to the object instance. In
all cases the second parameter will be *NULL*.
.. data:: METH_O
Methods with a single object argument can be listed with the :const:`METH_O`
flag, instead of invoking :cfunc:`PyArg_ParseTuple` with a ``"O"`` argument.
They have the type :ctype:`PyCFunction`, with the *self* parameter, and a
:ctype:`PyObject\*` parameter representing the single argument.
.. data:: METH_OLDARGS
This calling convention is deprecated. The method must be of type
:ctype:`PyCFunction`. The second argument is *NULL* if no arguments are given,
a single object if exactly one argument is given, and a tuple of objects if more
than one argument is given. There is no way for a function using this
convention to distinguish between a call with multiple arguments and a call with
a tuple as the only argument.
These two constants are not used to indicate the calling convention but the
binding when use with methods of classes. These may not be used for functions
defined for modules. At most one of these flags may be set for any given
method.
.. data:: METH_CLASS
.. index:: builtin: classmethod
The method will be passed the type object as the first parameter rather than an
instance of the type. This is used to create *class methods*, similar to what
is created when using the :func:`classmethod` built-in function.
.. versionadded:: 2.3
.. data:: METH_STATIC
.. index:: builtin: staticmethod
The method will be passed *NULL* as the first parameter rather than an instance
of the type. This is used to create *static methods*, similar to what is
created when using the :func:`staticmethod` built-in function.
.. versionadded:: 2.3
One other constant controls whether a method is loaded in place of another
definition with the same method name.
.. data:: METH_COEXIST
The method will be loaded in place of existing definitions. Without
*METH_COEXIST*, the default is to skip repeated definitions. Since slot
wrappers are loaded before the method table, the existence of a *sq_contains*
slot, for example, would generate a wrapped method named :meth:`__contains__`
and preclude the loading of a corresponding PyCFunction with the same name.
With the flag defined, the PyCFunction will be loaded in place of the wrapper
object and will co-exist with the slot. This is helpful because calls to
PyCFunctions are optimized more than wrapper object calls.
.. versionadded:: 2.4
.. ctype:: PyMemberDef
Structure which describes an attribute of a type which corresponds to a C
struct member. Its fields are:
+------------------+-------------+-------------------------------+
| Field | C Type | Meaning |
+==================+=============+===============================+
| :attr:`name` | char \* | name of the member |
+------------------+-------------+-------------------------------+
| :attr:`type` | int | the type of the member in the |
| | | C struct |
+------------------+-------------+-------------------------------+
| :attr:`offset` | Py_ssize_t | the offset in bytes that the |
| | | member is located on the |
| | | type's object struct |
+------------------+-------------+-------------------------------+
| :attr:`flags` | int | flag bits indicating if the |
| | | field should be read-only or |
| | | writable |
+------------------+-------------+-------------------------------+
| :attr:`doc` | char \* | points to the contents of the |
| | | docstring |
+------------------+-------------+-------------------------------+
:attr:`type` can be one of many ``T_`` macros corresponding to various C
types. When the member is accessed in Python, it will be converted to the
equivalent Python type.
=============== ==================
Macro name C type
=============== ==================
T_SHORT short
T_INT int
T_LONG long
T_FLOAT float
T_DOUBLE double
T_STRING char \*
T_OBJECT PyObject \*
T_OBJECT_EX PyObject \*
T_CHAR char
T_BYTE char
T_UBYTE unsigned char
T_UINT unsigned int
T_USHORT unsigned short
T_ULONG unsigned long
T_BOOL char
T_LONGLONG long long
T_ULONGLONG unsigned long long
T_PYSSIZET Py_ssize_t
=============== ==================
:cmacro:`T_OBJECT` and :cmacro:`T_OBJECT_EX` differ in that
:cmacro:`T_OBJECT` returns ``None`` if the member is *NULL* and
:cmacro:`T_OBJECT_EX` raises an :exc:`AttributeError`.
:attr:`flags` can be 0 for write and read access or :cmacro:`READONLY` for
read-only access. Using :cmacro:`T_STRING` for :attr:`type` implies
:cmacro:`READONLY`. Only :cmacro:`T_OBJECT` and :cmacro:`T_OBJECT_EX`
members can be deleted. (They are set to *NULL*).
.. cfunction:: PyObject* Py_FindMethod(PyMethodDef table[], PyObject *ob, char *name)
Return a bound method object for an extension type implemented in C. This can
be useful in the implementation of a :attr:`tp_getattro` or :attr:`tp_getattr`
handler that does not use the :cfunc:`PyObject_GenericGetAttr` function.

View File

@@ -0,0 +1,158 @@
.. highlightlang:: c
.. _os:
Operating System Utilities
==========================
.. cfunction:: int Py_FdIsInteractive(FILE *fp, const char *filename)
Return true (nonzero) if the standard I/O file *fp* with name *filename* is
deemed interactive. This is the case for files for which ``isatty(fileno(fp))``
is true. If the global flag :cdata:`Py_InteractiveFlag` is true, this function
also returns true if the *filename* pointer is *NULL* or if the name is equal to
one of the strings ``'<stdin>'`` or ``'???'``.
.. cfunction:: long PyOS_GetLastModificationTime(char *filename)
Return the time of last modification of the file *filename*. The result is
encoded in the same way as the timestamp returned by the standard C library
function :cfunc:`time`.
.. cfunction:: void PyOS_AfterFork()
Function to update some internal state after a process fork; this should be
called in the new process if the Python interpreter will continue to be used.
If a new executable is loaded into the new process, this function does not need
to be called.
.. cfunction:: int PyOS_CheckStack()
Return true when the interpreter runs out of stack space. This is a reliable
check, but is only available when :const:`USE_STACKCHECK` is defined (currently
on Windows using the Microsoft Visual C++ compiler). :const:`USE_STACKCHECK`
will be defined automatically; you should never change the definition in your
own code.
.. cfunction:: PyOS_sighandler_t PyOS_getsig(int i)
Return the current signal handler for signal *i*. This is a thin wrapper around
either :cfunc:`sigaction` or :cfunc:`signal`. Do not call those functions
directly! :ctype:`PyOS_sighandler_t` is a typedef alias for :ctype:`void
(\*)(int)`.
.. cfunction:: PyOS_sighandler_t PyOS_setsig(int i, PyOS_sighandler_t h)
Set the signal handler for signal *i* to be *h*; return the old signal handler.
This is a thin wrapper around either :cfunc:`sigaction` or :cfunc:`signal`. Do
not call those functions directly! :ctype:`PyOS_sighandler_t` is a typedef
alias for :ctype:`void (\*)(int)`.
.. _systemfunctions:
System Functions
================
These are utility functions that make functionality from the :mod:`sys` module
accessible to C code. They all work with the current interpreter thread's
:mod:`sys` module's dict, which is contained in the internal thread state structure.
.. cfunction:: PyObject *PySys_GetObject(char *name)
Return the object *name* from the :mod:`sys` module or *NULL* if it does
not exist, without setting an exception.
.. cfunction:: FILE *PySys_GetFile(char *name, FILE *def)
Return the :ctype:`FILE*` associated with the object *name* in the
:mod:`sys` module, or *def* if *name* is not in the module or is not associated
with a :ctype:`FILE*`.
.. cfunction:: int PySys_SetObject(char *name, PyObject *v)
Set *name* in the :mod:`sys` module to *v* unless *v* is *NULL*, in which
case *name* is deleted from the sys module. Returns ``0`` on success, ``-1``
on error.
.. cfunction:: void PySys_ResetWarnOptions(void)
Reset :data:`sys.warnoptions` to an empty list.
.. cfunction:: void PySys_AddWarnOption(char *s)
Append *s* to :data:`sys.warnoptions`.
.. cfunction:: void PySys_SetPath(char *path)
Set :data:`sys.path` to a list object of paths found in *path* which should
be a list of paths separated with the platform's search path delimiter
(``:`` on Unix, ``;`` on Windows).
.. cfunction:: void PySys_WriteStdout(const char *format, ...)
Write the output string described by *format* to :data:`sys.stdout`. No
exceptions are raised, even if truncation occurs (see below).
*format* should limit the total size of the formatted output string to
1000 bytes or less -- after 1000 bytes, the output string is truncated.
In particular, this means that no unrestricted "%s" formats should occur;
these should be limited using "%.<N>s" where <N> is a decimal number
calculated so that <N> plus the maximum size of other formatted text does not
exceed 1000 bytes. Also watch out for "%f", which can print hundreds of
digits for very large numbers.
If a problem occurs, or :data:`sys.stdout` is unset, the formatted message
is written to the real (C level) *stdout*.
.. cfunction:: void PySys_WriteStderr(const char *format, ...)
As above, but write to :data:`sys.stderr` or *stderr* instead.
.. _processcontrol:
Process Control
===============
.. cfunction:: void Py_FatalError(const char *message)
.. index:: single: abort()
Print a fatal error message and kill the process. No cleanup is performed.
This function should only be invoked when a condition is detected that would
make it dangerous to continue using the Python interpreter; e.g., when the
object administration appears to be corrupted. On Unix, the standard C library
function :cfunc:`abort` is called which will attempt to produce a :file:`core`
file.
.. cfunction:: void Py_Exit(int status)
.. index::
single: Py_Finalize()
single: exit()
Exit the current process. This calls :cfunc:`Py_Finalize` and then calls the
standard C library function ``exit(status)``.
.. cfunction:: int Py_AtExit(void (*func) ())
.. index::
single: Py_Finalize()
single: cleanup functions
Register a cleanup function to be called by :cfunc:`Py_Finalize`. The cleanup
function will be called with no arguments and should return no value. At most
32 cleanup functions can be registered. When the registration is successful,
:cfunc:`Py_AtExit` returns ``0``; on failure, it returns ``-1``. The cleanup
function registered last is called first. Each cleanup function will be called
at most once. Since Python's internal finalization will have completed before
the cleanup function, no Python APIs should be called by *func*.

View File

@@ -0,0 +1,124 @@
.. highlightlang:: c
.. _tupleobjects:
Tuple Objects
-------------
.. index:: object: tuple
.. ctype:: PyTupleObject
This subtype of :ctype:`PyObject` represents a Python tuple object.
.. cvar:: PyTypeObject PyTuple_Type
.. index:: single: TupleType (in module types)
This instance of :ctype:`PyTypeObject` represents the Python tuple type; it is
the same object as ``tuple`` and ``types.TupleType`` in the Python layer..
.. cfunction:: int PyTuple_Check(PyObject *p)
Return true if *p* is a tuple object or an instance of a subtype of the tuple
type.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyTuple_CheckExact(PyObject *p)
Return true if *p* is a tuple object, but not an instance of a subtype of the
tuple type.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyTuple_New(Py_ssize_t len)
Return a new tuple object of size *len*, or *NULL* on failure.
.. cfunction:: PyObject* PyTuple_Pack(Py_ssize_t n, ...)
Return a new tuple object of size *n*, or *NULL* on failure. The tuple values
are initialized to the subsequent *n* C arguments pointing to Python objects.
``PyTuple_Pack(2, a, b)`` is equivalent to ``Py_BuildValue("(OO)", a, b)``.
.. versionadded:: 2.4
.. cfunction:: Py_ssize_t PyTuple_Size(PyObject *p)
Take a pointer to a tuple object, and return the size of that tuple.
.. cfunction:: Py_ssize_t PyTuple_GET_SIZE(PyObject *p)
Return the size of the tuple *p*, which must be non-*NULL* and point to a tuple;
no error checking is performed.
.. cfunction:: PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos)
Return the object at position *pos* in the tuple pointed to by *p*. If *pos* is
out of bounds, return *NULL* and sets an :exc:`IndexError` exception.
.. cfunction:: PyObject* PyTuple_GET_ITEM(PyObject *p, Py_ssize_t pos)
Like :cfunc:`PyTuple_GetItem`, but does no checking of its arguments.
.. cfunction:: PyObject* PyTuple_GetSlice(PyObject *p, Py_ssize_t low, Py_ssize_t high)
Take a slice of the tuple pointed to by *p* from *low* to *high* and return it
as a new tuple.
.. cfunction:: int PyTuple_SetItem(PyObject *p, Py_ssize_t pos, PyObject *o)
Insert a reference to object *o* at position *pos* of the tuple pointed to by
*p*. Return ``0`` on success.
.. note::
This function "steals" a reference to *o*.
.. cfunction:: void PyTuple_SET_ITEM(PyObject *p, Py_ssize_t pos, PyObject *o)
Like :cfunc:`PyTuple_SetItem`, but does no error checking, and should *only* be
used to fill in brand new tuples.
.. note::
This function "steals" a reference to *o*.
.. cfunction:: int _PyTuple_Resize(PyObject **p, Py_ssize_t newsize)
Can be used to resize a tuple. *newsize* will be the new length of the tuple.
Because tuples are *supposed* to be immutable, this should only be used if there
is only one reference to the object. Do *not* use this if the tuple may already
be known to some other part of the code. The tuple will always grow or shrink
at the end. Think of this as destroying the old tuple and creating a new one,
only more efficiently. Returns ``0`` on success. Client code should never
assume that the resulting value of ``*p`` will be the same as before calling
this function. If the object referenced by ``*p`` is replaced, the original
``*p`` is destroyed. On failure, returns ``-1`` and sets ``*p`` to *NULL*, and
raises :exc:`MemoryError` or :exc:`SystemError`.
.. versionchanged:: 2.2
Removed unused third parameter, *last_is_sticky*.
.. cfunction:: int PyTuple_ClearFreeList(void)
Clear the free list. Return the total number of freed items.
.. versionadded:: 2.6

View File

@@ -0,0 +1,92 @@
.. highlightlang:: c
.. _typeobjects:
Type Objects
------------
.. index:: object: type
.. ctype:: PyTypeObject
The C structure of the objects used to describe built-in types.
.. cvar:: PyObject* PyType_Type
.. index:: single: TypeType (in module types)
This is the type object for type objects; it is the same object as ``type`` and
``types.TypeType`` in the Python layer.
.. cfunction:: int PyType_Check(PyObject *o)
Return true if the object *o* is a type object, including instances of types
derived from the standard type object. Return false in all other cases.
.. cfunction:: int PyType_CheckExact(PyObject *o)
Return true if the object *o* is a type object, but not a subtype of the
standard type object. Return false in all other cases.
.. versionadded:: 2.2
.. cfunction:: unsigned int PyType_ClearCache(void)
Clear the internal lookup cache. Return the current version tag.
.. versionadded:: 2.6
.. cfunction:: void PyType_Modified(PyTypeObject *type)
Invalidate the internal lookup cache for the type and all of its
subtypes. This function must be called after any manual
modification of the attributes or base classes of the type.
.. versionadded:: 2.6
.. cfunction:: int PyType_HasFeature(PyObject *o, int feature)
Return true if the type object *o* sets the feature *feature*. Type features
are denoted by single bit flags.
.. cfunction:: int PyType_IS_GC(PyObject *o)
Return true if the type object includes support for the cycle detector; this
tests the type flag :const:`Py_TPFLAGS_HAVE_GC`.
.. versionadded:: 2.0
.. cfunction:: int PyType_IsSubtype(PyTypeObject *a, PyTypeObject *b)
Return true if *a* is a subtype of *b*.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)
.. versionadded:: 2.2
.. cfunction:: PyObject* PyType_GenericNew(PyTypeObject *type, PyObject *args, PyObject *kwds)
.. versionadded:: 2.2
.. cfunction:: int PyType_Ready(PyTypeObject *type)
Finalize a type object. This should be called on all type objects to finish
their initialization. This function is responsible for adding inherited slots
from a type's base class. Return ``0`` on success, or return ``-1`` and sets an
exception on error.
.. versionadded:: 2.2

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,811 @@
.. highlightlang:: c
.. _unicodeobjects:
Unicode Objects and Codecs
--------------------------
.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
Unicode Objects
^^^^^^^^^^^^^^^
These are the basic Unicode object types used for the Unicode implementation in
Python:
.. % --- Unicode Type -------------------------------------------------------
.. ctype:: Py_UNICODE
This type represents the storage type which is used by Python internally as
basis for holding Unicode ordinals. Python's default builds use a 16-bit type
for :ctype:`Py_UNICODE` and store Unicode values internally as UCS2. It is also
possible to build a UCS4 version of Python (most recent Linux distributions come
with UCS4 builds of Python). These builds then use a 32-bit type for
:ctype:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms
where :ctype:`wchar_t` is available and compatible with the chosen Python
Unicode build variant, :ctype:`Py_UNICODE` is a typedef alias for
:ctype:`wchar_t` to enhance native platform compatibility. On all other
platforms, :ctype:`Py_UNICODE` is a typedef alias for either :ctype:`unsigned
short` (UCS2) or :ctype:`unsigned long` (UCS4).
Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep
this in mind when writing extensions or interfaces.
.. ctype:: PyUnicodeObject
This subtype of :ctype:`PyObject` represents a Python Unicode object.
.. cvar:: PyTypeObject PyUnicode_Type
This instance of :ctype:`PyTypeObject` represents the Python Unicode type. It
is exposed to Python code as ``unicode`` and ``types.UnicodeType``.
The following APIs are really C macros and can be used to do fast checks and to
access internal read-only data of Unicode objects:
.. cfunction:: int PyUnicode_Check(PyObject *o)
Return true if the object *o* is a Unicode object or an instance of a Unicode
subtype.
.. versionchanged:: 2.2
Allowed subtypes to be accepted.
.. cfunction:: int PyUnicode_CheckExact(PyObject *o)
Return true if the object *o* is a Unicode object, but not an instance of a
subtype.
.. versionadded:: 2.2
.. cfunction:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o)
Return the size of the object. *o* has to be a :ctype:`PyUnicodeObject` (not
checked).
.. cfunction:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o)
Return the size of the object's internal buffer in bytes. *o* has to be a
:ctype:`PyUnicodeObject` (not checked).
.. cfunction:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o)
Return a pointer to the internal :ctype:`Py_UNICODE` buffer of the object. *o*
has to be a :ctype:`PyUnicodeObject` (not checked).
.. cfunction:: const char* PyUnicode_AS_DATA(PyObject *o)
Return a pointer to the internal buffer of the object. *o* has to be a
:ctype:`PyUnicodeObject` (not checked).
.. cfunction:: int PyUnicode_ClearFreeList(void)
Clear the free list. Return the total number of freed items.
.. versionadded:: 2.6
Unicode provides many different character properties. The most often needed ones
are available through these macros which are mapped to C functions depending on
the Python configuration.
.. % --- Unicode character properties ---------------------------------------
.. cfunction:: int Py_UNICODE_ISSPACE(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a whitespace character.
.. cfunction:: int Py_UNICODE_ISLOWER(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a lowercase character.
.. cfunction:: int Py_UNICODE_ISUPPER(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is an uppercase character.
.. cfunction:: int Py_UNICODE_ISTITLE(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a titlecase character.
.. cfunction:: int Py_UNICODE_ISLINEBREAK(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a linebreak character.
.. cfunction:: int Py_UNICODE_ISDECIMAL(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a decimal character.
.. cfunction:: int Py_UNICODE_ISDIGIT(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a digit character.
.. cfunction:: int Py_UNICODE_ISNUMERIC(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is a numeric character.
.. cfunction:: int Py_UNICODE_ISALPHA(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is an alphabetic character.
.. cfunction:: int Py_UNICODE_ISALNUM(Py_UNICODE ch)
Return 1 or 0 depending on whether *ch* is an alphanumeric character.
These APIs can be used for fast direct character conversions:
.. cfunction:: Py_UNICODE Py_UNICODE_TOLOWER(Py_UNICODE ch)
Return the character *ch* converted to lower case.
.. cfunction:: Py_UNICODE Py_UNICODE_TOUPPER(Py_UNICODE ch)
Return the character *ch* converted to upper case.
.. cfunction:: Py_UNICODE Py_UNICODE_TOTITLE(Py_UNICODE ch)
Return the character *ch* converted to title case.
.. cfunction:: int Py_UNICODE_TODECIMAL(Py_UNICODE ch)
Return the character *ch* converted to a decimal positive integer. Return
``-1`` if this is not possible. This macro does not raise exceptions.
.. cfunction:: int Py_UNICODE_TODIGIT(Py_UNICODE ch)
Return the character *ch* converted to a single digit integer. Return ``-1`` if
this is not possible. This macro does not raise exceptions.
.. cfunction:: double Py_UNICODE_TONUMERIC(Py_UNICODE ch)
Return the character *ch* converted to a double. Return ``-1.0`` if this is not
possible. This macro does not raise exceptions.
To create Unicode objects and access their basic sequence properties, use these
APIs:
.. % --- Plain Py_UNICODE ---------------------------------------------------
.. cfunction:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size)
Create a Unicode Object from the Py_UNICODE buffer *u* of the given size. *u*
may be *NULL* which causes the contents to be undefined. It is the user's
responsibility to fill in the needed data. The buffer is copied into the new
object. If the buffer is not *NULL*, the return value might be a shared object.
Therefore, modification of the resulting Unicode object is only allowed when *u*
is *NULL*.
.. cfunction:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
Return a read-only pointer to the Unicode object's internal :ctype:`Py_UNICODE`
buffer, *NULL* if *unicode* is not a Unicode object.
.. cfunction:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
Return the length of the Unicode object.
.. cfunction:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, const char *encoding, const char *errors)
Coerce an encoded object *obj* to an Unicode object and return a reference with
incremented refcount.
String and other char buffer compatible objects are decoded according to the
given encoding and using the error handling defined by errors. Both can be
*NULL* to have the interface use the default values (see the next section for
details).
All other objects, including Unicode objects, cause a :exc:`TypeError` to be
set.
The API returns *NULL* if there was an error. The caller is responsible for
decref'ing the returned objects.
.. cfunction:: PyObject* PyUnicode_FromObject(PyObject *obj)
Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used
throughout the interpreter whenever coercion to Unicode is needed.
If the platform supports :ctype:`wchar_t` and provides a header file wchar.h,
Python can interface directly to this type using the following functions.
Support is optimized if Python's own :ctype:`Py_UNICODE` type is identical to
the system's :ctype:`wchar_t`.
.. % --- wchar_t support for platforms which support it ---------------------
.. cfunction:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)
Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given size.
Return *NULL* on failure.
.. cfunction:: Py_ssize_t PyUnicode_AsWideChar(PyUnicodeObject *unicode, wchar_t *w, Py_ssize_t size)
Copy the Unicode object contents into the :ctype:`wchar_t` buffer *w*. At most
*size* :ctype:`wchar_t` characters are copied (excluding a possibly trailing
0-termination character). Return the number of :ctype:`wchar_t` characters
copied or -1 in case of an error. Note that the resulting :ctype:`wchar_t`
string may or may not be 0-terminated. It is the responsibility of the caller
to make sure that the :ctype:`wchar_t` string is 0-terminated in case this is
required by the application.
.. _builtincodecs:
Built-in Codecs
^^^^^^^^^^^^^^^
Python provides a set of builtin codecs which are written in C for speed. All of
these codecs are directly usable via the following functions.
Many of the following APIs take two arguments encoding and errors. These
parameters encoding and errors have the same semantics as the ones of the
builtin unicode() Unicode object constructor.
Setting encoding to *NULL* causes the default encoding to be used which is
ASCII. The file system calls should use :cdata:`Py_FileSystemDefaultEncoding`
as the encoding for file names. This variable should be treated as read-only: On
some systems, it will be a pointer to a static string, on others, it will change
at run-time (such as when the application invokes setlocale).
Error handling is set by errors which may also be set to *NULL* meaning to use
the default handling defined for the codec. Default error handling for all
builtin codecs is "strict" (:exc:`ValueError` is raised).
The codecs all use a similar interface. Only deviation from the following
generic ones are documented for simplicity.
These are the generic codec APIs:
.. % --- Generic Codecs -----------------------------------------------------
.. cfunction:: PyObject* PyUnicode_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
Create a Unicode object by decoding *size* bytes of the encoded string *s*.
*encoding* and *errors* have the same meaning as the parameters of the same name
in the :func:`unicode` builtin function. The codec to be used is looked up
using the Python codec registry. Return *NULL* if an exception was raised by
the codec.
.. cfunction:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size and return a Python
string object. *encoding* and *errors* have the same meaning as the parameters
of the same name in the Unicode :meth:`encode` method. The codec to be used is
looked up using the Python codec registry. Return *NULL* if an exception was
raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsEncodedString(PyObject *unicode, const char *encoding, const char *errors)
Encode a Unicode object and return the result as Python string object.
*encoding* and *errors* have the same meaning as the parameters of the same name
in the Unicode :meth:`encode` method. The codec to be used is looked up using
the Python codec registry. Return *NULL* if an exception was raised by the
codec.
These are the UTF-8 codec APIs:
.. % --- UTF-8 Codecs -------------------------------------------------------
.. cfunction:: PyObject* PyUnicode_DecodeUTF8(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the UTF-8 encoded string
*s*. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_DecodeUTF8Stateful(const char *s, Py_ssize_t size, const char *errors, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :cfunc:`PyUnicode_DecodeUTF8`. If
*consumed* is not *NULL*, trailing incomplete UTF-8 byte sequences will not be
treated as an error. Those bytes will not be decoded and the number of bytes
that have been decoded will be stored in *consumed*.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size using UTF-8 and return a
Python string object. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode)
Encode a Unicode object using UTF-8 and return the result as Python string
object. Error handling is "strict". Return *NULL* if an exception was raised
by the codec.
These are the UTF-32 codec APIs:
.. % --- UTF-32 Codecs ------------------------------------------------------ */
.. cfunction:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
Decode *length* bytes from a UTF-32 encoded buffer string and return the
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
handling. It defaults to "strict".
If *byteorder* is non-*NULL*, the decoder starts decoding using the given byte
order::
*byteorder == -1: little endian
*byteorder == 0: native order
*byteorder == 1: big endian
and then switches if the first four bytes of the input data are a byte order mark
(BOM) and the specified byte order is native order. This BOM is not copied into
the resulting Unicode string. After completion, *\*byteorder* is set to the
current byte order at the end of input data.
In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
If *byteorder* is *NULL*, the codec starts in native order mode.
Return *NULL* if an exception was raised by the codec.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyUnicode_DecodeUTF32Stateful(const char *s, Py_ssize_t size, const char *errors, int *byteorder, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :cfunc:`PyUnicode_DecodeUTF32`. If
*consumed* is not *NULL*, :cfunc:`PyUnicode_DecodeUTF32Stateful` will not treat
trailing incomplete UTF-32 byte sequences (such as a number of bytes not divisible
by four) as an error. Those bytes will not be decoded and the number of bytes
that have been decoded will be stored in *consumed*.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
Return a Python bytes object holding the UTF-32 encoded value of the Unicode
data in *s*. If *byteorder* is not ``0``, output is written according to the
following byte order::
byteorder == -1: little endian
byteorder == 0: native byte order (writes a BOM mark)
byteorder == 1: big endian
If byteorder is ``0``, the output string will always start with the Unicode BOM
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
If *Py_UNICODE_WIDE* is not defined, surrogate pairs will be output
as a single codepoint.
Return *NULL* if an exception was raised by the codec.
.. versionadded:: 2.6
.. cfunction:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode)
Return a Python string using the UTF-32 encoding in native byte order. The
string always starts with a BOM mark. Error handling is "strict". Return
*NULL* if an exception was raised by the codec.
.. versionadded:: 2.6
These are the UTF-16 codec APIs:
.. % --- UTF-16 Codecs ------------------------------------------------------ */
.. cfunction:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
Decode *length* bytes from a UTF-16 encoded buffer string and return the
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
handling. It defaults to "strict".
If *byteorder* is non-*NULL*, the decoder starts decoding using the given byte
order::
*byteorder == -1: little endian
*byteorder == 0: native order
*byteorder == 1: big endian
and then switches if the first two bytes of the input data are a byte order mark
(BOM) and the specified byte order is native order. This BOM is not copied into
the resulting Unicode string. After completion, *\*byteorder* is set to the
current byte order at the.
If *byteorder* is *NULL*, the codec starts in native order mode.
Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_DecodeUTF16Stateful(const char *s, Py_ssize_t size, const char *errors, int *byteorder, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :cfunc:`PyUnicode_DecodeUTF16`. If
*consumed* is not *NULL*, :cfunc:`PyUnicode_DecodeUTF16Stateful` will not treat
trailing incomplete UTF-16 byte sequences (such as an odd number of bytes or a
split surrogate pair) as an error. Those bytes will not be decoded and the
number of bytes that have been decoded will be stored in *consumed*.
.. versionadded:: 2.4
.. cfunction:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
Return a Python string object holding the UTF-16 encoded value of the Unicode
data in *s*. If *byteorder* is not ``0``, output is written according to the
following byte order::
byteorder == -1: little endian
byteorder == 0: native byte order (writes a BOM mark)
byteorder == 1: big endian
If byteorder is ``0``, the output string will always start with the Unicode BOM
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
If *Py_UNICODE_WIDE* is defined, a single :ctype:`Py_UNICODE` value may get
represented as a surrogate pair. If it is not defined, each :ctype:`Py_UNICODE`
values is interpreted as an UCS-2 character.
Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode)
Return a Python string using the UTF-16 encoding in native byte order. The
string always starts with a BOM mark. Error handling is "strict". Return
*NULL* if an exception was raised by the codec.
These are the "Unicode Escape" codec APIs:
.. % --- Unicode-Escape Codecs ----------------------------------------------
.. cfunction:: PyObject* PyUnicode_DecodeUnicodeEscape(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the Unicode-Escape encoded
string *s*. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
Encode the :ctype:`Py_UNICODE` buffer of the given size using Unicode-Escape and
return a Python string object. Return *NULL* if an exception was raised by the
codec.
.. cfunction:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
Encode a Unicode object using Unicode-Escape and return the result as Python
string object. Error handling is "strict". Return *NULL* if an exception was
raised by the codec.
These are the "Raw Unicode Escape" codec APIs:
.. % --- Raw-Unicode-Escape Codecs ------------------------------------------
.. cfunction:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the Raw-Unicode-Escape
encoded string *s*. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size using Raw-Unicode-Escape
and return a Python string object. Return *NULL* if an exception was raised by
the codec.
.. cfunction:: PyObject* PyUnicode_AsRawUnicodeEscapeString(PyObject *unicode)
Encode a Unicode object using Raw-Unicode-Escape and return the result as
Python string object. Error handling is "strict". Return *NULL* if an exception
was raised by the codec.
These are the Latin-1 codec APIs: Latin-1 corresponds to the first 256 Unicode
ordinals and only these are accepted by the codecs during encoding.
.. % --- Latin-1 Codecs -----------------------------------------------------
.. cfunction:: PyObject* PyUnicode_DecodeLatin1(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the Latin-1 encoded string
*s*. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size using Latin-1 and return
a Python string object. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode)
Encode a Unicode object using Latin-1 and return the result as Python string
object. Error handling is "strict". Return *NULL* if an exception was raised
by the codec.
These are the ASCII codec APIs. Only 7-bit ASCII data is accepted. All other
codes generate errors.
.. % --- ASCII Codecs -------------------------------------------------------
.. cfunction:: PyObject* PyUnicode_DecodeASCII(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the ASCII encoded string
*s*. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size using ASCII and return a
Python string object. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode)
Encode a Unicode object using ASCII and return the result as Python string
object. Error handling is "strict". Return *NULL* if an exception was raised
by the codec.
These are the mapping codec APIs:
.. % --- Character Map Codecs -----------------------------------------------
This codec is special in that it can be used to implement many different codecs
(and this is in fact what was done to obtain most of the standard codecs
included in the :mod:`encodings` package). The codec uses mapping to encode and
decode characters.
Decoding mappings must map single string characters to single Unicode
characters, integers (which are then interpreted as Unicode ordinals) or None
(meaning "undefined mapping" and causing an error).
Encoding mappings must map single Unicode characters to single string
characters, integers (which are then interpreted as Latin-1 ordinals) or None
(meaning "undefined mapping" and causing an error).
The mapping objects provided must only support the __getitem__ mapping
interface.
If a character lookup fails with a LookupError, the character is copied as-is
meaning that its ordinal value will be interpreted as Unicode or Latin-1 ordinal
resp. Because of this, mappings only need to contain those mappings which map
characters to different code points.
.. cfunction:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors)
Create a Unicode object by decoding *size* bytes of the encoded string *s* using
the given *mapping* object. Return *NULL* if an exception was raised by the
codec. If *mapping* is *NULL* latin-1 decoding will be done. Else it can be a
dictionary mapping byte or a unicode string, which is treated as a lookup table.
Byte values greater that the length of the string and U+FFFE "characters" are
treated as "undefined mapping".
.. versionchanged:: 2.4
Allowed unicode string as mapping argument.
.. cfunction:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size using the given
*mapping* object and return a Python string object. Return *NULL* if an
exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsCharmapString(PyObject *unicode, PyObject *mapping)
Encode a Unicode object using the given *mapping* object and return the result
as Python string object. Error handling is "strict". Return *NULL* if an
exception was raised by the codec.
The following codec API is special in that maps Unicode to Unicode.
.. cfunction:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors)
Translate a :ctype:`Py_UNICODE` buffer of the given length by applying a
character mapping *table* to it and return the resulting Unicode object. Return
*NULL* when an exception was raised by the codec.
The *mapping* table must map Unicode ordinal integers to Unicode ordinal
integers or None (causing deletion of the character).
Mapping tables need only provide the :meth:`__getitem__` interface; dictionaries
and sequences work well. Unmapped character ordinals (ones which cause a
:exc:`LookupError`) are left untouched and are copied as-is.
These are the MBCS codec APIs. They are currently only available on Windows and
use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
DBCS) is a class of encodings, not just one. The target encoding is defined by
the user settings on the machine running the codec.
.. % --- MBCS codecs for Windows --------------------------------------------
.. cfunction:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*.
Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_DecodeMBCSStateful(const char *s, int size, const char *errors, int *consumed)
If *consumed* is *NULL*, behave like :cfunc:`PyUnicode_DecodeMBCS`. If
*consumed* is not *NULL*, :cfunc:`PyUnicode_DecodeMBCSStateful` will not decode
trailing lead byte and the number of bytes that have been decoded will be stored
in *consumed*.
.. versionadded:: 2.5
.. cfunction:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :ctype:`Py_UNICODE` buffer of the given size using MBCS and return a
Python string object. Return *NULL* if an exception was raised by the codec.
.. cfunction:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode)
Encode a Unicode object using MBCS and return the result as Python string
object. Error handling is "strict". Return *NULL* if an exception was raised
by the codec.
.. % --- Methods & Slots ----------------------------------------------------
.. _unicodemethodsandslots:
Methods and Slot Functions
^^^^^^^^^^^^^^^^^^^^^^^^^^
The following APIs are capable of handling Unicode objects and strings on input
(we refer to them as strings in the descriptions) and return Unicode objects or
integers as appropriate.
They all return *NULL* or ``-1`` if an exception occurs.
.. cfunction:: PyObject* PyUnicode_Concat(PyObject *left, PyObject *right)
Concat two strings giving a new Unicode string.
.. cfunction:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit)
Split a string giving a list of Unicode strings. If sep is *NULL*, splitting
will be done at all whitespace substrings. Otherwise, splits occur at the given
separator. At most *maxsplit* splits will be done. If negative, no limit is
set. Separators are not included in the resulting list.
.. cfunction:: PyObject* PyUnicode_Splitlines(PyObject *s, int keepend)
Split a Unicode string at line breaks, returning a list of Unicode strings.
CRLF is considered to be one line break. If *keepend* is 0, the Line break
characters are not included in the resulting strings.
.. cfunction:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors)
Translate a string by applying a character mapping table to it and return the
resulting Unicode object.
The mapping table must map Unicode ordinal integers to Unicode ordinal integers
or None (causing deletion of the character).
Mapping tables need only provide the :meth:`__getitem__` interface; dictionaries
and sequences work well. Unmapped character ordinals (ones which cause a
:exc:`LookupError`) are left untouched and are copied as-is.
*errors* has the usual meaning for codecs. It may be *NULL* which indicates to
use the default error handling.
.. cfunction:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq)
Join a sequence of strings using the given separator and return the resulting
Unicode string.
.. cfunction:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
Return 1 if *substr* matches *str*[*start*:*end*] at the given tail end
(*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match),
0 otherwise. Return ``-1`` if an error occurred.
.. cfunction:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
Return the first position of *substr* in *str*[*start*:*end*] using the given
*direction* (*direction* == 1 means to do a forward search, *direction* == -1 a
backward search). The return value is the index of the first match; a value of
``-1`` indicates that no match was found, and ``-2`` indicates that an error
occurred and an exception has been set.
.. cfunction:: Py_ssize_t PyUnicode_Count(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end)
Return the number of non-overlapping occurrences of *substr* in
``str[start:end]``. Return ``-1`` if an error occurred.
.. cfunction:: PyObject* PyUnicode_Replace(PyObject *str, PyObject *substr, PyObject *replstr, Py_ssize_t maxcount)
Replace at most *maxcount* occurrences of *substr* in *str* with *replstr* and
return the resulting Unicode object. *maxcount* == -1 means replace all
occurrences.
.. cfunction:: int PyUnicode_Compare(PyObject *left, PyObject *right)
Compare two strings and return -1, 0, 1 for less than, equal, and greater than,
respectively.
.. cfunction:: int PyUnicode_RichCompare(PyObject *left, PyObject *right, int op)
Rich compare two unicode strings and return one of the following:
* ``NULL`` in case an exception was raised
* :const:`Py_True` or :const:`Py_False` for successful comparisons
* :const:`Py_NotImplemented` in case the type combination is unknown
Note that :const:`Py_EQ` and :const:`Py_NE` comparisons can cause a
:exc:`UnicodeWarning` in case the conversion of the arguments to Unicode fails
with a :exc:`UnicodeDecodeError`.
Possible values for *op* are :const:`Py_GT`, :const:`Py_GE`, :const:`Py_EQ`,
:const:`Py_NE`, :const:`Py_LT`, and :const:`Py_LE`.
.. cfunction:: PyObject* PyUnicode_Format(PyObject *format, PyObject *args)
Return a new string object from *format* and *args*; this is analogous to
``format % args``. The *args* argument must be a tuple.
.. cfunction:: int PyUnicode_Contains(PyObject *container, PyObject *element)
Check whether *element* is contained in *container* and return true or false
accordingly.
*element* has to coerce to a one element Unicode string. ``-1`` is returned if
there was an error.

View File

@@ -0,0 +1,21 @@
.. highlightlang:: c
.. _utilities:
*********
Utilities
*********
The functions in this chapter perform various utility tasks, ranging from
helping C code be more portable across platforms, using Python modules from C,
and parsing function arguments and constructing Python values from C values.
.. toctree::
sys.rst
import.rst
marshal.rst
arg.rst
conversion.rst
reflection.rst

View File

@@ -0,0 +1,323 @@
.. highlightlang:: c
.. _veryhigh:
*************************
The Very High Level Layer
*************************
The functions in this chapter will let you execute Python source code given in a
file or a buffer, but they will not let you interact in a more detailed way with
the interpreter.
Several of these functions accept a start symbol from the grammar as a
parameter. The available start symbols are :const:`Py_eval_input`,
:const:`Py_file_input`, and :const:`Py_single_input`. These are described
following the functions which accept them as parameters.
Note also that several of these functions take :ctype:`FILE\*` parameters. One
particular issue which needs to be handled carefully is that the :ctype:`FILE`
structure for different C libraries can be different and incompatible. Under
Windows (at least), it is possible for dynamically linked extensions to actually
use different libraries, so care should be taken that :ctype:`FILE\*` parameters
are only passed to these functions if it is certain that they were created by
the same library that the Python runtime is using.
.. cfunction:: int Py_Main(int argc, char **argv)
The main program for the standard interpreter. This is made available for
programs which embed Python. The *argc* and *argv* parameters should be
prepared exactly as those which are passed to a C program's :cfunc:`main`
function. It is important to note that the argument list may be modified (but
the contents of the strings pointed to by the argument list are not). The return
value will be the integer passed to the :func:`sys.exit` function, ``1`` if the
interpreter exits due to an exception, or ``2`` if the parameter list does not
represent a valid Python command line.
Note that if an otherwise unhandled :exc:`SystemError` is raised, this
function will not return ``1``, but exit the process, as long as
``Py_InspectFlag`` is not set.
.. cfunction:: int PyRun_AnyFile(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_AnyFileExFlags` below, leaving
*closeit* set to ``0`` and *flags* set to *NULL*.
.. cfunction:: int PyRun_AnyFileFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
This is a simplified interface to :cfunc:`PyRun_AnyFileExFlags` below, leaving
the *closeit* argument set to ``0``.
.. cfunction:: int PyRun_AnyFileEx(FILE *fp, const char *filename, int closeit)
This is a simplified interface to :cfunc:`PyRun_AnyFileExFlags` below, leaving
the *flags* argument set to *NULL*.
.. cfunction:: int PyRun_AnyFileExFlags(FILE *fp, const char *filename, int closeit, PyCompilerFlags *flags)
If *fp* refers to a file associated with an interactive device (console or
terminal input or Unix pseudo-terminal), return the value of
:cfunc:`PyRun_InteractiveLoop`, otherwise return the result of
:cfunc:`PyRun_SimpleFile`. If *filename* is *NULL*, this function uses
``"???"`` as the filename.
.. cfunction:: int PyRun_SimpleString(const char *command)
This is a simplified interface to :cfunc:`PyRun_SimpleStringFlags` below,
leaving the *PyCompilerFlags\** argument set to NULL.
.. cfunction:: int PyRun_SimpleStringFlags(const char *command, PyCompilerFlags *flags)
Executes the Python source code from *command* in the :mod:`__main__` module
according to the *flags* argument. If :mod:`__main__` does not already exist, it
is created. Returns ``0`` on success or ``-1`` if an exception was raised. If
there was an error, there is no way to get the exception information. For the
meaning of *flags*, see below.
Note that if an otherwise unhandled :exc:`SystemError` is raised, this
function will not return ``-1``, but exit the process, as long as
``Py_InspectFlag`` is not set.
.. cfunction:: int PyRun_SimpleFile(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_SimpleFileExFlags` below,
leaving *closeit* set to ``0`` and *flags* set to *NULL*.
.. cfunction:: int PyRun_SimpleFileFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
This is a simplified interface to :cfunc:`PyRun_SimpleFileExFlags` below,
leaving *closeit* set to ``0``.
.. cfunction:: int PyRun_SimpleFileEx(FILE *fp, const char *filename, int closeit)
This is a simplified interface to :cfunc:`PyRun_SimpleFileExFlags` below,
leaving *flags* set to *NULL*.
.. cfunction:: int PyRun_SimpleFileExFlags(FILE *fp, const char *filename, int closeit, PyCompilerFlags *flags)
Similar to :cfunc:`PyRun_SimpleStringFlags`, but the Python source code is read
from *fp* instead of an in-memory string. *filename* should be the name of the
file. If *closeit* is true, the file is closed before PyRun_SimpleFileExFlags
returns.
.. cfunction:: int PyRun_InteractiveOne(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_InteractiveOneFlags` below,
leaving *flags* set to *NULL*.
.. cfunction:: int PyRun_InteractiveOneFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
Read and execute a single statement from a file associated with an interactive
device according to the *flags* argument. If *filename* is *NULL*, ``"???"`` is
used instead. The user will be prompted using ``sys.ps1`` and ``sys.ps2``.
Returns ``0`` when the input was executed successfully, ``-1`` if there was an
exception, or an error code from the :file:`errcode.h` include file distributed
as part of Python if there was a parse error. (Note that :file:`errcode.h` is
not included by :file:`Python.h`, so must be included specifically if needed.)
.. cfunction:: int PyRun_InteractiveLoop(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_InteractiveLoopFlags` below,
leaving *flags* set to *NULL*.
.. cfunction:: int PyRun_InteractiveLoopFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
Read and execute statements from a file associated with an interactive device
until EOF is reached. If *filename* is *NULL*, ``"???"`` is used instead. The
user will be prompted using ``sys.ps1`` and ``sys.ps2``. Returns ``0`` at EOF.
.. cfunction:: struct _node* PyParser_SimpleParseString(const char *str, int start)
This is a simplified interface to
:cfunc:`PyParser_SimpleParseStringFlagsFilename` below, leaving *filename* set
to *NULL* and *flags* set to ``0``.
.. cfunction:: struct _node* PyParser_SimpleParseStringFlags( const char *str, int start, int flags)
This is a simplified interface to
:cfunc:`PyParser_SimpleParseStringFlagsFilename` below, leaving *filename* set
to *NULL*.
.. cfunction:: struct _node* PyParser_SimpleParseStringFlagsFilename( const char *str, const char *filename, int start, int flags)
Parse Python source code from *str* using the start token *start* according to
the *flags* argument. The result can be used to create a code object which can
be evaluated efficiently. This is useful if a code fragment must be evaluated
many times.
.. cfunction:: struct _node* PyParser_SimpleParseFile(FILE *fp, const char *filename, int start)
This is a simplified interface to :cfunc:`PyParser_SimpleParseFileFlags` below,
leaving *flags* set to ``0``
.. cfunction:: struct _node* PyParser_SimpleParseFileFlags(FILE *fp, const char *filename, int start, int flags)
Similar to :cfunc:`PyParser_SimpleParseStringFlagsFilename`, but the Python
source code is read from *fp* instead of an in-memory string.
.. cfunction:: PyObject* PyRun_String(const char *str, int start, PyObject *globals, PyObject *locals)
This is a simplified interface to :cfunc:`PyRun_StringFlags` below, leaving
*flags* set to *NULL*.
.. cfunction:: PyObject* PyRun_StringFlags(const char *str, int start, PyObject *globals, PyObject *locals, PyCompilerFlags *flags)
Execute Python source code from *str* in the context specified by the
dictionaries *globals* and *locals* with the compiler flags specified by
*flags*. The parameter *start* specifies the start token that should be used to
parse the source code.
Returns the result of executing the code as a Python object, or *NULL* if an
exception was raised.
.. cfunction:: PyObject* PyRun_File(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals)
This is a simplified interface to :cfunc:`PyRun_FileExFlags` below, leaving
*closeit* set to ``0`` and *flags* set to *NULL*.
.. cfunction:: PyObject* PyRun_FileEx(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals, int closeit)
This is a simplified interface to :cfunc:`PyRun_FileExFlags` below, leaving
*flags* set to *NULL*.
.. cfunction:: PyObject* PyRun_FileFlags(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals, PyCompilerFlags *flags)
This is a simplified interface to :cfunc:`PyRun_FileExFlags` below, leaving
*closeit* set to ``0``.
.. cfunction:: PyObject* PyRun_FileExFlags(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals, int closeit, PyCompilerFlags *flags)
Similar to :cfunc:`PyRun_StringFlags`, but the Python source code is read from
*fp* instead of an in-memory string. *filename* should be the name of the file.
If *closeit* is true, the file is closed before :cfunc:`PyRun_FileExFlags`
returns.
.. cfunction:: PyObject* Py_CompileString(const char *str, const char *filename, int start)
This is a simplified interface to :cfunc:`Py_CompileStringFlags` below, leaving
*flags* set to *NULL*.
.. cfunction:: PyObject* Py_CompileStringFlags(const char *str, const char *filename, int start, PyCompilerFlags *flags)
Parse and compile the Python source code in *str*, returning the resulting code
object. The start token is given by *start*; this can be used to constrain the
code which can be compiled and should be :const:`Py_eval_input`,
:const:`Py_file_input`, or :const:`Py_single_input`. The filename specified by
*filename* is used to construct the code object and may appear in tracebacks or
:exc:`SyntaxError` exception messages. This returns *NULL* if the code cannot
be parsed or compiled.
.. cfunction:: PyObject* PyEval_EvalCode(PyCodeObject *co, PyObject *globals, PyObject *locals)
This is a simplified interface to :cfunc:`PyEval_EvalCodeEx`, with just
the code object, and the dictionaries of global and local variables.
The other arguments are set to *NULL*.
.. cfunction:: PyObject* PyEval_EvalCodeEx(PyCodeObject *co, PyObject *globals, PyObject *locals, PyObject **args, int argcount, PyObject **kws, int kwcount, PyObject **defs, int defcount, PyObject *closure)
Evaluate a precompiled code object, given a particular environment for its
evaluation. This environment consists of dictionaries of global and local
variables, arrays of arguments, keywords and defaults, and a closure tuple of
cells.
.. cfunction:: PyObject* PyEval_EvalFrame(PyFrameObject *f)
Evaluate an execution frame. This is a simplified interface to
PyEval_EvalFrameEx, for backward compatibility.
.. cfunction:: PyObject* PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
This is the main, unvarnished function of Python interpretation. It is
literally 2000 lines long. The code object associated with the execution
frame *f* is executed, interpreting bytecode and executing calls as needed.
The additional *throwflag* parameter can mostly be ignored - if true, then
it causes an exception to immediately be thrown; this is used for the
:meth:`throw` methods of generator objects.
.. cfunction:: int PyEval_MergeCompilerFlags(PyCompilerFlags *cf)
This function changes the flags of the current evaluation frame, and returns
true on success, false on failure.
.. cvar:: int Py_eval_input
.. index:: single: Py_CompileString()
The start symbol from the Python grammar for isolated expressions; for use with
:cfunc:`Py_CompileString`.
.. cvar:: int Py_file_input
.. index:: single: Py_CompileString()
The start symbol from the Python grammar for sequences of statements as read
from a file or other source; for use with :cfunc:`Py_CompileString`. This is
the symbol to use when compiling arbitrarily long Python source code.
.. cvar:: int Py_single_input
.. index:: single: Py_CompileString()
The start symbol from the Python grammar for a single statement; for use with
:cfunc:`Py_CompileString`. This is the symbol used for the interactive
interpreter loop.
.. ctype:: struct PyCompilerFlags
This is the structure used to hold compiler flags. In cases where code is only
being compiled, it is passed as ``int flags``, and in cases where code is being
executed, it is passed as ``PyCompilerFlags *flags``. In this case, ``from
__future__ import`` can modify *flags*.
Whenever ``PyCompilerFlags *flags`` is *NULL*, :attr:`cf_flags` is treated as
equal to ``0``, and any modification due to ``from __future__ import`` is
discarded. ::
struct PyCompilerFlags {
int cf_flags;
}
.. cvar:: int CO_FUTURE_DIVISION
This bit can be set in *flags* to cause division operator ``/`` to be
interpreted as "true division" according to :pep:`238`.

View File

@@ -0,0 +1,76 @@
.. highlightlang:: c
.. _weakrefobjects:
Weak Reference Objects
----------------------
Python supports *weak references* as first-class objects. There are two
specific object types which directly implement weak references. The first is a
simple reference object, and the second acts as a proxy for the original object
as much as it can.
.. cfunction:: int PyWeakref_Check(ob)
Return true if *ob* is either a reference or proxy object.
.. versionadded:: 2.2
.. cfunction:: int PyWeakref_CheckRef(ob)
Return true if *ob* is a reference object.
.. versionadded:: 2.2
.. cfunction:: int PyWeakref_CheckProxy(ob)
Return true if *ob* is a proxy object.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyWeakref_NewRef(PyObject *ob, PyObject *callback)
Return a weak reference object for the object *ob*. This will always return
a new reference, but is not guaranteed to create a new object; an existing
reference object may be returned. The second parameter, *callback*, can be a
callable object that receives notification when *ob* is garbage collected; it
should accept a single parameter, which will be the weak reference object
itself. *callback* may also be ``None`` or *NULL*. If *ob* is not a
weakly-referencable object, or if *callback* is not callable, ``None``, or
*NULL*, this will return *NULL* and raise :exc:`TypeError`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyWeakref_NewProxy(PyObject *ob, PyObject *callback)
Return a weak reference proxy object for the object *ob*. This will always
return a new reference, but is not guaranteed to create a new object; an
existing proxy object may be returned. The second parameter, *callback*, can
be a callable object that receives notification when *ob* is garbage
collected; it should accept a single parameter, which will be the weak
reference object itself. *callback* may also be ``None`` or *NULL*. If *ob*
is not a weakly-referencable object, or if *callback* is not callable,
``None``, or *NULL*, this will return *NULL* and raise :exc:`TypeError`.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyWeakref_GetObject(PyObject *ref)
Return the referenced object from a weak reference, *ref*. If the referent is
no longer live, returns ``None``.
.. versionadded:: 2.2
.. cfunction:: PyObject* PyWeakref_GET_OBJECT(PyObject *ref)
Similar to :cfunc:`PyWeakref_GetObject`, but implemented as a macro that does no
error checking.
.. versionadded:: 2.2

View File

@@ -0,0 +1,184 @@
# -*- coding: utf-8 -*-
#
# Python documentation build configuration file
#
# This file is execfile()d with the current directory set to its containing dir.
#
# The contents of this file are pickled, so don't put values in the namespace
# that aren't pickleable (module imports are okay, they're removed automatically).
import sys, os, time
sys.path.append(os.path.abspath('tools/sphinxext'))
# General configuration
# ---------------------
extensions = ['sphinx.ext.refcounting', 'sphinx.ext.coverage',
'sphinx.ext.doctest', 'pyspecific']
templates_path = ['tools/sphinxext']
# General substitutions.
project = 'Python'
copyright = '1990-%s, Python Software Foundation' % time.strftime('%Y')
# The default replacements for |version| and |release|.
#
# The short X.Y version.
# version = '2.6'
# The full version, including alpha/beta/rc tags.
# release = '2.6a0'
# We look for the Include/patchlevel.h file in the current Python source tree
# and replace the values accordingly.
import patchlevel
version, release = patchlevel.get_version_info()
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
today = ''
# Else, today_fmt is used as the format for a strftime call.
today_fmt = '%B %d, %Y'
# List of files that shouldn't be included in the build.
unused_docs = [
'maclib/scrap',
'library/xmllib',
'library/xml.etree',
]
# Relative filename of the reference count data file.
refcount_file = 'data/refcounts.dat'
# If true, '()' will be appended to :func: etc. cross-reference text.
add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
add_module_names = True
# Options for HTML output
# -----------------------
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
html_use_smartypants = True
# Custom sidebar templates, filenames relative to this file.
html_sidebars = {
'index': 'indexsidebar.html',
}
# Additional templates that should be rendered to pages.
html_additional_pages = {
'download': 'download.html',
'index': 'indexcontent.html',
}
# Output an OpenSearch description file.
html_use_opensearch = 'http://docs.python.org/dev'
# Additional static files.
html_static_path = ['tools/sphinxext/static']
# Output file base name for HTML help builder.
htmlhelp_basename = 'python' + release.replace('.', '')
# Split the index
html_split_index = True
# Options for LaTeX output
# ------------------------
# The paper size ('letter' or 'a4').
latex_paper_size = 'a4'
# The font size ('10pt', '11pt' or '12pt').
latex_font_size = '10pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, document class [howto/manual]).
_stdauthor = r'Guido van Rossum\\Fred L. Drake, Jr., editor'
latex_documents = [
('c-api/index', 'c-api.tex',
'The Python/C API', _stdauthor, 'manual'),
('distutils/index', 'distutils.tex',
'Distributing Python Modules', _stdauthor, 'manual'),
('documenting/index', 'documenting.tex',
'Documenting Python', 'Georg Brandl', 'manual'),
('extending/index', 'extending.tex',
'Extending and Embedding Python', _stdauthor, 'manual'),
('install/index', 'install.tex',
'Installing Python Modules', _stdauthor, 'manual'),
('library/index', 'library.tex',
'The Python Library Reference', _stdauthor, 'manual'),
('reference/index', 'reference.tex',
'The Python Language Reference', _stdauthor, 'manual'),
('tutorial/index', 'tutorial.tex',
'Python Tutorial', _stdauthor, 'manual'),
('using/index', 'using.tex',
'Using Python', _stdauthor, 'manual'),
('whatsnew/' + version, 'whatsnew.tex',
'What\'s New in Python', 'A. M. Kuchling', 'howto'),
]
# Collect all HOWTOs individually
latex_documents.extend(('howto/' + fn[:-4], 'howto-' + fn[:-4] + '.tex',
'', _stdauthor, 'howto')
for fn in os.listdir('howto')
if fn.endswith('.rst') and fn != 'index.rst')
# Additional stuff for the LaTeX preamble.
latex_preamble = r'''
\authoraddress{
\strong{Python Software Foundation}\\
Email: \email{docs@python.org}
}
\let\Verbatim=\OriginalVerbatim
\let\endVerbatim=\endOriginalVerbatim
'''
# Documents to append as an appendix to all manuals.
latex_appendices = ['glossary', 'about', 'license', 'copyright']
latex_elements = {'inputenc': '\\usepackage[utf8x]{inputenc}'}
# Options for the coverage checker
# --------------------------------
# The coverage checker will ignore all modules/functions/classes whose names
# match any of the following regexes (using re.match).
coverage_ignore_modules = [
r'[T|t][k|K]',
r'Tix',
r'distutils.*',
]
coverage_ignore_functions = [
'test($|_)',
]
coverage_ignore_classes = [
]
# Glob patterns for C source files for C API coverage, relative to this directory.
coverage_c_path = [
'../Include/*.h',
]
# Regexes to find C items in the source files.
coverage_c_regexes = {
'cfunction': (r'^PyAPI_FUNC\(.*\)\s+([^_][\w_]+)'),
'data': (r'^PyAPI_DATA\(.*\)\s+([^_][\w_]+)'),
'macro': (r'^#define ([^_][\w_]+)\(.*\)[\s|\\]'),
}
# The coverage checker will ignore all C items whose names match these regexes
# (using re.match) -- the keys must be the same as in coverage_c_regexes.
coverage_ignore_c_items = {
# 'cfunction': [...]
}

View File

@@ -0,0 +1,23 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Python Documentation contents
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
.. toctree::
whatsnew/index.rst
tutorial/index.rst
using/index.rst
reference/index.rst
library/index.rst
extending/index.rst
c-api/index.rst
distutils/index.rst
install/index.rst
documenting/index.rst
howto/index.rst
glossary.rst
about.rst
bugs.rst
copyright.rst
license.rst

View File

@@ -0,0 +1,19 @@
*********
Copyright
*********
Python and this documentation is:
Copyright © 2001-2008 Python Software Foundation. All rights reserved.
Copyright © 2000 BeOpen.com. All rights reserved.
Copyright © 1995-2000 Corporation for National Research Initiatives. All rights
reserved.
Copyright © 1991-1995 Stichting Mathematisch Centrum. All rights reserved.
-------
See :ref:`history-and-license` for complete license and permissions information.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,446 @@
.. _built-dist:
****************************
Creating Built Distributions
****************************
A "built distribution" is what you're probably used to thinking of either as a
"binary package" or an "installer" (depending on your background). It's not
necessarily binary, though, because it might contain only Python source code
and/or byte-code; and we don't call it a package, because that word is already
spoken for in Python. (And "installer" is a term specific to the world of
mainstream desktop systems.)
A built distribution is how you make life as easy as possible for installers of
your module distribution: for users of RPM-based Linux systems, it's a binary
RPM; for Windows users, it's an executable installer; for Debian-based Linux
users, it's a Debian package; and so forth. Obviously, no one person will be
able to create built distributions for every platform under the sun, so the
Distutils are designed to enable module developers to concentrate on their
specialty---writing code and creating source distributions---while an
intermediary species called *packagers* springs up to turn source distributions
into built distributions for as many platforms as there are packagers.
Of course, the module developer could be his own packager; or the packager could
be a volunteer "out there" somewhere who has access to a platform which the
original developer does not; or it could be software periodically grabbing new
source distributions and turning them into built distributions for as many
platforms as the software has access to. Regardless of who they are, a packager
uses the setup script and the :command:`bdist` command family to generate built
distributions.
As a simple example, if I run the following command in the Distutils source
tree::
python setup.py bdist
then the Distutils builds my module distribution (the Distutils itself in this
case), does a "fake" installation (also in the :file:`build` directory), and
creates the default type of built distribution for my platform. The default
format for built distributions is a "dumb" tar file on Unix, and a simple
executable installer on Windows. (That tar file is considered "dumb" because it
has to be unpacked in a specific location to work.)
Thus, the above command on a Unix system creates
:file:`Distutils-1.0.{plat}.tar.gz`; unpacking this tarball from the right place
installs the Distutils just as though you had downloaded the source distribution
and run ``python setup.py install``. (The "right place" is either the root of
the filesystem or Python's :file:`{prefix}` directory, depending on the options
given to the :command:`bdist_dumb` command; the default is to make dumb
distributions relative to :file:`{prefix}`.)
Obviously, for pure Python distributions, this isn't any simpler than just
running ``python setup.py install``\ ---but for non-pure distributions, which
include extensions that would need to be compiled, it can mean the difference
between someone being able to use your extensions or not. And creating "smart"
built distributions, such as an RPM package or an executable installer for
Windows, is far more convenient for users even if your distribution doesn't
include any extensions.
The :command:`bdist` command has a :option:`--formats` option, similar to the
:command:`sdist` command, which you can use to select the types of built
distribution to generate: for example, ::
python setup.py bdist --format=zip
would, when run on a Unix system, create :file:`Distutils-1.0.{plat}.zip`\
---again, this archive would be unpacked from the root directory to install the
Distutils.
The available formats for built distributions are:
+-------------+------------------------------+---------+
| Format | Description | Notes |
+=============+==============================+=========+
| ``gztar`` | gzipped tar file | (1),(3) |
| | (:file:`.tar.gz`) | |
+-------------+------------------------------+---------+
| ``ztar`` | compressed tar file | \(3) |
| | (:file:`.tar.Z`) | |
+-------------+------------------------------+---------+
| ``tar`` | tar file (:file:`.tar`) | \(3) |
+-------------+------------------------------+---------+
| ``zip`` | zip file (:file:`.zip`) | \(4) |
+-------------+------------------------------+---------+
| ``rpm`` | RPM | \(5) |
+-------------+------------------------------+---------+
| ``pkgtool`` | Solaris :program:`pkgtool` | |
+-------------+------------------------------+---------+
| ``sdux`` | HP-UX :program:`swinstall` | |
+-------------+------------------------------+---------+
| ``rpm`` | RPM | \(5) |
+-------------+------------------------------+---------+
| ``wininst`` | self-extracting ZIP file for | (2),(4) |
| | Windows | |
+-------------+------------------------------+---------+
Notes:
(1)
default on Unix
(2)
default on Windows
**\*\*** to-do! **\*\***
(3)
requires external utilities: :program:`tar` and possibly one of :program:`gzip`,
:program:`bzip2`, or :program:`compress`
(4)
requires either external :program:`zip` utility or :mod:`zipfile` module (part
of the standard Python library since Python 1.6)
(5)
requires external :program:`rpm` utility, version 3.0.4 or better (use ``rpm
--version`` to find out which version you have)
You don't have to use the :command:`bdist` command with the :option:`--formats`
option; you can also use the command that directly implements the format you're
interested in. Some of these :command:`bdist` "sub-commands" actually generate
several similar formats; for instance, the :command:`bdist_dumb` command
generates all the "dumb" archive formats (``tar``, ``ztar``, ``gztar``, and
``zip``), and :command:`bdist_rpm` generates both binary and source RPMs. The
:command:`bdist` sub-commands, and the formats generated by each, are:
+--------------------------+-----------------------+
| Command | Formats |
+==========================+=======================+
| :command:`bdist_dumb` | tar, ztar, gztar, zip |
+--------------------------+-----------------------+
| :command:`bdist_rpm` | rpm, srpm |
+--------------------------+-----------------------+
| :command:`bdist_wininst` | wininst |
+--------------------------+-----------------------+
The following sections give details on the individual :command:`bdist_\*`
commands.
.. _creating-dumb:
Creating dumb built distributions
=================================
**\*\*** Need to document absolute vs. prefix-relative packages here, but first
I have to implement it! **\*\***
.. _creating-rpms:
Creating RPM packages
=====================
The RPM format is used by many popular Linux distributions, including Red Hat,
SuSE, and Mandrake. If one of these (or any of the other RPM-based Linux
distributions) is your usual environment, creating RPM packages for other users
of that same distribution is trivial. Depending on the complexity of your module
distribution and differences between Linux distributions, you may also be able
to create RPMs that work on different RPM-based distributions.
The usual way to create an RPM of your module distribution is to run the
:command:`bdist_rpm` command::
python setup.py bdist_rpm
or the :command:`bdist` command with the :option:`--format` option::
python setup.py bdist --formats=rpm
The former allows you to specify RPM-specific options; the latter allows you to
easily specify multiple formats in one run. If you need to do both, you can
explicitly specify multiple :command:`bdist_\*` commands and their options::
python setup.py bdist_rpm --packager="John Doe <jdoe@example.org>" \
bdist_wininst --target_version="2.0"
Creating RPM packages is driven by a :file:`.spec` file, much as using the
Distutils is driven by the setup script. To make your life easier, the
:command:`bdist_rpm` command normally creates a :file:`.spec` file based on the
information you supply in the setup script, on the command line, and in any
Distutils configuration files. Various options and sections in the
:file:`.spec` file are derived from options in the setup script as follows:
+------------------------------------------+----------------------------------------------+
| RPM :file:`.spec` file option or section | Distutils setup script option |
+==========================================+==============================================+
| Name | :option:`name` |
+------------------------------------------+----------------------------------------------+
| Summary (in preamble) | :option:`description` |
+------------------------------------------+----------------------------------------------+
| Version | :option:`version` |
+------------------------------------------+----------------------------------------------+
| Vendor | :option:`author` and :option:`author_email`, |
| | or --- & :option:`maintainer` and |
| | :option:`maintainer_email` |
+------------------------------------------+----------------------------------------------+
| Copyright | :option:`license` |
+------------------------------------------+----------------------------------------------+
| Url | :option:`url` |
+------------------------------------------+----------------------------------------------+
| %description (section) | :option:`long_description` |
+------------------------------------------+----------------------------------------------+
Additionally, there are many options in :file:`.spec` files that don't have
corresponding options in the setup script. Most of these are handled through
options to the :command:`bdist_rpm` command as follows:
+-------------------------------+-----------------------------+-------------------------+
| RPM :file:`.spec` file option | :command:`bdist_rpm` option | default value |
| or section | | |
+===============================+=============================+=========================+
| Release | :option:`release` | "1" |
+-------------------------------+-----------------------------+-------------------------+
| Group | :option:`group` | "Development/Libraries" |
+-------------------------------+-----------------------------+-------------------------+
| Vendor | :option:`vendor` | (see above) |
+-------------------------------+-----------------------------+-------------------------+
| Packager | :option:`packager` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Provides | :option:`provides` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Requires | :option:`requires` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Conflicts | :option:`conflicts` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Obsoletes | :option:`obsoletes` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Distribution | :option:`distribution_name` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| BuildRequires | :option:`build_requires` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Icon | :option:`icon` | (none) |
+-------------------------------+-----------------------------+-------------------------+
Obviously, supplying even a few of these options on the command-line would be
tedious and error-prone, so it's usually best to put them in the setup
configuration file, :file:`setup.cfg`\ ---see section :ref:`setup-config`. If
you distribute or package many Python module distributions, you might want to
put options that apply to all of them in your personal Distutils configuration
file (:file:`~/.pydistutils.cfg`).
There are three steps to building a binary RPM package, all of which are
handled automatically by the Distutils:
#. create a :file:`.spec` file, which describes the package (analogous to the
Distutils setup script; in fact, much of the information in the setup script
winds up in the :file:`.spec` file)
#. create the source RPM
#. create the "binary" RPM (which may or may not contain binary code, depending
on whether your module distribution contains Python extensions)
Normally, RPM bundles the last two steps together; when you use the Distutils,
all three steps are typically bundled together.
If you wish, you can separate these three steps. You can use the
:option:`--spec-only` option to make :command:`bdist_rpm` just create the
:file:`.spec` file and exit; in this case, the :file:`.spec` file will be
written to the "distribution directory"---normally :file:`dist/`, but
customizable with the :option:`--dist-dir` option. (Normally, the :file:`.spec`
file winds up deep in the "build tree," in a temporary directory created by
:command:`bdist_rpm`.)
.. % \XXX{this isn't implemented yet---is it needed?!}
.. % You can also specify a custom \file{.spec} file with the
.. % \longprogramopt{spec-file} option; used in conjunction with
.. % \longprogramopt{spec-only}, this gives you an opportunity to customize
.. % the \file{.spec} file manually:
.. %
.. % \ begin{verbatim}
.. % > python setup.py bdist_rpm --spec-only
.. % # ...edit dist/FooBar-1.0.spec
.. % > python setup.py bdist_rpm --spec-file=dist/FooBar-1.0.spec
.. % \ end{verbatim}
.. %
.. % (Although a better way to do this is probably to override the standard
.. % \command{bdist\_rpm} command with one that writes whatever else you want
.. % to the \file{.spec} file.)
.. _creating-wininst:
Creating Windows Installers
===========================
Executable installers are the natural format for binary distributions on
Windows. They display a nice graphical user interface, display some information
about the module distribution to be installed taken from the metadata in the
setup script, let the user select a few options, and start or cancel the
installation.
Since the metadata is taken from the setup script, creating Windows installers
is usually as easy as running::
python setup.py bdist_wininst
or the :command:`bdist` command with the :option:`--formats` option::
python setup.py bdist --formats=wininst
If you have a pure module distribution (only containing pure Python modules and
packages), the resulting installer will be version independent and have a name
like :file:`foo-1.0.win32.exe`. These installers can even be created on Unix
platforms or Mac OS X.
If you have a non-pure distribution, the extensions can only be created on a
Windows platform, and will be Python version dependent. The installer filename
will reflect this and now has the form :file:`foo-1.0.win32-py2.0.exe`. You
have to create a separate installer for every Python version you want to
support.
The installer will try to compile pure modules into :term:`bytecode` after installation
on the target system in normal and optimizing mode. If you don't want this to
happen for some reason, you can run the :command:`bdist_wininst` command with
the :option:`--no-target-compile` and/or the :option:`--no-target-optimize`
option.
By default the installer will display the cool "Python Powered" logo when it is
run, but you can also supply your own bitmap which must be a Windows
:file:`.bmp` file with the :option:`--bitmap` option.
The installer will also display a large title on the desktop background window
when it is run, which is constructed from the name of your distribution and the
version number. This can be changed to another text by using the
:option:`--title` option.
The installer file will be written to the "distribution directory" --- normally
:file:`dist/`, but customizable with the :option:`--dist-dir` option.
.. _cross-compile-windows:
Cross-compiling on Windows
==========================
Starting with Python 2.6, distutils is capable of cross-compiling between
Windows platforms. In practice, this means that with the correct tools
installed, you can use a 32bit version of Windows to create 64bit extensions
and vice-versa.
To build for an alternate platform, specify the :option:`--plat-name` option
to the build command. Valid values are currently 'win32', 'win-amd64' and
'win-ia64'. For example, on a 32bit version of Windows, you could execute::
python setup.py build --plat-name=win-amd64
to build a 64bit version of your extension. The Windows Installers also
support this option, so the command::
python setup.py build --plat-name=win-amd64 bdist_wininst
would create a 64bit installation executable on your 32bit version of Windows.
To cross-compile, you must download the Python source code and cross-compile
Python itself for the platform you are targetting - it is not possible from a
binary installtion of Python (as the .lib etc file for other platforms are
not included.) In practice, this means the user of a 32 bit operating
system will need to use Visual Studio 2008 to open the
:file:`PCBuild/PCbuild.sln` solution in the Python source tree and build the
"x64" configuration of the 'pythoncore' project before cross-compiling
extensions is possible.
Note that by default, Visual Studio 2008 does not install 64bit compilers or
tools. You may need to reexecute the Visual Studio setup process and select
these tools (using Control Panel->[Add/Remove] Programs is a convenient way to
check or modify your existing install.)
.. _postinstallation-script:
The Postinstallation script
---------------------------
Starting with Python 2.3, a postinstallation script can be specified which the
:option:`--install-script` option. The basename of the script must be
specified, and the script filename must also be listed in the scripts argument
to the setup function.
This script will be run at installation time on the target system after all the
files have been copied, with ``argv[1]`` set to :option:`-install`, and again at
uninstallation time before the files are removed with ``argv[1]`` set to
:option:`-remove`.
The installation script runs embedded in the windows installer, every output
(``sys.stdout``, ``sys.stderr``) is redirected into a buffer and will be
displayed in the GUI after the script has finished.
Some functions especially useful in this context are available as additional
built-in functions in the installation script.
.. function:: directory_created(path)
file_created(path)
These functions should be called when a directory or file is created by the
postinstall script at installation time. It will register *path* with the
uninstaller, so that it will be removed when the distribution is uninstalled.
To be safe, directories are only removed if they are empty.
.. function:: get_special_folder_path(csidl_string)
This function can be used to retrieve special folder locations on Windows like
the Start Menu or the Desktop. It returns the full path to the folder.
*csidl_string* must be one of the following strings::
"CSIDL_APPDATA"
"CSIDL_COMMON_STARTMENU"
"CSIDL_STARTMENU"
"CSIDL_COMMON_DESKTOPDIRECTORY"
"CSIDL_DESKTOPDIRECTORY"
"CSIDL_COMMON_STARTUP"
"CSIDL_STARTUP"
"CSIDL_COMMON_PROGRAMS"
"CSIDL_PROGRAMS"
"CSIDL_FONTS"
If the folder cannot be retrieved, :exc:`OSError` is raised.
Which folders are available depends on the exact Windows version, and probably
also the configuration. For details refer to Microsoft's documentation of the
:cfunc:`SHGetSpecialFolderPath` function.
Vista User Access Control (UAC)
===============================
Starting with Python 2.6, bdist_wininst supports a :option:`--user-access-control`
option. The default is 'none' (meaning no UAC handling is done), and other
valid values are 'auto' (meaning prompt for UAC elevation if Python was
installed for all users) and 'force' (meaning always prompt for elevation)
.. function:: create_shortcut(target, description, filename[, arguments[, workdir[, iconpath[, iconindex]]]])
This function creates a shortcut. *target* is the path to the program to be
started by the shortcut. *description* is the description of the shortcut.
*filename* is the title of the shortcut that the user will see. *arguments*
specifies the command line arguments, if any. *workdir* is the working directory
for the program. *iconpath* is the file containing the icon for the shortcut,
and *iconindex* is the index of the icon in the file *iconpath*. Again, for
details consult the Microsoft documentation for the :class:`IShellLink`
interface.

View File

@@ -0,0 +1,104 @@
.. _reference:
*****************
Command Reference
*****************
.. % \section{Building modules: the \protect\command{build} command family}
.. % \label{build-cmds}
.. % \subsubsection{\protect\command{build}}
.. % \label{build-cmd}
.. % \subsubsection{\protect\command{build\_py}}
.. % \label{build-py-cmd}
.. % \subsubsection{\protect\command{build\_ext}}
.. % \label{build-ext-cmd}
.. % \subsubsection{\protect\command{build\_clib}}
.. % \label{build-clib-cmd}
.. _install-cmd:
Installing modules: the :command:`install` command family
=========================================================
The install command ensures that the build commands have been run and then runs
the subcommands :command:`install_lib`, :command:`install_data` and
:command:`install_scripts`.
.. % \subsubsection{\protect\command{install\_lib}}
.. % \label{install-lib-cmd}
.. _install-data-cmd:
:command:`install_data`
-----------------------
This command installs all data files provided with the distribution.
.. _install-scripts-cmd:
:command:`install_scripts`
--------------------------
This command installs all (Python) scripts in the distribution.
.. % \subsection{Cleaning up: the \protect\command{clean} command}
.. % \label{clean-cmd}
.. _sdist-cmd:
Creating a source distribution: the :command:`sdist` command
============================================================
**\*\*** fragment moved down from above: needs context! **\*\***
The manifest template commands are:
+-------------------------------------------+-----------------------------------------------+
| Command | Description |
+===========================================+===============================================+
| :command:`include pat1 pat2 ...` | include all files matching any of the listed |
| | patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`exclude pat1 pat2 ...` | exclude all files matching any of the listed |
| | patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`recursive-include dir pat1 pat2 | include all files under *dir* matching any of |
| ...` | the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`recursive-exclude dir pat1 pat2 | exclude all files under *dir* matching any of |
| ...` | the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`global-include pat1 pat2 ...` | include all files anywhere in the source tree |
| | matching --- & any of the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`global-exclude pat1 pat2 ...` | exclude all files anywhere in the source tree |
| | matching --- & any of the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`prune dir` | exclude all files under *dir* |
+-------------------------------------------+-----------------------------------------------+
| :command:`graft dir` | include all files under *dir* |
+-------------------------------------------+-----------------------------------------------+
The patterns here are Unix-style "glob" patterns: ``*`` matches any sequence of
regular filename characters, ``?`` matches any single regular filename
character, and ``[range]`` matches any of the characters in *range* (e.g.,
``a-z``, ``a-zA-Z``, ``a-f0-9_.``). The definition of "regular filename
character" is platform-specific: on Unix it is anything except slash; on Windows
anything except backslash or colon.
**\*\*** Windows support not there yet **\*\***
.. % \section{Creating a built distribution: the
.. % \protect\command{bdist} command family}
.. % \label{bdist-cmds}
.. % \subsection{\protect\command{bdist}}
.. % \subsection{\protect\command{bdist\_dumb}}
.. % \subsection{\protect\command{bdist\_rpm}}
.. % \subsection{\protect\command{bdist\_wininst}}

View File

@@ -0,0 +1,130 @@
.. _setup-config:
************************************
Writing the Setup Configuration File
************************************
Often, it's not possible to write down everything needed to build a distribution
*a priori*: you may need to get some information from the user, or from the
user's system, in order to proceed. As long as that information is fairly
simple---a list of directories to search for C header files or libraries, for
example---then providing a configuration file, :file:`setup.cfg`, for users to
edit is a cheap and easy way to solicit it. Configuration files also let you
provide default values for any command option, which the installer can then
override either on the command-line or by editing the config file.
The setup configuration file is a useful middle-ground between the setup script
---which, ideally, would be opaque to installers [#]_---and the command-line to
the setup script, which is outside of your control and entirely up to the
installer. In fact, :file:`setup.cfg` (and any other Distutils configuration
files present on the target system) are processed after the contents of the
setup script, but before the command-line. This has several useful
consequences:
.. % (If you have more advanced needs, such as determining which extensions
.. % to build based on what capabilities are present on the target system,
.. % then you need the Distutils ``auto-configuration'' facility. This
.. % started to appear in Distutils 0.9 but, as of this writing, isn't mature
.. % or stable enough yet for real-world use.)
* installers can override some of what you put in :file:`setup.py` by editing
:file:`setup.cfg`
* you can provide non-standard defaults for options that are not easily set in
:file:`setup.py`
* installers can override anything in :file:`setup.cfg` using the command-line
options to :file:`setup.py`
The basic syntax of the configuration file is simple::
[command]
option=value
...
where *command* is one of the Distutils commands (e.g. :command:`build_py`,
:command:`install`), and *option* is one of the options that command supports.
Any number of options can be supplied for each command, and any number of
command sections can be included in the file. Blank lines are ignored, as are
comments, which run from a ``'#'`` character until the end of the line. Long
option values can be split across multiple lines simply by indenting the
continuation lines.
You can find out the list of options supported by a particular command with the
universal :option:`--help` option, e.g. ::
> python setup.py --help build_ext
[...]
Options for 'build_ext' command:
--build-lib (-b) directory for compiled extension modules
--build-temp (-t) directory for temporary files (build by-products)
--inplace (-i) ignore build-lib and put compiled extensions into the
source directory alongside your pure Python modules
--include-dirs (-I) list of directories to search for header files
--define (-D) C preprocessor macros to define
--undef (-U) C preprocessor macros to undefine
--swig-opts list of SWIG command line options
[...]
Note that an option spelled :option:`--foo-bar` on the command-line is spelled
:option:`foo_bar` in configuration files.
For example, say you want your extensions to be built "in-place"---that is, you
have an extension :mod:`pkg.ext`, and you want the compiled extension file
(:file:`ext.so` on Unix, say) to be put in the same source directory as your
pure Python modules :mod:`pkg.mod1` and :mod:`pkg.mod2`. You can always use the
:option:`--inplace` option on the command-line to ensure this::
python setup.py build_ext --inplace
But this requires that you always specify the :command:`build_ext` command
explicitly, and remember to provide :option:`--inplace`. An easier way is to
"set and forget" this option, by encoding it in :file:`setup.cfg`, the
configuration file for this distribution::
[build_ext]
inplace=1
This will affect all builds of this module distribution, whether or not you
explicitly specify :command:`build_ext`. If you include :file:`setup.cfg` in
your source distribution, it will also affect end-user builds---which is
probably a bad idea for this option, since always building extensions in-place
would break installation of the module distribution. In certain peculiar cases,
though, modules are built right in their installation directory, so this is
conceivably a useful ability. (Distributing extensions that expect to be built
in their installation directory is almost always a bad idea, though.)
Another example: certain commands take a lot of options that don't change from
run to run; for example, :command:`bdist_rpm` needs to know everything required
to generate a "spec" file for creating an RPM distribution. Some of this
information comes from the setup script, and some is automatically generated by
the Distutils (such as the list of files installed). But some of it has to be
supplied as options to :command:`bdist_rpm`, which would be very tedious to do
on the command-line for every run. Hence, here is a snippet from the Distutils'
own :file:`setup.cfg`::
[bdist_rpm]
release = 1
packager = Greg Ward <gward@python.net>
doc_files = CHANGES.txt
README.txt
USAGE.txt
doc/
examples/
Note that the :option:`doc_files` option is simply a whitespace-separated string
split across multiple lines for readability.
.. seealso::
:ref:`inst-config-syntax` in "Installing Python Modules"
More information on the configuration files is available in the manual for
system administrators.
.. rubric:: Footnotes
.. [#] This ideal probably won't be achieved until auto-configuration is fully
supported by the Distutils.

View File

@@ -0,0 +1,241 @@
.. _examples:
********
Examples
********
This chapter provides a number of basic examples to help get started with
distutils. Additional information about using distutils can be found in the
Distutils Cookbook.
.. seealso::
`Distutils Cookbook <http://wiki.python.org/moin/Distutils/Cookbook>`_
Collection of recipes showing how to achieve more control over distutils.
.. _pure-mod:
Pure Python distribution (by module)
====================================
If you're just distributing a couple of modules, especially if they don't live
in a particular package, you can specify them individually using the
:option:`py_modules` option in the setup script.
In the simplest case, you'll have two files to worry about: a setup script and
the single module you're distributing, :file:`foo.py` in this example::
<root>/
setup.py
foo.py
(In all diagrams in this section, *<root>* will refer to the distribution root
directory.) A minimal setup script to describe this situation would be::
from distutils.core import setup
setup(name='foo',
version='1.0',
py_modules=['foo'],
)
Note that the name of the distribution is specified independently with the
:option:`name` option, and there's no rule that says it has to be the same as
the name of the sole module in the distribution (although that's probably a good
convention to follow). However, the distribution name is used to generate
filenames, so you should stick to letters, digits, underscores, and hyphens.
Since :option:`py_modules` is a list, you can of course specify multiple
modules, eg. if you're distributing modules :mod:`foo` and :mod:`bar`, your
setup might look like this::
<root>/
setup.py
foo.py
bar.py
and the setup script might be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
py_modules=['foo', 'bar'],
)
You can put module source files into another directory, but if you have enough
modules to do that, it's probably easier to specify modules by package rather
than listing them individually.
.. _pure-pkg:
Pure Python distribution (by package)
=====================================
If you have more than a couple of modules to distribute, especially if they are
in multiple packages, it's probably easier to specify whole packages rather than
individual modules. This works even if your modules are not in a package; you
can just tell the Distutils to process modules from the root package, and that
works the same as any other package (except that you don't have to have an
:file:`__init__.py` file).
The setup script from the last example could also be written as ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
packages=[''],
)
(The empty string stands for the root package.)
If those two files are moved into a subdirectory, but remain in the root
package, e.g.::
<root>/
setup.py
src/ foo.py
bar.py
then you would still specify the root package, but you have to tell the
Distutils where source files in the root package live::
from distutils.core import setup
setup(name='foobar',
version='1.0',
package_dir={'': 'src'},
packages=[''],
)
More typically, though, you will want to distribute multiple modules in the same
package (or in sub-packages). For example, if the :mod:`foo` and :mod:`bar`
modules belong in package :mod:`foobar`, one way to layout your source tree is
::
<root>/
setup.py
foobar/
__init__.py
foo.py
bar.py
This is in fact the default layout expected by the Distutils, and the one that
requires the least work to describe in your setup script::
from distutils.core import setup
setup(name='foobar',
version='1.0',
packages=['foobar'],
)
If you want to put modules in directories not named for their package, then you
need to use the :option:`package_dir` option again. For example, if the
:file:`src` directory holds modules in the :mod:`foobar` package::
<root>/
setup.py
src/
__init__.py
foo.py
bar.py
an appropriate setup script would be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
package_dir={'foobar': 'src'},
packages=['foobar'],
)
Or, you might put modules from your main package right in the distribution
root::
<root>/
setup.py
__init__.py
foo.py
bar.py
in which case your setup script would be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
package_dir={'foobar': ''},
packages=['foobar'],
)
(The empty string also stands for the current directory.)
If you have sub-packages, they must be explicitly listed in :option:`packages`,
but any entries in :option:`package_dir` automatically extend to sub-packages.
(In other words, the Distutils does *not* scan your source tree, trying to
figure out which directories correspond to Python packages by looking for
:file:`__init__.py` files.) Thus, if the default layout grows a sub-package::
<root>/
setup.py
foobar/
__init__.py
foo.py
bar.py
subfoo/
__init__.py
blah.py
then the corresponding setup script would be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
packages=['foobar', 'foobar.subfoo'],
)
(Again, the empty string in :option:`package_dir` stands for the current
directory.)
.. _single-ext:
Single extension module
=======================
Extension modules are specified using the :option:`ext_modules` option.
:option:`package_dir` has no effect on where extension source files are found;
it only affects the source for pure Python modules. The simplest case, a
single extension module in a single C source file, is::
<root>/
setup.py
foo.c
If the :mod:`foo` extension belongs in the root package, the setup script for
this could be ::
from distutils.core import setup
from distutils.extension import Extension
setup(name='foobar',
version='1.0',
ext_modules=[Extension('foo', ['foo.c'])],
)
If the extension actually belongs in a package, say :mod:`foopkg`, then
With exactly the same source tree layout, this extension can be put in the
:mod:`foopkg` package simply by changing the name of the extension::
from distutils.core import setup
from distutils.extension import Extension
setup(name='foobar',
version='1.0',
ext_modules=[Extension('foopkg.foo', ['foo.c'])],
)
.. % \section{Multiple extension modules}
.. % \label{multiple-ext}
.. % \section{Putting it all together}

View File

@@ -0,0 +1,96 @@
.. _extending-distutils:
*******************
Extending Distutils
*******************
Distutils can be extended in various ways. Most extensions take the form of new
commands or replacements for existing commands. New commands may be written to
support new types of platform-specific packaging, for example, while
replacements for existing commands may be made to modify details of how the
command operates on a package.
Most extensions of the distutils are made within :file:`setup.py` scripts that
want to modify existing commands; many simply add a few file extensions that
should be copied into packages in addition to :file:`.py` files as a
convenience.
Most distutils command implementations are subclasses of the :class:`Command`
class from :mod:`distutils.cmd`. New commands may directly inherit from
:class:`Command`, while replacements often derive from :class:`Command`
indirectly, directly subclassing the command they are replacing. Commands are
required to derive from :class:`Command`.
.. % \section{Extending existing commands}
.. % \label{extend-existing}
.. % \section{Writing new commands}
.. % \label{new-commands}
.. % \XXX{Would an uninstall command be a good example here?}
Integrating new commands
========================
There are different ways to integrate new command implementations into
distutils. The most difficult is to lobby for the inclusion of the new features
in distutils itself, and wait for (and require) a version of Python that
provides that support. This is really hard for many reasons.
The most common, and possibly the most reasonable for most needs, is to include
the new implementations with your :file:`setup.py` script, and cause the
:func:`distutils.core.setup` function use them::
from distutils.command.build_py import build_py as _build_py
from distutils.core import setup
class build_py(_build_py):
"""Specialized Python source builder."""
# implement whatever needs to be different...
setup(cmdclass={'build_py': build_py},
...)
This approach is most valuable if the new implementations must be used to use a
particular package, as everyone interested in the package will need to have the
new command implementation.
Beginning with Python 2.4, a third option is available, intended to allow new
commands to be added which can support existing :file:`setup.py` scripts without
requiring modifications to the Python installation. This is expected to allow
third-party extensions to provide support for additional packaging systems, but
the commands can be used for anything distutils commands can be used for. A new
configuration option, :option:`command_packages` (command-line option
:option:`--command-packages`), can be used to specify additional packages to be
searched for modules implementing commands. Like all distutils options, this
can be specified on the command line or in a configuration file. This option
can only be set in the ``[global]`` section of a configuration file, or before
any commands on the command line. If set in a configuration file, it can be
overridden from the command line; setting it to an empty string on the command
line causes the default to be used. This should never be set in a configuration
file provided with a package.
This new option can be used to add any number of packages to the list of
packages searched for command implementations; multiple package names should be
separated by commas. When not specified, the search is only performed in the
:mod:`distutils.command` package. When :file:`setup.py` is run with the option
:option:`--command-packages` :option:`distcmds,buildcmds`, however, the packages
:mod:`distutils.command`, :mod:`distcmds`, and :mod:`buildcmds` will be searched
in that order. New commands are expected to be implemented in modules of the
same name as the command by classes sharing the same name. Given the example
command line option above, the command :command:`bdist_openpkg` could be
implemented by the class :class:`distcmds.bdist_openpkg.bdist_openpkg` or
:class:`buildcmds.bdist_openpkg.bdist_openpkg`.
Adding new distribution types
=============================
Commands that create distributions (files in the :file:`dist/` directory) need
to add ``(command, filename)`` pairs to ``self.distribution.dist_files`` so that
:command:`upload` can upload it to PyPI. The *filename* in the pair contains no
path information, only the name of the file itself. In dry-run mode, pairs
should still be added to represent what would have been created.

View File

@@ -0,0 +1,31 @@
.. _distutils-index:
###############################
Distributing Python Modules
###############################
:Authors: Greg Ward, Anthony Baxter
:Email: distutils-sig@python.org
:Release: |version|
:Date: |today|
This document describes the Python Distribution Utilities ("Distutils") from
the module developer's point of view, describing how to use the Distutils to
make Python modules and extensions easily available to a wider audience with
very little overhead for build/release/install mechanics.
.. toctree::
:maxdepth: 2
:numbered:
introduction.rst
setupscript.rst
configfile.rst
sourcedist.rst
builtdist.rst
packageindex.rst
uploading.rst
examples.rst
extending.rst
commandref.rst
apiref.rst

View File

@@ -0,0 +1,208 @@
.. _distutils-intro:
****************************
An Introduction to Distutils
****************************
This document covers using the Distutils to distribute your Python modules,
concentrating on the role of developer/distributor: if you're looking for
information on installing Python modules, you should refer to the
:ref:`install-index` chapter.
.. _distutils-concepts:
Concepts & Terminology
======================
Using the Distutils is quite simple, both for module developers and for
users/administrators installing third-party modules. As a developer, your
responsibilities (apart from writing solid, well-documented and well-tested
code, of course!) are:
* write a setup script (:file:`setup.py` by convention)
* (optional) write a setup configuration file
* create a source distribution
* (optional) create one or more built (binary) distributions
Each of these tasks is covered in this document.
Not all module developers have access to a multitude of platforms, so it's not
always feasible to expect them to create a multitude of built distributions. It
is hoped that a class of intermediaries, called *packagers*, will arise to
address this need. Packagers will take source distributions released by module
developers, build them on one or more platforms, and release the resulting built
distributions. Thus, users on the most popular platforms will be able to
install most popular Python module distributions in the most natural way for
their platform, without having to run a single setup script or compile a line of
code.
.. _distutils-simple-example:
A Simple Example
================
The setup script is usually quite simple, although since it's written in Python,
there are no arbitrary limits to what you can do with it, though you should be
careful about putting arbitrarily expensive operations in your setup script.
Unlike, say, Autoconf-style configure scripts, the setup script may be run
multiple times in the course of building and installing your module
distribution.
If all you want to do is distribute a module called :mod:`foo`, contained in a
file :file:`foo.py`, then your setup script can be as simple as this::
from distutils.core import setup
setup(name='foo',
version='1.0',
py_modules=['foo'],
)
Some observations:
* most information that you supply to the Distutils is supplied as keyword
arguments to the :func:`setup` function
* those keyword arguments fall into two categories: package metadata (name,
version number) and information about what's in the package (a list of pure
Python modules, in this case)
* modules are specified by module name, not filename (the same will hold true
for packages and extensions)
* it's recommended that you supply a little more metadata, in particular your
name, email address and a URL for the project (see section :ref:`setup-script`
for an example)
To create a source distribution for this module, you would create a setup
script, :file:`setup.py`, containing the above code, and run::
python setup.py sdist
which will create an archive file (e.g., tarball on Unix, ZIP file on Windows)
containing your setup script :file:`setup.py`, and your module :file:`foo.py`.
The archive file will be named :file:`foo-1.0.tar.gz` (or :file:`.zip`), and
will unpack into a directory :file:`foo-1.0`.
If an end-user wishes to install your :mod:`foo` module, all she has to do is
download :file:`foo-1.0.tar.gz` (or :file:`.zip`), unpack it, and---from the
:file:`foo-1.0` directory---run ::
python setup.py install
which will ultimately copy :file:`foo.py` to the appropriate directory for
third-party modules in their Python installation.
This simple example demonstrates some fundamental concepts of the Distutils.
First, both developers and installers have the same basic user interface, i.e.
the setup script. The difference is which Distutils *commands* they use: the
:command:`sdist` command is almost exclusively for module developers, while
:command:`install` is more often for installers (although most developers will
want to install their own code occasionally).
If you want to make things really easy for your users, you can create one or
more built distributions for them. For instance, if you are running on a
Windows machine, and want to make things easy for other Windows users, you can
create an executable installer (the most appropriate type of built distribution
for this platform) with the :command:`bdist_wininst` command. For example::
python setup.py bdist_wininst
will create an executable installer, :file:`foo-1.0.win32.exe`, in the current
directory.
Other useful built distribution formats are RPM, implemented by the
:command:`bdist_rpm` command, Solaris :program:`pkgtool`
(:command:`bdist_pkgtool`), and HP-UX :program:`swinstall`
(:command:`bdist_sdux`). For example, the following command will create an RPM
file called :file:`foo-1.0.noarch.rpm`::
python setup.py bdist_rpm
(The :command:`bdist_rpm` command uses the :command:`rpm` executable, therefore
this has to be run on an RPM-based system such as Red Hat Linux, SuSE Linux, or
Mandrake Linux.)
You can find out what distribution formats are available at any time by running
::
python setup.py bdist --help-formats
.. _python-terms:
General Python terminology
==========================
If you're reading this document, you probably have a good idea of what modules,
extensions, and so forth are. Nevertheless, just to be sure that everyone is
operating from a common starting point, we offer the following glossary of
common Python terms:
module
the basic unit of code reusability in Python: a block of code imported by some
other code. Three types of modules concern us here: pure Python modules,
extension modules, and packages.
pure Python module
a module written in Python and contained in a single :file:`.py` file (and
possibly associated :file:`.pyc` and/or :file:`.pyo` files). Sometimes referred
to as a "pure module."
extension module
a module written in the low-level language of the Python implementation: C/C++
for Python, Java for Jython. Typically contained in a single dynamically
loadable pre-compiled file, e.g. a shared object (:file:`.so`) file for Python
extensions on Unix, a DLL (given the :file:`.pyd` extension) for Python
extensions on Windows, or a Java class file for Jython extensions. (Note that
currently, the Distutils only handles C/C++ extensions for Python.)
package
a module that contains other modules; typically contained in a directory in the
filesystem and distinguished from other directories by the presence of a file
:file:`__init__.py`.
root package
the root of the hierarchy of packages. (This isn't really a package, since it
doesn't have an :file:`__init__.py` file. But we have to call it something.)
The vast majority of the standard library is in the root package, as are many
small, standalone third-party modules that don't belong to a larger module
collection. Unlike regular packages, modules in the root package can be found in
many directories: in fact, every directory listed in ``sys.path`` contributes
modules to the root package.
.. _distutils-term:
Distutils-specific terminology
==============================
The following terms apply more specifically to the domain of distributing Python
modules using the Distutils:
module distribution
a collection of Python modules distributed together as a single downloadable
resource and meant to be installed *en masse*. Examples of some well-known
module distributions are Numeric Python, PyXML, PIL (the Python Imaging
Library), or mxBase. (This would be called a *package*, except that term is
already taken in the Python context: a single module distribution may contain
zero, one, or many Python packages.)
pure module distribution
a module distribution that contains only pure Python modules and packages.
Sometimes referred to as a "pure distribution."
non-pure module distribution
a module distribution that contains at least one extension module. Sometimes
referred to as a "non-pure distribution."
distribution root
the top-level directory of your source tree (or source distribution); the
directory where :file:`setup.py` exists. Generally :file:`setup.py` will be
run from this directory.

View File

@@ -0,0 +1,93 @@
.. _package-index:
**********************************
Registering with the Package Index
**********************************
The Python Package Index (PyPI) holds meta-data describing distributions
packaged with distutils. The distutils command :command:`register` is used to
submit your distribution's meta-data to the index. It is invoked as follows::
python setup.py register
Distutils will respond with the following prompt::
running register
We need to know who you are, so please choose either:
1. use your existing login,
2. register as a new user,
3. have the server generate a new password for you (and email it to you), or
4. quit
Your selection [default 1]:
Note: if your username and password are saved locally, you will not see this
menu.
If you have not registered with PyPI, then you will need to do so now. You
should choose option 2, and enter your details as required. Soon after
submitting your details, you will receive an email which will be used to confirm
your registration.
Once you are registered, you may choose option 1 from the menu. You will be
prompted for your PyPI username and password, and :command:`register` will then
submit your meta-data to the index.
You may submit any number of versions of your distribution to the index. If you
alter the meta-data for a particular version, you may submit it again and the
index will be updated.
PyPI holds a record for each (name, version) combination submitted. The first
user to submit information for a given name is designated the Owner of that
name. They may submit changes through the :command:`register` command or through
the web interface. They may also designate other users as Owners or Maintainers.
Maintainers may edit the package information, but not designate other Owners or
Maintainers.
By default PyPI will list all versions of a given package. To hide certain
versions, the Hidden property should be set to yes. This must be edited through
the web interface.
.. _pypirc:
The .pypirc file
================
The format of the :file:`.pypirc` file is as follows::
[distutils]
index-servers =
pypi
[pypi]
repository: <repository-url>
username: <username>
password: <password>
*repository* can be omitted and defaults to ``http://www.python.org/pypi``.
If you want to define another server a new section can be created::
[distutils]
index-servers =
pypi
other
[pypi]
repository: <repository-url>
username: <username>
password: <password>
[other]
repository: http://example.com/pypi
username: <username>
password: <password>
The command can then be called with the -r option::
python setup.py register -r http://example.com/pypi
Or even with the section name::
python setup.py register -r other

View File

@@ -0,0 +1,669 @@
.. _setup-script:
************************
Writing the Setup Script
************************
The setup script is the centre of all activity in building, distributing, and
installing modules using the Distutils. The main purpose of the setup script is
to describe your module distribution to the Distutils, so that the various
commands that operate on your modules do the right thing. As we saw in section
:ref:`distutils-simple-example` above, the setup script consists mainly of a call to
:func:`setup`, and most information supplied to the Distutils by the module
developer is supplied as keyword arguments to :func:`setup`.
Here's a slightly more involved example, which we'll follow for the next couple
of sections: the Distutils' own setup script. (Keep in mind that although the
Distutils are included with Python 1.6 and later, they also have an independent
existence so that Python 1.5.2 users can use them to install other module
distributions. The Distutils' own setup script, shown here, is used to install
the package into Python 1.5.2.) ::
#!/usr/bin/env python
from distutils.core import setup
setup(name='Distutils',
version='1.0',
description='Python Distribution Utilities',
author='Greg Ward',
author_email='gward@python.net',
url='http://www.python.org/sigs/distutils-sig/',
packages=['distutils', 'distutils.command'],
)
There are only two differences between this and the trivial one-file
distribution presented in section :ref:`distutils-simple-example`: more metadata, and the
specification of pure Python modules by package, rather than by module. This is
important since the Distutils consist of a couple of dozen modules split into
(so far) two packages; an explicit list of every module would be tedious to
generate and difficult to maintain. For more information on the additional
meta-data, see section :ref:`meta-data`.
Note that any pathnames (files or directories) supplied in the setup script
should be written using the Unix convention, i.e. slash-separated. The
Distutils will take care of converting this platform-neutral representation into
whatever is appropriate on your current platform before actually using the
pathname. This makes your setup script portable across operating systems, which
of course is one of the major goals of the Distutils. In this spirit, all
pathnames in this document are slash-separated.
This, of course, only applies to pathnames given to Distutils functions. If
you, for example, use standard Python functions such as :func:`glob.glob` or
:func:`os.listdir` to specify files, you should be careful to write portable
code instead of hardcoding path separators::
glob.glob(os.path.join('mydir', 'subdir', '*.html'))
os.listdir(os.path.join('mydir', 'subdir'))
.. _listing-packages:
Listing whole packages
======================
The :option:`packages` option tells the Distutils to process (build, distribute,
install, etc.) all pure Python modules found in each package mentioned in the
:option:`packages` list. In order to do this, of course, there has to be a
correspondence between package names and directories in the filesystem. The
default correspondence is the most obvious one, i.e. package :mod:`distutils` is
found in the directory :file:`distutils` relative to the distribution root.
Thus, when you say ``packages = ['foo']`` in your setup script, you are
promising that the Distutils will find a file :file:`foo/__init__.py` (which
might be spelled differently on your system, but you get the idea) relative to
the directory where your setup script lives. If you break this promise, the
Distutils will issue a warning but still process the broken package anyways.
If you use a different convention to lay out your source directory, that's no
problem: you just have to supply the :option:`package_dir` option to tell the
Distutils about your convention. For example, say you keep all Python source
under :file:`lib`, so that modules in the "root package" (i.e., not in any
package at all) are in :file:`lib`, modules in the :mod:`foo` package are in
:file:`lib/foo`, and so forth. Then you would put ::
package_dir = {'': 'lib'}
in your setup script. The keys to this dictionary are package names, and an
empty package name stands for the root package. The values are directory names
relative to your distribution root. In this case, when you say ``packages =
['foo']``, you are promising that the file :file:`lib/foo/__init__.py` exists.
Another possible convention is to put the :mod:`foo` package right in
:file:`lib`, the :mod:`foo.bar` package in :file:`lib/bar`, etc. This would be
written in the setup script as ::
package_dir = {'foo': 'lib'}
A ``package: dir`` entry in the :option:`package_dir` dictionary implicitly
applies to all packages below *package*, so the :mod:`foo.bar` case is
automatically handled here. In this example, having ``packages = ['foo',
'foo.bar']`` tells the Distutils to look for :file:`lib/__init__.py` and
:file:`lib/bar/__init__.py`. (Keep in mind that although :option:`package_dir`
applies recursively, you must explicitly list all packages in
:option:`packages`: the Distutils will *not* recursively scan your source tree
looking for any directory with an :file:`__init__.py` file.)
.. _listing-modules:
Listing individual modules
==========================
For a small module distribution, you might prefer to list all modules rather
than listing packages---especially the case of a single module that goes in the
"root package" (i.e., no package at all). This simplest case was shown in
section :ref:`distutils-simple-example`; here is a slightly more involved example::
py_modules = ['mod1', 'pkg.mod2']
This describes two modules, one of them in the "root" package, the other in the
:mod:`pkg` package. Again, the default package/directory layout implies that
these two modules can be found in :file:`mod1.py` and :file:`pkg/mod2.py`, and
that :file:`pkg/__init__.py` exists as well. And again, you can override the
package/directory correspondence using the :option:`package_dir` option.
.. _describing-extensions:
Describing extension modules
============================
Just as writing Python extension modules is a bit more complicated than writing
pure Python modules, describing them to the Distutils is a bit more complicated.
Unlike pure modules, it's not enough just to list modules or packages and expect
the Distutils to go out and find the right files; you have to specify the
extension name, source file(s), and any compile/link requirements (include
directories, libraries to link with, etc.).
.. XXX read over this section
All of this is done through another keyword argument to :func:`setup`, the
:option:`ext_modules` option. :option:`ext_modules` is just a list of
:class:`Extension` instances, each of which describes a single extension module.
Suppose your distribution includes a single extension, called :mod:`foo` and
implemented by :file:`foo.c`. If no additional instructions to the
compiler/linker are needed, describing this extension is quite simple::
Extension('foo', ['foo.c'])
The :class:`Extension` class can be imported from :mod:`distutils.core` along
with :func:`setup`. Thus, the setup script for a module distribution that
contains only this one extension and nothing else might be::
from distutils.core import setup, Extension
setup(name='foo',
version='1.0',
ext_modules=[Extension('foo', ['foo.c'])],
)
The :class:`Extension` class (actually, the underlying extension-building
machinery implemented by the :command:`build_ext` command) supports a great deal
of flexibility in describing Python extensions, which is explained in the
following sections.
Extension names and packages
----------------------------
The first argument to the :class:`Extension` constructor is always the name of
the extension, including any package names. For example, ::
Extension('foo', ['src/foo1.c', 'src/foo2.c'])
describes an extension that lives in the root package, while ::
Extension('pkg.foo', ['src/foo1.c', 'src/foo2.c'])
describes the same extension in the :mod:`pkg` package. The source files and
resulting object code are identical in both cases; the only difference is where
in the filesystem (and therefore where in Python's namespace hierarchy) the
resulting extension lives.
If you have a number of extensions all in the same package (or all under the
same base package), use the :option:`ext_package` keyword argument to
:func:`setup`. For example, ::
setup(...,
ext_package='pkg',
ext_modules=[Extension('foo', ['foo.c']),
Extension('subpkg.bar', ['bar.c'])],
)
will compile :file:`foo.c` to the extension :mod:`pkg.foo`, and :file:`bar.c` to
:mod:`pkg.subpkg.bar`.
Extension source files
----------------------
The second argument to the :class:`Extension` constructor is a list of source
files. Since the Distutils currently only support C, C++, and Objective-C
extensions, these are normally C/C++/Objective-C source files. (Be sure to use
appropriate extensions to distinguish C++\ source files: :file:`.cc` and
:file:`.cpp` seem to be recognized by both Unix and Windows compilers.)
However, you can also include SWIG interface (:file:`.i`) files in the list; the
:command:`build_ext` command knows how to deal with SWIG extensions: it will run
SWIG on the interface file and compile the resulting C/C++ file into your
extension.
**\*\*** SWIG support is rough around the edges and largely untested! **\*\***
This warning notwithstanding, options to SWIG can be currently passed like
this::
setup(...,
ext_modules=[Extension('_foo', ['foo.i'],
swig_opts=['-modern', '-I../include'])],
py_modules=['foo'],
)
Or on the commandline like this::
> python setup.py build_ext --swig-opts="-modern -I../include"
On some platforms, you can include non-source files that are processed by the
compiler and included in your extension. Currently, this just means Windows
message text (:file:`.mc`) files and resource definition (:file:`.rc`) files for
Visual C++. These will be compiled to binary resource (:file:`.res`) files and
linked into the executable.
Preprocessor options
--------------------
Three optional arguments to :class:`Extension` will help if you need to specify
include directories to search or preprocessor macros to define/undefine:
``include_dirs``, ``define_macros``, and ``undef_macros``.
For example, if your extension requires header files in the :file:`include`
directory under your distribution root, use the ``include_dirs`` option::
Extension('foo', ['foo.c'], include_dirs=['include'])
You can specify absolute directories there; if you know that your extension will
only be built on Unix systems with X11R6 installed to :file:`/usr`, you can get
away with ::
Extension('foo', ['foo.c'], include_dirs=['/usr/include/X11'])
You should avoid this sort of non-portable usage if you plan to distribute your
code: it's probably better to write C code like ::
#include <X11/Xlib.h>
If you need to include header files from some other Python extension, you can
take advantage of the fact that header files are installed in a consistent way
by the Distutils :command:`install_header` command. For example, the Numerical
Python header files are installed (on a standard Unix installation) to
:file:`/usr/local/include/python1.5/Numerical`. (The exact location will differ
according to your platform and Python installation.) Since the Python include
directory---\ :file:`/usr/local/include/python1.5` in this case---is always
included in the search path when building Python extensions, the best approach
is to write C code like ::
#include <Numerical/arrayobject.h>
If you must put the :file:`Numerical` include directory right into your header
search path, though, you can find that directory using the Distutils
:mod:`distutils.sysconfig` module::
from distutils.sysconfig import get_python_inc
incdir = os.path.join(get_python_inc(plat_specific=1), 'Numerical')
setup(...,
Extension(..., include_dirs=[incdir]),
)
Even though this is quite portable---it will work on any Python installation,
regardless of platform---it's probably easier to just write your C code in the
sensible way.
You can define and undefine pre-processor macros with the ``define_macros`` and
``undef_macros`` options. ``define_macros`` takes a list of ``(name, value)``
tuples, where ``name`` is the name of the macro to define (a string) and
``value`` is its value: either a string or ``None``. (Defining a macro ``FOO``
to ``None`` is the equivalent of a bare ``#define FOO`` in your C source: with
most compilers, this sets ``FOO`` to the string ``1``.) ``undef_macros`` is
just a list of macros to undefine.
For example::
Extension(...,
define_macros=[('NDEBUG', '1'),
('HAVE_STRFTIME', None)],
undef_macros=['HAVE_FOO', 'HAVE_BAR'])
is the equivalent of having this at the top of every C source file::
#define NDEBUG 1
#define HAVE_STRFTIME
#undef HAVE_FOO
#undef HAVE_BAR
Library options
---------------
You can also specify the libraries to link against when building your extension,
and the directories to search for those libraries. The ``libraries`` option is
a list of libraries to link against, ``library_dirs`` is a list of directories
to search for libraries at link-time, and ``runtime_library_dirs`` is a list of
directories to search for shared (dynamically loaded) libraries at run-time.
For example, if you need to link against libraries known to be in the standard
library search path on target systems ::
Extension(...,
libraries=['gdbm', 'readline'])
If you need to link with libraries in a non-standard location, you'll have to
include the location in ``library_dirs``::
Extension(...,
library_dirs=['/usr/X11R6/lib'],
libraries=['X11', 'Xt'])
(Again, this sort of non-portable construct should be avoided if you intend to
distribute your code.)
**\*\*** Should mention clib libraries here or somewhere else! **\*\***
Other options
-------------
There are still some other options which can be used to handle special cases.
The :option:`extra_objects` option is a list of object files to be passed to the
linker. These files must not have extensions, as the default extension for the
compiler is used.
:option:`extra_compile_args` and :option:`extra_link_args` can be used to
specify additional command line options for the respective compiler and linker
command lines.
:option:`export_symbols` is only useful on Windows. It can contain a list of
symbols (functions or variables) to be exported. This option is not needed when
building compiled extensions: Distutils will automatically add ``initmodule``
to the list of exported symbols.
Relationships between Distributions and Packages
================================================
A distribution may relate to packages in three specific ways:
#. It can require packages or modules.
#. It can provide packages or modules.
#. It can obsolete packages or modules.
These relationships can be specified using keyword arguments to the
:func:`distutils.core.setup` function.
Dependencies on other Python modules and packages can be specified by supplying
the *requires* keyword argument to :func:`setup`. The value must be a list of
strings. Each string specifies a package that is required, and optionally what
versions are sufficient.
To specify that any version of a module or package is required, the string
should consist entirely of the module or package name. Examples include
``'mymodule'`` and ``'xml.parsers.expat'``.
If specific versions are required, a sequence of qualifiers can be supplied in
parentheses. Each qualifier may consist of a comparison operator and a version
number. The accepted comparison operators are::
< > ==
<= >= !=
These can be combined by using multiple qualifiers separated by commas (and
optional whitespace). In this case, all of the qualifiers must be matched; a
logical AND is used to combine the evaluations.
Let's look at a bunch of examples:
+-------------------------+----------------------------------------------+
| Requires Expression | Explanation |
+=========================+==============================================+
| ``==1.0`` | Only version ``1.0`` is compatible |
+-------------------------+----------------------------------------------+
| ``>1.0, !=1.5.1, <2.0`` | Any version after ``1.0`` and before ``2.0`` |
| | is compatible, except ``1.5.1`` |
+-------------------------+----------------------------------------------+
Now that we can specify dependencies, we also need to be able to specify what we
provide that other distributions can require. This is done using the *provides*
keyword argument to :func:`setup`. The value for this keyword is a list of
strings, each of which names a Python module or package, and optionally
identifies the version. If the version is not specified, it is assumed to match
that of the distribution.
Some examples:
+---------------------+----------------------------------------------+
| Provides Expression | Explanation |
+=====================+==============================================+
| ``mypkg`` | Provide ``mypkg``, using the distribution |
| | version |
+---------------------+----------------------------------------------+
| ``mypkg (1.1)`` | Provide ``mypkg`` version 1.1, regardless of |
| | the distribution version |
+---------------------+----------------------------------------------+
A package can declare that it obsoletes other packages using the *obsoletes*
keyword argument. The value for this is similar to that of the *requires*
keyword: a list of strings giving module or package specifiers. Each specifier
consists of a module or package name optionally followed by one or more version
qualifiers. Version qualifiers are given in parentheses after the module or
package name.
The versions identified by the qualifiers are those that are obsoleted by the
distribution being described. If no qualifiers are given, all versions of the
named module or package are understood to be obsoleted.
Installing Scripts
==================
So far we have been dealing with pure and non-pure Python modules, which are
usually not run by themselves but imported by scripts.
Scripts are files containing Python source code, intended to be started from the
command line. Scripts don't require Distutils to do anything very complicated.
The only clever feature is that if the first line of the script starts with
``#!`` and contains the word "python", the Distutils will adjust the first line
to refer to the current interpreter location. By default, it is replaced with
the current interpreter location. The :option:`--executable` (or :option:`-e`)
option will allow the interpreter path to be explicitly overridden.
The :option:`scripts` option simply is a list of files to be handled in this
way. From the PyXML setup script::
setup(...,
scripts=['scripts/xmlproc_parse', 'scripts/xmlproc_val']
)
Installing Package Data
=======================
Often, additional files need to be installed into a package. These files are
often data that's closely related to the package's implementation, or text files
containing documentation that might be of interest to programmers using the
package. These files are called :dfn:`package data`.
Package data can be added to packages using the ``package_data`` keyword
argument to the :func:`setup` function. The value must be a mapping from
package name to a list of relative path names that should be copied into the
package. The paths are interpreted as relative to the directory containing the
package (information from the ``package_dir`` mapping is used if appropriate);
that is, the files are expected to be part of the package in the source
directories. They may contain glob patterns as well.
The path names may contain directory portions; any necessary directories will be
created in the installation.
For example, if a package should contain a subdirectory with several data files,
the files can be arranged like this in the source tree::
setup.py
src/
mypkg/
__init__.py
module.py
data/
tables.dat
spoons.dat
forks.dat
The corresponding call to :func:`setup` might be::
setup(...,
packages=['mypkg'],
package_dir={'mypkg': 'src/mypkg'},
package_data={'mypkg': ['data/*.dat']},
)
.. versionadded:: 2.4
Installing Additional Files
===========================
The :option:`data_files` option can be used to specify additional files needed
by the module distribution: configuration files, message catalogs, data files,
anything which doesn't fit in the previous categories.
:option:`data_files` specifies a sequence of (*directory*, *files*) pairs in the
following way::
setup(...,
data_files=[('bitmaps', ['bm/b1.gif', 'bm/b2.gif']),
('config', ['cfg/data.cfg']),
('/etc/init.d', ['init-script'])]
)
Note that you can specify the directory names where the data files will be
installed, but you cannot rename the data files themselves.
Each (*directory*, *files*) pair in the sequence specifies the installation
directory and the files to install there. If *directory* is a relative path, it
is interpreted relative to the installation prefix (Python's ``sys.prefix`` for
pure-Python packages, ``sys.exec_prefix`` for packages that contain extension
modules). Each file name in *files* is interpreted relative to the
:file:`setup.py` script at the top of the package source distribution. No
directory information from *files* is used to determine the final location of
the installed file; only the name of the file is used.
You can specify the :option:`data_files` options as a simple sequence of files
without specifying a target directory, but this is not recommended, and the
:command:`install` command will print a warning in this case. To install data
files directly in the target directory, an empty string should be given as the
directory.
.. _meta-data:
Additional meta-data
====================
The setup script may include additional meta-data beyond the name and version.
This information includes:
+----------------------+---------------------------+-----------------+--------+
| Meta-Data | Description | Value | Notes |
+======================+===========================+=================+========+
| ``name`` | name of the package | short string | \(1) |
+----------------------+---------------------------+-----------------+--------+
| ``version`` | version of this release | short string | (1)(2) |
+----------------------+---------------------------+-----------------+--------+
| ``author`` | package author's name | short string | \(3) |
+----------------------+---------------------------+-----------------+--------+
| ``author_email`` | email address of the | email address | \(3) |
| | package author | | |
+----------------------+---------------------------+-----------------+--------+
| ``maintainer`` | package maintainer's name | short string | \(3) |
+----------------------+---------------------------+-----------------+--------+
| ``maintainer_email`` | email address of the | email address | \(3) |
| | package maintainer | | |
+----------------------+---------------------------+-----------------+--------+
| ``url`` | home page for the package | URL | \(1) |
+----------------------+---------------------------+-----------------+--------+
| ``description`` | short, summary | short string | |
| | description of the | | |
| | package | | |
+----------------------+---------------------------+-----------------+--------+
| ``long_description`` | longer description of the | long string | |
| | package | | |
+----------------------+---------------------------+-----------------+--------+
| ``download_url`` | location where the | URL | \(4) |
| | package may be downloaded | | |
+----------------------+---------------------------+-----------------+--------+
| ``classifiers`` | a list of classifiers | list of strings | \(4) |
+----------------------+---------------------------+-----------------+--------+
| ``platforms`` | a list of platforms | list of strings | |
+----------------------+---------------------------+-----------------+--------+
Notes:
(1)
These fields are required.
(2)
It is recommended that versions take the form *major.minor[.patch[.sub]]*.
(3)
Either the author or the maintainer must be identified.
(4)
These fields should not be used if your package is to be compatible with Python
versions prior to 2.2.3 or 2.3. The list is available from the `PyPI website
<http://pypi.python.org/pypi>`_.
'short string'
A single line of text, not more than 200 characters.
'long string'
Multiple lines of plain text in reStructuredText format (see
http://docutils.sf.net/).
'list of strings'
See below.
None of the string values may be Unicode.
Encoding the version information is an art in itself. Python packages generally
adhere to the version format *major.minor[.patch][sub]*. The major number is 0
for initial, experimental releases of software. It is incremented for releases
that represent major milestones in a package. The minor number is incremented
when important new features are added to the package. The patch number
increments when bug-fix releases are made. Additional trailing version
information is sometimes used to indicate sub-releases. These are
"a1,a2,...,aN" (for alpha releases, where functionality and API may change),
"b1,b2,...,bN" (for beta releases, which only fix bugs) and "pr1,pr2,...,prN"
(for final pre-release release testing). Some examples:
0.1.0
the first, experimental release of a package
1.0.1a2
the second alpha release of the first patch version of 1.0
:option:`classifiers` are specified in a python list::
setup(...,
classifiers=[
'Development Status :: 4 - Beta',
'Environment :: Console',
'Environment :: Web Environment',
'Intended Audience :: End Users/Desktop',
'Intended Audience :: Developers',
'Intended Audience :: System Administrators',
'License :: OSI Approved :: Python Software Foundation License',
'Operating System :: MacOS :: MacOS X',
'Operating System :: Microsoft :: Windows',
'Operating System :: POSIX',
'Programming Language :: Python',
'Topic :: Communications :: Email',
'Topic :: Office/Business',
'Topic :: Software Development :: Bug Tracking',
],
)
If you wish to include classifiers in your :file:`setup.py` file and also wish
to remain backwards-compatible with Python releases prior to 2.2.3, then you can
include the following code fragment in your :file:`setup.py` before the
:func:`setup` call. ::
# patch distutils if it can't cope with the "classifiers" or
# "download_url" keywords
from sys import version
if version < '2.2.3':
from distutils.dist import DistributionMetadata
DistributionMetadata.classifiers = None
DistributionMetadata.download_url = None
Debugging the setup script
==========================
Sometimes things go wrong, and the setup script doesn't do what the developer
wants.
Distutils catches any exceptions when running the setup script, and print a
simple error message before the script is terminated. The motivation for this
behaviour is to not confuse administrators who don't know much about Python and
are trying to install a package. If they get a big long traceback from deep
inside the guts of Distutils, they may think the package or the Python
installation is broken because they don't read all the way down to the bottom
and see that it's a permission problem.
On the other hand, this doesn't help the developer to find the cause of the
failure. For this purpose, the DISTUTILS_DEBUG environment variable can be set
to anything except an empty string, and distutils will now print detailed
information what it is doing, and prints the full traceback in case an exception
occurs.

View File

@@ -0,0 +1,209 @@
.. _source-dist:
******************************
Creating a Source Distribution
******************************
As shown in section :ref:`distutils-simple-example`, you use the :command:`sdist` command
to create a source distribution. In the simplest case, ::
python setup.py sdist
(assuming you haven't specified any :command:`sdist` options in the setup script
or config file), :command:`sdist` creates the archive of the default format for
the current platform. The default format is a gzip'ed tar file
(:file:`.tar.gz`) on Unix, and ZIP file on Windows.
You can specify as many formats as you like using the :option:`--formats`
option, for example::
python setup.py sdist --formats=gztar,zip
to create a gzipped tarball and a zip file. The available formats are:
+-----------+-------------------------+---------+
| Format | Description | Notes |
+===========+=========================+=========+
| ``zip`` | zip file (:file:`.zip`) | (1),(3) |
+-----------+-------------------------+---------+
| ``gztar`` | gzip'ed tar file | (2),(4) |
| | (:file:`.tar.gz`) | |
+-----------+-------------------------+---------+
| ``bztar`` | bzip2'ed tar file | \(4) |
| | (:file:`.tar.bz2`) | |
+-----------+-------------------------+---------+
| ``ztar`` | compressed tar file | \(4) |
| | (:file:`.tar.Z`) | |
+-----------+-------------------------+---------+
| ``tar`` | tar file (:file:`.tar`) | \(4) |
+-----------+-------------------------+---------+
Notes:
(1)
default on Windows
(2)
default on Unix
(3)
requires either external :program:`zip` utility or :mod:`zipfile` module (part
of the standard Python library since Python 1.6)
(4)
requires external utilities: :program:`tar` and possibly one of :program:`gzip`,
:program:`bzip2`, or :program:`compress`
.. _manifest:
Specifying the files to distribute
==================================
If you don't supply an explicit list of files (or instructions on how to
generate one), the :command:`sdist` command puts a minimal default set into the
source distribution:
* all Python source files implied by the :option:`py_modules` and
:option:`packages` options
* all C source files mentioned in the :option:`ext_modules` or
:option:`libraries` options (
**\*\*** getting C library sources currently broken---no
:meth:`get_source_files` method in :file:`build_clib.py`! **\*\***)
* scripts identified by the :option:`scripts` option
* anything that looks like a test script: :file:`test/test\*.py` (currently, the
Distutils don't do anything with test scripts except include them in source
distributions, but in the future there will be a standard for testing Python
module distributions)
* :file:`README.txt` (or :file:`README`), :file:`setup.py` (or whatever you
called your setup script), and :file:`setup.cfg`
Sometimes this is enough, but usually you will want to specify additional files
to distribute. The typical way to do this is to write a *manifest template*,
called :file:`MANIFEST.in` by default. The manifest template is just a list of
instructions for how to generate your manifest file, :file:`MANIFEST`, which is
the exact list of files to include in your source distribution. The
:command:`sdist` command processes this template and generates a manifest based
on its instructions and what it finds in the filesystem.
If you prefer to roll your own manifest file, the format is simple: one filename
per line, regular files (or symlinks to them) only. If you do supply your own
:file:`MANIFEST`, you must specify everything: the default set of files
described above does not apply in this case.
The manifest template has one command per line, where each command specifies a
set of files to include or exclude from the source distribution. For an
example, again we turn to the Distutils' own manifest template::
include *.txt
recursive-include examples *.txt *.py
prune examples/sample?/build
The meanings should be fairly clear: include all files in the distribution root
matching :file:`\*.txt`, all files anywhere under the :file:`examples` directory
matching :file:`\*.txt` or :file:`\*.py`, and exclude all directories matching
:file:`examples/sample?/build`. All of this is done *after* the standard
include set, so you can exclude files from the standard set with explicit
instructions in the manifest template. (Or, you can use the
:option:`--no-defaults` option to disable the standard set entirely.) There are
several other commands available in the manifest template mini-language; see
section :ref:`sdist-cmd`.
The order of commands in the manifest template matters: initially, we have the
list of default files as described above, and each command in the template adds
to or removes from that list of files. Once we have fully processed the
manifest template, we remove files that should not be included in the source
distribution:
* all files in the Distutils "build" tree (default :file:`build/`)
* all files in directories named :file:`RCS`, :file:`CVS`, :file:`.svn`,
:file:`.hg`, :file:`.git`, :file:`.bzr` or :file:`_darcs`
Now we have our complete list of files, which is written to the manifest for
future reference, and then used to build the source distribution archive(s).
You can disable the default set of included files with the
:option:`--no-defaults` option, and you can disable the standard exclude set
with :option:`--no-prune`.
Following the Distutils' own manifest template, let's trace how the
:command:`sdist` command builds the list of files to include in the Distutils
source distribution:
#. include all Python source files in the :file:`distutils` and
:file:`distutils/command` subdirectories (because packages corresponding to
those two directories were mentioned in the :option:`packages` option in the
setup script---see section :ref:`setup-script`)
#. include :file:`README.txt`, :file:`setup.py`, and :file:`setup.cfg` (standard
files)
#. include :file:`test/test\*.py` (standard files)
#. include :file:`\*.txt` in the distribution root (this will find
:file:`README.txt` a second time, but such redundancies are weeded out later)
#. include anything matching :file:`\*.txt` or :file:`\*.py` in the sub-tree
under :file:`examples`,
#. exclude all files in the sub-trees starting at directories matching
:file:`examples/sample?/build`\ ---this may exclude files included by the
previous two steps, so it's important that the ``prune`` command in the manifest
template comes after the ``recursive-include`` command
#. exclude the entire :file:`build` tree, and any :file:`RCS`, :file:`CVS`,
:file:`.svn`, :file:`.hg`, :file:`.git`, :file:`.bzr` and :file:`_darcs`
directories
Just like in the setup script, file and directory names in the manifest template
should always be slash-separated; the Distutils will take care of converting
them to the standard representation on your platform. That way, the manifest
template is portable across operating systems.
.. _manifest-options:
Manifest-related options
========================
The normal course of operations for the :command:`sdist` command is as follows:
* if the manifest file, :file:`MANIFEST` doesn't exist, read :file:`MANIFEST.in`
and create the manifest
* if neither :file:`MANIFEST` nor :file:`MANIFEST.in` exist, create a manifest
with just the default file set
* if either :file:`MANIFEST.in` or the setup script (:file:`setup.py`) are more
recent than :file:`MANIFEST`, recreate :file:`MANIFEST` by reading
:file:`MANIFEST.in`
* use the list of files now in :file:`MANIFEST` (either just generated or read
in) to create the source distribution archive(s)
There are a couple of options that modify this behaviour. First, use the
:option:`--no-defaults` and :option:`--no-prune` to disable the standard
"include" and "exclude" sets.
Second, you might want to force the manifest to be regenerated---for example, if
you have added or removed files or directories that match an existing pattern in
the manifest template, you should regenerate the manifest::
python setup.py sdist --force-manifest
Or, you might just want to (re)generate the manifest, but not create a source
distribution::
python setup.py sdist --manifest-only
:option:`--manifest-only` implies :option:`--force-manifest`. :option:`-o` is a
shortcut for :option:`--manifest-only`, and :option:`-f` for
:option:`--force-manifest`.

View File

@@ -0,0 +1,43 @@
.. _package-upload:
***************************************
Uploading Packages to the Package Index
***************************************
.. versionadded:: 2.5
The Python Package Index (PyPI) not only stores the package info, but also the
package data if the author of the package wishes to. The distutils command
:command:`upload` pushes the distribution files to PyPI.
The command is invoked immediately after building one or more distribution
files. For example, the command ::
python setup.py sdist bdist_wininst upload
will cause the source distribution and the Windows installer to be uploaded to
PyPI. Note that these will be uploaded even if they are built using an earlier
invocation of :file:`setup.py`, but that only distributions named on the command
line for the invocation including the :command:`upload` command are uploaded.
The :command:`upload` command uses the username, password, and repository URL
from the :file:`$HOME/.pypirc` file (see section :ref:`pypirc` for more on this
file).
You can specify another PyPI server with the :option:`--repository=*url*` option::
python setup.py sdist bdist_wininst upload -r http://example.com/pypi
See section :ref:`pypirc` for more on defining several servers.
You can use the :option:`--sign` option to tell :command:`upload` to sign each
uploaded file using GPG (GNU Privacy Guard). The :program:`gpg` program must
be available for execution on the system :envvar:`PATH`. You can also specify
which key to use for signing using the :option:`--identity=*name*` option.
Other :command:`upload` options include :option:`--repository=<url>` or
:option:`--repository=<section>` where *url* is the url of the server and
*section* the name of the section in :file:`$HOME/.pypirc`, and
:option:`--show-response` (which displays the full response text from the PyPI
server for help in debugging upload problems).

View File

@@ -0,0 +1,202 @@
.. highlightlang:: rest
Differences to the LaTeX markup
===============================
Though the markup language is different, most of the concepts and markup types
of the old LaTeX docs have been kept -- environments as reST directives, inline
commands as reST roles and so forth.
However, there are some differences in the way these work, partly due to the
differences in the markup languages, partly due to improvements in Sphinx. This
section lists these differences, in order to give those familiar with the old
format a quick overview of what they might run into.
Inline markup
-------------
These changes have been made to inline markup:
* **Cross-reference roles**
Most of the following semantic roles existed previously as inline commands,
but didn't do anything except formatting the content as code. Now, they
cross-reference to known targets (some names have also been shortened):
| *mod* (previously *refmodule* or *module*)
| *func* (previously *function*)
| *data* (new)
| *const*
| *class*
| *meth* (previously *method*)
| *attr* (previously *member*)
| *exc* (previously *exception*)
| *cdata*
| *cfunc* (previously *cfunction*)
| *cmacro* (previously *csimplemacro*)
| *ctype*
Also different is the handling of *func* and *meth*: while previously
parentheses were added to the callable name (like ``\func{str()}``), they are
now appended by the build system -- appending them in the source will result
in double parentheses. This also means that ``:func:`str(object)``` will not
work as expected -- use ````str(object)```` instead!
* **Inline commands implemented as directives**
These were inline commands in LaTeX, but are now directives in reST:
| *deprecated*
| *versionadded*
| *versionchanged*
These are used like so::
.. deprecated:: 2.5
Reason of deprecation.
Also, no period is appended to the text for *versionadded* and
*versionchanged*.
| *note*
| *warning*
These are used like so::
.. note::
Content of note.
* **Otherwise changed commands**
The *samp* command previously formatted code and added quotation marks around
it. The *samp* role, however, features a new highlighting system just like
*file* does:
``:samp:`open({filename}, {mode})``` results in :samp:`open({filename}, {mode})`
* **Dropped commands**
These were commands in LaTeX, but are not available as roles:
| *bfcode*
| *character* (use :samp:`\`\`'c'\`\``)
| *citetitle* (use ```Title <URL>`_``)
| *code* (use ````code````)
| *email* (just write the address in body text)
| *filenq*
| *filevar* (use the ``{...}`` highlighting feature of *file*)
| *programopt*, *longprogramopt* (use *option*)
| *ulink* (use ```Title <URL>`_``)
| *url* (just write the URL in body text)
| *var* (use ``*var*``)
| *infinity*, *plusminus* (use the Unicode character)
| *shortversion*, *version* (use the ``|version|`` and ``|release|`` substitutions)
| *emph*, *strong* (use the reST markup)
* **Backslash escaping**
In reST, a backslash must be escaped in normal text, and in the content of
roles. However, in code literals and literal blocks, it must not be escaped.
Example: ``:file:`C:\\Temp\\my.tmp``` vs. ````open("C:\Temp\my.tmp")````.
Information units
-----------------
Information units (*...desc* environments) have been made reST directives.
These changes to information units should be noted:
* **New names**
"desc" has been removed from every name. Additionally, these directives have
new names:
| *cfunction* (previously *cfuncdesc*)
| *cmacro* (previously *csimplemacrodesc*)
| *exception* (previously *excdesc*)
| *function* (previously *funcdesc*)
| *attribute* (previously *memberdesc*)
The *classdesc\** and *excclassdesc* environments have been dropped, the
*class* and *exception* directives support classes documented with and without
constructor arguments.
* **Multiple objects**
The equivalent of the *...line* commands is::
.. function:: do_foo(bar)
do_bar(baz)
Description of the functions.
IOW, just give one signatures per line, at the same indentation level.
* **Arguments**
There is no *optional* command. Just give function signatures like they
should appear in the output::
.. function:: open(filename[, mode[, buffering]])
Description.
Note: markup in the signature is not supported.
* **Indexing**
The *...descni* environments have been dropped. To mark an information unit
as unsuitable for index entry generation, use the *noindex* option like so::
.. function:: foo_*
:noindex:
Description.
* **New information units**
There are new generic information units: One is called "describe" and can be
used to document things that are not covered by the other units::
.. describe:: a == b
The equals operator.
The others are::
.. cmdoption:: -O
Describes a command-line option.
.. envvar:: PYTHONINSPECT
Describes an environment variable.
Structure
---------
The LaTeX docs were split in several toplevel manuals. Now, all files are part
of the same documentation tree, as indicated by the *toctree* directives in the
sources (though individual output formats may choose to split them up into parts
again). Every *toctree* directive embeds other files as subdocuments of the
current file (this structure is not necessarily mirrored in the filesystem
layout). The toplevel file is :file:`contents.rst`.
However, most of the old directory structure has been kept, with the
directories renamed as follows:
* :file:`api` -> :file:`c-api`
* :file:`dist` -> :file:`distutils`, with the single TeX file split up
* :file:`doc` -> :file:`documenting`
* :file:`ext` -> :file:`extending`
* :file:`inst` -> :file:`installing`
* :file:`lib` -> :file:`library`
* :file:`mac` -> merged into :file:`library`, with :file:`mac/using.tex`
moved to :file:`using/mac.rst`
* :file:`ref` -> :file:`reference`
* :file:`tut` -> :file:`tutorial`, with the single TeX file split up
.. XXX more (index-generating, production lists, ...)

View File

@@ -0,0 +1,32 @@
.. _documenting-index:
######################
Documenting Python
######################
The Python language has a substantial body of documentation, much of it
contributed by various authors. The markup used for the Python documentation is
`reStructuredText`_, developed by the `docutils`_ project, amended by custom
directives and using a toolset named `Sphinx`_ to postprocess the HTML output.
This document describes the style guide for our documentation, the custom
reStructuredText markup introduced to support Python documentation and how it
should be used, as well as the Sphinx build system.
.. _reStructuredText: http://docutils.sf.net/rst.html
.. _docutils: http://docutils.sf.net/
.. _Sphinx: http://sphinx.pocoo.org/
If you're interested in contributing to Python's documentation, there's no need
to write reStructuredText if you're not so inclined; plain text contributions
are more than welcome as well.
.. toctree::
:numbered:
intro.rst
style.rst
rest.rst
markup.rst
fromlatex.rst

View File

@@ -0,0 +1,29 @@
Introduction
============
Python's documentation has long been considered to be good for a free
programming language. There are a number of reasons for this, the most
important being the early commitment of Python's creator, Guido van Rossum, to
providing documentation on the language and its libraries, and the continuing
involvement of the user community in providing assistance for creating and
maintaining documentation.
The involvement of the community takes many forms, from authoring to bug reports
to just plain complaining when the documentation could be more complete or
easier to use.
This document is aimed at authors and potential authors of documentation for
Python. More specifically, it is for people contributing to the standard
documentation and developing additional documents using the same tools as the
standard documents. This guide will be less useful for authors using the Python
documentation tools for topics other than Python, and less useful still for
authors not using the tools at all.
If your interest is in contributing to the Python documentation, but you don't
have the time or inclination to learn reStructuredText and the markup structures
documented here, there's a welcoming place for you among the Python contributors
as well. Any time you feel that you can clarify existing documentation or
provide documentation that's missing, the existing documentation team will
gladly work with you to integrate your text, dealing with the markup for you.
Please don't let the material in this document stand between the documentation
and your desire to help out!

View File

@@ -0,0 +1,823 @@
.. highlightlang:: rest
Additional Markup Constructs
============================
Sphinx adds a lot of new directives and interpreted text roles to standard reST
markup. This section contains the reference material for these facilities.
Documentation for "standard" reST constructs is not included here, though
they are used in the Python documentation.
.. note::
This is just an overview of Sphinx' extended markup capabilities; full
coverage can be found in `its own documentation
<http://sphinx.pocoo.org/contents.html>`_.
Meta-information markup
-----------------------
.. describe:: sectionauthor
Identifies the author of the current section. The argument should include
the author's name such that it can be used for presentation (though it isn't)
and email address. The domain name portion of the address should be lower
case. Example::
.. sectionauthor:: Guido van Rossum <guido@python.org>
Currently, this markup isn't reflected in the output in any way, but it helps
keep track of contributions.
Module-specific markup
----------------------
The markup described in this section is used to provide information about a
module being documented. Each module should be documented in its own file.
Normally this markup appears after the title heading of that file; a typical
file might start like this::
:mod:`parrot` -- Dead parrot access
===================================
.. module:: parrot
:platform: Unix, Windows
:synopsis: Analyze and reanimate dead parrots.
.. moduleauthor:: Eric Cleese <eric@python.invalid>
.. moduleauthor:: John Idle <john@python.invalid>
As you can see, the module-specific markup consists of two directives, the
``module`` directive and the ``moduleauthor`` directive.
.. describe:: module
This directive marks the beginning of the description of a module (or package
submodule, in which case the name should be fully qualified, including the
package name).
The ``platform`` option, if present, is a comma-separated list of the
platforms on which the module is available (if it is available on all
platforms, the option should be omitted). The keys are short identifiers;
examples that are in use include "IRIX", "Mac", "Windows", and "Unix". It is
important to use a key which has already been used when applicable.
The ``synopsis`` option should consist of one sentence describing the
module's purpose -- it is currently only used in the Global Module Index.
The ``deprecated`` option can be given (with no value) to mark a module as
deprecated; it will be designated as such in various locations then.
.. describe:: moduleauthor
The ``moduleauthor`` directive, which can appear multiple times, names the
authors of the module code, just like ``sectionauthor`` names the author(s)
of a piece of documentation. It too does not result in any output currently.
.. note::
It is important to make the section title of a module-describing file
meaningful since that value will be inserted in the table-of-contents trees
in overview files.
Information units
-----------------
There are a number of directives used to describe specific features provided by
modules. Each directive requires one or more signatures to provide basic
information about what is being described, and the content should be the
description. The basic version makes entries in the general index; if no index
entry is desired, you can give the directive option flag ``:noindex:``. The
following example shows all of the features of this directive type::
.. function:: spam(eggs)
ham(eggs)
:noindex:
Spam or ham the foo.
The signatures of object methods or data attributes should always include the
type name (``.. method:: FileInput.input(...)``), even if it is obvious from the
context which type they belong to; this is to enable consistent
cross-references. If you describe methods belonging to an abstract protocol,
such as "context managers", include a (pseudo-)type name too to make the
index entries more informative.
The directives are:
.. describe:: cfunction
Describes a C function. The signature should be given as in C, e.g.::
.. cfunction:: PyObject* PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)
This is also used to describe function-like preprocessor macros. The names
of the arguments should be given so they may be used in the description.
Note that you don't have to backslash-escape asterisks in the signature,
as it is not parsed by the reST inliner.
.. describe:: cmember
Describes a C struct member. Example signature::
.. cmember:: PyObject* PyTypeObject.tp_bases
The text of the description should include the range of values allowed, how
the value should be interpreted, and whether the value can be changed.
References to structure members in text should use the ``member`` role.
.. describe:: cmacro
Describes a "simple" C macro. Simple macros are macros which are used
for code expansion, but which do not take arguments so cannot be described as
functions. This is not to be used for simple constant definitions. Examples
of its use in the Python documentation include :cmacro:`PyObject_HEAD` and
:cmacro:`Py_BEGIN_ALLOW_THREADS`.
.. describe:: ctype
Describes a C type. The signature should just be the type name.
.. describe:: cvar
Describes a global C variable. The signature should include the type, such
as::
.. cvar:: PyObject* PyClass_Type
.. describe:: data
Describes global data in a module, including both variables and values used
as "defined constants." Class and object attributes are not documented
using this environment.
.. describe:: exception
Describes an exception class. The signature can, but need not include
parentheses with constructor arguments.
.. describe:: function
Describes a module-level function. The signature should include the
parameters, enclosing optional parameters in brackets. Default values can be
given if it enhances clarity. For example::
.. function:: Timer.repeat([repeat=3[, number=1000000]])
Object methods are not documented using this directive. Bound object methods
placed in the module namespace as part of the public interface of the module
are documented using this, as they are equivalent to normal functions for
most purposes.
The description should include information about the parameters required and
how they are used (especially whether mutable objects passed as parameters
are modified), side effects, and possible exceptions. A small example may be
provided.
.. describe:: class
Describes a class. The signature can include parentheses with parameters
which will be shown as the constructor arguments.
.. describe:: attribute
Describes an object data attribute. The description should include
information about the type of the data to be expected and whether it may be
changed directly.
.. describe:: method
Describes an object method. The parameters should not include the ``self``
parameter. The description should include similar information to that
described for ``function``.
.. describe:: opcode
Describes a Python :term:`bytecode` instruction.
.. describe:: cmdoption
Describes a command line option or switch. Option argument names should be
enclosed in angle brackets. Example::
.. cmdoption:: -m <module>
Run a module as a script.
.. describe:: envvar
Describes an environment variable that Python uses or defines.
There is also a generic version of these directives:
.. describe:: describe
This directive produces the same formatting as the specific ones explained
above but does not create index entries or cross-referencing targets. It is
used, for example, to describe the directives in this document. Example::
.. describe:: opcode
Describes a Python bytecode instruction.
Showing code examples
---------------------
Examples of Python source code or interactive sessions are represented using
standard reST literal blocks. They are started by a ``::`` at the end of the
preceding paragraph and delimited by indentation.
Representing an interactive session requires including the prompts and output
along with the Python code. No special markup is required for interactive
sessions. After the last line of input or output presented, there should not be
an "unused" primary prompt; this is an example of what *not* to do::
>>> 1 + 1
2
>>>
Syntax highlighting is handled in a smart way:
* There is a "highlighting language" for each source file. Per default,
this is ``'python'`` as the majority of files will have to highlight Python
snippets.
* Within Python highlighting mode, interactive sessions are recognized
automatically and highlighted appropriately.
* The highlighting language can be changed using the ``highlightlang``
directive, used as follows::
.. highlightlang:: c
This language is used until the next ``highlightlang`` directive is
encountered.
* The values normally used for the highlighting language are:
* ``python`` (the default)
* ``c``
* ``rest``
* ``none`` (no highlighting)
* If highlighting with the current language fails, the block is not highlighted
in any way.
Longer displays of verbatim text may be included by storing the example text in
an external file containing only plain text. The file may be included using the
``literalinclude`` directive. [1]_ For example, to include the Python source file
:file:`example.py`, use::
.. literalinclude:: example.py
The file name is relative to the current file's path. Documentation-specific
include files should be placed in the ``Doc/includes`` subdirectory.
Inline markup
-------------
As said before, Sphinx uses interpreted text roles to insert semantic markup in
documents.
Names of local variables, such as function/method arguments, are an exception,
they should be marked simply with ``*var*``.
For all other roles, you have to write ``:rolename:`content```.
There are some additional facilities that make cross-referencing roles more
versatile:
* You may supply an explicit title and reference target, like in reST direct
hyperlinks: ``:role:`title <target>``` will refer to *target*, but the link
text will be *title*.
* If you prefix the content with ``!``, no reference/hyperlink will be created.
* For the Python object roles, if you prefix the content with ``~``, the link
text will only be the last component of the target. For example,
``:meth:`~Queue.Queue.get``` will refer to ``Queue.Queue.get`` but only
display ``get`` as the link text.
In HTML output, the link's ``title`` attribute (that is e.g. shown as a
tool-tip on mouse-hover) will always be the full target name.
The following roles refer to objects in modules and are possibly hyperlinked if
a matching identifier is found:
.. describe:: mod
The name of a module; a dotted name may be used. This should also be used for
package names.
.. describe:: func
The name of a Python function; dotted names may be used. The role text
should not include trailing parentheses to enhance readability. The
parentheses are stripped when searching for identifiers.
.. describe:: data
The name of a module-level variable or constant.
.. describe:: const
The name of a "defined" constant. This may be a C-language ``#define``
or a Python variable that is not intended to be changed.
.. describe:: class
A class name; a dotted name may be used.
.. describe:: meth
The name of a method of an object. The role text should include the type
name and the method name. A dotted name may be used.
.. describe:: attr
The name of a data attribute of an object.
.. describe:: exc
The name of an exception. A dotted name may be used.
The name enclosed in this markup can include a module name and/or a class name.
For example, ``:func:`filter``` could refer to a function named ``filter`` in
the current module, or the built-in function of that name. In contrast,
``:func:`foo.filter``` clearly refers to the ``filter`` function in the ``foo``
module.
Normally, names in these roles are searched first without any further
qualification, then with the current module name prepended, then with the
current module and class name (if any) prepended. If you prefix the name with a
dot, this order is reversed. For example, in the documentation of the
:mod:`codecs` module, ``:func:`open``` always refers to the built-in function,
while ``:func:`.open``` refers to :func:`codecs.open`.
A similar heuristic is used to determine whether the name is an attribute of
the currently documented class.
The following roles create cross-references to C-language constructs if they
are defined in the API documentation:
.. describe:: cdata
The name of a C-language variable.
.. describe:: cfunc
The name of a C-language function. Should include trailing parentheses.
.. describe:: cmacro
The name of a "simple" C macro, as defined above.
.. describe:: ctype
The name of a C-language type.
The following role does possibly create a cross-reference, but does not refer
to objects:
.. describe:: token
The name of a grammar token (used in the reference manual to create links
between production displays).
The following role creates a cross-reference to the term in the glossary:
.. describe:: term
Reference to a term in the glossary. The glossary is created using the
``glossary`` directive containing a definition list with terms and
definitions. It does not have to be in the same file as the ``term``
markup, in fact, by default the Python docs have one global glossary
in the ``glossary.rst`` file.
If you use a term that's not explained in a glossary, you'll get a warning
during build.
---------
The following roles don't do anything special except formatting the text
in a different style:
.. describe:: command
The name of an OS-level command, such as ``rm``.
.. describe:: dfn
Mark the defining instance of a term in the text. (No index entries are
generated.)
.. describe:: envvar
An environment variable. Index entries are generated.
.. describe:: file
The name of a file or directory. Within the contents, you can use curly
braces to indicate a "variable" part, for example::
... is installed in :file:`/usr/lib/python2.{x}/site-packages` ...
In the built documentation, the ``x`` will be displayed differently to
indicate that it is to be replaced by the Python minor version.
.. describe:: guilabel
Labels presented as part of an interactive user interface should be marked
using ``guilabel``. This includes labels from text-based interfaces such as
those created using :mod:`curses` or other text-based libraries. Any label
used in the interface should be marked with this role, including button
labels, window titles, field names, menu and menu selection names, and even
values in selection lists.
.. describe:: kbd
Mark a sequence of keystrokes. What form the key sequence takes may depend
on platform- or application-specific conventions. When there are no relevant
conventions, the names of modifier keys should be spelled out, to improve
accessibility for new users and non-native speakers. For example, an
*xemacs* key sequence may be marked like ``:kbd:`C-x C-f```, but without
reference to a specific application or platform, the same sequence should be
marked as ``:kbd:`Control-x Control-f```.
.. describe:: keyword
The name of a keyword in Python.
.. describe:: mailheader
The name of an RFC 822-style mail header. This markup does not imply that
the header is being used in an email message, but can be used to refer to any
header of the same "style." This is also used for headers defined by the
various MIME specifications. The header name should be entered in the same
way it would normally be found in practice, with the camel-casing conventions
being preferred where there is more than one common usage. For example:
``:mailheader:`Content-Type```.
.. describe:: makevar
The name of a :command:`make` variable.
.. describe:: manpage
A reference to a Unix manual page including the section,
e.g. ``:manpage:`ls(1)```.
.. describe:: menuselection
Menu selections should be marked using the ``menuselection`` role. This is
used to mark a complete sequence of menu selections, including selecting
submenus and choosing a specific operation, or any subsequence of such a
sequence. The names of individual selections should be separated by
``-->``.
For example, to mark the selection "Start > Programs", use this markup::
:menuselection:`Start --> Programs`
When including a selection that includes some trailing indicator, such as the
ellipsis some operating systems use to indicate that the command opens a
dialog, the indicator should be omitted from the selection name.
.. describe:: mimetype
The name of a MIME type, or a component of a MIME type (the major or minor
portion, taken alone).
.. describe:: newsgroup
The name of a Usenet newsgroup.
.. describe:: option
A command-line option to an executable program. The leading hyphen(s) must
be included.
.. describe:: program
The name of an executable program. This may differ from the file name for
the executable for some platforms. In particular, the ``.exe`` (or other)
extension should be omitted for Windows programs.
.. describe:: regexp
A regular expression. Quotes should not be included.
.. describe:: samp
A piece of literal text, such as code. Within the contents, you can use
curly braces to indicate a "variable" part, as in ``:file:``.
If you don't need the "variable part" indication, use the standard
````code```` instead.
.. describe:: var
A Python or C variable or parameter name.
The following roles generate external links:
.. describe:: pep
A reference to a Python Enhancement Proposal. This generates appropriate
index entries. The text "PEP *number*\ " is generated; in the HTML output,
this text is a hyperlink to an online copy of the specified PEP.
.. describe:: rfc
A reference to an Internet Request for Comments. This generates appropriate
index entries. The text "RFC *number*\ " is generated; in the HTML output,
this text is a hyperlink to an online copy of the specified RFC.
Note that there are no special roles for including hyperlinks as you can use
the standard reST markup for that purpose.
.. _doc-ref-role:
Cross-linking markup
--------------------
To support cross-referencing to arbitrary sections in the documentation, the
standard reST labels are "abused" a bit: Every label must precede a section
title; and every label name must be unique throughout the entire documentation
source.
You can then reference to these sections using the ``:ref:`label-name``` role.
Example::
.. _my-reference-label:
Section to cross-reference
--------------------------
This is the text of the section.
It refers to the section itself, see :ref:`my-reference-label`.
The ``:ref:`` invocation is replaced with the section title.
Paragraph-level markup
----------------------
These directives create short paragraphs and can be used inside information
units as well as normal text:
.. describe:: note
An especially important bit of information about an API that a user should be
aware of when using whatever bit of API the note pertains to. The content of
the directive should be written in complete sentences and include all
appropriate punctuation.
Example::
.. note::
This function is not suitable for sending spam e-mails.
.. describe:: warning
An important bit of information about an API that a user should be very aware
of when using whatever bit of API the warning pertains to. The content of
the directive should be written in complete sentences and include all
appropriate punctuation. This differs from ``note`` in that it is recommended
over ``note`` for information regarding security.
.. describe:: versionadded
This directive documents the version of Python which added the described
feature to the library or C API. When this applies to an entire module, it
should be placed at the top of the module section before any prose.
The first argument must be given and is the version in question; you can add
a second argument consisting of a *brief* explanation of the change.
Example::
.. versionadded:: 2.5
The *spam* parameter.
Note that there must be no blank line between the directive head and the
explanation; this is to make these blocks visually continuous in the markup.
.. describe:: versionchanged
Similar to ``versionadded``, but describes when and what changed in the named
feature in some way (new parameters, changed side effects, etc.).
--------------
.. describe:: seealso
Many sections include a list of references to module documentation or
external documents. These lists are created using the ``seealso`` directive.
The ``seealso`` directive is typically placed in a section just before any
sub-sections. For the HTML output, it is shown boxed off from the main flow
of the text.
The content of the ``seealso`` directive should be a reST definition list.
Example::
.. seealso::
Module :mod:`zipfile`
Documentation of the :mod:`zipfile` standard module.
`GNU tar manual, Basic Tar Format <http://link>`_
Documentation for tar archive files, including GNU tar extensions.
.. describe:: rubric
This directive creates a paragraph heading that is not used to create a
table of contents node. It is currently used for the "Footnotes" caption.
.. describe:: centered
This directive creates a centered boldfaced paragraph. Use it as follows::
.. centered::
Paragraph contents.
Table-of-contents markup
------------------------
Since reST does not have facilities to interconnect several documents, or split
documents into multiple output files, Sphinx uses a custom directive to add
relations between the single files the documentation is made of, as well as
tables of contents. The ``toctree`` directive is the central element.
.. describe:: toctree
This directive inserts a "TOC tree" at the current location, using the
individual TOCs (including "sub-TOC trees") of the files given in the
directive body. A numeric ``maxdepth`` option may be given to indicate the
depth of the tree; by default, all levels are included.
Consider this example (taken from the library reference index)::
.. toctree::
:maxdepth: 2
intro.rst
strings.rst
datatypes.rst
numeric.rst
(many more files listed here)
This accomplishes two things:
* Tables of contents from all those files are inserted, with a maximum depth
of two, that means one nested heading. ``toctree`` directives in those
files are also taken into account.
* Sphinx knows that the relative order of the files ``intro.rst``,
``strings.rst`` and so forth, and it knows that they are children of the
shown file, the library index. From this information it generates "next
chapter", "previous chapter" and "parent chapter" links.
In the end, all files included in the build process must occur in one
``toctree`` directive; Sphinx will emit a warning if it finds a file that is
not included, because that means that this file will not be reachable through
standard navigation.
The special file ``contents.rst`` at the root of the source directory is the
"root" of the TOC tree hierarchy; from it the "Contents" page is generated.
Index-generating markup
-----------------------
Sphinx automatically creates index entries from all information units (like
functions, classes or attributes) like discussed before.
However, there is also an explicit directive available, to make the index more
comprehensive and enable index entries in documents where information is not
mainly contained in information units, such as the language reference.
The directive is ``index`` and contains one or more index entries. Each entry
consists of a type and a value, separated by a colon.
For example::
.. index::
single: execution; context
module: __main__
module: sys
triple: module; search; path
This directive contains five entries, which will be converted to entries in the
generated index which link to the exact location of the index statement (or, in
case of offline media, the corresponding page number).
The possible entry types are:
single
Creates a single index entry. Can be made a subentry by separating the
subentry text with a semicolon (this notation is also used below to describe
what entries are created).
pair
``pair: loop; statement`` is a shortcut that creates two index entries,
namely ``loop; statement`` and ``statement; loop``.
triple
Likewise, ``triple: module; search; path`` is a shortcut that creates three
index entries, which are ``module; search path``, ``search; path, module`` and
``path; module search``.
module, keyword, operator, object, exception, statement, builtin
These all create two index entries. For example, ``module: hashlib`` creates
the entries ``module; hashlib`` and ``hashlib; module``.
For index directives containing only "single" entries, there is a shorthand
notation::
.. index:: BNF, grammar, syntax, notation
This creates four index entries.
Grammar production displays
---------------------------
Special markup is available for displaying the productions of a formal grammar.
The markup is simple and does not attempt to model all aspects of BNF (or any
derived forms), but provides enough to allow context-free grammars to be
displayed in a way that causes uses of a symbol to be rendered as hyperlinks to
the definition of the symbol. There is this directive:
.. describe:: productionlist
This directive is used to enclose a group of productions. Each production is
given on a single line and consists of a name, separated by a colon from the
following definition. If the definition spans multiple lines, each
continuation line must begin with a colon placed at the same column as in the
first line.
Blank lines are not allowed within ``productionlist`` directive arguments.
The definition can contain token names which are marked as interpreted text
(e.g. ``unaryneg ::= "-" `integer```) -- this generates cross-references
to the productions of these tokens.
Note that no further reST parsing is done in the production, so that you
don't have to escape ``*`` or ``|`` characters.
.. XXX describe optional first parameter
The following is an example taken from the Python Reference Manual::
.. productionlist::
try_stmt: try1_stmt | try2_stmt
try1_stmt: "try" ":" `suite`
: ("except" [`expression` ["," `target`]] ":" `suite`)+
: ["else" ":" `suite`]
: ["finally" ":" `suite`]
try2_stmt: "try" ":" `suite`
: "finally" ":" `suite`
Substitutions
-------------
The documentation system provides three substitutions that are defined by default.
They are set in the build configuration file :file:`conf.py`.
.. describe:: |release|
Replaced by the Python release the documentation refers to. This is the full
version string including alpha/beta/release candidate tags, e.g. ``2.5.2b3``.
.. describe:: |version|
Replaced by the Python version the documentation refers to. This consists
only of the major and minor version parts, e.g. ``2.5``, even for version
2.5.1.
.. describe:: |today|
Replaced by either today's date, or the date set in the build configuration
file. Normally has the format ``April 14, 2007``.
.. rubric:: Footnotes
.. [1] There is a standard ``.. include`` directive, but it raises errors if the
file is not found. This one only emits a warning.

View File

@@ -0,0 +1,243 @@
.. highlightlang:: rest
reStructuredText Primer
=======================
This section is a brief introduction to reStructuredText (reST) concepts and
syntax, intended to provide authors with enough information to author documents
productively. Since reST was designed to be a simple, unobtrusive markup
language, this will not take too long.
.. seealso::
The authoritative `reStructuredText User
Documentation <http://docutils.sourceforge.net/rst.html>`_.
Paragraphs
----------
The paragraph is the most basic block in a reST document. Paragraphs are simply
chunks of text separated by one or more blank lines. As in Python, indentation
is significant in reST, so all lines of the same paragraph must be left-aligned
to the same level of indentation.
Inline markup
-------------
The standard reST inline markup is quite simple: use
* one asterisk: ``*text*`` for emphasis (italics),
* two asterisks: ``**text**`` for strong emphasis (boldface), and
* backquotes: ````text```` for code samples.
If asterisks or backquotes appear in running text and could be confused with
inline markup delimiters, they have to be escaped with a backslash.
Be aware of some restrictions of this markup:
* it may not be nested,
* content may not start or end with whitespace: ``* text*`` is wrong,
* it must be separated from surrounding text by non-word characters. Use a
backslash escaped space to work around that: ``thisis\ *one*\ word``.
These restrictions may be lifted in future versions of the docutils.
reST also allows for custom "interpreted text roles"', which signify that the
enclosed text should be interpreted in a specific way. Sphinx uses this to
provide semantic markup and cross-referencing of identifiers, as described in
the appropriate section. The general syntax is ``:rolename:`content```.
Lists and Quotes
----------------
List markup is natural: just place an asterisk at the start of a paragraph and
indent properly. The same goes for numbered lists; they can also be
autonumbered using a ``#`` sign::
* This is a bulleted list.
* It has two items, the second
item uses two lines.
1. This is a numbered list.
2. It has two items too.
#. This is a numbered list.
#. It has two items too.
Nested lists are possible, but be aware that they must be separated from the
parent list items by blank lines::
* this is
* a list
* with a nested list
* and some subitems
* and here the parent list continues
Definition lists are created as follows::
term (up to a line of text)
Definition of the term, which must be indented
and can even consist of multiple paragraphs
next term
Description.
Paragraphs are quoted by just indenting them more than the surrounding
paragraphs.
Source Code
-----------
Literal code blocks are introduced by ending a paragraph with the special marker
``::``. The literal block must be indented::
This is a normal text paragraph. The next paragraph is a code sample::
It is not processed in any way, except
that the indentation is removed.
It can span multiple lines.
This is a normal text paragraph again.
The handling of the ``::`` marker is smart:
* If it occurs as a paragraph of its own, that paragraph is completely left
out of the document.
* If it is preceded by whitespace, the marker is removed.
* If it is preceded by non-whitespace, the marker is replaced by a single
colon.
That way, the second sentence in the above example's first paragraph would be
rendered as "The next paragraph is a code sample:".
Hyperlinks
----------
External links
^^^^^^^^^^^^^^
Use ```Link text <http://target>`_`` for inline web links. If the link text
should be the web address, you don't need special markup at all, the parser
finds links and mail addresses in ordinary text.
Internal links
^^^^^^^^^^^^^^
Internal linking is done via a special reST role, see the section on specific
markup, :ref:`doc-ref-role`.
Sections
--------
Section headers are created by underlining (and optionally overlining) the
section title with a punctuation character, at least as long as the text::
=================
This is a heading
=================
Normally, there are no heading levels assigned to certain characters as the
structure is determined from the succession of headings. However, for the
Python documentation, we use this convention:
* ``#`` with overline, for parts
* ``*`` with overline, for chapters
* ``=``, for sections
* ``-``, for subsections
* ``^``, for subsubsections
* ``"``, for paragraphs
Explicit Markup
---------------
"Explicit markup" is used in reST for most constructs that need special
handling, such as footnotes, specially-highlighted paragraphs, comments, and
generic directives.
An explicit markup block begins with a line starting with ``..`` followed by
whitespace and is terminated by the next paragraph at the same level of
indentation. (There needs to be a blank line between explicit markup and normal
paragraphs. This may all sound a bit complicated, but it is intuitive enough
when you write it.)
Directives
----------
A directive is a generic block of explicit markup. Besides roles, it is one of
the extension mechanisms of reST, and Sphinx makes heavy use of it.
Basically, a directive consists of a name, arguments, options and content. (Keep
this terminology in mind, it is used in the next chapter describing custom
directives.) Looking at this example, ::
.. function:: foo(x)
foo(y, z)
:bar: no
Return a line of text input from the user.
``function`` is the directive name. It is given two arguments here, the
remainder of the first line and the second line, as well as one option ``bar``
(as you can see, options are given in the lines immediately following the
arguments and indicated by the colons).
The directive content follows after a blank line and is indented relative to the
directive start.
Footnotes
---------
For footnotes, use ``[#]_`` to mark the footnote location, and add the footnote
body at the bottom of the document after a "Footnotes" rubric heading, like so::
Lorem ipsum [#]_ dolor sit amet ... [#]_
.. rubric:: Footnotes
.. [#] Text of the first footnote.
.. [#] Text of the second footnote.
You can also explicitly number the footnotes for better context.
Comments
--------
Every explicit markup block which isn't a valid markup construct (like the
footnotes above) is regarded as a comment.
Source encoding
---------------
Since the easiest way to include special characters like em dashes or copyright
signs in reST is to directly write them as Unicode characters, one has to
specify an encoding:
All Python documentation source files must be in UTF-8 encoding, and the HTML
documents written from them will be in that encoding as well.
Gotchas
-------
There are some problems one commonly runs into while authoring reST documents:
* **Separation of inline markup:** As said above, inline markup spans must be
separated from the surrounding text by non-word characters, you have to use
an escaped space to get around that.

View File

@@ -0,0 +1,70 @@
.. highlightlang:: rest
Style Guide
===========
The Python documentation should follow the `Apple Publications Style Guide`_
wherever possible. This particular style guide was selected mostly because it
seems reasonable and is easy to get online.
Topics which are not covered in the Apple's style guide will be discussed in
this document.
All reST files use an indentation of 3 spaces. The maximum line length is 80
characters for normal text, but tables, deeply indented code samples and long
links may extend beyond that.
Make generous use of blank lines where applicable; they help grouping things
together.
A sentence-ending period may be followed by one or two spaces; while reST
ignores the second space, it is customarily put in by some users, for example
to aid Emacs' auto-fill mode.
Footnotes are generally discouraged, though they may be used when they are the
best way to present specific information. When a footnote reference is added at
the end of the sentence, it should follow the sentence-ending punctuation. The
reST markup should appear something like this::
This sentence has a footnote reference. [#]_ This is the next sentence.
Footnotes should be gathered at the end of a file, or if the file is very long,
at the end of a section. The docutils will automatically create backlinks to
the footnote reference.
Footnotes may appear in the middle of sentences where appropriate.
Many special names are used in the Python documentation, including the names of
operating systems, programming languages, standards bodies, and the like. Most
of these entities are not assigned any special markup, but the preferred
spellings are given here to aid authors in maintaining the consistency of
presentation in the Python documentation.
Other terms and words deserve special mention as well; these conventions should
be used to ensure consistency throughout the documentation:
CPU
For "central processing unit." Many style guides say this should be spelled
out on the first use (and if you must use it, do so!). For the Python
documentation, this abbreviation should be avoided since there's no
reasonable way to predict which occurrence will be the first seen by the
reader. It is better to use the word "processor" instead.
POSIX
The name assigned to a particular group of standards. This is always
uppercase.
Python
The name of our favorite programming language is always capitalized.
Unicode
The name of a character set and matching encoding. This is always written
capitalized.
Unix
The name of the operating system developed at AT&T Bell Labs in the early
1970s.
.. _Apple Publications Style Guide: http://developer.apple.com/documentation/UserExperience/Conceptual/APStyleGuide/APSG_2008.pdf

View File

@@ -0,0 +1,131 @@
.. highlightlang:: c
.. _building:
********************************************
Building C and C++ Extensions with distutils
********************************************
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
Starting in Python 1.4, Python provides, on Unix, a special make file for
building make files for building dynamically-linked extensions and custom
interpreters. Starting with Python 2.0, this mechanism (known as related to
Makefile.pre.in, and Setup files) is no longer supported. Building custom
interpreters was rarely used, and extension modules can be built using
distutils.
Building an extension module using distutils requires that distutils is
installed on the build machine, which is included in Python 2.x and available
separately for Python 1.5. Since distutils also supports creation of binary
packages, users don't necessarily need a compiler and distutils to install the
extension.
A distutils package contains a driver script, :file:`setup.py`. This is a plain
Python file, which, in the most simple case, could look like this::
from distutils.core import setup, Extension
module1 = Extension('demo',
sources = ['demo.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
ext_modules = [module1])
With this :file:`setup.py`, and a file :file:`demo.c`, running ::
python setup.py build
will compile :file:`demo.c`, and produce an extension module named ``demo`` in
the :file:`build` directory. Depending on the system, the module file will end
up in a subdirectory :file:`build/lib.system`, and may have a name like
:file:`demo.so` or :file:`demo.pyd`.
In the :file:`setup.py`, all execution is performed by calling the ``setup``
function. This takes a variable number of keyword arguments, of which the
example above uses only a subset. Specifically, the example specifies
meta-information to build packages, and it specifies the contents of the
package. Normally, a package will contain of addition modules, like Python
source modules, documentation, subpackages, etc. Please refer to the distutils
documentation in :ref:`distutils-index` to learn more about the features of
distutils; this section explains building extension modules only.
It is common to pre-compute arguments to :func:`setup`, to better structure the
driver script. In the example above, the\ ``ext_modules`` argument to
:func:`setup` is a list of extension modules, each of which is an instance of
the :class:`Extension`. In the example, the instance defines an extension named
``demo`` which is build by compiling a single source file, :file:`demo.c`.
In many cases, building an extension is more complex, since additional
preprocessor defines and libraries may be needed. This is demonstrated in the
example below. ::
from distutils.core import setup, Extension
module1 = Extension('demo',
define_macros = [('MAJOR_VERSION', '1'),
('MINOR_VERSION', '0')],
include_dirs = ['/usr/local/include'],
libraries = ['tcl83'],
library_dirs = ['/usr/local/lib'],
sources = ['demo.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
author = 'Martin v. Loewis',
author_email = 'martin@v.loewis.de',
url = 'http://docs.python.org/extending/building',
long_description = '''
This is really just a demo package.
''',
ext_modules = [module1])
In this example, :func:`setup` is called with additional meta-information, which
is recommended when distribution packages have to be built. For the extension
itself, it specifies preprocessor defines, include directories, library
directories, and libraries. Depending on the compiler, distutils passes this
information in different ways to the compiler. For example, on Unix, this may
result in the compilation commands ::
gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -DMAJOR_VERSION=1 -DMINOR_VERSION=0 -I/usr/local/include -I/usr/local/include/python2.2 -c demo.c -o build/temp.linux-i686-2.2/demo.o
gcc -shared build/temp.linux-i686-2.2/demo.o -L/usr/local/lib -ltcl83 -o build/lib.linux-i686-2.2/demo.so
These lines are for demonstration purposes only; distutils users should trust
that distutils gets the invocations right.
.. _distributing:
Distributing your extension modules
===================================
When an extension has been successfully build, there are three ways to use it.
End-users will typically want to install the module, they do so by running ::
python setup.py install
Module maintainers should produce source packages; to do so, they run ::
python setup.py sdist
In some cases, additional files need to be included in a source distribution;
this is done through a :file:`MANIFEST.in` file; see the distutils documentation
for details.
If the source distribution has been build successfully, maintainers can also
create binary distributions. Depending on the platform, one of the following
commands can be used to do so. ::
python setup.py bdist_wininst
python setup.py bdist_rpm
python setup.py bdist_dumb

View File

@@ -0,0 +1,285 @@
.. highlightlang:: c
.. _embedding:
***************************************
Embedding Python in Another Application
***************************************
The previous chapters discussed how to extend Python, that is, how to extend the
functionality of Python by attaching a library of C functions to it. It is also
possible to do it the other way around: enrich your C/C++ application by
embedding Python in it. Embedding provides your application with the ability to
implement some of the functionality of your application in Python rather than C
or C++. This can be used for many purposes; one example would be to allow users
to tailor the application to their needs by writing some scripts in Python. You
can also use it yourself if some of the functionality can be written in Python
more easily.
Embedding Python is similar to extending it, but not quite. The difference is
that when you extend Python, the main program of the application is still the
Python interpreter, while if you embed Python, the main program may have nothing
to do with Python --- instead, some parts of the application occasionally call
the Python interpreter to run some Python code.
So if you are embedding Python, you are providing your own main program. One of
the things this main program has to do is initialize the Python interpreter. At
the very least, you have to call the function :cfunc:`Py_Initialize`. There are
optional calls to pass command line arguments to Python. Then later you can
call the interpreter from any part of the application.
There are several different ways to call the interpreter: you can pass a string
containing Python statements to :cfunc:`PyRun_SimpleString`, or you can pass a
stdio file pointer and a file name (for identification in error messages only)
to :cfunc:`PyRun_SimpleFile`. You can also call the lower-level operations
described in the previous chapters to construct and use Python objects.
A simple demo of embedding Python can be found in the directory
:file:`Demo/embed/` of the source distribution.
.. seealso::
:ref:`c-api-index`
The details of Python's C interface are given in this manual. A great deal of
necessary information can be found here.
.. _high-level-embedding:
Very High Level Embedding
=========================
The simplest form of embedding Python is the use of the very high level
interface. This interface is intended to execute a Python script without needing
to interact with the application directly. This can for example be used to
perform some operation on a file. ::
#include <Python.h>
int
main(int argc, char *argv[])
{
Py_Initialize();
PyRun_SimpleString("from time import time,ctime\n"
"print 'Today is',ctime(time())\n");
Py_Finalize();
return 0;
}
The above code first initializes the Python interpreter with
:cfunc:`Py_Initialize`, followed by the execution of a hard-coded Python script
that print the date and time. Afterwards, the :cfunc:`Py_Finalize` call shuts
the interpreter down, followed by the end of the program. In a real program,
you may want to get the Python script from another source, perhaps a text-editor
routine, a file, or a database. Getting the Python code from a file can better
be done by using the :cfunc:`PyRun_SimpleFile` function, which saves you the
trouble of allocating memory space and loading the file contents.
.. _lower-level-embedding:
Beyond Very High Level Embedding: An overview
=============================================
The high level interface gives you the ability to execute arbitrary pieces of
Python code from your application, but exchanging data values is quite
cumbersome to say the least. If you want that, you should use lower level calls.
At the cost of having to write more C code, you can achieve almost anything.
It should be noted that extending Python and embedding Python is quite the same
activity, despite the different intent. Most topics discussed in the previous
chapters are still valid. To show this, consider what the extension code from
Python to C really does:
#. Convert data values from Python to C,
#. Perform a function call to a C routine using the converted values, and
#. Convert the data values from the call from C to Python.
When embedding Python, the interface code does:
#. Convert data values from C to Python,
#. Perform a function call to a Python interface routine using the converted
values, and
#. Convert the data values from the call from Python to C.
As you can see, the data conversion steps are simply swapped to accommodate the
different direction of the cross-language transfer. The only difference is the
routine that you call between both data conversions. When extending, you call a
C routine, when embedding, you call a Python routine.
This chapter will not discuss how to convert data from Python to C and vice
versa. Also, proper use of references and dealing with errors is assumed to be
understood. Since these aspects do not differ from extending the interpreter,
you can refer to earlier chapters for the required information.
.. _pure-embedding:
Pure Embedding
==============
The first program aims to execute a function in a Python script. Like in the
section about the very high level interface, the Python interpreter does not
directly interact with the application (but that will change in the next
section).
The code to run a function defined in a Python script is:
.. literalinclude:: ../includes/run-func.c
This code loads a Python script using ``argv[1]``, and calls the function named
in ``argv[2]``. Its integer arguments are the other values of the ``argv``
array. If you compile and link this program (let's call the finished executable
:program:`call`), and use it to execute a Python script, such as::
def multiply(a,b):
print "Will compute", a, "times", b
c = 0
for i in range(0, a):
c = c + b
return c
then the result should be::
$ call multiply multiply 3 2
Will compute 3 times 2
Result of call: 6
Although the program is quite large for its functionality, most of the code is
for data conversion between Python and C, and for error reporting. The
interesting part with respect to embedding Python starts with ::
Py_Initialize();
pName = PyString_FromString(argv[1]);
/* Error checking of pName left out */
pModule = PyImport_Import(pName);
After initializing the interpreter, the script is loaded using
:cfunc:`PyImport_Import`. This routine needs a Python string as its argument,
which is constructed using the :cfunc:`PyString_FromString` data conversion
routine. ::
pFunc = PyObject_GetAttrString(pModule, argv[2]);
/* pFunc is a new reference */
if (pFunc && PyCallable_Check(pFunc)) {
...
}
Py_XDECREF(pFunc);
Once the script is loaded, the name we're looking for is retrieved using
:cfunc:`PyObject_GetAttrString`. If the name exists, and the object returned is
callable, you can safely assume that it is a function. The program then
proceeds by constructing a tuple of arguments as normal. The call to the Python
function is then made with::
pValue = PyObject_CallObject(pFunc, pArgs);
Upon return of the function, ``pValue`` is either *NULL* or it contains a
reference to the return value of the function. Be sure to release the reference
after examining the value.
.. _extending-with-embedding:
Extending Embedded Python
=========================
Until now, the embedded Python interpreter had no access to functionality from
the application itself. The Python API allows this by extending the embedded
interpreter. That is, the embedded interpreter gets extended with routines
provided by the application. While it sounds complex, it is not so bad. Simply
forget for a while that the application starts the Python interpreter. Instead,
consider the application to be a set of subroutines, and write some glue code
that gives Python access to those routines, just like you would write a normal
Python extension. For example::
static int numargs=0;
/* Return the number of arguments of the application command line */
static PyObject*
emb_numargs(PyObject *self, PyObject *args)
{
if(!PyArg_ParseTuple(args, ":numargs"))
return NULL;
return Py_BuildValue("i", numargs);
}
static PyMethodDef EmbMethods[] = {
{"numargs", emb_numargs, METH_VARARGS,
"Return the number of arguments received by the process."},
{NULL, NULL, 0, NULL}
};
Insert the above code just above the :cfunc:`main` function. Also, insert the
following two statements directly after :cfunc:`Py_Initialize`::
numargs = argc;
Py_InitModule("emb", EmbMethods);
These two lines initialize the ``numargs`` variable, and make the
:func:`emb.numargs` function accessible to the embedded Python interpreter.
With these extensions, the Python script can do things like ::
import emb
print "Number of arguments", emb.numargs()
In a real application, the methods will expose an API of the application to
Python.
.. TODO: threads, code examples do not really behave well if errors happen
(what to watch out for)
.. _embeddingincplusplus:
Embedding Python in C++
=======================
It is also possible to embed Python in a C++ program; precisely how this is done
will depend on the details of the C++ system used; in general you will need to
write the main program in C++, and use the C++ compiler to compile and link your
program. There is no need to recompile Python itself using C++.
.. _link-reqs:
Linking Requirements
====================
While the :program:`configure` script shipped with the Python sources will
correctly build Python to export the symbols needed by dynamically linked
extensions, this is not automatically inherited by applications which embed the
Python library statically, at least on Unix. This is an issue when the
application is linked to the static runtime library (:file:`libpython.a`) and
needs to load dynamic extensions (implemented as :file:`.so` files).
The problem is that some entry points are defined by the Python runtime solely
for extension modules to use. If the embedding application does not use any of
these entry points, some linkers will not include those entries in the symbol
table of the finished executable. Some additional options are needed to inform
the linker not to remove these symbols.
Determining the right options to use for any given platform can be quite
difficult, but fortunately the Python configuration already has those values.
To retrieve them from an installed Python interpreter, start an interactive
interpreter and have a short session like this::
>>> import distutils.sysconfig
>>> distutils.sysconfig.get_config_var('LINKFORSHARED')
'-Xlinker -export-dynamic'
.. index:: module: distutils.sysconfig
The contents of the string presented will be the options that should be used.
If the string is empty, there's no need to add any additional options. The
:const:`LINKFORSHARED` definition corresponds to the variable of the same name
in Python's top-level :file:`Makefile`.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,35 @@
.. _extending-index:
##################################################
Extending and Embedding the Python Interpreter
##################################################
:Release: |version|
:Date: |today|
This document describes how to write modules in C or C++ to extend the Python
interpreter with new modules. Those modules can define new functions but also
new object types and their methods. The document also describes how to embed
the Python interpreter in another application, for use as an extension language.
Finally, it shows how to compile and link extension modules so that they can be
loaded dynamically (at run time) into the interpreter, if the underlying
operating system supports this feature.
This document assumes basic knowledge about Python. For an informal
introduction to the language, see :ref:`tutorial-index`. :ref:`reference-index`
gives a more formal definition of the language. :ref:`library-index` documents
the existing object types, functions and modules (both built-in and written in
Python) that give the language its wide application range.
For a detailed description of the whole Python/C API, see the separate
:ref:`c-api-index`.
.. toctree::
:maxdepth: 2
:numbered:
extending.rst
newtypes.rst
building.rst
windows.rst
embedding.rst

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,282 @@
.. highlightlang:: c
.. _building-on-windows:
****************************************
Building C and C++ Extensions on Windows
****************************************
This chapter briefly explains how to create a Windows extension module for
Python using Microsoft Visual C++, and follows with more detailed background
information on how it works. The explanatory material is useful for both the
Windows programmer learning to build Python extensions and the Unix programmer
interested in producing software which can be successfully built on both Unix
and Windows.
Module authors are encouraged to use the distutils approach for building
extension modules, instead of the one described in this section. You will still
need the C compiler that was used to build Python; typically Microsoft Visual
C++.
.. note::
This chapter mentions a number of filenames that include an encoded Python
version number. These filenames are represented with the version number shown
as ``XY``; in practice, ``'X'`` will be the major version number and ``'Y'``
will be the minor version number of the Python release you're working with. For
example, if you are using Python 2.2.1, ``XY`` will actually be ``22``.
.. _win-cookbook:
A Cookbook Approach
===================
There are two approaches to building extension modules on Windows, just as there
are on Unix: use the :mod:`distutils` package to control the build process, or
do things manually. The distutils approach works well for most extensions;
documentation on using :mod:`distutils` to build and package extension modules
is available in :ref:`distutils-index`. This section describes the manual
approach to building Python extensions written in C or C++.
To build extensions using these instructions, you need to have a copy of the
Python sources of the same version as your installed Python. You will need
Microsoft Visual C++ "Developer Studio"; project files are supplied for VC++
version 7.1, but you can use older versions of VC++. Notice that you should use
the same version of VC++that was used to build Python itself. The example files
described here are distributed with the Python sources in the
:file:`PC\\example_nt\\` directory.
#. **Copy the example files** --- The :file:`example_nt` directory is a
subdirectory of the :file:`PC` directory, in order to keep all the PC-specific
files under the same directory in the source distribution. However, the
:file:`example_nt` directory can't actually be used from this location. You
first need to copy or move it up one level, so that :file:`example_nt` is a
sibling of the :file:`PC` and :file:`Include` directories. Do all your work
from within this new location.
#. **Open the project** --- From VC++, use the :menuselection:`File --> Open
Solution` dialog (not :menuselection:`File --> Open`!). Navigate to and select
the file :file:`example.sln`, in the *copy* of the :file:`example_nt` directory
you made above. Click Open.
#. **Build the example DLL** --- In order to check that everything is set up
right, try building:
#. Select a configuration. This step is optional. Choose
:menuselection:`Build --> Configuration Manager --> Active Solution Configuration`
and select either :guilabel:`Release` or :guilabel:`Debug`. If you skip this
step, VC++ will use the Debug configuration by default.
#. Build the DLL. Choose :menuselection:`Build --> Build Solution`. This
creates all intermediate and result files in a subdirectory called either
:file:`Debug` or :file:`Release`, depending on which configuration you selected
in the preceding step.
#. **Testing the debug-mode DLL** --- Once the Debug build has succeeded, bring
up a DOS box, and change to the :file:`example_nt\\Debug` directory. You should
now be able to repeat the following session (``C>`` is the DOS prompt, ``>>>``
is the Python prompt; note that build information and various debug output from
Python may not match this screen dump exactly)::
C>..\..\PCbuild\python_d
Adding parser accelerators ...
Done.
Python 2.2 (#28, Dec 19 2001, 23:26:37) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import example
[4897 refs]
>>> example.foo()
Hello, world
[4903 refs]
>>>
Congratulations! You've successfully built your first Python extension module.
#. **Creating your own project** --- Choose a name and create a directory for
it. Copy your C sources into it. Note that the module source file name does
not necessarily have to match the module name, but the name of the
initialization function should match the module name --- you can only import a
module :mod:`spam` if its initialization function is called :cfunc:`initspam`,
and it should call :cfunc:`Py_InitModule` with the string ``"spam"`` as its
first argument (use the minimal :file:`example.c` in this directory as a guide).
By convention, it lives in a file called :file:`spam.c` or :file:`spammodule.c`.
The output file should be called :file:`spam.pyd` (in Release mode) or
:file:`spam_d.pyd` (in Debug mode). The extension :file:`.pyd` was chosen
to avoid confusion with a system library :file:`spam.dll` to which your module
could be a Python interface.
.. versionchanged:: 2.5
Previously, file names like :file:`spam.dll` (in release mode) or
:file:`spam_d.dll` (in debug mode) were also recognized.
Now your options are:
#. Copy :file:`example.sln` and :file:`example.vcproj`, rename them to
:file:`spam.\*`, and edit them by hand, or
#. Create a brand new project; instructions are below.
In either case, copy :file:`example_nt\\example.def` to :file:`spam\\spam.def`,
and edit the new :file:`spam.def` so its second line contains the string
'``initspam``'. If you created a new project yourself, add the file
:file:`spam.def` to the project now. (This is an annoying little file with only
two lines. An alternative approach is to forget about the :file:`.def` file,
and add the option :option:`/export:initspam` somewhere to the Link settings, by
manually editing the setting in Project Properties dialog).
#. **Creating a brand new project** --- Use the :menuselection:`File --> New
--> Project` dialog to create a new Project Workspace. Select :guilabel:`Visual
C++ Projects/Win32/ Win32 Project`, enter the name (``spam``), and make sure the
Location is set to parent of the :file:`spam` directory you have created (which
should be a direct subdirectory of the Python build tree, a sibling of
:file:`Include` and :file:`PC`). Select Win32 as the platform (in my version,
this is the only choice). Make sure the Create new workspace radio button is
selected. Click OK.
You should now create the file :file:`spam.def` as instructed in the previous
section. Add the source files to the project, using :menuselection:`Project -->
Add Existing Item`. Set the pattern to ``*.*`` and select both :file:`spam.c`
and :file:`spam.def` and click OK. (Inserting them one by one is fine too.)
Now open the :menuselection:`Project --> spam properties` dialog. You only need
to change a few settings. Make sure :guilabel:`All Configurations` is selected
from the :guilabel:`Settings for:` dropdown list. Select the C/C++ tab. Choose
the General category in the popup menu at the top. Type the following text in
the entry box labeled :guilabel:`Additional Include Directories`::
..\Include,..\PC
Then, choose the General category in the Linker tab, and enter ::
..\PCbuild
in the text box labelled :guilabel:`Additional library Directories`.
Now you need to add some mode-specific settings:
Select :guilabel:`Release` in the :guilabel:`Configuration` dropdown list.
Choose the :guilabel:`Link` tab, choose the :guilabel:`Input` category, and
append ``pythonXY.lib`` to the list in the :guilabel:`Additional Dependencies`
box.
Select :guilabel:`Debug` in the :guilabel:`Configuration` dropdown list, and
append ``pythonXY_d.lib`` to the list in the :guilabel:`Additional Dependencies`
box. Then click the C/C++ tab, select :guilabel:`Code Generation`, and select
:guilabel:`Multi-threaded Debug DLL` from the :guilabel:`Runtime library`
dropdown list.
Select :guilabel:`Release` again from the :guilabel:`Configuration` dropdown
list. Select :guilabel:`Multi-threaded DLL` from the :guilabel:`Runtime
library` dropdown list.
If your module creates a new type, you may have trouble with this line::
PyObject_HEAD_INIT(&PyType_Type)
Change it to::
PyObject_HEAD_INIT(NULL)
and add the following to the module initialization function::
MyObject_Type.ob_type = &PyType_Type;
Refer to section 3 of the `Python FAQ <http://www.python.org/doc/faq>`_ for
details on why you must do this.
.. _dynamic-linking:
Differences Between Unix and Windows
====================================
.. sectionauthor:: Chris Phoenix <cphoenix@best.com>
Unix and Windows use completely different paradigms for run-time loading of
code. Before you try to build a module that can be dynamically loaded, be aware
of how your system works.
In Unix, a shared object (:file:`.so`) file contains code to be used by the
program, and also the names of functions and data that it expects to find in the
program. When the file is joined to the program, all references to those
functions and data in the file's code are changed to point to the actual
locations in the program where the functions and data are placed in memory.
This is basically a link operation.
In Windows, a dynamic-link library (:file:`.dll`) file has no dangling
references. Instead, an access to functions or data goes through a lookup
table. So the DLL code does not have to be fixed up at runtime to refer to the
program's memory; instead, the code already uses the DLL's lookup table, and the
lookup table is modified at runtime to point to the functions and data.
In Unix, there is only one type of library file (:file:`.a`) which contains code
from several object files (:file:`.o`). During the link step to create a shared
object file (:file:`.so`), the linker may find that it doesn't know where an
identifier is defined. The linker will look for it in the object files in the
libraries; if it finds it, it will include all the code from that object file.
In Windows, there are two types of library, a static library and an import
library (both called :file:`.lib`). A static library is like a Unix :file:`.a`
file; it contains code to be included as necessary. An import library is
basically used only to reassure the linker that a certain identifier is legal,
and will be present in the program when the DLL is loaded. So the linker uses
the information from the import library to build the lookup table for using
identifiers that are not included in the DLL. When an application or a DLL is
linked, an import library may be generated, which will need to be used for all
future DLLs that depend on the symbols in the application or DLL.
Suppose you are building two dynamic-load modules, B and C, which should share
another block of code A. On Unix, you would *not* pass :file:`A.a` to the
linker for :file:`B.so` and :file:`C.so`; that would cause it to be included
twice, so that B and C would each have their own copy. In Windows, building
:file:`A.dll` will also build :file:`A.lib`. You *do* pass :file:`A.lib` to the
linker for B and C. :file:`A.lib` does not contain code; it just contains
information which will be used at runtime to access A's code.
In Windows, using an import library is sort of like using ``import spam``; it
gives you access to spam's names, but does not create a separate copy. On Unix,
linking with a library is more like ``from spam import *``; it does create a
separate copy.
.. _win-dlls:
Using DLLs in Practice
======================
.. sectionauthor:: Chris Phoenix <cphoenix@best.com>
Windows Python is built in Microsoft Visual C++; using other compilers may or
may not work (though Borland seems to). The rest of this section is MSVC++
specific.
When creating DLLs in Windows, you must pass :file:`pythonXY.lib` to the linker.
To build two DLLs, spam and ni (which uses C functions found in spam), you could
use these commands::
cl /LD /I/python/include spam.c ../libs/pythonXY.lib
cl /LD /I/python/include ni.c spam.lib ../libs/pythonXY.lib
The first command created three files: :file:`spam.obj`, :file:`spam.dll` and
:file:`spam.lib`. :file:`Spam.dll` does not contain any Python functions (such
as :cfunc:`PyArg_ParseTuple`), but it does know how to find the Python code
thanks to :file:`pythonXY.lib`.
The second command created :file:`ni.dll` (and :file:`.obj` and :file:`.lib`),
which knows how to find the necessary functions from spam, and also from the
Python executable.
Not every identifier is exported to the lookup table. If you want any other
modules (including Python) to be able to see your identifiers, you have to say
``_declspec(dllexport)``, as in ``void _declspec(dllexport) initspam(void)`` or
``PyObject _declspec(dllexport) *NiGetSpamData(void)``.
Developer Studio will throw in a lot of import libraries that you do not really
need, adding about 100K to your executable. To get rid of them, use the Project
Settings dialog, Link tab, to specify *ignore default libraries*. Add the
correct :file:`msvcrtxx.lib` to the list of libraries.

View File

@@ -0,0 +1,550 @@
.. _glossary:
********
Glossary
********
.. if you add new entries, keep the alphabetical sorting!
.. glossary::
``>>>``
The default Python prompt of the interactive shell. Often seen for code
examples which can be executed interactively in the interpreter.
``...``
The default Python prompt of the interactive shell when entering code for
an indented code block or within a pair of matching left and right
delimiters (parentheses, square brackets or curly braces).
2to3
A tool that tries to convert Python 2.x code to Python 3.x code by
handling most of the incompatibilites which can be detected by parsing the
source and traversing the parse tree.
2to3 is available in the standard library as :mod:`lib2to3`; a standalone
entry point is provided as :file:`Tools/scripts/2to3`. See
:ref:`2to3-reference`.
abstract base class
Abstract Base Classes (abbreviated ABCs) complement :term:`duck-typing` by
providing a way to define interfaces when other techniques like :func:`hasattr`
would be clumsy. Python comes with many builtin ABCs for data structures
(in the :mod:`collections` module), numbers (in the :mod:`numbers`
module), and streams (in the :mod:`io` module). You can create your own
ABC with the :mod:`abc` module.
argument
A value passed to a function or method, assigned to a named local
variable in the function body. A function or method may have both
positional arguments and keyword arguments in its definition.
Positional and keyword arguments may be variable-length: ``*`` accepts
or passes (if in the function definition or call) several positional
arguments in a list, while ``**`` does the same for keyword arguments
in a dictionary.
Any expression may be used within the argument list, and the evaluated
value is passed to the local variable.
attribute
A value associated with an object which is referenced by name using
dotted expressions. For example, if an object *o* has an attribute
*a* it would be referenced as *o.a*.
BDFL
Benevolent Dictator For Life, a.k.a. `Guido van Rossum
<http://www.python.org/~guido/>`_, Python's creator.
bytecode
Python source code is compiled into bytecode, the internal representation
of a Python program in the interpreter. The bytecode is also cached in
``.pyc`` and ``.pyo`` files so that executing the same file is faster the
second time (recompilation from source to bytecode can be avoided). This
"intermediate language" is said to run on a :term:`virtual machine`
that executes the machine code corresponding to each bytecode.
class
A template for creating user-defined objects. Class definitions
normally contain method definitions which operate on instances of the
class.
classic class
Any class which does not inherit from :class:`object`. See
:term:`new-style class`. Classic classes will be removed in Python 3.0.
coercion
The implicit conversion of an instance of one type to another during an
operation which involves two arguments of the same type. For example,
``int(3.15)`` converts the floating point number to the integer ``3``, but
in ``3+4.5``, each argument is of a different type (one int, one float),
and both must be converted to the same type before they can be added or it
will raise a ``TypeError``. Coercion between two operands can be
performed with the ``coerce`` builtin function; thus, ``3+4.5`` is
equivalent to calling ``operator.add(*coerce(3, 4.5))`` and results in
``operator.add(3.0, 4.5)``. Without coercion, all arguments of even
compatible types would have to be normalized to the same value by the
programmer, e.g., ``float(3)+4.5`` rather than just ``3+4.5``.
complex number
An extension of the familiar real number system in which all numbers are
expressed as a sum of a real part and an imaginary part. Imaginary
numbers are real multiples of the imaginary unit (the square root of
``-1``), often written ``i`` in mathematics or ``j`` in
engineering. Python has builtin support for complex numbers, which are
written with this latter notation; the imaginary part is written with a
``j`` suffix, e.g., ``3+1j``. To get access to complex equivalents of the
:mod:`math` module, use :mod:`cmath`. Use of complex numbers is a fairly
advanced mathematical feature. If you're not aware of a need for them,
it's almost certain you can safely ignore them.
context manager
An object which controls the environment seen in a :keyword:`with`
statement by defining :meth:`__enter__` and :meth:`__exit__` methods.
See :pep:`343`.
CPython
The canonical implementation of the Python programming language. The
term "CPython" is used in contexts when necessary to distinguish this
implementation from others such as Jython or IronPython.
decorator
A function returning another function, usually applied as a function
transformation using the ``@wrapper`` syntax. Common examples for
decorators are :func:`classmethod` and :func:`staticmethod`.
The decorator syntax is merely syntactic sugar, the following two
function definitions are semantically equivalent::
def f(...):
...
f = staticmethod(f)
@staticmethod
def f(...):
...
See :ref:`the documentation for function definition <function>` for more
about decorators.
descriptor
Any *new-style* object which defines the methods :meth:`__get__`,
:meth:`__set__`, or :meth:`__delete__`. When a class attribute is a
descriptor, its special binding behavior is triggered upon attribute
lookup. Normally, using *a.b* to get, set or delete an attribute looks up
the object named *b* in the class dictionary for *a*, but if *b* is a
descriptor, the respective descriptor method gets called. Understanding
descriptors is a key to a deep understanding of Python because they are
the basis for many features including functions, methods, properties,
class methods, static methods, and reference to super classes.
For more information about descriptors' methods, see :ref:`descriptors`.
dictionary
An associative array, where arbitrary keys are mapped to values. The use
of :class:`dict` closely resembles that for :class:`list`, but the keys can
be any object with a :meth:`__hash__` function, not just integers.
Called a hash in Perl.
docstring
A string literal which appears as the first expression in a class,
function or module. While ignored when the suite is executed, it is
recognized by the compiler and put into the :attr:`__doc__` attribute
of the enclosing class, function or module. Since it is available via
introspection, it is the canonical place for documentation of the
object.
duck-typing
A pythonic programming style which determines an object's type by inspection
of its method or attribute signature rather than by explicit relationship
to some type object ("If it looks like a duck and quacks like a duck, it
must be a duck.") By emphasizing interfaces rather than specific types,
well-designed code improves its flexibility by allowing polymorphic
substitution. Duck-typing avoids tests using :func:`type` or
:func:`isinstance`. (Note, however, that duck-typing can be complemented
with abstract base classes.) Instead, it typically employs :func:`hasattr`
tests or :term:`EAFP` programming.
EAFP
Easier to ask for forgiveness than permission. This common Python coding
style assumes the existence of valid keys or attributes and catches
exceptions if the assumption proves false. This clean and fast style is
characterized by the presence of many :keyword:`try` and :keyword:`except`
statements. The technique contrasts with the :term:`LBYL` style
common to many other languages such as C.
expression
A piece of syntax which can be evaluated to some value. In other words,
an expression is an accumulation of expression elements like literals, names,
attribute access, operators or function calls which all return a value.
In contrast to many other languages, not all language constructs are expressions.
There are also :term:`statement`\s which cannot be used as expressions,
such as :keyword:`print` or :keyword:`if`. Assignments are also statements,
not expressions.
extension module
A module written in C or C++, using Python's C API to interact with the core and
with user code.
finder
An object that tries to find the :term:`loader` for a module. It must
implement a method named :meth:`find_module`. See :pep:`302` for
details.
function
A series of statements which returns some value to a caller. It can also
be passed zero or more arguments which may be used in the execution of
the body. See also :term:`argument` and :term:`method`.
__future__
A pseudo module which programmers can use to enable new language features
which are not compatible with the current interpreter. For example, the
expression ``11/4`` currently evaluates to ``2``. If the module in which
it is executed had enabled *true division* by executing::
from __future__ import division
the expression ``11/4`` would evaluate to ``2.75``. By importing the
:mod:`__future__` module and evaluating its variables, you can see when a
new feature was first added to the language and when it will become the
default::
>>> import __future__
>>> __future__.division
_Feature((2, 2, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 8192)
garbage collection
The process of freeing memory when it is not used anymore. Python
performs garbage collection via reference counting and a cyclic garbage
collector that is able to detect and break reference cycles.
generator
A function which returns an iterator. It looks like a normal function
except that values are returned to the caller using a :keyword:`yield`
statement instead of a :keyword:`return` statement. Generator functions
often contain one or more :keyword:`for` or :keyword:`while` loops which
:keyword:`yield` elements back to the caller. The function execution is
stopped at the :keyword:`yield` keyword (returning the result) and is
resumed there when the next element is requested by calling the
:meth:`next` method of the returned iterator.
.. index:: single: generator expression
generator expression
An expression that returns a generator. It looks like a normal expression
followed by a :keyword:`for` expression defining a loop variable, range,
and an optional :keyword:`if` expression. The combined expression
generates values for an enclosing function::
>>> sum(i*i for i in range(10)) # sum of squares 0, 1, 4, ... 81
285
GIL
See :term:`global interpreter lock`.
global interpreter lock
The lock used by Python threads to assure that only one thread
executes in the :term:`CPython` :term:`virtual machine` at a time.
This simplifies the CPython implementation by assuring that no two
processes can access the same memory at the same time. Locking the
entire interpreter makes it easier for the interpreter to be
multi-threaded, at the expense of much of the parallelism afforded by
multi-processor machines. Efforts have been made in the past to
create a "free-threaded" interpreter (one which locks shared data at a
much finer granularity), but so far none have been successful because
performance suffered in the common single-processor case.
hashable
An object is *hashable* if it has a hash value which never changes during
its lifetime (it needs a :meth:`__hash__` method), and can be compared to
other objects (it needs an :meth:`__eq__` or :meth:`__cmp__` method).
Hashable objects which compare equal must have the same hash value.
Hashability makes an object usable as a dictionary key and a set member,
because these data structures use the hash value internally.
All of Python's immutable built-in objects are hashable, while no mutable
containers (such as lists or dictionaries) are. Objects which are
instances of user-defined classes are hashable by default; they all
compare unequal, and their hash value is their :func:`id`.
IDLE
An Integrated Development Environment for Python. IDLE is a basic editor
and interpreter environment which ships with the standard distribution of
Python. Good for beginners, it also serves as clear example code for
those wanting to implement a moderately sophisticated, multi-platform GUI
application.
immutable
An object with a fixed value. Immutable objects include numbers, strings and
tuples. Such an object cannot be altered. A new object has to
be created if a different value has to be stored. They play an important
role in places where a constant hash value is needed, for example as a key
in a dictionary.
integer division
Mathematical division discarding any remainder. For example, the
expression ``11/4`` currently evaluates to ``2`` in contrast to the
``2.75`` returned by float division. Also called *floor division*.
When dividing two integers the outcome will always be another integer
(having the floor function applied to it). However, if one of the operands
is another numeric type (such as a :class:`float`), the result will be
coerced (see :term:`coercion`) to a common type. For example, an integer
divided by a float will result in a float value, possibly with a decimal
fraction. Integer division can be forced by using the ``//`` operator
instead of the ``/`` operator. See also :term:`__future__`.
importer
An object that both finds and loads a module; both a
:term:`finder` and :term:`loader` object.
interactive
Python has an interactive interpreter which means you can enter
statements and expressions at the interpreter prompt, immediately
execute them and see their results. Just launch ``python`` with no
arguments (possibly by selecting it from your computer's main
menu). It is a very powerful way to test out new ideas or inspect
modules and packages (remember ``help(x)``).
interpreted
Python is an interpreted language, as opposed to a compiled one,
though the distinction can be blurry because of the presence of the
bytecode compiler. This means that source files can be run directly
without explicitly creating an executable which is then run.
Interpreted languages typically have a shorter development/debug cycle
than compiled ones, though their programs generally also run more
slowly. See also :term:`interactive`.
iterable
A container object capable of returning its members one at a
time. Examples of iterables include all sequence types (such as
:class:`list`, :class:`str`, and :class:`tuple`) and some non-sequence
types like :class:`dict` and :class:`file` and objects of any classes you
define with an :meth:`__iter__` or :meth:`__getitem__` method. Iterables
can be used in a :keyword:`for` loop and in many other places where a
sequence is needed (:func:`zip`, :func:`map`, ...). When an iterable
object is passed as an argument to the builtin function :func:`iter`, it
returns an iterator for the object. This iterator is good for one pass
over the set of values. When using iterables, it is usually not necessary
to call :func:`iter` or deal with iterator objects yourself. The ``for``
statement does that automatically for you, creating a temporary unnamed
variable to hold the iterator for the duration of the loop. See also
:term:`iterator`, :term:`sequence`, and :term:`generator`.
iterator
An object representing a stream of data. Repeated calls to the iterator's
:meth:`next` method return successive items in the stream. When no more
data are available a :exc:`StopIteration` exception is raised instead. At
this point, the iterator object is exhausted and any further calls to its
:meth:`next` method just raise :exc:`StopIteration` again. Iterators are
required to have an :meth:`__iter__` method that returns the iterator
object itself so every iterator is also iterable and may be used in most
places where other iterables are accepted. One notable exception is code
which attempts multiple iteration passes. A container object (such as a
:class:`list`) produces a fresh new iterator each time you pass it to the
:func:`iter` function or use it in a :keyword:`for` loop. Attempting this
with an iterator will just return the same exhausted iterator object used
in the previous iteration pass, making it appear like an empty container.
More information can be found in :ref:`typeiter`.
keyword argument
Arguments which are preceded with a ``variable_name=`` in the call.
The variable name designates the local name in the function to which the
value is assigned. ``**`` is used to accept or pass a dictionary of
keyword arguments. See :term:`argument`.
lambda
An anonymous inline function consisting of a single :term:`expression`
which is evaluated when the function is called. The syntax to create
a lambda function is ``lambda [arguments]: expression``
LBYL
Look before you leap. This coding style explicitly tests for
pre-conditions before making calls or lookups. This style contrasts with
the :term:`EAFP` approach and is characterized by the presence of many
:keyword:`if` statements.
list
A built-in Python :term:`sequence`. Despite its name it is more akin
to an array in other languages than to a linked list since access to
elements are O(1).
list comprehension
A compact way to process all or part of the elements in a sequence and
return a list with the results. ``result = ["0x%02x" % x for x in
range(256) if x % 2 == 0]`` generates a list of strings containing
even hex numbers (0x..) in the range from 0 to 255. The :keyword:`if`
clause is optional. If omitted, all elements in ``range(256)`` are
processed.
loader
An object that loads a module. It must define a method named
:meth:`load_module`. A loader is typically returned by a
:term:`finder`. See :pep:`302` for details.
mapping
A container object (such as :class:`dict`) which supports arbitrary key
lookups using the special method :meth:`__getitem__`.
metaclass
The class of a class. Class definitions create a class name, a class
dictionary, and a list of base classes. The metaclass is responsible for
taking those three arguments and creating the class. Most object oriented
programming languages provide a default implementation. What makes Python
special is that it is possible to create custom metaclasses. Most users
never need this tool, but when the need arises, metaclasses can provide
powerful, elegant solutions. They have been used for logging attribute
access, adding thread-safety, tracking object creation, implementing
singletons, and many other tasks.
More information can be found in :ref:`metaclasses`.
method
A function which is defined inside a class body. If called as an attribute
of an instance of that class, the method will get the instance object as
its first :term:`argument` (which is usually called ``self``).
See :term:`function` and :term:`nested scope`.
mutable
Mutable objects can change their value but keep their :func:`id`. See
also :term:`immutable`.
named tuple
Any tuple-like class whose indexable elements are also accessible using
named attributes (for example, :func:`time.localtime` returns a
tuple-like object where the *year* is accessible either with an
index such as ``t[0]`` or with a named attribute like ``t.tm_year``).
A named tuple can be a built-in type such as :class:`time.struct_time`,
or it can be created with a regular class definition. A full featured
named tuple can also be created with the factory function
:func:`collections.namedtuple`. The latter approach automatically
provides extra features such as a self-documenting representation like
``Employee(name='jones', title='programmer')``.
namespace
The place where a variable is stored. Namespaces are implemented as
dictionaries. There are the local, global and builtin namespaces as well
as nested namespaces in objects (in methods). Namespaces support
modularity by preventing naming conflicts. For instance, the functions
:func:`__builtin__.open` and :func:`os.open` are distinguished by their
namespaces. Namespaces also aid readability and maintainability by making
it clear which module implements a function. For instance, writing
:func:`random.seed` or :func:`itertools.izip` makes it clear that those
functions are implemented by the :mod:`random` and :mod:`itertools`
modules, respectively.
nested scope
The ability to refer to a variable in an enclosing definition. For
instance, a function defined inside another function can refer to
variables in the outer function. Note that nested scopes work only for
reference and not for assignment which will always write to the innermost
scope. In contrast, local variables both read and write in the innermost
scope. Likewise, global variables read and write to the global namespace.
new-style class
Any class which inherits from :class:`object`. This includes all built-in
types like :class:`list` and :class:`dict`. Only new-style classes can
use Python's newer, versatile features like :attr:`__slots__`,
descriptors, properties, and :meth:`__getattribute__`.
More information can be found in :ref:`newstyle`.
object
Any data with state (attributes or value) and defined behavior
(methods). Also the ultimate base class of any :term:`new-style
class`.
positional argument
The arguments assigned to local names inside a function or method,
determined by the order in which they were given in the call. ``*`` is
used to either accept multiple positional arguments (when in the
definition), or pass several arguments as a list to a function. See
:term:`argument`.
Python 3000
Nickname for the next major Python version, 3.0 (coined long ago
when the release of version 3 was something in the distant future.) This
is also abbreviated "Py3k".
Pythonic
An idea or piece of code which closely follows the most common idioms
of the Python language, rather than implementing code using concepts
common to other languages. For example, a common idiom in Python is
to loop over all elements of an iterable using a :keyword:`for`
statement. Many other languages don't have this type of construct, so
people unfamiliar with Python sometimes use a numerical counter instead::
for i in range(len(food)):
print food[i]
As opposed to the cleaner, Pythonic method::
for piece in food:
print piece
reference count
The number of references to an object. When the reference count of an
object drops to zero, it is deallocated. Reference counting is
generally not visible to Python code, but it is a key element of the
:term:`CPython` implementation. The :mod:`sys` module defines a
:func:`getrefcount` function that programmers can call to return the
reference count for a particular object.
__slots__
A declaration inside a :term:`new-style class` that saves memory by
pre-declaring space for instance attributes and eliminating instance
dictionaries. Though popular, the technique is somewhat tricky to get
right and is best reserved for rare cases where there are large numbers of
instances in a memory-critical application.
sequence
An :term:`iterable` which supports efficient element access using integer
indices via the :meth:`__getitem__` special method and defines a
:meth:`len` method that returns the length of the sequence.
Some built-in sequence types are :class:`list`, :class:`str`,
:class:`tuple`, and :class:`unicode`. Note that :class:`dict` also
supports :meth:`__getitem__` and :meth:`__len__`, but is considered a
mapping rather than a sequence because the lookups use arbitrary
:term:`immutable` keys rather than integers.
slice
An object usually containing a portion of a :term:`sequence`. A slice is
created using the subscript notation, ``[]`` with colons between numbers
when several are given, such as in ``variable_name[1:3:5]``. The bracket
(subscript) notation uses :class:`slice` objects internally (or in older
versions, :meth:`__getslice__` and :meth:`__setslice__`).
special method
A method that is called implicitly by Python to execute a certain
operation on a type, such as addition. Such methods have names starting
and ending with double underscores. Special methods are documented in
:ref:`specialnames`.
statement
A statement is part of a suite (a "block" of code). A statement is either
an :term:`expression` or a one of several constructs with a keyword, such
as :keyword:`if`, :keyword:`while` or :keyword:`print`.
triple-quoted string
A string which is bound by three instances of either a quotation mark
(") or an apostrophe ('). While they don't provide any functionality
not available with single-quoted strings, they are useful for a number
of reasons. They allow you to include unescaped single and double
quotes within a string and they can span multiple lines without the
use of the continuation character, making them especially useful when
writing docstrings.
type
The type of a Python object determines what kind of object it is; every
object has a type. An object's type is accessible as its
:attr:`__class__` attribute or can be retrieved with ``type(obj)``.
virtual machine
A computer defined entirely in software. Python's virtual machine
executes the :term:`bytecode` emitted by the bytecode compiler.
Zen of Python
Listing of Python design principles and philosophies that are helpful in
understanding and using the language. The listing can be found by typing
"``import this``" at the interactive prompt.

View File

@@ -0,0 +1,356 @@
*************************
Python Advocacy HOWTO
*************************
:Author: A.M. Kuchling
:Release: 0.03
.. topic:: Abstract
It's usually difficult to get your management to accept open source software,
and Python is no exception to this rule. This document discusses reasons to use
Python, strategies for winning acceptance, facts and arguments you can use, and
cases where you *shouldn't* try to use Python.
Reasons to Use Python
=====================
There are several reasons to incorporate a scripting language into your
development process, and this section will discuss them, and why Python has some
properties that make it a particularly good choice.
Programmability
---------------
Programs are often organized in a modular fashion. Lower-level operations are
grouped together, and called by higher-level functions, which may in turn be
used as basic operations by still further upper levels.
For example, the lowest level might define a very low-level set of functions for
accessing a hash table. The next level might use hash tables to store the
headers of a mail message, mapping a header name like ``Date`` to a value such
as ``Tue, 13 May 1997 20:00:54 -0400``. A yet higher level may operate on
message objects, without knowing or caring that message headers are stored in a
hash table, and so forth.
Often, the lowest levels do very simple things; they implement a data structure
such as a binary tree or hash table, or they perform some simple computation,
such as converting a date string to a number. The higher levels then contain
logic connecting these primitive operations. Using the approach, the primitives
can be seen as basic building blocks which are then glued together to produce
the complete product.
Why is this design approach relevant to Python? Because Python is well suited
to functioning as such a glue language. A common approach is to write a Python
module that implements the lower level operations; for the sake of speed, the
implementation might be in C, Java, or even Fortran. Once the primitives are
available to Python programs, the logic underlying higher level operations is
written in the form of Python code. The high-level logic is then more
understandable, and easier to modify.
John Ousterhout wrote a paper that explains this idea at greater length,
entitled "Scripting: Higher Level Programming for the 21st Century". I
recommend that you read this paper; see the references for the URL. Ousterhout
is the inventor of the Tcl language, and therefore argues that Tcl should be
used for this purpose; he only briefly refers to other languages such as Python,
Perl, and Lisp/Scheme, but in reality, Ousterhout's argument applies to
scripting languages in general, since you could equally write extensions for any
of the languages mentioned above.
Prototyping
-----------
In *The Mythical Man-Month*, Fredrick Brooks suggests the following rule when
planning software projects: "Plan to throw one away; you will anyway." Brooks
is saying that the first attempt at a software design often turns out to be
wrong; unless the problem is very simple or you're an extremely good designer,
you'll find that new requirements and features become apparent once development
has actually started. If these new requirements can't be cleanly incorporated
into the program's structure, you're presented with two unpleasant choices:
hammer the new features into the program somehow, or scrap everything and write
a new version of the program, taking the new features into account from the
beginning.
Python provides you with a good environment for quickly developing an initial
prototype. That lets you get the overall program structure and logic right, and
you can fine-tune small details in the fast development cycle that Python
provides. Once you're satisfied with the GUI interface or program output, you
can translate the Python code into C++, Fortran, Java, or some other compiled
language.
Prototyping means you have to be careful not to use too many Python features
that are hard to implement in your other language. Using ``eval()``, or regular
expressions, or the :mod:`pickle` module, means that you're going to need C or
Java libraries for formula evaluation, regular expressions, and serialization,
for example. But it's not hard to avoid such tricky code, and in the end the
translation usually isn't very difficult. The resulting code can be rapidly
debugged, because any serious logical errors will have been removed from the
prototype, leaving only more minor slip-ups in the translation to track down.
This strategy builds on the earlier discussion of programmability. Using Python
as glue to connect lower-level components has obvious relevance for constructing
prototype systems. In this way Python can help you with development, even if
end users never come in contact with Python code at all. If the performance of
the Python version is adequate and corporate politics allow it, you may not need
to do a translation into C or Java, but it can still be faster to develop a
prototype and then translate it, instead of attempting to produce the final
version immediately.
One example of this development strategy is Microsoft Merchant Server. Version
1.0 was written in pure Python, by a company that subsequently was purchased by
Microsoft. Version 2.0 began to translate the code into C++, shipping with some
C++code and some Python code. Version 3.0 didn't contain any Python at all; all
the code had been translated into C++. Even though the product doesn't contain
a Python interpreter, the Python language has still served a useful purpose by
speeding up development.
This is a very common use for Python. Past conference papers have also
described this approach for developing high-level numerical algorithms; see
David M. Beazley and Peter S. Lomdahl's paper "Feeding a Large-scale Physics
Application to Python" in the references for a good example. If an algorithm's
basic operations are things like "Take the inverse of this 4000x4000 matrix",
and are implemented in some lower-level language, then Python has almost no
additional performance cost; the extra time required for Python to evaluate an
expression like ``m.invert()`` is dwarfed by the cost of the actual computation.
It's particularly good for applications where seemingly endless tweaking is
required to get things right. GUI interfaces and Web sites are prime examples.
The Python code is also shorter and faster to write (once you're familiar with
Python), so it's easier to throw it away if you decide your approach was wrong;
if you'd spent two weeks working on it instead of just two hours, you might
waste time trying to patch up what you've got out of a natural reluctance to
admit that those two weeks were wasted. Truthfully, those two weeks haven't
been wasted, since you've learnt something about the problem and the technology
you're using to solve it, but it's human nature to view this as a failure of
some sort.
Simplicity and Ease of Understanding
------------------------------------
Python is definitely *not* a toy language that's only usable for small tasks.
The language features are general and powerful enough to enable it to be used
for many different purposes. It's useful at the small end, for 10- or 20-line
scripts, but it also scales up to larger systems that contain thousands of lines
of code.
However, this expressiveness doesn't come at the cost of an obscure or tricky
syntax. While Python has some dark corners that can lead to obscure code, there
are relatively few such corners, and proper design can isolate their use to only
a few classes or modules. It's certainly possible to write confusing code by
using too many features with too little concern for clarity, but most Python
code can look a lot like a slightly-formalized version of human-understandable
pseudocode.
In *The New Hacker's Dictionary*, Eric S. Raymond gives the following definition
for "compact":
.. epigraph::
Compact *adj.* Of a design, describes the valuable property that it can all be
apprehended at once in one's head. This generally means the thing created from
the design can be used with greater facility and fewer errors than an equivalent
tool that is not compact. Compactness does not imply triviality or lack of
power; for example, C is compact and FORTRAN is not, but C is more powerful than
FORTRAN. Designs become non-compact through accreting features and cruft that
don't merge cleanly into the overall design scheme (thus, some fans of Classic C
maintain that ANSI C is no longer compact).
(From http://www.catb.org/~esr/jargon/html/C/compact.html)
In this sense of the word, Python is quite compact, because the language has
just a few ideas, which are used in lots of places. Take namespaces, for
example. Import a module with ``import math``, and you create a new namespace
called ``math``. Classes are also namespaces that share many of the properties
of modules, and have a few of their own; for example, you can create instances
of a class. Instances? They're yet another namespace. Namespaces are currently
implemented as Python dictionaries, so they have the same methods as the
standard dictionary data type: .keys() returns all the keys, and so forth.
This simplicity arises from Python's development history. The language syntax
derives from different sources; ABC, a relatively obscure teaching language, is
one primary influence, and Modula-3 is another. (For more information about ABC
and Modula-3, consult their respective Web sites at http://www.cwi.nl/~steven/abc/
and http://www.m3.org.) Other features have come from C, Icon,
Algol-68, and even Perl. Python hasn't really innovated very much, but instead
has tried to keep the language small and easy to learn, building on ideas that
have been tried in other languages and found useful.
Simplicity is a virtue that should not be underestimated. It lets you learn the
language more quickly, and then rapidly write code -- code that often works the
first time you run it.
Java Integration
----------------
If you're working with Java, Jython (http://www.jython.org/) is definitely worth
your attention. Jython is a re-implementation of Python in Java that compiles
Python code into Java bytecodes. The resulting environment has very tight,
almost seamless, integration with Java. It's trivial to access Java classes
from Python, and you can write Python classes that subclass Java classes.
Jython can be used for prototyping Java applications in much the same way
CPython is used, and it can also be used for test suites for Java code, or
embedded in a Java application to add scripting capabilities.
Arguments and Rebuttals
=======================
Let's say that you've decided upon Python as the best choice for your
application. How can you convince your management, or your fellow developers,
to use Python? This section lists some common arguments against using Python,
and provides some possible rebuttals.
**Python is freely available software that doesn't cost anything. How good can
it be?**
Very good, indeed. These days Linux and Apache, two other pieces of open source
software, are becoming more respected as alternatives to commercial software,
but Python hasn't had all the publicity.
Python has been around for several years, with many users and developers.
Accordingly, the interpreter has been used by many people, and has gotten most
of the bugs shaken out of it. While bugs are still discovered at intervals,
they're usually either quite obscure (they'd have to be, for no one to have run
into them before) or they involve interfaces to external libraries. The
internals of the language itself are quite stable.
Having the source code should be viewed as making the software available for
peer review; people can examine the code, suggest (and implement) improvements,
and track down bugs. To find out more about the idea of open source code, along
with arguments and case studies supporting it, go to http://www.opensource.org.
**Who's going to support it?**
Python has a sizable community of developers, and the number is still growing.
The Internet community surrounding the language is an active one, and is worth
being considered another one of Python's advantages. Most questions posted to
the comp.lang.python newsgroup are quickly answered by someone.
Should you need to dig into the source code, you'll find it's clear and
well-organized, so it's not very difficult to write extensions and track down
bugs yourself. If you'd prefer to pay for support, there are companies and
individuals who offer commercial support for Python.
**Who uses Python for serious work?**
Lots of people; one interesting thing about Python is the surprising diversity
of applications that it's been used for. People are using Python to:
* Run Web sites
* Write GUI interfaces
* Control number-crunching code on supercomputers
* Make a commercial application scriptable by embedding the Python interpreter
inside it
* Process large XML data sets
* Build test suites for C or Java code
Whatever your application domain is, there's probably someone who's used Python
for something similar. Yet, despite being useable for such high-end
applications, Python's still simple enough to use for little jobs.
See http://wiki.python.org/moin/OrganizationsUsingPython for a list of some of
the organizations that use Python.
**What are the restrictions on Python's use?**
They're practically nonexistent. Consult the :file:`Misc/COPYRIGHT` file in the
source distribution, or the section :ref:`history-and-license` for the full
language, but it boils down to three conditions:
* You have to leave the copyright notice on the software; if you don't include
the source code in a product, you have to put the copyright notice in the
supporting documentation.
* Don't claim that the institutions that have developed Python endorse your
product in any way.
* If something goes wrong, you can't sue for damages. Practically all software
licenses contain this condition.
Notice that you don't have to provide source code for anything that contains
Python or is built with it. Also, the Python interpreter and accompanying
documentation can be modified and redistributed in any way you like, and you
don't have to pay anyone any licensing fees at all.
**Why should we use an obscure language like Python instead of well-known
language X?**
I hope this HOWTO, and the documents listed in the final section, will help
convince you that Python isn't obscure, and has a healthily growing user base.
One word of advice: always present Python's positive advantages, instead of
concentrating on language X's failings. People want to know why a solution is
good, rather than why all the other solutions are bad. So instead of attacking
a competing solution on various grounds, simply show how Python's virtues can
help.
Useful Resources
================
http://www.pythonology.com/success
The Python Success Stories are a collection of stories from successful users of
Python, with the emphasis on business and corporate users.
.. http://www.fsbassociates.com/books/pythonchpt1.htm
The first chapter of \emph{Internet Programming with Python} also
examines some of the reasons for using Python. The book is well worth
buying, but the publishers have made the first chapter available on
the Web.
http://home.pacbell.net/ouster/scripting.html
John Ousterhout's white paper on scripting is a good argument for the utility of
scripting languages, though naturally enough, he emphasizes Tcl, the language he
developed. Most of the arguments would apply to any scripting language.
http://www.python.org/workshops/1997-10/proceedings/beazley.html
The authors, David M. Beazley and Peter S. Lomdahl, describe their use of
Python at Los Alamos National Laboratory. It's another good example of how
Python can help get real work done. This quotation from the paper has been
echoed by many people:
.. epigraph::
Originally developed as a large monolithic application for massively parallel
processing systems, we have used Python to transform our application into a
flexible, highly modular, and extremely powerful system for performing
simulation, data analysis, and visualization. In addition, we describe how
Python has solved a number of important problems related to the development,
debugging, deployment, and maintenance of scientific software.
http://pythonjournal.cognizor.com/pyj1/Everitt-Feit_interview98-V1.html
This interview with Andy Feit, discussing Infoseek's use of Python, can be used
to show that choosing Python didn't introduce any difficulties into a company's
development process, and provided some substantial benefits.
.. http://www.python.org/psa/Commercial.html
Robin Friedrich wrote this document on how to support Python's use in
commercial projects.
http://www.python.org/workshops/1997-10/proceedings/stein.ps
For the 6th Python conference, Greg Stein presented a paper that traced Python's
adoption and usage at a startup called eShop, and later at Microsoft.
http://www.opensource.org
Management may be doubtful of the reliability and usefulness of software that
wasn't written commercially. This site presents arguments that show how open
source software can have considerable advantages over closed-source software.
http://www.faqs.org/docs/Linux-mini/Advocacy.html
The Linux Advocacy mini-HOWTO was the inspiration for this document, and is also
well worth reading for general suggestions on winning acceptance for a new
technology, such as Linux or Python. In general, you won't make much progress
by simply attacking existing systems and complaining about their inadequacies;
this often ends up looking like unfocused whining. It's much better to point
out some of the many areas where Python is an improvement over other systems.

View File

@@ -0,0 +1,216 @@
.. highlightlang:: c
********************************
Porting Extension Modules to 3.0
********************************
:author: Benjamin Peterson
.. topic:: Abstract
Although changing the C-API was not one of Python 3.0's objectives, the many
Python level changes made leaving 2.x's API intact impossible. In fact, some
changes such as :func:`int` and :func:`long` unification are more obvious on
the C level. This document endeavors to document incompatibilities and how
they can be worked around.
Conditional compilation
=======================
The easiest way to compile only some code for 3.0 is to check if
:cmacro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
#if PY_MAJOR_VERSION >= 3
#define IS_PY3K
#endif
API functions that are not present can be aliased to their equivalents within
conditional blocks.
Changes to Object APIs
======================
Python 3.0 merged together some types with similar functions while cleanly
separating others.
str/unicode Unification
-----------------------
Python 3.0's :func:`str` (``PyString_*`` functions in C) type is equivalent to
2.x's :func:`unicode` (``PyUnicode_*``). The old 8-bit string type has become
:func:`bytes`. Python 2.6 and later provide a compatibility header,
:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
compatibility with 3.0, :ctype:`PyUnicode` should be used for textual data and
:ctype:`PyBytes` for binary data. It's also important to remember that
:ctype:`PyBytes` and :ctype:`PyUnicode` in 3.0 are not interchangeable like
:ctype:`PyString` and :ctype:`PyString` are in 2.x. The following example shows
best practices with regards to :ctype:`PyUnicode`, :ctype:`PyString`, and
:ctype:`PyBytes`. ::
#include "stdlib.h"
#include "Python.h"
#include "bytesobject.h"
/* text example */
static PyObject *
say_hello(PyObject *self, PyObject *args) {
PyObject *name, *result;
if (!PyArg_ParseTuple(args, "U:say_hello", &name))
return NULL;
result = PyUnicode_FromFormat("Hello, %S!", name);
return result;
}
/* just a forward */
static char * do_encode(PyObject *);
/* bytes example */
static PyObject *
encode_object(PyObject *self, PyObject *args) {
char *encoded;
PyObject *result, *myobj;
if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
return NULL;
encoded = do_encode(myobj);
if (encoded == NULL)
return NULL;
result = PyBytes_FromString(encoded);
free(encoded);
return result;
}
long/int Unification
--------------------
In Python 3.0, there is only one integer type. It is called :func:`int` on the
Python level, but actually corresponds to 2.x's :func:`long` type. In the
C-API, ``PyInt_*`` functions are replaced by their ``PyLong_*`` neighbors. The
best course of action here is using the ``PyInt_*`` functions aliased to
``PyLong_*`` found in :file:`intobject.h`. The abstract ``PyNumber_*`` APIs
can also be used in some cases. ::
#include "Python.h"
#include "intobject.h"
static PyObject *
add_ints(PyObject *self, PyObject *args) {
int one, two;
PyObject *result;
if (!PyArg_ParseTuple(args, "ii:add_ints", &one, &two))
return NULL;
return PyInt_FromLong(one + two);
}
Module initialization and state
===============================
Python 3.0 has a revamped extension module initialization system. (See PEP
:pep:`3121`.) Instead of storing module state in globals, they should be stored
in an interpreter specific structure. Creating modules that act correctly in
both 2.x and 3.0 is tricky. The following simple example demonstrates how. ::
#include "Python.h"
struct module_state {
PyObject *error;
};
#if PY_MAJOR_VERSION >= 3
#define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
#else
#define GETSTATE(m) (&_state)
static struct module_state _state;
#endif
static PyObject *
error_out(PyObject *m) {
struct module_state *st = GETSTATE(m);
PyErr_SetString(st->error, "something bad happened");
return NULL;
}
static PyMethodDef myextension_methods[] = {
{"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
{NULL, NULL}
};
#if PY_MAJOR_VERSION >= 3
static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
Py_VISIT(GETSTATE(m)->error);
return 0;
}
static int myextension_clear(PyObject *m) {
Py_CLEAR(GETSTATE(m)->error);
return 0;
}
static struct PyModuleDef moduledef = {
PyModuleDef_HEAD_INIT,
"myextension",
NULL,
sizeof(struct module_state),
myextension_methods,
NULL,
myextension_traverse,
myextension_clear,
NULL
};
#define INITERROR return NULL
PyObject *
PyInit_myextension(void)
#else
#define INITERROR return
void
initmyextension(void)
#endif
{
#if PY_MAJOR_VERSION >= 3
PyObject *module = PyModule_Create(&moduledef);
#else
PyObject *module = Py_InitModule("myextension", myextension_methods);
#endif
if (module == NULL)
INITERROR;
struct module_state *st = GETSTATE(module);
st->error = PyErr_NewException("myextension.Error", NULL, NULL);
if (st->error == NULL) {
Py_DECREF(module);
INITERROR;
}
#if PY_MAJOR_VERSION >= 3
return module;
#endif
}
Other options
=============
If you are writing a new extension module, you might consider `Cython
<http://www.cython.org>`_. It translates a Python-like language to C. The
extension modules it creates are compatible with Python 3.x and 2.x.

View File

@@ -0,0 +1,436 @@
.. _curses-howto:
**********************************
Curses Programming with Python
**********************************
:Author: A.M. Kuchling, Eric S. Raymond
:Release: 2.03
.. topic:: Abstract
This document describes how to write text-mode programs with Python 2.x, using
the :mod:`curses` extension module to control the display.
What is curses?
===============
The curses library supplies a terminal-independent screen-painting and
keyboard-handling facility for text-based terminals; such terminals include
VT100s, the Linux console, and the simulated terminal provided by X11 programs
such as xterm and rxvt. Display terminals support various control codes to
perform common operations such as moving the cursor, scrolling the screen, and
erasing areas. Different terminals use widely differing codes, and often have
their own minor quirks.
In a world of X displays, one might ask "why bother"? It's true that
character-cell display terminals are an obsolete technology, but there are
niches in which being able to do fancy things with them are still valuable. One
is on small-footprint or embedded Unixes that don't carry an X server. Another
is for tools like OS installers and kernel configurators that may have to run
before X is available.
The curses library hides all the details of different terminals, and provides
the programmer with an abstraction of a display, containing multiple
non-overlapping windows. The contents of a window can be changed in various
ways-- adding text, erasing it, changing its appearance--and the curses library
will automagically figure out what control codes need to be sent to the terminal
to produce the right output.
The curses library was originally written for BSD Unix; the later System V
versions of Unix from AT&T added many enhancements and new functions. BSD curses
is no longer maintained, having been replaced by ncurses, which is an
open-source implementation of the AT&T interface. If you're using an
open-source Unix such as Linux or FreeBSD, your system almost certainly uses
ncurses. Since most current commercial Unix versions are based on System V
code, all the functions described here will probably be available. The older
versions of curses carried by some proprietary Unixes may not support
everything, though.
No one has made a Windows port of the curses module. On a Windows platform, try
the Console module written by Fredrik Lundh. The Console module provides
cursor-addressable text output, plus full support for mouse and keyboard input,
and is available from http://effbot.org/zone/console-index.htm.
The Python curses module
------------------------
Thy Python module is a fairly simple wrapper over the C functions provided by
curses; if you're already familiar with curses programming in C, it's really
easy to transfer that knowledge to Python. The biggest difference is that the
Python interface makes things simpler, by merging different C functions such as
:func:`addstr`, :func:`mvaddstr`, :func:`mvwaddstr`, into a single
:meth:`addstr` method. You'll see this covered in more detail later.
This HOWTO is simply an introduction to writing text-mode programs with curses
and Python. It doesn't attempt to be a complete guide to the curses API; for
that, see the Python library guide's section on ncurses, and the C manual pages
for ncurses. It will, however, give you the basic ideas.
Starting and ending a curses application
========================================
Before doing anything, curses must be initialized. This is done by calling the
:func:`initscr` function, which will determine the terminal type, send any
required setup codes to the terminal, and create various internal data
structures. If successful, :func:`initscr` returns a window object representing
the entire screen; this is usually called ``stdscr``, after the name of the
corresponding C variable. ::
import curses
stdscr = curses.initscr()
Usually curses applications turn off automatic echoing of keys to the screen, in
order to be able to read keys and only display them under certain circumstances.
This requires calling the :func:`noecho` function. ::
curses.noecho()
Applications will also commonly need to react to keys instantly, without
requiring the Enter key to be pressed; this is called cbreak mode, as opposed to
the usual buffered input mode. ::
curses.cbreak()
Terminals usually return special keys, such as the cursor keys or navigation
keys such as Page Up and Home, as a multibyte escape sequence. While you could
write your application to expect such sequences and process them accordingly,
curses can do it for you, returning a special value such as
:const:`curses.KEY_LEFT`. To get curses to do the job, you'll have to enable
keypad mode. ::
stdscr.keypad(1)
Terminating a curses application is much easier than starting one. You'll need
to call ::
curses.nocbreak(); stdscr.keypad(0); curses.echo()
to reverse the curses-friendly terminal settings. Then call the :func:`endwin`
function to restore the terminal to its original operating mode. ::
curses.endwin()
A common problem when debugging a curses application is to get your terminal
messed up when the application dies without restoring the terminal to its
previous state. In Python this commonly happens when your code is buggy and
raises an uncaught exception. Keys are no longer be echoed to the screen when
you type them, for example, which makes using the shell difficult.
In Python you can avoid these complications and make debugging much easier by
importing the module :mod:`curses.wrapper`. It supplies a :func:`wrapper`
function that takes a callable. It does the initializations described above,
and also initializes colors if color support is present. It then runs your
provided callable and finally deinitializes appropriately. The callable is
called inside a try-catch clause which catches exceptions, performs curses
deinitialization, and then passes the exception upwards. Thus, your terminal
won't be left in a funny state on exception.
Windows and Pads
================
Windows are the basic abstraction in curses. A window object represents a
rectangular area of the screen, and supports various methods to display text,
erase it, allow the user to input strings, and so forth.
The ``stdscr`` object returned by the :func:`initscr` function is a window
object that covers the entire screen. Many programs may need only this single
window, but you might wish to divide the screen into smaller windows, in order
to redraw or clear them separately. The :func:`newwin` function creates a new
window of a given size, returning the new window object. ::
begin_x = 20 ; begin_y = 7
height = 5 ; width = 40
win = curses.newwin(height, width, begin_y, begin_x)
A word about the coordinate system used in curses: coordinates are always passed
in the order *y,x*, and the top-left corner of a window is coordinate (0,0).
This breaks a common convention for handling coordinates, where the *x*
coordinate usually comes first. This is an unfortunate difference from most
other computer applications, but it's been part of curses since it was first
written, and it's too late to change things now.
When you call a method to display or erase text, the effect doesn't immediately
show up on the display. This is because curses was originally written with slow
300-baud terminal connections in mind; with these terminals, minimizing the time
required to redraw the screen is very important. This lets curses accumulate
changes to the screen, and display them in the most efficient manner. For
example, if your program displays some characters in a window, and then clears
the window, there's no need to send the original characters because they'd never
be visible.
Accordingly, curses requires that you explicitly tell it to redraw windows,
using the :func:`refresh` method of window objects. In practice, this doesn't
really complicate programming with curses much. Most programs go into a flurry
of activity, and then pause waiting for a keypress or some other action on the
part of the user. All you have to do is to be sure that the screen has been
redrawn before pausing to wait for user input, by simply calling
``stdscr.refresh()`` or the :func:`refresh` method of some other relevant
window.
A pad is a special case of a window; it can be larger than the actual display
screen, and only a portion of it displayed at a time. Creating a pad simply
requires the pad's height and width, while refreshing a pad requires giving the
coordinates of the on-screen area where a subsection of the pad will be
displayed. ::
pad = curses.newpad(100, 100)
# These loops fill the pad with letters; this is
# explained in the next section
for y in range(0, 100):
for x in range(0, 100):
try: pad.addch(y,x, ord('a') + (x*x+y*y) % 26 )
except curses.error: pass
# Displays a section of the pad in the middle of the screen
pad.refresh( 0,0, 5,5, 20,75)
The :func:`refresh` call displays a section of the pad in the rectangle
extending from coordinate (5,5) to coordinate (20,75) on the screen; the upper
left corner of the displayed section is coordinate (0,0) on the pad. Beyond
that difference, pads are exactly like ordinary windows and support the same
methods.
If you have multiple windows and pads on screen there is a more efficient way to
go, which will prevent annoying screen flicker at refresh time. Use the
:meth:`noutrefresh` method of each window to update the data structure
representing the desired state of the screen; then change the physical screen to
match the desired state in one go with the function :func:`doupdate`. The
normal :meth:`refresh` method calls :func:`doupdate` as its last act.
Displaying Text
===============
From a C programmer's point of view, curses may sometimes look like a twisty
maze of functions, all subtly different. For example, :func:`addstr` displays a
string at the current cursor location in the ``stdscr`` window, while
:func:`mvaddstr` moves to a given y,x coordinate first before displaying the
string. :func:`waddstr` is just like :func:`addstr`, but allows specifying a
window to use, instead of using ``stdscr`` by default. :func:`mvwaddstr` follows
similarly.
Fortunately the Python interface hides all these details; ``stdscr`` is a window
object like any other, and methods like :func:`addstr` accept multiple argument
forms. Usually there are four different forms.
+---------------------------------+-----------------------------------------------+
| Form | Description |
+=================================+===============================================+
| *str* or *ch* | Display the string *str* or character *ch* at |
| | the current position |
+---------------------------------+-----------------------------------------------+
| *str* or *ch*, *attr* | Display the string *str* or character *ch*, |
| | using attribute *attr* at the current |
| | position |
+---------------------------------+-----------------------------------------------+
| *y*, *x*, *str* or *ch* | Move to position *y,x* within the window, and |
| | display *str* or *ch* |
+---------------------------------+-----------------------------------------------+
| *y*, *x*, *str* or *ch*, *attr* | Move to position *y,x* within the window, and |
| | display *str* or *ch*, using attribute *attr* |
+---------------------------------+-----------------------------------------------+
Attributes allow displaying text in highlighted forms, such as in boldface,
underline, reverse code, or in color. They'll be explained in more detail in
the next subsection.
The :func:`addstr` function takes a Python string as the value to be displayed,
while the :func:`addch` functions take a character, which can be either a Python
string of length 1 or an integer. If it's a string, you're limited to
displaying characters between 0 and 255. SVr4 curses provides constants for
extension characters; these constants are integers greater than 255. For
example, :const:`ACS_PLMINUS` is a +/- symbol, and :const:`ACS_ULCORNER` is the
upper left corner of a box (handy for drawing borders).
Windows remember where the cursor was left after the last operation, so if you
leave out the *y,x* coordinates, the string or character will be displayed
wherever the last operation left off. You can also move the cursor with the
``move(y,x)`` method. Because some terminals always display a flashing cursor,
you may want to ensure that the cursor is positioned in some location where it
won't be distracting; it can be confusing to have the cursor blinking at some
apparently random location.
If your application doesn't need a blinking cursor at all, you can call
``curs_set(0)`` to make it invisible. Equivalently, and for compatibility with
older curses versions, there's a ``leaveok(bool)`` function. When *bool* is
true, the curses library will attempt to suppress the flashing cursor, and you
won't need to worry about leaving it in odd locations.
Attributes and Color
--------------------
Characters can be displayed in different ways. Status lines in a text-based
application are commonly shown in reverse video; a text viewer may need to
highlight certain words. curses supports this by allowing you to specify an
attribute for each cell on the screen.
An attribute is a integer, each bit representing a different attribute. You can
try to display text with multiple attribute bits set, but curses doesn't
guarantee that all the possible combinations are available, or that they're all
visually distinct. That depends on the ability of the terminal being used, so
it's safest to stick to the most commonly available attributes, listed here.
+----------------------+--------------------------------------+
| Attribute | Description |
+======================+======================================+
| :const:`A_BLINK` | Blinking text |
+----------------------+--------------------------------------+
| :const:`A_BOLD` | Extra bright or bold text |
+----------------------+--------------------------------------+
| :const:`A_DIM` | Half bright text |
+----------------------+--------------------------------------+
| :const:`A_REVERSE` | Reverse-video text |
+----------------------+--------------------------------------+
| :const:`A_STANDOUT` | The best highlighting mode available |
+----------------------+--------------------------------------+
| :const:`A_UNDERLINE` | Underlined text |
+----------------------+--------------------------------------+
So, to display a reverse-video status line on the top line of the screen, you
could code::
stdscr.addstr(0, 0, "Current mode: Typing mode",
curses.A_REVERSE)
stdscr.refresh()
The curses library also supports color on those terminals that provide it, The
most common such terminal is probably the Linux console, followed by color
xterms.
To use color, you must call the :func:`start_color` function soon after calling
:func:`initscr`, to initialize the default color set (the
:func:`curses.wrapper.wrapper` function does this automatically). Once that's
done, the :func:`has_colors` function returns TRUE if the terminal in use can
actually display color. (Note: curses uses the American spelling 'color',
instead of the Canadian/British spelling 'colour'. If you're used to the
British spelling, you'll have to resign yourself to misspelling it for the sake
of these functions.)
The curses library maintains a finite number of color pairs, containing a
foreground (or text) color and a background color. You can get the attribute
value corresponding to a color pair with the :func:`color_pair` function; this
can be bitwise-OR'ed with other attributes such as :const:`A_REVERSE`, but
again, such combinations are not guaranteed to work on all terminals.
An example, which displays a line of text using color pair 1::
stdscr.addstr( "Pretty text", curses.color_pair(1) )
stdscr.refresh()
As I said before, a color pair consists of a foreground and background color.
:func:`start_color` initializes 8 basic colors when it activates color mode.
They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and
7:white. The curses module defines named constants for each of these colors:
:const:`curses.COLOR_BLACK`, :const:`curses.COLOR_RED`, and so forth.
The ``init_pair(n, f, b)`` function changes the definition of color pair *n*, to
foreground color f and background color b. Color pair 0 is hard-wired to white
on black, and cannot be changed.
Let's put all this together. To change color 1 to red text on a white
background, you would call::
curses.init_pair(1, curses.COLOR_RED, curses.COLOR_WHITE)
When you change a color pair, any text already displayed using that color pair
will change to the new colors. You can also display new text in this color
with::
stdscr.addstr(0,0, "RED ALERT!", curses.color_pair(1) )
Very fancy terminals can change the definitions of the actual colors to a given
RGB value. This lets you change color 1, which is usually red, to purple or
blue or any other color you like. Unfortunately, the Linux console doesn't
support this, so I'm unable to try it out, and can't provide any examples. You
can check if your terminal can do this by calling :func:`can_change_color`,
which returns TRUE if the capability is there. If you're lucky enough to have
such a talented terminal, consult your system's man pages for more information.
User Input
==========
The curses library itself offers only very simple input mechanisms. Python's
support adds a text-input widget that makes up some of the lack.
The most common way to get input to a window is to use its :meth:`getch` method.
:meth:`getch` pauses and waits for the user to hit a key, displaying it if
:func:`echo` has been called earlier. You can optionally specify a coordinate
to which the cursor should be moved before pausing.
It's possible to change this behavior with the method :meth:`nodelay`. After
``nodelay(1)``, :meth:`getch` for the window becomes non-blocking and returns
``curses.ERR`` (a value of -1) when no input is ready. There's also a
:func:`halfdelay` function, which can be used to (in effect) set a timer on each
:meth:`getch`; if no input becomes available within a specified
delay (measured in tenths of a second), curses raises an exception.
The :meth:`getch` method returns an integer; if it's between 0 and 255, it
represents the ASCII code of the key pressed. Values greater than 255 are
special keys such as Page Up, Home, or the cursor keys. You can compare the
value returned to constants such as :const:`curses.KEY_PPAGE`,
:const:`curses.KEY_HOME`, or :const:`curses.KEY_LEFT`. Usually the main loop of
your program will look something like this::
while 1:
c = stdscr.getch()
if c == ord('p'): PrintDocument()
elif c == ord('q'): break # Exit the while()
elif c == curses.KEY_HOME: x = y = 0
The :mod:`curses.ascii` module supplies ASCII class membership functions that
take either integer or 1-character-string arguments; these may be useful in
writing more readable tests for your command interpreters. It also supplies
conversion functions that take either integer or 1-character-string arguments
and return the same type. For example, :func:`curses.ascii.ctrl` returns the
control character corresponding to its argument.
There's also a method to retrieve an entire string, :const:`getstr()`. It isn't
used very often, because its functionality is quite limited; the only editing
keys available are the backspace key and the Enter key, which terminates the
string. It can optionally be limited to a fixed number of characters. ::
curses.echo() # Enable echoing of characters
# Get a 15-character string, with the cursor on the top line
s = stdscr.getstr(0,0, 15)
The Python :mod:`curses.textpad` module supplies something better. With it, you
can turn a window into a text box that supports an Emacs-like set of
keybindings. Various methods of :class:`Textbox` class support editing with
input validation and gathering the edit results either with or without trailing
spaces. See the library documentation on :mod:`curses.textpad` for the
details.
For More Information
====================
This HOWTO didn't cover some advanced topics, such as screen-scraping or
capturing mouse events from an xterm instance. But the Python library page for
the curses modules is now pretty complete. You should browse it next.
If you're in doubt about the detailed behavior of any of the ncurses entry
points, consult the manual pages for your curses implementation, whether it's
ncurses or a proprietary Unix vendor's. The manual pages will document any
quirks, and provide complete lists of all the functions, attributes, and
:const:`ACS_\*` characters available to you.
Because the curses API is so large, some functions aren't supported in the
Python interface, not because they're difficult to implement, but because no one
has needed them yet. Feel free to add them and then submit a patch. Also, we
don't yet have support for the menus or panels libraries associated with
ncurses; feel free to add that.
If you write an interesting little program, feel free to contribute it as
another demo. We can always use more of them!
The ncurses FAQ: http://invisible-island.net/ncurses/ncurses.faq.html

View File

@@ -0,0 +1,308 @@
************************************
Idioms and Anti-Idioms in Python
************************************
:Author: Moshe Zadka
This document is placed in the public domain.
.. topic:: Abstract
This document can be considered a companion to the tutorial. It shows how to use
Python, and even more importantly, how *not* to use Python.
Language Constructs You Should Not Use
======================================
While Python has relatively few gotchas compared to other languages, it still
has some constructs which are only useful in corner cases, or are plain
dangerous.
from module import \*
---------------------
Inside Function Definitions
^^^^^^^^^^^^^^^^^^^^^^^^^^^
``from module import *`` is *invalid* inside function definitions. While many
versions of Python do not check for the invalidity, it does not make it more
valid, no more then having a smart lawyer makes a man innocent. Do not use it
like that ever. Even in versions where it was accepted, it made the function
execution slower, because the compiler could not be certain which names are
local and which are global. In Python 2.1 this construct causes warnings, and
sometimes even errors.
At Module Level
^^^^^^^^^^^^^^^
While it is valid to use ``from module import *`` at module level it is usually
a bad idea. For one, this loses an important property Python otherwise has ---
you can know where each toplevel name is defined by a simple "search" function
in your favourite editor. You also open yourself to trouble in the future, if
some module grows additional functions or classes.
One of the most awful question asked on the newsgroup is why this code::
f = open("www")
f.read()
does not work. Of course, it works just fine (assuming you have a file called
"www".) But it does not work if somewhere in the module, the statement ``from os
import *`` is present. The :mod:`os` module has a function called :func:`open`
which returns an integer. While it is very useful, shadowing builtins is one of
its least useful properties.
Remember, you can never know for sure what names a module exports, so either
take what you need --- ``from module import name1, name2``, or keep them in the
module and access on a per-need basis --- ``import module;print module.name``.
When It Is Just Fine
^^^^^^^^^^^^^^^^^^^^
There are situations in which ``from module import *`` is just fine:
* The interactive prompt. For example, ``from math import *`` makes Python an
amazing scientific calculator.
* When extending a module in C with a module in Python.
* When the module advertises itself as ``from import *`` safe.
Unadorned :keyword:`exec`, :func:`execfile` and friends
-------------------------------------------------------
The word "unadorned" refers to the use without an explicit dictionary, in which
case those constructs evaluate code in the *current* environment. This is
dangerous for the same reasons ``from import *`` is dangerous --- it might step
over variables you are counting on and mess up things for the rest of your code.
Simply do not do that.
Bad examples::
>>> for name in sys.argv[1:]:
>>> exec "%s=1" % name
>>> def func(s, **kw):
>>> for var, val in kw.items():
>>> exec "s.%s=val" % var # invalid!
>>> execfile("handler.py")
>>> handle()
Good examples::
>>> d = {}
>>> for name in sys.argv[1:]:
>>> d[name] = 1
>>> def func(s, **kw):
>>> for var, val in kw.items():
>>> setattr(s, var, val)
>>> d={}
>>> execfile("handle.py", d, d)
>>> handle = d['handle']
>>> handle()
from module import name1, name2
-------------------------------
This is a "don't" which is much weaker then the previous "don't"s but is still
something you should not do if you don't have good reasons to do that. The
reason it is usually bad idea is because you suddenly have an object which lives
in two separate namespaces. When the binding in one namespace changes, the
binding in the other will not, so there will be a discrepancy between them. This
happens when, for example, one module is reloaded, or changes the definition of
a function at runtime.
Bad example::
# foo.py
a = 1
# bar.py
from foo import a
if something():
a = 2 # danger: foo.a != a
Good example::
# foo.py
a = 1
# bar.py
import foo
if something():
foo.a = 2
except:
-------
Python has the ``except:`` clause, which catches all exceptions. Since *every*
error in Python raises an exception, this makes many programming errors look
like runtime problems, and hinders the debugging process.
The following code shows a great example::
try:
foo = opne("file") # misspelled "open"
except:
sys.exit("could not open file!")
The second line triggers a :exc:`NameError` which is caught by the except
clause. The program will exit, and you will have no idea that this has nothing
to do with the readability of ``"file"``.
The example above is better written ::
try:
foo = opne("file") # will be changed to "open" as soon as we run it
except IOError:
sys.exit("could not open file")
There are some situations in which the ``except:`` clause is useful: for
example, in a framework when running callbacks, it is good not to let any
callback disturb the framework.
Exceptions
==========
Exceptions are a useful feature of Python. You should learn to raise them
whenever something unexpected occurs, and catch them only where you can do
something about them.
The following is a very popular anti-idiom ::
def get_status(file):
if not os.path.exists(file):
print "file not found"
sys.exit(1)
return open(file).readline()
Consider the case the file gets deleted between the time the call to
:func:`os.path.exists` is made and the time :func:`open` is called. That means
the last line will throw an :exc:`IOError`. The same would happen if *file*
exists but has no read permission. Since testing this on a normal machine on
existing and non-existing files make it seem bugless, that means in testing the
results will seem fine, and the code will get shipped. Then an unhandled
:exc:`IOError` escapes to the user, who has to watch the ugly traceback.
Here is a better way to do it. ::
def get_status(file):
try:
return open(file).readline()
except (IOError, OSError):
print "file not found"
sys.exit(1)
In this version, \*either\* the file gets opened and the line is read (so it
works even on flaky NFS or SMB connections), or the message is printed and the
application aborted.
Still, :func:`get_status` makes too many assumptions --- that it will only be
used in a short running script, and not, say, in a long running server. Sure,
the caller could do something like ::
try:
status = get_status(log)
except SystemExit:
status = None
So, try to make as few ``except`` clauses in your code --- those will usually be
a catch-all in the :func:`main`, or inside calls which should always succeed.
So, the best version is probably ::
def get_status(file):
return open(file).readline()
The caller can deal with the exception if it wants (for example, if it tries
several files in a loop), or just let the exception filter upwards to *its*
caller.
The last version is not very good either --- due to implementation details, the
file would not be closed when an exception is raised until the handler finishes,
and perhaps not at all in non-C implementations (e.g., Jython). ::
def get_status(file):
fp = open(file)
try:
return fp.readline()
finally:
fp.close()
Using the Batteries
===================
Every so often, people seem to be writing stuff in the Python library again,
usually poorly. While the occasional module has a poor interface, it is usually
much better to use the rich standard library and data types that come with
Python then inventing your own.
A useful module very few people know about is :mod:`os.path`. It always has the
correct path arithmetic for your operating system, and will usually be much
better then whatever you come up with yourself.
Compare::
# ugh!
return dir+"/"+file
# better
return os.path.join(dir, file)
More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and
:func:`splitext`.
There are also many useful builtin functions people seem not to be aware of for
some reason: :func:`min` and :func:`max` can find the minimum/maximum of any
sequence with comparable semantics, for example, yet many people write their own
:func:`max`/:func:`min`. Another highly useful function is :func:`reduce`. A
classical use of :func:`reduce` is something like ::
import sys, operator
nums = map(float, sys.argv[1:])
print reduce(operator.add, nums)/len(nums)
This cute little script prints the average of all numbers given on the command
line. The :func:`reduce` adds up all the numbers, and the rest is just some
pre- and postprocessing.
On the same note, note that :func:`float`, :func:`int` and :func:`long` all
accept arguments of type string, and so are suited to parsing --- assuming you
are ready to deal with the :exc:`ValueError` they raise.
Using Backslash to Continue Statements
======================================
Since Python treats a newline as a statement terminator, and since statements
are often more then is comfortable to put in one line, many people do::
if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
calculate_number(10, 20) != forbulate(500, 360):
pass
You should realize that this is dangerous: a stray space after the ``\`` would
make this line wrong, and stray spaces are notoriously hard to see in editors.
In this case, at least it would be a syntax error, but if the code was::
value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
+ calculate_number(10, 20)*forbulate(500, 360)
then it would just be subtly wrong.
It is usually much better to use the implicit continuation inside parenthesis:
This version is bulletproof::
value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
+ calculate_number(10, 20)*forbulate(500, 360))

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,26 @@
***************
Python HOWTOs
***************
Python HOWTOs are documents that cover a single, specific topic,
and attempt to cover it fairly completely. Modelled on the Linux
Documentation Project's HOWTO collection, this collection is an
effort to foster documentation that's more detailed than the
Python Library Reference.
Currently, the HOWTOs are:
.. toctree::
:maxdepth: 1
advocacy.rst
cporting.rst
curses.rst
doanddont.rst
functional.rst
regex.rst
sockets.rst
unicode.rst
urllib2.rst
webservers.rst

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,420 @@
****************************
Socket Programming HOWTO
****************************
:Author: Gordon McMillan
.. topic:: Abstract
Sockets are used nearly everywhere, but are one of the most severely
misunderstood technologies around. This is a 10,000 foot overview of sockets.
It's not really a tutorial - you'll still have work to do in getting things
operational. It doesn't cover the fine points (and there are a lot of them), but
I hope it will give you enough background to begin using them decently.
Sockets
=======
Sockets are used nearly everywhere, but are one of the most severely
misunderstood technologies around. This is a 10,000 foot overview of sockets.
It's not really a tutorial - you'll still have work to do in getting things
working. It doesn't cover the fine points (and there are a lot of them), but I
hope it will give you enough background to begin using them decently.
I'm only going to talk about INET sockets, but they account for at least 99% of
the sockets in use. And I'll only talk about STREAM sockets - unless you really
know what you're doing (in which case this HOWTO isn't for you!), you'll get
better behavior and performance from a STREAM socket than anything else. I will
try to clear up the mystery of what a socket is, as well as some hints on how to
work with blocking and non-blocking sockets. But I'll start by talking about
blocking sockets. You'll need to know how they work before dealing with
non-blocking sockets.
Part of the trouble with understanding these things is that "socket" can mean a
number of subtly different things, depending on context. So first, let's make a
distinction between a "client" socket - an endpoint of a conversation, and a
"server" socket, which is more like a switchboard operator. The client
application (your browser, for example) uses "client" sockets exclusively; the
web server it's talking to uses both "server" sockets and "client" sockets.
History
-------
Of the various forms of IPC (*Inter Process Communication*), sockets are by far
the most popular. On any given platform, there are likely to be other forms of
IPC that are faster, but for cross-platform communication, sockets are about the
only game in town.
They were invented in Berkeley as part of the BSD flavor of Unix. They spread
like wildfire with the Internet. With good reason --- the combination of sockets
with INET makes talking to arbitrary machines around the world unbelievably easy
(at least compared to other schemes).
Creating a Socket
=================
Roughly speaking, when you clicked on the link that brought you to this page,
your browser did something like the following::
#create an INET, STREAMing socket
s = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
#now connect to the web server on port 80
# - the normal http port
s.connect(("www.mcmillan-inc.com", 80))
When the ``connect`` completes, the socket ``s`` can now be used to send in a
request for the text of this page. The same socket will read the reply, and then
be destroyed. That's right - destroyed. Client sockets are normally only used
for one exchange (or a small set of sequential exchanges).
What happens in the web server is a bit more complex. First, the web server
creates a "server socket". ::
#create an INET, STREAMing socket
serversocket = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
#bind the socket to a public host,
# and a well-known port
serversocket.bind((socket.gethostname(), 80))
#become a server socket
serversocket.listen(5)
A couple things to notice: we used ``socket.gethostname()`` so that the socket
would be visible to the outside world. If we had used ``s.bind(('', 80))`` or
``s.bind(('localhost', 80))`` or ``s.bind(('127.0.0.1', 80))`` we would still
have a "server" socket, but one that was only visible within the same machine.
A second thing to note: low number ports are usually reserved for "well known"
services (HTTP, SNMP etc). If you're playing around, use a nice high number (4
digits).
Finally, the argument to ``listen`` tells the socket library that we want it to
queue up as many as 5 connect requests (the normal max) before refusing outside
connections. If the rest of the code is written properly, that should be plenty.
OK, now we have a "server" socket, listening on port 80. Now we enter the
mainloop of the web server::
while 1:
#accept connections from outside
(clientsocket, address) = serversocket.accept()
#now do something with the clientsocket
#in this case, we'll pretend this is a threaded server
ct = client_thread(clientsocket)
ct.run()
There's actually 3 general ways in which this loop could work - dispatching a
thread to handle ``clientsocket``, create a new process to handle
``clientsocket``, or restructure this app to use non-blocking sockets, and
mulitplex between our "server" socket and any active ``clientsocket``\ s using
``select``. More about that later. The important thing to understand now is
this: this is *all* a "server" socket does. It doesn't send any data. It doesn't
receive any data. It just produces "client" sockets. Each ``clientsocket`` is
created in response to some *other* "client" socket doing a ``connect()`` to the
host and port we're bound to. As soon as we've created that ``clientsocket``, we
go back to listening for more connections. The two "clients" are free to chat it
up - they are using some dynamically allocated port which will be recycled when
the conversation ends.
IPC
---
If you need fast IPC between two processes on one machine, you should look into
whatever form of shared memory the platform offers. A simple protocol based
around shared memory and locks or semaphores is by far the fastest technique.
If you do decide to use sockets, bind the "server" socket to ``'localhost'``. On
most platforms, this will take a shortcut around a couple of layers of network
code and be quite a bit faster.
Using a Socket
==============
The first thing to note, is that the web browser's "client" socket and the web
server's "client" socket are identical beasts. That is, this is a "peer to peer"
conversation. Or to put it another way, *as the designer, you will have to
decide what the rules of etiquette are for a conversation*. Normally, the
``connect``\ ing socket starts the conversation, by sending in a request, or
perhaps a signon. But that's a design decision - it's not a rule of sockets.
Now there are two sets of verbs to use for communication. You can use ``send``
and ``recv``, or you can transform your client socket into a file-like beast and
use ``read`` and ``write``. The latter is the way Java presents their sockets.
I'm not going to talk about it here, except to warn you that you need to use
``flush`` on sockets. These are buffered "files", and a common mistake is to
``write`` something, and then ``read`` for a reply. Without a ``flush`` in
there, you may wait forever for the reply, because the request may still be in
your output buffer.
Now we come the major stumbling block of sockets - ``send`` and ``recv`` operate
on the network buffers. They do not necessarily handle all the bytes you hand
them (or expect from them), because their major focus is handling the network
buffers. In general, they return when the associated network buffers have been
filled (``send``) or emptied (``recv``). They then tell you how many bytes they
handled. It is *your* responsibility to call them again until your message has
been completely dealt with.
When a ``recv`` returns 0 bytes, it means the other side has closed (or is in
the process of closing) the connection. You will not receive any more data on
this connection. Ever. You may be able to send data successfully; I'll talk
about that some on the next page.
A protocol like HTTP uses a socket for only one transfer. The client sends a
request, the reads a reply. That's it. The socket is discarded. This means that
a client can detect the end of the reply by receiving 0 bytes.
But if you plan to reuse your socket for further transfers, you need to realize
that *there is no "EOT" (End of Transfer) on a socket.* I repeat: if a socket
``send`` or ``recv`` returns after handling 0 bytes, the connection has been
broken. If the connection has *not* been broken, you may wait on a ``recv``
forever, because the socket will *not* tell you that there's nothing more to
read (for now). Now if you think about that a bit, you'll come to realize a
fundamental truth of sockets: *messages must either be fixed length* (yuck), *or
be delimited* (shrug), *or indicate how long they are* (much better), *or end by
shutting down the connection*. The choice is entirely yours, (but some ways are
righter than others).
Assuming you don't want to end the connection, the simplest solution is a fixed
length message::
class mysocket:
'''demonstration class only
- coded for clarity, not efficiency
'''
def __init__(self, sock=None):
if sock is None:
self.sock = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
else:
self.sock = sock
def connect(self, host, port):
self.sock.connect((host, port))
def mysend(self, msg):
totalsent = 0
while totalsent < MSGLEN:
sent = self.sock.send(msg[totalsent:])
if sent == 0:
raise RuntimeError, \
"socket connection broken"
totalsent = totalsent + sent
def myreceive(self):
msg = ''
while len(msg) < MSGLEN:
chunk = self.sock.recv(MSGLEN-len(msg))
if chunk == '':
raise RuntimeError, \
"socket connection broken"
msg = msg + chunk
return msg
The sending code here is usable for almost any messaging scheme - in Python you
send strings, and you can use ``len()`` to determine its length (even if it has
embedded ``\0`` characters). It's mostly the receiving code that gets more
complex. (And in C, it's not much worse, except you can't use ``strlen`` if the
message has embedded ``\0``\ s.)
The easiest enhancement is to make the first character of the message an
indicator of message type, and have the type determine the length. Now you have
two ``recv``\ s - the first to get (at least) that first character so you can
look up the length, and the second in a loop to get the rest. If you decide to
go the delimited route, you'll be receiving in some arbitrary chunk size, (4096
or 8192 is frequently a good match for network buffer sizes), and scanning what
you've received for a delimiter.
One complication to be aware of: if your conversational protocol allows multiple
messages to be sent back to back (without some kind of reply), and you pass
``recv`` an arbitrary chunk size, you may end up reading the start of a
following message. You'll need to put that aside and hold onto it, until it's
needed.
Prefixing the message with it's length (say, as 5 numeric characters) gets more
complex, because (believe it or not), you may not get all 5 characters in one
``recv``. In playing around, you'll get away with it; but in high network loads,
your code will very quickly break unless you use two ``recv`` loops - the first
to determine the length, the second to get the data part of the message. Nasty.
This is also when you'll discover that ``send`` does not always manage to get
rid of everything in one pass. And despite having read this, you will eventually
get bit by it!
In the interests of space, building your character, (and preserving my
competitive position), these enhancements are left as an exercise for the
reader. Lets move on to cleaning up.
Binary Data
-----------
It is perfectly possible to send binary data over a socket. The major problem is
that not all machines use the same formats for binary data. For example, a
Motorola chip will represent a 16 bit integer with the value 1 as the two hex
bytes 00 01. Intel and DEC, however, are byte-reversed - that same 1 is 01 00.
Socket libraries have calls for converting 16 and 32 bit integers - ``ntohl,
htonl, ntohs, htons`` where "n" means *network* and "h" means *host*, "s" means
*short* and "l" means *long*. Where network order is host order, these do
nothing, but where the machine is byte-reversed, these swap the bytes around
appropriately.
In these days of 32 bit machines, the ascii representation of binary data is
frequently smaller than the binary representation. That's because a surprising
amount of the time, all those longs have the value 0, or maybe 1. The string "0"
would be two bytes, while binary is four. Of course, this doesn't fit well with
fixed-length messages. Decisions, decisions.
Disconnecting
=============
Strictly speaking, you're supposed to use ``shutdown`` on a socket before you
``close`` it. The ``shutdown`` is an advisory to the socket at the other end.
Depending on the argument you pass it, it can mean "I'm not going to send
anymore, but I'll still listen", or "I'm not listening, good riddance!". Most
socket libraries, however, are so used to programmers neglecting to use this
piece of etiquette that normally a ``close`` is the same as ``shutdown();
close()``. So in most situations, an explicit ``shutdown`` is not needed.
One way to use ``shutdown`` effectively is in an HTTP-like exchange. The client
sends a request and then does a ``shutdown(1)``. This tells the server "This
client is done sending, but can still receive." The server can detect "EOF" by
a receive of 0 bytes. It can assume it has the complete request. The server
sends a reply. If the ``send`` completes successfully then, indeed, the client
was still receiving.
Python takes the automatic shutdown a step further, and says that when a socket
is garbage collected, it will automatically do a ``close`` if it's needed. But
relying on this is a very bad habit. If your socket just disappears without
doing a ``close``, the socket at the other end may hang indefinitely, thinking
you're just being slow. *Please* ``close`` your sockets when you're done.
When Sockets Die
----------------
Probably the worst thing about using blocking sockets is what happens when the
other side comes down hard (without doing a ``close``). Your socket is likely to
hang. SOCKSTREAM is a reliable protocol, and it will wait a long, long time
before giving up on a connection. If you're using threads, the entire thread is
essentially dead. There's not much you can do about it. As long as you aren't
doing something dumb, like holding a lock while doing a blocking read, the
thread isn't really consuming much in the way of resources. Do *not* try to kill
the thread - part of the reason that threads are more efficient than processes
is that they avoid the overhead associated with the automatic recycling of
resources. In other words, if you do manage to kill the thread, your whole
process is likely to be screwed up.
Non-blocking Sockets
====================
If you've understood the preceeding, you already know most of what you need to
know about the mechanics of using sockets. You'll still use the same calls, in
much the same ways. It's just that, if you do it right, your app will be almost
inside-out.
In Python, you use ``socket.setblocking(0)`` to make it non-blocking. In C, it's
more complex, (for one thing, you'll need to choose between the BSD flavor
``O_NONBLOCK`` and the almost indistinguishable Posix flavor ``O_NDELAY``, which
is completely different from ``TCP_NODELAY``), but it's the exact same idea. You
do this after creating the socket, but before using it. (Actually, if you're
nuts, you can switch back and forth.)
The major mechanical difference is that ``send``, ``recv``, ``connect`` and
``accept`` can return without having done anything. You have (of course) a
number of choices. You can check return code and error codes and generally drive
yourself crazy. If you don't believe me, try it sometime. Your app will grow
large, buggy and suck CPU. So let's skip the brain-dead solutions and do it
right.
Use ``select``.
In C, coding ``select`` is fairly complex. In Python, it's a piece of cake, but
it's close enough to the C version that if you understand ``select`` in Python,
you'll have little trouble with it in C. ::
ready_to_read, ready_to_write, in_error = \
select.select(
potential_readers,
potential_writers,
potential_errs,
timeout)
You pass ``select`` three lists: the first contains all sockets that you might
want to try reading; the second all the sockets you might want to try writing
to, and the last (normally left empty) those that you want to check for errors.
You should note that a socket can go into more than one list. The ``select``
call is blocking, but you can give it a timeout. This is generally a sensible
thing to do - give it a nice long timeout (say a minute) unless you have good
reason to do otherwise.
In return, you will get three lists. They have the sockets that are actually
readable, writable and in error. Each of these lists is a subset (possibly
empty) of the corresponding list you passed in. And if you put a socket in more
than one input list, it will only be (at most) in one output list.
If a socket is in the output readable list, you can be
as-close-to-certain-as-we-ever-get-in-this-business that a ``recv`` on that
socket will return *something*. Same idea for the writable list. You'll be able
to send *something*. Maybe not all you want to, but *something* is better than
nothing. (Actually, any reasonably healthy socket will return as writable - it
just means outbound network buffer space is available.)
If you have a "server" socket, put it in the potential_readers list. If it comes
out in the readable list, your ``accept`` will (almost certainly) work. If you
have created a new socket to ``connect`` to someone else, put it in the
potential_writers list. If it shows up in the writable list, you have a decent
chance that it has connected.
One very nasty problem with ``select``: if somewhere in those input lists of
sockets is one which has died a nasty death, the ``select`` will fail. You then
need to loop through every single damn socket in all those lists and do a
``select([sock],[],[],0)`` until you find the bad one. That timeout of 0 means
it won't take long, but it's ugly.
Actually, ``select`` can be handy even with blocking sockets. It's one way of
determining whether you will block - the socket returns as readable when there's
something in the buffers. However, this still doesn't help with the problem of
determining whether the other end is done, or just busy with something else.
**Portability alert**: On Unix, ``select`` works both with the sockets and
files. Don't try this on Windows. On Windows, ``select`` works with sockets
only. Also note that in C, many of the more advanced socket options are done
differently on Windows. In fact, on Windows I usually use threads (which work
very, very well) with my sockets. Face it, if you want any kind of performance,
your code will look very different on Windows than on Unix.
Performance
-----------
There's no question that the fastest sockets code uses non-blocking sockets and
select to multiplex them. You can put together something that will saturate a
LAN connection without putting any strain on the CPU. The trouble is that an app
written this way can't do much of anything else - it needs to be ready to
shuffle bytes around at all times.
Assuming that your app is actually supposed to do something more than that,
threading is the optimal solution, (and using non-blocking sockets will be
faster than using blocking sockets). Unfortunately, threading support in Unixes
varies both in API and quality. So the normal Unix solution is to fork a
subprocess to deal with each connection. The overhead for this is significant
(and don't do this on Windows - the overhead of process creation is enormous
there). It also means that unless each subprocess is completely independent,
you'll need to use another form of IPC, say a pipe, or shared memory and
semaphores, to communicate between the parent and child processes.
Finally, remember that even though blocking sockets are somewhat slower than
non-blocking, in many cases they are the "right" solution. After all, if your
app is driven by the data it receives over a socket, there's not much sense in
complicating the logic just so your app can wait on ``select`` instead of
``recv``.

View File

@@ -0,0 +1,728 @@
*****************
Unicode HOWTO
*****************
:Release: 1.02
This HOWTO discusses Python's support for Unicode, and explains various problems
that people commonly encounter when trying to work with Unicode.
Introduction to Unicode
=======================
History of Character Codes
--------------------------
In 1968, the American Standard Code for Information Interchange, better known by
its acronym ASCII, was standardized. ASCII defined numeric codes for various
characters, with the numeric values running from 0 to
127. For example, the lowercase letter 'a' is assigned 97 as its code
value.
ASCII was an American-developed standard, so it only defined unaccented
characters. There was an 'e', but no 'é' or 'Í'. This meant that languages
which required accented characters couldn't be faithfully represented in ASCII.
(Actually the missing accents matter for English, too, which contains words such
as 'naïve' and 'café', and some publications have house styles which require
spellings such as 'coöperate'.)
For a while people just wrote programs that didn't display accents. I remember
looking at Apple ][ BASIC programs, published in French-language publications in
the mid-1980s, that had lines like these::
PRINT "FICHIER EST COMPLETE."
PRINT "CARACTERE NON ACCEPTE."
Those messages should contain accents, and they just look wrong to someone who
can read French.
In the 1980s, almost all personal computers were 8-bit, meaning that bytes could
hold values ranging from 0 to 255. ASCII codes only went up to 127, so some
machines assigned values between 128 and 255 to accented characters. Different
machines had different codes, however, which led to problems exchanging files.
Eventually various commonly used sets of values for the 128-255 range emerged.
Some were true standards, defined by the International Standards Organization,
and some were **de facto** conventions that were invented by one company or
another and managed to catch on.
255 characters aren't very many. For example, you can't fit both the accented
characters used in Western Europe and the Cyrillic alphabet used for Russian
into the 128-255 range because there are more than 127 such characters.
You could write files using different codes (all your Russian files in a coding
system called KOI8, all your French files in a different coding system called
Latin1), but what if you wanted to write a French document that quotes some
Russian text? In the 1980s people began to want to solve this problem, and the
Unicode standardization effort began.
Unicode started out using 16-bit characters instead of 8-bit characters. 16
bits means you have 2^16 = 65,536 distinct values available, making it possible
to represent many different characters from many different alphabets; an initial
goal was to have Unicode contain the alphabets for every single human language.
It turns out that even 16 bits isn't enough to meet that goal, and the modern
Unicode specification uses a wider range of codes, 0-1,114,111 (0x10ffff in
base-16).
There's a related ISO standard, ISO 10646. Unicode and ISO 10646 were
originally separate efforts, but the specifications were merged with the 1.1
revision of Unicode.
(This discussion of Unicode's history is highly simplified. I don't think the
average Python programmer needs to worry about the historical details; consult
the Unicode consortium site listed in the References for more information.)
Definitions
-----------
A **character** is the smallest possible component of a text. 'A', 'B', 'C',
etc., are all different characters. So are 'È' and 'Í'. Characters are
abstractions, and vary depending on the language or context you're talking
about. For example, the symbol for ohms (Ω) is usually drawn much like the
capital letter omega (Ω) in the Greek alphabet (they may even be the same in
some fonts), but these are two different characters that have different
meanings.
The Unicode standard describes how characters are represented by **code
points**. A code point is an integer value, usually denoted in base 16. In the
standard, a code point is written using the notation U+12ca to mean the
character with value 0x12ca (4810 decimal). The Unicode standard contains a lot
of tables listing characters and their corresponding code points::
0061 'a'; LATIN SMALL LETTER A
0062 'b'; LATIN SMALL LETTER B
0063 'c'; LATIN SMALL LETTER C
...
007B '{'; LEFT CURLY BRACKET
Strictly, these definitions imply that it's meaningless to say 'this is
character U+12ca'. U+12ca is a code point, which represents some particular
character; in this case, it represents the character 'ETHIOPIC SYLLABLE WI'. In
informal contexts, this distinction between code points and characters will
sometimes be forgotten.
A character is represented on a screen or on paper by a set of graphical
elements that's called a **glyph**. The glyph for an uppercase A, for example,
is two diagonal strokes and a horizontal stroke, though the exact details will
depend on the font being used. Most Python code doesn't need to worry about
glyphs; figuring out the correct glyph to display is generally the job of a GUI
toolkit or a terminal's font renderer.
Encodings
---------
To summarize the previous section: a Unicode string is a sequence of code
points, which are numbers from 0 to 0x10ffff. This sequence needs to be
represented as a set of bytes (meaning, values from 0-255) in memory. The rules
for translating a Unicode string into a sequence of bytes are called an
**encoding**.
The first encoding you might think of is an array of 32-bit integers. In this
representation, the string "Python" would look like this::
P y t h o n
0x50 00 00 00 79 00 00 00 74 00 00 00 68 00 00 00 6f 00 00 00 6e 00 00 00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
This representation is straightforward but using it presents a number of
problems.
1. It's not portable; different processors order the bytes differently.
2. It's very wasteful of space. In most texts, the majority of the code points
are less than 127, or less than 255, so a lot of space is occupied by zero
bytes. The above string takes 24 bytes compared to the 6 bytes needed for an
ASCII representation. Increased RAM usage doesn't matter too much (desktop
computers have megabytes of RAM, and strings aren't usually that large), but
expanding our usage of disk and network bandwidth by a factor of 4 is
intolerable.
3. It's not compatible with existing C functions such as ``strlen()``, so a new
family of wide string functions would need to be used.
4. Many Internet standards are defined in terms of textual data, and can't
handle content with embedded zero bytes.
Generally people don't use this encoding, instead choosing other encodings that
are more efficient and convenient.
Encodings don't have to handle every possible Unicode character, and most
encodings don't. For example, Python's default encoding is the 'ascii'
encoding. The rules for converting a Unicode string into the ASCII encoding are
simple; for each code point:
1. If the code point is < 128, each byte is the same as the value of the code
point.
2. If the code point is 128 or greater, the Unicode string can't be represented
in this encoding. (Python raises a :exc:`UnicodeEncodeError` exception in this
case.)
Latin-1, also known as ISO-8859-1, is a similar encoding. Unicode code points
0-255 are identical to the Latin-1 values, so converting to this encoding simply
requires converting code points to byte values; if a code point larger than 255
is encountered, the string can't be encoded into Latin-1.
Encodings don't have to be simple one-to-one mappings like Latin-1. Consider
IBM's EBCDIC, which was used on IBM mainframes. Letter values weren't in one
block: 'a' through 'i' had values from 129 to 137, but 'j' through 'r' were 145
through 153. If you wanted to use EBCDIC as an encoding, you'd probably use
some sort of lookup table to perform the conversion, but this is largely an
internal detail.
UTF-8 is one of the most commonly used encodings. UTF stands for "Unicode
Transformation Format", and the '8' means that 8-bit numbers are used in the
encoding. (There's also a UTF-16 encoding, but it's less frequently used than
UTF-8.) UTF-8 uses the following rules:
1. If the code point is <128, it's represented by the corresponding byte value.
2. If the code point is between 128 and 0x7ff, it's turned into two byte values
between 128 and 255.
3. Code points >0x7ff are turned into three- or four-byte sequences, where each
byte of the sequence is between 128 and 255.
UTF-8 has several convenient properties:
1. It can handle any Unicode code point.
2. A Unicode string is turned into a string of bytes containing no embedded zero
bytes. This avoids byte-ordering issues, and means UTF-8 strings can be
processed by C functions such as ``strcpy()`` and sent through protocols that
can't handle zero bytes.
3. A string of ASCII text is also valid UTF-8 text.
4. UTF-8 is fairly compact; the majority of code points are turned into two
bytes, and values less than 128 occupy only a single byte.
5. If bytes are corrupted or lost, it's possible to determine the start of the
next UTF-8-encoded code point and resynchronize. It's also unlikely that
random 8-bit data will look like valid UTF-8.
References
----------
The Unicode Consortium site at <http://www.unicode.org> has character charts, a
glossary, and PDF versions of the Unicode specification. Be prepared for some
difficult reading. <http://www.unicode.org/history/> is a chronology of the
origin and development of Unicode.
To help understand the standard, Jukka Korpela has written an introductory guide
to reading the Unicode character tables, available at
<http://www.cs.tut.fi/~jkorpela/unicode/guide.html>.
Two other good introductory articles were written by Joel Spolsky
<http://www.joelonsoftware.com/articles/Unicode.html> and Jason Orendorff
<http://www.jorendorff.com/articles/unicode/>. If this introduction didn't make
things clear to you, you should try reading one of these alternate articles
before continuing.
Wikipedia entries are often helpful; see the entries for "character encoding"
<http://en.wikipedia.org/wiki/Character_encoding> and UTF-8
<http://en.wikipedia.org/wiki/UTF-8>, for example.
Python's Unicode Support
========================
Now that you've learned the rudiments of Unicode, we can look at Python's
Unicode features.
The Unicode Type
----------------
Unicode strings are expressed as instances of the :class:`unicode` type, one of
Python's repertoire of built-in types. It derives from an abstract type called
:class:`basestring`, which is also an ancestor of the :class:`str` type; you can
therefore check if a value is a string type with ``isinstance(value,
basestring)``. Under the hood, Python represents Unicode strings as either 16-
or 32-bit integers, depending on how the Python interpreter was compiled.
The :func:`unicode` constructor has the signature ``unicode(string[, encoding,
errors])``. All of its arguments should be 8-bit strings. The first argument
is converted to Unicode using the specified encoding; if you leave off the
``encoding`` argument, the ASCII encoding is used for the conversion, so
characters greater than 127 will be treated as errors::
>>> unicode('abcdef')
u'abcdef'
>>> s = unicode('abcdef')
>>> type(s)
<type 'unicode'>
>>> unicode('abcdef' + chr(255))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 6:
ordinal not in range(128)
The ``errors`` argument specifies the response when the input string can't be
converted according to the encoding's rules. Legal values for this argument are
'strict' (raise a ``UnicodeDecodeError`` exception), 'replace' (add U+FFFD,
'REPLACEMENT CHARACTER'), or 'ignore' (just leave the character out of the
Unicode result). The following examples show the differences::
>>> unicode('\x80abc', errors='strict')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
ordinal not in range(128)
>>> unicode('\x80abc', errors='replace')
u'\ufffdabc'
>>> unicode('\x80abc', errors='ignore')
u'abc'
Encodings are specified as strings containing the encoding's name. Python 2.4
comes with roughly 100 different encodings; see the Python Library Reference at
:ref:`standard-encodings` for a list. Some encodings
have multiple names; for example, 'latin-1', 'iso_8859_1' and '8859' are all
synonyms for the same encoding.
One-character Unicode strings can also be created with the :func:`unichr`
built-in function, which takes integers and returns a Unicode string of length 1
that contains the corresponding code point. The reverse operation is the
built-in :func:`ord` function that takes a one-character Unicode string and
returns the code point value::
>>> unichr(40960)
u'\ua000'
>>> ord(u'\ua000')
40960
Instances of the :class:`unicode` type have many of the same methods as the
8-bit string type for operations such as searching and formatting::
>>> s = u'Was ever feather so lightly blown to and fro as this multitude?'
>>> s.count('e')
5
>>> s.find('feather')
9
>>> s.find('bird')
-1
>>> s.replace('feather', 'sand')
u'Was ever sand so lightly blown to and fro as this multitude?'
>>> s.upper()
u'WAS EVER FEATHER SO LIGHTLY BLOWN TO AND FRO AS THIS MULTITUDE?'
Note that the arguments to these methods can be Unicode strings or 8-bit
strings. 8-bit strings will be converted to Unicode before carrying out the
operation; Python's default ASCII encoding will be used, so characters greater
than 127 will cause an exception::
>>> s.find('Was\x9f')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9f in position 3: ordinal not in range(128)
>>> s.find(u'Was\x9f')
-1
Much Python code that operates on strings will therefore work with Unicode
strings without requiring any changes to the code. (Input and output code needs
more updating for Unicode; more on this later.)
Another important method is ``.encode([encoding], [errors='strict'])``, which
returns an 8-bit string version of the Unicode string, encoded in the requested
encoding. The ``errors`` parameter is the same as the parameter of the
``unicode()`` constructor, with one additional possibility; as well as 'strict',
'ignore', and 'replace', you can also pass 'xmlcharrefreplace' which uses XML's
character references. The following example shows the different results::
>>> u = unichr(40960) + u'abcd' + unichr(1972)
>>> u.encode('utf-8')
'\xea\x80\x80abcd\xde\xb4'
>>> u.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal not in range(128)
>>> u.encode('ascii', 'ignore')
'abcd'
>>> u.encode('ascii', 'replace')
'?abcd?'
>>> u.encode('ascii', 'xmlcharrefreplace')
'&#40960;abcd&#1972;'
Python's 8-bit strings have a ``.decode([encoding], [errors])`` method that
interprets the string using the given encoding::
>>> u = unichr(40960) + u'abcd' + unichr(1972) # Assemble a string
>>> utf8_version = u.encode('utf-8') # Encode as UTF-8
>>> type(utf8_version), utf8_version
(<type 'str'>, '\xea\x80\x80abcd\xde\xb4')
>>> u2 = utf8_version.decode('utf-8') # Decode using UTF-8
>>> u == u2 # The two strings match
True
The low-level routines for registering and accessing the available encodings are
found in the :mod:`codecs` module. However, the encoding and decoding functions
returned by this module are usually more low-level than is comfortable, so I'm
not going to describe the :mod:`codecs` module here. If you need to implement a
completely new encoding, you'll need to learn about the :mod:`codecs` module
interfaces, but implementing encodings is a specialized task that also won't be
covered here. Consult the Python documentation to learn more about this module.
The most commonly used part of the :mod:`codecs` module is the
:func:`codecs.open` function which will be discussed in the section on input and
output.
Unicode Literals in Python Source Code
--------------------------------------
In Python source code, Unicode literals are written as strings prefixed with the
'u' or 'U' character: ``u'abcdefghijk'``. Specific code points can be written
using the ``\u`` escape sequence, which is followed by four hex digits giving
the code point. The ``\U`` escape sequence is similar, but expects 8 hex
digits, not 4.
Unicode literals can also use the same escape sequences as 8-bit strings,
including ``\x``, but ``\x`` only takes two hex digits so it can't express an
arbitrary code point. Octal escapes can go up to U+01ff, which is octal 777.
::
>>> s = u"a\xac\u1234\u20ac\U00008000"
^^^^ two-digit hex escape
^^^^^^ four-digit Unicode escape
^^^^^^^^^^ eight-digit Unicode escape
>>> for c in s: print ord(c),
...
97 172 4660 8364 32768
Using escape sequences for code points greater than 127 is fine in small doses,
but becomes an annoyance if you're using many accented characters, as you would
in a program with messages in French or some other accent-using language. You
can also assemble strings using the :func:`unichr` built-in function, but this is
even more tedious.
Ideally, you'd want to be able to write literals in your language's natural
encoding. You could then edit Python source code with your favorite editor
which would display the accented characters naturally, and have the right
characters used at runtime.
Python supports writing Unicode literals in any encoding, but you have to
declare the encoding being used. This is done by including a special comment as
either the first or second line of the source file::
#!/usr/bin/env python
# -*- coding: latin-1 -*-
u = u'abcdé'
print ord(u[-1])
The syntax is inspired by Emacs's notation for specifying variables local to a
file. Emacs supports many different variables, but Python only supports
'coding'. The ``-*-`` symbols indicate to Emacs that the comment is special;
they have no significance to Python but are a convention. Python looks for
``coding: name`` or ``coding=name`` in the comment.
If you don't include such a comment, the default encoding used will be ASCII.
Versions of Python before 2.4 were Euro-centric and assumed Latin-1 as a default
encoding for string literals; in Python 2.4, characters greater than 127 still
work but result in a warning. For example, the following program has no
encoding declaration::
#!/usr/bin/env python
u = u'abcdé'
print ord(u[-1])
When you run it with Python 2.4, it will output the following warning::
amk:~$ python p263.py
sys:1: DeprecationWarning: Non-ASCII character '\xe9'
in file p263.py on line 2, but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details
Unicode Properties
------------------
The Unicode specification includes a database of information about code points.
For each code point that's defined, the information includes the character's
name, its category, the numeric value if applicable (Unicode has characters
representing the Roman numerals and fractions such as one-third and
four-fifths). There are also properties related to the code point's use in
bidirectional text and other display-related properties.
The following program displays some information about several characters, and
prints the numeric value of one particular character::
import unicodedata
u = unichr(233) + unichr(0x0bf2) + unichr(3972) + unichr(6000) + unichr(13231)
for i, c in enumerate(u):
print i, '%04x' % ord(c), unicodedata.category(c),
print unicodedata.name(c)
# Get numeric value of second character
print unicodedata.numeric(u[1])
When run, this prints::
0 00e9 Ll LATIN SMALL LETTER E WITH ACUTE
1 0bf2 No TAMIL NUMBER ONE THOUSAND
2 0f84 Mn TIBETAN MARK HALANTA
3 1770 Lo TAGBANWA LETTER SA
4 33af So SQUARE RAD OVER S SQUARED
1000.0
The category codes are abbreviations describing the nature of the character.
These are grouped into categories such as "Letter", "Number", "Punctuation", or
"Symbol", which in turn are broken up into subcategories. To take the codes
from the above output, ``'Ll'`` means 'Letter, lowercase', ``'No'`` means
"Number, other", ``'Mn'`` is "Mark, nonspacing", and ``'So'`` is "Symbol,
other". See
<http://www.unicode.org/Public/UNIDATA/UCD.html#General_Category_Values> for a
list of category codes.
References
----------
The Unicode and 8-bit string types are described in the Python library reference
at :ref:`typesseq`.
The documentation for the :mod:`unicodedata` module.
The documentation for the :mod:`codecs` module.
Marc-André Lemburg gave a presentation at EuroPython 2002 titled "Python and
Unicode". A PDF version of his slides is available at
<http://downloads.egenix.com/python/Unicode-EPC2002-Talk.pdf>, and is an
excellent overview of the design of Python's Unicode features.
Reading and Writing Unicode Data
================================
Once you've written some code that works with Unicode data, the next problem is
input/output. How do you get Unicode strings into your program, and how do you
convert Unicode into a form suitable for storage or transmission?
It's possible that you may not need to do anything depending on your input
sources and output destinations; you should check whether the libraries used in
your application support Unicode natively. XML parsers often return Unicode
data, for example. Many relational databases also support Unicode-valued
columns and can return Unicode values from an SQL query.
Unicode data is usually converted to a particular encoding before it gets
written to disk or sent over a socket. It's possible to do all the work
yourself: open a file, read an 8-bit string from it, and convert the string with
``unicode(str, encoding)``. However, the manual approach is not recommended.
One problem is the multi-byte nature of encodings; one Unicode character can be
represented by several bytes. If you want to read the file in arbitrary-sized
chunks (say, 1K or 4K), you need to write error-handling code to catch the case
where only part of the bytes encoding a single Unicode character are read at the
end of a chunk. One solution would be to read the entire file into memory and
then perform the decoding, but that prevents you from working with files that
are extremely large; if you need to read a 2Gb file, you need 2Gb of RAM.
(More, really, since for at least a moment you'd need to have both the encoded
string and its Unicode version in memory.)
The solution would be to use the low-level decoding interface to catch the case
of partial coding sequences. The work of implementing this has already been
done for you: the :mod:`codecs` module includes a version of the :func:`open`
function that returns a file-like object that assumes the file's contents are in
a specified encoding and accepts Unicode parameters for methods such as
``.read()`` and ``.write()``.
The function's parameters are ``open(filename, mode='rb', encoding=None,
errors='strict', buffering=1)``. ``mode`` can be ``'r'``, ``'w'``, or ``'a'``,
just like the corresponding parameter to the regular built-in ``open()``
function; add a ``'+'`` to update the file. ``buffering`` is similarly parallel
to the standard function's parameter. ``encoding`` is a string giving the
encoding to use; if it's left as ``None``, a regular Python file object that
accepts 8-bit strings is returned. Otherwise, a wrapper object is returned, and
data written to or read from the wrapper object will be converted as needed.
``errors`` specifies the action for encoding errors and can be one of the usual
values of 'strict', 'ignore', and 'replace'.
Reading Unicode from a file is therefore simple::
import codecs
f = codecs.open('unicode.rst', encoding='utf-8')
for line in f:
print repr(line)
It's also possible to open files in update mode, allowing both reading and
writing::
f = codecs.open('test', encoding='utf-8', mode='w+')
f.write(u'\u4500 blah blah blah\n')
f.seek(0)
print repr(f.readline()[:1])
f.close()
Unicode character U+FEFF is used as a byte-order mark (BOM), and is often
written as the first character of a file in order to assist with autodetection
of the file's byte ordering. Some encodings, such as UTF-16, expect a BOM to be
present at the start of a file; when such an encoding is used, the BOM will be
automatically written as the first character and will be silently dropped when
the file is read. There are variants of these encodings, such as 'utf-16-le'
and 'utf-16-be' for little-endian and big-endian encodings, that specify one
particular byte ordering and don't skip the BOM.
Unicode filenames
-----------------
Most of the operating systems in common use today support filenames that contain
arbitrary Unicode characters. Usually this is implemented by converting the
Unicode string into some encoding that varies depending on the system. For
example, Mac OS X uses UTF-8 while Windows uses a configurable encoding; on
Windows, Python uses the name "mbcs" to refer to whatever the currently
configured encoding is. On Unix systems, there will only be a filesystem
encoding if you've set the ``LANG`` or ``LC_CTYPE`` environment variables; if
you haven't, the default encoding is ASCII.
The :func:`sys.getfilesystemencoding` function returns the encoding to use on
your current system, in case you want to do the encoding manually, but there's
not much reason to bother. When opening a file for reading or writing, you can
usually just provide the Unicode string as the filename, and it will be
automatically converted to the right encoding for you::
filename = u'filename\u4500abc'
f = open(filename, 'w')
f.write('blah\n')
f.close()
Functions in the :mod:`os` module such as :func:`os.stat` will also accept Unicode
filenames.
:func:`os.listdir`, which returns filenames, raises an issue: should it return
the Unicode version of filenames, or should it return 8-bit strings containing
the encoded versions? :func:`os.listdir` will do both, depending on whether you
provided the directory path as an 8-bit string or a Unicode string. If you pass
a Unicode string as the path, filenames will be decoded using the filesystem's
encoding and a list of Unicode strings will be returned, while passing an 8-bit
path will return the 8-bit versions of the filenames. For example, assuming the
default filesystem encoding is UTF-8, running the following program::
fn = u'filename\u4500abc'
f = open(fn, 'w')
f.close()
import os
print os.listdir('.')
print os.listdir(u'.')
will produce the following output::
amk:~$ python t.py
['.svn', 'filename\xe4\x94\x80abc', ...]
[u'.svn', u'filename\u4500abc', ...]
The first list contains UTF-8-encoded filenames, and the second list contains
the Unicode versions.
Tips for Writing Unicode-aware Programs
---------------------------------------
This section provides some suggestions on writing software that deals with
Unicode.
The most important tip is:
Software should only work with Unicode strings internally, converting to a
particular encoding on output.
If you attempt to write processing functions that accept both Unicode and 8-bit
strings, you will find your program vulnerable to bugs wherever you combine the
two different kinds of strings. Python's default encoding is ASCII, so whenever
a character with an ASCII value > 127 is in the input data, you'll get a
:exc:`UnicodeDecodeError` because that character can't be handled by the ASCII
encoding.
It's easy to miss such problems if you only test your software with data that
doesn't contain any accents; everything will seem to work, but there's actually
a bug in your program waiting for the first user who attempts to use characters
> 127. A second tip, therefore, is:
Include characters > 127 and, even better, characters > 255 in your test
data.
When using data coming from a web browser or some other untrusted source, a
common technique is to check for illegal characters in a string before using the
string in a generated command line or storing it in a database. If you're doing
this, be careful to check the string once it's in the form that will be used or
stored; it's possible for encodings to be used to disguise characters. This is
especially true if the input data also specifies the encoding; many encodings
leave the commonly checked-for characters alone, but Python includes some
encodings such as ``'base64'`` that modify every single character.
For example, let's say you have a content management system that takes a Unicode
filename, and you want to disallow paths with a '/' character. You might write
this code::
def read_file (filename, encoding):
if '/' in filename:
raise ValueError("'/' not allowed in filenames")
unicode_name = filename.decode(encoding)
f = open(unicode_name, 'r')
# ... return contents of file ...
However, if an attacker could specify the ``'base64'`` encoding, they could pass
``'L2V0Yy9wYXNzd2Q='``, which is the base-64 encoded form of the string
``'/etc/passwd'``, to read a system file. The above code looks for ``'/'``
characters in the encoded form and misses the dangerous character in the
resulting decoded form.
References
----------
The PDF slides for Marc-André Lemburg's presentation "Writing Unicode-aware
Applications in Python" are available at
<http://downloads.egenix.com/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf>
and discuss questions of character encodings as well as how to internationalize
and localize an application.
Revision History and Acknowledgements
=====================================
Thanks to the following people who have noted errors or offered suggestions on
this article: Nicholas Bastin, Marius Gedminas, Kent Johnson, Ken Krugler,
Marc-André Lemburg, Martin von Löwis, Chad Whitacre.
Version 1.0: posted August 5 2005.
Version 1.01: posted August 7 2005. Corrects factual and markup errors; adds
several links.
Version 1.02: posted August 16 2005. Corrects factual errors.
.. comment Additional topic: building Python w/ UCS2 or UCS4 support
.. comment Describe obscure -U switch somewhere?
.. comment Describe use of codecs.StreamRecoder and StreamReaderWriter
.. comment
Original outline:
- [ ] Unicode introduction
- [ ] ASCII
- [ ] Terms
- [ ] Character
- [ ] Code point
- [ ] Encodings
- [ ] Common encodings: ASCII, Latin-1, UTF-8
- [ ] Unicode Python type
- [ ] Writing unicode literals
- [ ] Obscurity: -U switch
- [ ] Built-ins
- [ ] unichr()
- [ ] ord()
- [ ] unicode() constructor
- [ ] Unicode type
- [ ] encode(), decode() methods
- [ ] Unicodedata module for character properties
- [ ] I/O
- [ ] Reading/writing Unicode data into files
- [ ] Byte-order marks
- [ ] Unicode filenames
- [ ] Writing Unicode programs
- [ ] Do everything in Unicode
- [ ] Declaring source code encodings (PEP 263)
- [ ] Other issues
- [ ] Building Python (UCS2, UCS4)

View File

@@ -0,0 +1,579 @@
************************************************
HOWTO Fetch Internet Resources Using urllib2
************************************************
:Author: `Michael Foord <http://www.voidspace.org.uk/python/index.shtml>`_
.. note::
There is an French translation of an earlier revision of this
HOWTO, available at `urllib2 - Le Manuel manquant
<http://www.voidspace.org.uk/python/articles/urllib2_francais.shtml>`_.
Introduction
============
.. sidebar:: Related Articles
You may also find useful the following article on fetching web resources
with Python :
* `Basic Authentication <http://www.voidspace.org.uk/python/articles/authentication.shtml>`_
A tutorial on *Basic Authentication*, with examples in Python.
**urllib2** is a `Python <http://www.python.org>`_ module for fetching URLs
(Uniform Resource Locators). It offers a very simple interface, in the form of
the *urlopen* function. This is capable of fetching URLs using a variety of
different protocols. It also offers a slightly more complex interface for
handling common situations - like basic authentication, cookies, proxies and so
on. These are provided by objects called handlers and openers.
urllib2 supports fetching URLs for many "URL schemes" (identified by the string
before the ":" in URL - for example "ftp" is the URL scheme of
"ftp://python.org/") using their associated network protocols (e.g. FTP, HTTP).
This tutorial focuses on the most common case, HTTP.
For straightforward situations *urlopen* is very easy to use. But as soon as you
encounter errors or non-trivial cases when opening HTTP URLs, you will need some
understanding of the HyperText Transfer Protocol. The most comprehensive and
authoritative reference to HTTP is :rfc:`2616`. This is a technical document and
not intended to be easy to read. This HOWTO aims to illustrate using *urllib2*,
with enough detail about HTTP to help you through. It is not intended to replace
the :mod:`urllib2` docs, but is supplementary to them.
Fetching URLs
=============
The simplest way to use urllib2 is as follows::
import urllib2
response = urllib2.urlopen('http://python.org/')
html = response.read()
Many uses of urllib2 will be that simple (note that instead of an 'http:' URL we
could have used an URL starting with 'ftp:', 'file:', etc.). However, it's the
purpose of this tutorial to explain the more complicated cases, concentrating on
HTTP.
HTTP is based on requests and responses - the client makes requests and servers
send responses. urllib2 mirrors this with a ``Request`` object which represents
the HTTP request you are making. In its simplest form you create a Request
object that specifies the URL you want to fetch. Calling ``urlopen`` with this
Request object returns a response object for the URL requested. This response is
a file-like object, which means you can for example call ``.read()`` on the
response::
import urllib2
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
the_page = response.read()
Note that urllib2 makes use of the same Request interface to handle all URL
schemes. For example, you can make an FTP request like so::
req = urllib2.Request('ftp://example.com/')
In the case of HTTP, there are two extra things that Request objects allow you
to do: First, you can pass data to be sent to the server. Second, you can pass
extra information ("metadata") *about* the data or the about request itself, to
the server - this information is sent as HTTP "headers". Let's look at each of
these in turn.
Data
----
Sometimes you want to send data to a URL (often the URL will refer to a CGI
(Common Gateway Interface) script [#]_ or other web application). With HTTP,
this is often done using what's known as a **POST** request. This is often what
your browser does when you submit a HTML form that you filled in on the web. Not
all POSTs have to come from forms: you can use a POST to transmit arbitrary data
to your own application. In the common case of HTML forms, the data needs to be
encoded in a standard way, and then passed to the Request object as the ``data``
argument. The encoding is done using a function from the ``urllib`` library
*not* from ``urllib2``. ::
import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
values = {'name' : 'Michael Foord',
'location' : 'Northampton',
'language' : 'Python' }
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
Note that other encodings are sometimes required (e.g. for file upload from HTML
forms - see `HTML Specification, Form Submission
<http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13>`_ for more
details).
If you do not pass the ``data`` argument, urllib2 uses a **GET** request. One
way in which GET and POST requests differ is that POST requests often have
"side-effects": they change the state of the system in some way (for example by
placing an order with the website for a hundredweight of tinned spam to be
delivered to your door). Though the HTTP standard makes it clear that POSTs are
intended to *always* cause side-effects, and GET requests *never* to cause
side-effects, nothing prevents a GET request from having side-effects, nor a
POST requests from having no side-effects. Data can also be passed in an HTTP
GET request by encoding it in the URL itself.
This is done as follows::
>>> import urllib2
>>> import urllib
>>> data = {}
>>> data['name'] = 'Somebody Here'
>>> data['location'] = 'Northampton'
>>> data['language'] = 'Python'
>>> url_values = urllib.urlencode(data)
>>> print url_values
name=Somebody+Here&language=Python&location=Northampton
>>> url = 'http://www.example.com/example.cgi'
>>> full_url = url + '?' + url_values
>>> data = urllib2.open(full_url)
Notice that the full URL is created by adding a ``?`` to the URL, followed by
the encoded values.
Headers
-------
We'll discuss here one particular HTTP header, to illustrate how to add headers
to your HTTP request.
Some websites [#]_ dislike being browsed by programs, or send different versions
to different browsers [#]_ . By default urllib2 identifies itself as
``Python-urllib/x.y`` (where ``x`` and ``y`` are the major and minor version
numbers of the Python release,
e.g. ``Python-urllib/2.5``), which may confuse the site, or just plain
not work. The way a browser identifies itself is through the
``User-Agent`` header [#]_. When you create a Request object you can
pass a dictionary of headers in. The following example makes the same
request as above, but identifies itself as a version of Internet
Explorer [#]_. ::
import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
values = {'name' : 'Michael Foord',
'location' : 'Northampton',
'language' : 'Python' }
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()
The response also has two useful methods. See the section on `info and geturl`_
which comes after we have a look at what happens when things go wrong.
Handling Exceptions
===================
*urlopen* raises :exc:`URLError` when it cannot handle a response (though as usual
with Python APIs, builtin exceptions such as
:exc:`ValueError`, :exc:`TypeError` etc. may also
be raised).
:exc:`HTTPError` is the subclass of :exc:`URLError` raised in the specific case of
HTTP URLs.
URLError
--------
Often, URLError is raised because there is no network connection (no route to
the specified server), or the specified server doesn't exist. In this case, the
exception raised will have a 'reason' attribute, which is a tuple containing an
error code and a text error message.
e.g. ::
>>> req = urllib2.Request('http://www.pretend_server.org')
>>> try: urllib2.urlopen(req)
>>> except URLError, e:
>>> print e.reason
>>>
(4, 'getaddrinfo failed')
HTTPError
---------
Every HTTP response from the server contains a numeric "status code". Sometimes
the status code indicates that the server is unable to fulfil the request. The
default handlers will handle some of these responses for you (for example, if
the response is a "redirection" that requests the client fetch the document from
a different URL, urllib2 will handle that for you). For those it can't handle,
urlopen will raise an :exc:`HTTPError`. Typical errors include '404' (page not
found), '403' (request forbidden), and '401' (authentication required).
See section 10 of RFC 2616 for a reference on all the HTTP error codes.
The :exc:`HTTPError` instance raised will have an integer 'code' attribute, which
corresponds to the error sent by the server.
Error Codes
~~~~~~~~~~~
Because the default handlers handle redirects (codes in the 300 range), and
codes in the 100-299 range indicate success, you will usually only see error
codes in the 400-599 range.
``BaseHTTPServer.BaseHTTPRequestHandler.responses`` is a useful dictionary of
response codes in that shows all the response codes used by RFC 2616. The
dictionary is reproduced here for convenience ::
# Table mapping response codes to messages; entries have the
# form {code: (shortmessage, longmessage)}.
responses = {
100: ('Continue', 'Request received, please continue'),
101: ('Switching Protocols',
'Switching to new protocol; obey Upgrade header'),
200: ('OK', 'Request fulfilled, document follows'),
201: ('Created', 'Document created, URL follows'),
202: ('Accepted',
'Request accepted, processing continues off-line'),
203: ('Non-Authoritative Information', 'Request fulfilled from cache'),
204: ('No Content', 'Request fulfilled, nothing follows'),
205: ('Reset Content', 'Clear input form for further input.'),
206: ('Partial Content', 'Partial content follows.'),
300: ('Multiple Choices',
'Object has several resources -- see URI list'),
301: ('Moved Permanently', 'Object moved permanently -- see URI list'),
302: ('Found', 'Object moved temporarily -- see URI list'),
303: ('See Other', 'Object moved -- see Method and URL list'),
304: ('Not Modified',
'Document has not changed since given time'),
305: ('Use Proxy',
'You must use proxy specified in Location to access this '
'resource.'),
307: ('Temporary Redirect',
'Object moved temporarily -- see URI list'),
400: ('Bad Request',
'Bad request syntax or unsupported method'),
401: ('Unauthorized',
'No permission -- see authorization schemes'),
402: ('Payment Required',
'No payment -- see charging schemes'),
403: ('Forbidden',
'Request forbidden -- authorization will not help'),
404: ('Not Found', 'Nothing matches the given URI'),
405: ('Method Not Allowed',
'Specified method is invalid for this server.'),
406: ('Not Acceptable', 'URI not available in preferred format.'),
407: ('Proxy Authentication Required', 'You must authenticate with '
'this proxy before proceeding.'),
408: ('Request Timeout', 'Request timed out; try again later.'),
409: ('Conflict', 'Request conflict.'),
410: ('Gone',
'URI no longer exists and has been permanently removed.'),
411: ('Length Required', 'Client must specify Content-Length.'),
412: ('Precondition Failed', 'Precondition in headers is false.'),
413: ('Request Entity Too Large', 'Entity is too large.'),
414: ('Request-URI Too Long', 'URI is too long.'),
415: ('Unsupported Media Type', 'Entity body in unsupported format.'),
416: ('Requested Range Not Satisfiable',
'Cannot satisfy request range.'),
417: ('Expectation Failed',
'Expect condition could not be satisfied.'),
500: ('Internal Server Error', 'Server got itself in trouble'),
501: ('Not Implemented',
'Server does not support this operation'),
502: ('Bad Gateway', 'Invalid responses from another server/proxy.'),
503: ('Service Unavailable',
'The server cannot process the request due to a high load'),
504: ('Gateway Timeout',
'The gateway server did not receive a timely response'),
505: ('HTTP Version Not Supported', 'Cannot fulfill request.'),
}
When an error is raised the server responds by returning an HTTP error code
*and* an error page. You can use the :exc:`HTTPError` instance as a response on the
page returned. This means that as well as the code attribute, it also has read,
geturl, and info, methods. ::
>>> req = urllib2.Request('http://www.python.org/fish.html')
>>> try:
>>> urllib2.urlopen(req)
>>> except URLError, e:
>>> print e.code
>>> print e.read()
>>>
404
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<?xml-stylesheet href="./css/ht2html.css"
type="text/css"?>
<html><head><title>Error 404: File Not Found</title>
...... etc...
Wrapping it Up
--------------
So if you want to be prepared for :exc:`HTTPError` *or* :exc:`URLError` there are two
basic approaches. I prefer the second approach.
Number 1
~~~~~~~~
::
from urllib2 import Request, urlopen, URLError, HTTPError
req = Request(someurl)
try:
response = urlopen(req)
except HTTPError, e:
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
except URLError, e:
print 'We failed to reach a server.'
print 'Reason: ', e.reason
else:
# everything is fine
.. note::
The ``except HTTPError`` *must* come first, otherwise ``except URLError``
will *also* catch an :exc:`HTTPError`.
Number 2
~~~~~~~~
::
from urllib2 import Request, urlopen, URLError
req = Request(someurl)
try:
response = urlopen(req)
except URLError, e:
if hasattr(e, 'reason'):
print 'We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
else:
# everything is fine
info and geturl
===============
The response returned by urlopen (or the :exc:`HTTPError` instance) has two useful
methods :meth:`info` and :meth:`geturl`.
**geturl** - this returns the real URL of the page fetched. This is useful
because ``urlopen`` (or the opener object used) may have followed a
redirect. The URL of the page fetched may not be the same as the URL requested.
**info** - this returns a dictionary-like object that describes the page
fetched, particularly the headers sent by the server. It is currently an
``httplib.HTTPMessage`` instance.
Typical headers include 'Content-length', 'Content-type', and so on. See the
`Quick Reference to HTTP Headers <http://www.cs.tut.fi/~jkorpela/http.html>`_
for a useful listing of HTTP headers with brief explanations of their meaning
and use.
Openers and Handlers
====================
When you fetch a URL you use an opener (an instance of the perhaps
confusingly-named :class:`urllib2.OpenerDirector`). Normally we have been using
the default opener - via ``urlopen`` - but you can create custom
openers. Openers use handlers. All the "heavy lifting" is done by the
handlers. Each handler knows how to open URLs for a particular URL scheme (http,
ftp, etc.), or how to handle an aspect of URL opening, for example HTTP
redirections or HTTP cookies.
You will want to create openers if you want to fetch URLs with specific handlers
installed, for example to get an opener that handles cookies, or to get an
opener that does not handle redirections.
To create an opener, instantiate an ``OpenerDirector``, and then call
``.add_handler(some_handler_instance)`` repeatedly.
Alternatively, you can use ``build_opener``, which is a convenience function for
creating opener objects with a single function call. ``build_opener`` adds
several handlers by default, but provides a quick way to add more and/or
override the default handlers.
Other sorts of handlers you might want to can handle proxies, authentication,
and other common but slightly specialised situations.
``install_opener`` can be used to make an ``opener`` object the (global) default
opener. This means that calls to ``urlopen`` will use the opener you have
installed.
Opener objects have an ``open`` method, which can be called directly to fetch
urls in the same way as the ``urlopen`` function: there's no need to call
``install_opener``, except as a convenience.
Basic Authentication
====================
To illustrate creating and installing a handler we will use the
``HTTPBasicAuthHandler``. For a more detailed discussion of this subject --
including an explanation of how Basic Authentication works - see the `Basic
Authentication Tutorial
<http://www.voidspace.org.uk/python/articles/authentication.shtml>`_.
When authentication is required, the server sends a header (as well as the 401
error code) requesting authentication. This specifies the authentication scheme
and a 'realm'. The header looks like : ``Www-authenticate: SCHEME
realm="REALM"``.
e.g. ::
Www-authenticate: Basic realm="cPanel Users"
The client should then retry the request with the appropriate name and password
for the realm included as a header in the request. This is 'basic
authentication'. In order to simplify this process we can create an instance of
``HTTPBasicAuthHandler`` and an opener to use this handler.
The ``HTTPBasicAuthHandler`` uses an object called a password manager to handle
the mapping of URLs and realms to passwords and usernames. If you know what the
realm is (from the authentication header sent by the server), then you can use a
``HTTPPasswordMgr``. Frequently one doesn't care what the realm is. In that
case, it is convenient to use ``HTTPPasswordMgrWithDefaultRealm``. This allows
you to specify a default username and password for a URL. This will be supplied
in the absence of you providing an alternative combination for a specific
realm. We indicate this by providing ``None`` as the realm argument to the
``add_password`` method.
The top-level URL is the first URL that requires authentication. URLs "deeper"
than the URL you pass to .add_password() will also match. ::
# create a password manager
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
# Add the username and password.
# If we knew the realm, we could use it instead of None.
top_level_url = "http://example.com/foo/"
password_mgr.add_password(None, top_level_url, username, password)
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
# create "opener" (OpenerDirector instance)
opener = urllib2.build_opener(handler)
# use the opener to fetch a URL
opener.open(a_url)
# Install the opener.
# Now all calls to urllib2.urlopen use our opener.
urllib2.install_opener(opener)
.. note::
In the above example we only supplied our ``HHTPBasicAuthHandler`` to
``build_opener``. By default openers have the handlers for normal situations
-- ``ProxyHandler``, ``UnknownHandler``, ``HTTPHandler``,
``HTTPDefaultErrorHandler``, ``HTTPRedirectHandler``, ``FTPHandler``,
``FileHandler``, ``HTTPErrorProcessor``.
``top_level_url`` is in fact *either* a full URL (including the 'http:' scheme
component and the hostname and optionally the port number)
e.g. "http://example.com/" *or* an "authority" (i.e. the hostname,
optionally including the port number) e.g. "example.com" or "example.com:8080"
(the latter example includes a port number). The authority, if present, must
NOT contain the "userinfo" component - for example "joe@password:example.com" is
not correct.
Proxies
=======
**urllib2** will auto-detect your proxy settings and use those. This is through
the ``ProxyHandler`` which is part of the normal handler chain. Normally that's
a good thing, but there are occasions when it may not be helpful [#]_. One way
to do this is to setup our own ``ProxyHandler``, with no proxies defined. This
is done using similar steps to setting up a `Basic Authentication`_ handler : ::
>>> proxy_support = urllib2.ProxyHandler({})
>>> opener = urllib2.build_opener(proxy_support)
>>> urllib2.install_opener(opener)
.. note::
Currently ``urllib2`` *does not* support fetching of ``https`` locations
through a proxy. However, this can be enabled by extending urllib2 as
shown in the recipe [#]_.
Sockets and Layers
==================
The Python support for fetching resources from the web is layered. urllib2 uses
the httplib library, which in turn uses the socket library.
As of Python 2.3 you can specify how long a socket should wait for a response
before timing out. This can be useful in applications which have to fetch web
pages. By default the socket module has *no timeout* and can hang. Currently,
the socket timeout is not exposed at the httplib or urllib2 levels. However,
you can set the default timeout globally for all sockets using ::
import socket
import urllib2
# timeout in seconds
timeout = 10
socket.setdefaulttimeout(timeout)
# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
-------
Footnotes
=========
This document was reviewed and revised by John Lee.
.. [#] For an introduction to the CGI protocol see
`Writing Web Applications in Python <http://www.pyzine.com/Issue008/Section_Articles/article_CGIOne.html>`_.
.. [#] Like Google for example. The *proper* way to use google from a program
is to use `PyGoogle <http://pygoogle.sourceforge.net>`_ of course. See
`Voidspace Google <http://www.voidspace.org.uk/python/recipebook.shtml#google>`_
for some examples of using the Google API.
.. [#] Browser sniffing is a very bad practise for website design - building
sites using web standards is much more sensible. Unfortunately a lot of
sites still send different versions to different browsers.
.. [#] The user agent for MSIE 6 is
*'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)'*
.. [#] For details of more HTTP request headers, see
`Quick Reference to HTTP Headers`_.
.. [#] In my case I have to use a proxy to access the internet at work. If you
attempt to fetch *localhost* URLs through this proxy it blocks them. IE
is set to use the proxy, which urllib2 picks up on. In order to test
scripts with a localhost server, I have to prevent urllib2 from using
the proxy.
.. [#] urllib2 opener for SSL proxy (CONNECT method): `ASPN Cookbook Recipe
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/456195>`_.

Some files were not shown because too many files have changed in this diff Show More