[Python]

[Python Home] [PEP Index] [PEP Source]

 

PEP: 

8

Title: 

Style Guide for Python Code

Version: 

$Revision: 1.27$ (Modified by Geoscience Australia)

Author: 

Guido van Rossum <guido at python.org>, Barry Warsaw <barry at python.org>

Status: 

Active

Type: 

Informational

Created: 

05-Jul-2001

Post-History: 

Modified slightly by Ole Nielsen <Ole.Nielsen at ga.gov.au> and Duncan Gray <Duncan.gray at ga.gov.au>

Geoscience Australia 2005.

The GA modifications are clearly flagged using blue italics and reasons outlined.


Introduction

    This document gives coding conventions for the Python code
    comprising the standard library for the main Python distribution.
    Please see the companion informational PEP describing style
    guidelines for the C code in the C implementation of Python[1].
 
    This document was adapted from Guido's original Python Style
    Guide essay[2], with some additions from Barry's style guide[5].
    Where there's conflict, Guido's style rules for the purposes of
    this PEP.  This PEP may still be incomplete (in fact, it may never
    be finished <wink>).
 
 

A Foolish Consistency is the Hobgoblin of Little Minds

    A style guide is about consistency.  Consistency with this style
    guide is important.  Consistency within a project is more
    important. Consistency within one module or function is most
    important.
 
    But most importantly: know when to be inconsistent -- sometimes
    the style guide just doesn't apply.  When in doubt, use your best
    judgement.  Look at other examples and decide what looks best.
    And don't hesitate to ask!
 
    Two good reasons to break a particular rule:
 
    (1) When applying the rule would make the code less readable, even
        for someone who is used to reading code that follows the rules.
 
    (2) To be consistent with surrounding code that also breaks it
        (maybe for historic reasons) -- although this is also an
        opportunity to clean up someone else's mess (in true XP style).
 
 

Code lay-out

  Indentation
 
    Use the default of Emacs' Python-mode: 4 spaces for one
    indentation level.  For really old code that you don't want to
    mess up, you can continue to use 8-space tabs.  Emacs Python-mode
    auto-detects the prevailing indentation level used in a file and
    sets its indentation parameters accordingly.
 
  Tabs or Spaces?
 
    Never mix tabs and spaces.  The most popular way of indenting
    Python is with spaces only.  The second-most popular way is with
    tabs only.  Code indented with a mixture of tabs and spaces should
    be converted to using spaces exclusively.  (In Emacs, select the
    whole buffer and hit ESC-x untabify.)  When invoking the python
    command line interpreter with the -t option, it issues warnings
    about code that illegally mixes tabs and spaces.  When using -tt
    these warnings become errors.  These options are highly
    recommended!
 
    For new projects, spaces-only are strongly recommended over tabs.
    Most editors have features that make this easy to do.  (In Emacs,
    make sure indent-tabs-mode is nil).
 
  Maximum Line Length
 
    There are still many devices around that are limited to 80
    character lines; plus, limiting windows to 80 characters makes it
    possible to have several windows side-by-side.  The default
    wrapping on such devices looks ugly.  Therefore, please limit all
    lines to a maximum of 79 characters (Emacs wraps lines that are
    exactly 80 characters long).  For flowing long blocks of text
    (docstrings or comments), limiting the length to 72 characters is
    recommended.
 
    The preferred way of wrapping long lines is by using Python's
    implied line continuation inside parentheses, brackets and braces.
    If necessary, you can add an extra pair of parentheses around an
    expression, but sometimes using a backslash looks better.  Make
    sure to indent the continued line appropriately.  Emacs
    Python-mode does this right.  Some examples:
 
    class Rectangle(Blob):
 
        def __init__(self, width, height,
                     color='black', emphasis=None, highlight=0):
            if width == 0 and height == 0 and \
               color == 'red' and emphasis == 'strong' or \
               highlight > 100:
                raise ValueError("sorry, you lose")
            if width == 0 and height == 0 and (color == 'red' or
                                               emphasis is None):
                raise ValueError("I don't think so")
            Blob.__init__(self, width, height,
                          color, emphasis, highlight)
 
  Blank Lines
 
    Separate top-level function and class definitions with two blank
    lines.  Method definitions inside a class are separated by a
    single blank line.  Extra blank lines may be used (sparingly liberally) to
    separate groups of related functions.  Blank lines may be omitted
    between a bunch of related one-liners (e.g. a set of dummy
    implementations).
 
    When blank lines are used to separate method definitions, there is
    also a blank line between the `class' line and the first method
    definition.
 
    Use blank lines in functions, sparingly liberally, to indicate logical
    sections.
 
    Python accepts the control-L (i.e. ^L) form feed character as
    whitespace; Emacs (and some printing tools) treat these
    characters as page separators, so you may use them to separate
    pages of related sections of your file.
 
  Encodings (PEP 263)
 
    Code in the core Python distribution should aways use the ASCII or
    Latin-1 encoding (a.k.a. ISO-8859-1).  Files using ASCII should
    not have a coding cookie.  Latin-1 should only be used when a
    comment or docstring needs to mention an author name that requires
    Latin-1; otherwise, using \x escapes is the preferred way to
    include non-ASCII data in string literals.  An exception is made
    for those files that are part of the test suite for the code
    implementing PEP 263.
 
 

Imports

    - Imports should usually be on separate lines, e.g.:
 
        No:  import sys, os
        Yes: import sys
             import os
 
      it's okay to say this though:
 
        from types import StringType, ListType
 
-    Imports are always put at the top of the file (or top of 
  the function/method/class in cases where the import is optional 
  or very slow to load), just after any module comments and docstrings, 
  and before module globals and constants.  Imports should be grouped, 
  with the order being
 
      1. standard library imports
      2. related major package imports (i.e. all email package imports next)
      3. application specific imports
 
      You should put a blank line between each group of imports.
 
    - Relative imports for intra-package imports are highly
      discouraged.  Always use the absolute package path for all
      imports. Package imports should be relative to a directory on the PYTHONPATH. 
 
 
    - When importing a class from a class-containing module, it's usually
      okay to spell this
 
        from MyClass import MyClass
        from foo.bar.YourClass import YourClass
 
      If this spelling causes local name clashes, then spell them
 
        import MyClass
        import foo.bar.YourClass
 
      and use "MyClass.MyClass" and "foo.bar.YourClass.YourClass"
      
  Using import * is discouraged except for special cases like 
  unit testing where only one module is imported. The reason 
  being that multiple * imports obscures the origin of individual 
  objects making debugging much harder.  
 
 
 
 

Whitespace in Expressions and Statements

  Pet Peeves
 
    Guido hates whitespace in the following places:
 
    - Immediately inside parentheses, brackets or braces, as in:
      "spam( ham[ 1 ], { eggs: 2 } )".  Always write this as
      "spam(ham[1], {eggs: 2})".
 
    - Immediately before a comma, semicolon, or colon, as in:
      "if x == 4 : print x , y ; x , y = y , x".  Always write this as
      "if x == 4: print x, y; x, y = y, x".
 
    - Immediately before the open parenthesis that starts the argument
      list of a function call, as in "spam (1)".  Always write
      this as "spam(1)".
 
    - Immediately before the open parenthesis that starts an indexing or
      slicing, as in: "dict ['key'] = list [index]".  Always
      write this as "dict['key'] = list[index]".
 
    - More than one space around an assignment (or other) operator to
      align it with another, as in:
 
          x             = 1
          y             = 2
          long_variable = 3
 
      Always write this as
 
          x = 1
          y = 2
          long_variable = 3
 
    (Don't bother to argue with him on any of the above -- Guido's
    grown accustomed to this style over 20 years.)
 
 
  Other Recommendations
 
    - Always surround these binary operators with a single space on
      either side: assignment (=), comparisons (==, <, >, !=, <>, <=,
      >=, in, not in, is, is not), Booleans (and, or, not).
 
    - Use your better judgment for the insertion of spaces around
      arithmetic operators.  Always be consistent about whitespace on
      either side of a binary operator.  Some examples:
 
          i = i+1
          submitted = submitted + 1
          x = x*2 - 1
          hypot2 = x*x + y*y
          c = (a+b) * (a-b)
          c = (a + b) * (a - b)
 
    - Don't use spaces around the '=' sign when used to indicate a
      keyword argument or a default parameter value.  For instance:
 
          def complex(real, imag=0.0):
              return magic(r=real, i=imag)
 
    - Compound statements (multiple statements on the same line) are
      generally discouraged.
 
          No:  if foo == 'blah': do_blah_thing()
          Yes: if foo == 'blah':
                   do_blah_thing()
 
          No:  do_one(); do_two(); do_three()
          Yes: do_one()
               do_two()
               do_three()
 

Comments

    Comments that contradict the code are worse than no comments.
    Always make a priority of keeping the comments up-to-date when the
    code changes!
 
    Comments should be complete sentences.  If a comment is a phrase
    or sentence, its first word should be capitalized, unless it is an
    identifier that begins with a lower case letter (never alter the
    case of identifiers!).
 
    If a comment is short, the period at the end is best omitted.
    Block comments generally consist of one or more paragraphs built
    out of complete sentences, and each sentence should end in a
    period.
 
    You should use two spaces after a sentence-ending period, since it
    makes Emacs wrapping and filling work consistenty.
 
    When writing English, Strunk and White apply.
 
    Python coders from non-English speaking countries: please write
    your comments in English, unless you are 120% sure that the code
    will never be read by people who don't speak your language.
 
 
  Block Comments
 
    Block comments generally apply to some (or all) code that follows
    them, and are indented to the same level as that code.  Each line
    of a block comment starts with a # and a single space (unless it
    is indented text inside the comment).  Paragraphs inside a block
    comment are separated by a line containing a single #.  Block
    comments are best surrounded by a blank line above and below them
    (or two lines above and a single line below for a block comment at
    the start of a a new section of function definitions or to delimit 
    logical sections).
 
  Inline Comments
 
    An inline comment is a comment on the same line as a statement.
    Inline comments should be used sparingly.  Inline comments should
    be separated by at least two spaces from the statement.  They
    should start with a # and a single space.
 
    Inline comments are unnecessary and in fact distracting if they state
    the obvious.  Don't do this:
 
        x = x+1                 # Increment x
 
    But sometimes, this is useful:
 
        x = x+1                 # Compensate for border
 
 

Documentation Strings

    Conventions for writing good documentation strings
    (a.k.a. "docstrings") are immortalized in PEP 257 [3].
 
    - Write docstrings for all public modules, functions, classes, and
      methods.  Docstrings are not necessary for non-public methods,
      but you should have a comment that describes what the method
      does.  This comment should appear after the "def" line.
      Ideally, all modules, functions, classes, and methods should have 
      a docstring.
 
    - PEP 257 describes good docstring conventions.  Note that most
      importantly, the """ that ends a multiline docstring should be
      on a line by itself, e.g.:
 
      """Return a foobang
 
      Optional plotz says to frobnicate the bizbaz first.
      """
 
    - For one liner docstrings, it's okay to keep the closing """ on
      the same line.
 
 

Version Bookkeeping

    If you have to have RCS or CVS crud in your source file, do it as
    follows.
 
        __version__ = "$Revision: 1.27 $"
        # $Source: /cvsroot/python/python/nondist/peps/pep-0008.txt,v $
 
    These lines should be included after the module's docstring,
    before any other code, separated by a blank line above and
    below.
 
 

Naming Conventions

    The naming conventions of Python's library are a bit of a mess, so we'll
    never get this completely consistent -- nevertheless, here are the
    currently recommended naming standards.  New modules and packages
    (including 3rd party frameworks) should be written to these standards, but
    where an existing library has a different style, internal consistency is
    preferred.
 
  Descriptive: Naming Styles
 
    There are a lot of different naming styles.  It helps to be able
    to recognize what naming style is being used, independently from
    what they are used for.
 
    The following naming styles are commonly distinguished:
 
    - b (single lowercase letter)
 
    - B (single uppercase letter)
 
    - lowercase
 
    - lower_case_with_underscores
 
    - UPPERCASE
 
    - UPPER_CASE_WITH_UNDERSCORES
 
    - CapitalizedWords (or CapWords, or CamelCase -- so named because
      of the bumpy look of its letters[4]).  This is also sometimes known as
      StudlyCaps.
 
    - mixedCase (differs from CapitalizedWords by initial lowercase
      character!)
 
    - Capitalized_Words_With_Underscores (ugly!)
 
    There's also the style of using a short unique prefix to group
    related names together.  This is not used much in Python, but it
    is mentioned for completeness.  For example, the os.stat()
    function returns a tuple whose items traditionally have names like
    st_mode, st_size, st_mtime and so on.  The X11 library uses a
    leading X for all its public functions.  (In Python, this style is
    generally deemed unnecessary because attribute and method names
    are prefixed with an object, and function names are prefixed with
    a module name.)
 
    In addition, the following special forms using leading or trailing
    underscores are recognized (these can generally be combined with any
    case convention):
 
    - _single_leading_underscore: weak "internal use" indicator
      (e.g. "from M import *" does not import objects whose name
      starts with an underscore).
 
    - single_trailing_underscore_: used by convention to avoid
      conflicts with Python keyword, e.g.
      "Tkinter.Toplevel(master, class_='ClassName')".
 
    - __double_leading_underscore: class-private names as of Python 1.4.
 
    - __double_leading_and_trailing_underscore__: "magic" objects or
      attributes that live in user-controlled namespaces,
      e.g. __init__, __import__ or __file__.  Sometimes these are
      defined by the user to trigger certain magic behavior
      (e.g. operator overloading); sometimes these are inserted by the
      infrastructure for its own use or for debugging purposes.  Since
      the infrastructure (loosely defined as the Python interpreter
      and the standard library) may decide to grow its list of magic
      attributes in future versions, user code should generally
      refrain from using this convention for its own use.  User code
      that aspires to become part of the infrastructure could combine
      this with a short prefix inside the underscores,
      e.g. __bobo_magic_attr__.
 
  Prescriptive: Naming Conventions
 
    Names to Avoid
 
      Never use the characters `l' (lowercase letter el), `O'
      (uppercase letter oh), or `I' (uppercase letter eye) as single
      character variable names.  In some fonts, these characters are
      indistinguisable from the numerals one and zero.  When tempted
    to use `l' use `L' instead, unless the mathematical formula 
    using l would become harder to understand.
 
    Module Names
 
      Modules should have short, lowercase names, without possibly with 
      underscores.
 
      Since module names are mapped to file names, and some file systems
      are case insensitive and truncate long names, it is important that
      module names be chosen to be fairly short -- this won't be a
      problem on Unix, but it may be a problem when the code is
      transported to Mac or Windows.
 
      When an extension module written in C or C++ has an accompanying
      Python module that provides a higher level (e.g. more object
      oriented) interface, the C/C++ module has a leading underscore
      (e.g. _socket).
 
      Python packages should have short, all-lowercase names, without 
      possibly with underscores.
 
    Class Names
 
      Almost without exception, class names use the CapWords convention.
      Classes for internal use have a leading underscore in addition.
 
    Exception Names
 
      If a module defines a single exception raised for all sorts of
      conditions, it is generally called "error" or "Error".  It seems
      that built-in (extension) modules use "error" (e.g. os.error),
      while Python modules generally use "Error" (e.g. xdrlib.Error).
      The trend seems to be toward CapWords exception names.
 
    Global Variable Names
 
      (Let's hope that these variables are meant for use inside one
      module only.)  The conventions are about the same as those for
      functions.  Modules that are designed for use via "from M import *"
      should prefix their globals (and internal functions and classes)
      with an underscore to prevent exporting them.
 
    Function Names
 
      Function names should be lowercase, possibly with words separated by
      underscores to improve readability.  mixedCase is allowed only in
      contexts where that's already the prevailing style (e.g. threading.py),
      to retain backwards compatibility.
 
    Method Names and Instance Variables
 
      The story is largely the same as with functions: in general, use
      lowercase with words separated by underscores as necessary to improve
      readability.
 
      Use one leading underscore only for internal methods and instance
      variables which are not intended to be part of the class's public
      interface.  Python does not enforce this; it is up to programmers to
      respect the convention.
 
      Use two leading underscores to denote class-private names.  Python
      "mangles" these names with the class name: if class Foo has an
      attribute named __a, it cannot be accessed by Foo.__a.  (An insistent
      user could still gain access by calling Foo._Foo__a.)  Generally,
      double leading underscores should be used only to avoid name conflicts
      with attributes in classes designed to be subclassed.
 
    Designing for inheritance
 
      Always decide whether a class's methods and instance variables
      should be public or non-public.  In general, never make data
      variables public unless you're implementing essentially a
      record.  It's almost always preferrable to give a functional
      interface to your class instead (and some Python 2.2
      developments will make this much nicer). Consider using Python’s 
      properties.
 
      Also decide whether your attributes should be private or not.
      The difference between private and non-public is that the former
      will never be useful for a derived class, while the latter might
      be.  Yes, you should design your classes with inheritance in
      mind!
 
      Private attributes should have two leading underscores, no
      trailing underscores.
 
      Non-public attributes should have a single leading underscore,
      no trailing underscores.
 
      Public attributes should have no leading or trailing
      underscores, unless they conflict with reserved words, in which
      case, a single trailing underscore is preferable to a leading
      one, or a corrupted spelling, e.g. class_ rather than klass.
      (This last point is a bit controversial; if you prefer klass
      over class_ then just be consistent. :).
 
 

Programming Recommendations

    - Code should be written in a way that does not disadvantage other
      implementations of Python (PyPy, Jython, IronPython, Pyrex, Psyco,
      and such).  For example, do not rely on CPython's efficient
      implementation of in-place string concatenation for statements in
      the form a+=b or a=a+b.  Those statements run more slowly in
      Jython.  In performance sensitive parts of the library, the
      ''.join() form should be used instead.  This will assure that
      concatenation occurs in linear time across various implementations.
      In performance insensitive part it is OK to use the more readable 
      a+=b and a=a+b. 
 
 
    - Comparisons to singletons like None should always be done with
      'is' or 'is not'.  Also, beware of writing "if x" when you
      really mean "if x is not None" -- e.g. when testing whether a
      variable or argument that defaults to None was set to some other
      value.  The other value might be a value that's false in a
      Boolean context!
 
    - Class-based exceptions are always preferred over string-based
      exceptions.  Modules or packages should define their own
      domain-specific base exception class, which should be subclassed
      from the built-in Exception class.  Always include a class
      docstring.  E.g.:
 
        class MessageError(Exception):
            """Base class for errors in the email package."""
 
      When raising an exception, use "raise ValueError('message')"
      instead of the older form "raise ValueError, 'message'".  The
      paren-using form is preferred because when the exception
      arguments are long or include string formatting, you don't need
      to use line continuation characters thanks to the containing
      parentheses.  The older form will be removed in Python 3000.
 
    - Use string methods instead of the string module unless
      backward-compatibility with versions earlier than Python 2.0 is
      important.  String methods are always much faster and share the
      same API with unicode strings.
 
    - Avoid slicing strings when checking for prefixes or suffixes.
      Use startswith() and endswith() instead, since they are
      cleaner and less error prone.  For example:
 
        No:  if foo[:3] == 'bar':
        Yes: if foo.startswith('bar'):
 
      The exception is if your code must work with Python 1.5.2 (but
      let's hope not!).
 
    - Object type comparisons should always use isinstance() instead
      of comparing types directly.  E.g.
 
        No:  if type(obj) is type(1):
        Yes: if isinstance(obj, int):
 
      When checking if an object is a string, keep in mind that it
      might be a unicode string too!  In Python 2.3, str and unicode
      have a common base class, basestring, so you can do:
 
        if isinstance(obj, basestring):
 
      In Python 2.2, the types module has the StringTypes type defined
      for that purpose, e.g.:
 
        from types import StringTypes
        if isinstance(obj, StringTypes):
 
      In Python 2.0 and 2.1, you should do:
 
        from types import StringType, UnicodeType
        if isinstance(obj, StringType) or \
           isinstance(obj, UnicodeType) :
 
    - For sequences, (strings, lists, tuples), use the fact that empty
      sequences are false, so "if not seq" or "if seq" is preferable
      to "if len(seq)" or "if not len(seq)".
      We prefer using “if len(seq) == 0” to check for emptiness.
 
 
    - Don't write string literals that rely on significant trailing
      whitespace.  Such trailing whitespace is visually
      indistinguishable and some editors (or more recently,
      reindent.py) will trim them.
 
    - Don't compare boolean values to True or False using == (bool
      types are new in Python 2.3):
 
        No:  if greeting == True:
        Yes: if greeting:
        Yes:  if greeting is True:
 
 

References

    [1] PEP 7, Style Guide for C Code, van Rossum
 
    [2] http://www.python.org/doc/essays/styleguide.html
 
    [3] PEP 257, Docstring Conventions, Goodger, van Rossum
 
    [4] http://www.wikipedia.com/wiki/CamelCase
 
    [5] Barry's GNU Mailman style guide
        http://barry.warsaw.us/software/STYLEGUIDE.txt
 
 

Copyright

    This document has been placed in the public domain.