Skip to content
ttrainor edited this page Jan 8, 2012 · 5 revisions

Python data shell

Introduction

In pds you work with your data at the command line using standard python syntax (including indentation rules). See the python documentation pages for help on scripting with python.

The default behavior of pds is to import the standard numerical, scientific and plotting packages at startup. These modules in pds are:

Other modules may be imported at startup depending on how you have configured your installation.

Errors (and exceptions) are normally printed to the screen. If you are using the windows graphical user interface these errors may print to a command shell that is not in view. If you are curious about an error, find this window (e.g. the dos window in your taskbar on windows computers) and see what is printed there.

You can usually get more information by turning on debugging. Type 'pds>debug' at the command line to toggle debugging.

Also note that in the below discussion the term 'object' refers to any python data type (function, class instance, module etc...).

Startup

To start pds follow the instructions provided for the installation you are using:

  • Instructions on startup using source code or site-packages installation are on the Install page.
  • Instructions on startup using the windows pre-built pds package can be found here.

Note you can customize your startup options by editing the files:

INSTALL_PATH/pds/startup.pds
 - and -
HOMEPATH/.pds

You can also pass options at startup. To get more help on switches when starting pds from the command line pass a '-h' option.

Navigating through your directories

At the command line you can move around and see where you are using the following:

pds>>pwd                        # what directory am I in?
pds>>ls                         # list files in this directory
pds>>cd path/to/new/directory   # change to another directory...

note if you get an error when you use the cd command and you know the path exists try putting the path name in quotes

Command shell example

An example session

pds>x = num.arange(100.)/pi
pds>y = sin(x)
pds>plot x, y
pds>y
array([ 0.        ,  0.3129618 ,  0.59448077,  0.81627311,  0.95605566,
        0.99978466,  0.94306673,  0.79160024,  0.5606028 ,  0.2732824 ,
       -0.04149429, -0.35210211, -0.62733473, -0.83953993, -0.96739776,
                                    :
       -0.36492838, -0.63797503, -0.84692525, -0.9707861 , -0.99711346,
       -0.92326225, -0.75665221, -0.51402241, -0.2197495 ,  0.09660132])
pds>
pds># clear all the variables (same as ''>>del x, y'')
pds>clear
pds>
pds># Do it again but put variables into a group
pds>g = group()
pds>g.x = arange(100.)/pi
pds>g.y = num.sin(g.x)
pds>plot g.x, g.y
pds>
pds># Delete the variables
pds>del g
pds>
pds># Do it again but put variables into a group after creation
pds>x = arange(100.)
pds>y = sind(x)
pds>g = group(angles=x,vals=y)
pds>plot(g.angles, g.vals)
pds>

Typing show at the command line lists the commands, modules, functions and data that are currently defined:

pds>show

==== Commands ====
EOF            addcmd         alias          calc           cd             clear
debug          edit           execfile       exit           help           load
ls             mod_import     more           newplot        path           plot
pwd            quit           restore        save           show           web

==== Modules ====
num         pyplot      scandata    scipy

==== Functions / Classes ====
read_dat

==== Instances ====
g

==== Variables ====
x    y

pds>

Builtins

The show command does not show everything that is defined in your workspace. There are a number of objects that are defined as builtins. To see these pass show the -b switch:

pds>show -b

***** Builtins ******

==== Python Key Words ====
None        and         as          assert      break       class       continue    def
del         elif        else        except      exec        finally     for         from
global      if          import      in          is          lambda      not         or
pass        print       raise       return      try         while       yield

==== Functions / Classes ====
ArithmeticError              AssertionError               AttributeError
BaseException                DeprecationWarning           EOFError
EnvironmentError             Exception                    FloatingPointError
FutureWarning                GeneratorExit                IOError
                          .
                          .
                          .
sin                          sind                         slice
sorted                       sqrt                         square
staticmethod                 std                          str
sum                          super                        tan
tand                         tuple                        type
unichr                       unicode                      vars
voigt                        xrange                       zip

==== Variables ====
False    None     True     e        pi

==== Other Python Objects ====
Ellipsis          NotImplemented    copyright         credits           exit
help              license           quit

All the builtin objects are always available at the command line! In other words dont forget to check and see whats here.

If you really want to see everything try >>show -a and >>show -ab. The -a means show hidden objects; those object that have a leading '_' in their name. These objects are used by the program to hold various state information, and shouldnt be very interesting.

Python Key Words

The python key words are used by the python language. They obviously should not be redefined by the user.... and it should cause an error if you try to do so.

Functions and Classes

Functions and classes are callable python objects, that (likely) do something interesting and possibly returns another object. In the above example the statement >>x = num.arange(100.)/pi has a call to the arange function defined in the num module (see below), and >>y = sin(x) is a call to the builtin sin function (which is really the same as num.sin)

The same function can be assigned different names as should be evident in the above example: num.arange = arange etc..

Variables

Variables are what we normally think of as data. An integer or floating point number, numerical array, or any of the other standard python data types (lists, dictionaries...).

In the above example, x, y, and pi are all data. You can display the value by typing the variable name at the command line as shown above for the y values.

Instances

Instances are objects that do not fit with the typical python data types, but are not functions or modules. Instances are generally custom objects that hold data and methods that operate on the data.

In the above example the >g = group() statement created a new instance of a group object. g is just an empty container, but we can assign data and other attributes to it to keep things organized.

Commands

Commands are functions that can be called using >>function arg, arg syntax instead of (or in addition to) the standard function(arg,arg) syntax. You can not assign a return value from a command.

In the last line of the example, using plot x,y is equivalent to calling the function plot(x,y)

Modules

Modules are essentially objects that contain other objects. In the above example num is the standard numerical python (NumPy) modules that defines all sorts of numerical functions. To use (ie call) an object defined inside a module use the module.object syntax. For example the statement num.arange(100.) used above is a call to the arange function defined inside the num module.

To display the contents of a module you can use the show function:

pds>show num

Note that modules may contain more modules, functions, variables etc... Therefore you may have to tunnel into an object to find what you want. For instance to calculate the determinant of a matrix you can use the det function defined in the linalg module which exists in the num module:

pds>num.linalg.det([[3.,4.],[1.,7.]])
17.0

To see what other fun stuff may be defined inside the num.linalg module try:

pds>show num.linalg

Help

To get more information about an object you can see if it has some internal documentation using the 'help' command. For example:

pds>help num.linalg.det
Help on function det in module numpy.linalg.linalg:

det(a)
    Compute the determinant of a matrix

    Parameters
    ----------
    a : array-like, shape (M, M)

    Returns
    -------
    det : float or complex
        Determinant of a

    Notes
    -----
    The determinant is computed via LU factorization, LAPACK routine z/dgetrf.

You can try displaying the help for any object. You may or may not get something useful....

There are other very useful ways to get more information about an object, for example

pds>x = [1,2,3]
pds>type(x)
<type 'list'>

pds>dir(x)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__',
                               :
                               :
', '__setslice__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
pds>

Note that the dir function shows the attributes that the variable (or really an instance of the python list object) x has defined. This example shows that x is an instance of an object that carries with it both the data, [,1,2,3], and methods for operating on that data. For example:

pds>help x.count
Help on built-in function count:

count(...)
    L.count(value) -> integer -- return number of occurrences of value

pds>x.count(1)
1
pds>

In pds there is a builtin function / command named info that displays a bunch of useful information about objects. For example:

pds>x = arange(10.)
pds>info(x)
CLASS:    ndarray
ID:       39542744
TYPE:     <type 'numpy.ndarray'>
VALUE:    array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
CALLABLE: No
pds>

Another useful builtin is the source function / command which tries to display the source code for an object (this works as long as the interpretor has access to python source code which may not always be the case). For example :

pds>source read_dat
def read_col_data(fname):
    f = open(fname)
    lines = f.readlines()
    f.close()
    data = []
    for line in lines:
        if line[0] == '#':
            pass
        else:
            tmp = line.split()
            #print tmp
            d = map(float,tmp)
            #print d
            data.append(d)
    data = num.array(data)
    return num.transpose(data)

pds>

Reading formatted files

The below example shows a simple function for reading formatted data files:

###########################################################
## This function is defined in the module fmtfiles.read_fmt
###########################################################
def read_column(fname):
    f = open(fname)
    lines = f.readlines()
    f.close()
    data = []
    for line in lines:
        if line[0] == '#':
            pass
        else:
            tmp = line.split()
            #print tmp
            d = map(float,tmp)
            #print d
            data.append(d)
    data = num.array(data)
    return num.transpose(data)
#############################################################

To import this function

pds>>from fmtfiles.read_fmt import read_column as read_dat

Heres a simple data file (stuff.dat)

# x, y
1 2
3 4
5 6

Read it into the variable dat

pds>dat = read_dat("stuff.dat")
pds>dat
array([[ 1.,  3.,  5.],
       [ 2.,  4.,  6.]])
pds>dat[0]
array([ 1.,  3.,  5.])
pds>dat[1]
array([ 2.,  4.,  6.])
pds>

Writing formatted files

We can write this same array to a file using the following sequence of commands. Below we create a new file (data.out) in the current working directory

pds>f = open('data.out','w')
pds>for j in range(len(dat[0])):
...>   l = "%6.3f, %6.3f\n" % (dat[0,j],dat[1,j])
...>   f.write(l)
...>
pds>f.close()
pds>more data.out
 1.000,  2.000
 3.000,  4.000
 5.000,  6.000
pds>

Save/Restore

The save command / function tries to save data in a python pickle (a python archive of sorts). The save command by default tries to figure out what is 'data' versus modules and funcitons etc.. in other words save just tries to save things that arent defined in modules. However, the pickle is finicky. Not all data can be pickled, and it may not behave nicely down the road if a module that uses a particular data type has changed and expects something different from your data object... Therefore, it is always safest to write data in formatted files, but that can be a pain, so the pickle is an option...

This example saves all the 'data to a default file name that is the time stamp -> save_2009_11_10_2200.sav.

We then clear all the variables and restore the variables back again.

pds>show
   :
   :
==== Variables ====
dat    j      l      x

pds>save

pds>ls
save_2009_11_10_2200.sav

pds>clear

pds>restore save_2009_11_10_2200.sav

pds>show
   :
   :
==== Variables ====
dat    j      l      x