World of Python

Wednesday, 13 May 2015

Python Misc

Network Programming: Network byte order is big endian…This is the natural number order. Intel system is little endian.(little end first) print socket.gethostbyaddr('8.8.8.8') print socket.gethostbyname('google.com') pcapy is python library for capturing network packets. It can be used to analyse offline pcap files as well. You can create a raw packet with ‘scapy’

Monday, 4 May 2015

High Performance Python

CPU can do multiple operation at the same time with no additional cost on time or performance. Use vectorization to utilize this. possible with libraries such as 'numpy'.

Design your code in such a way that to minimize CPU moving data from RAM to L2 cache and minimize number of reads CPU has to perform from RAM.

Natively python can't make use of multiple CPU core. Inorder to avoid this problem use multiprocessing instead of multiple threads.

line_profiler profile a given function line by line.
heapy a track all objects in Python memory.
perf stat to see number of instructions ultimately executed by CPU.
memory_profile to show memory usage.

#python -m timeit -n 5 -r 5 "import test"
5 loops, best of 5: 0.763 usec per loop

simple timing using unix time command : time -p python test.py

or you can use cProfile

$ python -m cProfile -s cumulative test.py

         3 function calls in 0.002 seconds

   Ordered by: cumulative time

   ncalls tottime percall cumtime percall filename:lineno(function)
        1    0.002    0.002    0.002    0.002 test.py:1(<module>)
        1    0.000    0.000    0.000    0.000 {range}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

This gives cumulative time of each function

within the code, you can use 'import timeit' module for timing.

Sunday, 3 May 2015

Intro to Webservers for Python

When a user enters a web site, their browser makes a connection to the site’s web server (this is called the request). The server looks up the file in the file system and sends it back to the user’s browser, which displays it (this is the response). This is roughly how the underlying protocol, HTTP, works.

Dynamic web sites are not based on files in the file system, but rather on programs which are run by the web server when a request comes in, and which generate the content that is returned to the user.

They can do all sorts of useful things, like display the postings of a bulletin board, show your email, configure software, or just display the current time. These programs can be written in any programming language the server supports. Since most servers support Python, it is easy to use Python to create dynamic web sites.

Most HTTP servers are written in C or C++, so they cannot execute Python code directly – a bridge is needed between the server and the program. These bridges, or rather interfaces, define how programs interact with the server. There have been numerous attempts to create the best possible interface, but there are only a few worth mentioning.

Not every web server supports every interface. Many web servers only support old, now-obsolete interfaces; however, they can often be extended using third-party modules to support newer ones.

Common Gateway Interface (CGI):

This interface, most commonly referred to as “CGI”, is the oldest, and is supported by nearly every web server out of the box. Programs using CGI to communicate with their web server need to be started by the server for every request. So, every request starts a new Python interpreter – which takes some time to start up – thus making the whole interface only usable for low load situations.

The upside of CGI is that it is simple – writing a Python program which uses CGI is a matter of about three lines of code. This simplicity comes at a price: it does very few things to help the developer.

Writing CGI programs, while still possible, is no longer recommended.

Web Server Gateway Interface (WSGI):

The Web Server Gateway Interface, or WSGI for short, is defined in PEP 333 and is currently the best way to do Python web programming.

WSGI is the Web Server Gateway Interface. It is a specification that describes how a web server communicates with web applications, and how web applications can be chained together to process one request.

While it is great for programmers writing frameworks, a normal web developer does not need to get in direct contact with it. When choosing a framework for web development it is a good idea to choose one which supports WSGI.

The big benefit of WSGI is the unification of the application programming interface. When your program is compatible with WSGI – which at the outer level means that the framework you are using has support for WSGI – your program can be deployed via any web server interface for which there are WSGI wrappers.

A really great WSGI feature is middleware. Middleware is a layer around your program which can add various functionality to it. There is quite a bit of middleware already available.

For example, instead of writing your own session management (HTTP is a stateless protocol, so to associate multiple HTTP requests with a single user your application must create and manage such state via a session), you can just download middleware which does that, plug it in, and get on with coding the unique parts of your application.

The same thing with compression – there is existing middleware which handles compressing your HTML using gzip to save on your server’s bandwidth.

Authentication is another a problem easily solved using existing middleware.

Data Persistance:

Relational databases are queried using a language called SQL. Python programmers in general do not like SQL too much, as they prefer to work with objects. It is possible to save Python objects into a database using a technology called ORM (Object Relational Mapping).

ORM translates all object-oriented access into SQL code under the hood, so the developer does not need to think about it. Most frameworks use ORMs, and it works quite well.

Credit:https://docs.python.org/2/howto/webservers.html

Monday, 27 April 2015

Python in a nutshell

Everything is an object in python.

An object has an id 'id(obj)', type 'type(obj)' and value. id and type won't change during the life of an object.

Most fundamental object types in Python are immutable: string, number, tuple etc where even value won't change.

mutable objects : list, dictionaries..where object value can be changed.

numbers are either integer or float.

42 // 9 = 4

s = r"this is a raw\n string"
s = "this is splitted\n into two lines"

tuple is immutable, whereas list is mutable.

'is' operator compares the 'id' of objects.

'==' compared the value of objects.

condition expression : s = "this is true" if a > b else "this is not true"

'else' can also be used with with exception. It mean if no exception raised, then only execute the code inside 'else' block.

By using a 'with' block, you’re defining a specific context, in which the contents of the block should execute.

Comprehension

list comprehension : output = [value+1 for value in range(10) if value > 5]

generator comprehension : gen = (value for value in range(10) if value > 5). Note that you can only iterate over a generator once.

set comprehension : a = {value/2 for value in range(10) if value > 5}
>>> example = {1, 2, 3, 4, 5}
>>> 4 in example ##set membership check
remove(),add(),update(),discard(),pop(),issubset(),issuperset() are functions availabe with set.
Operators | union, & intersection, -,^(exclusive OR)
>>> {1, 2, 3} | {4, 5, 6}##union of two set

The chain() function, in particular, accepts any number of iterables and returns a new generator that will iterate over each one in turn . for i in itertools.chain(range(3),range(4)):

zipping can merge two iterables together and form a tuple list.

>>> zip(range(3),reversed(range(3)))
[(0, 2), (1, 1), (2, 0)]

This can be used to create dictionaries from two different list. for eg:-
>>> dict(zip(range(3),reversed(range(3))))
{0: 2, 1: 1, 2: 0}

The map() function iterates over a group of values:
>>> map(float, range(97, 102))
[97.0, 98.0, 99.0, 100.0, 101.0]

'from collections import *' provides collections such as OrderedDict, defaultdict etc.

The finally suite is always executed no matter what exceptions occur within a try/except statement.  

The standard library’s pickle module lets you easily and efficiently save and restore Python data objects to disk.  
ƒ The pickle.dump() function saves data to disk.  
ƒ The pickle.load() function restores data from disk.  

Python scope:

Python looks inside of the locals() array, which has entries for all local variables. If it doesn’t exist there, then the globals() dictionary is searched. Finally, if the object isn’t found there, the __builtin__ object is searched.

Implement Queue:

It is also possible to use a list as a queue, where the first element added is the first element retrieved (“first-in, first-out”); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one).

To implement a queue, use collections.deque which was designed to have fast appends and pops from both ends.

Default set and get method for class:

Python also has some built-in functions to modify your classes without having to create “get” and “set” functions. This is a peek ahead to the “Attributes” section of this chapter, but these functions are: getattr(obj, name) to access the attribute of an object, setattr(obj, name, value) to set the attribute of an object, hasattr(obj, name) to check for existence, and, finally, delattr(obj, name) to delete an attribute in an object. Public properties are, of course, accessible once the object is created.

Method resolution order

>>> class A(object): x = 'a'
... 
>>> class B(A): pass
... 
>>> class C(A): x = 'c'
... 
>>> class D(B, C): pass
... 
>>> D.x
'c'
>>>

here, new-style, the order is:

>>> D.__mro__
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, 
    <class '__main__.A'>, <type 'object'>)

with A forced to come in resolution order only once and after all of its subclasses, so that overrides (i.e., C's override of member x) actually work sensibly.

module inspect

The inspect module provides several useful functions to help get information about live objects such as modules, classes, methods, functions, tracebacks, frame objects, and code objects. For example, it can help you examine the contents of a class, retrieve the source code of a method, extract and format the argument list for a function, or get all the information you need to display a detailed traceback.

@staticmethod : define a static method. it is just way to organize your function into related class. no default arguements are passed to static method.

@classmethod: class level functions.

The four basic arithmetic operations—addition, subtraction, multiplication, and division—are represented in Python using the standard operators +, -, *, and /. Behind the scenes, the first three are powered by implementations of the __add__(), __sub__(), __mul__() methods.

Copy object

In order to make changes to an object without those changes showing up elsewhere, you’ll need to copy the object first.

Some object provide a copy() function with it.

Then you have a module ‘import copy’

In this module, you have copy.copy(objname) and copy.deepcopy(objname)

>>> original = [[1, 2, 3], [1, 2, 3]]
>>> shallow_copy = copy.copy(original)
>>> deep_copy = copy.deepcopy(original)
>>> original[0].append(4)
>>> shallow_copy
[[1, 2, 3, 4], [1, 2, 3]]
>>> deep_copy
[[1, 2, 3], [1, 2, 3]]

%s vs %r

Because this is equivalent to calling str() directly, the value placed into the string is the result of calling the object’s __str__() method. Similarly, if you use the %r placeholder inside the substitution string, Python will call the object’s __repr__() method instead

>>> print "%r is me" % ('shameer\n')
'shameer\n' is me
>>> print "%s is me" % ('shameer\n')
shameer
is me
>>>

Items visibility

Any Python object (modules, classes, functions, variables...) can be public or private. Usually the object name decides the object visibility: objects whose name starts with an underscore and doesn't end with an underscore are considered private. All the other objects (including the "magic functions" such as __add__) are public.

Python unittest

test fixture
A test fixture represents the preparation needed to perform one or more tests, and any associate cleanup actions. This may involve, for example, creating temporary or proxy databases, directories, or starting a server process.
test case
A test case is the smallest unit of testing. It checks for a specific response to a particular set of inputs. unittest provides a base class, TestCase, which may be used to create new test cases.
test suite
A test suite is a collection of test cases, test suites, or both. It is used to aggregate tests that should be executed together.
test runner
A test runner is a component which orchestrates the execution of tests and provides the outcome to the user. The runner may use a graphical interface, a textual interface, or return a special value to indicate the results of executing the tests.

import unittest

class TestStringMethods(unittest.TestCase):

def test_upper(self):
      self.assertEqual('foo'.upper(), 'FOO')

def test_isupper(self):
      self.assertTrue('FOO'.isupper())
      self.assertFalse('Foo'.isupper())

def test_split(self):
      s = 'hello world'
      self.assertEqual(s.split(), ['hello', 'world'])
      # check that s.split fails when the separator is not a string
      with self.assertRaises(TypeError):
          s.split(2)

if __name__ == '__main__':
    unittest.main()

or you can execute from command line..

python -m unittest test_module1 test_module2
python -m unittest test_module.TestClass
python -m unittest test_module.TestClass.test_method

class MyTestCase(unittest.TestCase):

    @unittest.skip("demonstrating skipping")
    def test_nothing(self):
        self.fail("shouldn't happen")

    @unittest.skipIf(mylib.__version__ < (1, 3),
                     "not supported in this library version")
    def test_format(self):
        # Tests that work for only a certain version of the library.
        pass

    @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")
    def test_windows_support(self):
        # windows specific testing code
        pass

running system level commands

>>> os.system('python test1.py')
Welcome
0
>>> subprocess.call(['python','test1.py'])
Welcome
0
>>>

threading

t1.join() is called to wait for the t1 thread to finish. Then t2.join() is called to wait for the t2 thread to finish.

passing arguments:

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('-i',type=str,help='just testing',required=True)

cmds = parser.parse_args()

print cmds.i

enumerate wraps any iterator with a lazy generator. This generator yields pairs of the loop index and the next value from the iterator. The resulting code is much clearer.
for i, flavor in enumerate(flavor_list):     print('%d: %s' % (i + 1, flavor))

design patterns:

creational pattern:
factory method and factory class
builder pattern
prototype pattern

The abstract factory design pattern essentially aims to provide an interface with the intention of creating families of related or dependent objects without specifying their concrete classes

Imagine that we want to create an object that is composed of multiple parts and the composition needs to be done step by step. The object is not complete unless all its parts are fully created. That's where the Builder design pattern can help us.We use the Builder pattern when we know that an object must be created in multiple steps.

prototype pattern is to clone existing objects. use copy.copy() and copy.deepcopy()

Structural pattern

Adapter is a structural design pattern that helps us make two incompatible interfaces compatible. First, let's answer what incompatible interfaces really mean. If we have an old component and we want to use it in a new system, or a new component that we want to use in an old system, the two can rarely communicate without requiring any code changes. But changing the code is not always possible, either because we don't have access to it (for example, the component is provided as an external library) or because it is impractical. In such cases, we can write an extra layer that makes all the required modifications for enabling the communication between the two interfaces. This layer is called the Adapter. Basically means writing a wrapper to a function. Adapter makes things work after they have been implemented.

A Decorator pattern can add responsibilities to an object dynamically, and in a transparent manner

Facade pattern is ideal for providing a simple interface to client code that wants to use a complex system but does not need to be aware of the system's complexity

Constructor overloading:

Do constructor overloading(polymorphism) with @classmethod.

Metaclasses:

Use metaclasses to ensure that subclasses are well formed at the time they are defined, before objects of their type are constructed.
The __new__ method of metaclasses is run after the class statement’s entire body has been processed.

===

heapq, bisect

==

Consider Interactive Debugging with pdb

Here, I instantiate a Profile object from the cProfile module and run the test function through it using the runcall method:
profiler = Profile() profiler.runcall(test)
Once the test function has finished running, I can extract statistics about its performance using the pstats built-in module and its Stats class. Various methods on a Stats object adjust how to select and sort the profiling information to show only the things you care about.
stats = Stats(profiler) stats.strip_dirs() stats.sort_stats('cumulative') stats.print_stats()

Tuesday, 21 April 2015

Python coding principles

Write readable code. It is all about choosing right names, putting that extra blank line between functions etc.

Do not make it complex. Simplify.

Beauty is in simplicity

Make less assumptions. Refuse temptation to guess. Suppose you are trying to access a dictionary with a key, don't assume that key value always present in dictionary.

Don't finish it immediately. Take time.

If implementation is hard to explain, it is a bad idea.

If implementation is easy to explain, it may be a good idea.

Don't repeat yourself. Commonly used task, make it as a function.

Premature optimization is root of evil. First make it work. It doesn't mean you should think about optimizing the code initially. Strike a balance.

Be conservative in what you do, be liberal in what you accept. Means your code allow variations in incoming data.

Python tips

1. Diffrence between my_dict['key'] and my_dict.get() method

Dictionaries can be accessed by my_dict['key'] and get() method, both seems to do the exact same thing.

With bracket syntax, an invalid key, a missing value etc, results in an exception being raised.

The get() method checks to see whether the provided key is present in the dictionary; if it is, the associated value is returned. If the key isn’t in the dictionary, an alternate value is returned instead. By default, the alternate value is None. You can pass a second arguement to override this default.

2. if __name__ == "__main__" : main()

Before executing a source file, python interpreter define few special variables. If interpreter is running a source file directly, it sets the value of __name__   as '__main__'. If it is imported, module's name will be set to '__name__'.

When you do the main check, code only execute when you want to run the module as a program and it wont execute when somebody is importing it.

3. *args and **kwargs?

The syntax is the * and **. The names *args and **kwargs are only by convention but there's no hard requirement to use them.

You would use *args when you're not sure how many arguments might be passed to your function, i.e. it allows you pass an arbitrary number of arguments to your function. For example:

>>> def print_everything(*args):
        for count, thing in enumerate(args):
...         print '{0}. {1}'.format(count, thing)
...
>>> print_everything('apple', 'banana', 'cabbage')
0. apple
1. banana
2. cabbage

Similarly, **kwargs allows you to handle named arguments that you have not defined in advance:

>>> def table_things(**kwargs):
...     for name, value in kwargs.items():
...         print '{0} = {1}'.format(name, value)
...
>>> table_things(apple = 'fruit', cabbage = 'vegetable')
cabbage = vegetable
apple = fruit

You can use these along with named arguments too. The explicit arguments get values first and then everything else is passed to *args and **kwargs. The named arguments come first in the list. For example:

def table_things(titlestring, **kwargs)

You can also use both in the same function definition but *args must occur before **kwargs.

You can also use the * and ** syntax when calling a function. For example:

>>> def print_three_things(a, b, c):
...     print 'a = {0}, b = {1}, c = {2}'.format(a,b,c)
...
>>> mylist = ['aardvark', 'baboon', 'cat']
>>> print_three_things(*mylist)
a = aardvark, b = baboon, c = cat

As you can see in this case it takes the list (or tuple) of items and unpacks it. By this it matches them to the arguments in the function. Of course, you could have a * both in the function definition and in the function call.

source(credit ) : http://stackoverflow.com/questions/3394835/args-and-kwargs