Freitag, 30. April 2010

Hashable or Unhashable: That is the question!

Python 3 has changed a lot of things, one of them being that if a class implements __eq__ the __hash__ method will not be inherited. This behavior seems weird at first but if you take a closer look at the topic it shows you a great problem.

Most Python developers do not really think about the design and behavior of an object when they create it, they have a class with an __init__ method, a couple of attributes and methods and may be a __repr__. If they implement other special methods they are usually __getattr__, __*item etc. but given an arbitary class you created could you tell me if it is supposed to be hashable or not and if it is supposed be hashable if it actually is?

Usually you do not care about __hash__ at all, you inherit from object which implements it and on CPython it returns the id of an object on other implementations it may be the same, it may be something different altogether but it is unique to the object.

This result of this is that everytime you create a type and you do not implement __hash__, objects of that type will be hashable even if it is not supposed to be. The worst case is that if you implement __eq__ but not __hash__, equal objects do not behave like equal objects when used as keys in dictionaries, if you put them into a set or if you do something else which relies on hashes as a way to check if objects are equal.

This results in two rules you should always stick to:
1. If you implement __eq__ also implement __hash__.
2. If your object is mutable, set __hash__ to None in order to make it unhashable, this way your object has the same behavior you expect from dictionaries or lists.

If you write new code stick to the rules and you are good to go in your old code make sure to implement those methods and you make it one step easier to port it to Python 3.

Dienstag, 27. April 2010

Congratulations! Your proposal has been accepted.

Yesterday Google released the list of accepted projects for GSoC, my project is one of them. In the next months I will be one of the three students working on Sphinx, my part will be porting Sphinx to Python 3.x and the integration of the web application which was developed during the last year.

Currently we are in the "Community Bonding Period" which means basically we are all idling in #pocoo to get in touch with the community and get an idea of the development process until May 24 which will be the day everybody starts coding. However as Sphinx is not such a big project at least in terms of the number of contributors, I think that spending so much time on "bonding" is a waste of time so I will probably start earlier.

So that's it for now. I will try to give you as much information as possible through blog posts in the future so everyone can easiely follow the progress.