GIS programmer writing Python in Emacs


EuroPython 2011 report

July 22, 2011 at 12:00 PM | categories: python | View Comments

Are you a Python programmer? Do you want to learn Python? Do you want to become a better programmer? Have you ever been to Florence?

If you've answered yes to at least one of these question and yet didn't show up at Florence in late June for EuroPython 2011 conference, then you have no excuse. And no, living in Australia or North America doesn't qualify as a valid reason, there were people from these continents.

Whew, that was a longish attempt at making a joke, now let's get to the real deal, that is, report. I must also note that I didn't stay for the whole length of EuroPython. Hey, it was my honeymoon! And that's why I didn't have time to write the report sooner.

Talks

Of course, talks are important. They are the reason people come to events like this. But it's always hard to choose which one suits you better, when you have 5 talks running in parallel. Sometimes you're sure to run into conflicts. And it's generally a good thing for conferences, having real concurenty between speakers. Watching all of the talks might be possible if you're diagnosed with insomnia or otherwise have tons of free time and I can't really recommend any talks, as I've been at the conference for only two days, but I've heard very good things about Armin Ronacher's Five Years of Bad Ideas talk. Fortunately, all of the talks are already uploaded to YouTube, so don't hesitate to watch those whose description you find interesting.

My talk

Ah, yes, forgot to mention -- I was one of the speakers too. Well, if you think about using OpenStreetMap data or just looking into using Python for GIS, my talk might give some hints as to where to start. Check out the slides here or perhaps check out talks' page at EuroPython site.

People

People in general are great. Pythonista are no exception from this rule. It's always fun to just talk to people around you, ask what mad things they are doing with Python, bash PyPy (ha-ha, joking) and Django (only half-joking) and just hang out talking about things not related to programming (not joking). Socializing with other geeks is fun!

Organization

Organization was actually pretty good. If you don't count WiFi issues in, it was fantastic. Just to make sure, I've made a list of good and bad things.

The good

  • Distinguishibility of organisers from conferences visiters. Seriously, using yellow t-shirts as a uniform was a fantastic idea
  • Names on both sides of badges. YES. BOTH SIDES. That was a minor gripe with PyCon this year -- after wearing the badge for 3 days, the name was almost invisible and furthermore, just like a sandwich that falls always butter down, the badge always turned the blank side up.
  • Equipment. It worked. 'nuff said.
  • Food. Now, this is something I like about Southern Europe in general -- good food. It's not that easy to cook something delicious for a crowd of 600 people, but cooks at Florence actually did a very good job.
  • Wine at lunch. I love wine. I L O V E it! And having a glass of wine relaxes you enough to not care about internet issues anymore.
  • Place. Bus stop right in front of the hotel? Check. Close to city center? Check.

The bad

  • WiFi and Internet connectivity in general. Everybody would definitely tweet about if they could connect to the internet. Well, at least more people paid attention to talks. :)
  • Partner programs, PyFiorentina and marketing. Constant reminders about those on EuroPython official twitter were kinda irritating.

In general -- organization was kick-ass and though I highlighted some issues I think organizers team did a great job. Go Italy!

Summary

Instead of a summary, I'm just going to say that I'm certain I'll visit EuroPython next year and so should you!

Read and Post Comments

PyCon tutorial slides are up!

April 14, 2011 at 02:25 PM | categories: python, pycons | View Comments

I've finally managed to get the slides on my site, so here you go! On an almost unrelated note, here's a tiny cute one-liner that I've used to create a zip of all the code used in slides:

grep "\\lstinputlisting" slides.tex | \
sed 's/.*{\(.*\)}.*/\1/' | \
zip -@ pycon2011-code.zip
Read and Post Comments

PyCon wrap up

March 25, 2011 at 10:53 AM | categories: python, pycons | View Comments

It's now almost two weeks since PyCon has come to an end and so it seems as a good time to write a short report about the conference.

I must say that this was my first time at PyCon and actually my first time at a conference of such huge size and thus I was a bit overwhelmed. Nevertheless, I was able to serve as tutorial instructor, bag swagger, session runner and beer drinking in the 7 days I spent at the venue. And of these positions, the hardest one was being a tutorial instructor. Not only I felt pressured by the fact that people actually paid to get to my tutorial and this was my first tutorial given in English, but I also spent more than two months adding, updating and removing material. In the end the tutorial went OK, but I think I hurried a little bit and the fact that tutorial wasn't meant to be one of "code as I talk" types seemed as a disappointment to some of the listeners. I'm still waiting for some feedback from PyCon organizers about the overall rating of my tutorial, but I'm happy that at least nobody walked out of the room during my tutorial.

Main conference featured tons of interesting talks, but of course I couldn't possibly make all of them, apparently because I don't scale to 5 rooms in a nice reproducible fashion. I'm still in process of watching the talks at http://pycon.blip.tv and I suggest anyone interested in Python doing exactly this -- watching all talks, one by one. Some of them might generate more interest, though. For example, I remember how crowded it was at Alex Martelli's talk about API design where people had to stand, because there wasn't a single empty chair in the room. And the same happened at both Raymond Hettinger's talks, where people where starting to reserve places 30 minutes prior to the talk. Some talks were not meant to be educational, or rather, emphasized the fun at the expense of content. I do like talks like these, because getting absolutely massive amount of information whole day non-stop would likely cause brain damage. So, don't forget to watch "Obfuscated Python" by Rev. Johnny Healey and "Exhibition of Atrocity" by Mike Pirnat when you get bored (and I hope you don't).

I also had some time to take part in sprints. I wouldn't say I had much choice in whom to join, as I mostly use Twisted at my current project. Well, I must say that Twisted people aren't that twisted after all and I had quite a bit of fun trying to fix and reproduce older bugs from their bugtracker. I admit that their way of doing code review leads to much higher code quality at the expense of longer ticket processing time. Still, I was able to provide fixes for two tickets and get to first twenty at Twisted Highscore which I find quite cool.

And, as a short summary -- PyCon 2011 and its attendees were awesome and I really want to make it to Santa Clara next year.

Read and Post Comments

My OpenStreetMap tutorial at PyCon

February 17, 2011 at 11:51 PM | categories: python, osm | View Comments

This year marks a huge milestone for me -- I'll be giving a tutorial session at PyCon. Being selected as one of the speakers is not only a great honor but also a huge responsibility. Whatever, that's not the point of this post. This is supposed to be a self-advertisement. So, if you don't know what's OpenStreetMap, how to render a map in Python and what GIS stands for (or any of the above) -- don't hesitate to spend $150 on registration. For those of you who think that they can handle this with ease, I suggest a much more sophisticated alternative. I don't recommend this workshop to people not experienced in Python, though. There're plenty of other tutorials so don't hesitate to share this information with your friends, colleagues and random people on the Internet.

Oh, and don't forget to register for the main conference, because it's probably the only way to find out about giant telescopes, huge robots or automation of airplane engines.

On almost unrelated not, if you have registered for my tutorial and want to try out all examples while in the session room I'd suggest you go through this check-list:

  • Install Mapnik. See the installation instructions. There're pre-built packages available for most major platforms, so the install process should be straightforward.
  • Install shapely. Play with examples, try to recall your geometry classes from school. This will help you feel more comfortable with examples I'm going to provide during the session.
  • Install PostgreSQL with PostGIS extension. Some of the examples might include writing SQL queries, so if you're not familiar with SQL, you should probably read some basic tutorial, but that's definitely not obligatory.

Of course, you could just skip all of these and try out the code samples later after the session, in a calm and quiet setting. The tutorial itself doesn't force students into constantly writing code.

In conclusion, don't miss out on your opportunity to extend your knowledge base and social network, visit PyCon 2011 in Atlanta, it's going to be cool.

Read and Post Comments

Tile Server Implementation

January 16, 2011 at 05:40 PM | categories: python, osm | View Comments

It's been almost 2 years as CloudMade has ditched mod_tile and renderd as main rendering solution in favour of in-house solution. As the principle designer of the said alternative, I must say that this decision led to higher development pace. This article will try to cover the general architecture approach, reasons of decisions made and short comparison to other rendering alternatives.

Before The Switch

As some of you might know, CloudMade has its roots in OpenStreetMap and it was quite natural to adopt OSM's software stack to have something to start with. But as CloudMade grew, the needs and requirements changed rapidly and the task of supporting and developing mod_tile became more of a burden, the decision to switch to more high-level language as the main was made. The language of choice was Python, due to its generous set of already existing spatial libraries (e.g. Shapely, GeoAlchemy, Mapnik bindings, etc), ease of deployment and its simpler support for cross-platform development. And, well, I knew it better than Scala, Ruby or Perl at that moment. Here goes a list of our tasks with mod_tile and renderd that we found easier to implement with Python:

Variable priorities
mod_tile has the notion of "dirty" and "general" requests, with dirty requests having lower priority and thus having the property of being rendered when there's little-to-none on-demand rendering required. While this seems enough for most applications, it does has its warts, as it makes the priority system overall less general. What this means in practice, is that every time we need to add some special priority (i.e. in case we need to health-check system by forcing rendering) we get into adding quite a lot of code, rather than changing the "priority" property of the request. It might seem silly, but off the top of my head I can remember that we have at least 6 different priorities now
Replicating cache
When it comes to scaling rendering and serving of tiles, the simplest solution that comes to mind is adding more servers. It's as simple, as pushing several links in web interface or even using automated process and Amazon Web Services API. But when you add new server with rendering stack installed you lose all the cache that has been on other servers and furthermore all the instances don't share cache, which makes the cacheto use system less effective. There're several solutions to this issue, each of them making use networking or database libraries, programming against which is tedious task in C (and C++).
Being tied to Apache
mod_tile is an Apache module, which makes it less interesting if you look at it from "commodity server" perspective. Having to program against a monster that is Apache, using its APR library is one giant leap into full-blown programmer depression. The autogenerated documentation make the matters even worse. And two last things about Apache are its comparatively slow serving of static files and complicated configuration scheme. One might say that Apache might be winning in other parts of comparison, but the things that have been mentioned were essential to our rendering services.

These were the main reasons to switch, as mod_tile and renderd didn't seem like the right thing for CloudMade. Of course, there were a lot of others, more and less subjective reasons, but having even before mentioned ones, it was enough to seriously consider a switch.

The Switch

With all the warts of the existing system and requirements for the future in mind, we decided to move on with the new approach. There were several things to consider in our system:

Decoupling
This was our main goal -- thoroughly decoupled system, where every part does one thing and does it good. This makes scaling much easier, but also incurs additional penalty on the amount of code, because of the need to write communication utilities. This also makes the system as a whole seem much more stable, as every other part of the system can work as a replacement in case of failure. Of course, the price is having network overhead and supervising system parts.
Handling styles
One of the main CloudMade web-services is the style editor, which gives ability to edit map styles using WYSIWYG technique. Handling thousands of Mapnik styles wasn't something any existing system was prepared for, so unique way of doing exactly this had to be devised. Of course, this meant that style state in every part of the system had to be consistent at any given moment of time, making this even harder to accomplish.
Cache expiry
To minimize load on the system, as much cache as possible has to be available. But for rapidly changing OpenStreetMap data, having all tiles cached for month wouldn't work and at the same time rendering all images on the fly would be an enormously heavy goal to accomplish. Whatever cache update approach is taken, unless there's a hardware possibility to render maps on the fly, someone will be unhappy about cache expiry scheme.
Health monitoring and high availability
In order to meet requirement of having usable web services, one of the most important things to consider is having as high service uptime as possible. Without having health monitoring which knows about state of every part of the system the said objective is almost unreachable. Of course, the ideal can not achieved, but having a setup that covers at least 80% of the nodes would satisfy our needs.

The system that's currently in use at CloudMade has been developed with exactly these goals in mind, with minor additions and subtractions along the way. To summarize, the goal was the system where every part has a maximum level of independency from every other while succumbing to the general goal of having fast and easily-deployed rendering stack.

To Be Continued

I'll continue the talk about moving from mod_tile to our in-house system in follow-ups, where I'll try to get into technical details, explain our shortcomings and issues that arised while developing.

Stay tuned.

Read and Post Comments

Next Page ยป