The real problem is not threads as such; it is threads plus shared mutable state. To solve this problem, it's not necessary to throw away threads. It is sufficient to disallow mutable state shared between threads (mutable state local to one thread is still allowed).
...and Allan McInnes adds:
The "problem with threads" lies in the current approach to sharing state by default, and "pruning away nondeterminism" to get a correctly functioning system.
...and "dbfaken" adds:
Perhaps we should have strong syntax distinctions for mutation.
Since the first versions of Dejavu (my Python mediated-DB/ORM), I've noticed that this "pruning away nondeterminism" approach is exactly the wrong direction for systems which are designed to be thread-safe; we could instead explore languages and systems which allow us to "prune away determinism". By that I mean, mutable state should not be shared between threads by default; any mutable state which needs to be shared should be explicitly declared as such. This would make systems like Dejavu much simpler to create, use, and maintain.
I've often wondered what a "strong syntax distinction for [shared] mutation" would look like in Python. The simplest solution would probably have to:
class.__dict__'s immutable. This is a natural choice given the normal usage patterns of classes by developers in the wild: generally, a class exists to share methods between instances. There are valid use cases for classes which are mutable, but they are rare; perhaps a sentinel of some kind provided by
objectcould re-enable mutability for classes, but it should be off by default.
- Make all
module.__dict__'s immutable. This has already been suggested on python-dev (IIRC by GvR himself), although I believe it was suggested as a way to reduce monkeypatching.
- Provide a
@sharedannotation for explicitly declaring shared mutable data.
This is just one solution to a small set of use cases: threaded programs where the explicit shared state is small compared to the total lines of code. I haven't the experience to state whether such a model is inherently damaging to other concurrent needs and designs. It has the benefit, however, of having little impact on single-threaded programs.
Would such a feature help catapult Python into the "large systems" space?
All systems fail, and complex systems fail in a nearly infinite number of ways, some anticipated, many unanticipated. You could publish a large manual for how to deal with every anticipated failure, but for sufficiently-complex systems, the labor of writing such a manual far outweighs the benefit of having it. Heck, the labor of reading such a manual far outweighs the benefits. Even the labor of advertising the manual outweighs the benefits. And let's not forget version control, editing, publishing, distribution, recollection, authorization, errata, indexing, and a host of other system-management duties.
Take laptop overheating. Yes, it happens. Yes, damage is done. But the damage of creating a new system to usefully and efficiently communicate the dangers of laptop overheating to all laptop users in your company is probably far greater.
But people still try. And it takes a long time to explain the above. Wouldn't it be great if you could use a single short phrase to mean all that?
Here's my contribution to the world of Getting Things Done: the Fu Filter. Use it to imply that the issue in question is not worth addressing in any meaningful way, because to do so would be more trouble than it's worth. For example, you could tell someone that laptop overheating "doesn't pass the Fu Filter." Those of you with sufficient computing experience may wish to spell it "Foo Filter" in honor of all foo everywhere. Since "fu" can mean happiness (with the right tone), you can also think of this as the "Happiness Filter".
No, really. It rocks. Rocks, rocks, rocks.
Hey, you. Do you realize what you're writing? The long-standing IT joke is that you always end up coding your own job out of existence. But what are you coding yourself into?
- You're writing a framework that turns website creation into an assembly line. Do you really want to work on an assembly line?
- You're writing an API that wraps a well-understood common object model with a domain-specific language. Do you really want to be an expert on a language nobody else knows?
- You're writing a program that needs regular maintenance. Do you really want to clean software toilets for a living?
- You're writing a community tool with a moderator mode. Do you really want to be a bouncer for the rest of your life?
Nobody else does, either.
PyCon 2007 is nearing a close; here are some notes on how it affected CherryPy:
Web application deployment
Chad Whitacre (author of Aspen) herded several cats into a room on Sunday and forced us to discuss the various issues surrounding Python web application deployment. This is hinted at in the WSGI spec:
Finally, it should be mentioned that the current version of WSGI does not prescribe any particular mechanism for "deploying" an application for use with a web server or server gateway. At the present time, this is necessarily implementation-defined by the server or gateway. After a sufficient number of servers and frameworks have implemented WSGI to provide field experience with varying deployment requirements, it may make sense to create another PEP, describing a deployment standard for WSGI servers and application frameworks.
There were three basic realms where the participants agreed we could try to collaborate/standardize:
Process control: stop, start, restart, daemonization, signal handling, socket re-use, drop privileges, etc. If you're familiar with CherryPy 3, you'll recognize this list as 95% of the current cherrypy.engine object. The CherryPy team has already been discussing ways of breaking up the Engine object; this may facilitate that (and vice-versa). Joseph Tate volunteered to look at socket re-use issues specifically, but the general consensus seemed to be that much of this would be hashed out on Web-SIG.
WSGI stack composition: Jim Fulton proposed that we could all agree on Paste Deploy (at least a good portion of the API) to manage this in a cross-framework manner. Most heads nodded, "yes". Jim also proposed that each of the framework authors take the next week to refamiliarize themselves with Deploy, and then start pestering Ian Bicking with specific API issues. Ian suggested that he should fork Paste Deploy into another project specifically for this. For CherryPy, this would first mean offering standard egg entry points. [Personally, I'd like to standardize on a pure-Python API for deploy, not a config file format API. In other words, make the config file format optional, so that users of CP-only apps could avoid having to learn a distinct config file format for deployment. It should be possible to transform various config file formats into the same Python object(s).]
Benchmarks: Jim also suggested we create a standard WSGI HTTP server benchmark suite, with various test applications and concurrency scenarios. This would compare various WSGI HTTP servers, as opposed to CherryPy's existing benchmark suite which compares successive versions of the full CP stack. Ian volunteered to begin work on that project (with the expectation that others would contribute substantial use cases, etc).
Others who were present for at least a portion of the long discussion: me, Mark Ramm, Kevin Dangoor, Ben Bangert, Jonathan Ellis, Matt Good, Brian Beck, and Calvin Hendryx-Parker.
WSGI middleware authoring
After some discussion with Mark (and he with Ian and Ben), we agreed that CherryPy could do more in the WSGI-middleware-authoring department. There is a continuous pressure to simply re-use or fix up the existing CherryPy request object to fill this need; however, there are some fundamental problems with that approach (such as the use of threadlocals to manage context, and the difficulty of streaming WSGI output through a CherryPy app). At the moment, I'm leaning toward adding a new API to CherryPy which would be similar to the application API, but specifically targeted at middleware authoring.
What should have been 7 HTTP requests is now 81, and what's worse is that all of the feedburner responses are 200's. This is no way to run an Internet. At the least, feedburner, please do the fancy webhit dance for only 1 of the 3 gifs for each entry in the feed.
I have created a new blog and moved it. If you have this bookmarked or are subcribed to the feed, please replace that info with the new site.
I got an uncompleted bit of spam in my inbox today. Here's the end of the headers for fun:
Received: from 192.168.0.%RND_DIGIT (203-219-%DIGSTAT2-%STATDIG.%RND_FROM_DOMAIN [203.219.%DIGSTAT2.%STATDIG]) by mail%SINGSTAT.%RND_FROM_DOMAIN (envelope-from %FROM_EMAIL) (8.13.6/8.13.6) with SMTP id %STATWORD for <%TO_EMAIL>; %CURRENT_DATE_TIME Message-Id: <%RND_DIGIT.%STATWORD@mail%SINGSTAT.%RND_FROM_DOMAIN> From: "%FROM_NAME" <%FROM_EMAIL> Bcc: Date: Tue, 23 Jan 2007 14:08:41 -0800 X-pstn-levels: (S: 1.07668/99.82653 R:95.9108 P:95.9108 M:97.0282 C:98.6951 ) X-pstn-settings: 3 (1.0000:1.0000) s gt3 gt2 gt1 r p m c X-pstn-addresses: from <email@example.com> [81/4] Return-Path: <a href="mailto:firstname.lastname@example.org">email@example.com</a> X-OriginalArrivalTime: 23 Jan 2007 23:02:12.0495 (UTC) FILETIME=[8464F9F0:01C73F42]
I was reading through a photography forum with the emphasis on Tattoo photos. The thread asked people to post pictures of their tattoos and stories or explanations behind the tattoos. As I was reading some of the entries I came across this one.
I had wanted one since I was a kid in the early 80s, no doubts about it. Our friend owned a
studio in NY and his girlfriend (with her purple hair and facial piercings and all her ink) was
the prettiest thing I'd ever seen when I was 5. I loved that she had a purple mohawk and all
those gorgeous tattoos. I was hooked then.
I turned 18 and moved right in with the guy I ended up marrying, and something always
prevented us from going and getting one done for me. It just didn't happen.
January 12, 2006, my husband died in his sleep at the age of 34 from a rare disease,
Marfan's. I went on the one year anniversary for a tattoo for him. I wear the scars from his
death, our relationship, and our life together with or without a tattoo on my arm, but I may as
well put it out there that my life has been changed by another human being, that's why I have
this birthday and date of death as the main theme.
How random, and depressing. When I read that the man's age was 34 and that he died from a "rare disease" I didn't need to know that the next word in the sentence was Marfan, I already knew.
Reading Barry Warsaw's recent use of SQLAlchemy, I'm reminded once again of how ugly I find SQLAlchemy's
PickleType and SQLObject's
PickleCol concepts. I have nothing against the concept of pickle itself, mind you, but I do have an issue with implementation layer names leaking into application code.
The existence of a PickleType (and BlobType, etc.) means that the application developer needs to think in terms of database types. This adds another mental model to the user's (my) tiny brain, one which is unnecessary. It constantly places the burden on the developer to map Python types to database types.
For Dejavu, I started in the opposite direction, and decided that object properties would be declared in terms of Python types, not database types. When you write a new Unit class, you even pass the actual type (such as
unicode) to the property constructor instead of a type name! Instead of separate classes for each type, there is only a single UnitProperty class. This frees programmers from having to map types in their code (and therefore in their heads); it removes an entire mental model (DB types) at coding time, and allows the programmer to remain in the Python flow.
However, the first versions of Dejavu went too far in this approach, mostly due to the fact that Dejavu started from the "no legacy" side of ORM development; that is, it assumed your Python model would always create the database. This allowed Dejavu to choose appropriate database types for the declared Python types, but meant that existing applications (with existing data) were difficult to port to Dejavu, because the type-adaptation machinery had no way to recognize and handle database types other than those Dejavu preferred to create.
Dejavu 1.5 (soon to be released) corrects this by allowing true "M x N" type adaptation. What this means is that you can continue to directly use Python types in your model, but you also gain complete control over the database types. The built-in type adapters understand many more (Python type <-> DB type) adaptation pairs, now, but you also have the power to add your own. In addition, Dejavu now has DB type-introspection capabilities—the database types will be discovered for you, and appropriate adapters used on the fly. [...and Dejavu now allows you to automatically create Python models from existing databases.]
In short, it is possible to have an ORM with abstractions that don't leak (at least not on a regular basis—the construction of a custom adapter requires some thought ).