Ed Crewe: 2009

Tuesday 1 December 2009

A potted history of open source software

This post provides some background history for the post about open source on the Internet development blog...

History

The open source approach has been around since the 1950s with IBM's operating systems and the 'SHARE' network. The internet was developed at the end of 60s via an open source / standards approach, with commercial usage and software starting in 1988. The following year the web was started as another open source project at CERN (recently in the news for its Large Hadron Collider). Due to these origins internet software has always been dominated by open source accounting for the majority of web servers, the protocols etc.

The 90s saw the hey day of commercial software challenging this position driven by its total dominance in desktop operating systems (OS) and documents. Whilst commercial software and web sites helped drive the explosion in internet usage in the 90s, they also introduced a proliferation of closed systems and a move away from common open standards due to commercial interests. Hence the founding of the W3C and more official labelling of the 'open source' movement to counter act these trends. One strand of this was the targeting of open source OS, eg. Linux, as alternatives to commercial desktop software and OS, eg. Windows.

With the arrival of Google's Linux based OS, Chrome, next year, things may yet change significantly in this market. Although perhaps more because the remorseless drop in hardware costs puts increasing pressure on the cost of a closed commercial OS. Apple was another main commercial OS and it chose to switch to an open source based Unix OS, Darwin in 2000, to mitigate these costs by leveraging open source development. For example its browser Safari uses WebKit which was based on Linux software, as does Chrome.

Present and future?

Now with the rise of the service driven charging model, cloud computing and the movement of documents to the web via web 2.0 apps, Google docs and the forthcoming Microsoft Web Office, things have moved on. As for the two iconic companies that have come to represent open vs closed source commerical interests, Google will soon match Microsoft financially with advertising replacing software revenue. Google's vision sees all hardware as always connected to the internet, hence all content resides in the service providers global server networks (ie the 'cloud'). A desktop OS is a browser and local documents go the way of the floppy disk.

Having said that the web itself was invented to replace locally stored bespoke documents and its taken 20 years for authorship to the web to be matching the level of authorship to local documents.

The cloud means that the goal posts have shifted for the open source movement. Commercial software will never dominate the internet and many believe it will increasingly lose its dominance elsewhere. So it is no longer seen as the main issue. The focus is now on challenging what companies like Google are doing in terms of holding content. Hence some founders of the open source movement reject the cloud out of hand, whilst others think open source should now be about enabling open cloud provision. The question remains how to do this since most cloud infrastructure already runs on open source, but the infrastructure provider still retains control of the content.

Thursday 12 November 2009

Getting up to speed with django

I finally tracked down why my experience of django has been really slow, when supposedly one of its qualities is speed. So it is meant to be faster than a lot of LAMP frameworks, eg. Symfony, and also Ruby on the Rails, so certainly better than a non-rdbms centric platform like zope/plone - yet I was getting really terrible performance.

Initially with a standard Apache set up I was getting 10-20 secs/req ! ... gak unusable what was wrong?
Simply adding KeepAlive Off to the apache conf had a radical effect. Now I was getting 3-4 secs/req.
So pretty terrible but not totally unusable. Things were a bit better under mod_wsgi so 2-3 secs/req.

However the real nub of the issue was Oracle, the backend was not using pooled connection so each page was renegotiating an Oracle connection. With our distributed Oracle server network that could take 2-3 secs (ie around 90% of the request time).

I investigated pyorapool and it doubled performance, but had issues and it seemed overkill to run something more suited to pooling connections across servers.

So I went back to a post about a pooled oracle backend by Taras Halturin, which I initially couldnt get working. Having fixed it for current django 1.1.1 I have packaged it up in our repo as ilrtdjango.oracle_pool I guess if I add some extra features to it such as logging etc, I should contact Taras about punting it up on pypi.

This really deals with the problem by wrapping up cx_Oracle's own connection pooling. Now I am seeing 0.4 secs/req on average, with cached pages down to 0.1 secs.
There is a lot more that could be done with better fragment caching and use of memcache rather than a simple file system backend. But for the moment I am happy that django really is fast, ie. the framework can do 10 req/sec out of the box. With good queries and materialised views for remote databases, using a pooled backend a few Oracle queries should at most halve that speed, so a 5 req/sec uncached baseline.

This is a much better performance point to work from for database applications, and if I need the sort of 50 req/sec performance of a high traffic site then I have all the caching headroom to use (e.g. memcache or a web cache frontend).

Sunday 1 November 2009

Plone conference keynote

This is an attempt to paraphrase Alexander Limi's keynote speech, rather than reflect any of my own views (added as comments instead).

Intro

Good to see two new plone books are out, e.g. Erik Rose
plone 3 for education -
and contrary to Google trends - usage is still increasing.

plone 4/5 roadmap:

native blobs in 4

performance should be twice as fast as 3 -
largely due to Chameleon and its compiled ZPT template acceleration (also used by Pylons for Genshi)

code base should be 30% smaller

deliverance - more into the core

dexterity - to replace archetypes but will still have AT in 5

deco - UI layout design framework using tiles to replace viewlets and portlets. All of the page elements can be drag and dropped rearranged.
- so edit mode becomes a design mode too! (Live demo of deco drag and dropping layout followed)

Issues about promoting plone

Cant compete against cost and simplicitiy of PHP CMS for web page publishing - so sees goal as the high end of the market - enterprise collaborative CMS, easy to extend with simple apps.

Hence the competition is Alfresco, RedDot, Sharepoint etc - Plone community need to create standard content importers from these.

Fix plone.org to drive plone again. More user groups, etc.

Tuesday 15 September 2009

University CMS away day

We had an away day (at home!) on the 11th September to discuss the next version of the CMS.
It is documented in detail on ILRT pypi.
This involved analysing a survey to users of the system and getting input from other parts of the University (mainly IS) about what we should do next time.
The main conclusions seemed to be:

We had assumed we would stick with Plone, but this may need revisiting since centrally Java based or commercial solutions were mooted. Plone was chosen last time by central IS based on its use of zope. Now that linkage is lessening and zope is struggling as a platform amongst all the newer python ones. However the added costs of anything else may be too much.

Simplicity should be key - use of a CMS has seen no significant increase in web editors because of complexity - we need to radically simplify editing for occasional users

On the other hand all existing features must be retained - hence regular users need to be able to turn on more complex editing features

The CMS needs a deployment framework to push content, chrome and system info to other systems

Non CMS tasks such as database applications should be move outside of Plone (but be integrated)

Monday 10 August 2009

Bristol Balloon Sprint

Sprinting in Bristol

Team Rubber with a contribution from Netsight hosted the Sprint under the auspices of Matt Wilkes (who will soon be finishing there and becoming an independent developer).

On an unusually hot and sunny weekend, I spent three days stuck inside tapping away on a computer, surrounded by plates of half eaten food and glasses of half drunk beer. This was the Bristol Balloon Sprint, a gathering of independent developers who volunteer
their time to help contribute towards developing the Plone open source CMS.

There were attendees from Germany and Scandinavia as well as a gathering of a number of representatives of Bristol's large Plone community that help make it the UK Plone capital. The European attendees included the release manager for Plone 4, its next incarnation. We mainly worked on progressing Plone improvement proposals and fixing the automated tests that facilitate software quality assurance. Along with some work on the popular web form building addition for Plone.

Coding

The sprint was focussed on Plone 4 along with some PloneFormGen (PFG) work. I tried my hand at the former for the first day.
Working with Anreas Ziedler on plone.app.blob fixing tests for plone 4.
This package unifies blob handling for images and files and adds native file system storage for them to plone. It was useful experience
in developing / testing for plone 4 but I felt I wasnt getting a lot done.

Hence the next couple of days were spent trying out creating a save database adapter for PFG. There was some progress with a ploneformgen.sqlizer egg
that creates tables based on PFG forms and generates CRUD sql handlers. The work was partly because Netsight were interested in this, and the rest of the PFG team worked on adding optional form inputs to PFG. My interest was in trying out this as a possible future for the UOBCMS DBI document
solution moving to a PFG based one. Since PFG has quite an intututive interface for designing web forms that end users should be much more comfortable with than DBI.

Plone Futures

From what I gathered regarding the future of Plone, it looks healthy (unlike zopes!), if confusing.
Hanno Schlichting is Plone 4's release manager and is aiming to get it released by
the end of year. The main drive is refactoring the underlying code base and simplifying it so the code base is now a lot smaller, and runs on the fully eggified zope 2.12 There may also be a new skin, ie admin interface ( perhaps relevant to CofE issue) although it seems as though there is little work on that yet.
There has been a phasing of new versions of plone into 4 and 5. With a fairly movable set of deliverables for each depending on who gets stuff done by when.

Broadly there are a number of things happening wrt. the whole platform underneath plone and its long term future.

In general everything is eggified and hence developers are now tending to pick and mix elements from all the python frameworks a lot more.
This means everything is going to use WSGI, and Plone is moving away from being 'a zope application'

Also it looks as though Zope 3 is dead - ie. the zope 3 publisher, however the Five elements of zope 2 are likely to become a standard
set of additions packaged as the zope framework 1. This would sit on zope2 that still seems to be plodding on.

Also all the ZCML stuff hasnt proved that popular and people are mooting a simplified Plone specific config language.

Repoze looks to be gaining ground, so Plone may well move to being as commonly deployed on
repoze / WSGI (ie Apache) as on zope.

On the front end skin level deliverance seems to be gaining ground, with its capability of mixing together application's chrome.
Also KSS is much reduced and only enabled for the edit interface for plone 4 onwards.

Archetypes are being replaced by dexterity - but probably not until plone 5.

My personal take on this is dead archetypes and KSS - hurray.
Dead zope altogether also hurray, but zope3 dying and stepping back to rebadged zope2/Five for a long time until Plone is a pure WSGI / ZODB app bad, bad, bad.
The idea of all the ZCML stuff being changed, and a more pure java-like component architecture not being a Plone goal is also pretty bad in my book, since at
lot of my inspiration for getting more into Plone recently was the promise that zope2 was soon to be ditched. I guess however that I can just readjust my
view towards repoze and the idea that zope as a whole will largely be ditched eventually.

Tuesday 14 July 2009

Check out ILRT PyPi

ILRT now has its own python code repository and general documentation server at http://pypi.ilrt.bris.ac.uk

So some of the more technical posts from this blog have been moved to the HowTos there.

Any ILRT python code should have its packaged documentation this is now checked out from svn to an eggserver folder on devbox and punted up into web pages via a cron job.
I will look at adding the dump of these to the windows file share as well, so code specs etc. can be tied to release tags and round trip within the code to svn to eggs to the repo web pages to windows share text docs (editable in Word or whatever).

Along with that the repo can hold 'manuals' eg. code club presentations etc. and other bits and pieces. Plus FAQs a useful place to add any quick hints or tips - certainly if you ever waste an hour working out some undocumented thing that in itself only takes 5 minutes, please always add it as a FAQ.

Sorry the actual server is a bit bare plone, it could probably do with being made more like http://www.coactivate.org (a free basecamp stylee plone used for open source project planning)
Or possibly use sphinx, or one of the more documentation centric python tools?

NB: The server is only accessible within the University. Internet development team members can log in via their standard zope accounts if anyone has the urge to edit anything (hopefully!)

Tuesday 17 March 2009

Reading anyone?

After springing a few hundred books into the skip today, I thought it makes sense to add reviews of books that we have just read.
So that others in the team can judge whether they are worth a gander. Over the last 9 months I have read the following work books.

Friday 30 January 2009

My january laying season is over

Ok just sneaked in one more egg to pypi before the end of the month.

ilrt.migrationtool

... bringing ILRT up to a grand total of four packages in pypi.

I doubt I am likely to have the space (or inclination) to do three eggs in one month again ... unless we get some really full time UOBCMS migration funding.

This particular egg adds the controlled release functionality that was developed in a more specific manner by Dom for the UOBCMS. But this time it can be used for any site, where the migrations are added to the sites theme egg. In addtion it is all rewritten the zope 3 way, and bundles a sub-tool that migrates content's workflow.

All required for changing the workflow used by the production ECU site.

... OK should be finished with plone for a little while (apart from plone2/3 site support) ... time to do a bit of django ...

Monday 19 January 2009

... egg number 3

Yep put out another one ... this one gives roughly the same workflow as we have used with
in the past with BaseCMS and UOBCMS ...
http://pypi.python.org/pypi/ilrt.formalworkflow

NB: learnt a couple more lessons ... added to instructions below ...
1. remember to register from a fresh svn checkout (dont want .pyc from a instance)
2. use python2.4 since python breaks when it encounters decorators
3. make sure tag_svn_revision is set to false
4. your download counter gets reset every time you replace a file even if they are called the same ... so try not to replace files except if there is a glitch when testing the install from release - at release time ... this is good practise anyhow since even minor textual changes mean its not really the same release.
... however that doesnt mean you cant always re-edit documentation ... just do
python2.4 setup.py egg_info
python2.4 setup.py register
5. Finally remember you might as well edit setup.py so that it pumps in README, HISTORY and TODO into the long description field.

Thursday 8 January 2009

Another ILRT egg released

Finally got around to finishing and releasing the egg for generating content / plone sites matching different profiles - e.g. intranet, public etc.

http://pypi.python.org/pypi/collective.contentgenerator ... our second cheeseshop / plone.org release and the first that I acted as release manager for.

Technical gotchas regarding releasing code

Do all your metadata in the right places first then use setup tools to regenerate the metadata files.

So my recommendation is start at the top with setup.cfg which should release tag changed from dev to nothing and svn from true to false ... or else you get _rSVNnumber appended to your release versions.
Next setup.py add your metadata here (classifiers can be found on the pypi form dropdown) ... then make sure the version you specify in setup.py is copied to the one in egg.name/version.txt
Next get your final text ready in README.txt and HISTORY.txt (using restructured text)
Now run
> python2.4 setup.py egg_info
this regenerates your.eggname.egg-info/PKG-INFO and the other metadata files. So you can check them before uploading.
Finally you cpuld upload PKG-INFO to pypi by hand but is easier to do it all in one fell swoop via setuptools register.

> python2.4 setup.py register sdist bdist_egg upload

(the sdist makes a source tarball and the bdist_egg and egg)

If you really need to do final tweakings ... via the pypi web form ... you can do so but you will then have to go to the generated PKG_INFO link on pypi and copy and paste the results back into README.txt etc., rerun egg-info and re-upload your egg tarball ... I did all that at first and then realised that was a really bad idea!

Lastly all you need to do is go through the whole process again via different web forms on plone.org ;-)
... though I think this is being addressed?

* Oh and one final bugbear ... where do you set platform in the metadata?

Ed Crewe