When open source goes closed
by Jonathan
I currently have a need for a forum system for a couple of small-scale clients, and my first thought was to look at Jive Forums. I used it for a big project a few years ago and was very impressed not just with the feature list, but with the clean way it was written (they give you almost all the source code, making it easy to extend, which was one of the key factors in choosing the product at the time). It's pretty well commented and full of textbook use of design patterns. It's all the more impressive because it started life as an open-source project.
Looking into the current offerings from Jive, I was disappointed to discover that they have discontinued all but one version of Forums, the one that used to be the Enterprise edition. It's priced at about US$25,000 for the first CPU. As one of my projects is a start-up and the other a charity, this is far, far beyond what they can afford (the charity would get a 20% discount, but $20,000 is still an indefensible cost for a charity when there are plenty of open-source choices around; I'm now looking at JForum).
I asked my sales rep what had happened to the cheaper versions, and he confirmed that Jive is only interested (to paraphrase) in the mega-corporate market nowadays.
I've long been an admirer of Jive Software (and still am), due to their product quality, coding standards and friendly attitude. (Full disclosure: they once sent me a free T-shirt years ago.) It's always struck me as a well-run business and I'm sure their decision makes good financial sense for them. But I can't help finding it rather sad that, in going from open source to high-end market leader, they are now ignoring the low-end markets that helped them on the way; the wider community, the word of mouth recommendation that helps drive adoption and so on.
The model of open source software with the option of commercial support is currently very widespread, but what happens if more companies take their open-source product and develop them into closed-source offerings? If Jive Software can be so successful, others will surely follow. The moral, I suppose, is always download a copy of the sources while you still can.
I currently have a need for a forum system for a couple of small-scale clients, and my first thought was to look at Jive Forums. I used it for a big project a few years ago and was very impressed not just with the feature list, but with the clean way it was written (they give you almost all the source code, making it easy to extend, which was one of the key factors in choosing the product at the time). It's pretty well commented and full of textbook use of design patterns. It's all the more impressive because it started life as an open-source project.
Looking into the current offerings from Jive, I was disappointed to discover that they have discontinued all but one version of Forums, the one that used to be the Enterprise edition. It's priced at about US$25,000 for the first CPU. As one of my projects is a start-up and the other a charity, this is far, far beyond what they can afford (the charity would get a 20% discount, but $20,000 is still an indefensible cost for a charity when there are plenty of open-source choices around; I'm now looking at JForum).
I asked my sales rep what had happened to the cheaper versions, and he confirmed that Jive is only interested (to paraphrase) in the mega-corporate market nowadays.
I've long been an admirer of Jive Software (and still am), due to their product quality, coding standards and friendly attitude. (Full disclosure: they once sent me a free T-shirt years ago.) It's always struck me as a well-run business and I'm sure their decision makes good financial sense for them. But I can't help finding it rather sad that, in going from open source to high-end market leader, they are now ignoring the low-end markets that helped them on the way; the wider community, the word of mouth recommendation that helps drive adoption and so on.
The model of open source software with the option of commercial support is currently very widespread, but what happens if more companies take their open-source product and develop them into closed-source offerings? If Jive Software can be so successful, others will surely follow. The moral, I suppose, is always download a copy of the sources while you still can.
Rapid web development without the RDBMS
by Steve
When I first came across Ruby on Rails, I was sceptical. This was not because of the use of Ruby (although I have some concerns, which I have mentioned in previous posts here), but mainly because of the Active Record pattern. Why take a fully object-oriented language like Ruby and define the data model relationally? It seems me that the developer is losing much in terms of ease of development and flexibility by using such a thin layer over relational stores and having to express relationships between classes in relational terms. Maybe this effort would be justified if the relational database was shared with other applications, but my impression is that this isn't the target of RoR.
There are other ways of using relational databases in combination with OOP development that allow the data model to expressed more fully in terms of objects rather than relationally; such as ORM. We develop using JDO and JPA using Kodo and OpenJPA for this reason. There are alternatives to ActiveRecord in Ruby (such as Og). There are various noteworthy projects in Python (e.g. Dejavu), and Perl (e.g. RDBO). For RoR-style development on the JVM there is Grails, based around Groovy. But there is still this focus on relational stores. I do wonder what the point of that is for small, one-off web projects.
I started to think about this again when I took another look (after several years) at db4o, a pure object store for Java and .NET. It did not take me long to realise the advantages for small projects. Java code to open a store and save an object consists of only a few lines, and with no need for mapping or any kind of "instrumentation" of classes; at least not just to get going or for small applications.
I decided to try this using Groovy. It was just as easy:
So, I am going to start a long-term project, that I will report on here. The aim is to see how simple Grails could be with an object database; without the use of Hibernate or JPA; without ORM at all. This is purely a spare-time project, but I think it could be interesting. Even if doesn't work, it would be good to know why. I decided to start with Groovy for this because it integrates very cleanly with Java libraries, such as the db4o implementation for the JVM, and Grails is pretty stable.
I will take a look at other languages later.
When I first came across Ruby on Rails, I was sceptical. This was not because of the use of Ruby (although I have some concerns, which I have mentioned in previous posts here), but mainly because of the Active Record pattern. Why take a fully object-oriented language like Ruby and define the data model relationally? It seems me that the developer is losing much in terms of ease of development and flexibility by using such a thin layer over relational stores and having to express relationships between classes in relational terms. Maybe this effort would be justified if the relational database was shared with other applications, but my impression is that this isn't the target of RoR.
There are other ways of using relational databases in combination with OOP development that allow the data model to expressed more fully in terms of objects rather than relationally; such as ORM. We develop using JDO and JPA using Kodo and OpenJPA for this reason. There are alternatives to ActiveRecord in Ruby (such as Og). There are various noteworthy projects in Python (e.g. Dejavu), and Perl (e.g. RDBO). For RoR-style development on the JVM there is Grails, based around Groovy. But there is still this focus on relational stores. I do wonder what the point of that is for small, one-off web projects.
I started to think about this again when I took another look (after several years) at db4o, a pure object store for Java and .NET. It did not take me long to realise the advantages for small projects. Java code to open a store and save an object consists of only a few lines, and with no need for mapping or any kind of "instrumentation" of classes; at least not just to get going or for small applications.
I decided to try this using Groovy. It was just as easy:
class AClass {
def id
def name
String toString() { return "$id:$name" }
}
// Connect (this is simple file access. Db4o can work multi-user)
def db = Db4o.openFile("data.db");
try {
// Store
db.set(new AClass(id:1,name:"Jon"))
db.set(new AClass(id:2,name:"Steve"))
// Retrieve
db.query(AClass.class).each { println it }
// Delete all instances
db.query(AClass.class).each { db.delete(it) }
} finally {
db.close()
}
The incredible simplicity of this hit me immediately. Suddenly all those paragraphs of instructions about how to set up different databases for RoR, and all the rules of how to configure different types of relational connections beteen classes looked very tedious, especially as Rails was supposed to be the great time-saver.So, I am going to start a long-term project, that I will report on here. The aim is to see how simple Grails could be with an object database; without the use of Hibernate or JPA; without ORM at all. This is purely a spare-time project, but I think it could be interesting. Even if doesn't work, it would be good to know why. I decided to start with Groovy for this because it integrates very cleanly with Java libraries, such as the db4o implementation for the JVM, and Grails is pretty stable.
I will take a look at other languages later.
Object-Relational Mapping (ORM)
by Steve
A friend wondered what I thought about this article about ORM: http://friends.newtelligence.net/clemensv/PermaLink,guid,92eaea8c-778d-4512-af03-d332785f65f5.aspx
In response, I am going to list some of these issues and discuss them in the context of a modern object persistence API standard, JDO 2.0, which was finally approved by the Java Community Process on 13th March 2006.
Object identity is difficult to manage with ORM, as not everything is handled in the database, and this can lead to concurrency problems.
In JDO, identity is handled by registering an object with a PersistenceManager (PM). Concurrency issues can be avoided either by telling the PM to use the database to generate identity itself, or via one of several very mature clustering solutions such as Coherence. These are simply configuration options and do not have an impact on the developer.
Management of changes is difficult. Data can be changed in the database which is not then reflected by cached objects in the ORM product.
Issues about data changing in the database are not specific to ORM; they apply where any data is changed in memory away from the database. This requires careful control of transactions and versioning, but this is provided in JDO, through a rich API to control transactions and the cacheing and refreshing of objects from a data store.
Management of changes in objects is also a problem in terms of transactions - how can changes in the database be reflected by changes in objects in memory?
Transaction isolation and identification of changes are also handled transparently by the JDO PersistenceManager for persistence-capable objects, as are commits and rollbacks. The consequences of commits and rollbacks for both the data store and the objects managed by JDO are very clearly defined by the specification.
Tracking changes in the database schema in terms of changes in objects is a major problem.
On the issue of changing the underlying data model, two-way mapping of object models to relational systems (the 'meet in the middle' approach) is very well established and there are some superb utilities (such as <a href="www.solarmetric.com">Kodo's</a> schematool and mapping tool) to handle highly complex requirements, in which parts of the data model come from the database, part from the classes in Java, and the two have to be combined.
As ORMs only handle graphs of objects, it is highly inefficient to obtain results that are very fast for relational systems, such as the sum of a field. Large numbers of objects have to be retrieved from the database for this purpose.
Modern ORMs have rich query languages so you don't have to navigate the object model to obtain aggregates. You may have to navigate through a large object graph if your persistence product is working on an object database, but on relational stores, you don't. In fact, you don't have to know what is happening: JDO 2.0 has a query language which includes functions like 'sum', 'min', 'max', and the ability to group and order results. The implementation of these functions may be dramatically different for different types of store. The great thing about JDO is that such functions are guaranteed to be available, no matter what the storage mechanism. If you do want to navigate through a large number of objects, then tuning tools like JDO's fetch groups and fetch plans mean that you can do this efficiently for a particular query without having to populate all of the objects - only the fields that you need for the query will be retrieved from the database.
ORMs try to merge two completely different ways of working - OOP and relational theory. You will always end up with a lot of 'special cases' where this approach fails.
A good, mature ORM like JDO 2.0 provides a clean abstraction of what an RDMBS does in terms of objects. There were specific issues with earlier ORMs like JDO 1.0 so that special cases did turn up due to the API being, to be honest, less suited to relational stores. JDO 2.0 is a different beast.
ORM denies the relational nature of the vast majority of data.
JDO 2.0 certainly doesn't deny the relational nature of data - a significant proportion of the specification covers this nature in detail.
Approaches like Microsoft's LINQ are more suited to relational systems as they allow classes to be added 'on the fly' to represent the results of queries.
Personally, I don't like the idea of adding classes on the fly in the way that the next version of .NET development languages (required for LINQ) allows. But JDO's ability to use arbitrary existing classes for sets of results comes close to this, while ensuring that code for handling the results of queries is clean and free of specific method calls for retrieving field values (unlike JDBC or ODBC). Adding classes on the fly is useful for highly dynamic situations, but very little querying is dynamic - in practice, most is effectively hardwired. When the situation is more dynamic, JDO allows the developer to specify a Map as the result type of a query, so that field values can be selected by name.
Finally, the advantages of using ORM can be considerable. They include the ability to produce extremely portable and clean code and to be able to transparently use very high-performance (and, again, portable) cacheing and clustering systems. There are many supposed barriers to the use of ORM that are often mentioned on IT forums. I believe that all these barriers have now been overcome, but that many proponents of relational systems have yet to realise this. ORM is not a threat to good use of relational systems, but a way to use such systems efficiently and cleanly in modern development languages.
A friend wondered what I thought about this article about ORM: http://friends.newtelligence.net/clemensv/PermaLink,guid,92eaea8c-778d-4512-af03-d332785f65f5.aspx
In response, I am going to list some of these issues and discuss them in the context of a modern object persistence API standard, JDO 2.0, which was finally approved by the Java Community Process on 13th March 2006.
Object identity is difficult to manage with ORM, as not everything is handled in the database, and this can lead to concurrency problems.
In JDO, identity is handled by registering an object with a PersistenceManager (PM). Concurrency issues can be avoided either by telling the PM to use the database to generate identity itself, or via one of several very mature clustering solutions such as Coherence. These are simply configuration options and do not have an impact on the developer.
Management of changes is difficult. Data can be changed in the database which is not then reflected by cached objects in the ORM product.
Issues about data changing in the database are not specific to ORM; they apply where any data is changed in memory away from the database. This requires careful control of transactions and versioning, but this is provided in JDO, through a rich API to control transactions and the cacheing and refreshing of objects from a data store.
Management of changes in objects is also a problem in terms of transactions - how can changes in the database be reflected by changes in objects in memory?
Transaction isolation and identification of changes are also handled transparently by the JDO PersistenceManager for persistence-capable objects, as are commits and rollbacks. The consequences of commits and rollbacks for both the data store and the objects managed by JDO are very clearly defined by the specification.
Tracking changes in the database schema in terms of changes in objects is a major problem.
On the issue of changing the underlying data model, two-way mapping of object models to relational systems (the 'meet in the middle' approach) is very well established and there are some superb utilities (such as <a href="www.solarmetric.com">Kodo's</a> schematool and mapping tool) to handle highly complex requirements, in which parts of the data model come from the database, part from the classes in Java, and the two have to be combined.
As ORMs only handle graphs of objects, it is highly inefficient to obtain results that are very fast for relational systems, such as the sum of a field. Large numbers of objects have to be retrieved from the database for this purpose.
Modern ORMs have rich query languages so you don't have to navigate the object model to obtain aggregates. You may have to navigate through a large object graph if your persistence product is working on an object database, but on relational stores, you don't. In fact, you don't have to know what is happening: JDO 2.0 has a query language which includes functions like 'sum', 'min', 'max', and the ability to group and order results. The implementation of these functions may be dramatically different for different types of store. The great thing about JDO is that such functions are guaranteed to be available, no matter what the storage mechanism. If you do want to navigate through a large number of objects, then tuning tools like JDO's fetch groups and fetch plans mean that you can do this efficiently for a particular query without having to populate all of the objects - only the fields that you need for the query will be retrieved from the database.
ORMs try to merge two completely different ways of working - OOP and relational theory. You will always end up with a lot of 'special cases' where this approach fails.
A good, mature ORM like JDO 2.0 provides a clean abstraction of what an RDMBS does in terms of objects. There were specific issues with earlier ORMs like JDO 1.0 so that special cases did turn up due to the API being, to be honest, less suited to relational stores. JDO 2.0 is a different beast.
ORM denies the relational nature of the vast majority of data.
JDO 2.0 certainly doesn't deny the relational nature of data - a significant proportion of the specification covers this nature in detail.
Approaches like Microsoft's LINQ are more suited to relational systems as they allow classes to be added 'on the fly' to represent the results of queries.
Personally, I don't like the idea of adding classes on the fly in the way that the next version of .NET development languages (required for LINQ) allows. But JDO's ability to use arbitrary existing classes for sets of results comes close to this, while ensuring that code for handling the results of queries is clean and free of specific method calls for retrieving field values (unlike JDBC or ODBC). Adding classes on the fly is useful for highly dynamic situations, but very little querying is dynamic - in practice, most is effectively hardwired. When the situation is more dynamic, JDO allows the developer to specify a Map as the result type of a query, so that field values can be selected by name.
Finally, the advantages of using ORM can be considerable. They include the ability to produce extremely portable and clean code and to be able to transparently use very high-performance (and, again, portable) cacheing and clustering systems. There are many supposed barriers to the use of ORM that are often mentioned on IT forums. I believe that all these barriers have now been overcome, but that many proponents of relational systems have yet to realise this. ORM is not a threat to good use of relational systems, but a way to use such systems efficiently and cleanly in modern development languages.