您的位置:首页 > 其它

[记录]Hibernate资料

2004-11-28 10:55 190 查看
Hibernate创始人Gavin对于Hibernate的大致介绍
Gavin King - Founder, The Hibernate Project

Gavin King is the founder of the Hibernate project, an ORM implementation for Java. He is co-author of an upcoming Manning book, 'Object Relational Persistence in Java'. Gavin lives in Melbourne, Australia, where he has been working as a J2EE consultant for several years, evangelizing open source solutions and agile methodology. He currently works for Expert IS.

Discuss this Interview

 

Gavin, can tell us a bit about yourself and what you're involved with?

I’m the founder of the Hibernate project. And at the moment I’m also kind of branching out a little bit peripherally into a couple of other projects, which are useful as part of the Hibernate toolset. (I've made) a couple minor contributions to XDoclet and Middlegen. I’m interested in continuing that kind of work at the moment, and keeping and pushing forward with Hibernate and working on that. I’m not interested in branching out too much in too many different things and neglecting those core things within.

How did Hibernate get started?

Hibernate started because I was trying to build applications with CMP as well with EJB 1.1 and I found that most of the time I was not trying to solve the business problem, not trying to solve the modeling problem but trying to solve workaround limitations with EJB and putting a lot of effort into coding stuff that was infrastructure related, that was not related to the business problem.

And in the end after all that effort, after all the lost productivity of working with entity beans, in particular, the resulting code was ugly, it was non-portable, it was non-reusable. It's not reusable outside of an application server, it’s not reusable with some other kind of different persistence mechanisms, it’s untestable. All these kinds of problems. We heard a lot at the conference (TheServerSide Symposium) about, and previously on TheServerSide community, about problems with entity beans and they’re real. They are real. People aren’t saying all this stuff because they’re bad at working with EJB; there are real problems and these are smart people who understand how to build Java applications really well who have been saying this.

What about EJB 2.0 and 2.1. Haven’t they alleviated some of these problems?

Yes, I believe it’s improved, but a lot of the fundamental underlying problem, which is the confusion of a component, which is a coarse grained thing, with a persistent class, with a piece of your domain model, still exists. Secondly, yes, EJB 2.0 is an improvement but we still have basic problems: Inheritance. I’ll talk a bit more probably later about problems with doing things like aggregation, and projection and some of these kinds of things which still exist. We need something that, a bit more, unifies the relational and object models and lets you work with both simultaneously, in my view.

Why do you think Hibernate has become so successful?

We’ve actually got a list of things on the Wiki like, a list of things, you know because I’ve thought a lot about this, like what makes an open source project become successful and take off, and we’ve got a list of things, but I think some of the most important things are: the ability to develop a real community to really interact properly with users, to establish the expectation with the users that, one if there’s a problem, it’ll be fixed, if there’s suggestions then they’ll be taken seriously. The idea that the community is kind of a nice thing surrounding this project that the users can feel proud of. If you develop a thing where there's perhaps too many smart people with too many really strong ideas that are conflicting, you can end up with a kind of a disagreement, and disunity. Disunity is fine, different opinions are great but if you want to develop an open source product you need a feeling of community, a feeling of doing things together.

Secondly: Documentation. So many of the great Java open source projects that are out there could be so much more successful, and so much better as a project if people took the time to take pride in the documentation. To me, it’s a matter of taking pride in your work. When you build something, don’t you want to tell everybody about it? Don’t you want to write about and document it? So people keep talking about how good the documentation for Hibernate is. That’s because we’re proud of it. We’re proud of what we produced and we want to get out there and tell people about it.

Thirdly, I think, and this is something that should be common to many open source projects; thirdly, we’ve come into this, I’ve come into this particularly saying, "I’m not an expert in persistence, I don’t know all the answers, I’m going to work through this, we’re going to work through this with the users and design something together that addresses their needs." Hibernate does things that I never thought would be useful. And some of those things are the best features. The users have come and said, "you know I what I would really do is this." So getting away from that idea of having experts design software, and rather have things evolve. I sometimes say, "how many people believe in a Command and Control economy? How will Command and Control also benefit the software economy?

What are some of these features you talk about, that have become the ‘best features’?

I think I was coming from the point of view of a Java developer. One of the funny things is I never had a great background in SQL and the kind of things people were trying to do with SQL, and with working with data as sets of data. And often when we look at Java Persistence, what we’re looking at is a problem of objects, and the way you work with objects is very individual, you've got an object and then you work with another object. Whereas there’s this whole other world in the database where we’ve got this very elegant model. The relational model where we work with sets of data. And we need to be able to work with both representations.

And I admit, I wasn’t thinking about these problems in terms of working with sets of data. But what’s come up through users asking, well I want to do this. What’s really crystallized is the notion that you actually need to be able to work with sets of objects. So you need to be able to maintain your level of abstraction of where you have notions like inheritance and properties and associations and you’re working with classes, and the names of properties of classes. But at the same time you need to be able to be doing aggregation, outer joining, nested queries, all those kind of things, grouping, ordering, sorting, all those kinds of things, with your objects. And that stuff doesn’t get done on the Java side, that stuff should be done on the database side.

I think you can look at some of the other specifications that compete with Hibernate; they’re not addressing that issue; that issue of trying to unify the two worlds. Often we talk about O/R as a problem of bridging or mapping and I don’t think it should be looked at as that, it should be looked at as a problem of unifying, being able to work with sets of objects.

What’s wrong with traversing object hierarchies through Java; why the need for queries?

Firstly, traversing object graphs is simply too inefficient when you’re working with a relational database, particularly when you’re working with a database that might be physically separated from the process. So if you’re work by traversing graphs of objects you’re inevitably going to run into the n+1 problem. I’m talking about a more generalized n+1 problem than what people know very well from BMP, but a more generalized problem where we fetch data piece meal. And that’s simply inefficient with relational databases. I’m not an expert on object databases but I suspect in the case of object databases they’re a bit more designed to cope with that method of data access.

And one of the problems that we’ve had is that object relational mapping is being perceived as a bit of a hack workaround and what we really want is an object database, and we’ve tried to apply object database APIs to object relational mapping. And that’s not going to work because a relational database is fundamentally different. The way you access data in the relational database is fundamentally different from the way you access data in an object database.

What are some of the ways you’re merging these two approaches in Hibernate?

The most important thing for us, the center piece of our effort is the Hibernate query language which is a minimal extension of SQL where we’re trying to create something as close as possible to what people already use, people are already used to. Select foo from yadda yadda yadda. They know what a relational query looks like.

I’m kind of ashamed to say that in the Java community, I think often, we look at that stuff and think it’s a bit dirty and people talk about persistence layers as being able to shield the developer from having to work with that kind of stuff. But if you actually sit down and spend some time thinking about the relational model and relational modeling, you realize it’s actually very elegant. What makes it dirty is the mismatch problem, is the mapping problem, that’s the only part that makes it dirty. What makes it dirty is working with JDBC and trying to map the stuff that comes out of JDBC onto a graph of objects. That’s what makes it dirty, not writing SQL, not working with SQL.

So, we have a query language that’s based in SQL, which I think is a nice language. It’s a very small language but it’s nice. We’ve tried to put object oriented constructs into that and we're continuing to try to do that.

Did you evaluate ODMG’s OQL standard and if so, what was wrong with that standard?

OQL's great and in some ways there’s a subset of Hibernate query language that’s the same as OQL. But OQL was based around object databases so we have notions like method calls in OQL and that simply doesn’t apply to object relational mapping. Secondly, we have the problem in a relational database, null equals null evaluates to null. And that’s not the case in an object database.

How about JDOQL?

JDOQL is a query language for stuff. It’s designed to be able to query some kind of transactional system like CICS or some kind of flat file or perhaps even XML format or potentially a relational database, certainly an object database. It’s not specific to any kind to any particular persistence mechanism and as a result, when you’re trying to work at that level of abstraction you simply can't add in constructs which are specific to relational mapping. A particular one is outer joins, but also things like aggregation. Aggregation and projection are essential things and our users use them all the time.

How do you deal with stored procedures?

Stored procedures are essentially a non-relational view of a relational database. They're a procedure oriented, a call oriented view of a relational database. So my view, and this is a controversy, not everybody agrees with it, my view, currently, is that the goal of an object relational mapping tool should be to map between tables and objects, not between objects and "some other stuff".

Having said that, many people are using Hibernate in an environment where they have some data access via stored procedures or via some other kind of heterogeneous means. So we try very very hard to support access to data that’s hetergeneous, that’s coming from different places or is coming out of the database in different ways. And the way we’re doing this is not by hiding all that behind Hibernate, because in the end we’re trying to do one thing and we’re trying to do that well. How we’re trying to do it is by simple things like exposing the JDBC connection that Hibernate’s using to the user so that they can go and do anything they like.

Secondly, by allowing hooks into Hibernate. For example, one of the great things about Hibernate is that kind of unified model types, model of types of a property of a class. And we have a notion or a custom type, where the user can build their own custom type for a property of an object. A custom type can be used to create an association between an object, which is persisted by DAO, that uses direct JDBC stored procedures, yadda yadda yadda, LDAP, and can then become part of the object graph that the Hibernate application's loading. So that’s how we’re trying to address those things, currently.

You’re obviously very passionate about transparent persistence but what I’m hearing from the major vendors is that their customers aren’t asking for it; they just don’t seem to think it’s important and neither do certain factions within Sun. Why do you think transparent persistence is important?

Firstly, to address the first part: that some vendor’s customers aren’t asking for that. Well, my view on that is that who are they defining as their customers? Are they defining their customers as the developers who work with their products day to day? Or are they defining their customers as the organizational people they go to when they sell an application? I don’t know the answer to that, but I’m raising that as a question, right. To me, it appears, the interest in Hibernate should demonstrate, that in fact, that developers are very, very, interested in the problem of transparent persistence.

Secondly, transparent persistence is important because, firstly it lets you concentrate on the business problem. And that’s very important. Transparent persistence forces the infrastructure implementer, or infrastructure vendor to not make you write too much code. A POJO is the simplest, in terms of LOC, the smallest object that you can write. And transparent persistence, saying that word forces us, as implementers of Hibernate to make sure that that’s all you need to write. So it’s important from that level. Secondly, it’s important from the level of portability. Code is not portable between EJB 1.1 and EJB 2.0 because the infrastructure gets its little fingers into the business model. Whereas the business model should be completely portable between Hibernate or JDO or Toplink. I might be making a bit too extreme a claim there, but potentially once you're talking about writing POJOs, you’re talking about building a system that can easily be ported to a new infrastructure. But also, a system that's easily testable, that's easily unit testable, a system that can run in a batch process or in some asynchronous listener.

And all that stuff's really important. For some reason reusability is seen as an advantage of EJB. But in fact the lack of reusability is the disadvantage of EJB. POJOs are reusable. Things which implement interfaces or inherit stuff simply aren’t as reusable.

One of the reasons why transparent persistence may not be getting as much attention is because a lot people just aren’t writing domain models. Why do you think people are not yet adopting such a good object-oriented paradigm yet?

Firstly, not every application needs a domain model. There are lots of applications for which a domain model is absolute overkill. There are lots of applications for which a view of sets of data coming out of the database is absolutely appropriate. And there’s all kinds of good tools in Java, in the open source community, for writing that kind of application well, things like the Spring Framework give you great ways of getting away a bit from very messy JDBC code, but writing those kind of applications which pull some stuff form the database and display it on the screen, or insert a row of data into the database. You don’t need a domain model to do those kinds of things.

On the other hand, there's a whole other class of applications where we do significant business logic. And I should come back and say, really that’s not all applications. One of the problems is that a lot of books you read about Java assume, push on people the notion that they need to abstract their business logic out from their presentation logic. But if you’re not doing sufficient business logic you don’t need to make that level of abstraction.

But there are some applications which need to do that. For those kind of applications, I think, unfortunately, because of the lack of a really good way of doing transparent persistence, that in itself, has pushed people away from using domain models, from using simple POJO-oriented domain models. One of the fundamental aspects of the domain model is it gets persisted to a relational database, in 98% of Java applications. Not being able to do that efficiently hinders the adoption of that way of building applications.

What were some of the biggest technical challenges you faced building a persistence framework?

Technical challenges are easy. The big problems are all social, communicational mainly. We have some algorithms of medium level of complexity that some business applications and a lot of other open source projects don’t have pieces of code which are quite that complicated. So we’ve got some algorithms and stuff going on that’s of a medium level of difficulty.

Trying to communicate to other people how Hibernate works, particularly how Hibernate works internally as opposed to semantics, the external semantics, the user visible semantics of Hibernate, it’s very, very difficult. When you’re not in the same room as someone, when you’re not able to communicate in words, when it’s even difficult to communicate with diagrams, your communication is limited to emails, text-based things and more than anything else, I’d love there to be a bigger number of people in the world who understand how Hibernate works. There’s enough now, but early on it was a real difficulty when I’m the only one who’s really able to understand that code. Things like Javadoc, and code comments can help all that kind of stuff but really to do that kind of stuff efficiently you need to be able to sit down next to someone and go, "Oh, this is how this works!"

Hibernate seems to have taken a strong stance against JDO. Why?

I’d like to think of it as, we’re not taking a strong stance against the notion of JDO. What we’re doing is we’re saying that we’re not going to implement JDO yet, because JDO has some limitations that would mean users simply would not use JDO, they’d use native APIs because they can do stuff with Hibernate that they need to be able to do, that they will not be able to do with JDO now. If possible, in JDO 2.0, which is an absolutely "up in the air" thing, nobody knows what it’ll be, what it’ll do, it is possible iJDO 2.0 will address these problems I’m talking about and then we’ll be able to move into implementing that. Another reason that we don’t implement it is because, now this is not an absolute show stopper from our point of view, but JDO specifies not only some user visible APIs, but also some implementation details. Now, there’s been an awful lot of discussion on TSS about bytecode processing, is it a good thing, is it a bad thing, does JDO really require bytecode processing, or can you do it some other ways with inheritance? There’s all kinds of complicated issues around that.

My position is that if you want transparent persistence, you need bytecode processing. Bytecode processing is not necessarily bad, but JDO doesn’t need to mandate it. So perhaps what we can look at is having two levels of compliance to the JDO spec. There’s the binary level of compliance were compiled POJOs are binary portable between different JDO implementations. And there’s another level of compliance where source code will be portable. And really, I don’t know where this problem of binary portability came from. I don’t know anybody who’s trying to solve that problem.

As far as I know people are very happy to have source code level of portability. It would also be nice if some of the requirements were loosened up so that you could have different implementations. Reflections are in implementations a great way to do transparent persistence. They’re fantastic. There is no reason why JDO can’t support both approaches. I believe we can change the specs to allow both those approaches without losing anything much. So that’s another level of reservation, but fundamentally, the main problem is that our users want to do things that they can’t do with JDO code.

What kind of things could be added to JDO that might get your support?

Okay, firstly, my number one issue is that in traditional ORM there are two ways of getting stuff out of the database. There’s the query by criteria style API and there’s the query language style way of querying. It's as if they (JDO) couldn't decide between the two models which are both good models and excellent for particular cases and they decided to do something a bit in the middle where they've married something that’s a bit like a query by criteria API to a sort of an expression language that’s sort of based around some Java syntax and I don’t know where it’s going. It doesn’t seem to allow easy extension into those kind of issues that we were talking about : aggregation, projection, outer joining. It doesn’t seem to be able to move in that direction. So this is the number one problem with the JDO spec, is that the query API is inelegant, I don’t think that’s going too far, and also it’s not clear how to extend it and add important features to it.

What’s in the pipeline for you and Hibernate?

Currently, we’ve just delivered Hibernate 2.0. I think that’s gone pretty smoothly. People now working have moved across from 1.2.x to Hibernate. There were a couple of little teething issues. I think it’s gone smoothly, now we’ve ironed out those and I think it’s going well. And this really does represent the culmination of a lot of work and a lot of ideas and now we’re a little bit casting around for things to do and maybe take a little bit of a break. The big thing that is immediately in front of us is distributed caching, support for some reasonably good transaction for distributed caching. And there’s a couple of things going around there. It would be really nice if they could get the JCache JSR out, but there are some other things we can do before then.

So what’s so bad about byte-code manipulation?

Nothing's so bad about bytecode manipulation. Bytecode manipulation's okay, if you need it. Particularly in the case of some of the AOP frameworks that are coming out where they want to do field level interception and all that kind of stuff. If you need to do field level interception, that’s really good, you're going to need to do that. Bytecode enhancement as part of the build process, to me, is a bit of a problem. It requires support in all your development tools or else it’s an extra step in your build cycle. Those are problems you can get past. My problem with bytecode manipulation in the JDO spec is that it’s required. Not that you can use bytecode manipulation to implement JDO, but that it’s required for no really strong reason.

What approach does Hibernate take?

One of the interesting threads of conversation that's been going on, that I’ve been involved with a couple of people, is this question of how to implement AOP frameworks, where you have what Rod Johnson didwas he wrote up a hierarchy of the ways to implement it where he started off with dynamic proxies and ended up with a new language like AspectJ. And in between there were these two options: here was bytecode manipulation which interestingly AOP frameworks are trying to do it runtime by hooking up to the application server classloader, so that was an option, but there was some questions about how easy it is to make that portable between different application servers when we don’t have control over the class loaders. I really don’t know the answer to this because I haven’t looked at the problem closely, but I know there was some different opinions flying around from different people.

But there’s another solution, which is what Hibernate is using currently for proxying, which is called CGLib. Now CGLib does a few things. It’s a library of useful things you would want to do with bytecode generation. So it solves some common problems, and in particular a couple problems that we want solved. CGLib lets you do some bytecode generation to load new classes. So this is a little bit more of a light weight thing than actually manipulating the classes as they’re loaded. CGLib will load a new class. And that’s the way we’re using bytecode stuff with Hibernate.

So we’re creating these proxies which are a bit more powerful than a JDK dynamic proxy. You can think of what CGLib's doing here as being a bit like an alternative implementation of a JDK dynamic proxy, except that it works on 1.2. Importantly, you can inherit a concrete class as well as implementing a list of interfaces and a JDK dynamic proxy doesn’t let you do that. So it means that we can implement proxies for things. Often in Java and all over the place in JMX, EJB, you see a pattern where we allow proxying and indirection, interception between the container and the implementation by having an interface and an implementation device and that’s required by the specifications because they want the container to be able to get in between the client and the server transparently.

Now, that’s not required with Hibernate because we have CGLib where we can do interception upon the methods of a concrete class and CGLib is what lets it do that. CGLib is something built on top of BCel and ASM, for people who are wondering how it relates to these other things.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: