Managing DbContext the right way with Entity Framework 6: an in-depth guide by mehdime
2017-02-10 10:38
531 查看
UPDATE: the source code for
For many applications, the solutions presented in those articles (which generally revolve around using a DI container to inject
For certain types of applications however, the inherent limitations of these approaches pose problems. To the point that certain features become impossible to implement or require to resort to increasingly complex structures or increasingly ugly hacks to work around the way the
Here is for example an overview of the real-world application that prompted me to re-think the way we managed our
The application is comprised of multiple web applications built with ASP.NET MVC and WebAPI. It also includes many background services implemented as console apps and Windows Services, including a home grown task scheduler service and multiple services that process messages from MSMQ and RabbitMQ queues. Most of the articles I linked to above make the assumption that all services will execute within the context of a web request. This is not the case here.
It stores and reads data to / from multiple databases, including a main database, a secondary database, a reporting database and a logging database. Its domain model is separated into several independent groups, each with their own
It relies heavily on third-party remote APIs, such as the Facebook, Twitter or LinkedIn APIs. These aren't transactional. Many user actions require us to make multiple remote API calls before we can return a result to the user. Many of the articles I linked to make the assumption that "1 web request = 1 business transaction" that either gets committed or rolled back in an atomic manner (hence the idea of using a PerWebRequest-scope
Many services are heavily parallelized, either by taking advantage of async I/O or (more often) by simply distributing tasks across multiple threads via the TPL's
In this post, I'll go in depth into the various moving parts that are involved in
There is of course no such thing as one-size-fits-all. But by the end of this post, you should have all the tools and knowledge you need to make an informed decision for your specific application.
Like most posts on this blog, this post is on the long and detailed side. It might take a while to read and digest. For an Entity Framework-based application, the strategy you choose to use to manage the lifetime of the
Of course, depending on the design patterns that were used to create the architecture of your application (and depending on the imagination of whoever designed it - software developers are an imaginative bunch), your code base might be using different names for this. So what I call a "service" might very well be called a "workflow", an "orchestrator", an "executor", an "interactor", a "command", a "handler" or a variety of other names in your application.
Not to mention that many application don't have a well-defined place where business logic is implemented and rely instead on implementing (and often duplicating) business logic on an ad-hoc basis where and when needed, e.g. in controllers in an MVC application.
But none of this matters for this discussion. Whenever I say "service", read: "the place that implements the business logic", be it a random controller method or a well-defined service class in a separate service layer.
Here are a few points that I would consider to be essential for most applications.
Your services must be in control of the business transaction boundary (but not necessarily in control of the
Perhaps the main source of confusion when it comes to managing
Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
In practice, as you use a
A service method, as defined above, is responsible for defining the boundary of a business transaction.
The practical consequence of this is that:
A service method must use the same
Your services must be the sole components in your application responsible for calling the
The
A
I.e. the lifetime of a
Pros and cons of managing the
Possible performance gains. Each
It enables lazy-loading. If your services return persistent entities (as opposed to returning view models or other sorts of DTOs) and you'd like to take advantage of lazy-loading on those entities, the lifetime of the
Issues with keeping the
While it can be fine to re-use a
Note that you can't re-use the same
Re-using the same
Finally, managing your
For example, for an application that starts off as a simple web application and relies an instance-per-web-request strategy to manage the lifetime of its
As a result, it's advisable to avoid managing the lifetime of
This precludes using lazy-loading outside of services (which can be addressed by modeling your domain using DDD or by getting services to return DTOs instead of persistent entities) and poses a few other constraints (e.g. you shouldn't pass persistent entities into a service method as they won't be attached to the
As we'll see later, Entity Framework wraps all write operations within an explicit database transaction by default. Coupled with a READ COMMITTED isolation level - the default on SQL Server - this suits the needs of most business transactions. This is especially the case if you rely on optimistic concurrency to detect and avoid conflicting updates.
Most applications however will still occasionally need to use other isolation levels for specific operations.
It's very common for example to execute reporting queries where you have determined that dirty reads aren't an issue under a READ UNCOMMITTED isolation level in order to eliminate lock contention with other queries (although if your environment allows it, you'll probably want to use READ COMMITTED SNAPSHOT instead).
And some business rules might require the use the REPEATABLE READ or even SERIALIZABLE isolation levels (especially if your application uses pessimistic concurrency control). In which case the service will need to have explicit control over the transaction scope.
The way your
The architecture of a software system and the design patterns it relies on always evolve over time to adapt to new constraints, business requirements and increasing load.
You don't want the strategy you choose to manage the lifetime of your
The way your
While most applications today start off as web applications, the strategy you choose to manage the lifetime of your
It won't be long until you need to create command-line utilities for your support team to execute ad-hoc maintenance tasks or Windows Services to handle scheduled tasks and long-running background operations. When this happens, you want to be able to reference the assembly that contains your services and just use any service you need from your console or Windows Service application. You most definitely don't want to have to completely re-engineer the way your
Your
If your application needs to connect to multiple databases (for example if it uses separate reporting, logging and / or auditing databases) or if you have split your domain model into multiple aggregate groups, you will have to manage multiple
For those coming from an NHibernate background, this is the equivalent of having to manage multiple
Whatever strategy you choose should be able to let services use the appropriate
Your
In .NET 4.5, ADO.NET introduced (at very long last) support for async database queries. Async support was then included in Entity Framework 6, allowing you to use a fully async workflow for all read and write queries made through EF.
Needless to say that whatever system you use to manage your
In general,
There are several key behaviours of Entity Framework you should always keep in mind however. This list documents EF's behaviour when working against SQL Server. There might be differences when using other data stores.
You must never access your
In a multi-threaded application, you must create and use a separate instance of your
So if
Entity Framework's async features are there to support an asynchronous programming model, not to enable parallelism.
The canonical manner to implement a business transaction with Entity Framework is therefore:
In NHibernate, the
This EF behaviour can result in subtle bugs as it is possible to be in a situation where queries may unexpectedly return stale or incorrect data. This wouldn't be possible with NHibernate's default behaviour. On the other side, it dramatically simplifies the issue of database transaction lifetime management.
One of the trickiest issue in NHibernate is to correctly manage the database transaction lifetime. Since NHibernate's
The only reliable method to correctly manage the database transaction lifetime with NHibernate is to wrap all your service methods in an explicit database transaction. This is what you'll see done in pretty much every NHibernate-based application.
A side-effect of this approach is that it requires keeping a database connection and transaction open for often longer than strictly necessary. It therefore increases database lock contention and the probability of database deadlocks occurring. It's also very easy for a developer to inadvertently execute a long-running computation or a remote service call without realizing or even knowing that they're within the context of an open database transaction.
With the EF approach, only the
If you've been around the block for a while, and particularly if you've used NHibernate before, you may have heard that AutoCommit (or Implicit) transactions are bad. And indeed, relying on Autocommit transactions for writes can have a disastrous impact on performance.
The story is very different for reads however. As you can see by yourself by running the SQL script below, neither Autocommit nor Implicit transactions have any significant performance impact for
Obviously, if you need to use an isolation level higher than the default READ COMMITTED, all reads will need to be part of an explicit database transaction. In that case, you will have to start the transaction yourself - EF will not do this for you. But this would typically only be done on an ad-hoc basis for specific business transactions. Entity Framework's default behaviour should suit the vast majority of business transactions.
It will use whatever default transaction isolation level the database engine has been configured to use (READ COMMITTED by default for SQL Server).
An obvious side-effect of manually controlling the database transaction scope is that you are now forcing the database connection and transaction to remain open for the duration of the transaction scope.
You should be careful to keep this scope as short-lived as possible. Keeping a database transaction running for too long can have a significant impact on your application's performance and scalability. In particular, it's generally a good idea to refrain from calling other service methods within an explicit transaction scope - they might be executing long-running operations unaware that they have been invoked within an open database transaction scope.
There's unfortunately no built-in way to override this isolation level. If you'd like to use another isolation level, you must start and manage the database transaction yourself.
The database connection open by DbContext will enroll in an ambient
Alternatively, you can also use the
Prior to EF6, using
In practice, and unless you actually need a distributed transaction, you should avoid using
In practice however, and unless you choose to explicitly manage the database connection or transaction that the DbContext uses, not calling
This is good news as a lot of the code you'll find in the wild fails to dispose of
A DI container like StructureMap for example doesn't support decommissioning the components it created. As a result, if you rely on StructureMap to create your
As we've seen above, the responsibility of creating and disposing the
The
There are 3 school of thoughts when it comes to making the
(in this intentionally contrived example, the repository layer is of course completely pointless. In a real-work application, you would expect the repository layer to be a lot richer. In addition, you could of course abstract your
There's no magic anywhere. The
That your repository methods require to be provided with an explicit
Things are quite different in your service layer however. Chances are that most of your service methods won't use the
It can get quite ugly. Particularly if your application uses multiple
Jon Skeet wrote an interesting article on the topic of explicitness vs ambient but couldn't come up with a good solution either.
Nevertheless, the simplicity and foolproofness of this approach is hard to beat.
In .NET itself, this pattern is used quite extensively. You've probably already used
With this approach, the top-level service method not only creates the
Anders Abel has written a simple implementation of an ambient DbContext that relies on a
As with the explicit approach, the creation and disposal of the
If your application uses multiple
Finally, the ambient
With this approach, you let your DI container manage the lifetime of your
This is what it looks like:
You then need to configure your DI container to create an instance of the
A lot of magic
The first issue is that this approach relies very heavily on magic. And when it comes to managing the correctness and consistency of your data - your most precious asset - magic isn't a word you want to hear too often.
Where do these
If you're a back-end developer working on a EF-based project, you must know the answers to these questions if you want to be able to write correct code.
The answers here aren't obvious and will require you to pour through your DI container configuration code to find out. And as we've seen earlier, getting this configuration right isn't as trivial as it may seem at first sight and may end up being fairly complex and / or subtle.
Unclear business transaction boundaries
Perhaps the most glaring issue in the code sample above is: who is responsible for committing changes to the data store? I.e. who is calling the
You could inject the
Alternatively, you could define a
Another approach sometimes seen in the wild is to let the DI container call
In short: the DI container is an infrastructure-level component - it has no knowledge of the business logic the components it manages implement. The
All that being said, if you subscribe to the Repository is Dead movement, the issue of defining who is calling
There is however a number of other issues you will run into with an injected
Forces your services to become stateful
A notable one is that
It's not the end of the world but it certainly complicates DI container configuration. Having stateless services provides tremendous flexibility and makes the configuration of their lifetime a non-issue (any lifetime would do and singleton is often your best bet). As soon as you introduce stateful services, careful consideration has to be given to your service lifetimes.
It often starts off easy (PerWebRequest or Transient lifetime for everything which suits a simple web app well) and then descends into more complexity as console apps, Windows Services and others inevitably make their appearance.
Prevents multi-threading
Another issue (related to the previous one) that will inevitably bite you quite hard is that an injected
Remember that
How can you fix this? Not easily.
Your first instinct is probably to change your services to depend on a DbContext factory instead of depending directly on a DbContext. That would allow them to create their own
Another way to approach the issue would be to add a few more layers of complexity, introduce a queuing middleware like RabbitMQ and let it distribute the workload for you. Which may or may not work depending on why you need to introduce parallelism. But in any case, you may neither need nor want the additional overhead and complexity.
With an injected
The approach presented below relies on
If you're familiar with the
This is the
The purpose of a
You can instantiate a
Within a
But that's of course only available in the method that created the
Those
You'll note that the service method doesn't need to know which type of
You're implementing a new feature that requires being able to mark a group of users as premium within a single business transaction. You can easily do it like this:
(this would of course be a very inefficient way to implement this particular feature but it demonstrates the point)
This makes creating a service method that combines the logic of multiple other service methods trivial.
It will make code review and maintenance difficult (did you intend not to call
If you requested an explicit database transaction to be started (we'll see later how to do it), not calling
The
And this is how you use it:
In the example above, the
This is made possible by the fact that
WARNING: There is one thing that you must always keep in mind when using any async flow with
I.e. if you attempt to start multiple parallel tasks within the context of a
In general, parallelizing database access within a single business transaction has little to no benefits and only adds significant complexity. Any parallel operation performed within the context of a business transaction should not access the database.
However, if you really need to start a parallel task within a
Sometimes, a service method may need to persist its changes to the underlying database regardless of the outcome of overall business transaction it may be part of. This would be the case if:
It needs to record cross-cutting concern information that shouldn't be rolled-back even if the business transaction fails. A typical example would be logging or auditing records.
It needs to record the result of an operation that cannot be rolled back. A typical example would be service methods that interact with non-transactional remote services or APIs. E.g. if your service method uses the Facebook API to post a new status update on Facebook and then records the newly created status update in the local database, that record must be persisted even if the overall business transaction fails because of some other error occurring after the Facebook API call. The Facebook API isn't transactional - it's impossible to "rollback" a Facebook API call. The result of that API call should therefore never be rolled back.
In that case, you can pass a value of
The major issue with doing this is that this service method will use separate
The client code calling your service method may be a service method itself that created its own
Instead, either:
Don't return persistent entities. This is the easiest, cleanest, most foolproof method. E.g. if your service creates a new domain model object, don't return it. Return its ID instead and let the client load the entity in its own
If you absolutely need to return a persistent entity, switch back to the ambient context, load the entity you want to return in the ambient context and return that.
I.e. if the
The
But as I tried to use that
If even I, who had spent a significant amount of time researching, designing and implementing this component, kept getting confused when trying to use it, there clearly wasn't a hope that anyone else would find it easy to use it.
So I renamed it
The main issue I had with the
Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
There is no ambiguity at to what a unit of work means at the database level.
At the application level however, a "unit of work" is a very vague concept that could mean everything and nothing. And it's certainly not clear how this "unit of work" relates to Entity Framework, to the issue of managing
As a result, any developer trying to use a "
In fact, for many applications, an application-level "unit of work" doesn't even make any sense. Many applications will have to use several non-transactional services during the course of a business transaction, such as remote APIs or non-transactional legacy components. The changes made there cannot be rolled back. Pretending otherwise and is counter-productive, confusing and makes it even harder to write correct code.
A
Of course, naming this component
How
The source code is well commented and I would encourage you to read through it. In addition, this excellent blog post by Stephen Toub on the ExecutionContext is a mandatory read if you'd like to fully understand how the ambient context pattern was implemented in
By doing this, you're loosing pretty much every feature that Entity Framework provides via the
You're effectively reducing Entity Framework to a basic ORM in the literal sense of the term: an mapper from your objects to their relational representation in the database.
There are some applications where this type of architecture does make sense. If you're working on such an application, you should however ask yourself why you're using Entity Framework in the first place. If you're going to use it as a basic ORM and won't use any of the features that it provides on top of its ORM capabilities, you might be better off using a lightweight ORM library such as Dapper. Chances are it would simplify your code and offer better performance by not having the additional overhead that EF introduces to support its additional functionalities.
from:http://mehdi.me/ambient-dbcontext-in-ef6/
DbContextScopeis now available on GitHub: DbContextScope on GitHub.
A bit of context
This isn't the first post that has been written about managing theDbContextlifetime in Entity Framework-based applications. In fact, there is no shortage of articles discussing this topic.
For many applications, the solutions presented in those articles (which generally revolve around using a DI container to inject
DbContextinstances with a PerWebRequest lifetime) will work just fine. They also have the merit of being very simple - at least at first sight.
For certain types of applications however, the inherent limitations of these approaches pose problems. To the point that certain features become impossible to implement or require to resort to increasingly complex structures or increasingly ugly hacks to work around the way the
DbContextinstances are created and managed.
Here is for example an overview of the real-world application that prompted me to re-think the way we managed our
DbContextinstances:
The application is comprised of multiple web applications built with ASP.NET MVC and WebAPI. It also includes many background services implemented as console apps and Windows Services, including a home grown task scheduler service and multiple services that process messages from MSMQ and RabbitMQ queues. Most of the articles I linked to above make the assumption that all services will execute within the context of a web request. This is not the case here.
It stores and reads data to / from multiple databases, including a main database, a secondary database, a reporting database and a logging database. Its domain model is separated into several independent groups, each with their own
DbContexttype. Any approach assuming a single
DbContexttype won't work here.
It relies heavily on third-party remote APIs, such as the Facebook, Twitter or LinkedIn APIs. These aren't transactional. Many user actions require us to make multiple remote API calls before we can return a result to the user. Many of the articles I linked to make the assumption that "1 web request = 1 business transaction" that either gets committed or rolled back in an atomic manner (hence the idea of using a PerWebRequest-scope
DbContextinstance). This clearly doesn't apply here. Just because one remote API call failed doesn't mean that we can auto-magically "rollback" the results of any remote API call we may be done prior to the failed one (e.g. when you've used the Facebook API to post a status update on Facebook, you can't roll it back even if that operation was part of a wider user action that eventually failed as a whole). So in this application, a user action will often require us to execute multiple business transactions, which must be independently persisted. (you may argue that there might be ways to redesign the whole system to avoid finding ourselves in this sort of situation. And maybe there are. But that's how the application was originally designed, it works very well and that's what we have to work with).
Many services are heavily parallelized, either by taking advantage of async I/O or (more often) by simply distributing tasks across multiple threads via the TPL's
Task.Run()or
Parallel.Invoke()methods. So the way we manage our
DbContextinstances must play well with multi-threading and parallel programming in general. Most of the common approaches suggested to manage
DbContextinstances don't work at all in this scenario.
In this post, I'll go in depth into the various moving parts that are involved in
DbContextlifetime management. We'll look at the pros and cons of several strategies commonly used to solve this problem. Finally, we'll look in details at one strategy (among others) to manage the
DbContextlifetime that addresses all the challenges presented above and that should work for most applications regardless of their complexity.
There is of course no such thing as one-size-fits-all. But by the end of this post, you should have all the tools and knowledge you need to make an informed decision for your specific application.
Like most posts on this blog, this post is on the long and detailed side. It might take a while to read and digest. For an Entity Framework-based application, the strategy you choose to use to manage the lifetime of the
DbContextwill be one of the most important decisions you make. It will have a major impact on the correctness, maintainability and scalability of your application. So it's well worth taking some time to choose your strategy carefully and not rush into it.
A note on terminology
In this post, I'll often refer to the term "services". What I mean by that is not remote services (REST or otherwise). Instead, what I'm referring to is what is often called Service Objects. That is: the place where your business logic is implemented - the objects responsible for executing your business rules and defining your business transaction boundaries.Of course, depending on the design patterns that were used to create the architecture of your application (and depending on the imagination of whoever designed it - software developers are an imaginative bunch), your code base might be using different names for this. So what I call a "service" might very well be called a "workflow", an "orchestrator", an "executor", an "interactor", a "command", a "handler" or a variety of other names in your application.
Not to mention that many application don't have a well-defined place where business logic is implemented and rely instead on implementing (and often duplicating) business logic on an ad-hoc basis where and when needed, e.g. in controllers in an MVC application.
But none of this matters for this discussion. Whenever I say "service", read: "the place that implements the business logic", be it a random controller method or a well-defined service class in a separate service layer.
Key points to consider
When coming up with or evaluating aDbContextlifetime management strategy, it's important to keep in mind the key scenarios and functionalities that it must support.
Here are a few points that I would consider to be essential for most applications.
Your services must be in control of the business transaction boundary (but not necessarily in control of the DbContext
instance lifetime)
Perhaps the main source of confusion when it comes to managing DbContextinstances is understanding the difference between the lifetime of a
DbContextinstance and the lifetime of a business transaction and how they relate.
DbContextimplements the Unit of Work pattern:
Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
In practice, as you use a
DbContextinstance to load, update, add and delete persistent entities, the instance keeps track of those changes in memory. It doesn't however persist those changes to the underlying database until you call its
SaveChanges()method.
A service method, as defined above, is responsible for defining the boundary of a business transaction.
The practical consequence of this is that:
A service method must use the same
DbContextinstance throughout the duration of a business transaction. This is so that all the changes made to your persistent model are tracked and either committed to the underlying data store or rolled back in an atomic manner.
Your services must be the sole components in your application responsible for calling the
DbContext.SaveChanges()method at the end of a business transaction. Should other parts of the application call the
SaveChanges()method (e.g. repository methods), you will end up with partially committed changes, leaving your data in an inconsistent state.
The
SaveChanges()method must be called exactly once at the end of each business transaction. Inadvertently calling this method in the middle of a business transaction may leave the system with inconsistent, partially committed changes.
A
DbContextinstance can however span across multiple (sequential) business transactions. Once a business transaction has completed and has called the
DbContext.SaveChanges()method to persist all the changes it made, it's entirely possible to just re-use the same
DbContextinstance for the next business transaction.
I.e. the lifetime of a
DbContextinstance is not necessarily bound to the lifetime of a single business transaction.
Pros and cons of managing the DbContext
instance lifetime independently of the business transaction lifetime.
Example
A very common scenario where the lifetime of theDbContextinstance can be maintained independently from the lifetime of business transactions is in the case of web applications. It's quite common to a use a configuration where a
DbContextinstance is created at the beginning of each web request, used by all the services invoked during the execution of the web request and eventually disposed of at the end of the request.
Pros
There are two main reasons why you would want to decouple the lifetime of theDbContextinstance from the business transaction lifetime.
Possible performance gains. Each
DbContextinstance maintains a first-level cache of all the entities its loads from the database. Whenever you query an entity by its primary key, the
DbContextwill first attempt to retrieve it from its first-level cache before defaulting to querying it from the database. Depending on your data query pattern, re-using the same
DbContextacross multiple sequential business transactions may result in a fewer database queries being made thanks to the
DbContextfirst-level cache.
It enables lazy-loading. If your services return persistent entities (as opposed to returning view models or other sorts of DTOs) and you'd like to take advantage of lazy-loading on those entities, the lifetime of the
DbContextinstance from which those entities were retrieved must extend beyond the scope of the business transaction. If the service method disposed the
DbContextinstance it used before returning, any attempt to lazy-load properties on the returned entities would fail (whether or not using lazy-loading is a good idea is a different debate altogether which we won't get into here). In our web application example, lazy-loading would typically be used in controller action methods on entities returned by a separate service layer. In that case, the
DbContextinstance that was used by the service method to load these entities would need to remain alive for the duration of the web request (or at the very least until the action method has completed).
Issues with keeping the DbContext
alive beyond the scope of a business transaction
While it can be fine to re-use a DbContextacross multiple business transactions, its lifetime should still be kept short. Its first-level cache will become eventually become stale, which will lead to concurrency issues. If your application uses optimistic concurrency this will result in business transactions failing with a
DbUpdateConcurrencyException. Using an instance-per-web-request lifetime for your
DbContextin web apps will usually be fine as a web request is short-lived by nature. But using an instance-per-form lifetime in a desktop application, which you'll often find suggested, is a lot more questionable and requires careful thought before being adopted.
Note that you can't re-use the same
DbContextinstance across multiple business transactions if you rely on pessimistic concurrency. Correctly implementing pessimistic concurrency involves keeping a database transaction with the correct isolation level open for the whole lifetime of a
DbContextinstance, which would prevent committing or rolling back individual business transactions independently.
Re-using the same
DbContextinstance for more than one business transaction can also lead to disastrous bugs where a service method accidently commits the changes from a previously failed business transaction.
Finally, managing your
DbContextinstance lifetime outside of your services tends to tie your application to a specific infrastructure, making it a lot less flexible and much more difficult to evolve and maintain in the long run.
For example, for an application that starts off as a simple web application and relies an instance-per-web-request strategy to manage the lifetime of its
DbContextinstances, it's easy to fall into the trap of relying on lazy-loading in controllers or views or on passing persistent entities across service methods on the assumption that they will all use the same
DbContextinstance behind the scenes. When the need to introduce multi-threading or move operations to background Windows Services inevitably arises, this carefully constructed sand castle often collapses as there are no more web requests to bind
DbContextinstances to.
As a result, it's advisable to avoid managing the lifetime of
DbContextinstances separately from business transactions. Instead, each service method (i.e. each business transaction) should create its own
DbContextinstance and dispose it at the end of the business transaction (i.e. before returning).
This precludes using lazy-loading outside of services (which can be addressed by modeling your domain using DDD or by getting services to return DTOs instead of persistent entities) and poses a few other constraints (e.g. you shouldn't pass persistent entities into a service method as they won't be attached to the
DbContextinstance that the service will use). But it brings a lot of long-term benefits for the flexibility and maintenance of the application.
Your services must be in control of the database transaction scope and isolation level
If your application works against an RDMS that provides ACID properties for its transactions (and if you're using Entity Framework, you almost certainly are), it's essential for your services to be in control of the database transaction scope and isolation level. You can't write correct code otherwise.As we'll see later, Entity Framework wraps all write operations within an explicit database transaction by default. Coupled with a READ COMMITTED isolation level - the default on SQL Server - this suits the needs of most business transactions. This is especially the case if you rely on optimistic concurrency to detect and avoid conflicting updates.
Most applications however will still occasionally need to use other isolation levels for specific operations.
It's very common for example to execute reporting queries where you have determined that dirty reads aren't an issue under a READ UNCOMMITTED isolation level in order to eliminate lock contention with other queries (although if your environment allows it, you'll probably want to use READ COMMITTED SNAPSHOT instead).
And some business rules might require the use the REPEATABLE READ or even SERIALIZABLE isolation levels (especially if your application uses pessimistic concurrency control). In which case the service will need to have explicit control over the transaction scope.
The way your DbContext
is managed should be independent of the architecture of the application
The architecture of a software system and the design patterns it relies on always evolve over time to adapt to new constraints, business requirements and increasing load.You don't want the strategy you choose to manage the lifetime of your
DbContextto tie you to a specific architecture and prevent you from being able to evolve it as and when needed.
The way your DbContext
is managed should be independent of the application type
While most applications today start off as web applications, the strategy you choose to manage the lifetime of your DbContextshouldn't assume that your service method will be called from within the context a web request. More generally, your service layer (if you have one) should be independent of the type of application it's used from.
It won't be long until you need to create command-line utilities for your support team to execute ad-hoc maintenance tasks or Windows Services to handle scheduled tasks and long-running background operations. When this happens, you want to be able to reference the assembly that contains your services and just use any service you need from your console or Windows Service application. You most definitely don't want to have to completely re-engineer the way your
DbContextinstances are managed just to be able to use your services from a different type of application.
Your DbContext
management strategy should support multiple DbContext-derived types
If your application needs to connect to multiple databases (for example if it uses separate reporting, logging and / or auditing databases) or if you have split your domain model into multiple aggregate groups, you will have to manage multiple DbContext-derived types.
For those coming from an NHibernate background, this is the equivalent of having to manage multiple
SessionFactoryinstances.
Whatever strategy you choose should be able to let services use the appropriate
DbContextfor their need.
Your DbContext
management strategy should work with EF6's async workflow
In .NET 4.5, ADO.NET introduced (at very long last) support for async database queries. Async support was then included in Entity Framework 6, allowing you to use a fully async workflow for all read and write queries made through EF.Needless to say that whatever system you use to manage your
DbContextinstance must play well with Entity Framework's async features.
DbContext
's default behaviour
In general, DbContext's default behaviour can be described as: "does the right thing by default".
There are several key behaviours of Entity Framework you should always keep in mind however. This list documents EF's behaviour when working against SQL Server. There might be differences when using other data stores.
DbContext
is not thread-safe
You must never access your DbContext-derived instance from multiple threads simultaneously. This might result on multiple queries being sent concurrently over the same database connection. It will also corrupt the first-level cache that
DbContextmaintains to offer its Identity Map, change tracking and Unit of Work functionalities.
In a multi-threaded application, you must create and use a separate instance of your
DbContext-derived class in each thread.
So if
DbContextisn't thread-safe, how can it support the async query features introduced with EF6? Simply by preventing more than one async operation being executed at any given time (as documented in the Entity Framework specifications for its async pattern support). If you attempt to execute multiple actions on the same
DbContextinstance in parallel, for example by kicking off multiple SELECT queries in parallel via the the
DbSet<T>.ToListAsync()method, you will get a
NotSupportedExceptionwith the following message:
A second operation started on this context before a previous asynchronous operation completed. Use 'await' to ensure that any asynchronous operations have completed before calling another method on this context. Any instance members are not guaranteed to be thread safe.
Entity Framework's async features are there to support an asynchronous programming model, not to enable parallelism.
Changes are only persisted when SaveChanges() is called
Any changes made to your entities, be it updates, inserts or deletes, are only persisted to the database when theDbContext.SaveChanges()method is called. If a
DbContextinstance is disposed before its
SaveChanges()method was called, none of the inserts, updates or deletes done through this
DbContextwill be persisted to the underlying data store.
The canonical manner to implement a business transaction with Entity Framework is therefore:
using (var context = new MyDbContext(ConnectionString)) { /* * Business logic here. Add, update, delete data * through the 'context'. * * Throw in case of any error to roll back all * changes. * * Do not call SaveChanges() until the business * transaction is complete - i.e. no partial or * intermediate saves. SaveChanges() must be * called exactly once per business transaction. * * If you find yourself needing to call SaveChanges() * multiple times within a business transaction, it means * that you are in fact implementing multiple business * transactions within a single service method. * This is the perfect recipe for disaster. Clients of * your service class will naturally assume that your * service method will either commit or roll-back all * changes in an atomic manner when it might in fact * end up doing a partial roll-back, leaving the system * in an inconsistent state. * * In this case, refactor your service method into * multiple service methods that each implement once * and exactly one business transaction. */ [...] // Complete the business transaction // and persist all changes. context.SaveChanges(); // Changes cannot be rolled back after this point. // context.SaveChanges() should be the last statement // of any business transaction. }
A side note for NHibernate veterans
If you're coming from an NHibernate background, the way Entity Framework persists changes to the database is one of the major differences between EF and NHibernate.In NHibernate, the
Sessionoperates by default in AutoFlush mode. In this mode, the
Sessionwill automatically persists all changes made to entities to the database before executing any 'select' query, ensuring consistency between the persisted entities and their in-memory state within the context of a
Session. Entity Framework's default behaviour is the equivalent of setting
Session.FlushModeto
Neverin NHibernate.
This EF behaviour can result in subtle bugs as it is possible to be in a situation where queries may unexpectedly return stale or incorrect data. This wouldn't be possible with NHibernate's default behaviour. On the other side, it dramatically simplifies the issue of database transaction lifetime management.
One of the trickiest issue in NHibernate is to correctly manage the database transaction lifetime. Since NHibernate's
Sessioncan persists outstanding changes to the database automatically at any time throughout its lifetime and may do so multiple times within a single business transaction, there is no single, well-defined point or method where to start the database transaction to ensure that all changes are either committed or rolled-back in an atomic manner.
The only reliable method to correctly manage the database transaction lifetime with NHibernate is to wrap all your service methods in an explicit database transaction. This is what you'll see done in pretty much every NHibernate-based application.
A side-effect of this approach is that it requires keeping a database connection and transaction open for often longer than strictly necessary. It therefore increases database lock contention and the probability of database deadlocks occurring. It's also very easy for a developer to inadvertently execute a long-running computation or a remote service call without realizing or even knowing that they're within the context of an open database transaction.
With the EF approach, only the
SaveChanges()method must be wrapped in an explicit database transaction (unless you need a REPEATABLE READ or SERIALIZABLE isolation level of course), ensuring that the database connection and transaction are kept as short-lived as possible.
Reads are executed within an AutoCommit transaction
DbContextdoesn't start explicit database transactions for read queries. It instead relies on SQL Server's Autocommit Transactions (or Implicit Transactions if you've enabled them but that would be a relatively unusual setup). Autocommit (or Implicit) transactions will use whatever default transaction isolation level the database engine has been configured to use (READ COMMITTED by default for SQL Server).
If you've been around the block for a while, and particularly if you've used NHibernate before, you may have heard that AutoCommit (or Implicit) transactions are bad. And indeed, relying on Autocommit transactions for writes can have a disastrous impact on performance.
The story is very different for reads however. As you can see by yourself by running the SQL script below, neither Autocommit nor Implicit transactions have any significant performance impact for
SELECTstatements.
/* * Execute 100,000 SELECT queries under autocommit, * implicit and explicit database transactions. * * These scripts assumes that the database they are * running against contains a Users table with an 'Id' * column of data type INT. * * If running from SQL Server Management Studio, * right-click in the query window, go to * Query Options -> Results and tick "Discard results * after execution". Otherwise, what you'll be measuring * will be the Result Grid redrawing performance and not * the query execution time. */ --------------------------------------------------- -- Autocommit transaction -- 6 seconds DECLARE @i INT SET @i = 0 WHILE @i < 100000 BEGIN SELECT Id FROM dbo.Users WHERE Id = @i SET @i = @i + 1 END --------------------------------------------------- -- Implicit transaction -- 6 seconds SET IMPLICIT_TRANSACTIONS ON DECLARE @i INT SET @i = 0 WHILE @i < 100000 BEGIN SELECT Id FROM dbo.Users WHERE Id = @i SET @i = @i + 1 END COMMIT; SET IMPLICIT_TRANSACTIONS OFF ---------------------------------------------------- -- Explicit transaction -- 6 seconds DECLARE @i INT SET @i = 0 BEGIN TRAN WHILE @i < 100000 BEGIN SELECT Id FROM dbo.Users WHERE Id = @i SET @i = @i + 1 END COMMIT TRAN
Obviously, if you need to use an isolation level higher than the default READ COMMITTED, all reads will need to be part of an explicit database transaction. In that case, you will have to start the transaction yourself - EF will not do this for you. But this would typically only be done on an ad-hoc basis for specific business transactions. Entity Framework's default behaviour should suit the vast majority of business transactions.
Writes are executed within an explicit transaction
Entity Framework automatically wraps all the queries made by theDbContext.SaveChanges()method in a single explicit database transaction, therefore ensuring that all the changes applied to the context are either committed or rolled-back in full.
It will use whatever default transaction isolation level the database engine has been configured to use (READ COMMITTED by default for SQL Server).
A side note for NHibernate veterans
This is another major difference between EF and NHibernate. With NHibernate, database transactions are entirely in the hands of developers. NHibernate'sSessionwill never start an explicit database transaction automatically.
You can override EF's default behaviour and control the database transaction scope and isolation level
With Entity Framework 6, taking explicit control of the database transaction scope and isolation level is as simple as it should be:using (var context = new MyDbContext(ConnectionString)) { using (var transaction = context.BeginTransaction(IsolationLevel.RepeatableRead)) { [...] context.SaveChanges(); transaction.Commit(); } }
An obvious side-effect of manually controlling the database transaction scope is that you are now forcing the database connection and transaction to remain open for the duration of the transaction scope.
You should be careful to keep this scope as short-lived as possible. Keeping a database transaction running for too long can have a significant impact on your application's performance and scalability. In particular, it's generally a good idea to refrain from calling other service methods within an explicit transaction scope - they might be executing long-running operations unaware that they have been invoked within an open database transaction scope.
There's no built-in way to override the default isolation level used for AutoCommit and automatic explicit transactions
As mentioned earlier, the AutoCommit transactions EF relies on for read queries and the explicit transaction it automatically starts whenSaveChanges()is called use whatever default isolation level the database engine has been configured with.
There's unfortunately no built-in way to override this isolation level. If you'd like to use another isolation level, you must start and manage the database transaction yourself.
The database connection open by DbContext will enroll in an ambient TransactionScope
Alternatively, you can also use the TransactionScopeclass to control the transaction scope and isolation level. The database connection that Entity Framework opens will enroll in the ambient
TransactionScope.
Prior to EF6, using
TransactionScopewas the only practical way to control the database transaction scope and isolation level.
In practice, and unless you actually need a distributed transaction, you should avoid using
TransactionScope.
TransactionScope, and distributed transactions in general, are not necessary for most applications and tend to introduce more problems than they solve. EF's documentation has more details on working with
TransactionScopewith Entity Framework if you really need distributed transactions.
DbContext instances should be disposed of (but you'll probably be OK if they're not)
DbContextimplements
IDisposable. Its instances should therefore be disposed of as soon as they're not needed anymore.
In practice however, and unless you choose to explicitly manage the database connection or transaction that the DbContext uses, not calling
DbContext.Dispose()won't cause any issues as Diego Vega, a EF team member, explains.
This is good news as a lot of the code you'll find in the wild fails to dispose of
DbContextinstances properly. This is particularly the case for code that attempts to manage
DbContextinstance lifetimes via a DI container, which can be a lot trickier than it sounds.
A DI container like StructureMap for example doesn't support decommissioning the components it created. As a result, if you rely on StructureMap to create your
DbContextinstances, they will never be disposed of, regardless of what lifecycle you choose for them. The only correct way to manage disposable components with a DI container like this is to significantly complicate your DI configuration and use nested dependency injection containers as Jeremy Miller demonstrates.
Ambient DbContext vs Explicit DbContext vs Injected Dbcontext
A key decision you'll have to make at the start of any Entity Framework-based project is how your code will handle passing theDbContextinstances down to the method / layer that will make the actual database queries.
As we've seen above, the responsibility of creating and disposing the
DbContextlies with the top-level service methods. The data access code, i.e. the code that actually uses the
DbContextinstance, will however often be made in a separate part of the code - be it in a private method deep down the service implementation, in a query object or in a separate repository layer.
The
DbContextinstance that the top-level service method creates must therefore somehow find its way down to these methods.
There are 3 school of thoughts when it comes to making the
DbContextinstance available to the data access code: ambient, explicit or injected. Each approach has its pros and cons, which we'll examine now.
Explicit DbContext
What it looks like
With the explicitDbContextapproach, the top-level service method creates a
DbContextinstance and simply passes it down the stack as a method parameter until it finally reaches the method that implements the data access part. In a traditional 3-tier architecture with both a service and a repository layer, this would look like this:
public class UserService : IUserService { private readonly IUserRepository _userRepository; public UserService(IUserRepository userRepository) { if (userRepository == null) throw new ArgumentNullException("userRepository"); _userRepository = userRepository; } public void MarkUserAsPremium(Guid userId) { using (var context = new MyDbContext()) { var user = _userRepository.Get(context, userId); user.IsPremiumUser = true; context.SaveChanges(); } } } public class UserRepository : IUserRepository { public User Get(MyDbContext context, Guid userId) { return context.Set<User>().Find(userId); } }
(in this intentionally contrived example, the repository layer is of course completely pointless. In a real-work application, you would expect the repository layer to be a lot richer. In addition, you could of course abstract your
DbContextbehind an "IDbContext" of sorts and create it via an abstract factory if you really didn't want to have to have a direct dependency on Entity Framework in your services. The principle would remain the same).
The Good
This approach is by far and away the simplest approach. It results in code that's very easy to understand and maintain, even by developers new to the code base.There's no magic anywhere. The
DbContextinstance doesn't materialize out of thin air. There's a clear and obvious place where the context is created. And it's really easy to climb up the stack and find it if you're wondering where a particular
DbContextinstance is coming from.
The Bad
The main drawback of this approach is that it requires you to pollute all your repository methods (if you have a repository layer) as well as most of your service methods with a mandatoryDbContextparameter (or some sort of
IDbContextabstraction if you don't want to be tied to a concrete implementation - but the point still stands). You could see this as being a sort of Method Injection pattern.
That your repository methods require to be provided with an explicit
DbContextparameter isn't too much of an issue. In fact, it can even be seen as a good thing as it removes any potential ambiguity as to which context they'll run their queries against.
Things are quite different in your service layer however. Chances are that most of your service methods won't use the
DbContextat all, particularly if you've isolated your data access code away in query objects or in a repository layer. As a result, these methods will only require to be provided with a
DbContextparameter so that they can pass it down the line until it eventually reaches whatever method actually uses it.
It can get quite ugly. Particularly if your application uses multiple
DbContext, resulting in service methods potentially requiring two or more mandatory
DbContextparameters. It also muddies your method contracts as your service method are now forced to ask for a parameter that they neither need nor use but require purely to satisfy the dependency of a downstream method.
Jon Skeet wrote an interesting article on the topic of explicitness vs ambient but couldn't come up with a good solution either.
Nevertheless, the simplicity and foolproofness of this approach is hard to beat.
Ambient DbContext
What it looks like
NHibernate users will be very familiar with this approach as the ambient context pattern is the predominant approach used in the NHibernate world to manage NH'sSession(NHibernate's equivalent to EF's
DbContext). NHibernate even comes with built-in support for this pattern, which it calls contextual sessions.
In .NET itself, this pattern is used quite extensively. You've probably already used
HttpContext.Currentor the
TransactionScopeclass, both of which rely on the ambient context pattern.
With this approach, the top-level service method not only creates the
DbContextto use for the current business transaction but it also registers it as the ambient
DbContext. The data access code can then just retrieve the ambient
DbContextwhenever it needs it. No need to pass the
DbContextinstance around anymore.
Anders Abel has written a simple implementation of an ambient DbContext that relies on a
ThreadStaticvariable to store the ambient
DbContext. Have a look - there's less to it than it sounds.
The Good
The advantages of this approach are obvious. Your service and repository methods are now free ofDbContextparameters, making your interfaces cleaner and your method contracts clearer as they can now only request the parameters that they actually need to do their job. No need to pass
DbContextinstances all over the place anymore.
As with the explicit approach, the creation and disposal of the
DbContextinstance is in a clear, well-defined and logical place.
The Bad
This approach does however introduce a certain amount of magic which can certainly make the code more difficult to understand and maintain. When looking at the data access code, it's not necessarily easy to figure out where the ambientDbContextis coming from. You just have to hope that someone somehow registered it before calling the data access code.
If your application uses multiple
DbContextclasses, e.g. if it connects to multiple databases or if you have split your domain model into separate model groups, it can be difficult for the top-level service method to know which
DbContextobject(s) it must create and register. With the explicit approach, the data access methods require to provided with whatever
DbContextobject they need as a method parameter. There is therefore no ambiguity possible. But with an ambient context approach, the top-level service method must somehow know what
DbContexttype the downstream data access code will require. There are ways to solve this issue in a fairly clean manner however as we'll see later.
Finally, the ambient
DbContextexample I linked to above works fine in a single-threaded model. But if you intend to use Entity Framework's async query feature, this won't fly. After an async operation, you will most likely find yourself in another thread than the one where the
DbContextwas created. In many cases (although not in all cases - this is where async gets tricky), it means that your ambient
DbContextwill be gone. This is fixable as well but it will require some advanced understanding of how multi-threading, the TPL and async works behind the scenes in .NET. We'll have a look at this later in this post.
Injected DbContext
What it looks like
Last but not least, the injectedDbContextapproach is the most often mentioned strategy in articles and blog posts addressing the issue of managing the
DbContextlifetime.
With this approach, you let your DI container manage the lifetime of your
DbContextand inject it into whatever component needs it (your repository objects for example).
This is what it looks like:
public class UserService : IUserService { private readonly IUserRepository _userRepository; public UserService(IUserRepository userRepository) { if (userRepository == null) throw new ArgumentNullException("userRepository"); _userRepository = userRepository; } public void MarkUserAsPremium(Guid userId) { var user = _userRepository.Get(context, userId); user.IsPremiumUser = true; } } public class UserRepository : IUserRepository { private readonly MyDbContext _context; public UserRepository(MyDbContext context) { if (context == null) throw new ArgumentNullException("context"); _context = context; } public User Get(Guid userId) { return _context.Set<User>().Find(userId); } }
You then need to configure your DI container to create an instance of the
DbContextwith an appropriate lifetime on object graph creation. A common advice you'll find is to use a PerWebRequest lifetime for web apps and PerForm lifetime for desktop apps.
The Good
The advantage here is similar to that of the ambient approach: the code isn't littered withDbContextinstances being passed all over the place. This approach goes one step further still: there is no
DbContextto be seen anywhere in the service code. The service is completely oblivious of Entity Framework. Which might sound good a first sight but quickly leads to a lot of problems.
The Bad
Despite its popularity, this approach has significant drawbacks and limitations. It's important to understand them before adopting this approach.A lot of magic
The first issue is that this approach relies very heavily on magic. And when it comes to managing the correctness and consistency of your data - your most precious asset - magic isn't a word you want to hear too often.
Where do these
DbContextinstances come from? How and where is the business transaction boundary defined? If a service depends on two different repositories, will they both have access to the same
DbContextinstance or will they each have their own instance?
If you're a back-end developer working on a EF-based project, you must know the answers to these questions if you want to be able to write correct code.
The answers here aren't obvious and will require you to pour through your DI container configuration code to find out. And as we've seen earlier, getting this configuration right isn't as trivial as it may seem at first sight and may end up being fairly complex and / or subtle.
Unclear business transaction boundaries
Perhaps the most glaring issue in the code sample above is: who is responsible for committing changes to the data store? I.e. who is calling the
DbContext.SaveChanges()method? It's unclear.
You could inject the
DbContextinto your service for the sole purpose of calling its
SaveChanges()method. That would be rather baffling and very error-prone code. Why would the service method call
SaveChanges()on a context object that it neither created nor used? What changes would be saved?
Alternatively, you could define a
SaveChanges()method on all your repositories, which would just delegate to the underlying
DbContext. The service method would then just call
SaveChanges()on the repository itself. This would be very misleading code, as it would imply that each repository implement their own unit-of-work and can persist their changes independently of the other repositories. Which would of course be incorrect as they would in fact all use the same
DbContextinstance behind the scenes.
Another approach sometimes seen in the wild is to let the DI container call
SaveChanges()before decommissioning the
DbContextinstance. A disastrous approach that would merit a blog post of its own to examine.
In short: the DI container is an infrastructure-level component - it has no knowledge of the business logic the components it manages implement. The
DbContext.SaveChanges()method on the other side defines a business transaction boundary - i.e. it's a business logic concern (and a critical one at that). Mixing those two unrelated concerns together will quickly cause a lot of pain.
All that being said, if you subscribe to the Repository is Dead movement, the issue of defining who is calling
DbContext.SaveChanges()shouldn't arise as your services will use the
DbContextinstance directly. They will therefore be the natural place for
SaveChanges()to be called.
There is however a number of other issues you will run into with an injected
DbContextregardless of the architectural style of your application.
Forces your services to become stateful
A notable one is that
DbContextisn't a service. It's a resource. And a Disposable one to boot. By injecting it into whatever layer implement your data access, you're making that layer, and by extension all the layers above which would be pretty much the entire application, stateful.
It's not the end of the world but it certainly complicates DI container configuration. Having stateless services provides tremendous flexibility and makes the configuration of their lifetime a non-issue (any lifetime would do and singleton is often your best bet). As soon as you introduce stateful services, careful consideration has to be given to your service lifetimes.
It often starts off easy (PerWebRequest or Transient lifetime for everything which suits a simple web app well) and then descends into more complexity as console apps, Windows Services and others inevitably make their appearance.
Prevents multi-threading
Another issue (related to the previous one) that will inevitably bite you quite hard is that an injected
DbContextprevents you from being able to introduce multi-threading or any sort of parallel execution flows in your services.
Remember that
DbContext(just like
Sessionin NHibernate) isn't thread-safe. If you need to execute multiple tasks in parallel in a service, you must make sure that each task works against its own
DbContextinstance or the whole thing will blow up at runtime. This is impossible to do with the injected DbContext approach since the service isn't in control of the
DbContextinstance creation and doesn't have any way to create new ones.
How can you fix this? Not easily.
Your first instinct is probably to change your services to depend on a DbContext factory instead of depending directly on a DbContext. That would allow them to create their own
DbContextinstances when needed. But that would effectively defeat the whole point of the injected
DbContextapproach. If services create their own DbContext instances via a factory, these instances can't be injected anymore. Which means that services will have to explicitly pass those
DbContextinstances down the layers to whatever components need them (e.g. the repositories). So you're effectively back to the explicit DbContext approach discussed earlier. I can think of a few ways in which this could be solved but all of them feel more like hacks than clean and elegant solutions.
Another way to approach the issue would be to add a few more layers of complexity, introduce a queuing middleware like RabbitMQ and let it distribute the workload for you. Which may or may not work depending on why you need to introduce parallelism. But in any case, you may neither need nor want the additional overhead and complexity.
With an injected
DbContext, you're simply better off limiting yourself to single-threaded code or at least to a single logical flow of execution. Which is perfectly fine for many applications but it will become a major limitation in certain cases.
DbContextScope: a simple, correct and flexible way to manage DbContext instances
Time to look at a better way to manage thoseDbContextinstances.
The approach presented below relies on
DbContextScope, a custom component that implements the ambient DbContext approach presented earlier. The full source code for
DbContextScopeand the classes it depends on is on GitHub.
If you're familiar with the
TransactionScopeclass, then you already know how to use a
DbContextScope. They're very similar in essence - the only difference is that
DbContextScopecreates and manages
DbContextinstances instead of database transactions. But just like
TransactionScope,
DbContextScopeis ambient, can be nested, can have its nesting behaviour disabled and works fine with async execution flows.
This is the
DbContextScopeinterface:
public interface IDbContextScope : IDisposable { void SaveChanges(); Task SaveChangesAsync(); void RefreshEntitiesInParentScope(IEnumerable entities); Task RefreshEntitiesInParentScopeAsync(IEnumerable entities); IDbContextCollection DbContexts { get; } }
The purpose of a
DbContextScopeis to create and manage the
DbContextinstances used within a code block. A
DbContextScopetherefore effectively defines the boundary of a business transaction. I'll explain later why I didn't name it "UnitOfWork" or "UnitOfWorkScope", which would have been a more commonly used terminology for this.
You can instantiate a
DbContextScopedirectly. Or you can take a dependency on
IDbContextScopeFactory, which provides convenience methods to create a
DbContextScopewith the most common configurations:
public interface IDbContextScopeFactory { IDbContextScope Create(DbContextScopeOption joiningOption = DbContextScopeOption.JoinExisting); IDbContextReadOnlyScope CreateReadOnly(DbContextScopeOption joiningOption = DbContextScopeOption.JoinExisting); IDbContextScope CreateWithTransaction(IsolationLevel isolationLevel); IDbContextReadOnlyScope CreateReadOnlyWithTransaction(IsolationLevel isolationLevel); IDisposable SuppressAmbientContext(); }
Typical usage
WithDbContextScope, your typical service method would look like this:
public void MarkUserAsPremium(Guid userId) { using (var dbContextScope = _dbContextScopeFactory.Create()) { var user = _userRepository.Get(userId); user.IsPremiumUser = true; dbContextScope.SaveChanges(); } }
Within a
DbContextScope, you can access the
DbContextinstances that the scope manages in two ways. You can get them via the
DbContextScope.DbContextsproperty like this:
public void SomeServiceMethod(Guid userId) { using (var dbContextScope = _dbContextScopeFactory.Create()) { var user = dbContextScope.DbContexts.Get<MyDbContext>.Set<User>.Find(userId); [...] dbContextScope.SaveChanges(); } }
But that's of course only available in the method that created the
DbContextScope. If you need to access the ambient
DbContextinstances anywhere else (e.g. in a repository class), you can just take a dependency on
IAmbientDbContextLocator, which you would use like this:
public class UserRepository : IUserRepository { private readonly IAmbientDbContextLocator _contextLocator; public UserRepository(IAmbientDbContextLocator contextLocator) { if (contextLocator == null) throw new ArgumentNullException("contextLocator"); _contextLocator = contextLocator; } public User Get(Guid userId) { return _contextLocator.Get<MyDbContext>.Set<User>().Find(userId); } }
Those
DbContextinstances are created lazily and the
DbContextScopekeeps track of them to ensure that only one instance of any given DbContext type is ever created within its scope.
You'll note that the service method doesn't need to know which type of
DbContextwill be required during the course of the business transaction. It only needs to create a
DbContextScopeand any component that needs to access the database within that scope will request the type of
DbContextthey need.
Nesting scopes
ADbContextScopecan of course be nested. Let's say that you already have a service method that can mark a user as a premium user like this:
public void MarkUserAsPremium(Guid userId) { using (var dbContextScope = _dbContextScopeFactory.Create()) { var user = _userRepository.Get(userId); user.IsPremiumUser = true; dbContextScope.SaveChanges(); } }
You're implementing a new feature that requires being able to mark a group of users as premium within a single business transaction. You can easily do it like this:
public void MarkGroupOfUsersAsPremium(IEnumerable<Guid> userIds) { using (var dbContextScope = _dbContextScopeFactory.Create()) { foreach (var userId in userIds) { // The child scope created by MarkUserAsPremium() will // join our scope. So it will re-use our DbContext instance(s) // and the call to SaveChanges() made in the child scope will // have no effect. MarkUserAsPremium(userId); } // Changes will only be saved here, in the top-level scope, // ensuring that all the changes are either committed or // rolled-back atomically. dbContextScope.SaveChanges(); } }
(this would of course be a very inefficient way to implement this particular feature but it demonstrates the point)
This makes creating a service method that combines the logic of multiple other service methods trivial.
Read-only scopes
If a service method is read-only, having to callSaveChanges()on its
DbContextScopebefore returning can be a pain. But not calling it isn't an option either as:
It will make code review and maintenance difficult (did you intend not to call
SaveChanges()or did you forget to call it?)
If you requested an explicit database transaction to be started (we'll see later how to do it), not calling
SaveChanges()will result in the transaction being rolled back. Database monitoring systems will usually interpret transaction rollbacks as an indication of an application error. Having spurious rollbacks is not a good idea.
The
DbContextReadOnlyScopeclass addresses this issue. This is its interface:
public interface IDbContextReadOnlyScope : IDisposable { IDbContextCollection DbContexts { get; } }
And this is how you use it:
public int NumberPremiumUsers() { using (_dbContextScopeFactory.CreateReadOnly()) { return _userRepository.GetNumberOfPremiumUsers(); } }
Async support
DbContextScopeworks with async execution flows as you would expect:
public async Task RandomServiceMethodAsync(Guid userId) { using (var dbContextScope = _dbContextScopeFactory.Create()) { var user = await _userRepository.GetAsync(userId); var orders = await _orderRepository.GetOrdersForUserAsync(userId); [...] await dbContextScope.SaveChangesAsync(); } }
In the example above, the
OrderRepository.GetOrdersForUserAsync()method will be able to see and access the ambient DbContext instance despite the fact that it's being called in a separate thread than the one where the
DbContextScopewas initially created.
This is made possible by the fact that
DbContextScopestores itself in the CallContext. The CallContext automatically flows through async points. If you're curious about how it all works behind the scenes, Stephen Toub has written an excellent blog post about it. But if all you want to do is use
DbContextScope, you just have to know that: it just works.
WARNING: There is one thing that you must always keep in mind when using any async flow with
DbContextScope. Just like
TransactionScope,
DbContextScopeonly supports being used within a single logical flow of execution.
I.e. if you attempt to start multiple parallel tasks within the context of a
DbContextScope(e.g. by creating multiple threads or multiple TPL
Task), you will get into big trouble. This is because the ambient
DbContextScopewill flow through all the threads your parallel tasks are using. If code in these threads need to use the database, they will all use the same ambient
DbContextinstance, resulting the same the
DbContextinstance being used from multiple threads simultaneously.
In general, parallelizing database access within a single business transaction has little to no benefits and only adds significant complexity. Any parallel operation performed within the context of a business transaction should not access the database.
However, if you really need to start a parallel task within a
DbContextScope(e.g. to perform some out-of-band background processing independently from the outcome of the business transaction), then you must suppress the ambient context before starting the parallel task. Which you can easily do like this:
public void RandomServiceMethod() { using (var dbContextScope = _dbContextScopeFactory.Create()) { // Do some work that uses the ambient context [...] using (_dbContextScopeFactory.SuppressAmbientContext()) { // Kick off parallel tasks that shouldn't be using the // ambient context here. E.g. create new threads, // enqueue work items on the ThreadPool or create // TPL Tasks. [...] } // The ambient context is available again here. // Can keep doing more work as usual. [...] dbContextScope.SaveChanges(); } }
Creating a non-nested DbContextScope
This is an advanced feature that I would expect most applications to never need. Tread carefully when using this as it can create tricky issues and quickly lead to a maintenance nightmare.Sometimes, a service method may need to persist its changes to the underlying database regardless of the outcome of overall business transaction it may be part of. This would be the case if:
It needs to record cross-cutting concern information that shouldn't be rolled-back even if the business transaction fails. A typical example would be logging or auditing records.
It needs to record the result of an operation that cannot be rolled back. A typical example would be service methods that interact with non-transactional remote services or APIs. E.g. if your service method uses the Facebook API to post a new status update on Facebook and then records the newly created status update in the local database, that record must be persisted even if the overall business transaction fails because of some other error occurring after the Facebook API call. The Facebook API isn't transactional - it's impossible to "rollback" a Facebook API call. The result of that API call should therefore never be rolled back.
In that case, you can pass a value of
DbContextScopeOption.ForceCreateNewas the
joiningOptionparameter when creating a new
DbContextScope. This will create a
DbContextScopethat will not join the ambient scope even if one exists:
public void RandomServiceMethod() { using (var dbContextScope = _dbContextScopeFactory.Create(DbContextScopeOption.ForceCreateNew)) { // We've created a new scope. Even if that service method // was called by another service method that has created its // own DbContextScope, we won't be joining it. // Our scope will create new DbContext instances and won't // re-use the DbContext instances that the parent scope uses. [...] // Since we've forced the creation of a new scope, // this call to SaveChanges() will persist // our changes regardless of whether or not the // parent scope (if any) saves its changes or rolls back. dbContextScope.SaveChanges(); } }
The major issue with doing this is that this service method will use separate
DbContextinstances than the ones used in the rest of that business transaction. Here are a few basic rules to always follow in that case in order to avoid weird bugs and maintenance nightmares:
1. Persistent entity returned by a service method must always be attached to the ambient context
If you force the creation of a newDbContextScope(and therefore of new
DbContextinstances) instead of joining the ambient one, your service method must never return persistent entities that were created / retrieved within that new scope. This would be completely unexpected and will lead to humongous complexity.
The client code calling your service method may be a service method itself that created its own
DbContextScopeand therefore expects all service methods it calls to use that same ambient scope (this is the whole point of using an ambient context). It will therefore expect any persistent entity returned by your service method to be attached to the ambient
DbContext.
Instead, either:
Don't return persistent entities. This is the easiest, cleanest, most foolproof method. E.g. if your service creates a new domain model object, don't return it. Return its ID instead and let the client load the entity in its own
DbContextinstance if it needs the actual object.
If you absolutely need to return a persistent entity, switch back to the ambient context, load the entity you want to return in the ambient context and return that.
2. Upon exit, a service method must make sure that all modifications it made to persistent entities have been replicated in the parent scope
If your service method forces the creation of a newDbContextScopeand then modifies persistent entities in that new scope, it must make sure that the parent ambient scope (if any) can "see" those modification when it returns.
I.e. if the
DbContextinstances in the parent scope had already loaded the entities you modified in their first-level cache (ObjectStateManager), your service method must force a refresh of these entities to ensure that the parent scope doesn't end up working with stale versions of these objects.
The
DbContextScopeclass has a handy helper method that makes this fairly painless:
public void RandomServiceMethod(Guid accountId) { // Forcing the creation of a new scope (i.e. we'll be using our // own DbContext instances) using (var dbContextScope = _dbContextScopeFactory.Create(DbContextScopeOption.ForceCreateNew)) { var account = _accountRepository.Get(accountId); account.Disabled = true; // Since we forced the creation of a new scope, // this will persist our changes to the database // regardless of what the parent scope does. dbContextScope.SaveChanges(); // If the caller of this method had already // loaded that account object into their own // DbContext instance, their version // has now become stale. They won't see that // this account has been disabled and might // therefore execute incorrect logic. // So make sure that the version our caller // has is up-to-date. dbContextScope.RefreshEntitiesInParentScope(new[] { account }); } }
Why DbContextScope and not UnitOfWork?
The first version of theDbContextScopeclass I wrote was actually called
UnitOfWork. This is arguably the most commonly used name for this type of component.
But as I tried to use that
UnitOfWorkcomponent in a real-world application, I kept getting really confused as to how I was supposed to use it and what it really did. This is despite the fact that I was the one who researched, designed and implemented it and despite the fact that I knew what it did and how it worked inside-out. Yet, I kept getting myself confused and had to often take a step back and think hard about how this "unit of work" related to the actual problem I was trying to solve: managing my DbContext instances.
If even I, who had spent a significant amount of time researching, designing and implementing this component, kept getting confused when trying to use it, there clearly wasn't a hope that anyone else would find it easy to use it.
So I renamed it
DbContextScopeand suddenly everything became clearer.
The main issue I had with the
UnitOfWorkI believe is that at the application-level, it often doesn't make a lot of sense. At the lower levels, for example at the database level, a "unit of work" is a very clear and concrete concept. This is Martin Fowler's definition of a unit of work:
Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
There is no ambiguity at to what a unit of work means at the database level.
At the application level however, a "unit of work" is a very vague concept that could mean everything and nothing. And it's certainly not clear how this "unit of work" relates to Entity Framework, to the issue of managing
DbContextinstances and to the problem of ensuring that the persistent entities we're manipulating are attached to the right DbContext instance.
As a result, any developer trying to use a "
UnitOfWork" would have to pour through its source code to find out what it really does. The definition of the unit of work pattern is simply too vague to be useful at the application level.
In fact, for many applications, an application-level "unit of work" doesn't even make any sense. Many applications will have to use several non-transactional services during the course of a business transaction, such as remote APIs or non-transactional legacy components. The changes made there cannot be rolled back. Pretending otherwise and is counter-productive, confusing and makes it even harder to write correct code.
A
DbContextScopeon the other side does what it says on the tin. Nothing more, nothing less. It doesn't pretend to be what it's not. And I've found that this simple name change significantly reduced the cognitive load required to use that component and to verify that it was being used correctly.
Of course, naming this component
DbContextScopemeans that you can't hide the fact that you're using Entity Framework from your services anymore.
UnitOfWorkis a conveniently vague term that allows you to abstract away the persistence mechanism used in the lower layers. Whether or not abstracting EF away from your service layer is a good thing is another debate that we won't get into here.
See it in action
The source code on GitHub includes a demo application that demonstrates the most common use-cases.How DbContextScope
works
The source code is well commented and I would encourage you to read through it. In addition, this excellent blog post by Stephen Toub on the ExecutionContext is a mandatory read if you'd like to fully understand how the ambient context pattern was implemented in DbContextScope.
Further reading
The personal blog of Rowan Miller, the program manager for the Entity Framework team, is a must-read for any developer working on an Entity Framework-based application.Bonus material
Where not to create your DbContext instances
An Entity Framework anti-pattern commonly seen in the wild is to implement the creation and disposal ofDbContextin data access methods (e.g. in repository methods in a traditional 3-tier application). It usually looks like this:
public class UserService : IUserService { private readonly IUserRepository _userRepository; public UserService(IUserRepository userRepository) { if (userRepository == null) throw new ArgumentNullException("userRepository"); _userRepository = userRepository; } public void MarkUserAsPremium(Guid userId) { var user = _userRepository.Get(userId); user.IsPremiumUser = true; _userRepository.Save(user); } } public class UserRepository : IUserRepository { public User Get(Guid userId) { using (var context = new MyDbContext()) { return context.Set<User>().Find(userId); } } public void Save(User user) { using (var context = new MyDbContext()) { // [...] // (either attach the provided entity to the context // or load it from the context and update its properties // from the provided entity) context.SaveChanges(); } } }
By doing this, you're loosing pretty much every feature that Entity Framework provides via the
DbContext, including its 1st-level cache, its identity map, its unit-of-work, and its change tracking and lazy-loading abilities. That's because in the scenario above, a new
DbContextinstance is created for every database query and disposed immediately afterwards, hence preventing the
DbContextinstance from being able to track the state of your data objects across the entire business transaction.
You're effectively reducing Entity Framework to a basic ORM in the literal sense of the term: an mapper from your objects to their relational representation in the database.
There are some applications where this type of architecture does make sense. If you're working on such an application, you should however ask yourself why you're using Entity Framework in the first place. If you're going to use it as a basic ORM and won't use any of the features that it provides on top of its ORM capabilities, you might be better off using a lightweight ORM library such as Dapper. Chances are it would simplify your code and offer better performance by not having the additional overhead that EF introduces to support its additional functionalities.
from:http://mehdi.me/ambient-dbcontext-in-ef6/
相关文章推荐
- Managing DbContext the right way with Entity Framework 6: an in-depth guide
- [转]Sorting, Filtering, and Paging with the Entity Framework in an ASP.NET MVC Application (3 of 10)
- Fixing the "There is already an open DataReader associated with this Command which must be closed first." exception in Entity Framework
- Sorting, Filtering, and Paging with the Entity Framework in an ASP.NET MVC Application
- Creating a Web Control with an Expandable Property in the Designer by Using C#.
- There is already an open DataReader associated with this Command which must be closed first." exception in Entity Framework
- Make the DbContext Ambient with UnitOfWorkScope(now named DbContextScope by mehdime)
- [Yii Framework] How to run cron job (or use command) in the server with yii framework
- 海洋工作室——网站建设专家:The version of SQL Server in use does not support datatype datetime2 and the Entity Framework.
- Foreign Keys in the Entity Framework
- [转]Executing a stored procedure with an output parameter using Entity Framework
- Host a WCF Service in IIS 7 & Windows 2008 - The right way (IIS7上部署WCF)
- Login control in an ASP.NET AJAX toolkit PopupControlExtender with a close button--write by Laurent Kempé
- ADO.NET Entity Framework: The version of SQL Server in use does not support datatype 'datetime2'
- An easier way to register the Assemblies' Instrumentation In Enterprise Library[转]
- 让人迷惑的"Cannot add an entity with a key that is already in use."
- Problem 15 - Starting in the top left corner in a 20 by 20 grid, how many routes are there to the bottom right corner?
- Professional ADO.NET 3.5 with LINQ and the Entity Framework
- This application has requested the Runtime to terminate it in an unusual way.
- Host a WCF Service in IIS 7 & Windows 2008 - The right way