Click here to Skip to main content
15,879,535 members
Articles / Entity Framework

Highly Effective Data Insertion at Case of Oracle DB

Rate me:
Please Sign up or sign in to vote.
5.00/5 (8 votes)
31 Oct 2016CPOL5 min read 19.4K   8  
EF implementation of BulkInsert operation and "Arrays in a Bind" solution at case of absence of EF.

Introduction

There are a lot of articles and posts about: "How to increase performance of inserting new entities at case of Entity Framework?". Often, answers contain such recommendations as: to play with Configuration.AutoDetectChangesEnabled, divide your data to portions and use for each of them individual context, disposing it, and a lot of other approaches. For example, in case of MS SQL, I use a very cool library BulkInsert: https://efbulkinsert.codeplex.com/, but what we will do, if we use Oracle database? I will try to answer this question in this article.

What to Install

Fortunately, there is a project, forked from mentioned above BulkInsert library: https://github.com/Thorium/EntityFramework.BulkInsert, with its help, we will solve our main problem - implementing BulkInsert approach to increase insert performance in case of Oracle database.

First of all, you should download its source code (the most important for us is DevartProvider class, located in Providers folder) and include the corresponding assembly in your project. Then, you need to load some external packages with the help of Nuget. They are:

Image 1

I will explain for which reasons each package is needed:

  1. EntityFramework.MappingAPI for BulkInsert inner stuff.
  2. Simple.Data.Oracle and dependent Simple.Data. packages contain Oracle provider which BulkInsert assembly will use.
  3. Oracle.ManagedDataAccess. packages necessity will be described lately.
  4. EntityFramework is the core package, which is indeed always needed.

Solution

Contexts

Now, let's create our contexts:

C#
internal class OracleContext : DbContext
{
    public DbSet<Customer> Customers { get; set; }
    public DbSet<Order> Orders { get; set; }
    
    //other stuff...
}

internal class OracleContextBulkInsert : OracleContext 
{
    public OracleContextInsert()
    {
        //this extension comes from BulkInsert library!
        EntityFrameworkContextExtensions.RegisterDevartBulkInsertProvider(this);
    }
}

But there is one restriction: OracleContextBulkInsert approach is only applicable for performing BulkInsert operation. If we will try to perform one of the common EF operations, for example, to get some data, an exception will be thrown. This is the point, where Oracle.ManagedDataAccess. packages mentioned above come out to scene. Actually, we will use two contexts: OracleContext and OracleContextBulkInsert. They will use providers from Oracle.ManagedDataAccess. packages and Simple.Data packages accordingly. You can think about this as something familiar to the CQRS approach: OracleContext for common operations and OracleContextBulkInsert for highly effective inserts.

Repository

I think it is a good idea to implement the repository pattern to decorate and hide our two contexts, because, who wants to know about them and inner logic, that stay behind? That is why, context classes have internal access modifier.

C#
public interface IOracleRepository
{
    void Insert<TContext>(IEnumerable<TContext> entities) where TContext : class;

    List<TDto> GetList<TContext, TDto>(Expression<Func<TContext, bool>> predicate, 
               Expression<Func<TContext, TDto>> selector) where TContext : class;
    //other common methods...
}

public class OracleRepository : IOracleRepository
{        
    private OracleContext CreateContext(bool isSelect = true)
    {
        return isSelect ? new OracleContext() : new OracleContextBulkInsert();
    }

    private DbSet<T> Table<T>(OracleContextBase context) where T : class
    {
        return context.Set<T>();
    }

    public List<TDto> GetList<TContext, TDto>(Expression<Func<TContext, bool>> predicate, 
                      Expression<Func<TContext, TDto>> selector) where TContext : class
    {
        using (var context = CreateContext())
        {
            return Table<TContext>(context).Where(predicate).Select(selector).ToList();
        }
    }

    public void Insert<TContext>(IEnumerable<TContext> entities) where TContext : class
    {
        using (var context = CreateContext(false))
        {
            context.BulkInsert(entities);
        }
    }
}

As you can see, it is quite a usual implementation of repository pattern, except for one important thing - CreateContext method's result depends on inside which method it is called.

Usage

As I already said, we won't use our context directly, but through repository, not worrying about inner implemented logic. This is a simple example:

C#
var repo = new OracleRepository();
var sales = repo.GetList<Sale, SalesDto>(x => x.Amount > 100,
              x => new SalesDto { Id = x.Id, Amount = x.Amount});

//BulkInsert behind the scene
repo.Insert(new List<Sales> { 
       new Sale{ Amount = 200 }, 
       new Sale{ Amount = 300 }
});

App.config

C#
<configuration>
  <!-- other stuff -->
  <appSettings>    
    <add key="InnerDbProviderName" value="Devart.Data.Oracle" />    
  </appSettings>
  <connectionStrings>    
    <add name="OracleContextBulkInsert" connectionString="your connection string" 
                                        providerName="Devart.Data.Oracle" />
    <add name="OracleContext" connectionString="your connection string" 
                              providerName="Oracle.ManagedDataAccess.Client"/>        
  </connectionStrings>
</configuration>

There are two very important things:

  1. You should specify: <add key="InnerDbProviderName" value="Devart.Data.Oracle"/> - it is a requirement of BulkInsert project.
  2. Now you have two connection strings, because of two contexts and two corresponding providers, as you can see in App.config (of course, connection strings must be the same and I hope that it is not very hard for you to specify it twice).

Issues

If you use such web diagnostic platforms as Glimpse: http://getglimpse.com/, there will be a problem with its compatibility against BulkInsert library (exception: BulkInsertProviderNotFoundException will be thrown). This problem is fixed in case of MS SQL here, but, unfortunately, there is no existing solution for Oracle database, so you should refuse to use Glimpse or fix this problem by yourself. On the other hand, you can use MiniProfiler (http://miniprofiler.com/) as alternative, it is fully compatible with our situation.

Also bitness of your Oracle Client (that should be installed on target PC) should be the same as your application. For example, in case of IIS, you can tune one option for app pool - allow 32 bit applications to true value, it will fix the problem, if you have x86 Oracle Client, but your application has x64 bitness.

Arrays in a Bind Solution (Without Entity Framework)

If you don't want to have a deal with BulkInsert or you do not use Entity Framework, there is another solution, that provides results that are good enough. You can find the full description here. The main idea is that you are creating a very simple query like this:

C#
insert into mytable(columnName1, columnName2, ..) values(:columnName1, :columnName2, ...)

where :columnName1, :columnName2 are parameters, but instead of being simple atomic values, they will be arrays. That is all, but you should also assign property ArrayBindCount at OracleCommand variable to the corresponding value. Full solution, that was made, based on code from the link above with adding some modifications: reflection and generic approaches, will look like this:

C#
private static Dictionary<Type, OracleDbType> dotNet2Oracle { get; set; } = 
    new Dictionary<Type, OracleDbType> {
        [typeof(string)] = OracleDbType.Varchar2,
        [typeof(int)] = OracleDbType.Int32,
        [typeof(DateTime)] = OracleDbType.Date,
        [typeof(decimal)] = OracleDbType.Decimal,
    };

public void BatchInsert<TContext>(IEnumerable<TContext> entities) where TContext : class
{
    if (entities.Count() == 0)
        return;            

    using (var con = new OracleConnection
          (ConfigurationManager.ConnectionStrings["OracleContextInsert"].ConnectionString))
    {
        con.Open();
        var cmd = con.CreateCommand();
        var type = entities.GetType().GetGenericArguments()[0];
        var tableName = (type.GetCustomAttributes
        (typeof(TableAttribute), false).FirstOrDefault() as TableAttribute)?.Name ?? type.Name;

        cmd.CommandText = $@"insert into {tableName} ({type.GetProperties().Select(x => x.Name)
            .Aggregate((a, b) => $"{a}, {b}")}) values 
            ({type.GetProperties().Select(x => $":{x.Name}")
            .Aggregate((a, b) => $"{a}, {b}")})";

        cmd.ArrayBindCount = entities.Count();

        var oracleTypesType = typeof(OracleDbType);
        foreach (var prop in type.GetProperties())
        {
            var param = new OracleParameter();
            param.ParameterName = prop.Name;
            var customType = (prop.GetCustomAttributes
            (typeof(ColumnAttribute), false).FirstOrDefault() as ColumnAttribute)?.TypeName;

            if (customType != null)
            {
                OracleDbType enumVal;
                if (Enum.TryParse(customType, out enumVal))
                    param.OracleDbType = enumVal;
                else
                    throw new Exception($@"Custom column's type ({customType}) has 
                       no corresponding projection at OracleDbType!");
            }                        
            else
            {
                var targetType = prop.PropertyType
                       .GetGenericArguments().FirstOrDefault() ?? prop.PropertyType;

                if (!dotNet2Oracle.ContainsKey(targetType))
                    throw new Exception($@"Corresponding Oracle type for this .Net type
                    ({targetType.Name}) not founded, 
                    add necessary pair to dotNet2Oracle dictionary!");
                param.OracleDbType = dotNet2Oracle[targetType];
            }                         
            
            param.Value = entities.Select(x => type.GetProperty(prop.Name).GetValue(x)).ToArray();
            cmd.Parameters.Add(param);
        }

        cmd.ExecuteNonQuery();             
   }
}

By default, table's name will be the same as the name of TContext type, to change it, just add TableAttribute, specifying desired value to Name property, columns names are properties names. dotNet2Oracle dictionary contains type mappings between .NET and Oracle types, you can simply add other necessary values to it, Also, it is possible to specify Oracle type directly via ColumnAttribute and its property TypeName. For this solution, you should install only item 3. from BulkInsert list. Performance comparison - this approach is about 1.5 times worse than BulkInsert, but has no issues, described in the previous section.

Conclusion

In this article, I have shown you how to implement BulkInsert approach with Oracle database with the help of library: https://github.com/Thorium/EntityFramework.BulkInsert. One can use this solution in case of code-first, model-first or database-first approaches, of course, some changes and additional stuff may be needed in case of concrete variant, but the core approach will remain the same. Also, you can use Arrays in a Bind approach (if you do not use Entity Framework), that is a more simple, but less effective solution.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Other Other
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --