Often the data we are modelling must be imported into a database.  Data importing can be a complex and difficult task.

Often the data must come from several different sources and the importing will occur in several steps. Sometimes the data for different tables can come from different sources and must be imported at different times, using different tools or configuration.

At times, even different columns in the same table can come from different sources and must be imported in different ways.  This can mean that columns with essential data may need to be nullable until a few stages of the import or data processing are complete.

In some situations, connections between tables must be assigned or computed after the data has been imported, which can mean that mandatory foreign keys have to be left empty for a while.

Various constraints may need to be disabled until later.

It is very common for data to be imported and then cleaned up.  This process can be very complex as data can be invalid in many different ways.


Make sure that data is validated after import.  Small errors in importing can cause big problems for software if the software assumes data is valid.  It may seem to be dangerous to assume that data is valid – and so it is – however constant checking for validity at every step of a process can be very time consuming, computationally expensive and prone to errors.  Your approach to validation of data will always have to be a balancing act.


There can be many steps in the importing of data from various sources and, but it is best if it can be made into a reproducible process so that any extra steps can be easily added and the data imported (and validated) again.  During development, this is by far the easiest method, saving time and minimising errors.

Once users start to modify data, redoing an import can be much more complex because data may need to be merged.  If possible, finish all data manipulation that may be required before allowing the imported data to be modified in other ways.  Merge imports can be difficult to manage.