EPANET Network Components – Database tables and rows, Part 2 (#19)

In this case study we are going through the process of designing a database for storing the information contained in an input file for EPANET, software which can analyse water supply distribution networks.

The last post in our continuing case study modelled the spring as a reservoir and discussed some principles of modelling and identifiers.  We were able to highlight the most important principle of modelling: Don’t include anything unless you need it.  Keep things as simple as possible, and only add complexity where your application demands it.

Junction – the Pool of Siloam

By Tamar Hayardeni תמר הירדני (My own work יצרתי בעצמי) [CC BY 3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia Commons

The Pool of Siloam By Tamar Hayardeni תמר הירדני (My own work יצרתי בעצמי) [CC BY 3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia Commons

The next simplest item to model is the junction which represents the Pool of Siloam.  The requirements for our different data record types were listed three weeks ago (in issue #16), and are included below in the description of each column as we define our JUNCTION table.  The network ID column has also been included, since we know that all network objects need to be associated with a specific network:

Table JUNCTION

Column name Data type Nullable? Description
NETWORKID Foreign key No ID of the network in the NETWORK table
ID Character(31) No ID label (up to 31 characters (letters and digits))
ELEVATION Floating point No Elevation (m or ft)
BASEDEMAND Floating point Yes Base demand flow (flow units) (optional)
DEMANDID Character(31) Yes Demand pattern ID (optional)

This table can contain each of the data fields required by the EPANET analysis engine.  Columns are included for optional values, but they are marked as “Nullable” because they can be left empty.

Database Rows

For our single junction, the row we would need is shown below (in metric units):

Table JUNCTION

NETWORKID ID HEAD DEMAND DEMANDID
1 PoolOfSiloam 635.7 18.9

In Imperial/US units, the necessary row would be:

Table JUNCTION

NETWORKID ID HEAD DEMAND DEMANDID
1 PoolOfSiloam 2086 300

Pipe – Hezekiah’s Tunnel

The last item to model is the pipe which represents Hezekiah’s Tunnel.  For anyone who is more familiar with fluid flow, we need to acknowledge a simplification we have made in our modelling.  Hezekiah’s Tunnel does not run full, while the normal assumption for pipes is that they will be full of water.  We may discuss the effect of this later in our case study.  The requirements for our different data record types were listed three weeks ago (in issue #16), and for a pipe they are included below in the description of each column as we define our PIPE table.  The network ID column has also been included, since we know that all network objects need to be associated with a specific network:

Table PIPE

Column name Data type Nullable? Description
NETWORKID Foreign key No ID of the network in the NETWORK table
ID Character(31) No ID label (up to 31 characters (letters and digits))
STARTID Character(31) No ID of start node
ENDID Character(31) No ID of end node
LENGTH Floating point No Length (m or feet)
DIAMETER Floating point No Diameter (mm or feet)
ROUGHNESS Floating point No Roughness coefficient
MINORLOSS Floating point No Minor loss coefficient
STATUS Character(6) No Status (Open, Closed or CV)

Two questions arise when defining the columns in this table:

1. Should the “STARTID” and “ENDID” columns be foreign keys?  This is just a slightly different angle on the question we answered last week about the ID column in the RESERVOIR table.  The answer?  It depends.  For our use, where we are assuming that our input and output data are both checked, there is no advantage in doing so.  If we were designing a database that was to be used by a network design tool, it would be much better to use foreign key references – leaving the database to do some of our checking for us in areas where databases excel.

2. The STATUS column can contain one of three values: “OPEN”, “CLOSED” or “CV” (meaning Check Valve, a valve which prevents water from flowing backwards in a pipe).  Lookup tables are a standard feature of databases in cases where a limited number of different natural values will be used, in order to reduce storage and improve consistency, as well as for philosophical reasons.

In this case, we could store a single digit representing each possible value (e.g. 1=OPEN, 2=CLOSED, 3=CV).  This can be a very effective way of reducing storage space in some situations or for avoiding small errors which can introduce different values which are unintended; for example, variations in case (e.g. “OPEn”), extra spaces (e.g. “ CLOSED”) or simple spelling mistakes (e.g. “CLOZED”).

However, for this application, adding a lookup table would be completely against our principle of only adding complexity when it is necessary or valuable!  Storage space is cheap and we would be saving very little, even if we had millions of rows.  And if we are considering the argument of data accuracy or consistency, we can dismiss this on the basis that our data is already being checked before we receive it.  Once again, if we were developing a design tool, lookup tables could be valuable, but we are not.  Complexity has an appearance of excellence, but the appearance is utterly misleading.  Complexity increases the difficulty of understanding a system.  Each extra item – yes, every single one – makes understanding more difficult and slower.

Note that complexity can also be included in the data itself, as was discussed at the start of this case study.  We could use just one column for storing all of the information for a pipe, but then we would need to parse the data and pull it apart into the pieces we need.  This would simplify the table design, but lose us the ability to easily find all of the pipes that have a diameter in a particular range or to list the pipes in order of length.  Our modelling should never lose us anything useful in a quest for simplicity, but it nor should it add complexity that gives no measurable advantage.  “Philosophical reasons” do not offer any measurable advantage except contentment in the person with the reasons.  Sometimes, we have to make decisions based on which option gives the greatest advantage.

In this case, we would gain no overall advantage from using lookup tables, so we will not do so.

Database Rows

For our single pipe, the row we would need is shown below (in metric units):

Table PIPE

NETWORKID ID STARTID ENDID LENGTH DIAMETER ROUGHNESS MINORLOSS STATUS
1 HezekiahsTunnel GihonSpring PoolOfSiloam 533.1 1200 60 0.0 OPEN

In Imperial/US units, the necessary row would be:

Table PIPE

NETWORKID ID STARTID ENDID LENGTH DIAMETER ROUGHNESS MINORLOSS STATUS
1 HezekiahsTunnel GihonSpring PoolOfSiloam 1749 48 60 0.0 OPEN

Now we have the tables that we need to describe the parts of our network and the rows that define our network.  We have not yet included any of the tables that describe other types of equipment (valves, pumps, etc.), or those which would be required for controlling the network operation.  The next post (DV) will look at more tables and examine the questions they raise.