Is Excel a Database?

Post date: Feb 14, 2014 10:18:23 PM

Yes - with caveats - and assuming the word database is meant as a common synonym, though technically incorrect, for DBMS (or Database Management System) To understand what I mean, let us examine the proper definition for database from Wiki:

"A database is an organized collection of data."

When we open XL, it has no data. So technically, it's not a database. So the appropriate question should be, "Is Excel a DBMS?". For those not familiar with that term, let us turn to wiki once more:

"(A DBMS is) specially designed software application that interacts with users, other applications, and the database itself to capture and analyze data."

XL can do that. So yes - with caveats, XL is a DBMS. Hopefully you are still reading and willing to ask the more important questions, “How good of a DBMS is XL?” and "What kinds of data is XL appropriate for?"

How good of a DBMS is XL?

"Good" requires comparison. So let us start by comparing XL's conformance to standards.

Standards and Features:

The best way to do determine XL's conformance to standards is to use SQL (Structured Query Language) the de facto standard for querying and manipulating databases. It is not a database. It is not a DBMS. It is a language that invokes a DBMS’s methods over data the DBMS understands. We can use SQL over XL and XL’s data. So using SQL, we now enter mysterious territory few people know exists. This will become apparent right after we create a simple routine.

Open a new XL Workbook. Use ALT-F11 to get to the VBE. Use menu option Insert > Module. Use menu option View > Project Explorer. In the Project Explorer, find and double click Module1. In the large code window enter this:

Sub Test(sSQL As String)

Dim oCN As Object

Set oCN = CreateObject("ADODB.Connection")

oCN.Open "ODBC;DSN=Excel Files;" & _

"Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" & _

"DBQ=C:\Test.xls;" & _

"ReadOnly=0;"

oCN.Execute sSQL

oCN.Close

Set oCN = Nothing

End Sub

We will use this routine to execute SQL statements against XL and see what it does, and does not support. Let’s start by pressing CTRL-G to get to the Immediate Window. Copy this to the Immediate Window.

Test "CREATE TABLE Persons (PersonID int, LastName varchar(255), FirstName varchar(255), Address varchar(255), City varchar(255))"

Click just right of the line and press ENTER. That created a new Workbook in C:\ called Test.xls and added a Worksheet called Persons with five column headings (Bet you didn’t know XL’s VBA could do that!) XL did exactly what a DBMS should do. So let’s see what else XL can do as a DBMS. Try this line:

Test "CREATE TABLE States (Code varchar, Description varchar)"

That added another Worksheet to Test.xls called States with two column headings. Wow - XL looks just like a database - right? Let’s see. Try this:

Test "INSERT INTO Persons VALUES (1,'Hatmaker','Craig','100 Main St', 'Anytown')"

That added a record to worksheet Persons - just like a database. Let’s push on. Type:

Test "DELETE FROM Persons Where PersonID = 1"

That blows up with a "Data type mismatch in criteria expression" error. Huh? That error says XL thinks PersonID isn't a number anymore. Well let's see if it thinks it's a character. Type:

Test "DELETE FROM Persons Where PersonID = '1'"

XL reports "Delete data in a linked table is not supported by this ISAM" which means XL simply doesn't support a fundamental database operation like DELETE. But hey! we know we can delete records in XL. What gives? And to be honest, I have no idea. All I know is SQL is the standard for manipulating databases but XL refuses to let SQL perform this basic operation. Let’s continue. Try:

Test "ALTER TABLE Persons ADD DateOfBirth date"

This is supposed to add a column. XL reports "Invalid operation". Once again, we can do that through XL’s GUI but XL refuses SQL.

Test "CREATE INDEX PersonsByName ON Persons (PersonID)"

XL reports "Operation is not supported for this type of object". An Index provides a way of accessing records by key, a bit like XL’s INDEX function. But like any function, XL’s INDEX function returns a result. A database Index does not. It is an object that hold pointers to all records (not just one) sequenced for speed based on criteria in CREATE INDEX statement. There is no Index object in XL and so this SQL operation fails.

Test "CREATE VIEW PersonsByName as SELECT LastName FROM Persons"

This is supposed to create an object that only exposes the LastName field. Think of this as hiding all columns except LastName. This is useful for security but it fails with message "Operation is not supported for this type of object". And thinking of security:

Test "GRANT SELECT ON Persons TO [Craig Hatmaker]"

This is supposed to grant me authorization to the Persons table. XL reports “Invalid SQL statement; expected ‘DELETE’, ’INSERT’, ’PROCEDURE’, ‘SELECT’, or ‘UPDATE’” Unlike DBMSs, XL has no user based security.

Explore this strange territory on your own. Try any SQL statement using the Test routine and discover exactly what XL can, and cannot do as a DBMS..

Metrics:

Another comparison is normal DBMS metrics. Here a some metrics of popular DMBSs.

Clearly, XL's metrics place the most severe restrictions on data.

Architecture:

Enterprise databases are all "server based". This is important because when we request data from those DBMSs, the server handles the request and returns only the results. In MS Access and XL, then entire table must reside on the PC so the PC can handle the request. This can have huge impacts on network performance.

Most Enterprise database can also spread the workload across multiple servers which makes them highly scalable. As the demand for data increases, we can add servers and improve performance.

What kinds of data is XL appropriate for?

As we can see, XL doesn't measure up very well compared to products marketed as DBMSs, but that doesn't mean XL isn't better than those products for certain applications. For small, single user, simple datasets XL is my 'go to' DBMS because its user interface (UI) is superior. But as soon as my data is meant to be shared, or if it grows beyond a few thousand rows, it's time to port the database to something else.

Conclusion:

XL supports: minimal DBMS features; far fewer records; and far fewer concurrent users. It’s not scalable and it’s not fast. But sometimes that's okay - as long as we don't try to force XL to handle more than is appropriate and are willing to move our data when needed.

References:

W3schools.com SQL Tutorial:

http://www.w3schools.com/sql/default.asp?PHPSESSID=300ae3404d5fa2612f238abeebb8869c

Beginner SQL Tutorial:

http://beginner-sql-tutorial.com/sql.htm

DBMS Metric Comparisons

http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems

XL Metrics

http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HA103980614.aspx