Create a Survey Database in Access: Part I

Tue, Jul 21, 2009

Data Management

Create a Survey Database in Access: Part I

Microsoft Access is an excellent tool to assist with data entry for a survey project. With Access, you can create a simple data entry interface that not only looks like your survey, but also includes features to reduce data entry errors and still make the process go fast. This is a two part tutorial spanning 16 videos which walks through every step involved in creating a typical database for storing survey or questionnaire data.

surveycustomerMany people enter survey data (or data from chart reviews) directly into Excel or SPSS, or worse: directly into an Access table instead of an Access form. This is spreadsheet doom! Entering data directly into a spreadsheet is fraught with problems: it yields higher rates of data entry errors and data anomalies, many of which you won’t uncover until you start to analyze the data and find yourself wondering why Larry Smith’s survey was entered twice; why some zip codes have 3 digits and some have letters; and why the female gender has been entered four different ways: “Female”, “F”, “girl”, and “Beyonce.” It’s also very confusing to have to figure out how someone who answered, “No, I didn’t attend the orientation” was somehow able to answer in great detail a follow-up question about what they liked most about the orientation – the orientation that they supposedly didn’t attend.

These errors will add to your project hours of tedious data cleaning and investigating. If you’re doing both the data entry and data analysis, you have only yourself to blame. But if someone else is doing the analysis, don’t be surprised if they stop responding to your emails when you ask for their help on another project. Save yourself the headaches by making a great Access database from the start.

I’ve split this tutorial into two parts:

  • Part I (which you’re reading now) introduces you to the sample survey project, the Access environment, and shows you how to set up your tables in Access. (6 videos)
  • Part II shows you how to build the data entry form, which is based on the tables you built earlier. (10 videos)

Video 1: A Quick Look at the Database you’ll Build

http://www.researchtutorials.com/videos/dm/surveydatabaseA1.flv

Video 2: A Quick Look at the Sample Survey

http://www.researchtutorials.com/videos/dm/surveydatabaseA2.flv

Video 3: Turn Your Survey into a Codebook (the Blueprint for your Database)

http://www.researchtutorials.com/videos/dm/surveydatabaseA3.flv

Video 4: Opening Access and Getting Started

http://www.researchtutorials.com/videos/dm/surveydatabaseA4.flv

Video 5: Create a “Lookup Table” to Store the Answer Choices

The method I teach here for setting up your tables differs from the method most database experts would recommend. My approach is very simplified, and requires you to build only two tables: one table to store all of the answer options to your multiple choice questions (e.g., “Yes,” “No,” “Strongly Disagree,” “Disagree,” etc.) and one table to store the actual responses you collect from people who filled out your surveys. (An even more simplified version would involve only one table that holds everything.)

There are pros and cons to this simplified approach:

CONS

It will be difficult to use Access’s built-in query and report features to calculate and show summary statistics of your data (e.g., frequencies and percentages to show how everyone answered each question.) This is because your data – when structured in the way I teach in this tutorial – will not be fully “normalized.” Creating a non-normal structure means that you’re cramming into one or two tables many different categories of data: data about your survey questions (their wording, their numbering, etc.); data about answer choices on your survey (Yes/No, Strongly Disagree … Strongly Agree, etc.); data about your respondents (name, sex, race, etc.); and data on the actual responses people gave to each question (e.g., how many people answer “Yes” to question 7, etc.). A fully normalized survey database would put each of these categories of data into its own table, like so:

  • tblQuestions: stores information about the survey questions (with fields like QuestionID, QuestionNumber, QuestionText, etc.)
  • tblAnswerChoices: stores information about the response options for each multiple choice question (with fields like QuestionID, Choice [e.g., "Yes," "No," "Strongly Agree," "Agree," etc..)
  • tblRespondents: stores information on the respondents (with fields like RespondentID, LastName, Sex, Race, etc.)
  • tblResponses: stores the actual responses you collect (with fields like ResponseID, QuestionID, ResponseID, Response, etc.)

These four tables would then be connected to each other, like so:

Relationships

Each of these tables is called - in database speak - a "relation," which is why Access is a type of "relational database" management system. In a normalized database, each table contains only data on a single topic (one table for data about survey questions; one table for data about respondents, etc.). The tables are then linked to each other on a common field or "key" to form relationships. Take a look at tblQuestions and tblResponses. See the line connecting them? Notice it has a "1" next to tblQuestions and an infinity sign next to tblResponses. This tells us that there is a "one-to-many relationship" between these two tables, such that "one" question can have "many" responses.

When each category of data is stored in its own table, it's much easier to create Access queries to calculate statistics about each category of data. Or you can combine queries to get statistics about multiple categories of data, like a report that shows how males vs. females (data from tblRespondents) answered (data from tblResponses) each question (data from tblResponses). Normalization is a big concept and entire books are dedicated to it. For a quick but excellent overview of normalization and relational database concepts, check out Databases: Normalizing Access Data and Databases: Access Terminology and Relational Database Concepts.

Another downside to this simplified approach is that you can't reuse your survey database to manage other (different) surveys you might collect data on in the future. In a normalized survey database, the characteristics of the survey are not hard coded into the table or forms.

It's the difference between this non-normal structure ...

nonnormaltable

... and this normalized one:

normaltabledata

When we view the normalized table in Datasheet view, we see this is where the characteristics of the survey have been specified. That's right! The features of the survey are entered in rows, as real, dynamic data:

When you're done with the survey project, you can easily delete the data and keep the table structures (and probably the forms) intact for use with another survey. (Or, add a 5th table to the relationships picture earlier - a table called tblSurveys (with fields like SurveyID, SurveyName, SurveyAuthor) - and link it (one-to-many) to tblQuestions (add a new field in tblQuestions called SurveyID). Vaoila! You have a single Access database you can use to manage all of your survey projects.

So with all the advantages you get from a normalized structure, why would I recommend a non-normal, simplified approach?

PROS

Statistical programs (like SPSS, SAS, etc.) hate normalized data. If you plan to export your data to a statistical program for analysis, you'll have to spend a lot of time converting your normalized data (which, for one, is spread across four tables) into a non-normal structure, like a single flat file. Look at the picture above (the one in Datasheet view). If you export that to SPSS, it will give you a dataset with variables called QuestionID, QuestionNumber, etc., and cases containing data about the survey questions. Um ... that's not what you want. Instead you need a dataset with variables (i.e., columns) like q1, q2, q3, and so on, with cases that represent each respondent's answers to the survey. In a normalized Access database, your variables are stored as live data, in rows; in SPSS, your variables are hard coded, one per column, and their properties are stored in the Variable view window. With a normalized structure, not only will you need to merge the four different tables into one, but you will also need to somehow transpose the columns and rows so that you end up with something SPSS can make sense of. Ugh.

Another advantage of my simplified approach is that it gives you great flexibility when creating the data entry form(s), also called the graphical user interface (or GUI, pronounced "gooey"). The closer the GUI looks like the original survey, the faster (and more error-free) the data entry will go. In the GUI of a normalized survey database, you most likely have to present the survey questions as one long, single column, continuous list. Have fun with all that vertical scrolling. You won't have much flexibility with making an eye-pleasing, multi-column, multi-tab, bells-and-whistles GUI like the one you'll learn in Part II of this tutorial.

Again, it's the difference between a GUI based on a non-normalized table structure:

nonnormalgui2

... and a GUI based on a normalized one (this one is from a different project):

normalgui

Both GUIs "work," but with the first one I was able to take more liberties with the design and implement better validation rules.

A final advantage of the simplified approach is that it's easier to add to your form custom validation rules, automatic skip pattern logic, and so on. Creating conditional codes on a normalized GUI form like above is tricky. (e.g., 'If user selects "Other" for Question1, enable Question1_specify; otherwise, disable Question1_specify and jump to Question 2.). On either type of form, you have to write code to add such rules, but it's much easier to do this in a GUI based on a non-normalized structure.

So the choice of whether to normalize or not is up to you. As you can see, it depends on your needs, the project, and, most importantly, whether you plan to analyze you data in Access or a statistical program. I've built lots of survey databases and have always found the non-normalized approach much better suited to my needs. Of course, this doesn't stop the Access purists from shaking their fists at me.

Now back to the videos!

Video 5: Create a "Lookup Table" to Store the Answer Choices

http://www.researchtutorials.com/videos/dm/surveydatabaseA5.flv

Video 6: Create a Table to Store the Survey Questions and the Data you Collect

http://www.researchtutorials.com/videos/dm/surveydatabase6.flv

Once the tables are done, you’re ready to build your form. So when you’re ready, check out Create a Survey Database in Access: Part II.

  • Hotmail
  • Yahoo Mail
  • Evernote
  • Delicious
  • Digg
  • Share/Bookmark
, ,

30 Responses to “Create a Survey Database in Access: Part I”

  1. Paula Says:

    Great tutorial! Thanks for sharing.

    Now I’d like to extract a report from surveys collected. My rating scale is 1 to 5 and I’d like to be able to show how many responses were received for question 1 with a 1, how many 2’s, 3’s etc. With the drop down for answer choices, I’m not sure how to do that. I’ve tried a number of different things and still am unable to get the information to show properly on a report. Can you offer any insight on how to do this?

    Reply

    • Kurt Says:

      Paula:

      Good question. Recall from the tutorial that the non-normal structure makes it difficult to calculate statistics, sometimes even basic ones like you asked about. But where there’s a will there’s a way!

      I added a report which does what you need to the sample download available at this tutorial:

      http://www.researchtutorials.com/data-management/make-access-combo-boxes-drop-down-automatically/

      Basically, I wanted to calculate stats for 3 questions.

      1. I created 3 separate queries to calculate the number of people who selected each possible response for question 1, 2, & 3.

      2. Since I wanted to group the results of all 3 questions, I created a Union Query (can only do this in SQL view) to bring the results together into a single query.

      3. I created a report based on the Union Query, and added some headings, sort order (sort by filter, then value), and labels to calculate percentages.

      4. I created another report (just a blank one, not based on anything), and added the report in #3 to it. (So it’s now a subreport.)

      Take a look at the download and dig around. I hope it makes sense. I plan on doing a video tutorial for this but for now this should get you started.

      Reply

      • Paula Says:

        Thanks again for the posting Kurt. I followed the instruction to adjust the combo boxes. Works great. As for the report created, I don’t see that in the posting. I have created a query although I can’t seem to get proper results in counting the different ratings. I have yet to create the union query because of this. I’m not sure what I’m doing wrong. In creating your query how did you indicate to calculate each possible response to your questions? Obviously I’m missing something. Thanks for your help.

        Reply

        • Kurt Says:

          I should clarify. Go to the *end* of the tutorial I linked to and you’ll see an option to download a sample Access database. Download and open that database; that’s where I built the queries and report I described in my reply.

          Reply

        • Damian Says:

          I think I’m still trying to get what you managed to achieve.
          My scale is 1 to 6 and I want to know how many selected ‘1′ as their answer to question 1, how many selected 2, how many selected 3….etc. And so on for all the questions.
          The survey was conducted on a Board of Directors, each completing a Self-Assessment Questionnaire and then completing the same Questionnaire for each fellow Director.

          I have looked at Kurt’s sample database but I think I may be missing the obvious.

          Is my report request similar to the one youu were attempting?
          Regards
          Damian

          Reply

      • Andrew Says:

        Great tutorial! But I am also having issues with video 6 – it seems to cut off after a few minutes. Is there any chance it is stored at an alternative address that I might try and access?

        Reply

  2. Paula Says:

    Hey Kurt
    I think I’ve got it all sorted out now. Great example! My survery had numerous questions (A to Q)that I wanted to track so I had to create separate reports in order to view all the responses as I had too much data in the control source. It works well though as I have a basic survey as well as advanced. So I’ve put the advanced question results in a separate report.

    Thanks again for your help and tutorial.

    Reply

  3. Doug Says:

    Excellent tutorial! I found this site looking for a way to make my survey GUI behave exactly like you demonstrate with subsequent questions disabled based on a previous answer. Using the typical normalized structure it was next to impossible to figure out. I was a little disappointed to learn that the tutorial stopped short of showing the code you use to enable this. Any chance of taking a look at the code or getting a copy of the full application?

    Thanks.

    Reply

  4. Kirsten Says:

    Hello –

    Thank you for posting this great tutorial! It was exactly what I needed. I am having one stumbling block, however. First, I am working with Access 2007. I have built my lookup tbl as directed and I have built my survey question table but when I go into the table and click on Row Source line to build the drop down boxes, my lookup table does not appear. What appears is TABLE 1. If I click on this, it is merely my survey question table. I am stumped. I have tried saving both as Access 2003 compatible and trying over, but still no luck. Any ideas?

    Reply

  5. Kirsten Says:

    UPDATE!

    I think I got it working now – I was saving as separate databases instead of tables within one database. Thank you!

    Reply

  6. Damian Says:

    Kurt, you are to be commended. Excellent presentation skills.

    Video 6 in Part I freezes part way through.
    Is that my version only?
    Regards

    Reply

  7. Shubash Says:

    Do you have the tutorial on the survey designed from the normalized tables ? I’m designing a survey, however the questions are continually changing so I created a table and a simple form to update questions but the rest of it I’m stuck.

    Reply

    • Kurt Says:

      > Do you have the tutorial on the survey designed from the normalized tables ?

      Not yet. In the meantime, check out Duane Hookom’s At Your Survey database here: http://www.rogersaccesslibrary.com/forum/forum_posts.asp?TID=3

      It’s a great template and comes with built in reports, too. One limitation is that it doesn’t support “check all that apply” type questions. (You can do add these, but it requires making a lot of modifications.)

      Reply

  8. David Says:

    I am very confused about the normalized version of this survey. I have 100 surveys to enter into Access from a college, all of which are four pages, but plan to add multiple batches of 100 surveys in the near future (for other population samples– ie, other colleges).

    I want to keep each batch of 100 separate, and I will want to use regression at one point to analyze the data.

    I tried to follow your “normalized data tables” brief introduction, but it doesn’t make any sense to me. Your four proposed tables were: 1 for questions, 1 for possible answers, 1 for respondents, and 1 for completed surveys (”responses”). I like that idea of using the Responses table, but get stuck.

    The Questions table you show in datasheet view has a new record for each question– not a new field for each question. Yet your Answers table only shows “question ID” and “choice” for possible fields– that’s where I get stuck. What do I enter in for choice in the datasheet view of the Answer table for it to make sense later on when I link it with the Questions and Responses table. (For now, I don’t plan on keeping a Respondent table).

    Please let me know. Thank you so much.

    -David

    Reply

    • Kurt Says:

      > The Questions table you show in datasheet view has a new record for each question– not a new field for each question. Yet your Answers table only shows “question ID” and “choice” for possible fields– that’s where I get stuck. What do I enter in for choice in the datasheet view of the Answer table for it to make sense later on when I link it with the Questions and Responses table.

      tblAnswerChoices would contain entries like:

      QuestionID Choice
      1 Male
      1 Female
      2 Strongly Disagree
      2 Disagree
      2 Agree
      2 Strongly Agree
      etc.

      The data entry form you’ll create (see the last screenshot before Video 5) is based on a query that brings together tblQuestions *and* tblAnswerChoices. (In the query, the two tables are linked by QuestionID). The data entry form should be in Continuous view. The left side of the form includes fields from tblQuestions like QuestionNumber and QuestionText, and the right side of the form includes the field Choices from tblAnswerChoices.

      By keeping answer choices in a separate table, it does complicate things but it allows you to calculate in Access frequency statistics (how many people answered Male, how many answered Female, etc.).
      If you don’t need to do that in Access, you could easily put an AnswerChoice field in tblQuestions. It would be a combo box and the values would be populated from a value list or pulled from a lookup table.

      The normalized GUI for a survey database is tricky and very involved, and I’m leaving out a lot of steps. For more details about how to put it together, take a look at my earlier link (in a comment reply) to Duane Hookum’s AtYourSurvey database.

      Reply

      • David Says:

        Thank you very much for your thorough reply, Kurt. You answered my question to a “T.” I realized I was more concerned with being able to use the interface of Access to run frequency statistics than “taking the easy way out” with a non-normalized table, when in fact I can more readily determine those statistics within Excel with the added bonus of not having to group the data to also do regression. It’s not only easier, but it makes more sense. Function over form, as they say. Wonderful. Thank you again, you are great at what you do. I hope you have considered teaching if not already.

        Reply

  9. Leah Says:

    Hello,
    I am suddenly getting a “property value is too large” error message as I make my survey table in Access. Any suggestions? I do have a lot of fields, it’s a large survey. I’m doing it your way, the non-normalized way.
    Thanks!

    Reply

    • Kurt Says:

      Leah:

      How many fields do you have? This error usually occurs when you try to stuff too much information into one table. Although 255 is technically the maximum number of fields a table can support, Access can start to act weird even when a table begins to *approach* its limit.

      Try this:

      1. As a precaution, make a backup of your database: File > Back Up Database (This is the quick and easy way.)
      2. Compact and repair your database: Tools > Database Utilities > Compact and Repair (This will clean out the bloat – like deleted fields – that accumulates and remains stored in memory when developing a database. The bloat can make Access think it has 255 fields in a table when in fact it has fewer.)
      3. Test things out. If you still get the error:

      - Create a new database
      - With the new database open, and the old database closed, import all of the objects (tables, forms, etc.) from the old one, into the new: File > Get External Data > Import …

      4. Test things out.

      There are degrees of normalization, and you may need to consider implementing some level of normalization. For example, do you have a bunch of fields to store several examples of the same type of thing? (Probably so if you have several fields differentiated only by a number, like Color1, Color2, Color3.) Such groupings should be stored in a table of their own (e.g., tblColor), with a minimum of three fields:

      tblColors
      ———
      ColorID (Datatype: Autonumber; make this the primary key)
      SurveyID (or whatever the primary key of the main table is called; Datatype: Number)
      Color

      tblColor should be linked to you main table (tblSurveys?) a one-to-many relationship, such that “one survey can have many colors associated with it.” Data entry is handled via a continuous subform (based on tblColors), linked to the main form via SurveyID. The subform allows you to enter as many colors (or whatever the thing is) per survey as you want.

      I’ll do a tutorial on this soon. In the meantime, read over the comments where I address in more detail what I mean. Also Google “one to many” + Access for some examples.

      Reply

  10. psn codes free Says:

    Good day, I found out about your blog on Yahoo Blogs and think that this is actually quite intriguing and offers wonderful subject matter. Many thanks with regard to the fantastic post, I will certainly promote this on Twitter. Have a good day.

    Reply

  11. perry Says:

    it took alot of seaching but i found this site. it is exactly what i neeeded. thanks for your efforts.

    Reply

  12. perry Says:

    question

    both video 5 and 6 stop play after a couple of minutes. is it me or the site?

    Reply

  13. Enoch Says:

    Hi,
    I have been trying to create a form based on the normalized table but have not succeeded. Could you please show me how you did the one indicated in this presentation?

    Thank you.

    Reply

  14. Andrew Says:

    Hi Kurt,

    This is another very basic question, but I was able to successfully create an entry form for a simple survey I am conducting, but when I fill out the form and create multiple records, the data does not show up in the TblSurvey table. I am sure this is something simple, but I am an access novice! Thanks.

    Reply

  15. Atta Says:

    This is another very basic question, but I was able to successfully create an entry form for a simple survey I am conducting, but when I fill out the form and create multiple records, the data does not show up in the TblSurvey table. I am sure this is something simple, but I am an access novice! Thanks.

    Reply

  16. Tally Says:

    Creating conditional codes on a normalized GUI form like above is tricky. (e.g., ‘If user selects “Other” for Question1, enable Question1_specify; otherwise, disable Question1_specify and jump to Question 2.). On either type of form, you have to write code to add such rules, but it’s much easier to do this in a GUI based on a non-normalized structure.

    How do i make this happen (the above) Can you tell me where to put the rules and what the rules are?

    Reply

  17. Corey Says:

    great tutorial!

    simple question I hope…in your normalized table structure….is the “ResponseID” in tblRespondent supposed to be “respondentID” with a One to Many to respondentID in the tblresponse?

    Im trying to find out why there would be a response field in the Respondent table. Thanks!

    Reply

    • Kurt Says:

      Corey: Good catch! You’re exactly right: RespondentID would be a better name for those fields, because one Respondent can have many responses (in this case, a response to each question on the survey). It still works as it’s presented, but I agree that RespondentID better conveys the relation.

      Reply

  18. Cheryl Says:

    Awsome training. Exactly what I needed. I spent a few days trying to create a database for my “Subcontractor Feedback” and always ran into some roadblock. Although it’s not technically a survey, it is very similiar. You Rock!!!

    Reply


Leave a Reply