Data Sources Guide

The Data Sources Guide is a document in which you record information on the sources of your Input Data Files.

Contents of the Data Sources Guide

The information you provide in the Data Sources Guide about an Input Data File differs depending on whether you obtained the data file from an existing source or created it yourself.

Information about existing data files

For each Input Data File you obtain from an existing source, the Data Sources Guide should provide the following information:

A bibliographic citation for the Input Data File

  • See details
    • For some data sets, the data provider gives a suggested citation style. If this is the case, use the style suggested by the provider.
    • If the provider does not suggest a style, use whatever editorial style you use throughout your report (e.g., APA, Chicago, a publisher-specific style, or formatting guidelines given by your instructor).
    • If a Digital Object Identifier (DOI) has been assigned to the data file, include that with the citation.
    • Indicate the date you downloaded the data file.

Instructions explaining how a reader can obtain copies of the Input Data File directly from the original source

  • Read details

    These instructions should be written in plain English, in complete sentences.

    They should be detailed and precise enough that it is realistic to expect that by following them a reader would in practice be able locate and obtain a copy of the Input Data File that is identical to the one stored in your InputData/ folder.

    Note that a bibliographic citation, even if it includes a DOI, does not generally give enough information to enable a reader to find a specific data file. For example, a single DOI can be attached to an object containing multiple data files. Bibliographic citations are intended to give credit to the producers and/or distributors of the data, not necessarily to indicate how a user can obtain the data.

    Similarly, although it is often useful to indicate a web address or URL where the data can be accessed, only giving a web address is usually not sufficient, because many data files may be posted at a single URL.

    In the case of restricted data for which a user must apply for authorization, your instructions should also indicate whom the user should contact and what steps must be taken to gain authorization.

A note on the availability of a codebook for the Input Data File

  • Read details

    A codebook for a data file contains information a user would need to be able to understand and work with the data, such as names, definitions and coding schemes for the variables, the population or universe from which the data were drawn, and the sampling procedure used to select the observations. Documents that provide this kind of information are also sometimes called user's guides or data dictionaries.

    Producers and distributors of public-use data generally make codebooks or similar documentation available with their datasets.

    If a codebook for the Input Data File is available in a format that can be downloaded and saved on your computer (e.g., in a .pdf file), you should store a copy of it in your Metadata/ folder. In that case, you should include a note in your Data Sources Guide indicating that the codebook is stored there.

    If a codebook is only available in a format that cannot stored in your documentation (e.g., an interactive web interface where you can search for information about the Input Data File), your Data Sources Guide should include a note indicating how a user can access that information (e.g., a URL for the web interface where the information is available).

Information about data files you create yourself

For each Input Data File you create yourself, the Data Sources Guide should provide the following information:

● A description of how you constructed the data file

  • Read more

    For example:

    • If you created a data file by scraping data from the web, explain qualitatively what the procedure involved, and provide a copy of the script that implemented it.
    • If you created a data file containing responses to a survey you conducted, describe the survey qualitatively, and provide a copy of the survey instrument.
    • If you created a data file by conducting an experiment, give details on your experimental methods.

● A note on the availability of a codebook for the data file

  • Read more

    When you construct a data file yourself, there will obviously not be a pre-existing codebook. You must therefore write a codebook for every Input Data File you create yourself, and save it in the Metadata/ folder

    See the Codebooks page for details about writing a codebook.

    In your Data Sources Guide, write a note giving the filename of the codebook and indicating that a copy is available in your Metadata/ folder.

Writing the Data Sources Guide

  • If you have multiple Input Data Files, the Data Sources Guide should be organized in sections, with one section providing the information outlined above for each of the Input Data Files.
  • You may use any word processing or typesetting software you like (eg., Microsoft Word, Google Docs, or LaTex) to write the Guide to Data Sources.
  • The final version of the:Data Sources Guide you save in your documentation should be in .pdf format.

Naming the Data Sources Guide

Give your Data Sources Guide the name DataSources.pdf.