Menu

Read and hear what students have to say about their experiences learning to conduct transparent and reproducible research with the TIER Protocol.

Mallory Hart ’16, Colgate University

Hart

...Compared to other research projects I had conducted before being introduced to TIER, I had a structure that helped me to more thoroughly explore my data and research topic... Because of my work with TIER, I began to approach other problems with a similar framework...

  • Read the complete testimonial.

    In using TIER for my year-long thesis, I realized that replicability is not only valuable for the academic community as a whole, but also for individual processes of research. Understanding the proper stages established in TIER to order data and code so that it could be replicated provided me with an organized structure to approach my project. Knowing that my code and data would need to be broken down for replication into three different stages, (1) Import, (2) Processing, and (3) Results, helped me tackle my problem one step at a time. This made the task of empirical research less overwhelming and also made me think critically about what I was doing before I did it. Rather than hasty and haphazard codes, each line in a TIER project has a known purpose that obviously connects to the end goal. Conducting my project with this structure also taught me the importance of knowing my data in and out before attempting to come up with conclusions from econometric methods.

    Compared to other research projects I had conducted before being introduced to TIER, I had a structure that helped me to more thoroughly explore my data and research topic. The convenience of having all stages of my work in a simple and clear structure allowed me to perform more sophisticated processing and analysis than I had in the past. Having my original data and code that imported these files came in handy multiple times when I wanted to go back and change which variables were included or how they were treated, which I was unable to do with ease in a previous project. I was able to move back and forth between different stages of my project without the headache. I could easily go to an earlier stage of my project, fix a problem, and move on to where I had left off without re-writing the rest of the code. The structure saved my time in the long-run and gave me a clear picture of how best to manage data and stages of my work.

    TIER taught me the general importance of having an organizational plan before approaching a problem and the usefulness in documenting my work. Because of my work with TIER, I began to approach other problems with a similar framework. I approached a research paper in an interdisciplinary class with a three-stage action plan based on each discipline, and also meticulously documented another project for a job I had on campus. I ended up writing the paper with much more ease since my research was broken down into stages with clear and concise notes. For my project with my job on campus, my bosses greatly appreciated the organized notes I had taken and recognized its value so that other interns could possibly conduct a similar project in the future and learn from what I had done. TIER’s emphasis on organization and transparency is something that I will continually benefit from in and out of the academic setting.

Caitlin Gallagher ’15, Haverford College

...Rather than becoming lost in the tool and spending considerable time searching for errors, I was able to focus on the actual research and data analysis...My entire team, including my professor and research librarian, were able to easily access my team’s files...

  • Read the complete testimonial.

    In fall 2014 I took a statistical methods course with Professor Ball and was introduced to Stata. As I had never used Stata before, I inevitably experienced occasional challenges with it. However by recording commands in a "do-file" rather than entering codes directly into Stata, I was able to identify my errors exactly. The ability to retrace my steps and to determine where the issues arose not only facilitated a relatively easy correction of the mistakes, but also helped me to better understand how Stata works. Rather than becoming lost in the tool and spending considerable time searching for errors, I was able to focus on the actual research and data analysis.

    In addition to learning Stata, we learned techniques for managing and documenting data and associated files. We were introduced to the Open Science Framework (OSF), a platform that aided the organizational structure of our research projects. My entire team, including my professor and research librarian, were able to easily access my team’s files, which included folders for raw data, written works, imported data, data analysis, and do-files. The combination of using correct data documenting techniques and OSF allowed me to better understand Stata and avoid becoming lost in my own work. It also facilitates the replication of my work and its extension by future scholars. I believe it is critical that these techniques be implemented by all scholars conducting empirical research.

Steven Evans ’15, Colgate University

Evans

...The TIER documentation I put together easily saved me countless hours of attempting to recreate my research...I am convinced that the TIER protocol should be the standard across all academic research papers...

  • Read the complete testimonial.

    During my junior year summer at Colgate, I had the opportunity to research wind turbine installations and their effect on the local real estate market with Prof. Michael O’Hara. I collected my data from the NYS GIS clearinghouse and from a local real estate company. These two sources required substantial cleaning and modification before they were useable in my models. With the reluctant help of various students and professors in the geography department, I was able to import my raw data and use ArcGIS to make necessary calculations required for several independent variables in my model. Fortunately, cleaning the real estate data set proved to be a unilateral effort and I was able to make the necessary changes largely on my own. After 8 weeks of working with my data, I felt confident that I would be able to reproduce my data without any trouble if I needed to work with it again in the future, but Prof. O’Hara insisted that I follow the TIER protocol and document my work. Recording each of my steps was painful, (especially in the geography lab), but I’m exceedingly glad I had a comprehensive log of my work. Half a year later, while embarking on my capstone paper, I decided to expand on my summer research. The TIER documentation I put together easily saved me countless hours of attempting to recreate my research. 

    Following my experience at Colgate, I am convinced that the TIER protocol should be the standard across all academic research papers. Without a system of documenting each step used to produce results from raw datasets, I cannot imagine a practical way of comprehensively reviewing either my own work or work done by my peers.

Patrick Haneman ’12, Haverford College

...without documented do-files, I would have had no chance of recalling each and every nuance of my prior work with the data... I came to appreciate that what people do with data to draw conclusions can and should be transparent not only to themselves but also to anyone interested in reproducing the results down the line.

  • Read the complete testimonial.

    As I have learned throughout my life, though especially in the past year while working on my thesis and searching for a job, organization is extremely important. Regarding my thesis, it was vital that I be organized because I often had to refer back to steps I had taken months earlier. For instance, at one point during my analysis, I noticed my results for steals and blocks were quite unusual. (Specifically, I was finding evidence suggesting that road teams had significant advantages in both categories.) Since I kept my "raw data" (i.e., data as it appeared when I collected it from an outside source) and "importable data" (i.e., data that I "cleaned up" and categorized before importing into a statistical software program called STATA) organized, I was able to refer back to the raw and importable data and discover that I had incorrectly labeled "home team blocks" as "road team blocks" and vice versa.

    Moreover, because the project lasted for months, I maintained do-files to keep track of each and every step I took during my analysis. If I had instead adjusted and analyzed my data interactively (through STATA) without documented do-files, I would have had no chance of recalling each and every nuance of my prior work with the data and thus, I would have been unable to reproduce my results in the long run.

    And of course, my do-files themselves had to be organized. At the end of each day of thesis work, I saved updated versions of my two major do-files (titled "Import" and "Analysis"). As Professor Ball continually stressed, saving new do-files each day (such as "Updated Import on Feb 2") would lead to clutter in the short term and questions about what work was the most updated version in the long run.

    Overall, while adhering to Professor Ball's replicability structure, I came to appreciate that what people do with data to draw conclusions can and should be transparent not only to themselves but also to anyone interested in reproducing the results down the line.

Giff Brooks ’12, Haverford College

...recording my analysis electronically is beneficial insofar as it increases the transparency, and thus credibility, of my findings. My work can be reproduced by anyone, anywhere...if I realize I made a mistake editing or organizing my data (or simply want to try a different tack in my analysis), I can tweak my do-file to rectify my mistake...

  • Read the complete testimonial.

    I can point to four main benefits that arise from documenting and recording economic (or any other data-centric) analysis in a reproducible fashion, e.g. a Stata do-file. Doing so is useful because it is transparent, it is convenient, it offers an opportunity to comment on my own work, and it maintains organization.

    First, recording my analysis electronically is beneficial insofar as it increases the transparency, and thus credibility, of my findings. My work can be reproduced by anyone, anywhere. If any of my findings or analysis is incorrect, this vastly improves the chances that my errors will be noticed by a sharp analyst.

    Second, keeping electronic files that document my analysis is simply convenient. As a project develops, it is quite easy to open my do-file and pick up where I left off the day before. Compare that to trying to recreate my analysis--and then build on it--day after day. Moreover, if I realize I made a mistake editing or organizing my data (or simply want to try a different tack in my analysis), I can tweak my do-file to rectify my mistake. This is vastly superior to having to start anew with the raw data, as I would have to do absent documentation of my previous work.

    Third, the commenting feature included in most analytical software packages is a tool that should not be overlooked. In my experience, interweaving comments (i.e. text unread by the program, but visible to the analyst) with my commands has been immensely helpful. Lines of code that might otherwise look like gobbledygook can be enhanced with informative comments. Furthermore, they can be used to remind me of why I have chosen to run a certain test or create a certain variable, as well as to interpret a given result "in plain English."

    Finally, maintaining an electronic record has aided me, especially late in my project, by keeping things organized. One do-file, for example, can contain every single step necessary for an analysis, from pulling in raw data from the internet to outputting the results as neat tables in Microsoft Word. Recording my analysis in such a form ensures that I never forget to run a given test, or mix up the order of my commands, or irreparably harm my data.