Read and hear what students have to say about their experiences learning to conduct transparent and reproducible research with the TIER Protocol.
Read and hear what students have to say about their experiences learning to conduct transparent and reproducible research with the TIER Protocol.
Samantha Wetzel '18, Haverford College
"Project TIER emphasizes clear, concise documentation in the form of Stata do-files that is critical for the sake of transparency and replication within the academic community. Not only does it prove credibility, but easy-to-follow code also allows researchers to extend and enhance the robustness of one’s research."
Read Samantha's complete testimonial.
In my Junior Research Seminar on the Federal Reserve, my final paper was the framework for what would become my year-long senior thesis project. I collected data, formatted it, and conducted Stata analysis. However, when it was time to begin my thesis, I lost this work and could not remember what steps I had taken, because I failed to utilize proper code documentation to ensure the replication and understanding of my previous analysis. Project TIER though changed the way in which I managed my senior thesis, and ultimately led me to co-publish with my advisor Professor Carola Binder organized, focused, and transparent research.
First, Project TIER emphasizes clear, concise documentation in the form of Stata do-files that is critical for the sake of transparency and replication within the academic community. Not only does it prove credibility, but easy-to-follow code also allows researchers to extend and enhance the robustness of one’s research. At the same time, it allows for the revelation of overlooked errors, which is critical for we, as a community of researchers, to learn and find honest answers to our questions.
Additionally, the process of formatting my data, analyzing it, and commenting was a rewarding learning process for me that helped me to fully understand my data’s strength and limitations and the reasoning behind why I chose each of my statistical methods. Before my thesis, I had a basic understanding of Stata syntax, but Project TIER led me to become more comfortable and equipped to conduct research beyond my undergraduate studies.
Project TIER also benefits the collaborative research process. For example, my advisor and I decided to submit portions of my thesis for publication. She was able to efficiently validate and extend the findings, and we were both able to work simultaneously on furthering the research.
Lastly, as a student-athlete at Haverford College, Project TIER provided me the tools to structure my research and make me feel in control of a process that can often be overwhelming. As a women’s basketball player whose busiest part of my season occurs when primary data entry, processing, and analysis happens, Project TIER helped me develop code in which each line had a purpose and each do-file had an end goal. Further, I could manage my time more efficiently. When I detected mistakes, I did not start from scratch—instead, I could quickly locate where errors had occurred. Overall, Project TIER is an extremely valuable tool for all conducting quantitative research, as it guides individuals to create a narrative of their analysis to not only benefit their individual work but also support the exploration and strengthening of ideas within the academic community.
Mallory Hart ’16, Colgate University
"Compared to other research projects I had conducted before being introduced to TIER, I had a structure that helped me to more thoroughly explore my data and research topic... Because of my work with TIER, I began to approach other problems with a similar framework."
Read Mallory's complete testimonial.
In using TIER for my year-long thesis, I realized that replicability is not only valuable for the academic community as a whole, but also for individual processes of research. Understanding the proper stages established in TIER to order data and code so that it could be replicated provided me with an organized structure to approach my project. Knowing that my code and data would need to be broken down for replication into three different stages, (1) Import, (2) Processing, and (3) Results, helped me tackle my problem one step at a time. This made the task of empirical research less overwhelming and also made me think critically about what I was doing before I did it. Rather than hasty and haphazard codes, each line in a TIER project has a known purpose that obviously connects to the end goal. Conducting my project with this structure also taught me the importance of knowing my data in and out before attempting to come up with conclusions from econometric methods.
Compared to other research projects I had conducted before being introduced to TIER, I had a structure that helped me to more thoroughly explore my data and research topic. The convenience of having all stages of my work in a simple and clear structure allowed me to perform more sophisticated processing and analysis than I had in the past. Having my original data and code that imported these files came in handy multiple times when I wanted to go back and change which variables were included or how they were treated, which I was unable to do with ease in a previous project. I was able to move back and forth between different stages of my project without the headache. I could easily go to an earlier stage of my project, fix a problem, and move on to where I had left off without re-writing the rest of the code. The structure saved my time in the long-run and gave me a clear picture of how best to manage data and stages of my work.
TIER taught me the general importance of having an organizational plan before approaching a problem and the usefulness in documenting my work. Because of my work with TIER, I began to approach other problems with a similar framework. I approached a research paper in an interdisciplinary class with a three-stage action plan based on each discipline, and also meticulously documented another project for a job I had on campus. I ended up writing the paper with much more ease since my research was broken down into stages with clear and concise notes. For my project with my job on campus, my bosses greatly appreciated the organized notes I had taken and recognized its value so that other interns could possibly conduct a similar project in the future and learn from what I had done. TIER’s emphasis on organization and transparency is something that I will continually benefit from in and out of the academic setting.
Caitlin Gallagher ’15, Haverford College
"Rather than becoming lost in the tool and spending considerable time searching for errors, I was able to focus on the actual research and data analysis...My entire team, including my professor and research librarian, were able to easily access my team’s files."
Read Caitlin's complete testimonial.
In fall 2014 I took a statistical methods course with Professor Ball and was introduced to Stata. As I had never used Stata before, I inevitably experienced occasional challenges with it. However by recording commands in a "do-file" rather than entering codes directly into Stata, I was able to identify my errors exactly. The ability to retrace my steps and to determine where the issues arose not only facilitated a relatively easy correction of the mistakes, but also helped me to better understand how Stata works. Rather than becoming lost in the tool and spending considerable time searching for errors, I was able to focus on the actual research and data analysis.
In addition to learning Stata, we learned techniques for managing and documenting data and associated files. We were introduced to the Open Science Framework (OSF), a platform that aided the organizational structure of our research projects. My entire team, including my professor and research librarian, were able to easily access my team’s files, which included folders for raw data, written works, imported data, data analysis, and do-files. The combination of using correct data documenting techniques and OSF allowed me to better understand Stata and avoid becoming lost in my own work. It also facilitates the replication of my work and its extension by future scholars. I believe it is critical that these techniques be implemented by all scholars conducting empirical research.
Steven Evans ’15, Colgate University
"The TIER documentation I put together easily saved me countless hours of attempting to recreate my research...I am convinced that the TIER protocol should be the standard across all academic research papers."
Read Steven's complete testimonial.
During my junior year summer at Colgate, I had the opportunity to research wind turbine installations and their effect on the local real estate market with Prof. Michael O’Hara. I collected my data from the NYS GIS clearinghouse and from a local real estate company. These two sources required substantial cleaning and modification before they were useable in my models. With the reluctant help of various students and professors in the geography department, I was able to import my raw data and use ArcGIS to make necessary calculations required for several independent variables in my model. Fortunately, cleaning the real estate data set proved to be a unilateral effort and I was able to make the necessary changes largely on my own. After 8 weeks of working with my data, I felt confident that I would be able to reproduce my data without any trouble if I needed to work with it again in the future, but Prof. O’Hara insisted that I follow the TIER protocol and document my work. Recording each of my steps was painful, (especially in the geography lab), but I’m exceedingly glad I had a comprehensive log of my work. Half a year later, while embarking on my capstone paper, I decided to expand on my summer research. The TIER documentation I put together easily saved me countless hours of attempting to recreate my research.
Following my experience at Colgate, I am convinced that the TIER protocol should be the standard across all academic research papers. Without a system of documenting each step used to produce results from raw datasets, I cannot imagine a practical way of comprehensively reviewing either my own work or work done by my peers.
Patrick Haneman ’12, Haverford College
"Without documented do-files, I would have had no chance of recalling each and every nuance of my prior work with the data... I came to appreciate that what people do with data to draw conclusions can and should be transparent not only to themselves but also to anyone interested in reproducing the results down the line."
Read Patrick's complete testimonial.
As I have learned throughout my life, though especially in the past year while working on my thesis and searching for a job, organization is extremely important. Regarding my thesis, it was vital that I be organized because I often had to refer back to steps I had taken months earlier. For instance, at one point during my analysis, I noticed my results for steals and blocks were quite unusual. (Specifically, I was finding evidence suggesting that road teams had significant advantages in both categories.) Since I kept my "raw data" (i.e., data as it appeared when I collected it from an outside source) and "importable data" (i.e., data that I "cleaned up" and categorized before importing into a statistical software program called STATA) organized, I was able to refer back to the raw and importable data and discover that I had incorrectly labeled "home team blocks" as "road team blocks" and vice versa.
Moreover, because the project lasted for months, I maintained do-files to keep track of each and every step I took during my analysis. If I had instead adjusted and analyzed my data interactively (through STATA) without documented do-files, I would have had no chance of recalling each and every nuance of my prior work with the data and thus, I would have been unable to reproduce my results in the long run.
And of course, my do-files themselves had to be organized. At the end of each day of thesis work, I saved updated versions of my two major do-files (titled "Import" and "Analysis"). As Professor Ball continually stressed, saving new do-files each day (such as "Updated Import on Feb 2") would lead to clutter in the short term and questions about what work was the most updated version in the long run.
Overall, while adhering to Professor Ball's replicability structure, I came to appreciate that what people do with data to draw conclusions can and should be transparent not only to themselves but also to anyone interested in reproducing the results down the line.
Giff Brooks ’12, Haverford College
"Recording my analysis electronically is beneficial insofar as it increases the transparency, and thus credibility, of my findings. My work can be reproduced by anyone, anywhere...if I realize I made a mistake editing or organizing my data (or simply want to try a different tack in my analysis), I can tweak my do-file to rectify my mistake."
Read Giff's complete testimonial.
I can point to four main benefits that arise from documenting and recording economic (or any other data-centric) analysis in a reproducible fashion, e.g. a Stata do-file. Doing so is useful because it is transparent, it is convenient, it offers an opportunity to comment on my own work, and it maintains organization.
First, recording my analysis electronically is beneficial insofar as it increases the transparency, and thus credibility, of my findings. My work can be reproduced by anyone, anywhere. If any of my findings or analysis is incorrect, this vastly improves the chances that my errors will be noticed by a sharp analyst.
Second, keeping electronic files that document my analysis is simply convenient. As a project develops, it is quite easy to open my do-file and pick up where I left off the day before. Compare that to trying to recreate my analysis--and then build on it--day after day. Moreover, if I realize I made a mistake editing or organizing my data (or simply want to try a different tack in my analysis), I can tweak my do-file to rectify my mistake. This is vastly superior to having to start anew with the raw data, as I would have to do absent documentation of my previous work.
Third, the commenting feature included in most analytical software packages is a tool that should not be overlooked. In my experience, interweaving comments (i.e. text unread by the program, but visible to the analyst) with my commands has been immensely helpful. Lines of code that might otherwise look like gobbledygook can be enhanced with informative comments. Furthermore, they can be used to remind me of why I have chosen to run a certain test or create a certain variable, as well as to interpret a given result "in plain English."
Finally, maintaining an electronic record has aided me, especially late in my project, by keeping things organized. One do-file, for example, can contain every single step necessary for an analysis, from pulling in raw data from the internet to outputting the results as neat tables in Microsoft Word. Recording my analysis in such a form ensures that I never forget to run a given test, or mix up the order of my commands, or irreparably harm my data.