Scientific Method

A topic that fascinates me is how the scientific method has been modified and reiterated to better fit the workflows of different niches of research and technology. The purpose and end goal of all of these methods is essentially the same: to find and report information. With specialized frameworks for different fields streamlining workflows, it is interesting and informative to compare and contrast some specialized iterations against each other, as well as against the basic scientific method framework.

The Scientific Method

The scientific method is a framework for answering questions and validating a hypothesis through experimentation. The method can be described in a multitude of ways, but it almost always starts with asking a question and concludes with communicating results.

An example of the method’s workflow is summarized in Figure 1 below.

Figure 1: Scientific Method

Figure 1: Scientific Method

The scientific method is usually the first framework a young STEM student is exposed to. The goal of the method is to answer a question by proving or disproving a hypothesis through experimentation. It is a fantastic general tool for experiments in all scientific disciplines, and has some built in reiteration to further explore a question should the initial hypothesis fail to completely explain the phenomenon being explored.

The Engineering Method

The scientific method is fantastic at answering questions such as, “Why does this phenomenon happen?” or, “If we increase x how does it affect y?” However, an area where the scientific method falls short is when being used to solve a problem.

The engineering method is a modified version of the scientific method that specializes in solving problems that have never before been solved through product design. An example of the method’s workflow is summarized in Figure 2 below.

Figure 2: Engineering Method

Figure 2: Engineering Method

The engineering method is the framework that I have the most experience with, and is very near and dear to my heart. Finding and defining a problem that has never been solved and developing a solution to that problem is inspiring, and the engineering method is the roadmap to get from problem to solution. Similar to the scientific method, the engineering method has some built in reiteration to improve solutions and prototypes. An important detail, however, is that the problem definition and specified requirements do not change. This ensures that the solution moves to fit the problem and not the other way around. Redefining the problem to fit your first prototype always leaves your initial problem unsolved. 

The Data Science Method (CRISP-DM)

The data science method is an interesting hybrid of the last two discussed methods. Fitting closely with the scientific method, data science processes are often meant to draw clarity from raw, initially unintelligible data. Similar to the engineering method, the data science method can also be used to solve business problems and often results in a newly developed product, such as a data model or application. To those inspired by STEM fields, the process has both the gratification of answering a question and the satisfaction of solving a new problems.

There are a few different versions of the data science process, but I will focus on the Cross Industry Standard Process for Data Mining (CRISP-DM), a very popular framework. An example of the method’s workflow is summarized in Figure 3 below.

Figure 3: CRISP-DM

Figure 3: CRISP-DM

A feature of the data science process that differentiates it from the other processes described in this blog is its heavy emphasis on reiteration. The data may reveal previously unknown business understanding, which could call for the collection of more data. The first model built my bring to light the need for further data scrubbing and preparation. Even after deploying a product, more knowledge may call for improvements in the product. The process leaves the option of continuous improvement open, which is both exciting and daunting. A work of art is never truly finished.

Final Thoughts

Data science is a relatively new field of study, and the standardized processes that are used in the field such as CRISP-DM are reflective of that in their more modernized workflow. It seems that modern data science methods make use of the quick reiteration possible with computers and platforms like Jupyter Notebooks. The workflow has been optimized for today’s tools, which makes the field of data science all the more exciting to be entering.

I have adopted CRISP-DM as my default process in my future work and I look forward to becoming more familiar with the framework as I use it in my career as an engineer and data scientist.

Previous
Previous

Metrics for Classification Models

Next
Next

Engineering to Data Science