6 min read

Establishing a Workflow for 2019 (and beyond)


For DataBased’s inaugural blog post (🎉) I wanted to write about something that I always find myself thinking about when I work on or think about a new project: workflows! No workflow is perfect nor are they linear. But having one is helpful because they are created for the sole purpose of working for you. No matter what form it takes on, they are used to inform each step that you take.

What is a workflow?

Instead of wading through Stack Overflow first or googling without a focus, I refer to a specific set of steps to get my mental cogs moving and my feet on the ground. That’s all a workflow is. It is akin to a mental checklist that activates whenever you start a new project. The domain or the type of project does not really matter. They can be applied to anything that has several moving parts.

What I Currently Do

Over time, I’ve come up with a workflow that is a result of experiences I’ve accumulated over the course of my undergraduate journey thus far. From course projects to team-based work and independent data analyses to lab presentations, my workflow has kept me tethered to some idea of what I should do next. It has 7 steps:

  1. Inspiration: Coming up with an idea to tackle
  2. Ideation: Brainstorming ways to execute the idea
  3. Implementation: Carrying out what was identified in step 2
  4. Analysis: Going over and fine-tuning findings produced by step 3
  5. Cleaning: Taking a step back to address lingering problems stemming from step 4. I make sure to ask for help and to look into helpful resources if I’m stuck. Finishing touches are also added at this point.
  6. Presenting: Packaging my efforts in a format that can be shared with other people.
  7. Reflecting: Taking a moment to think about the process and address what I could have done differently or better.

Where in the world did these steps come from?

In the context of data science, I’m still getting my bearings on what I should do and when I should do them. In retrospect however, I’ve realized that it is more productive to pick up ideas and to follow my instincts along the way in lieu of waiting idly.

Inspiration & Ideation

For example, for ten weeks during the summer of 2017, I participated in a data science REU experience hosted by my school, the University of Massachusetts Amherst (UMass). There I worked with the Expanding Computing Education Pathways (ECEP) Alliance. My project was primarily concerned with coming up with a way to reorganize ECEP’s internal website to best serve their network of alliance stakeholders across the United States. As I worked on this, I recall feeling overwhelmed. After researching what others have done to adequately serve end users, I eventually stumbled on IDEO (a design firm) and their human-centered design approach.

This ultimately inspired my inspiration and ideation steps that summer. The rest of the steps that I used (different from the steps that I have now) came naturally as I familiarized myself with IDEO’s philosophy and exposed myself to resources that delved into UX/UI design and web programming.

My 2017 workflow from the REUMass poster I made for my project with the ECEP Alliance.

Implementation, Analysis, & Presenting

The implementation, analysis, and presentation steps came from work on my honors thesis and courses in programming and business at UMass. Through them, I have had the opportunity to take the reigns and create something of my own design. But in a world full of ideas, do any of them matter if they’re not acted on? It’s easy to come up with ideas; the most difficult part is actually creating something that is tangible and presentable.

Subsequently, whenever my code was not readable, my numbers weren’t making sense or other people were having a hard time understanding whatever I shared, I knew it was time to go back to the drawing board. Continuously nailing down these ideas in the context of my academic work resulted in them being added to my workflow.

Cleaning & Reflecting

At the end of semesters, I always take a look back to remember the good and what I could have done differently to improve over break for the next semester. When I am detached from projects at that point, it is easy to see the rough edges that I might have missed. Though cleaning up after submitting something is too late for deadlines, it still helps to revisit old projects to tie up loose thoughts and ideas.

But I don’t have to wait for an entire semester to cleanup and reflect. By allocating dedicated time during a project to clean up and reflect, I can reap the same benefits and put them to use while the project is still fresh. Looking back also inspires me with new ways to approach problems in the future. As a result, it is usually a productive process that I try to do often.

A Workflow’s Role in Data Science

My workflow likely won’t be the same as my peers’ because everyone gets work done differently. When I need a workflow that better matches the type of work I’m doing, I find no shame in researching what others do. Doing this gives me the opportunity to tweak and fine-tune my own steps.

In data science, there seems to be a number of archetypal steps. For example, in a 2013 blog post by Philip Guo in Communications of the ACM, Guo provides an overview of a typical data science workflow.

A screenshot of a typical data science workflow from Philip Guo’s post in Communications of the ACM.

We share the analysis, reflection, and presentation (dissemination) steps but we have defined them with slightly different definitions. Our paths are also not linear. My steps are numbered just to convey what order they naturally flow in. But in reality, I move backwards and forwards as many times as necessary. For example, if I am stuck during the cleaning step, I will consult with my peers, professors, online resources, and etc. for insight. If I end up uncovering an issue that needs to be resolved, I may go back a couple steps to try something else.

I think Guo’s post—and others like it—place meaningful emphasis on the fact that workflows are relative. The main difference is how they are applied and optimized for a particular project at hand.

As I continue to make progress with my education and independent projects, my workflow will help to ensure that I’m not at a standstill. Sometimes I will have to diverge from what I have laid out and that’s okay. Ultimately, by establishing these steps, I know where to start and how to get to the end.

I hope sharing my steps helps you fine-tune your own steps or inspires something new.

Here’s to a great 2019! ✨