Stash - Data Engineer

Who We Are

Stash is a digital-first financial services company committed to making saving and investing accessible to everyone. By breaking down barriers and building transparent, technology-driven products, we help the 99% build smarter financial habits so they can confidently save more, grow wealth, and enjoy life.

Position Details 

At Stash, data is at the core of how we make decisions and build great products for millions of users. As a Data Engineer, you will be a part of our Data Platform Team which is leading the architectural design decisions and implementation of a modern data infrastructure at scale. You will build distributed services and large scale processing systems that will support various teams to work faster and smarter. You will partner with Data Science to help production machine learning models and algorithms into actual data-driven products that will help make smarter products for our users.

Tools and technologies in our tech stack (evolving):

  • Hadoop, Yarn, Spark, MongoDB, Hive

  • AWS EMR/EC2/Lambda/kinesis/S3/Glue/DynamoDB/API * Gateway, Redshift

  • ElasticSearch, Airflow, and Terraform.

  • Scala, Python

What you'll do:

  • Build core components of data platform which will serve various types of consumers including but not limited to data science, engineers, product, QA

  • Build various data ingestion and transformation job/s as and when they are needed

  • Productionize our machine learning models and algorithms into data-driven feature MVPs that scale

  • Leverage best practices in continuous integration and deployment to our cloud-based infrastructure

  • Build scalable data services to bridge the gap between analytics and application space

  • Optimize data access and consumption for our business and product colleagues

  • Develop an understanding of key product, user, and business questions

Who we’re looking for:

  • 3+ years of professional experience working in data engineering

  • BS / MS  in Computer Science, Engineering, Mathematics, or a related field

  • You have built large-scale data products and understand the tradeoffs made when building these features

  • You have a deep understanding of system design, data structures, and algorithms

  • Experience (or a strong interest in) working with Python or Scala

  • Experience with working with a cluster manager (YARN / Mesos / Kubernetes)

  • Experience with distributed computing and working with Spark, Hadoop, or MapReduce Framework

  • Experience working on a cloud platform such as AWS

  • Experience with ETL in general

Gold stars:

  • Experience working with Apache Airflow

  • Experience working with AWS Glue

  • Experience in Machine Learning and Information Retrieval