Who We Are
Stash is a digital-first financial services company committed to making saving and investing accessible to everyone. By breaking down barriers and building transparent, technology-driven products, we help the 99% build smarter financial habits so they can confidently save more, grow wealth, and enjoy life.
Position Details
At Stash, data is at the core of how we make decisions and build great products for millions of users. As a Data Engineer, you will be a part of our Data Platform Team which is leading the architectural design decisions and implementation of a modern data infrastructure at scale. You will build distributed services and large scale processing systems that will support various teams to work faster and smarter. You will partner with Data Science to help production machine learning models and algorithms into actual data-driven products that will help make smarter products for our users.
Tools and technologies in our tech stack (evolving):
Hadoop, Yarn, Spark, MongoDB, Hive
AWS EMR/EC2/Lambda/kinesis/S3/Glue/DynamoDB/API * Gateway, Redshift
ElasticSearch, Airflow, and Terraform.
Scala, Python
What you'll do:
Build core components of data platform which will serve various types of consumers including but not limited to data science, engineers, product, QA
Build various data ingestion and transformation job/s as and when they are needed
Productionize our machine learning models and algorithms into data-driven feature MVPs that scale
Leverage best practices in continuous integration and deployment to our cloud-based infrastructure
Build scalable data services to bridge the gap between analytics and application space
Optimize data access and consumption for our business and product colleagues
Develop an understanding of key product, user, and business questions
Who we’re looking for:
3+ years of professional experience working in data engineering
BS / MS in Computer Science, Engineering, Mathematics, or a related field
You have built large-scale data products and understand the tradeoffs made when building these features
You have a deep understanding of system design, data structures, and algorithms
Experience (or a strong interest in) working with Python or Scala
Experience with working with a cluster manager (YARN / Mesos / Kubernetes)
Experience with distributed computing and working with Spark, Hadoop, or MapReduce Framework
Experience working on a cloud platform such as AWS
Experience with ETL in general
Gold stars:
Experience working with Apache Airflow
Experience working with AWS Glue
Experience in Machine Learning and Information Retrieval