Jobs

Build where you're strongest. All of our portfolio companies are hiring. We’d love to help facilitate a match.
companies
Jobs

ML/Data Analytics Engineer

straddle

straddle

Software Engineering, Data Science
Denver, CO, USA
Posted on Jun 3, 2025

About the Role

We are seeking an ML/Data Analytics Engineer to join our engineering team and take ownership of the data pipelines and machine learning infrastructure that support our fintech platform. In this role, you will be the crucial link between raw data and actionable insights, ensuring that data flows smoothly from our products into analytics dashboards and fraud detection models. You’ll work on building systems that handle everything from aggregating transaction data and customer information, to deploying machine learning models that evaluate risk in real time. If you enjoy writing production-quality code as much as wrangling datasets and tuning models, this hybrid role at the intersection of software engineering and data science will be a great fit.

On any given day, you might be writing Python ETL jobs to extract and transform new data sources (for example, pulling in bank transaction logs or user activity events), orchestrating these jobs with a tool like Apache Airflow or cloud data pipelines. You’ll collaborate with the Data Science Lead to take prototypes of fraud detection or identity scoring models and implement the robust, scalable systems needed to run these models in production (such as setting up an API endpoint or microservice for real-time scoring). You will also create analytical queries or dashboards to help the team monitor key metrics – like success rates of payments, model performance, or user growth trends. This role involves a mix of backend engineering, DevOps for data (managing databases, cloud services), and applied ML engineering.

Because we are a small, agile team, the ML/Data Analytics Engineer will have broad responsibilities and plenty of autonomy. You’ll be expected to uphold strong coding standards and deliver reliable systems even as requirements change rapidly. Your contributions will directly influence our ability to make data-driven decisions and deliver intelligent features to customers. This is a full-time position based in Denver, CO with flexibility for remote work. We offer a competitive base salary and equity package. For an engineer who loves data and wants to build something impactful from the ground up, this role provides the opportunity to shape the data foundation of a promising fintech startup.

Key Responsibilities

  • Data Pipeline Development – Design, build, and maintain robust data pipelines that collect and process data from various parts of our system (e.g., user onboarding data, transaction records, external banking data via APIs). Ensure data is ETL’d into appropriate storage (databases, data lakes/warehouses) in a reliable, repeatable way.

  • Machine Learning Engineering – Collaborate with data scientists to productionize machine learning models. This includes rewriting or optimizing model code for efficiency, setting up REST/GraphQL endpoints or batch processes to serve model predictions (such as fraud risk scores) to the application, and integrating these into the transaction workflow.

  • Real-Time Processing – Implement real-time or near-real-time data processing where required. For example, set up message queues or streaming systems to handle events like incoming payments or login attempts, feeding them into fraud detection algorithms with low latency.

  • Data Analytics & Reporting – Work on the analytics side by writing complex SQL queries or using BI tools to enable reporting on key business and product metrics. Develop internal dashboards to surface insights (e.g., daily active users, number of payments processed, fraud alerts triggered) for team members and leadership.

  • Data Infrastructure Management – Oversee our databases and data warehouse solutions. Tune database performance, manage schema migrations for new data needs, and ensure secure and compliant handling of sensitive information (encryption, access controls, data retention policies).

  • Collaboration & Support – Partner with the Data Science Lead and Risk team to understand data requirements and ensure the pipeline meets their needs (e.g., delivering labeled datasets for model training or features for analytics). Work with software engineers to instrument the application code to emit important events and logs for analysis. Assist Customer Success or Product teams by pulling data when ad-hoc analysis is needed.

  • Quality and Monitoring – Implement monitoring for data pipeline jobs and ML services to quickly detect failures or anomalies. Set up alerting for data quality issues (like missing data or pipeline delays) and work to make the system self-healing where possible. Write unit and integration tests for your pipelines and model serving code to maintain a high reliability bar.

  • Continuous Improvement – Keep up with the latest tools and best practices in data engineering and MLOps. Evaluate and introduce new technologies (like analytics platforms, feature stores, or ML workflow tools) that could enhance our capabilities. Continuously refactor and improve existing data systems for better performance and maintainability.

Required Qualifications

  • Data Engineering Experience – 3+ years of experience as a data engineer, machine learning engineer, or similar role. Strong knowledge of building data pipelines and working with ETL processes in a production environment.

  • Programming Skills – Proficiency in Python (for data scripts and ML integration) and SQL (for querying and managing data). Familiarity with at least one statically-typed language (Java, Scala, Go, etc.) is a bonus. Writing clean, maintainable code is a must.

  • Database and Storage – Experience with relational databases (e.g. PostgreSQL, MySQL) and writing complex SQL queries. Familiarity with data warehousing concepts and tools (such as Snowflake, BigQuery, or Redshift) and possibly NoSQL data stores for unstructured data.

  • Machine Learning Fundamentals – Solid understanding of how common machine learning models function and are deployed. You don’t need to be a model researcher, but you should be comfortable taking a trained model and handling tasks like serialization, versioning, and setting up an API to serve predictions. Experience with ML libraries or frameworks (scikit-learn, TensorFlow, etc.) is important.

  • Cloud & DevOps – Hands-on experience with cloud services (AWS, GCP, or Azure) especially those related to data processing (e.g., AWS Lambda, Kinesis, S3, Google Cloud Functions, Pub/Sub, BigQuery, etc.). Knowledge of containerization (Docker) and CI/CD pipelines to deploy data/ML services.

  • Data Visualization/Analysis – Ability to use or learn tools for creating dashboards or reports (such as Tableau, Looker, or Python visualization libraries) to help non-technical team members understand the data. Strong analytical thinking to interpret data trends.

  • Problem-Solving – Excellent debugging and problem-solving abilities, especially when dealing with messy data or system issues. Able to trace problems across complex systems (from data ingestion to processing to output).

  • Attention to Detail – Diligence in verifying data accuracy and consistency. Understanding of the importance of clean data for analytics and model performance.

  • Team Collaboration – Good communication skills to work with a cross-functional team. Capable of translating technical information for less-technical stakeholders. Comfortable working in an agile, iterative development process.

Preferred Qualifications

  • Fintech/Payments Exposure – Experience with financial datasets or payment processing systems. Understanding of transactions, ledgers, or fraud signals in banking data. This domain knowledge can help in building relevant features and interpreting data correctly.

  • Stream Processing – Familiarity with streaming frameworks or messaging systems (Apache Kafka, AWS Kinesis, RabbitMQ, etc.) and experience building stream processing jobs (using Spark Streaming, Flink, etc.) for real-time analytics.

  • MLOps & Automation – Experience with machine learning operations: model deployment workflows, automation of retraining, continuous integration for ML (CI/CD pipelines that handle data and model updates). Familiarity with tools like MLflow, Kubeflow, or Vertex AI for model management.

  • Big Data Tools – Knowledge of big data processing frameworks like Apache Spark or Hadoop in case we need to scale to large datasets. Experience optimizing large-scale data jobs for performance/cost efficiency.

  • Security & Compliance – Understanding of data security best practices, especially concerning sensitive personal and financial data. Experience implementing data compliance measures (GDPR, SOC 2, etc.) or working with encrypted data.

  • Academic Background – Bachelor’s or Master’s in Computer Science, Data Engineering, or related field. Emphasis or coursework in databases or machine learning engineering is a plus.

  • Adaptability – Demonstrated ability to pick up new technologies quickly. Since our stack may evolve, showing a history of learning new tools (like transitioning from one database or pipeline technology to another) is valuable.