Howdy Logo
Image of Rafael J.

Rafael J.
Data Engineer

Airflow
Kubernetes
Spark
Github
Python
Postgresql
Amazon Aws
Google Cloud
Docker Cloud
Bio

Adept computer science student with a strong enthusiasm for software engineering, big data, cloud computing, DevOps, DataOps, and machine learning.

  • Mid-Level Data Engineer
    8/1/2022 - Present

    Participated in the construction of a comprehensive data platform designed to facilitate data-driven decision-making. Conducted in-depth studies to determine the minimum permissions required for developers and tools to operate effectively on Google Cloud Platform (GCP). Implemented Extract, Load, Transform (ELT) processes, along with cataloging, quality testing, and governance procedures for Open Finance API data. Enhanced the development workflow through Continuous Integration and Continuous Delivery (CI/CD), including code reviews, unit tests, linting, building Docker images, and deploying applications on Kubernetes (GKE). Provided support for users of the data platform by assisting with local setup, creating new data pipelines, and clarifying technical queries related to Apache Airflow, DBT (Data Build Tool), Git, GCP, and DevSecOps, while incorporating user feedback to improve development ergonomics. Maintained frequent collaboration with Site Reliability Engineering (SRE) and Security teams to support platform development.

  • Data Engineering Intern
    2/1/2021 - 12/1/2021

    Developed proficiency in the creation, debugging, and refactoring of ETL flows involving open business data acquired via API, Web Scraping, and FTP. Automated processes through continuous integration, infrastructure as code, and scripts. Orchestrated ETL processes using Apache Airflow and implemented ETL jobs utilizing Python, PostgreSQL, Logstash, and Elasticsearch. Created thorough documentation of business rules and ETL processes using Markdown. Gained experience with multi-cloud environments including GCP (GKE, GCE, Cloud SQL, GCS, VPC, IAM) and AWS (EC2, ECS, S3, VPC). Conducted proof of concept with Spark on GCP Dataproc for data transformation, and automated development pipelines using Gitlab CI/CD. Provisioned cloud infrastructure using Terraform, Helm, Kubernetes, and Docker while monitoring and optimizing costs in GCP. Employed DevSecOps practices with Kubernetes Secrets, GCP Secret Manager, GCP VPC Firewall rules, and GitLab CI/CD Variables.

  • Python Development Intern
    6/1/2019 - 2/1/2021

    Developed proficiency in Python programming, specifically for web scraping tender data from various public bodies. Enhanced automation skills through the effective use of Jenkins, ensuring efficient task automation. Demonstrated strong abilities in database querying with PostgreSQL. Leveraged Azure DevOps for streamlined project management and continuous integration/continuous delivery (CI/CD) processes.

  • Data Science Intern
    9/1/2018 - 6/1/2019

    Developed expertise in Python programming with a focus on web scraping real estate advertisement data using Scrapy. Acquired advanced skills in data mining and visualization through the use of Pandas, Seaborn, and Folium. Engineered a real estate pricing model leveraging Numpy and Scikit-learn. Designed and implemented microservices for web scraping and pricing, utilizing AWS technologies such as API Gateway, Lambda, Fargate, and S3. Conducted provisioning of cloud infrastructure employing AWS CloudFormation and Serverless API. Enhanced database querying capabilities with extensive use of MySQL.

  • Computer Science at Federal University of Rio Grande do Sul
    2016 - 2023

Rafael is available for hire

Hire Rafael J.
Check icon

All Howdy Candidates are vetted for skills and english proficiency.