Gobisan

GOBISAN ANANTHANADARAJAN

Data Engineer

Profile


I am a data engineer with 1.5+ years of experience, focusing on developing, operationalizing monitoring, and optimizing reliable data infrastructures. Strong experience spanning across Data Engineering, Data Observability, Distributed Data Processing, Databases, Cloud, E-Commerce (Retail) and Agile Practices.

Languages


  • English
  • Tamil
  • Sinhala

Experience


Engineer - Business Intelligence Systems

Colombo, Western Province, Sri Lanka

  • Collaborating with Business Intelligence Systems team to build, monitor and optimize data prodcuts.
Skills: Azure Databricks, Azure Data Factory, Azure DevOps, Power BI, Python, MS SQL Server, SQL/T-SQL, PySpark

Engineer - Data Platform

Colombo, Western Province, Sri Lanka May 2023 - Jun 2024 • 1 yr 1 mos

  • Collaborated with the data platform team to consolidate data from various enterprise application sources into data lake and support deriving enterprise-wide insights from consolidated data.
Achievements:
  • Collaborated on the proactive monitoring of extraction pipelines and quality of extracted data.
  • Collaborated on the reduction of ADF extraction pipeline execution time from more than 6 hours to 4 hours.
  • Collaborated on the optimization of a data transformation pipeline which resulted in a significant cost reduction.
  • Developed an ADF extraction pipeline with chunking mechanism to extract large volume of data.
  • Developed a Python based Azure Function to schedule and extract data from a report publisher system.
  • Transformed data using Synapse Analytics and Databricks and created comprehensive Power BI report to support employee rehire decision making.
Responsibilities:
  • Actively participate in the stabilization activities of the data platform (post-production monitoring, optimization, cost reduction, data quality).
  • CI/CD pipelines using Azure DevOps to deploy Azure Function Apps (ARM, Functions, Health Check Probe) and Azure Databricks notebooks.
  • Azure Synapse Analytics and Databricks to perform data transformations and orchestrate the transformation process.
  • comprehensive Power BI reports to support data-driven decision-making for various departments within the organization.
Skills: Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure DevOps, Azure Functions, Power BI, Azure Data Lake, Python, MS SQL Server, SQL, PySpark, GitHub, T-SQL, CI/CD

Intern - Data Platform

Colombo, Western Province, Sri Lanka Oct 2022 - May 2023 • 7 mos

Achievements:

  • Replaced an existing ADF pipeline that failed due to a timeout issue.
  • Developed and operationalized 5+ ADF pipelines, capable of handling both full load and incremental load to extract data from relational databases and REST APIs to the data lake.
Responsibilities:
  • Build data ingestion pipelines to ingest data from various enterprise source systems into data lake.
  • Support post production monitoring and bug fixing along the data pipelines.
Skills: Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure DevOps, Azure Functions, Power BI, Azure Data Lake, Python, MS SQL Server, SQL, PySpark, GitHub, T-SQL, CI/CD

Technical Skills


Languages

  • Python
  • Java
  • SQL
  • JavaScript
  • C++
  • R

Database Systems

  • MS SQL Server
  • Oracle Database
  • PostgreSQL
  • MySQL

Data Platform Technologies

  • Azure Data Factory
  • Azure Databricks
  • Azure Synapse Analytics
  • Azure Data Lake Storage
  • Azure Functions
  • Apache Spark
  • Power BI
  • SSIS
  • SSAS
  • SSRS

Machine Learning

  • Scikit-learn
  • Tensorflow
  • Keras
  • NumPy
  • Pandas

Web Technologies

  • Flask
  • JSP, Servlets
  • Jersey
  • Ajax
  • JQuery
  • Bootstrap
  • HTML
  • CSS

Cloud

  • Azure
  • AWS

Other

  • Data Modelling
  • Data Observability
  • ETL/ELT Process
  • CI/CD (Azure DevOps)
  • Version Control (Git, GitHub)
  • Web Scraping (Selenium)
  • Micro Services Architecture
  • RESTful Web Services
  • MVC Design Pattern
  • Unit and Integration Testing
  • Agile Practices

Projects


Hotel Booking Cancellation Predictor (Data Mining)

Sep 2022 - Oct 2022

Developed a web application to predict hotel booking cancellations using Data Mining techniques. Several models were trained based on different classification algorithms (SVM, Logistic Regression, Decision Tree, Naive Bayes) and best model was utilized in the web application. Data preprocessing techniques were used to feed quality data for model training.

  • Python
  • Scikit-learn
  • Flask
  • Heroku
  • HTML
  • Bootstrap
GitHub Repo

Data Warehousing and BI solution for Bank Transactions

Apr 2022 - May 2022

Developed an ETL pipeline, data warehouse, OLAP Cube, and visualizations (Excel, SSRS) to analyze the historical data of a bank's transactions and improve the transaction process. The data warehouse is based on a snowflake schema, which contains an accumulating fact table, a slowly changing dimension, and other dimensions. 

  • SSIS
  • SSAS
  • SSRS
  • MS SQL Server
  • Power BI
  • MS Excel
GitHub Repo

Customer Support System for PowerGrid (RESTful WebServices)

Apr 2022 - May 2022

This is a RESTful web service for power grid maintenance that offers services to monitor power consumption of users, generate monthly bills and automatically send them to the users, and accept online payments from users.

  • Java (Jersey)
  • JQuery
  • MySQL
  • Tomcat
  • Maven
  • Postman
GitHub Repo

Apparel Manufacturing Management (Web Application)

Jul 2021 - Oct 2021

This web application offers apparel manufacturers a fully-integrated management suite with an analytics dashboard for each management module. Based on the MVC architecture.

  • Java (Servlets & JSP)
  • HTML
  • Bootrap
  • Java Script
  • MySQL
GitHub Repo

Vehicle Rental and Parking Slot Reservation (Mobile Application)

Jul 2021 - Oct 2021

This is an android application for vehicle owners to advertise vehicles and parking area owners to advertise parking slots. Customers can reserve vehicles and parking spots through the application.

  • Java
  • Firebase
  • Android Studio
GitHub Repo

Education


BSc (Hons) in Information Technology - Specialising in Data Science

Faculty of Computing Sri Lanka Institute of Information Technology 2020 - 2023

G.C.E A/L Examination 2018 – Physical Science Stream

Hindu College Colombo Sri Lanka 2005 - 2018

Certificates


DP-203: Microsoft Certified: Azure Data Engineer Associate

Earners of the Azure Fundamentals certification have demonstrated understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.

View Credentials

AZ-900: Microsoft Certified: Azure Fundamentals

Earners of the Azure Fundamentals certification have demonstrated foundational level knowledge of cloud services and how those services are provided with Microsoft Azure.

View Credentials