Skip to content

Projects

A collection of my data engineering, analytics, and software engineering projects — from ETL pipelines to machine learning applications.

Services

Data Analytics

ETL pipeline development, data visualization with Power BI and Tableau, SQL optimization, and business intelligence reporting.

PythonSQLPower BITableauAWSSnowflakedbt

Backend Development

Designing and implementing RESTful APIs, database architecture, and server-side applications.

PythonFastAPIPostgreSQLMySQLMongoDBAWS

Featured Projects

Featured Project

Automated Invoice Processing Pipeline

  • Designed an event-driven architecture that eliminates manual data entry by automatically triggering extraction pipelines upon file upload, reducing processing time by 95%.
  • Migrated real-time invoice processing to a scheduled Airflow batch architecture, optimizing compute resource usage by processing uploads in hourly micro-batches.
  • Implemented unstructured-to-structured data transformation using AI-based OCR to parse PDF invoices and normalize them into a relational PostgreSQL schema.
PythonPostgreSQLStreamlitAWS LambdaAWS S3AWS RDSAWS Textract

Tech Job Market Trends Dashboard

Jan 2024 - Mar 2024

Scraped job postings from career sites using BeautifulSoup to track company hiring patterns. Standardized company names with Pandas and analyzed month-over-month position changes with NumPy. Visualized hiring trends using Matplotlib.

PythonPandasBeautifulSoupMatplotlibNumPy

DC Bikeshare Demand & Peak Usage Analysis

Aug 2024 - Oct 2024

Processed 2+ million Bikeshare trips using Pandas to identify usage patterns across DC metro stations. Calculated statistical correlations between weather and ridership. Generated Seaborn heatmaps and Plotly interactive visualizations.

PythonPandasSeabornPlotlyLooker

Open Source Contribution - Crawl4AI

Mar 2025 - May 2025

Built async scraping pipelines with Crawl4AI handling 500+ pages, implementing session management. Enhanced FastAPI endpoints with JWT authentication, reducing unauthorized access attempts by 95%.

PythonCrawl4AIFastAPIJWTAsyncIO
View all repositories on GitHub