Chima Charles Ojini - Data Engineer

About Me

I'm a data engineer based in Boston with a passion for building production-grade ETL systems that process millions of records efficiently. My background in Cloud Engineering gives me a unique perspective on building scalable, reliable data infrastructure.

Currently, I'm actively pursuing Data Engineer positions in the Boston area while building my professional brand through technical projects and LinkedIn content. I specialize in designing and implementing dimensional data models, optimizing query performance, and mentoring teams on modern data practices.

When I'm not engineering data pipelines, you'll find me training calisthenics at Franklin Park, exploring Afrobeats music, or planning my next travel adventure.

🔧

Technical Depth

Expert in Python, SQL, PySpark, Airflow, and modern cloud platforms

📊

Data Architecture

Specializing in medallion architecture and dimensional modeling

☁️

Cloud Native

Proficient with Azure services, containerization, and CI/CD pipelines

🚀

Production Ready

Building enterprise-grade systems with logging, monitoring, and error handling

Featured Projects

New

Real-Time Stock Market Pipeline

Kafka Spark PostgreSQL Docker Grafana

Fully containerized real-time data pipeline solving the problem of no real-time visibility into stock price movements. Streams data from Alpha Vantage API through Kafka and Spark into PostgreSQL every 5 minutes. Live Grafana dashboard visualizes closing prices, price spreads, and ingestion metrics. Entire 9-service stack launches with a single docker compose command.

Symbols Tracked 5

Services 9

Ingestion 5 min

View on GitHub

Featured

Amazon Apparel ETL Pipeline

Python Pandas Data Quality

Production-ready ETL pipeline transforming severely corrupted sales data into analytics-ready format. Processed 37,432 records, reconstructed 1,385 missing SKUs with 100% recovery rate, and implemented star schema with 11 automated quality checks.

Records Cleaned 37.4K

Quality Checks 11/11 ✓

SKU Recovery 100%

View on GitHub

Featured

Sales API Pipeline

Python Salesforce PostgreSQL

Integrated data pipeline combining Salesforce, Stripe, and Google Sheets data. Built automated workflows for data reconciliation and reporting. Implemented error handling and retry logic for production reliability.

Data Sources 3

Automation Rate 100%

Latency <5 min

View on GitHub

OptiRide Analytics

PySpark Airflow PostgreSQL

ETL system integrating NYC Citi Bike data with weather patterns. Built dimensional models for analysis. Scheduled daily orchestration using Apache Airflow with automated data quality checks.

Daily Records 100K+

Dimensions 12

Uptime 99.9%

View on GitHub

UK Business Portal Scraper

Python Web Scraping PostgreSQL

Web scraping solution extracting company data from UK government business registry. Built robust parsing logic with error recovery. Automated daily data collection with intelligent caching.

Companies Tracked 50K+

Success Rate 99.2%

Update Frequency Daily

View on GitHub

Chima's Pizza House Database

PostgreSQL Python GCP Cloud SQL

Enterprise database system deployed on Google Cloud Platform. Designed a 7-table normalized schema (3NF), populated with 21,000+ rows of synthetic data using Python and Faker, and wrote analytics queries answering key business questions.

Total Records 21K+

Tables Designed 7

Analytics Queries 5

View on GitHub

Building scalable ETL pipelines & analytics solutions

About Me

Technical Depth

Data Architecture

Cloud Native

Production Ready

Featured Projects

Real-Time Stock Market Pipeline

Amazon Apparel ETL Pipeline

Sales API Pipeline

OptiRide Analytics

UK Business Portal Scraper

Chima's Pizza House Database

Technical Skills

Languages

Frameworks & Tools

Cloud & Infrastructure

Databases & Platforms

Let's Connect

Email

LinkedIn

GitHub