Skip to content
View hsjoi0214's full-sized avatar
💭
GoodDays!
💭
GoodDays!

Highlights

  • Pro

Block or report hsjoi0214

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hsjoi0214/README.md

PJ · Embedded Software Engineer → Data/AI Engineer

E-mobility software engineer for embedded systems with experience in data analysis and transformation. Now I’m transitioning to pure Data Engineering and ML system roles out of strong interest and curiosity.


Professional Background

Experienced Embedded Software Engineer with a strong foundation in automation, data-driven systems, and scalable software architectures.
Currently transitioning into Data Engineering and Applied Machine Learning, leveraging a deep understanding of system design, data flows, and distributed computation.

Technical Alignment with Data Engineering

  • badge
    Engineered complex automation and control systems using PA-Base/Script, an object-oriented scripting environment conceptually similar to Python/C++ which helped me build strong foundations in modular software design, data manipulation, and process automation.

  • badge

    1. Designed and deployed automated data acquisition and transformation pipelines for large-scale battery testing which are analogous to modern ETL (Extract, Transform, Load) workflows in data engineering.
    2. Implemented process control flows via DAG-based orchestration (PA-Graph), mirroring dependency management in tools like Apache Airflow.
  • badge

    1. Developed structured and distributed databases for managing cell, pack, and end-of-line test data which conceptually aligned with PostgreSQL, AWS RDS, and DynamoDB architectures.
    2. Implemented cloud-based data synchronization for global test environments, paralleling AWS S3 and Azure Data Lake solutions.
  • badge

    1. Analyzed large-scale battery performance data to detect trends and anomalies using statistical and algorithmic reasoning and laying groundwork for machine learning workflows.
    2. Built user-facing dashboards (PA-Design) for visualization and reporting, comparable to frameworks like Streamlit or Plotly.
  • badge

    1. Built real-time monitoring solutions for distributed test systems, providing insight into data quality, system health, and performance which conceptually aligned with Prometheus, Grafana, and AWS CloudWatch.
    2. Defined alerting and metric-tracking logic for anomaly detection and proactive maintenance.
  • badge
    Automated deployment and testing pipelines for hardware-software integration which extends continuous integration and delivery (CI/CD) concepts into data and MLOps workflows.

  • badge
    Led global customer training sessions across Europe, the USA, and China, authored internal documentation and user guides to standardize testing and data workflows.

Broader Experience

  • Developed full-stack applications and Data science / ML-based projects, demonstrating proficiency across both software engineering and data infrastructure layers.
  • Familiar with AWS Cloud, Python, SQL, Databricks, Terraform, Docker, and CI/CD pipelines.

My experience in embedded systems taught me to build reliable, data-centric automation in distributed environments :— skills that map directly to modern data engineering and cloud computing.

Email Website Medium


Skills & Transition Path

  • focus
    Transitioning to working with production-grade data engineering, data science, and applied ML projects.

  • skills
    AWS Cloud Solutions: Glue, Lambda, API Gateway, S3, IaC (Terraform, CloudFormation), Simple Data Lake, CloudWatch, Cost Explorer, RDS, DynamoDB, IAM, VPC Security, Databricks, Jenkins (CI/CD), Airflow (DAGs).

  • technical
    AWS (Cloud): Lambda, S3, API Gateway, RDS, DynamoDB, IAM, Service Catalog, Terraform (IaC), CloudWatch, Cost Explorer, EKS, SQS, Glue, Athena, VPC, and others.

    Programming & Tools: Python, SQL, Unix Shell Scripting, PySpark, ETL.

    DevOps & Automation: CI/CD, Git, Jenkins, Airflow, Terraform (IaC), Kafka (Basic), Containerisation (EKS, Docker).

    Design & Architecture: System Design, Client-Server Architecture, Microservices, Serverless Architecture, Event-Driven Architecture, Data Modeling, Database Design.

    Observability & Monitoring: OpenTelemetry (Otel), Jaeger, Databricks, Prometheus, Grafana, custom DIY Monitoring & Observability Panel.


Featured Projects (learning + build)

Market Data Platform
Cloud-native streaming & batch pipelines for financial market data, data quality gates + real-time & analytical serving.

The Knowledge Drip
AI-driven knowledge delivery platform using hybrid search (BM25 + embeddings) & personalized insights via SMS.

RAGbot
RAG chatbot for Crime and Punishment — information retrieval + LLM via Streamlit.

Housing Price Prediction
Feature-engineered XGBoost pipeline; Streamlit app; Kaggle RMSE 0.12033.

Brazil Market Expansion
SQL + Tableau dashboards on an artificial Brazil market dataset; structured insights & schema design.

Eniac Discount Analysis
Discount strategy & product segmentation on €7.8M revenue; seasonal demand & margin impact.

Weather App
Minimalist JS + OpenWeather app with essentials + outfit suggestions.

Movie Night
CLI scraper curating top 50 films of 2023; filters + GCS/Heroku.


Current Work & Learning

  • working

    1. Knowledge app integrates multiple APIs + Supabase(PostgreSQL) + hosting environment + recommendation system (repo is private, permission-based access). done
    2. Medium Article that explains the detailed workflow of the Knowledge-app. done
    3. Personal blogging website built from scratch — roadmap includes adding a text-to-speech model (private repo).
    4. Medium Article explaining the workings of Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) in depth.
  • learning

    1. Agentic Knowledge graphs construction
    2. Building AI Agents and Agentic Workflows

Journey & Achievements

Moving closer to downstream data roles through projects, certifications, and writing:


Collaboration & Contact

  • collaborate
    Open to and excited about collaborating on end-to-end data engineering, data science, and applied ML projects, anything from small builds to production-grade pipelines.

  • askme
    From embedded systems to end-to-end data workflows: engineering pipelines, applied ML, RAG and deep learning — deployed with DataOps/DevOps practices (CI/CD, IaC, automation, monitoring, Docker/Kubernetes).

  • contact

  • funfact

Pinned Loading

  1. fact_scrap fact_scrap Public

    Read about this app at: https://medium.com/@prakash1402/the-knowledge-drip-b832e40206c0

    Python

  2. market-data-platform market-data-platform Public

    Cloud-native data engineering project that builds streaming and batch pipelines for financial market data, enforcing data quality gates and serving clean analytics to both real-time applications an…

    Python

  3. RAGbot RAGbot Public

    Retrieval-Augmented Generation (RAG) chatbot for exploring Crime and Punishment — combines document retrieval with LLMs to deliver context-aware, literary insights via a Streamlit app.

    Python

  4. housing-price-prediction housing-price-prediction Public

    End-to-end Kaggle house price predictor with domain-driven feature engineering, 2-stage feature filtering, and XGBoost — deployed as an interactive Streamlit app.

    Python

  5. brazil-market-expansion brazil-market-expansion Public

    Data storytelling project analyzing an artificial Brazil market dataset — SQL + Tableau dashboards with structured insights and schema design.

  6. eniac-discount-analysis eniac-discount-analysis Public

    Data analysis project exploring discount strategies and product segmentation on €7.8M revenue dataset — uncovering insights on seasonal demand and margin impact.

    Python