Skip to content

Merge large structured datasets locally like Excel files

Notifications You must be signed in to change notification settings

D-S007/DataMerger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 Merge Multiple Excel Files — Large Dataset Manager

A high-performance, privacy-first web application built to merge, manage, and restructure extremely large Excel datasets directly on your local machine — without relying on cloud tools or exposing sensitive business data.


🚨 Problem Statement

Traditional spreadsheet tools like Microsoft Excel and Google Sheets struggle when handling enterprise-scale datasets. Performance degradation, crashes, and row/column limitations make large-scale data preparation inefficient and risky.

Real scenario that triggered this project:

  • 7 Excel files

  • Each containing ~25,000 rows × 120 columns

  • Required to be merged into one master dataset

  • Needed to:

    • Keep header row from the first file only
    • Remove duplicate headers from the other 6 files
    • Reorder data blocks easily (move up/down)
    • Prepare a clean dataset for downstream analysis

This volume of structured data was beyond practical limits for spreadsheet software, especially when confidentiality restrictions prevented using online tools.

💡 Solution: Build a local, browser-based data merging tool that works natively on the system — fast, secure, and purpose-built for large structured files.


🎯 Primary Use Case

This tool is designed for professionals such as:

  • Business Analysts
  • Data Analysts
  • Operations Teams
  • Finance & Reporting Teams

Who frequently need to:

✔ Merge multiple structured Excel files ✔ Preserve a single standardized header schema ✔ Rearrange file order before final merge ✔ Work with large datasets safely offline ✔ Avoid uploading confidential company data to third-party services


🔒 Privacy-First Architecture

All file processing happens locally in your browser. No uploads. No servers. No data leaves your machine.

This ensures compliance with internal data governance and confidentiality policies.


🧩 Core Features

  • 📂 Merge multiple Excel files into a single dataset
  • 🏷 Keep header row from the first file only
  • 🧹 Automatically remove redundant headers
  • 🔀 Reorder file blocks before merging (move up/down)
  • ⚡ Optimized for large datasets
  • 💻 Runs entirely in the browser using client-side processing

🛠 Technologies

Layer Technology
Frontend Framework React 19.2.3
Language TypeScript 5.9.3
Build Tool Vite 7.2.4
Styling Tailwind CSS 4.1.17
Excel Processing XLSX 0.18.5

🚀 Getting Started

Prerequisites

  • Node.js (v16 or higher)
  • npm, yarn, or pnpm

Installation

git clone <repository-url>
cd Merge_multiple_excel_files
npm install

Development

npm run dev

App runs at: http://localhost:5173


Build for Production

npm run build

Preview Production Build

npm run preview

📁 Project Structure

src/
├── App.tsx           # Main application component
├── main.tsx          # Application entry point
├── index.css         # Global styles
└── utils/
    └── cn.ts         # Utility functions

📈 Why This Tool Matters

This project transforms a manual, crash-prone spreadsheet process into a controlled, scalable data preparation workflow. It bridges the gap between spreadsheet users and full data engineering pipelines — without requiring backend infrastructure.


🧭 Roadmap

Next major enhancements planned:

  • 📄 Support for additional file formats:

    • CSV
    • TSV
    • JSON
    • More structured data types
  • 🧠 Smart column validation & mismatch detection

  • 📊 Basic summary statistics after merge

  • 💾 Export options for multiple formats


👨‍💼 Author Motivation

Built out of a real operational bottleneck faced during large-scale business data consolidation. The goal was simple:

Handle enterprise-sized structured data efficiently, privately, and without spreadsheet limitations.

This tool is the first step toward a broader suite of local-first data preparation utilities.

About

Merge large structured datasets locally like Excel files

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published