Building a Parquet File Reader with Python and Tkinter

Introduction

Parquet is a popular columnar storage file format widely used in big data and analytics. However, reading and analyzing Parquet files on a desktop often requires programming knowledge.

In this blog, we’ll walk through building a Parquet File Reader Windows desktop application using Python, Tkinter, and Pandas. The application provides a user-friendly interface to:
✅ Open and view Parquet files
Search and filter data
Sort columns
Export data to CSV, Excel, and JSON
Switch between light/dark themes
Track usage statistics

1. Setting Up the Project

To begin, install the required dependencies:

pip install pandas pyarrow openpyxl
  • pandas: For handling Parquet files
  • pyarrow: Required for reading Parquet format
  • openpyxl: Needed for Excel export

Next, let’s structure our project:

ParquetReader/
│── assets/                # Icons and UI assets
│── app.py                 # Main application
│── components.py          # UI elements
│── ui_helper.py           # Data handling functions
│── utils.py               # Export utilities
│── usage_stats.py         # Collects usage data
│── requirements.txt       # Dependencies

2. Implementing the Main Application (app.py)

The app.py file initializes the Tkinter GUI and imports functionalities from helper files.

Key Features in app.py

Creates the main window
Initializes UI components
Adds a menu bar
Starts a background thread to track app usage

import tkinter as tk
from tkinter import ttk
import threading
from components import (
    create_header, create_search_frame, create_treeview, 
    create_button_frame, create_metadata_label, create_menu_bar
)
from ui_helper import open_parquet_file, toggle_theme
from usage_stats import collect_usage_data  # Track usage stats

def run_usage_tracker():
    """Run usage tracking in a separate thread."""
    usage_thread = threading.Thread(target=collect_usage_data, daemon=True)
    usage_thread.start()

def main():
    """Initialize the application."""
    root = tk.Tk()
    root.title("Parquet File Reader")
    root.geometry("900x600")
    root.configure(bg="#f0f0f0")

    # Initialize UI components
    metadata_label = create_metadata_label(root)
    tree, v_scrollbar, h_scrollbar = create_treeview(root)
    search_frame, search_entry, search_column_combobox = create_search_frame(root, tree, v_scrollbar, h_scrollbar)
    button_frame = create_button_frame(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label)

    # Create menu bar
    create_menu_bar(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label)

    # Start usage tracking
    run_usage_tracker()

    root.mainloop()

if __name__ == "__main__":
    main()

3. Creating the UI Components (components.py)

The UI consists of:
A menu bar (File Open, Export, Exit)
A search bar
A treeview (table) for data
Action buttons

from tkinter import ttk, tk
from utils import export_to_csv, export_to_excel, export_to_json
from ui_helper import open_parquet_file, search_data

def create_menu_bar(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label):
    """Create the menu bar."""
    menu_bar = tk.Menu(root)
    root.config(menu=menu_bar)

    file_menu = tk.Menu(menu_bar, tearoff=0)
    file_menu.add_command(label="Open Parquet", command=lambda: open_parquet_file(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label))
    file_menu.add_command(label="Export to CSV", command=lambda: export_to_csv())
    file_menu.add_command(label="Export to Excel", command=lambda: export_to_excel())
    file_menu.add_command(label="Export to JSON", command=lambda: export_to_json())
    file_menu.add_separator()
    file_menu.add_command(label="Exit", command=root.quit)
    
    menu_bar.add_cascade(label="File", menu=file_menu)

4. Handling Data (ui_helper.py)

This file contains functions for loading, searching, and displaying Parquet data.

import pandas as pd
from tkinter import messagebox, filedialog

def open_parquet_file(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label):
    """Open and display a Parquet file."""
    file_path = filedialog.askopenfilename(filetypes=[("Parquet files", "*.parquet")])
    if file_path:
        try:
            df = pd.read_parquet(file_path)
            display_data(df, tree, v_scrollbar, h_scrollbar)
            search_column_combobox["values"] = list(df.columns)
        except Exception as e:
            messagebox.showerror("Error", f"Failed to read file: {e}")

def display_data(df, tree, v_scrollbar, h_scrollbar):
    """Show DataFrame contents in the treeview."""
    tree["columns"] = list(df.columns)
    tree["show"] = "headings"
    for col in df.columns:
        tree.heading(col, text=col)
    for _, row in df.iterrows():
        tree.insert("", "end", values=list(row))

5. Exporting Data (utils.py)

Users can export the table data to CSV, Excel, or JSON.

def export_to_csv(df):
    """Export DataFrame to CSV."""
    if df is not None:
        file_path = filedialog.asksaveasfilename(defaultextension=".csv", filetypes=[("CSV files", "*.csv")])
        if file_path:
            df.to_csv(file_path, index=False)
            messagebox.showinfo("Success", "Data exported successfully!")

6. Implementing Dark Mode (ui_helper.py)

def toggle_theme(root):
    """Switch between light and dark themes."""
    if root["bg"] == "#f0f0f0":
        root["bg"] = "#2d2d2d"
    else:
        root["bg"] = "#f0f0f0"

Packaging as an Executable

To distribute the app as an .exe file:

pyinstaller --onefile --windowed --icon=assets/icon.ico app.py

Conclusion

We successfully built a Parquet File Reader with:
File loading & display
Sorting, searching & filtering
Export options
Dark mode
Usage tracking

You can customize the app by adding charts, filters, or data analytics tools. 🚀

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *