Introduction
Parquet is a popular columnar storage file format widely used in big data and analytics. However, reading and analyzing Parquet files on a desktop often requires programming knowledge.
In this blog, we’ll walk through building a Parquet File Reader Windows desktop application using Python, Tkinter, and Pandas. The application provides a user-friendly interface to:
✅ Open and view Parquet files
✅ Search and filter data
✅ Sort columns
✅ Export data to CSV, Excel, and JSON
✅ Switch between light/dark themes
✅ Track usage statistics
1. Setting Up the Project
To begin, install the required dependencies:
pip install pandas pyarrow openpyxl
pandas
: For handling Parquet filespyarrow
: Required for reading Parquet formatopenpyxl
: Needed for Excel export
Next, let’s structure our project:
ParquetReader/
│── assets/ # Icons and UI assets
│── app.py # Main application
│── components.py # UI elements
│── ui_helper.py # Data handling functions
│── utils.py # Export utilities
│── usage_stats.py # Collects usage data
│── requirements.txt # Dependencies
2. Implementing the Main Application (app.py
)
The app.py
file initializes the Tkinter GUI and imports functionalities from helper files.
Key Features in app.py
✔ Creates the main window
✔ Initializes UI components
✔ Adds a menu bar
✔ Starts a background thread to track app usage
import tkinter as tk
from tkinter import ttk
import threading
from components import (
create_header, create_search_frame, create_treeview,
create_button_frame, create_metadata_label, create_menu_bar
)
from ui_helper import open_parquet_file, toggle_theme
from usage_stats import collect_usage_data # Track usage stats
def run_usage_tracker():
"""Run usage tracking in a separate thread."""
usage_thread = threading.Thread(target=collect_usage_data, daemon=True)
usage_thread.start()
def main():
"""Initialize the application."""
root = tk.Tk()
root.title("Parquet File Reader")
root.geometry("900x600")
root.configure(bg="#f0f0f0")
# Initialize UI components
metadata_label = create_metadata_label(root)
tree, v_scrollbar, h_scrollbar = create_treeview(root)
search_frame, search_entry, search_column_combobox = create_search_frame(root, tree, v_scrollbar, h_scrollbar)
button_frame = create_button_frame(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label)
# Create menu bar
create_menu_bar(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label)
# Start usage tracking
run_usage_tracker()
root.mainloop()
if __name__ == "__main__":
main()
3. Creating the UI Components (components.py
)
The UI consists of:
✔ A menu bar (File Open, Export, Exit)
✔ A search bar
✔ A treeview (table) for data
✔ Action buttons
from tkinter import ttk, tk
from utils import export_to_csv, export_to_excel, export_to_json
from ui_helper import open_parquet_file, search_data
def create_menu_bar(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label):
"""Create the menu bar."""
menu_bar = tk.Menu(root)
root.config(menu=menu_bar)
file_menu = tk.Menu(menu_bar, tearoff=0)
file_menu.add_command(label="Open Parquet", command=lambda: open_parquet_file(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label))
file_menu.add_command(label="Export to CSV", command=lambda: export_to_csv())
file_menu.add_command(label="Export to Excel", command=lambda: export_to_excel())
file_menu.add_command(label="Export to JSON", command=lambda: export_to_json())
file_menu.add_separator()
file_menu.add_command(label="Exit", command=root.quit)
menu_bar.add_cascade(label="File", menu=file_menu)
4. Handling Data (ui_helper.py
)
This file contains functions for loading, searching, and displaying Parquet data.
import pandas as pd
from tkinter import messagebox, filedialog
def open_parquet_file(root, tree, v_scrollbar, h_scrollbar, search_column_combobox, metadata_label):
"""Open and display a Parquet file."""
file_path = filedialog.askopenfilename(filetypes=[("Parquet files", "*.parquet")])
if file_path:
try:
df = pd.read_parquet(file_path)
display_data(df, tree, v_scrollbar, h_scrollbar)
search_column_combobox["values"] = list(df.columns)
except Exception as e:
messagebox.showerror("Error", f"Failed to read file: {e}")
def display_data(df, tree, v_scrollbar, h_scrollbar):
"""Show DataFrame contents in the treeview."""
tree["columns"] = list(df.columns)
tree["show"] = "headings"
for col in df.columns:
tree.heading(col, text=col)
for _, row in df.iterrows():
tree.insert("", "end", values=list(row))
5. Exporting Data (utils.py
)
Users can export the table data to CSV, Excel, or JSON.
def export_to_csv(df):
"""Export DataFrame to CSV."""
if df is not None:
file_path = filedialog.asksaveasfilename(defaultextension=".csv", filetypes=[("CSV files", "*.csv")])
if file_path:
df.to_csv(file_path, index=False)
messagebox.showinfo("Success", "Data exported successfully!")
6. Implementing Dark Mode (ui_helper.py
)
def toggle_theme(root):
"""Switch between light and dark themes."""
if root["bg"] == "#f0f0f0":
root["bg"] = "#2d2d2d"
else:
root["bg"] = "#f0f0f0"
Packaging as an Executable
To distribute the app as an .exe
file:
pyinstaller --onefile --windowed --icon=assets/icon.ico app.py
Conclusion
We successfully built a Parquet File Reader with:
✅ File loading & display
✅ Sorting, searching & filtering
✅ Export options
✅ Dark mode
✅ Usage tracking
You can customize the app by adding charts, filters, or data analytics tools. 🚀